Hi all, This is a short explanation of the new instances of SAMOA.
https://github.com/abifet/moa/tree/master/moa/src/main/java/com/yahoo/labs/samoa/instances Instances will be much simpler than the current implementation. They can be dense or sparse, and they contain only one array (or two for sparse) with all the attribute values. In the current implementation we have two arrays, one for input values and another for output values The main changes are two: 1/ All instances are going to be multi-label, that means they have input and output attributes, and we can call their values with getInputValue(i) and getOutputValue(i). 2/ Attributes are numeric by default, so we only keep information of discrete attributes (values). For example if we have one million numeric attributes, we will not need to store attribute information of these one million numeric attributes. Basically, we have: - Instance: interface - MultiLabelInstance: interface (empty interface that extends Instance) - InstanceImpl extends MultiLabelInstance: implementation of Instance. Contains - InstanceData - InstancesHeader - DenseInstance extends InstanceImpl - SparseInstance extends InstanceImpl -Instances: a list of instances and an InstanceInformation object -InstancesHeader extends Instances -InstanceData: interface -DenseInstanceData implements InstanceData -SparseInstanceData implements InstanceData - InstanceInformation contains name, attribute information and attributes to predict. - AttributesInformation contains two list of Attributes (indices and values) for non-numerical attributes. Numerical attributes are by default - Range: attributes to predict Cheers, Albert
