Re: [nupic-discuss] Classifier inner workings

Allan Inocêncio de Souza Costa Wed, 22 Jan 2014 10:29:46 -0800

Of course!

Following is my model_params. My memory can handle about 512 columns in SP. 
Also, the pixels fields are setted in the end:

MODEL_PARAMS = {
    # Type of model that the rest of these parameters apply to.
    'model': "CLA",

    # Version that specifies the format of the config.
    'version': 1,

    # Intermediate variables used to compute fields in modelParams and also
    # referenced from the control section.
    'aggregationInfo': {   'days': 0,
        'fields': [],
        'hours': 0,
        'microseconds': 0,
        'milliseconds': 0,
        'minutes': 0,
        'months': 0,
        'seconds': 0,
        'weeks': 0,
        'years': 0},

    'predictAheadTime': None,

    # Model parameter dictionary.
    'modelParams': {
        # The type of inference that this model will perform
        'inferenceType': 'NontemporalClassification',

        'sensorParams': {
            # Sensor diagnostic output verbosity control;
            # if > 0: sensor region will print out on screen what it's sensing
            # at each step 0: silent; >=1: some info; >=2: more info;
            # >=3: even more info (see compute() in py/regions/RecordSensor.py)
            'verbosity' : 0,

            # Example:
            #     dsEncoderSchema = [
            #       DeferredDictLookup('__field_name_encoder'),
            #     ],
            #
            # (value generated from DS_ENCODER_SCHEMA)
            'encoders': {
                u'label':     {  
                  'classifierOnly': True, 
                  'fieldname': u'label',
                  'n': 121,
                  'name': u'label',
                  'type': 'ScalarEncoder',
                  'minval':0,
                  'maxval':9,
                  'w': 21},
            },

            # A dictionary specifying the period for automatically-generated
            # resets from a RecordSensor;
            #
            # None = disable automatically-generated resets (also disabled if
            # all of the specified values evaluate to 0).
            # Valid keys is the desired combination of the following:
            #   days, hours, minutes, seconds, milliseconds, microseconds, weeks
            #
            # Example for 1.5 days: sensorAutoReset = dict(days=1,hours=12),
            #
            # (value generated from SENSOR_AUTO_RESET)
            'sensorAutoReset' : None,
        },

        'spEnable': True,

        'spParams': {
            # SP diagnostic output verbosity control;
            # 0: silent; >=1: some info; >=2: more info;
            'spVerbosity' : 0,

            'globalInhibition': 1,

            # Number of cell columns in the cortical region (same number for
            # SP and TP)
            # (see also tpNCellsPerCol)
            'columnCount': 256,

            'inputWidth': 256,

            # SP inhibition control (absolute value);
            # Maximum number of active columns in the SP region's output (when
            # there are more, the weaker ones are suppressed)
            'numActivePerInhArea': 40,

            'seed': 1956,

            # coincInputPoolPct
            # What percent of the columns's receptive field is available
            # for potential synapses. At initialization time, we will
            # choose coincInputPoolPct * (2*coincInputRadius+1)^2
            'coincInputPoolPct': 0.5,

            # The default connected threshold. Any synapse whose
            # permanence value is above the connected threshold is
            # a "connected synapse", meaning it can contribute to the
            # cell's firing. Typical value is 0.10. Cells whose activity
            # level before inhibition falls below minDutyCycleBeforeInh
            # will have their own internal synPermConnectedCell
            # threshold set below this default value.
            # (This concept applies to both SP and TP and so 'cells'
            # is correct here as opposed to 'columns')
            'synPermConnected': 0.1,

            'synPermActiveInc': 0.1,

            'synPermInactiveDec': 0.01,

            'randomSP': 0,
        },

        # Controls whether TP is enabled or disabled;
        # TP is necessary for making temporal predictions, such as predicting
        # the next inputs.  Without TP, the model is only capable of
        # reconstructing missing sensor inputs (via SP).
        'tpEnable' : False,

        'tpParams': {
            # TP diagnostic output verbosity control;
            # 0: silent; [1..6]: increasing levels of verbosity
            # (see verbosity in nta/trunk/py/nupic/research/TP.py and TP10X*.py)
            'verbosity': 0,

            # Number of cell columns in the cortical region (same number for
            # SP and TP)
            # (see also tpNCellsPerCol)
            'columnCount': 2048,

            # The number of cells (i.e., states), allocated per column.
            'cellsPerColumn': 32,

            'inputWidth': 2048,

            'seed': 1960,

            # Temporal Pooler implementation selector (see _getTPClass in
            # CLARegion.py).
            'temporalImp': 'cpp',

            # New Synapse formation count
            # NOTE: If None, use spNumActivePerInhArea
            #
            # TODO: need better explanation
            'newSynapseCount': 20,

            # Maximum number of synapses per segment
            #  > 0 for fixed-size CLA
            # -1 for non-fixed-size CLA
            #
            # TODO: for Ron: once the appropriate value is placed in TP
            # constructor, see if we should eliminate this parameter from
            # description.py.
            'maxSynapsesPerSegment': 32,

            # Maximum number of segments per cell
            #  > 0 for fixed-size CLA
            # -1 for non-fixed-size CLA
            #
            # TODO: for Ron: once the appropriate value is placed in TP
            # constructor, see if we should eliminate this parameter from
            # description.py.
            'maxSegmentsPerCell': 128,

            # Initial Permanence
            # TODO: need better explanation
            'initialPerm': 0.21,

            # Permanence Increment
            'permanenceInc': 0.1,

            # Permanence Decrement
            # If set to None, will automatically default to tpPermanenceInc
            # value.
            'permanenceDec' : 0.1,

            'globalDecay': 0.0,

            'maxAge': 0,

            # Minimum number of active synapses for a segment to be considered
            # during search for the best-matching segments.
            # None=use default
            # Replaces: tpMinThreshold
            'minThreshold': 12,

            # Segment activation threshold.
            # A segment is active if it has >= tpSegmentActivationThreshold
            # connected synapses that are active due to infActiveState
            # None=use default
            # Replaces: tpActivationThreshold
            'activationThreshold': 16,

            'outputType': 'normal',

            # "Pay Attention Mode" length. This tells the TP how many new
            # elements to append to the end of a learned sequence at a time.
            # Smaller values are better for datasets with short sequences,
            # higher values are better for datasets with long sequences.
            'pamLength': 1,
        },

        'clParams': {
            'regionName' : 'CLAClassifierRegion',

            # Classifier diagnostic output verbosity control;
            # 0: silent; [1..6]: increasing levels of verbosity
            'clVerbosity' : 0,

            # This controls how fast the classifier learns/forgets. Higher 
values
            # make it adapt faster and forget older patterns faster.
            'alpha': 0.001,

            # This is set after the call to updateConfigFromSubConfig and is
            # computed from the aggregationInfo and predictAheadTime.
            'steps': '0',
        },

        'anomalyParams': {   
          u'anomalyCacheRecords': None,
          u'autoDetectThreshold': None,
          u'autoDetectWaitRecords': None
        },

        'trainSPNetOnlyIfRequested': False,
    }
}

for i in range(0,784):
    MODEL_PARAMS['modelParams']['sensorParams']['encoders']['pixel%d' % i] =   
{  
                  'fieldname': u'pixel%d' % i,
                  'n': 121,
                  'name': u'pixel%d' % i,
                  'type': 'ScalarEncoder',
                  'minval':0,
                  'maxval':255,
                  'w': 21}

Best regards,
Allan

Em Quarta-feira, 22 de Janeiro de 2014 16:19, Pedro Tabacof <[email protected]> 
escreveu:

It's odd that the SP runs out of memory. Could you post your code here?

On Wed, Jan 22, 2014 at 4:15 PM, Allan Inocêncio de Souza Costa 
<[email protected]> wrote:

Thanks for the reply, Pedro and Mark. 
>
>@Pedro
>You're right, I'm not using SP or TP. I did tried to simply activate SP in 
>model_params.py, but it soon ran out of memory (I'm using 8 GB), so I have to 
>use few columns and the result does not get improved.
>
>@Mark
>I agree with point 3.
>About point 1: I think you're right about the classifier and I would like to 
>know more details about how it is implemented, so if someone knows, please let 
>me know.
>About point 2: the images are encoded in 1D arrays with 784 (28x28) features, 
>so I do lost topological information. But it is still a high dimensional space 
>in which the data shows good clustering, so that even the hyperplanes obtained 
>by simple logistic regression are capable of classifying the digits with good 
>accuracy (> 90%).
 That's why I would like to get more information about the classifier itself.
>
>Best regards,
>
>Allan
>
>
>
>Em Quarta-feira, 22 de Janeiro de 2014 15:28, Pedro Tabacof 
><[email protected]> escreveu:
> 
>Marek's second point is of utmost importance for anyone doing image 
>classification. It would be awesome if someone could make 2D topology easily 
>available. Convolutional neural networks are so much better than regular 
>neural networks for image classification.
>
>
>
>
>
>On Wed, Jan 22, 2014 at 3:18 PM, Marek Otahal <[email protected]> wrote:
>
>Hi Allan, 
>>
>>that was maybe me, it's great someone is working on the MNIST here! 
>>
>>1/ I'm not 100% clear about the Classifier, but I think it's just a helper 
>>utility, unrelated to the HTM/CLA, so you've been testing performance of any 
>>algorithm the CLassifier implements (not CLA imho). So you'd want to create a 
>>CLA (with SP only) and place Classifier atop of it. The pipeline would look 
>>like: {MNIST-data[ith-example]} >>> CLA(without TP) >>>(you get SDR) >>> 
>>Classifier (add MNIST-label[ith-example] 
>>
>>2/ I assume the mnist dataset is created from 2D images of hadwritten digits 
>>-> and just simply put in 1D array (??) 
>>Then you'll lose lot of topological info passing it to the CLA just as is. I 
>>think this will require ressurection of the Image Encoders that take into 
>>account distance for neighborhood pixels (each pixel has 8 neighboring px), 
>>this is used in inhibition etc. 
>>
>>3/ You're probably overfitting, rather experiment with 80%/20% data split. 
>>
>>Cheers, Mark
>>
>>
>>
>>
>>On Wed, Jan 22, 2014 at 5:57 PM, Allan Inocêncio de Souza Costa 
>><[email protected]> wrote:
>>
>>
>>>
>>>Hi,
>>>
>>>
>>>I read a question that someone else asked here, but I couldn't  find the 
question nor the answers (if any), so I will ask again, as I'm now working 
around with the classifier.
>>>
>>>
>>>I tried to apply the classifier to the task of handwritten recognition 
using the MNIST dataset. The best result I got was an overall accuracy 
of about 42% (by that I mean that after training the entire dataset, the 
proportion of right predictions from the first to the last training 
example was 42%), after playing a little with the encoders. Of course 
this is better than the expected 10% accuracy of a random picker 
algorithm, but it falls short of what is accomplished by other (linear) 
algorithms. For those interested, I 
attached a plot of the accuracy.
>>>
>>>
>>>
>>>So here comes the question: what are the inner workings of the classifier? 
>>>I'm puzzled as it doesn't have a SP. Can someone help or point to some 
reading?
>>>
>>> 
>>>Best regards,
>>>Allan
>>>
>>>_______________________________________________
>>>nupic mailing list
>>>[email protected]
>>>http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>>
>>-- 
>>Marek Otahal :o) 
>>_______________________________________________
>>nupic mailing list
>>[email protected]
>>http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
>
>
>-- 
>Pedro Tabacof,
>Unicamp - Eng. de Computação 08.
>
>_______________________________________________
>nupic mailing list
>[email protected]
>http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
>
>_______________________________________________
>nupic mailing list
>[email protected]
>http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

-- 
Pedro Tabacof,
Unicamp - Eng. de Computação 08.

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Classifier inner workings

Reply via email to