Re: [nupic-dev] Training on Handwritten Digit Dataset using CLA

Quinn Liu Wed, 17 Jul 2013 08:09:29 -0700

Hi Michael and Scott,
    Thank you very much for your explanations. Michael's explanation
implies that the Temporal Pooler greatly helps in spatial invariance
learning of training data which I can see working.


But for question 3 Scott has said "No need for TP. It won't help with
spatial representations." I was hoping Scott you could expand on your
answer to what you think about how SP and TP contribute to spatial
invarience recognition.

Best Regards,
Quinn Liu


On Mon, Jul 15, 2013 at 5:07 PM, Michael Ferrier
<[email protected]>wrote:

> Hi Quinn,
>
> The older version of HTM would group together the spatial patterns that
> would tend to occur in close temporal sequence with one another, and
> produce the same output when it saw any of the spatial patterns within a
> given group. So, if a network were trained on visual input of digits
> zig-zagging through the visual field, then any individual visual feature
> (for example a vertical line) would come to be represented by a temporal
> group that responds when it is presented with a vertical line at any of
> many nearby locations, because in the training data, a vertical line is
> often seen moving from one location to another nearby location. In this way
> it would learn invariance to position. At the lowest level of the hierarchy
> it would learn invariance to position for individual small visual features,
> and at higher levels it would learn invariance for more complex and larger
> arrangements of features and whole visual objects. Invariance to other
> transformations like scale, rotation, etc. could also be learned this way
> given the appropriate training data.
>
> Like Scott said the old version of HTM worked very differently from CLA,
> but they both model the same basic principles (the CLA does so much more
> flexibly). Using a CLA region with one cell per column, a cell should
> become active when given a particular spatial pattern, but should become
> predictive when given any pattern that (during training) often occurs close
> by in temporal sequence to that spatial pattern. So, if a column's proximal
> segment represents the spatial pattern of a vertical line, then that
> column's cell should become predictive whenever a vertical line at any
> nearby position is presented, because during training a given vertical line
> is often followed by another nearby vertical line, since the training set
> is made up of animations of the visual objects smoothly zig-zagging around.
>
> And because a CLA region sends output from both its active and predictive
> cells, from the point of view of the next, higher region in the hierarchy,
> that cell is responding invariantly to any of a set of nearby vertical
> lines. This corresponds to how 'complex cells' respond in visual cortex.
>
> Does that make sense?
>
> -Mike
>
> _____________
> Michael Ferrier
> Department of Cognitive, Linguistic and Psychological Sciences, Brown
> University
> [email protected]
>
>
> On Mon, Jul 15, 2013 at 4:23 PM, Scott Purdy <[email protected]> wrote:
>
>> Quinn, the older HTM implementations were completely different algorithms
>> and are now obsolete.
>>
>>
>> On Mon, Jul 15, 2013 at 1:09 PM, Quinn Liu <[email protected]> wrote:
>>
>>> Hi Michael,
>>>     I had an additional question. In your reply you remarked that "while
>>> digit recognition was successfully modeled with the original version of
>>> HTM, that doesn't seem to be the case with CLA yet". I was wondering if you
>>> or anyone else could expand on this as I am unfamiliar with the original
>>> version of the HTM. Assuming that it is premature version of the current
>>> spatial and temporal learning algorithms how is it different? Thanks!
>>>
>>> Best Regards,
>>> Quinn Liu
>>>
>>> [email protected]
>>>
>>>
>>> On Mon, Jul 15, 2013 at 3:41 PM, Michael Ferrier <
>>> [email protected]> wrote:
>>>
>>>> Hi Fergal,
>>>>
>>>> I completely agree that a visual object recognition system would
>>>> greatly benefit from hierarchy. Causes in the world are hierarchical, and
>>>> the brain uses hierarchy to learn and represent them. The successful vision
>>>> models using the original implementation of HTM were also hierarchical. I
>>>> was just saying that, as far as I know, this hasn't been done with CLA yet
>>>> -- according to Jeff, in their vision experiments they were just beginning
>>>> to expand beyond one layer when they stopped working on vision.
>>>>
>>>> I think that both temporal pooling (for invariance) and hierarchy are
>>>> key to using CLA for visual recognition problems, but I don't know of
>>>> anyone who has put all the pieces together yet to do visual recognition
>>>> with CLA.
>>>>
>>>> -Mike
>>>>
>>>>
>>>>
>>>> _____________
>>>> Michael Ferrier
>>>> Department of Cognitive, Linguistic and Psychological Sciences, Brown
>>>> University
>>>> [email protected]
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 11:44 AM, Fergal Byrne <
>>>> [email protected]> wrote:
>>>>
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>>  Handwritten characters are undoubtedly multi-component designs,
>>>>> which have evolved to connect with and trigger our ability to learn
>>>>> spatial, temporal and hierarchical patterns. We perceive the same
>>>>> characters even when loads of things change in fonts, and especially when
>>>>> reading different people's handwriting. We can fill in gaps and correct
>>>>> misspellings. So the learning and prediction must be several levels deep 
>>>>> in
>>>>> hierarchy.
>>>>>
>>>>>  In terms of bottom level mechanics, we use saccades to recognise and
>>>>> "delocalise" components such as characters, facial features, etc, in such 
>>>>> a
>>>>> way as to allow this multi-level recognition (including a hierarchy of
>>>>> fixations - for strokes, junctions, topology, characters, letters, words,
>>>>> and even sentences).
>>>>>
>>>>>  Speed-readers can saccade to read entire phrases and sentences at a
>>>>> time, allowing reading speeds of thousands of words per minute with better
>>>>> than 70% comprehension scores. With practice, I've been able to get scores
>>>>> in the 1-2000 wpm range. I can also read text in a mirror or upside-down 
>>>>> at
>>>>> speeds approaching 50-60% of an average reader. These things could only be
>>>>> done using big, complex region hierarchies with vast volumes of (normal)
>>>>> reading practice.
>>>>>
>>>>>  I would have predicted that a single layer CLA would struggle with
>>>>> this kind of data set, because it lacks the multi-level upward and 
>>>>> downward
>>>>> structure which I feel this kind of performance requires.
>>>>>
>>>>>  Regards,
>>>>>
>>>>>  Fergal Byrne
>>>>>
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] Training on Handwritten Digit Dataset using CLA

Reply via email to