Hi Scott, Michael and Jeff, Thank you very much for your explanations. I greatly appreciate them.
Best Regards, Quinn Liu On Wed, Jul 17, 2013 at 8:26 PM, Jeff Hawkins <[email protected]> wrote: > Sorry I have been a little absent on this list. I was travelling this > week and I am preparing for OsCON next week so I can’t keep up with all the > conversations.**** > > ** ** > > Most image classification systems rely on some form of what we call > “temporal pooling”. (Mike described it well below.) E.g. HMAX, a vision > system out of Poggio’s lab at MIT uses a hard-coded pooling mechanism. > They take their spatial features and hard code representations that are > active for spatial shifts of the feature. Hard coded pooling works OK for > the first level of a vision hierarchy but it doesn’t work in a general > sense. For example, in audition we need to pool patterns in time that have > no obvious spatial invariance. We might want to pool successive notes in a > melody and there is no equivalent of spatial invariance for that. > Therefore, a cortical region must *learn* what patterns to pool over > time.**** > > ** ** > > We did some vision work prior to the CLA. These algorithms did not have a > good temporal model and we actually used hard coded pooling ala HMAX. I > was never happy about this although it produce ok but not great results.** > ** > > ** ** > > When we first created the CLA and were using it for vision experiments we > spent a lot of time making sure it could learn temporal pooling. The idea > is you first learn a sequence, but this on its own doesn’t do any pooling. > To pool you need cells to stay active over a sequence of patterns. The way > we achieved this is a cell first learns to predict its activity for one > step ahead on time. But once it has learned to do that it can learn to > predict its activity for two steps ahead, etc. By repeating patterns a > cell can learn to predict its activity well in advance. How far in advance > depends on how predictable and varied the sequences are.**** > > ** ** > > I don’t have time to go into all the details now, but as Mike suggests, if > we have only one cell per column then the cell will pool no matter what > direction a pattern is moving. It can’t tell a left moving line from a > right moving line. Therefore it will produce a cell that responds to a > line no matter where the line is and no matter what direction it is > moving. However, if we have multiple cells per column then it will produce > a cell that responds when a line is moving in a particular direction. We > see both types of cell in V1 in real brains. I have a theory (a highly > speculative theory) that Layer 4 cells are like the former and Layer 3 > cells are like the latter. There are several lines of evidence to suggest > this. In this case Layer 4 learns pure shift invariance but layer 3 > learns true sequences. BTW, layer 4 is large in the first couple of levels > in cortex but disappears as you ascend the hierarchy. My explanation is as > you ascend the hierarchy spatial invariance is solved and is no longer > needed. But sequences like melodies, language and actions continue to need > the type of pooling done by layer 3.**** > > ** ** > > We got pooling to work in the CLA but it took a lot of synapses and > therefore memory and computation time. In the current form of the CLA we > have sequence memory but the pooling part is deactivated. We don’t need > pooling for the types of problems we are applying Grok to.**** > > ** ** > > One of the reasons I am hesitant to work on vision problems is that the > temporal pooling requirement is large. Consider this, the amount of cortex > dedicated to low-level vision (areas V1 and V2) dwarfs the amount of cortex > dedicated to language (Broca’s and Wernicke’s areas). Low level vision is > much harder than language. Amazing.**** > > Jeff**** > > ** ** > > *From:* nupic [mailto:[email protected]] *On Behalf Of *Scott > Purdy > *Sent:* Wednesday, July 17, 2013 10:58 AM > *To:* NuPIC general mailing list. > *Subject:* Re: [nupic-dev] Training on Handwritten Digit Dataset using CLA > **** > > ** ** > > I was wrong about that. I don't quite understand it well enough to give a > proper response so I am going to see if Jeff can write it up.**** > > ** ** > > The explanation I got was that you can train a temporal model by moving > the letter around the image. And then when you give it a test image, you > expect it to predict the letter moving in different directions. The > predicted cells are apparently useful as you move up the hierarchy. Time > acts as a sort of supervisor for spatial invariants.**** > > ** ** > > But like I said, I am going to try to get someone to do a better > explanation. There was quite a lot of vision work done that would be great > to capture for you guys.**** > > ** ** > > On Wed, Jul 17, 2013 at 8:04 AM, Quinn Liu <[email protected]> wrote:**** > > Hi Michael and Scott, > Thank you very much for your explanations. Michael's explanation > implies that the Temporal Pooler greatly helps in spatial invariance > learning of training data which I can see working. **** > > ** ** > > But for question 3 Scott has said "No need for TP. It won't help with > spatial representations." I was hoping Scott you could expand on your > answer to what you think about how SP and TP contribute to spatial > invarience recognition. **** > > ** ** > > Best Regards,**** > > Quinn Liu**** > > ** ** > > On Mon, Jul 15, 2013 at 5:07 PM, Michael Ferrier < > [email protected]> wrote:**** > > Hi Quinn, **** > > ** ** > > The older version of HTM would group together the spatial patterns that > would tend to occur in close temporal sequence with one another, and > produce the same output when it saw any of the spatial patterns within a > given group. So, if a network were trained on visual input of digits > zig-zagging through the visual field, then any individual visual feature > (for example a vertical line) would come to be represented by a temporal > group that responds when it is presented with a vertical line at any of > many nearby locations, because in the training data, a vertical line is > often seen moving from one location to another nearby location. In this way > it would learn invariance to position. At the lowest level of the hierarchy > it would learn invariance to position for individual small visual features, > and at higher levels it would learn invariance for more complex and larger > arrangements of features and whole visual objects. Invariance to other > transformations like scale, rotation, etc. could also be learned this way > given the appropriate training data.**** > > ** ** > > Like Scott said the old version of HTM worked very differently from CLA, > but they both model the same basic principles (the CLA does so much more > flexibly). Using a CLA region with one cell per column, a cell should > become active when given a particular spatial pattern, but should become > predictive when given any pattern that (during training) often occurs close > by in temporal sequence to that spatial pattern. So, if a column's proximal > segment represents the spatial pattern of a vertical line, then that > column's cell should become predictive whenever a vertical line at any > nearby position is presented, because during training a given vertical line > is often followed by another nearby vertical line, since the training set > is made up of animations of the visual objects smoothly zig-zagging around. > **** > > ** ** > > And because a CLA region sends output from both its active and predictive > cells, from the point of view of the next, higher region in the hierarchy, > that cell is responding invariantly to any of a set of nearby vertical > lines. This corresponds to how 'complex cells' respond in visual cortex.** > ** > > ** ** > > Does that make sense?**** > > ** ** > > -Mike**** > > > **** > > _____________ > Michael Ferrier > Department of Cognitive, Linguistic and Psychological Sciences, Brown > University > [email protected]**** > > ** ** > > On Mon, Jul 15, 2013 at 4:23 PM, Scott Purdy <[email protected]> wrote:*** > * > > Quinn, the older HTM implementations were completely different algorithms > and are now obsolete.**** > > ** ** > > On Mon, Jul 15, 2013 at 1:09 PM, Quinn Liu <[email protected]> wrote:**** > > Hi Michael,**** > > I had an additional question. In your reply you remarked that "while > digit recognition was successfully modeled with the original version of > HTM, that doesn't seem to be the case with CLA yet". I was wondering if you > or anyone else could expand on this as I am unfamiliar with the original > version of the HTM. Assuming that it is premature version of the current > spatial and temporal learning algorithms how is it different? Thanks!**** > > ** ** > > Best Regards,**** > > Quinn Liu**** > > ** ** > > [email protected]**** > > ** ** > > On Mon, Jul 15, 2013 at 3:41 PM, Michael Ferrier < > [email protected]> wrote:**** > > Hi Fergal,**** > > ** ** > > I completely agree that a visual object recognition system would greatly > benefit from hierarchy. Causes in the world are hierarchical, and the brain > uses hierarchy to learn and represent them. The successful vision models > using the original implementation of HTM were also hierarchical. I was just > saying that, as far as I know, this hasn't been done with CLA yet -- > according to Jeff, in their vision experiments they were just beginning to > expand beyond one layer when they stopped working on vision.**** > > ** ** > > I think that both temporal pooling (for invariance) and hierarchy are key > to using CLA for visual recognition problems, but I don't know of anyone > who has put all the pieces together yet to do visual recognition with CLA. > **** > > ** ** > > -Mike**** > > ** ** > > ** ** > > > **** > > _____________ > Michael Ferrier > Department of Cognitive, Linguistic and Psychological Sciences, Brown > University > [email protected]**** > > ** ** > > On Mon, Jul 15, 2013 at 11:44 AM, Fergal Byrne < > [email protected]> wrote:**** > > ** ** > > Hi Michael,**** > > ** ** > > Handwritten characters are undoubtedly multi-component designs, which have > evolved to connect with and trigger our ability to learn spatial, temporal > and hierarchical patterns. We perceive the same characters even when loads > of things change in fonts, and especially when reading different people's > handwriting. We can fill in gaps and correct misspellings. So the learning > and prediction must be several levels deep in hierarchy.**** > > ** ** > > In terms of bottom level mechanics, we use saccades to recognise and > "delocalise" components such as characters, facial features, etc, in such a > way as to allow this multi-level recognition (including a hierarchy of > fixations - for strokes, junctions, topology, characters, letters, words, > and even sentences). **** > > ** ** > > Speed-readers can saccade to read entire phrases and sentences at a time, > allowing reading speeds of thousands of words per minute with better than > 70% comprehension scores. With practice, I've been able to get scores in > the 1-2000 wpm range. I can also read text in a mirror or upside-down at > speeds approaching 50-60% of an average reader. These things could only be > done using big, complex region hierarchies with vast volumes of (normal) > reading practice.**** > > ** ** > > I would have predicted that a single layer CLA would struggle with this > kind of data set, because it lacks the multi-level upward and downward > structure which I feel this kind of performance requires.**** > > ** ** > > Regards,**** > > ** ** > > Fergal Byrne**** > > ** ** > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org**** > > ** ** > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
