On Thu, Jul 4, 2013 at 11:08 PM, Hideaki Suzuki <[email protected]> wrote:
> Dear Jeff san and Scott san, > > Thank you very much for your kind reply with detailed answers. :) > I'm sorry that I couldn't respond sooner. > > > I understood. It is interesting to me that CLA can reverse-map from an > SDR to the original values even after all input values are concatenated for > the region and the fanout area spans the entire input region. > Are you referring to converting predicted cells back to predicted values? If so, we call that process reconstruction<https://github.com/numenta/nupic/wiki/Reconstruction> (I realize there isn't much info at the link but wanted to point out where the info will be once it is public) and do not use it currently. Reconstruction basically takes the predicted columns and chooses the input bits they are connected to and the encoders each have a method for determining the value given the chosen bits. This description might be slightly off but that is the general idea with reconstruction. However, we currently don't do that. Instead, we have the CLA classifier<https://github.com/numenta/nupic/wiki/CLA-Classifier>that sits on top of the TP and converts the predicted cells to a predicted value. This isn't a biologically-inspired component; rather, it is a practical method for turning predicted cells into a predicted value and it works better than reconstruction for our problems. > > Besides, another thought is, since the entire input area is connected > to each column in Grok, I guess the inhibition radius would also become the > entire region, which reduces the inhibition logic by a little bit (we can > ignore the concept of neighbors). Am I right? > Yes this is correct to my knowledge. > > For the memory usage, thank you for the information. Scott > san's explanation sounds reasonable to me. > Only 1024 columns (i.e. a very small 32x32 region) with 5% active ones can > represent C(1024 51) different patterns, which is 609 septenvigintillion(an > 87-digit number) according to Wolfram|Alpha. So, most memory must be used > for connections and the mappings between input values and SDRs. I'm > thinking of evaluating how well it can compress temporal and spatial > pattern data into an HTM region, after I can feel confortable with my CLA > gadget. > > Best Regards, > Hideaki Suzuki. > > 2013/7/2 Jeff Hawkins <[email protected]> > >> *I will try to answer some more. My answers are preceded by >>* >> >> >> - A perceptron can relate various input to various output.**** >> >> People have used a perceptron to convert electrical signals along the >> arm muscles to the input for motors,**** >> >> so those who have lost an arm can move artificial arms by the image >> of moving their lost arms (you may know).**** >> >> >> CLA looks similar by some extent. Learning the electrical signal >> input with the output for the motors, and predict.**** >> >> Do you have any detailed comparisons between CLA and perceptron?**** >> >> Is it good to combine them? e.g. can perceptron be a good classifier >> to convert SDR to the original value?**** >> >> ** ** >> >> >> Perceptrons are a very old and simple form of neural network. You >> don’t hear that term much these days. There are other more modern neural >> networks including the currently popular “deep learning networks”. Almost >> all neural networks are spatial pattern classifiers. That means they have >> no ability to recognize time-based patterns. The CLA learns both spatial >> and temporal patterns and therefore it is hard to compare CLA to other >> neural networks unless you ignore time. Other differences are the CLA is >> an online learning algorithm, the CLA can handle many different types of >> inputs, and the CLA learns unsupervised. Other neural networks may have >> some of these attributes but most don’t. The CLA is also a biological >> theory where most artificial neural networks are not really.**** >> >> >> - With boosting, we can have different SDR for the same input, after >> feeding the data again and again.**** >> >> If true, we may have multiple SP SDRs for a single input pattern. Is >> my understanding okay?**** >> >> If true, is this related to Gestaltzerfall? Possible?**** >> >> ** ** >> >> Yes the same input could have different representations. Not just due to >> boosting but also because the active input bit connections for each column >> can change over time.**** >> >> **** >> >> - Do you have a specific reason that boosting is done by multiplication >> in the white papaer, rather than addition?**** >> >> >> It might not be important.**** >> >> **** >> >> - If a column happens to drop all bottom up synapses, can it become >> active again, without boosting?**** >> >> ** ** >> >> Yes it can. That is the whole point of boosting :) And when is does >> become active again, it will start to form new connections.**** >> >> **** >> >> - Should we have fanout areas on the input space overlapped? If yes, >> how much should we?**** >> >> If we have two columns next to each other, and the two corresponding >> fanout areas, connected to them,**** >> >> are overlapped by 50% of radius of the fanout size vertically / >> horizontally, one fanout area is overlapped by**** >> >> the other eight fanout areas connected to the columns around in 2D. >> This means, one input bit can affected**** >> >> nine columns. If we have more overlaps in fanout areas, one input >> bit can affect more number of columns.**** >> >> At extreme, any input pattern will generate the same intensity to all >> columns.**** >> >> >> Is my understanding okay? Probably, I'm missing something.**** >> >> >> Typically the fan out and fan in areas overlap. How much is >> dependent on the statistics of the data. In Grok all columns get input >> from the entire input area, but in a vision application you wouldn’t do >> this. We use the concept of “potential synapses” where are the cells that >> can potentially connect. Normally the potential synapses are a subset of >> all cells within a radius, say 50%. So even if two columns receive input >> from the entire input space, their 50% of cells within that space are >> different. So you never have the exact same input to two columns.**** >> >> ** ** >> >> For the time series problems we do right now we basically concatenate the >> encodings from each field. Since there isn't a logical fanout and >> overlapping pattern for the potential input bit connections for each >> column, we instead randomly select 50% (I think) of the input bits for each >> column as the possible connected**** >> >> ** ** >> >> Our scalar encoders use overlapping sets of bits to represent each bucket >> (range of the input space) which captures the semantics of scalars (values >> closer together will have more overlap).**** >> >> >> - Do you have any data or analysis in memory usage?**** >> >> i.e. how much memory CLA uses to lean what it learn.**** >> >> ** ** >> >> We have done some pretty thorough analysis of this. Perhaps we can put >> some more in depth information in the wiki but as a quick estimate you can >> expect an untrained model to be pretty small and it will grow to be several >> MB in memory as it gets more saturated. I am not sure what the max size >> is. Also, if you predict 3 different steps, the classifier will be 3x >> larger so that can have a pretty big impact on memory usage. >> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
