Scott,
On Fri, Nov 22, 2013 at 11:59 PM, Scott Purdy <[email protected]> wrote: Marek, there is a difference between reconstruction and feedback. There is no connection that I am aware of between the CLA theory and reconstruction (or the classifier for that matter). This is what I reached in the end, neither classifier, nor reconstruction is really used in brain. But it does prove useful for the NuPIC's algo use. Like I said, a plus point goes to reconstruction, which could be achievable with how CLA region works, while classification is artificial. The classifier and reconstruction are tools that we use for business purposes. So what are the pragmatic benefits we get by switching from the classifier to reconstruction? Reconstruction gets us a single prediction for the following step. The classifier can provide the same. But it can make multiple predictions with probabilities. And it can be used to predict a different time step or time interval. And it can do so for multiple intervals and all of these come with weighted sets of predictions. I note the classifier has nice advantages here. For business purpose, there probably wouldn't be any advantage, for research purposes, I'd like to see these implemented by the means of (hierarchy of) CLAs. Because all the cells do only single-step predictions, no buffers, etc.. So what practical benefit do you want to get from reconstruction? More like from the inhibitory connections that go with it. I would be able to use SP as an associative memory with fault-tolerance, See my example: " Now I will train: { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner}; and I want to ask {7:48am : ???}, I can easily encode the 7.48 time and expect to get "breakfast", as 7.45 and 7.48 share most of the bits. This is usecase where i want to feed (incomplete) SDR-active columns and expect to get input pattern. " I think I could achieve the same with SP+classifier (right?), but this is just simpler. Another thing is the TP, where I could put a pattern on atop of it and as what was the pattern at T-1? Could I do that with classifier? This may be useful for planning tasks, etc. Big reason that came on my mind is "SDR math". OR-ing, subsampling etc are mentioned. ex: "1" -> ScalarEncoder -> SP -> SDR_1 "2" -> enc -> SP -> SDR_2 res := SDR_1 or SDR_2 How would I train a classifier for that? If I had reconstruction, SP would give me a bit-array which encoder shoud decode as [1, 2]. Other modularity issues come to mind, think GAME-NN which is a neural net, where neurons are more complex algorithms. Having inverse function for each layer of the region would be useful there for learning. I am actually not opposed to reconstruction and think it is a little easier to reason about. But my proposal would be to leave the classifier as the default for prediction problems and implement reconstruction completely outside the spatial and temporal pooler code. IE rather than adding the code directly into the SP and TP classes as it was before, to implement it in different files and have it simply inspect the SP/TP state. For me, it falls into the same category as the classifier and encoders - not part of the core CLA theory but useful for applying the CLA to real problems. I'm glad you hold open possition here. I'd very much like to see the reconstruction ressurected. Best case the real one, with inhibit. synapses inside of SP/TP (maybe yet another implementation file?) Still reconstruction providing the inverse of compute() in a separate class would be of a good use too. My main motivation is neural-research where I'd like to see a CLA as close to brain as possible, how it's described in the whitepaper. Even though it presents noticable slower speed, less features and worse usability. I agree on your point, that practical use and execution speed is important for the normal usecase and makes nupic/CLA successful. Thanks, Mark Marek, Ian - what are your thoughts on that? Are there any actual benefits of implementing reconstruction? On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]> wrote: Hi Fergal, Ian, thank you very much for these ideas.. On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne < [email protected]> wrote: Hi Marek, Some good points there. I'll address them together rather than point by pint, as they're intertwined. First, the classifier is an artefact used to derive useful information from the state of a region, it's never meant to be part of the CLA theory. Ok, I can take that. It is useful for the practical purpose of some applications we do. However, if we could achieve the same with the reverse function at each layer, I'd be happier - because it would be more "correct". I use quotes here, because the brain doesn't have to do either - it never turns patterns back to the original. But interestingly, the way it's hardwired in brain, patterns are top-down reconstructable, while our (for speed) approach isn't. The SP and TP are only separate in the current implementation for engineering reasons. You can use each alone if you like (useful for some applications) and you can also subclass each separately (useful for others). Yep, agreed. We currently lack feedback and motor connections, as well as thalamic mediated connections; these will be added in time. We do model inhibition in NuPIC and local inhibition is close enough to match how the neocortex does it. Yes, I agree it's close enough, but the fact we don't keep the inhibition permanences disallows us to compute the inverse of feed-fwd compute(). When discussing reconstruction of inputs, remember that it is the set of predictive cells which are used to identify the associated inputs, not the current set of active columns in the SP. There are no inhibited predictive cells to be consulted. Not sure I follow here... In both TP/SP case, the output is SDR - an OR for active cols and predictive cells/cols. ...so, what you're saying, the reconstruction should work even with current inhibition model, that is without inhibitory connections? Imagine I have two categories - "cats", and "dogs". Encoder transcribes them to a 1000-bit array; cats = bits 1-500 ON, dogs = 501=1000 ON. For the encoder to decode back, I need to produce an array of same quality. The SP makes an SDR out of it, some 20 bits for dogs, other 20 represents cats. Do these bits' synapses cover all 500-input bits? They should. So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits) represents the dogs' bits. (btw, I dont see any predictive columns in the SP(?)). Now I want to say, given this SDR, what output is the most likely? So I should take a random sub-sample back to 20 ONbits (this will most likely give me cats' only SDR), apply the permanences and see what input synapses should have been active. There's a good chance it's the left-500 ones. Will this work as a means for reconstruction? We never need to reconstruct the inputs from the active columns - the input is what causes the current activation pattern. This is not true. I have an application of learning where I want to use this for CAM memory. Let's assume date ancoder encodes similar times with similar patterns (it does). Now I will train: { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner}; and I want to ask {7:48am : ???}, I can easily encode the 7.48 time and expect to get "breakfast", as 7.45 and 7.48 share most of the bits. This is usecase where i want to feed (incomplete) SDR-active columns and expect to get input pattern. Btw, interesting article about Google's deep NNs, scary and encouraging at a same time! And I'm glad youtube can auto-identify funny cat videos, that's a killer feature ;) Ian, I'd like to know how much slow-down this reconstruction was? I think the reconstruction should be re-enabled (maybe with a switch to dis/able its use(and slowdown)) and the Classifier should not be part of the nupic-core, but exported as a feature of the OPF model framework. When we have feedback from higher regions affecting the activity of a region, we'll then have to employ reconstruction (using feed forward permanences or a classifier) to get the associated "imagined" inputs. In the brain, we never need to do this, as the perception is the input as far as our minds see it. Similarly, we don't need a classifier, as the output is the identity when viewed from above. — Sent from Mailbox for iPhone On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal < [email protected]> wrote: In the following text I'll describe what (I think) Classifier does, why I consider it wrong and what can be done to avoid it? 1/ What is a Classifier: wiki doesn't say much, I got it's a part of Region that "translates SDRs back to input-space". Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its c++ sibling) I see there's a function compute() that basicaly pairs SDR (from lower layer, with input that caused the sdr), am I right? *) there's a KNNClassifier which would use k-nearest neighbours alg., and CLAClassifier..which uses what? a SP? A feed-forward NN seems to be a good candidate for such impl. of a classifier. 2/ Why I consider it "wrong" 2.1/ Classifier does not have a biological counterpart like other parts of HTM/CLA do. the chain : input --> encoder --> SP --> TP --> Classifier??! input can be whatever we have sensors to percieve, eg a "sound wave", it's of any possible data-type - input-space encoder is the function of the sensory organ - eg "cochlea translates the vibrations to the electromagnetic pulses on the cells ", it translates from inputspace to bit-vector (not a SDR, however) SP+TP: are combined together in the brain in a (micro-)region; they both accept and produce a SDR The Classifier does not have a counterpart as brain has no need to translate back to the input-space. We, however, do for CLAs to be useful in practical problems. 2.2/ lack of top-down compute in SP, TP breaks modularity. Encoder does have encode() and decode()/topDownCompute() methods. SP and TP dont. To make these two useful building blocks, it would be necessary to have the an inverse of compute() in these too. In nature, there are four types of connections between cells/neurons in brain: 2 vertical: feedforward feeding input to higher layer, and recurrent feeding more stable patterns to lower layers. And two horizontal: predictive connections (used in TP) and inhibitory (missing in Nupic). The inhibition connections are missing for performance reasons (I think) and we use global/local n-best inhibition instead. This fact makes it impossible to recostruct from a list of active columns (an SDR) the input that had caused it. If we had them (inh. permanences) we could use them in reverse meaning to boost the columns that have been silenced and from these "active+" columns according to permanences turn ON appropriate synapses. Such implementation would be slower, but interesting for bio-inspired research; allow SDR to be messaging format between parts of CLA (even diff implementations) and reduce possible space for errors happening in the Classifier. Are my thoughts proper/wrong somewhere? What do you think? Cheers, Mark -- Marek Otahal :o) _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org -- Marek Otahal :o) _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org -- Marek Otahal :o)
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
