Thanks Ian, Yes I get that, real (clock on the wall) time is not encoded. But the string of steps it took to get to a point in the CLA is encoded as the connections are built up over the steps the CLA takes during training. When you convert that to a reconstructed SDR output you have effectively lost all that temporal info. It is collapsed into a single frame.
I'm still trying to think this through. Suppose your are feeding this 'collapsed' SDR into another CLA. You can only pick one SDR (the best fit / highest probability) to feed it. Have you now lost all that other good info on the other likely (less probable) paths in the sequence? On Fri, Nov 22, 2013 at 5:24 PM, Ian Danforth <[email protected]>wrote: > Doug, > > Whatever method you choose it comes down to that the output of a system > like CLA needs to make choices. Is this large collection of active cells a > 5, or an 8? Or more realistically, does this large collection of active > cells mean I move my arm up or down? > > Temporal context gives you the ability to make the right choices at the > right time, but those individual outcomes don't need to encode time > themselves. > > To use Jeff's favorite metaphor, a melody is a series of notes, and there > arn't that many on a normal scale, but learning to play them in the right > order and at the right time takes temporal context. The output however of > if I play a C or an F# is a collapsed/reduced version of all the > possibilities that exist in the brain at a given time. If I try to > express/play/output BOTH, well, I'll probably mash the keys in between, and > that's no good. > > Ian > > > > On Fri, Nov 22, 2013 at 5:13 PM, Doug King <[email protected]> wrote: > >> I'm trying to follow here. Am I getting this right? >> >> Output of CLA needs to be reconstructed for business reasons - we need a >> prediction or anomaly detection so we can get paid and do more of this. >> >> A reconstruction is an single SDR output of CLA that represents the best >> fit for predicting the next step (t+1). >> >> A classifier is a super-set of reconstructed SDRs, with probabilities, >> out to t+n. >> We want to chain CLAs in a biological like way (HTMs) and we need to feed >> reconstruction of one (output) into another. >> >> It seems to me that a reconstructor or classifier collapses the temporal >> information, so we loose it. To actually keep the temporal information we >> need to " propagate information through synaptic weights, any claim to >> biological adherence demands that." >> >> Jeff alluded to this in one of the Office Hours videos. He said it would >> be very hard to do hierarchy correctly so as to not lose temporal >> information. Should I take this to mean that all synaptic info needs to be >> wired up to the next CLA input? >> >> Are my assumptions / interpretations correct here? >> >> >> On Fri, Nov 22, 2013 at 3:54 PM, Ian Danforth >> <[email protected]>wrote: >> >>> Scott, >>> >>> Reconstruction was implemented because it was much closer to biology >>> than any classifier. This holds true today. It may also be useful to learn >>> feed forward and feed back pathways independently, and that would be >>> biologically sound, as you suggest, we should move in that direction. >>> >>> 1. The classifier adds complexity and memory footprint that could be >>> eliminated. >>> 2. Reconstruction also provides probabilities based on synaptic >>> strengths, which is how the brain does it >>> 3. It is true that the classifier provides direct multistep prediction, >>> but this is an egregiously non-biological hack. >>> >>> The utility of a classifier is pretty high, but ultimately is an almost >>> inexcusable distraction for the advancement of the core theory. It should >>> remain as one of a family of possible classifiers to put on the output of >>> the SP or TP (which is very common in mainline ML research) but shouldn't >>> be part of any core logic. >>> >>> There is no question that a real solution here is based on the >>> propagation of information through synaptic weights, any claim to >>> biological adherence demands that. >>> >>> When we get around to putting the whitepaper up so the community can >>> update it, a clear roadmap for either a feedback pathway or a >>> reconstruction pathway as a step to feedback will be required. >>> >>> Ian >>> >>> >>> On Fri, Nov 22, 2013 at 2:59 PM, Scott Purdy <[email protected]> wrote: >>> >>>> Marek, there is a difference between reconstruction and feedback. There >>>> is no connection that I am aware of between the CLA theory and >>>> reconstruction (or the classifier for that matter). >>>> >>>> The classifier and reconstruction are tools that we use for business >>>> purposes. So what are the pragmatic benefits we get by switching from the >>>> classifier to reconstruction? >>>> >>>> Reconstruction gets us a single prediction for the following step. The >>>> classifier can provide the same. But it can make multiple predictions with >>>> probabilities. And it can be used to predict a different time step or time >>>> interval. And it can do so for multiple intervals and all of these come >>>> with weighted sets of predictions. >>>> >>>> So what practical benefit do you want to get from reconstruction? >>>> >>>> I am actually not opposed to reconstruction and think it is a little >>>> easier to reason about. But my proposal would be to leave the classifier as >>>> the default for prediction problems and implement reconstruction completely >>>> outside the spatial and temporal pooler code. IE rather than adding the >>>> code directly into the SP and TP classes as it was before, to implement it >>>> in different files and have it simply inspect the SP/TP state. For me, it >>>> falls into the same category as the classifier and encoders - not part of >>>> the core CLA theory but useful for applying the CLA to real problems. >>>> >>>> Marek, Ian - what are your thoughts on that? Are there any actual >>>> benefits of implementing reconstruction? >>>> >>>> >>>> On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]>wrote: >>>> >>>>> Hi Fergal, Ian, >>>>> thank you very much for these ideas.. >>>>> >>>>> >>>>> On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Marek, >>>>>> >>>>>> Some good points there. I'll address them together rather than point >>>>>> by pint, as they're intertwined. >>>>>> >>>>>> First, the classifier is an artefact used to derive useful >>>>>> information from the state of a region, it's never meant to be part of >>>>>> the >>>>>> CLA theory. >>>>>> >>>>> >>>>> Ok, I can take that. It is useful for the practical purpose of some >>>>> applications we do. However, if we could achieve the same with the reverse >>>>> function at each layer, I'd be happier - because it would be more >>>>> "correct". I use quotes here, because the brain doesn't have to do either >>>>> - >>>>> it never turns patterns back to the original. But interestingly, the way >>>>> it's hardwired in brain, patterns are top-down reconstructable, while our >>>>> (for speed) approach isn't. >>>>> >>>>> >>>>>> >>>>>> The SP and TP are only separate in the current implementation for >>>>>> engineering reasons. You can use each alone if you like (useful for some >>>>>> applications) and you can also subclass each separately (useful for >>>>>> others). >>>>>> >>>>> >>>>> Yep, agreed. >>>>> >>>>> >>>>>> >>>>>> We currently lack feedback and motor connections, as well as thalamic >>>>>> mediated connections; these will be added in time. >>>>>> >>>>>> We do model inhibition in NuPIC and local inhibition is close enough >>>>>> to match how the neocortex does it. >>>>>> >>>>> >>>>> Yes, I agree it's close enough, but the fact we don't keep the >>>>> inhibition permanences disallows us to compute the inverse of feed-fwd >>>>> compute(). >>>>> >>>>> >>>>>> >>>>>> When discussing reconstruction of inputs, remember that it is the set >>>>>> of predictive cells which are used to identify the associated inputs, not >>>>>> the current set of active columns in the SP. There are no inhibited >>>>>> predictive cells to be consulted. >>>>>> >>>>> >>>>> Not sure I follow here... >>>>> In both TP/SP case, the output is SDR - an OR for active cols and >>>>> predictive cells/cols. ...so, what you're saying, the reconstruction >>>>> should >>>>> work even with current inhibition model, that is without inhibitory >>>>> connections? >>>>> >>>>> Imagine I have two categories - "cats", and "dogs". Encoder >>>>> transcribes them to a 1000-bit array; cats = bits 1-500 ON, dogs = >>>>> 501=1000 >>>>> ON. For the encoder to decode back, I need to produce an array of same >>>>> quality. >>>>> >>>>> The SP makes an SDR out of it, some 20 bits for dogs, other 20 >>>>> represents cats. Do these bits' synapses cover all 500-input bits? They >>>>> should. >>>>> >>>>> So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to >>>>> SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits) >>>>> represents the dogs' bits. (btw, I dont see any predictive columns in the >>>>> SP(?)). Now I want to say, given this SDR, what output is the most likely? >>>>> So I should take a random sub-sample back to 20 ONbits (this will most >>>>> likely give me cats' only SDR), apply the permanences and see what input >>>>> synapses should have been active. There's a good chance it's the left-500 >>>>> ones. >>>>> >>>>> Will this work as a means for reconstruction? >>>>> >>>>> >>>>> We never need to reconstruct the inputs from the active columns - the >>>>>> input is what causes the current activation pattern. >>>>>> >>>>> >>>>> This is not true. I have an application of learning where I want to >>>>> use this for CAM memory. Let's assume date ancoder encodes similar times >>>>> with similar patterns (it does). >>>>> Now I will train: >>>>> { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner}; >>>>> and I want to ask {7:48am : ???}, I can easily encode the 7.48 time >>>>> and expect to get "breakfast", as 7.45 and 7.48 share most of the bits. >>>>> This is usecase where i want to feed (incomplete) SDR-active columns and >>>>> expect to get input pattern. >>>>> >>>>> >>>>> >>>>> Btw, >>>>> interesting article about Google's deep NNs, scary and encouraging at >>>>> a same time! And I'm glad youtube can auto-identify funny cat videos, >>>>> that's a killer feature ;) >>>>> >>>>> >>>>> Ian, >>>>> I'd like to know how much slow-down this reconstruction was? I think >>>>> the reconstruction should be re-enabled (maybe with a switch to dis/able >>>>> its use(and slowdown)) and the Classifier should not be part of the >>>>> nupic-core, but exported as a feature of the OPF model framework. >>>>> >>>>> >>>>> >>>>> When we have feedback from higher regions affecting the activity of a >>>>>> region, we'll then have to employ reconstruction (using feed forward >>>>>> permanences or a classifier) to get the associated "imagined" inputs. >>>>>> >>>>>> In the brain, we never need to do this, as the perception is the >>>>>> input as far as our minds see it. Similarly, we don't need a classifier, >>>>>> as >>>>>> the output is the identity when viewed from above. >>>>>> — >>>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPhone >>>>>> >>>>>> >>>>>> On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> In the following text I'll describe what (I think) >>>>>>> Classifier does, >>>>>>> why I consider it wrong and what can be done to avoid it? >>>>>>> >>>>>>> 1/ What is a Classifier: >>>>>>> wiki doesn't say much, I got it's a part of Region that "translates >>>>>>> SDRs back to input-space". >>>>>>> >>>>>>> Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its >>>>>>> c++ sibling) I see there's a function compute() that basicaly pairs SDR >>>>>>> (from lower layer, with input that caused the sdr), am I right? >>>>>>> >>>>>>> *) there's a KNNClassifier which would use k-nearest neighbours >>>>>>> alg., and CLAClassifier..which uses what? a SP? A feed-forward NN seems >>>>>>> to >>>>>>> be a good candidate for such impl. of a classifier. >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2/ Why I consider it "wrong" >>>>>>> >>>>>>> 2.1/ Classifier does not have a biological counterpart like other >>>>>>> parts of HTM/CLA do. >>>>>>> >>>>>>> the chain : >>>>>>> input --> encoder --> SP --> TP --> Classifier??! >>>>>>> >>>>>>> input can be whatever we have sensors to percieve, eg a "sound >>>>>>> wave", it's of any possible data-type - input-space >>>>>>> >>>>>>> encoder is the function of the sensory organ - eg "cochlea >>>>>>> translates the vibrations to the electromagnetic pulses on the cells ", >>>>>>> it >>>>>>> translates from inputspace to bit-vector (not a SDR, however) >>>>>>> >>>>>>> SP+TP: are combined together in the brain in a (micro-)region; >>>>>>> they both accept and produce a SDR >>>>>>> >>>>>>> The Classifier does not have a counterpart as brain has no need to >>>>>>> translate back to the input-space. We, however, do for CLAs to be >>>>>>> useful in >>>>>>> practical problems. >>>>>>> >>>>>>> >>>>>>> 2.2/ lack of top-down compute in SP, TP breaks modularity. >>>>>>> >>>>>>> Encoder does have encode() and decode()/topDownCompute() methods. SP >>>>>>> and TP dont. To make these two useful building blocks, it would be >>>>>>> necessary to have the an inverse of compute() in these too. >>>>>>> >>>>>>> In nature, there are four types of connections between cells/neurons >>>>>>> in brain: 2 vertical: feedforward feeding input to higher layer, and >>>>>>> recurrent feeding more stable patterns to lower layers. >>>>>>> >>>>>>> And two horizontal: predictive connections (used in TP) and >>>>>>> inhibitory (missing in Nupic). >>>>>>> >>>>>>> The inhibition connections are missing for performance reasons (I >>>>>>> think) and we use global/local n-best inhibition instead. This fact >>>>>>> makes >>>>>>> it impossible to recostruct from a list of active columns (an SDR) the >>>>>>> input that had caused it. If we had them (inh. permanences) we could use >>>>>>> them in reverse meaning to boost the columns that have been silenced and >>>>>>> from these "active+" columns according to permanences turn ON >>>>>>> appropriate >>>>>>> synapses. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Such implementation would be slower, but interesting for >>>>>>> bio-inspired research; allow SDR to be messaging format between parts of >>>>>>> CLA (even diff implementations) and reduce possible space for errors >>>>>>> happening in the Classifier. >>>>>>> >>>>>>> Are my thoughts proper/wrong somewhere? What do you think? >>>>>>> >>>>>>> Cheers, Mark >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Marek Otahal :o) >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> nupic mailing list >>>>>> [email protected] >>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Marek Otahal :o) >>>>> >>>>> _______________________________________________ >>>>> nupic mailing list >>>>> [email protected] >>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
