Correction, not all synapses, all active cells. In the Office Hours video around 19:10 Jeff explains specifics of how he thinks hierarchy should work, all active cells wired up to next level, no classifiers between levels.
On Fri, Nov 22, 2013 at 5:13 PM, Doug King <[email protected]> wrote: > I'm trying to follow here. Am I getting this right? > > Output of CLA needs to be reconstructed for business reasons - we need a > prediction or anomaly detection so we can get paid and do more of this. > > A reconstruction is an single SDR output of CLA that represents the best > fit for predicting the next step (t+1). > > A classifier is a super-set of reconstructed SDRs, with probabilities, out > to t+n. > We want to chain CLAs in a biological like way (HTMs) and we need to feed > reconstruction of one (output) into another. > > It seems to me that a reconstructor or classifier collapses the temporal > information, so we loose it. To actually keep the temporal information we > need to " propagate information through synaptic weights, any claim to > biological adherence demands that." > > Jeff alluded to this in one of the Office Hours videos. He said it would > be very hard to do hierarchy correctly so as to not lose temporal > information. Should I take this to mean that all synaptic info needs to be > wired up to the next CLA input? > > Are my assumptions / interpretations correct here? > > > On Fri, Nov 22, 2013 at 3:54 PM, Ian Danforth <[email protected]>wrote: > >> Scott, >> >> Reconstruction was implemented because it was much closer to biology >> than any classifier. This holds true today. It may also be useful to learn >> feed forward and feed back pathways independently, and that would be >> biologically sound, as you suggest, we should move in that direction. >> >> 1. The classifier adds complexity and memory footprint that could be >> eliminated. >> 2. Reconstruction also provides probabilities based on synaptic >> strengths, which is how the brain does it >> 3. It is true that the classifier provides direct multistep prediction, >> but this is an egregiously non-biological hack. >> >> The utility of a classifier is pretty high, but ultimately is an almost >> inexcusable distraction for the advancement of the core theory. It should >> remain as one of a family of possible classifiers to put on the output of >> the SP or TP (which is very common in mainline ML research) but shouldn't >> be part of any core logic. >> >> There is no question that a real solution here is based on the >> propagation of information through synaptic weights, any claim to >> biological adherence demands that. >> >> When we get around to putting the whitepaper up so the community can >> update it, a clear roadmap for either a feedback pathway or a >> reconstruction pathway as a step to feedback will be required. >> >> Ian >> >> >> On Fri, Nov 22, 2013 at 2:59 PM, Scott Purdy <[email protected]> wrote: >> >>> Marek, there is a difference between reconstruction and feedback. There >>> is no connection that I am aware of between the CLA theory and >>> reconstruction (or the classifier for that matter). >>> >>> The classifier and reconstruction are tools that we use for business >>> purposes. So what are the pragmatic benefits we get by switching from the >>> classifier to reconstruction? >>> >>> Reconstruction gets us a single prediction for the following step. The >>> classifier can provide the same. But it can make multiple predictions with >>> probabilities. And it can be used to predict a different time step or time >>> interval. And it can do so for multiple intervals and all of these come >>> with weighted sets of predictions. >>> >>> So what practical benefit do you want to get from reconstruction? >>> >>> I am actually not opposed to reconstruction and think it is a little >>> easier to reason about. But my proposal would be to leave the classifier as >>> the default for prediction problems and implement reconstruction completely >>> outside the spatial and temporal pooler code. IE rather than adding the >>> code directly into the SP and TP classes as it was before, to implement it >>> in different files and have it simply inspect the SP/TP state. For me, it >>> falls into the same category as the classifier and encoders - not part of >>> the core CLA theory but useful for applying the CLA to real problems. >>> >>> Marek, Ian - what are your thoughts on that? Are there any actual >>> benefits of implementing reconstruction? >>> >>> >>> On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]>wrote: >>> >>>> Hi Fergal, Ian, >>>> thank you very much for these ideas.. >>>> >>>> >>>> On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne < >>>> [email protected]> wrote: >>>> >>>>> Hi Marek, >>>>> >>>>> Some good points there. I'll address them together rather than point >>>>> by pint, as they're intertwined. >>>>> >>>>> First, the classifier is an artefact used to derive useful information >>>>> from the state of a region, it's never meant to be part of the CLA theory. >>>>> >>>> >>>> Ok, I can take that. It is useful for the practical purpose of some >>>> applications we do. However, if we could achieve the same with the reverse >>>> function at each layer, I'd be happier - because it would be more >>>> "correct". I use quotes here, because the brain doesn't have to do either - >>>> it never turns patterns back to the original. But interestingly, the way >>>> it's hardwired in brain, patterns are top-down reconstructable, while our >>>> (for speed) approach isn't. >>>> >>>> >>>>> >>>>> The SP and TP are only separate in the current implementation for >>>>> engineering reasons. You can use each alone if you like (useful for some >>>>> applications) and you can also subclass each separately (useful for >>>>> others). >>>>> >>>> >>>> Yep, agreed. >>>> >>>> >>>>> >>>>> We currently lack feedback and motor connections, as well as thalamic >>>>> mediated connections; these will be added in time. >>>>> >>>>> We do model inhibition in NuPIC and local inhibition is close enough >>>>> to match how the neocortex does it. >>>>> >>>> >>>> Yes, I agree it's close enough, but the fact we don't keep the >>>> inhibition permanences disallows us to compute the inverse of feed-fwd >>>> compute(). >>>> >>>> >>>>> >>>>> When discussing reconstruction of inputs, remember that it is the set >>>>> of predictive cells which are used to identify the associated inputs, not >>>>> the current set of active columns in the SP. There are no inhibited >>>>> predictive cells to be consulted. >>>>> >>>> >>>> Not sure I follow here... >>>> In both TP/SP case, the output is SDR - an OR for active cols and >>>> predictive cells/cols. ...so, what you're saying, the reconstruction should >>>> work even with current inhibition model, that is without inhibitory >>>> connections? >>>> >>>> Imagine I have two categories - "cats", and "dogs". Encoder transcribes >>>> them to a 1000-bit array; cats = bits 1-500 ON, dogs = 501=1000 ON. For the >>>> encoder to decode back, I need to produce an array of same quality. >>>> >>>> The SP makes an SDR out of it, some 20 bits for dogs, other 20 >>>> represents cats. Do these bits' synapses cover all 500-input bits? They >>>> should. >>>> >>>> So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to >>>> SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits) >>>> represents the dogs' bits. (btw, I dont see any predictive columns in the >>>> SP(?)). Now I want to say, given this SDR, what output is the most likely? >>>> So I should take a random sub-sample back to 20 ONbits (this will most >>>> likely give me cats' only SDR), apply the permanences and see what input >>>> synapses should have been active. There's a good chance it's the left-500 >>>> ones. >>>> >>>> Will this work as a means for reconstruction? >>>> >>>> >>>> We never need to reconstruct the inputs from the active columns - the >>>>> input is what causes the current activation pattern. >>>>> >>>> >>>> This is not true. I have an application of learning where I want to use >>>> this for CAM memory. Let's assume date ancoder encodes similar times with >>>> similar patterns (it does). >>>> Now I will train: >>>> { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner}; >>>> and I want to ask {7:48am : ???}, I can easily encode the 7.48 time and >>>> expect to get "breakfast", as 7.45 and 7.48 share most of the bits. This is >>>> usecase where i want to feed (incomplete) SDR-active columns and expect to >>>> get input pattern. >>>> >>>> >>>> >>>> Btw, >>>> interesting article about Google's deep NNs, scary and encouraging at a >>>> same time! And I'm glad youtube can auto-identify funny cat videos, that's >>>> a killer feature ;) >>>> >>>> >>>> Ian, >>>> I'd like to know how much slow-down this reconstruction was? I think >>>> the reconstruction should be re-enabled (maybe with a switch to dis/able >>>> its use(and slowdown)) and the Classifier should not be part of the >>>> nupic-core, but exported as a feature of the OPF model framework. >>>> >>>> >>>> >>>> When we have feedback from higher regions affecting the activity of a >>>>> region, we'll then have to employ reconstruction (using feed forward >>>>> permanences or a classifier) to get the associated "imagined" inputs. >>>>> >>>>> In the brain, we never need to do this, as the perception is the input >>>>> as far as our minds see it. Similarly, we don't need a classifier, as the >>>>> output is the identity when viewed from above. >>>>> — >>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPhone >>>>> >>>>> >>>>> On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal <[email protected]>wrote: >>>>> >>>>>> In the following text I'll describe what (I think) >>>>>> Classifier does, >>>>>> why I consider it wrong and what can be done to avoid it? >>>>>> >>>>>> 1/ What is a Classifier: >>>>>> wiki doesn't say much, I got it's a part of Region that "translates >>>>>> SDRs back to input-space". >>>>>> >>>>>> Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its >>>>>> c++ sibling) I see there's a function compute() that basicaly pairs SDR >>>>>> (from lower layer, with input that caused the sdr), am I right? >>>>>> >>>>>> *) there's a KNNClassifier which would use k-nearest neighbours alg., >>>>>> and CLAClassifier..which uses what? a SP? A feed-forward NN seems to be a >>>>>> good candidate for such impl. of a classifier. >>>>>> >>>>>> >>>>>> >>>>>> 2/ Why I consider it "wrong" >>>>>> >>>>>> 2.1/ Classifier does not have a biological counterpart like other >>>>>> parts of HTM/CLA do. >>>>>> >>>>>> the chain : >>>>>> input --> encoder --> SP --> TP --> Classifier??! >>>>>> >>>>>> input can be whatever we have sensors to percieve, eg a "sound wave", >>>>>> it's of any possible data-type - input-space >>>>>> >>>>>> encoder is the function of the sensory organ - eg "cochlea translates >>>>>> the vibrations to the electromagnetic pulses on the cells ", it >>>>>> translates >>>>>> from inputspace to bit-vector (not a SDR, however) >>>>>> >>>>>> SP+TP: are combined together in the brain in a (micro-)region; >>>>>> they both accept and produce a SDR >>>>>> >>>>>> The Classifier does not have a counterpart as brain has no need to >>>>>> translate back to the input-space. We, however, do for CLAs to be useful >>>>>> in >>>>>> practical problems. >>>>>> >>>>>> >>>>>> 2.2/ lack of top-down compute in SP, TP breaks modularity. >>>>>> >>>>>> Encoder does have encode() and decode()/topDownCompute() methods. SP >>>>>> and TP dont. To make these two useful building blocks, it would be >>>>>> necessary to have the an inverse of compute() in these too. >>>>>> >>>>>> In nature, there are four types of connections between cells/neurons >>>>>> in brain: 2 vertical: feedforward feeding input to higher layer, and >>>>>> recurrent feeding more stable patterns to lower layers. >>>>>> >>>>>> And two horizontal: predictive connections (used in TP) and >>>>>> inhibitory (missing in Nupic). >>>>>> >>>>>> The inhibition connections are missing for performance reasons (I >>>>>> think) and we use global/local n-best inhibition instead. This fact makes >>>>>> it impossible to recostruct from a list of active columns (an SDR) the >>>>>> input that had caused it. If we had them (inh. permanences) we could use >>>>>> them in reverse meaning to boost the columns that have been silenced and >>>>>> from these "active+" columns according to permanences turn ON appropriate >>>>>> synapses. >>>>>> >>>>>> >>>>>> >>>>>> Such implementation would be slower, but interesting for bio-inspired >>>>>> research; allow SDR to be messaging format between parts of CLA (even >>>>>> diff >>>>>> implementations) and reduce possible space for errors happening in the >>>>>> Classifier. >>>>>> >>>>>> Are my thoughts proper/wrong somewhere? What do you think? >>>>>> >>>>>> Cheers, Mark >>>>>> >>>>>> >>>>>> -- >>>>>> Marek Otahal :o) >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> nupic mailing list >>>>> [email protected] >>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>> >>>>> >>>> >>>> >>>> -- >>>> Marek Otahal :o) >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
