Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Doug King Fri, 22 Nov 2013 17:35:05 -0800

Thanks Ian,
Yes I get that, real (clock on the wall) time is not encoded. But the
string of steps it took to get to a point in the CLA is encoded as the
connections are built up over the steps the CLA takes during training. When
you convert that to a reconstructed SDR output you have effectively lost
all that temporal info. It is collapsed into a single frame.


I'm still trying to think this through. Suppose your are feeding this
'collapsed' SDR into another CLA. You can only pick one SDR (the best fit /
highest probability) to feed it. Have you now lost all that other good info
on the other likely (less probable) paths in the sequence?




On Fri, Nov 22, 2013 at 5:24 PM, Ian Danforth <[email protected]>wrote:

> Doug,
>
>  Whatever method you choose it comes down to that the output of a system
> like CLA needs to make choices. Is this large collection of active cells a
> 5, or an 8? Or more realistically, does this large collection of active
> cells mean I move my arm up or down?
>
>  Temporal context gives you the ability to make the right choices at the
> right time, but those individual outcomes don't need to encode time
> themselves.
>
>  To use Jeff's favorite metaphor, a melody is a series of notes, and there
> arn't that many on a normal scale, but learning to play them in the right
> order and at the right time takes temporal context. The output however of
> if I play a C or an F# is a collapsed/reduced version of all the
> possibilities that exist in the brain at a given time. If I try to
> express/play/output BOTH, well, I'll probably mash the keys in between, and
> that's no good.
>
> Ian
>
>
>
> On Fri, Nov 22, 2013 at 5:13 PM, Doug King <[email protected]> wrote:
>
>> I'm trying to follow here. Am I getting this right?
>>
>> Output of CLA needs to be reconstructed for business reasons - we need a
>> prediction or anomaly detection so we can get paid and do more of this.
>>
>> A reconstruction is an single SDR output of CLA that represents the best
>> fit for predicting the next step (t+1).
>>
>> A classifier is a super-set of reconstructed SDRs, with probabilities,
>> out to t+n.
>> We want to chain CLAs in a biological like way (HTMs) and we need to feed
>> reconstruction of one (output) into another.
>>
>> It seems to me that a reconstructor or classifier collapses the temporal
>> information, so we loose it. To actually keep the temporal information we
>> need to " propagate information through synaptic weights, any claim to
>> biological adherence demands that."
>>
>> Jeff alluded to this in one of the Office Hours videos. He said it would
>> be very hard to do hierarchy correctly so as to not lose temporal
>> information. Should I take this to mean that all synaptic info needs to be
>> wired up to the next CLA input?
>>
>> Are my assumptions / interpretations correct here?
>>
>>
>> On Fri, Nov 22, 2013 at 3:54 PM, Ian Danforth 
>> <[email protected]>wrote:
>>
>>> Scott,
>>>
>>>  Reconstruction was implemented because it was much closer to biology
>>> than any classifier. This holds true today. It may also be useful to learn
>>> feed forward and feed back pathways independently, and that would be
>>> biologically sound, as you suggest, we should move in that direction.
>>>
>>> 1. The classifier adds complexity and memory footprint that could be
>>> eliminated.
>>> 2. Reconstruction also provides probabilities based on synaptic
>>> strengths, which is how the brain does it
>>> 3. It is true that the classifier provides direct multistep prediction,
>>> but this is an egregiously non-biological hack.
>>>
>>> The utility of a classifier is pretty high, but ultimately is an almost
>>> inexcusable distraction for the advancement of the core theory. It should
>>> remain as one of a family of possible classifiers to put on the output of
>>> the SP or TP (which is very common in mainline ML research) but shouldn't
>>> be part of any core logic.
>>>
>>>  There is no question that a real solution here is based on the
>>> propagation of information through synaptic weights, any claim to
>>> biological adherence demands that.
>>>
>>>  When we get around to putting the whitepaper up so the community can
>>> update it, a clear roadmap for either a feedback pathway or a
>>> reconstruction pathway as a step to feedback will be required.
>>>
>>> Ian
>>>
>>>
>>> On Fri, Nov 22, 2013 at 2:59 PM, Scott Purdy <[email protected]> wrote:
>>>
>>>> Marek, there is a difference between reconstruction and feedback. There
>>>> is no connection that I am aware of between the CLA theory and
>>>> reconstruction (or the classifier for that matter).
>>>>
>>>> The classifier and reconstruction are tools that we use for business
>>>> purposes. So what are the pragmatic benefits we get by switching from the
>>>> classifier to reconstruction?
>>>>
>>>> Reconstruction gets us a single prediction for the following step. The
>>>> classifier can provide the same. But it can make multiple predictions with
>>>> probabilities. And it can be used to predict a different time step or time
>>>> interval. And it can do so for multiple intervals and all of these come
>>>> with weighted sets of predictions.
>>>>
>>>> So what practical benefit do you want to get from reconstruction?
>>>>
>>>> I am actually not opposed to reconstruction and think it is a little
>>>> easier to reason about. But my proposal would be to leave the classifier as
>>>> the default for prediction problems and implement reconstruction completely
>>>> outside the spatial and temporal pooler code. IE rather than adding the
>>>> code directly into the SP and TP classes as it was before, to implement it
>>>> in different files and have it simply inspect the SP/TP state. For me, it
>>>> falls into the same category as the classifier and encoders - not part of
>>>> the core CLA theory but useful for applying the CLA to real problems.
>>>>
>>>> Marek, Ian - what are your thoughts on that? Are there any actual
>>>> benefits of implementing reconstruction?
>>>>
>>>>
>>>> On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]>wrote:
>>>>
>>>>> Hi Fergal, Ian,
>>>>> thank you very much for these ideas..
>>>>>
>>>>>
>>>>> On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Marek,
>>>>>>
>>>>>> Some good points there. I'll address them together rather than point
>>>>>> by pint, as they're intertwined.
>>>>>>
>>>>>> First, the classifier is an artefact used to derive useful
>>>>>> information from the state of a region, it's never meant to be part of 
>>>>>> the
>>>>>> CLA theory.
>>>>>>
>>>>>
>>>>> Ok, I can take that. It is useful for the practical purpose of some
>>>>> applications we do. However, if we could achieve the same with the reverse
>>>>> function at each layer, I'd be happier - because it would be more
>>>>> "correct". I use quotes here, because the brain doesn't have to do either 
>>>>> -
>>>>> it never turns patterns back to the original. But interestingly, the way
>>>>> it's hardwired in brain, patterns are top-down reconstructable, while our
>>>>> (for speed) approach isn't.
>>>>>
>>>>>
>>>>>>
>>>>>> The SP and TP are only separate in the current implementation for
>>>>>> engineering reasons. You can use each alone if you like (useful for some
>>>>>> applications) and you can also subclass each separately (useful for
>>>>>> others).
>>>>>>
>>>>>
>>>>> Yep, agreed.
>>>>>
>>>>>
>>>>>>
>>>>>> We currently lack feedback and motor connections, as well as thalamic
>>>>>> mediated connections; these will be added in time.
>>>>>>
>>>>>> We do model inhibition in NuPIC and local inhibition is close enough
>>>>>> to match how the neocortex does it.
>>>>>>
>>>>>
>>>>> Yes, I agree it's close enough, but the fact we don't keep the
>>>>> inhibition permanences disallows us to compute the inverse of feed-fwd
>>>>> compute().
>>>>>
>>>>>
>>>>>>
>>>>>> When discussing reconstruction of inputs, remember that it is the set
>>>>>> of predictive cells which are used to identify the associated inputs, not
>>>>>> the current set of active columns in the SP. There are no inhibited
>>>>>> predictive cells to be consulted.
>>>>>>
>>>>>
>>>>> Not sure I follow here...
>>>>> In both TP/SP case, the output is SDR - an OR for active cols and
>>>>> predictive cells/cols. ...so, what you're saying, the reconstruction 
>>>>> should
>>>>> work even with current inhibition model, that is without inhibitory
>>>>> connections?
>>>>>
>>>>> Imagine I have two categories - "cats", and "dogs". Encoder
>>>>> transcribes them to a 1000-bit array; cats = bits 1-500 ON, dogs = 
>>>>> 501=1000
>>>>> ON. For the encoder to decode back, I need to produce an array of same
>>>>> quality.
>>>>>
>>>>> The SP makes an SDR out of it, some 20 bits for dogs, other 20
>>>>> represents cats. Do these bits' synapses cover all 500-input bits? They
>>>>> should.
>>>>>
>>>>> So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to
>>>>> SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits)
>>>>> represents the dogs' bits. (btw, I dont see any predictive columns in the
>>>>> SP(?)). Now I want to say, given this SDR, what output is the most likely?
>>>>> So I should take a random sub-sample back to 20 ONbits (this will most
>>>>> likely give me cats' only SDR), apply the permanences and see what input
>>>>> synapses should have been active. There's a good chance it's the left-500
>>>>> ones.
>>>>>
>>>>> Will this work as a means for reconstruction?
>>>>>
>>>>>
>>>>> We never need to reconstruct the inputs from the active columns - the
>>>>>> input is what causes the current activation pattern.
>>>>>>
>>>>>
>>>>> This is not true. I have an application of learning where I want to
>>>>> use this for CAM memory. Let's assume date ancoder encodes similar times
>>>>> with similar patterns (it does).
>>>>> Now I will train:
>>>>> { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner};
>>>>> and I want to ask {7:48am : ???}, I can easily encode the 7.48 time
>>>>> and expect to get "breakfast", as 7.45 and 7.48 share most of the bits.
>>>>> This is usecase where i want to feed (incomplete) SDR-active columns and
>>>>> expect to get input pattern.
>>>>>
>>>>>
>>>>>
>>>>> Btw,
>>>>> interesting article about Google's deep NNs, scary and encouraging at
>>>>> a same time! And I'm glad youtube can auto-identify funny cat videos,
>>>>> that's a killer feature ;)
>>>>>
>>>>>
>>>>> Ian,
>>>>> I'd like to know how much slow-down this reconstruction was? I think
>>>>> the reconstruction should be re-enabled (maybe with a switch to dis/able
>>>>> its use(and slowdown)) and the Classifier should not be part of the
>>>>> nupic-core, but exported as a feature of the OPF model framework.
>>>>>
>>>>>
>>>>>
>>>>> When we have feedback from higher regions affecting the activity of a
>>>>>> region, we'll then have to employ reconstruction (using feed forward
>>>>>> permanences or a classifier) to get the associated "imagined" inputs.
>>>>>>
>>>>>> In the brain, we never need to do this, as the perception is the
>>>>>> input as far as our minds see it. Similarly, we don't need a classifier, 
>>>>>> as
>>>>>> the output is the identity when viewed from above.
>>>>>> —
>>>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPhone
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>>            In the following text I'll describe what  (I think)
>>>>>>> Classifier does,
>>>>>>> why I consider it wrong and what can be done to avoid it?
>>>>>>>
>>>>>>> 1/ What is a Classifier:
>>>>>>> wiki doesn't say much, I got it's a part of Region that "translates
>>>>>>> SDRs back to input-space".
>>>>>>>
>>>>>>> Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its
>>>>>>> c++ sibling) I see there's a function compute() that basicaly pairs SDR
>>>>>>> (from lower layer, with input that caused the sdr), am I right?
>>>>>>>
>>>>>>> *) there's a KNNClassifier which would use k-nearest neighbours
>>>>>>> alg., and CLAClassifier..which uses what? a SP? A feed-forward NN seems 
>>>>>>> to
>>>>>>> be a good candidate for such impl. of a classifier.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2/ Why I consider it "wrong"
>>>>>>>
>>>>>>> 2.1/ Classifier does not have a biological counterpart like other
>>>>>>> parts of HTM/CLA do.
>>>>>>>
>>>>>>> the chain :
>>>>>>>  input --> encoder --> SP --> TP --> Classifier??!
>>>>>>>
>>>>>>> input can be whatever we have sensors to percieve, eg a "sound
>>>>>>> wave", it's of any possible data-type - input-space
>>>>>>>
>>>>>>> encoder is the function of the sensory organ - eg "cochlea
>>>>>>> translates the vibrations to the electromagnetic pulses on the cells ", 
>>>>>>> it
>>>>>>> translates from inputspace to bit-vector (not a SDR, however)
>>>>>>>
>>>>>>> SP+TP: are combined together in the brain in a (micro-)region;
>>>>>>> they both accept and produce a SDR
>>>>>>>
>>>>>>> The Classifier does not have a counterpart as brain has no need to
>>>>>>> translate back to the input-space. We, however, do for CLAs to be 
>>>>>>> useful in
>>>>>>> practical problems.
>>>>>>>
>>>>>>>
>>>>>>> 2.2/ lack of top-down compute in SP, TP breaks modularity.
>>>>>>>
>>>>>>> Encoder does have encode() and decode()/topDownCompute() methods. SP
>>>>>>> and TP dont. To make these two useful building blocks, it would be
>>>>>>> necessary to have the an inverse of compute() in these too.
>>>>>>>
>>>>>>> In nature, there are four types of connections between cells/neurons
>>>>>>> in brain: 2 vertical: feedforward feeding input to higher layer, and
>>>>>>> recurrent feeding more stable patterns to lower layers.
>>>>>>>
>>>>>>> And two horizontal: predictive connections (used in TP) and
>>>>>>> inhibitory (missing in Nupic).
>>>>>>>
>>>>>>> The inhibition connections are missing for performance reasons (I
>>>>>>> think) and we use global/local n-best inhibition instead. This fact 
>>>>>>> makes
>>>>>>> it impossible to recostruct from a list of active columns (an SDR) the
>>>>>>> input that had caused it. If we had them (inh. permanences) we could use
>>>>>>> them in reverse meaning to boost the columns that have been silenced and
>>>>>>> from these "active+" columns according to permanences turn ON 
>>>>>>> appropriate
>>>>>>> synapses.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Such implementation would be slower, but interesting for
>>>>>>> bio-inspired research; allow SDR to be messaging format between parts of
>>>>>>> CLA (even diff implementations) and reduce possible space for errors
>>>>>>> happening in the Classifier.
>>>>>>>
>>>>>>> Are my thoughts proper/wrong somewhere? What do you think?
>>>>>>>
>>>>>>> Cheers, Mark
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Marek Otahal :o)
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> nupic mailing list
>>>>>> [email protected]
>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Marek Otahal :o)
>>>>>
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Reply via email to