Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Doug King Fri, 22 Nov 2013 17:20:25 -0800

Correction, not all synapses, all active cells.
In the Office Hours video around 19:10 Jeff explains specifics of how he
thinks hierarchy should work, all active cells wired up to next level, no
classifiers between levels.



On Fri, Nov 22, 2013 at 5:13 PM, Doug King <[email protected]> wrote:

> I'm trying to follow here. Am I getting this right?
>
> Output of CLA needs to be reconstructed for business reasons - we need a
> prediction or anomaly detection so we can get paid and do more of this.
>
> A reconstruction is an single SDR output of CLA that represents the best
> fit for predicting the next step (t+1).
>
> A classifier is a super-set of reconstructed SDRs, with probabilities, out
> to t+n.
> We want to chain CLAs in a biological like way (HTMs) and we need to feed
> reconstruction of one (output) into another.
>
> It seems to me that a reconstructor or classifier collapses the temporal
> information, so we loose it. To actually keep the temporal information we
> need to " propagate information through synaptic weights, any claim to
> biological adherence demands that."
>
> Jeff alluded to this in one of the Office Hours videos. He said it would
> be very hard to do hierarchy correctly so as to not lose temporal
> information. Should I take this to mean that all synaptic info needs to be
> wired up to the next CLA input?
>
> Are my assumptions / interpretations correct here?
>
>
> On Fri, Nov 22, 2013 at 3:54 PM, Ian Danforth <[email protected]>wrote:
>
>> Scott,
>>
>>  Reconstruction was implemented because it was much closer to biology
>> than any classifier. This holds true today. It may also be useful to learn
>> feed forward and feed back pathways independently, and that would be
>> biologically sound, as you suggest, we should move in that direction.
>>
>> 1. The classifier adds complexity and memory footprint that could be
>> eliminated.
>> 2. Reconstruction also provides probabilities based on synaptic
>> strengths, which is how the brain does it
>> 3. It is true that the classifier provides direct multistep prediction,
>> but this is an egregiously non-biological hack.
>>
>> The utility of a classifier is pretty high, but ultimately is an almost
>> inexcusable distraction for the advancement of the core theory. It should
>> remain as one of a family of possible classifiers to put on the output of
>> the SP or TP (which is very common in mainline ML research) but shouldn't
>> be part of any core logic.
>>
>>  There is no question that a real solution here is based on the
>> propagation of information through synaptic weights, any claim to
>> biological adherence demands that.
>>
>>  When we get around to putting the whitepaper up so the community can
>> update it, a clear roadmap for either a feedback pathway or a
>> reconstruction pathway as a step to feedback will be required.
>>
>> Ian
>>
>>
>> On Fri, Nov 22, 2013 at 2:59 PM, Scott Purdy <[email protected]> wrote:
>>
>>> Marek, there is a difference between reconstruction and feedback. There
>>> is no connection that I am aware of between the CLA theory and
>>> reconstruction (or the classifier for that matter).
>>>
>>> The classifier and reconstruction are tools that we use for business
>>> purposes. So what are the pragmatic benefits we get by switching from the
>>> classifier to reconstruction?
>>>
>>> Reconstruction gets us a single prediction for the following step. The
>>> classifier can provide the same. But it can make multiple predictions with
>>> probabilities. And it can be used to predict a different time step or time
>>> interval. And it can do so for multiple intervals and all of these come
>>> with weighted sets of predictions.
>>>
>>> So what practical benefit do you want to get from reconstruction?
>>>
>>> I am actually not opposed to reconstruction and think it is a little
>>> easier to reason about. But my proposal would be to leave the classifier as
>>> the default for prediction problems and implement reconstruction completely
>>> outside the spatial and temporal pooler code. IE rather than adding the
>>> code directly into the SP and TP classes as it was before, to implement it
>>> in different files and have it simply inspect the SP/TP state. For me, it
>>> falls into the same category as the classifier and encoders - not part of
>>> the core CLA theory but useful for applying the CLA to real problems.
>>>
>>> Marek, Ian - what are your thoughts on that? Are there any actual
>>> benefits of implementing reconstruction?
>>>
>>>
>>> On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]>wrote:
>>>
>>>> Hi Fergal, Ian,
>>>> thank you very much for these ideas..
>>>>
>>>>
>>>> On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Marek,
>>>>>
>>>>> Some good points there. I'll address them together rather than point
>>>>> by pint, as they're intertwined.
>>>>>
>>>>> First, the classifier is an artefact used to derive useful information
>>>>> from the state of a region, it's never meant to be part of the CLA theory.
>>>>>
>>>>
>>>> Ok, I can take that. It is useful for the practical purpose of some
>>>> applications we do. However, if we could achieve the same with the reverse
>>>> function at each layer, I'd be happier - because it would be more
>>>> "correct". I use quotes here, because the brain doesn't have to do either -
>>>> it never turns patterns back to the original. But interestingly, the way
>>>> it's hardwired in brain, patterns are top-down reconstructable, while our
>>>> (for speed) approach isn't.
>>>>
>>>>
>>>>>
>>>>> The SP and TP are only separate in the current implementation for
>>>>> engineering reasons. You can use each alone if you like (useful for some
>>>>> applications) and you can also subclass each separately (useful for
>>>>> others).
>>>>>
>>>>
>>>> Yep, agreed.
>>>>
>>>>
>>>>>
>>>>> We currently lack feedback and motor connections, as well as thalamic
>>>>> mediated connections; these will be added in time.
>>>>>
>>>>> We do model inhibition in NuPIC and local inhibition is close enough
>>>>> to match how the neocortex does it.
>>>>>
>>>>
>>>> Yes, I agree it's close enough, but the fact we don't keep the
>>>> inhibition permanences disallows us to compute the inverse of feed-fwd
>>>> compute().
>>>>
>>>>
>>>>>
>>>>> When discussing reconstruction of inputs, remember that it is the set
>>>>> of predictive cells which are used to identify the associated inputs, not
>>>>> the current set of active columns in the SP. There are no inhibited
>>>>> predictive cells to be consulted.
>>>>>
>>>>
>>>> Not sure I follow here...
>>>> In both TP/SP case, the output is SDR - an OR for active cols and
>>>> predictive cells/cols. ...so, what you're saying, the reconstruction should
>>>> work even with current inhibition model, that is without inhibitory
>>>> connections?
>>>>
>>>> Imagine I have two categories - "cats", and "dogs". Encoder transcribes
>>>> them to a 1000-bit array; cats = bits 1-500 ON, dogs = 501=1000 ON. For the
>>>> encoder to decode back, I need to produce an array of same quality.
>>>>
>>>> The SP makes an SDR out of it, some 20 bits for dogs, other 20
>>>> represents cats. Do these bits' synapses cover all 500-input bits? They
>>>> should.
>>>>
>>>> So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to
>>>> SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits)
>>>> represents the dogs' bits. (btw, I dont see any predictive columns in the
>>>> SP(?)). Now I want to say, given this SDR, what output is the most likely?
>>>> So I should take a random sub-sample back to 20 ONbits (this will most
>>>> likely give me cats' only SDR), apply the permanences and see what input
>>>> synapses should have been active. There's a good chance it's the left-500
>>>> ones.
>>>>
>>>> Will this work as a means for reconstruction?
>>>>
>>>>
>>>> We never need to reconstruct the inputs from the active columns - the
>>>>> input is what causes the current activation pattern.
>>>>>
>>>>
>>>> This is not true. I have an application of learning where I want to use
>>>> this for CAM memory. Let's assume date ancoder encodes similar times with
>>>> similar patterns (it does).
>>>> Now I will train:
>>>> { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner};
>>>> and I want to ask {7:48am : ???}, I can easily encode the 7.48 time and
>>>> expect to get "breakfast", as 7.45 and 7.48 share most of the bits. This is
>>>> usecase where i want to feed (incomplete) SDR-active columns and expect to
>>>> get input pattern.
>>>>
>>>>
>>>>
>>>> Btw,
>>>> interesting article about Google's deep NNs, scary and encouraging at a
>>>> same time! And I'm glad youtube can auto-identify funny cat videos, that's
>>>> a killer feature ;)
>>>>
>>>>
>>>> Ian,
>>>> I'd like to know how much slow-down this reconstruction was? I think
>>>> the reconstruction should be re-enabled (maybe with a switch to dis/able
>>>> its use(and slowdown)) and the Classifier should not be part of the
>>>> nupic-core, but exported as a feature of the OPF model framework.
>>>>
>>>>
>>>>
>>>> When we have feedback from higher regions affecting the activity of a
>>>>> region, we'll then have to employ reconstruction (using feed forward
>>>>> permanences or a classifier) to get the associated "imagined" inputs.
>>>>>
>>>>> In the brain, we never need to do this, as the perception is the input
>>>>> as far as our minds see it. Similarly, we don't need a classifier, as the
>>>>> output is the identity when viewed from above.
>>>>> —
>>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPhone
>>>>>
>>>>>
>>>>> On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal <[email protected]>wrote:
>>>>>
>>>>>>            In the following text I'll describe what  (I think)
>>>>>> Classifier does,
>>>>>> why I consider it wrong and what can be done to avoid it?
>>>>>>
>>>>>> 1/ What is a Classifier:
>>>>>> wiki doesn't say much, I got it's a part of Region that "translates
>>>>>> SDRs back to input-space".
>>>>>>
>>>>>> Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its
>>>>>> c++ sibling) I see there's a function compute() that basicaly pairs SDR
>>>>>> (from lower layer, with input that caused the sdr), am I right?
>>>>>>
>>>>>> *) there's a KNNClassifier which would use k-nearest neighbours alg.,
>>>>>> and CLAClassifier..which uses what? a SP? A feed-forward NN seems to be a
>>>>>> good candidate for such impl. of a classifier.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2/ Why I consider it "wrong"
>>>>>>
>>>>>> 2.1/ Classifier does not have a biological counterpart like other
>>>>>> parts of HTM/CLA do.
>>>>>>
>>>>>> the chain :
>>>>>>  input --> encoder --> SP --> TP --> Classifier??!
>>>>>>
>>>>>> input can be whatever we have sensors to percieve, eg a "sound wave",
>>>>>> it's of any possible data-type - input-space
>>>>>>
>>>>>> encoder is the function of the sensory organ - eg "cochlea translates
>>>>>> the vibrations to the electromagnetic pulses on the cells ", it 
>>>>>> translates
>>>>>> from inputspace to bit-vector (not a SDR, however)
>>>>>>
>>>>>> SP+TP: are combined together in the brain in a (micro-)region;
>>>>>> they both accept and produce a SDR
>>>>>>
>>>>>> The Classifier does not have a counterpart as brain has no need to
>>>>>> translate back to the input-space. We, however, do for CLAs to be useful 
>>>>>> in
>>>>>> practical problems.
>>>>>>
>>>>>>
>>>>>> 2.2/ lack of top-down compute in SP, TP breaks modularity.
>>>>>>
>>>>>> Encoder does have encode() and decode()/topDownCompute() methods. SP
>>>>>> and TP dont. To make these two useful building blocks, it would be
>>>>>> necessary to have the an inverse of compute() in these too.
>>>>>>
>>>>>> In nature, there are four types of connections between cells/neurons
>>>>>> in brain: 2 vertical: feedforward feeding input to higher layer, and
>>>>>> recurrent feeding more stable patterns to lower layers.
>>>>>>
>>>>>> And two horizontal: predictive connections (used in TP) and
>>>>>> inhibitory (missing in Nupic).
>>>>>>
>>>>>> The inhibition connections are missing for performance reasons (I
>>>>>> think) and we use global/local n-best inhibition instead. This fact makes
>>>>>> it impossible to recostruct from a list of active columns (an SDR) the
>>>>>> input that had caused it. If we had them (inh. permanences) we could use
>>>>>> them in reverse meaning to boost the columns that have been silenced and
>>>>>> from these "active+" columns according to permanences turn ON appropriate
>>>>>> synapses.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Such implementation would be slower, but interesting for bio-inspired
>>>>>> research; allow SDR to be messaging format between parts of CLA (even 
>>>>>> diff
>>>>>> implementations) and reduce possible space for errors happening in the
>>>>>> Classifier.
>>>>>>
>>>>>> Are my thoughts proper/wrong somewhere? What do you think?
>>>>>>
>>>>>> Cheers, Mark
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Marek Otahal :o)
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Marek Otahal :o)
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Reply via email to