Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Scott Purdy Fri, 22 Nov 2013 18:08:05 -0800

To be clear, when I say reconstruction in this email I am talking about the
process of 1) taking the columns for each predicted cell in the TP and 2)
selecting the connected input bits (possibly including the connectedness as
a weight) to those corresponding SP coincidences/columns and 3) using the
encoders to select the closest value to the selected input bits as the
predicted value.


On Fri, Nov 22, 2013 at 3:54 PM, Ian Danforth <[email protected]>
wrote:

> Scott,
>
>  Reconstruction was implemented because it was much closer to biology
than any classifier.

I think your rationale here is that there is information propagation
"downwards" in the brain and that this is what we are doing with
reconstruction. That may be true, but the way that the information is used
in reconstruction is different then the way it is used in feedback or other
processes that actually happen in the brain. So it is true that information
flows downward, but not true that the brain performs reconstruction with
that information.

Here is a good test - write down the benefits of feedback in the brain and
then see which of those things are solved by reconstruction. Then do the
reverse. It seems to me that they are solving completely orthogonal
problems.

The classifier and reconstruction are not methods of prediction (they have
absolutely no affect on current or future active or predicted cells).
Instead, they are methods for turning internal CLA predicted cells into
some other value type. For a time series prediction problem, this usually
means converting predicted TP cells into a scalar value. This is the crux
of the issue - we are doing something that doesn't have an analogy in the
human body. We are doing it for to solve a business problem.

> This holds true today. It may also be useful to learn feed forward and
feed back pathways independently, and that would be biologically sound, as
you suggest, we should move in that direction.

Yes, feedback that matches the theory would be great to implement. It will
be fun to work on with hierarchy.

> 1. The classifier adds complexity and memory footprint that could be
eliminated.

You absolutely should not use the classifier. Unless you are attempting to
solve a business problem that it works well for, in which case you should
absolutely use the classifier.

>
> 2. Reconstruction also provides probabilities based on synaptic
strengths, which is how the brain does it
>
> 3. It is true that the classifier provides direct multistep prediction,
but this is an egregiously non-biological hack.
>
>
> The utility of a classifier is pretty high, but ultimately is an almost
inexcusable distraction for the advancement of the core theory. It should
remain as one of a family of possible classifiers to put on the output of
the SP or TP (which is very common in mainline ML research) but shouldn't
be part of any core logic.

Agreed that it should not be included as part of the CLA theory.

>  There is no question that a real solution here is based on the
propagation of information through synaptic weights, any claim to
biological adherence demands that.
>
>  When we get around to putting the whitepaper up so the community can
update it, a clear roadmap for either a feedback pathway or a
reconstruction pathway as a step to feedback will be required.
>
>
> Ian
>
>
> On Fri, Nov 22, 2013 at 2:59 PM, Scott Purdy <[email protected]> wrote:
>>
>> Marek, there is a difference between reconstruction and feedback. There
is no connection that I am aware of between the CLA theory and
reconstruction (or the classifier for that matter).
>>
>> The classifier and reconstruction are tools that we use for business
purposes. So what are the pragmatic benefits we get by switching from the
classifier to reconstruction?
>>
>> Reconstruction gets us a single prediction for the following step. The
classifier can provide the same. But it can make multiple predictions with
probabilities. And it can be used to predict a different time step or time
interval. And it can do so for multiple intervals and all of these come
with weighted sets of predictions.
>>
>> So what practical benefit do you want to get from reconstruction?
>>
>> I am actually not opposed to reconstruction and think it is a little
easier to reason about. But my proposal would be to leave the classifier as
the default for prediction problems and implement reconstruction completely
outside the spatial and temporal pooler code. IE rather than adding the
code directly into the SP and TP classes as it was before, to implement it
in different files and have it simply inspect the SP/TP state. For me, it
falls into the same category as the classifier and encoders - not part of
the core CLA theory but useful for applying the CLA to real problems.
>>
>> Marek, Ian - what are your thoughts on that? Are there any actual
benefits of implementing reconstruction?
>>
>>
>> On Fri, Nov 22, 2013 at 12:01 PM, Marek Otahal <[email protected]>
wrote:
>>>
>>> Hi Fergal, Ian,
>>> thank you very much for these ideas..
>>>
>>>
>>> On Fri, Nov 22, 2013 at 7:02 PM, Fergal Byrne <
[email protected]> wrote:
>>>>
>>>> Hi Marek,
>>>>
>>>> Some good points there. I'll address them together rather than point
by pint, as they're intertwined.
>>>>
>>>> First, the classifier is an artefact used to derive useful information
from the state of a region, it's never meant to be part of the CLA theory.
>>>
>>>
>>> Ok, I can take that. It is useful for the practical purpose of some
applications we do. However, if we could achieve the same with the reverse
function at each layer, I'd be happier - because it would be more
"correct". I use quotes here, because the brain doesn't have to do either -
it never turns patterns back to the original. But interestingly, the way
it's hardwired in brain, patterns are top-down reconstructable, while our
(for speed) approach isn't.
>>>
>>>>
>>>>
>>>> The SP and TP are only separate in the current implementation for
engineering reasons. You can use each alone if you like (useful for some
applications) and you can also subclass each separately (useful for
others).
>>>
>>>
>>> Yep, agreed.
>>>
>>>>
>>>>
>>>> We currently lack feedback and motor connections, as well as thalamic
mediated connections; these will be added in time.
>>>>
>>>> We do model inhibition in NuPIC and local inhibition is close enough
to match how the neocortex does it.
>>>
>>>
>>> Yes, I agree it's close enough, but the fact we don't keep the
inhibition permanences disallows us to compute the inverse of feed-fwd
compute().
>>>
>>>>
>>>>
>>>> When discussing reconstruction of inputs, remember that it is the set
of predictive cells which are used to identify the associated inputs, not
the current set of active columns in the SP. There are no inhibited
predictive cells to be consulted.
>>>
>>>
>>> Not sure I follow here...
>>> In both TP/SP case, the output is SDR - an OR for active cols and
predictive cells/cols. ...so, what you're saying, the reconstruction should
work even with current inhibition model, that is without inhibitory
connections?
>>>
>>> Imagine I have two categories - "cats", and "dogs". Encoder transcribes
them to a 1000-bit array; cats = bits 1-500 ON, dogs = 501=1000 ON. For the
encoder to decode back, I need to produce an array of same quality.
>>>
>>> The SP makes an SDR out of it, some 20 bits for dogs, other 20
represents cats. Do these bits' synapses cover all 500-input bits? They
should.
>>>
>>> So if I take a noisy input, bits (1-250 ON and 400-600 ON), feed it to
SP, I expect it to produce some 21-22 bits ON, where majority (cca 19bits)
represents the dogs' bits. (btw, I dont see any predictive columns in the
SP(?)). Now I want to say, given this SDR, what output is the most likely?
So I should take a random sub-sample back to 20 ONbits (this will most
likely give me cats' only SDR), apply the permanences and see what input
synapses should have been active. There's a good chance it's the left-500
ones.
>>>
>>> Will this work as a means for reconstruction?
>>>
>>>
>>>> We never need to reconstruct the inputs from the active columns - the
input is what causes the current activation pattern.
>>>
>>>
>>> This is not true. I have an application of learning where I want to use
this for CAM memory. Let's assume date ancoder encodes similar times with
similar patterns (it does).
>>> Now I will train:
>>> { 7:45am : breakfast}; {2pm : lunch}, {7pm : dinner};
>>> and I want to ask {7:48am : ???}, I can easily encode the 7.48 time and
expect to get "breakfast", as 7.45 and 7.48 share most of the bits. This is
usecase where i want to feed (incomplete) SDR-active columns and expect to
get input pattern.
>>>
>>>
>>>
>>> Btw,
>>> interesting article about Google's deep NNs, scary and encouraging at a
same time! And I'm glad youtube can auto-identify funny cat videos, that's
a killer feature ;)
>>>
>>>
>>> Ian,
>>> I'd like to know how much slow-down this reconstruction was? I think
the reconstruction should be re-enabled (maybe with a switch to dis/able
its use(and slowdown)) and the Classifier should not be part of the
nupic-core, but exported as a feature of the OPF model framework.
>>>
>>>
>>>
>>>> When we have feedback from higher regions affecting the activity of a
region, we'll then have to employ reconstruction (using feed forward
permanences or a classifier) to get the associated "imagined" inputs.
>>>>
>>>> In the brain, we never need to do this, as the perception is the input
as far as our minds see it. Similarly, we don't need a classifier, as the
output is the identity when viewed from above.
>>>> —
>>>> Sent from Mailbox for iPhone
>>>>
>>>>
>>>> On Fri, Nov 22, 2013 at 5:41 PM, Marek Otahal <[email protected]>
wrote:
>>>>>
>>>>> In the following text I'll describe what  (I think) Classifier does,
>>>>> why I consider it wrong and what can be done to avoid it?
>>>>>
>>>>> 1/ What is a Classifier:
>>>>> wiki doesn't say much, I got it's a part of Region that "translates
SDRs back to input-space".
>>>>>
>>>>> Looking at the code at py/nupic/algorithms/CLAClassifier.py (and its
c++ sibling) I see there's a function compute() that basicaly pairs SDR
(from lower layer, with input that caused the sdr), am I right?
>>>>>
>>>>> *) there's a KNNClassifier which would use k-nearest neighbours alg.,
and CLAClassifier..which uses what? a SP? A feed-forward NN seems to be a
good candidate for such impl. of a classifier.
>>>>>
>>>>>
>>>>>
>>>>> 2/ Why I consider it "wrong"
>>>>>
>>>>> 2.1/ Classifier does not have a biological counterpart like other
parts of HTM/CLA do.
>>>>>
>>>>> the chain :
>>>>>  input --> encoder --> SP --> TP --> Classifier??!
>>>>>
>>>>> input can be whatever we have sensors to percieve, eg a "sound wave",
it's of any possible data-type - input-space
>>>>>
>>>>> encoder is the function of the sensory organ - eg "cochlea translates
the vibrations to the electromagnetic pulses on the cells ", it translates
from inputspace to bit-vector (not a SDR, however)
>>>>>
>>>>> SP+TP: are combined together in the brain in a (micro-)region;
>>>>> they both accept and produce a SDR
>>>>>
>>>>> The Classifier does not have a counterpart as brain has no need to
translate back to the input-space. We, however, do for CLAs to be useful in
practical problems.
>>>>>
>>>>>
>>>>> 2.2/ lack of top-down compute in SP, TP breaks modularity.
>>>>>
>>>>> Encoder does have encode() and decode()/topDownCompute() methods. SP
and TP dont. To make these two useful building blocks, it would be
necessary to have the an inverse of compute() in these too.
>>>>>
>>>>> In nature, there are four types of connections between cells/neurons
in brain: 2 vertical: feedforward feeding input to higher layer, and
recurrent feeding more stable patterns to lower layers.
>>>>>
>>>>> And two horizontal: predictive connections (used in TP) and
inhibitory (missing in Nupic).
>>>>>
>>>>> The inhibition connections are missing for performance reasons (I
think) and we use global/local n-best inhibition instead. This fact makes
it impossible to recostruct from a list of active columns (an SDR) the
input that had caused it. If we had them (inh. permanences) we could use
them in reverse meaning to boost the columns that have been silenced and
from these "active+" columns according to permanences turn ON appropriate
synapses.
>>>>>
>>>>>
>>>>>
>>>>> Such implementation would be slower, but interesting for bio-inspired
research; allow SDR to be messaging format between parts of CLA (even diff
implementations) and reduce possible space for errors happening in the
Classifier.
>>>>>
>>>>> Are my thoughts proper/wrong somewhere? What do you think?
>>>>>
>>>>> Cheers, Mark
>>>>>
>>>>>
>>>>> --
>>>>> Marek Otahal :o)
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>
>>>
>>>
>>> --
>>> Marek Otahal :o)
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Critique of the Classifier; top-down compute in SP/TP and inhibition connections

Reply via email to