Cheers Scott,

I'm right with you on all points you make. It could be a swarming
parameter.

The only problem I have is with the so-called adaptive encoder which resets
everything if it get one value out of range, and which will usually destroy
all your learning. A fixed encoder which burns out outliers is better than
that, and an encoder which slides out statistically to match a widening
range of data is also better.

Just to answer your points (I like doing that, I pride myself on being a
pedant):

1. You can slide back in if the data says so, and then you get more
precision. The learning should be relative anyway.
2. Once you decide your n and w, the mapping adapts to the data, and
changes only when you get many out of range values (and recontracts when
you have few). If something is really noise, it'll get damped out.
3. That's fine, but you lose topology that way. The brain uses topology
when it is useful.

If there's any consensus about how our brains work, it is that we measure
things by their shape, by their relative position or amplitude. not by
precise absolutes. This presupposes a relativistic, statistically based
(and temporally adaptive) encoding system.

Regards

Fergal Byrne






On Sat, Oct 26, 2013 at 1:10 AM, Scott Purdy <[email protected]> wrote:

> Fergal, the encoder you describe mitigates the issues but does not solve
> them. Three reasons I like Jeff's solution (although I agree with Ian that
> we should test them):
>
> 1. With centroids or changing buckets, you lose learning. There is no way
> around this. It might happen slowly, but it still happens. And it is
> particularly problematic for smaller data sets.
> 2. With a fixed number of centroids or buckets that have changing ranges,
> you may use all of your buckets to represent noise. Since there is
> currently no way to know what is noise, you don't know if a small section
> that you see a lot of data in needs many buckets to represent the different
> values or if the values should all be thrown into the same bucket. In my
> experiments with the "arithmetic" encoder this proved to make it less
> effective than a fixed scalar encoder. Perhaps a centroid or other approach
> would not fall victim to this issue but it seems to me they would.
> 3. Jeff's encoder reuses the same bits but in different combinations. This
> is very similar to how cells in a TP column are reused once all of the
> cells have been used. In this sense I find it very elegant.
>
>
> On Fri, Oct 25, 2013 at 4:56 PM, Ian Danforth <[email protected]>wrote:
>
>> And of course the correct answer is "Try both!" and see which one works
>> better. And perhaps an even better approach is to actively normalize data.
>> Band-pass filters, and automatic gain control are present in most
>> biological senses. Why re-evolve the range of your rods when you can evolve
>> and iris?
>>
>> Ian
>>
>>
>> On Fri, Oct 25, 2013 at 4:41 PM, Fergal Byrne <
>> [email protected]> wrote:
>>
>>> Hi Jeff,
>>>
>>> The lads mentioned this in the Sprint Meeting, but more or less said to
>>> wait until we'd heard from you.
>>>
>>> I disagree that that this is a vastly complex problem, and I think there
>>> is a way to avoid the problems you raise. If you consider the idea that
>>> each bit represents a centroid of the range with a radius, then the case
>>> where your range must be extended is just a situation of adjusting the
>>> centroids and radii of each bit, so that they shift gradually to
>>> accommodate the new range of data.
>>>
>>> You can do this based on the statistics of the data, allowing the
>>> centroids to spread out quite slowly when new out-of-range data are
>>> encountered, and have an algorithm which gradually spreads out the meaning
>>> of all the bits, and enlarges the radii as new min or max values are
>>> encountered. So, if you get a couple of new values larger than the max,
>>> then you let those values "burn out" the top encodings, but you increment
>>> the centroid values of all bits a little for them. If lots of values keep
>>> appearing above the max, this gives rise to a gradually enlarging spread
>>> for the encoder, while all bits slowly change their semantic meaning for
>>> the SP.
>>>
>>> The patterns and sequences already learned by the region would migrate
>>> gradually to the slowly evolving range, but would retain all their learned
>>> understanding of the data space. There would not be any sudden shift in the
>>> semantics of any input bit, or any column.
>>>
>>> If you treat an outlier as such, in other words give it a statistical
>>> weight which affects the min-max in terms of how often it occurs in the
>>> data, then you should be able to gradually adjust the interpretations of
>>> the mapping from input scalar value to encoder bits.
>>>
>>> Regards,
>>>
>>> Fergal Byrne
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Oct 25, 2013 at 10:58 PM, Jeff Hawkins <
>>> [email protected]> wrote:
>>>
>>>>  We have had an issue with our encoders for some time.  I recently
>>>> came up with a solution that I want to share.  It would make a good
>>>> smallish project, something that could be done during the hack-a-thon for
>>>> example. ****
>>>>
>>>> ** **
>>>>
>>>> *The Problem*
>>>>
>>>> Our current encoder needs to know in advance the max and min value it
>>>> will represent.  We usually look through any historical data we have to
>>>> find the max and min and add some for safety.  A problem occurs if the
>>>> range of actual values is greater than we anticipated.  It isn’t uncommon
>>>> for numbers to grow over time.  If we just change the encoder to represent
>>>> a larger range it will mess up all the previous learning in the CLA.  *
>>>> ***
>>>>
>>>> ** **
>>>>
>>>> It is analogous to how our cochlea evolved to represent 20 to 20KHz for
>>>> humans.  If we needed to start hearing patterns above 20KHz our cochlea
>>>> wouldn’t cut it.  If we replaced it with a new cochlea that had an extended
>>>> range then all of our previous auditory learning would be lost.****
>>>>
>>>> ** **
>>>>
>>>> We toyed with the idea of slowly modifying the encoder so all learning
>>>> wouldn’t be lost at once, but this has problems.  We didn’t have a good
>>>> solution for this problem.****
>>>>
>>>> ** **
>>>>
>>>> *The Proposed Solution*
>>>>
>>>> Let’s say our encoder produces a 500 bit output of which 20 bits are
>>>> active at once.  Recall that each bit represents a small span of the number
>>>> line (we refer to this as a “bucket”).  Adjacent bits in the output
>>>> represent overlapping buckets.  Any input number overlaps twenty buckets.
>>>> As the input value increases one bit will turn to zero and another will
>>>> become one.****
>>>>
>>>> ** **
>>>>
>>>> Now imagine the input value approaches and then exceeds the max value
>>>> of the encoder.  We have no more bits to encode the new high value, no more
>>>> buckets.  Today we represent any value over the max the same as the max.
>>>> This isn’t good.****
>>>>
>>>> ** **
>>>>
>>>> The solution is to continue creating new buckets beyond the max value
>>>> and assign them to one of the existing 500 bits *at random*.  As soon
>>>> as we do this, encoder bits will start representing two different ranges.
>>>> They will be assigned to two different buckets, the original one and the
>>>> new one that is above the max value.  It is important that the new extended
>>>> buckets are assigned to existing bits randomly.****
>>>>
>>>> ** **
>>>>
>>>> Imagine we have encoded a new value that is above our max.  It is
>>>> represented by 20 new buckets that have been randomly assigned to twenty
>>>> bits.  The original bucket ranges for the 20 bits representing the new high
>>>> value are not overlapping, but the new bucket ranges are overlapping.
>>>> Therefore the spatial pooler will not get confused by bits having two or
>>>> more ranges.  I am at a little loss of words to describe exactly why this
>>>> is so, hopefully you can see why this works.  If not, I can try to describe
>>>> it further.****
>>>>
>>>> ** **
>>>>
>>>> The cleanest way to implement this might be to throw away the idea that
>>>> the first 500 buckets are assigned to adjacent bits.  Instead just start
>>>> assigning buckets to random bits and keep going as far as you need to.
>>>> This will eliminate edge issues.****
>>>>
>>>> ** **
>>>>
>>>> If you are interested in tackling this project, Scott at Numenta has
>>>> volunteered to provide assistance.  He can point you to the correct code
>>>> and help in other ways.****
>>>>
>>>> ** **
>>>>
>>>> Jeff****
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Fergal Byrne
>>>
>>> <http://www.examsupport.ie>Brenter IT
>>> [email protected] +353 83 4214179
>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne

<http://www.examsupport.ie>Brenter IT
[email protected] +353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to