Re: [nupic-discuss] Fun challenge: constant memory RDSE

Fergal Byrne Tue, 18 Feb 2014 12:21:32 -0800

Yes, but then you'd have to save all your buckets between runs of the
encoder, and reload them each time you use it. The SP is depending on those
encodings never changing, and for example if you decided to aggregate
values, or choose every nth one, you'd break the constancy of the encoding.


My workaround means that every encoding is identical no matter how you give
it data - it reconstructs all needed buckets the first time you give it
data, and continues to do so for every new one. The random number generator
provides everything you need to rebuild the encoding, so saving or
reloading necessary.

Regards

Fergal Byrne


On Tue, Feb 18, 2014 at 8:13 PM, Chetan Surpur <[email protected]> wrote:

> As long as the encoder maintains the mappings for the currently-existing
> buckets, why must the new encodings be independent of the order of
> presentation of the data?
>
>
> On Tue, Feb 18, 2014 at 12:10 PM, Fergal Byrne <
> [email protected]> wrote:
>
>> Hi Chetan,
>>
>> No, but the encodings should always be independent of the order of
>> presentation of the data, so it's a bug if they're not.  My code includes a
>> workaround which builds buckets out in both directions from a predefined
>> centre value until it encompasses each new value. This guarantees the same
>> encoding regardless of which values come in when. You could easily add a
>> version of this to your encoder, it's a small overhead for ensuring
>> identical encodings.
>>
>> This is the C4 idea in action - argue with patches... Just shows you how
>> useful an executable document can be when you're experimenting!
>>
>> Regards,
>>
>> Fergal
>>
>>
>> On Tue, Feb 18, 2014 at 6:55 PM, Chetan Surpur <[email protected]>wrote:
>>
>>> Oh, my mistake, I misunderstood the question. I thought Fergal was
>>> asking if the order has to be presented in a certain order to get correct
>>> results (results that have the desired overlap properties).
>>>
>>> So yes, order dependence exists in the currently implemented encoder,
>>> but it shouldn't affect correctness.
>>> On Feb 18, 2014 10:51 AM, "Scott Purdy" <[email protected]> wrote:
>>>
>>>> Fergal, I believe the implementation in NuPIC is dependent on the order
>>>> of data. Why do you ask? The constant-memory design I have brought up here
>>>> do not exist in NuPIC.
>>>>
>>>> Chetan, you are right that it extends separately in each directly but I
>>>> believe that the randomness is shared so the order of the data would affect
>>>> it. It wouldn't be difficult to change that though. But it also doesn't
>>>> really solve any problems.
>>>>
>>>>
>>>> On Tue, Feb 18, 2014 at 10:39 AM, Chetan Surpur <[email protected]>wrote:
>>>>
>>>>> From reading the code, it looks to me that the generation of buckets
>>>>> happens on the left and right boundaries of the currently-existing 
>>>>> buckets,
>>>>> and extends the boundaries to create buckets as necessary. Thus, it
>>>>> shouldn't matter what order the data is presented. Subutai can correct me
>>>>> if I'm mistaken.
>>>>>
>>>>>
>>>>> On Tue, Feb 18, 2014 at 10:35 AM, Fergal Byrne <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Cheers Scott.
>>>>>>
>>>>>> Have you checked the NuPIC implementation for dependence on the order
>>>>>> of data presented? My python isn't up to that ;{
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Fergal Byrne
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 18, 2014 at 5:37 PM, Scott Purdy <[email protected]>wrote:
>>>>>>
>>>>>>> Oh and Chetan's proposal is good but it still has memory and time
>>>>>>> constraints that are linear with the number of buckets (but it doesn't 
>>>>>>> have
>>>>>>> to keep the memory in use in between invocations).
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 18, 2014 at 9:36 AM, Scott Purdy <[email protected]>wrote:
>>>>>>>
>>>>>>>> Thanks for the details on your implementation Fergal!
>>>>>>>>
>>>>>>>> Just to be clear, it is possible to create a constant memory and
>>>>>>>> constant time solution (assuming fixed w). The one that I came up with 
>>>>>>>> does
>>>>>>>> not use all nCw combinations of active bits though. Instead it uses 
>>>>>>>> (n/2)Cw
>>>>>>>> * (n/2)Cw.
>>>>>>>>
>>>>>>>> I am hoping someone can find a solution with the same time/memory
>>>>>>>> bounds but with a higher entropy solution. IE one that will have random
>>>>>>>> collisions less frequently.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Feb 18, 2014 at 1:01 AM, Fergal Byrne <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Scott, Chetan,
>>>>>>>>>
>>>>>>>>> Great thought experiment, Scott.
>>>>>>>>>
>>>>>>>>> Someone looking at my Clojure code [1] (a Clojure expert, not one
>>>>>>>>> on NuPIC) had some issues with my using a mutable data structure (ie
>>>>>>>>> memory) so much. I looked at ways of eliminating it, but there's no 
>>>>>>>>> simple
>>>>>>>>> way to do it unless you know that you'll never go backwards on the 
>>>>>>>>> number
>>>>>>>>> line. This also means that the encodings are dependent on the order in
>>>>>>>>> which you present the data to the encoder.
>>>>>>>>>
>>>>>>>>> For example, let's say the first value encoded is 0 (without loss
>>>>>>>>> of generality). If you have to encode 1 next, it will get the second 
>>>>>>>>> code,
>>>>>>>>> 2 will get the next one, and so on. But if -1 is provided after 1, 
>>>>>>>>> it'll
>>>>>>>>> get the next code (or at least some distortion of it) and thus 2 will 
>>>>>>>>> be
>>>>>>>>> encoded differently.
>>>>>>>>>
>>>>>>>>> This means that every encoder must remember its buckets in order
>>>>>>>>> to give back the same encoding for previously computed values, or else
>>>>>>>>> remember the entire sequence of values and rerun their computations 
>>>>>>>>> each
>>>>>>>>> time (which may cost more memory if many values per bucket must be 
>>>>>>>>> stored).
>>>>>>>>>
>>>>>>>>> I've added a test/demo for this to my document.
>>>>>>>>>
>>>>>>>>> Update: If you decide on 0 as a centre, you can precalculate bands
>>>>>>>>> of buckets out to your first data value (and repeat this for each new 
>>>>>>>>> one),
>>>>>>>>> which ensures the encoding is always the same:
>>>>>>>>>
>>>>>>>>> e.g given 22 as the first datum, generate buckets for -10...10,
>>>>>>>>> -20...20, -30...30 and return encoding(22).
>>>>>>>>>
>>>>>>>>> You could choose a different centre if you know more about your
>>>>>>>>> data. I've detailed this idea in the doc.
>>>>>>>>>
>>>>>>>>> [1] http://fergalbyrne.github.io/rdse.html
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> Fergal Byrne
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 18, 2014 at 6:06 AM, Chetan Surpur <[email protected]
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> For this problem, this looks useful:
>>>>>>>>>> http://en.wikipedia.org/wiki/Linear_feedback_shift_register
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 17, 2014 at 6:01 PM, Chetan Surpur <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> A very simple approach would be to trade speed for memory.
>>>>>>>>>>> Instead of storing a map between buckets and SDRs, we can go 
>>>>>>>>>>> through the
>>>>>>>>>>> bucket generation process every time we want to find the SDR for a 
>>>>>>>>>>> bucket.
>>>>>>>>>>> From what I understand, this bucket generation process is linear in 
>>>>>>>>>>> speed
>>>>>>>>>>> with the number of buckets you want to generate. So the linear 
>>>>>>>>>>> memory
>>>>>>>>>>> requirement would be translated into a linear speed requirement.
>>>>>>>>>>>
>>>>>>>>>>> In a nutshell, walk through the number line, generating buckets
>>>>>>>>>>> until you hit the target bucket you want a representation for, 
>>>>>>>>>>> *every
>>>>>>>>>>> time* you want to get a representation, and don't store
>>>>>>>>>>> anything. You'll need to use the same seed for the random number 
>>>>>>>>>>> generator
>>>>>>>>>>> though, to get consistent results.
>>>>>>>>>>>
>>>>>>>>>>> The advantage of this is that it's a simple modification to what
>>>>>>>>>>> is already implemented. On the other hand, it's slightly slower when
>>>>>>>>>>> outputting SDRs for previously-seen buckets.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 17, 2014 at 5:15 PM, Scott Purdy 
>>>>>>>>>>> <[email protected]>wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all, I thought some of you might enjoy trying to come up
>>>>>>>>>>>> with a solution for this problem. If you watch Chetan's 
>>>>>>>>>>>> presentation about
>>>>>>>>>>>> the random distributed scalar encoder (RDSE), you will see that we 
>>>>>>>>>>>> are
>>>>>>>>>>>> keeping a mapping between all buckets computed so far and the bits 
>>>>>>>>>>>> that
>>>>>>>>>>>> represent them. This was Subutai's implementation of Jeff's 
>>>>>>>>>>>> general idea
>>>>>>>>>>>> for the encoder. This design has a memory usage for the encoder 
>>>>>>>>>>>> that
>>>>>>>>>>>> increases linearly with the number of buckets that it has to 
>>>>>>>>>>>> represent.
>>>>>>>>>>>>
>>>>>>>>>>>> When originally discussing the design, I was trying to find a
>>>>>>>>>>>> way to statically compute the mapping so that you don't have to 
>>>>>>>>>>>> store
>>>>>>>>>>>> anything. But it has to have the property that buckets i and j 
>>>>>>>>>>>> have w-(j-i)
>>>>>>>>>>>> overlapping bits if j-i<w and also that a given index is never 
>>>>>>>>>>>> assigned
>>>>>>>>>>>> multiple times to the same bucket. I came up with a solution but 
>>>>>>>>>>>> it would
>>>>>>>>>>>> likely have more random collisions than Subutai's linear-memory 
>>>>>>>>>>>> solution
>>>>>>>>>>>> because it was limited in the number of possibly combinations of 
>>>>>>>>>>>> bits the
>>>>>>>>>>>> buckets could have. Curious if someone can come up with something 
>>>>>>>>>>>> better!
>>>>>>>>>>>>
>>>>>>>>>>>> And be sure to watch Chetan's presentation on the RDSE that
>>>>>>>>>>>> Subutai designed and implemented for background.
>>>>>>>>>>>>
>>>>>>>>>>>> *Note: the current implementation is fine for all practical
>>>>>>>>>>>> scenarios so this is just a fun exercise for those interested*
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> nupic mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>
>>>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> nupic mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Fergal Byrne, Brenter IT
>>>>>>>>>
>>>>>>>>> <http://www.examsupport.ie>http://inbits.com - Better Living
>>>>>>>>> through Thoughtful Technology
>>>>>>>>>
>>>>>>>>> e:[email protected] t:+353 83 4214179
>>>>>>>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> nupic mailing list
>>>>>>>>> [email protected]
>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> nupic mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Fergal Byrne, Brenter IT
>>>>>>
>>>>>> <http://www.examsupport.ie>http://inbits.com - Better Living through
>>>>>> Thoughtful Technology
>>>>>>
>>>>>> e:[email protected] t:+353 83 4214179
>>>>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>>>>
>>>>>> _______________________________________________
>>>>>> nupic mailing list
>>>>>> [email protected]
>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>>
>> --
>>
>> Fergal Byrne, Brenter IT
>>
>> <http://www.examsupport.ie>http://inbits.com - Better Living through
>> Thoughtful Technology
>>
>> e:[email protected] t:+353 83 4214179
>> Formerly of Adnet [email protected] http://www.adnet.ie
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Fun challenge: constant memory RDSE

Reply via email to