Cheers Scott.

Have you checked the NuPIC implementation for dependence on the order of
data presented? My python isn't up to that ;{

Regards

Fergal Byrne


On Tue, Feb 18, 2014 at 5:37 PM, Scott Purdy <[email protected]> wrote:

> Oh and Chetan's proposal is good but it still has memory and time
> constraints that are linear with the number of buckets (but it doesn't have
> to keep the memory in use in between invocations).
>
>
> On Tue, Feb 18, 2014 at 9:36 AM, Scott Purdy <[email protected]> wrote:
>
>> Thanks for the details on your implementation Fergal!
>>
>> Just to be clear, it is possible to create a constant memory and constant
>> time solution (assuming fixed w). The one that I came up with does not use
>> all nCw combinations of active bits though. Instead it uses (n/2)Cw *
>> (n/2)Cw.
>>
>> I am hoping someone can find a solution with the same time/memory bounds
>> but with a higher entropy solution. IE one that will have random collisions
>> less frequently.
>>
>>
>>
>> On Tue, Feb 18, 2014 at 1:01 AM, Fergal Byrne <
>> [email protected]> wrote:
>>
>>> Hi Scott, Chetan,
>>>
>>> Great thought experiment, Scott.
>>>
>>> Someone looking at my Clojure code [1] (a Clojure expert, not one on
>>> NuPIC) had some issues with my using a mutable data structure (ie memory)
>>> so much. I looked at ways of eliminating it, but there's no simple way to
>>> do it unless you know that you'll never go backwards on the number line.
>>> This also means that the encodings are dependent on the order in which you
>>> present the data to the encoder.
>>>
>>> For example, let's say the first value encoded is 0 (without loss of
>>> generality). If you have to encode 1 next, it will get the second code, 2
>>> will get the next one, and so on. But if -1 is provided after 1, it'll get
>>> the next code (or at least some distortion of it) and thus 2 will be
>>> encoded differently.
>>>
>>> This means that every encoder must remember its buckets in order to give
>>> back the same encoding for previously computed values, or else remember the
>>> entire sequence of values and rerun their computations each time (which may
>>> cost more memory if many values per bucket must be stored).
>>>
>>> I've added a test/demo for this to my document.
>>>
>>> Update: If you decide on 0 as a centre, you can precalculate bands of
>>> buckets out to your first data value (and repeat this for each new one),
>>> which ensures the encoding is always the same:
>>>
>>> e.g given 22 as the first datum, generate buckets for -10...10,
>>> -20...20, -30...30 and return encoding(22).
>>>
>>> You could choose a different centre if you know more about your data.
>>> I've detailed this idea in the doc.
>>>
>>> [1] http://fergalbyrne.github.io/rdse.html
>>>
>>> Regards
>>>
>>> Fergal Byrne
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Feb 18, 2014 at 6:06 AM, Chetan Surpur <[email protected]>wrote:
>>>
>>>> For this problem, this looks useful:
>>>> http://en.wikipedia.org/wiki/Linear_feedback_shift_register
>>>>
>>>>
>>>> On Mon, Feb 17, 2014 at 6:01 PM, Chetan Surpur <[email protected]>wrote:
>>>>
>>>>> A very simple approach would be to trade speed for memory. Instead of
>>>>> storing a map between buckets and SDRs, we can go through the bucket
>>>>> generation process every time we want to find the SDR for a bucket. From
>>>>> what I understand, this bucket generation process is linear in speed with
>>>>> the number of buckets you want to generate. So the linear memory
>>>>> requirement would be translated into a linear speed requirement.
>>>>>
>>>>> In a nutshell, walk through the number line, generating buckets until
>>>>> you hit the target bucket you want a representation for, *every time* you
>>>>> want to get a representation, and don't store anything. You'll need to use
>>>>> the same seed for the random number generator though, to get consistent
>>>>> results.
>>>>>
>>>>> The advantage of this is that it's a simple modification to what is
>>>>> already implemented. On the other hand, it's slightly slower when
>>>>> outputting SDRs for previously-seen buckets.
>>>>>
>>>>>
>>>>> On Mon, Feb 17, 2014 at 5:15 PM, Scott Purdy <[email protected]>wrote:
>>>>>
>>>>>> Hi all, I thought some of you might enjoy trying to come up with a
>>>>>> solution for this problem. If you watch Chetan's presentation about the
>>>>>> random distributed scalar encoder (RDSE), you will see that we are 
>>>>>> keeping
>>>>>> a mapping between all buckets computed so far and the bits that represent
>>>>>> them. This was Subutai's implementation of Jeff's general idea for the
>>>>>> encoder. This design has a memory usage for the encoder that increases
>>>>>> linearly with the number of buckets that it has to represent.
>>>>>>
>>>>>> When originally discussing the design, I was trying to find a way to
>>>>>> statically compute the mapping so that you don't have to store anything.
>>>>>> But it has to have the property that buckets i and j have w-(j-i)
>>>>>> overlapping bits if j-i<w and also that a given index is never assigned
>>>>>> multiple times to the same bucket. I came up with a solution but it would
>>>>>> likely have more random collisions than Subutai's linear-memory solution
>>>>>> because it was limited in the number of possibly combinations of bits the
>>>>>> buckets could have. Curious if someone can come up with something better!
>>>>>>
>>>>>> And be sure to watch Chetan's presentation on the RDSE that Subutai
>>>>>> designed and implemented for background.
>>>>>>
>>>>>> *Note: the current implementation is fine for all practical scenarios
>>>>>> so this is just a fun exercise for those interested*
>>>>>>
>>>>>> _______________________________________________
>>>>>> nupic mailing list
>>>>>> [email protected]
>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Fergal Byrne, Brenter IT
>>>
>>> <http://www.examsupport.ie>http://inbits.com - Better Living through
>>> Thoughtful Technology
>>>
>>> e:[email protected] t:+353 83 4214179
>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to