Thanks for the details on your implementation Fergal!

Just to be clear, it is possible to create a constant memory and constant
time solution (assuming fixed w). The one that I came up with does not use
all nCw combinations of active bits though. Instead it uses (n/2)Cw *
(n/2)Cw.

I am hoping someone can find a solution with the same time/memory bounds
but with a higher entropy solution. IE one that will have random collisions
less frequently.



On Tue, Feb 18, 2014 at 1:01 AM, Fergal Byrne
<[email protected]>wrote:

> Hi Scott, Chetan,
>
> Great thought experiment, Scott.
>
> Someone looking at my Clojure code [1] (a Clojure expert, not one on
> NuPIC) had some issues with my using a mutable data structure (ie memory)
> so much. I looked at ways of eliminating it, but there's no simple way to
> do it unless you know that you'll never go backwards on the number line.
> This also means that the encodings are dependent on the order in which you
> present the data to the encoder.
>
> For example, let's say the first value encoded is 0 (without loss of
> generality). If you have to encode 1 next, it will get the second code, 2
> will get the next one, and so on. But if -1 is provided after 1, it'll get
> the next code (or at least some distortion of it) and thus 2 will be
> encoded differently.
>
> This means that every encoder must remember its buckets in order to give
> back the same encoding for previously computed values, or else remember the
> entire sequence of values and rerun their computations each time (which may
> cost more memory if many values per bucket must be stored).
>
> I've added a test/demo for this to my document.
>
> Update: If you decide on 0 as a centre, you can precalculate bands of
> buckets out to your first data value (and repeat this for each new one),
> which ensures the encoding is always the same:
>
> e.g given 22 as the first datum, generate buckets for -10...10, -20...20,
> -30...30 and return encoding(22).
>
> You could choose a different centre if you know more about your data. I've
> detailed this idea in the doc.
>
> [1] http://fergalbyrne.github.io/rdse.html
>
> Regards
>
> Fergal Byrne
>
>
>
>
>
> On Tue, Feb 18, 2014 at 6:06 AM, Chetan Surpur <[email protected]> wrote:
>
>> For this problem, this looks useful:
>> http://en.wikipedia.org/wiki/Linear_feedback_shift_register
>>
>>
>> On Mon, Feb 17, 2014 at 6:01 PM, Chetan Surpur <[email protected]>wrote:
>>
>>> A very simple approach would be to trade speed for memory. Instead of
>>> storing a map between buckets and SDRs, we can go through the bucket
>>> generation process every time we want to find the SDR for a bucket. From
>>> what I understand, this bucket generation process is linear in speed with
>>> the number of buckets you want to generate. So the linear memory
>>> requirement would be translated into a linear speed requirement.
>>>
>>> In a nutshell, walk through the number line, generating buckets until
>>> you hit the target bucket you want a representation for, *every time* you
>>> want to get a representation, and don't store anything. You'll need to use
>>> the same seed for the random number generator though, to get consistent
>>> results.
>>>
>>> The advantage of this is that it's a simple modification to what is
>>> already implemented. On the other hand, it's slightly slower when
>>> outputting SDRs for previously-seen buckets.
>>>
>>>
>>> On Mon, Feb 17, 2014 at 5:15 PM, Scott Purdy <[email protected]> wrote:
>>>
>>>> Hi all, I thought some of you might enjoy trying to come up with a
>>>> solution for this problem. If you watch Chetan's presentation about the
>>>> random distributed scalar encoder (RDSE), you will see that we are keeping
>>>> a mapping between all buckets computed so far and the bits that represent
>>>> them. This was Subutai's implementation of Jeff's general idea for the
>>>> encoder. This design has a memory usage for the encoder that increases
>>>> linearly with the number of buckets that it has to represent.
>>>>
>>>> When originally discussing the design, I was trying to find a way to
>>>> statically compute the mapping so that you don't have to store anything.
>>>> But it has to have the property that buckets i and j have w-(j-i)
>>>> overlapping bits if j-i<w and also that a given index is never assigned
>>>> multiple times to the same bucket. I came up with a solution but it would
>>>> likely have more random collisions than Subutai's linear-memory solution
>>>> because it was limited in the number of possibly combinations of bits the
>>>> buckets could have. Curious if someone can come up with something better!
>>>>
>>>> And be sure to watch Chetan's presentation on the RDSE that Subutai
>>>> designed and implemented for background.
>>>>
>>>> *Note: the current implementation is fine for all practical scenarios
>>>> so this is just a fun exercise for those interested*
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
>
> --
>
> Fergal Byrne, Brenter IT
>
> <http://www.examsupport.ie>http://inbits.com - Better Living through
> Thoughtful Technology
>
> e:[email protected] t:+353 83 4214179
> Formerly of Adnet [email protected] http://www.adnet.ie
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to