Hi Chetan, No, but the encodings should always be independent of the order of presentation of the data, so it's a bug if they're not. My code includes a workaround which builds buckets out in both directions from a predefined centre value until it encompasses each new value. This guarantees the same encoding regardless of which values come in when. You could easily add a version of this to your encoder, it's a small overhead for ensuring identical encodings.
This is the C4 idea in action - argue with patches... Just shows you how useful an executable document can be when you're experimenting! Regards, Fergal On Tue, Feb 18, 2014 at 6:55 PM, Chetan Surpur <[email protected]> wrote: > Oh, my mistake, I misunderstood the question. I thought Fergal was asking > if the order has to be presented in a certain order to get correct results > (results that have the desired overlap properties). > > So yes, order dependence exists in the currently implemented encoder, but > it shouldn't affect correctness. > On Feb 18, 2014 10:51 AM, "Scott Purdy" <[email protected]> wrote: > >> Fergal, I believe the implementation in NuPIC is dependent on the order >> of data. Why do you ask? The constant-memory design I have brought up here >> do not exist in NuPIC. >> >> Chetan, you are right that it extends separately in each directly but I >> believe that the randomness is shared so the order of the data would affect >> it. It wouldn't be difficult to change that though. But it also doesn't >> really solve any problems. >> >> >> On Tue, Feb 18, 2014 at 10:39 AM, Chetan Surpur <[email protected]>wrote: >> >>> From reading the code, it looks to me that the generation of buckets >>> happens on the left and right boundaries of the currently-existing buckets, >>> and extends the boundaries to create buckets as necessary. Thus, it >>> shouldn't matter what order the data is presented. Subutai can correct me >>> if I'm mistaken. >>> >>> >>> On Tue, Feb 18, 2014 at 10:35 AM, Fergal Byrne < >>> [email protected]> wrote: >>> >>>> Cheers Scott. >>>> >>>> Have you checked the NuPIC implementation for dependence on the order >>>> of data presented? My python isn't up to that ;{ >>>> >>>> Regards >>>> >>>> Fergal Byrne >>>> >>>> >>>> On Tue, Feb 18, 2014 at 5:37 PM, Scott Purdy <[email protected]> wrote: >>>> >>>>> Oh and Chetan's proposal is good but it still has memory and time >>>>> constraints that are linear with the number of buckets (but it doesn't >>>>> have >>>>> to keep the memory in use in between invocations). >>>>> >>>>> >>>>> On Tue, Feb 18, 2014 at 9:36 AM, Scott Purdy <[email protected]>wrote: >>>>> >>>>>> Thanks for the details on your implementation Fergal! >>>>>> >>>>>> Just to be clear, it is possible to create a constant memory and >>>>>> constant time solution (assuming fixed w). The one that I came up with >>>>>> does >>>>>> not use all nCw combinations of active bits though. Instead it uses >>>>>> (n/2)Cw >>>>>> * (n/2)Cw. >>>>>> >>>>>> I am hoping someone can find a solution with the same time/memory >>>>>> bounds but with a higher entropy solution. IE one that will have random >>>>>> collisions less frequently. >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Feb 18, 2014 at 1:01 AM, Fergal Byrne < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Scott, Chetan, >>>>>>> >>>>>>> Great thought experiment, Scott. >>>>>>> >>>>>>> Someone looking at my Clojure code [1] (a Clojure expert, not one on >>>>>>> NuPIC) had some issues with my using a mutable data structure (ie >>>>>>> memory) >>>>>>> so much. I looked at ways of eliminating it, but there's no simple way >>>>>>> to >>>>>>> do it unless you know that you'll never go backwards on the number line. >>>>>>> This also means that the encodings are dependent on the order in which >>>>>>> you >>>>>>> present the data to the encoder. >>>>>>> >>>>>>> For example, let's say the first value encoded is 0 (without loss of >>>>>>> generality). If you have to encode 1 next, it will get the second code, >>>>>>> 2 >>>>>>> will get the next one, and so on. But if -1 is provided after 1, it'll >>>>>>> get >>>>>>> the next code (or at least some distortion of it) and thus 2 will be >>>>>>> encoded differently. >>>>>>> >>>>>>> This means that every encoder must remember its buckets in order to >>>>>>> give back the same encoding for previously computed values, or else >>>>>>> remember the entire sequence of values and rerun their computations each >>>>>>> time (which may cost more memory if many values per bucket must be >>>>>>> stored). >>>>>>> >>>>>>> I've added a test/demo for this to my document. >>>>>>> >>>>>>> Update: If you decide on 0 as a centre, you can precalculate bands >>>>>>> of buckets out to your first data value (and repeat this for each new >>>>>>> one), >>>>>>> which ensures the encoding is always the same: >>>>>>> >>>>>>> e.g given 22 as the first datum, generate buckets for -10...10, >>>>>>> -20...20, -30...30 and return encoding(22). >>>>>>> >>>>>>> You could choose a different centre if you know more about your >>>>>>> data. I've detailed this idea in the doc. >>>>>>> >>>>>>> [1] http://fergalbyrne.github.io/rdse.html >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> Fergal Byrne >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 18, 2014 at 6:06 AM, Chetan Surpur >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> For this problem, this looks useful: >>>>>>>> http://en.wikipedia.org/wiki/Linear_feedback_shift_register >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Feb 17, 2014 at 6:01 PM, Chetan Surpur >>>>>>>> <[email protected]>wrote: >>>>>>>> >>>>>>>>> A very simple approach would be to trade speed for memory. Instead >>>>>>>>> of storing a map between buckets and SDRs, we can go through the >>>>>>>>> bucket >>>>>>>>> generation process every time we want to find the SDR for a bucket. >>>>>>>>> From >>>>>>>>> what I understand, this bucket generation process is linear in speed >>>>>>>>> with >>>>>>>>> the number of buckets you want to generate. So the linear memory >>>>>>>>> requirement would be translated into a linear speed requirement. >>>>>>>>> >>>>>>>>> In a nutshell, walk through the number line, generating buckets >>>>>>>>> until you hit the target bucket you want a representation for, *every >>>>>>>>> time* you want to get a representation, and don't store anything. >>>>>>>>> You'll need to use the same seed for the random number generator >>>>>>>>> though, to >>>>>>>>> get consistent results. >>>>>>>>> >>>>>>>>> The advantage of this is that it's a simple modification to what >>>>>>>>> is already implemented. On the other hand, it's slightly slower when >>>>>>>>> outputting SDRs for previously-seen buckets. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Feb 17, 2014 at 5:15 PM, Scott Purdy <[email protected]>wrote: >>>>>>>>> >>>>>>>>>> Hi all, I thought some of you might enjoy trying to come up with >>>>>>>>>> a solution for this problem. If you watch Chetan's presentation >>>>>>>>>> about the >>>>>>>>>> random distributed scalar encoder (RDSE), you will see that we are >>>>>>>>>> keeping >>>>>>>>>> a mapping between all buckets computed so far and the bits that >>>>>>>>>> represent >>>>>>>>>> them. This was Subutai's implementation of Jeff's general idea for >>>>>>>>>> the >>>>>>>>>> encoder. This design has a memory usage for the encoder that >>>>>>>>>> increases >>>>>>>>>> linearly with the number of buckets that it has to represent. >>>>>>>>>> >>>>>>>>>> When originally discussing the design, I was trying to find a way >>>>>>>>>> to statically compute the mapping so that you don't have to store >>>>>>>>>> anything. >>>>>>>>>> But it has to have the property that buckets i and j have w-(j-i) >>>>>>>>>> overlapping bits if j-i<w and also that a given index is never >>>>>>>>>> assigned >>>>>>>>>> multiple times to the same bucket. I came up with a solution but it >>>>>>>>>> would >>>>>>>>>> likely have more random collisions than Subutai's linear-memory >>>>>>>>>> solution >>>>>>>>>> because it was limited in the number of possibly combinations of >>>>>>>>>> bits the >>>>>>>>>> buckets could have. Curious if someone can come up with something >>>>>>>>>> better! >>>>>>>>>> >>>>>>>>>> And be sure to watch Chetan's presentation on the RDSE that >>>>>>>>>> Subutai designed and implemented for background. >>>>>>>>>> >>>>>>>>>> *Note: the current implementation is fine for all practical >>>>>>>>>> scenarios so this is just a fun exercise for those interested* >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> nupic mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> nupic mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Fergal Byrne, Brenter IT >>>>>>> >>>>>>> <http://www.examsupport.ie>http://inbits.com - Better Living >>>>>>> through Thoughtful Technology >>>>>>> >>>>>>> e:[email protected] t:+353 83 4214179 >>>>>>> Formerly of Adnet [email protected] http://www.adnet.ie >>>>>>> >>>>>>> _______________________________________________ >>>>>>> nupic mailing list >>>>>>> [email protected] >>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> nupic mailing list >>>>> [email protected] >>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Fergal Byrne, Brenter IT >>>> >>>> <http://www.examsupport.ie>http://inbits.com - Better Living through >>>> Thoughtful Technology >>>> >>>> e:[email protected] t:+353 83 4214179 >>>> Formerly of Adnet [email protected] http://www.adnet.ie >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Fergal Byrne, Brenter IT <http://www.examsupport.ie>http://inbits.com - Better Living through Thoughtful Technology e:[email protected] t:+353 83 4214179 Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
