On Feb 18, 2014 12:11 PM, "Fergal Byrne" <[email protected]> wrote: > > Hi Chetan, > > No, but the encodings should always be independent of the order of presentation of the data, so it's a bug if they're not. My code includes a workaround which builds buckets out in both directions from a predefined centre value until it encompasses each new value. This guarantees the same encoding regardless of which values come in when. You could easily add a version of this to your encoder, it's a small overhead for ensuring identical encodings. > > This is the C4 idea in action - argue with patches... Just shows you how useful an executable document can be when you're experimenting!
Please keep that conversation separate. It isn't related to our appropriate for this thread. > > Regards, > > Fergal > > > On Tue, Feb 18, 2014 at 6:55 PM, Chetan Surpur <[email protected]> wrote: >> >> Oh, my mistake, I misunderstood the question. I thought Fergal was asking if the order has to be presented in a certain order to get correct results (results that have the desired overlap properties). >> >> So yes, order dependence exists in the currently implemented encoder, but it shouldn't affect correctness. >> >> On Feb 18, 2014 10:51 AM, "Scott Purdy" <[email protected]> wrote: >>> >>> Fergal, I believe the implementation in NuPIC is dependent on the order of data. Why do you ask? The constant-memory design I have brought up here do not exist in NuPIC. >>> >>> Chetan, you are right that it extends separately in each directly but I believe that the randomness is shared so the order of the data would affect it. It wouldn't be difficult to change that though. But it also doesn't really solve any problems. >>> >>> >>> On Tue, Feb 18, 2014 at 10:39 AM, Chetan Surpur <[email protected]> wrote: >>>> >>>> From reading the code, it looks to me that the generation of buckets happens on the left and right boundaries of the currently-existing buckets, and extends the boundaries to create buckets as necessary. Thus, it shouldn't matter what order the data is presented. Subutai can correct me if I'm mistaken. >>>> >>>> >>>> On Tue, Feb 18, 2014 at 10:35 AM, Fergal Byrne < [email protected]> wrote: >>>>> >>>>> Cheers Scott. >>>>> >>>>> Have you checked the NuPIC implementation for dependence on the order of data presented? My python isn't up to that ;{ >>>>> >>>>> Regards >>>>> >>>>> Fergal Byrne >>>>> >>>>> >>>>> On Tue, Feb 18, 2014 at 5:37 PM, Scott Purdy <[email protected]> wrote: >>>>>> >>>>>> Oh and Chetan's proposal is good but it still has memory and time constraints that are linear with the number of buckets (but it doesn't have to keep the memory in use in between invocations). >>>>>> >>>>>> >>>>>> On Tue, Feb 18, 2014 at 9:36 AM, Scott Purdy <[email protected]> wrote: >>>>>>> >>>>>>> Thanks for the details on your implementation Fergal! >>>>>>> >>>>>>> Just to be clear, it is possible to create a constant memory and constant time solution (assuming fixed w). The one that I came up with does not use all nCw combinations of active bits though. Instead it uses (n/2)Cw * (n/2)Cw. >>>>>>> >>>>>>> I am hoping someone can find a solution with the same time/memory bounds but with a higher entropy solution. IE one that will have random collisions less frequently. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 18, 2014 at 1:01 AM, Fergal Byrne < [email protected]> wrote: >>>>>>>> >>>>>>>> Hi Scott, Chetan, >>>>>>>> >>>>>>>> Great thought experiment, Scott. >>>>>>>> >>>>>>>> Someone looking at my Clojure code [1] (a Clojure expert, not one on NuPIC) had some issues with my using a mutable data structure (ie memory) so much. I looked at ways of eliminating it, but there's no simple way to do it unless you know that you'll never go backwards on the number line. This also means that the encodings are dependent on the order in which you present the data to the encoder. >>>>>>>> >>>>>>>> For example, let's say the first value encoded is 0 (without loss of generality). If you have to encode 1 next, it will get the second code, 2 will get the next one, and so on. But if -1 is provided after 1, it'll get the next code (or at least some distortion of it) and thus 2 will be encoded differently. >>>>>>>> >>>>>>>> This means that every encoder must remember its buckets in order to give back the same encoding for previously computed values, or else remember the entire sequence of values and rerun their computations each time (which may cost more memory if many values per bucket must be stored). >>>>>>>> >>>>>>>> I've added a test/demo for this to my document. >>>>>>>> >>>>>>>> Update: If you decide on 0 as a centre, you can precalculate bands of buckets out to your first data value (and repeat this for each new one), which ensures the encoding is always the same: >>>>>>>> >>>>>>>> e.g given 22 as the first datum, generate buckets for -10...10, -20...20, -30...30 and return encoding(22). >>>>>>>> >>>>>>>> You could choose a different centre if you know more about your data. I've detailed this idea in the doc. >>>>>>>> >>>>>>>> [1] http://fergalbyrne.github.io/rdse.html >>>>>>>> >>>>>>>> Regards >>>>>>>> >>>>>>>> Fergal Byrne >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 18, 2014 at 6:06 AM, Chetan Surpur <[email protected]> wrote: >>>>>>>>> >>>>>>>>> For this problem, this looks useful: http://en.wikipedia.org/wiki/Linear_feedback_shift_register >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Feb 17, 2014 at 6:01 PM, Chetan Surpur <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> A very simple approach would be to trade speed for memory. Instead of storing a map between buckets and SDRs, we can go through the bucket generation process every time we want to find the SDR for a bucket. >From what I understand, this bucket generation process is linear in speed with the number of buckets you want to generate. So the linear memory requirement would be translated into a linear speed requirement. >>>>>>>>>> >>>>>>>>>> In a nutshell, walk through the number line, generating buckets until you hit the target bucket you want a representation for, every time you want to get a representation, and don't store anything. You'll need to use the same seed for the random number generator though, to get consistent results. >>>>>>>>>> >>>>>>>>>> The advantage of this is that it's a simple modification to what is already implemented. On the other hand, it's slightly slower when outputting SDRs for previously-seen buckets. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Feb 17, 2014 at 5:15 PM, Scott Purdy <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi all, I thought some of you might enjoy trying to come up with a solution for this problem. If you watch Chetan's presentation about the random distributed scalar encoder (RDSE), you will see that we are keeping a mapping between all buckets computed so far and the bits that represent them. This was Subutai's implementation of Jeff's general idea for the encoder. This design has a memory usage for the encoder that increases linearly with the number of buckets that it has to represent. >>>>>>>>>>> >>>>>>>>>>> When originally discussing the design, I was trying to find a way to statically compute the mapping so that you don't have to store anything. But it has to have the property that buckets i and j have w-(j-i) overlapping bits if j-i<w and also that a given index is never assigned multiple times to the same bucket. I came up with a solution but it would likely have more random collisions than Subutai's linear-memory solution because it was limited in the number of possibly combinations of bits the buckets could have. Curious if someone can come up with something better! >>>>>>>>>>> >>>>>>>>>>> And be sure to watch Chetan's presentation on the RDSE that Subutai designed and implemented for background. >>>>>>>>>>> >>>>>>>>>>> Note: the current implementation is fine for all practical scenarios so this is just a fun exercise for those interested >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> nupic mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> nupic mailing list >>>>>>>>> [email protected] >>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Fergal Byrne, Brenter IT >>>>>>>> >>>>>>>> http://inbits.com - Better Living through Thoughtful Technology >>>>>>>> >>>>>>>> e:[email protected] t:+353 83 4214179 >>>>>>>> Formerly of Adnet [email protected] http://www.adnet.ie >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> nupic mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> nupic mailing list >>>>>> [email protected] >>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Fergal Byrne, Brenter IT >>>>> >>>>> http://inbits.com - Better Living through Thoughtful Technology >>>>> >>>>> e:[email protected] t:+353 83 4214179 >>>>> Formerly of Adnet [email protected] http://www.adnet.ie >>>>> >>>>> _______________________________________________ >>>>> nupic mailing list >>>>> [email protected] >>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>>> >>>> >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> > > > > -- > > Fergal Byrne, Brenter IT > > http://inbits.com - Better Living through Thoughtful Technology > > e:[email protected] t:+353 83 4214179 > Formerly of Adnet [email protected] http://www.adnet.ie > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
