Hi Nick, I believe your understanding is exactly right. If we are predicting 10 steps into the future, the classifier has to keep a rolling buffer of the last 10 sets of active bits. The classifier sort-of outputs the conditional probability of each bucket given the current activation. I say "sort-of" because there's a rolling average in there, so it's really a "recent conditional probability". This is how the OPF outputs probabilities for each set of predictions.
I believe the implementation stores the indices only for the historical buffer. The C++ code for this is in nupic.core, in FastClaClassifier.hpp/cpp. --Subutai On Sat, Aug 16, 2014 at 6:14 AM, Nicholas Mitri <[email protected]> wrote: > Hi Subutai, > > So we’re using the predictive state of the cells as a middle step (during > learning) to encode context into the representation of the input pattern > using only active bits? But that’s the extent of their practical use as far > as the CLA classifier is concerned. > > I understood the point you made about the fact that context encoded into > active bits gives us all the information we need for prediction, but > there’s still one issue I’m having with the operation of the CLA > classifier. > > If we’re only using active bits, then the RADC matrix we’re storing should > maintain and update a coincidence counter between the current bucket and > the active bits from a previous time step during its leaning phase. In that > way, when the classifier is in inference mode, the likelihood becomes the > conditional probability of a future bucket given current activation. In > other words, the classifier learning phase creates a relation between past > info (active output of TP at time = t - x) and current input value (bucket > index at time t) so that during inference we can use current information > (at time = t) to predict future values (at time = t + x). (The document > attached isn’t very clear on that point). > > If that’s the case, then the active state of the region should be stored > for future use. Is any of that accurate? and if so, would we be storing the > state of every cell or only the index of the active ones? > > best, > Nick > > > > On Aug 15, 2014, at 9:18 PM, Subutai Ahmad <[email protected]> wrote: > > Hi Nick, > > That’s a great question, and one we worked through as well. The classifier > does really only use the active bits. If you think about it, the active > bits include all the available information about the high order sequence. > It includes the full dynamic context and all future predictions about this > sequence can be derived from the active bits. > > For example, suppose you've learned different melodies and start listening > to a song. Once the first few notes are played, there could be many > different musical pieces that start the same way. The active state includes > all possible melodies that start with these notes. > > Once you are in the middle of the melody and it’s now unambiguous, the > active state at any point is unique to that melody as well as the position > within that melody. If you are a musician, you could actually stop > listening, take over and play the rest of the song. Similarly, a classifier > can take that state as input and predict the sequence of all those notes > into the future with 100% accuracy. This is a very cool property. It is a > result of the capacity inherent in sparse representations and critical to > representing high order sequences. > > As such, the classifier only needs the active state to predict the next N > steps. > > So what is the predictive state? The predictive state is in fact just a > function of the active bits and the current set of segments. It doesn’t add > new information. However it has other uses. The predictive state is used in > the Temporal Memory to update the set of active bits given new sensory > information. This helps fine tune the active state as you get new > information. It also helps the system refine learning as new (possibly > unpredicted) information comes in. > > —Subutai > > > > On Fri, Aug 15, 2014 at 7:40 AM, Nicholas Mitri <[email protected]> > wrote: > >> Hi Subutai, >> >> Again, thanks for forwarding the document. It was really helpful. >> >> I have a quick question before I delve deeper into the classifier. >> The document mentions that the classifier makes use of the ‘active’ bits >> of the temporal pooler. Are we grouping active and predictive bits under >> the label ‘active' here? >> >> If the predictive bits are not mapped into actual values by the >> classifier, then what module is performing that task when I query for the >> predicted field value at any time step? >> >> If they are, what process is used to decouple multiple simultaneous >> predictions and map each to its corresponding value to compare it against a >> value after X time steps? Is it as simple as looking at the normalized RADC >> table and picking the top 3 buckets with the highest likelihoods, mapping >> them into their actual values, then attaching the likelihood to the >> prediction as a confidence measure? >> >> There are clearly some major holes in my understanding of the algorithms >> at play, I’d appreciate the clarifications :). >> >> thanks, >> Nick >> >> On Aug 13, 2014, at 8:39 PM, Subutai Ahmad <[email protected]> wrote: >> >> Hi Nick, >> >> Nice diagram! In addition to the video David sent, we have a NuPIC issue >> to create this document: >> >> https://github.com/numenta/nupic/issues/578 >> >> I found some old documentation in our archives. Scott is planning to >> update the wiki with this information. I have also attached it here for >> reference (but warning, it may be a bit outdated!) >> >> --Subutai >> >> >> On Wed, Aug 13, 2014 at 9:03 AM, cogmission1 . < >> [email protected]> wrote: >> >>> Hi Nicholas, >>> >>> This is the only source with any depth I have seen. Have you seen this? >>> >>> https://www.youtube.com/watch?v=z6r3ekreRzY >>> >>> David >>> >>> >>> On Wed, Aug 13, 2014 at 10:46 AM, Nicholas Mitri <[email protected]> >>> wrote: >>> >>>> Hey all, >>>> >>>> Based on my understanding of the material in the wiki, the CLA >>>> algorithms can be depicted by the figure below. >>>> There’s plenty of info about SP and TP in both theory and >>>> implementation details. >>>> I can’t seem to find much information about the classifier though. >>>> If I’ve understood correctly, this is not a classifier in the Machine >>>> Learning sense of the word but rather a mechanism to translate TP output >>>> into values of the same data type as the input for comparison purposes. >>>> >>>> I’d really appreciate some more involved explanation of the process in >>>> terms of what data is stored step to step and how the look-up/mapping >>>> mechanics are implemented. >>>> >>>> best, >>>> Nick >>>> >>>> <Screen Shot 2013-12-02 at 4.00.01 PM.png> >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> <multistep_prediction.docx> >> _______________________________________________ >> >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
