Hi Marek, Subutai, Apologies for the radio silence over the last week or so, I've been deep-diving something which I believe will be of interest to everyone - more on this in the next few days as I still have some more experiments to do..
First off, great that you're doing some investigation into this, /Mar(e?)k/. It's really important that we grasp how powerful SDRs are, and even more how efficient sequence learning of SDRs is. This is the key to Jeff's theory - nature discovered this idea a few hundred million years ago, and only perfected it when the big stone landed on the Yucatan, wiping out all our competitors. The key to all of this is probability (or conversely confidence). Any learning system which has sufficient capacity will only ever encounter an astronomically tiny proportion of possible sequences of learned patterns, so when you see enough of a sequence you have seen the whole thing (at each level of abstraction). This is the power of the TP. Each time you see a pattern which you've predicted, the probability that the rest of the sequence is not going to happen drops exponentially. When you do see something new, it's happening so rarely that you have plenty of resources to either learn it or flag a major breach of your model of the world. <ad>NuPIC has a major bug: Jeff's CLA theory says that predictive cells fire first, and for no good reason this is not in the SP. Fixing this is both trivial (in terms of work) and significant (in terms of effectiveness for many kinds of data). Anyone interested in adding the 10-15 lines of Python and perhaps twice that of C++ might like to pick up my issue at https://github.com/numenta/nupic/issues/415</ad> The enemy of this is noise. /Mar(e?)k/'s example is nice and clean: load's of distinguishable A's with one hard B at the end. Most data in the real world is not like this, so you'll have many sequences with a spurious Æ, Å, or Ā in there. We should be able to consider these semantic variants of A and believe that we're still looking at a sequence of A's. The data should decide how good this tolerance is - if we only ever get long sequences of As (and their near relatives), followed by B's, then this is what we'll learn. If the statistics are that we often get 40±5 As, then there will be a prediction of 80% another A and 20% a B when you get to that point. But the probability of any other character will be mathematically zero at that stage, allowing you to rule out the significance of Æ, Å, or Ā being anything other than noise. /Mar(e?)k/, the numbers really tell here, as you hinted. The capacities are in the 10^n where n is 40, 80, 120 or something, for even a small 2048 column layer. The real issue then becomes one of ensuring that these capacities are exploited. This is all about presenting input data well, and having a good enough learning algorithm so that the CLA will slice up the data space well. Regards, Fergal Byrne On Sat, Dec 7, 2013 at 9:37 PM, Subutai Ahmad <[email protected]> wrote: > Hi Mark, > > We haven’t had too much discussion about the TP on this list but you ask > some interesting questions below. We don’t really know a huge amount but we > do know it can learn extremely long sequences. Consider that each > transition in a sequence is represented by a number of segments. Each step > at a minimum would consume at least activationThreshold segments since you > need that many active columns to go on to the next step. So, one limit > with our typical configuration is (128 segments per cell * 32 cells per > column * 2048)/activationThreshold. That’s a sequence about 1/2 million > steps long! We could theoretically “hand construct” a sequence that long > and it should work. > > In practice the length is likely to be a lot lower but it’s still probably > pretty long (it would be interesting to try this out with random SDR’s). > The length is not really the problem. The difficulty of the sequences > (like the one you have below) is more interesting. We have some tests > already in NuPIC of lower order vs high order sequences. Please take a look > at this file: > > nupic/tests/integration/py2/nupic/algorithms/tp_test.py > > It would be really cool to expand on this type of test. There’s a lot > more we could do to understand the TP better! > > —Subutai > > > On Sun, Nov 17, 2013 at 3:00 PM, Marek Otahal <[email protected]>wrote: > >> I'm about to create and carry out some benchmarks of the CLA. >> >> >> -for TP: given n-sequences, what's the max length f the sequences it can >> recall? >> -test with hardest sequences? (AAAAAAAAAAAAAAAAAAAAAAAAAAAAB) >> -resistance to noice (I think Subutai did these? Could we have the >> graphs, scripts, please?) >> >> >> > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Fergal Byrne, Brenter IT <http://www.examsupport.ie>http://inbits.com - Better Living through Thoughtful Technology e:[email protected] t:+353 83 4214179 Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
