Hi Marek, This is great. One suggestion is to steal from one of Geoff Hinton's students, who did exactly the same letter-by-letter prediction. What he did was to take the predictions, let's say:
d: 0.33 t: 0.27 e: 0.2 f: 0.2 And use a random generator to decide which of these to give it next, in proportion to their probabilities. So 1/3 of the time you give it a d etc. On Sun, Nov 17, 2013 at 3:05 PM, Marek Otahal <[email protected]> wrote: > Here;s illustrative output on running a "xAAA. xBBB" dataset. > > ====== Repeat #100 ======= > > [991] x ==> BBB|x (0.50 | 0.50 | 0.50 | 1.00 | 1.00) > <<<<<learning correctly > [992] A ==> AA|xB (0.88 | 0.78 | 0.78 | 0.78 | 1.00) > [993] A ==> A|xBB (0.92 | 0.81 | 0.81 | 0.89 | 1.00) > [994] A ==> |xBBB (0.80 | 0.80 | 0.80 | 0.88 | 1.00) > [995] | ==> xBBB| (1.00 | 0.92 | 0.92 | 0.92 | 1.00) > DEBUG: Result of PyRegion::executeCommand : 'None' > reset > [996] x ==> AAA|x (0.50 | 0.50 | 0.50 | 1.00 | 1.00) > <<<<<<learning correctly > [997] B ==> BB|xA (0.94 | 0.89 | 0.89 | 0.89 | 1.00) > [998] B ==> B|xAA (0.91 | 0.85 | 0.85 | 0.94 | 1.00) > [999] B ==> |xAAA (0.85 | 0.85 | 0.85 | 0.94 | 1.00) > [1000] | ==> xAAA| (1.00 | 0.91 | 0.91 | 0.91 | 1.00) > DEBUG: Result of PyRegion::executeCommand : 'None' > reset > ========================================== > Welcome young adventurer, let me tell you a story! > Enter story start (QUIT to go to work): x > x x B B B <<<<interpretation is always same!! > > > x B B B > > Enter story start (QUIT to go to work): x > x x B B B > > > x B B B > > Enter story start (QUIT to go to work): x > x x B B B > > > > > On Sun, Nov 17, 2013 at 4:01 PM, Marek Otahal <[email protected]>wrote: > >> I've added an "interactive" feature to Chetan's Linguist >> https://github.com/chetan51/linguist - a story teller mode. >> >> It will (more or less) memorize the given text and then let you type >> starting words (ie "So he ") and follow up on its own to complete the >> sentence(s). >> >> --------------------------------- >> Yet there's a problem. >> >> I'll describe the project briefly, it uses TP to learn texts as a >> sequence(s) of letters. >> >> First it used to memorize whole text as one long sequence, this worked >> for smaller datasets, but for bigger, the accuracy went down quickly. >> >> I decided to simplify and separate text to separate sequences and reset >> the sequence memory of the temporal pooler at the end of each sentence. >> This greatly improved prediction probabilities as sequences are much >> shorter (avg sentence lenght (+-30chars) vs dataset len (hundreds - >> thousands chars)). >> >> The problem is, after the first end of sequence, there's no "flow" (I >> know, I've called a reset(), what could I expect ;) ), so a state with >> highest statistical probability is selected (always the same!) >> >> example dataset: " >> How are you? >> I'm fine. >> I'm tired. >> Yayyyyy!" >> >> So when you start "Ho"..it'll correctly follow.."w are you?" "I'm fine" >> "I'm fine" "I'm fine"...forever. >> >> The "I'm fine" is fine :) as from a new state it's the most probable >> choice (2 out of 4). But it doesn't look good. >> >> I;ve come with 2 solutions: >> # Idea1: >> after seq reset in the generation mode, randomly generate the first char >> manually, feed it to TP and let it follow... >> should work: OK, principle: so-so. >> >> #Idea2: >> even though I trained with a reset (=new unknown state) after each >> sentence end, can I now somehow keep the flow spanning over more sentences? >> >> >> Last but not least, the bug! >> The bug is in (CLA)model's result.inferences['prediction'] >> By definition, this field should return the most probable state from the >> inference. But what if there are two+ most probable states? I believe we >> should go random. >> >> While for debuging the fixt order is convenient, the random order seems >> natural. I believe it would fix my problem with repetitive "Im fine" above >> too. (kindof) >> >> Proposed solution, if you agree, we;ll add init() parameter debug=False >> which will keep the fixed ordering if needed, and by default, do random on >> same probable states. >> >> Thanks for reading :) >> mark >> -- >> Marek Otahal :o) >> > > > > -- > Marek Otahal :o) > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Fergal Byrne, Brenter IT <http://www.examsupport.ie>http://inbits.com - Better Living through Thoughtful Technology e:[email protected] t:+353 83 4214179 Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
