Re: [nupic-dev] [project][BUG?] "Story teller" on Nupic, and randomizing selection of states with same probabilities in TP

Fergal Byrne Sun, 17 Nov 2013 07:49:22 -0800

Hi Marek,

This is great. One suggestion is to steal from one of Geoff Hinton's
students, who did exactly the same letter-by-letter prediction. What he did
was to take the predictions, let's say:


d: 0.33
t: 0.27
e: 0.2
f: 0.2

And use a random generator to decide which of these to give it next, in
proportion to their probabilities. So 1/3 of the time you give it a d etc.




On Sun, Nov 17, 2013 at 3:05 PM, Marek Otahal <[email protected]> wrote:

> Here;s illustrative output on running a "xAAA. xBBB" dataset.
>
> ====== Repeat #100 =======
>
> [991]    x ==> BBB|x    (0.50 | 0.50 | 0.50 | 1.00 | 1.00)
> <<<<<learning correctly
> [992]    A ==> AA|xB    (0.88 | 0.78 | 0.78 | 0.78 | 1.00)
> [993]    A ==> A|xBB    (0.92 | 0.81 | 0.81 | 0.89 | 1.00)
> [994]    A ==> |xBBB    (0.80 | 0.80 | 0.80 | 0.88 | 1.00)
> [995]    | ==> xBBB|    (1.00 | 0.92 | 0.92 | 0.92 | 1.00)
> DEBUG:  Result of PyRegion::executeCommand : 'None'
> reset
> [996]    x ==> AAA|x    (0.50 | 0.50 | 0.50 | 1.00 | 1.00)
> <<<<<<learning correctly
> [997]    B ==> BB|xA    (0.94 | 0.89 | 0.89 | 0.89 | 1.00)
> [998]    B ==> B|xAA    (0.91 | 0.85 | 0.85 | 0.94 | 1.00)
> [999]    B ==> |xAAA    (0.85 | 0.85 | 0.85 | 0.94 | 1.00)
> [1000]   | ==> xAAA|    (1.00 | 0.91 | 0.91 | 0.91 | 1.00)
> DEBUG:  Result of PyRegion::executeCommand : 'None'
> reset
> ==========================================
> Welcome young adventurer, let me tell you a story!
> Enter story start (QUIT to go to work): x
> x x B B B   <<<<interpretation is always same!!
>
>
> x B B B
>
> Enter story start (QUIT to go to work): x
> x x B B B
>
>
> x B B B
>
> Enter story start (QUIT to go to work): x
> x x B B B
>
>
>
>
> On Sun, Nov 17, 2013 at 4:01 PM, Marek Otahal <[email protected]>wrote:
>
>> I've added an "interactive" feature to Chetan's Linguist
>> https://github.com/chetan51/linguist - a story teller mode.
>>
>> It will (more or less) memorize the given text and then let you type
>> starting words (ie "So he ") and follow up on its own to complete the
>> sentence(s).
>>
>> ---------------------------------
>> Yet there's a problem.
>>
>> I'll describe the project briefly, it uses TP to learn texts as a
>> sequence(s) of letters.
>>
>> First it used to memorize whole text as one long sequence, this worked
>> for smaller datasets, but for bigger, the accuracy went down quickly.
>>
>> I decided to simplify and separate text to separate sequences and reset
>> the sequence memory of the temporal pooler at the end of each sentence.
>> This greatly improved prediction probabilities as sequences are much
>> shorter (avg sentence lenght (+-30chars) vs dataset len (hundreds -
>> thousands chars)).
>>
>> The problem is, after the first end of sequence, there's no "flow" (I
>> know, I've called a reset(), what could I expect ;) ), so a state with
>> highest statistical probability is selected (always the same!)
>>
>> example dataset: "
>> How are you?
>> I'm fine.
>> I'm tired.
>> Yayyyyy!"
>>
>> So when you start "Ho"..it'll correctly follow.."w are you?" "I'm fine"
>> "I'm fine" "I'm fine"...forever.
>>
>> The "I'm fine" is fine :) as from a new state it's the most probable
>> choice (2 out of 4). But it doesn't look good.
>>
>> I;ve come with 2 solutions:
>>  # Idea1:
>>  after seq reset in the generation mode, randomly generate the first char
>> manually, feed it to TP and let it follow...
>>  should work: OK, principle: so-so.
>>
>>  #Idea2:
>>  even though I trained with a reset (=new unknown state) after each
>> sentence end, can I now somehow keep the flow spanning over more sentences?
>>
>>
>> Last but not least, the bug!
>> The bug is in (CLA)model's result.inferences['prediction']
>> By definition, this field should return the most probable state from the
>> inference. But what if there are two+ most probable states? I believe we
>> should go random.
>>
>> While for debuging the fixt order is convenient, the random order seems
>> natural. I believe it would fix my problem with repetitive "Im fine" above
>> too. (kindof)
>>
>> Proposed solution, if you agree, we;ll add init() parameter debug=False
>> which will keep the fixed ordering if needed, and by default, do random on
>> same probable states.
>>
>> Thanks for reading :)
>> mark
>> --
>> Marek Otahal :o)
>>
>
>
>
> --
> Marek Otahal :o)
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] [project][BUG?] "Story teller" on Nupic, and randomizing selection of states with same probabilities in TP

Reply via email to