Re: [Jprogramming] markov chain algorithm

Kip Murray Sat, 12 Nov 2011 16:13:57 -0800

I see from Skip's discussion and rereading Daniel's post that Daniel may 
be thinking, choose a word at random from a population list, next choose 
a word at random from the first word's successor list, next choose a 
word at random from the second word's successor list, and so on, 
bypassing finding the probabilities I had put into a transition matrix.


My discussion assumed someone had indexed the n distinct words in the 
population with integers 0 to n-1 and found the n^2 probabilities in my 
n by n transition matrix.  To generate a random sentence: when index k 
has been chosen, use the probabilities in row k to choose the next 
index; afterwards convert the chosen indices into words.  Too much 
trouble?  You decide.  When you know the transition matrix you can 
calculate other interesting matrices, see my discussion.

Kip Murray

On 11/12/2011 4:49 PM, Skip Cave wrote:
> Take a look at Roger Hui's responses in my discussion thread entitled : "The
> Travel Itinerary
> Problem<http://www.jsoftware.com/pipermail/general/2006-March/026640.html>"
> This thread was posted in the General forum March 26, 2006. My problem was
> couched as calculating the probabilities of tourists taking specific tour
> routes through various cities given lots of collected trip data. However,
> the same calculations could be used to calculate the probabilities of word
> n-grams in a text. Those probabilities can then be used to generate texts
> which, while mostly nonsensical, can seem to emulate a particular author's
> writing style.
>
> Scientific American had an article a few years back about a program to
> analyze the word-frequencies and n-gram probabilities of most of
> Shakespeare's works. Then those probabilities were used to generate new
> texts that, while total gibberish, still had the flavor of  Shakespeare.
> The Natural Language Toolkit (http://www.nltk.org/) has tools for that
> purpose, and actually provides several of Shakespeare's texts to test them
> on.
>
> I also wrote an APL program to do that same thing back then, but I'm afraid
> I lost  that program years ago. Roger's functions in the
> previously-mentioned thread do the bulk of the work, however.
>
> Skip
>
> On Sat, Nov 12, 2011 at 12:43 AM, Daniel Lyons<[email protected]>wrote:
>
>> Hi,
>>
>> I'm playing with J, just trying to get into the J mindset. A small program
>> I wrote recently in another language was a small Markov chain sentence
>> generator. This program had essentially two parts: parsing some input into
>> some kind of internal representation, and generating sentences randomly
>> using that. I'm trying to figure out how one would go about doing this in J.
>>
>> My first thought is to make a bunch of nested boxed arrays, so given a
>> sample input string it would produce a pair of arrays, one with the unique
>> set of words, and another with boxed arrays of successor words. But
>> stumbling onto the lab that discusses a dice game it seems like it might be
>> more natural to write this in terms of some kind of transition table.
>>
>> I am not looking for a concrete solution so much as clues as to how a J
>> programmer would decompose this problem, and what techniques would be
>> involved in solving it.
>>
>> Thanks,
>>
>> —
>> Daniel Lyons
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] markov chain algorithm

Reply via email to