Take a look at Roger Hui's responses in my discussion thread entitled : "The Travel Itinerary Problem<http://www.jsoftware.com/pipermail/general/2006-March/026640.html>" This thread was posted in the General forum March 26, 2006. My problem was couched as calculating the probabilities of tourists taking specific tour routes through various cities given lots of collected trip data. However, the same calculations could be used to calculate the probabilities of word n-grams in a text. Those probabilities can then be used to generate texts which, while mostly nonsensical, can seem to emulate a particular author's writing style.
Scientific American had an article a few years back about a program to analyze the word-frequencies and n-gram probabilities of most of Shakespeare's works. Then those probabilities were used to generate new texts that, while total gibberish, still had the flavor of Shakespeare. The Natural Language Toolkit (http://www.nltk.org/) has tools for that purpose, and actually provides several of Shakespeare's texts to test them on. I also wrote an APL program to do that same thing back then, but I'm afraid I lost that program years ago. Roger's functions in the previously-mentioned thread do the bulk of the work, however. Skip On Sat, Nov 12, 2011 at 12:43 AM, Daniel Lyons <[email protected]>wrote: > Hi, > > I'm playing with J, just trying to get into the J mindset. A small program > I wrote recently in another language was a small Markov chain sentence > generator. This program had essentially two parts: parsing some input into > some kind of internal representation, and generating sentences randomly > using that. I'm trying to figure out how one would go about doing this in J. > > My first thought is to make a bunch of nested boxed arrays, so given a > sample input string it would produce a pair of arrays, one with the unique > set of words, and another with boxed arrays of successor words. But > stumbling onto the lab that discusses a dice game it seems like it might be > more natural to write this in terms of some kind of transition table. > > I am not looking for a concrete solution so much as clues as to how a J > programmer would decompose this problem, and what techniques would be > involved in solving it. > > Thanks, > > — > Daniel Lyons > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Skip Cave Cave Consulting LLC Phone: 214-460-4861 ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
