On Thu, May 2, 2013 at 5:22 PM, Jim Bromer <jimbro...@hotmail.com> wrote: > I do not see myself working on an n-gram language model.
I'm sure you have other ideas. If I tried this approach and it didn't work, I would try something else. The idea is that for a sentence like "Put the green ball on the red block.", a bag of words model is insufficient. There are at least 4 ways you could rearrange the words to mean something else but still be grammatically correct. But if your model was a bag of word pairs like "green ball", "ball on", etc. then there is no ambiguity. There is only one way that the word pairs could be reassembled into a sentence. Therefore, this 2-gram model should be sufficient for both understanding the command and constructing responses. My theory is that people actually learn language like this, rather than Chomsky's context free grammars. To test the theory, I would build a model and test it with grammatically incorrect sentences with bad spelling, to see if the program is still able to make sense of it as a human would. For example, "Put the gren ball onthe red blok". One problem is I would need a lot of examples to train the system. I would look for a different domain where the data is already available. One example is text compression, or predicting the next word in a sentence. What kind of grammar models give the best predictions? -- -- Matt Mahoney, mattmahone...@gmail.com ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com