--- Richard Loosemore <[EMAIL PROTECTED]> wrote: > Matt Mahoney wrote: > > What did your simulation actually accomplish? What were the results? > What do > > you think you could achieve on a modern computer? > > Oh, I hope there's no misunderstanding: I did not build networks to do > any kind of syntactic learning, they just learned relationships between > phonemic representations and graphemes. (They learned to spell). What > they showed was something already known for the learning of > pronunciation: that the system first learns spellings by rote, then > increases its level of accuracy and at the same time starts to pick up > regularities in the mapping. Then it starts to "regularize" the > spellings. For example: having learned to spell "height" correctly in > the early stages, it would then start to spell it incorrectly as "hite" > because it had learned many other words in which the spelling of the > phoneme sequence in "height" would involve "-ite". Then in the last > stages it would learn the correct spellings again.
That's interesting, because children make similar mistakes at higher language levels. For example, a child will learn an irregular verb like "went", then later generalize to "goed" before switching back to the correct form. I am convinced that similar neural learning mechanisms are involved at the lexical and syntactic levels, but on different scales. For example, we learn to classify letters into vowels and consonants by their context, just as we do for nouns and verbs. Then we learn sequential patterns. Just as every word needs a vowel, every sentence needs a verb. I think that learning syntax is a matter of computational power. Children learn the rules for segmenting continuous speech at 7-10 months, but don't learn grammar until years later. So you need more training data and a larger network. The reason I say the problem is O(n^2) is because when you double the information content of the training data, you need to double the number of number of connections to represent it. Actually I think it is a little less than O(n^2) (maybe O(n^2/log n)?) because of redundancy in the training data. There are about 1000 times more words than there are letters, so this suggests you need 100,000 times more computing power for adult level grammar. This might explain why the problem is still unsolved. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=4007604&user_secret=8eb45b07
