--- Richard Loosemore <[EMAIL PROTECTED]> wrote:
> Matt Mahoney wrote:
> > What did your simulation actually accomplish?  What were the results? 
> What do
> > you think you could achieve on a modern computer?
> 
> Oh, I hope there's no misunderstanding:  I did not build networks to do 
> any kind of syntactic learning, they just learned relationships between 
> phonemic representations and graphemes.  (They learned to spell).  What 
> they showed was something already known for the learning of 
> pronunciation:  that the system first learns spellings by rote, then 
> increases its level of accuracy and at the same time starts to pick up 
> regularities in the mapping.  Then it starts to "regularize" the 
> spellings.  For example: having learned to spell "height" correctly in 
> the early stages, it would then start to spell it incorrectly as "hite" 
> because it had learned many other words in which the spelling of the 
> phoneme sequence in "height" would involve "-ite".  Then in the last 
> stages it would learn the correct spellings again.

That's interesting, because children make similar mistakes at higher language
levels.  For example, a child will learn an irregular verb like "went", then
later generalize to "goed" before switching back to the correct form.

I am convinced that similar neural learning mechanisms are involved at the
lexical and syntactic levels, but on different scales.  For example, we learn
to classify letters into vowels and consonants by their context, just as we do
for nouns and verbs.  Then we learn sequential patterns.  Just as every word
needs a vowel, every sentence needs a verb.

I think that learning syntax is a matter of computational power.  Children
learn the rules for segmenting continuous speech at 7-10 months, but don't
learn grammar until years later.  So you need more training data and a larger
network.  The reason I say the problem is O(n^2) is because when you double
the information content of the training data, you need to double the number of
number of connections to represent it.  Actually I think it is a little less
than O(n^2) (maybe O(n^2/log n)?) because of redundancy in the training data. 
There are about 1000 times more words than there are letters, so this suggests
you need 100,000 times more computing power for adult level grammar.  This
might explain why the problem is still unsolved.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&user_secret=8eb45b07

Reply via email to