How about a third approach: C. Load a seed (minimal complete) grammar and vocabulary into a computationally tractable knowledge representation structure. Provide learning tools that will allow the system to a) look-up new words and create a new structure that defines the new words in terms of the minimal complete set plus already learned terms, b) recognize when known words are used in unknown ways and start to determine a new definition/structure for the new usage, and c) recognize when a new grammatical structure is being used (most often due to skipping the "obvious" parts of a phrase) and start to determine how to flesh out the new structure into a fully specified known (minimal + learned) structure.
Human beings clearly have very low-level internal mental knowledge representation structures/grammars that they are born with. Making a system derive these structures de novo will (IMO) make the learning task impossibly hard (even if you start from the human baby level). >> If this is correct, then we must start with simple sentences (Basic English) >> and not with mining the web or newspapers. Sort of. Basic English is probably pretty close to the minimal complete seed grammar and vocabulary but it's in an extremely computationally expensive structure. Figuring out a good structure and how to parse Basic English into and out of it -- and then, providing the tools mentioned above -- should allow for mining anything (Now, the alert reader may well object that this merely pushes the burden of "learning" onto the three tools; however, it is my contention that this method reframes the NL problem into <what I believe are> soluble pieces). ----- Original Message ----- From: YKY (Yan King Yin) To: [email protected] Sent: Thursday, April 26, 2007 1:38 PM Subject: Re: [agi] rule-based NL system I have an intuition about language learning... There're 2 different approaches: A. Learning like a human baby. Start with single words, and then proceed to simple sentences, and so on. Each successive "layer" building on the foundation of lower layers. B. Learning directly from "adult" text corpuses, ie going directly from age 0 to age 6. My intuition is that B requires exponentially more computation than A. In other words, the "layers-based" learning pathway reduces computational complexity logarithmically. But I don't know how to prove it. Does it make sense? Can someone corroborate? If this is correct, then we must start with simple sentences (Basic English) and not with mining the web or newspapers. YKY ------------------------------------------------------------------------------ This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?& ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=231415&user_secret=fabd7936
