>> I basically agree, but you're jumping steps. I'm not jumping steps. You didn't read/understand what I wrote.
>> 1. First you create a minimal Basic English grammar, hand-coding the rules. I am NOT suggesting a rule-based system at this level. First I figure out a good representation for the minimal Basic English grammar that fundamentally has the simplest grammatical rules embedded into it's structure rather than expressed as rules (i.e. I *AM* hand-crafting the design of the initial structure -- nouns, verbs, adjectives, adverbs, prepositions, noun clauses, verb clauses, prepositional phrases, simple SV sentences, simple SVO sentences, etc. I am also setting up different types of inheritance and analogy links between words/terms/structures). But I am definitely not going to be hand-coding grammar "rules" per se. After the fact, you obviously could say that certain rules define the structure (and you could obviously design an analogous rule-based system -- with *a lot* more effort) but, at the lowest level, my version of the system is not going to be run as a rule-based system. Note that fundamentally, as best we can determine, this is also the same way in which humans operate. You're also missing the fact that language is both grammar and vocabulary. The system needs to be able to "understand" from the beginning (Understand, in this case, meaning being able to translate anything to "seed-only" form -- and thus, to be able, via the built in structure, to know how to transform between alternative forms and to know what aspects and data attach where). 2. At this stage you can only read simple/short sentences. Yes. And I should be able to write them as well. And as soon as my vocabulary grows a bit, I should be able to paraphrase freely. 3. You then need to add more complex grammar rules to handle "real" English, but such rules are difficult to hand-craft and thus may probably require machine learning. At this point, the tools I mentioned come into play to extend the structure until it can handle "real" English. This extension is clearly in the realm of machine learning but, I believe, is structured and limited enough to be feasible -- particularly if I start by pointing it at "well-behaved/well-defined" sources like dictionaries, encyclopedias, etc. 4. Only at this stage you can digest the web or newspapers. Actually, it can probably start *attempting* to digest such sources fairly early. All it really has to do is to be able to tell when it's pretty sure that it's correct and when it needs to try again after it's learned more (and discard the data until then). In particular though, it can always go to a dictionary when it runs across a new word (or when it seems that a known word has another, unknown definition) and it can also go to a trusted human if it's really not sure about how to parse a sentence (at which point it gives that human a list of alternatives to choose from -- which doesn't require extensive training to handle). >> I guess (3) and (4) won't happen immediately. And after (2) we can start >> collecting commonsense facts via Basic English. So it seems to me that a >> viable "first product" could be a commonsense engine using Basic English, >> without going to 3 & 4. I disagree. Building the extension/learning tools is a fundamental part of the initial design. If you start collecting commonsense facts without a good data structure and the "understanding" described above, you end up with Cyc (the same facts encoded multiple ways, almost all facts inaccessible unless you access them in almost *exactly* the form in which they were encoded, etc., etc.). I don't find *any* value in that at all. Facts are only useful if you can access and use them. ----- Original Message ----- From: YKY (Yan King Yin) To: [email protected] Sent: Thursday, April 26, 2007 3:41 PM Subject: Re: [agi] rule-based NL system On 4/27/07, Mark Waser <[EMAIL PROTECTED]> wrote: > >> If this is correct, then we must start with simple sentences (Basic English) and not with mining the web or newspapers. > > Sort of. Basic English is probably pretty close to the minimal complete seed grammar and vocabulary but it's in an extremely computationally expensive structure. Figuring out a good structure and how to parse Basic English into and out of it -- and then, providing the tools mentioned above -- should allow for mining anything (Now, the alert reader may well object that this merely pushes the burden of "learning" onto the three tools; however, it is my contention that this method reframes the NL problem into <what I believe are> soluble pieces). I basically agree, but you're jumping steps. 1. First you create a minimal Basic English grammar, hand-coding the rules. 2. At this stage you can only read simple/short sentences. 3. You then need to add more complex grammar rules to handle "real" English, but such rules are difficult to hand-craft and thus may probably require machine learning. 4. Only at this stage you can digest the web or newspapers. I guess (3) and (4) won't happen immediately. And after (2) we can start collecting commonsense facts via Basic English. So it seems to me that a viable "first product" could be a commonsense engine using Basic English, without going to 3 & 4. YKY ------------------------------------------------------------------------------ This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?& ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=231415&user_secret=fabd7936
