Re: [agi] Incremental Fluid Construction Grammar released
I agree with most everything you have said so far, is in line with alot of the thoughts I have had. How far along is your dialog system? I am here in Austin as well, and would be interested in talking with you further as time permits. James Stephen Reed [EMAIL PROTECTED] wrote: Ben, I want to engage them as volunteers. The OpenMind project is a good example. Another is the game that Cycorp built: http://game.cyc.com . The bootstrap dialog system will operate using Jabber, a standard chat protocol (e.g. Google Chat), so it should easily scale and deploy to the Internet. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 12:16:59 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released Do you plan to pay these non-experts, or recruit them as volunteers? ben On Jan 10, 2008 1:11 PM, Stephen Reed [EMAIL PROTECTED] wrote: Granted that from a logical viewpoint, using a controlled English syntax to acquire rules is as much work as explicitly encoding the rules. However, a suitable, engaging, bootstrap dialog system may permit a multitude of non-expert users to add the rules, thus dramatically reducing the amount of programmatic encoding, and the duration of the effort. That is my hypothesis and plan. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 11:06:45 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... OTOH, you could fairly easily define a limited, controlled syntax encompassing a variety of statements about words, sentences and other linguistic structures, and then make a system add syntactic and semantic rules based on these sentences. But I don't see what the point would be, because telling the system stuff in the controlled syntax would be basically as much work as explicitly encoding the rules... -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Never miss a thing. Make Yahoo your homepage. This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Never miss a thing. Make Yahoo your homepage. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=89048075-8f4460
Re: [agi] Incremental Fluid Construction Grammar released
Hi James, Your web site is informative. I very much seek comments, input and collaboration with the AI Lab at the University of Texas. I see that your interest is knowledge based systems. I worked indirectly with Dr. Porter during my tenure at Cycorp as its first manager for the DARPA Rapid Knowledge Formation project, which is strongly influencing my current project stage. The design of the Texai bootstrap dialog system is published on my blog: http://texai.org/blog/2008/01/20/bootstrap-dialog-system-design and I am keeping that documentation up to date as I write the code. At the moment I am working on the developer's chat interface, which resembles a client for instant messaging. This is the node labeled UI console chat session node in my illustration. Pre-released source code is stored in the project's SourceForge repository which can be browsed at: http://texai.svn.sourceforge.net/viewvc/texai . The Incremental Fluid Construction Grammar grammar rule application libraries are done, except for heuristics to choose the best rules out of the multitude that I think will be eventually present. All skills on the diagram remain to be written, but I expect the code volume to be reasonable given that this is a bootstrap system. It would honor me greatly to present my work to you and any of your fellows at UT. I am developing a talk to give to my former coworkers at Cycorp soon. The Fifth International Conference on Construction Grammar is to be held at UT, September 26, 2008. I have submitted this abstract, and if it is accepted then I'll write the associated paper. The research is already completed and briefly summarized in my blog. Furthermore I edited the Wikipedia article on FCG to explain more about how it works. A cognitively-plausible implementation of Fluid Construction Grammar (FCG) is described in which the grammar rules are adopted from Double R Grammar (DRG). FCG provides a bi-directional rule application engine in which the working memory is a coupled semantic and syntactic feature structure. FCG itself does not commit to any particular lexical categories, nor does it commit to any particular organization of construction rules. DRG, previously implemented in the ACT-R cognitive architecture, is a linguistic theory of the grammatical encoding and integration of referential and relational meaning in English. Its referential and relational constructions facilitate the composition of logical forms. In this work, a set of bi-directional FCG rules are developed that comply with DRG. Results demonstrate both the lexically incremental parse of an utterance to precise, discourse-referential, logical form, and the semantically incremental production of the original utterance, given as input the discourse-grounded logical form. Lets's keep in touch. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: James Ratcliff [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, January 23, 2008 11:55:19 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released I agree with most everything you have said so far, is in line with alot of the thoughts I have had. How far along is your dialog system? I am here in Austin as well, and would be interested in talking with you further as time permits. James ___ James Ratcliff - http://falazar.com Looking for something... Never miss a thing. Make Yahoo your homepage. This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=89104221-dd2b8c
Re: [agi] Incremental Fluid Construction Grammar released
Vladimir, What do you mean by difference in processing here? I said the difference was after the initial processing. By processing I meant syntactic and semantic processing. After processing the syntax related sentence the realm of action is changing the system itself, rather than knowledge of how to act on the outside world. I'm fairly convinced that self-change/management/knowledge is the key thing that has been lacking in AI, which is why I find it different and interesting. I think that both instructions can be perceived by AI in the same manner, using the same kind of internal representations, if IO is implemented on sufficiently low level, for example as a stream of letters (or even their binary codes). This way knowledge about spelling and syntax can work with low-level concepts influencing little chunks of IO perception and generation, and 'more semantic' knowledge can work with more high-level aspects. It's less convenient for quick dialog system setup or knowledge extraction from text corpus, but it should provide flexibility. I'm not quite sure of the representation or system you are describing so I can't say what it can or cannot do. Would you expect it to be able to do the equivalent of switching to think in a different language? Will - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84643606-cff255
Re: [agi] Incremental Fluid Construction Grammar released
On Jan 11, 2008 3:01 PM, William Pearson [EMAIL PROTECTED] wrote: Vladimir, What do you mean by difference in processing here? I said the difference was after the initial processing. By processing I meant syntactic and semantic processing. After processing the syntax related sentence the realm of action is changing the system itself, rather than knowledge of how to act on the outside world. I'm fairly convinced that self-change/management/knowledge is the key thing that has been lacking in AI, which is why I find it different and interesting. I fully agree with this sentiment, which is why I take it a step further. Instead of building explicit lexical and syntax processing (however mutable), I propose processing textual input the same way all other semantics is handled. In other words, text isn't preprocessed before it's taken to semantic level, it's dumped there without changes. The same processes that analyze semantics and extract high-level regularities would analyze sequences of symbols and extract words, syntactic structure, and so on. Because it's based on the same inevitably mutable knowledge representation, problem with integration and mutability of language processing doesn't exist. I think that both instructions can be perceived by AI in the same manner, using the same kind of internal representations, if IO is implemented on sufficiently low level, for example as a stream of letters (or even their binary codes). This way knowledge about spelling and syntax can work with low-level concepts influencing little chunks of IO perception and generation, and 'more semantic' knowledge can work with more high-level aspects. It's less convenient for quick dialog system setup or knowledge extraction from text corpus, but it should provide flexibility. I'm not quite sure of the representation or system you are describing so I can't say what it can or cannot do. Would you expect it to be able to do the equivalent of switching to think in a different language? Certainly, including mixing of languages. (I'm not sure thinking itself is very language-dependent.) That is why it might be useful to supply binary codes of letters instead of just letters: this way any Unicode symbol can be fed in it, so that it would be able to learn new alphabets without needing to learn new separate modality. Representation I'm talking about, if you omit learning for simplicty, is basically a production system that produces (activates) a set of unique symbols (concepts) each tact, based on sets produced in previous k tacts. For IO there are special symbols, so that input corresponds to external activation of symbols, and output consists in detecting that special output symbols are activated by the system. Streamed input corresponds to sequential activation of letters of input text, so that first letter is externally activated at first tact, second letter at second tact, and so on. -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84721989-16c7f9
Re: [agi] Incremental Fluid Construction Grammar released
On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: Processing a dictionary in a useful way requires quite sophisticated language understanding ability, though. Once you can do that, the hard part of the problem is already solved ;-) While this kind of system requires sophisticated language understanding ability, I don't think that sophisticated language understanding ability implies the ability to use the dictionary... So you have to be careful to create a system with both abilities. For example a language understanding system focussed on understanding sophisticated sentences about the world external to itself does need not be able to add to the syntactical rules. Which would make those systems a lot slower at learning language when they get to that language understanding ability. I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements about words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. I think it is a thorny problem and needs to be dealt with in a creative way, but I would be interested to be proved wrong. What sort of age of human do you think is capable of this kind of linguistic rule acquisition? I'd guess when kids start asking questions like, What is that called? or What does that word mean?. If not before. Will Pearson - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84161407-3219d5
Re: [agi] Incremental Fluid Construction Grammar released
I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements about words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Depending on exactly what you mean by this, it's not a very far-off thing, and there probably are systems that do this in various ways. In a lexical grammar approach to NLP, most of the information about the grammar is in the lexicon. So all that's required for the system to learn new syntactic rules is to make the lexicon adaptive. For instance, in the link grammar framework, all that's required is for the AI to be able to edit the link grammar dictionary, which tells the syntactic link types associated with various words. This just requires a bit of abductive inference of the general form: 1) I have no way to interpret sentence S syntactically, yet pragmatically I know that sentence S is supposed to mean (set of logical relations) M 2) If word W (in sentence S) had syntactic link type L attached to it, then I could syntactically interpret sentence S to yield meaning M 3) Thus, I abductively infer that W should have L attached to it (with a certain level of probabilistic confidence) There is nothing conceptually difficult here, and nothing beyond the state of the art. The link grammar exists (among other frameworks), and multiple frameworks for abductive inference exist (including Novamente's PLN framework). The bottleneck is really the presence of data of type 1), i.e. of instances in which the system knows what a sentence is supposed to mean even though it can't syntactically parse it. One way to get a system this kind of data is via embodiment. But this is not the only way. It can also be done via pure conversation, for example. Suppose i'm talking to an AI, as follows: AI: What's your name Ben: I be Ben Goertzel AI: What?? Ben: I am Ben Goertzel AI: Thanks Now, the AI may not know the grammatical rule needed to parse I be Ben Goertzel But, after the conversation is done, it knows that the meaning is supposed to be equivalent to that of I am Ben Goertzel and thus it can edit it grammar (e.g. the link parser dictionary) appropriately, in this case to incorporate the Ebonic grammatical structure of be. Another way to provide training of type 1) would be if the system had a corpus of multiple different sentences all describing the same thing -- wherein it could parse some of the sentences and not others. In short, I feel that adapting grammar rules based on experience is not an extremely hard problem, though there are surely some moderate-level hidden gotchas. The bottlenecks in this regard appear to be -- getting the AI the experience -- boring old systems integration -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84197075-424a2e
Re: [agi] Incremental Fluid Construction Grammar released
Ben asked: What is the semantics of ?on-situation-localized-14 rdf:type texai:On-SituationLocalized On-SituationLocalized is a term I created for this use case, while postponing its associated definitional assertions. What I have in mind is that On-SituationLocalized is a specialization of SitutationLocalized in which some object is on some other object. Because the Texai KB is derived from the rule-lacking, RDF-compatible subset of OpenCyc, I am postponing for now the representation of required commonsense rules about these terms. Ben asked: How would your system parse The book is on neuroscience or The book is on the Washington Monument or The book is on fire or The book is on my shit list Construction Grammar (CxG) is differs from other grammars in that there is much less emphasis on grammar rules. Its basic principle is that language consists of pairings between form and meaning. That is, humans acquire these pairings as young children, matching their parent's intentions with some signs or utterances. As they develop linguistic skills, humans induce that patterns exist among these pairings, allowing novel combinations to be understood and produced. According to CxG, there is no universal grammar, there are no grammar rules constraining what is well-formed or not, and there may not even be lexical categories (e.g. nouns, adverbs, etc.) Therefore, from the viewpoint of CxG, your example variations of the on construction have their own associated semantics, and are necessarily covered by the rules that I developed for my sense of on. In my system the construction rules are adopted from Double R Grammar, which renounces traditional lexical categories in favor of functional, semantically motivated constituents, namely (1) objects, things that can be referenced and (2) relationships among referents. Double R Grammar thus facilitates the mapping of text form to semantics. Let's pretend that the Texai dialog system for lexical acquisition exists, and that the system is to learn constructions for your examples. The below dialog is Controlled English, in which the system understands and generates constrained syntax and vocabulary. [user] Do you understand The book is on neuroscience? [texai] Not entirely, neuroscience is not a physical location, how can the book be on it? [user] The utterance means that the book is about neuroscience. [texai] Could I generalize on neuroscience to any field of scientific discovery, such as on Astronomy, and mean that the book is about Astronomy? [user] yes [texai] Could I generalize field of scientific discovery to some general region, such on deep space? [user] yes, this role can be generalized to any topic. [texai] In the utterance the book is on Mars entail both the interpretation that (1) the book is physically located on Mars, and (2) the book is about the topic Mars? [user] yes ... [user] Do you understand the book is on fire? [texai] Yes, the book is about the topic fire. [user] There is another, more common, interpretation: There is a fire in which the book is the thing that is burning. [texai] Could I generalize fire to any decomposition process, such as rot? [user] no ... [user] Do you understand the book is on my shit-list? [texai] Yes, the book is about the topic shit-list. [user] There is another, more common, interpretation: There is a shit-list, and the book is an element of the list. [texai] I know from Wiktionary that a shitlist is a group of people who a person holds in disregard, but a book is not a person. [user] The elements of a shit-list can be things. [texai] Now I understand that the book is on my shit-list commonly means that the book is an element of the group of things that you hold in disregard. ... Hope this answers your questions. And thanks for advancing my use case!!! -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84202781-d5577a
Re: [agi] Incremental Fluid Construction Grammar released
On Jan 10, 2008 9:59 AM, Stephen Reed [EMAIL PROTECTED] wrote: and that the system is to learn constructions for your examples. The below dialog is Controlled English, in which the system understands and generates constrained syntax and vocabulary. [user] The elements of a shit-list can be things. [texai] Now I understand that the book is on my shit-list commonly means that the book is an element of the group of things that you hold in disregard. If you successfully have this level of language usage from a machine, can figure out a way to have people speak as succinctly? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84214196-da6ba6
Re: [agi] Incremental Fluid Construction Grammar released
- Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, January 9, 2008 4:04:58 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released And how would a young child or foreigner interpret on the Washington Monument or shit list? Both are physical objects and a book *could* be resting on them. Sorry, my shit list is purely mental in nature ;-) ... at the moment, I maintain a task list but not a shit list... maybe I need to get better organized!!! Ben, your question is *very* disingenuous. Who, **me** ??? There is a tremendous amount of domain/real-world knowledge that is absolutely required to parse your sentences. Do you have any better way of approaching the problem? I've been putting a lot of thought and work into trying to build and maintain precedence of knowledge structures with respect to disambiguating (and overriding incorrect) parsing . . . . and don't believe that it's going to be possible without a severe amount of knwledge . . . . What do you think? OK... Let's assume one is working within the scope of an AI system that includes an NLP parser, a logical knowledge representation system, and needs some intelligent way to map the output of the latter into the former. Then, in this context, there are three approaches, which may be tried alone or in combination: 1) Hand-code rules to map the output of the parser into a much less ambiguous logical format 2) Use statistical learning across a huge corpus of text to somehow infer these rules [I did not ever flesh out this approach as it seemed implausible, but I have to recognize its theoretical possibility] 3) Use **embodied** learning, so that the system can statistically infer the rules from the combination of parse-trees with logical relationships that it observes to describe situations it sees [This is the best approach in principle, but may require years and years of embodied interaction for a system to learn.] Obviously, Cycorp has taken Approach 1, with only modest success. But I think part of the reason they have not been more successful is a combination of a bad choice of parser with a bad choice of knowledge representation. They use a phrase structure grammar parser and predicate logic, whereas I believe if one uses a dependency grammar parser and term logic, the process becomes a lot easier. So far as I can tell, in texai you are replicating Cyc's choices in this regard (phrase structure grammar + predicate logic). Yes, the Texai implementation of Incremental Fluid Construction Grammar follows the phrase structure approach in which leaf lexical constituents are grouped into a structure (i.e. construction) hierarchy. Yet, because it is incremental and thus cognitively plausible, it should scale to longer sentences better than any non-incremental alternative. The mapping of form to predicate logic (RDF-style) is facilitated both by Fluid Construction Grammar (FCG) and by Double R Grammar (DRG). I am using the production rule engine from FCG, enhanced to operate incrementally, and the construction theory from DRG whose focus is on referents and the relationships among them. For quantifier scoping I expect to use Minimal Recursion Semantics which should plug into the FCG feature structure. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84215091-e9ef0b
Re: [agi] Incremental Fluid Construction Grammar released
On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Depending on exactly what you mean by this, it's not a very far-off thing, and there probably are systems that do this in various ways. What I mean by it, is systems that can learn from lessons like the following http://www.primaryresources.co.uk/english/PC_prefix2.htm I could easily whip up something very narrow which didn't do too poorly for prefixes (involving regular expressions transforming the words). But it would be horribly brittle and specific only to prefixes and would know what prefixes were before hand. And your, I be, example made me think of pirates rather than ebonics :). It is also not what I am looking for, because it relies on the system looking for regularities, rather than being explicitly told about them. The benefits of being able to be told there are regularities mean that you do not always have to be looking out for them, saving processing time and memory for other more important tasks. Will - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84215783-ff2e58
Re: [agi] Incremental Fluid Construction Grammar released
A typo in my previous post: ... Therefore, from the viewpoint of CxG, your example variations of the on construction have their own associated semantics, and are *NOT* necessarily covered by the rules that I developed for my sense of on. ... -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84219958-fc5867
Re: [agi] Incremental Fluid Construction Grammar released
Mike, If I understand your question correctly it asks whether a non-expert user can be guided to use Controlled English in a dialog system. In such a system it is expected that small differences exist between the few things that the system understands and the vast number of things that the system does not understand. The differences can be morphological (e.g. spelling), or lexical (e.g. vocabulary), or syntactic (e.g. passive vs active), or semantic (e.g. word sense). Therefore my challenge is to (1) find a polite, non-boring, engaging manner to get the user to say things the way the system can understand, and (2) enable the system to understand new forms, such as what the user is trying to say but currently cannot be understood. The Texai bootstrap dialog system will be an expert system on lexical knowledge acquisition, and hopefully will swiftly grow past the very-hard-to-use stage. This is an idea that I wanted to try at Cycorp but Doug Lenat said that it had been tried before and failed, due to great resistance among users to Controlled English. Let's see if this idea can be made to work now, or not. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Mike Dougherty [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 9:17:43 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 9:59 AM, Stephen Reed wrote: and that the system is to learn constructions for your examples. The below dialog is Controlled English, in which the system understands and generates constrained syntax and vocabulary. [user] The elements of a shit-list can be things. [texai] Now I understand that the book is on my shit-list commonly means that the book is an element of the group of things that you hold in disregard. If you successfully have this level of language usage from a machine, can figure out a way to have people speak as succinctly? Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84222463-fd342d
Re: [agi] Incremental Fluid Construction Grammar released
Will, Affixes are morphological constructions and my system could have rules to handle them. I plan eventually to include such rules for combinations that are new. However the Texai lexicon will explicitly represent all common word forms and multi-word phrases that would otherwise be covered by rules in order to accommodate exceptions. My goal is precise understanding and generation, and that goal is guided by the desire to be cognitively plausible, (i.e. do as humans do). I believe that the human mental lexicon caches morphological rules in the projected word forms paired with their semantics, and invokes these rules only when comprehending a new or uncommon combination. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: William Pearson [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 9:26:27 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On 10/01/2008, Benjamin Goertzel wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Depending on exactly what you mean by this, it's not a very far-off thing, and there probably are systems that do this in various ways. What I mean by it, is systems that can learn from lessons like the following http://www.primaryresources.co.uk/english/PC_prefix2.htm I could easily whip up something very narrow which didn't do too poorly for prefixes (involving regular expressions transforming the words). But it would be horribly brittle and specific only to prefixes and would know what prefixes were before hand. And your, I be, example made me think of pirates rather than ebonics :). It is also not what I am looking for, because it relies on the system looking for regularities, rather than being explicitly told about them. The benefits of being able to be told there are regularities mean that you do not always have to be looking out for them, saving processing time and memory for other more important tasks. Will - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84234687-a4ff52
Re: [agi] Incremental Fluid Construction Grammar released
On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... OTOH, you could fairly easily define a limited, controlled syntax encompassing a variety of statements about words, sentences and other linguistic structures, and then make a system add syntactic and semantic rules based on these sentences. But I don't see what the point would be, because telling the system stuff in the controlled syntax would be basically as much work as explicitly encoding the rules... -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84320793-5fc1e6
Re: [agi] Incremental Fluid Construction Grammar released
Hi, Yes, the Texai implementation of Incremental Fluid Construction Grammar follows the phrase structure approach in which leaf lexical constituents are grouped into a structure (i.e. construction) hierarchy. Yet, because it is incremental and thus cognitively plausible, it should scale to longer sentences better than any non-incremental alternative. I agree that the incremental approach to parsing is the correct one, as opposed to the whole sentence at once approach taken in the link parser and most other parsers. However, this is really a quite separate issue from the choice of hierarchical phrase structure based grammar versus dependency grammar. For instance, Word Grammar is a dependency based approach that incorporates incremental parsing (but has not been turned into a viable computational system). -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84321988-45541e
Re: [agi] Incremental Fluid Construction Grammar released
Granted that from a logical viewpoint, using a controlled English syntax to acquire rules is as much work as explicitly encoding the rules. However, a suitable, engaging, bootstrap dialog system may permit a multitude of non-expert users to add the rules, thus dramatically reducing the amount of programmatic encoding, and the duration of the effort. That is my hypothesis and plan. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 11:06:45 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... OTOH, you could fairly easily define a limited, controlled syntax encompassing a variety of statements about words, sentences and other linguistic structures, and then make a system add syntactic and semantic rules based on these sentences. But I don't see what the point would be, because telling the system stuff in the controlled syntax would be basically as much work as explicitly encoding the rules... -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84345323-0b6cb1
Re: [agi] Incremental Fluid Construction Grammar released
Do you plan to pay these non-experts, or recruit them as volunteers? ben On Jan 10, 2008 1:11 PM, Stephen Reed [EMAIL PROTECTED] wrote: Granted that from a logical viewpoint, using a controlled English syntax to acquire rules is as much work as explicitly encoding the rules. However, a suitable, engaging, bootstrap dialog system may permit a multitude of non-expert users to add the rules, thus dramatically reducing the amount of programmatic encoding, and the duration of the effort. That is my hypothesis and plan. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 11:06:45 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... OTOH, you could fairly easily define a limited, controlled syntax encompassing a variety of statements about words, sentences and other linguistic structures, and then make a system add syntactic and semantic rules based on these sentences. But I don't see what the point would be, because telling the system stuff in the controlled syntax would be basically as much work as explicitly encoding the rules... -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Never miss a thing. Make Yahoo your homepage. This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84346888-b3c207
Re: [agi] Incremental Fluid Construction Grammar released
Ben, I want to engage them as volunteers. The OpenMind project is a good example. Another is the game that Cycorp built: http://game.cyc.com . The bootstrap dialog system will operate using Jabber, a standard chat protocol (e.g. Google Chat), so it should easily scale and deploy to the Internet. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 12:16:59 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released Do you plan to pay these non-experts, or recruit them as volunteers? ben On Jan 10, 2008 1:11 PM, Stephen Reed [EMAIL PROTECTED] wrote: Granted that from a logical viewpoint, using a controlled English syntax to acquire rules is as much work as explicitly encoding the rules. However, a suitable, engaging, bootstrap dialog system may permit a multitude of non-expert users to add the rules, thus dramatically reducing the amount of programmatic encoding, and the duration of the effort. That is my hypothesis and plan. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 11:06:45 AM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... OTOH, you could fairly easily define a limited, controlled syntax encompassing a variety of statements about words, sentences and other linguistic structures, and then make a system add syntactic and semantic rules based on these sentences. But I don't see what the point would be, because telling the system stuff in the controlled syntax would be basically as much work as explicitly encoding the rules... -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Never miss a thing. Make Yahoo your homepage. This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84378131-f3f84e
Re: [agi] Incremental Fluid Construction Grammar released
On Jan 10, 2008 10:57 AM, Stephen Reed [EMAIL PROTECTED] wrote: If I understand your question correctly it asks whether a non-expert user can be guided to use Controlled English in a dialog system. In This is an idea that I wanted to try at Cycorp but Doug Lenat said that it had been tried before and failed, due to great resistance among users to Controlled English. Let's see if this idea can be made to work now, or not. Basically, yes. I was also cynically suggesting that it would be difficult to teach the majority of existing human brains how to use Controlled English - and you wouldn't have to build them first. If you have a semi-working prototype at some point, please email me an invitation - I am very interested in such a dialog. :) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84411553-51531a
Re: [agi] Incremental Fluid Construction Grammar released
Mike, I'm beginning now to tear out my previous naive construction grammar code and plug in incremental FCG. When that is finished, maybe by month end, I'll begin tediously hand-crafting the constructions, and procedures, to support minimal dialog. Then I'll get the dialog system interfaced with my Jabber client and off we go. I can use either Google Chat or Jabber.org as the scalable chat server and Texai will run as a Jabber client on my scalable, cheap Linux cluster. Thanks for asking to participate! -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Mike Dougherty [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 2:25:33 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:57 AM, Stephen Reed [EMAIL PROTECTED] wrote: If I understand your question correctly it asks whether a non-expert user can be guided to use Controlled English in a dialog system. In This is an idea that I wanted to try at Cycorp but Doug Lenat said that it had been tried before and failed, due to great resistance among users to Controlled English. Let's see if this idea can be made to work now, or not. Basically, yes. I was also cynically suggesting that it would be difficult to teach the majority of existing human brains how to use Controlled English - and you wouldn't have to build them first. If you have a semi-working prototype at some point, please email me an invitation - I am very interested in such a dialog. :) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84418258-c55117
Re: [agi] Incremental Fluid Construction Grammar released
On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... Firstly, so what? Why not allow for the fact that there will hopefully be a sophisticated NLP system in the system at some point? Give it the hooks to use dictionary style acquisition, even if it won't for the first x years of development. We are aiming for adult human-level in the end, right? Not just a 5 year old. It will make adding French or another language a whole lot quicker, when it comes to that level. Retrofitting the ability may or may not be easy at that stage. It would be better to figure out whether it is easy or not before settling on an architecture. My hunch, is that it is not easy. Secondly, I'm not buying that it is any more complex than dealing with other domains. You easily get equal complexity dealing with non-linguistic stuff such as This is a battery A battery can be part of a machine Putting a battery in the battery holder, gives the machine power Is as complex, if not more so, than un- is a prefix A prefix is the front part of a word Adding un- to a, word, is equivalent to saying, not word. What the system does after processing these different sets of sentences is vastly different. A difference worth exploring before settling on an architecture, IMO. Not building the potential to have a capability into a baby based AI, even if it is not initially used, means when the AI is grown up it still won't be able to have that capability. Unless you are relying on it getting to the self-modifying code phase before the asking-what-words-mean phase. Will - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84431135-87cfe7
Re: [agi] Incremental Fluid Construction Grammar released
I am very interested in parsing the constructions used in WordNet and Wiktionary glosses (i.e. definitions). Here are some samples from WordNet online http://wordnet.princeton.edu/perl/webwn . The glosses are parenthesized, and examples are in italics for those of you with rich text email editors. (1) very simple patterns ruble - (the basic unit of money in Tajikistan) ruble, rouble (the basic unit of money in Russia) lira, Maltese lira (the basic unit of money on Malta; equal to 100 cents) lira, Turkish lira (the basic unit of money in Turkey) lira, Italian lira (formerly the basic unit of money in Italy; equal to 100 centesimi) (2) complex constructions break (terminate) She interrupted her pregnancy; break a lucky streak; break the cycle of poverty break, separate, split up, fall apart, come apart (become separated into pieces or fragments) The figurine broke; The freshly baked loaf fell apart break (render inoperable or ineffective) You broke the alarm clock when you took it apart! break, bust (ruin completely) He busted my radio! break (destroy the integrity of; usually by force; cause to separate into pieces or fragments) He broke the glass plate; She broke the match transgress, offend, infract, violate, go against, breach, break (act in disregard of laws, rules, contracts, or promises) offend all laws of humanity; violate the basic laws or human civilization; break a law; break a promise break, break out, break away (move away or escape suddenly) The horses broke from the stable; Three inmates broke jail; Nobody can break out--this prison is high security break (scatter or part) The clouds broke after the heavy downpour Having a dialog system gives one the ability to query a contributing user about otherwise confusing or circular glosses. Plus one can always recurse into a session to understand a word or phrase used in a containing gloss. And after the commonly occuring word senses from Wiktionary / WordNet glosses are understood and incorporated in the KB, its on to Wikipedia. For the latter, I'm closely monitoring the Cyc Foundation effort to link OpenCyc with Wikipedia topics. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: William Pearson [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 3:04:34 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: On Jan 10, 2008 10:26 AM, William Pearson [EMAIL PROTECTED] wrote: On 10/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: I'll be a lot more interested when people start creating NLP systems that are syntactically and semantically processing statements *about* words, sentences and other linguistic structures and adding syntactic and semantic rules based on those sentences. Note the new emphasis ;-) You example didn't have statements *about* words, but new rules were inferred from word usage. Well, here's the thing. Dictionary text and English-grammar-textbook text are highly ambiguous and complex English... so you'll need a very sophisticated NLP system to be able to grok them... Firstly, so what? Why not allow for the fact that there will hopefully be a sophisticated NLP system in the system at some point? Give it the hooks to use dictionary style acquisition, even if it won't for the first x years of development. We are aiming for adult human-level in the end, right? Not just a 5 year old. It will make adding French or another language a whole lot quicker, when it comes to that level. Retrofitting the ability may or may not be easy at that stage. It would be better to figure out whether it is easy or not before settling on an architecture. My hunch, is that it is not easy. Secondly, I'm not buying that it is any more complex than dealing with other domains. You easily get equal complexity dealing with non-linguistic stuff such as This is a battery A battery can be part of a machine Putting a battery in the battery holder, gives the machine power Is as complex, if not more so, than un- is a prefix A prefix is the front part of a word Adding un- to a, word, is equivalent to saying, not word. What the system does after processing these different sets of sentences is vastly different. A difference worth exploring before settling on an architecture, IMO. Not building the potential to have a capability into a baby based AI, even if it is not initially used, means when the AI is grown up it still won't be able to have that capability. Unless you are relying on it getting to the self-modifying code phase before the asking-what-words-mean phase. Will - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member
Re: [agi] Incremental Fluid Construction Grammar released
All this discussion of building a grammar seems to ignore the obvious fact that in humans, language learning is a continuous process that does not require any explicit encoding of rules. I think either your model should learn this way, or you need to explain why your model would be more successful by taking a different route. Explicit encoding of grammars has a long history of failure, so your explanation should be good. At a minimum, the explanation should describe how humans actually learn language and why your method is better. Natural language has a structure that allows it to be learned in the same order that children learn: lexical, semantics, grammar. Artificial language lacks this structure. 1. Lexical: word boundaries occur where the mutual information between n-grams (phoneme or letter sequences) on opposite sides is smallest. Words have a Zipf distribution, so that the vocabulary grows at a constant rate. 2. Semantics: words with related meanings are more likely to co-occur within a small time window. 3. Grammar: words of the same type (part of speech) are more likely to occur in the same immediate context. The problem with statistical models trained on text is that the semantics is not grounded. A model can learn associations like rain...wet...water, but does not associate these words with sensory or motor I/O as humans do. So your language model might pass a text compression test or a Turing test, but would still lack the knowledge needed to integrate it into a robot. Some have argued that this is a good enough reason to code knowledge explicitly (i.e. expert systems, Cyc), but I don't buy it. Where is the mechanism for updating the knowledge base during a conversation? Some have argued that we should use an artificial or simplified language to make the problem easier, but I don't buy it. Artificial languages are designed to be processed in the wrong order: lexical, grammar, semantics. How do you transition to natural language? You cannot parse natural language without knowing the meanings of the words. You would have avoided that problem if you learned the meanings first, before learning the grammar. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84568500-05d38c
Re: [agi] Incremental Fluid Construction Grammar released
On Jan 10, 2008 10:03 PM, Matt Mahoney [EMAIL PROTECTED] wrote: All this discussion of building a grammar seems to ignore the obvious fact that in humans, language learning is a continuous process that does not require any explicit encoding of rules. I think either your model should learn this way, or you need to explain why your model would be more successful by taking a different route. Explicit encoding of grammars has a long history of failure, so your explanation should be good. At a minimum, the explanation should describe how humans actually learn language and why your method is better. Matt, If you read the paper at the top of this list http://www.novamente.net/papers/ you will see a brief summary of the reasoning behind the approach I am taking. It is only 8 pages long so it should be quick to read, though it obviously does not explain all details in that length. The abstract is as follows: * Abstract— Current work is described wherein simplified versions of the Novamente Cognition Engine (NCE) are being used to control virtual agents in virtual worlds such as game engines and Second Life. In this context, an IRC (imitation- reinforcement-correction) methodology is being used to teach the agents various behaviors, including simple tricks and communicative acts. Here we describe how this work may potentially be exploited and extended to yield a pathway toward giving the NCE robust, ultimately human-level natural language conversation capability. The pathway starts via using the current system to instruct NCE-controlled agents in semiosis and gestural communication; and then continues via integration of a particular sort of hybrid rule-based/statistical NLP system (which is currently partially complete) into the NCE-based virtual agent system, in such a way as to allow experiential adaptation of the rules underlying the NLP system, * I do not think that a viable design for an AGI needs to include a description of human learning (of language or anything else). No one understands exactly how the human brain works yet, but that doesn't mean we can't potentially have success with non-brain-emulating AGI approaches. My favorite theorists of human language are Richard Hudson (see his 2007 book Language Networks) and Tomassello (see his book Constructing a Language). I actually believe my approach to language in AGI is quite close to their ideas. But I don't have time/space to justify this statement in an email. -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84570254-afda8d
Re: [agi] Incremental Fluid Construction Grammar released
Matt, I agree with Ben. Tomassello's book Constructing a Language, A Usage-Based Theory of Language Acquisition argues that young children develop the skill to discern the intentional actions of others. Construction Grammar (CxG) is a simple pairing of form and meaning. According to this theory, implemented in my Incremental Fluid Construction Grammar, children learn the pairings of what their parents say and what their parents' intentions are. Children generalize (i.e. induce) patterns among the instance pairings they experience. CxG names these patterns Constructions. In my Incremental Fluid Construction Grammar these constructions are assembled from the input utterance word by word by the application of production rules against a growing feature structure in working memory. To get my bootstrap dialog system going I will explicitly code as few of these rules as are necessary. However I do not believe that my system should learn these rules from percepts alone. Let's see if it works. I hope that in some months we can debate its actual behavior. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 10, 2008 9:14:43 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released On Jan 10, 2008 10:03 PM, Matt Mahoney [EMAIL PROTECTED] wrote: All this discussion of building a grammar seems to ignore the obvious fact that in humans, language learning is a continuous process that does not require any explicit encoding of rules. I think either your model should learn this way, or you need to explain why your model would be more successful by taking a different route. Explicit encoding of grammars has a long history of failure, so your explanation should be good. At a minimum, the explanation should describe how humans actually learn language and why your method is better. Matt, If you read the paper at the top of this list http://www.novamente.net/papers/ you will see a brief summary of the reasoning behind the approach I am taking. It is only 8 pages long so it should be quick to read, though it obviously does not explain all details in that length. The abstract is as follows: * Abstract— Current work is described wherein simplified versions of the Novamente Cognition Engine (NCE) are being used to control virtual agents in virtual worlds such as game engines and Second Life. In this context, an IRC (imitation- reinforcement-correction) methodology is being used to teach the agents various behaviors, including simple tricks and communicative acts. Here we describe how this work may potentially be exploited and extended to yield a pathway toward giving the NCE robust, ultimately human-level natural language conversation capability. The pathway starts via using the current system to instruct NCE-controlled agents in semiosis and gestural communication; and then continues via integration of a particular sort of hybrid rule-based/statistical NLP system (which is currently partially complete) into the NCE-based virtual agent system, in such a way as to allow experiential adaptation of the rules underlying the NLP system, * I do not think that a viable design for an AGI needs to include a description of human learning (of language or anything else). No one understands exactly how the human brain works yet, but that doesn't mean we can't potentially have success with non-brain-emulating AGI approaches. My favorite theorists of human language are Richard Hudson (see his 2007 book Language Networks) and Tomassello (see his book Constructing a Language). I actually believe my approach to language in AGI is quite close to their ideas. But I don't have time/space to justify this statement in an email. -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84580699-2a67a1
[agi] Incremental Fluid Construction Grammar released
On the SourceForge project site, I just released the Java library for Incremental Fluid Construction Grammar. Fluid Construction Grammar is a natural language parsing and generation system developed by researchers at emergent-languages.org. The system features a production rule mechanism for both parsing and generation using a reversible grammar. This library extends FCG so that it operates incrementally, word by word, left to right in English. Furthermore, its construction rules are adapted from Double R Grammar. See this blog post for more information about Double R Grammar. Execution scripts for a parsing benchmark and for the unit test cases are supplied in Linux and Windows versions. Next tasks are to integrate IFCG into the existing, but not yet released, dialog framework. The framework will heuristically guide the application of construction rules during parsing, and plan the application of rules during generation. Furthermore the framework will incrementally prune alternate interpretations during parsing by employing Walter Kintsch’s Construction/Integration method for discourse comprehension. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=83907124-e50b56
Re: [agi] Incremental Fluid Construction Grammar released
One of the things that I quickly discovered when first working on my convert it all to Basic English project is that the simplest words (prepositions and the simplest verbs in particular) are the biggest problem because they have so many different (though obscurely related) meanings (not to mention being part of one-off phrases). Some of the problems are resolved by stronger typing (as in variable typing). For example, On-SituationLocalized is clearly meant to deal with two physical objects and shouldn't apply to neuroscience. But *that* sentence is easy after you realize that neuroscience really can only have the type of field-of-study or topic. The on becomes obvious then -- provided that you have that many variable types and rules for prepositions (not an easy thing). And how would a young child or foreigner interpret on the Washington Monument or shit list? Both are physical objects and a book *could* be resting on them. It's just that there are more likely alternatives. On has a specific meaning (a-member-of-this-ordered-group) for lists and another specific meaning (about-this-topic) for books, movies, and other subject-matter-describers. The special on overrides the generic on -- provided that you have even more variable types and special rules for prepositions. And on fire is a simple override phrase -- provided that you're keeping track of even more specific instances . . . . - - - - - Ben, your question is *very* disingenuous. There is a tremendous amount of domain/real-world knowledge that is absolutely required to parse your sentences. Do you have any better way of approaching the problem? I've been putting a lot of thought and work into trying to build and maintain precedence of knowledge structures with respect to disambiguating (and overriding incorrect) parsing . . . . and don't believe that it's going to be possible without a severe amount of knwledge . . . . What do you think? - Original Message - From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, January 09, 2008 3:51 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released What is the semantics of ?on-situation-localized-14 rdf:type texai:On-SituationLocalized ?? How would your system parse The book is on neuroscience or The book is on the Washington Monument or The book is on fire or The book is on my shit list ??? thx Ben On Jan 9, 2008 3:37 PM, Stephen Reed [EMAIL PROTECTED] wrote: Ben, The use case utterance the block is on the table yields the following RDF statements (i.e. subject, predicate, object triples). A yet-to-be written discourse mechanism will resolve ?obj-4 to the known book and ?obj-18 to the known table. Parsed statements about the book: ?obj-4 rdf:type cyc:BookCopy ?obj-4 rdf:type texai:FCGClauseSubject ?obj-4 rdf:type texai:PreviouslyIntroducedThingInThisDiscourse ?obj-4 texai:fcgDiscourseRole texai:external ?obj-4 texai:fcgStatus texai:ingleObject Parsed statements about the table: ?obj-18 rdf:type cyc:Table ?obj-18 rdf:type texai:PreviouslyIntroducedThingInThisDiscourse ?obj-18 texai:fcgDiscourseRole texai:external ?obj-18 texai:fcgStatus texai:SingleObject Parsed statements about the book on the table: ?on-situation-localized-14 rdf:type texai:On-SituationLocalized ?on-situation-localized-14 texai:aboveObject ?obj-4 ?on-situation-localized-14 texai:belowObject ?obj-18 Parsed statements about that the book is on the table ( the fact that ?on-situation-localized-14 is a proper sub-situtation of ?situation-localized-10 should also be here): ?situation-localized-10 rdf:type cyc:Situation-Localized ?situation-localized-10 texai:situationHappeningOnDate cyc:Now ?situation-localized-10 cyc:situationConstituents ?obj-4 Cyc parsing is based upon semantic translation templates, which are stitched together with procedural code following the determination of constituent structure by a plug-in parser such as the CMU link-grammar. My method differs in that: (1) I want to get the entire and precise semantics from the utterance. (2) FCG is reversible, the same construction rules not only parse input text, but can be applied in reverse to re-create the original utterance from its semantics. Cyc has a separate system for NL generation. (3) Cyc hand-codes their semantic translation templates and I have in mind building an expert English dialog system using minimal hand-coded Controlled English, for the purpose of interacting with a multitude of non-linguists to extend its linguistic knowledge. -Steve Stephen L. Reed Artificial Intelligence Researcher http://texai.org/blog http://texai.org 3008 Oak Crest Ave. Austin, Texas, USA 78704 512.791.7860 - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, January 9, 2008 1:45:34 PM Subject: Re: [agi
Re: [agi] Incremental Fluid Construction Grammar released
And how would a young child or foreigner interpret on the Washington Monument or shit list? Both are physical objects and a book *could* be resting on them. Sorry, my shit list is purely mental in nature ;-) ... at the moment, I maintain a task list but not a shit list... maybe I need to get better organized!!! Ben, your question is *very* disingenuous. Who, **me** ??? There is a tremendous amount of domain/real-world knowledge that is absolutely required to parse your sentences. Do you have any better way of approaching the problem? I've been putting a lot of thought and work into trying to build and maintain precedence of knowledge structures with respect to disambiguating (and overriding incorrect) parsing . . . . and don't believe that it's going to be possible without a severe amount of knwledge . . . . What do you think? OK... Let's assume one is working within the scope of an AI system that includes an NLP parser, a logical knowledge representation system, and needs some intelligent way to map the output of the latter into the former. Then, in this context, there are three approaches, which may be tried alone or in combination: 1) Hand-code rules to map the output of the parser into a much less ambiguous logical format 2) Use statistical learning across a huge corpus of text to somehow infer these rules [I did not ever flesh out this approach as it seemed implausible, but I have to recognize its theoretical possibility] 3) Use **embodied** learning, so that the system can statistically infer the rules from the combination of parse-trees with logical relationships that it observes to describe situations it sees [This is the best approach in principle, but may require years and years of embodied interaction for a system to learn.] Obviously, Cycorp has taken Approach 1, with only modest success. But I think part of the reason they have not been more successful is a combination of a bad choice of parser with a bad choice of knowledge representation. They use a phrase structure grammar parser and predicate logic, whereas I believe if one uses a dependency grammar parser and term logic, the process becomes a lot easier. So far as I can tell, in texai you are replicating Cyc's choices in this regard (phrase structure grammar + predicate logic). In Novamente, we are aiming at a combination of the 3 approaches. We are encoding a bunch of rules, but we don't ever expect to get anywhere near complete coverage with them, and we have mechanisms (some designed, some already in place) that can generalize the rule base to learn new, probabilistic rules, based on statistical corpus analysis and based on embodied experience. In our rule encoding approach, we will need about 5000 mapping rules to map syntactic parses of commonsense sentences into term logic relationships. Our inference engine will then generalize these into hundreds of thousands or millions of specialized rules. This is current work, research in progress. We have about 1000 rules in place now and will soon stop coding them and start experimenting with using inference to generalize and apply them. If this goes well, then we'll put in the work to encode the rest of the rules (which is not very fun work, as you might imagine). Emotionally and philosophically, I am more drawn to approach 3 (embodied learning), but pragmatically, I have reluctantly concluded that the hybrid approach we're currently taking has the greatest odds of rapid success. In the longer term, we intend to throw out the standalone grammar parser we're using and have syntax parsing done via our core AI processing -- but we're now using a standalone grammar parser as a sort of scaffolding. I note that this is not the main NM RD thrust right now -- it is at the moment somewhat separate from our work on embodied imitative/reinforcement/corrective learning of virtual agents. However, the two streams of work are intended to come together, as I've outlined in my paper for WCCI 2008, http://www.goertzel.org/new_research/WCCI_AGI.pdf -- Ben -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=83967477-a9e1c4
Re: [agi] Incremental Fluid Construction Grammar released
In our rule encoding approach, we will need about 5000 mapping rules to map syntactic parses of commonsense sentences into term logic relationships. Our inference engine will then generalize these into hundreds of thousands or millions of specialized rules. How would your rules handle the on cases that you gave? What do your rules match on (specific words, word types, object types, something else)? Are your rules all at the same level or are they tiered somehow? My gut instinct is that 5000 rules is way, way high for both the most general and second-tiers and that you can do exception-based learning after those two tiers. We have about 1000 rules in place now and will soon stop coding them and start experimenting with using inference to generalize and apply them. If this goes well, then we'll put in the work to encode the rest of the rules (which is not very fun work, as you might imagine). Can you give about ten examples of rules? (That would answer a lot of my questions above) Where did you get the rules? Did you hand-code them or get them from somewhere? - Original Message - From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, January 09, 2008 5:04 PM Subject: Re: [agi] Incremental Fluid Construction Grammar released And how would a young child or foreigner interpret on the Washington Monument or shit list? Both are physical objects and a book *could* be resting on them. Sorry, my shit list is purely mental in nature ;-) ... at the moment, I maintain a task list but not a shit list... maybe I need to get better organized!!! Ben, your question is *very* disingenuous. Who, **me** ??? There is a tremendous amount of domain/real-world knowledge that is absolutely required to parse your sentences. Do you have any better way of approaching the problem? I've been putting a lot of thought and work into trying to build and maintain precedence of knowledge structures with respect to disambiguating (and overriding incorrect) parsing . . . . and don't believe that it's going to be possible without a severe amount of knwledge . . . . What do you think? OK... Let's assume one is working within the scope of an AI system that includes an NLP parser, a logical knowledge representation system, and needs some intelligent way to map the output of the latter into the former. Then, in this context, there are three approaches, which may be tried alone or in combination: 1) Hand-code rules to map the output of the parser into a much less ambiguous logical format 2) Use statistical learning across a huge corpus of text to somehow infer these rules [I did not ever flesh out this approach as it seemed implausible, but I have to recognize its theoretical possibility] 3) Use **embodied** learning, so that the system can statistically infer the rules from the combination of parse-trees with logical relationships that it observes to describe situations it sees [This is the best approach in principle, but may require years and years of embodied interaction for a system to learn.] Obviously, Cycorp has taken Approach 1, with only modest success. But I think part of the reason they have not been more successful is a combination of a bad choice of parser with a bad choice of knowledge representation. They use a phrase structure grammar parser and predicate logic, whereas I believe if one uses a dependency grammar parser and term logic, the process becomes a lot easier. So far as I can tell, in texai you are replicating Cyc's choices in this regard (phrase structure grammar + predicate logic). In Novamente, we are aiming at a combination of the 3 approaches. We are encoding a bunch of rules, but we don't ever expect to get anywhere near complete coverage with them, and we have mechanisms (some designed, some already in place) that can generalize the rule base to learn new, probabilistic rules, based on statistical corpus analysis and based on embodied experience. In our rule encoding approach, we will need about 5000 mapping rules to map syntactic parses of commonsense sentences into term logic relationships. Our inference engine will then generalize these into hundreds of thousands or millions of specialized rules. This is current work, research in progress. We have about 1000 rules in place now and will soon stop coding them and start experimenting with using inference to generalize and apply them. If this goes well, then we'll put in the work to encode the rest of the rules (which is not very fun work, as you might imagine). Emotionally and philosophically, I am more drawn to approach 3 (embodied learning), but pragmatically, I have reluctantly concluded that the hybrid approach we're currently taking has the greatest odds of rapid success. In the longer term, we intend to throw out the standalone grammar parser we're using and have syntax parsing done via our core AI processing -- but we're now using
Re: [agi] Incremental Fluid Construction Grammar released
A perhaps nicer example is Get me the ball for which RelEx outputs definite(ball) singular(ball) imperative(get) singular(me) definite(me) _obj(get, me) _obj2(get, ball) and RelExToFrame outputs Bringing:Theme(get,me) Bringing:Beneficiary(get,me) Bringing:Theme(get,ball) Bringing:Agent(get,you) Note that the RelEx output is already abstracted and semantified compared to what comes out of a grammar parser. -- Ben On Jan 9, 2008 5:59 PM, Benjamin Goertzel [EMAIL PROTECTED] wrote: Can you give about ten examples of rules? (That would answer a lot of my questions above) That would just lead to really long list of questions that I don't have time to answer right now In a month or two, we'll write a paper on the rule-encoding approach we're using, and I'll post it to the list, which will make this approach clearer. Where did you get the rules? Did you hand-code them or get them from somewhere? As you know we have a system called RelEx that transforms the output of the link parser into higher-level semantic relationships. We then have a system of rules that map RelEx output into a set of frame-element relationships constructed mostly based on FrameNet. For the sentence Ben kills chickens RelEx outputs _obj(kill, chicken) present(kill) plural(chicken) uncountable(Ben) _subj(kill, Ben) and the RelExToFrame rules output Killing:Killer(kill,Ben) Killing:Victim(kill,chicken) Temporal_colocation:Event(present,kill) But I really don't have time to explain all the syntax and notation in detail... if it's not transparent... And I want to stress that I consider this kind of system pretty useless on its own, it's only potentially valuable if coupled with other components like we have in Novamente, such as an uncertain inference engine and an embodied learning system... Such rules IMO are mainly valuable to give a starting-point to a learning system, not as the sole or primary cognitive material of an AI system. And using them as a starting-point requires very careful design... The 5000 rules figure is roughly rooted in the 825 frames in FrameNet; each frame corresponds to a number of rules, most of which are related to specific verb/preposition combinations. Another way to look at it is that each rule corresponds roughly to a Lojban word/argument combination... pretty much, FrameNet and the Lojban dictionary are doing the same thing, which is to precisely specify commonsense subcategorization frames. -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=83993607-803936
Re: [agi] Incremental Fluid Construction Grammar released
Can you give about ten examples of rules? (That would answer a lot of my questions above) That would just lead to really long list of questions that I don't have time to answer right now In a month or two, we'll write a paper on the rule-encoding approach we're using, and I'll post it to the list, which will make this approach clearer. Where did you get the rules? Did you hand-code them or get them from somewhere? As you know we have a system called RelEx that transforms the output of the link parser into higher-level semantic relationships. We then have a system of rules that map RelEx output into a set of frame-element relationships constructed mostly based on FrameNet. For the sentence Ben kills chickens RelEx outputs _obj(kill, chicken) present(kill) plural(chicken) uncountable(Ben) _subj(kill, Ben) and the RelExToFrame rules output Killing:Killer(kill,Ben) Killing:Victim(kill,chicken) Temporal_colocation:Event(present,kill) But I really don't have time to explain all the syntax and notation in detail... if it's not transparent... And I want to stress that I consider this kind of system pretty useless on its own, it's only potentially valuable if coupled with other components like we have in Novamente, such as an uncertain inference engine and an embodied learning system... Such rules IMO are mainly valuable to give a starting-point to a learning system, not as the sole or primary cognitive material of an AI system. And using them as a starting-point requires very careful design... The 5000 rules figure is roughly rooted in the 825 frames in FrameNet; each frame corresponds to a number of rules, most of which are related to specific verb/preposition combinations. Another way to look at it is that each rule corresponds roughly to a Lojban word/argument combination... pretty much, FrameNet and the Lojban dictionary are doing the same thing, which is to precisely specify commonsense subcategorization frames. -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=83995036-45a6ce
Re: [agi] Incremental Fluid Construction Grammar released
Processing a dictionary in a useful way requires quite sophisticated language understanding ability, though. Once you can do that, the hard part of the problem is already solved ;-) Ben On Jan 9, 2008 7:22 PM, William Pearson [EMAIL PROTECTED] wrote: On 09/01/2008, Benjamin Goertzel [EMAIL PROTECTED] wrote: Let's assume one is working within the scope of an AI system that includes an NLP parser, a logical knowledge representation system, and needs some intelligent way to map the output of the latter into the former. Then, in this context, there are three approaches, which may be tried alone or in combination: 1) Hand-code rules to map the output of the parser into a much less ambiguous logical format 2) Use statistical learning across a huge corpus of text to somehow infer these rules [I did not ever flesh out this approach as it seemed implausible, but I have to recognize its theoretical possibility] 3) Use **embodied** learning, so that the system can statistically infer the rules from the combination of parse-trees with logical relationships that it observes to describe situations it sees [This is the best approach in principle, but may require years and years of embodied interaction for a system to learn.] Isn't there a 4th potential one? I would define the 4th as being something like 4) Use a language that can describe itself to bootstrap quickly new phrase usage. These can be seen in humans when processing dictionary/thesaurus like statements or learning a new language. The following paragraphs can be seen as examples of sentances that would need this kind of system to deal with and make use of the information in them: The word, on, can be used in many different situations. One of these is to imply one thing is above another and supported by it. The prefix dis can mean apart or break apart. Enchant can mean to take control by magical means. What might disenchant mean? * ---End examples It requires the system to be able to process this statement then add the appropriate rules. It may be tentative in keeping or using the rules, gathering information on how useful it finds it while processing text. It is different from handcoding, because it should enable anyone to add rules after a minimal set of language description language has been added. It should be combined with 3 however, so that rules don't always need to be given explicitly. I think this type of learning/instruction has the ability to be a lot quicker than any system that mainly relies on inference. I don't know of systems that are using this sort of thing. And it is a bit above the level I am working at, at the moment. Anyone know of systems that parse and then use sentances in this fashion? Will Pearson * I'm unsure how much work people are doing on the use of prefixes and suffixes to infer the meaning/usage of new words. I certainly use it a lot myself. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=84074230-e1fae9