Re: [agi] Project proposal: MindPixel 2
Ben B) Even if there are only 5 applications of rules, the Ben combinatorial explosion still exists. If there are 10 rules and Ben 1 billlion knowledge items, then there may be up to 10 billion Ben possibilities to consider in each inference step. How do you respond to the 20-question argument that there are only of order 2^20 knowledge items ? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote: How do you respond to the 20-question argument that there are only of order 2^20 knowledge items ? The granularity of knowledge items for 20 Questions and the number 20 are specifically chosen to match each other, to make the game fair. While never explicitly stated, everyone understands that e.g. 'a book' is a fair topic for 20 Questions, but 'Alice in Wonderland' is not. Yet we do know about 'Alice in Wonderland', and any attempt to duplicate human abilities must take that into account. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Russell On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote: How do you respond to the 20-question argument that there are only of order 2^20 knowledge items ? Russell The granularity of knowledge items for 20 Questions and the Russell number 20 are specifically chosen to match each other, to Russell make the game fair. While never explicitly stated, everyone Russell understands that e.g. 'a book' is a fair topic for 20 Russell Questions, but 'Alice in Wonderland' is not. Yet we do know Russell about 'Alice in Wonderland', and any attempt to duplicate Russell human abilities must take that into account. Eric Have you ever played 20 questions? In the games I've played, Eric Alice in Wonderland would be a fine topic. I admit its Eric surprising that one plays as well as one does. I haven't played 20 questions recently, but in response to your comment I just went to www.20q.net and played thinking of Alice in Wonderland, the book. The neural net guessed is it a novel on question 22, and then decided it had gone far enough and said you won, but 20q guessed it eventually. However, I have distinct recollections of, last time I played, a human player guessing my thought of a specific radio station. Of course, 20q.net cheats by asking multimodal questions (animal,vegetable, or mineral) so more than 2^20 possibilities. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/29/07, Eric Baum [EMAIL PROTECTED] wrote: I haven't played 20 questions recently, but in response to your comment I just went to www.20q.net and played thinking of Alice in Wonderland, the book. The neural net guessed is it a novel on question 22, and then decided it had gone far enough and said you won, but 20q guessed it eventually. However, I have distinct recollections of, last time I played, a human player guessing my thought of a specific radio station. Of course, 20q.net cheats by asking multimodal questions (animal,vegetable, or mineral) so more than 2^20 possibilities. Wait.. since it is not the *same* 20 questions all the time, the number of concepts in the mind may be significantly more than 2^20. Also, 2^20 ~= 1 million, and OpenCyc has about 47,000 concepts. Still a long way to go, it seems... YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote: Have you ever played 20 questions? Yep. In the games I've played, Alice in Wonderland would be a fine topic. I admit its surprising that one plays as well as one does. Interesting, and surprising, but I don't draw the same conclusion as you do. The interesting conclusion I draw is that each group/circle of people who play the game must have a different set of criteria for deciding what's a fair question, and that we therefore must have a very surprising tacit ability to judge what level of detail constitutes 20 bits of information. It remains the case that we do know about more than a million things; you can't build a human-equivalent mind with a million knowledge items. (Or with mere explicit knowledge items at all, as Cyc has adequately demonstrated.) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Pick whatever public domain licence you prefer: GPL, MIT, Apache, or whatever you believe will prevent legal abuses. In principle though I agree that data entered by the public should be owned by the public. On 27/01/07, David Hart [EMAIL PROTECTED] wrote: On 1/27/07, Charles D Hixson [EMAIL PROTECTED] wrote: Philip Goetz wrote: On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote: It's find to talk about making the data public domain, but that's not a good idea. Why not? Because public domain offers NO protection. If you want something close to what public domain used to provide, then the MIT license is a good choice. If you make something public domain, you are opening yourself to abusive lawsuits. (Those are always a possibility, but a license that disclaims responsibility offers *some* protection.) Public domain used to be a good choice (for some purposes), before lawsuits became quite so pernicious. This license chooser may help: http://creativecommons.org/license/ Perhaps MindPixel2 discussion deserves its own list at this stage? Listbox, Google and many others offer list services (Google Code also offers a wiki, source version management, and other features). David -- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/27/07, Philip Goetz [EMAIL PROTECTED] wrote: Totally disagree! I actually examined a few cases of *real-life* commonsense inference steps, and I found that they are based on a *small* number of tiny rules of thought. I don't know why you think massive knowledge items are needed for commonsense reasoning -- if you closely examine some of your own thoughts you'd see. On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. Sorry about the confusion =) What I meant is that the AGI's knowledgebase needs to store a massive number of facts/rules, but a *single* commonsense inference case (eg the examples from my introspection) usually involves only a few deductive steps in logic (assuming the required rules are there). I guess Ben's objection is based on the first point, but the project is still feasible if it is powered by an online community, IMO. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/25/07, Pei Wang [EMAIL PROTECTED] wrote: Suppose I have a set of *deductive* facts/rules in FOPL. You can actually use this data in your AGI to support other forms of inference such as induction and abduction. In this sense the facts/rules collection does not dictate the form of inference engine we use. No, you cannot do that without twisting some definitions. You are right that now many people define induction and abduction in the language of FOPL, but what they actually do is to omit important aspects in the process, such as uncertainty. To me that is cheating. I addressed this issue in http://nars.wang.googlepages.com/wang.syllogism.ps . In http://www.springer.com/west/home/computer/artificial?SGWID=4-147-22-173659733-0 I explained in detail (especially in Ch. 9 and 10) why the language of FOPL is improper for AI. OK, there is some confusion here too. You're talking about standard FOPL, a version that is described in textbooks of mathematical logic. My logic is based on standard FOPL, but there are some significant differences. First of all, it can be extended with uncertainty values (eg according to your theory of f,c). Secondly, it does not use Frege-style quantifiers. Thirdly, it does not make a strict distinction between predicates and arguments (eg I can say Loves(john,mary) and Is_Blind(love)). Given these differences, the 3 objections to FOPL in your book may be answered. In the end, your NARS logic and my logic may be very similar in both expressivity and semantics. If you're interested we may consider a collaboration or merging of theories. One issue I have not yet formed an opinion about is the universality of the inheritence relation in NARS. We can discuss that later... That's why my top priority is to build an inference engine for deduction. Inductive learning will be added later in the form of data mining, which is very computation-intensive. I'm afraid it is not going to work --- many people have tried to extend FOPL to cover a wider range, and run into all kinds of problems. To restart from scratch is actually easier than to maintain consistency among many ad hoc patches and hacks. To me, one of the biggest mistake of mainstream AI is to treat learning as independent to working, and can be added in later. To see AI in this way and to put learning into the foundation will produce very different systems. In NARS, learning and reasoning, as well as some other cognitive facilities, are different aspects of the same underlying process, and cannot handled separately. Inductive learning under FOPL is a vast topic, and is still under development (eg the field of inductive logic programming). It is still too early to say that it won't work. Also, many methods in data mining are forms of inductive learning, and I believe these techniques can be borrowed for AGI. I guess Ben uses pattern mining techniques in Novamente too. There is not a clear reason why reasoning and learning must be unified. Can you elaborate on the advantages of such an approach? The learning problem in AGI is difficult partly because GOFAI knowledge representation schemes are usually very cumbersome (with frames, microtheories, modal operators for temporal / epistemological aspects, etc). My logic is very minimalistic, almost structureless. This makes learning easier since learning is a search for hypotheses in the hypothesis space. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/27/07, Ben Goertzel [EMAIL PROTECTED] wrote: Yes, you can reduce nearly all commonsense inference to a few rules, but only if your rules and your knowledge base are not fully formalized... As I envision it, we would have a large number of rules. Some rules are very abstract (eg rules governing inheritence or syllogisms) and others are more concrete (eg involving concrete concepts). For example: Abstract rule: If X is a Y and Z(Y) then Z(X). Concrete rule: If X is wet then X conducts electricity. The rules are not fully formalized -- in the sense that there is not an elite set of rules governing all others. Instead, there is a continuum of rules from the highly abstract / always-right to the concrete / defeasible. Do you think that's better? Fully formalizing things, as is necessary for software implementation, makes things substantially more complicated. Give it a try and see! Thanks for your support =) YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I need to understand your design better to talk about the details, and the discussion is getting too technical for this list. I will hold my doubts and wait for you to go further. Pei On 1/27/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/25/07, Pei Wang [EMAIL PROTECTED] wrote: Suppose I have a set of *deductive* facts/rules in FOPL. You can actually use this data in your AGI to support other forms of inference such as induction and abduction. In this sense the facts/rules collection does not dictate the form of inference engine we use. No, you cannot do that without twisting some definitions. You are right that now many people define induction and abduction in the language of FOPL, but what they actually do is to omit important aspects in the process, such as uncertainty. To me that is cheating. I addressed this issue in http://nars.wang.googlepages.com/wang.syllogism.ps . In http://www.springer.com/west/home/computer/artificial?SGWID=4-147-22-173659733-0 I explained in detail (especially in Ch. 9 and 10) why the language of FOPL is improper for AI. OK, there is some confusion here too. You're talking about standard FOPL, a version that is described in textbooks of mathematical logic. My logic is based on standard FOPL, but there are some significant differences. First of all, it can be extended with uncertainty values (eg according to your theory of f,c). Secondly, it does not use Frege-style quantifiers. Thirdly, it does not make a strict distinction between predicates and arguments (eg I can say Loves(john,mary) and Is_Blind(love)). Given these differences, the 3 objections to FOPL in your book may be answered. In the end, your NARS logic and my logic may be very similar in both expressivity and semantics. If you're interested we may consider a collaboration or merging of theories. One issue I have not yet formed an opinion about is the universality of the inheritence relation in NARS. We can discuss that later... That's why my top priority is to build an inference engine for deduction. Inductive learning will be added later in the form of data mining, which is very computation-intensive. I'm afraid it is not going to work --- many people have tried to extend FOPL to cover a wider range, and run into all kinds of problems. To restart from scratch is actually easier than to maintain consistency among many ad hoc patches and hacks. To me, one of the biggest mistake of mainstream AI is to treat learning as independent to working, and can be added in later. To see AI in this way and to put learning into the foundation will produce very different systems. In NARS, learning and reasoning, as well as some other cognitive facilities, are different aspects of the same underlying process, and cannot handled separately. Inductive learning under FOPL is a vast topic, and is still under development (eg the field of inductive logic programming). It is still too early to say that it won't work. Also, many methods in data mining are forms of inductive learning, and I believe these techniques can be borrowed for AGI. I guess Ben uses pattern mining techniques in Novamente too. There is not a clear reason why reasoning and learning must be unified. Can you elaborate on the advantages of such an approach? The learning problem in AGI is difficult partly because GOFAI knowledge representation schemes are usually very cumbersome (with frames, microtheories, modal operators for temporal / epistemological aspects, etc). My logic is very minimalistic, almost structureless. This makes learning easier since learning is a search for hypotheses in the hypothesis space. YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/27/07, David Hart [EMAIL PROTECTED] wrote: This license chooser may help: http://creativecommons.org/license/ Perhaps MindPixel2 discussion deserves its own list at this stage? Listbox, Google and many others offer list services (Google Code also offers a wiki, source version management, and other features). Thanks, but I favor a license that supports some commercial rights, or I'll need to create one. Google Code only supports free / copyleft licenses. I will start a separate list as soon as this is settled... YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
There is not a clear reason why reasoning and learning must be unified. Can you elaborate on the advantages of such an approach? To answer that question I would have to know how you are defining those terms. The learning problem in AGI is difficult partly because GOFAI knowledge representation schemes are usually very cumbersome (with frames, microtheories, modal operators for temporal / epistemological aspects, etc). My logic is very minimalistic, almost structureless. This makes learning easier since learning is a search for hypotheses in the hypothesis space. With a minimalist logic, the hypothesis space will be large, posing a huge search problem. The point of all those cumbersome additions to basic logic is essentially to allow learning and reasoning algorithms to narrow down the search space in contextually appropriate ways. Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2 - licensing
Hi YKY, On 1/28/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: Thanks, but I favor a license that supports some commercial rights, or I'll need to create one. Google Code only supports free / copyleft licenses. Licensing is typically more intricate than it first appears. KB content and software source code would likely be under separate licenses, and contributed and maintained by mostly separate communities. It's feasible to maintain two source code bases, one which is open source and a second which is closed source. As copyright holder, you're permitted to intermingle code between the two (with some restrictions), and reserve any proprietary code you like in the closed-source version. However, once source code is contributed to an open source version, it's in the wild forever (i.e. an open source license can't be retroactively revoked). You might also consider making an upfront statement to the effect that open source coders may be hired in the future if they're willing to assign their source code copyright to your company, allowing you to more easily make proprietary derivative works of their open source code. Depending on the license chosen, others may also be allowed to make proprietary derivative works of the open source code. For example, while it seems counter-intuitive, dual-licensed GPL projects have stronger commercial protection for the copyright holder than do BSD licensed projects, which allow third parties to keep their changes proprietary. For the KB, non-commercial creative commons licenses exist which may be useful. It's my guess that a KB of this size and nature would be hosted outside of a normal source-code-hosting setting, simply because those services don't offer the necessary tools for the job. Most Linux hosting services would be sufficient for KB hosting, as they include database software and large amounts of storage. You'd want to read the fine print of the source-code-hosting services' licenses, but it's probably okay to combine all of these various license types in the way described, however IANAL so better yet seek legal advice. Nearly any AGI project with a commercial/community mix will have similar licensing issues. David - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
--- YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote: If there is a major problem with Cyc, it is not the choice of basic KR language. Predicate logic is precise and relatively simple. I agree mostly, though I think even Cyc's simple predicate logic language can be made even simpler and better. For example, Cyc uses the classical quantifiers #$forAll and #$exists. In my version I don't use Frege-style quantifiers but I allow generalized modifiers like many, a few, in addition to all, exists. IMHO the problem with Cyc is they tried to go directly to adult level intelligence with no theory on how people learn. This is why they are having such difficulty adding a natural language interface. Children learn semantics first, then simple sentences, and then the elements of logic such as and, or, not, all, some, etc. Cyc went straight to adult level logic and math, and now they can't add in the stuff that should have been learned as children. They should have built the language model first. Another problem is that n-th order logic (even probabilistic) is not how people think. Logic does not model inductive reasoning, e.g. Kermit is a frog. Kermit is green. Therefore frogs are green. Where is the theory that explains why people reason this way? This is what happens when you ignore the cognitive side of AI. Rather, the main problem is the impracticality of encoding a decent percentage of the needed commonsense knowledge! Now I see why we disagree here. You believe we should acquire all knowledge via experiential learning. IMO we can do even better than the experiential route. We can let the internet crowd enter the commonsense corpus for us. This should be allow us to reach a functioning, usable AGI sooner. How much knowledge you need depends on what problem you are trying to solve. Building an AGI to run a corporation is not the same as building a better spam detector. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: Totally disagree! I actually examined a few cases of *real-life* commonsense inference steps, and I found that they are based on a *small* number of tiny rules of thought. I don't know why you think massive knowledge items are needed for commonsense reasoning -- if you closely examine some of your own thoughts you'd see. On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Yes, you can reduce nearly all commonsense inference to a few rules, but only if your rules and your knowledge base are not fully formalized... Fully formalizing things, as is necessary for software implementation, makes things substantially more complicated. Give it a try and see! Ben On Jan 26, 2007, at 8:00 PM, Philip Goetz wrote: On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: Totally disagree! I actually examined a few cases of *real-life* commonsense inference steps, and I found that they are based on a *small* number of tiny rules of thought. I don't know why you think massive knowledge items are needed for commonsense reasoning -- if you closely examine some of your own thoughts you'd see. On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Philip Goetz wrote: On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote: It's find to talk about making the data public domain, but that's not a good idea. Why not? Because public domain offers NO protection. If you want something close to what public domain used to provide, then the MIT license is a good choice. If you make something public domain, you are opening yourself to abusive lawsuits. (Those are always a possibility, but a license that disclaims responsibility offers *some* protection.) Public domain used to be a good choice (for some purposes), before lawsuits became quite so pernicious. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/27/07, Charles D Hixson [EMAIL PROTECTED] wrote: Philip Goetz wrote: On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote: It's find to talk about making the data public domain, but that's not a good idea. Why not? Because public domain offers NO protection. If you want something close to what public domain used to provide, then the MIT license is a good choice. If you make something public domain, you are opening yourself to abusive lawsuits. (Those are always a possibility, but a license that disclaims responsibility offers *some* protection.) Public domain used to be a good choice (for some purposes), before lawsuits became quite so pernicious. This license chooser may help: http://creativecommons.org/license/ Perhaps MindPixel2 discussion deserves its own list at this stage? Listbox, Google and many others offer list services (Google Code also offers a wiki, source version management, and other features). David - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/25/07, Bob Mottram [EMAIL PROTECTED] wrote: The trouble is that you can only really decide whether a statement is non-probabilistic if enough people have voted unanimously yes or no. Even then you can't be sure that the next person to vote won't go the opposite way. At the initial stage we may rely on the wisdom of crowds (wikipedia: http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds), using voting on one set of common knowledge. Although in later stages I think separate sub-communities might be desirable. This is not a worrying issue IMO. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote: If there is a major problem with Cyc, it is not the choice of basic KR language. Predicate logic is precise and relatively simple. I agree mostly, though I think even Cyc's simple predicate logic language can be made even simpler and better. For example, Cyc uses the classical quantifiers #$forAll and #$exists. In my version I don't use Frege-style quantifiers but I allow generalized modifiers like many, a few, in addition to all, exists. Rather, the main problem is the impracticality of encoding a decent percentage of the needed commonsense knowledge! Now I see why we disagree here. You believe we should acquire all knowledge via experiential learning. IMO we can do even better than the experiential route. We can let the internet crowd enter the commonsense corpus for us. This should be allow us to reach a functioning, usable AGI sooner. And, on a more technical level, I think that Cyc's **ontology** is too complex and unwieldy. This is NOT an issue of the KR language, but rather of the chosen vocabulary of semantic primitives. I don't feel that Cyc has a well-thought-out set of semantic primitives. They have a small number of basic logical primitives, and then a HUGE number of complex abstract concepts in their upper ontology. IMO an intermediate level is needed, involving a few dozen well thought out semantic primitives, and a few hundred additional basic semantic relationships. I have a similar sense. As wikipedia puts it, Cyc has been criticized for excessive reification. I think the problem is that Cyc creates artificial labels that are atomic and *non-compositional*. For example the label #$rawFood should be represented *compositionally* by the concepts raw and food. I suggest not to use ontologies at all. John Sowa has spent lots of time on the ontology problem and his conclusion is: We will never have a one-size fits all ontology for anything having to do with computer systems. Case closed [ http://suo.ieee.org/email/msg12861.html ]. Perhaps this is one GOFAI feature we need to ditch. I think we can work bottom-up from a vast web of commonsense pixels, and the computer organize its own knowledgebase via clustering etc. So we don't need any man-made ontology. Lojban IMO has done a great job of this. The Lojban language embodies a very well thought out commonsense ontology, which has been shaped evolutionarily thru the usage of the language by the Lojban community. Not familiar with the Lojban community or the status of the language, so I can't comment. I still believe that introducing Lojban into AGI is spurious / redundant and it may alienate people from your projects if they don't know Lojban. It seems like just another man-made ontology that has its inadequacies. However, this still doesn't solve the problem that there is too much commonsense knowledge to code-in explicitly ... so it has to be learned... This is the main disagreement. Could an internet crowd codify all commonsense knowledge? It seems yes, especially if we're talking about the more *verbal* portion of commonsense. Perhaps we should combine the Codifiy strategy with the Experiential Learning strategy YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Yes, Lojban is just another human-created ontology. My statement was that it is a particularly good one, well-thought out, practical, and conceptually sensible. I note that, unlike what you say, Novamente is not predicated on the assumption that we should acquire all knowledge via experiential learning. Rather, it is predicated on the assumption that there is a lot of knowledge, necessary for AGI, that is only practicably acquirable via experiential learning. I think there is also a lot of knowledge that can practicably be explicitly encoded and fed directly into an AGI's mind -- I just think the knowledge in this latter category is not **sufficient** in itself So, we can take a hybrid approach in Novamente. -- Ben G On Jan 25, 2007, at 6:58 PM, YKY (Yan King Yin) wrote: On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote: If there is a major problem with Cyc, it is not the choice of basic KR language. Predicate logic is precise and relatively simple. I agree mostly, though I think even Cyc's simple predicate logic language can be made even simpler and better. For example, Cyc uses the classical quantifiers #$forAll and #$exists. In my version I don't use Frege-style quantifiers but I allow generalized modifiers like many, a few, in addition to all, exists. Rather, the main problem is the impracticality of encoding a decent percentage of the needed commonsense knowledge! Now I see why we disagree here. You believe we should acquire all knowledge via experiential learning. IMO we can do even better than the experiential route. We can let the internet crowd enter the commonsense corpus for us. This should be allow us to reach a functioning, usable AGI sooner. And, on a more technical level, I think that Cyc's **ontology** is too complex and unwieldy. This is NOT an issue of the KR language, but rather of the chosen vocabulary of semantic primitives. I don't feel that Cyc has a well-thought-out set of semantic primitives. They have a small number of basic logical primitives, and then a HUGE number of complex abstract concepts in their upper ontology. IMO an intermediate level is needed, involving a few dozen well thought out semantic primitives, and a few hundred additional basic semantic relationships. I have a similar sense. As wikipedia puts it, Cyc has been criticized for excessive reification. I think the problem is that Cyc creates artificial labels that are atomic and *non- compositional*. For example the label #$rawFood should be represented *compositionally* by the concepts raw and food. I suggest not to use ontologies at all. John Sowa has spent lots of time on the ontology problem and his conclusion is: We will never have a one-size fits all ontology for anything having to do with computer systems. Case closed [ http://suo.ieee.org/email/ msg12861.html ]. Perhaps this is one GOFAI feature we need to ditch. I think we can work bottom-up from a vast web of commonsense pixels, and the computer organize its own knowledgebase via clustering etc. So we don't need any man-made ontology. Lojban IMO has done a great job of this. The Lojban language embodies a very well thought out commonsense ontology, which has been shaped evolutionarily thru the usage of the language by the Lojban community. Not familiar with the Lojban community or the status of the language, so I can't comment. I still believe that introducing Lojban into AGI is spurious / redundant and it may alienate people from your projects if they don't know Lojban. It seems like just another man-made ontology that has its inadequacies. However, this still doesn't solve the problem that there is too much commonsense knowledge to code-in explicitly ... so it has to be learned... This is the main disagreement. Could an internet crowd codify all commonsense knowledge? It seems yes, especially if we're talking about the more *verbal* portion of commonsense. Perhaps we should combine the Codifiy strategy with the Experiential Learning strategy YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, Pei Wang [EMAIL PROTECTED] wrote: The bottomline is that the knowledge acquisition project is *separable* from specific inference methods. What is your argument supporting this strong claim? I guess every book on knowledge representation includes a statement saying that whether a knowledge representation format is good, to a large extent depends on the type of inference it can support. These two aspects are never considered as separable. Suppose I have a set of *deductive* facts/rules in FOPL. You can actually use this data in your AGI to support other forms of inference such as induction and abduction. In this sense the facts/rules collection does not dictate the form of inference engine we use. For example, First-Order Predicate Logic is not good enough for AI, partly because it does not support non-deductive inference. Also, semantic network is considered as weak mainly because it has no powerful inference method associated. FOPL can be used for things like induction and abduction, albeit via external algorithms. Therefore, I think a FOPL-based system can suffice for AGI (which doesn't mean that it is the only way). I am still reading your book, but I found numerous good ideas in it. I know that you treat deduction, induction, and abduction in a unified way. That is a very elegant theory but it may have problems. For example, if: (1) I read a lot of books (2) I hate my mom your system may infer by induction that reading a lot of books - hating ones mom. In some instances doing this is meaningful, but in general your system may be flooded with a lot of these speculative statements, drawing time from the day-to-day deductive operations. I tend to think of induction as something less essential than deduction. That's why my top priority is to build an inference engine for deduction. Inductive learning will be added later in the form of data mining, which is very computation-intensive. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/24/07, Bob Mottram [EMAIL PROTECTED] wrote: I think it would be better to design a system with probabilistic reasoning as a fundamental component from the outset, rather than trying to bolt this on as an after thought. I know from doing a lot of stuff with machine vision that modelling sensor uncertainties is critical for being able to understand the spatial structure of the environment, and I expect similar principles will apply when reasoning within more abstract domains. Yes I agree. I think Pei Wang's version of uncertain logic is very simple and effective. It uses 2 numbers, one for probability (as frequency) and one for support or confidence. On the other hand, I suspect that many commonsense statements do not have probabilistic values attached to them. For example water conducts electricity or oil is slippery are not really probabilistic. We should leave an option for a statement to be non-probabilistic. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
The trouble is that you can only really decide whether a statement is non-probabilistic if enough people have voted unanimously yes or no. Even then you can't be sure that the next person to vote won't go the opposite way. On 24/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/24/07, Bob Mottram [EMAIL PROTECTED] wrote: I think it would be better to design a system with probabilistic reasoning as a fundamental component from the outset, rather than trying to bolt this on as an after thought. I know from doing a lot of stuff with machine vision that modelling sensor uncertainties is critical for being able to understand the spatial structure of the environment, and I expect similar principles will apply when reasoning within more abstract domains. Yes I agree. I think Pei Wang's version of uncertain logic is very simple and effective. It uses 2 numbers, one for probability (as frequency) and one for support or confidence. On the other hand, I suspect that many commonsense statements do not have probabilistic values attached to them. For example water conducts electricity or oil is slippery are not really probabilistic. We should leave an option for a statement to be non-probabilistic. YKY -- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Given my experience while employed at Cycorp, I would say that there are two ways to work with them. The first way is to collaborate with Cycorp on a sponsored project. Collaborators are mainly universities (e.g. CMU Stanford) and established research companies (e.g. SRI SAIC) who have a track record of receiving government grants, and whose technologies are complementary to Cyc. I would not suggest this approach for MindPixel 2 yet. The second approach involves no exchange of money. Cycorp wants to promote its ontology - its commonsense vocabulary, and has released its definitions with a very permisive license as OpenCyc. One can also obtain nearly the entire Cyc knowledge base with a Research Cyc license for research purposes without fee, but with the RCyc license you are not allowed to extract facts and rules for MindPixel 2. You could contact the Cyc Foundation, which is an independent organization run by a friend of mine and former Cycorp employee. They are seeking to add knowledge to Cyc by using volunteers and I believe that they would be very receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary for knowledge representation. I suggest obtaining an RCyc license to see how the Cyc inference engine handles large rule and fact sets, and to see if the Cyc vocabulary fits your idea of a commonsense representation language. - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 5:35:51 PM Subject: Re: [agi] Project proposal: MindPixel 2 Hi, Do you think Cyc has a rule/fact like wet things can usually conduct electricity (or if X is wet then X may conduct electricity)? Yes, it does... I'll also contact some Cyc folks to see if they're interested in collaborating... IMO, to have any chance of interesting them, you will need to be able to explain to them VERY CLEARLY why your current proposed approach is superior to theirs -- given that it seems so philosophically similar to theirs, and given that they have already encoded millions of knowledge items and built an inference engine and language-processing front end! -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I'm no expert on automated reasoning, but wasn't the original Mindpixel based fundamentally upon probabilistic representations (coherence values) whereas Cyc, from what I understand, doesn't represent facts or rules probabilistically. - Bob On 23/01/07, Stephen Reed [EMAIL PROTECTED] wrote: Given my experience while employed at Cycorp, I would say that there are two ways to work with them. The first way is to collaborate with Cycorp on a sponsored project. Collaborators are mainly universities (e.g. CMU Stanford) and established research companies (e.g. SRI SAIC) who have a track record of receiving government grants, and whose technologies are complementary to Cyc. I would not suggest this approach for MindPixel 2 yet. The second approach involves no exchange of money. Cycorp wants to promote its ontology - its commonsense vocabulary, and has released its definitions with a very permisive license as OpenCyc. One can also obtain nearly the entire Cyc knowledge base with a Research Cyc license for research purposes without fee, but with the RCyc license you are not allowed to extract facts and rules for MindPixel 2. You could contact the Cyc Foundation, which is an independent organization run by a friend of mine and former Cycorp employee. They are seeking to add knowledge to Cyc by using volunteers and I believe that they would be very receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary for knowledge representation. I suggest obtaining an RCyc license to see how the Cyc inference engine handles large rule and fact sets, and to see if the Cyc vocabulary fits your idea of a commonsense representation language. - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 5:35:51 PM Subject: Re: [agi] Project proposal: MindPixel 2 Hi, Do you think Cyc has a rule/fact like wet things can usually conduct electricity (or if X is wet then X may conduct electricity)? Yes, it does... I'll also contact some Cyc folks to see if they're interested in collaborating... IMO, to have any chance of interesting them, you will need to be able to explain to them VERY CLEARLY why your current proposed approach is superior to theirs -- given that it seems so philosophically similar to theirs, and given that they have already encoded millions of knowledge items and built an inference engine and language-processing front end! -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Right, Cyc's deductive inference engine does not support probabilistic reasoning. But there is no obstacle to extending Cyc's probabilistic vocabulary for the particular representation you want and then using an inference engine of your own design. For my AGI project I use the OpenCyc vocabulary and content, but with my own object store (a relational database) and simple inference (look-up and subsumption within contexts). The java dialog application that I am building does not require any more sophisticated deduction, so I postponing any complex inference until I can teach those algorithms to the system using English. -Steve http://sf.net/projects/texai - Original Message From: Bob Mottram [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Tuesday, January 23, 2007 1:13:14 PM Subject: Re: [agi] Project proposal: MindPixel 2 I'm no expert on automated reasoning, but wasn't the original Mindpixel based fundamentally upon probabilistic representations (coherence values) whereas Cyc, from what I understand, doesn't represent facts or rules probabilistically. - Bob On 23/01/07, Stephen Reed [EMAIL PROTECTED] wrote: Given my experience while employed at Cycorp, I would say that there are two ways to work with them. The first way is to collaborate with Cycorp on a sponsored project. Collaborators are mainly universities (e.g. CMU Stanford) and established research companies ( e.g. SRI SAIC) who have a track record of receiving government grants, and whose technologies are complementary to Cyc. I would not suggest this approach for MindPixel 2 yet. The second approach involves no exchange of money. Cycorp wants to promote its ontology - its commonsense vocabulary, and has released its definitions with a very permisive license as OpenCyc. One can also obtain nearly the entire Cyc knowledge base with a Research Cyc license for research purposes without fee, but with the RCyc license you are not allowed to extract facts and rules for MindPixel 2. You could contact the Cyc Foundation, which is an independent organization run by a friend of mine and former Cycorp employee. They are seeking to add knowledge to Cyc by using volunteers and I believe that they would be very receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary for knowledge representation. I suggest obtaining an RCyc license to see how the Cyc inference engine handles large rule and fact sets, and to see if the Cyc vocabulary fits your idea of a commonsense representation language. - Original Message From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 5:35:51 PM Subject: Re: [agi] Project proposal: MindPixel 2 Hi, Do you think Cyc has a rule/fact like wet things can usually conduct electricity (or if X is wet then X may conduct electricity)? Yes, it does... I'll also contact some Cyc folks to see if they're interested in collaborating... IMO, to have any chance of interesting them, you will need to be able to explain to them VERY CLEARLY why your current proposed approach is superior to theirs -- given that it seems so philosophically similar to theirs, and given that they have already encoded millions of knowledge items and built an inference engine and language-processing front end! -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Do you Yahoo!? Everyone is raving about the all-new Yahoo! Mail beta. http://new.mail.yahoo.com - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Benjamin Goertzel wrote: And, importance levels need to be context-dependent, so that assigning them requires sophisticated inference in itself... The problem may not be so serious. Common sense reasoning may require only *shallow* inference chains, eg 5 applications of rules. So I'm very optimistic =) Your worries are only applicable to 100-page theorem-proving tasks, not really the concern of AGI. A) This is just not true, many commonsense inferences require significantly more than 5 applications of rules B) Even if there are only 5 applications of rules, the combinatorial explosion still exists. If there are 10 rules and 1 billlion knowledge items, then there may be up to 10 billion possibilities to consider in each inference step. So there are (10 billion)^5 possible 5-step inference trajectories, in this scenario ;-) Of course, some fairly basic pruning mechanisms can prune it down a lot, but, one is still left with a combinatorial explosion that needs to be dealt with via subtle means... Please bear in mind that we actually have a functional uncertain logical reasoning engine within the Novamente system, and have experimented with feeding in knowledge from files and doing inference on them. (Though this has been mainly for system testing, as our primary focus is on doing inference based on knowledge gained via embodied experience in the AGISim world.) The truth is that, if you have a lot of knowledge in your system's memory, you need a pretty sophisticated, context-savvy inference control mechanism to do commonsense inference. Also, temporal inference can be quite tricky, and introduces numerous options for combinatorial explosion that you may not be thinking about when looking at atemporal examples of commonsense inference. Various conclusions may hold over various time scales; various pieces of knowledge may become obsolete at various rates, etc. I imagine you will have a better sense of these issues once you have actually built an uncertain reasoning engine, fed knowledge into it, and tried to make it do interesting things I certainly think this may be a valuable exercise for you to do. However, until you have done it, I think it's kind of silly for you to be speaking so confidently about how you are so confident you can solve all the problems found by others in doing this kind of work!! I ask again, do you have some theoretical innovation that seems probably to allow you circumvent all these very familiar problems?? -- Ben Possibly this could be approached by partitioning the rule-set into small chunks of rules that work together, so that one didn't end up trying everything against everything else. These chunks of rules might well be context dependent, so that one would use different chunks at a dinner table than in a work shop. There would need to be ways to combine different chunks of rules, of course, so e.g. a restaurant table would be different from a dinner table, but would have overlapping sets of rules. (I hope I'm not just re-inventing frames...) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Hi, Possibly this could be approached by partitioning the rule-set into small chunks of rules that work together, so that one didn't end up trying everything against everything else. These chunks of rules might well be context dependent, so that one would use different chunks at a dinner table than in a work shop. There would need to be ways to combine different chunks of rules, of course, so e.g. a restaurant table would be different from a dinner table, but would have overlapping sets of rules. (I hope I'm not just re-inventing frames...) The issue is how these contexts are learned. If context have to be programmer-supplied, then you ARE just reinventing frames Context formation is a tricky inference problem in itself -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Benjamin Goertzel wrote: Hi, Possibly this could be approached by partitioning the rule-set into small chunks of rules that work together, so that one didn't end up trying everything against everything else. These chunks of rules might well be context dependent, so that one would use different chunks at a dinner table than in a work shop. There would need to be ways to combine different chunks of rules, of course, so e.g. a restaurant table would be different from a dinner table, but would have overlapping sets of rules. (I hope I'm not just re-inventing frames...) The issue is how these contexts are learned. If context have to be programmer-supplied, then you ARE just reinventing frames Context formation is a tricky inference problem in itself -- Ben Well, my rather vague idea was to start with a very small rule set, that didn't need to be partitioned, and evolve rule-sets by statistical correlation (what tends to get used with what). As new rules are added, at some point clusters would need to separate (for efficiency). I suppose this could all be done with activation levels, but that's not the way I tend to think of it. OTOH, if the local cluster can't handle the deduction, it would need to check the most closely associated/most activated clusters to see if they could handle it. Not sure how well this would work. Clearly it has no more theoretical power than having all the rules in a large table, but I feel it would be a more efficient organization. Also, I don't have any definition of rule yet. It's not at all clear that it would be easy to translate into something a person not familiar with the details of the hardware and software would understand. (If a certain area of RAM is mapped to a video camera, reading/writing the ram will naturally mean something very different than it would mean in other contexts. Writing to it might be a request to alter the scene . (A silly way to do things, but it's for the sake of the point, not for real implementation.) I'm not at all sure that rules of the form if x do y, then check for result z (if not raise exception w) will suffice, even if you allow great flexibility as to what x, y, z, and w are interpreted as. Possibly if they could be generalized functions (with x and z limited to not causing side effects). - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
My feeling is that this probably isn't a great business idea. I think collecting common sense data and building that into a general reasoner should really be thought of as a long term effort, which is unlikely to appeal to business investors expecting to see a return within a few years. If any attempt is made to build a second version of mindpixel, I think the open source (or open corpus) model would be the obvious choice. Chris McKinstry kept his database secret, and for a long time so did Cyc, and as a consequence those projects saw very little actual usage by anyone. The more easily researchers can get their hands on the corpus the more likely is that some interesting applications will result. It might also be worth noting that cross validated common sense information can be grabbed directly from the internet, from sites like wikipedia. I've had a program doing this for quite some time, and the quality of the data acquired is good. - Bob On 18/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/19/07, Matt Mahoney [EMAIL PROTECTED] wrote: I think if you want to make a business out of AI, you are in for a lot of work.First you need something that is truly innovative, that does something that nobody else can do. What will that be? A search engine better than Google? A new operating system that understands natural language? A car that drives itself? A household servant robot? A program that can manage a company? A better spam detector? Text compression? Write down a well defined goal. Do research. What is your competition? How are your ideas better than what's been done? Prove it (with benchmarks), and the opportunities will come. Thanks for the tips. My idea is quite simple, slightly innovative, but not groundbreaking. Basically, I want to collect a knowledgebase of facts as well as rules. Facts are like water is wet etc. The rules I explain with this example: Cats have claws; Kitty is a cat; therefore Kitty has claws. Here is an implicit rule that says if X is-a Y and Z(Y), then Z(X). I call rules like this the Rules of Thought. They are not logical tautologies but they express some common thought patterns. My theory is that if we collect a bunch of these rules, add a database of common sense facts, and add a rule-based FOPL inference engine (which may be enhanced with eg Pei Wang's numerical logic), then we have a common sense reasoner. That's what I'm trying to build as a first-stage AGI. If it does work, there may be some commercial applications for such a reasoner. Also it would serve as the base to build a full AGI capable of machine learning etc (I have crudely worked out the long-term plan). So, is this a good business idea? YKY -- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
YKY, Frankly, I still see many conceptual confusions in your description. Of course, some of them come from other people's mistake, but they will hurt your work anyway. For example, what you called rule in your postings have two different meanings: (1) A declarative implication statement, X == Y; (2) A procedure that produces conclusions from premises, {X} |- Y. These two are related, but not the same thing. Both can be learned, but through very different paths. To confuse the two will cause a mess. The failure of GOFAI has reasons deeper than you suggested. Like Ben, I think you will repeat the same mistake if you follow the current plan. Just adding numbers to your rules won't solve all the problems. More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? I don't think it is a good idea to attract many volunteers to a project unless the plan is mature enough so that people's time and interest won't be wasted. Sources of human knowledge will be needed by any AGI project, so projects like CYC or MindPixel will be useful, though I'm afraid neither is cost-effective enough to play a central role in satisfying this need. Mining the Web may be more efficient, though it will surely leave gaps in the knowledge base to be filled in by other methods, such as personal experience, NLP, interactive tutoring, etc. Sorry for the negative tone, but since you mentioned my work, I have to clarify my position. Pei On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s because of newer techniques such as neural networks and statistical learning methods. They were not necessarily aware of what exactly was the cause of failure. If they did, they would have tackled it. For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. My system is conceptually very close to Cyc, but the difference is that Cyc only contains ground facts and rely on special predicates (eg $isa, $genl) to do the reasoning. My project may be the first to openly collect facts as well as rules. I guess Novamente or NARS can benefit by importing these rules, if the format is right? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
YKY, Pei's attitude is pretty similar to mine on these matters, although we differ on other more detailed issues regarding AGI. And, please note that compared to most AI researchers, Pei and I would be among the folks most likely to be sympathetic to your ideas, given that -- we are both explicitly in favor of pushing hard toward AGI rather than fiddling with narrow AI -- are are both in favor of uncertain logic systems as one highly viable path toward AGI You have not explained how you will overcome the issues that plagued GOFAI, such as -- the need for massive amounts of highly uncertain background knowledge to make real-world commonsense inferences -- the combinatorial explosion that ensues when you try to control logical inference on a large body of data My own solution to these problems is to -- learn most knowledge via experience rather than via explicit encoding -- utilize a subtle combination of inference, statistical pattern mining and artificial economics for inference control Pei agrees with me on the learning via experience part but has a different approach to the combinatorial explosion problem of inference control. But you have not yet presented any original solutions to these or other major well-documented problems with the GOFAI approach. Yes, intuitively the approach you're suggesting sounds like it should work -- at first. That is why masses of research funding were spent on it decades ago, and why hundreds of brilliant people spent their lives on GOFAI. But you are not giving us any rational reason to suspect you might succeed in this sort of approach where so many others have failed. What is your new and different idea? -- Ben On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote: YKY, Frankly, I still see many conceptual confusions in your description. Of course, some of them come from other people's mistake, but they will hurt your work anyway. For example, what you called rule in your postings have two different meanings: (1) A declarative implication statement, X == Y; (2) A procedure that produces conclusions from premises, {X} |- Y. These two are related, but not the same thing. Both can be learned, but through very different paths. To confuse the two will cause a mess. The failure of GOFAI has reasons deeper than you suggested. Like Ben, I think you will repeat the same mistake if you follow the current plan. Just adding numbers to your rules won't solve all the problems. More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? I don't think it is a good idea to attract many volunteers to a project unless the plan is mature enough so that people's time and interest won't be wasted. Sources of human knowledge will be needed by any AGI project, so projects like CYC or MindPixel will be useful, though I'm afraid neither is cost-effective enough to play a central role in satisfying this need. Mining the Web may be more efficient, though it will surely leave gaps in the knowledge base to be filled in by other methods, such as personal experience, NLP, interactive tutoring, etc. Sorry for the negative tone, but since you mentioned my work, I have to clarify my position. Pei On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s because of newer techniques such as neural networks and statistical learning methods. They were not necessarily aware of what exactly was the cause of failure. If they did, they would have tackled it. For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. My system is conceptually very close to Cyc, but the difference is that Cyc only contains ground facts and rely on special predicates (eg $isa, $genl) to do the reasoning. My project may be the first to openly collect facts as well as rules. I guess Novamente or NARS can benefit by importing these rules, if the format is right? YKY
Re: [agi] Project proposal: MindPixel 2
Regarding Mindpixel 2, FWIW, one kind of knowledge base that would be most interesting to me as an AGI developer would be a set of pairs of the form (Simple English sentence, formal representation) For instance, a [nonrepresentatively simple] piece of knowledge might be (Cats often chase mice, { often( chase(cat, mouse) ) } ) This sort of training corpus would be really nice for providing some extra help to an AI system that was trying to learn English. Equivalently one could use a set of pairs of the form (English sentence, Lojban sentence) If Lojban is not used, then one needs to make some other highly particular specification regarding the logical representation language. -- Ben On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: YKY, Pei's attitude is pretty similar to mine on these matters, although we differ on other more detailed issues regarding AGI. And, please note that compared to most AI researchers, Pei and I would be among the folks most likely to be sympathetic to your ideas, given that -- we are both explicitly in favor of pushing hard toward AGI rather than fiddling with narrow AI -- are are both in favor of uncertain logic systems as one highly viable path toward AGI You have not explained how you will overcome the issues that plagued GOFAI, such as -- the need for massive amounts of highly uncertain background knowledge to make real-world commonsense inferences -- the combinatorial explosion that ensues when you try to control logical inference on a large body of data My own solution to these problems is to -- learn most knowledge via experience rather than via explicit encoding -- utilize a subtle combination of inference, statistical pattern mining and artificial economics for inference control Pei agrees with me on the learning via experience part but has a different approach to the combinatorial explosion problem of inference control. But you have not yet presented any original solutions to these or other major well-documented problems with the GOFAI approach. Yes, intuitively the approach you're suggesting sounds like it should work -- at first. That is why masses of research funding were spent on it decades ago, and why hundreds of brilliant people spent their lives on GOFAI. But you are not giving us any rational reason to suspect you might succeed in this sort of approach where so many others have failed. What is your new and different idea? -- Ben On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote: YKY, Frankly, I still see many conceptual confusions in your description. Of course, some of them come from other people's mistake, but they will hurt your work anyway. For example, what you called rule in your postings have two different meanings: (1) A declarative implication statement, X == Y; (2) A procedure that produces conclusions from premises, {X} |- Y. These two are related, but not the same thing. Both can be learned, but through very different paths. To confuse the two will cause a mess. The failure of GOFAI has reasons deeper than you suggested. Like Ben, I think you will repeat the same mistake if you follow the current plan. Just adding numbers to your rules won't solve all the problems. More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? I don't think it is a good idea to attract many volunteers to a project unless the plan is mature enough so that people's time and interest won't be wasted. Sources of human knowledge will be needed by any AGI project, so projects like CYC or MindPixel will be useful, though I'm afraid neither is cost-effective enough to play a central role in satisfying this need. Mining the Web may be more efficient, though it will surely leave gaps in the knowledge base to be filled in by other methods, such as personal experience, NLP, interactive tutoring, etc. Sorry for the negative tone, but since you mentioned my work, I have to clarify my position. Pei On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s because of newer techniques such as neural networks and statistical learning methods. They were not necessarily aware of what exactly was the cause
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, Bob Mottram [EMAIL PROTECTED] wrote: My feeling is that this probably isn't a great business idea. I think collecting common sense data and building that into a general reasoner should really be thought of as a long term effort, which is unlikely to appeal to business investors expecting to see a return within a few years. In contrast to many business in this internet era, my project seems to need a low level of funding over a relatively long period of time (eg 5-10 years). I actually think this is a better business model for AGI (cf Ben's WebMind story =)). Funding is of secondary importance to finding the right partners; I know quite a few people with VC connections. If any attempt is made to build a second version of mindpixel, I think the open source (or open corpus) model would be the obvious choice. Chris McKinstry kept his database secret, and for a long time so did Cyc, and as a consequence those projects saw very little actual usage by anyone. The more easily researchers can get their hands on the corpus the more likely is that some interesting applications will result. How about this: the database would be open for anyone to download, for experimentation or whatever purpose. Only when someone wants to incorporate the data in an AGI, would a license fee be needed. Also I would make the inference engine etc opensource, again within a commercial context. This approach is not so common but I think it gets the best of both worlds. It might also be worth noting that cross validated common sense information can be grabbed directly from the internet, from sites like wikipedia. I've had a program doing this for quite some time, and the quality of the data acquired is good. There might be some gaps in the knowledge you acquired. If I really run the project I would also import knowledge acquired from other methods. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote: For example, what you called rule in your postings have two different meanings: (1) A declarative implication statement, X == Y; (2) A procedure that produces conclusions from premises, {X} |- Y. These two are related, but not the same thing. Both can be learned, but through very different paths. To confuse the two will cause a mess. Thanks, that's a good point. In uncertain logic, the status of the classical logic connectives AND, OR, NOT may be somewhat different. For example, A - B may no longer be equivalent to (!A v B), when A and B are attached with uncertainty values. Therefore I am still unsure about how to deal with - etc. But I will pay special attention to this point. The failure of GOFAI has reasons deeper than you suggested. Like Ben, I think you will repeat the same mistake if you follow the current plan. Just adding numbers to your rules won't solve all the problems. More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? That's the problem of forward-chaining without a goal. In fact, the human mind can easily think of a lot of useless implications in a situation. If we have a query as a goal, we can use backward-chaining. Otherwise we can rank the implied sentences into levels of importance. This does not seem to be a show-stopper. I don't think it is a good idea to attract many volunteers to a project unless the plan is mature enough so that people's time and interest won't be wasted. Sources of human knowledge will be needed by any AGI project, so projects like CYC or MindPixel will be useful, though I'm afraid neither is cost-effective enough to play a central role in satisfying this need. Mining the Web may be more efficient, though it will surely leave gaps in the knowledge base to be filled in by other methods, such as personal experience, NLP, interactive tutoring, etc. Speaking of cost-effectiveness, my project can be pretty low-cost =) I try to keep things simple and not pursue a million ideas at once, though I have plans for an entire AGI. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 19/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: How about this: the database would be open for anyone to download, for experimentation or whatever purpose. Only when someone wants to incorporate the data in an AGI, would a license fee be needed. Also I would make the inference engine etc opensource, again within a commercial context. This approach is not so common but I think it gets the best of both worlds. This might be ok. You could distribute any funds from commercial licenses proportionately amongst those people who had entered the data. However, such a system might be difficult to enforce. I think we'll soon be entering an age where the internet becomes a big knowledge crunching monster, with information from one source being processed and spat out to other destinations in a completely automated way. It would be very hard to tell in this situation exactly who was using the data, or indeed what the original source of the data was. If a commercial model is adopted it should be made clear that this isn't a get rich quick scheme. If people start entering data believing that they're soon going to be making money out of it after a year or two with no financial return in sight dissapointment sets in, which quickly leads to bad press and people losing interest in the project. This seems to be what happened with the original mindpixel. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
YKY (Yan King Yin) wrote: ... I think a project like this one requires substantial efforts, so people would need to be paid to do some of the work (programming, interface design, etc), especially if we want to build a high quality knowledgebase. If we make it free then a likely outcome is that we get a lot of noise but very few people actually contribute. I'm not an academic (left uni a couple years ago) so I can't get academic funding for this. If I can't start an AI business I'd have to entirely give up AI as a career. I hope you can understand these circumstances. YKY I can understand those circumstances, but if you expect people to contribute, you must give them something back. One thing that's cheap to give back is the work that they and others have contributed. Giving back less generally results in people not being willing to participate. Even if you claim sole rights to commercially exploit the work, you will find it much more difficult to get folk to participate. They will feel that you are stealing their work without just compensation. You raise the issue of compensation to you, and that's fair. But if you take out too much, you will cause the project to fail just as surely as if you hadn't put in the time to design the interface. If you merely make a requirement that people be a better than average contributor to be entitled to download the current results, then you will eliminate most potential competitors...and the remaining ones will be those who are also dedicating time and effort to making your project work. It's true that old versions of your work will circulate, but that should do little harm. People only participate in a public project if they feel they are getting a good return out of it. What a good return is, is subjective, but few people consider I put in a bunch of work, and they don't even mention my name to be a good return. You want to give people a return that they see as more valuable than their efforts, but which costs you a lot less than their efforts. Status in a community requires that the community exist. (At some point you'll want to give people scores depending on the amount of their work that is included in the current project...or something that will relate positively to that. This is a cheap status reward, and will boost community participation. On Slashdot I notice that just having a low numbered user ID has become a status marker of sorts. I.e., you've been a member of the community for a long time. That was a REALLY cheap status gift, but it took a long time to build to anything of value. Much quicker was the right to meta-moderate. Slightly less quick was the right to moderate. Note that these are both seen by the Slashdot community as things of worth, yet to the operator of Slashdot they were instituted as ways of cutting cost while improving quality. Also note that it took a long time for them to become worth much as status markers. You need something else to use while you're getting started.) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? That's the problem of forward-chaining without a goal. In fact, the human mind can easily think of a lot of useless implications in a situation. If we have a query as a goal, we can use backward-chaining. Otherwise we can rank the implied sentences into levels of importance. This does not seem to be a show-stopper. Backward inference faces the same problem --- more knowledge means more possible ways to derive subgoals. No traditional control strategy can scale up to a huge knowledge base. Importance-ranking will surely be necesary, but this idea by itself is not enough. Pei - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: You have not explained how you will overcome the issues that plagued GOFAI, such as -- the need for massive amounts of highly uncertain background knowledge to make real-world commonsense inferences Precisely, we need to amass millions of pieces of knowledge items, and some items may have uncertainty. This is precisely what I'm trying to do. The alternative route is machine learning, but that requires a sensorium or a NL interface, which is an even more daunting task. (But I don't object you going that way =)) -- the combinatorial explosion that ensues when you try to control logical inference on a large body of data I have studied this issue a bit, unbeknown to you =) There are ways to tackle massive numbers of rules, eg the rete algorithm, predicate hashing, etc. Soar is a good example using the rete algorithm. It can handle millions of rules (and probably many more). My own solution to these problems is to -- learn most knowledge via experience rather than via explicit encoding Nothing wrong with this approach, but it may be even more difficult than mine. -- utilize a subtle combination of inference, statistical pattern mining and artificial economics for inference control You're getting into the topic of inference control, but I was only talking about collecting knowledge in the form of rules. Speaking of my project, it does not endorse specific inference methods. It is up to the AGI designer how to use the data. BTW, statistical pattern mining is good for *learning* patterns, I wouldn't use it for inference per se. For me inference is done only using *existing* rules and facts in the KB. Pattern mining is for discovering *new* rules and facts, which is very time-consuming and compute-intensive. Pei agrees with me on the learning via experience part but has a different approach to the combinatorial explosion problem of inference control. But you have not yet presented any original solutions to these or other major well-documented problems with the GOFAI approach. Again, I have no problems with learning via experience. What I propose is to augment this with knowledge acquisition via direct encoding, with the help of the net community. Do you have some reasons against this? Is it difficult for Novamente to incorporate the rules database? Yes, intuitively the approach you're suggesting sounds like it should work -- at first. That is why masses of research funding were spent on it decades ago, and why hundreds of brilliant people spent their lives on GOFAI. But you are not giving us any rational reason to suspect you might succeed in this sort of approach where so many others have failed. What is your new and different idea? I think the key innovation is that I allow rules with variables as well as facts, and that such knowledge would be collected from online users on a massive scale (which doesn't mean the project require massive $$$s). Such a combination has NOT been attempted before, AFAIK. Frankly I'm not that knowledgeable about failed GOFAI projects (I was just teenage in the 80s, playing with a TRS-80). Decades ago, the internet didn't exist and there was no way of amassing knowledge like MindPixel can. This is perhaps the most important reason why past projects failed. Maybe you're just reflexively saying that GOFAI is a failure, without giving it serious consideration. Cyc is not a complete failure. It's still ALIVE and it can do some reasoning about terrorist attacks etc. Why wouldn't an improved GOFAI succeed? Perhaps it is an misconception that everything associated with GOFAI _must_ be abandoned... YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Your thermostat example can be used to show what I am talking about. The thermostat has an algorithm that says when the temperature gets below some X amount, turn on the burner and fan until the temperature rises to at least some X+N amount. You get to set the X amount. It doesn't have a table that says at exactly 69.1 turn on and turn off at exactly 70.3 as the temperature reading might never register exactly 69.1 or 70.3. If you made a database with enough entries and fine enough detail, you could just look up turn on and turn off points for any recorded temperature but isn't just using the simple algorithm a better solution? I think if the goal was no create intelligence that this would be even more important. When I was in Physics in high school, I could have memorized all the formulas for my exams but instead, I found that by knowing only a few basic formulas I could derive all the others I needed during the exams. The problem is you have to know/understand how the formulas interrelate etc. When my kids were growing up, I tried to minimize the number of explicit rule/punishment combinations in modifying their behavior. Instead, I put forward policies and variable punishment so that it was much harder for my kids to get around the rules and so the punishment could always fit the crime. With a relatively small number of policies (analogous to the algorithms above), I could look after a much larger set of problems than I could by just resorting to a set of mindless rules. Working out how the policies were broken in each case and what appropriate punishment is much harder than just using a set of rigid rules but it is much more intelligent(just) don't you think? Do we divine the rules/laws/algorithms from a mass of data or do we generate the appropriate conclusions when we need them because we understand how it actually works? David Clark - Original Message - From: Charles D Hixson [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 11:02 AM Subject: Re: [agi] Project proposal: MindPixel 2 David Clark wrote: I agree with Ben's post that this kind a system has been tried many times and produced very little. How can a collection of Cats have claws; Kitty is a cat; therefore Kitty has claws. relate cat and kitty and that kitty is slang and normally used for a young cat. A database of this type seems to be like the Chinese room dilemma where even if you got something that looked intelligent out of the system, you know for a fact that no intelligence exists. ... David Clark I'm not certain that I'm convinced by that argument. I tend to feel that as we approach the base level, intelligences DO decompose into pieces that are, themselves, not intelligent. (Otherwise one gets into an It's turtles all the way down! kind of argument.) Partially it's a matter of definition. Is a thermostat intelligent? To me the answer would be Yes, at the most basic possible level (i.e., I wouldn't consider a thermocouple intelligent.) A thermostat maintains a homeostasis, and to me that is one of the most basic kinds of intelligence. I can easily see that one could have a reasonable definition of intelligence that was sufficiently specific AND excluded thermostats as being too basic, but I'm willing to grant to thermostats a basic amount of intelligence. I'm also willing to grant that to logic engines. And to many other things that I see as pieces of an AGI. They aren't general intelligences, and I'm not totally convinced that such things can, even in principle, exist. (Goedel's results seem to imply otherwise. No system can be both complete and consistent.) Still, we are an existence proof that something better than we've been able to build so far is possible. I suspect that we shave on both completeness and consistency, and that's probably an indication of what's needed to come any closer than we are. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
More knowledge, higher intelligence is an intuitively attractive slogan, but has many problems in it. For example, more knowledge will easily lead to combinatorial explosion, and the reasoning system will derive many true but useless conclusions. How do you deal with that? That's the problem of forward-chaining without a goal. In fact, the human mind can easily think of a lot of useless implications in a situation. If we have a query as a goal, we can use backward-chaining. Otherwise we can rank the implied sentences into levels of importance. This does not seem to be a show-stopper. Backward chaining is just as susceptible to combinatorial explosions as forward chaining... And, importance levels need to be context-dependent, so that assigning them requires sophisticated inference in itself... Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I've been using OpenCyc as the standard ontology for my texai project. OpenCyc contains only the very few rules needed to enable the OpenCyc deductive inference engine operate on its OpenCyc content. On the other hand ResearchCyc, whose licenses are available without fees for research purposes, has a large number of rules. I have a license and can state that my copy of RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping assertions. Nearly all of these rules were entered by hand at Cycorp. Here are five at random with my comments to give you a feel for what RCyc contains: [this is a typical temporal relations rule] (#$implies (#$and (#$startingIntervalOfThing ?TEMP-THING ?TIME-INTERVAL) (#$startingPoint ?TEMP-THING ?TIME-POINT)) (#$endsAfterStartingOf ?TIME-INTERVAL ?TIME-POINT)) in context: #$CycTemporalTheoryMt [the Cyc term suffix Mt means microtheory (context)] [this is a typical spatial relations rule] (#$implies (#$and (#$isa ?UNIVERSE #$UniversalSpaceRegion) (#$partOfSpaceRegion ?REGION ?UNIVERSE) (#$spaceRegionDifference ?COMPLEMENT ?UNIVERSE ?REGION)) (#$spaceRegionComplement ?COMPLEMENT ?REGION)) in context: #$SpatialGMt [this is a rather specialized rule that helps define the predicate #$eventCasualtyDataSentence] (#$implies (#$and (#$isa ?PRED #$CasualtyPredicate) (#$assertedSentence (#$relationInstanceExists ?PRED ?SUBEVENT ?COL)) (#$different ?EVENT ?SUBEVENT) (#$subEvents ?EVENT ?SUBEVENT)) (#$eventCasualtyDataSentence ?EVENT (#$and (#$subEvents ?EVENT ?SUBEVENT) (#$relationInstanceExists ?PRED ?SUBEVENT ?COL in context: #$BaseKB [this is a general domain context from which almost all other contexts inherit facts and rules] [this is a typical rule in the naive physics domain] (#$implies (#$and (#$isa ?HOLDING #$HoldingAnObject) (#$doneBy ?HOLDING ?AGENT) (#$objectActedOn ?HOLDING ?OBJ)) (#$holdsIn ?HOLDING (#$touches ?AGENT ?OBJ))) in context: #$NaivePhysicsMt [This is a rule to guide a Cyc knowledge acquisition tool. Note that this rule represents a form of probability not seen in the other rules.] (#$implies (#$genls ?COL #$EnclosingSomething) (#$keCommonQueryForTerm ?COL (#$relationAllExists #$enclosure ?COL :WHAT))) in context: #$BaseKB Cheers. -Steve http://sf.net/projects/texai - Original Message From: YKY (Yan King Yin) [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 8:48:33 AM Subject: Re: [agi] Project proposal: MindPixel 2 On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s because of newer techniques such as neural networks and statistical learning methods. They were not necessarily aware of what exactly was the cause of failure. If they did, they would have tackled it. For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. My system is conceptually very close to Cyc, but the difference is that Cyc only contains ground facts and rely on special predicates (eg $isa, $genl) to do the reasoning. My project may be the first to openly collect facts as well as rules. I guess Novamente or NARS can benefit by importing these rules, if the format is right? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Get your own web address. Have a HUGE year through Yahoo! Small Business. http://smallbusiness.yahoo.com/domains/?p=BESTDEAL - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
There will come a point when integrating Cyc-type assertions into Novamente will make sense for us, and I'll be curious how useful they turn out to be at that point However, my impression is that OpenCyc's rules are not extensive enough to really add a lot to Novamente. ResearchCyc has more odds of being helpful but can't be used for commercial purposes, unfortunately. Still, we could integrate it with NM experimentally to see how useful it was. However, my view is that the integration of this sort of knowledge is most likely to be useful to an AI system once it has achieved a certain level of experiential intelligence, via embodied-learning means... ben On 1/19/07, Stephen Reed [EMAIL PROTECTED] wrote: I've been using OpenCyc as the standard ontology for my texai project. OpenCyc contains only the very few rules needed to enable the OpenCyc deductive inference engine operate on its OpenCyc content. On the other hand ResearchCyc, whose licenses are available without fees for research purposes, has a large number of rules. I have a license and can state that my copy of RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping assertions. Nearly all of these rules were entered by hand at Cycorp. Here are five at random with my comments to give you a feel for what RCyc contains: [this is a typical temporal relations rule] (#$implies (#$and (#$startingIntervalOfThing ?TEMP-THING ?TIME-INTERVAL) (#$startingPoint ?TEMP-THING ?TIME-POINT)) (#$endsAfterStartingOf ?TIME-INTERVAL ?TIME-POINT)) in context: #$CycTemporalTheoryMt [the Cyc term suffix Mt means microtheory (context)] [this is a typical spatial relations rule] (#$implies (#$and (#$isa ?UNIVERSE #$UniversalSpaceRegion) (#$partOfSpaceRegion ?REGION ?UNIVERSE) (#$spaceRegionDifference ?COMPLEMENT ?UNIVERSE ?REGION)) (#$spaceRegionComplement ?COMPLEMENT ?REGION)) in context: #$SpatialGMt [this is a rather specialized rule that helps define the predicate #$eventCasualtyDataSentence] (#$implies (#$and (#$isa ?PRED #$CasualtyPredicate) (#$assertedSentence (#$relationInstanceExists ?PRED ?SUBEVENT ?COL)) (#$different ?EVENT ?SUBEVENT) (#$subEvents ?EVENT ?SUBEVENT)) (#$eventCasualtyDataSentence ?EVENT (#$and (#$subEvents ?EVENT ?SUBEVENT) (#$relationInstanceExists ?PRED ?SUBEVENT ?COL in context: #$BaseKB [this is a general domain context from which almost all other contexts inherit facts and rules] [this is a typical rule in the naive physics domain] (#$implies (#$and (#$isa ?HOLDING #$HoldingAnObject) (#$doneBy ?HOLDING ?AGENT) (#$objectActedOn ?HOLDING ?OBJ)) (#$holdsIn ?HOLDING (#$touches ?AGENT ?OBJ))) in context: #$NaivePhysicsMt [This is a rule to guide a Cyc knowledge acquisition tool. Note that this rule represents a form of probability not seen in the other rules.] (#$implies (#$genls ?COL #$EnclosingSomething) (#$keCommonQueryForTerm ?COL (#$relationAllExists #$enclosure ?COL :WHAT))) in context: #$BaseKB Cheers. -Steve http://sf.net/projects/texai - Original Message From: YKY (Yan King Yin) [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Friday, January 19, 2007 8:48:33 AM Subject: Re: [agi] Project proposal: MindPixel 2 On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s because of newer techniques such as neural networks and statistical learning methods. They were not necessarily aware of what exactly was the cause of failure. If they did, they would have tackled it. For the type of common sense reasoner I described, we need a *massive* number of rules. You can either acquire these rule via machine learning or direct encoding. Machine learning of such rules is possible, but the area of research is kind of immature. OTOH there has not been a massive project to collect such rules by hand. So that explains why my type of system has not been tried before. My system is conceptually very close to Cyc, but the difference is that Cyc only contains ground facts and rely on special predicates (eg $isa, $genl) to do the reasoning. My project may be the first to openly collect facts as well as rules. I guess Novamente or NARS can benefit by importing these rules, if the format is right? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 It's here! Your new
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, David Clark [EMAIL PROTECTED] wrote: ... Do we divine the rules/laws/algorithms from a mass of data or do we generate the appropriate conclusions when we need them because we understand how it actually works? Just as chemistry is reducible to physics, in theory, while in reality it is a completely different subject... I think it is necessary that we populate the knowledgebase with *redundant* facts/rules. So we don't have to derive everything from scratch every time we do an inference. Some facts/rules are derivable from other facts/rules. Whenever a fact/rule require more than, say, 3 steps of inference we will enter it into the knowledgebase. This does not mean that the AGI does not *understand* the facts/rules. It does, but it memorizes intermediate results. If needed, it can explain the facts/rules using more basic facts/rules. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Backward chaining is just as susceptible to combinatorial explosions as forward chaining... And, importance levels need to be context-dependent, so that assigning them requires sophisticated inference in itself... The problem may not be so serious. Common sense reasoning may require only *shallow* inference chains, eg 5 applications of rules. So I'm very optimistic =) Your worries are only applicable to 100-page theorem-proving tasks, not really the concern of AGI. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
And, importance levels need to be context-dependent, so that assigning them requires sophisticated inference in itself... The problem may not be so serious. Common sense reasoning may require only *shallow* inference chains, eg 5 applications of rules. So I'm very optimistic =) Your worries are only applicable to 100-page theorem-proving tasks, not really the concern of AGI. A) This is just not true, many commonsense inferences require significantly more than 5 applications of rules B) Even if there are only 5 applications of rules, the combinatorial explosion still exists. If there are 10 rules and 1 billlion knowledge items, then there may be up to 10 billion possibilities to consider in each inference step. So there are (10 billion)^5 possible 5-step inference trajectories, in this scenario ;-) Of course, some fairly basic pruning mechanisms can prune it down a lot, but, one is still left with a combinatorial explosion that needs to be dealt with via subtle means... Please bear in mind that we actually have a functional uncertain logical reasoning engine within the Novamente system, and have experimented with feeding in knowledge from files and doing inference on them. (Though this has been mainly for system testing, as our primary focus is on doing inference based on knowledge gained via embodied experience in the AGISim world.) The truth is that, if you have a lot of knowledge in your system's memory, you need a pretty sophisticated, context-savvy inference control mechanism to do commonsense inference. Also, temporal inference can be quite tricky, and introduces numerous options for combinatorial explosion that you may not be thinking about when looking at atemporal examples of commonsense inference. Various conclusions may hold over various time scales; various pieces of knowledge may become obsolete at various rates, etc. I imagine you will have a better sense of these issues once you have actually built an uncertain reasoning engine, fed knowledge into it, and tried to make it do interesting things I certainly think this may be a valuable exercise for you to do. However, until you have done it, I think it's kind of silly for you to be speaking so confidently about how you are so confident you can solve all the problems found by others in doing this kind of work!! I ask again, do you have some theoretical innovation that seems probably to allow you circumvent all these very familiar problems?? -- Ben - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, Stephen Reed [EMAIL PROTECTED] wrote: I've been using OpenCyc as the standard ontology for my texai project. OpenCyc contains only the very few rules needed to enable the OpenCyc deductive inference engine operate on its OpenCyc content. On the other hand ResearchCyc, whose licenses are available without fees for research purposes, has a large number of rules. I have a license and can state that my copy of RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping assertions. Nearly all of these rules were entered by hand at Cycorp. Here are five at random with my comments to give you a feel for what RCyc contains: [...] Thanks a lot for this info... The Cyc rules you cited seem to be of the nonverbal knowledge kind, whereas my project tend to focus on the more verbal facet of common sense. But there should be no clear-cut boundary between the 2. Do you think Cyc has a rule/fact like wet things can usually conduct electricity (or if X is wet then X may conduct electricity)? That's the kind of verbal knowledge I'm interested in -- things that can be entered by laymen in natural langauge. I use a logical form that has a (nearly) 1-1 mapping to NL. To keep things simple -- for this project -- we can focus on collecting the facts/rules, without talking about the inference engine or other deep AGI issues. I'll also contact some Cyc folks to see if they're interested in collaborating... YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Hi, Do you think Cyc has a rule/fact like wet things can usually conduct electricity (or if X is wet then X may conduct electricity)? Yes, it does... I'll also contact some Cyc folks to see if they're interested in collaborating... IMO, to have any chance of interesting them, you will need to be able to explain to them VERY CLEARLY why your current proposed approach is superior to theirs -- given that it seems so philosophically similar to theirs, and given that they have already encoded millions of knowledge items and built an inference engine and language-processing front end! -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: A) This is just not true, many commonsense inferences require significantly more than 5 applications of rules OK, I concur. Long inference chains are built upon short inference steps. We need a mechanism to recognize the interestingness of sentences. So we only keep the interesting/relevant ones and build more deductions upon them. It's not easy, OK. The bottomline is that the knowledge acquisition project is *separable* from specific inference methods. B) Even if there are only 5 applications of rules, the combinatorial explosion still exists. If there are 10 rules and 1 billlion knowledge items, then there may be up to 10 billion possibilities to consider in each inference step. So there are (10 billion)^5 possible 5-step inference trajectories, in this scenario ;-) Of course, some fairly basic pruning mechanisms can prune it down a lot, but, one is still left with a combinatorial explosion that needs to be dealt with via subtle means... This is really solved. Use a simple hashing of predicates, which is part of the rete algorithm. For example, you want to deduce whether dead birds can fly. There may be 5000 rules/facts about birds, and 1 rules/facts about dead things. (Reasonable?) So only these rules/facts would be checked, the other parts of the KB (about flowers, cats, etc) are completely untouched (unsearched). Please bear in mind that we actually have a functional uncertain logical reasoning engine within the Novamente system, and have experimented with feeding in knowledge from files and doing inference on them. (Though this has been mainly for system testing, as our primary focus is on doing inference based on knowledge gained via embodied experience in the AGISim world.) Do you think the problems you encounter in Novamente are really due to combinatorial explosion or rather the *lack* of the right rules/facts? The truth is that, if you have a lot of knowledge in your system's memory, you need a pretty sophisticated, context-savvy inference control mechanism to do commonsense inference. I'm thinking about simple questions like can dead birds fly? etc. It shouldn't involve more than 5 steps. What you're talking about seems to be chaining many such steps to solve a detective story, that kind of thing. And yes, for that you need sophisticated inference mechanisms. Also, temporal inference can be quite tricky, and introduces numerous options for combinatorial explosion that you may not be thinking about when looking at atemporal examples of commonsense inference. Various conclusions may hold over various time scales; various pieces of knowledge may become obsolete at various rates, etc. Think 4D. Time is just another dimension. If you can do spatial reasoning you can do temporal reasoning. It *has* got to be the same, thanks to Einstein. If your approach uses special tricks to deal with temporal, then at least it is not an elegant solution. I'm not arrogant, and I admit I have not fully solved this 4D problem. It is kind of tricky, but I'm optimistic about it. I imagine you will have a better sense of these issues once you have actually built an uncertain reasoning engine, fed knowledge into it, and tried to make it do interesting things I certainly think this may be a valuable exercise for you to do. However, until you have done it, I think it's kind of silly for you to be speaking so confidently about how you are so confident you can solve all the problems found by others in doing this kind of work!! I ask again, do you have some theoretical innovation that seems probably to allow you circumvent all these very familiar problems?? I've given brief answers to your earlier questions, hope they're convincing. The point is that I think a simple inference engine combined with a good, *densely* populated knowledgebase can accomplish a lot. Let me stress again that this project per se is only a collection of facts/rules. Other, more intelligent people may come up with a better AGI to use this database. As for myself, I do use some innovative ideas (eg use of uncertain logic (not my invention though)) in my AGI that makes it different from GOFAI. My knowledge representation is not just a bunch of logic formulae. The logic formulae can reference each other so they form an intricate network similar to your graphical representation. I'd love to talk more about these things. But I think it's better to actually start a project and do some damned programming. So far I still believe the project is worth doing; and if my inference engine sucks, the database could still be of use to others... YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: Regarding Mindpixel 2, FWIW, one kind of knowledge base that would be most interesting to me as an AGI developer would be a set of pairs of the form (Simple English sentence, formal representation) For instance, a [nonrepresentatively simple] piece of knowledge might be (Cats often chase mice, { often( chase(cat, mouse) ) } ) This sort of training corpus would be really nice for providing some extra help to an AI system that was trying to learn English. Equivalently one could use a set of pairs of the form (English sentence, Lojban sentence) If Lojban is not used, then one needs to make some other highly particular specification regarding the logical representation language. So would there be a use for existing english documents to be translated, as verbatim as possible into lojban? Are you aware of any project like this? It's been a while since I looked at Lojban or your Lojban++, so was wondering if english sentences translate well into Lojban without the sentence ordering changing? I.e. given two english sentences, are there any situations where in lojban the sentences would be more correctly put in the reverse order? If there are, then manually inserting placemarks in the original and translated version could be used to delineate between regions of meaning and assist an AI in reading the text while learning english. I bet it'd be a great way of learning Lojban too! ;) -- -Joel Unless you try to do something beyond what you have mastered, you will never grow. -C.R. Lawton - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, Joel Pitt [EMAIL PROTECTED] wrote: It's been a while since I looked at Lojban or your Lojban++, so was wondering if english sentences translate well into Lojban without the sentence ordering changing? I.e. given two english sentences, are there any situations where in lojban the sentences would be more correctly put in the reverse order? If there are, then manually inserting placemarks in the original and translated version could be used to delineate between regions of meaning and assist an AI in reading the text while learning english. I bet it'd be a great way of learning Lojban too! ;) Lojban/Lojban++ is inherently an explicit language, right? Then given an environment of objects and actions, the AI's-avatar could be asked to perform actions that we pick from an interface. How many person-hours of interaction have gone into telling a guy in a chicken suit to flap his arms, jump, etc.? Imagine how much more fun people would have with a greater range of action/object potential. If this were a game along the same vein as the Google Image labeler, where another participant verified that the AI correctly completed the requested action the language could be more easily learned- English expressed from person1 to person2, lojban++ from person2 expressed to AI, confirmation from person1 that AI completed request. Win for the AI to see english and lojban++ of the same action, Win for person2 to have direct experiential learning by translating to lojban++ (I/we need interactive learning mechanisms to be fluent enough in lojban++ to think clearly in it) and person1 gets the same kicks as telling the man in the chicken suit to hop on one foot. (I never really understood that, but people forwarded that URL a lot) Ben, I used lojban++ in this example and was specifically thinking of NM because you have expressed (near-)readiness for virtual embodiment. I would love to be able to interact with your baby via an avatar of my own, but I am currently less than baby-capable with respect to lojban++. (Although this semester I am taking Discrete math, so that may help with the 'logical' thinking) Frankly, I feel I need to better understand how my own brain works before I can attempt to build a copy. Hopefully as my skills rise to meet this challenge, the interface tools will mature to lower the prerequisites for involvement. humour: I originally spelled Labeller and gmail's spellcheck offered libeller - which would be a fun google product, wouldn't it? Image Libeller - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: The bottomline is that the knowledge acquisition project is *separable* from specific inference methods. What is your argument supporting this strong claim? I guess every book on knowledge representation includes a statement saying that whether a knowledge representation format is good, to a large extent depends on the type of inference it can support. These two aspects are never considered as separable. For example, First-Order Predicate Logic is not good enough for AI, partly because it does not support non-deductive inference. Also, semantic network is considered as weak mainly because it has no powerful inference method associated. You cannot build a useful knowledge base without thinking about what inference methods it should support. Pei - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Judging from your posts, you have solved the AI problem in 2007, 2006, 2005, On 1/15/07, A. T. Murray [EMAIL PROTECTED] wrote: Matt Mahoney wrote: [...] Lenat briefly mentions Sergey's (one of Google's founders) goal of solving AI by 2020. FWIW I solved AI theory-wise in 1979 and software-wise in 2007. http://mind.sourceforce.net/Mind.html and http://www.scn.org/~mentifex/jsaimind.html and http://visitware.com/AI4U/jsaimind.html are True AI demo versions. I think if Google and Cyc work together on this, they will succeed. The Mentifex solution to AI is messy. About thirty parameters of AI have been orchestrated and coordinated to produce a minimal thinking artificial Mind. What the late Christopher McKinstry and the late Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-) programs can be achieved, albeit messily, in Mind.html or in http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html) either by hard-coding a minimal subject-verb-object KB (as I did) or by data-entry when users teach the artificial Mind new facts. On another note, something which may alarm our fellow list members, I am thinking of replacing the Terminate exit from Mind.html with a [ ] Death check-box that will pop up a plea for mercy, with an ethical user-decision to be made about AI life or death. If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth have truly solved AI, the open-access Site Meter logs will reveal an enormous rush to fetch the free AI source code. That escalation has not happened yet, but you are all welcome to click on Site Meter and see such curious visit logs as the following example from a few days ago, which was apparently made to a local copy of a Mentifex page: Visit 190,585 [] [] Domain Namesenate.gov ? (United States Government) IP Address 156.33.25.# (U.S. Senate Sergeant at Arms) ISPU.S. Senate Sergeant at Arms Location Continent : North America Country : United States (Facts) State : District of Columbia City : Washington Lat/Long : 38.8933, -77.0146 (Map) Language unknown Operating System Microsoft WinXP BrowserInternet Explorer 6.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 2.0.50727) Javascript disabled Time of Visit Jan 12 2007 5:40:01 pm Last Page View Jan 12 2007 5:40:01 pm Visit Length 0 seconds Page Views 1 Referring URL unknown Visit Entry Page Visit Exit Page Out Click Time Zone unknown Visitor's Time Unknown Visit Number 190,585 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
--- YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm not an academic (left uni a couple years ago) so I can't get academic funding for this. If I can't start an AI business I'd have to entirely give up AI as a career. I hope you can understand these circumstances. Aren't there companies looking for AI researchers? Google? Maybe another approach (the one I took) is to publish something innovative, and people come to you. It won't make you rich, but I have so far gotten 3 small consulting jobs designing and writing data compression software or doing research, all from home, simply because people have seen my work on my website (PAQ compressor, large text benchmark, Hutter prize) or they just saw my posts on comp.compression. I never looked for any of this work. I make enough teaching at a nearby university as an adjunct, with lots of time off. I'm sure I could make more money if I wanted to work long hours in an office, but I don't need to. PAQ introduced a new compression algorithm (context mixing) when PPM algorithms were the best known. PAQ would not have made it to the top of the benchmarks without the ideas and coding and testing efforts of others working on it with no reward except name recognition. That would not have happened if it wasn't free (GPL open source). Even now, I'm sure nobody would pay even $20/copy when there is so much free competition. Other good compressors (Compressia, WinRK) have failed with this business model. I think if you want to make a business out of AI, you are in for a lot of work.First you need something that is truly innovative, that does something that nobody else can do. What will that be? A search engine better than Google? A new operating system that understands natural language? A car that drives itself? A household servant robot? A program that can manage a company? A better spam detector? Text compression? Write down a well defined goal. Do research. What is your competition? How are your ideas better than what's been done? Prove it (with benchmarks), and the opportunities will come. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/18/07, Matt Mahoney [EMAIL PROTECTED] wrote: --- YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm not an academic (left uni a couple years ago) so I can't get academic funding for this. If I can't start an AI business I'd have to entirely give up AI as a career. I hope you can understand these circumstances. ... I think if you want to make a business out of AI, you are in for a lot of work.First you need something that is truly innovative, that does something that nobody else can do. What will that be? A search engine better than Google? A new operating system that understands natural language? A car that drives itself? A household servant robot? A program that can manage a company? A better spam detector? Text compression? Wow, those are very strong prerequisites to start an AI company! No AI company to date has created a household servant robot or NL OS. I think AI companies can exist around less formidable goals. I also like Guy Kawasaki's point (paraphrasing) that if you have a good idea, at least 5 other startups are working on it. If you have a great idea, at least 10. Write down a well defined goal. Do research. What is your competition? How are your ideas better than what's been done? Prove it (with benchmarks), and the opportunities will come. Not all successful companies have a quantitative proof that they are the best. Probably most do not. I'm not saying your assertion prove it...and (they) will come is incorrect, but that it's not the only basis for a startup. Re: business, you also need the right story to attract funding, the right approach to sales and a good deal of luck. Novamente, *afaik*, has no proof via benchmark but does have paying AI contracts that sustain them. And still no robot butler! (I want one.) Btw I found Ben's 22 page recap of his experiences at WebMind to be useful: http://www.goertzel.org/benzine/WakingUpFromTheEconomyOfDreams.htm And there is good info here: http://www.amazon.com/Micro-ISV-Vision-Reality-Bob-Walsh/dp/1590596013/sr=8-1/qid=1169144626/ref=pd_bbs_sr_1/002-2460655-8624059?ie=UTF8s=books Finally: Open source is one way. Commercial is another. Both have succeeded and failed many times over and will continue to do so. Neither will be going away any time soon. -Chuck - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/19/07, Matt Mahoney [EMAIL PROTECTED] wrote: I think if you want to make a business out of AI, you are in for a lot of work.First you need something that is truly innovative, that does something that nobody else can do. What will that be? A search engine better than Google? A new operating system that understands natural language? A car that drives itself? A household servant robot? A program that can manage a company? A better spam detector? Text compression? Write down a well defined goal. Do research. What is your competition? How are your ideas better than what's been done? Prove it (with benchmarks), and the opportunities will come. Thanks for the tips. My idea is quite simple, slightly innovative, but not groundbreaking. Basically, I want to collect a knowledgebase of facts as well as rules. Facts are like water is wet etc. The rules I explain with this example: Cats have claws; Kitty is a cat; therefore Kitty has claws. Here is an implicit rule that says if X is-a Y and Z(Y), then Z(X). I call rules like this the Rules of Thought. They are not logical tautologies but they express some common thought patterns. My theory is that if we collect a bunch of these rules, add a database of common sense facts, and add a rule-based FOPL inference engine (which may be enhanced with eg Pei Wang's numerical logic), then we have a common sense reasoner. That's what I'm trying to build as a first-stage AGI. If it does work, there may be some commercial applications for such a reasoner. Also it would serve as the base to build a full AGI capable of machine learning etc (I have crudely worked out the long-term plan). So, is this a good business idea? YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I agree with Ben's post that this kind a system has been tried many times and produced very little. How can a collection of Cats have claws; Kitty is a cat; therefore Kitty has claws. relate cat and kitty and that kitty is slang and normally used for a young cat. A database of this type seems to be like the Chinese room dilemma where even if you got something that looked intelligent out of the system, you know for a fact that no intelligence exists. To know that a cat is a mammal as are people and dogs can only be had by a huge collection of interrelated models that show the relationships, properties, abilities etc of all of these things. Such models could be automatically created (probably) by using this kind of information tidbits that you suggest but the process would be very messy and the size of database would be enormous. It would be like the AI trying to find the rules and relations of things out of a huge pile of word facts. Why not just build the rules and relationships into the AI from the beginning, populating the models with relevant facts as you go. This could be done with much less labor by using the AI itself to build the models by using higher and higher levels of teaching methods by multiple individuals. Computer languages use a strict subset of English to populate their syntax. People use English to communicate with each other. Why would we want to use a new language like Lojban when we already use subsets of English with computers? Why does an arbitrary English sentence have to be unambiguous? Most of the time this isn't a problem for English language people and where it might be a problem why couldn't it just be clarified the same as we humans do all the time? The teachers of the AI could intentionally use an unambiguous subset of English and gradually use more and more sophisticated sentences as the intelligence of the AI progressed. Isn't this what we do with children as they grow up? Most people verify they understand instructions given to them before them actually act on those instructions and potential misunderstandings are normally avoided. Why can't we do the same with an AI? Adding an additional language won't eliminate the need for the humans using English or the computer using it's English subset language. Whatever the ambiguity problem is between humans and computers, will only be transported to between the human and the new language for no net benefit. David Clark - Original Message - From: Benjamin Goertzel [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Thursday, January 18, 2007 1:28 PM Subject: Re: [agi] Project proposal: MindPixel 2 YKY, this kind of thing has been tried many dozens of times in the history of AI. It does not lead to interesting results! Alas... The key problem is that you can't feasibly encode enough facts to allow interesting commonsense inferences -- commonsense inference seems to require a very massive store of highly uncertain knowledge-items, rather than a small store of certain ones. BTW the rule if X is-a Y and Z(Y), then Z(X). exists (in a slightly different form) in Novamente and many other inference systems... I feel like you are personally rediscovering GOFAI, the kind of AI that I read about in textbooks when I first started exploring the field in the early 1980's Ben G Thanks for the tips. My idea is quite simple, slightly innovative, but not groundbreaking. Basically, I want to collect a knowledgebase of facts as well as rules. Facts are like water is wet etc. The rules I explain with this example: Cats have claws; Kitty is a cat; therefore Kitty has claws. Here is an implicit rule that says if X is-a Y and Z(Y), then Z(X). I call rules like this the Rules of Thought. They are not logical tautologies but they express some common thought patterns. My theory is that if we collect a bunch of these rules, add a database of common sense facts, and add a rule-based FOPL inference engine (which may be enhanced with eg Pei Wang's numerical logic), then we have a common sense reasoner. That's what I'm trying to build as a first-stage AGI. If it does work, there may be some commercial applications for such a reasoner. Also it would serve as the base to build a full AGI capable of machine learning etc (I have crudely worked out the long-term plan). So, is this a good business idea? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Well YKY, I don't feel like rehashing these ancient arguments on this list!! Others are welcome to do so, if they wish... ;-) You are welcome to repeat the mistakes of the past if you like, but I frankly consider it a waste of effort. What you have not explained is how what you are doing is fundamentally different from what has been tried N times in the past -- by larger, better-funded teams with more expertise in mathematical logic... -- Ben On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: Call me GOFAI ;) I have thought about this for quite some time and I'm not just copying old ideas On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: The key problem is that you can't feasibly encode enough facts to allow interesting commonsense inferences Yes, we need a lot of thought rules -- they are needed, and there is no escape except to encode them. Machine learning may help (the rules can be learned), but I think human encoding can get us quite far already. That's why I want to start a project to collect such rules. -- commonsense inference seems to require a very massive store of highly uncertain knowledge-items, rather than a small store of certain ones. Totally disagree! I actually examined a few cases of *real-life* commonsense inference steps, and I found that they are based on a *small* number of tiny rules of thought. I don't know why you think massive knowledge items are needed for commonsense reasoning -- if you closely examine some of your own thoughts you'd see. The rules in my system need not be certain. They can be *defeasible* and augmented with Pei Wang's c,f (confidence and frequency) values (which I think is a very good idea). I feel like you are personally rediscovering GOFAI, the kind of AI that I read about in textbooks when I first started exploring the field in the early 1980's Indeed I am very much influenced by those books. That's not necessarily a bad thing!! YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Joel Pitt wrote: ... Some comments/suggestions: * I think such a project should make the data public domain. Ignore silly ideas like giving be shares in the knowledge or whatever. It just complicates things. If the project is really strapped for cash later, then either use ad revenue or look for research funding (although I don't see much cost except for initial development of the system and web hosting). ... Making this proprietary and expecting shares to translate into cash would indeed be a silly approach. OTOH, having people's names attached to scores of some type (call them shares, or anything else) lets people feel more attached to the project. This is probably necessary for success. There also needs to be some way for the builders to interact, and a few other methods that assist the formation of a community. Newsboards, games, etc. can all be useful if structured properly to enhance the formation of a community. Perhaps only community members with scores above the median could be allowed to download the database? It's find to talk about making the data public domain, but that's not a good idea. There are arguments in favor of BSD, MIT, GPL, LGPL, etc. licenses. For this kind of activity I can see either BSD or MIT as easily defensible. (Personally I'd use LGPL, but then if I were using it, I'd want the whole application to be GPL. I might not be able to achieve it, but that's what I'd want.) Public domain wouldn't be one of the possibilities that I would consider. The Artistic license is about as close to that as I would want to come...and the MIT license is probably a better choice for those purposes. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: The choice of knowledge representation language makes a huge difference. IMO, Cyc committed themselves to an overcomplicated representation language that has rendered their DB far less useful than it would be otherwise If you want to use Lojban or Lojban++ as a knowledge representation, then I will back your project strongly, as careful study convinced me that the Lojban style of representation makes a lot of sense... Lojban is advertised to be based on predicate logic, so I assume that translating Lojban to FOPL should be straightforward. Is there such a translator available? I think the difficulty is in translating from English (or whatever NL) to Lojban or logic. This is still unsolved, so we need to settle with a restricted subset of English. IMO the use of Lojban is unnecessary because it is computationally equivalent to FOPL. But if you insist on Lojban it wouldn't be difficult to prepare a Lojban version of the knowledgebase. Or am I missing something? YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/18/07, Charles D Hixson [EMAIL PROTECTED] wrote: Joel Pitt wrote: * I think such a project should make the data public domain. Ignore silly ideas like giving be shares in the knowledge or whatever. It just complicates things. If the project is really strapped for cash later, then either use ad revenue or look for research funding (although I don't see much cost except for initial development of the system and web hosting). ... Making this proprietary and expecting shares to translate into cash would indeed be a silly approach. ... It's find to talk about making the data public domain, but that's not a good idea. There are arguments in favor of BSD, MIT, GPL, LGPL, etc. licenses. For this kind of activity I can see either BSD or MIT as easily defensible. (Personally I'd use LGPL, but then if I were using it, I'd want the whole application to be GPL. I might not be able to achieve it, but that's what I'd want.) Public domain wouldn't be one of the possibilities that I would consider. The Artistic license is about as close to that as I would want to come...and the MIT license is probably a better choice for those purposes. I think a project like this one requires substantial efforts, so people would need to be paid to do some of the work (programming, interface design, etc), especially if we want to build a high quality knowledgebase. If we make it free then a likely outcome is that we get a lot of noise but very few people actually contribute. I'm not an academic (left uni a couple years ago) so I can't get academic funding for this. If I can't start an AI business I'd have to entirely give up AI as a career. I hope you can understand these circumstances. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
--- Stephen Reed [EMAIL PROTECTED] wrote: I worked at Cycorp when the FACTory game was developed. The examples below do not reveal Cyc's knowledge of the assertions connecting these disparate concepts, rather most show that the argument constraints of the terms compared are rather overly generalized. The exception is the example Most BTU dozer blades are wider than most T-64 medium tanks. in which both concepts are specializations of Platform-Military. Download and examine concepts in OpenCyc and Cyc's world model (or lack thereof by your standards) will be readily apparent. You need ResearchCyc which has no license fee for research purposes, in order to evaluate its language model. -Steve Thanks. I did take another look at Cyc, at least this talk by Lenat at Google. http://video.google.com/videoplay?docid=-7704388615049492068 In spite of Cyc's lack of success at AGI (so far), it is still the biggest repository of common sense knowledge. He explains how Cyc had tried machine learning approaches to acquiring such knowledge and why they failed. They knew early on that it would require a 1000 person-year effort to develop the knowledge base and proceeded anyway. Cyc has 3.2 million assertions, 300,000 concepts and 16,000 relations (is-a, contains, etc). They tried very hard to simplfy the knowledge base, to keep these numbers small. Cyc is planning a Web interface to its knowledge base. If they make something useful, a 1000 person-year effort is nothing. Lenat briefly mentions Sergey's (one of Google's founders) goal of solving AI by 2020. I think if Google and Cyc work together on this, they will succeed. - Original Message From: Matt Mahoney [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Sunday, January 14, 2007 3:14:07 PM Subject: Re: [agi] Project proposal: MindPixel 2 --- Gabriel R [EMAIL PROTECTED] wrote: Also, if you can think of any way to turn the knowledge-entry process into a fun game or competition, go for it. I've been told by a few people working on similar projects that making the knowledge-providing process engaging and fun for visitors ended up being a lot more important (and difficult) than they'd expected. Cyc has a game like this called FACTory at http://www.cyc.com/ It's purpose is to help refine its knowledge base. It presents statements and asks you to rate them as true, false, don't know or doesn't make sense. For example. - Most shirts are heavier than most appendixes. - Pages are typically located in HVAC Chem Bio facilities. - Terminals are typically located in studies. - People perform or are involved in paying a mortgage more frequenty than they perform or are involved in overbearing. - Most BTU dozer blades are wider than most T-64 medium tanks. The game exposes Cyc's shortcomings pretty quickly. Cyc seems to lack a world model and a language model. Sentences seem to be constructed by relating common properties of unrelated objects. The set of common properties is fairly small: size, weight, cost, frequency (for events), containment, etc. There does not seem to be any sense that Cyc understands the purpose or function of objects. The result is that context is no help in disambiguating terms that have more than one meaning, such as appendix, page, or terminal. A language model would allow a more natural grammar, such as People pay mortgages more often than they are overbearing. This example also exposes the fallacy of logical inference. Inference allows you to draw conclusions such as this, but why would you? Inference is not a good model of human thought. A good model would compare related objects. It might ask instead whether people make mortgage payments more frequently than they receive paychecks. The game gives no hint that Cyc understands such relations. Cyc has millions of hand coded assertions. It has taken over 20 years to get this far, and it seems we are not even close. This seems to be a problem with every knowledge representation based on labeled graphs (frame-slot, first order logic, connectionist, expert system, etc). Using English words to label the elements of your data structure does not substitute for a language model. Also, this labeling tempts you to examine and update the knowledge manually. We should know by now that there is just too much data to do this. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Never miss an email again! Yahoo! Toolbar alerts you the instant new Mail arrives. http://tools.search.yahoo.com/toolbar/features/mail/ - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2
Re: [agi] Project proposal: MindPixel 2
Matt Mahoney wrote: [...] Lenat briefly mentions Sergey's (one of Google's founders) goal of solving AI by 2020. FWIW I solved AI theory-wise in 1979 and software-wise in 2007. http://mind.sourceforce.net/Mind.html and http://www.scn.org/~mentifex/jsaimind.html and http://visitware.com/AI4U/jsaimind.html are True AI demo versions. I think if Google and Cyc work together on this, they will succeed. The Mentifex solution to AI is messy. About thirty parameters of AI have been orchestrated and coordinated to produce a minimal thinking artificial Mind. What the late Christopher McKinstry and the late Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-) programs can be achieved, albeit messily, in Mind.html or in http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html) either by hard-coding a minimal subject-verb-object KB (as I did) or by data-entry when users teach the artificial Mind new facts. On another note, something which may alarm our fellow list members, I am thinking of replacing the Terminate exit from Mind.html with a [ ] Death check-box that will pop up a plea for mercy, with an ethical user-decision to be made about AI life or death. If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth have truly solved AI, the open-access Site Meter logs will reveal an enormous rush to fetch the free AI source code. That escalation has not happened yet, but you are all welcome to click on Site Meter and see such curious visit logs as the following example from a few days ago, which was apparently made to a local copy of a Mentifex page: Visit 190,585 [] [] Domain Namesenate.gov ? (United States Government) IP Address 156.33.25.# (U.S. Senate Sergeant at Arms) ISPU.S. Senate Sergeant at Arms Location Continent : North America Country : United States (Facts) State : District of Columbia City : Washington Lat/Long : 38.8933, -77.0146 (Map) Language unknown Operating System Microsoft WinXP BrowserInternet Explorer 6.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 2.0.50727) Javascript disabled Time of Visit Jan 12 2007 5:40:01 pm Last Page View Jan 12 2007 5:40:01 pm Visit Length 0 seconds Page Views 1 Referring URL unknown Visit Entry Page Visit Exit Page Out Click Time Zone unknown Visitor's Time Unknown Visit Number 190,585 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Uh... for those of you who are unfamiliar with Mentifex, there's an FAQ on him here: http://www.nothingisreal.com/mentifex_faq.html On 1/15/07, A. T. Murray [EMAIL PROTECTED] wrote: Matt Mahoney wrote: [...] Lenat briefly mentions Sergey's (one of Google's founders) goal of solving AI by 2020. FWIW I solved AI theory-wise in 1979 and software-wise in 2007. http://mind.sourceforce.net/Mind.html and http://www.scn.org/~mentifex/jsaimind.html and http://visitware.com/AI4U/jsaimind.html are True AI demo versions. I think if Google and Cyc work together on this, they will succeed. The Mentifex solution to AI is messy. About thirty parameters of AI have been orchestrated and coordinated to produce a minimal thinking artificial Mind. What the late Christopher McKinstry and the late Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-) programs can be achieved, albeit messily, in Mind.html or in http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html) either by hard-coding a minimal subject-verb-object KB (as I did) or by data-entry when users teach the artificial Mind new facts. On another note, something which may alarm our fellow list members, I am thinking of replacing the Terminate exit from Mind.html with a [ ] Death check-box that will pop up a plea for mercy, with an ethical user-decision to be made about AI life or death. If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth have truly solved AI, the open-access Site Meter logs will reveal an enormous rush to fetch the free AI source code. That escalation has not happened yet, but you are all welcome to click on Site Meter and see such curious visit logs as the following example from a few days ago, which was apparently made to a local copy of a Mentifex page: Visit 190,585 [] [] Domain Namesenate.gov ? (United States Government) IP Address 156.33.25.# (U.S. Senate Sergeant at Arms) ISPU.S. Senate Sergeant at Arms Location Continent : North America Country : United States (Facts) State : District of Columbia City : Washington Lat/Long : 38.8933, -77.0146 (Map) Language unknown Operating System Microsoft WinXP BrowserInternet Explorer 6.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 2.0.50727) Javascript disabled Time of Visit Jan 12 2007 5:40:01 pm Last Page View Jan 12 2007 5:40:01 pm Visit Length 0 seconds Page Views 1 Referring URL unknown Visit Entry Page Visit Exit Page Out Click Time Zone unknown Visitor's Time Unknown Visit Number 190,585 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I unfortunately don't have time to read through the entire thread right now, but a potential alternative is to do something like Luis von Ahn's Verbosity, which uses a webgame to collect common-sense facts: http://www.cs.cmu.edu/~biglou/Verbosity.pdf On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm considering this idea: build a repository of facts/rules in FOL (or Prolog) format, similar to Cyc's. For example water is wet, oil is slippery, etc. The repository is structureless, in the sense that it is just a collection of simple statements. It can serve as raw material for other AGIs, not only mine (although it is especially suitable for my system). One thing that is different from MindPixel 1 is that we can allow Prolog-like rules with variables, for example if X has wings then X can fly etc. I can build a translator to translate simple English sentences to logic statements. We can then solicit internet users to help type in these facts. The resulting knowledgebase would be owned by the community, with those who contribute more facts getting more shares. Also there can be a human cross-validation mechanism to prevent people from typing nonsense. We can also absorb Cyc's current knowledge so the effort will not be duplicative. What do you think of this? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, Chuck Esterbrook [EMAIL PROTECTED] wrote: * Would it support separate domains/modules? I didn't realize the importance of this point at first. Indeed, what we regard as common sense may be highly subjective as it involves matters such as human values, ideology or religion. So the differentiation of subsets is desirable. We may maintain a core body that is really uncontroversial (eg everyday physics), and then let users create their own personalities as additional modules / communities. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I think all these are excellent suggestions. On 13/01/07, Joel Pitt [EMAIL PROTECTED] wrote: Some comments/suggestions: * I think such a project should make the data public domain. Ignore silly ideas like giving be shares in the knowledge or whatever. It just complicates things. If the project is really strapped for cash later, then either use ad revenue or look for research funding (although I don't see much cost except for initial development of the system and web hosting). * Whenever people want to add a new statement, have them evaluate two existing statements as well. Don't make the evaluation true/false, use a slider so the user can decide how true it is (even better, have a xy chart with one axis true/false and the other how sure the user is - this would be useful in the case of some obscure fact on quantum physics since not all of us have the answer). * Emphasize the community aspect of the database. Allow people to have profiles and list the number of statements evaluated and submitted (also how true the statements they submit are judged). Allow people to form teams. Allow teams to extract a subset of the data which represents only the facts they've submitted and evaluated (perhaps this could be an extra feature available to sponsors?) * Although Lojban would be great to use, not many people are proficient it (relative to english), we could be idealistic and suggest that everyone learn lojban before submitting statements, but that would just shrink the user base and kill the community aspect. An alternative might be to allow statements in both languages to submitted (Hell, why not allow ANY language as long as it is tagged with what language it is). * An idea for keeping the community alive would be to focus on a particular topic each week, and run competitions between teams/individuals and award stars to their profile or something. * Instead of making people come up with brand new statements everytime, have a mode where the system randomly selects phrases from somewhere like wikipedia (some times this will produce stupid statements, and allow the user to indicate as such). I think it could be done and made quite fun. Don't just focus on the AI guys, most of us don't have that much spare time. Focus at the bored at work market. Actually going through and thinking about this has made me quite enthused about it. Keep me posted on how it pans out. If I didn't have 10 other projects and my PhD to do I'd volunteer to code it. -- -Joel Unless you try to do something beyond what you have mastered, you will never grow. -C.R. Lawton - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
--- Gabriel R [EMAIL PROTECTED] wrote: Also, if you can think of any way to turn the knowledge-entry process into a fun game or competition, go for it. I've been told by a few people working on similar projects that making the knowledge-providing process engaging and fun for visitors ended up being a lot more important (and difficult) than they'd expected. Cyc has a game like this called FACTory at http://www.cyc.com/ It's purpose is to help refine its knowledge base. It presents statements and asks you to rate them as true, false, don't know or doesn't make sense. For example. - Most shirts are heavier than most appendixes. - Pages are typically located in HVAC Chem Bio facilities. - Terminals are typically located in studies. - People perform or are involved in paying a mortgage more frequenty than they perform or are involved in overbearing. - Most BTU dozer blades are wider than most T-64 medium tanks. The game exposes Cyc's shortcomings pretty quickly. Cyc seems to lack a world model and a language model. Sentences seem to be constructed by relating common properties of unrelated objects. The set of common properties is fairly small: size, weight, cost, frequency (for events), containment, etc. There does not seem to be any sense that Cyc understands the purpose or function of objects. The result is that context is no help in disambiguating terms that have more than one meaning, such as appendix, page, or terminal. A language model would allow a more natural grammar, such as People pay mortgages more often than they are overbearing. This example also exposes the fallacy of logical inference. Inference allows you to draw conclusions such as this, but why would you? Inference is not a good model of human thought. A good model would compare related objects. It might ask instead whether people make mortgage payments more frequently than they receive paychecks. The game gives no hint that Cyc understands such relations. Cyc has millions of hand coded assertions. It has taken over 20 years to get this far, and it seems we are not even close. This seems to be a problem with every knowledge representation based on labeled graphs (frame-slot, first order logic, connectionist, expert system, etc). Using English words to label the elements of your data structure does not substitute for a language model. Also, this labeling tempts you to examine and update the knowledge manually. We should know by now that there is just too much data to do this. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Another way to group the data might be to tease it out into dimensions of what, where, when and whom. There does seem to be some neurological evidence for this kind of categorization. Also, by indexing the data along these lines it allows you to some extent to make meaningful interpolations from similar but non-identical situations, or to imagine situations which are vaguely plausible based upon your past experience. On 14/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/14/07, Chuck Esterbrook [EMAIL PROTECTED] wrote: * Would it support separate domains/modules? I didn't realize the importance of this point at first. Indeed, what we regard as common sense may be highly subjective as it involves matters such as human values, ideology or religion. So the differentiation of subsets is desirable. We may maintain a core body that is really uncontroversial (eg everyday physics), and then let users create their own personalities as additional modules / communities. YKY -- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
I worked at Cycorp when the FACTory game was developed. The examples below do not reveal Cyc's knowledge of the assertions connecting these disparate concepts, rather most show that the argument constraints of the terms compared are rather overly generalized. The exception is the example Most BTU dozer blades are wider than most T-64 medium tanks. in which both concepts are specializations of Platform-Military. Download and examine concepts in OpenCyc and Cyc's world model (or lack thereof by your standards) will be readily apparent. You need ResearchCyc which has no license fee for research purposes, in order to evaluate its language model. -Steve - Original Message From: Matt Mahoney [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Sunday, January 14, 2007 3:14:07 PM Subject: Re: [agi] Project proposal: MindPixel 2 --- Gabriel R [EMAIL PROTECTED] wrote: Also, if you can think of any way to turn the knowledge-entry process into a fun game or competition, go for it. I've been told by a few people working on similar projects that making the knowledge-providing process engaging and fun for visitors ended up being a lot more important (and difficult) than they'd expected. Cyc has a game like this called FACTory at http://www.cyc.com/ It's purpose is to help refine its knowledge base. It presents statements and asks you to rate them as true, false, don't know or doesn't make sense. For example. - Most shirts are heavier than most appendixes. - Pages are typically located in HVAC Chem Bio facilities. - Terminals are typically located in studies. - People perform or are involved in paying a mortgage more frequenty than they perform or are involved in overbearing. - Most BTU dozer blades are wider than most T-64 medium tanks. The game exposes Cyc's shortcomings pretty quickly. Cyc seems to lack a world model and a language model. Sentences seem to be constructed by relating common properties of unrelated objects. The set of common properties is fairly small: size, weight, cost, frequency (for events), containment, etc. There does not seem to be any sense that Cyc understands the purpose or function of objects. The result is that context is no help in disambiguating terms that have more than one meaning, such as appendix, page, or terminal. A language model would allow a more natural grammar, such as People pay mortgages more often than they are overbearing. This example also exposes the fallacy of logical inference. Inference allows you to draw conclusions such as this, but why would you? Inference is not a good model of human thought. A good model would compare related objects. It might ask instead whether people make mortgage payments more frequently than they receive paychecks. The game gives no hint that Cyc understands such relations. Cyc has millions of hand coded assertions. It has taken over 20 years to get this far, and it seems we are not even close. This seems to be a problem with every knowledge representation based on labeled graphs (frame-slot, first order logic, connectionist, expert system, etc). Using English words to label the elements of your data structure does not substitute for a language model. Also, this labeling tempts you to examine and update the knowledge manually. We should know by now that there is just too much data to do this. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 Never miss an email again! Yahoo! Toolbar alerts you the instant new Mail arrives. http://tools.search.yahoo.com/toolbar/features/mail/ - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
[agi] Project proposal: MindPixel 2
I'm considering this idea: build a repository of facts/rules in FOL (or Prolog) format, similar to Cyc's. For example water is wet, oil is slippery, etc. The repository is structureless, in the sense that it is just a collection of simple statements. It can serve as raw material for other AGIs, not only mine (although it is especially suitable for my system). One thing that is different from MindPixel 1 is that we can allow Prolog-like rules with variables, for example if X has wings then X can fly etc. I can build a translator to translate simple English sentences to logic statements. We can then solicit internet users to help type in these facts. The resulting knowledgebase would be owned by the community, with those who contribute more facts getting more shares. Also there can be a human cross-validation mechanism to prevent people from typing nonsense. We can also absorb Cyc's current knowledge so the effort will not be duplicative. What do you think of this? YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
How do you plan to represent water is wet? Pei On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm considering this idea: build a repository of facts/rules in FOL (or Prolog) format, similar to Cyc's. For example water is wet, oil is slippery, etc. The repository is structureless, in the sense that it is just a collection of simple statements. It can serve as raw material for other AGIs, not only mine (although it is especially suitable for my system). One thing that is different from MindPixel 1 is that we can allow Prolog-like rules with variables, for example if X has wings then X can fly etc. I can build a translator to translate simple English sentences to logic statements. We can then solicit internet users to help type in these facts. The resulting knowledgebase would be owned by the community, with those who contribute more facts getting more shares. Also there can be a human cross-validation mechanism to prevent people from typing nonsense. We can also absorb Cyc's current knowledge so the effort will not be duplicative. What do you think of this? YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote: How do you plan to represent water is wet? Pei Well, we need to agree on some conventions. A pretty standard way is: Is(water,wet). But the one I use in my system is: R(is, water, wet) where R is a generic predicate representing a relation. I mean, we can choose a format that is as natural as possible or pleases most people. You can easily translate it to your native form. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Actually it doesn't matter what convention you use. You could simply have an entry box on the screen, with a prompt saying please type a short statement that you believe to be either true or false. Some parsing can do the rest. To avoid getting too verbose simply restrict the maximum number of words which may be used in the sentence. The key point about Mindpixel is the cross validation, to calculate a probability or coherence value, with the ultimate aim of mapping out an average human belief system - something which has never really been done before. If Mindpixel does get revived I think it should be an open source project, with the results available to everyone. The idea of doing this on a commercial basis with the issuing of shares turned out not to be viable. This kind of effort is a long term thing, unlikely to return profits for shareholders within a few years. On 13/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote: How do you plan to represent water is wet? Pei Well, we need to agree on some conventions. A pretty standard way is: Is(water,wet). But the one I use in my system is: R(is, water, wet) where R is a generic predicate representing a relation. I mean, we can choose a format that is as natural as possible or pleases most people. You can easily translate it to your native form. -- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/13/07, Bob Mottram [EMAIL PROTECTED] wrote: Actually it doesn't matter what convention you use. You could simply have an entry box on the screen, with a prompt saying please type a short statement that you believe to be either true or false. Some parsing can do the rest. To avoid getting too verbose simply restrict the maximum number of words which may be used in the sentence. The key point about Mindpixel is the cross validation, to calculate a probability or coherence value, with the ultimate aim of mapping out an average human belief system - something which has never really been done before. This answers my question about the comparison with Cyc. It's been awhile since I read up on Mindpixel and now it's coming back to me. If Mindpixel does get revived I think it should be an open source project, with the results available to everyone. The idea of doing this on a commercial basis with the issuing of shares turned out not to be viable. This kind of effort is a long term thing, unlikely to return profits for shareholders within a few years. Shares aren't the only means to structure a business. One possibility is revenue and/or profit sharing based on a formula embedded in a contract. We could probably think of more ways if we spent the time. Although I admit I believe all of them will be complex. -Chuck - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote: Well, we need to agree on some conventions. A pretty standard way is: Is(water,wet). In the standard way of knowledge representation, a constant is either a predicate name or an individual name. Mass noun like water is neither. There is no consensus on how to represent it yet. In the above formula, water is a constant. I don't see why it cannot be a mass noun. Predicate logic does not restrict constants/variables in any way, they can be any abstract or concrete concept. I guess you will represent water is liquid as R(is, water, liquid), right? Are you going to somehow show the difference between noun like liquid and adjective like wet? Are you going to somehow show the difference between uncountable noun like liquid and countable noun like table? One could have statements like: Noun(water) Uncountable(water) Adjective(wet) etc but they are again conventions. I'm afraid that there is no format that is as natural as possible or pleases most people. Well, indeed you're right. It seems that either we have to agree on some arbitrary format, or just leave it as English (perhaps parsed and disambiguated). YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, Bob Mottram [EMAIL PROTECTED] wrote: If Mindpixel does get revived I think it should be an open source project, with the results available to everyone. The idea of doing this on a commercial basis with the issuing of shares turned out not to be viable. This kind of effort is a long term thing, unlikely to return profits for shareholders within a few years. Using MindPixel shares makes the project more interesting to people, as they can look at how much they have contributed. It doesn't matter if the database doesn't turn a profit. In the long term, you just can't tell. I think it's reasonable to let the community own the product. That will motivate people to participate too. YKY - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
Well, I don't really think that Mindpixel shares are going to be a big motivator for folks to encode knowledge, in the current economic/business climate. The original mindpixel shares were part of the dot-com-era mentality, I think. But I think the key point is whether there is a committment to make and keep the DB open to everyone (as was not the case with mindpixel). I think folks contributing to the DB will want to know for sure that the knowledge they contribute will be open to all, rather than restricted to certain customers who are willing to pay... -- Ben G On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: On 1/14/07, Bob Mottram [EMAIL PROTECTED] wrote: If Mindpixel does get revived I think it should be an open source project, with the results available to everyone. The idea of doing this on a commercial basis with the issuing of shares turned out not to be viable. This kind of effort is a long term thing, unlikely to return profits for shareholders within a few years. Using MindPixel shares makes the project more interesting to people, as they can look at how much they have contributed. It doesn't matter if the database doesn't turn a profit. In the long term, you just can't tell. I think it's reasonable to let the community own the product. That will motivate people to participate too. YKY This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
On 1/14/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm considering this idea: build a repository of facts/rules in FOL (or Prolog) format, similar to Cyc's. For example water is wet, oil is slippery, etc. The repository is structureless, in the sense that it is just a collection of simple statements. It can serve as raw material for other AGIs, not only mine (although it is especially suitable for my system). Some comments/suggestions: * I think such a project should make the data public domain. Ignore silly ideas like giving be shares in the knowledge or whatever. It just complicates things. If the project is really strapped for cash later, then either use ad revenue or look for research funding (although I don't see much cost except for initial development of the system and web hosting). * Whenever people want to add a new statement, have them evaluate two existing statements as well. Don't make the evaluation true/false, use a slider so the user can decide how true it is (even better, have a xy chart with one axis true/false and the other how sure the user is - this would be useful in the case of some obscure fact on quantum physics since not all of us have the answer). * Emphasize the community aspect of the database. Allow people to have profiles and list the number of statements evaluated and submitted (also how true the statements they submit are judged). Allow people to form teams. Allow teams to extract a subset of the data which represents only the facts they've submitted and evaluated (perhaps this could be an extra feature available to sponsors?) * Although Lojban would be great to use, not many people are proficient it (relative to english), we could be idealistic and suggest that everyone learn lojban before submitting statements, but that would just shrink the user base and kill the community aspect. An alternative might be to allow statements in both languages to submitted (Hell, why not allow ANY language as long as it is tagged with what language it is). * An idea for keeping the community alive would be to focus on a particular topic each week, and run competitions between teams/individuals and award stars to their profile or something. * Instead of making people come up with brand new statements everytime, have a mode where the system randomly selects phrases from somewhere like wikipedia (some times this will produce stupid statements, and allow the user to indicate as such). I think it could be done and made quite fun. Don't just focus on the AI guys, most of us don't have that much spare time. Focus at the bored at work market. Actually going through and thinking about this has made me quite enthused about it. Keep me posted on how it pans out. If I didn't have 10 other projects and my PhD to do I'd volunteer to code it. -- -Joel Unless you try to do something beyond what you have mastered, you will never grow. -C.R. Lawton - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
Re: [agi] Project proposal: MindPixel 2
A few other considerations -- It is possible to reduce the need for after-the-fact parsing by imposing constraints on the knowledge-entry process, and this actually makes it easier for users to come up with facts. For example, the MIT Open Mind Common Sense project asked users to fill in the blanks of templates like You would be likely to find ___ in , __ can __, etc. which the researchers then readily translated to assertions of the form LocationOf(X, Y), CapableOf(X, Y), etc. They ended up collecting quite a few facts -- see http://web.media.mit.edu/~hugo/conceptnet/http://web.media.mit.edu/%7Ehugo/conceptnet/. If you dig around in the zip file you'll see a few large txt files that have all the collected facts listed out -- 200,000 in the concise version, 1.6 million in the full version. It seems that either we have to agree on some arbitrary format, or just leave it as English (perhaps parsed and disambiguated). The MIT project I just mentioned did something similar to the collect in plain English, then disambiguate afterwards strategy (a lexically disambiguated version is also available on their site), but the fact that the arguments of their predicates are not tied to any kind of formal semantics really limits their database's utility. I would strongly recommend thinking about ways to constrain the users' input in such a way as to eliminate ambiguity from the getgo. For example, you could ask users to create true sentences of the form if X __1__ __2__ then X __3__ __4__, but force them to fill in the blanks by selecting from combo boxes that give them many predefined choices such as can, is a, mouse (the animal), mouse (for a computer), etc. This will help standardize your input and make it a lot easier to work with. Alternatively, you could write a simple program to create thousands of randomly generated logical statements and then automatically convert them to relatively unambiguously phrased English statements (obviously far easier than converting English to logical form). Then you could put these to your web users and have them tell you how true each statement is, or ask them how they would change false statements to make them true. You could even incorporate clustering algorithms to infer which unannotated statements are most likely to be true ... lots of possibilities. Actually, going with the randomly-generated-logical-statements idea, if you used Lojban predicates as your underlying logical form, you could auto-translate them to English and essentially have your users annotating the truth-values of Lojban sentences without knowing Lojban. It's very difficult to auto-translate complex Lojban sentences into readable English (mainly because Lojban's equivalent of noun compounds are ridiculously vague), but it shouldn't be too hard to do with simple sentences. Also, if you can think of any way to turn the knowledge-entry process into a fun game or competition, go for it. I've been told by a few people working on similar projects that making the knowledge-providing process engaging and fun for visitors ended up being a lot more important (and difficult) than they'd expected. On 1/13/07, Joel Pitt [EMAIL PROTECTED] wrote: On 1/14/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote: I'm considering this idea: build a repository of facts/rules in FOL (or Prolog) format, similar to Cyc's. For example water is wet, oil is slippery, etc. The repository is structureless, in the sense that it is just a collection of simple statements. It can serve as raw material for other AGIs, not only mine (although it is especially suitable for my system). Some comments/suggestions: * I think such a project should make the data public domain. Ignore silly ideas like giving be shares in the knowledge or whatever. It just complicates things. If the project is really strapped for cash later, then either use ad revenue or look for research funding (although I don't see much cost except for initial development of the system and web hosting). * Whenever people want to add a new statement, have them evaluate two existing statements as well. Don't make the evaluation true/false, use a slider so the user can decide how true it is (even better, have a xy chart with one axis true/false and the other how sure the user is - this would be useful in the case of some obscure fact on quantum physics since not all of us have the answer). * Emphasize the community aspect of the database. Allow people to have profiles and list the number of statements evaluated and submitted (also how true the statements they submit are judged). Allow people to form teams. Allow teams to extract a subset of the data which represents only the facts they've submitted and evaluated (perhaps this could be an extra feature available to sponsors?) * Although Lojban would be great to use, not many people are proficient it (relative to english), we could be idealistic and suggest that everyone learn