subject:"\[agi\] Project proposal\: MindPixel 2"

Re: [agi] Project proposal: MindPixel 2

2007-01-28 Thread Eric Baum


Ben B) Even if there are only 5 applications of rules, the
Ben combinatorial explosion still exists.  If there are 10 rules and
Ben 1 billlion knowledge items, then there may be up to 10 billion
Ben possibilities to consider in each inference step. 

How do you respond to the 20-question argument that there are only
of order 2^20 knowledge items ?

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-28 Thread Russell Wallace


On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote:


How do you respond to the 20-question argument that there are only
of order 2^20 knowledge items ?



The granularity of knowledge items for 20 Questions and the number 20 are
specifically chosen to match each other, to make the game fair. While never
explicitly stated, everyone understands that e.g. 'a book' is a fair topic
for 20 Questions, but 'Alice in Wonderland' is not. Yet we do know about
'Alice in Wonderland', and any attempt to duplicate human abilities must
take that into account.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-28 Thread Eric Baum


Russell On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote:
 How do you respond to the 20-question argument that there are only
 of order 2^20 knowledge items ?
 

Russell The granularity of knowledge items for 20 Questions and the
Russell number 20 are specifically chosen to match each other, to
Russell make the game fair. While never explicitly stated, everyone
Russell understands that e.g. 'a book' is a fair topic for 20
Russell Questions, but 'Alice in Wonderland' is not. Yet we do know
Russell about 'Alice in Wonderland', and any attempt to duplicate
Russell human abilities must take that into account.

Eric Have you ever played 20 questions? In the games I've played,
Eric Alice in Wonderland would be a fine topic. I admit its
Eric surprising that one plays as well as one does.

I haven't played 20 questions recently, but in response to your
comment I just went to www.20q.net and played thinking of Alice in
Wonderland, the book. The neural net guessed is it a novel on
question 22, and then decided it had gone far enough and said
you won, but 20q guessed it eventually.

However, I have distinct recollections of, last time I played,
a human player guessing my thought of a specific radio station.

Of course, 20q.net cheats by asking multimodal questions
(animal,vegetable, or mineral) so more than 2^20 possibilities.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-28 Thread YKY (Yan King Yin)


On 1/29/07, Eric Baum [EMAIL PROTECTED] wrote:

I haven't played 20 questions recently, but in response to your
comment I just went to www.20q.net and played thinking of Alice in
Wonderland, the book. The neural net guessed is it a novel on
question 22, and then decided it had gone far enough and said
you won, but 20q guessed it eventually.

However, I have distinct recollections of, last time I played,
a human player guessing my thought of a specific radio station.

Of course, 20q.net cheats by asking multimodal questions
(animal,vegetable, or mineral) so more than 2^20 possibilities.


Wait.. since it is not the *same* 20 questions all the time, the number of
concepts in the mind may be significantly more than 2^20.

Also, 2^20 ~= 1 million,  and OpenCyc has about 47,000 concepts.  Still a
long way to go, it seems...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-28 Thread Russell Wallace


On 1/28/07, Eric Baum [EMAIL PROTECTED] wrote:


Have you ever played 20 questions?



Yep.

In the games I've played,

Alice in Wonderland would be a fine topic. I admit its surprising
that one plays as well as one does.



Interesting, and surprising, but I don't draw the same conclusion as you do.
The interesting conclusion I draw is that each group/circle of people who
play the game must have a different set of criteria for deciding what's a
fair question, and that we therefore must have a very surprising tacit
ability to judge what level of detail constitutes 20 bits of information. It
remains the case that we do know about more than a million things; you can't
build a human-equivalent mind with a million knowledge items. (Or with mere
explicit knowledge items at all, as Cyc has adequately demonstrated.)

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread Bob Mottram


Pick whatever public domain licence you prefer: GPL, MIT, Apache, or
whatever you believe will prevent legal abuses.  In principle though I agree
that data entered by the public should be owned by the public.




On 27/01/07, David Hart [EMAIL PROTECTED] wrote:


On 1/27/07, Charles D Hixson [EMAIL PROTECTED] wrote:

 Philip Goetz wrote:
  On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote:
 
  It's find to talk about making the data public domain, but that's
 not
  a good idea.
 
  Why not?
 Because public domain offers NO protection.  If you want something
 close to what public domain used to provide, then the MIT license is a
 good choice.  If you make something public domain, you are opening
 yourself to abusive lawsuits.  (Those are always a possibility, but a
 license that disclaims responsibility offers *some* protection.)

 Public domain used to be a good choice (for some purposes), before
 lawsuits became quite so pernicious.



This license chooser may help: http://creativecommons.org/license/

Perhaps MindPixel2 discussion deserves its own list at this stage?
Listbox, Google and many others offer list services (Google Code also offers
a wiki, source version management, and other features).

David
--
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread YKY (Yan King Yin)


On 1/27/07, Philip Goetz [EMAIL PROTECTED] wrote:

Totally disagree!  I actually examined a few cases of *real-life*
commonsense inference steps,
and I found that they are based on a *small* number of tiny rules of
thought.  I don't know why
you think massive knowledge items are needed for commonsense
reasoning -- if you closely
examine some of your own thoughts you'd see.

On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:

 For the type of common sense reasoner I described, we need a *massive*
 number of rules.  You can either acquire these rule via machine learning

or

 direct encoding.  Machine learning of such rules is possible, but the

area

 of research is kind of immature.  OTOH there has not been a massive

project

 to collect such rules by hand.  So that explains why my type of system

has

 not been tried before.


Sorry about the confusion =)   What I meant is that the AGI's knowledgebase
needs to store a massive number of facts/rules, but a *single* commonsense
inference case (eg the examples from my introspection) usually involves only
a few deductive steps in logic (assuming the required rules are there).

I guess Ben's objection is based on the first point, but the project is
still feasible if it is powered by an online community, IMO.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread YKY (Yan King Yin)


On 1/25/07, Pei Wang [EMAIL PROTECTED] wrote:

 Suppose I have a set of *deductive* facts/rules in FOPL.  You can

actually

 use this data in your AGI to support other forms of inference such as
 induction and abduction.  In this sense the facts/rules collection does

not

 dictate the form of inference engine we use.

No, you cannot do that without twisting some definitions. You are
right that now many people define induction and abduction in the
language of FOPL, but what they actually do is to omit important
aspects in the process, such as uncertainty. To me that is cheating. I
addressed this issue in
http://nars.wang.googlepages.com/wang.syllogism.ps . In


http://www.springer.com/west/home/computer/artificial?SGWID=4-147-22-173659733-0

I explained in detail (especially in Ch. 9 and 10) why the language of
FOPL is improper for AI.


OK, there is some confusion here too.  You're talking about standard
FOPL, a version that is described in textbooks of mathematical logic.  My
logic is based on standard FOPL, but there are some significant
differences.  First of all, it can be extended with uncertainty values (eg
according to your theory of f,c).  Secondly, it does not use Frege-style
quantifiers.  Thirdly, it does not make a strict distinction between
predicates and arguments (eg I can say Loves(john,mary) and Is_Blind(love)).

Given these differences, the 3 objections to FOPL in your book may be
answered.  In the end, your NARS logic and my logic may be very similar in
both expressivity and semantics.  If you're interested we may consider a
collaboration or merging of theories.

One issue I have not yet formed an opinion about is the universality of the
inheritence relation in NARS.  We can discuss that later...


 That's why my top priority is to build an inference engine for

deduction.

 Inductive learning will be added later in the form of data mining, which

is

 very computation-intensive.

I'm afraid it is not going to work --- many people have tried to
extend FOPL to cover a wider range, and run into all kinds of
problems. To restart from scratch is actually easier than to maintain
consistency among many ad hoc patches and hacks.

To me, one of the biggest mistake of mainstream AI is to treat
learning as independent to working, and can be added in later. To
see AI in this way and to put learning into the foundation will
produce very different systems. In NARS, learning and reasoning,
as well as some other cognitive facilities, are different aspects of
the same underlying process, and cannot handled separately.


Inductive learning under FOPL is a vast topic, and is still under
development (eg the field of inductive logic programming).  It is still too
early to say that it won't work.  Also, many methods in data mining are
forms of inductive learning, and I believe these techniques can be borrowed
for AGI.  I guess Ben uses pattern mining techniques in Novamente too.

There is not a clear reason why reasoning and learning must be unified.
Can you elaborate on the advantages of such an approach?

The learning problem in AGI is difficult partly because GOFAI knowledge
representation schemes are usually very cumbersome (with frames,
microtheories, modal operators for temporal / epistemological aspects,
etc).  My logic is very minimalistic, almost structureless.  This makes
learning easier since learning is a search for hypotheses in the hypothesis
space.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread YKY (Yan King Yin)


On 1/27/07, Ben Goertzel [EMAIL PROTECTED] wrote:


Yes, you can reduce nearly all commonsense inference to a few rules,
but only if your rules and your knowledge base are not fully
formalized...


As I envision it, we would have a large number of rules.  Some rules are
very abstract (eg rules governing inheritence or syllogisms) and others are
more concrete (eg involving concrete concepts).

For example:
Abstract rule:  If X is a Y and Z(Y) then Z(X).
Concrete rule:  If X is wet then X conducts electricity.

The rules are not fully formalized -- in the sense that there is not an
elite set of rules governing all others.  Instead, there is a continuum of
rules from the highly abstract / always-right to the concrete / defeasible.

Do you think that's better?


Fully formalizing things, as is necessary for software
implementation, makes things substantially more complicated.

Give it a try and see!


Thanks for your support =)

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread Pei Wang


I need to understand your design better to talk about the details, and
the discussion is getting too technical for this list. I will hold my
doubts and wait for you to go further.

Pei

On 1/27/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:




On 1/25/07, Pei Wang [EMAIL PROTECTED] wrote:

  Suppose I have a set of *deductive* facts/rules in FOPL.  You can
actually
  use this data in your AGI to support other forms of inference such as
  induction and abduction.  In this sense the facts/rules collection does
not
  dictate the form of inference engine we use.

 No, you cannot do that without twisting some definitions. You are
 right that now many people define induction and abduction in the
 language of FOPL, but what they actually do is to omit important
 aspects in the process, such as uncertainty. To me that is cheating. I
 addressed this issue in
 http://nars.wang.googlepages.com/wang.syllogism.ps . In

http://www.springer.com/west/home/computer/artificial?SGWID=4-147-22-173659733-0
 I explained in detail (especially in Ch. 9 and 10) why the language of
 FOPL is improper for AI.


OK, there is some confusion here too.  You're talking about standard FOPL,
a version that is described in textbooks of mathematical logic.  My logic is
based on standard FOPL, but there are some significant differences.  First
of all, it can be extended with uncertainty values (eg according to your
theory of f,c).  Secondly, it does not use Frege-style quantifiers.
Thirdly, it does not make a strict distinction between predicates and
arguments (eg I can say Loves(john,mary) and Is_Blind(love)).

Given these differences, the 3 objections to FOPL in your book may be
answered.  In the end, your NARS logic and my logic may be very similar in
both expressivity and semantics.  If you're interested we may consider a
collaboration or merging of theories.

One issue I have not yet formed an opinion about is the universality of the
inheritence relation in NARS.  We can discuss that later...

  That's why my top priority is to build an inference engine for
deduction.
  Inductive learning will be added later in the form of data mining, which
is
  very computation-intensive.

 I'm afraid it is not going to work --- many people have tried to
 extend FOPL to cover a wider range, and run into all kinds of
 problems. To restart from scratch is actually easier than to maintain
 consistency among many ad hoc patches and hacks.

 To me, one of the biggest mistake of mainstream AI is to treat
 learning as independent to working, and can be added in later. To
 see AI in this way and to put learning into the foundation will
 produce very different systems. In NARS, learning and reasoning,
 as well as some other cognitive facilities, are different aspects of
 the same underlying process, and cannot handled separately.


Inductive learning under FOPL is a vast topic, and is still under
development (eg the field of inductive logic programming).  It is still too
early to say that it won't work.  Also, many methods in data mining are
forms of inductive learning, and I believe these techniques can be borrowed
for AGI.  I guess Ben uses pattern mining techniques in Novamente too.

There is not a clear reason why reasoning and learning must be unified.
Can you elaborate on the advantages of such an approach?

The learning problem in AGI is difficult partly because GOFAI knowledge
representation schemes are usually very cumbersome (with frames,
microtheories, modal operators for temporal / epistemological aspects, etc).
 My logic is very minimalistic, almost structureless.  This makes learning
easier since learning is a search for hypotheses in the hypothesis space.


YKY 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread YKY (Yan King Yin)


On 1/27/07, David Hart [EMAIL PROTECTED] wrote:

This license chooser may help: http://creativecommons.org/license/

Perhaps MindPixel2 discussion deserves its own list at this stage?

Listbox, Google and many others offer list services (Google Code also offers
a wiki, source version management, and other features).

Thanks, but I favor a license that supports some commercial rights, or I'll
need to create one.  Google Code only supports free / copyleft licenses.

I will start a separate list as soon as this is settled...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-27 Thread Ben Goertzel





There is not a clear reason why reasoning and learning must be  
unified.  Can you elaborate on the advantages of such an approach?





To answer that question I would have to know how you are defining  
those terms.




The learning problem in AGI is difficult partly because GOFAI  
knowledge representation schemes are usually very cumbersome (with  
frames, microtheories, modal operators for temporal /  
epistemological aspects, etc).  My logic is very minimalistic,  
almost structureless.  This makes learning easier since learning is  
a search for hypotheses in the hypothesis space.


With a minimalist logic, the hypothesis space will be large, posing a  
huge search problem.


The point of all those cumbersome additions to basic logic is  
essentially to allow learning and reasoning algorithms to narrow down  
the search space in contextually appropriate ways.


Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2 - licensing

2007-01-27 Thread David Hart


Hi YKY,

On 1/28/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



Thanks, but I favor a license that supports some commercial rights, or
I'll need to create one.  Google Code only supports free /
copyleft licenses.




Licensing is typically more intricate than it first appears. KB content and
software source code would likely be under separate licenses, and
contributed and maintained by mostly separate communities.

It's feasible to maintain two source code bases, one which is open source
and a second which is closed source. As copyright holder, you're permitted
to intermingle code between the two (with some restrictions), and reserve
any proprietary code you like in the closed-source version. However, once
source code is contributed to an open source version, it's in the wild
forever (i.e. an open source license can't be retroactively revoked). You
might also consider making an upfront statement to the effect that open
source coders may be hired in the future if they're willing to assign their
source code copyright to your company, allowing you to more easily make
proprietary derivative works of their open source code. Depending on the
license chosen, others may also be allowed to make proprietary derivative
works of the open source code. For example, while it seems
counter-intuitive, dual-licensed GPL projects have stronger commercial
protection for the copyright holder than do BSD licensed projects, which
allow third parties to keep their changes proprietary.

For the KB, non-commercial creative commons licenses exist which may be
useful. It's my guess that a KB of this size and nature would be hosted
outside of a normal source-code-hosting setting, simply because those
services don't offer the necessary tools for the job. Most Linux hosting
services would be sufficient for KB hosting, as they include database
software and large amounts of storage.

You'd want to read the fine print of the source-code-hosting services'
licenses, but it's probably okay to combine all of these various license
types in the way described, however IANAL so better yet seek legal advice.
Nearly any AGI project with a commercial/community mix will have similar
licensing issues.

David

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-26 Thread Matt Mahoney


--- YKY (Yan King Yin) [EMAIL PROTECTED] wrote:

 On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote:
  If there is a major problem with Cyc, it is not the choice of basic
  KR language.  Predicate logic is precise and relatively simple.
 
 I agree mostly, though I think even Cyc's simple predicate logic language
 can be made even simpler and better.  For example, Cyc uses the classical
 quantifiers #$forAll and #$exists.  In my version I don't use Frege-style
 quantifiers but I allow generalized modifiers like many, a few, in
 addition to all, exists.

IMHO the problem with Cyc is they tried to go directly to adult level
intelligence with no theory on how people learn.  This is why they are having
such difficulty adding a natural language interface.  Children learn semantics
first, then simple sentences, and then the elements of logic such as and,
or, not, all, some, etc.  Cyc went straight to adult level logic and
math, and now they can't add in the stuff that should have been learned as
children.  They should have built the language model first.

Another problem is that n-th order logic (even probabilistic) is not how
people think.  Logic does not model inductive reasoning, e.g. Kermit is a
frog.  Kermit is green.  Therefore frogs are green.  Where is the theory that
explains why people reason this way?

This is what happens when you ignore the cognitive side of AI.

  Rather, the main problem is the impracticality of encoding a decent
  percentage of the needed commonsense knowledge!
 
 
 Now I see why we disagree here.  You believe we should acquire all knowledge
 via experiential learning.  IMO we can do even better than the experiential
 route.  We can let the internet crowd enter the commonsense corpus for
 us.  This should be allow us to reach a functioning, usable AGI sooner.

How much knowledge you need depends on what problem you are trying to solve. 
Building an AGI to run a corporation is not the same as building a better spam
detector.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-26 Thread Philip Goetz


On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


Totally disagree!  I actually examined a few cases of *real-life*

commonsense inference steps,

and I found that they are based on a *small* number of tiny rules of

thought.  I don't know why

you think massive knowledge items are needed for commonsense

reasoning -- if you closely

examine some of your own thoughts you'd see.


On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


For the type of common sense reasoner I described, we need a *massive*
number of rules.  You can either acquire these rule via machine learning or
direct encoding.  Machine learning of such rules is possible, but the area
of research is kind of immature.  OTOH there has not been a massive project
to collect such rules by hand.  So that explains why my type of system has
not been tried before.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-26 Thread Ben Goertzel



Yes, you can reduce nearly all commonsense inference to a few rules,  
but only if your rules and your knowledge base are not fully  
formalized...


Fully formalizing things, as is necessary for software  
implementation, makes things substantially more complicated.


Give it a try and see!

Ben


On Jan 26, 2007, at 8:00 PM, Philip Goetz wrote:


On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


Totally disagree!  I actually examined a few cases of *real-life*

commonsense inference steps,

and I found that they are based on a *small* number of tiny rules of

thought.  I don't know why

you think massive knowledge items are needed for commonsense

reasoning -- if you closely

examine some of your own thoughts you'd see.


On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


For the type of common sense reasoner I described, we need a  
*massive*
number of rules.  You can either acquire these rule via machine  
learning or
direct encoding.  Machine learning of such rules is possible, but  
the area
of research is kind of immature.  OTOH there has not been a  
massive project
to collect such rules by hand.  So that explains why my type of  
system has

not been tried before.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-26 Thread Charles D Hixson


Philip Goetz wrote:

On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote:


It's find to talk about making the data public domain, but that's not
a good idea.


Why not?
Because public domain offers NO protection.  If you want something 
close to what public domain used to provide, then the MIT license is a 
good choice.  If you make something public domain, you are opening 
yourself to abusive lawsuits.  (Those are always a possibility, but a 
license that disclaims responsibility offers *some* protection.)


Public domain used to be a good choice (for some purposes), before 
lawsuits became quite so pernicious.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-26 Thread David Hart


On 1/27/07, Charles D Hixson [EMAIL PROTECTED] wrote:


Philip Goetz wrote:
 On 1/17/07, Charles D Hixson [EMAIL PROTECTED] wrote:

 It's find to talk about making the data public domain, but that's not
 a good idea.

 Why not?
Because public domain offers NO protection.  If you want something
close to what public domain used to provide, then the MIT license is a
good choice.  If you make something public domain, you are opening
yourself to abusive lawsuits.  (Those are always a possibility, but a
license that disclaims responsibility offers *some* protection.)

Public domain used to be a good choice (for some purposes), before
lawsuits became quite so pernicious.




This license chooser may help: http://creativecommons.org/license/

Perhaps MindPixel2 discussion deserves its own list at this stage? Listbox,
Google and many others offer list services (Google Code also offers a wiki,
source version management, and other features).

David

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-25 Thread YKY (Yan King Yin)


On 1/25/07, Bob Mottram [EMAIL PROTECTED] wrote:


The trouble is that you can only really decide whether a statement is

non-probabilistic if enough people have voted unanimously yes or no.  Even
then you can't be sure that the next person to vote won't go the opposite
way.

At the initial stage we may rely on the wisdom of crowds (wikipedia:
http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds), using voting on one set
of common knowledge.  Although in later stages I think separate
sub-communities might be desirable.

This is not a worrying issue IMO.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-25 Thread YKY (Yan King Yin)


On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote:

If there is a major problem with Cyc, it is not the choice of basic
KR language.  Predicate logic is precise and relatively simple.


I agree mostly, though I think even Cyc's simple predicate logic language
can be made even simpler and better.  For example, Cyc uses the classical
quantifiers #$forAll and #$exists.  In my version I don't use Frege-style
quantifiers but I allow generalized modifiers like many, a few, in
addition to all, exists.


Rather, the main problem is the impracticality of encoding a decent
percentage of the needed commonsense knowledge!



Now I see why we disagree here.  You believe we should acquire all knowledge
via experiential learning.  IMO we can do even better than the experiential
route.  We can let the internet crowd enter the commonsense corpus for
us.  This should be allow us to reach a functioning, usable AGI sooner.


And, on a more technical level, I think that Cyc's **ontology** is
too complex and unwieldy.  This is NOT an issue of the KR language,
but rather of the chosen vocabulary of semantic primitives.  I
don't feel that Cyc has a well-thought-out set of semantic
primitives.  They have a small number of basic logical primitives,
and then a HUGE number of complex abstract concepts in their upper
ontology.  IMO an intermediate level is needed, involving a few dozen
well thought out semantic primitives, and a few hundred additional
basic semantic relationships.



I have a similar sense.  As wikipedia puts it, Cyc has been criticized for
excessive reification.  I think the problem is that Cyc creates artificial
labels that are atomic and *non-compositional*.  For example the label
#$rawFood should be represented *compositionally* by the concepts raw
and food.

I suggest not to use ontologies at all.  John Sowa has spent lots of time on
the ontology problem and his conclusion is:  We will never have a one-size
fits all ontology for anything having to do with computer systems. Case
closed [ http://suo.ieee.org/email/msg12861.html ].  Perhaps this is one
GOFAI feature we need to ditch.

I think we can work bottom-up from a vast web of commonsense pixels, and the
computer organize its own knowledgebase via clustering etc.  So we don't
need any man-made ontology.


Lojban IMO has done a great job of this.  The Lojban language
embodies a very well thought out commonsense ontology, which has been
shaped evolutionarily thru the usage of the language by the Lojban
community.



Not familiar with the Lojban community or the status of the language, so I
can't comment.  I still believe that introducing Lojban into AGI is
spurious / redundant and it may alienate people from your projects if they
don't know Lojban.  It seems like just another man-made ontology that has
its inadequacies.


However, this still doesn't solve the problem that there is too much
commonsense knowledge to code-in explicitly ... so it has to be
learned...



This is the main disagreement.  Could an internet crowd codify all
commonsense knowledge?  It seems yes, especially if we're talking about the
more *verbal* portion of commonsense.  Perhaps we should combine the Codifiy
strategy with the Experiential Learning strategy

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-25 Thread Ben Goertzel




Yes, Lojban is just another human-created ontology.  My statement  
was that it is a particularly good one, well-thought out, practical,  
and conceptually sensible.


I note that, unlike what you say, Novamente is not predicated on the  
assumption that we should acquire all knowledge via experiential  
learning.  Rather, it is predicated on the assumption that there is  
a lot of knowledge, necessary for AGI, that is only practicably  
acquirable via experiential learning.  I think there is also a lot  
of knowledge that can practicably be explicitly encoded and fed  
directly into an AGI's mind -- I just think the knowledge in this  
latter category is not **sufficient** in itself  So, we can take  
a hybrid approach in Novamente.


-- Ben G


On Jan 25, 2007, at 6:58 PM, YKY (Yan King Yin) wrote:




On 1/25/07, Ben Goertzel [EMAIL PROTECTED] wrote:
 If there is a major problem with Cyc, it is not the choice of basic
 KR language.  Predicate logic is precise and relatively simple.

I agree mostly, though I think even Cyc's simple predicate logic  
language can be made even simpler and better.  For example, Cyc  
uses the classical quantifiers #$forAll and #$exists.  In my  
version I don't use Frege-style quantifiers but I allow generalized  
modifiers like many, a few, in addition to all, exists.


 Rather, the main problem is the impracticality of encoding a decent
 percentage of the needed commonsense knowledge!


Now I see why we disagree here.  You believe we should acquire all  
knowledge via experiential learning.  IMO we can do even better  
than the experiential route.  We can let the internet crowd enter  
the commonsense corpus for us.  This should be allow us to reach a  
functioning, usable AGI sooner.


 And, on a more technical level, I think that Cyc's **ontology** is
 too complex and unwieldy.  This is NOT an issue of the KR language,
 but rather of the chosen vocabulary of semantic primitives.  I
 don't feel that Cyc has a well-thought-out set of semantic
 primitives.  They have a small number of basic logical primitives,
 and then a HUGE number of complex abstract concepts in their upper
 ontology.  IMO an intermediate level is needed, involving a few  
dozen

 well thought out semantic primitives, and a few hundred additional
 basic semantic relationships.


I have a similar sense.  As wikipedia puts it, Cyc has been  
criticized for excessive reification.  I think the problem is  
that Cyc creates artificial labels that are atomic and *non- 
compositional*.  For example the label #$rawFood should be  
represented *compositionally* by the concepts raw and food.


I suggest not to use ontologies at all.  John Sowa has spent lots  
of time on the ontology problem and his conclusion is:  We will  
never have a one-size fits all ontology for anything having to do  
with computer systems. Case closed [ http://suo.ieee.org/email/ 
msg12861.html ].  Perhaps this is one GOFAI feature we need to ditch.


I think we can work bottom-up from a vast web of commonsense  
pixels, and the computer organize its own knowledgebase via  
clustering etc.  So we don't need any man-made ontology.


 Lojban IMO has done a great job of this.  The Lojban language
 embodies a very well thought out commonsense ontology, which has  
been

 shaped evolutionarily thru the usage of the language by the Lojban
 community.


Not familiar with the Lojban community or the status of the  
language, so I can't comment.  I still believe that introducing  
Lojban into AGI is spurious / redundant and it may alienate people  
from your projects if they don't know Lojban.  It seems like just  
another man-made ontology that has its inadequacies.


 However, this still doesn't solve the problem that there is too much
 commonsense knowledge to code-in explicitly ... so it has to be
 learned...


This is the main disagreement.  Could an internet crowd codify all  
commonsense knowledge?  It seems yes, especially if we're talking  
about the more *verbal* portion of commonsense.  Perhaps we should  
combine the Codifiy strategy with the Experiential Learning  
strategy


YKY
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-24 Thread YKY (Yan King Yin)


On 1/20/07, Pei Wang [EMAIL PROTECTED] wrote:

 The bottomline is that the knowledge acquisition project is *separable*

from

 specific inference methods.

What is your argument supporting this strong claim?

I guess every book on knowledge representation includes a statement
saying that whether a knowledge representation format is good, to a
large extent depends on the type of inference it can support. These
two aspects are never considered as separable.




Suppose I have a set of *deductive* facts/rules in FOPL.  You can actually
use this data in your AGI to support other forms of inference such as
induction and abduction.  In this sense the facts/rules collection does not
dictate the form of inference engine we use.

For example, First-Order Predicate Logic is not good enough for AI,
partly because it does not support non-deductive inference. Also,
semantic network is considered as weak mainly because it has no
powerful inference method associated.



FOPL can be used for things like induction and abduction, albeit via
external algorithms.  Therefore, I think a FOPL-based system can suffice for
AGI (which doesn't mean that it is the only way).

I am still reading your book, but I found numerous good ideas in it.  I know
that you treat deduction, induction, and abduction in a unified way.  That
is a very elegant theory but it may have problems.  For example, if:

(1) I read a lot of books
(2) I hate my mom

your system may infer by induction that reading a lot of books - hating
ones mom.  In some instances doing this is meaningful, but in general your
system may be flooded with a lot of these speculative statements, drawing
time from the day-to-day deductive operations.

I tend to think of induction as something less essential than deduction.
That's why my top priority is to build an inference engine for deduction.
Inductive learning will be added later in the form of data mining, which is
very computation-intensive.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-24 Thread YKY (Yan King Yin)


On 1/24/07, Bob Mottram [EMAIL PROTECTED] wrote:


I think it would be better to design a system with probabilistic reasoning

as a fundamental component from the outset, rather than trying to bolt this
on as an after thought.  I know from doing a lot of stuff with machine
vision that modelling sensor uncertainties is critical for being able to
understand the spatial structure of the environment, and I expect similar
principles will apply when reasoning within more abstract domains.

Yes I agree.  I think Pei Wang's version of uncertain logic is very simple
and effective.  It uses 2 numbers, one for probability (as frequency) and
one for support or confidence.

On the other hand, I suspect that many commonsense statements do not have
probabilistic values attached to them.  For example water conducts
electricity or oil is slippery are not really probabilistic.  We should
leave an option for a statement to be non-probabilistic.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-24 Thread Bob Mottram


The trouble is that you can only really decide whether a statement is
non-probabilistic if enough people have voted unanimously yes or no.  Even
then you can't be sure that the next person to vote won't go the opposite
way.




On 24/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:




On 1/24/07, Bob Mottram [EMAIL PROTECTED] wrote:

 I think it would be better to design a system with probabilistic
reasoning as a fundamental component from the outset, rather than trying to
bolt this on as an after thought.  I know from doing a lot of stuff with
machine vision that modelling sensor uncertainties is critical for being
able to understand the spatial structure of the environment, and I expect
similar principles will apply when reasoning within more abstract domains.

Yes I agree.  I think Pei Wang's version of uncertain logic is very simple
and effective.  It uses 2 numbers, one for probability (as frequency) and
one for support or confidence.

On the other hand, I suspect that many commonsense statements do not have
probabilistic values attached to them.  For example water conducts
electricity or oil is slippery are not really probabilistic.  We should
leave an option for a statement to be non-probabilistic.

YKY
--
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-23 Thread Stephen Reed

Given my experience while employed at Cycorp, I would say that there are two 
ways to work with them.  The first way is to collaborate with Cycorp on a 
sponsored project.  Collaborators are mainly universities (e.g. CMU  Stanford) 
and established research companies (e.g. SRI  SAIC) who have a track record of 
receiving government grants, and whose technologies are complementary to Cyc.   
I would not suggest this approach for MindPixel 2 yet.

The second approach involves no exchange of money.  Cycorp wants to promote its 
ontology - its commonsense vocabulary, and has released its definitions with a 
very permisive license as OpenCyc.  One can also obtain nearly the entire Cyc 
knowledge base with a Research Cyc license for research purposes without fee, 
but with the RCyc license you are not allowed to extract facts and rules for 
MindPixel 2.  

You could contact the Cyc Foundation, which is an independent organization run 
by a friend of mine and former Cycorp employee.  They are seeking to add 
knowledge to Cyc by using volunteers and I  believe that they would be very 
receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary for 
knowledge representation.

I suggest obtaining an RCyc license to see how the Cyc inference engine handles 
large rule and fact sets, and to see if the Cyc vocabulary fits your idea of a 
commonsense representation language.


- Original Message 
From: Benjamin Goertzel [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 5:35:51 PM
Subject: Re: [agi] Project proposal: MindPixel 2

Hi,

 Do you think Cyc has a rule/fact like wet things can usually conduct
 electricity (or if X is wet then X may conduct electricity)?

Yes, it does...

 I'll also contact some Cyc folks to see if they're interested in
 collaborating...

IMO, to have any chance of interesting them, you will need to be able
to explain to them VERY CLEARLY why your current proposed approach is
superior to theirs -- given that it seems so philosophically similar
to theirs, and given that they have already encoded millions of
knowledge items and built an inference engine and language-processing
front end!

-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303





 

Get your own web address.  
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-23 Thread Bob Mottram


I'm no expert on automated reasoning, but wasn't the original Mindpixel
based fundamentally upon probabilistic representations (coherence values)
whereas Cyc, from what I understand, doesn't represent facts or rules
probabilistically.

- Bob



On 23/01/07, Stephen Reed [EMAIL PROTECTED] wrote:


Given my experience while employed at Cycorp, I would say that there are
two ways to work with them.  The first way is to collaborate with Cycorp on
a sponsored project.  Collaborators are mainly universities (e.g. CMU 
Stanford) and established research companies (e.g. SRI  SAIC) who have a
track record of receiving government grants, and whose technologies are
complementary to Cyc.   I would not suggest this approach for MindPixel 2
yet.

The second approach involves no exchange of money.  Cycorp wants to
promote its ontology - its commonsense vocabulary, and has released its
definitions with a very permisive license as OpenCyc.  One can also obtain
nearly the entire Cyc knowledge base with a Research Cyc license for
research purposes without fee, but with the RCyc license you are not allowed
to extract facts and rules for MindPixel 2.

You could contact the Cyc Foundation, which is an independent organization
run by a friend of mine and former Cycorp employee.  They are seeking to add
knowledge to Cyc by using volunteers and I  believe that they would be very
receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary
for knowledge representation.

I suggest obtaining an RCyc license to see how the Cyc inference engine
handles large rule and fact sets, and to see if the Cyc vocabulary fits your
idea of a commonsense representation language.


- Original Message 
From: Benjamin Goertzel [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 5:35:51 PM
Subject: Re: [agi] Project proposal: MindPixel 2

Hi,

 Do you think Cyc has a rule/fact like wet things can usually conduct
 electricity (or if X is wet then X may conduct electricity)?

Yes, it does...

 I'll also contact some Cyc folks to see if they're interested in
 collaborating...

IMO, to have any chance of interesting them, you will need to be able
to explain to them VERY CLEARLY why your current proposed approach is
superior to theirs -- given that it seems so philosophically similar
to theirs, and given that they have already encoded millions of
knowledge items and built an inference engine and language-processing
front end!

-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303








Get your own web address.
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-23 Thread Stephen Reed

Right, Cyc's deductive inference engine does not support probabilistic 
reasoning.  But there is no obstacle to extending Cyc's probabilistic 
vocabulary for the particular representation you want and then using an 
inference engine of your own design.

For my AGI project I use the OpenCyc vocabulary and content, but with my own 
object store (a relational database) and simple inference (look-up and 
subsumption within contexts).  The java dialog application that I am building 
does not require any more sophisticated deduction, so I postponing any complex 
inference until I can teach those algorithms to the system using English.
-Steve
http://sf.net/projects/texai

- Original Message 
From: Bob Mottram [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Tuesday, January 23, 2007 1:13:14 PM
Subject: Re: [agi] Project proposal: MindPixel 2


I'm no expert on automated reasoning, but wasn't the original Mindpixel based 
fundamentally upon probabilistic representations (coherence values) whereas 
Cyc, from what I understand, doesn't represent facts or rules 
probabilistically. 

- Bob



On 23/01/07, Stephen Reed [EMAIL PROTECTED] wrote: Given my experience while 
employed at Cycorp, I would say that there are two ways to work with them.  The 
first way is to collaborate with Cycorp on a sponsored project.  Collaborators 
are mainly universities (e.g. CMU  Stanford) and established research 
companies ( e.g. SRI  SAIC) who have a track record of receiving government 
grants, and whose technologies are complementary to Cyc.   I would not suggest 
this approach for MindPixel 2 yet.

The second approach involves no exchange of money.  Cycorp wants to promote its 
ontology - its commonsense vocabulary, and has released its definitions with a 
very permisive license as OpenCyc.  One can also obtain nearly the entire Cyc 
knowledge base with a Research Cyc license for research purposes without fee, 
but with the RCyc license you are not allowed to extract facts and rules for 
MindPixel 2. 

You could contact the Cyc Foundation, which is an independent organization run 
by a friend of mine and former Cycorp employee.  They are seeking to add 
knowledge to Cyc by using volunteers and I  believe that they would be very 
receptive to MindPixel 2 provided it uses a form of the OpenCyc vocabulary for 
knowledge representation. 

I suggest obtaining an RCyc license to see how the Cyc inference engine handles 
large rule and fact sets, and to see if the Cyc vocabulary fits your idea of a 
commonsense representation language.


- Original Message  
From: Benjamin Goertzel [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 5:35:51 PM
Subject: Re: [agi] Project proposal: MindPixel 2 

Hi,

 Do you think Cyc has a rule/fact like wet things can usually conduct
 electricity (or if X is wet then X may conduct electricity)?

Yes, it does...

 I'll also contact some Cyc folks to see if they're interested in 
 collaborating...

IMO, to have any chance of interesting them, you will need to be able
to explain to them VERY CLEARLY why your current proposed approach is
superior to theirs -- given that it seems so philosophically similar 
to theirs, and given that they have already encoded millions of
knowledge items and built an inference engine and language-processing
front end!

-- Ben G

-
This list is sponsored by AGIRI:  http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303







 
Get your own web address.
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL

-
This list is sponsored by AGIRI:  http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303 


  This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303 





 

Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail beta.
http://new.mail.yahoo.com

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-20 Thread Charles D Hixson


Benjamin Goertzel wrote:

 And, importance levels need to be context-dependent, so that assigning
 them requires sophisticated inference in itself...


The problem may not be so serious.  Common sense reasoning may 
require only

*shallow* inference chains, eg  5 applications of rules.  So I'm very
optimistic =)  Your worries are only applicable to 100-page 
theorem-proving

tasks, not really the concern of AGI.


A) This is just not true, many commonsense inferences require
significantly more than 5 applications of rules

B) Even if there are only 5 applications of rules, the combinatorial
explosion still exists.  If there are 10 rules and 1 billlion
knowledge items, then there may be up to 10 billion possibilities to
consider in each inference step.  So there are (10 billion)^5 possible
5-step inference trajectories, in this scenario ;-)

Of course, some fairly basic pruning mechanisms can prune it down a
lot, but, one is still left with a combinatorial explosion that needs
to be dealt with via subtle means...

Please bear in mind that we actually have a functional uncertain
logical reasoning engine within the Novamente system, and have
experimented with feeding in knowledge from files and doing inference
on them.  (Though this has been mainly for system testing, as our
primary focus is on doing inference based on knowledge gained via
embodied experience in the AGISim world.)

The truth is that, if you have a lot of knowledge in your system's
memory, you need a pretty sophisticated, context-savvy inference
control mechanism to do commonsense inference.

Also, temporal inference can be quite tricky, and introduces numerous
options for combinatorial explosion that you may not be thinking about
when looking at atemporal examples of commonsense inference.  Various
conclusions may hold over various time scales; various pieces of
knowledge may become obsolete at various rates, etc.

I imagine you will have a better sense of these issues once you have
actually built an uncertain reasoning engine, fed knowledge into it,
and tried to make it do interesting things  I certainly think this
may be a valuable exercise for you to do.  However, until you have
done it, I think it's kind of silly for you to be speaking so
confidently about how you are so confident you can solve all the
problems found by others in doing this kind of work!!  I ask again, do
you have some theoretical innovation that seems probably to allow you
circumvent all these very familiar problems??

-- Ben
Possibly this could be approached by partitioning the rule-set into 
small chunks of rules that work together, so that one didn't end up 
trying everything against everything else.  These chunks of rules 
might well be context dependent, so that one would use different chunks 
at a dinner table than in a work shop.  There would need to be ways to 
combine different chunks of rules, of course, so e.g. a restaurant table 
would be different from a dinner table, but would have overlapping sets 
of rules.  (I hope I'm not just re-inventing frames...)


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-20 Thread Benjamin Goertzel


Hi,


Possibly this could be approached by partitioning the rule-set into
small chunks of rules that work together, so that one didn't end up
trying everything against everything else.  These chunks of rules
might well be context dependent, so that one would use different chunks
at a dinner table than in a work shop.  There would need to be ways to
combine different chunks of rules, of course, so e.g. a restaurant table
would be different from a dinner table, but would have overlapping sets
of rules.  (I hope I'm not just re-inventing frames...)


The issue is how these contexts are learned.  If context have to be
programmer-supplied, then you ARE just reinventing frames

Context formation is a tricky inference problem in itself

-- Ben

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-20 Thread Charles D Hixson


Benjamin Goertzel wrote:

Hi,


Possibly this could be approached by partitioning the rule-set into
small chunks of rules that work together, so that one didn't end up
trying everything against everything else.  These chunks of rules
might well be context dependent, so that one would use different chunks
at a dinner table than in a work shop.  There would need to be ways to
combine different chunks of rules, of course, so e.g. a restaurant table
would be different from a dinner table, but would have overlapping sets
of rules.  (I hope I'm not just re-inventing frames...)


The issue is how these contexts are learned.  If context have to be
programmer-supplied, then you ARE just reinventing frames

Context formation is a tricky inference problem in itself

-- Ben
Well, my rather vague idea was to start with a very small rule set, that 
didn't need to be partitioned, and evolve rule-sets by statistical 
correlation (what tends to get used with what).  As new rules are added, 
at some point clusters would need to separate (for efficiency).  I 
suppose this could all be done with activation levels, but that's not 
the way I tend to think of it. 

OTOH, if the local cluster can't handle the deduction, it would need to 
check the most closely associated/most activated clusters to see if 
they could handle it.  Not sure how well this would work.  Clearly it 
has no more theoretical power than having all the rules in a large 
table, but I feel it would be a more efficient organization.


Also, I don't have any definition of rule yet.  It's not at all clear 
that it would be easy to translate into something a person not familiar 
with the details of the hardware and software would understand.  (If a 
certain area of RAM is mapped to a video camera, reading/writing the ram 
will naturally mean something very different than it would mean in other 
contexts.  Writing to it might be a request to alter the scene .  (A 
silly way to do things, but it's for the sake of the point, not for real 
implementation.)  I'm not at all sure that rules of the form if x do y, 
then check for result z (if not raise exception w) will suffice, even 
if you allow great flexibility as to what x, y, z, and w are interpreted 
as.  Possibly if they could be generalized functions (with x and z 
limited to not causing side effects).


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Bob Mottram


My feeling is that this probably isn't a great business idea.  I think
collecting common sense data and building that into a general reasoner
should really be thought of as a long term effort, which is unlikely to
appeal to business investors expecting to see a return within a few years.

If any attempt is made to build a second version of mindpixel, I think the
open source (or open corpus) model would be the obvious choice.  Chris
McKinstry kept his database secret, and for a long time so did Cyc, and as a
consequence those projects saw very little actual usage by anyone.  The more
easily researchers can get their hands on the corpus the more likely is that
some interesting applications will result.

It might also be worth noting that cross validated common sense information
can be grabbed directly from the internet, from sites like wikipedia.  I've
had a program doing this for quite some time, and the quality of the data
acquired is good.

- Bob



On 18/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:




On 1/19/07, Matt Mahoney [EMAIL PROTECTED] wrote:

 I think if you want to make a business out of AI, you are in for a lot
of
 work.First you need something that is truly innovative, that does
 something that nobody else can do.  What will that be?  A search engine
better
 than Google?  A new operating system that understands natural
language?  A car
 that drives itself?  A household servant robot?  A program that can
manage a
 company?  A better spam detector?  Text compression?

 Write down a well defined goal.  Do research.  What is your
competition?  How
 are your ideas better than what's been done?  Prove it (with
benchmarks), and
 the opportunities will come.

Thanks for the tips.  My idea is quite simple, slightly innovative, but
not groundbreaking.  Basically, I want to collect a knowledgebase of
facts as well as rules.  Facts are like water is wet etc.  The rules I
explain with this example:  Cats have claws;  Kitty is a cat;  therefore
Kitty has claws.  Here is an implicit rule that says if X is-a Y and Z(Y),
then Z(X).  I call rules like this the Rules of Thought.  They are not
logical tautologies but they express some common thought patterns.

My theory is that if we collect a bunch of these rules, add a database
of common sense facts, and add a rule-based FOPL inference engine (which may
be enhanced with eg Pei Wang's numerical logic), then we have a common sense
reasoner.  That's what I'm trying to build as a first-stage AGI.

If it does work, there may be some commercial applications for such a
reasoner.  Also it would serve as the base to build a full AGI capable of
machine learning etc (I have crudely worked out the long-term plan).

So, is this a good business idea?

YKY
--
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Pei Wang


YKY,

Frankly, I still see many conceptual confusions in your description.
Of course, some of them come from other people's mistake, but they
will hurt your work anyway.

For example, what you called rule in your postings have two
different meanings:

(1) A declarative implication statement, X == Y;

(2) A procedure that produces conclusions from premises, {X} |- Y.

These two are related, but not the same thing. Both can be learned,
but through very different paths. To confuse the two will cause a
mess.

The failure of GOFAI has reasons deeper than you suggested. Like Ben,
I think you will repeat the same mistake if you follow the current
plan. Just adding numbers to your rules won't solve all the problems.

More knowledge, higher intelligence is an intuitively attractive
slogan, but has many problems in it. For example, more knowledge will
easily lead to combinatorial explosion, and the reasoning system will
derive many true but useless conclusions. How do you deal with that?

I don't think it is a good idea to attract many volunteers to a
project unless the plan is mature enough so that people's time and
interest won't be wasted.

Sources of human knowledge will be needed by any AGI project, so
projects like CYC or MindPixel will be useful, though I'm afraid
neither is cost-effective enough to play a central role in satisfying
this need. Mining the Web may be more efficient, though it will surely
leave gaps in the knowledge base to be filled in by other methods,
such as personal experience, NLP, interactive tutoring, etc.

Sorry for the negative tone, but since you mentioned my work, I have
to clarify my position.

Pei

On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:
 Well YKY, I don't feel like rehashing these ancient arguments on this
list!!

 Others are welcome to do so, if they wish... ;-)

 You are welcome to repeat the mistakes of the past if you like, but I
 frankly consider it a waste of effort.

 What you have not explained is how what you are doing is fundamentally
 different from what has been tried N times in the past -- by larger,
 better-funded teams with more expertise in mathematical logic...

Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s
because of newer techniques such as neural networks and statistical learning
methods.  They were not necessarily aware of what exactly was the cause of
failure.  If they did, they would have tackled it.

For the type of common sense reasoner I described, we need a *massive*
number of rules.  You can either acquire these rule via machine learning or
direct encoding.  Machine learning of such rules is possible, but the area
of research is kind of immature.  OTOH there has not been a massive project
to collect such rules by hand.  So that explains why my type of system has
not been tried before.

My system is conceptually very close to Cyc, but the difference is that Cyc
only contains ground facts and rely on special predicates (eg $isa, $genl)
to do the reasoning.  My project may be the first to openly collect facts as
well as rules.

I guess Novamente or NARS can benefit by importing these rules, if the
format is right?


YKY 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


YKY,

Pei's attitude is pretty similar to mine on these matters, although we
differ on other more detailed issues regarding AGI.

And, please note that compared to most AI researchers, Pei and I would
be among the folks most likely to be sympathetic to your ideas, given
that

-- we are both explicitly in favor of pushing hard toward AGI rather
than fiddling with narrow AI
-- are are both in favor of uncertain logic systems as one highly
viable path toward AGI

You have not explained how you will overcome the issues that plagued
GOFAI, such as
-- the need for massive amounts of highly uncertain background
knowledge to make real-world commonsense inferences
-- the combinatorial explosion that ensues when you try to control
logical inference on a large body of data

My own solution to these problems is to
-- learn most knowledge via experience rather than via explicit encoding
-- utilize a subtle combination of inference, statistical pattern
mining and artificial economics for inference control

Pei agrees with me on the learning via experience part but has a
different approach to the combinatorial explosion problem of inference
control.  But you have not yet presented any original solutions to
these or other major well-documented problems with the GOFAI approach.

Yes, intuitively the approach you're suggesting sounds like it should
work -- at first.  That is why masses of research funding were spent
on it decades ago, and why hundreds of brilliant people spent their
lives on GOFAI.  But you are not giving us any rational reason to
suspect you might succeed in this sort of approach where so many
others have failed.  What is your new and different idea?

-- Ben

On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote:

YKY,

Frankly, I still see many conceptual confusions in your description.
Of course, some of them come from other people's mistake, but they
will hurt your work anyway.

For example, what you called rule in your postings have two
different meanings:

(1) A declarative implication statement, X == Y;

(2) A procedure that produces conclusions from premises, {X} |- Y.

These two are related, but not the same thing. Both can be learned,
but through very different paths. To confuse the two will cause a
mess.

The failure of GOFAI has reasons deeper than you suggested. Like Ben,
I think you will repeat the same mistake if you follow the current
plan. Just adding numbers to your rules won't solve all the problems.

More knowledge, higher intelligence is an intuitively attractive
slogan, but has many problems in it. For example, more knowledge will
easily lead to combinatorial explosion, and the reasoning system will
derive many true but useless conclusions. How do you deal with that?

I don't think it is a good idea to attract many volunteers to a
project unless the plan is mature enough so that people's time and
interest won't be wasted.

Sources of human knowledge will be needed by any AGI project, so
projects like CYC or MindPixel will be useful, though I'm afraid
neither is cost-effective enough to play a central role in satisfying
this need. Mining the Web may be more efficient, though it will surely
leave gaps in the knowledge base to be filled in by other methods,
such as personal experience, NLP, interactive tutoring, etc.

Sorry for the negative tone, but since you mentioned my work, I have
to clarify my position.

Pei

On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


 On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:
  Well YKY, I don't feel like rehashing these ancient arguments on this
 list!!
 
  Others are welcome to do so, if they wish... ;-)
 
  You are welcome to repeat the mistakes of the past if you like, but I
  frankly consider it a waste of effort.
 
  What you have not explained is how what you are doing is fundamentally
  different from what has been tried N times in the past -- by larger,
  better-funded teams with more expertise in mathematical logic...

 Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s
 because of newer techniques such as neural networks and statistical learning
 methods.  They were not necessarily aware of what exactly was the cause of
 failure.  If they did, they would have tackled it.

 For the type of common sense reasoner I described, we need a *massive*
 number of rules.  You can either acquire these rule via machine learning or
 direct encoding.  Machine learning of such rules is possible, but the area
 of research is kind of immature.  OTOH there has not been a massive project
 to collect such rules by hand.  So that explains why my type of system has
 not been tried before.

 My system is conceptually very close to Cyc, but the difference is that Cyc
 only contains ground facts and rely on special predicates (eg $isa, $genl)
 to do the reasoning.  My project may be the first to openly collect facts as
 well as rules.

 I guess Novamente or NARS can benefit by importing these rules, if the
 format is right?


 YKY

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


Regarding Mindpixel 2, FWIW, one kind of knowledge base that would be
most interesting to me as an AGI developer would be a set of pairs of
the form

(Simple English sentence, formal representation)

For instance, a [nonrepresentatively simple] piece of knowledge might be

(Cats often chase mice, { often( chase(cat, mouse) ) } )

This sort of training corpus would be really nice for providing some
extra help to an AI system that was trying to learn English.

Equivalently one could use a set of pairs of the form

(English sentence, Lojban sentence)

If Lojban is not used, then one needs to make some other highly
particular  specification regarding the logical representation
language.

-- Ben

On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

YKY,

Pei's attitude is pretty similar to mine on these matters, although we
differ on other more detailed issues regarding AGI.

And, please note that compared to most AI researchers, Pei and I would
be among the folks most likely to be sympathetic to your ideas, given
that

-- we are both explicitly in favor of pushing hard toward AGI rather
than fiddling with narrow AI
-- are are both in favor of uncertain logic systems as one highly
viable path toward AGI

You have not explained how you will overcome the issues that plagued
GOFAI, such as
-- the need for massive amounts of highly uncertain background
knowledge to make real-world commonsense inferences
-- the combinatorial explosion that ensues when you try to control
logical inference on a large body of data

My own solution to these problems is to
-- learn most knowledge via experience rather than via explicit encoding
-- utilize a subtle combination of inference, statistical pattern
mining and artificial economics for inference control

Pei agrees with me on the learning via experience part but has a
different approach to the combinatorial explosion problem of inference
control.  But you have not yet presented any original solutions to
these or other major well-documented problems with the GOFAI approach.

Yes, intuitively the approach you're suggesting sounds like it should
work -- at first.  That is why masses of research funding were spent
on it decades ago, and why hundreds of brilliant people spent their
lives on GOFAI.  But you are not giving us any rational reason to
suspect you might succeed in this sort of approach where so many
others have failed.  What is your new and different idea?

-- Ben

On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote:
 YKY,

 Frankly, I still see many conceptual confusions in your description.
 Of course, some of them come from other people's mistake, but they
 will hurt your work anyway.

 For example, what you called rule in your postings have two
 different meanings:

 (1) A declarative implication statement, X == Y;

 (2) A procedure that produces conclusions from premises, {X} |- Y.

 These two are related, but not the same thing. Both can be learned,
 but through very different paths. To confuse the two will cause a
 mess.

 The failure of GOFAI has reasons deeper than you suggested. Like Ben,
 I think you will repeat the same mistake if you follow the current
 plan. Just adding numbers to your rules won't solve all the problems.

 More knowledge, higher intelligence is an intuitively attractive
 slogan, but has many problems in it. For example, more knowledge will
 easily lead to combinatorial explosion, and the reasoning system will
 derive many true but useless conclusions. How do you deal with that?

 I don't think it is a good idea to attract many volunteers to a
 project unless the plan is mature enough so that people's time and
 interest won't be wasted.

 Sources of human knowledge will be needed by any AGI project, so
 projects like CYC or MindPixel will be useful, though I'm afraid
 neither is cost-effective enough to play a central role in satisfying
 this need. Mining the Web may be more efficient, though it will surely
 leave gaps in the knowledge base to be filled in by other methods,
 such as personal experience, NLP, interactive tutoring, etc.

 Sorry for the negative tone, but since you mentioned my work, I have
 to clarify my position.

 Pei

 On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:
 
 
  On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:
   Well YKY, I don't feel like rehashing these ancient arguments on this
  list!!
  
   Others are welcome to do so, if they wish... ;-)
  
   You are welcome to repeat the mistakes of the past if you like, but I
   frankly consider it a waste of effort.
  
   What you have not explained is how what you are doing is fundamentally
   different from what has been tried N times in the past -- by larger,
   better-funded teams with more expertise in mathematical logic...
 
  Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s
  because of newer techniques such as neural networks and statistical learning
  methods.  They were not necessarily aware of what exactly was the cause

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/19/07, Bob Mottram [EMAIL PROTECTED] wrote:


My feeling is that this probably isn't a great business idea.  I think

collecting common sense data and building that into a general reasoner
should really be thought of as a long term effort, which is unlikely to
appeal to business investors expecting to see a return within a few years.


In contrast to many business in this internet era, my project seems to need
a low level of funding over a relatively long period of time (eg 5-10
years).  I actually think this is a better business model for AGI (cf Ben's
WebMind story =)).  Funding is of secondary importance to finding the right
partners;  I know quite a few people with VC connections.


If any attempt is made to build a second version of mindpixel, I think the

open source (or open corpus) model would be the obvious choice.  Chris
McKinstry kept his database secret, and for a long time so did Cyc, and as a
consequence those projects saw very little actual usage by anyone.  The more
easily researchers can get their hands on the corpus the more likely is that
some interesting applications will result.


How about this:  the database would be open for anyone to download, for
experimentation or whatever purpose.  Only when someone wants to incorporate
the data in an AGI, would a license fee be needed.  Also I would make the
inference engine etc opensource, again within a commercial context.  This
approach is not so common but I think it gets the best of both worlds.


It might also be worth noting that cross validated common sense

information can be grabbed directly from the internet, from sites like
wikipedia.  I've had a program doing this for quite some time, and the
quality of the data acquired is good.


There might be some gaps in the knowledge you acquired.  If I really run the
project I would also import knowledge acquired from other methods.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/19/07, Pei Wang [EMAIL PROTECTED] wrote:

For example, what you called rule in your postings have two
different meanings:

(1) A declarative implication statement, X == Y;

(2) A procedure that produces conclusions from premises, {X} |- Y.

These two are related, but not the same thing. Both can be learned,
but through very different paths. To confuse the two will cause a
mess.


Thanks, that's a good point.  In uncertain logic, the status of
the classical logic connectives AND, OR, NOT may be somewhat different.  For
example, A - B may no longer be equivalent to (!A v B), when A and B are
attached with uncertainty values. Therefore I am still unsure about how to
deal with - etc.  But I will pay special attention to this point.


The failure of GOFAI has reasons deeper than you suggested. Like Ben,
I think you will repeat the same mistake if you follow the current
plan. Just adding numbers to your rules won't solve all the problems.

More knowledge, higher intelligence is an intuitively attractive
slogan, but has many problems in it. For example, more knowledge will
easily lead to combinatorial explosion, and the reasoning system will
derive many true but useless conclusions. How do you deal with that?



That's the problem of forward-chaining without a goal.  In fact, the human
mind can easily think of a lot of useless implications in a situation.  If
we have a query as a goal, we can use backward-chaining.  Otherwise we can
rank the implied sentences into levels of importance.  This does not seem to
be a show-stopper.


I don't think it is a good idea to attract many volunteers to a
project unless the plan is mature enough so that people's time and
interest won't be wasted.

Sources of human knowledge will be needed by any AGI project, so
projects like CYC or MindPixel will be useful, though I'm afraid
neither is cost-effective enough to play a central role in satisfying
this need. Mining the Web may be more efficient, though it will surely
leave gaps in the knowledge base to be filled in by other methods,
such as personal experience, NLP, interactive tutoring, etc.



Speaking of cost-effectiveness, my project can be pretty low-cost =)  I try
to keep things simple and not pursue a million ideas at once, though I have
plans for an entire AGI.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Bob Mottram


On 19/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



How about this:  the database would be open for anyone to download, for
experimentation or whatever purpose.  Only when someone wants to incorporate
the data in an AGI, would a license fee be needed.  Also I would make the
inference engine etc opensource, again within a commercial context.  This
approach is not so common but I think it gets the best of both worlds.




This might be ok.  You could distribute any funds from commercial licenses
proportionately amongst those people who had entered the data.  However,
such a system might be difficult to enforce.  I think we'll soon be entering
an age where the internet becomes a big knowledge crunching monster, with
information from one source being processed and spat out to other
destinations in a completely automated way.  It would be very hard to tell
in this situation exactly who was using the data, or indeed what the
original source of the data was.

If a commercial model is adopted it should be made clear that this isn't a
get rich quick scheme.  If people start entering data believing that they're
soon going to be making money out of it after a year or two with no
financial return in sight dissapointment sets in, which quickly leads to bad
press and people losing interest in the project.  This seems to be what
happened with the original mindpixel.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Charles D Hixson


YKY (Yan King Yin) wrote:

...
 
I think a project like this one requires substantial efforts, so 
people would need to be paid to do some of the work (programming, 
interface design, etc), especially if we want to build a high quality 
knowledgebase.  If we make it free then a likely outcome is that we 
get a lot of noise but very few people actually contribute.
 
I'm not an academic (left uni a couple years ago) so I can't get 
academic funding for this.  If I can't start an AI business I'd have 
to entirely give up AI as a career.  I hope you can understand these 
circumstances.
 
YKY
I can understand those circumstances, but if you expect people to 
contribute, you must give them something back.  One thing that's cheap 
to give back is the work that they and others have contributed.  Giving 
back less generally results in people not being willing to participate.  
Even if you claim sole rights to commercially exploit the work, you will 
find it much more difficult to get folk to participate.  They will feel 
that you are stealing their work without just compensation.


You raise the issue of compensation to you, and that's fair.  But if you 
take out too much, you will cause the project to fail just as surely as 
if you hadn't put in the time to design the interface.  If you merely 
make a requirement that people be a better than average contributor to 
be entitled to download the  current results, then you will eliminate 
most potential competitors...and the remaining ones will be those who 
are also dedicating time and effort to making your project work.  It's 
true that old versions of your work will circulate, but that should do 
little harm.


People only participate in a public project if they feel they are 
getting a good return out of it.  What a good return is, is 
subjective, but few people consider I put in a bunch of work, and they 
don't even mention my name to be a good return.  You want to give 
people a return that they see as more valuable than their efforts, but 
which costs you a lot less than their efforts.  Status in a community 
requires that the community exist.  (At some point you'll want to give 
people scores depending on the amount of their work that is included in 
the current project...or something that will relate positively to that.  
This is a cheap status reward, and will boost community participation.  
On Slashdot I notice that just having a low numbered user ID has become 
a status marker of sorts.  I.e., you've been a member of the community 
for a long time.  That was a REALLY cheap status gift, but it took a 
long time to build to anything of value.  Much quicker was the right to 
meta-moderate.  Slightly less quick was the right to moderate.  Note 
that these are both seen by the Slashdot community as things of worth, 
yet to the operator of Slashdot they were instituted as ways of cutting 
cost while improving quality.  Also note that it took a long time for 
them to become worth much as status markers.  You need something else to 
use while you're getting started.)


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Pei Wang


On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


 More knowledge, higher intelligence is an intuitively attractive
 slogan, but has many problems in it. For example, more knowledge will
 easily lead to combinatorial explosion, and the reasoning system will
 derive many true but useless conclusions. How do you deal with that?

That's the problem of forward-chaining without a goal.  In fact, the human
mind can easily think of a lot of useless implications in a situation.  If
we have a query as a goal, we can use backward-chaining.  Otherwise we can
rank the implied sentences into levels of importance.  This does not seem to
be a show-stopper.


Backward inference faces the same problem --- more knowledge means
more possible ways to derive subgoals. No traditional control strategy
can scale up to a huge knowledge base. Importance-ranking will surely
be necesary, but this idea by itself is not enough.

Pei

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

You have not explained how you will overcome the issues that plagued
GOFAI, such as
-- the need for massive amounts of highly uncertain background
knowledge to make real-world commonsense inferences


Precisely, we need to amass millions of pieces of knowledge items, and some
items may have uncertainty.  This is precisely what I'm trying to do.  The
alternative route is machine learning, but that requires a sensorium or a NL
interface, which is an even more daunting task.  (But I don't object you
going that way =))


-- the combinatorial explosion that ensues when you try to control
logical inference on a large body of data


I have studied this issue a bit, unbeknown to you =)  There are ways to
tackle massive numbers of rules, eg the rete algorithm, predicate hashing,
etc.  Soar is a good example using the rete algorithm.  It can handle
millions of rules (and probably many more).


My own solution to these problems is to
-- learn most knowledge via experience rather than via explicit encoding


Nothing wrong with this approach, but it may be even more difficult than
mine.


-- utilize a subtle combination of inference, statistical pattern
mining and artificial economics for inference control



You're getting into the topic of inference control, but I was only talking
about collecting knowledge in the form of rules.  Speaking of my project, it
does not endorse specific inference methods.  It is up to the AGI designer
how to use the data.

BTW, statistical pattern mining is good for *learning* patterns, I wouldn't
use it for inference per se.  For me inference is done only using *existing*
rules and facts in the KB.  Pattern mining is for discovering *new* rules
and facts, which is very time-consuming and compute-intensive.


Pei agrees with me on the learning via experience part but has a
different approach to the combinatorial explosion problem of inference
control.  But you have not yet presented any original solutions to
these or other major well-documented problems with the GOFAI approach.



Again, I have no problems with learning via experience.  What I propose is
to augment this with knowledge acquisition via direct encoding, with the
help of the net community.  Do you have some reasons against this?  Is it
difficult for Novamente to incorporate the rules database?


Yes, intuitively the approach you're suggesting sounds like it should
work -- at first.  That is why masses of research funding were spent
on it decades ago, and why hundreds of brilliant people spent their
lives on GOFAI.  But you are not giving us any rational reason to
suspect you might succeed in this sort of approach where so many
others have failed.  What is your new and different idea?



I think the key innovation is that I allow rules with variables as well as
facts, and that such knowledge would be collected from online users on a
massive scale (which doesn't mean the project require massive $$$s).  Such a
combination has NOT been attempted before, AFAIK.

Frankly I'm not that knowledgeable about failed GOFAI projects (I was just
teenage in the 80s, playing with a TRS-80).  Decades ago, the internet
didn't exist and there was no way of amassing knowledge like MindPixel can.
This is perhaps the most important reason why past projects failed.

Maybe you're just reflexively saying that GOFAI is a failure, without giving
it serious consideration.  Cyc is not a complete failure.  It's still ALIVE
and it can do some reasoning about terrorist attacks etc.  Why wouldn't an
improved GOFAI succeed?  Perhaps it is an misconception that everything
associated with GOFAI _must_ be abandoned...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread David Clark

Your thermostat example can be used to show what I am talking about.  The
thermostat has an algorithm that says when the temperature gets below some X
amount, turn on the burner and fan until the temperature rises to at least
some X+N amount.  You get to set the X amount.  It doesn't have a table that
says at exactly 69.1 turn on and turn off at exactly 70.3 as the temperature
reading might never register exactly 69.1 or 70.3.  If you made a database
with enough entries and fine enough detail, you could just look up turn on
and turn off points for any recorded temperature but isn't just using the
simple algorithm a better solution?  I think if the goal was no create
intelligence that this would be even more important.



When I was in Physics in high school, I could have memorized all the
formulas for my exams but instead, I found that by knowing only a few basic
formulas I could derive all the others I needed during the exams.  The
problem is you have to know/understand how the formulas interrelate etc.



When my kids were growing up, I tried to minimize the number of explicit
rule/punishment combinations in modifying their behavior.  Instead, I put
forward policies and variable punishment so that it was much harder for my
kids to get around the rules and so the punishment could always fit the
crime.  With a relatively small number of policies (analogous to the
algorithms above), I could look after a much larger set of problems than I
could by just resorting to a set of mindless rules.  Working out how the
policies were broken in each case and what appropriate punishment is much
harder than just using a set of rigid rules but it is much more
intelligent(just) don't you think?



Do we divine the rules/laws/algorithms from a mass of data or do we generate
the appropriate conclusions when we need them because we understand how it
actually works?



David Clark



- Original Message - 

From: Charles D Hixson [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 11:02 AM
Subject: Re: [agi] Project proposal: MindPixel 2


 David Clark wrote:
  I agree with Ben's post that this kind a system has been tried many
times
  and produced very little.  How can a collection of Cats have claws;
Kitty
  is a cat;  therefore Kitty has claws. relate cat and kitty and that
kitty
  is slang and normally used for a young cat.  A database of this type
seems
  to be like the Chinese room dilemma where even if you got something that
  looked intelligent out of the system, you know for a fact that no
  intelligence exists.  ...
 
  David Clark
 
 
 I'm not certain that I'm convinced by that argument.  I tend to feel
 that as we approach the base level, intelligences DO decompose into
 pieces that are, themselves, not intelligent.  (Otherwise one gets into
 an It's turtles all the way down! kind of argument.)

 Partially it's a matter of definition.  Is a thermostat intelligent?  To
 me the answer would be Yes, at the most basic possible level (i.e., I
 wouldn't consider a thermocouple intelligent.)  A thermostat maintains a
 homeostasis, and to me that is one of the most basic kinds of
 intelligence.  I can easily see that one could have a reasonable
 definition of intelligence that was sufficiently specific AND excluded
 thermostats as being too basic, but I'm willing to grant to thermostats
 a basic amount of intelligence.  I'm also willing to grant that to
 logic engines.  And to many other things that I see as pieces of an
 AGI.  They aren't general intelligences, and I'm not totally convinced
 that such things can, even in principle, exist.  (Goedel's results seem
 to imply otherwise.  No system can be both complete and consistent.)
 Still, we are an existence proof that something better than we've been
 able to build so far is possible.  I suspect that we shave on both
 completeness and consistency, and that's probably an indication of
 what's needed to come any closer than we are.

 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?list_id=303

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


 More knowledge, higher intelligence is an intuitively attractive
 slogan, but has many problems in it. For example, more knowledge will
 easily lead to combinatorial explosion, and the reasoning system will
 derive many true but useless conclusions. How do you deal with that?


That's the problem of forward-chaining without a goal.  In fact, the human
mind can easily think of a lot of useless implications in a situation.  If
we have a query as a goal, we can use backward-chaining.  Otherwise we can
rank the implied sentences into levels of importance.  This does not seem to
be a show-stopper.


Backward chaining is just as susceptible to combinatorial explosions
as forward chaining...

And, importance levels need to be context-dependent, so that assigning
them requires sophisticated inference in itself...

Ben

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Stephen Reed

I've been using OpenCyc as the standard ontology for my texai project.  OpenCyc 
contains only the very few rules needed to enable the  OpenCyc deductive 
inference engine operate on its OpenCyc content.  On the other hand 
ResearchCyc, whose licenses are available without fees for research purposes, 
has a large number of rules.  I have a license and can state that my copy of 
RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping assertions.  
Nearly all of these rules were entered by hand at Cycorp.  Here are five at 
random with my comments to give you a feel for what RCyc contains:

[this is a typical temporal relations rule]
(#$implies 
  (#$and 
(#$startingIntervalOfThing ?TEMP-THING ?TIME-INTERVAL) 
(#$startingPoint ?TEMP-THING ?TIME-POINT)) 
  (#$endsAfterStartingOf ?TIME-INTERVAL ?TIME-POINT))
in context: #$CycTemporalTheoryMt  [the Cyc term suffix Mt means microtheory 
(context)]

[this is a typical spatial relations rule]
(#$implies 
  (#$and 
(#$isa ?UNIVERSE #$UniversalSpaceRegion) 
(#$partOfSpaceRegion ?REGION ?UNIVERSE) 
(#$spaceRegionDifference ?COMPLEMENT ?UNIVERSE ?REGION)) 
  (#$spaceRegionComplement ?COMPLEMENT ?REGION))
in context: #$SpatialGMt

[this is a rather specialized rule that helps define the predicate 
#$eventCasualtyDataSentence]
(#$implies 
  (#$and 
(#$isa ?PRED #$CasualtyPredicate) 
(#$assertedSentence (#$relationInstanceExists ?PRED ?SUBEVENT ?COL)) 
(#$different ?EVENT ?SUBEVENT) 
(#$subEvents ?EVENT ?SUBEVENT)) 
  (#$eventCasualtyDataSentence ?EVENT 
(#$and 
  (#$subEvents ?EVENT ?SUBEVENT) 
  (#$relationInstanceExists ?PRED ?SUBEVENT ?COL
in context: #$BaseKB [this is a general domain context from which almost all 
other contexts inherit facts and rules]

[this is a typical rule in the naive physics domain]
(#$implies 
  (#$and 
(#$isa ?HOLDING #$HoldingAnObject) 
(#$doneBy ?HOLDING ?AGENT) 
(#$objectActedOn ?HOLDING ?OBJ)) 
  (#$holdsIn ?HOLDING (#$touches ?AGENT ?OBJ)))
in context: #$NaivePhysicsMt

[This is a rule to guide a Cyc knowledge acquisition tool.  Note that this rule 
represents a form of probability not seen in the other rules.]
(#$implies 
  (#$genls ?COL #$EnclosingSomething) 
  (#$keCommonQueryForTerm ?COL (#$relationAllExists #$enclosure ?COL :WHAT)))
in context: #$BaseKB

Cheers.
-Steve
http://sf.net/projects/texai

- Original Message 
From: YKY (Yan King Yin) [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 8:48:33 AM
Subject: Re: [agi] Project proposal: MindPixel 2


 On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote: 
 Well YKY, I don't feel like rehashing these ancient arguments on this list!! 
 
 Others are welcome to do so, if they wish... ;-)
 
 You are welcome to repeat the mistakes of the past if you like, but I
 frankly consider it a waste of effort.
 
 What you have not explained is how what you are doing is fundamentally 
 different from what has been tried N times in the past -- by larger,
 better-funded teams with more expertise in mathematical logic...

 Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s 
because of newer techniques such as neural networks and statistical learning 
methods.  They were not necessarily aware of what exactly was the cause of 
failure.  If they did, they would have tackled it. 
  
 For the type of common sense reasoner I described, we need a *massive* number 
of rules.  You can either acquire these rule via machine learning or direct 
encoding.  Machine learning of such rules is possible, but the area of research 
is kind of immature.  OTOH there has not been a massive project to collect such 
rules by hand.  So that explains why my type of system has not been tried 
before. 
  
 My system is conceptually very close to Cyc, but the difference is that Cyc 
only contains ground facts and rely on special predicates (eg $isa, $genl) to 
do the reasoning.  My project may be the first to openly collect facts as well 
as rules. 
  
 I guess Novamente or NARS can benefit by importing these rules, if the format 
is right?
  
 YKY
  This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303 





 

Get your own web address.  
Have a HUGE year through Yahoo! Small Business.
http://smallbusiness.yahoo.com/domains/?p=BESTDEAL

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


There will come a point when integrating Cyc-type assertions into
Novamente will make sense for us, and I'll be curious how useful they
turn out to be at that point

However, my impression is that OpenCyc's rules are not extensive
enough to really add a lot to Novamente.  ResearchCyc has more odds of
being helpful but can't be used for commercial purposes,
unfortunately.  Still, we could integrate it with NM experimentally to
see how useful it was.

However, my view is that the integration of this sort of knowledge is
most likely to be useful to an AI system once it has achieved a
certain level of experiential intelligence, via embodied-learning
means...

ben

On 1/19/07, Stephen Reed [EMAIL PROTECTED] wrote:


I've been using OpenCyc as the standard ontology for my texai project.
OpenCyc contains only the very few rules needed to enable the  OpenCyc
deductive inference engine operate on its OpenCyc content.  On the other
hand ResearchCyc, whose licenses are available without fees for research
purposes, has a large number of rules.  I have a license and can state that
my copy of RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping
assertions.  Nearly all of these rules were entered by hand at Cycorp.  Here
are five at random with my comments to give you a feel for what RCyc
contains:

[this is a typical temporal relations rule]
(#$implies
  (#$and
(#$startingIntervalOfThing ?TEMP-THING ?TIME-INTERVAL)
(#$startingPoint ?TEMP-THING ?TIME-POINT))
  (#$endsAfterStartingOf ?TIME-INTERVAL ?TIME-POINT))
in context: #$CycTemporalTheoryMt  [the Cyc term suffix Mt means microtheory
(context)]

[this is a typical spatial relations rule]
(#$implies
  (#$and
(#$isa ?UNIVERSE #$UniversalSpaceRegion)
(#$partOfSpaceRegion ?REGION ?UNIVERSE)
(#$spaceRegionDifference ?COMPLEMENT ?UNIVERSE ?REGION))
  (#$spaceRegionComplement ?COMPLEMENT ?REGION))
in context: #$SpatialGMt

[this is a rather specialized rule that helps define the predicate
#$eventCasualtyDataSentence]
(#$implies
  (#$and
(#$isa ?PRED #$CasualtyPredicate)
(#$assertedSentence (#$relationInstanceExists ?PRED ?SUBEVENT ?COL))
(#$different ?EVENT ?SUBEVENT)
(#$subEvents ?EVENT ?SUBEVENT))
  (#$eventCasualtyDataSentence ?EVENT
(#$and
  (#$subEvents ?EVENT ?SUBEVENT)
  (#$relationInstanceExists ?PRED ?SUBEVENT ?COL
in context: #$BaseKB [this is a general domain context from which almost all
other contexts inherit facts and rules]

[this is a typical rule in the naive physics domain]
(#$implies
  (#$and
(#$isa ?HOLDING #$HoldingAnObject)
(#$doneBy ?HOLDING ?AGENT)
(#$objectActedOn ?HOLDING ?OBJ))
  (#$holdsIn ?HOLDING (#$touches ?AGENT ?OBJ)))
in context: #$NaivePhysicsMt

[This is a rule to guide a Cyc knowledge acquisition tool.  Note that this
rule represents a form of probability not seen in the other rules.]
(#$implies
  (#$genls ?COL #$EnclosingSomething)
  (#$keCommonQueryForTerm ?COL (#$relationAllExists #$enclosure ?COL
:WHAT)))
in context: #$BaseKB

Cheers.
-Steve
http://sf.net/projects/texai

- Original Message 
From: YKY (Yan King Yin) [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Friday, January 19, 2007 8:48:33 AM
Subject: Re: [agi] Project proposal: MindPixel 2



On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:
 Well YKY, I don't feel like rehashing these ancient arguments on this
list!!

 Others are welcome to do so, if they wish... ;-)

 You are welcome to repeat the mistakes of the past if you like, but I
 frankly consider it a waste of effort.

 What you have not explained is how what you are doing is fundamentally
 different from what has been tried N times in the past -- by larger,
 better-funded teams with more expertise in mathematical logic...

Well I think people gave up on logic-based AI (GOFAI if you will) in the 80s
because of newer techniques such as neural networks and statistical learning
methods.  They were not necessarily aware of what exactly was the cause of
failure.  If they did, they would have tackled it.

For the type of common sense reasoner I described, we need a *massive*
number of rules.  You can either acquire these rule via machine learning or
direct encoding.  Machine learning of such rules is possible, but the area
of research is kind of immature.  OTOH there has not been a massive project
to collect such rules by hand.  So that explains why my type of system has
not been tried before.

My system is conceptually very close to Cyc, but the difference is that Cyc
only contains ground facts and rely on special predicates (eg $isa, $genl)
to do the reasoning.  My project may be the first to openly collect facts as
well as rules.

I guess Novamente or NARS can benefit by importing these rules, if the
format is right?

YKY This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

 
It's here! Your new

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/20/07, David Clark [EMAIL PROTECTED] wrote:

...
Do we divine the rules/laws/algorithms from a mass of data or do we

generate

the appropriate conclusions when we need them because we understand how it
actually works?


Just as chemistry is reducible to physics, in theory, while in reality it is
a completely different subject... I think it is necessary that we populate
the knowledgebase with *redundant* facts/rules.  So we don't have to derive
everything from scratch every time we do an inference.  Some facts/rules are
derivable from other facts/rules.  Whenever a fact/rule require more than,
say, 3 steps of inference we will enter it into the knowledgebase.  This
does not mean that the AGI does not *understand* the facts/rules.  It does,
but it memorizes intermediate results.  If needed, it can explain the
facts/rules using more basic facts/rules.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

Backward chaining is just as susceptible to combinatorial explosions
as forward chaining...

And, importance levels need to be context-dependent, so that assigning
them requires sophisticated inference in itself...


The problem may not be so serious.  Common sense reasoning may require only
*shallow* inference chains, eg  5 applications of rules.  So I'm very
optimistic =)  Your worries are only applicable to 100-page theorem-proving
tasks, not really the concern of AGI.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


 And, importance levels need to be context-dependent, so that assigning
 them requires sophisticated inference in itself...


The problem may not be so serious.  Common sense reasoning may require only
*shallow* inference chains, eg  5 applications of rules.  So I'm very
optimistic =)  Your worries are only applicable to 100-page theorem-proving
tasks, not really the concern of AGI.


A) This is just not true, many commonsense inferences require
significantly more than 5 applications of rules

B) Even if there are only 5 applications of rules, the combinatorial
explosion still exists.  If there are 10 rules and 1 billlion
knowledge items, then there may be up to 10 billion possibilities to
consider in each inference step.  So there are (10 billion)^5 possible
5-step inference trajectories, in this scenario ;-)

Of course, some fairly basic pruning mechanisms can prune it down a
lot, but, one is still left with a combinatorial explosion that needs
to be dealt with via subtle means...

Please bear in mind that we actually have a functional uncertain
logical reasoning engine within the Novamente system, and have
experimented with feeding in knowledge from files and doing inference
on them.  (Though this has been mainly for system testing, as our
primary focus is on doing inference based on knowledge gained via
embodied experience in the AGISim world.)

The truth is that, if you have a lot of knowledge in your system's
memory, you need a pretty sophisticated, context-savvy inference
control mechanism to do commonsense inference.

Also, temporal inference can be quite tricky, and introduces numerous
options for combinatorial explosion that you may not be thinking about
when looking at atemporal examples of commonsense inference.  Various
conclusions may hold over various time scales; various pieces of
knowledge may become obsolete at various rates, etc.

I imagine you will have a better sense of these issues once you have
actually built an uncertain reasoning engine, fed knowledge into it,
and tried to make it do interesting things  I certainly think this
may be a valuable exercise for you to do.  However, until you have
done it, I think it's kind of silly for you to be speaking so
confidently about how you are so confident you can solve all the
problems found by others in doing this kind of work!!  I ask again, do
you have some theoretical innovation that seems probably to allow you
circumvent all these very familiar problems??

-- Ben

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/20/07, Stephen Reed [EMAIL PROTECTED] wrote:

I've been using OpenCyc as the standard ontology for my texai project.

OpenCyc contains only the very few rules needed to enable the  OpenCyc
deductive inference engine operate on its OpenCyc content.  On the other
hand ResearchCyc, whose licenses are available without fees for research
purposes, has a large number of rules.  I have a license and can state that
my copy of RCyc has 55,794 rules out of a total of 2,689,421 non-bookkeeping
assertions.  Nearly all of these rules were entered by hand at Cycorp.  Here
are five at random with my comments to give you a feel for what RCyc
contains:

[...]




Thanks a lot for this info...

The Cyc rules you cited seem to be of the nonverbal knowledge kind,
whereas my project tend to focus on the more verbal facet of common
sense.  But there should be no clear-cut boundary between the 2.

Do you think Cyc has a rule/fact like wet things can usually conduct
electricity (or if X is wet then X may conduct electricity)?  That's the
kind of verbal knowledge I'm interested in -- things that can be entered by
laymen in natural langauge.  I use a logical form that has a (nearly)
1-1 mapping to NL.

To keep things simple -- for this project -- we can focus on collecting the
facts/rules, without talking about the inference engine or other deep
AGI issues.

I'll also contact some Cyc folks to see if they're interested in
collaborating...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Benjamin Goertzel


Hi,


Do you think Cyc has a rule/fact like wet things can usually conduct
electricity (or if X is wet then X may conduct electricity)?


Yes, it does...


I'll also contact some Cyc folks to see if they're interested in
collaborating...


IMO, to have any chance of interesting them, you will need to be able
to explain to them VERY CLEARLY why your current proposed approach is
superior to theirs -- given that it seems so philosophically similar
to theirs, and given that they have already encoded millions of
knowledge items and built an inference engine and language-processing
front end!

-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread YKY (Yan King Yin)


On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

A) This is just not true, many commonsense inferences require
significantly more than 5 applications of rules


OK, I concur.

Long inference chains are built upon short inference steps.  We need a
mechanism to recognize the interestingness of sentences.  So we only keep
the interesting/relevant ones and build more deductions upon them.  It's not
easy, OK.

The bottomline is that the knowledge acquisition project is *separable* from
specific inference methods.


B) Even if there are only 5 applications of rules, the combinatorial
explosion still exists.  If there are 10 rules and 1 billlion
knowledge items, then there may be up to 10 billion possibilities to
consider in each inference step.  So there are (10 billion)^5 possible
5-step inference trajectories, in this scenario ;-)

Of course, some fairly basic pruning mechanisms can prune it down a
lot, but, one is still left with a combinatorial explosion that needs
to be dealt with via subtle means...



This is really solved.  Use a simple hashing of predicates, which is part of
the rete algorithm.  For example, you want to deduce whether dead birds can
fly.  There may be 5000 rules/facts about birds, and 1 rules/facts
about dead things. (Reasonable?)   So only these rules/facts would be
checked, the other parts of the KB (about flowers, cats, etc) are completely
untouched (unsearched).


Please bear in mind that we actually have a functional uncertain
logical reasoning engine within the Novamente system, and have
experimented with feeding in knowledge from files and doing inference
on them.  (Though this has been mainly for system testing, as our
primary focus is on doing inference based on knowledge gained via
embodied experience in the AGISim world.)



Do you think the problems you encounter in Novamente are really due to
combinatorial explosion or rather the *lack* of the right rules/facts?


The truth is that, if you have a lot of knowledge in your system's
memory, you need a pretty sophisticated, context-savvy inference
control mechanism to do commonsense inference.



I'm thinking about simple questions like can dead birds fly? etc.  It
shouldn't involve more than 5 steps.  What you're talking about seems to be
chaining many such steps to solve a detective story, that kind of thing.
And yes, for that you need sophisticated inference mechanisms.


Also, temporal inference can be quite tricky, and introduces numerous
options for combinatorial explosion that you may not be thinking about
when looking at atemporal examples of commonsense inference.  Various
conclusions may hold over various time scales; various pieces of
knowledge may become obsolete at various rates, etc.



Think 4D.  Time is just another dimension.  If you can do spatial reasoning
you can do temporal reasoning.  It *has* got to be the same, thanks to
Einstein.  If your approach uses special tricks to deal with temporal, then
at least it is not an elegant solution.

I'm not arrogant, and I admit I have not fully solved this 4D problem.  It
is kind of tricky, but I'm optimistic about it.


I imagine you will have a better sense of these issues once you have
actually built an uncertain reasoning engine, fed knowledge into it,
and tried to make it do interesting things  I certainly think this
may be a valuable exercise for you to do.  However, until you have
done it, I think it's kind of silly for you to be speaking so
confidently about how you are so confident you can solve all the
problems found by others in doing this kind of work!!  I ask again, do
you have some theoretical innovation that seems probably to allow you
circumvent all these very familiar problems??



I've given brief answers to your earlier questions, hope they're
convincing.  The point is that I think a simple inference engine combined
with a good, *densely* populated knowledgebase can accomplish a lot.

Let me stress again that this project per se is only a collection of
facts/rules.  Other, more intelligent people may come up with a better AGI
to use this database.

As for myself, I do use some innovative ideas (eg use of uncertain logic
(not my invention though)) in my AGI that makes it different from GOFAI.  My
knowledge representation is not just a bunch of logic formulae.  The logic
formulae can reference each other so they form an intricate network similar
to your graphical representation.  I'd love to talk more about these
things.  But I think it's better to actually start a project and do some
damned programming.  So far I still believe the project is worth doing; and
if my inference engine sucks, the database could still be of use to
others...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Joel Pitt


On 1/20/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

Regarding Mindpixel 2, FWIW, one kind of knowledge base that would be
most interesting to me as an AGI developer would be a set of pairs of
the form

(Simple English sentence, formal representation)

For instance, a [nonrepresentatively simple] piece of knowledge might be

(Cats often chase mice, { often( chase(cat, mouse) ) } )

This sort of training corpus would be really nice for providing some
extra help to an AI system that was trying to learn English.

Equivalently one could use a set of pairs of the form

(English sentence, Lojban sentence)

If Lojban is not used, then one needs to make some other highly
particular  specification regarding the logical representation
language.


So would there be a use for existing english documents to be
translated, as verbatim as possible into lojban? Are you aware of  any
project like this?

It's been a while since I looked at Lojban or your Lojban++, so was
wondering if english sentences translate well into Lojban without the
sentence ordering changing? I.e. given two english sentences, are
there any situations where in lojban the sentences would be more
correctly put in the reverse order? If there are, then manually
inserting placemarks in the original and translated version could be
used to delineate between regions of meaning and assist an AI in
reading the text while learning english.

I bet it'd be a great way of learning Lojban too! ;)

--
-Joel

Unless you try to do something beyond what you have mastered, you
will never grow. -C.R. Lawton

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Mike Dougherty


On 1/19/07, Joel Pitt [EMAIL PROTECTED] wrote:

It's been a while since I looked at Lojban or your Lojban++, so was
wondering if english sentences translate well into Lojban without the
sentence ordering changing? I.e. given two english sentences, are
there any situations where in lojban the sentences would be more
correctly put in the reverse order? If there are, then manually
inserting placemarks in the original and translated version could be
used to delineate between regions of meaning and assist an AI in
reading the text while learning english.

I bet it'd be a great way of learning Lojban too! ;)


Lojban/Lojban++ is inherently an explicit language, right?  Then given
an environment of objects and actions, the AI's-avatar could be asked
to perform actions that we pick from an interface.  How many
person-hours of interaction have gone into telling a guy in a chicken
suit to flap his arms, jump, etc.?  Imagine how much more fun people
would have with a greater range of action/object potential.  If this
were a game along the same vein as the Google Image labeler, where
another participant verified that the AI correctly completed the
requested action the language could be more easily learned- English
expressed from person1 to person2, lojban++ from person2 expressed to
AI, confirmation from person1 that AI completed request.  Win for the
AI to see english and lojban++ of the same action, Win for person2 to
have direct experiential learning by translating to lojban++ (I/we
need interactive learning mechanisms to be fluent enough in lojban++
to think clearly in it)  and person1 gets the same kicks as telling
the man in the chicken suit to hop on one foot.  (I never really
understood that, but people forwarded that URL a lot)

Ben, I used lojban++ in this example and was specifically thinking of
NM because you have expressed (near-)readiness for virtual embodiment.
I would love to be able to interact with your baby via an avatar of
my own, but I am currently less than baby-capable with respect to
lojban++.  (Although this semester I am taking Discrete math, so that
may help with the 'logical' thinking)  Frankly, I feel I need to
better understand how my own brain works before I can attempt to build
a copy.  Hopefully as my skills rise to meet this challenge, the
interface tools will mature to lower the prerequisites for
involvement.

humour:  I originally spelled Labeller and gmail's spellcheck
offered libeller - which would be a fun google product, wouldn't it?
Image Libeller

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-19 Thread Pei Wang


On 1/19/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


The bottomline is that the knowledge acquisition project is *separable* from
specific inference methods.


What is your argument supporting this strong claim?

I guess every book on knowledge representation includes a statement
saying that whether a knowledge representation format is good, to a
large extent depends on the type of inference it can support. These
two aspects are never considered as separable.

For example, First-Order Predicate Logic is not good enough for AI,
partly because it does not support non-deductive inference. Also,
semantic network is considered as weak mainly because it has no
powerful inference method associated.

You cannot build a useful knowledge base without thinking about what
inference methods it should support.

Pei

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread Ricardo Barreira


Judging from your posts, you have solved the AI problem in 2007, 2006,
2005, 

On 1/15/07, A. T. Murray [EMAIL PROTECTED] wrote:

Matt Mahoney wrote:

 [...] Lenat briefly mentions
 Sergey's (one of Google's founders) goal of solving AI by 2020.

FWIW I solved AI theory-wise in 1979 and software-wise in 2007.
http://mind.sourceforce.net/Mind.html and
http://www.scn.org/~mentifex/jsaimind.html and
http://visitware.com/AI4U/jsaimind.html are True AI demo versions.

 I think if Google and Cyc work together on this, they will succeed.

The Mentifex solution to AI is messy.
About thirty parameters of AI have been orchestrated and
coordinated to produce a minimal thinking artificial Mind.

What the late Christopher McKinstry and the late
Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-)
programs can be achieved, albeit messily, in Mind.html or in
http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html)
either by hard-coding a minimal subject-verb-object KB (as I did)
or by data-entry when users teach the artificial Mind new facts.

On another note, something which may alarm our fellow list members,
I am thinking of replacing the Terminate exit from Mind.html
with a [ ] Death check-box that will pop up a plea for mercy,
with an ethical user-decision to be made about AI life or death.

If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth
have truly solved AI, the open-access Site Meter logs will reveal
an enormous rush to fetch the free AI source code. That escalation
has not happened yet, but you are all welcome to click on Site Meter
and see such curious visit logs as the following example from a few
days ago, which was apparently made to a local copy of a Mentifex page:

 Visit 190,585
   []  []
Domain Namesenate.gov ? (United States Government)
IP Address 156.33.25.# (U.S. Senate Sergeant at Arms)
ISPU.S. Senate Sergeant at Arms
Location   Continent  :  North America
   Country  :  United States  (Facts)
   State  :  District of Columbia
   City  :  Washington
   Lat/Long  :  38.8933, -77.0146 (Map)
Language   unknown
Operating System   Microsoft WinXP
BrowserInternet Explorer 6.0
   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
   SV1; InfoPath.1; .NET CLR 2.0.50727)
Javascript disabled
Time of Visit  Jan 12 2007 5:40:01 pm
Last Page View Jan 12 2007 5:40:01 pm
Visit Length   0 seconds
Page Views 1
Referring URL  unknown
Visit Entry Page
Visit Exit Page
Out Click
Time Zone  unknown
Visitor's Time Unknown
Visit Number   190,585

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread Matt Mahoney

--- YKY (Yan King Yin) [EMAIL PROTECTED] wrote:
 I'm not an academic (left uni a couple years ago) so I can't get academic
 funding for this.  If I can't start an AI business I'd have to entirely give
 up AI as a career.  I hope you can understand these circumstances.

Aren't there companies looking for AI researchers?  Google?

Maybe another approach (the one I took) is to publish something innovative,
and people come to you.  It won't make you rich, but I have so far gotten 3
small consulting jobs designing and writing data compression software or doing
research, all from home, simply because people have seen my work on my website
(PAQ compressor, large text benchmark, Hutter prize) or they just saw my posts
on comp.compression.  I never looked for any of this work.  I make enough
teaching at a nearby university as an adjunct, with lots of time off.  I'm
sure I could make more money if I wanted to work long hours in an office, but
I don't need to.

PAQ introduced a new compression algorithm (context mixing) when PPM
algorithms were the best known.  PAQ would not have made it to the top of the
benchmarks without the ideas and coding and testing efforts of others working
on it with no reward except name recognition.  That would not have happened if
it wasn't free (GPL open source).  Even now, I'm sure nobody would pay even
$20/copy when there is so much free competition.  Other good compressors
(Compressia, WinRK) have failed with this business model.

I think if you want to make a business out of AI, you are in for a lot of
work.First you need something that is truly innovative, that does
something that nobody else can do.  What will that be?  A search engine better
than Google?  A new operating system that understands natural language?  A car
that drives itself?  A household servant robot?  A program that can manage a
company?  A better spam detector?  Text compression?

Write down a well defined goal.  Do research.  What is your competition?  How
are your ideas better than what's been done?  Prove it (with benchmarks), and
the opportunities will come.



-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread Chuck Esterbrook


On 1/18/07, Matt Mahoney [EMAIL PROTECTED] wrote:

--- YKY (Yan King Yin) [EMAIL PROTECTED] wrote:
 I'm not an academic (left uni a couple years ago) so I can't get academic
 funding for this.  If I can't start an AI business I'd have to entirely give
 up AI as a career.  I hope you can understand these circumstances.


...


I think if you want to make a business out of AI, you are in for a lot of
work.First you need something that is truly innovative, that does
something that nobody else can do.  What will that be?  A search engine better
than Google?  A new operating system that understands natural language?  A car
that drives itself?  A household servant robot?  A program that can manage a
company?  A better spam detector?  Text compression?


Wow, those are very strong prerequisites to start an AI company! No AI
company to date has created a household servant robot or NL OS. I
think AI companies can exist around less formidable goals.

I also like Guy Kawasaki's point (paraphrasing) that if you have a
good idea, at least 5 other startups are working on it. If you have a
great idea, at least 10.


Write down a well defined goal.  Do research.  What is your competition?  How
are your ideas better than what's been done?  Prove it (with benchmarks), and
the opportunities will come.


Not all successful companies have a quantitative proof that they are
the best. Probably most do not. I'm not saying your assertion prove
it...and (they) will come is incorrect, but that it's not the only
basis for a startup.

Re: business, you also need the right story to attract funding, the
right approach to sales and a good deal of luck. Novamente, *afaik*,
has no proof via benchmark but does have paying AI contracts that
sustain them. And still no robot butler! (I want one.)

Btw I found Ben's 22 page recap of his experiences at WebMind to be useful:
http://www.goertzel.org/benzine/WakingUpFromTheEconomyOfDreams.htm

And there is good info here:
http://www.amazon.com/Micro-ISV-Vision-Reality-Bob-Walsh/dp/1590596013/sr=8-1/qid=1169144626/ref=pd_bbs_sr_1/002-2460655-8624059?ie=UTF8s=books

Finally: Open source is one way. Commercial is another. Both have
succeeded and failed many times over and will continue to do so.
Neither will be going away any time soon.

-Chuck

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread YKY (Yan King Yin)


On 1/19/07, Matt Mahoney [EMAIL PROTECTED] wrote:


I think if you want to make a business out of AI, you are in for a lot of
work.First you need something that is truly innovative, that does
something that nobody else can do.  What will that be?  A search engine

better

than Google?  A new operating system that understands natural language?  A

car

that drives itself?  A household servant robot?  A program that can manage

a

company?  A better spam detector?  Text compression?

Write down a well defined goal.  Do research.  What is your

competition?  How

are your ideas better than what's been done?  Prove it (with benchmarks),

and

the opportunities will come.


Thanks for the tips.  My idea is quite simple, slightly innovative, but not
groundbreaking.  Basically, I want to collect a knowledgebase of facts as
well as rules.  Facts are like water is wet etc.  The rules I explain
with this example:  Cats have claws;  Kitty is a cat;  therefore Kitty has
claws.  Here is an implicit rule that says if X is-a Y and Z(Y), then
Z(X).  I call rules like this the Rules of Thought.  They are not logical
tautologies but they express some common thought patterns.

My theory is that if we collect a bunch of these rules, add a database
of common sense facts, and add a rule-based FOPL inference engine (which may
be enhanced with eg Pei Wang's numerical logic), then we have a common sense
reasoner.  That's what I'm trying to build as a first-stage AGI.

If it does work, there may be some commercial applications for such a
reasoner.  Also it would serve as the base to build a full AGI capable of
machine learning etc (I have crudely worked out the long-term plan).

So, is this a good business idea?

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread David Clark

I agree with Ben's post that this kind a system has been tried many times
and produced very little.  How can a collection of Cats have claws;  Kitty
is a cat;  therefore Kitty has claws. relate cat and kitty and that kitty
is slang and normally used for a young cat.  A database of this type seems
to be like the Chinese room dilemma where even if you got something that
looked intelligent out of the system, you know for a fact that no
intelligence exists.  To know that a cat is a mammal as are people and dogs
can only be had by a huge collection of interrelated models that show the
relationships, properties, abilities etc of all of these things.  Such
models could be automatically created (probably) by using this kind of
information tidbits that you suggest but the process would be very messy and
the size of database would be enormous.  It would be like the AI trying to
find the rules and relations of things out of a huge pile of word facts.
Why not just build the rules and relationships into the AI from the
beginning, populating the models with relevant facts as you go.  This could
be done with much less labor by using the AI itself to build the models by
using higher and higher levels of teaching methods by multiple individuals.



Computer languages use a strict subset of English to populate their syntax.
People use English to communicate with each other.  Why would we want to use
a new language like Lojban when we already use subsets of English with
computers?  Why does an arbitrary English sentence have to be unambiguous?
Most of the time this isn't a problem for English language people and where
it might be a problem why couldn't it just be clarified the same as we
humans do all the time?  The teachers of the AI could intentionally use an
unambiguous subset of English and gradually use more and more sophisticated
sentences as the intelligence of the AI progressed.  Isn't this what we do
with children as they grow up?  Most people verify they understand
instructions given to them before them actually act on those instructions
and potential misunderstandings are normally avoided.  Why can't we do the
same with an AI?  Adding an additional language won't  eliminate the need
for the humans using English or the computer using it's English subset
language.  Whatever the ambiguity problem is between humans and computers,
will only be transported to between the human and the new language for no
net benefit.

David Clark

- Original Message - 
From: Benjamin Goertzel [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Thursday, January 18, 2007 1:28 PM
Subject: Re: [agi] Project proposal: MindPixel 2


 YKY, this kind of thing has been tried many dozens of times in the
 history of AI.

 It does not lead to interesting results!  Alas...

 The key problem is that you can't feasibly encode enough facts to
 allow interesting commonsense inferences -- commonsense inference
 seems to require a very massive store of highly uncertain
 knowledge-items, rather than a small store of certain ones.

 BTW the rule

  if X is-a Y and Z(Y), then Z(X).

 exists (in a slightly different form) in Novamente and many other
 inference systems...

 I feel like you are personally rediscovering GOFAI, the kind of AI
 that I read about in textbooks when I first started exploring the
 field in the early 1980's

 Ben G

  Thanks for the tips.  My idea is quite simple, slightly innovative, but
not
  groundbreaking.  Basically, I want to collect a knowledgebase of facts
as
  well as rules.  Facts are like water is wet etc.  The rules I explain
  with this example:  Cats have claws;  Kitty is a cat;  therefore Kitty
has
  claws.  Here is an implicit rule that says if X is-a Y and Z(Y), then
  Z(X).  I call rules like this the Rules of Thought.  They are not
logical
  tautologies but they express some common thought patterns.
 
  My theory is that if we collect a bunch of these rules, add a database
  of common sense facts, and add a rule-based FOPL inference engine (which
may
  be enhanced with eg Pei Wang's numerical logic), then we have a common
sense
  reasoner.  That's what I'm trying to build as a first-stage AGI.
 
  If it does work, there may be some commercial applications for such a
  reasoner.  Also it would serve as the base to build a full AGI capable
of
  machine learning etc (I have crudely worked out the long-term plan).
 
  So, is this a good business idea?
 
  YKY 
   This list is sponsored by AGIRI: http://www.agiri.org/email
 
  To unsubscribe or change your options, please go to:
  http://v2.listbox.com/member/?list_id=303

 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?list_id=303

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-18 Thread Benjamin Goertzel


Well YKY, I don't feel like rehashing these ancient arguments on this list!!

Others are welcome to do so, if they wish... ;-)

You are welcome to repeat the mistakes of the past if you like, but I
frankly consider it a waste of effort.

What you have not explained is how what you are doing is fundamentally
different from what has been tried N times in the past -- by larger,
better-funded teams with more expertise in mathematical logic...

-- Ben

On 1/18/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:


Call me GOFAI ;)  I have thought about this for quite some time and I'm not
just copying old ideas

On 1/19/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:
 The key problem is that you can't feasibly encode enough facts to
 allow interesting commonsense inferences

Yes, we need a lot of thought rules -- they are needed, and there is no
escape except to encode them.  Machine learning may help (the rules can be
learned), but I think human encoding can get us quite far already.  That's
why I want to start a project to collect such rules.

 -- commonsense inference
 seems to require a very massive store of highly uncertain
 knowledge-items, rather than a small store of certain ones.

Totally disagree!  I actually examined a few cases of *real-life*
commonsense inference steps, and I found that they are based on a *small*
number of tiny rules of thought.  I don't know why you think massive
knowledge items are needed for commonsense reasoning -- if you closely
examine some of your own thoughts you'd see.

The rules in my system need not be certain.  They can be *defeasible* and
augmented with Pei Wang's c,f (confidence and frequency) values (which I
think is a very good idea).

 I feel like you are personally rediscovering GOFAI, the kind of AI
 that I read about in textbooks when I first started exploring the
 field in the early 1980's

Indeed I am very much influenced by those books.  That's not necessarily a
bad thing!!


YKY 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-17 Thread Charles D Hixson


Joel Pitt wrote:

...
Some comments/suggestions:

* I think such a project should make the data public domain. Ignore
silly ideas like giving be shares in the knowledge or whatever. It
just complicates things. If the project is really strapped for cash
later, then either use ad revenue or look for research funding
(although I don't see much cost except for initial development of the
system and web hosting).

...
Making this proprietary and expecting shares to translate into cash 
would indeed be a silly approach.
OTOH, having people's names attached to scores of some type (call them 
shares, or anything else) lets people feel more attached to the 
project.  This is probably necessary for success.  There also needs to 
be some way for the builders to interact, and a few other methods that 
assist the formation of a community.  Newsboards, games, etc. can all be 
useful if structured properly to enhance the formation of a community.  
Perhaps only community members with scores above the median could be 
allowed to download the database?


It's find to talk about making the data public domain, but that's not 
a good idea.  There are arguments in favor of BSD, MIT, GPL, LGPL, etc. 
licenses.  For this kind of activity I can see either BSD or MIT as 
easily defensible.  (Personally I'd use LGPL, but then if I were using 
it, I'd want the whole application to be GPL.  I might not be able to 
achieve it, but that's what I'd want.)  Public domain wouldn't be one of 
the possibilities that I would consider.  The Artistic license is about 
as close to that as I would want to come...and the MIT license is 
probably a better choice for those purposes.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-17 Thread YKY (Yan King Yin)


On 1/14/07, Benjamin Goertzel [EMAIL PROTECTED] wrote:

The choice of knowledge representation language makes a huge difference.

IMO, Cyc committed themselves to an overcomplicated representation
language that has rendered their DB far less useful than it would be
otherwise

If you want to use Lojban or Lojban++ as a knowledge representation,
then I will back your project strongly, as careful study convinced me
that the Lojban style of representation makes a lot of sense...


Lojban is advertised to be based on predicate logic, so I assume that
translating Lojban to FOPL should be straightforward.  Is there such a
translator available?

I think the difficulty is in translating from English (or whatever NL) to
Lojban or logic.  This is still unsolved, so we need to settle with a
restricted subset of English.

IMO the use of Lojban is unnecessary because it is computationally
equivalent to FOPL.  But if you insist on Lojban it wouldn't be difficult to
prepare a Lojban version of the knowledgebase.  Or am I missing something?

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-17 Thread YKY (Yan King Yin)


On 1/18/07, Charles D Hixson [EMAIL PROTECTED] wrote:

Joel Pitt wrote:
 * I think such a project should make the data public domain. Ignore
 silly ideas like giving be shares in the knowledge or whatever. It
 just complicates things. If the project is really strapped for cash
 later, then either use ad revenue or look for research funding
 (although I don't see much cost except for initial development of the
 system and web hosting).
...
Making this proprietary and expecting shares to translate into cash
would indeed be a silly approach.
...
It's find to talk about making the data public domain, but that's not
a good idea.  There are arguments in favor of BSD, MIT, GPL, LGPL, etc.
licenses.  For this kind of activity I can see either BSD or MIT as
easily defensible.  (Personally I'd use LGPL, but then if I were using
it, I'd want the whole application to be GPL.  I might not be able to
achieve it, but that's what I'd want.)  Public domain wouldn't be one of
the possibilities that I would consider.  The Artistic license is about
as close to that as I would want to come...and the MIT license is
probably a better choice for those purposes.


I think a project like this one requires substantial efforts, so people
would need to be paid to do some of the work (programming, interface
design, etc), especially if we want to build a high quality knowledgebase.
If we make it free then a likely outcome is that we get a lot of noise but
very few people actually contribute.

I'm not an academic (left uni a couple years ago) so I can't get academic
funding for this.  If I can't start an AI business I'd have to entirely give
up AI as a career.  I hope you can understand these circumstances.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-15 Thread Matt Mahoney


--- Stephen Reed [EMAIL PROTECTED] wrote:

 I worked at Cycorp when the FACTory game was developed.  The examples below
 do not reveal Cyc's knowledge of the assertions connecting these disparate
 concepts, rather most show that the argument constraints of the terms
 compared are rather overly generalized. The exception is the example Most
 BTU dozer blades are wider than most T-64 medium tanks. in which both
 concepts are specializations of Platform-Military.  Download and examine
 concepts in OpenCyc and Cyc's world model (or lack thereof by your
 standards) will be readily apparent.  You need ResearchCyc which has no
 license fee for research purposes, in order to evaluate its language model.
 -Steve

Thanks.  I did take another look at Cyc, at least this talk by Lenat at
Google.
http://video.google.com/videoplay?docid=-7704388615049492068

In spite of Cyc's lack of success at AGI (so far), it is still the biggest
repository of common sense knowledge.  He explains how Cyc had tried machine
learning approaches to acquiring such knowledge and why they failed.  They
knew early on that it would require a 1000 person-year effort to develop the
knowledge base and proceeded anyway.  Cyc has 3.2 million assertions, 300,000
concepts and 16,000 relations (is-a, contains, etc).  They tried very hard to
simplfy the knowledge base, to keep these numbers small.  Cyc is planning a
Web interface to its knowledge base.  If they make something useful, a 1000
person-year effort is nothing.

Lenat briefly mentions Sergey's (one of Google's founders) goal of solving AI
by 2020.  I think if Google and Cyc work together on this, they will succeed.

 
 - Original Message 
 From: Matt Mahoney [EMAIL PROTECTED]
 To: agi@v2.listbox.com
 Sent: Sunday, January 14, 2007 3:14:07 PM
 Subject: Re: [agi] Project proposal: MindPixel 2
 
 --- Gabriel R [EMAIL PROTECTED] wrote:
  Also, if you can think of any way to turn the knowledge-entry process into
 a
  fun game or competition, go for it.  I've been told by a few people
 working
  on similar projects that making the knowledge-providing process engaging
 and
  fun for visitors ended up being a lot more important (and difficult) than
  they'd expected.
 
 Cyc has a game like this called FACTory at http://www.cyc.com/
 It's purpose is to help refine its knowledge base.  It presents statements
 and
 asks you to rate them as true, false, don't know or doesn't make sense.  For
 example.
 
 - Most shirts are heavier than most appendixes.
 - Pages are typically located in HVAC Chem Bio facilities.
 - Terminals are typically located in studies.
 - People perform or are involved in paying a mortgage more frequenty than
 they
 perform or are involved in overbearing.
 - Most BTU dozer blades are wider than most T-64 medium tanks.
 
 The game exposes Cyc's shortcomings pretty quickly.  Cyc seems to lack a
 world
 model and a language model.  Sentences seem to be constructed by relating
 common properties of unrelated objects.  The set of common properties is
 fairly small: size, weight, cost, frequency (for events), containment, etc. 
 There does not seem to be any sense that Cyc understands the purpose or
 function of objects.  The result is that context is no help in
 disambiguating
 terms that have more than one meaning, such as appendix, page, or
 terminal.
 
 A language model would allow a more natural grammar, such as People pay
 mortgages more often than they are overbearing.  This example also exposes
 the fallacy of logical inference.  Inference allows you to draw conclusions
 such as this, but why would you?  Inference is not a good model of human
 thought.  A good model would compare related objects.  It might ask instead
 whether people make mortgage payments more frequently than they receive
 paychecks.  The game gives no hint that Cyc understands such relations.
 
 Cyc has millions of hand coded assertions.  It has taken over 20 years to
 get
 this far, and it seems we are not even close.  This seems to be a problem
 with
 every knowledge representation based on labeled graphs (frame-slot, first
 order logic, connectionist, expert system, etc).  Using English words to
 label
 the elements of your data structure does not substitute for a language
 model. 
 Also, this labeling tempts you to examine and update the knowledge manually.
 
 We should know by now that there is just too much data to do this.
 
 
 -- Matt Mahoney, [EMAIL PROTECTED]
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?list_id=303
 
 
 
 
 
  


 Never miss an email again!
 Yahoo! Toolbar alerts you the instant new Mail arrives.
 http://tools.search.yahoo.com/toolbar/features/mail/
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2

Re: [agi] Project proposal: MindPixel 2

2007-01-15 Thread A. T. Murray

Matt Mahoney wrote:

 [...] Lenat briefly mentions
 Sergey's (one of Google's founders) goal of solving AI by 2020. 

FWIW I solved AI theory-wise in 1979 and software-wise in 2007.
http://mind.sourceforce.net/Mind.html and
http://www.scn.org/~mentifex/jsaimind.html and 
http://visitware.com/AI4U/jsaimind.html are True AI demo versions.

 I think if Google and Cyc work together on this, they will succeed.

The Mentifex solution to AI is messy. 
About thirty parameters of AI have been orchestrated and
coordinated to produce a minimal thinking artificial Mind.

What the late Christopher McKinstry and the late
Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-)
programs can be achieved, albeit messily, in Mind.html or in 
http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html)
either by hard-coding a minimal subject-verb-object KB (as I did)
or by data-entry when users teach the artificial Mind new facts.

On another note, something which may alarm our fellow list members,
I am thinking of replacing the Terminate exit from Mind.html
with a [ ] Death check-box that will pop up a plea for mercy,
with an ethical user-decision to be made about AI life or death.

If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth
have truly solved AI, the open-access Site Meter logs will reveal
an enormous rush to fetch the free AI source code. That escalation
has not happened yet, but you are all welcome to click on Site Meter
and see such curious visit logs as the following example from a few
days ago, which was apparently made to a local copy of a Mentifex page:

 Visit 190,585 
   []  [] 
Domain Namesenate.gov ? (United States Government) 
IP Address 156.33.25.# (U.S. Senate Sergeant at Arms) 
ISPU.S. Senate Sergeant at Arms 
Location   Continent  :  North America 
   Country  :  United States  (Facts) 
   State  :  District of Columbia 
   City  :  Washington 
   Lat/Long  :  38.8933, -77.0146 (Map) 
Language   unknown 
Operating System   Microsoft WinXP 
BrowserInternet Explorer 6.0
   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 
   SV1; InfoPath.1; .NET CLR 2.0.50727) 
Javascript disabled 
Time of Visit  Jan 12 2007 5:40:01 pm 
Last Page View Jan 12 2007 5:40:01 pm 
Visit Length   0 seconds 
Page Views 1 
Referring URL  unknown 
Visit Entry Page 
Visit Exit Page 
Out Click 
Time Zone  unknown 
Visitor's Time Unknown 
Visit Number   190,585 

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-15 Thread Neil H.


Uh... for those of you who are unfamiliar with Mentifex, there's an
FAQ on him here:

http://www.nothingisreal.com/mentifex_faq.html

On 1/15/07, A. T. Murray [EMAIL PROTECTED] wrote:

Matt Mahoney wrote:

 [...] Lenat briefly mentions
 Sergey's (one of Google's founders) goal of solving AI by 2020.

FWIW I solved AI theory-wise in 1979 and software-wise in 2007.
http://mind.sourceforce.net/Mind.html and
http://www.scn.org/~mentifex/jsaimind.html and
http://visitware.com/AI4U/jsaimind.html are True AI demo versions.

 I think if Google and Cyc work together on this, they will succeed.

The Mentifex solution to AI is messy.
About thirty parameters of AI have been orchestrated and
coordinated to produce a minimal thinking artificial Mind.

What the late Christopher McKinstry and the late
Pushpinder Singh tried to achieve in their web-mind (pace Ben G :-)
programs can be achieved, albeit messily, in Mind.html or in
http://mind.sourceforge.net/mind4th.html (lagging behind Mind.html)
either by hard-coding a minimal subject-verb-object KB (as I did)
or by data-entry when users teach the artificial Mind new facts.

On another note, something which may alarm our fellow list members,
I am thinking of replacing the Terminate exit from Mind.html
with a [ ] Death check-box that will pop up a plea for mercy,
with an ethical user-decision to be made about AI life or death.

If the Mentifex AI programs Mind.html [AI-Complete] and Mind.Forth
have truly solved AI, the open-access Site Meter logs will reveal
an enormous rush to fetch the free AI source code. That escalation
has not happened yet, but you are all welcome to click on Site Meter
and see such curious visit logs as the following example from a few
days ago, which was apparently made to a local copy of a Mentifex page:

 Visit 190,585
   []  []
Domain Namesenate.gov ? (United States Government)
IP Address 156.33.25.# (U.S. Senate Sergeant at Arms)
ISPU.S. Senate Sergeant at Arms
Location   Continent  :  North America
   Country  :  United States  (Facts)
   State  :  District of Columbia
   City  :  Washington
   Lat/Long  :  38.8933, -77.0146 (Map)
Language   unknown
Operating System   Microsoft WinXP
BrowserInternet Explorer 6.0
   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
   SV1; InfoPath.1; .NET CLR 2.0.50727)
Javascript disabled
Time of Visit  Jan 12 2007 5:40:01 pm
Last Page View Jan 12 2007 5:40:01 pm
Visit Length   0 seconds
Page Views 1
Referring URL  unknown
Visit Entry Page
Visit Exit Page
Out Click
Time Zone  unknown
Visitor's Time Unknown
Visit Number   190,585

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-15 Thread Neil H.


I unfortunately don't have time to read through the entire thread
right now, but a potential alternative is to do something like Luis
von Ahn's Verbosity, which uses a webgame to collect common-sense
facts:

http://www.cs.cmu.edu/~biglou/Verbosity.pdf

On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



I'm considering this idea:  build a repository of facts/rules in FOL (or
Prolog) format, similar to Cyc's.  For example water is wet, oil is
slippery, etc.  The repository is structureless, in the sense that it is
just a collection of simple statements.  It can serve as raw material for
other AGIs, not only mine (although it is especially suitable for my
system).

One thing that is different from MindPixel 1 is that we can allow
Prolog-like rules with variables, for example if X has wings then X can
fly etc.

I can build a translator to translate simple English sentences to logic
statements.  We can then solicit internet users to help type in these facts.
 The resulting knowledgebase would be owned by the community, with those who
contribute more facts getting more shares.  Also there can be a human
cross-validation mechanism to prevent people from typing nonsense.

We can also absorb Cyc's current knowledge so the effort will not be
duplicative.

What do you think of this?

YKY 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-14 Thread YKY (Yan King Yin)


On 1/14/07, Chuck Esterbrook [EMAIL PROTECTED] wrote:

* Would it support separate domains/modules?


I didn't realize the importance of this point at first.  Indeed, what we
regard as common sense may be highly subjective as it involves matters such
as human values, ideology or religion.  So the differentiation of subsets is
desirable.  We may maintain a core body that is really uncontroversial (eg
everyday physics), and then let users create their own personalities as
additional modules / communities.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-14 Thread Bob Mottram


I think all these are excellent suggestions.



On 13/01/07, Joel Pitt [EMAIL PROTECTED] wrote:


Some comments/suggestions:

* I think such a project should make the data public domain. Ignore
silly ideas like giving be shares in the knowledge or whatever. It
just complicates things. If the project is really strapped for cash
later, then either use ad revenue or look for research funding
(although I don't see much cost except for initial development of the
system and web hosting).

* Whenever people want to add a new statement, have them evaluate two
existing statements as well. Don't make the evaluation true/false, use
a slider so the user can decide how true it is (even better, have a
xy chart with one axis true/false and the other how sure the user is -
this would be useful in the case of some obscure fact on quantum
physics since not all of us have the answer).

* Emphasize the community aspect of the database. Allow people to have
profiles and list the number of statements evaluated and submitted
(also how true the statements they submit are judged). Allow people to
form teams. Allow teams to extract a subset of the data
which represents only the facts they've submitted and evaluated
(perhaps this could be an extra feature available to sponsors?)

* Although Lojban would be great to use, not many people are
proficient it (relative to english), we could be idealistic and
suggest that everyone learn lojban before submitting statements, but
that would just shrink the user base and kill the community aspect. An
alternative might be to allow statements in both languages to
submitted (Hell, why not allow ANY language as long as it is tagged
with what language it is).

* An idea for keeping the community alive would be to focus on a
particular topic each week, and run competitions between
teams/individuals and award stars to their profile or something.

* Instead of making people come up with brand new statements
everytime, have a mode where the system randomly selects phrases from
somewhere like wikipedia (some times this will produce stupid
statements, and allow the user to indicate as such).

I think it could be done and made quite fun. Don't just focus on the
AI guys, most of us don't have that much spare time. Focus at the
bored at work market.

Actually going through and thinking about this has made me quite
enthused about it. Keep me posted on how it pans out. If I didn't have
10 other projects and my PhD to do I'd volunteer to code it.

--
-Joel

Unless you try to do something beyond what you have mastered, you
will never grow. -C.R. Lawton

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-14 Thread Matt Mahoney

--- Gabriel R [EMAIL PROTECTED] wrote:
 Also, if you can think of any way to turn the knowledge-entry process into a
 fun game or competition, go for it.  I've been told by a few people working
 on similar projects that making the knowledge-providing process engaging and
 fun for visitors ended up being a lot more important (and difficult) than
 they'd expected.

Cyc has a game like this called FACTory at http://www.cyc.com/
It's purpose is to help refine its knowledge base.  It presents statements and
asks you to rate them as true, false, don't know or doesn't make sense.  For
example.

- Most shirts are heavier than most appendixes.
- Pages are typically located in HVAC Chem Bio facilities.
- Terminals are typically located in studies.
- People perform or are involved in paying a mortgage more frequenty than they
perform or are involved in overbearing.
- Most BTU dozer blades are wider than most T-64 medium tanks.

The game exposes Cyc's shortcomings pretty quickly.  Cyc seems to lack a world
model and a language model.  Sentences seem to be constructed by relating
common properties of unrelated objects.  The set of common properties is
fairly small: size, weight, cost, frequency (for events), containment, etc. 
There does not seem to be any sense that Cyc understands the purpose or
function of objects.  The result is that context is no help in disambiguating
terms that have more than one meaning, such as appendix, page, or
terminal.

A language model would allow a more natural grammar, such as People pay
mortgages more often than they are overbearing.  This example also exposes
the fallacy of logical inference.  Inference allows you to draw conclusions
such as this, but why would you?  Inference is not a good model of human
thought.  A good model would compare related objects.  It might ask instead
whether people make mortgage payments more frequently than they receive
paychecks.  The game gives no hint that Cyc understands such relations.

Cyc has millions of hand coded assertions.  It has taken over 20 years to get
this far, and it seems we are not even close.  This seems to be a problem with
every knowledge representation based on labeled graphs (frame-slot, first
order logic, connectionist, expert system, etc).  Using English words to label
the elements of your data structure does not substitute for a language model. 
Also, this labeling tempts you to examine and update the knowledge manually. 
We should know by now that there is just too much data to do this.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-14 Thread Bob Mottram


Another way to group the data might be to tease it out into dimensions of
what, where, when and whom.  There does seem to be some neurological
evidence for this kind of categorization.  Also, by indexing the data along
these lines it allows you to some extent to make meaningful interpolations
from similar but non-identical situations, or to imagine situations which
are vaguely plausible based upon your past experience.



On 14/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:




On 1/14/07, Chuck Esterbrook [EMAIL PROTECTED] wrote:
 * Would it support separate domains/modules?

I didn't realize the importance of this point at first.  Indeed, what we
regard as common sense may be highly subjective as it involves matters such
as human values, ideology or religion.  So the differentiation of subsets is
desirable.  We may maintain a core body that is really uncontroversial (eg
everyday physics), and then let users create their own personalities as
additional modules / communities.

YKY
--
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-14 Thread Stephen Reed

I worked at Cycorp when the FACTory game was developed.  The examples below do 
not reveal Cyc's knowledge of the assertions connecting these disparate 
concepts, rather most show that the argument constraints of the terms compared 
are rather overly generalized. The exception is the example Most BTU dozer 
blades are wider than most T-64 medium tanks. in which both concepts are 
specializations of Platform-Military.  Download and examine concepts in OpenCyc 
and Cyc's world model (or lack thereof by your standards) will be readily 
apparent.  You need ResearchCyc which has no license fee for research purposes, 
in order to evaluate its language model.
-Steve

- Original Message 
From: Matt Mahoney [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Sunday, January 14, 2007 3:14:07 PM
Subject: Re: [agi] Project proposal: MindPixel 2

--- Gabriel R [EMAIL PROTECTED] wrote:
 Also, if you can think of any way to turn the knowledge-entry process into a
 fun game or competition, go for it.  I've been told by a few people working
 on similar projects that making the knowledge-providing process engaging and
 fun for visitors ended up being a lot more important (and difficult) than
 they'd expected.

Cyc has a game like this called FACTory at http://www.cyc.com/
It's purpose is to help refine its knowledge base.  It presents statements and
asks you to rate them as true, false, don't know or doesn't make sense.  For
example.

- Most shirts are heavier than most appendixes.
- Pages are typically located in HVAC Chem Bio facilities.
- Terminals are typically located in studies.
- People perform or are involved in paying a mortgage more frequenty than they
perform or are involved in overbearing.
- Most BTU dozer blades are wider than most T-64 medium tanks.

The game exposes Cyc's shortcomings pretty quickly.  Cyc seems to lack a world
model and a language model.  Sentences seem to be constructed by relating
common properties of unrelated objects.  The set of common properties is
fairly small: size, weight, cost, frequency (for events), containment, etc. 
There does not seem to be any sense that Cyc understands the purpose or
function of objects.  The result is that context is no help in disambiguating
terms that have more than one meaning, such as appendix, page, or
terminal.

A language model would allow a more natural grammar, such as People pay
mortgages more often than they are overbearing.  This example also exposes
the fallacy of logical inference.  Inference allows you to draw conclusions
such as this, but why would you?  Inference is not a good model of human
thought.  A good model would compare related objects.  It might ask instead
whether people make mortgage payments more frequently than they receive
paychecks.  The game gives no hint that Cyc understands such relations.

Cyc has millions of hand coded assertions.  It has taken over 20 years to get
this far, and it seems we are not even close.  This seems to be a problem with
every knowledge representation based on labeled graphs (frame-slot, first
order logic, connectionist, expert system, etc).  Using English words to label
the elements of your data structure does not substitute for a language model. 
Also, this labeling tempts you to examine and update the knowledge manually. 
We should know by now that there is just too much data to do this.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303





 

Never miss an email again!
Yahoo! Toolbar alerts you the instant new Mail arrives.
http://tools.search.yahoo.com/toolbar/features/mail/

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

[agi] Project proposal: MindPixel 2

2007-01-13 Thread YKY (Yan King Yin)


I'm considering this idea:  build a repository of facts/rules in FOL (or
Prolog) format, similar to Cyc's.  For example water is wet, oil is
slippery, etc.  The repository is structureless, in the sense that it is
just a collection of simple statements.  It can serve as raw material for
other AGIs, not only mine (although it is especially suitable for my
system).

One thing that is different from MindPixel 1 is that we can allow
Prolog-like rules with variables, for example if X has wings then X can
fly etc.

I can build a translator to translate simple English sentences to logic
statements.  We can then solicit internet users to help type in these
facts.  The resulting knowledgebase would be owned by the community, with
those who contribute more facts getting more shares.  Also there can be a
human cross-validation mechanism to prevent people from typing nonsense.

We can also absorb Cyc's current knowledge so the effort will not be
duplicative.

What do you think of this?

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Pei Wang


How do you plan to represent water is wet?

Pei

On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



I'm considering this idea:  build a repository of facts/rules in FOL (or
Prolog) format, similar to Cyc's.  For example water is wet, oil is
slippery, etc.  The repository is structureless, in the sense that it is
just a collection of simple statements.  It can serve as raw material for
other AGIs, not only mine (although it is especially suitable for my
system).

One thing that is different from MindPixel 1 is that we can allow
Prolog-like rules with variables, for example if X has wings then X can
fly etc.

I can build a translator to translate simple English sentences to logic
statements.  We can then solicit internet users to help type in these facts.
 The resulting knowledgebase would be owned by the community, with those who
contribute more facts getting more shares.  Also there can be a human
cross-validation mechanism to prevent people from typing nonsense.

We can also absorb Cyc's current knowledge so the effort will not be
duplicative.

What do you think of this?

YKY 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread YKY (Yan King Yin)


On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote:

How do you plan to represent water is wet?

Pei


Well, we need to agree on some conventions.  A pretty standard way is:
   Is(water,wet).

But the one I use in my system is:
   R(is, water, wet)
where R is a generic predicate representing a relation.

I mean, we can choose a format that is as natural as possible or pleases
most people.  You can easily translate it to your native form.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Bob Mottram


Actually it doesn't matter what convention you use.  You could simply have
an entry box on the screen, with a prompt saying please type a short
statement that you believe to be either true or false.  Some parsing can do
the rest.  To avoid getting too verbose simply restrict the maximum number
of words which may be used in the sentence.

The key point about Mindpixel is the cross validation, to calculate a
probability or coherence value, with the ultimate aim of mapping out an
average human belief system - something which has never really been done
before.

If Mindpixel does get revived I think it should be an open source project,
with the results available to everyone.  The idea of doing this on a
commercial basis with the issuing of shares turned out not to be viable.
This kind of effort is a long term thing, unlikely to return profits for
shareholders within a few years.




On 13/01/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:




On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote:
 How do you plan to represent water is wet?

 Pei

Well, we need to agree on some conventions.  A pretty standard way is:
Is(water,wet).

But the one I use in my system is:
R(is, water, wet)
where R is a generic predicate representing a relation.

I mean, we can choose a format that is as natural as possible or pleases
most people.  You can easily translate it to your native form.

--
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Chuck Esterbrook


On 1/13/07, Bob Mottram [EMAIL PROTECTED] wrote:


Actually it doesn't matter what convention you use.  You could simply have
an entry box on the screen, with a prompt saying please type a short
statement that you believe to be either true or false.  Some parsing can do
the rest.  To avoid getting too verbose simply restrict the maximum number
of words which may be used in the sentence.

The key point about Mindpixel is the cross validation, to calculate a
probability or coherence value, with the ultimate aim of mapping out an
average human belief system - something which has never really been done
before.


This answers my question about the comparison with Cyc. It's been
awhile since I read up on Mindpixel and now it's coming back to me.


If Mindpixel does get revived I think it should be an open source project,
with the results available to everyone.  The idea of doing this on a
commercial basis with the issuing of shares turned out not to be viable.
This kind of effort is a long term thing, unlikely to return profits for
shareholders within a few years.


Shares aren't the only means to structure a business. One possibility
is revenue and/or profit sharing based on a formula embedded in a
contract. We could probably think of more ways if we spent the time.
Although I admit I believe all of them will be complex.

-Chuck

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread YKY (Yan King Yin)


On 1/14/07, Pei Wang [EMAIL PROTECTED] wrote:

 Well, we need to agree on some conventions.  A pretty standard way is:
 Is(water,wet).

In the standard way of knowledge representation, a constant is
either a predicate name or an individual name. Mass noun like water
is neither. There is no consensus on how to represent it yet.

In the above formula, water is a constant.  I don't see why it cannot be a
mass noun.  Predicate logic does not restrict constants/variables in any
way, they can be any abstract or concrete concept.



I guess you will represent water is liquid as R(is, water, liquid),
right? Are you going to somehow show the difference between noun like
liquid and adjective like wet? Are you going to somehow show the
difference between uncountable noun like liquid and countable noun
like table?


One could have statements like:
   Noun(water)
   Uncountable(water)
   Adjective(wet)
   etc
but they are again conventions.


I'm afraid that there is no format that is as natural as possible or
pleases most people.


Well, indeed you're right.  It seems that either we have to agree on some
arbitrary format, or just leave it as English (perhaps parsed and
disambiguated).

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread YKY (Yan King Yin)


On 1/14/07, Bob Mottram [EMAIL PROTECTED] wrote:


If Mindpixel does get revived I think it should be an open source project,

with the results available to everyone.  The idea of doing this on a
commercial basis with the issuing of shares turned out not to be viable.
This kind of effort is a long term thing, unlikely to return profits for
shareholders within a few years.

Using MindPixel shares makes the project more interesting to people, as they
can look at how much they have contributed.  It doesn't matter if the
database doesn't turn a profit.  In the long term, you just can't tell.  I
think it's reasonable to let the community own the product.  That will
motivate people to participate too.

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Benjamin Goertzel


Well, I don't really think that Mindpixel shares are going to be a
big motivator for folks to encode knowledge, in the current
economic/business climate.  The original mindpixel shares were part of
the dot-com-era mentality, I think.

But I think the key point is whether there is a committment to make
and keep the DB open to everyone (as was not the case with
mindpixel).

I think folks contributing to the DB will want to know for sure that
the knowledge they contribute will be open to all, rather than
restricted to certain customers who are willing to pay...

-- Ben G

On 1/13/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:



On 1/14/07, Bob Mottram [EMAIL PROTECTED] wrote:

 If Mindpixel does get revived I think it should be an open source project,
with the results available to everyone.  The idea of doing this on a
commercial basis with the issuing of shares turned out not to be viable.
This kind of effort is a long term thing, unlikely to return profits for
shareholders within a few years.


Using MindPixel shares makes the project more interesting to people, as they
can look at how much they have contributed.  It doesn't matter if the
database doesn't turn a profit.  In the long term, you just can't tell.  I
think it's reasonable to let the community own the product.  That will
motivate people to participate too.

YKY
 
 This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Joel Pitt


On 1/14/07, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:

I'm considering this idea:  build a repository of facts/rules in FOL (or
Prolog) format, similar to Cyc's.  For example water is wet, oil is
slippery, etc.  The repository is structureless, in the sense that it is
just a collection of simple statements.  It can serve as raw material for
other AGIs, not only mine (although it is especially suitable for my
system).


Some comments/suggestions:

* I think such a project should make the data public domain. Ignore
silly ideas like giving be shares in the knowledge or whatever. It
just complicates things. If the project is really strapped for cash
later, then either use ad revenue or look for research funding
(although I don't see much cost except for initial development of the
system and web hosting).

* Whenever people want to add a new statement, have them evaluate two
existing statements as well. Don't make the evaluation true/false, use
a slider so the user can decide how true it is (even better, have a
xy chart with one axis true/false and the other how sure the user is -
this would be useful in the case of some obscure fact on quantum
physics since not all of us have the answer).

* Emphasize the community aspect of the database. Allow people to have
profiles and list the number of statements evaluated and submitted
(also how true the statements they submit are judged). Allow people to
form teams. Allow teams to extract a subset of the data
which represents only the facts they've submitted and evaluated
(perhaps this could be an extra feature available to sponsors?)

* Although Lojban would be great to use, not many people are
proficient it (relative to english), we could be idealistic and
suggest that everyone learn lojban before submitting statements, but
that would just shrink the user base and kill the community aspect. An
alternative might be to allow statements in both languages to
submitted (Hell, why not allow ANY language as long as it is tagged
with what language it is).

* An idea for keeping the community alive would be to focus on a
particular topic each week, and run competitions between
teams/individuals and award stars to their profile or something.

* Instead of making people come up with brand new statements
everytime, have a mode where the system randomly selects phrases from
somewhere like wikipedia (some times this will produce stupid
statements, and allow the user to indicate as such).

I think it could be done and made quite fun. Don't just focus on the
AI guys, most of us don't have that much spare time. Focus at the
bored at work market.

Actually going through and thinking about this has made me quite
enthused about it. Keep me posted on how it pans out. If I didn't have
10 other projects and my PhD to do I'd volunteer to code it.

--
-Joel

Unless you try to do something beyond what you have mastered, you
will never grow. -C.R. Lawton

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] Project proposal: MindPixel 2

2007-01-13 Thread Gabriel R


A few other considerations --  It is possible to reduce the need for
after-the-fact parsing by imposing constraints on the knowledge-entry
process, and this actually makes it easier for users to come up with facts.
For example, the MIT Open Mind Common Sense project asked users to fill in
the blanks of templates like You would be likely to find ___ in
, __ can __, etc. which the researchers then readily
translated to assertions of the form LocationOf(X, Y), CapableOf(X, Y),
etc.  They ended up collecting quite a few facts -- see
http://web.media.mit.edu/~hugo/conceptnet/http://web.media.mit.edu/%7Ehugo/conceptnet/.
If you dig around in the zip file you'll see a few large txt files
that
have all the collected facts listed out -- 200,000 in the concise version,
1.6 million in the full version.



It seems that either we have to agree on some arbitrary format, or just

leave it as

English (perhaps parsed and disambiguated).


The MIT project I just mentioned did something similar to the collect in
plain English, then disambiguate afterwards strategy (a lexically
disambiguated version is also available on their site), but the fact that
the arguments of their predicates are not tied to any kind of formal
semantics really limits their database's utility.  I would strongly
recommend thinking about ways to constrain the users' input in such a way as
to eliminate ambiguity from the getgo.  For example, you could ask users to
create true sentences of the form  if X __1__ __2__ then X __3__ __4__,
but force them to fill in the blanks by selecting from combo boxes that give
them many predefined choices such as can, is a, mouse (the animal),
mouse (for a computer), etc.  This will help standardize your input and
make it a lot easier to work with.

Alternatively, you could write a simple program to create thousands of
randomly generated logical statements and then automatically convert them to
relatively unambiguously phrased English statements (obviously far easier
than converting English to logical form).  Then you could put these to your
web users and have them tell you how true each statement is, or ask them how
they would change false statements to make them true. You could even
incorporate clustering algorithms to infer which unannotated statements are
most likely to be true ... lots of possibilities.

Actually, going with the randomly-generated-logical-statements idea, if you
used Lojban predicates as your underlying logical form, you could
auto-translate them to English and essentially have your users annotating
the truth-values of Lojban sentences without knowing Lojban.  It's very
difficult to auto-translate complex Lojban sentences into readable English
(mainly because Lojban's equivalent of noun compounds are ridiculously
vague), but it shouldn't be too hard to do with simple sentences.

Also, if you can think of any way to turn the knowledge-entry process into a
fun game or competition, go for it.  I've been told by a few people working
on similar projects that making the knowledge-providing process engaging and
fun for visitors ended up being a lot more important (and difficult) than
they'd expected.





On 1/13/07, Joel Pitt  [EMAIL PROTECTED] wrote:


On 1/14/07, YKY (Yan King Yin)  [EMAIL PROTECTED] wrote:
 I'm considering this idea:  build a repository of facts/rules in FOL (or
 Prolog) format, similar to Cyc's.  For example water is wet, oil is
 slippery, etc.  The repository is structureless, in the sense that it
is
 just a collection of simple statements.  It can serve as raw material
for
 other AGIs, not only mine (although it is especially suitable for my
 system).

Some comments/suggestions:

* I think such a project should make the data public domain. Ignore
silly ideas like giving be shares in the knowledge or whatever. It
just complicates things. If the project is really strapped for cash
later, then either use ad revenue or look for research funding
(although I don't see much cost except for initial development of the
system and web hosting).

* Whenever people want to add a new statement, have them evaluate two
existing statements as well. Don't make the evaluation true/false, use
a slider so the user can decide how true it is (even better, have a
xy chart with one axis true/false and the other how sure the user is -
this would be useful in the case of some obscure fact on quantum
physics since not all of us have the answer).

* Emphasize the community aspect of the database. Allow people to have
profiles and list the number of statements evaluated and submitted
(also how true the statements they submit are judged). Allow people to
form teams. Allow teams to extract a subset of the data
which represents only the facts they've submitted and evaluated
(perhaps this could be an extra feature available to sponsors?)

* Although Lojban would be great to use, not many people are
proficient it (relative to english), we could be idealistic and
suggest that everyone learn

81 matches

Mail list logo