Re: [agi] just a thought

2009-01-14 Thread Pei Wang
On Wed, Jan 14, 2009 at 4:40 PM, Joshua Cowan  wrote:
> Is having a strong sense of self one aspect of "mature enough"?

I meant something more basic --- you need to have an individual system
complete and running, before you can have a society of individuals.

> Also, Dr. Wang, do you see this as a primary way for teaching empathy.

Yes, as well as everything else that depend on social experience.

> I believe Ben
> has written about hardwiring the desire to work with other agents as a
> possible means of encouraging empathy. Do you agree with this approach
> and/or have other ideas for encouraging empathy (assuming you see empathy as
> a good goal)?

It is too big a topic for me to explain at the current moment, but you
can take my abstract at http://nars.wang.googlepages.com/gti-5 as a
starting point.

Pei

>
>> From: "Pei Wang" 
>> Reply-To: agi@v2.listbox.com
>> To: agi@v2.listbox.com
>> Subject: Re: [agi] just a thought
>> Date: Wed, 14 Jan 2009 16:21:23 -0500
>>
>> I guess something like this is in the plan of many, if not all, AGI
>> projects. For NARS, see
>> http://nars.wang.googlepages.com/wang.roadmap.pdf , under "(4)
>> Socialization" in page 11.
>>
>> It is just that to attempt any non-trivial multi-agent experiment, the
>> work in single agent needs to be mature enough. The AGI projects are
>> not there yet.
>>
>> Pei
>>
>> On Wed, Jan 14, 2009 at 4:10 PM, Valentina Poletti 
>> wrote:
>> > Cool,
>> >
>> > this idea has already been applied successfully to some areas of AI,
>> > such as
>> > ant-colony algorithms and swarm intelligence algorithms. But I was
>> > thinking
>> > that it would be interesting to apply it at a high level. For example,
>> > consider that you create the best AGI agent you can come up with and,
>> > instead of running just one, you create several copies of it (perhaps
>> > with
>> > slight variations), and you initiate each in a different part of your
>> > reality or environment for such agents, after letting them have the
>> > ability
>> > to communicate. In this way whenever one such agents learns anything
>> > meaningful he passes the information to all other agents as well, that
>> > is,
>> > it not only modifies its own policy but it also affects the other's to
>> > some
>> > extent (determined by some constant or/and by how much the other agent
>> > likes
>> > this one, that is how useful learning from it has been in the past and
>> > so
>> > on). This way not only each agent would learn much faster, but also the
>> > agents could learn to use this communication ability to their advantage
>> > to
>> > ameliorate. I just think it would be interesting to implement this, not
>> > that
>> > I am capable of right now.
>> >
>> >
>> > On Wed, Jan 14, 2009 at 2:34 PM, Bob Mottram  wrote:
>> >>
>> >> 2009/1/14 Valentina Poletti :
>> >> > Anyways my point is, the reason why we have achieved so much
>> >> > technology,
>> >> > so
>> >> > much knowledge in this time is precisely the "we", it's the union of
>> >> > several
>> >> > individuals together with their ability to communicate with one-other
>> >> > that
>> >> > has made us advance so much. In a sense we are a single being with
>> >> > millions
>> >> > of eyes, ears, hands, brains, which alltogether can create amazing
>> >> > things.
>> >> > But take any human being alone, isolate him/her from any contact with
>> >> > any
>> >> > other human being and rest assured he/she will not achieve a single
>> >> > artifact
>> >> > of technology. In fact he/she might not survive long.
>> >>
>> >>
>> >> Yes.  I think Ben made a similar point in The Hidden Pattern.  People
>> >> studying human intelligence - psychologists, psychiatrists, cognitive
>> >> scientists, etc - tend to focus narrowly on the individual brain, but
>> >> human intelligence is more of an emergent networked phenomena
>> >> populated by strange meta-entities such as archetypes and memes.  Even
>> >> the greatest individuals from the world of science or art didn't make
>> >> their achievements in a vacuum, and were influenced by earlier works.
>> >>
>> >> Years ago I was chatting with someone who was abo

Re: [agi] just a thought

2009-01-14 Thread Pei Wang
I guess something like this is in the plan of many, if not all, AGI
projects. For NARS, see
http://nars.wang.googlepages.com/wang.roadmap.pdf , under "(4)
Socialization" in page 11.

It is just that to attempt any non-trivial multi-agent experiment, the
work in single agent needs to be mature enough. The AGI projects are
not there yet.

Pei

On Wed, Jan 14, 2009 at 4:10 PM, Valentina Poletti  wrote:
> Cool,
>
> this idea has already been applied successfully to some areas of AI, such as
> ant-colony algorithms and swarm intelligence algorithms. But I was thinking
> that it would be interesting to apply it at a high level. For example,
> consider that you create the best AGI agent you can come up with and,
> instead of running just one, you create several copies of it (perhaps with
> slight variations), and you initiate each in a different part of your
> reality or environment for such agents, after letting them have the ability
> to communicate. In this way whenever one such agents learns anything
> meaningful he passes the information to all other agents as well, that is,
> it not only modifies its own policy but it also affects the other's to some
> extent (determined by some constant or/and by how much the other agent likes
> this one, that is how useful learning from it has been in the past and so
> on). This way not only each agent would learn much faster, but also the
> agents could learn to use this communication ability to their advantage to
> ameliorate. I just think it would be interesting to implement this, not that
> I am capable of right now.
>
>
> On Wed, Jan 14, 2009 at 2:34 PM, Bob Mottram  wrote:
>>
>> 2009/1/14 Valentina Poletti :
>> > Anyways my point is, the reason why we have achieved so much technology,
>> > so
>> > much knowledge in this time is precisely the "we", it's the union of
>> > several
>> > individuals together with their ability to communicate with one-other
>> > that
>> > has made us advance so much. In a sense we are a single being with
>> > millions
>> > of eyes, ears, hands, brains, which alltogether can create amazing
>> > things.
>> > But take any human being alone, isolate him/her from any contact with
>> > any
>> > other human being and rest assured he/she will not achieve a single
>> > artifact
>> > of technology. In fact he/she might not survive long.
>>
>>
>> Yes.  I think Ben made a similar point in The Hidden Pattern.  People
>> studying human intelligence - psychologists, psychiatrists, cognitive
>> scientists, etc - tend to focus narrowly on the individual brain, but
>> human intelligence is more of an emergent networked phenomena
>> populated by strange meta-entities such as archetypes and memes.  Even
>> the greatest individuals from the world of science or art didn't make
>> their achievements in a vacuum, and were influenced by earlier works.
>>
>> Years ago I was chatting with someone who was about to patent some
>> piece of machinery.  He had his name on the patent, but was pointing
>> out that it's very difficult to be able to say exactly who made the
>> invention - who was the "guiding mind".  In this case many individuals
>> within his company had some creative input, and there was really no
>> one "inventor" as such.  I think many human-made artifacts are like
>> this.
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> A true friend stabs you in the front. - O. Wilde
>
> Einstein once thought he was wrong; then he discovered he was wrong.
>
> For every complex problem, there is an answer which is short, simple and
> wrong. - H.L. Mencken
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com


Re: [agi] Relevance of SE in AGI

2008-12-21 Thread Pei Wang
At the current time, almost all AGI projects are still working on
conceptual design issues, and the systems developed are just
prototypes, so software engineering is not that much relevant. In the
future, when most of the theoretical problems have been solved,
especially when it becomes clear that one approach is going to lead us
to AGI, software engineering will become really relevant.

The existing "AI applications" are not that different from just
"computer applications", for which software engineering is necessary,
but there isn't much intelligence in them.

BTW, in a sense "software engineering" is just the opposite of
"artificial intelligence": while the latter tries to make machines to
work as flexibly as humans, the former tries to make humans
(programmers) to work as rigidly as machines. ;-)

Pei

On Sat, Dec 20, 2008 at 8:28 PM, Valentina Poletti  wrote:
> I have a question for you AGIers.. from your experience as well as from your
> background, how relevant do you think software engineering is in developing
> AI software and, in particular AGI software? Just wondering.. does software
> verification as well as correctness proving serve any use in this field? Or
> is this something used just for Nasa and critical applications?
> Valentina
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: Cross-Cultural Discussion using English [WAS Re: [agi] Creativity ...]

2008-12-19 Thread Pei Wang
Richard and Ben,

If you think I, as a Chinese, have overreacted to Mike Tintner's
writing style, and this is just a culture difference, please let me
know. In that case I'll try my best to learn his way of communication,
at least when talking to British and American people --- who knows, it
may even improve my marketing ability. ;-)

Pei

On Fri, Dec 19, 2008 at 7:01 PM, Ben Goertzel  wrote:
>
> And when a Chinese doesn't answer a question, it usually means "No" ;-)
>
> Relatedly, I am discussing with some US gov't people a potential project
> involving customizing an AI reasoning system to emulate the different
> inferential judgments of people from different cultures...
>
> ben
>
> On Fri, Dec 19, 2008 at 5:29 PM, Richard Loosemore 
> wrote:
>>
>> Ben Goertzel wrote:
>>>
>>> yeah ... that's not a matter of the English language but rather a matter
>>> of the American Way ;-p
>>>
>>> Through working with many non-Americans I have noted that what Americans
>>> often intend as a "playful obnoxiousness" is interpreted by non-Americans
>>> more seriously...
>>
>> Except that, in fact, Mike is not American but British.
>>
>> As a result of long experience talking to Americans, I have discovered
>> that what British people intend as routine discussion, Americans interpret
>> as serious, intentional obnoxiousness.  And then, what Americans (as you
>> say) intend as playful obnoxiousness, non-Americans interpret more
>> seriously.
>>
>>
>>
>> Richard Loosemore
>>
>>
>>
>>
>>
>>
>>
>>> I think we had some mutual colleagues in the past who favored such a
>>> style of discourse ;-)
>>>
>>> ben
>>>
>>> On Fri, Dec 19, 2008 at 1:49 PM, Pei Wang >> <mailto:mail.peiw...@gmail.com>> wrote:
>>>
>>>On Fri, Dec 19, 2008 at 1:40 PM, Ben Goertzel >><mailto:b...@goertzel.org>> wrote:
>>> >
>>> > IMHO, Mike Tintner is not often rude, and is not exactly a
>>>"troll" because I
>>> > feel he is genuinely trying to understand the deeper issues
>>>related to AGI,
>>> > rather than mainly trying to stir up trouble or cause irritation
>>>
>>>Well, I guess my English is not good enough to tell the subtle
>>>difference in tones, but his comments often sound that "You AGIers are
>>>so obviously wrong that I don't even bother to understand what you are
>>>saying ... Now let me tell you ...".
>>>
>>>I don't enjoy this tone.
>>>
>>>Pei
>>
>>
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> b...@goertzel.org
>
> "I intend to live forever, or die trying."
> -- Groucho Marx
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: [agi] Creativity and Rationality (was: Re: Should I get a PhD?)

2008-12-19 Thread Pei Wang
On Fri, Dec 19, 2008 at 1:40 PM, Ben Goertzel  wrote:
>
> IMHO, Mike Tintner is not often rude, and is not exactly a "troll" because I
> feel he is genuinely trying to understand the deeper issues related to AGI,
> rather than mainly trying to stir up trouble or cause irritation

Well, I guess my English is not good enough to tell the subtle
difference in tones, but his comments often sound that "You AGIers are
so obviously wrong that I don't even bother to understand what you are
saying ... Now let me tell you ...".

I don't enjoy this tone.

Pei


> However, I find conversing with him generally frustrating because he
> combines
>
> A)
> extremely strong intuitive opinions about AGI topics
>
> with
>
> B)
> almost utter ignorance of the details of AGI (or standard AI), or the
> background knowledge needed to appreciate these details when compactly
> communicated
>
>
> This means that discussions with Mike never seem to get anywhere... and,
> frankly, I usually regret getting into them
>
> I would find it more rewarding by far to engage in discussion with someone
> who had Mike's same philosophy and ideas (which I disagree strongly with),
> but had enough technical background to actually debate the details of AGI in
> a meaningful way
>
> For example, Selmer Bringjord (an AI expert, not on this list) seems to
> share a fair number of Mike's ideas, but discussions with him are less
> frustrating because rather than wasting time on misunderstandings, basics
> and terminology, one cuts VERY QUICKLY to the deep points of conceptual
> disagreement
>
> ben g
>
>
>
> On Fri, Dec 19, 2008 at 1:19 PM, Pei Wang  wrote:
>>
>> BillK,
>>
>> Thanks for the reminder. I didn't reply to him, but still got involved.
>> :-(
>>
>> I certainty don't want to encourage bad behaviors in this mailing
>> list. Here "bad behaviors" are not in the conclusions or arguments,
>> but in the way they are presented, as well as in the
>> politeness/rudeness toward other people.
>>
>> Pei
>>
>> On Fri, Dec 19, 2008 at 11:38 AM, BillK  wrote:
>> > On Fri, Dec 19, 2008 at 3:55 PM, Mike Tintner wrote:
>> >>
>> >> (On the contrary, Pei, you can't get more narrow-minded than rational
>> >> thinking. That's its strength and its weakness).
>> >>
>> >
>> >
>> > Pei
>> >
>> > In case you haven't noticed, you won't gain anything from trying to
>> > engage with the troll.
>> >
>> > Mike does not discuss anything. He states his opinions in many
>> > different ways, pretending to respond to those that waste their time
>> > talking to him. But no matter what points are raised in discussion
>> > with him, they will only be used as an excuse to produce yet another
>> > variation of his unchanged opinions.  He doesn't have any technical
>> > programming or AI background, so he can't understand that type of
>> > argument.
>> >
>> > He is against the whole basis of AGI research. He believes that
>> > rationality is a dead end, a dying culture, so deep-down, rational
>> > arguments mean little to him.
>> >
>> > Don't feed the troll!
>> > (Unless you really, really, think he might say something useful to you
>> > instead of just wasting your time).
>> >
>> >
>> > BillK
>> >
>> >
>> > ---
>> > agi
>> > Archives: https://www.listbox.com/member/archive/303/=now
>> > RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> > Modify Your Subscription: https://www.listbox.com/member/?&;
>> > Powered by Listbox: http://www.listbox.com
>> >
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> b...@goertzel.org
>
> "I intend to live forever, or die trying."
> -- Groucho Marx
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: [agi] Creativity and Rationality (was: Re: Should I get a PhD?)

2008-12-19 Thread Pei Wang
BillK,

Thanks for the reminder. I didn't reply to him, but still got involved. :-(

I certainty don't want to encourage bad behaviors in this mailing
list. Here "bad behaviors" are not in the conclusions or arguments,
but in the way they are presented, as well as in the
politeness/rudeness toward other people.

Pei

On Fri, Dec 19, 2008 at 11:38 AM, BillK  wrote:
> On Fri, Dec 19, 2008 at 3:55 PM, Mike Tintner wrote:
>>
>> (On the contrary, Pei, you can't get more narrow-minded than rational
>> thinking. That's its strength and its weakness).
>>
>
>
> Pei
>
> In case you haven't noticed, you won't gain anything from trying to
> engage with the troll.
>
> Mike does not discuss anything. He states his opinions in many
> different ways, pretending to respond to those that waste their time
> talking to him. But no matter what points are raised in discussion
> with him, they will only be used as an excuse to produce yet another
> variation of his unchanged opinions.  He doesn't have any technical
> programming or AI background, so he can't understand that type of
> argument.
>
> He is against the whole basis of AGI research. He believes that
> rationality is a dead end, a dying culture, so deep-down, rational
> arguments mean little to him.
>
> Don't feed the troll!
> (Unless you really, really, think he might say something useful to you
> instead of just wasting your time).
>
>
> BillK
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: [agi] Creativity and Rationality (was: Re: Should I get a PhD?)

2008-12-19 Thread Pei Wang
Agree.

As far as a system is not pure deductive, it can be creative. What
usually called "creative thinking" often can be analyzed into a
combination induction, abduction, analogy, etc, as well as deduction.
When these inference are properly justified, they are rational.

To treat "creative" and "rational" as opposite to each other is indeed
based on a very narrow understanding of rationality and logic.

Pei

On Fri, Dec 19, 2008 at 6:25 AM, Kaj Sotala  wrote:
> On Fri, Dec 19, 2008 at 1:47 AM, Mike Tintner  
> wrote:
>> Ben,
>>
>> I radically disagree. Human intelligence involves both creativity and
>> rationality, certainly.  But  rationality - and the rational systems  of
>> logic/maths and formal languages, [on which current AGI depends]  -  are
>> fundamentally *opposed* to creativity and the generation of new ideas.  What
>> I intend to demonstrate in a while is that just about everything that is bad
>> thinking from a rational POV is *good [or potentially good] thinking* from a
>> creative POV (and vice versa). To take a small example, logical fallacies
>> are indeed illogical and irrational - an example of rationally bad thinking.
>> But they are potentially good thinking from a creative POV -   useful
>> skills, for example, in a political spinmeister's art. (And you and Pei use
>> them a lot in arguing for your AGI's  :)).
>
> I think this example is more about needing to apply different kinds of
> reasoning rules in different domains, rather than the underlying
> reasoning process itself being different.
>
> In the domain of classical logic, if you encounter a contradiction,
> you'll want to apply a reasoning rule saying that your premises are
> inconsistent, and at least one of them needs to be eliminated or at
> least modified.
>
> In the domain of politics, if you encounter a contradiction, you'll
> want to apply a reasoning rule saying that this may come useful as a
> rhetorical argument. Note that even then, you need to apply
> rationality in order to figure out what kinds of contradictions are
> effective on your intended audience, and what kinds of contradictions
> you'll want to avoid. You can't just go around proclaiming "it is my
> birthday and it is not my birthday" and expect people to take you
> seriously.
>
> It seems to me like Mike is committing the fallacy of interpreting
> "rationality" in a too narrow way, thinking it to be something like a
> slightly expanded version of classical formal logic. That's a common
> mistake (oh, what damage Gene Roddenberry did to humanity when he
> created the character of Spock), but a mistake nonetheless.
>
> Furthermore, this currently seems to be mostly a debate over
> semantics, and the appropriate meaning of labels... if both Ben and
> Mike took the approach advocated in
> http://www.overcomingbias.com/2008/02/taboo-words.html and taboo'd
> both "rationality" and "creativity", so that e.g.
>
> rationalityBen = [a process by which ideas are verified for internal
> consistency]
> creativityBen = [a process, currently not entirely understood, by
> which new ideas are generated]
> rationalityMike = [a set of techniques such as math and logic]
> creativityMike = well, not sure of what Mike's exact definition for
> creativity *would* be
>
> then, instead of sentences like "the wider culture has always known
> that rationality and creativity are  opposed" (to quote Mike's earlier
> mail), we'd get sentences like "the wider culture has always known
> that the set of techniques of math and logic are opposed to
> creativity", which would be much easier to debate. No need to keep
> guessing what, exactly, the other person *means* with "rationality"
> and "logic"...
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: [agi] Should I get a PhD?

2008-12-17 Thread Pei Wang
As Ben said, the fact that both of us have relations with Temple
University is indeed a coincidence.

Peter enrolled in my AI course a few years ago, though I don't know
how much influence it has on him --- probably not too much.

Pei

On Wed, Dec 17, 2008 at 11:22 AM, Joshua Fox  wrote:
> About graduate programs and AGI: It seems that Temple University has an
> affinity for AGI people--Ben Goertzel, Pei Wang, and now Peter de Blanc. Is
> this just a coincidence?
> Joshua
>
> On Wed, Dec 17, 2008 at 5:48 PM, Ben Goertzel  wrote:
>>
>>
>>
>>>
>>>
>>> Can I start the PhD directly without getting the MS first?
>>
>>
>> You can start a PhD without having an MS first, but you'll still need to
>> take all the coursework corresponding to the MS
>>
>> I don't personally know of any university that lets you go directly from a
>> BS/BA to a PhD without doing a couple years of coursework first [whether the
>> coursework takes the role of a MS, or just coursework taken in the process
>> of getting the PhD]
>>
>> And I think this makes sense!  The PhD is supposed to indicate that you
>> have broadly-based expertise in a field, as well as capability to do
>> independent research...
>>
>> **Possibly** you could convince some university to let you "test out" of
>> coursework if you could ace the final exams of all the MS-level courses, but
>> I never actually heard of this occurring...
>>
>> ben
>>
>> 
>> agi | Archives | Modify Your Subscription
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


Re: [agi] Should I get a PhD?

2008-12-17 Thread Pei Wang
YKY,

As many people have said, US universities usually have coursework
requirements for PhD.

Furthermore, in most US universities the application deadline for 2009
has passed or will pass soon --- our deadline is Dec. 15.

Therefore, you'd better consider Europe.

Pei

On Wed, Dec 17, 2008 at 8:30 AM, YKY (Yan King Yin)
 wrote:
> Hi group,
>
> I'm considering getting a PhD somewhere, and I've accumulated some
> material for a thesis in my 50%-finished AGI book.  I think getting a
> PhD will put my work in a more rigorous form and get it published.
> Also it may help me get funding afterwards, either in academia or in
> the business world.
>
> I want to maximize the time spent on my thesis while minimizing time
> spent on other 'coursework' (ie things that aren't directly related to
> my project, exams, classes, homework, etc).  Which universities should
> I look at?  Or should I contact some professors directly?
>
> Thanks! =)
> YKY
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com


[agi] an interesting article on "Body Swapping"

2008-12-03 Thread Pei Wang
If I Were You: Perceptual Illusion of Body Swapping

Valeria I. Petkova*, H. Henrik Ehrsson

Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden

Abstract
The concept of an individual swapping his or her body with that of
another person has captured the imagination of writers and artists for
decades. Although this topic has not been the subject of investigation
in science, it exemplifies the fundamental question of why we have an
ongoing experience of being located inside our bodies. Here we report
a perceptual illusion of body-swapping that addresses directly this
issue. Manipulation of the visual perspective, in combination with the
receipt of correlated multisensory information from the body was
sufficient to trigger the illusion that another person's body or an
artificial body was one's own. This effect was so strong that people
could experience being in another person's body when facing their own
body and shaking hands with it. Our results are of fundamental
importance because they identify the perceptual processes that produce
the feeling of ownership of one's body.

http://www.plosone.org/article/info:doi/10.1371/journal.pone.0003832


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] Mushed Up Decision Processes

2008-11-30 Thread Pei Wang
Stephen,

Does that mean what you did at Cycorp on transfer learning is similar
to what Taylor presented to AGI-08?

Pei

On Sun, Nov 30, 2008 at 1:01 PM, Stephen Reed <[EMAIL PROTECTED]> wrote:
> Matt Taylor was also an intern at Cycorp where was on Cycorp's Transfer
> Learning team with me.
> -Steve
>
> Stephen L. Reed
>
> Artificial Intelligence Researcher
> http://texai.org/blog
> http://texai.org
> 3008 Oak Crest Ave.
> Austin, Texas, USA 78704
> 512.791.7860
>
> 
> From: Pei Wang <[EMAIL PROTECTED]>
> To: agi@v2.listbox.com
> Sent: Sunday, November 30, 2008 10:48:59 AM
> Subject: Re: [agi] Mushed Up Decision Processes
>
> On Sun, Nov 30, 2008 at 11:17 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> There was a DARPA program on "transfer learning" a few years back ...
>> I believe I applied and got rejected (with perfect marks on the
>> technical proposal, as usual ...) ... I never checked to see who got
>> the $$ and what they did with it...
>
> See http://www.cs.utexas.edu/~mtaylor/Publications/AGI08-taylor.pdf
>
> Pei
>
>> ben g
>>
>> On Sun, Nov 30, 2008 at 11:12 AM, Philip Hunt <[EMAIL PROTECTED]>
>> wrote:
>>> 2008/11/30 Ben Goertzel <[EMAIL PROTECTED]>:
>>>> Hi,
>>>>
>>>>> I have proposed a problem domain called "function predictor" whose
>>>>> purpose is to allow an AI to learn across problem sub-domains,
>>>>> carrying its learning from one domain to another. (See
>>>>> http://www.includipedia.com/wiki/User:Cabalamat/Function_predictor )
>>>>>
>>>>> I also think it would be useful if there was a regular (maybe annual)
>>>>> competition in the function predictor domain (or some similar domain).
>>>>> A bit like the Loebner Prize, except that it would be more useful to
>>>>> the advancement of AI, since the Loebner prize is silly.
>>>>>
>>>>> --
>>>>> Philip Hunt, <[EMAIL PROTECTED]>
>>>>
>>>> How does that differ from what is generally called "transfer learning" ?
>>>
>>> I don't think it does differ. ("Transfer learning" is not a term I'd
>>> previously come across).
>>>
>>> --
>>> Philip Hunt, <[EMAIL PROTECTED]>
>>> Please avoid sending me Word or PowerPoint attachments.
>>> See http://www.gnu.org/philosophy/no-word-attachments.html
>>>
>>>
>>> ---
>>> agi
>>> Archives: https://www.listbox.com/member/archive/303/=now
>>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>>> Modify Your Subscription: https://www.listbox.com/member/?&;
>>> Powered by Listbox: http://www.listbox.com
>>>
>>
>>
>>
>> --
>> Ben Goertzel, PhD
>> CEO, Novamente LLC and Biomind LLC
>> Director of Research, SIAI
>> [EMAIL PROTECTED]
>>
>> "I intend to live forever, or die trying."
>> -- Groucho Marx
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] Mushed Up Decision Processes

2008-11-30 Thread Pei Wang
On Sun, Nov 30, 2008 at 11:17 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
> There was a DARPA program on "transfer learning" a few years back ...
> I believe I applied and got rejected (with perfect marks on the
> technical proposal, as usual ...) ... I never checked to see who got
> the $$ and what they did with it...

See http://www.cs.utexas.edu/~mtaylor/Publications/AGI08-taylor.pdf

Pei

> ben g
>
> On Sun, Nov 30, 2008 at 11:12 AM, Philip Hunt <[EMAIL PROTECTED]> wrote:
>> 2008/11/30 Ben Goertzel <[EMAIL PROTECTED]>:
>>> Hi,
>>>
 I have proposed a problem domain called "function predictor" whose
 purpose is to allow an AI to learn across problem sub-domains,
 carrying its learning from one domain to another. (See
 http://www.includipedia.com/wiki/User:Cabalamat/Function_predictor )

 I also think it would be useful if there was a regular (maybe annual)
 competition in the function predictor domain (or some similar domain).
 A bit like the Loebner Prize, except that it would be more useful to
 the advancement of AI, since the Loebner prize is silly.

 --
 Philip Hunt, <[EMAIL PROTECTED]>
>>>
>>> How does that differ from what is generally called "transfer learning" ?
>>
>> I don't think it does differ. ("Transfer learning" is not a term I'd
>> previously come across).
>>
>> --
>> Philip Hunt, <[EMAIL PROTECTED]>
>> Please avoid sending me Word or PowerPoint attachments.
>> See http://www.gnu.org/philosophy/no-word-attachments.html
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>>
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "I intend to live forever, or die trying."
> -- Groucho Marx
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] JAGI submission

2008-11-24 Thread Pei Wang
On Mon, Nov 24, 2008 at 7:20 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> I submitted my paper "A Model for Recursively Self Improving Programs" to 
> JAGI and it is ready for open review. For those who have already read it, it 
> is essentially the same paper except that I have expanded the abstract. The 
> paper describes a mathematical model of RSI in closed environments (e.g. 
> boxed AI) and shows that such programs exist in a certain sense. It can be 
> found here:
>
> http://journal.agi-network.org/Submissions/tabid/99/ctrl/ViewOneU/ID/9/Default.aspx
>
> JAGI has an open review process where anyone can comment, but you will need 
> to register to do so. You don't need to register to read the paper. This is a 
> new journal started by Pei Wang.
>
> -- Matt Mahoney, [EMAIL PROTECTED]

Thanks Matt for supporting JAGI, and I hope more and more people in
this mailing list will be willing to put their ideas in a more
organized manner, as well as to treat other people's ideas in the same
way.

Two minor corrections:

(1) JAGI is started not just by me, but by a group of researchers ---
see http://journal.agi-network.org/EditorialBoard/tabid/92/Default.aspx

(2) The public review of JAGI is considered as part of "peer review"
process of the journal. Therefore, it doesn't really allow anyone to
post their comment in the journal website (of course they can post it
elsewhere). Instead, only "AGI Network Members" can post reviews at
the journal website. Roughly speaking, "AGI Network Members" are the
researchers who have an AI related graduate degree or graduate
students working toward such a degree. The other people are approved
in a case-by-case basis.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] Hunting for a Brainy Computer

2008-11-20 Thread Pei Wang
Derek,

I have no doubt that their proposal contains interesting ideas and
will produce interesting and valuable results --- most AI projects do,
though the results and the values are often not what they targeted (or
they claimed to be targeting) initially.

"Biologically inspired approaches" are attractive, partly because they
have existing proof for the mechanism to work. However, we need to
remember that "inspired" by a working solution is one thing, and to
treat that solution as the best way to achieve a goal is another.
Furthermore, the difficult part in these approaches is to separate the
aspect of the biological mechanism/process that should be duplicated
from the aspects that shouldn't.

Yes, maybe I should market NARS as a theory of the brain, just a very
high-level one. ;-)

Pei

On Thu, Nov 20, 2008 at 10:06 AM, Derek Zahn <[EMAIL PROTECTED]> wrote:
> Pei Wang:
>
>> --- I have problem with each of these assumptions and beliefs, though
>> I don't think anyone can convince someone who just get a big grant
>> that they are moving in a wrong direction. ;-)
>
> With his other posts about the Singularity Summit and his invention of the
> word "Synaptronics", Modha certainly seems to be a kindred spirit to many on
> this list.
>
> I think what he's trying to do with this project (to the extent I understand
> it) seems like a reasonably promising approach (not really to AGI as such,
> but experimenting with soft computing substrates is kind of a cool
> enterprise to me).  Let a thousand flowers bloom.
>
> However, when he says things on his blog like "In my opinion, there are
> three reasons why the time is now ripe to begin to draw inspiration from
> structure, dynamics, function, and behavior of the brain for developing
> novel computing architectures and cognitive systems." -- I despair again.
>
> Dr. Wang, if you want to get some funding maybe you should start promoting
> NARS as a theory of the brain :)
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] Hunting for a Brainy Computer

2008-11-20 Thread Pei Wang
The basic assumptions behind the project, from the webpage of its team
lead at http://www.modha.org/ :

"The mind arises from the wetware of the brain. Thus, it would seem
that reverse engineering the computational function of the brain is
perhaps the cheapest and quickest way to engineer computers that mimic
the robustness and versatility of the mind.

"Cognitive computing, seeks to engineer holistic intelligent machines
that neatly tie together all of the pieces. Cognitive computing seeks
to uncover the core micro and macro circuits of the brain underlying a
wide variety of abilities. So, it aims to proceeds in algorithm-first,
problems-later fashion.

"I believe that spiking computation is a key to achieving this vision."

--- I have problem with each of these assumptions and beliefs, though
I don't think anyone can convince someone who just get a big grant
that they are moving in a wrong direction. ;-)

Pei

On Thu, Nov 20, 2008 at 8:29 AM, Rafael C.P. <[EMAIL PROTECTED]> wrote:
> http://bits.blogs.nytimes.com/2008/11/20/hunting-for-a-brainy-computer/
>
> ===[ Rafael C.P. ]===
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


[agi] PhD study opportunity at Temple University

2008-11-19 Thread Pei Wang
Hi,

I may accept a few PhD students in 2009. Interested people please
visit http://www.cis.temple.edu/~pwang/students.html

Pei Wang
http://www.cis.temple.edu/~pwang/


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com


Re: [agi] Unification by index?

2008-10-31 Thread Pei Wang
I didn't directly code it myself, but as far as I know, nested lists
should be fine, though the N expressions probably should remain
constant.

Pei

On Fri, Oct 31, 2008 at 4:44 PM, Russell Wallace
<[EMAIL PROTECTED]> wrote:
> On Fri, Oct 31, 2008 at 8:00 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> The closest thing I can think of is Rete algorithm --- see
>> http://en.wikipedia.org/wiki/Rete_algorithm
>
> Thanks! If I'm understanding correctly, the Rete algorithm only
> handles lists of constants and variables, not general expressions
> which include nested lists?
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Unification by index?

2008-10-31 Thread Pei Wang
The closest thing I can think of is Rete algorithm --- see
http://en.wikipedia.org/wiki/Rete_algorithm

Pei

On Fri, Oct 31, 2008 at 3:39 PM, Russell Wallace
<[EMAIL PROTECTED]> wrote:
> In classical logic programming, there is the concept of unification,
> where one expression is matched against another, and one or both
> expressions may contain variables. For example, (FOO ?A) unifies with
> (FOO 42) by setting the variable ?A = 42.
>
> Suppose you have a database of N expressions, and are given a new
> expression, and want to find which of the existing ones unify against
> the new one. This can obviously be done by unifying against each
> expression in turn. However, this takes O(N) time, which is slow if
> the database is large.
>
> It seems to me that by appropriate use of indexes, it should be
> possible to unify against the entire database simultaneously, or at
> least to isolate a small fraction of it as potential matches so that
> the individual unification algorithm need not be run against every
> expression in the database.
>
> I'm obviously not the first person to run into this problem, and
> presumably not the first to think of that kind of solution. Before I
> go ahead and work out the whole design myself, I figure it's worth
> checking: does anyone know of any existing examples of this?
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
Matt,

How about the following argument:

A. "Since in principle all human knowledge about the universe can be
expressed in English, we say that the universe exists as a English
essay --- though we don't know which one yet".

B. "Because of A, the ultimate scientific research method is to
exhaustively produce all possible English essays, test each of them
against the universe, and keep the best --- since there are infinite
number of them, the process won't terminate, but it can be used as an
idealized model of scientific research, and the best scientific theory
will always be produced in this way."

C. "As a practical version of B, we can limit the length of the essay,
in characters, to a constant N. Then this algorithm will surely find
the best scientific theory within length N in finite time. Now
everyone doing science should approximate this process as closely as
possible, and the only remaining issue is computational power to reach
larger and larger N."

Of course, I don't mean that your argument is this silly --- the
research paradigm you argued for is interesting and valuable in
certain aspects --- though I do feel some similarity between the two
cases.

Pei


On Thu, Oct 30, 2008 at 7:30 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> Ben, you missed my point. We use Turing machines in all kinds of computer
> science proofs, even though you can't build one. Turing machines have
> infinite memory, so it is not unreasonable to assume that if Turing machines
> did exist, then one could store the 2^409 bits needed to describe the
> quantum state of the observable universe and then perform computations on
> that data to predict the future.
>
> I described how a Turing machine could obtain that knowledge in about 2^818
> steps by enumerating all possible universes until intelligent life is found.
> As evidence, I suggest that the algorithmic complexity of the free
> parameters in string theory, general relativity, and the initial state of
> the Big Bang is on the order of a few hundred bits.
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
> --- On Thu, 10/30/08, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> From: Ben Goertzel <[EMAIL PROTECTED]>
> Subject: Re: [agi] "the universe is computable" [Was: Occam's Razor and its
> abuse]
> To: agi@v2.listbox.com
> Date: Thursday, October 30, 2008, 6:02 PM
>
>
>>
>>
>> If I can assume that Turing machines exist, then I can assume perfect
>> knowledge of the state of the universe. It doesn't change my conclusion that
>> the universe is computable.
>>
>> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
> 1)
> Turing machines are mathematical abstractions and don't physically exist
>
> 2)
> I thought **I** had a lot of hubris but ... wow!  Color me skeptical that
> you possess perfect knowledge of the state of the universe ;-)
>
>
> ben g
> 
> agi | Archives | Modify Your Subscription
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Thu, Oct 30, 2008 at 5:36 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
>
> The point is not that AGI should model things at the level of atoms.

I didn't blame anyone for doing that. What I said is: to predict the
environment as a Turing Machine (symbol by symbol) is just like to
construct a building atom by atom. The problem is not merely in
complexity, but in the level of description.

> The point is that we should apply the principle of Occam's Razor to machine 
> learning and AGI.

If by "Occam's Razor" you mean "the learning mechanism should prefer
simpler result", I don't think anyone has disagreed (though people may
not use that term, or may justify it differently), but if by "Occam's
Razor" you mean "learning should start by giving simpler hypotheses
higher prior probability", I still don't see why.

> We already do that in all practical learning algorithms. For example in NARS, 
> a link between two concepts like (if X then Y) has a probability and a 
> confidence that depends on the counts of (X,Y) and (X, not Y).

Yes, except it is not a "probability" in the sense of "limit of frequency".

> This model is a simplification from a sequence of n events (with algorithmic 
> complexity 2n) to two small integers (with algorithmic complexity 2 log n).

This is your interpretation, which is fine, though I don't see why I
must see it the same way, though I do agree that it is a summary of
experience.

> The reason this often works in practice is Occam's Razor. That might not be 
> the case if physics were not computable.

Again, this is a description of your belief, not a justification of this belief.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
Agree.

As I mentioned before, science was used to be seen as the pursuing of
"truth", and its theories aimed at describing the aspects of world "as
it is". Now it has been taken as a wrong view. Science is organized
human experience, which is fundamentally based on human cognitive
capability and human experience, so its description of the world never
perfectly matches the world itself.

Actually this is what makes science an everlasting enterprise.
Otherwise there will be a day when science really tells us everything
about the world --- as a Turing Machine or not --- then it will stop
there. Now we know that it will never happen. There will always be
phenomena that no existing theory can explain or predict --- this is
the "insufficient knowledge and resources" situation at the level of
whole human society.

Obviously many people working in science still hold the old
("classical"?) view of science, which usually does not cause too big a
difference in what they do in their research. However, if the subject
of the research is science itself or the human cognition process, then
the old view is not even good enough as an approximation or
idealization.

For people who think the above is just my personal bias, I recommend
the following readings:

*. any textbook in philosophy of science, as far as it include Kuhn and Lakatos
*. Philosophy in the Flesh : The Embodied Mind and Its Challenge to
Western Thought,
by George Lakoff and Mark Johnson
*. Why We See What We Do: An Empirical Theory of Vision, by Dale
Purves and R. Beau Lotto

Pei

On Thu, Oct 30, 2008 at 5:35 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> I note that physicists have frequently, throughout the last few hundred
> years, expressed confidence in their understanding of the whole universe ...
> and then been proven wrong by later generations of physicists...
>
> Personally I find it highly unlikely that the current physical understanding
> of the universe as a whole is going to survive the next century ...
> especially with the Singularity looming and all that.  Most likely,
> superintelligent AGIs will tell us why our current physics ideas are very
> limited.
>
> Fortunately, we don't seem to need to understand the physical universe very
> completely in order to build AGIs at the human level and beyond.
>
> -- Ben G
>
> On Thu, Oct 30, 2008 at 5:29 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> On Thu, Oct 30, 2008 at 5:18 PM, Matt Mahoney <[EMAIL PROTECTED]>
>> wrote:
>> > --- On Thu, 10/30/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>> >> So there are physicists who think in principle the stock
>> >> market can be
>> >> accurately predicated from quantum theory alone? I'd
>> >> like to get a
>> >> reference on that. ;-)
>> >
>> > If you had a Turing machine, yes.
>> >
>> > It also assumes you know which of the possible 2^(2^409) possible states
>> > the universe is in. (2^409 ~ 2.9 x 10^122 bits = entropy of the universe).
>> > So don't expect any experimental verification.
>>
>> Matt,
>>
>> Even if all of our models of the universe can be put into a Turing
>> Machine, your conclusion still doesn't follow, because you need to
>> further assume the model is perfect, that is, it describes the
>> universe *as it is*. This is another conclusion that conflict with the
>> current understanding of science.
>>
>> Pei
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "A human being should be able to change a diaper, plan an invasion, butcher
> a hog, conn a ship, design a building, write a sonnet, balance accounts,
> build a wall, set a bone, comfort the dying, take orders, give orders,
> cooperate, act alone, solve equations, analyze a new problem, pitch manure,
> program a computer, cook a tasty meal, fight efficiently, die gallantly.
> Specialization is for insects."  -- Robert Heinlein
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Thu, Oct 30, 2008 at 5:18 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Thu, 10/30/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>> So there are physicists who think in principle the stock
>> market can be
>> accurately predicated from quantum theory alone? I'd
>> like to get a
>> reference on that. ;-)
>
> If you had a Turing machine, yes.
>
> It also assumes you know which of the possible 2^(2^409) possible states the 
> universe is in. (2^409 ~ 2.9 x 10^122 bits = entropy of the universe). So 
> don't expect any experimental verification.

Matt,

Even if all of our models of the universe can be put into a Turing
Machine, your conclusion still doesn't follow, because you need to
further assume the model is perfect, that is, it describes the
universe *as it is*. This is another conclusion that conflict with the
current understanding of science.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Thu, Oct 30, 2008 at 5:00 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>> So there are physicists who think in principle the stock market can be
>> accurately predicated from quantum theory alone? I'd like to get a
>> reference on that. ;-)
>
> Pei, I think a majority -- or at least a substantial plurality -- of
> physicists think that.  Really.
>
> ben

Too bad --- they all should take a course in philosophy of science.

Even if that is the case, I don't accept it as a reason to tolerant
this opinion in AGI research.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Thu, Oct 30, 2008 at 4:53 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
> On Thu, Oct 30, 2008 at 4:50 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> On Thu, Oct 30, 2008 at 3:07 PM, Matt Mahoney <[EMAIL PROTECTED]>
>> wrote:
>> >
>> > An exact description of the quantum state of the universe gives you
>> > everything else.
>>
>> Why? Just because it is the smallest object we know? Is this a
>> self-evident commonsense, or a conclusion from physics?
>
>
> It's an implication of quantum theory.

So there are physicists who think in principle the stock market can be
accurately predicated from quantum theory alone? I'd like to get a
reference on that. ;-)

Pei

> However, it's not yet fully validated experimentally or theoretically.
>
> No one has ever, for instance,
> derived the periodic table of the elements from the laws of physics, without
> making a
> lot of hacky assumptions that amount to using known facts of chemistry to
> tune various
> constants in the derivations.
>
> ben g
>
>>
>> As I said before, this is a very strong version of reductionism. It
>> was widely accepted in the time of Newton and Laplace, but I don't
>> think it is still considered as a valid theory in philosophy of
>> science. This position is not only unjustifiable, but also lead the
>> research to wrong directions. It is like to suggest an architect to
>> analyze the structure of a building at atom level, because all
>> building materials are made by atoms, after all. The fact that all
>> building materials are indeed made by atoms only makes the suggestion
>> even more harmful than a suggestion based on false statements.
>>
>> Pei
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "A human being should be able to change a diaper, plan an invasion, butcher
> a hog, conn a ship, design a building, write a sonnet, balance accounts,
> build a wall, set a bone, comfort the dying, take orders, give orders,
> cooperate, act alone, solve equations, analyze a new problem, pitch manure,
> program a computer, cook a tasty meal, fight efficiently, die gallantly.
> Specialization is for insects."  -- Robert Heinlein
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Thu, Oct 30, 2008 at 3:07 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
>
> An exact description of the quantum state of the universe gives you 
> everything else.

Why? Just because it is the smallest object we know? Is this a
self-evident commonsense, or a conclusion from physics?

As I said before, this is a very strong version of reductionism. It
was widely accepted in the time of Newton and Laplace, but I don't
think it is still considered as a valid theory in philosophy of
science. This position is not only unjustifiable, but also lead the
research to wrong directions. It is like to suggest an architect to
analyze the structure of a building at atom level, because all
building materials are made by atoms, after all. The fact that all
building materials are indeed made by atoms only makes the suggestion
even more harmful than a suggestion based on false statements.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
Matt,

I understand your explanation, but you haven't answered my main
problem here: why to simulate the universe we only need physics, but
not chemistry, biology, psychology, history, philosophy, ...? Why not
to say "all human knowledge"?

Pei

On Thu, Oct 30, 2008 at 1:51 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Thu, 10/30/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>> [C]. "Because of B, the universe can be simulated in
>> Turing Machine".
>>
>> This is where I start to feel uncomfortable.
>
> The theory cannot be tested directly because there is no such thing as a real 
> Turing machine. But we can show that the observable universe has finite 
> information content according to the known laws of physics and cosmology, 
> which assumes finite age, size, and mass. In particular, the Bekenstein bound 
> of the Hubble radius gives an exact number (2.91 x 10^122 bits).
>
> http://en.wikipedia.org/wiki/Bekenstein_bound
> http://en.wikipedia.org/wiki/Age_of_the_universe
>
> This does not mean you could model the universe. It would be impossible to 
> build a memory this large. Any physically realizable computer would have to 
> be built inside our observable universe. But that is not a requirement for 
> Occam's Razor to hold.
>
> I realize we don't have a complete theory of physics. In particular, quantum 
> mechanics has not been unified with general relativity. I also realize that 
> even if we did have a complete theory, we couldn't prove it.
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


[agi] "the universe is computable" [Was: Occam's Razor and its abuse]

2008-10-30 Thread Pei Wang
On Wed, Oct 29, 2008 at 10:44 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
>
> My assumption is that the physics of the observable universe is computable 
> (which is widely believed to be true).

To me, this is another topic where several different claims tangled together.

[A]. "Every law or model proposed in physics is computable and
therefore can be calculated by a Turing Machine."

This is what Zuse argued in his original paper, which started this
whole discussion.

Given my limited knowledge of physics, it sounds correct, and I have
no problem here.

[B]. "Because of A, all laws and models in physics can be calculated
by a Turing Machine."

This is Zuse's hypothesis. He didn't really argue for it, but
suggested it as a possibility. I think that this is theoretically
possible, but practically very unlikely to happen. Physics, like any
science, consists of incompatible theories each focusing on one aspect
of the world. A consistent "Theory of Physics" is very hard to get, if
not impossible.

I don't have a high expectation for B to be realized, though do accept
it as a meaningful and interesting possibility to consider, or even as
an ultimate goal of research for a physicist.

[C]. "Because of B, the universe can be simulated in Turing Machine".

This is where I start to feel uncomfortable. Even if someone has got a
theory can explains all observed physical phenomena, and formulated it
in a Turing machine, it is still only a description of the universe at
a certain level of description. To claim it describes "the universe",
rather than just "all observed physical phenomena", you need to assume
a strong version of Physicalism and Reductionism, so that all
phenomena can be reduced into physical phenomena with no information
loss.

To me, this claim is philosophically incorrect. There is no single
language or level of description that describes "the true world",
while all the other descriptions are just its approximation.

[D]. "Because of C, the universe is a Turing Machine".

To me, this is a confusion between an object and a
description/simulation of an object. Even if C is true, a simulated
universe is still not a universe itself, just as a simulated hurricane
in a computer is not a real hurricane itself, because it does not have
the defining property of "hurricane" in our world (which is different
from the simulated world in the computer).

Of course, some people will go to the extreme to say that I'm really
living in a simulated world, it is just that I haven't realize it yet.
This is a reasonable argument, but since I don't see what difference
it will make if I accept it, it won't be considered here.

In summary, what I cannot accept in AGI research is the assumption
that there is a real/true/object description of the universe, and all
theories or knowledge are partial approximation of it. Turing Machine
is just one form of this "objective truth". To me, though this opinion
is acceptable in many situations in everyday life, it will lead the
research of AGI to a wrong direction.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang
Ben,

It goes back to what "justification" we are talking about. "To prove
it" is a strong version, and "to show supporting evidence" is a weak
version. Hume pointed out that induction cannot be justified in the
sense that there is no way to guarantee that all inductive conclusions
will be confirmed.

I don't think Hume can be cited to support the assumption that
"complexity is correlated to probability", or that this assumption
does not need justification, just because inductive conclusions can be
wrong. There are much more reasons to accept induction than to accept
the above assumption.

Pei

On Wed, Oct 29, 2008 at 12:31 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
>>
>> However, it does not mean that all assumptions are equally acceptable,
>> or as soon as something is called a "assumption", the author will be
>> released from the duty of justifying it.
>
> Hume argued that at the basis of any approach to induction, there will
> necessarily lie some assumption that is *not* inductively justified, but
> must in essence be taken "on faith" or "as an unjustified assumption"
>
> He claimed that humans make certain unjustified assumptions of this nature
> automatically due to "human nature"
>
> This is an argument that not all assumptions can be expected to be justified
> ...
>
> Comments?
> ben g
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang
Ben,

I never claimed that NARS is not based on assumptions (or call them
biases), but only on "truths". It surely is, and many of the
assumptions are my beliefs and intuitions, which I cannot convince
other people to accept very soon.

However, it does not mean that all assumptions are equally acceptable,
or as soon as something is called a "assumption", the author will be
released from the duty of justifying it.

Going back to the original topic, since "simplicity/complexity of a
description is correlated with its prior probability" is the core
assumption of certain research paradigms, it should be justified. Call
it "Occam's Razor" so as to suggest it is self-evident is not the
proper way to do the job. This is all I want to argue in this
discussion.

Pei

On Wed, Oct 29, 2008 at 12:10 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> But, NARS as an overall software system will perform more effectively
> (i.e., learn more rapidly) in
> some environments than in others, for a variety of reasons.  There are many
> biases built into the NARS architecture in various ways ... it's just not
> obvious
> to spell out what they are, because the NARS system was not explicitly
> designed based on that sort of thinking...
>
> The same is true of every other complex AGI architecture...
>
> ben g
>
>
> On Wed, Oct 29, 2008 at 12:07 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Ed,
>>
>> When NARS extrapolates its past experience to the current and the
>> future, it is indeed based on the assumption that its future
>> experience will be similar to its past experience (otherwise any
>> prediction will be equally valid), however it does not assume the
>> world can be captured by any specific mathematical model, such as a
>> Turing Machine or a probability distribution defined on a
>> propositional space.
>>
>> Concretely speaking, when a statement S has been tested N times, and
>> in M times it is true, but in N-M times it is false, then NARS's
>> "expectation value" for it to be true in the next testing is E(S) =
>> (M+0.5)/(N+1) [if there is no other relevant knowledge], and the
>> system will use this value to decide whether to accept a bet on S.
>> However, neither the system nor its designer assumes that there is a
>> "true probability" for S to occur for which the above expectation is
>> an approximation. Also, it is not assumed that E(S)  will converge
>> when the testing on S continues.
>>
>> Pei
>>
>>
>> On Wed, Oct 29, 2008 at 11:33 AM, Ed Porter <[EMAIL PROTECTED]> wrote:
>> > Pei,
>> >
>> > My understanding is that when you reason from data, you often want the
>> > ability to extrapolate, which requires some sort of assumptions about
>> > the
>> > type of mathematical model to be used.  How do you deal with that in
>> > NARS?
>> >
>> > Ed Porter
>> >
>> > -Original Message-
>> > From: Pei Wang [mailto:[EMAIL PROTECTED]
>> > Sent: Tuesday, October 28, 2008 9:40 PM
>> > To: agi@v2.listbox.com
>> > Subject: Re: [agi] Occam's Razor and its abuse
>> >
>> >
>> > Ed,
>> >
>> > Since NARS doesn't follow the Bayesian approach, there is no initial
>> > priors
>> > to be assumed. If we use a more general term, such as "initial
>> > knowledge" or
>> > "innate beliefs", then yes, you can add them into the system, will will
>> > improve the system's performance. However, they are optional. In NARS,
>> > all
>> > object-level (i.e., not meta-level) innate beliefs can be learned by the
>> > system afterward.
>> >
>> > Pei
>> >
>> > On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter <[EMAIL PROTECTED]> wrote:
>> >> It appears to me that the assumptions about initial priors used by a
>> >> self learning AGI or an evolutionary line of AGI's could be quite
>> >> minimal.
>> >>
>> >> My understanding is that once a probability distribution starts
>> >> receiving random samples from its distribution the effect of the
>> >> original prior becomes rapidly lost, unless it is a rather rare one.
>> >> Such rare problem priors would get selected against quickly by
>> >> evolution.  Evolution would tend to tune for the most appropriate
>> >> priors for the success of subsequent generations (either or computing
>> >> in the same system if it is capable of enough change or of descendant
>> 

Re: [agi] Occam's Razor and its abuse

2008-10-29 Thread Pei Wang
Ed,

When NARS extrapolates its past experience to the current and the
future, it is indeed based on the assumption that its future
experience will be similar to its past experience (otherwise any
prediction will be equally valid), however it does not assume the
world can be captured by any specific mathematical model, such as a
Turing Machine or a probability distribution defined on a
propositional space.

Concretely speaking, when a statement S has been tested N times, and
in M times it is true, but in N-M times it is false, then NARS's
"expectation value" for it to be true in the next testing is E(S) =
(M+0.5)/(N+1) [if there is no other relevant knowledge], and the
system will use this value to decide whether to accept a bet on S.
However, neither the system nor its designer assumes that there is a
"true probability" for S to occur for which the above expectation is
an approximation. Also, it is not assumed that E(S)  will converge
when the testing on S continues.

Pei


On Wed, Oct 29, 2008 at 11:33 AM, Ed Porter <[EMAIL PROTECTED]> wrote:
> Pei,
>
> My understanding is that when you reason from data, you often want the
> ability to extrapolate, which requires some sort of assumptions about the
> type of mathematical model to be used.  How do you deal with that in NARS?
>
> Ed Porter
>
> -Original Message-
> From: Pei Wang [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 28, 2008 9:40 PM
> To: agi@v2.listbox.com
> Subject: Re: [agi] Occam's Razor and its abuse
>
>
> Ed,
>
> Since NARS doesn't follow the Bayesian approach, there is no initial priors
> to be assumed. If we use a more general term, such as "initial knowledge" or
> "innate beliefs", then yes, you can add them into the system, will will
> improve the system's performance. However, they are optional. In NARS, all
> object-level (i.e., not meta-level) innate beliefs can be learned by the
> system afterward.
>
> Pei
>
> On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter <[EMAIL PROTECTED]> wrote:
>> It appears to me that the assumptions about initial priors used by a
>> self learning AGI or an evolutionary line of AGI's could be quite
>> minimal.
>>
>> My understanding is that once a probability distribution starts
>> receiving random samples from its distribution the effect of the
>> original prior becomes rapidly lost, unless it is a rather rare one.
>> Such rare problem priors would get selected against quickly by
>> evolution.  Evolution would tend to tune for the most appropriate
>> priors for the success of subsequent generations (either or computing
>> in the same system if it is capable of enough change or of descendant
>> systems).  Probably the best priors would generally be ones that could
>> be trained moderately rapidly by data.
>>
>> So it seems an evolutionary system or line could initially learn
>> priors without any assumptions for priors other than a random picking
>> of priors. Over time and multiple generations it might develop
>> hereditary priors, an perhaps even different hereditary priors for
>> parts of its network connected to different inputs, outputs or
>> internal controls.
>>
>> The use of priors in an AGI could be greatly improved by having a
>> gen/comp hiearachy in which models for a given concept could be
>> inherited from the priors of sets of models for similar concepts, and
>> that the set of priors appropriate could change contextually.  It
>> would also seem that the notion of a prior could be improve by
>> blending information from episodic and probabilistic models.
>>
>> It would appear than in almost any generally intelligent system, being
>> able to approximate reality in a manner sufficient for evolutionary
>> success with the most efficient representations would be a
>> characteristic that would be greatly preferred by evolution, because
>> it would allow systems to better model more of their environement
>> sufficiently well for evolutionary success with whatever current
>> modeling capacity they have.
>>
>> So, although a completely accurate description of virtually anything
>> may not find much use for Occam's Razor, as a practically useful
>> representation it often will.  It seems to me that Occam's Razor is
>> more oriented to deriving meaningful generalizations that it is exact
>> descriptions of anything.
>>
>> Furthermore, it would seem to me that a more simple set of
>> preconditions, is generally more probable than a more complex one,
>> because it requires less coincidence.  It would seem to me this would
>> be true under most random sets of priors

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Eric,

I highly respect your work, though we clearly have different opinions
on what intelligence is, as well as on how to achieve it. For example,
though learning and generalization play central roles in my theory
about intelligence, I don't think PAC learning (or the other learning
algorithms proposed so far) provides a proper conceptual framework for
the typical situation of this process. Generally speaking, I'm not
"building some system that learns about the world", in the sense that
there is a correct way to describe the world waiting to be discovered,
which can be captured by some algorithm. Instead, learning to me is a
non-algorithmic open-ended process by which the system summarizes its
own experience, and uses it to predict the future. I fully understand
that most people in this field probably consider this opinion wrong,
though I haven't been convinced yet by the arguments I've seen so far.

Instead of addressing all of the relevant issues, in this discussion I
have a very limited goal. To rephrase what I said initially, I see
that under the term "Occam's Razor", currently there are three
different statements:

(1) Simplicity (in conclusions, hypothesis, theories, etc.) is preferred.

(2) The preference to simplicity does not need a reason or justification.

(3) Simplicity is preferred because it is correlated with correctness.

I agree with (1), but not (2) and (3). I know many people have
different opinions, and I don't attempt to argue with them here ---
these problems are too complicated to be settled by email exchanges.

However, I do hope to convince people in this discussion that the
three statements are not logically equivalent, and (2) and (3) are not
implied by (1), so to use "Occam's Razor" to refer to all of them is
not a good idea, because it is going to mix different issues.
Therefore, I suggest people to use "Occam's Razor" in its original and
basic sense, that is (1), and to use other terms to refer to (2) and
(3). Otherwise, when people talk about "Occam's Razor", I just don't
know what to say.

Pei

On Tue, Oct 28, 2008 at 8:09 PM, Eric Baum <[EMAIL PROTECTED]> wrote:
>
> Pei> Triggered by several recent discussions, I'd like to make the
> Pei> following position statement, though won't commit myself to long
> Pei> debate on it. ;-)
>
> Pei> Occam's Razor, in its original form, goes like "entities must not
> Pei> be multiplied beyond necessity", and it is often stated as "All
> Pei> other things being equal, the simplest solution is the best" or
> Pei> "when multiple competing theories are equal in other respects,
> Pei> the principle recommends selecting the theory that introduces the
> Pei> fewest assumptions and postulates the fewest entities" --- all
> Pei> from http://en.wikipedia.org/wiki/Occam's_razor
>
> Pei> I fully agree with all of the above statements.
>
> Pei> However, to me, there are two common misunderstandings associated
> Pei> with it in the context of AGI and philosophy of science.
>
> Pei> (1) To take this statement as self-evident or a stand-alone
> Pei> postulate
>
> Pei> To me, it is derived or implied by the insufficiency of
> Pei> resources. If a system has sufficient resources, it has no good
> Pei> reason to prefer a simpler theory.
>
> With all due respect, this is mistaken.
> Occam's Razor, in some form, is the heart of Generalization, which
> is the essence (and G) of GI.
>
> For example, if you study concept learning from examples,
> say in the PAC learning context (related theorems
> hold in some other contexts as well),
> there are theorems to the effect that if you find
> a hypothesis from a simple enough class of a hypotheses
> it will with very high probability accurately classify new
> examples chosen from the same distribution,
>
> and conversely theorems that state (roughly speaking) that
> any method that chooses a hypothesis from too expressive a class
> of hypotheses will have a probability that can be bounded below
> by some reasonable number like 1/7,
> of having large error in its predictions on new examples--
> in other words it is impossible to PAC learn without respecting
> Occam's Razor.
>
> For discussion of the above paragraphs, I'd refer you to
> Chapter 4 of What is Thought? (MIT Press, 2004).
>
> In other words, if you are building some system that learns
> about the world, it had better respect Occam's razor if you
> want whatever it learns to apply to new experience.
> (I use the term Occam's razor loosely; using
> hypotheses that are highly constrained in ways other than
> just being concise may work, but you'd better respect
> "simplicity" broadly defined. See Chap 6 of WIT? for
> more discussion of this point.)
>
> The core problem of GI is generalization: you want to be able to
> figure out new problems as they come along that you haven't seen
> before. In order to do that, you basically must implicitly or
> explicitly employ some version
> of Occam's Razor, independent of how much resources you have.
>
> In my view, th

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Ed,

Since NARS doesn't follow the Bayesian approach, there is no initial
priors to be assumed. If we use a more general term, such as "initial
knowledge" or "innate beliefs", then yes, you can add them into the
system, will will improve the system's performance. However, they are
optional. In NARS, all object-level (i.e., not meta-level) innate
beliefs can be learned by the system afterward.

Pei

On Tue, Oct 28, 2008 at 5:37 PM, Ed Porter <[EMAIL PROTECTED]> wrote:
> It appears to me that the assumptions about initial priors used by a self
> learning AGI or an evolutionary line of AGI's could be quite minimal.
>
> My understanding is that once a probability distribution starts receiving
> random samples from its distribution the effect of the original prior
> becomes rapidly lost, unless it is a rather rare one.  Such rare problem
> priors would get selected against quickly by evolution.  Evolution would
> tend to tune for the most appropriate priors for the success of subsequent
> generations (either or computing in the same system if it is capable of
> enough change or of descendant systems).  Probably the best priors would
> generally be ones that could be trained moderately rapidly by data.
>
> So it seems an evolutionary system or line could initially learn priors
> without any assumptions for priors other than a random picking of priors.
> Over time and multiple generations it might develop hereditary priors, an
> perhaps even different hereditary priors for parts of its network connected
> to different inputs, outputs or internal controls.
>
> The use of priors in an AGI could be greatly improved by having a gen/comp
> hiearachy in which models for a given concept could be inherited from the
> priors of sets of models for similar concepts, and that the set of priors
> appropriate could change contextually.  It would also seem that the notion
> of a prior could be improve by blending information from episodic and
> probabilistic models.
>
> It would appear than in almost any generally intelligent system, being able
> to approximate reality in a manner sufficient for evolutionary success with
> the most efficient representations would be a characteristic that would be
> greatly preferred by evolution, because it would allow systems to better
> model more of their environement sufficiently well for evolutionary success
> with whatever current modeling capacity they have.
>
> So, although a completely accurate description of virtually anything may not
> find much use for Occam's Razor, as a practically useful representation it
> often will.  It seems to me that Occam's Razor is more oriented to deriving
> meaningful generalizations that it is exact descriptions of anything.
>
> Furthermore, it would seem to me that a more simple set of preconditions, is
> generally more probable than a more complex one, because it requires less
> coincidence.  It would seem to me this would be true under most random sets
> of priors for the probabilities of the possible sets of components involved
> and Occam's Razor type selection.
>
> The are the musings of an untrained mind, since I have not spent much time
> studying philosophy, because such a high percent of it was so obviously
> stupid (such as what was commonly said when I was young, that you can't have
> intelligence without language) and my understanding of math is much less
> than that of many on this list.  But none the less I think much of what I
> have said above is true.
>
> I think its gist is not totally dissimilar to what Abram has said.
>
> Ed Porter
>
>
>
>
> -Original Message-
> From: Pei Wang [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 28, 2008 3:05 PM
> To: agi@v2.listbox.com
> Subject: Re: [agi] Occam's Razor and its abuse
>
>
> Abram,
>
> I agree with your basic idea in the following, though I usually put it in
> different form.
>
> Pei
>
> On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>> Ben,
>>
>> You assert that Pei is forced to make an assumption about the
>> regulatiry of the world to justify adaptation. Pei could also take a
>> different argument. He could try to show that *if* a strategy exists
>> that can be implemented given the finite resources, NARS will
>> eventually find it. Thus, adaptation is justified on a sort of "we
>> might as well try" basis. (The proof would involve showing that NARS
>> searches the state of finite-state-machines that can be implemented
>> with the resources at hand, and is more probable to stay for longer
>> periods of time in configurations that give more reward, such that
>> NARS would eventually 

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Matt,

The "currently known laws of physics" is a *description* of the
universe at a certain level, which is fundamentally different from the
universe itself. Also, "All human knowledge can be reduced into
physics" is not a view point accepted by everyone.

Furthermore, "computable" is a property of a mathematical function. It
takes a bunch of assumptions to be applied to a statement, and some
additional ones to be applied to an object --- Is the Earth
"computable"? Does the previous question ever make sense?

Whenever someone "prove" something outside mathematics, it is always
based on certain assumptions. If the assumptions are not well
justified, there is no strong reason for people to accept the
conclusion, even though the proof process is correct.

Pei

On Tue, Oct 28, 2008 at 5:23 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> Hutter proved Occam's Razor (AIXI) for the case of any environment with a 
> computable probability distribution. It applies to us because the observable 
> universe is Turing computable according to currently known laws of physics. 
> Specifically, the observable universe has a finite description length 
> (approximately 2.91 x 10^122 bits, the Bekenstein bound of the Hubble radius).
>
> AIXI has nothing to do with insufficiency of resources. Given unlimited 
> resources we would still prefer the (algorithmically) simplest explanation 
> because it is the most likely under a Solomonoff distribution of possible 
> environments.
>
> Also, AIXI does not state "the simplest answer is the best answer". It says 
> that the simplest answer consistent with observation so far is the best 
> answer. When we are short on resources (and we always are because AIXI is not 
> computable), then we may choose a different explanation than the simplest 
> one. However this does not make the alternative correct.
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
> --- On Tue, 10/28/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> From: Pei Wang <[EMAIL PROTECTED]>
>> Subject: [agi] Occam's Razor and its abuse
>> To: agi@v2.listbox.com
>> Date: Tuesday, October 28, 2008, 11:58 AM
>> Triggered by several recent discussions, I'd like to
>> make the
>> following position statement, though won't commit
>> myself to long
>> debate on it. ;-)
>>
>> Occam's Razor, in its original form, goes like
>> "entities must not be
>> multiplied beyond necessity", and it is often stated
>> as "All other
>> things being equal, the simplest solution is the best"
>> or "when
>> multiple competing theories are equal in other respects,
>> the principle
>> recommends selecting the theory that introduces the fewest
>> assumptions
>> and postulates the fewest entities" --- all from
>> http://en.wikipedia.org/wiki/Occam's_razor
>>
>> I fully agree with all of the above statements.
>>
>> However, to me, there are two common misunderstandings
>> associated with
>> it in the context of AGI and philosophy of science.
>>
>> (1) To take this statement as self-evident or a stand-alone
>> postulate
>>
>> To me, it is derived or implied by the insufficiency of
>> resources. If
>> a system has sufficient resources, it has no good reason to
>> prefer a
>> simpler theory.
>>
>> (2) To take it to mean "The simplest answer is usually
>> the correct answer."
>>
>> This is a very different statement, which cannot be
>> justified either
>> analytically or empirically.  When theory A is an
>> approximation of
>> theory B, usually the former is simpler than the latter,
>> but less
>> "correct" or "accurate", in terms of
>> its relation with all available
>> evidence. When we are short in resources and have a low
>> demand on
>> accuracy, we often prefer A over B, but it does not mean
>> that by doing
>> so we judge A as more correct than B.
>>
>> In summary, in choosing among alternative theories or
>> conclusions, the
>> preference for simplicity comes from shortage of resources,
>> though
>> simplicity and correctness are logically independent of
>> each other.
>>
>> Pei
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Abram,

I agree with your basic idea in the following, though I usually put it
in different form.

Pei

On Tue, Oct 28, 2008 at 2:52 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Ben,
>
> You assert that Pei is forced to make an assumption about the
> regulatiry of the world to justify adaptation. Pei could also take a
> different argument. He could try to show that *if* a strategy exists
> that can be implemented given the finite resources, NARS will
> eventually find it. Thus, adaptation is justified on a sort of "we
> might as well try" basis. (The proof would involve showing that NARS
> searches the state of finite-state-machines that can be implemented
> with the resources at hand, and is more probable to stay for longer
> periods of time in configurations that give more reward, such that
> NARS would eventually settle on a configuration if that configuration
> consistently gave the highest reward.)
>
> So, some form of learning can take place with no assumptions. The
> problem is that the search space is exponential in the resources
> available, so there is some maximum point where the system would
> perform best (because the amount of resources match the problem), but
> giving the system more resources would hurt performance (because the
> system searches the unnecessarily large search space). So, in this
> sense, the system's behavior seems counterintuitive-- it does not seem
> to be taking advantage of the increased resources.
>
> I'm not claiming NARS would have that problem, of course just that
> a theoretical no-assumption learner would.
>
> --Abram
>
> On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>
>>
>> On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>
>>> Ben,
>>>
>>> Thanks. So the other people now see that I'm not attacking a straw man.
>>>
>>> My solution to Hume's problem, as embedded in the experience-grounded
>>> semantics, is to assume no predictability, but to justify induction as
>>> adaptation. However, it is a separate topic which I've explained in my
>>> other publications.
>>
>> Right, but justifying induction as adaptation only works if the environment
>> is assumed to have certain regularities which can be adapted to.  In a
>> random environment, adaptation won't work.  So, still, to justify induction
>> as adaptation you have to make *some* assumptions about the world.
>>
>> The Occam prior gives one such assumption: that (to give just one form) sets
>> of observations in the world tend to be producible by short computer
>> programs.
>>
>> For adaptation to successfully carry out induction, *some* vaguely
>> comparable property to this must hold, and I'm not sure if you have
>> articulated which one you assume, or if you leave this open.
>>
>> In effect, you implicitly assume something like an Occam prior, because
>> you're saying that  a system with finite resources can successfully adapt to
>> the world ... which means that sets of observations in the world *must* be
>> approximately summarizable via subprograms that can be executed within this
>> system.
>>
>> So I argue that, even though it's not your preferred way to think about it,
>> your own approach to AI theory and practice implicitly assumes some variant
>> of the Occam prior holds in the real world.
>>>
>>>
>>> Here I just want to point out that the original and basic meaning of
>>> Occam's Razor and those two common (mis)usages of it are not
>>> necessarily the same. I fully agree with the former, but not the
>>> latter, and I haven't seen any convincing justification of the latter.
>>> Instead, they are often taken as granted, under the name of Occam's
>>> Razor.
>>
>> I agree that the notion of an Occam prior is a significant conceptual beyond
>> the original "Occam's Razor" precept enounced long ago.
>>
>> Also, I note that, for those who posit the Occam prior as a **prior
>> assumption**, there is not supposed to be any convincing justification for
>> it.  The idea is simply that: one must make *some* assumption (explicitly or
>> implicitly) if one wants to do induction, and this is the assumption that
>> some people choose to make.
>>
>> -- Ben G
>>
>>
>>
>> 
>> agi | Archives | Modify Your Subscription
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
On Tue, Oct 28, 2008 at 3:01 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> I believe I could prove that *mathematically*, in order for a NARS system to
> consistently, successfully achieve goals in an environment, that environment
> would need to have some Occam-prior-like property.

Maybe you can, but "to consistently, successfully achieve goals in an
environment" is not in my working definition of "intelligence", so I
don't really mind.

> However, even if so, that doesn't mean such is the best way to think about
> NARS ... that's a different issue.

Exactly. I'm glad we finally agree again. ;-)

Pei

> -- Ben G
>
> On Tue, Oct 28, 2008 at 11:58 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Ben,
>>
>> It seems that you agree the issue I pointed out really exists, but
>> just take it as a necessary evil. Furthermore, you think I also
>> assumed the same thing, though I failed to see it. I won't argue
>> against the "necessary evil" part --- as far as you agree that those
>> "postulates" (such as "the universe is computable") are not
>> convincingly justified. I won't try to disprove them.
>>
>> As for the latter part, I don't think you can convince me that you
>> know me better than I know myself. ;-)
>>
>> The following is from
>> http://nars.wang.googlepages.com/wang.semantics.pdf , page 28:
>>
>> If the answers provided by NARS are fallible, in what sense these answers
>> are
>> "better" than arbitrary guesses? This leads us to the concept of
>> "rationality".
>> When infallible predictions cannot be obtained (due to insufficient
>> knowledge
>> and resources), answers based on past experience are better than arbitrary
>> guesses, if the environment is relatively stable. To say an answer is only
>> a
>> summary of past experience (thus no future confirmation guaranteed) does
>> not make it equal to an arbitrary conclusion — it is what "adaptation"
>> means.
>> Adaptation is the process in which a system changes its behaviors as if
>> the
>> future is similar to the past. It is a rational process, even though
>> individual
>> conclusions it produces are often wrong. For this reason, valid inference
>> rules
>> (deduction, induction, abduction, and so on) are the ones whose
>> conclusions
>> correctly (according to the semantics) summarize the evidence in the
>> premises.
>> They are "truth-preserving" in this sense, not in the model-theoretic
>> sense that
>> they always generate conclusions which are immune from future revision.
>>
>> --- so you see, I don't assume adaptation will always be successful,
>> even successful to a certain probability. You can dislike this
>> conclusion, though you cannot say it is the same as what is assumed by
>> Novamente and AIXI.
>>
>> Pei
>>
>> On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> > On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang <[EMAIL PROTECTED]>
>> > wrote:
>> >>
>> >> Ben,
>> >>
>> >> Thanks. So the other people now see that I'm not attacking a straw man.
>> >>
>> >> My solution to Hume's problem, as embedded in the experience-grounded
>> >> semantics, is to assume no predictability, but to justify induction as
>> >> adaptation. However, it is a separate topic which I've explained in my
>> >> other publications.
>> >
>> > Right, but justifying induction as adaptation only works if the
>> > environment
>> > is assumed to have certain regularities which can be adapted to.  In a
>> > random environment, adaptation won't work.  So, still, to justify
>> > induction
>> > as adaptation you have to make *some* assumptions about the world.
>> >
>> > The Occam prior gives one such assumption: that (to give just one form)
>> > sets
>> > of observations in the world tend to be producible by short computer
>> > programs.
>> >
>> > For adaptation to successfully carry out induction, *some* vaguely
>> > comparable property to this must hold, and I'm not sure if you have
>> > articulated which one you assume, or if you leave this open.
>> >
>> > In effect, you implicitly assume something like an Occam prior, because
>> > you're saying that  a system with finite resources can successfully
>> > adapt to
>> 

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
We can say the same thing for the human mind, right?

Pei

On Tue, Oct 28, 2008 at 2:54 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Sure ... but my point is that unless the environment satisfies a certain
> Occam-prior-like property, NARS will be useless...
>
> ben
>
> On Tue, Oct 28, 2008 at 11:52 AM, Abram Demski <[EMAIL PROTECTED]>
> wrote:
>>
>> Ben,
>>
>> You assert that Pei is forced to make an assumption about the
>> regulatiry of the world to justify adaptation. Pei could also take a
>> different argument. He could try to show that *if* a strategy exists
>> that can be implemented given the finite resources, NARS will
>> eventually find it. Thus, adaptation is justified on a sort of "we
>> might as well try" basis. (The proof would involve showing that NARS
>> searches the state of finite-state-machines that can be implemented
>> with the resources at hand, and is more probable to stay for longer
>> periods of time in configurations that give more reward, such that
>> NARS would eventually settle on a configuration if that configuration
>> consistently gave the highest reward.)
>>
>> So, some form of learning can take place with no assumptions. The
>> problem is that the search space is exponential in the resources
>> available, so there is some maximum point where the system would
>> perform best (because the amount of resources match the problem), but
>> giving the system more resources would hurt performance (because the
>> system searches the unnecessarily large search space). So, in this
>> sense, the system's behavior seems counterintuitive-- it does not seem
>> to be taking advantage of the increased resources.
>>
>> I'm not claiming NARS would have that problem, of course.... just that
>> a theoretical no-assumption learner would.
>>
>> --Abram
>>
>> On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> > On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang <[EMAIL PROTECTED]>
>> > wrote:
>> >>
>> >> Ben,
>> >>
>> >> Thanks. So the other people now see that I'm not attacking a straw man.
>> >>
>> >> My solution to Hume's problem, as embedded in the experience-grounded
>> >> semantics, is to assume no predictability, but to justify induction as
>> >> adaptation. However, it is a separate topic which I've explained in my
>> >> other publications.
>> >
>> > Right, but justifying induction as adaptation only works if the
>> > environment
>> > is assumed to have certain regularities which can be adapted to.  In a
>> > random environment, adaptation won't work.  So, still, to justify
>> > induction
>> > as adaptation you have to make *some* assumptions about the world.
>> >
>> > The Occam prior gives one such assumption: that (to give just one form)
>> > sets
>> > of observations in the world tend to be producible by short computer
>> > programs.
>> >
>> > For adaptation to successfully carry out induction, *some* vaguely
>> > comparable property to this must hold, and I'm not sure if you have
>> > articulated which one you assume, or if you leave this open.
>> >
>> > In effect, you implicitly assume something like an Occam prior, because
>> > you're saying that  a system with finite resources can successfully
>> > adapt to
>> > the world ... which means that sets of observations in the world *must*
>> > be
>> > approximately summarizable via subprograms that can be executed within
>> > this
>> > system.
>> >
>> > So I argue that, even though it's not your preferred way to think about
>> > it,
>> > your own approach to AI theory and practice implicitly assumes some
>> > variant
>> > of the Occam prior holds in the real world.
>> >>
>> >>
>> >> Here I just want to point out that the original and basic meaning of
>> >> Occam's Razor and those two common (mis)usages of it are not
>> >> necessarily the same. I fully agree with the former, but not the
>> >> latter, and I haven't seen any convincing justification of the latter.
>> >> Instead, they are often taken as granted, under the name of Occam's
>> >> Razor.
>> >
>> > I agree that the notion of an Occam prior is a significant conceptual
>> > beyond
>> > the original "Occam

Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Ben,

It seems that you agree the issue I pointed out really exists, but
just take it as a necessary evil. Furthermore, you think I also
assumed the same thing, though I failed to see it. I won't argue
against the "necessary evil" part --- as far as you agree that those
"postulates" (such as "the universe is computable") are not
convincingly justified. I won't try to disprove them.

As for the latter part, I don't think you can convince me that you
know me better than I know myself. ;-)

The following is from
http://nars.wang.googlepages.com/wang.semantics.pdf , page 28:

If the answers provided by NARS are fallible, in what sense these answers are
"better" than arbitrary guesses? This leads us to the concept of "rationality".
When infallible predictions cannot be obtained (due to insufficient knowledge
and resources), answers based on past experience are better than arbitrary
guesses, if the environment is relatively stable. To say an answer is only a
summary of past experience (thus no future confirmation guaranteed) does
not make it equal to an arbitrary conclusion — it is what "adaptation" means.
Adaptation is the process in which a system changes its behaviors as if the
future is similar to the past. It is a rational process, even though individual
conclusions it produces are often wrong. For this reason, valid inference rules
(deduction, induction, abduction, and so on) are the ones whose conclusions
correctly (according to the semantics) summarize the evidence in the premises.
They are "truth-preserving" in this sense, not in the model-theoretic sense that
they always generate conclusions which are immune from future revision.

--- so you see, I don't assume adaptation will always be successful,
even successful to a certain probability. You can dislike this
conclusion, though you cannot say it is the same as what is assumed by
Novamente and AIXI.

Pei

On Tue, Oct 28, 2008 at 2:12 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
> On Tue, Oct 28, 2008 at 10:00 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Ben,
>>
>> Thanks. So the other people now see that I'm not attacking a straw man.
>>
>> My solution to Hume's problem, as embedded in the experience-grounded
>> semantics, is to assume no predictability, but to justify induction as
>> adaptation. However, it is a separate topic which I've explained in my
>> other publications.
>
> Right, but justifying induction as adaptation only works if the environment
> is assumed to have certain regularities which can be adapted to.  In a
> random environment, adaptation won't work.  So, still, to justify induction
> as adaptation you have to make *some* assumptions about the world.
>
> The Occam prior gives one such assumption: that (to give just one form) sets
> of observations in the world tend to be producible by short computer
> programs.
>
> For adaptation to successfully carry out induction, *some* vaguely
> comparable property to this must hold, and I'm not sure if you have
> articulated which one you assume, or if you leave this open.
>
> In effect, you implicitly assume something like an Occam prior, because
> you're saying that  a system with finite resources can successfully adapt to
> the world ... which means that sets of observations in the world *must* be
> approximately summarizable via subprograms that can be executed within this
> system.
>
> So I argue that, even though it's not your preferred way to think about it,
> your own approach to AI theory and practice implicitly assumes some variant
> of the Occam prior holds in the real world.
>>
>>
>> Here I just want to point out that the original and basic meaning of
>> Occam's Razor and those two common (mis)usages of it are not
>> necessarily the same. I fully agree with the former, but not the
>> latter, and I haven't seen any convincing justification of the latter.
>> Instead, they are often taken as granted, under the name of Occam's
>> Razor.
>
> I agree that the notion of an Occam prior is a significant conceptual beyond
> the original "Occam's Razor" precept enounced long ago.
>
> Also, I note that, for those who posit the Occam prior as a **prior
> assumption**, there is not supposed to be any convincing justification for
> it.  The idea is simply that: one must make *some* assumption (explicitly or
> implicitly) if one wants to do induction, and this is the assumption that
> some people choose to make.
>
> -- Ben G
>
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Ben,

Thanks. So the other people now see that I'm not attacking a straw man.

My solution to Hume's problem, as embedded in the experience-grounded
semantics, is to assume no predictability, but to justify induction as
adaptation. However, it is a separate topic which I've explained in my
other publications.

Here I just want to point out that the original and basic meaning of
Occam's Razor and those two common (mis)usages of it are not
necessarily the same. I fully agree with the former, but not the
latter, and I haven't seen any convincing justification of the latter.
Instead, they are often taken as granted, under the name of Occam's
Razor.

Pei

On Tue, Oct 28, 2008 at 12:37 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Hi Pei,
>
> This is an interesting perspective; I just want to clarify for others on the
> list that it is a particular and controversial perspective, and contradicts
> the perspectives of many other well-informed research professionals and deep
> thinkers on relevant topics.
>
> Many serious thinkers in the area *do* consider Occam's Razor a standalone
> postulate.  This fits in naturally with the Bayesian perspective, in which
> one needs to assume *some* prior distribution, so one often assumes some
> sort of Occam prior (e.g. the Solomonoff-Levin prior, the speed prior, etc.)
> as a standalone postulate.
>
> Hume pointed out that induction (in the old sense of extrapolating from the
> past into the future) is not solvable except by introducing some kind of a
> priori assumption.  Occam's Razor, in one form or another, is a suitable a
> prior assumption to plug into this role.
>
> If you want to replace the Occam's Razor assumption with the assumption that
> "the world is predictable by systems with limited resources, and we will
> prefer explanations that consume less resources", that seems unproblematic
> as it's basically equivalent to assuming an Occam prior.
>
> On the other hand, I just want to point out that to get around Hume's
> complaint you do need to make *some* kind of assumption about the regularity
> of the world.  What kind of assumption of this nature underlies your work on
> NARS (if any)?
>
> ben
>
> On Tue, Oct 28, 2008 at 8:58 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Triggered by several recent discussions, I'd like to make the
>> following position statement, though won't commit myself to long
>> debate on it. ;-)
>>
>> Occam's Razor, in its original form, goes like "entities must not be
>> multiplied beyond necessity", and it is often stated as "All other
>> things being equal, the simplest solution is the best" or "when
>> multiple competing theories are equal in other respects, the principle
>> recommends selecting the theory that introduces the fewest assumptions
>> and postulates the fewest entities" --- all from
>> http://en.wikipedia.org/wiki/Occam's_razor
>>
>> I fully agree with all of the above statements.
>>
>> However, to me, there are two common misunderstandings associated with
>> it in the context of AGI and philosophy of science.
>>
>> (1) To take this statement as self-evident or a stand-alone postulate
>>
>> To me, it is derived or implied by the insufficiency of resources. If
>> a system has sufficient resources, it has no good reason to prefer a
>> simpler theory.
>>
>> (2) To take it to mean "The simplest answer is usually the correct
>> answer."
>>
>> This is a very different statement, which cannot be justified either
>> analytically or empirically.  When theory A is an approximation of
>> theory B, usually the former is simpler than the latter, but less
>> "correct" or "accurate", in terms of its relation with all available
>> evidence. When we are short in resources and have a low demand on
>> accuracy, we often prefer A over B, but it does not mean that by doing
>> so we judge A as more correct than B.
>>
>> In summary, in choosing among alternative theories or conclusions, the
>> preference for simplicity comes from shortage of resources, though
>> simplicity and correctness are logically independent of each other.
>>
>> Pei
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Bi

[agi] Occam's Razor and its abuse

2008-10-28 Thread Pei Wang
Triggered by several recent discussions, I'd like to make the
following position statement, though won't commit myself to long
debate on it. ;-)

Occam's Razor, in its original form, goes like "entities must not be
multiplied beyond necessity", and it is often stated as "All other
things being equal, the simplest solution is the best" or "when
multiple competing theories are equal in other respects, the principle
recommends selecting the theory that introduces the fewest assumptions
and postulates the fewest entities" --- all from
http://en.wikipedia.org/wiki/Occam's_razor

I fully agree with all of the above statements.

However, to me, there are two common misunderstandings associated with
it in the context of AGI and philosophy of science.

(1) To take this statement as self-evident or a stand-alone postulate

To me, it is derived or implied by the insufficiency of resources. If
a system has sufficient resources, it has no good reason to prefer a
simpler theory.

(2) To take it to mean "The simplest answer is usually the correct answer."

This is a very different statement, which cannot be justified either
analytically or empirically.  When theory A is an approximation of
theory B, usually the former is simpler than the latter, but less
"correct" or "accurate", in terms of its relation with all available
evidence. When we are short in resources and have a low demand on
accuracy, we often prefer A over B, but it does not mean that by doing
so we judge A as more correct than B.

In summary, in choosing among alternative theories or conclusions, the
preference for simplicity comes from shortage of resources, though
simplicity and correctness are logically independent of each other.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] Reasoning by analogy recommendations

2008-10-17 Thread Pei Wang
Don't know what you've read, so have to start from the general references:

http://nars.wang.googlepages.com/wang.AGI-Curriculum.html , C.11:
Analogy and metaphor

http://cognet.mit.edu/library/books/view?isbn=0262571390 , The
Analogical Mind, Gentner, Holyoak, Kokinov (eds)

http://nars.wang.googlepages.com/wang.analogy.pdf , analogy in NARS

Pei

On Fri, Oct 17, 2008 at 11:09 AM, Harry Chesley <[EMAIL PROTECTED]> wrote:
> I find myself needing to more thoroughly understand reasoning by analogy.
> (I've read/thought about it to a degree, but would like more.) Anyone have
> any recommendation for books and/or papers on the subject?
>
> Thanks.
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription:
> https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


[agi] NEWS: Scientist develops programme to understand alien languages

2008-10-17 Thread Pei Wang
"... even an alien language far removed from any on Earth is likely to
have recognisable patterns that could help reveal how intelligent the
life forms are."

the news: 
http://www.telegraph.co.uk/earth/main.jhtml?view=DETAILS&grid=&xml=/earth/2008/10/15/scialien115.xml

the researcher: http://www.lmu.ac.uk/ies/comp/staff/jelliott/jre.htm


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


[agi] Announcement: Journal of Artificial General Intelligence

2008-10-13 Thread Pei Wang
Journal of Artificial General Intelligence (JAGI) has opened to submissions.

See http://journal.agi-network.org


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com


Re: [agi] two types of semantics [Was: NARS and probability]

2008-10-12 Thread Pei Wang
On Sun, Oct 12, 2008 at 3:06 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei,
>
> You are right, it doesn't make any such assumptions while Bayesian
> practice does. But, the parameter 'k' still fixes the length of time
> into the future that we are interested in predicting, right? So it
> seems to me that the truth value must be predictive, if its
> calculation depends on what we want to predict.

The truth-value is defined/measured according to past experience, but
is used to predict future experience. Especially, this is what the
"expectation" function is about. But still, a high expectation only
means that the system will behave under the assumption that the
statement may be confirmed again, which by no means guarantee the
actual confirmation of the statement in the future.

> That is why 'k' is hard to incorporate into the probabilistic NARSian
> scheme I want to formulate...

For this purpose, the interval version of the truth value may be easier.

Pei

> --Abram
>
> On Sun, Oct 12, 2008 at 2:07 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> Abram: The parameter 'k' does not really depend on the future, because
>> it makes no assumption about what will happen in that period of time.
>> It is just a "ruler" or "weight" (used with scale) to measure the
>> amount of evidence, as a "reference amount".
>>
>> For other people: The definition of confidence c = w/(w+k) states that
>> confidence is the proportion of current evidence among future
>> evidence, after the coming of evidence of amount k.
>>
>> Pei
>>
>> On Sun, Oct 12, 2008 at 1:48 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>> Pei,
>>>
>>> In this context, how do you justify the use of 'k'? It seems like, by
>>> introducing 'k', you add a reliance on the truth of the future "after
>>> k observations" into the semantics. Since the induction/abduction
>>> formula is dependent on 'k', the truth values that result no longer
>>> only summarize experience; they are calculated with prediction in
>>> mind.
>>>
>>> --Abram
>>>
>>> On Sun, Oct 12, 2008 at 8:29 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>> A brief and non-technical description of the two types of semantics
>>>> mentioned in the previous discussions:
>>>>
>>>> (1) Model-Theoretic Semantics (MTS)
>>>>
>>>> (1.1) There is a world existing independently outside the intelligent
>>>> system (human or machine).
>>>>
>>>> (1.2) In principle, there is an objective description of the world, in
>>>> terms of objects, their properties, and relations among them.
>>>>
>>>> (1.3) Within the intelligent system, its knowledge is an approximation
>>>> of the objective description of the world.
>>>>
>>>> (1.4) The meaning of a symbol within the system is the object it
>>>> refers to in the world.
>>>>
>>>> (1.5) The truth-value of a statement within the system measures how
>>>> close it approximates the fact in the world.
>>>>
>>>> (2) Experience-Grounded Semantics (EGS)
>>>>
>>>> (2.1) There is a world existing independently outside the intelligent
>>>> system (human or machine). [same as (1.1), but the agreement stops
>>>> here]
>>>>
>>>> (2.2) Even in principle, there is no objective description of the
>>>> world. What the system has is its experience, the history of its
>>>> interaction of the world.
>>>>
>>>> (2.3) Within the intelligent system, its knowledge is a summary of its
>>>> experience.
>>>>
>>>> (2.4) The meaning of a symbol within the system is determined by its
>>>> role in the experience.
>>>>
>>>> (2.5) The truth-value of a statement within the system measures how
>>>> close it summarizes the relevant part of the experience.
>>>>
>>>> To further simplify the description, in the context of learning and
>>>> reasoning: MTS takes "objective truth" of statements and "real
>>>> meaning" of terms as aim of approximation, while EGS refuses them, but
>>>> takes experience (input data) as the only thing to depend on.
>>>>
>>>> As usual, each theory has its strength and limitation. The issue is
>>>> which one is more proper for AGI. MTS has been dominating in math,
>>>> logic, 

Re: [agi] two types of semantics [Was: NARS and probability]

2008-10-12 Thread Pei Wang
True. Similar parameters can be found in the work of Carnap and
Walley, with different interpretations.

Pei

On Sun, Oct 12, 2008 at 2:11 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> On the other hand, in PLN's indefinite probabilities there is a parameter k
> which
> plays a similar mathematical role,  yet **is** explicitly interpreted as
> being about
> a "number of hypothetical future observations" ...
>
> Clearly the interplay btw algebra and interpretation is one of the things
> that makes
> this area of research (uncertain logic) "interesting" ...
>
> ben g
>
> On Sun, Oct 12, 2008 at 2:07 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Abram: The parameter 'k' does not really depend on the future, because
>> it makes no assumption about what will happen in that period of time.
>> It is just a "ruler" or "weight" (used with scale) to measure the
>> amount of evidence, as a "reference amount".
>>
>> For other people: The definition of confidence c = w/(w+k) states that
>> confidence is the proportion of current evidence among future
>> evidence, after the coming of evidence of amount k.
>>
>> Pei
>>
>> On Sun, Oct 12, 2008 at 1:48 PM, Abram Demski <[EMAIL PROTECTED]>
>> wrote:
>> > Pei,
>> >
>> > In this context, how do you justify the use of 'k'? It seems like, by
>> > introducing 'k', you add a reliance on the truth of the future "after
>> > k observations" into the semantics. Since the induction/abduction
>> > formula is dependent on 'k', the truth values that result no longer
>> > only summarize experience; they are calculated with prediction in
>> > mind.
>> >
>> > --Abram
>> >
>> > On Sun, Oct 12, 2008 at 8:29 AM, Pei Wang <[EMAIL PROTECTED]>
>> > wrote:
>> >> A brief and non-technical description of the two types of semantics
>> >> mentioned in the previous discussions:
>> >>
>> >> (1) Model-Theoretic Semantics (MTS)
>> >>
>> >> (1.1) There is a world existing independently outside the intelligent
>> >> system (human or machine).
>> >>
>> >> (1.2) In principle, there is an objective description of the world, in
>> >> terms of objects, their properties, and relations among them.
>> >>
>> >> (1.3) Within the intelligent system, its knowledge is an approximation
>> >> of the objective description of the world.
>> >>
>> >> (1.4) The meaning of a symbol within the system is the object it
>> >> refers to in the world.
>> >>
>> >> (1.5) The truth-value of a statement within the system measures how
>> >> close it approximates the fact in the world.
>> >>
>> >> (2) Experience-Grounded Semantics (EGS)
>> >>
>> >> (2.1) There is a world existing independently outside the intelligent
>> >> system (human or machine). [same as (1.1), but the agreement stops
>> >> here]
>> >>
>> >> (2.2) Even in principle, there is no objective description of the
>> >> world. What the system has is its experience, the history of its
>> >> interaction of the world.
>> >>
>> >> (2.3) Within the intelligent system, its knowledge is a summary of its
>> >> experience.
>> >>
>> >> (2.4) The meaning of a symbol within the system is determined by its
>> >> role in the experience.
>> >>
>> >> (2.5) The truth-value of a statement within the system measures how
>> >> close it summarizes the relevant part of the experience.
>> >>
>> >> To further simplify the description, in the context of learning and
>> >> reasoning: MTS takes "objective truth" of statements and "real
>> >> meaning" of terms as aim of approximation, while EGS refuses them, but
>> >> takes experience (input data) as the only thing to depend on.
>> >>
>> >> As usual, each theory has its strength and limitation. The issue is
>> >> which one is more proper for AGI. MTS has been dominating in math,
>> >> logic, and computer science, and therefore is accepted by the majority
>> >> people. Even so, it has been attacked by other people (not only the
>> >> EGS believers) for many reasons.
>> >>
>> >> A while ago I made a figure to illustrate this difference, which is at
>> >> ht

Re: [agi] two types of semantics [Was: NARS and probability]

2008-10-12 Thread Pei Wang
Abram: The parameter 'k' does not really depend on the future, because
it makes no assumption about what will happen in that period of time.
It is just a "ruler" or "weight" (used with scale) to measure the
amount of evidence, as a "reference amount".

For other people: The definition of confidence c = w/(w+k) states that
confidence is the proportion of current evidence among future
evidence, after the coming of evidence of amount k.

Pei

On Sun, Oct 12, 2008 at 1:48 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei,
>
> In this context, how do you justify the use of 'k'? It seems like, by
> introducing 'k', you add a reliance on the truth of the future "after
> k observations" into the semantics. Since the induction/abduction
> formula is dependent on 'k', the truth values that result no longer
> only summarize experience; they are calculated with prediction in
> mind.
>
> --Abram
>
> On Sun, Oct 12, 2008 at 8:29 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> A brief and non-technical description of the two types of semantics
>> mentioned in the previous discussions:
>>
>> (1) Model-Theoretic Semantics (MTS)
>>
>> (1.1) There is a world existing independently outside the intelligent
>> system (human or machine).
>>
>> (1.2) In principle, there is an objective description of the world, in
>> terms of objects, their properties, and relations among them.
>>
>> (1.3) Within the intelligent system, its knowledge is an approximation
>> of the objective description of the world.
>>
>> (1.4) The meaning of a symbol within the system is the object it
>> refers to in the world.
>>
>> (1.5) The truth-value of a statement within the system measures how
>> close it approximates the fact in the world.
>>
>> (2) Experience-Grounded Semantics (EGS)
>>
>> (2.1) There is a world existing independently outside the intelligent
>> system (human or machine). [same as (1.1), but the agreement stops
>> here]
>>
>> (2.2) Even in principle, there is no objective description of the
>> world. What the system has is its experience, the history of its
>> interaction of the world.
>>
>> (2.3) Within the intelligent system, its knowledge is a summary of its
>> experience.
>>
>> (2.4) The meaning of a symbol within the system is determined by its
>> role in the experience.
>>
>> (2.5) The truth-value of a statement within the system measures how
>> close it summarizes the relevant part of the experience.
>>
>> To further simplify the description, in the context of learning and
>> reasoning: MTS takes "objective truth" of statements and "real
>> meaning" of terms as aim of approximation, while EGS refuses them, but
>> takes experience (input data) as the only thing to depend on.
>>
>> As usual, each theory has its strength and limitation. The issue is
>> which one is more proper for AGI. MTS has been dominating in math,
>> logic, and computer science, and therefore is accepted by the majority
>> people. Even so, it has been attacked by other people (not only the
>> EGS believers) for many reasons.
>>
>> A while ago I made a figure to illustrate this difference, which is at
>> http://nars.wang.googlepages.com/wang.semantics-figure.pdf . A
>> manifesto of EGS is at
>> http://nars.wang.googlepages.com/wang.semantics.pdf
>>
>> Since the debate on the nature of "truth" and "meaning" has existed
>> for thousands of years, I don't think we can settle down it here by
>> some email exchanges. I just want to let the interested people know
>> the theoretical background of the related discussions.
>>
>> Pei
>>
>>
>> On Sat, Oct 11, 2008 at 8:34 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>>>
>>>> > What this highlights for me is the idea that NARS truth values attempt
>>>> > to reflect the evidence so far, while probabilities attempt to reflect
>>>> > the world
>>>
>>> I agree that probabilities attempt to reflect the world
>>>
>>>>
>>>> .
>>>>
>>>> Well said. This is exactly the difference between an
>>>> experience-grounded semantics and a model-theoretic semantics.
>>>
>>> I don't agree with this distinction ... unless you are construing "model
>>> theoretic semantics" in a very restrictive way, which then does not apply to
>>> PLN.
&

Re: [agi] two types of semantics [Was: NARS and probability]

2008-10-12 Thread Pei Wang
Ben,

Of course, "probability theory", in its mathematical form, is not
bounded to any semantics at all, though it implicitly exclude some
possibilities. A semantic theory is associated to it when probability
theory is applied to a practical situation.

There are several major schools in the interpretation of probability
(see http://plato.stanford.edu/entries/probability-interpret/), and
their relations with NARS is explained in Section 8.5.1 of my book.

As for the interpretation of probability in PLN, I'd rather wait for
your book than to make comment based on your brief explanation.

Pei


On Sun, Oct 12, 2008 at 9:13 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Thanks Pei,
>
> I would add (for others, obviously you know this stuff) that there are many
> different
> theoretical justifications of probability theory, hence that the use of
> probability
> theory does not imply model-theoretic semantics nor any other particular
> approach to semantics.
>
> My own philosophy is even further from your summary of model-theoretic
> semantics than it is from (my reading of) Tarski's original version of model
> theoretic semantics.  I am not an objectivist whatsoever  (I read too
> many
> Oriental philosophy books in my early youth, when my mom was studying
> for her PhD in Chinese history, and my brain was even more pliant  ;-).
> I deal extensively with objectivity/subjectivity/intersubjectivity issues in
> "The Hidden Pattern."
>
> As an example, if one justifies probability theory according a Cox's-axioms
> approach, no model theory is necessary.  In this approach, it is justified
> as a set of a priori constraints that the system chooses to impose on its
> own
> reasoning.
>
> In a de Finetti approach, it is justified because the system wants to
> be able to "win bets" with other agents.  The intersection between this
> notion and the hypothesis of an "objective world" is unclear, but it's not
> obvious why these hypothetical agents need to have objective existence.
>
> As you say, this is a deep philosophical rat's-nest... my point is just that
> it's
> not correct to imply "probability theory = traditional
> model theoretic semantics"
>
> -- Ben G
>
> On Sun, Oct 12, 2008 at 8:29 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> A brief and non-technical description of the two types of semantics
>> mentioned in the previous discussions:
>>
>> (1) Model-Theoretic Semantics (MTS)
>>
>> (1.1) There is a world existing independently outside the intelligent
>> system (human or machine).
>>
>> (1.2) In principle, there is an objective description of the world, in
>> terms of objects, their properties, and relations among them.
>>
>> (1.3) Within the intelligent system, its knowledge is an approximation
>> of the objective description of the world.
>>
>> (1.4) The meaning of a symbol within the system is the object it
>> refers to in the world.
>>
>> (1.5) The truth-value of a statement within the system measures how
>> close it approximates the fact in the world.
>>
>> (2) Experience-Grounded Semantics (EGS)
>>
>> (2.1) There is a world existing independently outside the intelligent
>> system (human or machine). [same as (1.1), but the agreement stops
>> here]
>>
>> (2.2) Even in principle, there is no objective description of the
>> world. What the system has is its experience, the history of its
>> interaction of the world.
>>
>> (2.3) Within the intelligent system, its knowledge is a summary of its
>> experience.
>>
>> (2.4) The meaning of a symbol within the system is determined by its
>> role in the experience.
>>
>> (2.5) The truth-value of a statement within the system measures how
>> close it summarizes the relevant part of the experience.
>>
>> To further simplify the description, in the context of learning and
>> reasoning: MTS takes "objective truth" of statements and "real
>> meaning" of terms as aim of approximation, while EGS refuses them, but
>> takes experience (input data) as the only thing to depend on.
>>
>> As usual, each theory has its strength and limitation. The issue is
>> which one is more proper for AGI. MTS has been dominating in math,
>> logic, and computer science, and therefore is accepted by the majority
>> people. Even so, it has been attacked by other people (not only the
>> EGS believers) for many reasons.
>>
>> A while ago I made a figure to illustrate this difference, which is at
>> http://nars.wang.googlepages.com/wang.semantics-figur

[agi] two types of semantics [Was: NARS and probability]

2008-10-12 Thread Pei Wang
A brief and non-technical description of the two types of semantics
mentioned in the previous discussions:

(1) Model-Theoretic Semantics (MTS)

(1.1) There is a world existing independently outside the intelligent
system (human or machine).

(1.2) In principle, there is an objective description of the world, in
terms of objects, their properties, and relations among them.

(1.3) Within the intelligent system, its knowledge is an approximation
of the objective description of the world.

(1.4) The meaning of a symbol within the system is the object it
refers to in the world.

(1.5) The truth-value of a statement within the system measures how
close it approximates the fact in the world.

(2) Experience-Grounded Semantics (EGS)

(2.1) There is a world existing independently outside the intelligent
system (human or machine). [same as (1.1), but the agreement stops
here]

(2.2) Even in principle, there is no objective description of the
world. What the system has is its experience, the history of its
interaction of the world.

(2.3) Within the intelligent system, its knowledge is a summary of its
experience.

(2.4) The meaning of a symbol within the system is determined by its
role in the experience.

(2.5) The truth-value of a statement within the system measures how
close it summarizes the relevant part of the experience.

To further simplify the description, in the context of learning and
reasoning: MTS takes "objective truth" of statements and "real
meaning" of terms as aim of approximation, while EGS refuses them, but
takes experience (input data) as the only thing to depend on.

As usual, each theory has its strength and limitation. The issue is
which one is more proper for AGI. MTS has been dominating in math,
logic, and computer science, and therefore is accepted by the majority
people. Even so, it has been attacked by other people (not only the
EGS believers) for many reasons.

A while ago I made a figure to illustrate this difference, which is at
http://nars.wang.googlepages.com/wang.semantics-figure.pdf . A
manifesto of EGS is at
http://nars.wang.googlepages.com/wang.semantics.pdf

Since the debate on the nature of "truth" and "meaning" has existed
for thousands of years, I don't think we can settle down it here by
some email exchanges. I just want to let the interested people know
the theoretical background of the related discussions.

Pei


On Sat, Oct 11, 2008 at 8:34 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
>
> Hi,
>
>>
>> > What this highlights for me is the idea that NARS truth values attempt
>> > to reflect the evidence so far, while probabilities attempt to reflect
>> > the world
>
> I agree that probabilities attempt to reflect the world
>
>>
>> .
>>
>> Well said. This is exactly the difference between an
>> experience-grounded semantics and a model-theoretic semantics.
>
> I don't agree with this distinction ... unless you are construing "model
> theoretic semantics" in a very restrictive way, which then does not apply to
> PLN.
>
> If by model-theoretic semantics you mean something like what Wikipedia says
> at http://en.wikipedia.org/wiki/Formal_semantics,
>
> ***
> Model-theoretic semantics is the archetype of Alfred Tarski's semantic
> theory of truth, based on his T-schema, and is one of the founding concepts
> of model theory. This is the most widespread approach, and is based on the
> idea that the meaning of the various parts of the propositions are given by
> the possible ways we can give a recursively specified group of
> interpretation functions from them to some predefined mathematical domains:
> an interpretation of first-order predicate logic is given by a mapping from
> terms to a universe of individuals, and a mapping from propositions to the
> truth values "true" and "false".
> ***
>
> then yes, PLN's semantics is based on a mapping from terms to a universe of
> individuals, and a mapping from propositions to truth values.  On the other
> hand, these "individuals" may be for instance **elementary sensations or
> actions**, rather than higher-level individuals like, say, a specific cat,
> or the concept "cat".  So there is nothing non-experience-based about
> mapping terms into a "individuals" that are the system's direct experience
> ... and then building up more abstract terms by grouping these
> directly-experience-based terms.
>
> IMO, the dichotomy between experience-based and model-based semantics is a
> misleading one.  Model-based semantics has often been used in a
> non-experience-based way, but that is not because it fundamentally **has**
> to be used in that way.
>
> To say that PLN tries to model the world, is then just to say that it tries
> to make probabilistic predictions about sensations and actions that have not
> yet been experienced ... which is certainly the case.
>
>>
>> Once
>> again, the difference in truth-value functions is reduced to the
>> difference in semantics, what is, what the "truth-value" attempts to
>> measure.
>
> Agreed...
>
> Be

Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
On Sat, Oct 11, 2008 at 5:56 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>
>> I see your point --- it comes from the fact that "As are Bs" and "Bs
>> are As" have the same positive evidence (both in NARS and in PLN),
>> plus the additional assumption that "no positive evidence means
>> negative evidence". Here the problem is in the additional assumption.
>> Indeed it is assumed both in traditional logic and probability theory
>> that "everything matters for every statement" (as revealed by Hempel's
>> Paradox).
>
> Hmm... other additional assumptions will do the job here as well, and
> I don't see why you mentioned the one you did. An assumption closer to
> the argument I gave would be "The more negative evidence we've ween,
> the less positive evidence we should expect".

Yes, for this topic, your assumption may be more proper, though it is
still unjustified, unless it is further assumed that the number of
total amount of evidence is fixed.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
On Sat, Oct 11, 2008 at 4:10 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei, Ben,
>
> I am going to try to spell out an arguments for each side (arguing for
> symmetry, then for asymmetry).
>
> For Symmetry:
>
> Suppose we get negative evidence for "As are Bs", such that we are
> tempted to say "no As are Bs". We then consider the statement "Bs are
> As", with no other info. We think, "If we found a B that was an A,
> then we would also have found an A that was a B; I don't think any
> exist; so, I don't think there are any Bs that are As." Thus, evidence
> against "As are Bs" is also evidence against "Bs are As".

I see your point --- it comes from the fact that "As are Bs" and "Bs
are As" have the same positive evidence (both in NARS and in PLN),
plus the additional assumption that "no positive evidence means
negative evidence". Here the problem is in the additional assumption.
Indeed it is assumed both in traditional logic and probability theory
that "everything matters for every statement" (as revealed by Hempel's
Paradox).

> Against Symmetry:
>
> If we are counting empirical frequencies, then an A that is not a B
> will lower the frequency of "As are Bs"; however, it will not alter
> the frequency count for "Bs are As".

Exactly.

> What this highlights for me is the idea that NARS truth values attempt
> to reflect the evidence so far, while probabilities attempt to reflect
> the world.

Well said. This is exactly the difference between an
experience-grounded semantics and a model-theoretic semantics. Once
again, the difference in truth-value functions is reduced to the
difference in semantics, what is, what the "truth-value" attempts to
measure.

Pei

> --Abram
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
Ben,

My summary was on "the asymmetry of induction/abduction" topic alone,
not on NARS vs. PLN in general --- of course NARS is counterintuitive
in several places!

Under that restriction, I assume you'll agree with me summary.

Please note that this issue is related to Hempel's Paradox, but not
the same --- the former is on negative evidence, while the latter is
on positive evidence.

I won't address the other issues here --- as you said, they are
complicated, and email discussion is not always enough. I'm looking
forward to the PNL book and your future publications on the related
topics.

Pei

On Sat, Oct 11, 2008 at 11:54 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Thanks Pei!
>
> This is an interesting dialogue, but indeed, I have some reservations about
> putting so much energy into email dialogues -- for a couple reasons
>
> 1)
> because, once they're done,
> the text generated basically just vanishes into messy, barely-searchable
> archives.
>
> 2)
> because I tend to answer emails on the fly and hastily, without putting
> careful thought into phrasing, as I do when writing papers or books ... and
> this hastiness can sometimes add confusion
>
> It would be better to further explore these issues in some other forum where
> the
> discussion would be preserved in a more easily readable form, and where
> the medium is more conducive to carefully-thought-out phrasings...
>
>
>> Go back to where this debate starts: the asymmetry of
>> induction/abduction. To me, here is what the discussion  has revealed
>> so far:
>>
>> (1) The PLN solution is consistent with the Bayesian tradition and
>> probability theory in general, though it is counterintuitive.
>>
>> (2) The NARS solution fits people's intuition, though it violates
>> probability theory.
>
> I don't fully agree with this summary, sorry.
>
> I agree that the PLN approach
> is counterintuitive in some respects (e.g. the Hempel puzzle)
>
> I also note that the more innovative aspects of PLN don't seem
> to introduce any new counterintuitiveness.  The counterintuitiveness
> that is there is just inherited from plain old probability theory, it seems.
>
> However, I also feel
> the NARS approach is counterintuitive in some respects.  One
> example is the fact that in NARS,
> induction/abduction the frequency component of the conclusion depends
> on only one of the premises).
>
> Another example is the lack of Bayes
> rule in NARS: there is loads of evidence that humans and animals intuitively
> reason according to Bayes rule in various situations.
>
> Which approach (PLN or NARS) is more agreeable with human intuition, on the
> whole,
> is not clear to me.   And, as I argued in my prior email, this is not the
> most
> interesting issue from my point of view ... for two reasons, actually (only
> one
> of which I elaborated carefully before)
>
> 1)
> I'm not primarily trying to model humans, but rather trying to create a
> powerful
> AGI
>
> 2)
> Human intuition about human practice,
>  does not always match human practice.  What we feel like we're
> doing may not match what we're actually doing in our brains.  This is very
> plainly
> demonstrated for instance in the area of mental arithmetic: the algorithms
> people
> think they're following, could not possibly lead to the timing-patterns that
> people
> generate when actually solving mental arithmetic problems.  The same thing
> may hold for inference: the rules people think they're following may not be
> the
> ones they actually follow.  So that "intuitiveness" is of significant yet
> limited
> value in figuring out what people actually do unconsciously when thinking.
>
>
> -- Ben G
>
>
>
>
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] "Logical Intuition"

2008-10-11 Thread Pei Wang
For people who really want to know, the issue of how NARS gets its
intuition is addressed in Section 14.1.3 of my book. However, I don't
think I can explain it to everyone's satisfaction by email.

Pei

On Sat, Oct 11, 2008 at 10:16 AM, Mike Tintner <[EMAIL PROTECTED]> wrote:
> Pei:The NARS solution fits people's intuition
>
> You guys keep talking - perfectly reasonably - about how your logics do or
> don't fit your intuition. The logical question is - how - on what principles
> - does your intuition work? What ideas do you have about this?
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription:
> https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
Ben,

Your reply raised several interesting topics, and most of them cannot
be settled down in this kind of email exchanges. Therefore, I won't
address every of them here, but will propose another solution, in a
separate private email.

Go back to where this debate starts: the asymmetry of
induction/abduction. To me, here is what the discussion  has revealed
so far:

(1) The PLN solution is consistent with the Bayesian tradition and
probability theory in general, though it is counterintuitive.

(2) The NARS solution fits people's intuition, though it violates
probability theory.

Please note that on this topic, what is involved is not just "Pei's
intuition" (though in some other topics, it is) --- Hempel's Paradox
looks counterintuitive to everyone, including you (which you admitted)
and Hempel himself, though you, Hempel, and most of the others
involved in this research, choose to accept the counterintuitive
conclusion, because of the belief that probability theory should be
followed in commonsense reasoning.

As I said before, I don't think I can change your belief in
probability theory very soon. Therefore, as long as you think my above
summary is fair, I've reached my goal in this round of exchange.

Pei


On Sat, Oct 11, 2008 at 8:45 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Pei etc.,
>
> First high level comment here, mostly to the non-Pei audience ... then I'll
> respond to some of the details:
>
> This dialogue -- so far -- feels odd to me because I have not been
>  defending anything special, peculiar or inventive about PLN here.
> There are some things about PLN that would be considered to fall into that
> category
> (e.g. the treatment of intension which uses my "pattern theory", and the
> treatment of quantifiers which uses third-order probabilities ... or even
> the
> use of indefinite truth values).   Those are the things that I would expect
> to be arguing about!  Even more interesting would be to argue about
> strategies
> for controlling combinatorial explosion in inference trees, which IMO is the
> truly crucial issue, more so than the particulars of the inference and
> uncertainty
> management formalism (though those particulars need to be workable too, if
> one is to have an AI with explicit inference as a significant component).
>
> Instead, in this dialogue, I am essentially defending the standard usage
> of probability theory, which is the **least** interesting and inventive part
> of
> PLN.  I'm defending the use of Bayes rule ... re-presenting the standard
> Bayesian argument about the Hempel confirmation problem, etc.
>
> This is rather a reversal of positions for me, as I more often these days
> argue
> with people who are hard-core Bayesians, who believe that explicitly doing
> Bayesian inference is the key to AGI ... and  my argument with them is that
> a) you need to supplement probability theory with heuristics, because
> otherwise
> things become intractable; b) these "heuristics" are huge and subtle and in
> fact wind up constituting a whole cognitive architecture of which explicit
> probability
> theory is just one component (but the whole architecture appears to the
> probabilistic-reasoning component as a set of heuristic assumptions).
>
> So anyway this is  not, so far, so much of a "PLN versus NARS" debate as a
> "probability theoretic AI versus NARS" debate, in the sense that none of the
> more odd/questionable/fun/inventive parts of PLN are being invoked here ...
> only the parts that are common to PLN and a lot of other approaches...
>
> But anyway, back to defending Bayes and elementary probability theory in
> (its application to common sense reasoning; obviously Pei is not disputing
> the actual mathematics!)
>
> Maybe in this reply I will get a chance to introduce some of the more
> interesting
> aspects of PLN, we'll see...
>
>>
>>
>> Since each inference rule usually only considers two premises, whether
>> the meaning of the involved concepts are rich or poor (i.e., whether
>> they are also involved in other statements not considered by the rule)
>> shouldn't matter in THAT STEP, right?
>
> It doesn't matter in the sense of determining
> what the system does in that step, but it matters in terms
> of the human "intuitiveness evaluation" of that step, because we are
> intuitively accustomed to evaluating inferences regarding rich concepts
> that have a lot of links, and for which we have some intuitive understanding
> of the relevant term probabilities.
>
>
>
>>
>> >> Further questions:
>> >>
>> >> (1) Don't you intuitively feel that the evidence provided by
>> >> non-swimming birds says more about "Birds are swimmers" than
>> >> "Swimmers are birds"?
>> >
>> > Yes, but only because I know intuitively that swimmers are more common
>> > in my everyday world than birds.
>>
>> Please note that this issue is different from our previous debate.
>> "Node probability" have nothing to do with the asymmetry in
>> induction/abduction.
>
> I don't remember our previo

Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
Brad,

Thanks for the encouragement.

For people who cannot fully grok the discussion from the email alone,
the relevant NARS references are
http://nars.wang.googlepages.com/wang.semantics.pdf and
http://nars.wang.googlepages.com/wang.confidence.pdf

Pei

On Sat, Oct 11, 2008 at 1:13 AM, Brad Paulsen <[EMAIL PROTECTED]> wrote:
> Pei, Ben G. and Abram,
>
> Oh, man, is this stuff GOOD!  This is the real nitty-gritty of the AGI
> matter.  How does your approach handle counter-evidence?  How does your
> approach deal with insufficient evidence?  (Those are rhetorical questions,
> by the way -- I don't want to influence the course of this thread, just want
> to let you know I dig it and, mostly, grok it as well).  I love this stuff.
>  You guys are brilliant.  Actually, I think it would make a good
> publication: "PLN vs. NARS -- The AGI Smack-down!"  A win-win contest.
>
> This is a rare treat for an old hacker like me.  And, I hope, educational
> for all (including the participants)!  Keep it coming, please!
>
> Cheers,
> Brad
>
> Pei Wang wrote:
>>
>> On Fri, Oct 10, 2008 at 8:03 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>
>>> Yah, according to Bayes rule if one assumes P(bird) = P(swimmer) this
>>> would
>>> be the case...
>>>
>>> (Of course, this kind of example is cognitively misleading, because if
>>> the
>>> only knowledge
>>> the system has is "Swallows are birds" and "Swallows are NOT swimmers"
>>> then
>>> it doesn't
>>> really know that the terms involved are "swallows", "birds", "swimmers"
>>> etc.
>>> ... then in
>>> that case they're just almost-meaningless tokens to the system, right?)
>>
>> Well, it depends on the semantics. According to model-theoretic
>> semantics, if a term has no reference, it has no meaning. According to
>> experience-grounded semantics, every term in experience have meaning
>> --- by the role it plays.
>>
>> Further questions:
>>
>> (1) Don't you intuitively feel that the evidence provided by
>> non-swimming birds says more about "Birds are swimmers" than
>> "Swimmers are birds"?
>>
>> (2) If your answer for (1) is "yes", then think about "Adults are
>> alcohol-drinkers" and "Alcohol-drinkers are adults" --- do they have
>> the same set of counter examples, intuitively speaking?
>>
>> (3) According to your previous explanation, will PLN also take a red
>> apple as negative evidence for "Birds are swimmers" and "Swimmers are
>> birds", because it reduces the "candidate pool" by one? Of course, the
>> probability adjustment may be very small, but qualitatively, isn't it
>> the same as a non-swimming bird? If not, then what the system will do
>> about it?
>>
>> Pei
>>
>>
>>> On Fri, Oct 10, 2008 at 7:34 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Ben,
>>>>
>>>> I see your position.
>>>>
>>>> Let's go back to the example. If the only relevant domain knowledge
>>>> PLN has is "Swallows are birds" and "Swallows are
>>>> NOT swimmers", will the system assigns the same lower-than-default
>>>> probability to "Birds are swimmers" and  "Swimmers are birds"? Again,
>>>> I only need a qualitative answer.
>>>>
>>>> Pei
>>>>
>>>> On Fri, Oct 10, 2008 at 7:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>> Pei,
>>>>>
>>>>> I finally took a moment to actually read your email...
>>>>>
>>>>>>
>>>>>> However, the negative evidence of one conclusion is no evidence of the
>>>>>> other conclusion. For example, "Swallows are birds" and "Swallows are
>>>>>> NOT swimmers" suggests "Birds are NOT swimmers", but says nothing
>>>>>> about whether "Swimmers are birds".
>>>>>>
>>>>>> Now I wonder if PLN shows a similar asymmetry in induction/abduction
>>>>>> on negative evidence. If it does, then how can that effect come out of
>>>>>> a symmetric truth-function? If it doesn't, how can you justify the
>>>>>> conclusion, which looks counter-intuitive?
>>>>>
>>>>> According to Bayes rule,
>>>>>
&g

Re: [agi] NARS and probability

2008-10-11 Thread Pei Wang
On Fri, Oct 10, 2008 at 8:56 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:

>> Well, it depends on the semantics. According to model-theoretic
>> semantics, if a term has no reference, it has no meaning. According to
>> experience-grounded semantics, every term in experience have meaning
>> --- by the role it plays.
>
> That's why I said "almost-meaningless" ... if those are the only
> relationships
> known to the system, then the terms in those relationships play almost
> no roles, hence have almost no meanings...

Since each inference rule usually only considers two premises, whether
the meaning of the involved concepts are rich or poor (i.e., whether
they are also involved in other statements not considered by the rule)
shouldn't matter in THAT STEP, right?

>> Further questions:
>>
>> (1) Don't you intuitively feel that the evidence provided by
>> non-swimming birds says more about "Birds are swimmers" than
>> "Swimmers are birds"?
>
> Yes, but only because I know intuitively that swimmers are more common
> in my everyday world than birds.

Please note that this issue is different from our previous debate.
"Node probability" have nothing to do with the asymmetry in
induction/abduction.

For example, "non-swimmer birds" is negative evidence for "Birds are
swimmers" but irrelevant to "Swimmers are birds", while "non-bird
swimmers" is negative evidence for "Swimmers are birds" but irrelevant
to "Birds are swimmers". No matter which of the two nodes is more
common, you cannot have both case right.

>> (2) If your answer for (1) is "yes", then think about "Adults are
>> alcohol-drinkers" and "Alcohol-drinkers are adults" --- do they have
>> the same set of counter examples, intuitively speaking?
>
> Again, our intuitions for this are colored by the knowledge that there
> are more adults than alcohol-drinkers.

As above, the two sets of counter examples are "non-alcohol-drinking
adult" and "non-adult alcohol-drinker", respectively. The fact that
these two statements have different negative evidence have nothing to
do with the size of the related sets (node probability).

> Consider high school, which has 4 years: freshman, sophomore,
> junior, senior.
>
> Then think about "Juniors & seniors are women" and "women
> are juniors & seniors"
>
> It seems quite intuitive to me that, in this case, the same pieces of
> evidence support the truth values of these two hypotheses.
>
> This is because the term probabilities of "juniors and seniors"
> and "women" are intuitively known to be about equal.

Instead of "supporting evidence", you should address "refuting
evidence" (because that is where the issue is). For "Juniors & seniors
are women", it is "juniors & seniors man", and for "women are juniors
& seniors", it is "freshman & sophomore women".

What I argued is: the counter evidence of statement "A is B" is not
counter evidence of the converse statement "B is A", and vice versa.
You cannot explain this in both directions by node probability.

>> (3) According to your previous explanation, will PLN also take a red
>> apple as negative evidence for "Birds are swimmers" and "Swimmers are
>> birds", because it reduces the "candidate pool" by one? Of course, the
>> probability adjustment may be very small, but qualitatively, isn't it
>> the same as a non-swimming bird? If not, then what the system will do
>> about it?
>
> Yes, in principle, PLN will behave in "Hempel's confirmation paradox" in
> a similar way to other Bayesian systems.
>
> I do find this counterintuitive, personally, and I spent a while trying to
> work
> around it ... but finally I decided that my intuition is the faulty thing.
> As you note,
> it's a very small probability adjustment in these cases, so it's not
> surprising
> if human intuition is not tuned to make such small probability adjustments
> in a correct or useful way...

Well, actually your previous explanation is exactly the opposite of
the standard Bayesian answer --- see
http://en.wikipedia.org/wiki/Raven_paradox

Now we have three different opinions on the relationship between
statement "Birds are swimmers" and the evidence provided by a red
apple:
(1) NARS: it is irrelevant (neither positive nor negative)
(2) PLN: it is negative evidence (though very small)
(3) Bayesian: it is positive evidence (though very small)

Everyone agrees that (2) and (3) are counterintuitive, but most people
trust probability theory more than their own intuition --- after all,
nobody is perfect ... :-(

To me, "small probability adjustments" is a bad excuse. No matter how
small the adjustment is, as far as it is not infinitely small, it
cannot be always ignored, since it will accumulate. If all non-bird
objects are taken as (either positive or negative) evidence for "Birds
are swimmers", then the huge number of them cannot be ignored.

It is always possible to save a theory (probability theory, in this
situation) if you are willing to pay the price. The problem is whether
the price is too high.

Pei



Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
On Fri, Oct 10, 2008 at 8:03 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Yah, according to Bayes rule if one assumes P(bird) = P(swimmer) this would
> be the case...
>
> (Of course, this kind of example is cognitively misleading, because if the
> only knowledge
> the system has is "Swallows are birds" and "Swallows are NOT swimmers" then
> it doesn't
> really know that the terms involved are "swallows", "birds", "swimmers" etc.
> ... then in
> that case they're just almost-meaningless tokens to the system, right?)

Well, it depends on the semantics. According to model-theoretic
semantics, if a term has no reference, it has no meaning. According to
experience-grounded semantics, every term in experience have meaning
--- by the role it plays.

Further questions:

(1) Don't you intuitively feel that the evidence provided by
non-swimming birds says more about "Birds are swimmers" than
"Swimmers are birds"?

(2) If your answer for (1) is "yes", then think about "Adults are
alcohol-drinkers" and "Alcohol-drinkers are adults" --- do they have
the same set of counter examples, intuitively speaking?

(3) According to your previous explanation, will PLN also take a red
apple as negative evidence for "Birds are swimmers" and "Swimmers are
birds", because it reduces the "candidate pool" by one? Of course, the
probability adjustment may be very small, but qualitatively, isn't it
the same as a non-swimming bird? If not, then what the system will do
about it?

Pei


>
> On Fri, Oct 10, 2008 at 7:34 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Ben,
>>
>> I see your position.
>>
>> Let's go back to the example. If the only relevant domain knowledge
>> PLN has is "Swallows are birds" and "Swallows are
>> NOT swimmers", will the system assigns the same lower-than-default
>> probability to "Birds are swimmers" and  "Swimmers are birds"? Again,
>> I only need a qualitative answer.
>>
>> Pei
>>
>> On Fri, Oct 10, 2008 at 7:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> > Pei,
>> >
>> > I finally took a moment to actually read your email...
>> >
>> >>
>> >>
>> >> However, the negative evidence of one conclusion is no evidence of the
>> >> other conclusion. For example, "Swallows are birds" and "Swallows are
>> >> NOT swimmers" suggests "Birds are NOT swimmers", but says nothing
>> >> about whether "Swimmers are birds".
>> >>
>> >> Now I wonder if PLN shows a similar asymmetry in induction/abduction
>> >> on negative evidence. If it does, then how can that effect come out of
>> >> a symmetric truth-function? If it doesn't, how can you justify the
>> >> conclusion, which looks counter-intuitive?
>> >
>> > According to Bayes rule,
>> >
>> > P(bird | swimmer) P(swimmer) = P(swimmer | bird) P(bird)
>> >
>> > So, in PLN, evidence for P(bird | swimmer) will also count as evidence
>> > for P(swimmer | bird), though potentially with a different weighting
>> > attached to each piece of evidence
>> >
>> > If P(bird) = P(swimmer) is assumed, then each piece of evidence
>> > for each of the two conditional probabilities, will count for the other
>> > one symmetrically.
>> >
>> > The intuition here is the standard Bayesian one.
>> > Suppose you know there
>> > are 1 things in the universe, and 1000 swimmers.
>> > Then if you find out that swallows are not
>> > swimmers ... then, unless you think there are zero swallows,
>> > this does affect P(bird | swimmer).  For instance, suppose
>> > you think there are 10 swallows and 100 birds.  Then, if you know for
>> > sure
>> > that swallows are not swimmers, and you have no other
>> > info but the above, your estimate of P(bird|swimmer)
>> > should decrease... because of the 1000 swimmers, you now know there
>> > are only 990 that might be birds ... whereas before you thought
>> > there were 1000 that might be birds.
>> >
>> > And the same sort of reasoning holds for **any** probability
>> > distribution you place on the number of things in the universe,
>> > the number of swimmers, the number of birds, the number of swallows.
>> > It doesn't matter what assumption you make, whether you look at
>> > n'th order pdf's or whatever ... t

Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
Ben,

I see your position.

Let's go back to the example. If the only relevant domain knowledge
PLN has is "Swallows are birds" and "Swallows are
NOT swimmers", will the system assigns the same lower-than-default
probability to "Birds are swimmers" and  "Swimmers are birds"? Again,
I only need a qualitative answer.

Pei

On Fri, Oct 10, 2008 at 7:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Pei,
>
> I finally took a moment to actually read your email...
>
>>
>>
>> However, the negative evidence of one conclusion is no evidence of the
>> other conclusion. For example, "Swallows are birds" and "Swallows are
>> NOT swimmers" suggests "Birds are NOT swimmers", but says nothing
>> about whether "Swimmers are birds".
>>
>> Now I wonder if PLN shows a similar asymmetry in induction/abduction
>> on negative evidence. If it does, then how can that effect come out of
>> a symmetric truth-function? If it doesn't, how can you justify the
>> conclusion, which looks counter-intuitive?
>
> According to Bayes rule,
>
> P(bird | swimmer) P(swimmer) = P(swimmer | bird) P(bird)
>
> So, in PLN, evidence for P(bird | swimmer) will also count as evidence
> for P(swimmer | bird), though potentially with a different weighting
> attached to each piece of evidence
>
> If P(bird) = P(swimmer) is assumed, then each piece of evidence
> for each of the two conditional probabilities, will count for the other
> one symmetrically.
>
> The intuition here is the standard Bayesian one.
> Suppose you know there
> are 1 things in the universe, and 1000 swimmers.
> Then if you find out that swallows are not
> swimmers ... then, unless you think there are zero swallows,
> this does affect P(bird | swimmer).  For instance, suppose
> you think there are 10 swallows and 100 birds.  Then, if you know for sure
> that swallows are not swimmers, and you have no other
> info but the above, your estimate of P(bird|swimmer)
> should decrease... because of the 1000 swimmers, you now know there
> are only 990 that might be birds ... whereas before you thought
> there were 1000 that might be birds.
>
> And the same sort of reasoning holds for **any** probability
> distribution you place on the number of things in the universe,
> the number of swimmers, the number of birds, the number of swallows.
> It doesn't matter what assumption you make, whether you look at
> n'th order pdf's or whatever ... the same reasoning works...
>
> From what I understand, your philosophical view is that it's somehow
> wrong for a mind to make some assumption about the pdf underlying
> the world around it?  Is that correct?  If so I don't agree with this... I
> think this kind of assumption is just part of the "inductive bias" with
> which
> a mind approaches the world.
>
> The human mind may well have particular pdf's for stuff like birds and
> trees wired into it, as we evolved to deal with these things.  But that's
> not really the point.  The inductive bias may be much more abstract --
> ultimately, it can just be an "occam bias" that biases the mind to
> prior distributions (over the space of procedures for generating
> prior distributions for handling specific cases)
> that are simplest according to some wired-in
> simplicity measure
>
> So again we get back to basic differences in philosophy...
>
> -- Ben G
>
>
>
>
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
Ben,

Maybe your memory is correct --- we use "strength" in Webmind to keep
some distance from NARS.

Anyway, I don't like that term because it can be easily interpreted in
several ways, while the reason I don't like "probability" is just the
opposite --- it has a widely accepted interpretation, which is hard to
bend to mean what I want the term to mean.

Pei

On Fri, Oct 10, 2008 at 6:58 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, Oct 10, 2008 at 6:01 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> On Fri, Oct 10, 2008 at 5:52 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> > I meant frequency, sorry
>> >
>> > "Strength" is a term Pei used for frequency in some old sicsussions...
>>
>> Another correction: "strength" is never used in any NARS publication.
>> It was used in some Webmind documents, though I guess it must be your
>> idea, since I never like this term. ;-)
>
> As I recall, the use of the term (in discussions rather than publications)
> was your idea, *but* the context in which it was
> suggested was as follows.  We wanted a term for a variable in the Webmind
> Java code that would be applicable to both NARS and PLN truth values, and
> would be
> burdened as little as possible with specific theoretical interpretation.  So
> you suggested strength.
>
> I'm not sure why we didn't just use "frequency" instead.  I remember you did
> not want to call it "probability."
>
> (This was, unbelievably, 10 years ago, so I don't want to bet my right arm
> on the details of my recollection ... but that's how I remember it...)
>
>>
>> > The exact formulas used in NARS are basically heuristics derived based
>> > on "endpoint conditions", so replicating those exact formulas is really
>> > not important IMO... the key would be replicating their qualitative
>> > behavior...
>>
>> I have to say that I don't like the term "heuristics", neither, since
>> it usually refers to "quick-and-dirty" replacement of the "real
>> thing".
>
> I didn't mean anything negative via the word "heuristic" ... and you didn't
> suggest
> an alternative word ;-)
>
>
> ben
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
On Fri, Oct 10, 2008 at 5:52 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> I meant frequency, sorry
>
> "Strength" is a term Pei used for frequency in some old sicsussions...

Another correction: "strength" is never used in any NARS publication.
It was used in some Webmind documents, though I guess it must be your
idea, since I never like this term. ;-)

> The exact formulas used in NARS are basically heuristics derived based
> on "endpoint conditions", so replicating those exact formulas is really
> not important IMO... the key would be replicating their qualitative
> behavior...

I have to say that I don't like the term "heuristics", neither, since
it usually refers to "quick-and-dirty" replacement of the "real
thing".

I fully agree with you that what really matters is the qualitative
behavior, rather than the exact formula.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
Abram,

Ben's "strength" is my "frequency".

Pei

On Fri, Oct 10, 2008 at 5:49 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei,
>
> You agree that the abduction and induction "strength" formulas only
> rely on one of the two premises?
>
> Is there some variable called strength that I missed?
>
> --Abram
>
> On Fri, Oct 10, 2008 at 5:38 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> Ben,
>>
>> I agree with what you said in the previous email.
>>
>> However, since we already touched this point in the second time, there
>> may be people wondering what the difference between NARS and PLN
>> really is.
>>
>> Again let me use an example to explain why the truth-value function of
>> abduction/induction should be asymmetric, at least to me. Since
>> induction is more intuitive, I'll use it.
>>
>> The general induction rule in NARS has the following form
>>
>> M-->P 
>> M-->S 
>> -
>> S-->P 
>> P-->S 
>>
>> where each truth value has a "frequency" factor (for
>> positive/negative), and a "confidence" factor (for sure/unsure).
>>
>> A truth-value function is symmetric with respect to the premises, if
>> and only if  =  for all  and . Last time you
>> mentioned the following abduction function of PLN:
>>   s3  = s1 s2 + w (1-s1)(1-s2)
>> which is symmetric in this sense.
>>
>> Now, instead of discussing the details of the NARS function, I only
>> explain why it is not symmetric, that is, when t_a and t_b are
>> different.
>>
>> First, positive evidence lead to symmetric conclusions, that is, if M
>> support S-->P, it will also support P-->S. For example, "Swans are
>> birds" and "Swans are swimmers" support both "Birds are swimmers" and
>> "Swimmers are birds", to the same extent.
>>
>> However, the negative evidence of one conclusion is no evidence of the
>> other conclusion. For example, "Swallows are birds" and "Swallows are
>> NOT swimmers" suggests "Birds are NOT swimmers", but says nothing
>> about whether "Swimmers are birds".
>>
>> Now I wonder if PLN shows a similar asymmetry in induction/abduction
>> on negative evidence. If it does, then how can that effect come out of
>> a symmetric truth-function? If it doesn't, how can you justify the
>> conclusion, which looks counter-intuitive?
>>
>> Pei
>>
>>
>>
>> On Fri, Oct 10, 2008 at 4:57 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>
>>> Sorry Pei, you are right, I sloppily  mis-stated!
>>>
>>> What I should have said was:
>>>
>>> "
>>> the result that the NARS induction and abduction *strength* formulas
>>> each depend on **only one** of their premise truth values ...
>>> "
>>>
>>> Anyway, my point in that particular post was not to say that NARS is either
>>> good or bad in this aspect ... but just to note that this IMO is a
>>> conceptually
>>> important point that should somehow "fall right out" of a probabilistic
>>> (or nonprobabilistic) derivation of NARS, rather than being achieved via
>>> carefully fitting complex formulas to produce it...
>>>
>>> ben g
>>>
>>> On Fri, Oct 10, 2008 at 4:48 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>>
>>>> On Fri, Oct 10, 2008 at 4:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>> >
>>>> > In particular, the result that NARS induction and abduction each
>>>> > depend on **only one** of their premise truth values ...
>>>>
>>>> Ben,
>>>>
>>>> I'm sure you know it in your mind, but this simple description will
>>>> make some people think that NARS is obvious wrong.
>>>>
>>>> In NARS, in induction and abduction the truth value of the conclusion
>>>> depends on the truth values of both premises, but in an asymmetric
>>>> way. It is the "frequency" factor of the conclusion that only depends
>>>> on the frequency of one premise, but not the other.
>>>>
>>>> Unlike deduction, the truth-value function of induction and abduction
>>>> are fundamentally asymmetric (on negative evidence), with respect to
>>>> the two premises. Actually, it is the PLN functions that looks wrong
>>>> to me, on this aspect. ;-)
>>>>
&g

Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
Ben,

I agree with what you said in the previous email.

However, since we already touched this point in the second time, there
may be people wondering what the difference between NARS and PLN
really is.

Again let me use an example to explain why the truth-value function of
abduction/induction should be asymmetric, at least to me. Since
induction is more intuitive, I'll use it.

The general induction rule in NARS has the following form

M-->P 
M-->S 
-
S-->P 
P-->S 

where each truth value has a "frequency" factor (for
positive/negative), and a "confidence" factor (for sure/unsure).

A truth-value function is symmetric with respect to the premises, if
and only if  =  for all  and . Last time you
mentioned the following abduction function of PLN:
   s3  = s1 s2 + w (1-s1)(1-s2)
which is symmetric in this sense.

Now, instead of discussing the details of the NARS function, I only
explain why it is not symmetric, that is, when t_a and t_b are
different.

First, positive evidence lead to symmetric conclusions, that is, if M
support S-->P, it will also support P-->S. For example, "Swans are
birds" and "Swans are swimmers" support both "Birds are swimmers" and
"Swimmers are birds", to the same extent.

However, the negative evidence of one conclusion is no evidence of the
other conclusion. For example, "Swallows are birds" and "Swallows are
NOT swimmers" suggests "Birds are NOT swimmers", but says nothing
about whether "Swimmers are birds".

Now I wonder if PLN shows a similar asymmetry in induction/abduction
on negative evidence. If it does, then how can that effect come out of
a symmetric truth-function? If it doesn't, how can you justify the
conclusion, which looks counter-intuitive?

Pei



On Fri, Oct 10, 2008 at 4:57 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Sorry Pei, you are right, I sloppily  mis-stated!
>
> What I should have said was:
>
> "
> the result that the NARS induction and abduction *strength* formulas
> each depend on **only one** of their premise truth values ...
> "
>
> Anyway, my point in that particular post was not to say that NARS is either
> good or bad in this aspect ... but just to note that this IMO is a
> conceptually
> important point that should somehow "fall right out" of a probabilistic
> (or nonprobabilistic) derivation of NARS, rather than being achieved via
> carefully fitting complex formulas to produce it...
>
> ben g
>
> On Fri, Oct 10, 2008 at 4:48 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> On Fri, Oct 10, 2008 at 4:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> > In particular, the result that NARS induction and abduction each
>> > depend on **only one** of their premise truth values ...
>>
>> Ben,
>>
>> I'm sure you know it in your mind, but this simple description will
>> make some people think that NARS is obvious wrong.
>>
>> In NARS, in induction and abduction the truth value of the conclusion
>> depends on the truth values of both premises, but in an asymmetric
>> way. It is the "frequency" factor of the conclusion that only depends
>> on the frequency of one premise, but not the other.
>>
>> Unlike deduction, the truth-value function of induction and abduction
>> are fundamentally asymmetric (on negative evidence), with respect to
>> the two premises. Actually, it is the PLN functions that looks wrong
>> to me, on this aspect. ;-)
>>
>> Pei
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "Nothing will ever be attempted if all possible objections must be first
> overcome "  - Dr Samuel Johnson
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
On Fri, Oct 10, 2008 at 4:24 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> In particular, the result that NARS induction and abduction each
> depend on **only one** of their premise truth values ...

Ben,

I'm sure you know it in your mind, but this simple description will
make some people think that NARS is obvious wrong.

In NARS, in induction and abduction the truth value of the conclusion
depends on the truth values of both premises, but in an asymmetric
way. It is the "frequency" factor of the conclusion that only depends
on the frequency of one premise, but not the other.

Unlike deduction, the truth-value function of induction and abduction
are fundamentally asymmetric (on negative evidence), with respect to
the two premises. Actually, it is the PLN functions that looks wrong
to me, on this aspect. ;-)

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS and probability

2008-10-10 Thread Pei Wang
On Wed, Oct 8, 2008 at 5:15 PM, Abram Demski <[EMAIL PROTECTED]> wrote:

> Given those three assumptions, plus the NARS formula for revision,
> there is (I think) only one possible formula relating the NARS
> variables 'f' and 'w' to the value of 'par': the probability density
> function p(par | w, f) = par^(w*f) * (1-par)^(w*(1-f)). Note: NARS
> truth values are more often (I think?) represented by the pair 'f'
> 'c', where 'c' is computed from 'w' by the formula c=w/(w+k), where k
> is a fixed constant. This is of little consequence at this point, and
> it was more intuitive to use 'f' and 'w' (at least for me).

At this stage, you are right. Since c and w fully determines each
other, in principle you can use either, and w is more intuitive.
However, in designing the truth-value functions, it is more convenient
to use c, a real number in [0, 1], than w, which has no upper bound.

> Here's the math. In NARS, the operation we're interested in is taking
> two pools of evidence, one concerning A=>X and the other concerning
> B=>X, and combining them to calculate the evidence they lend to A=>B.

Now things get tricky, In my derivation, in abduction/deduction the
evidence of a premise is not directly used as evidence for the
conclusion. Instead, it is the premise, as a summary of its own
evidences, that is used as evidence. That is, X is not a set, but an
individual. Consequently, the operation doesn't "taking two pools of
evidence" and somehow combine them into one pool (as in the revision
rule).

> So probabilistically, we want to determine the probability of the
> evidence for A=>X and B=>X given each possible 'par' value of A=>B.

According to the semantics of NARS, A=>X or B=>X, by itself, doesn't
provide evidence for A=>B.

Overall, it is a nice try, but given the difference in semantics
between probability theory and NARS, I'm still doubtful on how far you
can go in this direction.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-28 Thread Pei Wang
I got it from an internal source.

Pei

On Sun, Sep 28, 2008 at 8:24 PM, Brad Paulsen <[EMAIL PROTECTED]> wrote:
> Pei,
>
> Would you mind sharing the link (that is, if you found it on the Internet)?
>
> Thanks,
> Brad
>
> Pei Wang wrote:
>>
>> I found the paper.
>>
>> As I guessed, their update operator is defined on the whole
>> probability distribution function, rather than on a single probability
>> value of an event. I don't think it is practical for AGI --- we cannot
>> afford the time to re-evaluate every belief on each piece of new
>> evidence. Also, I haven't seen a convincing argument on why an
>> intelligent system should follow the ME Principle.
>>
>> Also this paper doesn't directly solve my example, because it doesn't
>> use second-order probability.
>>
>> Pei
>>
>> On Sat, Sep 20, 2008 at 10:13 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>
>>> The approach in that paper doesn't require any special assumptions, and
>>> could be applied to your example, but I don't have time to write up an
>>> explanation of how to do the calculations ... you'll have to read the
>>> paper
>>> yourself if you're curious ;-)
>>>
>>> That approach is not implemented in PLN right now but we have debated
>>> integrating it with PLN as in some ways it's subtler than what we
>>> currently
>>> do in the code...
>>>
>>> ben
>>>
>>> On Sat, Sep 20, 2008 at 10:02 PM, Pei Wang <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>> I didn't know this paper, but I do know approaches based on the
>>>> principle of maximum/optimum entropy. They usually requires much more
>>>> information (or assumptions) than what is given in the following
>>>> example.
>>>>
>>>> I'd be interested to know what the solution they will suggest for such
>>>> a situation.
>>>>
>>>> Pei
>>>>
>>>> On Sat, Sep 20, 2008 at 9:53 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>> Think about a concrete example: if from one source the system gets
>>>>>> P(A-->B) = 0.9, and P(P(A-->B) = 0.9) = 0.5, while from another source
>>>>>> P(A-->B) = 0.2, and P(P(A-->B) = 0.2) = 0.7, then what will be the
>>>>>> conclusion when the two sources are considered together?
>>>>>
>>>>> There are many approaches to this within the probabilistic framework,
>>>>> one of which is contained within this paper, for example...
>>>>>
>>>>> http://cat.inist.fr/?aModele=afficheN&cpsidt=16174172
>>>>>
>>>>> (I have a copy of the paper but I'm not sure where it's available for
>>>>> free online ... if anyone finds it please post the link... thx)
>>>>>
>>>>> Ben
>>>>> 
>>>>> agi | Archives | Modify Your Subscription
>>>>
>>>> ---
>>>> agi
>>>> Archives: https://www.listbox.com/member/archive/303/=now
>>>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>>>> Modify Your Subscription: https://www.listbox.com/member/?&;
>>>> Powered by Listbox: http://www.listbox.com
>>>
>>>
>>> --
>>> Ben Goertzel, PhD
>>> CEO, Novamente LLC and Biomind LLC
>>> Director of Research, SIAI
>>> [EMAIL PROTECTED]
>>>
>>> "Nothing will ever be attempted if all possible objections must be first
>>> overcome " - Dr Samuel Johnson
>>>
>>>
>>> 
>>> agi | Archives | Modify Your Subscription
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription:
> https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
Abram,

Some comments are added into your writing after my first reading. It
seems I need to read it again.

Pei

On Tue, Sep 23, 2008 at 7:26 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Wow! I did not mean to stir up such an argument between you two!!
>
> Pei,
>
> What if instead of using "node probability", the knowledge that "wrote
> an AGI book" is rare was inserted as a low frequency (high confidence)
> truth value on "human" => "wrote an AGI book"? Could NARS use that to
> do what Ben wants? More specifically, could it do so with only the
> knowledge:
>
> Ben is agi-author 
> guy is agi-author 
> Ben is human 
> guy is human 
> human is agi-author 
>
> If this was literally all NARS knew, what difference would
> adding/removing the last item make to the system's opinion of "guy is
> Ben"?
>
> To answer your earlier question, I am still ignoring confidence. It
> could always be calculated from the frequencies, of course. But, that
> does not justify using them in the calculations the way you do.
> Perhaps once I figure out the exact formulas for everything, I will
> see if they match up to a particular value of the parameter k. Or,
> perhaps, a value of k that moves according to the specific situation.
> Hmm... actually... that could be used as a fudge factor to get
> everything to "match up"... :)
>
> Also, attached is my latest revision. I have found that NARS deduction
> does not quite fit with my definitions. Induction and abduction are OK
> so far. If in the end I merely have something "close" to NARS, I will
> consider this a success-- it is an interpretation that fits well
> enough to show where NARS essentially differs from probability theory.
>
> On Tue, Sep 23, 2008 at 5:54 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> Yes, I know them, though I don't like any of them that I've seen. I
>> wonder Abram can find something better.
>>
>> To tell you the truth, my whole idea of confidence actually came from
>> a probabilistic formula, after my re-interpretation of it.
>>
>> Pei
>>
>> On Tue, Sep 23, 2008 at 4:35 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>>>
>>> Note that formally, the
>>>
>>> c = n/(n+k)
>>>
>>> equation also exists in the math of the beta distribution, which is used
>>> in Walley's imprecise probability theory and also in PLN's indefinite
>>> probabilities...
>>>
>>> So there seems some hope of making such a correspondence, based on
>>> algebraic evidence...
>>>
>>> ben
>>>
>>> On Tue, Sep 23, 2008 at 4:29 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Abram,
>>>>
>>>> Can your approach gives the Confidence measurement a probabilistic
>>>> interpretation? It is what really differs NARS from the other
>>>> approaches.
>>>>
>>>> Pei
>>>>
>>>> On Mon, Sep 22, 2008 at 11:22 PM, Abram Demski <[EMAIL PROTECTED]>
>>>> wrote:
>>>> >>> This example also shows why NARS and PLN are similar on deduction, but
>>>> >>> very different in abduction and induction.
>>>> >>
>>>> >> Yes.  One of my biggest practical complaints with NARS is that the
>>>> >> induction
>>>> >> and abduction truth value formulas don't make that much sense to me.
>>>> >
>>>> > Interesting in the context of these statements that my current
>>>> > "justification" for NARS probabilistically justifies induction and
>>>> > abduction but isn't as clear concerning deduction. (I'm working on
>>>> > it...)
>>>> >
>>>> > --Abram Demski
>>>> >
>>>> >
>>>> > ---
>>>> > agi
>>>> > Archives: https://www.listbox.com/member/archive/303/=now
>>>> > RSS Feed: https://www.listbox.com/member/archive/rss/303/
>>>> > Modify Your Subscription: https://www.listbox.com/member/?&;
>>>> > Powered by Listbox: http://www.listbox.com
>>>> >
>>>>
>>>>
>>>> ---
>>>> agi
>>>> Archives: https://www.listbox.com/member/archive/303/=now
>>>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>>>> Modify Your Subscription: https://www.listbox.com/member/?

Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
In NARS, if we have

Ben --> AGI-author 
Dude --> AGI-author 
|-
Dude --> Ben 

then f1 only contributes to c3, not to f3, because it decides whether
"AGI-author" is in the intension of Ben, and therefore whether the
premises really have anything to say about the conclusion. However it
doesn't say about how much of the evidence is positive/negative. On
the other hand, f2 have nothing to do with c3 (whether "AGI-author" is
indeed evidence for the conclusion), but if there is evidence, f2
decides whether (and how much) it is positive evidence --- given Ben
being AGI-author, if Dude is AGI-author, too, it is positive evidence
for "Dude --> Ben" (which means Dude inherits Ben's properties, rather
than the two are identical), if Dude is not AGI-author, then it is
negative evidence. Unlike in deduction, in abduction and induction the
truth-value function is not symmetric to the two premises, and this is
especially true when the conclusion is negative.

This result directly follows from the definition of evidence in NARS.

I'd rather not to spend a lot of time to debate on this topic again.
After all, "doesn't feel right to Ben" is a much less serious crime
than doesn't feel right to "any intelligent human", and I have little
hope in changing Ben's intuition. ;-)

Pei


On Wed, Sep 24, 2008 at 12:33 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>>
>> If we have
>>
>> Ben ==> AGI-author 
>> Dude ==> AGI-author 
>> |-
>> Dude ==> Ben 
>>
>> the PLN abduction rule would yield
>>
>> s3  = s1 s2 + w (1-s1)(1-s2)
>
>
>  But ... before we move on to psychic powers, let me note that this PLN
> abduction strength rule (simplified for the case of equal node
> probabilities) does depend on both s1 and s2
>
> The NARS abduction strength rule  depends on only one of the premise
> strengths, which intuitively and pretheoretically does not feel right to
> me...
>
> ben g
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
Thanks for the detailed answer. Now I'm happy, and we can turn to
something else. ;-)

Pei

On Wed, Sep 24, 2008 at 12:09 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>>
>> >> I guess my previous question was not clear enough: if the only domain
>> >> knowledge PLN has is
>> >>
>> >> > Ben is an author of a book on AGI 
>> >> > This dude is an author of a book on AGI 
>> >>
>> >> and
>> >>
>> >> > Ben is odd 
>> >> > This dude is odd 
>> >>
>> >> Will the system derives anything?
>> >
>> > Yes, via making default assumptions about node probability...
>>
>> Then what are the conclusions, with their truth-values, in each of the
>> two cases?
>
>
> Without node probability tv's, PLN actually behaves pretty similarly
> to NARS in this case...
>
> If we have
>
> Ben ==> AGI-author 
> Dude ==> AGI-author 
> |-
> Dude ==> Ben 
>
> the PLN abduction rule would yield
>
> s3  = s1 s2 + w (1-s1)(1-s2)
>
> where w is a parameter of the form
>
> w = p/ (1-p)
>
> and if we set w=1 which is a principle of indifference type
> assumption then we just have
>
> s3 = 1 - s1 - s2 + 2s1s2
>
> In any case, regardless of w, s1=s2=1 implies s3=1
> in this formula, which is the same answer NARS gives
> in this case (of crisp premises)
>
> Similar to NARS, PLN also gives a fairly low confidence
> to this case, but the confidence formula is a pain and I
> won't write it out here...  (i.e., PLN assigns this a beta
> distribution with 1 in its support, but a pretty high variance...)
>
> So, similar to NARS, without node probability info PLN cannot
> distinguish the two inference examples I gave .. no system could...
>
> However, PLN incorporates the node probabilities when available,
> immediately and easily, without requiring knowledge of math on
> the part of the system... and it incorporates them according to Bayes
> rule which I believe the right approach ...
>
> What is counterintuitive to me is having an inference engine that
> does not immediately and automatically use the node probability info
> when it is available...
>
> As evidence about Bayesian neural population coding in the brain suggests,
> use of Bayes rule is probably MORE cognitively primary than use of
> these other more complex inference rules...
>
> -- ben g
>
>
> p.s.
> details:
>
> In PLN,
> simple abduction consists of the inference problem:
> Given P(A), P(B), P(C), P(B|A) and P(B|C), find P(C|A).
>
> and the simplest, independence-assumption + Bayes rule based formula
> for this is
>
> abdAC:=(sA,sB,sC,sAB,sCB)->(sAB*sCB*sC/sB+(1-sAB)*(1-sBC)*sC/(1-sB))
>
> [or, more fully including all consistency conditions,
>
> abdAC:=
> (sA,sB,sC,sAB,sCB)->(sAB*sCB*sC/sB+(1-sAB)*(1-sBC)*sC/(1-sB))*(Heaviside(sAB-max(((sA+sB-1)/sA),0))-Heaviside(sAB-min(1,(sB/sA*(Heaviside(sCB-max(((sB+sC-1)/sC),0))-Heaviside(sCB-min(1,(sB/sC;
>
> ]
>
> (This is Maple notation...)
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
I see your point and agree.

Now, how about the other question? If PLN gets two different
truth-values, where is the difference come from? If it gets the same
truth-value (as NARS), why you don't think it is counter-intuitive?

Pei

On Wed, Sep 24, 2008 at 11:44 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>
> On Wed, Sep 24, 2008 at 11:43 AM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> The distinction between object-level and meta-level knowledge is very
>> clear in NARS, though I won't push this issue any further.
>
> yes, but some of the things you push into the meta-level knowledge in NARS,
> seem more like the things we consider object-level knowledge in PLN
>
> ben g
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
The distinction between object-level and meta-level knowledge is very
clear in NARS, though I won't push this issue any further.

However, I do want to know your answer to my other question, which I
repeat in the following:

>> if the only domain knowledge PLN has is
>>
>> > Ben is an author of a book on AGI 
>> > This dude is an author of a book on AGI 
>>
>> and
>>
>> > Ben is odd 
>> > This dude is odd 
>>
>> Will the system derives anything?
>
> Yes, via making default assumptions about node probability...

Then what are the conclusions, with their truth-values, in each of the
two cases?

Pei


On Wed, Sep 24, 2008 at 11:32 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>>
>> >
>> > I mean assumptions like "symmetric treatment of intension and
>> > extension",
>> > which are technical mathematical assumptions...
>>
>> But they are still not assumptions about domain knowledge, like node
>> probability.
>
>
> Well, in PLN the balance between intensional and extensional knowledge is
> calculated based on domain knowledge, in each case...
>
> So, from a PLN view, this symmetry assumption of NARS's **is** effectively
> an assumption about domain knowledge
>
> What constitutes domain knowledge, versus an a priori assumption, is
> not very clear really... the distinction seems to be theory-laden and
> dependent on the semantics of the inference system in question...
>
> ben g
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-24 Thread Pei Wang
On Tue, Sep 23, 2008 at 9:59 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>> > PLN needs to make assumptions about node probability in this case; but
>> > NARS
>> > also makes assumptions, it's just that NARS's assumptions are more
>> > deeply
>> > hidden in the formalism...
>>
>> If you means assumptions like "insufficient knowledge and resources",
>> you are right, but that is not at the same level as assumptions about
>> the values of node probability.
>
> I mean assumptions like "symmetric treatment of intension and extension",
> which are technical mathematical assumptions...

But they are still not assumptions about domain knowledge, like node
probability.

>> I guess my previous question was not clear enough: if the only domain
>> knowledge PLN has is
>>
>> > Ben is an author of a book on AGI 
>> > This dude is an author of a book on AGI 
>>
>> and
>>
>> > Ben is odd 
>> > This dude is odd 
>>
>> Will the system derives anything?
>
> Yes, via making default assumptions about node probability...

Then what are the conclusions, with their truth-values, in each of the
two cases?

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-23 Thread Pei Wang
On Tue, Sep 23, 2008 at 7:26 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Wow! I did not mean to stir up such an argument between you two!!

Abram: This argument has been going on for about 10 years, with some
"on" periods and "off" periods, so don't feel responsible for it ---
you just raised the right topic in the right time to turn it "on"
again. ;-)

> Pei,
>
> What if instead of using "node probability", the knowledge that "wrote
> an AGI book" is rare was inserted as a low frequency (high confidence)
> truth value on "human" => "wrote an AGI book"? Could NARS use that to
> do what Ben wants? More specifically, could it do so with only the
> knowledge:
>
> Ben is agi-author 
> guy is agi-author 
> Ben is human 
> guy is human 
> human is agi-author 
>
> If this was literally all NARS knew, what difference would
> adding/removing the last item make to the system's opinion of "guy is
> Ben"?

Not much, since "Ben" and "guy" play symmetric role in the knowledge.

> To answer your earlier question, I am still ignoring confidence. It
> could always be calculated from the frequencies, of course.

Not really. The frequency and confidence of a judgment is independent
(in my sense) of each other. If you mean "from the frequencies of the
premises", then it is true, but still you need to provide confidence
for the premises, too.

> But, that
> does not justify using them in the calculations the way you do.
> Perhaps once I figure out the exact formulas for everything, I will
> see if they match up to a particular value of the parameter k. Or,
> perhaps, a value of k that moves according to the specific situation.
> Hmm... actually... that could be used as a fudge factor to get
> everything to "match up"... :)

Probably not. The choice of k doesn't change the overall nature of the
uncertainty calculus.

> Also, attached is my latest revision. I have found that NARS deduction
> does not quite fit with my definitions. Induction and abduction are OK
> so far. If in the end I merely have something "close" to NARS, I will
> consider this a success-- it is an interpretation that fits well
> enough to show where NARS essentially differs from probability theory.

I'll find time to read it carefully.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-23 Thread Pei Wang
On Tue, Sep 23, 2008 at 5:28 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>> > Yes.  One of my biggest practical complaints with NARS is that the
>> > induction
>> > and abduction truth value formulas don't make that much sense to me.
>>
>> I guess since you are trained as a mathematician, your "sense" has
>> been formalized by probability theory to some extent. ;-)
>
> Actually, the main reason the NARS induction and abduction truth value
> formulas
> seem counterintuitive to me has nothing to do with my math training...

Of course I was half joking when I said that. But the non-joking half
has some truth in it, I believe --- when we get used to a theory, it
gets into our judgments implicitly. For example, many people would say
that predicate logic is simpler or more natural than term logic, which
is really because the former is what they learned in school.

> it has to do
> with the fact that, in each case, the strength of the conclusion relies on
> the strength
> of only **one** of the premises.  This just does not feel right to me, quite
> apart
> from any mathematical intuitions or knowledge of probability theory.  It
> happens
> that in this case probability theory agrees with my naive, pretheoretic
> intuition...

Understand, and I don't think I can change that soon, so I'll stop here.

Just to let you know that I'm much more confident about the
abduction/induction function than about the deduction function. In the
history of NARS, the latter has been modified several times, and I'm
still not fully happy with the current one. The former, on the
contrary, has never been changed, and I'm perfectly happy with it.

>> > PLN is able to make judgments, in every case, using *exactly* the same
>> > amount of evidence that NARS is.
>>
>> Without assumptions on "node probability"? In your example, what is
>> the conclusion from PLN if it is only given (1)-(4) ?
>
> PLN needs to make assumptions about node probability in this case; but NARS
> also makes assumptions, it's just that NARS's assumptions are more deeply
> hidden in the formalism...

If you means assumptions like "insufficient knowledge and resources",
you are right, but that is not at the same level as assumptions about
the values of node probability.

I guess my previous question was not clear enough: if the only domain
knowledge PLN has is

> Ben is an author of a book on AGI 
> This dude is an author of a book on AGI 

and

> Ben is odd 
> This dude is odd 

Will the system derives anything?

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-23 Thread Pei Wang
Yes, I know them, though I don't like any of them that I've seen. I
wonder Abram can find something better.

To tell you the truth, my whole idea of confidence actually came from
a probabilistic formula, after my re-interpretation of it.

Pei

On Tue, Sep 23, 2008 at 4:35 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Note that formally, the
>
> c = n/(n+k)
>
> equation also exists in the math of the beta distribution, which is used
> in Walley's imprecise probability theory and also in PLN's indefinite
> probabilities...
>
> So there seems some hope of making such a correspondence, based on
> algebraic evidence...
>
> ben
>
> On Tue, Sep 23, 2008 at 4:29 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> Abram,
>>
>> Can your approach gives the Confidence measurement a probabilistic
>> interpretation? It is what really differs NARS from the other
>> approaches.
>>
>> Pei
>>
>> On Mon, Sep 22, 2008 at 11:22 PM, Abram Demski <[EMAIL PROTECTED]>
>> wrote:
>> >>> This example also shows why NARS and PLN are similar on deduction, but
>> >>> very different in abduction and induction.
>> >>
>> >> Yes.  One of my biggest practical complaints with NARS is that the
>> >> induction
>> >> and abduction truth value formulas don't make that much sense to me.
>> >
>> > Interesting in the context of these statements that my current
>> > "justification" for NARS probabilistically justifies induction and
>> > abduction but isn't as clear concerning deduction. (I'm working on
>> > it...)
>> >
>> > --Abram Demski
>> >
>> >
>> > ---
>> > agi
>> > Archives: https://www.listbox.com/member/archive/303/=now
>> > RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> > Modify Your Subscription: https://www.listbox.com/member/?&;
>> > Powered by Listbox: http://www.listbox.com
>> >
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "Nothing will ever be attempted if all possible objections must be first
> overcome " - Dr Samuel Johnson
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-23 Thread Pei Wang
On Mon, Sep 22, 2008 at 10:09 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> One interesting observation is that these truth values approximate
> relatively
> uninformative points on the probability distributions that PLN would attach
> to these relationships.
>
> That is, <1.0;0.45> , if interpreted as a probabilistic truth value, would
> indicate a fairly wide interval of probabilities containing 1.0
>
> Which is not necessarily wrong, but is not maximally interesting ... there
> might
> be a narrower interval centered somewhere besides 1.0
>
> (the confidence 0.45, in a PLN-like interpretation, is inverse to
> probability interval width)

>From the beginning (see
http://www.cogsci.indiana.edu/pub/wang.inheritance_nal.ps) the
truth-value of NARS has an interval interpretation, where the width of
the interval (called "ignorance") is the opposite of confidence.
Therefore, duductive conclusions usually have narrower intervals than
abductive/inductive conclusions.

> This is all correct, but the problem I have is that something which should
> IMO be very simple and instinctive is being done in an overly
> complicated way  Knowledge of math should not be needed to
> do an inference this simple...

It is simple to me --- at least I don't need "node probability". ;-)

>> The same result can be obtained in other ways. Even if NARS doesn't
>> know math, if the system has met AGI author many times, and only in
>> one percent of the times the person happens to be Ben, the system will
>> also learn something like (7). The same for (8).
>
> But also, observations of Ben should not be needed to do this inference...

If Ben isn't in "AGI authors", this concept won't be involved in this inference.

>> What does this means? To me, it once again shows what I've been saying
>> all the time: NARS doesn't always give better results than PLN or
>> other probability-based approach, but it does assume less knowledge
>> and resources. In this example, from knowledge (1)-(4) alone, NARS
>> derives (5)-(6), but probability-based approach, including PLN, cannot
>> derive anything, until knowledge is got (or assumptions are made) on
>> the involved "node probabilities". For NARS, when this information
>> becomes available, it may be taken into consideration to change the
>> system's conclusions, though they are not demanded in all cases.
>
> It is simple enough, in PLN, to assume that all terms have equal
> probability ... in the absence of knowledge to the contrary.

Though it is usually intuitive and natural, it is still an assumption
to be make, and it does cause problems in certain situation.
Furthermore, what is "knowledge to the contrary" that lead the system
to undo the assumption? How?

>> This example also shows why NARS and PLN are similar on deduction, but
>> very different in abduction and induction.
>
> Yes.  One of my biggest practical complaints with NARS is that the induction
> and abduction truth value formulas don't make that much sense to me.

I guess since you are trained as a mathematician, your "sense" has
been formalized by probability theory to some extent. ;-)

> I understand
> their mathematical/conceptual derivation using boundary conditions, but to
> me
> they seem to produce generally uninteresting conclusion truth values,
> corresponding
> roughly to "suboptimally informative points on the conclusion truth value's
> probability
> distribution" ...

... just like metaphors, which cannot be used to prove anything.

A single non-deductive conclusion is almost never useful. Their value
is when they are accumulated in the long run.

> in OpenCogPrime
> we use other methods for hypothesis generation, then use probability theory
> for estimating the truth values of these hypotheses...

Many people have argued that "hypotheses generation" and "hypotheses
evaluation" should be separated. I strongly think that is wrong,
though I don't have the time to argue on that now.

> PLN is able to make judgments, in every case, using *exactly* the same
> amount of evidence that NARS is.

Without assumptions on "node probability"? In your example, what is
the conclusion from PLN if it is only given (1)-(4) ?

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS vs. PLN [Was: NARS probability]

2008-09-23 Thread Pei Wang
Abram,

Can your approach gives the Confidence measurement a probabilistic
interpretation? It is what really differs NARS from the other
approaches.

Pei

On Mon, Sep 22, 2008 at 11:22 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>> This example also shows why NARS and PLN are similar on deduction, but
>>> very different in abduction and induction.
>>
>> Yes.  One of my biggest practical complaints with NARS is that the induction
>> and abduction truth value formulas don't make that much sense to me.
>
> Interesting in the context of these statements that my current
> "justification" for NARS probabilistically justifies induction and
> abduction but isn't as clear concerning deduction. (I'm working on
> it...)
>
> --Abram Demski
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] re: NARS probability

2008-09-22 Thread Pei Wang
On Mon, Sep 22, 2008 at 8:05 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>> by treating terms as sets,
>> inheritance as partial subset, and frequency as the extent of partial
>> inclusion,
>
> No.
>
>> then I don't think it is possible.
>
> If you read the definitions I attached, you'll see that I am not
> taking inheritance to be set-membership. (It is fine if you didn't
> take time to read it, time is always limited.) Instead, I've taken
> A=>B to be the probability distribution that is average between
> p(A=>C|B=>C) and p(C=>B|C=>A).

Yes, I noticed that (which is nice) when reading the document, though
I didn't consider that point when writing the email. My mistake.

However, I'm still not convinced that it can be done --- see the email
I just sent.

Pei

> --Abram Demski
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


[agi] NARS vs. PLN [Was: NARS probability]

2008-09-22 Thread Pei Wang
Given the nature of this topic, I start a new thread for it.

Ben proposed the following example to reveal a difference between NARS
and PLN, with the hope to show why PLN is better. Now let me use the
same example to show the opposite conclusion. ;-)

In the first part, I'll just translate Ben's example into Narsese,
then derive his conclusion, with more details.
[ For the grammar of Narsese, see
http://code.google.com/p/open-nars/wiki/InputOutputFormat ]

Assuming 4 input judgments, with the same default confidence value (0.9):

(1) {Ben} --> AGI-author <1.0;0.9>
(2) {dude-101} --> AGI-author <1.0;0.9>
(3) {Ben} --> odd-people <1.0;0.9>
(4) {dude-102} --> odd-people <1.0;0.9>

>From (1) and (2), by abduction, NARS derives (5)
(5) {dude-101} --> {Ben} <1.0;0.45>

Since (3) and (4) gives the same evidence, they derives the same conclusion
(6) {dude-102} --> {Ben} <1.0;0.45>

Ben argues that since there are many more odd people than AGI authors,
(5) should have a "higher" truth-value, in a certain sense, which is
the case in PLN, by using Bayes rule.

So far, I agree with Ben, but just add that in NARS, the information
"there are many more odd people than AGI authors" has not been taken
into consideration yet.

That information can be added in several different forms. For example,
after NARS learns some math, from the information that there are only
about 100 AGI authors but 100 odd people (a conservative
estimation, I guess), plus Ben is in both category, and the principle
of indifference, the system should have the following knowledge:
(7) AGI-author --> {Ben} <0.01;0.9>
(8) odd-people --> {Ben} <0.01;0.9>

Now from (2) and (7), by deduction, NARS gets
(9) {dude-101} --> {Ben} <0.01;0.81>

and from (4) and (8), also by deduction, the conclusion is
(10) {dude-102} --> {Ben} <0.01;0.81>

[Here I'm taking a shortcut. In the current implementation, the
deduction rule only directly produces strong positive conclusions,
while strong negative conclusions are produced with the help of the
negation operator, which is something I skipped in this discussion.
So, in the actual case, the confidence will be lower than 0.81 (the
product of the confidence values of the premise), but not by too
much.]

The same result can be obtained in other ways. Even if NARS doesn't
know math, if the system has met AGI author many times, and only in
one percent of the times the person happens to be Ben, the system will
also learn something like (7). The same for (8).

When the system gets both (5)-(6) and (9)-(10), the latter is chosen
as the final conclusion, given their high confidence value. [The two
pairs won't be merged, because they come from overlapping evidence ---
(2) and (4) are used in both cases.]

Now NARS gives exactly the conclusion Ben asked.

So, what is going on here? The information referred as "node
probability" in PLN sometimes (though maybe not always) builds
"reverse" links in NARS, and consequently turns an abduction (or
induction) into deduction, whose conclusion will "override" the
abductive/inductive conclusion, because deductive conclusions usually
have higher confidence values. This is not really news, because
abduction and induction are implemented in PLN as
deduction-on-reversed-link.

What does this means? To me, it once again shows what I've been saying
all the time: NARS doesn't always give better results than PLN or
other probability-based approach, but it does assume less knowledge
and resources. In this example, from knowledge (1)-(4) alone, NARS
derives (5)-(6), but probability-based approach, including PLN, cannot
derive anything, until knowledge is got (or assumptions are made) on
the involved "node probabilities". For NARS, when this information
becomes available, it may be taken into consideration to change the
system's conclusions, though they are not demanded in all cases.

This example also shows why NARS and PLN are similar on deduction, but
very different in abduction and induction. In my opinion, what called
"abduction" and "induction" in PLN are special forms of deductions,
which produce solid conclusion, but also demand more evidence to start
with. Actually probability theory is about (multi-valued) deduction
only. It doesn't build tentative conclusions first, them using
additional evidence to revise or override them, which is how
non-deductive inference works.

NARS can deliberately use probability theory by coding P(E) = 0.8 into
Narsese judgment like "(*, E, 0.8) --> probability-of <1.0;0.99>",
though it is not built-in, but must be learned by the system, just
like us. Its "native logic" is similar to probability theory here or
there, but is based on very different assumptions.

Pei


On Sun, Sep 21, 2008 at 10:46 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> As an example inference, consider
>
> Ben is an author of a book on AGI 
> This dude is an author of a book on AGI 
> |-
> This dude is Ben 
>
> versus
>
> Ben is odd 
> This dude is odd 
> |-
> This dude is Ben 
>

Re: [agi] re: NARS probability

2008-09-22 Thread Pei Wang
On Mon, Sep 22, 2008 at 5:41 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
> [...] To me, to "to interpret NARS probabilistically"
> means to take the make assumption as NARS, [...]

Sorry, I mean "to make the same assumption ..."

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] re: NARS probability

2008-09-22 Thread Pei Wang
Abram,

In the document, you wrote: "Pei Wang suggested that I should make no
reference to objective probabilities from an external world. But, this
does not fit with my goal-- I'm trying to give a probabilistic
interpretation, after all. However, the system never manipulates the
external probabilities. They are referenced only to indicate the
semantics."

However, semantics (what the numbers measure under what assumptions)
is exactly what differs NARS from the conventional applications of
probability theory. To me, to "to interpret NARS probabilistically"
means to take the make assumption as NARS, but re-do all the
truth-value functions according to probability theory (as a pure
mathematical theory applied to this situation).

If by "probabilistic justification for NARS" you mean the current NARS
truth-value functions can be justified even according to the common
semantics of probability theory, by treating terms as sets,
inheritance as partial subset, and frequency as the extent of partial
inclusion, then I don't think it is possible.

Let me spend some time to analyze the example Ben raised in a
following email. Hopefully it will show you why NARS is not based on
probability theory.

Pei

On Sun, Sep 21, 2008 at 10:10 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Attached is my attempt at a probabilistic justification for NARS.
>
> --Abram
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] Waiting to gain information before acting

2008-09-21 Thread Pei Wang
On Sun, Sep 21, 2008 at 4:15 PM, William Pearson <[EMAIL PROTECTED]> wrote:
> 2008/9/21 Pei Wang <[EMAIL PROTECTED]>:
>> There are several issues involved in this example, though the basic is:
>>
>> (1) There is a decision to be made before a deadline (after 10 days),
>> let's call it goal A, written as A?
>> (2) At the current moment, the available information is not enough to
>> support a confident conclusion, that is, the system has belief A,
>> though the confidence c is below the threshold (to trigger an
>> immediate betting action).
>
> Can you get it so that without the knowledge of the possibility of
> future information it would still act? E.g. is the threshold
> adjustable in some way.

In the current implementation the threshold is just a constant
(DECISION_THRESHOLD in
http://code.google.com/p/open-nars/source/browse/trunk/nars/main/Parameters.java),
but in the future the decision will also depend on other factors, such
as the urgency of the goal --- if the decision must be taken very
soon, the system won't wait for more information, but just use
whatever evidence available.

>> (3) It is known that future evidence B (the weather in russia 5 day
>> before the deadline) will provide a better solution, that is, B==>A
>> with a high .
>
> It is this step I am interested in.
>
> Normally knowing X does not imply that you know X is useful for
> solving all problems that X is useful for helping to solve.  If I tell
> you that Gordon Brown the current Prime Minister of the UK is in
> political difficulties, you don't know that this will be useful for
> answering a prize quiz question such as, "Who resigned after losing a
> party leadership election this week" (it hasn't happened yet, but
> might do).  You might figure out that the fact I gave is useful for
> this question and then guess the answer is Gordon Brown, without
> having better information.

Of course.

> So there would need to be some form of search or linkage, so that you
> construct the potential usefulness of facts as yet unknown, with their
> usefulness as parts of the solution of problems.
>
> Does NARS do this?

In NARS, what information will be useful for what problem is a kind of
knowledge the system learns from its experience. It is not obtained by
search, but by inference --- a constructive process. Even so, it is
always possible that fact F will be helpful for solving problem P,
though the system hasn't realized that yet. The system cannot afford
the resources to exhaust all linkages, though will try to find as many
as it can.

> I should probably reformulate the scenario as the problem of which
> question to phone a friend on "Who wants to be a millionaire". I shall
> try and do so.

Yes, it makes sense.

Pei

>  Will Pearson
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-21 Thread Pei Wang
When working on your new proposal, remember that in NARS all
measurements must be based on what the system has --- limited evidence
and resources. I don't allow any "objective probability" that only
exists in a Platonic world or the infinite future.

Pei

On Sun, Sep 21, 2008 at 1:53 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Hmm... I didn't mean infinite evidence, only infinite time and space
> with which to compute the consequences of evidence. But that is
> interesting too.
>
> The higher-order probabilities I'm talking about introducing do not
> reflect inaccuracy at all. :)
> This may seem odd, but it seems to me to follow from your development
> of NARS... so the difficulty for me is to account for why you can
> exclude it in your system. Of course, this need only arises from
> interpreting your definitions probabilistically.
>
> I think I have come up with a more specific proposal. I will try to
> write it up properly and see if it works.
>
> --Abram
>
> On Sat, Sep 20, 2008 at 11:28 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> On Sat, Sep 20, 2008 at 11:02 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>> You are right in what you say about (1). The truth is, my analysis is
>>> meant to apply to NARS operating with unrestricted time and memory
>>> resources (which of course is not the point of NARS!). So, the
>>> question is whether NARS approaches a probability calculation as it is
>>> given more time to use all its data.
>>
>> That is an interesting question. When the weight of evidence w goes to
>> infinite, so does confidence, and frequency converge to the limit of
>> positive evidence among all evidence, so it becomes probability, under
>> a certain interpretation. Therefore, as far as a single truth value is
>> concerned, probability theory is an extreme case of NARS.
>>
>> However, to take all truth values in the system into account, it is
>> not necessarily true, because the two theories specify the relations
>> among statements/propositions differently. For example, probability
>> theory has conditional B|A, while NARS uses implication A==>B, which
>> are similar, but not the same. Of course, there are some overlaps,
>> such as disjunction and conjunction, where NARS converges to
>> probability theory in the extreme case (infinite evidence).
>>
>>> As for higher values... NARS and PLN may be using them for the purpose
>>> you mention, but that is not the purpose I am giving them in my
>>> analysis! In my analysis, I am simply trying to justify the deductions
>>> allowed in NARS in a probabilistic way. Higher-order probabilities are
>>> potentially useful here because of the way you sum evidence. Simply
>>> put, it is as if NARS purposefully ignores the distinction between
>>> different probability levels, so that a NARS frequency is also a
>>> frequency-of-frequencies and frequency-of-frequency-of frequencies and
>>> so on, all the way up.
>>
>> I see what you mean, but as it is currently defined, in NARS there is
>> no need to introduce higher-order probabilities --- frequency is not
>> an estimation of a "true probability". It is uncertain because the
>> influence of new evidence, not because it is inaccurate.
>>
>>> The simple way of dealing with this is to say that it is wrong, and
>>> results from a confusion of similar-looking mathematical entities.
>>> But, to some extent, it is intuitive: I should not care too much in
>>> normal reasoning which "level" of inheritance I'm using when I say
>>> that a truck is a type of vehicle. So the question is, can this be
>>> justified probabilistically? I think I can give a very tentative
>>> "yes".
>>
>> Hopefully we'll know better about that when you explore further. ;-)
>>
>> Pei
>>
>>> --Abram
>>>
>>> On Sat, Sep 20, 2008 at 9:38 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>> On Sat, Sep 20, 2008 at 9:09 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>> (1) In probability theory, an event E has a constant probability P(E)
>>>>>> (which can be unknown). Given the assumption of insufficient knowledge
>>>>>> and resources, in NARS P(A-->B) would change over time, when more and
>>>>>> more evidence is taken into account. This process cannot be treated as
>>>>>> conditioning, because, among other things, the system can neither
>>>>>> explicitly list all evidence as condition, nor upd

Re: [agi] Waiting to gain information before acting

2008-09-21 Thread Pei Wang
There are several issues involved in this example, though the basic is:

(1) There is a decision to be made before a deadline (after 10 days),
let's call it goal A, written as A?
(2) At the current moment, the available information is not enough to
support a confident conclusion, that is, the system has belief A,
though the confidence c is below the threshold (to trigger an
immediate betting action).
(3) It is known that future evidence B (the weather in russia 5 day
before the deadline) will provide a better solution, that is, B==>A
with a high .
(4) By backward inference, the system produce a derived goal B?
(5) But the only way to answer B is to wait for 5 days, that is, C==>B
(6) Then, again by backward inference, the system get a derived goal
C? --- to wait for 5 days.

Of course, to actually run this example in NARS, the situation is much
more complicated, but the above will be roughly what will happen.
Similarly, the "weather prediction software" provides a way to achieve
a goal, but some waiting time is needed as a precondition for that
path to be taken. In all these cases "wait" becomes an action the
system will take (while working on other tasks).

Pei


On Sun, Sep 21, 2008 at 11:00 AM, William Pearson <[EMAIL PROTECTED]> wrote:
> I've started to wander away from my normal sub-cognitive level of AI,
> and have been thinking about reasoning systems. One scenario I have
> come up with is the, foresight of extra knowledge, scenario.
>
> Suppose Alice and Bob have decided to bet $10 on the weather in the 10
> days time in alaska whether it is warmer or colder than average, it is
> Bobs turn to pick his side. He already thinks that it is going to be
> warmer than average (p 0.6) based on global warming and prevailing
> conditions. But he also knows that the weather in russia 5 day before
> is a good indicator of the conditions, that is he has a p 0.9 that if
> the russian weather is colder than average on day x alaskan weather
> will be colder than average on day x+5 and likewise for warmer. He has
> to pick his side of the bet 3 days before the due date so he can
> afford to wait.
>
> My question is, are current proposed reasoning systems able to act so
> that Bob doesn't bet straight away, and waits for the extra
> information from Russia before making the bet?
>
> Lets try some backward chaining.
>
> Make money <- Win bet <- Pick most likely side <- Get more information
> about the most likely side
>
> The probability that a warm russia implies a warm alaska, does not
> intrinsically indicate that it gives you more information, allowing
> you to make a better bet.
>
> So, this is where I come to a halt, somewhat. How do you proceed the
> inference from here, it would seem you would have to do something
> special and treat every possible event that increases your ability to
> make a good guess on this bet as implying you have got more
> information (and some you don't?). You also would need to go with the
> meta-probability or some other indication of how good an estimate is,
> so that "more information" could be quantified.
>
> There are also more esoteric examples of waiting for more information,
> for example suppose Bob doesn't know about the russia-alaska
> connections but knows that a piece of software is going to be released
> that improves weather predictions in general. Can we still hook up
> that knowledge somehow?
>
>  Will Pearson
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-20 Thread Pei Wang
On Sat, Sep 20, 2008 at 11:02 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> You are right in what you say about (1). The truth is, my analysis is
> meant to apply to NARS operating with unrestricted time and memory
> resources (which of course is not the point of NARS!). So, the
> question is whether NARS approaches a probability calculation as it is
> given more time to use all its data.

That is an interesting question. When the weight of evidence w goes to
infinite, so does confidence, and frequency converge to the limit of
positive evidence among all evidence, so it becomes probability, under
a certain interpretation. Therefore, as far as a single truth value is
concerned, probability theory is an extreme case of NARS.

However, to take all truth values in the system into account, it is
not necessarily true, because the two theories specify the relations
among statements/propositions differently. For example, probability
theory has conditional B|A, while NARS uses implication A==>B, which
are similar, but not the same. Of course, there are some overlaps,
such as disjunction and conjunction, where NARS converges to
probability theory in the extreme case (infinite evidence).

> As for higher values... NARS and PLN may be using them for the purpose
> you mention, but that is not the purpose I am giving them in my
> analysis! In my analysis, I am simply trying to justify the deductions
> allowed in NARS in a probabilistic way. Higher-order probabilities are
> potentially useful here because of the way you sum evidence. Simply
> put, it is as if NARS purposefully ignores the distinction between
> different probability levels, so that a NARS frequency is also a
> frequency-of-frequencies and frequency-of-frequency-of frequencies and
> so on, all the way up.

I see what you mean, but as it is currently defined, in NARS there is
no need to introduce higher-order probabilities --- frequency is not
an estimation of a "true probability". It is uncertain because the
influence of new evidence, not because it is inaccurate.

> The simple way of dealing with this is to say that it is wrong, and
> results from a confusion of similar-looking mathematical entities.
> But, to some extent, it is intuitive: I should not care too much in
> normal reasoning which "level" of inheritance I'm using when I say
> that a truck is a type of vehicle. So the question is, can this be
> justified probabilistically? I think I can give a very tentative
> "yes".

Hopefully we'll know better about that when you explore further. ;-)

Pei

> --Abram
>
> On Sat, Sep 20, 2008 at 9:38 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> On Sat, Sep 20, 2008 at 9:09 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>>>
>>>> (1) In probability theory, an event E has a constant probability P(E)
>>>> (which can be unknown). Given the assumption of insufficient knowledge
>>>> and resources, in NARS P(A-->B) would change over time, when more and
>>>> more evidence is taken into account. This process cannot be treated as
>>>> conditioning, because, among other things, the system can neither
>>>> explicitly list all evidence as condition, nor update the probability
>>>> of all statements in the system for each piece of new evidence (so as
>>>> to treat all background knowledge as a default condition).
>>>> Consequently, at any moment P(A-->B) and P(B-->C) may be based on
>>>> different, though unspecified, data, so it is invalid to use them in a
>>>> rule to calculate the "probability" of A-->C --- probability theory
>>>> does not allow cross-distribution probability calculation.
>>>
>>> This is not a problem the way I set things up. The likelihood of a
>>> statement is welcome to change over time, as the evidence changes.
>>
>> If each of them is changed independently, you don't have a single
>> probability distribution anymore, but a bunch of them. In the above
>> case, you don't really have P(A-->B) and P(B-->C), but P_307(A-->B)
>> and P_409(B-->C). How can you use two probability values together if
>> they come from different distributions?
>>
>>>> (2) For the same reason, in NARS a statement might get different
>>>> "probability" attached, when derived from different evidence.
>>>> Probability theory does not have a general rule to handle
>>>> inconsistency within a probability distribution.
>>>
>>> The same statement holds for PLN, right?
>>
>> Yes. Ben proposed a solution, which I won't comment until I see all
>> the details in the PLN book.
>>
>>&

Re: [agi] NARS probability

2008-09-20 Thread Pei Wang
I found the paper.

As I guessed, their update operator is defined on the whole
probability distribution function, rather than on a single probability
value of an event. I don't think it is practical for AGI --- we cannot
afford the time to re-evaluate every belief on each piece of new
evidence. Also, I haven't seen a convincing argument on why an
intelligent system should follow the ME Principle.

Also this paper doesn't directly solve my example, because it doesn't
use second-order probability.

Pei

On Sat, Sep 20, 2008 at 10:13 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> The approach in that paper doesn't require any special assumptions, and
> could be applied to your example, but I don't have time to write up an
> explanation of how to do the calculations ... you'll have to read the paper
> yourself if you're curious ;-)
>
> That approach is not implemented in PLN right now but we have debated
> integrating it with PLN as in some ways it's subtler than what we currently
> do in the code...
>
> ben
>
> On Sat, Sep 20, 2008 at 10:02 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>
>> I didn't know this paper, but I do know approaches based on the
>> principle of maximum/optimum entropy. They usually requires much more
>> information (or assumptions) than what is given in the following
>> example.
>>
>> I'd be interested to know what the solution they will suggest for such
>> a situation.
>>
>> Pei
>>
>> On Sat, Sep 20, 2008 at 9:53 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>> >
>> >>
>> >>
>> >> Think about a concrete example: if from one source the system gets
>> >> P(A-->B) = 0.9, and P(P(A-->B) = 0.9) = 0.5, while from another source
>> >> P(A-->B) = 0.2, and P(P(A-->B) = 0.2) = 0.7, then what will be the
>> >> conclusion when the two sources are considered together?
>> >
>> > There are many approaches to this within the probabilistic framework,
>> > one of which is contained within this paper, for example...
>> >
>> > http://cat.inist.fr/?aModele=afficheN&cpsidt=16174172
>> >
>> > (I have a copy of the paper but I'm not sure where it's available for
>> > free online ... if anyone finds it please post the link... thx)
>> >
>> > Ben
>> > 
>> > agi | Archives | Modify Your Subscription
>>
>>
>> ---
>> agi
>> Archives: https://www.listbox.com/member/archive/303/=now
>> RSS Feed: https://www.listbox.com/member/archive/rss/303/
>> Modify Your Subscription: https://www.listbox.com/member/?&;
>> Powered by Listbox: http://www.listbox.com
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> Director of Research, SIAI
> [EMAIL PROTECTED]
>
> "Nothing will ever be attempted if all possible objections must be first
> overcome " - Dr Samuel Johnson
>
>
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-20 Thread Pei Wang
I didn't know this paper, but I do know approaches based on the
principle of maximum/optimum entropy. They usually requires much more
information (or assumptions) than what is given in the following
example.

I'd be interested to know what the solution they will suggest for such
a situation.

Pei

On Sat, Sep 20, 2008 at 9:53 PM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
>>
>>
>> Think about a concrete example: if from one source the system gets
>> P(A-->B) = 0.9, and P(P(A-->B) = 0.9) = 0.5, while from another source
>> P(A-->B) = 0.2, and P(P(A-->B) = 0.2) = 0.7, then what will be the
>> conclusion when the two sources are considered together?
>
> There are many approaches to this within the probabilistic framework,
> one of which is contained within this paper, for example...
>
> http://cat.inist.fr/?aModele=afficheN&cpsidt=16174172
>
> (I have a copy of the paper but I'm not sure where it's available for
> free online ... if anyone finds it please post the link... thx)
>
> Ben
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-20 Thread Pei Wang
On Sat, Sep 20, 2008 at 9:09 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>
>> (1) In probability theory, an event E has a constant probability P(E)
>> (which can be unknown). Given the assumption of insufficient knowledge
>> and resources, in NARS P(A-->B) would change over time, when more and
>> more evidence is taken into account. This process cannot be treated as
>> conditioning, because, among other things, the system can neither
>> explicitly list all evidence as condition, nor update the probability
>> of all statements in the system for each piece of new evidence (so as
>> to treat all background knowledge as a default condition).
>> Consequently, at any moment P(A-->B) and P(B-->C) may be based on
>> different, though unspecified, data, so it is invalid to use them in a
>> rule to calculate the "probability" of A-->C --- probability theory
>> does not allow cross-distribution probability calculation.
>
> This is not a problem the way I set things up. The likelihood of a
> statement is welcome to change over time, as the evidence changes.

If each of them is changed independently, you don't have a single
probability distribution anymore, but a bunch of them. In the above
case, you don't really have P(A-->B) and P(B-->C), but P_307(A-->B)
and P_409(B-->C). How can you use two probability values together if
they come from different distributions?

>> (2) For the same reason, in NARS a statement might get different
>> "probability" attached, when derived from different evidence.
>> Probability theory does not have a general rule to handle
>> inconsistency within a probability distribution.
>
> The same statement holds for PLN, right?

Yes. Ben proposed a solution, which I won't comment until I see all
the details in the PLN book.

>> The first half is fine, but the second isn't. As the previous example
>> shows, in NARS a high Confidence does implies that the Frequency value
>> is a good summary of evidence, but a low Confidence does implies that
>> the Frequency is bad, just that it is not very stable.
>
> But I'm not talking about confidence when I say "higher". I'm talking
> about the system of levels I defined, for which it is perfectly OK.

Yes, but the whole purpose of adding another value is to handle
inconsistency and belief revision. Higher-order probability is
mathematically sound, but won't do this work.

Think about a concrete example: if from one source the system gets
P(A-->B) = 0.9, and P(P(A-->B) = 0.9) = 0.5, while from another source
P(A-->B) = 0.2, and P(P(A-->B) = 0.2) = 0.7, then what will be the
conclusion when the two sources are considered together?

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: The brain does not implement formal logic (was Re: [agi] Where the Future of AGI Lies)

2008-09-20 Thread Pei Wang
Matt,

I really hope NARS can be simplified, but until you give me the
details, such as how to calculate the truth value in your "converse"
rule, I cannot see how you can do the same things with a simpler
design.

NARS has this conversion rule, which, with the deduction rule, can
"replace" induction/abduction, just as you suggested. However,
conclusions produced in this way usually have lower confidence than
those directly generated by induction/abduction, so this trick is not
that useful in NARS.

This result is discussed in
http://www.cogsci.indiana.edu/pub/wang.inheritance_nal.ps , page 27.

For your original claim that "The brain does not implement formal
logic", my brief answers are:

(1) So what? Who said AI must duplicate the brain? Just because we
cannot image another possibility?

(2) In a broad sense, "formal logic" is nothing but
"domain-independent and justifiable data manipulation schemes". I
haven't seen any argument for why AI cannot be achieved by
implementing that. After all, "formal logic" is not limited to
"First-Order Predicate Calculus plus Model Theory".

Pei


On Sat, Sep 20, 2008 at 4:44 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/19/08, Jan Klauck <[EMAIL PROTECTED]> wrote:
>
>> Formal logic doesn't scale up very well in humans. That's why this
>> kind of reasoning is so unpopular. Our capacities are that
>> small and we connect to other human entities for a kind of
>> distributed problem solving. Logic is just a tool for us to
>> communicate and reason systematically about problems we would
>> mess up otherwise.
>
> Exactly. That is why I am critical of probabilistic or uncertain logic. 
> Humans are not very good at logic and arithmetic problems requiring long 
> sequences of steps, but duplicating these defects in machines does not help. 
> It does not solve the problem of translating natural language into formal 
> language and back. When we need to solve such a problem, we use pencil and 
> paper, or a calculator, or we write a program. The problem for AI is to 
> convert natural language to formal language or a program and back. The formal 
> reasoning we already know how to do.
>
> Even though a language model is probabilistic, probabilistic logic is not a 
> good fit. For example, in NARS we have deduction (P->Q, Q->R) => (P->R), 
> induction (P->Q, P->R) => (Q->R), and abduction (P->R, Q->R) => (P->Q). 
> Induction and abduction are not strictly true, of course, but in a 
> probabilistic logic we can assign them partial truth values.
>
> For language modeling, we can simplify the logic. If we accept the "converse" 
> rule (P->Q) => (Q->P) as partially true (if rain predicts clouds, then clouds 
> may predict rain), then we can derive induction and abduction from deduction 
> and converse. For induction, (P->Q, P->R) => (Q->P, P->R) => (Q->R). 
> Abduction is similar. Allowing converse, the statement (P->Q) is really a 
> fuzzy equivalence or association (P ~ Q), e.g. (rain ~ clouds).
>
> A language model is a set of associations between concepts. Language learning 
> consists of two operations carried out on a massively parallel scale: forming 
> associations and forming new concepts by clustering in context space. An 
> example of the latter is:
>
> the dog is
> the cat is
> the house is
> ...
> the (noun) is
>
> So if we read "the glorp is" we learn that "glorp" is a noun. Likewise, we 
> learn something of its meaning from its more distant context, e.g. "the glorp 
> is eating my flowers". We do this by the transitive property of association, 
> e.g. (glorp ~ eating flowers ~ rabbit).
>
> This is not to say NARS or other systems are wrong, but rather that they have 
> more capability than we need to solve reasoning in AI. Whether the extra 
> capability helps or not is something that requires experimental verification.
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] NARS probability

2008-09-20 Thread Pei Wang
On Sat, Sep 20, 2008 at 2:22 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> It has been mentioned several times on this list that NARS has no
> proper probabilistic interpretation. But, I think I have found one
> that works OK. Not perfectly. There are some differences, but the
> similarity is striking (at least to me).

Abram,

There is indeed a lot of similarity between NARS and probability
theory. When I started this project, my plan was to use probability
theory to handle uncertainty. I moved away from it after I believed
that what is needed cannot be fully obtained from that theory and its
extensions. Even so, NARS still agrees with probability theory here or
there, which were mentioned in my papers.

The key, therefore, is whether NARS can be FULLY treated as an
application of probability theory, by following the probability
axioms, and only adding justifiable consistent assumptions when
necessary.

> I imagine that what I have come up with is not too different from what
> Ben Goertzel and Pei Wang have already hashed out in their attempts to
> reconcile the two, but we'll see. The general idea is to treat NARS as
> probability plus a good number of regularity assumptions that justify
> the inference steps of NARS. However, since I make so many
> assumptions, it is very possible that some of them conflict. This
> would show that NARS couldn't fit into probability theory after all,
> but it is still interesting even if that's the case...

I assume by "treat NARS as probability" you mean "to treat the
Frequency in NARS as a measurement following the axioms of probability
theory". I mentioned this because there is another measurement in
NARS, Expectation (which is derived from Frequency and Confidence),
which is also intuitively similar to probability.

> So, here's an outline. We start with the primitive inheritance
> relation, A inh B; this could be called "definite inheritance",
> because it means that A inherits all of B's properties, and B inherits
> all of A's instances. B is a superset of A. The truth value is 1 or 0.

Fine.

> Then, we define "probabilistic inheritance", which carries a
> probability that a given property of B will be inherited by A and that
> a given instance of A will be inherited by B.

There is a tricky issue here. When evaluating the truth value of
A-->B, NARS doesn't only check "properties" and "instances", but also
check "supersets" and "subsets", intuitively speaking. For example,
when the system is told that "Swans are birds" and "Swans fly", it
derives "Birds fly" by induction. In this process "swan" is counted as
one piece of evidence, rather than a set of instances. How many swans
the system knows doesn't matter in this step. That is why in the
definitions I use "extension/intension", not "instance/property",
because the latter is just special cases of the former. Actually, the
truth value of A-->B measures how often the two terms can substitute
each other (in different ways), not how much one set is included in
the other, which is the usual probabilistic reading of an inheritance.

This is one reason why NARS does not define "node probability".

> Probabilistic
> inheritance behaves somewhat like the full NARS inheritance: if we
> reason about likelihoods (the probability of the data assuming (A
> prob_inh B) = x), the math is actually the same EXCEPT we can only use
> primitive inheritance as evidence, so we can't spread evidence around
> the network by (1) treating prob_inh with high evidence as if it were
> primitive inh or (2) attempting to use deduction to accumulate
> evidence as we might want to, so that evidence for "A prob_inh B" and
> evidence for "B prob_inh C" gets combined to evidence for "A prob_inh
> C".

Beside the problem you mentioned, there are other issues. Let me start
at the basic ones:

(1) In probability theory, an event E has a constant probability P(E)
(which can be unknown). Given the assumption of insufficient knowledge
and resources, in NARS P(A-->B) would change over time, when more and
more evidence is taken into account. This process cannot be treated as
conditioning, because, among other things, the system can neither
explicitly list all evidence as condition, nor update the probability
of all statements in the system for each piece of new evidence (so as
to treat all background knowledge as a default condition).
Consequently, at any moment P(A-->B) and P(B-->C) may be based on
different, though unspecified, data, so it is invalid to use them in a
rule to calculate the "probability" of A-->C --- probability theory
does not allow cross-distribution probability calculation.

(2) For the same 

[agi] Re: Case-by-case Problem Solving (draft)

2008-09-19 Thread Pei Wang
Instead of responding to each comment, I'd make the following answers
altogether:

1. This paper assumes a background of algorithm analysis. People
without that won't correctly understand what I mean.

2. A CPS system is "non-algorithmic" with respect to some problems,
while still be "algorithmic" with respect to some other problems. For
NARS, the former is the case for all user-level "problems" (that the
user provides in Narsese), and the latter is the case in micro-level
(single step) or macro-level (lifelong experience).  On the contrary,
An APS system is "algorithmic" in all these three levels. As system
designers, I write algorithms to make NARS run, though I don't write
any algorithm to handle the problems the system meets in its own life
cycle. This difference has been explained in
http://nars.wang.googlepages.com/wang.computation.pdf .

3. No, NARS hasn't solved any problem that no human can (for what the
current implementation can do, visit
http://code.google.com/p/open-nars/). The point the paper want to make
is that the "problems" an AI system can "solve" are not bounded by
computability theory and computational complexity theory, though it is
still too early to tell how far it can go in this direction.

Pei

On Thu, Sep 18, 2008 at 4:05 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
> TITLE: Case-by-case Problem Solving (draft)
>
> AUTHOR: Pei Wang
>
> ABSTRACT: Case-by-case Problem Solving is an approach in which the
> system solves the current occurrence of a problem instance by taking
> the available knowledge into consideration, under the restriction of
> available resources. It is different from the traditional Algorithmic
> Problem Solving in which the system applies a given algorithm to each
> problem instance. Case-by-case Problem Solving is suitable for
> situations where the system has no applicable algorithm for a problem.
> This approach gives the system flexibility, originality, and
> scalability, at the cost of predictability. This paper introduces the
> basic notion of case-by-case problem solving, as well as its most
> recent implementation in NARS, an AGI project.
>
> URL: http://nars.wang.googlepages.com/wang.CaseByCase.pdf
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


[agi] Case-by-case Problem Solving (draft)

2008-09-18 Thread Pei Wang
TITLE: Case-by-case Problem Solving (draft)

AUTHOR: Pei Wang

ABSTRACT: Case-by-case Problem Solving is an approach in which the
system solves the current occurrence of a problem instance by taking
the available knowledge into consideration, under the restriction of
available resources. It is different from the traditional Algorithmic
Problem Solving in which the system applies a given algorithm to each
problem instance. Case-by-case Problem Solving is suitable for
situations where the system has no applicable algorithm for a problem.
This approach gives the system flexibility, originality, and
scalability, at the cost of predictability. This paper introduces the
basic notion of case-by-case problem solving, as well as its most
recent implementation in NARS, an AGI project.

URL: http://nars.wang.googlepages.com/wang.CaseByCase.pdf


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] uncertain logic criteria

2008-09-18 Thread Pei Wang
On Wed, Sep 17, 2008 at 10:54 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei,
>
> You are right, that does sound better than "quick-and-dirty". And more
> relevant, because my primary interest here is to get a handle on what
> normative epistemology should tell us to conclude if we do not have
> time to calculate the full set of consequences to (uncertain) facts.

Fully understand. As far as uncertain reasoning is concerned, NARS
aims at a normative model that is optimal under certain restriction,
and in this sense it is not inferior to probability theory, but
designed under different assumptions. Especially, NARS is not an
approximation or a second-rate substitute for probability theory, just
as probability theory is not a second-rate substitute of binary logic.

> It is unfortunate that I had to use biased language, but probability
> is of course what I am familiar with... I suppose, though, that most
> of the terms could be roughly translated into NARS? Especially
> independence, and I should hope conditional independence as well.
> Collapsing probabilities can be restated as generally collapsing
> uncertainty.

>From page 80 of my book: "We call quantities mutually independent of
each other, when given the values of any of them, the remaining ones
cannot be determined, or even bounded approximately."

> Thanks for the links. The reason for singling out these three, of
> course, is that they have already been discussed on this list. If
> anybody wants to point out any others in particular, that would be
> great.

Understand. The UAI community used to be an interesting one, though in
recent years it has been too much dominated by the Bayesians, who
assume they already get the big picture right, and all the remain
issues are in the details. For discussions on the fundamental
properties of uncertain reasoning, I recommend the works of Henry
Kyburg and Susan Haack.

Pei

> --Abram
>
> On Wed, Sep 17, 2008 at 3:54 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>> On Wed, Sep 17, 2008 at 1:46 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
>>> Hi everyone,
>>>
>>> Most people on this list should know about at least 3 uncertain logics
>>> claiming to be AGI-grade (or close):
>>>
>>> --Pie Wang's NARS
>>
>> Yes, I heard of this guy a few times, who happens to use the same name
>> for his project as mine. ;-)
>>
>>> Here is my list:
>>>
>>> 1. Well-defined uncertainty semantics (either probability theory or a
>>> well-argued alternative)
>>
>> Agree, and I'm glad that you mentioned this item first.
>>
>>> 2. Good at quick-and-dirty reasoning when needed
>>> --a. Makes unwarranted independence assumptions
>>> --b. Collapses probability distributions down to the most probable
>>> item when necessary for fast reasoning
>>> --c. Uses the maximum entropy distribution when it doesn't have time
>>> to calculate the true distribution
>>> --d. Learns simple conditional models (like 1st-order markov models)
>>> for use later when full models are too complicated to quickly use
>>
>> As you admitted in the following, the language is biased. Using
>> theory-neutral language, I'd say the requirement is "to derive
>> conclusions with available knowledge and resources only", which sounds
>> much better than "quick-and-dirty" to me.
>>
>>> 3. Capable of "repairing" initial conclusions based on the bad models
>>> through further reasoning
>>> --a. Should have a good way of representing the special sort of
>>> uncertainty that results from the methods above
>>> --b. Should have a "repair" algorithm based on that higher-order uncertainty
>>
>> As soon as you don't assume there is a "model", this item and the
>> above one become similar, which are what I called "revision" and
>> "inference", respectively, in
>> http://www.cogsci.indiana.edu/pub/wang.uncertainties.ps
>>
>>> The 3 logics mentioned above vary in how well they address these
>>> issues, of course, but they are all essentially descended from NARS.
>>> My impression is that as a result they are strong in (2a) and (3b) at
>>> least, but I am not sure about the rest. (Of course, it is hard to
>>> evaluate NARS on most of the points in #2 since I stated them in the
>>> language of probability theory. And, opinions will differ on (1).)
>>>
>>> Anyone else have lists? Or thoughts?
>>
>> If you consider approaches with various scope and maturity, there

Re: [agi] uncertain logic criteria

2008-09-17 Thread Pei Wang
On Wed, Sep 17, 2008 at 1:46 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Hi everyone,
>
> Most people on this list should know about at least 3 uncertain logics
> claiming to be AGI-grade (or close):
>
> --Pie Wang's NARS

Yes, I heard of this guy a few times, who happens to use the same name
for his project as mine. ;-)

> Here is my list:
>
> 1. Well-defined uncertainty semantics (either probability theory or a
> well-argued alternative)

Agree, and I'm glad that you mentioned this item first.

> 2. Good at quick-and-dirty reasoning when needed
> --a. Makes unwarranted independence assumptions
> --b. Collapses probability distributions down to the most probable
> item when necessary for fast reasoning
> --c. Uses the maximum entropy distribution when it doesn't have time
> to calculate the true distribution
> --d. Learns simple conditional models (like 1st-order markov models)
> for use later when full models are too complicated to quickly use

As you admitted in the following, the language is biased. Using
theory-neutral language, I'd say the requirement is "to derive
conclusions with available knowledge and resources only", which sounds
much better than "quick-and-dirty" to me.

> 3. Capable of "repairing" initial conclusions based on the bad models
> through further reasoning
> --a. Should have a good way of representing the special sort of
> uncertainty that results from the methods above
> --b. Should have a "repair" algorithm based on that higher-order uncertainty

As soon as you don't assume there is a "model", this item and the
above one become similar, which are what I called "revision" and
"inference", respectively, in
http://www.cogsci.indiana.edu/pub/wang.uncertainties.ps

> The 3 logics mentioned above vary in how well they address these
> issues, of course, but they are all essentially descended from NARS.
> My impression is that as a result they are strong in (2a) and (3b) at
> least, but I am not sure about the rest. (Of course, it is hard to
> evaluate NARS on most of the points in #2 since I stated them in the
> language of probability theory. And, opinions will differ on (1).)
>
> Anyone else have lists? Or thoughts?

If you consider approaches with various scope and maturity, there are
much more than these three approaches, and I'm sure most of people
working on them will claim that they are also "general purpose".
Interested people may want to browse http://www.auai.org/ and
http://www.elsevier.com/wps/find/journaldescription.cws_home/505787/description#description

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] A model for RSI

2008-09-14 Thread Pei Wang
Matt,

Thanks for the paper. Some random comments:

*. "If RSI is possible, then it is critical that the initial goals of
the first iteration of agents (seed AI) are friendly to humans and
that the goals not drift through successive iterations."

As I commented on Ben's paper recently, here the implicit assumption
is that the initial goals fully determines the goal structure, which I
don't think is correct. If you think otherwise, you should argue for
it, or at least make it explicit.

*. "Turing [5] defined AI as the ability of a machine to fool a human
into believing it was another human."

No he didn't. Turing proposed the imitation game as a sufficient
condition for intelligence, and he made it clear that it may not be a
necessary condition by saying "May not machines carry out something
which ought to be described as thinking but which is very different
from what a man does? This objection is a very strong one, but at
least we can say that if, nevertheless, a machine can be constructed
to play the imitation game satisfactorily, we need not be troubled by
this objection."

*. "This would solve the general intelligence problem once and for
all, except for the fact that the strategy is not computable."

Not only that. Other exceptions include the situations where the
definition doesn't apply, such as in systems where goals change over
time, where no immediate and reliable reward signals are given, etc.,
not to mention the unrealistic assumption on infinite resources.

*. "AIXI has insufficient knowledge (none initially) ..."

But it assumes a reward signal, which contains sufficient knowledge to
evaluate behaviors. What if the reward signal is wrong?

*. "Hutter also proved that in the case of space bound l and time bound t ..."

That is not the same thing as "insufficient resources".

*. "We define a goal as a function G: N → R mapping natural numbers
... to real numbers."

I'm sure you can build systems with such a goal, though call it a
"definition of goal" seems too strong --- are you claiming that all
the "goals" in the AGI context can be put into this format? On the
other hand, are all N → R functions goals? If not, how to distinguish
them?

*. "running P longer will eventually produce a better result and never
produce a worse result afterwards"

This is true for certain goals, though not for all. Some goals ask for
keeping some parameter (such as body temperature) at a certain value,
which cannot be covered by your definition using monotonically
increasing function.

 *. "Define an improving sequence with respect to G as an infinite
sequence of programs P1, P2, P3,... such that for all i > 0, Pi+1
improves on Pi with respect to goal G."

If this is what people means by RSI, I don't think it can be designed
to happen --- it will either be impossible or only happens by
accident. All realistic learning and adaptation is tentative --- you
make a change with the belief that it will be a better strategy,
according to your experience, though you can never be absolutely sure,
because the future is different from the past. There is no guaranteed
improvement in an open system.

Pei

On Sat, Sep 13, 2008 at 11:39 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> I have produced a mathematical model for recursive self improvement and would 
> appreciate any comments before I publish this.
>
> http://www.mattmahoney.net/rsi.pdf
>
> In the paper, I try to give a sensible yet precise definition of what it 
> means for a program to have a goal. Then I describe infinite sequences of 
> programs that improve with respect to reaching a goal within fixed time 
> bounds, and finally I give an example (in C) of a program that outputs the 
> next program in this sequence. Although it is my long sought goal to prove or 
> disprove RSI, it doesn't entirely resolve the question because the rate of 
> knowledge gain is O(log n) and I prove that is the best you can do given 
> fixed goals.
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com


Re: [agi] any advice

2008-09-09 Thread Pei Wang
IDSIA is AGI-related, though you need to love math and theoretical
computer science to work with Schmidhuber.

Don't know about Verona.

Pei

On Tue, Sep 9, 2008 at 8:27 AM, Valentina Poletti <[EMAIL PROTECTED]> wrote:
> I am applying for a research program and I have to chose between these two
> schools:
>
> Dalle Molle Institute of Artificial Intelligence
> University of Verona (Artificial Intelligence dept)
> 
> agi | Archives | Modify Your Subscription


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] draft paper: a hybrid logic for AGI

2008-09-08 Thread Pei Wang
Sorry I don't have the time to type a detailed reply, but for your
second point, see the example in
http://www.cogsci.indiana.edu/pub/wang.fuzziness.ps , page 9, 4th
paragraph:

If these two types of uncertainty [randomness and fuzziness] are
different, why bother to treat them in an uniform way?
The basic reason is: in many practical problems, they are involved
with each other. Smets stressed
the importance of this issue, and provided some examples, in which
randomness and fuzziness are
encountered in the same sentence ([20]). It is also true for
inferences. Let's take medical diagnosis
as an example. When a doctor want to determine whether a patient A is
suffering from disease D,
(at least) two types of information need to be taken into account: (1)
whether A has D's symptoms,
and (2) whether D is a common illness. Here (1) is evaluated by
comparing A's symptoms with D's
typical symptoms, so the result is usually fuzzy, and (2) is
determined by previous statistics. After
the total certainty of "A is suffering from D" is evaluated, it should
be combined with the certainty
of  "T is a proper treatment to D" (which is usually a statistic
statement, too) to get the doctor's
"degree of belief" for "T should be applied to A". In such a situation
(which is the usual case,
rather than an exception), even if randomness and fuzziness can be
distinguished in the premises,
they are mixed in the middle and final conclusions.

Pei

On Mon, Sep 8, 2008 at 3:55 PM, YKY (Yan King Yin)
<[EMAIL PROTECTED]> wrote:
> A somewhat revised version of my paper is at:
> http://www.geocities.com/genericai/AGI-ch4-logic-9Sep2008.pdf
> (sorry it is now a book chapter and the bookmarks are lost when extracting)
>
> On Tue, Sep 2, 2008 at 7:05 PM, Pei Wang <[EMAIL PROTECTED]> wrote:
>>>
>>>   I intend to use NARS confidence in a way compatible with
>>> probability...
>
>> I'm pretty sure it won't, as I argued in several publications, such as
>> http://nars.wang.googlepages.com/wang.confidence.pdf and the book.
>
> I understood your argument about defining the confidence c, and agree
> with it.  But I don't see why c cannot be used together with f (as
> *traditional* probability).
>
>> In summary, I don't think it is a good idea to mix B, P, and Z. As Ben
>> said, the key is semantics, that is, what is measured by your truth
>> values. I prefer a unified treatment than a hybrid, because the former
>> is semantically consistent, while the later isn't.
>
> My logic actually does *not* mix B, P, and Z.  They are kept
> orthogonal, and so the semantics can be very simple.  Your approach
> mixes fuzziness with probability which can result in ambiguity in some
> everyday examples:  eg, John tries to find a 0.9 pretty girl (degree)
> vs  Mary is 0.9 likely to be pretty (probability).  The difference is
> real, but subtle, and I agree that you can mix them but you must
> always acknowledge that the measure is mixed.
>
> Maybe you've mistaken what I'm trying to do, 'cause my theory should
> not be semantically confusing...
>
> YKY
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread Pei Wang
I won't argue against your  "preference test" here, since this is a
big topic, and I've already made my position clear in the papers I
mentioned.

As for "compression", yes every intelligent system needs to 'compress'
its experience in the sense of "keeping the essence but using less
space". However, it is clearly not loseless. It is even not what we
usually call "loosy compression", because what to keep and in what
form is highly context-sensitive. Consequently, this process is not
reversible --- no decompression, though the result can be applied in
various ways. Therefore I prefer not to call it compression to avoid
confusing this process with the technical sense of "compression",
which is reversible, at least approximately.

Legg and Hutter's "universal intelligence" definition is way too
narrow to cover various attempts towards AI, even as an idealization.
Therefore, I don't take it as a goal to aim at and to approach to as
close as possible. However, as I said before, I'd rather leave this
topic for the future, when I have enough time to give it a fair
treatment.

Pei

On Sat, Sep 6, 2008 at 4:29 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> Thanks for taking the time to explain your ideas in detail.
>> As I said,
>> our different opinions on how to do AI come from our very
>> different
>> understanding of "intelligence". I don't take
>> "passing Turing Test" as
>> my research goal (as explained in
>> http://nars.wang.googlepages.com/wang.logic_intelligence.pdf
>> and
>> http://nars.wang.googlepages.com/wang.AI_Definitions.pdf).
>> I disagree
>> with Hutter's approach, not because his SOLUTION is not
>> computable,
>> but because his PROBLEM is too idealized and simplified to
>> be relevant
>> to the actual problems of AI.
>
> I don't advocate the Turing test as the ideal test of intelligence. Turing 
> himself was aware of the problem when he gave an example of a computer 
> answering an arithmetic problem incorrectly in his famous 1950 paper:
>
> Q: Please write me a sonnet on the subject of the Forth Bridge.
> A: Count me out on this one. I never could write poetry.
> Q: Add 34957 to 70764.
> A: (Pause about 30 seconds and then give as answer) 105621.
> Q: Do you play chess?
> A: Yes.
> Q: I have K at my K1, and no other pieces.  You have only K at K6 and R at 
> R1.  It is your move.  What do you play?
> A: (After a pause of 15 seconds) R-R8 mate.
>
> I prefer a "preference test", which a machine passes if you prefer to talk to 
> it over a human. Such a machine would be too fast and make too few errors to 
> pass a Turing test. For example, if you had to add two large numbers, I think 
> you would prefer to use a calculator than ask someone. You could, I suppose, 
> measure intelligence as the fraction of questions for which the machine gives 
> the preferred answer, which would be 1/4 in Turing's example.
>
> If you know the probability distribution P of text, and therefore know the 
> distribution P(A|Q) for any question Q and answer A, then to pass the Turing 
> test you would randomly choose answers from this distribution. But to pass 
> the preference test for all Q, you would choose A that maximizes P(A|Q) 
> because the most probable answer is usually the correct one. Text compression 
> measures progress toward either test.
>
> I believe that compression measures your definition of intelligence, i.e. 
> adaptation given insufficient knowledge and resources. In my benchmark, there 
> are two parts: the size of the decompression program, which measures the 
> initial knowledge, and the compressed size, which measures prediction errors 
> that occur as the system adapts. Programs must also meet practical time and 
> memory constraints to be listed in most benchmarks.
>
> Compression is also consistent with Legg and Hutter's universal intelligence, 
> i.e. expected reward of an AIXI universal agent in an environment simulated 
> by a random program. Suppose you have a compression oracle that inputs any 
> string x and outputs the shortest program that outputs a string with prefix 
> x. Then this reduces the (uncomputable) AIXI problem to using the oracle to 
> guess which environment is consistent with the interaction so far, and 
> figuring out which future outputs by the agent will maximize reward.
>
> Of course universal intelligence is also not testable because it requires an 
> infinite number of environments. Instead, we have to choose a practical data 
> set. I use Wikipedia text, which has fewer errors than average text, bu

Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
Matt,

Thanks for taking the time to explain your ideas in detail. As I said,
our different opinions on how to do AI come from our very different
understanding of "intelligence". I don't take "passing Turing Test" as
my research goal (as explained in
http://nars.wang.googlepages.com/wang.logic_intelligence.pdf and
http://nars.wang.googlepages.com/wang.AI_Definitions.pdf).  I disagree
with Hutter's approach, not because his SOLUTION is not computable,
but because his PROBLEM is too idealized and simplified to be relevant
to the actual problems of AI.

Even so, I'm glad that we can still agree on somethings, like
semantics comes before syntax. In my plan for NLP, there won't be
separate 'parsing' and 'semantic mapping' stages. I'll say more when I
have concrete results to share.

Pei

On Fri, Sep 5, 2008 at 8:39 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> Like to many existing AI works, my disagreement with you is
>> not that
>> much on the solution you proposed (I can see the value),
>> but on the
>> problem you specified as the goal of AI. For example, I
>> have no doubt
>> about the theoretical and practical values of compression,
>> but don't
>> think it has much to do with intelligence.
>
> In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why text 
> compression is an AI problem. To summarize, if you know the probability 
> distribution of text, then you can compute P(A|Q) for any question Q and 
> answer A to pass the Turing test. Compression allows you to precisely measure 
> the accuracy of your estimate of P. Compression (actually, word perplexity) 
> has been used since the early 1990's to measure the quality of language 
> models for speech recognition, since it correlates well with word error rate.
>
> The purpose of this work is not to solve general intelligence, such as the 
> universal intelligence proposed by Legg and Hutter [1]. That is not 
> computable, so you have to make some arbitrary choice with regard to test 
> environments about what problems you are going to solve. I believe the goal 
> of AGI should be to do useful work for humans, so I am making a not so 
> arbitrary choice to solve a problem that is central to what most people 
> regard as useful intelligence.
>
> I had hoped that my work would lead to an elegant theory of AI, but that 
> hasn't been the case. Rather, the best compression programs were developed as 
> a series of thousands of hacks and tweaks, e.g. change a 4 to a 5 because it 
> gives 0.002% better compression on the benchmark. The result is an opaque 
> mess. I guess I should have seen it coming, since it is predicted by 
> information theory (e.g. [2]).
>
> Nevertheless the architectures of the best text compressors are consistent 
> with cognitive development models, i.e. phoneme (or letter) sequences -> 
> lexical -> semantics -> syntax, which are themselves consistent with layered 
> neural architectures. I already described a neural semantic model in my last 
> post. I also did work supporting Hutchens and Alder showing that lexical 
> models can be learned from n-gram statistics, consistent with the observation 
> that babies learn the rules for segmenting continuous speech before they 
> learn any words [3].
>
> I agree it should also be clear that semantics is learned before grammar, 
> contrary to the way artificial languages are processed. Grammar requires 
> semantics, but not the other way around. Search engines work using semantics 
> only. Yet we cannot parse sentences like "I ate pizza with Bob", "I ate pizza 
> with pepperoni", "I ate pizza with chopsticks", without semantics.
>
> My benchmark does not prove that there aren't better language models, but it 
> is strong evidence. It represents the work of about 100 researchers who have 
> tried and failed to find more accurate, faster, or less memory intensive 
> models. The resource requirements seem to increase as we go up the chain from 
> n-grams to grammar, contrary to symbolic approaches. This is my argument why 
> I think AI is bound by lack of hardware, not lack of theory.
>
> 1. Legg, Shane, and Marcus Hutter (2006), A Formal Measure of Machine 
> Intelligence, Proc. Annual machine learning conference of Belgium and The 
> Netherlands (Benelearn-2006). Ghent, 2006.  
> http://www.vetta.org/documents/ui_benelearn.pdf
>
> 2. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?,  
> Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for 
> Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
>

Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
On Fri, Sep 5, 2008 at 6:15 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> NARS indeed can learn semantics before syntax --- see
>> http://nars.wang.googlepages.com/wang.roadmap.pdf
>
> Yes, I see this corrects many of the problems with Cyc and with traditional 
> language models. I didn't see a description of a mechanism for learning new 
> terms in your other paper. Clearly this could be added, although I believe it 
> should be a statistical process.

I don't have a separate paper on term composition, so you'd have to
read my book. It is indeed a statistical process, in the sense that
most of the composed terms won't be useful, so will be forgot
gradually. Only the "useful patterns" will be kept for long time in
the form of compound terms.

> I am interested in determining the computational cost of language modeling. 
> The evidence I have so far is that it is high. I believe the algorithmic 
> complexity of a model is 10^9 bits. This is consistent with Turing's 1950 
> prediction that AI would require this much memory, with Landauer's estimate 
> of human long term memory, and is about how much language a person processes 
> by adulthood assuming an information content of 1 bit per character as 
> Shannon estimated in 1950. This is why I use a 1 GB data set in my 
> compression benchmark.

I see your point, though I think to analyze this problem in terms of
computational complexity is not the correct way to go, because this
process does not follow a predetermined algorithm. Instead, language
learning is an incremental process, without a well-defined beginning
and ending.

> However there is a 3 way tradeoff between CPU speed, memory, and model 
> accuracy (as measured by compression ratio). I added two graphs to my 
> benchmark at http://cs.fit.edu/~mmahoney/compression/text.html (below the 
> main table) which shows this clearly. In particular the size-memory tradeoff 
> is an almost perfectly straight line (with memory on a log scale) over tests 
> of 104 compressors. These tests suggest to me that CPU and memory are indeed 
> bottlenecks to language modeling. The best models in my tests use simple 
> semantic and grammatical models, well below adult human level. The 3 top 
> programs on the memory graph map words to tokens using dictionaries that 
> group semantically and syntactically related words together, but only one 
> (paq8hp12any) uses a semantic space of more than one dimension. All have 
> large vocabularies, although not implausibly large for an educated person. 
> Other top programs like nanozipltcb and WinRK use smaller dictionaries and
>  strictly lexical models. Lesser programs model only at the n-gram level.

Like to many existing AI works, my disagreement with you is not that
much on the solution you proposed (I can see the value), but on the
problem you specified as the goal of AI. For example, I have no doubt
about the theoretical and practical values of compression, but don't
think it has much to do with intelligence. I don't think this kind of
issue can be efficient handled by email discussion like this one. I've
been thinking about to write a paper to compare my ideas with the
ideas represented by AIXI, which is closely related to yours, though
this project hasn't got enough priority in my to-do list. Hopefully
I'll find the time to make myself clear on this topic.

> I don't yet have an answer to my question, but I believe efficient 
> human-level NLP will require hundreds of GB or perhaps 1 TB of memory. The 
> slowest programs are already faster than real time, given that equivalent 
> learning in humans would take over a decade. I think you could use existing 
> hardware in a speed-memory tradeoff to get real time NLP, but it would not be 
> practical for doing experiments where each source code change requires 
> training the model from scratch. Model development typically requires 
> thousands of tests.

I guess we are exploring very different paths in NLP, and now it is
too early to tell which one will do better.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
On Fri, Sep 5, 2008 at 11:15 AM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Thu, 9/4/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> I guess you still see NARS as using model-theoretic
>> semantics, so you
>> call it "symbolic" and contrast it with system
>> with sensors. This is
>> not correct --- see
>> http://nars.wang.googlepages.com/wang.semantics.pdf and
>> http://nars.wang.googlepages.com/wang.AI_Misconceptions.pdf
>
> I mean NARS is symbolic in the sense that you write statements in Narsese 
> like "raven -> bird <0.97, 0.92>" (probability=0.97, confidence=0.92). I 
> realize that the meanings of "raven" and "bird" are determined by their 
> relations to other symbols in the knowledge base and that the probability and 
> confidence change with experience. But in practice you are still going to 
> write statements like this because it is the easiest way to build the 
> knowledge base.

Yes.

> You aren't going to specify the brightness of millions of pixels in a vision 
> system in Narsese, and there is no mechanism I am aware of to collect this 
> knowledge from a natural language text corpus.

Of course not. To have visual experience, there must be a devise to
convert visual signals into internal representation in Narsese. I
never suggested otherwise.

> There is no mechanism to add new symbols to the knowledge base through 
> experience. You have to explicitly add them.

"New symbols" either come from the outside in experience (experience
can be verbal), or composed by the concept-formation rules from
existing ones. The latter case is explained in my book.

> Natural language has evolved to be learnable on a massively parallel network 
> of slow computing elements. This should be apparent when we compare 
> successful language models with unsuccessful ones. Artificial language models 
> usually consist of tokenization, parsing, and semantic analysis phases. This 
> does not work on natural language because artificial languages have precise 
> specifications and natural languages do not.

It depends on which aspect of the language you talk about. Narsese has
"precise specifications" in syntax, but the meaning of the terms is a
function of experience, and change from time to time.

> No two humans use exactly the same language, nor does the same human at two 
> points in time. Rather, language is learnable by example, so that each 
> message causes the language of the receiver to be a little more like that of 
> the sender.

Same thing in NARS --- if two implementations of NARS have different
experience, they will disagree on what is the meaning of a term. When
they begin to learn natural language, it will also be true for
grammar. Since I haven't done any concrete NLP yet, I don't expect you
to believe me on the second point, but you cannot rule out that
possibility just because no traditional system can do that.

> Children learn semantics before syntax, which is the opposite order from 
> which you would write an artificial language interpreter.

NARS indeed can learn semantics before syntax --- see
http://nars.wang.googlepages.com/wang.roadmap.pdf

I won't comment on the following detailed statements, since I agree
with your criticism on the traditional processing of formal language,
but that is not how NARS handles languages. Don't think NARS as
another Cyc just because both use "formal language". The same "ravens
are birds" in these two systems are treated very differently in them.

Pei


> An example of a successful language model is a search engine. We know that 
> most of the meaning of a text document depends only on the words it contains, 
> ignoring word order. A search engine matches the semantics of the query with 
> the semantics of a document mostly by matching words, but also by matching 
> semantically related words like "water" to "wet".
>
> Here is an example of a computationally intensive but biologically plausible 
> language model. A semantic model is a word-word matrix A such that A_ij is 
> the degree to which words i and j are related, which you can think of as the 
> probability of finding i and j together in a sliding window over a huge text 
> corpus. However, semantic relatedness is a fuzzy identity relation, meaning 
> it is reflexive, commutative, and transitive. If i is related to j and j to 
> k, then i is related to k. Deriving transitive relations in A, also known as 
> latent semantic analysis, is performed by singular value decomposition, 
> factoring A = USV where S is diagonal, then discarding the small terms of S, 
> which has the effect of lossy compression. Typically, A has about 10^6 
> elements and we keep only a few hundred elem

Re: [agi] open models, closed models, priors

2008-09-05 Thread Pei Wang
On Thu, Sep 4, 2008 at 11:17 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> Pei,
>
> I sympathize with your care in wording, because I'm very aware of the
> strange meaning that the word "model" takes on in formal accounts of
> semantics. While a cognitive scientist might talk about a person's
> "model of the world", a logician would say that the world is "a model
> of a first-order theory". I do want to avoid the second meaning. But,
> I don't think I could fare well by saying "system" instead, because
> the models are only a part of the larger system... so I'm not sure
> there is a word that is both neutral and sufficiently meaningful.

Yes, the first usage of "model" is less evil than the second, though
it still carry the sense of "representing the world as it is" and
"building a one-to-one mapping between the symbols and the objects".
As I write in the draft, it is better to take knowledge as "a
representation of the experience of the system, after summarization
and organization."

> Do you think it is impossible to apply probability to open
> models/theories/systems, or merely undesirable?

Well, "to apply probability" can be done in many ways. What I have
argued (e.g., in
http://nars.wang.googlepages.com/wang.bayesianism.pdf) is that if a
system is open to new information and works in real time, it is
practically impossible to maintain a (consistent) probability
distribution among its beliefs --- incremental revision is not
supported by the theory, and re-building the distribution from raw
data is not affordable. It only works on toy problems and cannot scale
up.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] open models, closed models, priors

2008-09-04 Thread Pei Wang
Abram,

I agree with the spirit of your post, and I even go further to include
"being open" in my working definition of intelligence --- see
http://nars.wang.googlepages.com/wang.logic_intelligence.pdf

I also agree with your comment on Solomonoff induction and Bayesian prior.

However, I talk about "open system", not "open model", because I think
model-theoretic semantics is the wrong theory to be used here --- see
http://nars.wang.googlepages.com/wang.semantics.pdf

Pei

On Thu, Sep 4, 2008 at 2:19 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> A closed model is one that is interpreted as representing all truths
> about that which is modeled. An open model is instead interpreted as
> making a specific set of assertions, and leaving the rest undecided.
> Formally, we might say that a closed model is interpreted to include
> all of the truths, so that any other statements are false. This is
> also known as the closed-world assumption.
>
> A typical example of an open model is a set of statements in predicate
> logic. This could be changed to a closed model simply by applying the
> closed-world assumption. A possibly more typical example of a
> closed-world model is a computer program that outputs the data so far
> (and predicts specific future output), as in Solomonoff induction.
>
> These two types of model are very different! One important difference
> is that we can simply *add* to an open model if we need to account for
> new data, while we must always *modify* a closed model if we want to
> account for more information.
>
> The key difference I want to ask about here is: a length-based
> bayesian prior seems to apply well to closed models, but not so well
> to open models.
>
> First, such priors are generally supposed to apply to entire joint
> states; in other words, probability theory itself (and in particular
> bayesian learning) is built with an assumption of an underlying space
> of closed models, not open ones.
>
> Second, an open model always has room for additional stuff somewhere
> else in the universe, unobserved by the agent. This suggests that,
> made probabilistic, open models would generally predict universes with
> infinite description length. Whatever information was known, there
> would be an infinite number of chances for other unknown things to be
> out there; so it seems as if the probability of *something* more being
> there would converge to 1. (This is not, however, mathematically
> necessary.) If so, then taking that other thing into account, the same
> argument would still suggest something *else* was out there, and so
> on; in other words, a probabilistic open-model-learner would seem to
> predict a universe with an infinite description length. This does not
> make it easy to apply the description length principle.
>
> I am not arguing that open models are a necessity for AI, but I am
> curious if anyone has ideas of how to handle this. I know that Pei
> Wang suggests abandoning standard probability in order to learn open
> models, for example.
>
> --Abram Demski
>
>
> ---
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


  1   2   3   4   5   6   >