Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-22 Thread Bill Hibbard
Hi Brad,

> > Yes, its amazing what even simple animal brains can do with
> > simple learning problems, when rewards quickly follow
> > behaviors. The forebrain evolved to solve the hard learning
> > problems, when there are long delays between behaviors and
> > rewards, and multiple behaviors precede rewards. To solve
> > this 'credit assignment problem' it needs a model of how
> > the world works. The forebrain seems stupid unless we award
> > it points for difficulty.
>
> I thought about that too, but then I thought of games like slot machines, craps and 
> roulette.  In these cases, reward follows commission very quickly, and yet we're 
> still hopelessly bad at stopping.
>
> Obviously a time delay has a big impact on credit assignment, but even some games 
> with short time scales are outside of our rationality agents' range of analysis.  I 
> think it's an issue of task complexity (very odd given that roulette's probabilities 
> are so trivial).

Yes, reward comes quickly in casino games, but it takes
a long period of observation to see that average reward
is negative. And of course most people do bet modestly
or not at all. A small percentage become addicted to
the emotions of win rewards. I think these people are
a lot like drug addicts, failing to balance long term
and short term rewards.

It would be an interesting experiment to create a lever
game equivalent to blackjack, playing perhaps several
hands per second and with long run won/loss record
expressed in a lever angle, so that players would
feel the long run averages over short time periods.
Perhaps that would move the game from the forebrain
to simple motor learning.

In any case, I am quite sure about the role of conscious
reasoning in providing a world simulation model for
difficult learning problems. Chess is a good illustration
of that. But I agree it is interesting how simple casino
games can befuddle some peoples' forebrains.

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Brad Wyble

> 
> Yes, its amazing what even simple animal brains can do with
> simple learning problems, when rewards quickly follow
> behaviors. The forebrain evolved to solve the hard learning
> problems, when there are long delays between behaviors and
> rewards, and multiple behaviors precede rewards. To solve
> this 'credit assignment problem' it needs a model of how
> the world works. The forebrain seems stupid unless we award
> it points for difficulty.



I thought about that too, but then I thought of games like slot machines, craps and 
roulette.  In these cases, reward follows commission very quickly, and yet we're still 
hopelessly bad at stopping.  

Obviously a time delay has a big impact on credit assignment, but even some games with 
short time scales are outside of our rationality agents' range of analysis.  I think 
it's an issue of task complexity (very odd given that roulette's probabilities are so 
trivial).   


"Strange game, the only winning move is... not to play"
-WarGames

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Bill Hibbard
On Fri, 21 Feb 2003, Brad Wyble wrote:

> . . .
> Interestingly, there are some primitive parts of our brain that are better at logic 
> and are more rational than our executive function.  Animals (and humans) in a 
> classical conditioning paradigm are *excellent* at performing simple behaviors in a 
> way that maximizes reward.  We can determine the proper ratio of performance on a 
> two lever task without even being consciously aware of the contingencies.  Rats can 
> do this too.  In fact, sometimes our advanced forebrain gets in the way of our more 
> primitive structures trying to do what they do best.  This is probably why people 
> gamble and play the lottery.  I would guess that the payoff matrices for all forms 
> of casino gambling are too subtle and complicated for our primitive rationality 
> agents to comprehend, and so the stupid forebrain gets to have its way.

Yes, its amazing what even simple animal brains can do with
simple learning problems, when rewards quickly follow
behaviors. The forebrain evolved to solve the hard learning
problems, when there are long delays between behaviors and
rewards, and multiple behaviors precede rewards. To solve
this 'credit assignment problem' it needs a model of how
the world works. The forebrain seems stupid unless we award
it points for difficulty.

As you point out sometimes we can see that a learning
problem with delayed reward is actually equivalent to an
easier problem with immediate reward, but the forebrain
is still stuck with its slow but general simulation.

Cheers,
Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: Re: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel

Brad said, responding to Moshe:
> > We have insufficient knowledge, so we need to make some assumptions to
> > approximate P(Xi|Xj).  I argue that under these circumstances, the best
> > assumption to make is that Xi and Xj are independent, (ie,
> P(Xi|Xj)=P(Xi)).
> > Does this clarify things?
>
>
> You are basically saying, for each unknown P(Xi|Xj), assume it
> equals P(Xi).
>


One can do better than such a simplistic form of independence assumption
though...

For instance we have

P(C|A) = P(B|A) * P(C|B) + (1- P(B|A))*(P(C)-P(B)* P(C|B))/(1-P(B))

if we assume that

* (A intersect B) and (C intersect B) are independent [the first term]
* (A intersect ~B) and (C intersect ~B) are independent [the second term]

So if you know P(B|A) and P(C|B) then you can guess P(C|A) if you're willing
to assume A and C are independent in B and in ~B, but not universally
independent.

This is basically the PTL "probabilistic deduction rule."

So one reasonable heuristic inference strategy is to prefer trains of
inference where this kind of "localized independence" can most plausibly be
assumed at each step along the way.

But even so, after a lot of inference steps, these independence assumptions
can sometimes (not always) lead to substantial error.

As noted, the human brain incurs substantial error when doing this sort of
reasoning, and this could be because it employs similar heuristic
independence assumptions, which can cause problems when applied iteratively
in the large scale.

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Re: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Brad Wyble

> 
> Brad wrote:
> > I think this is a core principle of AGI design and that a system that
> > only makes inferences it *knows* are correct would be fairly
> > uninteresting and incapable of performing in the real world.  The fact
> > that the information in the P(xi|xj) list is very incomplete is what
> > makes the problem interesting.
> > 
> > Or maybe I'm misinterpreting your intent.
> >
> I agree perfectly with your "core principle", and my proposal was not to
> only make inferences that you know are correct. I think you may be
> misinterpreting: lets say that we know P(Xi), and want to guess at P(Xi|Xj).
> We have insufficient knowledge, so we need to make some assumptions to
> approximate P(Xi|Xj).  I argue that under these circumstances, the best
> assumption to make is that Xi and Xj are independent, (ie, P(Xi|Xj)=P(Xi)). 
> Does this clarify things?


You are basically saying, for each unknown P(Xi|Xj), assume it equals P(Xi).  

I think this conservative approach, while well grounded in rationality, doesn't really 
allow for the existence of useful and interesting inference.   An AGI has to tolerate, 
and work with, large degrees of uncertainty.  This includes assuming dependencies 
without sufficient evidence.  I can say that in the biological sciences, one has to do 
this constantly.  What separates the good scientists from the not-so-good is an 
ability to keep track of many low-confidence assumptions simultaneously, shake them up 
and see what theories fall out that violate the fewest of them.   

-Brad




> 
> Moshe
> 
> > 
> > 
> > -Brad
> > 
> > ---
> > To unsubscribe, change your address, or temporarily deactivate your
> > subscription,  please go to
> > http://v2.listbox.com/member/[EMAIL PROTECTED]
> 
> 
> ---
> To unsubscribe, change your address, or temporarily deactivate your subscription, 
> please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
> 

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Brad Wyble
> This is also an example of how weird the brain can be from an algorithmic
> perspective.  In designing an AI system, one tends to abstract cognitive
> processes and create specific processes based on these abstractions.  (And
> this is true in NN type AI architectures, not just logicist ones.)  But
> evolution is a hacker sometimes: often, rather than abstracting, it reuses
> stuff that was created for another purpose, providing hacky mappings to
> enable the reuse.  This is terrible software engineering practice, but
> evolution has a lot of computational resources to work with, and it does
> create a lot of buggy things ;)
> 

The study of historical constraints on evolution's design's principle is fascinating.  
I took a class with this guy: http://www.mcz.harvard.edu/Departments/Fish/kfl.htm, and 
he focusses on very interesting problems within systems that would seem to be very 
boring (the evolution of jaw structures in cichlids).

For example, consider hemoglobin, the current means of transmitting oxygen in the 
body.  There might be a better way to do it, in fact, it's almost certain that there 
is.  But evolution would have a very hard time finding it, because we're already 
heavily invested in the hemoglobin tract.   

The same thing applies to the brain of course, evolution has invested alot of effort 
into developing sensory and motor facilities.  Logic and reason are crude hacks, 
tacked on top of a system designed to do nothing of the sort.  It's like figuring out 
how to attach a swimming pool to the space shuttle.  (and miracle of miracles, it 
somehow works, albeit crudely).  

Small wonder that we are so terribly bad at logic.  

http://plus.maths.org/issue20/reviews/book1/


Interestingly, there are some primitive parts of our brain that are better at logic 
and are more rational than our executive function.  Animals (and humans) in a 
classical conditioning paradigm are *excellent* at performing simple behaviors in a 
way that maximizes reward.  We can determine the proper ratio of performance on a two 
lever task without even being consciously aware of the contingencies.  Rats can do 
this too.  In fact, sometimes our advanced forebrain gets in the way of our more 
primitive structures trying to do what they do best.  This is probably why people 
gamble and play the lottery.  I would guess that the payoff matrices for all forms of 
casino gambling are too subtle and complicated for our primitive rationality agents to 
comprehend, and so the stupid forebrain gets to have its way.

-brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel


> Brad wrote:
> > I think this is a core principle of AGI design and that a system that
> > only makes inferences it *knows* are correct would be fairly
> > uninteresting and incapable of performing in the real world.  The fact
> > that the information in the P(xi|xj) list is very incomplete is what
> > makes the problem interesting.
> >
> > Or maybe I'm misinterpreting your intent.
> >
> I agree perfectly with your "core principle", and my proposal was not to
> only make inferences that you know are correct. I think you may be
> misinterpreting: lets say that we know P(Xi), and want to guess
> at P(Xi|Xj).
> We have insufficient knowledge, so we need to make some assumptions to
> approximate P(Xi|Xj).  I argue that under these circumstances, the best
> assumption to make is that Xi and Xj are independent, (ie,
> P(Xi|Xj)=P(Xi)).
> Does this clarify things?
>
> Moshe

Moshe: your approach is conceptually quite right

Empirically, however, making a bunch of iterated independence assumptions on
a large body of knowledge, can rapidly lead to nonsense conclusions,
especially if one allows any conclusion-based premise correction (not really
needed in the rectangles test).

One lesson is that to have reliable inferences one wants to spend a LOT of
time seeking dependency information, because while each individual
independence assumption only adds a little error on average, the errors can
really pile up in an iterated inference context.

-- Ben G


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Moshe Looks
Brad wrote:
> I think this is a core principle of AGI design and that a system that
> only makes inferences it *knows* are correct would be fairly
> uninteresting and incapable of performing in the real world.  The fact
> that the information in the P(xi|xj) list is very incomplete is what
> makes the problem interesting.
> 
> Or maybe I'm misinterpreting your intent.
>
I agree perfectly with your "core principle", and my proposal was not to
only make inferences that you know are correct. I think you may be
misinterpreting: lets say that we know P(Xi), and want to guess at P(Xi|Xj).
We have insufficient knowledge, so we need to make some assumptions to
approximate P(Xi|Xj).  I argue that under these circumstances, the best
assumption to make is that Xi and Xj are independent, (ie, P(Xi|Xj)=P(Xi)). 
Does this clarify things?

Moshe

> 
> 
> -Brad
> 
> ---
> To unsubscribe, change your address, or temporarily deactivate your
> subscription,  please go to
> http://v2.listbox.com/member/[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel

> > Hi Ben,
> >
> > Thanks for the brain teaser!  As a sometimes believer in
> Occam's Razor, I
> > think it makes sense to assume that Xi and Xj are indepenent,
> unless we know
> > otherwise.  This simplifies things, and is the "rational" thing
> to do (for
> > some definition of rational ;->).  So why not construct a bayes
> net modeling
> > the distributions, with causal links only where you _know_ two
> variables are
> > dependent?  For reasoning about "orphan variables" (e.g., you
> know nothing
> > at all about Xi), just assume the average of all other
> distributions.  If
> > you have P(Xi|Xj), and want P(Xj|Xi), fudge something together
> with Bayes'
> > rule.  This isn't a complete solution, but its how I would
> start... Is this
> > one of the things you've tried?
> >
> > Cheers,
> > Moshe

Well, the Novamente PTL (probabilistic term logic) module is not a Bayes
net, but it has some similarities.  It is based on elementary probability
theory, and its two main rules are

* inversion, which is just Bayes rule for guessing P(B|A) from P(A|B) (with
help from P(A) and P(B) )
* deduction, which is indeed based on an independence assumption, which goes
from P(A|B) and P(C|A) to P(C|B)

There are variants of the deduction rule to use when dependency information
is known; otherwise independency is assumed.

A major difference from a Bayes net however, is that PTL doesn't impose a
hierarchical structure like a Bayes net does.   In a realistically complex
web of probabilistic relationships, one is going to have heterarchical
interdependencies, and no hierarchy will capture what's going on.  Each
P(A|B) corresponds to an InheritanceLink in Novamente, but, these links
don't need to form a hierarchy.

There is more similarity between PTL and "loopy Bayes nets", but those are
kinda hacky -- they involve using algorithms proved correct for hierarchies,
and just applying them to heterarchies and hoping for the best.  PTL is
designed to work with nonhierarchical probability webs...

-- Ben







---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel

> Lakoff and Nunez
> (http://perso.unifr.ch/rafael.nunez/reviews.html) have a theory
> that we compare lengths in our head to do arithmetic, when we're
> not using school-learned rules.  Our innate mathematical ability
> is based on visuo-spatial comparisons in their view.
>
> This would basically be #2, and to use this capability we need to
> get familiar enough with the problem that our mind translates the
> numbers involved into length.
>
>
>
> -Brad


yes!  This is exactly the sort of thing I was thinking of

We have pretty good "inference" capability in the perceptual and motor
domains, and it may be that one of our main (unconscious) strategies for
cognitive processing is to map cognitive problems into the intuitive form of
perceptual/motor problems.  And, as you say, doing this mapping is much
easier when the domain in question is familiar.  Translation of extents into
lengths so they can be reasoned on using length-handling circuitry is an
excellent example...

This is also an example of how weird the brain can be from an algorithmic
perspective.  In designing an AI system, one tends to abstract cognitive
processes and create specific processes based on these abstractions.  (And
this is true in NN type AI architectures, not just logicist ones.)  But
evolution is a hacker sometimes: often, rather than abstracting, it reuses
stuff that was created for another purpose, providing hacky mappings to
enable the reuse.  This is terrible software engineering practice, but
evolution has a lot of computational resources to work with, and it does
create a lot of buggy things ;)

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Brad Wyble
> 
> Hi Ben,
> 
> Thanks for the brain teaser!  As a sometimes believer in Occam's Razor, I
> think it makes sense to assume that Xi and Xj are indepenent, unless we know
> otherwise.  This simplifies things, and is the "rational" thing to do (for
> some definition of rational ;->).  So why not construct a bayes net modeling
> the distributions, with causal links only where you _know_ two variables are
> dependent?  For reasoning about "orphan variables" (e.g., you know nothing
> at all about Xi), just assume the average of all other distributions.  If
> you have P(Xi|Xj), and want P(Xj|Xi), fudge something together with Bayes'
> rule.  This isn't a complete solution, but its how I would start... Is this
> one of the things you've tried?
> 
> Cheers,
> Moshe


As Pei Wang said:  Intelligence is the ability to work and adapt to the environment 
with insufficient knowledge and resources.

I think this is a core principle of AGI design and that a system that only makes 
inferences it *knows* are correct would be fairly uninteresting and incapable of 
performing in the real world.  The fact that the information in the P(xi|xj) list is 
very incomplete is what makes the problem interesting.

Or maybe I'm misinterpreting your intent.


-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Moshe Looks
Hi Ben,

Thanks for the brain teaser!  As a sometimes believer in Occam's Razor, I
think it makes sense to assume that Xi and Xj are indepenent, unless we know
otherwise.  This simplifies things, and is the "rational" thing to do (for
some definition of rational ;->).  So why not construct a bayes net modeling
the distributions, with causal links only where you _know_ two variables are
dependent?  For reasoning about "orphan variables" (e.g., you know nothing
at all about Xi), just assume the average of all other distributions.  If
you have P(Xi|Xj), and want P(Xj|Xi), fudge something together with Bayes'
rule.  This isn't a complete solution, but its how I would start... Is this
one of the things you've tried?

Cheers,
Moshe
> 
> Hi,
> 
> This one is for the more mathematically/algorithmically inclined people
> on the list.
> 
> I'm going to present a mathematical problem that's come up in the
> Novamente development process.  We have two different solutions for it,
> each with strengths and weaknesses.  I'm curious if, perhaps, someone
> on this list will suggest an alternate approach.  (If not, at least the
> problem itself may stimulate somebody's mind ;)
> 
> I'll describe the problem here in a very simple form.  Actually, inside
> Novamente, this simple problem exists in many "transformed" variants
> and takes many different guises.  It is posed here in terms of simple
> conditional probabilities, but it also presents itself in other forms,
> involving n-ary relationships, complex procedures and predicates, etc.
> etc.
> 
> Without further ado
> 
> Let X_i, i=1,...,n, denote a set of discrete random variables (think of
> them as concepts or percepts)
> 
> Let's say we have a set of N << n^2 conditional probability
> relationships of the form
> 
> P(X_j|X_i)
> 
> where i, j are drawn from {1,...,n}.
> 
> Let's say we also have a set of M <= n probabilities
> 
> P(X_i)
> 
> The problem is:
> 
> * Infer the rest of the P(X_i|X_j) and P(X_i): the ones that aren't
> given
> 
> * Specifically, infer cases where P(X_i|X_j) differs significantly from
> P(X_i)
> 
> Clearly this is a massively "underdetermined" problem: the given data
> will generally not be enough to uniquely determine the results.  This
> is what makes it interesting!
> 
> As I said, we have two solutions for this, one implemented the other
> just designed; so we know the problem is approximately and
> heuristically solvable in a plausible computational timeframe.  But I
> wonder if there aren't radically different solutions from the ones
> we've come up with...
> 
> Any thoughts?
> 
> -- Ben G
> 
> ---
> To unsubscribe, change your address, or temporarily deactivate your
> subscription,  please go to
> http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Brad Wyble


> 
> 1) Humans use special-case algorithms to solve these problem, a different
> algorithm for each domain
> 
> 2) Humans have a generalized mental tool for solving these problems, but
> this tool can only be invoked when complemented by some domain-specific
> knowledge
> 
> My intuitive inclination is that the correct explanation is 2) not 1).  But
> of course, which explanation is correct for humans isn't all that relevant
> to AI work in the Novamente v


Lakoff and Nunez (http://perso.unifr.ch/rafael.nunez/reviews.html) have a theory that 
we compare lengths in our head to do arithmetic, when we're not using school-learned 
rules.  Our innate mathematical ability is based on visuo-spatial comparisons in their 
view.

This would basically be #2, and to use this capability we need to get familiar enough 
with the problem that our mind translates the numbers involved into length.



-Brad



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel

> > The problem at hand is, you're given some absolute and
> > some conditional probabilities regarding the concepts
> > at hand, and you want to infer a bunch of others.
>
> Hmm. The think I find interesting here is that humans don't have a good
> solution to this problem. Give a typical human a set of data like
> the above,
> and he'll just give you a blank look. Give him a specific problem
> and he'll
> do some first-order inference (i.e. Fluffy is more likely to be a
> cat's name
> than a dog's), but we rarely take it more than one step. Also, it seems to
> me that humans usually only look for the specific data required by a
> problem, rather than trying to figure out all the logical
> consequences of a
> set of data.
>
> This does not, of course, mean that you should give Novamente the
> ability to
> solve this kind of problem. But it does hint that what you're
> building is a
> different kind of mind than what humans have...
>
> Billy Brown

Yes, we are explicitly trying to build a different kind of mind than what
humans have.  Computers have a capability for precision that seems to vastly
exceed that of the human brain.  Exploiting this capability for precision in
an AI design seems appropriate.  Perhaps it can make up for the lack of the
massive parallelism that the human brain possesses

About humans' abilities at probabilistic inference.  There has been plenty
of research on this in the cognitive psych community.  It seems that humans
are OK at solving this kind of problem *only in familiar contexts*.

That is, we can sometimes approximately solve problems formally mappable
into this kind of probabilistic inference problem, but our ability at
solving them is vastly better if the problems occur in familiar domains
(physical objects, social interactions, etc.) than if the problems occur in
an "abstracted" form.

There are two possible explanations for this:

1) Humans use special-case algorithms to solve these problem, a different
algorithm for each domain

2) Humans have a generalized mental tool for solving these problems, but
this tool can only be invoked when complemented by some domain-specific
knowledge

My intuitive inclination is that the correct explanation is 2) not 1).  But
of course, which explanation is correct for humans isn't all that relevant
to AI work in the Novamente vein.


-- Ben G


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Billy Brown
Ben Goertzel wrote:
> Suppose you have a large set of people, say, all the people on Earth
>
> Then you have a bunch of categories you're interested in, say:

...

> The problem at hand is, you're given some absolute and
> some conditional probabilities regarding the concepts
> at hand, and you want to infer a bunch of others.

Hmm. The think I find interesting here is that humans don't have a good
solution to this problem. Give a typical human a set of data like the above,
and he'll just give you a blank look. Give him a specific problem and he'll
do some first-order inference (i.e. Fluffy is more likely to be a cat's name
than a dog's), but we rarely take it more than one step. Also, it seems to
me that humans usually only look for the specific data required by a
problem, rather than trying to figure out all the logical consequences of a
set of data.

This does not, of course, mean that you should give Novamente the ability to
solve this kind of problem. But it does hint that what you're building is a
different kind of mind than what humans have...

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Ben Goertzel

> I think I proposed about 6 or so basic variations on this
> theme to test the reasoning system's ability with deal with
> various level or noise and missing data...  you can come up
> with all sorts of interesting variations with a bit of thought.
>
> Yeah, just a fancy Venn diagram really used to generate
> reasonably consistent data sets.
>
> Cheers
> Shane

Right -- and two thinks should be emphasized

1) this is just one little puzzle (within a much larger puzzle whose pieces
are smaller puzzles ;) ... solving it only brings you a leeetle bit of the
way toward an effective AGI design, of course

2) we actually have an apparently adequate solution, I'm just curious if
there are better ones out there.  One aspect of the Novamente design is that
various aspects of it can often be improved "independently" of the rest of
the system.  (Though there are complex limitations on this substitutability)

ben


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-21 Thread Philip Sutton
Ben,

> OK... life lesson #567: When a mathematical explanation confuses
> non-math people, another mathematical explanation is not likely to
> help 

While I can't help with the solution, I can say that this version of your 
problem at last made sense to me - previous version were 
incomprehensible to me, this last version leaped off the page as 
comprehensible communication.  So you're rule above holds very well.

If you can teach Novamente to do what you have just done here you've 
made a big leap forward in human / Novamente communication.

Cheers, Philip

From:   "Ben Goertzel" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject:        RE: [agi] A probabilistic/algorithmic puzzle...
Date sent:  Thu, 20 Feb 2003 14:25:54 -0500
Send reply to:  [EMAIL PROTECTED]



OK... life lesson #567: When a mathematical explanation confuses 
non-math people, another mathematical explanation is not likely to 
help

The basic situation can be thought of as follows.

Suppose you have a large set of people, say, all the people on Earth

Then you have a bunch of categories you're interested in, say:

Chinese
Arab
fat
skinny
smelly 
female
...


Then you have some absolute probabilities, e.g.

P(Chinese) = .2
P(fat) = .15

etc. , which tell you how likely a randomly chosen person is to fall into 
each of the categories

Then you have some conditional probabilities, e.g.

P(fat | skinny)=0
P(smelly|male) = .62
P(fat | American) = .4
P(slow|fat) = .7

The last one, for instance, tells you that if you know someone is 
American, then there's a .4 chance the person is fat (i.e. 40% of 
Americans are fat).

The problem at hand is, you're given some absolute and some 
conditional probabilities regarding the concepts at hand, and you want 
to infer a bunch of others.

In localized cases this is easy, for instance using probability theory one 
can get evidence for

P(slow|American)

from the combination of

P(slow|fat)

and

P(fat | American)

Given n concepts there are n^2 conditional probabilities to look at. 
The most interesting ones to find are the ones for which

P(A|B) is very different from P(B)

just as for instance

P(fat|American) is very different from P(fat)

This problem is covered by elementary probability theory. Solving it in 
principle is no issue. The tricky problem is solving it approximately, for 
a large number of concepts and probabilities, in a very rapid 
computational way.

Bayesian networks try to solve the problem by seeking a set of 
concepts that are arranged in an "independence hierarchy" (a directed 
acyclic graph with a concept at each node, so that each concept is 
independent of its parents conditional on its ancestors -- and no I don't 
feel like explaining that in nontechnical terms at the moment ;). But 
this can leave out a lot of information because real conceptual 
networks may be grossly interdependent. Of course, then one can try 
to learn a whole bunch of different Bayes nets and merge the 
probability estimates obtained from each one

One thing that complicates the problem is that ,in some cases, as well 
as inferring probabilities one hasn't been given, one may want to make 
corrections to probabilities one HAS been given. For instance, 
sometimes one may be given inconsistent information, and one has to 
choose which information to accept.

For example, if you're told

P(male) = .5
P(young|male) = .4
P(young) = .1

then something's gotta give, because the first two probabilities imply 
P(young) >= .5*.4 = .2

Novamente's probabilistic reasoning system handles this problem pretty 
well, but one thing we're struggling with now is keeping this "correction 
of errors in the premises" under control. If you let the system revise its 
premises to correct errors (a necessity in an AGI context), then it can 
easily get carried away in cycles of revising premises based on 
conclusions, then revising conclusions based on the new premises, and 
so on in a chaotic trajectory leading to meaningless inferred 
probabilities.

As I said before, this is a very simple incarnation of a problem that 
takes a lot of other forms, more complex but posing the same essential 
challenge.

-- Ben G







---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Shane Legg

Hi Cliff and others,

As I came up with this kind of a test perhaps I should
say a few things about its motivation...

The problem was that the Webmind system had a number of
proposed reasoning systems and it wasn't clear which was
the best.  Essentially the reasoning systems took as input
a whole lot of data like:

Fluffy is a Cat
Snuggles is a Cat
Tweety is a Bird
Cats are animals
Cats are mamals
Cats are dogs

and so on...  This data might have errors, it might be
very bias in its sample of the outside world, it might
contain contradictions and so on... nevertheless we
would expect some basic level of consistency to it.

The reasoning systems take this and come up with all
sorts of conclusions like: Fluffy is an animal based
on the fact that Fluffy is a Cat and Cats seem to be
animals...  In a sense the reasoning system is trying
to "fill in the gaps" in our data by looking at the
data it has and drawing simple conclusions.

So what I wanted to do is to some how artificially
generate test sets that I could use to automatically
test the systems against each other.  I would vary the
number of entities in the space (Fluffy, Cat, Bird...)
the amount of noise in the data set, the number of
data points and so on...

Now the problem is that you can't just randomly generate
any old data points, you actually need at least some kind
of consistency which is a bit tricky when you have some
A's being B's and most B's being C's and all B's not being
D's but all D's being A's.  Before long your data is totally
self contradictorary are are basically just feeding your
reasoning system complete junk and so it isn't a very
interesting test of the system's ability.

So my idea was basically to create a virtual Venn diagram
using randomly placed rectangles as the sets used to compute
the probability for each entity in the space and the conditional
probabilities of their various intersections.  This way your
fundamental underlying system has consistent probabilities
which is a good start.  You can then randomly sample points
from the space or directly compute the probabilities from the
rectange areas (actually n dimensional rectanges as this gives
more interesting intersection possibilities) and so on to get
your data sets.  You can then look at how well the system is
able to approximate the true probabilities based on the
incomplete data that it has been given (you can compute the
true probabilities directly as you know the recatangle areas).

I think I proposed about 6 or so basic variations on this
theme to test the reasoning system's ability with deal with
various level or noise and missing data...  you can come up
with all sorts of interesting variations with a bit of thought.

Yeah, just a fancy Venn diagram really used to generate
reasonably consistent data sets.

Cheers
Shane

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel


> Isn't there some way, if a "full curve" is too computationally
> exensive, some way of expressing, say, 2 sigmas (standard deviations)
> or whatever? E.g. 74% will fall within 1 standard dev. of optimum X?

We tried that, but generally, after a few inference iterations, the
confidence intervals tend to become meaningless.

What we do when we want pdf truth values is to use polynomials (so that,
e.g. we can look at truth value distributions fitted by 10'th degree
polynomials).  Polynomials are nice in that they allow rapid algebraic
manipulation.


> Finally, isn't there some precise equation or set of equations you are
> approximating?

Sure, and I worked that out in detail, but it's not computationally
feasible.

You can take a possible worlds approach.  In the case where the premises are
consistent, for instance, you can look at all possible worlds consistent
with the given premises.  Then to find an unknown probability

P(A|B)

you average the value this would take in each of the possible worlds.  If
you have a priori knowledge about which possible worlds are more likely, you
can use it too.  This is a "correct" approach, but only computationally
feasible for small problem cases.  The math proofs of the Novamente
inference rules explicitly involve an approximation to this approach.


-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Cliff Stabbert
Thursday, February 20, 2003, 8:11:48 PM, Ben Goertzel wrote:

CS> Somehow I see this ending up as finding a set a bell curves (i.e.
CS> their height, spread and optimum) for each estimate.  That is to say I
CS> don't see *just* the probability as relevant but the probability
CS> distribution...if I sample 10 people, the curves are all "wider" than
CS> if I sample 100 people out of a 1,000 total.

BG> You can do that, it's true.  One option we prototyped in Novamente was using
BG> "probability distribution truth values" instead of simple probability truth
BG> values.  However, it vastly increases the computational cost, and in many
BG> cases there's not enough data to support a distributional truth value
BG> meaningfully.

BG> So the system is designed to be able to switch between distributional and
BG> simple truth values adaptively ;-)

BG> However, a truth value distribution need not be a bell curve.

BG> For example, how about the truth value of

BG> P( male | human )

BG> As a number it's .5

BG> As a distribution, it's bimodal, not Gaussian at all

Well, it's bimodal *IF* you know the *real* probability distribution.

Put it this way: given 100 samples from a 1,000 total population

  100 samples
   51 exhibit property X
3 exhibit property Y

Let's say we *know* that property X is "male" and the sampled
population is "people", (i.e. those are givens) then this is both
expected and not too far off. 

But property Y, because it's so "small", doesn't tell us as much about
"the probability of property Y".  E.g., if Y is having green eyes and
red hair, and this is a rare property, a few more or less in a sample
of 100 is "pretty possible".  Whereas a number of males >> .5n or
<< .5n is much less likely.  If in a random sample of 100 people only
5 were male, we'd be, ehm, surprised.

Isn't there some equation that determines the likelihood of out-of-
bounds results in a random sample?  My probability theory is too rusty
here...

Isn't there some way, if a "full curve" is too computationally
exensive, some way of expressing, say, 2 sigmas (standard deviations)
or whatever? E.g. 74% will fall within 1 standard dev. of optimum X?

Finally, isn't there some precise equation or set of equations you are
approximating?


--
Cliff

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel


Hi Cliff,

> BG> One thing that complicates the problem is that ,in some
> cases, as well as
> BG> inferring probabilities one hasn't been given, one may want to make
> BG> corrections to probabilities one HAS been given.  For
> instance, sometimes
> BG> one may be given inconsistent information, and one has to choose which
> BG> information to accept.
>
> If I'm following this, this corresponds in your second statement to
> the random point selection leading to *approximations* of
> probabilities,

Actually, in the rectangles test, we will never have inconsistencies, unless
we specifically add noise to the data.  Sorry if I made a mis-statement
there.

On the other hand, in real-world inference problems, one often has
inconsistencies.  For instance, in data loaded from biology databases, there
may be inconsistencies due to errors in some of the databases.  Similarly,
in data loaded from sensors, there may be inconsistencies due to erroneous
perceptions by one or another sensor (and the system may not know which is
erroneous in a given case).

> So don't we, in order to make assessments of the accuracy of the
> approximations need to know the number of samples taken and have some
> given confidence level of the randomness of the sampling process?

That would be nice, but in real inference examples this kind of information
is usually not available.

> Somehow I see this ending up as finding a set a bell curves (i.e.
> their height, spread and optimum) for each estimate.  That is to say I
> don't see *just* the probability as relevant but the probability
> distribution...if I sample 10 people, the curves are all "wider" than
> if I sample 100 people out of a 1,000 total.

You can do that, it's true.  One option we prototyped in Novamente was using
"probability distribution truth values" instead of simple probability truth
values.  However, it vastly increases the computational cost, and in many
cases there's not enough data to support a distributional truth value
meaningfully.

So the system is designed to be able to switch between distributional and
simple truth values adaptively ;-)

However, a truth value distribution need not be a bell curve.

For example, how about the truth value of

P( male | human )

As a number it's .5

As a distribution, it's bimodal, not Gaussian at all

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel

Interestingly, in our system, we nearly always get an equilibrium even
without any kind of "rate of change decay factor."  It's just that if too
much "conclusion based premise revision" goes on, then the equilibrium may
reflect a too-much-revised illusory world.  Basically, the process of
revising premises based on conclusions is difficult (but possible) to
control and has a tendency to lead to chaotic inference trajectories, if
things aren't set up carefully.

We have a mechanism based on keeping track of "weight of evidence" that
works pretty much like your decay factor; and what we find is that it's a
bit fussy, that's all.

The philosophical conclusion, perhaps, is that "deviations from 'seeing is
believing' have to be handled with great care" ...

[I note that this "nearly always get an equilibrium" result is only the case
when NOTHING BUT first-order probabilistic inference is going on.  When
other processes are running in the system too, say the nonlinear-dynamical
attention-allocation function that drives the system's focus of attention,
or new-node-formation processes, etc. then convergence does not occur.

You mention new information being added into the system via GoalNodes, which
is one route, but in Novamente there are also other processes besides
GoalNodes and first-order prob. inference that may add knowledge to the
system as well.]

-- Ben

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Brad Wyble
> Sent: Thursday, February 20, 2003 3:26 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] A probabilistic/algorithmic puzzle...
>
>
> >
> > But anyway, using the weighted-averaging rule dynamically and
> iteratively
> > can lead to problems in some cases.  Maybe the mechanism you
> suggest -- a
> > nonlinear average of some sort -- would have better behavior, I'll think
> > about it.
>
> The part of the idea that guaranteed an eventual equilibrium was
> to add decay to the variables that can trigger internal
> probability adjustments(in my case, what I called them "truth").
> Eventually the system will stop self-modifying when the
> energy(truth) runs out.  The only way to add more truth to the
> system would be to acquire new information via adding goal nodes
> for that purpose.  You could say that the internal conistency
> checker "feeds on" the truth energy introduced into the system by
> the completion of data-acquisition goals(which are capable of
> incrementing truth values).
>
> This should guarantee the prevention of  infinite self-modification loops.
>
> -Brad
>
> ---
> To unsubscribe, change your address, or temporarily deactivate
> your subscription,
> please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
>

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Cliff Stabbert
Thursday, February 20, 2003, 2:25:54 PM, Ben Goertzel wrote:

BG> The basic situation can be thought of as follows.



Thanks, this does clarify things a lot.  Your first statement of the
problem did leave some things out though...but, perhaps
unsurprisingly, I'm still a bit puzzled.

I don't mean to nag, so if you don't have the time, just leave it --
perhaps someone else will volunteer as probability professor...

BG> One thing that complicates the problem is that ,in some cases, as well as
BG> inferring probabilities one hasn't been given, one may want to make
BG> corrections to probabilities one HAS been given.  For instance, sometimes
BG> one may be given inconsistent information, and one has to choose which
BG> information to accept.

If I'm following this, this corresponds in your second statement to
the random point selection leading to *approximations* of
probabilities, i.e., we only have samples such as
  Cliff(Fat, Smelly, Slow, American, Sucks at Math)
and
  Ben(Slender, Smelly, Fast, American, Math Geek)
etc.

from which the probabilities P(fat|american) etc. are derived.

So don't we, in order to make assessments of the accuracy of the
approximations need to know the number of samples taken and have some
given confidence level of the randomness of the sampling process?

Or is the randomness of sampling more or less a given, and we're
dealing with n << t total samples, i.e. we've sampled (n/t) of the
population?

In that case, all probabilities we've inferred have the same initial
"certainty quotient" which depends in a straightforward way on the
ratio of n to t...?

Or is the probability *itself* a factor in certainty, i.e. if only 1
in a 1,000 people have property X, then the number of people you
sample who have property X is "more random" in a 100 person sample
than if 1 in 2 people have that property. 

I.e. if the probability of having lung cancer is 1/300 and the
probability of being male is 1/2 than if we have a sample of
  100 people
   48 male
2 lung cancer

the 2 is less significant (informative) because it's so small, while
the 48 "gives you more information" or "more certainty"... ?

BG> For example, if you're told

BG> P(male) = .5
BG> P(young|male) = .4
BG> P(young) = .1

BG> then something's gotta give, because the first two probabilities imply
P(young) >>= .5*.4 = .2

Right, because out of a population of
  1000 people
   100 young
   500 males
   200 young males
==
?? does not compute: 100 young or >=200 young??

So the sample size / population size and possibly the sampling method
have a large influence here.

BG> Novamente's probabilistic reasoning system handles this problem pretty well,
BG> but one thing we're struggling with now is keeping this "correction of
BG> errors in the premises" under control.  If you let the system revise its
BG> premises to correct errors (a necessity in an AGI context), then it can
BG> easily get carried away in cycles of revising premises based on conclusions,
BG> then revising conclusions based on the new premises, and so on in a chaotic
BG> trajectory leading to meaningless inferred probabilities.

Say we randomly sampled 100 people of the total 1,000 people to arrive
at the above probabilities: this is what I don't get.  In order to
find the above probabilities wouldn't we have to have found, out of
  100 samples  that
   10 are young
   50 are males
   20 are young males
??  We couldn't have.

So are we separately sampling
  100 samples
   10 are young
   50 are male

and then sampling a *different* random set of
 n  males  of which
  0.2n  are young
??

Can you (or somebody) sketch me a scenario in which we arrive from
some set of samples at "contradictory" probability estimates?
   
BG> As I said before, this is a very simple incarnation of a problem that takes
BG> a lot of other forms, more complex but posing the same essential challenge.

>From everything you're saying above I get the sense that

- there's *some* set of equations governing the *relationship
  between* the actual values of the probabilities and the estimated
  probabilities
- these relationships depend on the number of samples and the
  estimated probabilities (estimated based on those samples),
  i.e. the certainty of the estimate P(X_i) depends both on the
  sample size relative to the population, and how many of the sample
  exhibited X_i (e.g. 50 males out of a 100 "tells you more
  certainly" about the probability of male than 1 lung cancer out of
  a 100 tells you about the probability of lung cancer)
- you want to converge quickly (computationally cheaply) on some
  optimum estimate of those relationships (because they involve
  something nastily nonlinear, somehow)

Somehow I see this ending up as finding a set a bell curves (i.e.
their height, spread and optimum) for each estimate.  That is to say I
don't see *just* the probability as relevant but the probability
distribution...if I sample 10 people, the curves are all "wider" than
if I samp

Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Brad Wyble
> 
> But anyway, using the weighted-averaging rule dynamically and iteratively
> can lead to problems in some cases.  Maybe the mechanism you suggest -- a
> nonlinear average of some sort -- would have better behavior, I'll think
> about it.

The part of the idea that guaranteed an eventual equilibrium was to add decay to the 
variables that can trigger internal probability adjustments(in my case, what I called 
them "truth").  Eventually the system will stop self-modifying when the energy(truth) 
runs out.  The only way to add more truth to the system would be to acquire new 
information via adding goal nodes for that purpose.  You could say that the internal 
conistency checker "feeds on" the truth energy introduced into the system by the 
completion of data-acquisition goals(which are capable of incrementing truth values).

This should guarantee the prevention of  infinite self-modification loops.

-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel

> If P1 and P2 are contradictory, compare the truth values of the
> assertions.  If they are very similar, do nothing, because it's
> impossible to know which is correct.  If they vary
> significantly(and at least one of them is above a certain
> threshold), alter the probabilities towards one another, with
> respect to their relative truth.   So if P1 has truth .95 and P2
> has truth .2, adjust P1 slightly in the direction to relieve the
> contradiction.  Adjust P2 greatly.Then, decrement the truth
> values of both of them using some nonlinear function.  High truth
> assertions should probably be "sticky", in that it they decrease
> very slowly, so that you need a great number of contradictory
> low-truth contradictions to bring a single high-truth value down
> to mid-range truth values.

yeah, this is handled in Novamente via the "revision rule" and "rule of
choice", two inference rules.

If two estimates are reasonably close, they are weighted-averaged in a
certain way; if they are very different, they may both be maintained pending
future evidence that one of them is totally wrong...

The Novamente situation is a little subtler because in addition to
probabilities we retain for each probability a number indicating the "amount
of evidence" on which the probability is based.  This can be useful, e.g.
for the weighting in the above-mentioned weighted average rule.

But anyway, using the weighted-averaging rule dynamically and iteratively
can lead to problems in some cases.  Maybe the mechanism you suggest -- a
nonlinear average of some sort -- would have better behavior, I'll think
about it.

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Brad Wyble
> 
> One thing that complicates the problem is that ,in some cases, as well as
> inferring probabilities one hasn't been given, one may want to make
> corrections to probabilities one HAS been given.  For instance, sometimes
> one may be given inconsistent information, and one has to choose which
> information to accept.
> 
> For example, if you're told
> 
> P(male) = .5
> P(young|male) = .4
> P(young) = .1
> 
> then something's gotta give, because the first two probabilities imply
> P(young) >= .5*.4 = .2
> 
> Novamente's probabilistic reasoning system handles this problem pretty well,
> but one thing we're struggling with now is keeping this "correction of
> errors in the premises" under control.  If you let the system revise its
> premises to correct errors (a necessity in an AGI context), then it can
> easily get carried away in cycles of revising premises based on conclusions,
> then revising conclusions based on the new premises, and so on in a chaotic
> trajectory leading to meaningless inferred probabilities.
> 
> As I said before, this is a very simple incarnation of a problem that takes
> a lot of other forms, more complex but posing the same essential challenge.
> 
> -- Ben G

The first thing occurs to me (and probably has to you) as a solution to this dilemna 
is to use truth(confidence) values to set cut-off points for correction of 
contradictions.

If P1 and P2 are contradictory, compare the truth values of the assertions.  If they 
are very similar, do nothing, because it's impossible to know which is correct.  If 
they vary significantly(and at least one of them is above a certain threshold), alter 
the probabilities towards one another, with respect to their relative truth.   So if 
P1 has truth .95 and P2 has truth .2, adjust P1 slightly in the direction to relieve 
the contradiction.  Adjust P2 greatly.Then, decrement the truth values of both of 
them using some nonlinear function.  High truth assertions should probably be 
"sticky", in that it they decrease very slowly, so that you need a great number of 
contradictory low-truth contradictions to bring a single high-truth value down to 
mid-range truth values.  

By decreasing the energy in the truth table with each sweep, and only effecting 
changes as long as truth values are above a cut-off threshold, you are guaranteed to 
reach a state of equilibrium eventually.   

However, the system I've described can end up with a stable state of many mutually 
contradictory statements of similar truth values.  These states would need to be 
resolved by another system, perhaps one that creates goal nodes to address 
contradictions through the acquisition of new information.  


-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel



 
Yes, 
of course, the overlaps are the whole subtlety to the problem!  This is 
what's known as "probabilistic dependency" ;-)

  -Original Message-From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]]On Behalf Of KevinSent: 
  Thursday, February 20, 2003 2:43 PMTo: 
  [EMAIL PROTECTED]Subject: Re: [agi] A probabilistic/algorithmic 
  puzzle...
  Isn't this problem made more complex when we 
  consider that things belong to various categories.
   
  For instance, if we know that 
  -40% of americans are fat
  -americans are "people"
  -a person can be male or female
   
  then we can make the initial guess that 40% of american 
  males are fat, and 40% of american women are fat.
   
  this seems to add alot of layers of complexity to the 
  problem, which I'm sure you've already considered...
   
  Kevin
  
- Original Message - 
From: 
Ben Goertzel 

To: [EMAIL PROTECTED] 
Sent: Thursday, February 20, 2003 2:25 
PM
    Subject: RE: [agi] A 
probabilistic/algorithmic puzzle...

 
OK... life 
lesson #567: When a mathematical explanation confuses non-math people, 
another mathematical explanation is not likely to help
 
The basic 
situation can be thought of as follows.
 
Suppose you 
have a large set of people, say, all the people on Earth
 
Then you have a 
bunch of categories you're interested in, say:
 
Chinese
Arab
fat
skinny
smelly 

female
...
 
 
Then you have 
some absolute probabilities, e.g.
 
P(Chinese) = 
.2
P(fat) = 
.15
 
etc. , which 
tell you how likely a randomly chosen person is to fall into each of the 
categories
 
Then you have 
some conditional probabilities, e.g.
 
P(fat | 
skinny)=0
P(smelly|male) 
= .62
P(fat | 
American) = .4
P(slow|fat) = 
.7
 
The last one, 
for instance, tells you that if you know someone is American, then there's a 
.4 chance the person is fat (i.e. 40% of Americans are 
fat).
 
The problem at 
hand is, you're given some absolute and some conditional probabilities 
regarding the concepts at hand, and you want to infer a bunch of 
others.
 
In localized 
cases this is easy, for instance using probability theory one can get 
evidence for
 
P(slow|American)
 
from the 
combination of
 
P(slow|fat)
 
and
 
P(fat | 
American)
 
Given n 
concepts there are n^2 conditional probabilities to look at.  The most 
interesting ones to find are the ones for which
 
P(A|B) is very 
different from P(B)
 
just as for 
instance
 
P(fat|American) 
is very different from P(fat)
 
This problem is 
covered by elementary probability theory.  Solving it in principle is 
no issue.  The tricky problem is solving it approximately, for a large 
number of concepts and probabilities, in a very rapid computational 
way.
 
Bayesian 
networks try to solve the problem by seeking a set of concepts that are 
arranged in an "independence hierarchy" (a directed acyclic graph with a 
concept at each node, so that each concept is independent of its parents 
conditional on its ancestors -- and no I don't feel like explaining that in 
nontechnical terms at the moment ;).   But this can leave out a 
lot of information because real conceptual networks may be grossly 
interdependent.  Of course, then one can try to learn a whole bunch of 
different Bayes nets and merge the probability estimates obtained from each 
one
 
One thing that complicates 
the problem is that ,in some cases, as well as inferring probabilities one 
hasn't been given, one may want to make corrections to probabilities one HAS 
been given.  For instance, sometimes one may be given inconsistent 
information, and one has to choose which information to 
accept.
 
For example, if you're 
told
 
P(male) = .5
P(young|male) = .4
P(young) = .1
 
then something's gotta give, because the first two 
probabilities imply P(young) >= .5*.4 = .2
 
Novamente's probabilistic reasoning system handles 
this problem pretty well, but one thing we're struggling with now is keeping 
this "correction of errors in the premises" under control.  If you let 
the system revise its premises to correct errors (a necessity in an AGI 
context), then it can easily get carried away in cycles of revising premises 
based on conclusions, then revising conclusions based on the new premises, 
and so on in a chaotic trajectory leading to meaningless inferred 
probabilities.
 
As I said before, 

Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Kevin



Isn't this problem made more complex when we 
consider that things belong to various categories.
 
For instance, if we know that 
-40% of americans are fat
-americans are "people"
-a person can be male or female
 
then we can make the initial guess that 40% of american males 
are fat, and 40% of american women are fat.
 
this seems to add alot of layers of complexity to the problem, 
which I'm sure you've already considered...
 
Kevin

  - Original Message - 
  From: 
  Ben Goertzel 
  
  To: [EMAIL PROTECTED] 
  Sent: Thursday, February 20, 2003 2:25 
  PM
  Subject: RE: [agi] A 
  probabilistic/algorithmic puzzle...
  
   
  OK... life lesson 
  #567: When a mathematical explanation confuses non-math people, another 
  mathematical explanation is not likely to help
   
  The basic 
  situation can be thought of as follows.
   
  Suppose you have 
  a large set of people, say, all the people on Earth
   
  Then you have a 
  bunch of categories you're interested in, say:
   
  Chinese
  Arab
  fat
  skinny
  smelly 
  
  female
  ...
   
   
  Then you have 
  some absolute probabilities, e.g.
   
  P(Chinese) = 
  .2
  P(fat) = 
  .15
   
  etc. , which tell 
  you how likely a randomly chosen person is to fall into each of the 
  categories
   
  Then you have 
  some conditional probabilities, e.g.
   
  P(fat | 
  skinny)=0
  P(smelly|male) = 
  .62
  P(fat | American) 
  = .4
  P(slow|fat) = 
  .7
   
  The last one, for 
  instance, tells you that if you know someone is American, then there's a .4 
  chance the person is fat (i.e. 40% of Americans are fat).
   
  The problem at 
  hand is, you're given some absolute and some conditional probabilities 
  regarding the concepts at hand, and you want to infer a bunch of 
  others.
   
  In localized 
  cases this is easy, for instance using probability theory one can get evidence 
  for
   
  P(slow|American)
   
  from the 
  combination of
   
  P(slow|fat)
   
  and
   
  P(fat | 
  American)
   
  Given n concepts 
  there are n^2 conditional probabilities to look at.  The most interesting 
  ones to find are the ones for which
   
  P(A|B) is very 
  different from P(B)
   
  just as for 
  instance
   
  P(fat|American) 
  is very different from P(fat)
   
  This problem is 
  covered by elementary probability theory.  Solving it in principle is no 
  issue.  The tricky problem is solving it approximately, for a large 
  number of concepts and probabilities, in a very rapid computational 
  way.
   
  Bayesian networks 
  try to solve the problem by seeking a set of concepts that are arranged in an 
  "independence hierarchy" (a directed acyclic graph with a concept at each 
  node, so that each concept is independent of its parents conditional on its 
  ancestors -- and no I don't feel like explaining that in nontechnical terms at 
  the moment ;).   But this can leave out a lot of information because 
  real conceptual networks may be grossly interdependent.  Of course, then 
  one can try to learn a whole bunch of different Bayes nets and merge the 
  probability estimates obtained from each one
   
  One thing that complicates 
  the problem is that ,in some cases, as well as inferring probabilities one 
  hasn't been given, one may want to make corrections to probabilities one HAS 
  been given.  For instance, sometimes one may be given inconsistent 
  information, and one has to choose which information to 
  accept.
   
  For example, if you're 
told
   
  P(male) = .5
  P(young|male) = .4
  P(young) = .1
   
  then something's gotta give, because the first two 
  probabilities imply P(young) >= .5*.4 = .2
   
  Novamente's probabilistic reasoning system handles 
  this problem pretty well, but one thing we're struggling with now is keeping 
  this "correction of errors in the premises" under control.  If you let 
  the system revise its premises to correct errors (a necessity in an AGI 
  context), then it can easily get carried away in cycles of revising premises 
  based on conclusions, then revising conclusions based on the new premises, and 
  so on in a chaotic trajectory leading to meaningless inferred 
  probabilities.
   
  As I said before, this is a very simple incarnation 
  of a problem that takes a lot of other forms, more complex but posing the same 
  essential challenge.
   
  -- Ben G
   
   
   
   
   
   
   


RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel



 
OK... life lesson 
#567: When a mathematical explanation confuses non-math people, another 
mathematical explanation is not likely to help
 
The basic situation 
can be thought of as follows.
 
Suppose you have a 
large set of people, say, all the people on Earth
 
Then you have a 
bunch of categories you're interested in, say:
 
Chinese
Arab
fat
skinny
smelly 

female
...
 
 
Then you have some 
absolute probabilities, e.g.
 
P(Chinese) = 
.2
P(fat) = 
.15
 
etc. , which tell 
you how likely a randomly chosen person is to fall into each of the 
categories
 
Then you have some 
conditional probabilities, e.g.
 
P(fat | 
skinny)=0
P(smelly|male) = 
.62
P(fat | American) = 
.4
P(slow|fat) = 
.7
 
The last one, for 
instance, tells you that if you know someone is American, then there's a .4 
chance the person is fat (i.e. 40% of Americans are fat).
 
The problem at hand 
is, you're given some absolute and some conditional probabilities regarding the 
concepts at hand, and you want to infer a bunch of others.
 
In localized cases 
this is easy, for instance using probability theory one can get evidence 
for
 
P(slow|American)
 
from the 
combination of
 
P(slow|fat)
 
and
 
P(fat | 
American)
 
Given n concepts 
there are n^2 conditional probabilities to look at.  The most interesting 
ones to find are the ones for which
 
P(A|B) is very 
different from P(B)
 
just as for 
instance
 
P(fat|American) is 
very different from P(fat)
 
This problem is 
covered by elementary probability theory.  Solving it in principle is no 
issue.  The tricky problem is solving it approximately, for a large number 
of concepts and probabilities, in a very rapid computational 
way.
 
Bayesian networks 
try to solve the problem by seeking a set of concepts that are arranged in an 
"independence hierarchy" (a directed acyclic graph with a concept at each node, 
so that each concept is independent of its parents conditional on its ancestors 
-- and no I don't feel like explaining that in nontechnical terms at the moment 
;).   But this can leave out a lot of information because real 
conceptual networks may be grossly interdependent.  Of course, then one can 
try to learn a whole bunch of different Bayes nets and merge the probability 
estimates obtained from each one
 
One thing that complicates the 
problem is that ,in some cases, as well as inferring probabilities one hasn't 
been given, one may want to make corrections to probabilities one HAS been 
given.  For instance, sometimes one may be given inconsistent information, 
and one has to choose which information to 
accept.
 
For example, if you're told
 
P(male) = .5
P(young|male) = .4
P(young) = .1
 
then something's gotta give, because the first two 
probabilities imply P(young) >= .5*.4 = .2
 
Novamente's probabilistic reasoning system handles this 
problem pretty well, but one thing we're struggling with now is keeping this 
"correction of errors in the premises" under control.  If you let the 
system revise its premises to correct errors (a necessity in an AGI context), 
then it can easily get carried away in cycles of revising premises based on 
conclusions, then revising conclusions based on the new premises, and so on in a 
chaotic trajectory leading to meaningless inferred 
probabilities.
 
As I said before, this is a very simple incarnation of 
a problem that takes a lot of other forms, more complex but posing the same 
essential challenge.
 
-- Ben G
 
 
 
 
 
 
 


RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel

> BG> I don't know if this "test problem" will clarify things or
> confuse them ;-)
>
> For me, it's confused them.  I thought I was following it before,
> sorta...

OK, well I'm pressed for time today, so I'll write a nonmathematical version
of the problem late tonight or tomorrow or over the weekend.

> BG> Consider n rectangles inside the unit square.  Call these
> rectangles X_1,
> BG> X_2,..., X_n
>
> Are these rectangles at arbitrary locations or all with their origin
> at the unit square's origin?

Arbitrary locations

> BG> Now, we may define some probabilities:
>
> BG> P(X_i) = the area of the rectangle X_i
>
> BG> P(X_i | X_j) = (the area of the intersection between the
> rectangle X_i and
> BG> the rectangle X_j) / (the area of the rectangle X_j)
>
> So P(X_i | X_j) = how much (from 0 to 1) of X_j is covered by X_i, right?

Yes

> BG> 1) Choose a large number of points in the unit square
> BG> 2) Based on these points, evaluate *some of* the
> probabilities P(X_i) and
> BG>P(X_i | X_j)
>
> I don't understand how we get this from the points...are we getting
> approximations by selecting arbitrary points and finding out how many
> hit X_i and X_j?

yes, exactly

> Do we know n (how many rectangles total)?

yeah... of course if NO relationships involving a given rectangle are given,
then you won't even know it exists and can't infer anything about it

>Are there some other
> relationships involved between these rectangles?  All I'm getting out
> of this is "we know some rectangles in the unit square, we know some
> overlaps, now figure out the rest" but I can't see how what constrains
> "the rest" from being completely arbitrary in the given scenario.

Figure out the rest of the overlaps between the rectangles you know about
(the overlaps you haven't been given)

Figure out the areas of any rectangles whose areas you haven't been given

And if possible, correct any errors in the given overlaps and areas (errors
due to the "points" approximation)

I'll write a nonmathematical explanation of the problem later, but I don't
have time right now...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Cliff Stabbert
Thursday, February 20, 2003, 10:58:57 AM, Ben Goertzel wrote:


BG> OK... I can see that I formulated the problem too formally for a lot of
BG> people

BG> I will now rephrase it in the context of a specific "test problem."



BG> I don't know if this "test problem" will clarify things or confuse them ;-)

For me, it's confused them.  I thought I was following it before,
sorta...Maybe you're leaving out some initial assumptions, or maybe
Shane can clarify.

BG> Consider the unit square in the plane, i.e. a square of area 1.

BG> Consider n rectangles inside the unit square.  Call these rectangles X_1,
BG> X_2,..., X_n

Are these rectangles at arbitrary locations or all with their origin
at the unit square's origin?

BG> Now, we may define some probabilities:

BG> P(X_i) = the area of the rectangle X_i

BG> P(X_i | X_j) = (the area of the intersection between the rectangle X_i and
BG> the rectangle X_j) / (the area of the rectangle X_j)

So P(X_i | X_j) = how much (from 0 to 1) of X_j is covered by X_i, right?

BG> Conceptually, if you like, you can think of the set of rectangles as a Venn
BG> diagram, so that the points in the unit square are things in the world, and
BG> each X_i represents some concept (defined extensionally as a set of things)

BG> Now the test problem is as follows.

BG> 1) Choose a large number of points in the unit square
BG> 2) Based on these points, evaluate *some of* the probabilities P(X_i) and
BG>P(X_i | X_j)

I don't understand how we get this from the points...are we getting
approximations by selecting arbitrary points and finding out how many
hit X_i and X_j?  Or do we select a point (x,y) and get some
information back?

BG> 3) Provide an AI system with no information other than the probabilities
BG>evaluated in step 2
BG> 4) Ask the AI system to evaluate the rest of the probabilities P(X_i) and
BG>P(X_i | X_j)

BG> This is an actual test problem we've used to tweak parameters of Novamente's
BG> first-order inference module (which embodies one solution to the problem)...

Do we know n (how many rectangles total)?  Are there some other
relationships involved between these rectangles?  All I'm getting out
of this is "we know some rectangles in the unit square, we know some
overlaps, now figure out the rest" but I can't see how what constrains
"the rest" from being completely arbitrary in the given scenario.

Maybe I need more math and/or coffee, but I think something's being left
out...

--
Cliff

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Ben Goertzel



 
OK... 
I can see that I formulated the problem too formally for a lot of 
people
 
I will 
now rephrase it in the context of a specific "test problem."  

 
This 
test problem was invented by Shane Legg in 2000, when he worked at Webmind 
Inc.
 
It is 
of no intrinsic interest, but it presents the mathematical puzzle in a 
simple visual context.
 
I'll 
describe the test problem in a 2D context, but it could be recast in n 
dimensions just as well.
 
First 
I'll define the basic players in the test problem.
 
Consider the unit square in the plane, i.e. a square of area 
1.
 
Consider n rectangles inside the unit square.  Call these rectangles 
X_1, X_2,..., X_n
 
Now, 
we may define some probabilities:
 
P(X_i) 
= the area of the rectangle X_i
 
P(X_i 
| X_j) = (the area of the intersection between the rectangle X_i and the 
rectangle X_j) / (the area of the rectangle X_j)
 
Of 
course, if one knows the rectangles (the coordinates of their corners, say) then 
one can compute all these probabilities.
 
Conceptually, if you like, you can think of the set of rectangles as a 
Venn diagram, so that the points in the unit square are things in the world, and 
each X_i represents some concept (defined extensionally as a set of 
things)
 
Now 
the test problem is as follows.
 
1) 
Choose a large number of points in the unit square
2) 
Based on these points, evaluate *some of* the probabilities P(X_i) and 
P(X_i|X_j)
3) 
Provide an AI system with no information other than the probabilities evaluated 
in step 2
4) Ask 
the AI system to evaluate the rest of the probabilities P(X_i) and P(X_i | 
X_j)
 
 
I 
don't know if this "test problem" will clarify things or confuse them 
;-)
 
This 
is an actual test problem we've used to tweak parameters of Novamente's 
first-order inference module (which embodies one solution to the 
problem)...
 
 
-- 
Ben
 
 
 
 

  -Original Message-From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]]On Behalf Of Jonathan 
  StandleySent: Thursday, February 20, 2003 4:25 AMTo: 
  [EMAIL PROTECTED]Subject: Re: [agi] A probabilistic/algorithmic 
  puzzle...
  a challenge! cool :)  but let 
  me try to put it in less-math terms for myself and others who are 
  not math-types
   
  >Let X_i, i=1,...,n, denote a set of discrete 
  random variables 
   
  X_i is the set of all integers 
  between i and n, initial value for i is 1?
  or is i any member of the set 
  X?
  or does i function only as a lower 
  bound to set X?
   
   
  > Let's say we have a set of N << n^2 
  conditional probability relationships of> the form
  set N consists of relationships 
  numbering n^2, n is the upper boundary of the set X? > P(X_j|X_i)
  what does this notation 
  "|" mean ?
  does P(X_j|X_i) 
  mean the probability of subset occuring within 
  the set X,   X is bounded by [i,n]?
   
   > where i, j are drawn from 
  {1,...,n}.
  does X_i represent the whole set 
  and j is a subset or set X_i?
  or is i the lower bound of set X 
  and j is an arbitrary member of set X?
  > Let's say 
  we also have a set of M <= n probabilities 
  is this a different n than 
  the upper bound of set x?
  > P(X_i)
  M = P(X_i)?
  if so, it means that M is the chance of set X 
  existing within a larger 'parent' set (ie the novamente system)?
   
  > The problem is:> > * Infer the rest of the P(X_i|X_j) 
  and P(X_i): the ones that aren't given> > * Specifically, infer 
  cases where P(X_i|X_j) differs significantly from> P(X_i)
   
  Is the above asking?  given a set X bounded 
  upper = n, lower = i, and given j, an arbitrary subset of set X in X's  
  initial state, use the available data to approximate the probability of a 
  hypothetical integer set ( a new value for j) apperaing within X in a future 
  state.  specifically, try to find initial conditions which will lead to a 
  large difference between the chances of X existing and the chances of X that 
  contains j existing at a future state. all of this is assuming that Set X is 
  being acted upon by a specified algorithm or process?> > 
  Clearly this is a massively "underdetermined" problem: the given data 
  will> generally not be enough to uniquely determine the results.  
  This is what> makes it interesting!> > As I said, we have 
  two solutions for this, one implemented the other just> designed; so we 
  know the problem is approximately and heuristically solvable> in a 
  plausible computational timeframe.  But I wonder if there aren't> 
  radically different solutions from the ones we've come up with...
  would you mind trying to put the problem in 'word 
  problem' form?  I do much better with concepts than equations ;) 
  
   
  If it;s not worth the effort dont bother 
  :)
   
  J 
Standley


Re[2]: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Cliff Stabbert
Thursday, February 20, 2003, 4:25:24 AM, Jonathan Standley wrote:

JS> a challenge! cool :)  but let me try to put it in less-math terms
JS> for myself and others who are not math-types 

BG>Let X_i, i=1,...,n, denote a set of discrete random variables

JS> X_i is the set of all integers between i and n, initial value for i is 1?
JS> or is i any member of the set X?
JS> or does i function only as a lower bound to set X?

Jonathan, I'll try to restate it.  Note that I'm not a mathematician
either; so this could be off.  I'd appreciate further clarification
from other list members.

X_i is a set of variables X_1, X_2, X_3 ... X_n -- you can think of
these as a set of statements.  They're "random" in that there's no
discernable pattern.  It's not a series or anything, think of a set
of possibilities.

P(X_i|X_j) is the probability of X_i GIVEN X_j
[ i.e. say that X_i is "You test positive for cancer"
and X_j is "You have cancer"
and the test is 90% accurate
   then P(X_i|X_j) is 0.9
   i.e. IF you have cancer THEN the chance of testing positive is
0.9
  conversely,
   say that X_i is "You have cancer"
and X_j is "You test positive for cancer"
   then P(X_i|X_j) DEPENDS very strongly on the overall
probability of having cancer, because some % of the
people who don't have cancer will test positive. ]

BG> Let's say we have a set of N << n^2 conditional probability relationships of
BG> the form
BG>
BG> P(X_j|X_i)
BG>
BG> where i, j are drawn from {1,...,n}.

i.e. we know a number far less (<<) than the total number of
the conditional probabilities i.e. only a few of them.

BG> Let's say we also have a set of M <= n probabilities
BG>
BG> P(X_i)
BG>

i.e. we also know a number of, maybe all, the basic probabilities.

BG> The problem is:
BG>
BG> * Infer the rest of the P(X_i|X_j) and P(X_i): the ones that aren't given
BG>
BG> * Specifically, infer cases where P(X_i|X_j) differs significantly from
BG> P(X_i)

i.e. how can we infer the rest.

This is where my confusion sets in...

- Is the sum of all conditional probabilities P(X_i|X_j) for all i
  given a specific j equal to 1?

- Is the sum of all probabilities P(X_i) for all i equal to 1?

If not, what sort of relationships hold?  I have the feeling I'm
missing some basic assumption here.

Thanks

--
Cliff

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Jonathan Standley



 
 

  >Let X_i, i=1,...,n, denote a set of discrete 
  random variables 
   
  X_i is the set of all integers 
  between i and n, initial value for i is 1?
  or is i any member of the set 
  X?
  or does i function only as a lower 
  bound to set X?
   
  hi me again.  if forgot to ask: is 
  i,...,n  an integer index pointing to a array of related 
  variables?
  could X_i have this structure [1,6,3,7] in 
  other words is the internal order significant? can this exist X_i = 
  [3,5,8,1,8]?
   
  related question: does X_i represent 
  constructions in Novamente's knowledge representation system? or thoughts or 
  trains of thought?  or a hybrid thought/knowledge 
  construction?
   
  thanks
   
  J Standley


Re: [agi] A probabilistic/algorithmic puzzle...

2003-02-20 Thread Jonathan Standley



a challenge! cool :)  but let me 
try to put it in less-math terms for myself and others who are not 
math-types
 
>Let X_i, i=1,...,n, denote a set of discrete 
random variables 
 
X_i is the set of all integers 
between i and n, initial value for i is 1?
or is i any member of the set 
X?
or does i function only as a lower 
bound to set X?
 
 
> Let's say we have a set of N << n^2 
conditional probability relationships of> the form
set N consists of relationships 
numbering n^2, n is the upper boundary of the set X? > P(X_j|X_i)
what does this notation "|" mean 
?
does P(X_j|X_i) 
mean the probability of subset occuring within 
the set X,   X is bounded by [i,n]?
 
 > where i, j are drawn from 
{1,...,n}.
does X_i represent the whole set and 
j is a subset or set X_i?
or is i the lower bound of set X and 
j is an arbitrary member of set X?
> Let's say 
we also have a set of M <= n probabilities 
is this a different n than the 
upper bound of set x?
> P(X_i)
M = P(X_i)?
if so, it means that M is the chance of set X existing 
within a larger 'parent' set (ie the novamente system)?
 
> The problem is:> > * Infer the rest of the P(X_i|X_j) 
and P(X_i): the ones that aren't given> > * Specifically, infer 
cases where P(X_i|X_j) differs significantly from> P(X_i)
 
Is the above asking?  given a set X bounded upper 
= n, lower = i, and given j, an arbitrary subset of set X in X's  initial 
state, use the available data to approximate the probability of a hypothetical 
integer set ( a new value for j) apperaing within X in a future state.  
specifically, try to find initial conditions which will lead to a large 
difference between the chances of X existing and the chances of X that contains 
j existing at a future state. all of this is assuming that Set X is being acted 
upon by a specified algorithm or process?> > Clearly this 
is a massively "underdetermined" problem: the given data will> generally 
not be enough to uniquely determine the results.  This is what> 
makes it interesting!> > As I said, we have two solutions for 
this, one implemented the other just> designed; so we know the problem is 
approximately and heuristically solvable> in a plausible computational 
timeframe.  But I wonder if there aren't> radically different 
solutions from the ones we've come up with...
would you mind trying to put the problem in 'word 
problem' form?  I do much better with concepts than equations ;) 

 
If it;s not worth the effort dont bother 
:)
 
J Standley


[agi] A probabilistic/algorithmic puzzle...

2003-02-19 Thread Ben Goertzel

Hi,

This one is for the more mathematically/algorithmically inclined people on
the list.

I'm going to present a mathematical problem that's come up in the Novamente
development process.  We have two different solutions for it, each with
strengths and weaknesses.  I'm curious if, perhaps, someone on this list
will suggest an alternate approach.  (If not, at least the problem itself
may stimulate somebody's mind ;)

I'll describe the problem here in a very simple form.  Actually, inside
Novamente, this simple problem exists in many "transformed" variants and
takes many different guises.  It is posed here in terms of simple
conditional probabilities, but it also presents itself in other forms,
involving n-ary relationships, complex procedures and predicates, etc. etc.

Without further ado

Let X_i, i=1,...,n, denote a set of discrete random variables (think of them
as concepts or percepts)

Let's say we have a set of N << n^2 conditional probability relationships of
the form

P(X_j|X_i)

where i, j are drawn from {1,...,n}.

Let's say we also have a set of M <= n probabilities

P(X_i)

The problem is:

* Infer the rest of the P(X_i|X_j) and P(X_i): the ones that aren't given

* Specifically, infer cases where P(X_i|X_j) differs significantly from
P(X_i)

Clearly this is a massively "underdetermined" problem: the given data will
generally not be enough to uniquely determine the results.  This is what
makes it interesting!

As I said, we have two solutions for this, one implemented the other just
designed; so we know the problem is approximately and heuristically solvable
in a plausible computational timeframe.  But I wonder if there aren't
radically different solutions from the ones we've come up with...

Any thoughts?

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]