----- Original Message -----
Sent: Thursday, February 20, 2003 2:25
PM
Subject: RE: [agi] A
probabilistic/algorithmic puzzle...
OK... life lesson
#567: When a mathematical explanation confuses non-math people, another
mathematical explanation is not likely to help
The basic
situation can be thought of as follows.
Suppose you have
a large set of people, say, all the people on Earth
Then you have a
bunch of categories you're interested in, say:
Chinese
Arab
fat
skinny
smelly
female
...
Then you have
some absolute probabilities, e.g.
P(Chinese) =
.2
P(fat) =
.15
etc. , which tell
you how likely a randomly chosen person is to fall into each of the
categories
Then you have
some conditional probabilities, e.g.
P(fat |
skinny)=0
P(smelly|male) =
.62
P(fat | American)
= .4
P(slow|fat) =
.7
The last one, for
instance, tells you that if you know someone is American, then there's a .4
chance the person is fat (i.e. 40% of Americans are fat).
The problem at
hand is, you're given some absolute and some conditional probabilities
regarding the concepts at hand, and you want to infer a bunch of
others.
In localized
cases this is easy, for instance using probability theory one can get evidence
for
P(slow|American)
from the
combination of
P(slow|fat)
and
P(fat |
American)
Given n concepts
there are n^2 conditional probabilities to look at. The most interesting
ones to find are the ones for which
P(A|B) is very
different from P(B)
just as for
instance
P(fat|American)
is very different from P(fat)
This problem is
covered by elementary probability theory. Solving it in principle is no
issue. The tricky problem is solving it approximately, for a large
number of concepts and probabilities, in a very rapid computational
way.
Bayesian networks
try to solve the problem by seeking a set of concepts that are arranged in an
"independence hierarchy" (a directed acyclic graph with a concept at each
node, so that each concept is independent of its parents conditional on its
ancestors -- and no I don't feel like explaining that in nontechnical terms at
the moment ;). But this can leave out a lot of information because
real conceptual networks may be grossly interdependent. Of course, then
one can try to learn a whole bunch of different Bayes nets and merge the
probability estimates obtained from each one....
One thing that complicates
the problem is that ,in some cases, as well as inferring probabilities one
hasn't been given, one may want to make corrections to probabilities one HAS
been given. For instance, sometimes one may be given inconsistent
information, and one has to choose which information to
accept.
For example, if you're
told
P(male) = .5
P(young|male) = .4
P(young) = .1
then something's gotta give, because the first two
probabilities imply P(young) >= .5*.4 = .2
Novamente's probabilistic reasoning system handles
this problem pretty well, but one thing we're struggling with now is keeping
this "correction of errors in the premises" under control. If you let
the system revise its premises to correct errors (a necessity in an AGI
context), then it can easily get carried away in cycles of revising premises
based on conclusions, then revising conclusions based on the new premises, and
so on in a chaotic trajectory leading to meaningless inferred
probabilities.
As I said before, this is a very simple incarnation
of a problem that takes a lot of other forms, more complex but posing the same
essential challenge.
-- Ben G