Re: [agi] NARS and probability

Ben Goertzel Sat, 11 Oct 2008 05:46:30 -0700

Pei etc.,

First high level comment here, mostly to the non-Pei audience ... then I'll
respond to some of the details:


This dialogue -- so far -- feels odd to me because I have not been
 defending anything special, peculiar or inventive about PLN here.
There are some things about PLN that would be considered to fall into that
category
(e.g. the treatment of intension which uses my "pattern theory", and the
treatment of quantifiers which uses third-order probabilities ... or even
the
use of indefinite truth values).   Those are the things that I would expect
to be arguing about!  Even more interesting would be to argue about
strategies
for controlling combinatorial explosion in inference trees, which IMO is the
truly crucial issue, more so than the particulars of the inference and
uncertainty
management formalism (though those particulars need to be workable too, if
one is to have an AI with explicit inference as a significant component).

Instead, in this dialogue, I am essentially defending the standard usage
of probability theory, which is the **least** interesting and inventive part
of
PLN.  I'm defending the use of Bayes rule ... re-presenting the standard
Bayesian argument about the Hempel confirmation problem, etc.

This is rather a reversal of positions for me, as I more often these days
argue
with people who are hard-core Bayesians, who believe that explicitly doing
Bayesian inference is the key to AGI ... and  my argument with them is that
a) you need to supplement probability theory with heuristics, because
otherwise
things become intractable; b) these "heuristics" are huge and subtle and in
fact wind up constituting a whole cognitive architecture of which explicit
probability
theory is just one component (but the whole architecture appears to the
probabilistic-reasoning component as a set of heuristic assumptions).

So anyway this is  not, so far, so much of a "PLN versus NARS" debate as a
"probability theoretic AI versus NARS" debate, in the sense that none of the
more odd/questionable/fun/inventive parts of PLN are being invoked here ...
only the parts that are common to PLN and a lot of other approaches...

But anyway, back to defending Bayes and elementary probability theory in
(its application to common sense reasoning; obviously Pei is not disputing
the actual mathematics!)

Maybe in this reply I will get a chance to introduce some of the more
interesting
aspects of PLN, we'll see...


>
> Since each inference rule usually only considers two premises, whether
> the meaning of the involved concepts are rich or poor (i.e., whether
> they are also involved in other statements not considered by the rule)
> shouldn't matter in THAT STEP, right?



It doesn't matter in the sense of determining
what the system does in that step, but it matters in terms
of the human "intuitiveness evaluation" of that step, because we are
intuitively accustomed to evaluating inferences regarding rich concepts
that have a lot of links, and for which we have some intuitive understanding
of the relevant term probabilities.




>
>
> >> Further questions:
> >>
> >> (1) Don't you intuitively feel that the evidence provided by
> >> non-swimming birds says more about "Birds are swimmers" than
> >> "Swimmers are birds"?
> >
> > Yes, but only because I know intuitively that swimmers are more common
> > in my everyday world than birds.
>
> Please note that this issue is different from our previous debate.
> "Node probability" have nothing to do with the asymmetry in
> induction/abduction.



I don't remember our previous debate and don't have time to study  my
email archives (I don't really have time to answer this email but I'm doing
it
anyway ;-) ...

Anyway, in PLN, if we map "is" into ExtensionalInheritance, then the point
that

P(swimmer | Bird) P(bird) = P(bird | swimmer) P(swimmer)

lets me answer your question without even thinking much about the context.

Due to Bayes rule, in any Bayesian inference system, evidence for one of
{ P(swimmer|bird), P(bird|swimmer) } may be considered as evidence for the
other, on principle.  [How that evidence is propagated through the system's
memory is another question, etc. etc.]  And Bayes rule tells you how to
convert
evidence for one of these conditionals into evidence for another.

Getting back to the "odd versus standard" aspects of PLN, if we introduce an
odd
aspect we can model "is" as IntensionalInheritance, or a weighted average of
ExtensionalInheritance and IntensionalInheritance.

In the Intensional case, then for instance

"bird is swimmer"

comes out to mean

P(X is in PAT_swimmer | X is in PAT_bird)

where PAT_A is the fuzzy set of patterns in A.

A quick cut and paste from the PLN book, page 257, here:

***
Note a significant difference from NARS here. In NARS, it is assumed that X
inherits from Y if X extensionally inherits from Y but Y intensionally
inherits
from (inherits properties from) X. We take a different approach here. We say
that
X inherits from Y if X's members are members of Y, and the properties
associ-
ated with X are also associated with Y. The duality of properties and
members is
taken to be provided via Bayes' rule, where appropriate.
***

But even so we have

P(X in PAT_swimmer | X in PAT_Bird) P(X in PAT_bird) =
P(X in PAT_bird | X in PAT_swimmer) P(X in PAT_swimmer)

so at the moment I don't see how the introduction of PLN intensional
inheritance
affects the points we're discussing... because in PLN, intension boils down
to
extension on sets of patterns (where simple, compression-based "pattern
theory"
definitions are used to define fuzzy set membership in pattern sets)

(In the artificial test cases you're discussing here, the fuzzy sets PAT_A
would
all be empty anyway, except for the relations introduced in the test
inferences,
which then would have to be excluded by the trail mechanism from use in both
the PAT_A sets and the inferences we're discussing, to avoid double-counting
of evidence....)


>
>
> For example, "non-swimmer birds" is negative evidence for "Birds are
> swimmers" but irrelevant to "Swimmers are birds", while "non-bird
> swimmers" is negative evidence for "Swimmers are birds" but irrelevant
> to "Birds are swimmers". No matter which of the two nodes is more
> common, you cannot have both case right.


When you say "irrelevant", you mean according to your NARS logic or your
own personal human intuition, but not necessarily according to probability
theory.

Also, when you say "right", what you seem to mean is "in agreement with
NARS logic" or "Pei's intuition"

 ... is that right?  ;-)


>
>
> >> (2) If your answer for (1) is "yes", then think about "Adults are
> >> alcohol-drinkers" and "Alcohol-drinkers are adults" --- do they have
> >> the same set of counter examples, intuitively speaking?
> >
> > Again, our intuitions for this are colored by the knowledge that there
> > are more adults than alcohol-drinkers.
>
> As above, the two sets of counter examples are "non-alcohol-drinking
> adult" and "non-adult alcohol-drinker", respectively. The fact that
> these two statements have different negative evidence have nothing to
> do with the size of the related sets (node probability).


The node probabilities tell you the quantitative factor via which evidence
in favor of P(A|B) converts into evidence in favor of P(B|A).  Thus
according
to Bayes rule, they are relevant.


>
> What I argued is: the counter evidence of statement "A is B" is not
> counter evidence of the converse statement "B is A", and vice versa.
> You cannot explain this in both directions by node probability.



Yes, and your argument based on {your intuition plus your assumed logic}
contradicts my argument based on {my intuition plus Bayes rule}

As I said node probability explains the conversion of (both intensional and
extensional) knowledge about any one of {B is A, A is B} into knowledge
about the other, via a certain quantitative conversion factor


>
>
> >> (3) According to your previous explanation, will PLN also take a red
> >> apple as negative evidence for "Birds are swimmers" and "Swimmers are
> >> birds", because it reduces the "candidate pool" by one? Of course, the
> >> probability adjustment may be very small, but qualitatively, isn't it
> >> the same as a non-swimming bird? If not, then what the system will do
> >> about it?
> >
>
...

>
>
> Well, actually your previous explanation is exactly the opposite of
> the standard Bayesian answer --- see
> http://en.wikipedia.org/wiki/Raven_paradox



Sorry, maybe I did not read your previous email sufficiently carefully
before
replying.  I am confident there is nothing eccentric in PLN's treatment of
Hempel's raven situation, as I did look into this very carefully a couple
years
ago when you, Matt and I were writing a paper on the topic (wow, we never
finished that, did we????).  Rather than going through that whole train
of thought again and recapitulating it in emails, I'd rather put the time
into
finishing that paper and posting it somewhere ;-p


>
> To me, "small probability adjustments" is a bad excuse. No matter how
> small the adjustment is, as far as it is not infinitely small, it
> cannot be always ignored, since it will accumulate. If all non-bird
> objects are taken as (either positive or negative) evidence for "Birds
> are swimmers", then the huge number of them cannot be ignored.
>
> It is always possible to save a theory (probability theory, in this
> situation) if you are willing to pay the price. The problem is whether
> the price is too high.


I am not making excuses for probability theory here.  I don't think there is
any problem with it.

However, I want to be clear that I am not advocating PLN, or probability
theory as a whole, as an accurate model of **human reasoning** or
**human intuition about how reasoning should be done**

If probability theory or PLN differ from human intuition in some case, this
does NOT mean that PLN has a problem for which an "excuse" needs
to be made.

Similarly, when I throw a ball at a target, the trajectory I choose may be
different than the one that would be chosen by an automated device for
throwing a ball at a target.  This could be for multiple reasons.  One
reason could be that the device correctly implements the laws of physics
for this case, and my brain does not.  So I might miss the target more
often ... even though I would be throwing the ball according to the
trajectory
that "feels right" to me intuitively.  With training perhaps I'd be able to
improve my intuition so that it would better reflect the laws of physics.

I think the human brain approximates probability theory, and approximates
something like pattern theory for handling intensions.  If its
approximations
suck, that doesn't mean the theory is wrong, it means either

a) the brain is just dumb in some aspects

b) the brain embodies heuristic approximations that are adaptive in some
contexts (the ones it evolved for) and maladaptive in others

If probabilistic calculations result in small probability-factors that seem
counterintuitive to us humans, then this means that "pure probability
theory" is a bad model for the human mind, but "heuristic approximations
to probability theory" could still be a good model.

Ultimately, though, modeling the human mind is not my  main interest here --
if PLN wound up a terrible model of human intuitive inference (which seems
not to be the case so far, but...) but a great basis for a rational AGI
system,
I'd be pretty happy...

-- Ben G



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com

Re: [agi] NARS and probability

Reply via email to