Ben,

Your reply raised several interesting topics, and most of them cannot
be settled down in this kind of email exchanges. Therefore, I won't
address every of them here, but will propose another solution, in a
separate private email.

Go back to where this debate starts: the asymmetry of
induction/abduction. To me, here is what the discussion  has revealed
so far:

(1) The PLN solution is consistent with the Bayesian tradition and
probability theory in general, though it is counterintuitive.

(2) The NARS solution fits people's intuition, though it violates
probability theory.

Please note that on this topic, what is involved is not just "Pei's
intuition" (though in some other topics, it is) --- Hempel's Paradox
looks counterintuitive to everyone, including you (which you admitted)
and Hempel himself, though you, Hempel, and most of the others
involved in this research, choose to accept the counterintuitive
conclusion, because of the belief that probability theory should be
followed in commonsense reasoning.

As I said before, I don't think I can change your belief in
probability theory very soon. Therefore, as long as you think my above
summary is fair, I've reached my goal in this round of exchange.

Pei


On Sat, Oct 11, 2008 at 8:45 AM, Ben Goertzel <[EMAIL PROTECTED]> wrote:
>
> Pei etc.,
>
> First high level comment here, mostly to the non-Pei audience ... then I'll
> respond to some of the details:
>
> This dialogue -- so far -- feels odd to me because I have not been
>  defending anything special, peculiar or inventive about PLN here.
> There are some things about PLN that would be considered to fall into that
> category
> (e.g. the treatment of intension which uses my "pattern theory", and the
> treatment of quantifiers which uses third-order probabilities ... or even
> the
> use of indefinite truth values).   Those are the things that I would expect
> to be arguing about!  Even more interesting would be to argue about
> strategies
> for controlling combinatorial explosion in inference trees, which IMO is the
> truly crucial issue, more so than the particulars of the inference and
> uncertainty
> management formalism (though those particulars need to be workable too, if
> one is to have an AI with explicit inference as a significant component).
>
> Instead, in this dialogue, I am essentially defending the standard usage
> of probability theory, which is the **least** interesting and inventive part
> of
> PLN.  I'm defending the use of Bayes rule ... re-presenting the standard
> Bayesian argument about the Hempel confirmation problem, etc.
>
> This is rather a reversal of positions for me, as I more often these days
> argue
> with people who are hard-core Bayesians, who believe that explicitly doing
> Bayesian inference is the key to AGI ... and  my argument with them is that
> a) you need to supplement probability theory with heuristics, because
> otherwise
> things become intractable; b) these "heuristics" are huge and subtle and in
> fact wind up constituting a whole cognitive architecture of which explicit
> probability
> theory is just one component (but the whole architecture appears to the
> probabilistic-reasoning component as a set of heuristic assumptions).
>
> So anyway this is  not, so far, so much of a "PLN versus NARS" debate as a
> "probability theoretic AI versus NARS" debate, in the sense that none of the
> more odd/questionable/fun/inventive parts of PLN are being invoked here ...
> only the parts that are common to PLN and a lot of other approaches...
>
> But anyway, back to defending Bayes and elementary probability theory in
> (its application to common sense reasoning; obviously Pei is not disputing
> the actual mathematics!)
>
> Maybe in this reply I will get a chance to introduce some of the more
> interesting
> aspects of PLN, we'll see...
>
>>
>>
>> Since each inference rule usually only considers two premises, whether
>> the meaning of the involved concepts are rich or poor (i.e., whether
>> they are also involved in other statements not considered by the rule)
>> shouldn't matter in THAT STEP, right?
>
> It doesn't matter in the sense of determining
> what the system does in that step, but it matters in terms
> of the human "intuitiveness evaluation" of that step, because we are
> intuitively accustomed to evaluating inferences regarding rich concepts
> that have a lot of links, and for which we have some intuitive understanding
> of the relevant term probabilities.
>
>
>
>>
>> >> Further questions:
>> >>
>> >> (1) Don't you intuitively feel that the evidence provided by
>> >> non-swimming birds says more about "Birds are swimmers" than
>> >> "Swimmers are birds"?
>> >
>> > Yes, but only because I know intuitively that swimmers are more common
>> > in my everyday world than birds.
>>
>> Please note that this issue is different from our previous debate.
>> "Node probability" have nothing to do with the asymmetry in
>> induction/abduction.
>
> I don't remember our previous debate and don't have time to study  my
> email archives (I don't really have time to answer this email but I'm doing
> it
> anyway ;-) ...
>
> Anyway, in PLN, if we map "is" into ExtensionalInheritance, then the point
> that
>
> P(swimmer | Bird) P(bird) = P(bird | swimmer) P(swimmer)
>
> lets me answer your question without even thinking much about the context.
>
> Due to Bayes rule, in any Bayesian inference system, evidence for one of
> { P(swimmer|bird), P(bird|swimmer) } may be considered as evidence for the
> other, on principle.  [How that evidence is propagated through the system's
> memory is another question, etc. etc.]  And Bayes rule tells you how to
> convert
> evidence for one of these conditionals into evidence for another.
>
> Getting back to the "odd versus standard" aspects of PLN, if we introduce an
> odd
> aspect we can model "is" as IntensionalInheritance, or a weighted average of
> ExtensionalInheritance and IntensionalInheritance.
>
> In the Intensional case, then for instance
>
> "bird is swimmer"
>
> comes out to mean
>
> P(X is in PAT_swimmer | X is in PAT_bird)
>
> where PAT_A is the fuzzy set of patterns in A.
>
> A quick cut and paste from the PLN book, page 257, here:
>
> ***
> Note a significant difference from NARS here. In NARS, it is assumed that X
> inherits from Y if X extensionally inherits from Y but Y intensionally
> inherits
> from (inherits properties from) X. We take a different approach here. We say
> that
> X inherits from Y if X's members are members of Y, and the properties
> associ-
> ated with X are also associated with Y. The duality of properties and
> members is
> taken to be provided via Bayes' rule, where appropriate.
> ***
>
> But even so we have
>
> P(X in PAT_swimmer | X in PAT_Bird) P(X in PAT_bird) =
> P(X in PAT_bird | X in PAT_swimmer) P(X in PAT_swimmer)
>
> so at the moment I don't see how the introduction of PLN intensional
> inheritance
> affects the points we're discussing... because in PLN, intension boils down
> to
> extension on sets of patterns (where simple, compression-based "pattern
> theory"
> definitions are used to define fuzzy set membership in pattern sets)
>
> (In the artificial test cases you're discussing here, the fuzzy sets PAT_A
> would
> all be empty anyway, except for the relations introduced in the test
> inferences,
> which then would have to be excluded by the trail mechanism from use in both
> the PAT_A sets and the inferences we're discussing, to avoid double-counting
> of evidence....)
>
>>
>> For example, "non-swimmer birds" is negative evidence for "Birds are
>> swimmers" but irrelevant to "Swimmers are birds", while "non-bird
>> swimmers" is negative evidence for "Swimmers are birds" but irrelevant
>> to "Birds are swimmers". No matter which of the two nodes is more
>> common, you cannot have both case right.
>
> When you say "irrelevant", you mean according to your NARS logic or your
> own personal human intuition, but not necessarily according to probability
> theory.
>
> Also, when you say "right", what you seem to mean is "in agreement with
> NARS logic" or "Pei's intuition"
>
>  ... is that right?  ;-)
>
>>
>> >> (2) If your answer for (1) is "yes", then think about "Adults are
>> >> alcohol-drinkers" and "Alcohol-drinkers are adults" --- do they have
>> >> the same set of counter examples, intuitively speaking?
>> >
>> > Again, our intuitions for this are colored by the knowledge that there
>> > are more adults than alcohol-drinkers.
>>
>> As above, the two sets of counter examples are "non-alcohol-drinking
>> adult" and "non-adult alcohol-drinker", respectively. The fact that
>> these two statements have different negative evidence have nothing to
>> do with the size of the related sets (node probability).
>
> The node probabilities tell you the quantitative factor via which evidence
> in favor of P(A|B) converts into evidence in favor of P(B|A).  Thus
> according
> to Bayes rule, they are relevant.
>
>>
>>
>> What I argued is: the counter evidence of statement "A is B" is not
>> counter evidence of the converse statement "B is A", and vice versa.
>> You cannot explain this in both directions by node probability.
>
> Yes, and your argument based on {your intuition plus your assumed logic}
> contradicts my argument based on {my intuition plus Bayes rule}
>
> As I said node probability explains the conversion of (both intensional and
> extensional) knowledge about any one of {B is A, A is B} into knowledge
> about the other, via a certain quantitative conversion factor
>
>>
>> >> (3) According to your previous explanation, will PLN also take a red
>> >> apple as negative evidence for "Birds are swimmers" and "Swimmers are
>> >> birds", because it reduces the "candidate pool" by one? Of course, the
>> >> probability adjustment may be very small, but qualitatively, isn't it
>> >> the same as a non-swimming bird? If not, then what the system will do
>> >> about it?
>> >
>
> ...
>>
>>
>> Well, actually your previous explanation is exactly the opposite of
>> the standard Bayesian answer --- see
>> http://en.wikipedia.org/wiki/Raven_paradox
>
> Sorry, maybe I did not read your previous email sufficiently carefully
> before
> replying.  I am confident there is nothing eccentric in PLN's treatment of
> Hempel's raven situation, as I did look into this very carefully a couple
> years
> ago when you, Matt and I were writing a paper on the topic (wow, we never
> finished that, did we????).  Rather than going through that whole train
> of thought again and recapitulating it in emails, I'd rather put the time
> into
> finishing that paper and posting it somewhere ;-p
>
>>
>>
>> To me, "small probability adjustments" is a bad excuse. No matter how
>> small the adjustment is, as far as it is not infinitely small, it
>> cannot be always ignored, since it will accumulate. If all non-bird
>> objects are taken as (either positive or negative) evidence for "Birds
>> are swimmers", then the huge number of them cannot be ignored.
>>
>> It is always possible to save a theory (probability theory, in this
>> situation) if you are willing to pay the price. The problem is whether
>> the price is too high.
>
> I am not making excuses for probability theory here.  I don't think there is
> any problem with it.
>
> However, I want to be clear that I am not advocating PLN, or probability
> theory as a whole, as an accurate model of **human reasoning** or
> **human intuition about how reasoning should be done**
>
> If probability theory or PLN differ from human intuition in some case, this
> does NOT mean that PLN has a problem for which an "excuse" needs
> to be made.
>
> Similarly, when I throw a ball at a target, the trajectory I choose may be
> different than the one that would be chosen by an automated device for
> throwing a ball at a target.  This could be for multiple reasons.  One
> reason could be that the device correctly implements the laws of physics
> for this case, and my brain does not.  So I might miss the target more
> often ... even though I would be throwing the ball according to the
> trajectory
> that "feels right" to me intuitively.  With training perhaps I'd be able to
> improve my intuition so that it would better reflect the laws of physics.
>
> I think the human brain approximates probability theory, and approximates
> something like pattern theory for handling intensions.  If its
> approximations
> suck, that doesn't mean the theory is wrong, it means either
>
> a) the brain is just dumb in some aspects
>
> b) the brain embodies heuristic approximations that are adaptive in some
> contexts (the ones it evolved for) and maladaptive in others
>
> If probabilistic calculations result in small probability-factors that seem
> counterintuitive to us humans, then this means that "pure probability
> theory" is a bad model for the human mind, but "heuristic approximations
> to probability theory" could still be a good model.
>
> Ultimately, though, modeling the human mind is not my  main interest here --
> if PLN wound up a terrible model of human intuitive inference (which seems
> not to be the case so far, but...) but a great basis for a rational AGI
> system,
> I'd be pretty happy...
>
> -- Ben G
>
>
>
> ________________________________
> agi | Archives | Modify Your Subscription


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com

Reply via email to