Re: [agi] NARS probability

Pei Wang Sat, 20 Sep 2008 17:24:15 -0700

On Sat, Sep 20, 2008 at 2:22 PM, Abram Demski <[EMAIL PROTECTED]> wrote:
> It has been mentioned several times on this list that NARS has no
> proper probabilistic interpretation. But, I think I have found one
> that works OK. Not perfectly. There are some differences, but the
> similarity is striking (at least to me).


Abram,

There is indeed a lot of similarity between NARS and probability
theory. When I started this project, my plan was to use probability
theory to handle uncertainty. I moved away from it after I believed
that what is needed cannot be fully obtained from that theory and its
extensions. Even so, NARS still agrees with probability theory here or
there, which were mentioned in my papers.

The key, therefore, is whether NARS can be FULLY treated as an
application of probability theory, by following the probability
axioms, and only adding justifiable consistent assumptions when
necessary.

> I imagine that what I have come up with is not too different from what
> Ben Goertzel and Pei Wang have already hashed out in their attempts to
> reconcile the two, but we'll see. The general idea is to treat NARS as
> probability plus a good number of regularity assumptions that justify
> the inference steps of NARS. However, since I make so many
> assumptions, it is very possible that some of them conflict. This
> would show that NARS couldn't fit into probability theory after all,
> but it is still interesting even if that's the case...

I assume by "treat NARS as probability" you mean "to treat the
Frequency in NARS as a measurement following the axioms of probability
theory". I mentioned this because there is another measurement in
NARS, Expectation (which is derived from Frequency and Confidence),
which is also intuitively similar to probability.

> So, here's an outline. We start with the primitive inheritance
> relation, A inh B; this could be called "definite inheritance",
> because it means that A inherits all of B's properties, and B inherits
> all of A's instances. B is a superset of A. The truth value is 1 or 0.

Fine.

> Then, we define "probabilistic inheritance", which carries a
> probability that a given property of B will be inherited by A and that
> a given instance of A will be inherited by B.

There is a tricky issue here. When evaluating the truth value of
A-->B, NARS doesn't only check "properties" and "instances", but also
check "supersets" and "subsets", intuitively speaking. For example,
when the system is told that "Swans are birds" and "Swans fly", it
derives "Birds fly" by induction. In this process "swan" is counted as
one piece of evidence, rather than a set of instances. How many swans
the system knows doesn't matter in this step. That is why in the
definitions I use "extension/intension", not "instance/property",
because the latter is just special cases of the former. Actually, the
truth value of A-->B measures how often the two terms can substitute
each other (in different ways), not how much one set is included in
the other, which is the usual probabilistic reading of an inheritance.

This is one reason why NARS does not define "node probability".

> Probabilistic
> inheritance behaves somewhat like the full NARS inheritance: if we
> reason about likelihoods (the probability of the data assuming (A
> prob_inh B) = x), the math is actually the same EXCEPT we can only use
> primitive inheritance as evidence, so we can't spread evidence around
> the network by (1) treating prob_inh with high evidence as if it were
> primitive inh or (2) attempting to use deduction to accumulate
> evidence as we might want to, so that evidence for "A prob_inh B" and
> evidence for "B prob_inh C" gets combined to evidence for "A prob_inh
> C".

Beside the problem you mentioned, there are other issues. Let me start
at the basic ones:

(1) In probability theory, an event E has a constant probability P(E)
(which can be unknown). Given the assumption of insufficient knowledge
and resources, in NARS P(A-->B) would change over time, when more and
more evidence is taken into account. This process cannot be treated as
conditioning, because, among other things, the system can neither
explicitly list all evidence as condition, nor update the probability
of all statements in the system for each piece of new evidence (so as
to treat all background knowledge as a default condition).
Consequently, at any moment P(A-->B) and P(B-->C) may be based on
different, though unspecified, data, so it is invalid to use them in a
rule to calculate the "probability" of A-->C --- probability theory
does not allow cross-distribution probability calculation.

(2) For the same reason, in NARS a statement might get different
"probability" attached, when derived from different evidence.
Probability theory does not have a general rule to handle
inconsistency within a probability distribution.

> So, we can define a second-order-probabilistic-inheritance "prob_inh2"
> that is for prob_inh what prob_inh is for inh. We can define a
> third-order over the second-order, a fourth over the second, and so
> on. In fact, each of these are generalizations: simple inheritance can
> be seen as a special case of prob_inh (where the probability is 1),
> prob_inh is a special case of prob_inh2, and so on. This means we can
> define an infinite-order probabilistic inheritance, prob_inh_inf,
> which is a generalization of any given level. The truth value of
> prob_inh_inf will be very complicated (since each prob_inhN has a more
> complicated truth value than the last, and prob_inh_inf will include
> the truth values from each level).

Higher-order probability has been introduced by several people in
various ways, but none has gone very far. Beside the infinite
regression and the complexity in calculation, it does not really
capture the uncertainty in the first-order probability in a desired
way.

In NARS, Confidence is introduced as a "higher-order uncertainty",
because it, in a sense, measures the uncertainty in the Frequency
value. However, mathematically it is not a 2nd-order probability. For
example, if for A-->B the Frequency is 0.5, and Confidence is 0, it
really means "I have no idea" (no evidence), rather than "The true
probability of A-->B is NOT 0.5". The latter one, as a 2nd-order
probability statement, can be defined without any mathematical
trouble, but it won't be useful in reasoning.

> My proposal is to add 2 regularity assumptions to this structure.
> First, we assume that the prior over probability values for prob_inh
> is even. This givens us some permission to act like the probability
> and the likelihood are the same thing, which brings the math closer to
> NARS.

That is intuitively acceptable, if interpreted properly.

> Second, assume that a "high" truth value on one level strongly
> implies a high one on the next value, and similarly that low implies
> low.

The first half is fine, but the second isn't. As the previous example
shows, in NARS a high Confidence does implies that the Frequency value
is a good summary of evidence, but a low Confidence does implies that
the Frequency is bad, just that it is not very stable.

> They will already weakly imply eachother, but I think the math
> could be brought closer to NARS with a stronger assumption. I don't
> have any precise suggestions however. The idea here is to allow
> evidence that properly should only be counted for prob_inh2 to cound
> for prob_inh as well, which is the case in NARS. This is point (1)
> above. More generally, it justifies the NARSian practice of using the
> simple prob_inh likelihood as if it were a likelihood for
> prob_inh_inf, so that it recursively acts on other instances of itself
> rather than only on simple inh.

If you work out a detailed solution along your path, you will see that
it will be similar to NARS when both are doing deduction with strong
evidence. The difference will show up (1) in cases where evidence is
rare, and (2) in non-deductive inferences, such as induction and
abduction. I believe this is also where NARS and PLN differ most.

> Of course, since I have not given precise definitions, this solution
> is difficult to evaluate. But, I thought it would be of interest.

I appreciate your effort.

Pei

> --Abram Demski


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69
Powered by Listbox: http://www.listbox.com

Re: [agi] NARS probability

Reply via email to