Re: Essay - example of how the CSP bites [WAS Re: [agi] What best evidence for fast AI?]

Richard Loosemore Tue, 13 Nov 2007 11:42:24 -0800


Ben,

Unfortunately what you say below is tangential to my point, which iswhat happens when you reach the stage where you cannot allow any morevagueness or subjective interpretation of the qualifiers, because youhave to force the system to do its own grounding, and hence its owninterpretation.

What you gave below was a sketch of some more elaborate 'qualifier'mechanisms. But I described the process of generating more and moreelaborate qualifier mechanisms in the body of the essay, and said whythis process was of no help in resolving the issue.




Richard Loosemore





Benjamin Goertzel wrote:


Richard,

The idea of the PLN semantics underlying Novamente's probabilistic
truth values is that we can have **both**

-- simple probabilistic truth values without highly specific interpretation

-- more complex, logically refined truth values, when this level of
precision is necessary

To make the discussion more concrete, I'll use a specfic example
to do with virtual animals in Second Life.  Our first version of the
virtual pets won't use PLN in this sort of way, it'll be focused on MOSES
evolutionary learning; but, this is planned for the second version and
is within the scope of what Novamente can feasibly be expected to
do with modest effort.

Consider an avatar identified as Bob_Yifu

And, consider the concept of "friend", which is a ConceptNode

-- associated to the WordNode "friend" via a learned ReferenceLink
-- defined operationally via a number of links such as

ImplicationLink
   AND
      InheritanceLink X friend
      EvaluationLink near (I, X)
   Pleasure

(this one just says that being near a friend confers pleasure.  Other
links about friendship may contain knowledge such as that friends
often give one food, friends help one find things, etc.)

The concept of "friend" may be learned, via mining of the animal'sexperience-base --

basically, this is a matter of learning that there are certain predicates
whose SatisfyingSets (the set of Atoms that fulfill the predicate)
have significant intersection, and creating a ConceptNode to denote

that intersection.

Then, once the concept of "friend" has been formed, more links pertaining
to it may be learned via mining the experience base and via inference rules.

Then, we can may find that

InheritanceLink Bob_Yifu friend <.9,1>

(where the <.9,1> is an interval probability, interpreted according to
the indefinite probabilities framework) and this link mixes intensional
and extensional inheritance, and thus is only useful for heuristic
reasoning (which however is a very important kind).

What this link means is basically that Bob_Yifu's node in the memory
has a lot of the same links as the "friend" node -- or rather, that it
**would**, if all its links were allowed to exist rather than being
pruned to save memory.  So, note that the semantics are actually
tied to the mind itself.

Or we can make more specialized logical constructs if we really
want to, denoting stuff like

-- at certain times Bob_Yifu is a friend
-- Bob displays some characteristics of friendship very strongly,
and others not at all
-- etc.

We can also do crude, heuristic contextualization like

ContextLink <.7,.8>
     home
     InheritanceLink Bob_Yifu friend

which suggests that Bob is less friendly at home than
in general.

Again this doesn't capture all the subtleties of Bob's friendship in

relation to being at home -- and one could do so if one wanted to, butit would

require introducing a larger complex of nodes and links, which is
not always the most appropriate
thing to do.

The PLN inference rules are designed to give heuristically
correct conclusions based on heuristically interpreted links;
or more precise conclusions based on more precisely interpreted

links.

Finally, the semantics of PLN relationships is explicitly an
**experiential** semantics.  (One of the early chapters in the PLN
book, to appear via Springer next year, is titled "Experiential
Semantics.")  So, all node and link truth values in PLN are
intended to be settable and adjustable via experience, rather than
via programming or importation from databases or something like
that.

Now, the above example is of course a quite simple one.
Discussing a more complex example would go beyond the scope
of what I'm willing to do in an email conversation, but the mechanisms
I've described are not limited to such simple examples.

I am aware that identifying Bob_Yifu as a coherent, distinct entity is aproblem

faced by humans and robots, and eliminated via the simplicity of the SL

environment. However, there is detailed discussion in the (proprietary)NM book of

how these same mechanisms may be used to do object recognition and
classification, as well.

You may of course argue that these mechanisms won't scale up
to large knowledge bases and rich experience streams.  I believe that
they will, and have arguments but not rigorous proofs that they will.

-- Ben G

On Nov 13, 2007 12:34 PM, Richard Loosemore <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:


    Mark Waser wrote:
     >     I'm going to try to put some words into Richard's mouth here
    since
     > I'm curious to see how close I am . . . . (while radically
    changing the
     > words).
     >
     >     I think that Richard is not arguing about the possibility of
     > Novamente-type solutions as much as he is arguing about the
     > predictability of *very* flexible Novamente-type solutions as
    they grow
     > larger and more complex (and the difficulty in getting it to not
     > instantaneously "crash-and-burn").  Indeed, I have heard a very
    faint
     > shadow of Richard's concerns in your statements about the "tuning"
     > problems that you had with BioMind.

    This is true, but not precise enough to capture the true nature of
    my worry.

    Let me focus on one aspect of the problem.  My goal here is to describe
    in a little detail how the Complex Systems Problem actually bites in a
    particular case.

    Suppose that in some significant part of Novamente there is a
    representation system that uses "probability" or "likelihood" numbers to
    encode the strength of facts, as in [I like cats](p=0.75).  The (p=0.75)
    is supposed to express the idea that the statement [I like cats] is in
    some sense "75% true".

    [Quick qualifier:  I know that this oversimplifies the real situation in
    Novamente, but I need to do this simplification in order to get my point
    across, and I am pretty sure this will not affect my argument, so bear
    with me].

    We all know that this p value is not quite a "probability" or
    "likelihood" or "confidence factor".  It plays a very ambigous role in
    the system, because on the one hand we want it to be very much like a
    probability in the sense that we want to do calculations with it:  we
    NEED a calculus of such values in order to combine facts in the system
    to make inferences.  But we also do not want to lock ourselves into a
    particular interpretation of what it means, because we know full well
    that we do not really have a clear semantics for these numbers.

    Either way, we have a problem:  a fact like [I like cats](p=0.75) is
    ungrounded because we have to interpret it.  Does it mean that I like
    cats 75% of the time?  That I like 75% of all cats?  75% of each cat?
    Are the cats that I like always the same ones, or is the chance of an
    individual cat being liked by me something that changes?  Does it mean
    that I like all cats, but only 75% as much as I like my human family,
    which I like(p=1.0)?  And so on and so on.

    Digging down to the root of this problem (and this is the point where I
    am skipping from baby stuff to hard core AI) we want these numbers
    to be
    semantically compositional and interpretable, but in order to make sure
    they are grounded, the system itself is going to have to build them
    interpret them without our help ... and it is not clear that this
    grounding can be completely implemented.  Why is it not clear?  Because
    when you try to build the entire grounding mechanism(s) you are forced
    to become explicit about what these numbers mean, during the process of
    building a grounding system that you can trust to be doing its job:
     you
    cannot create a mechanism that you *know* is constructing sensible p
    numbers and facts during all of its development *unless* you finally
    bite the bullet and say what the p numbers really mean, in fully cashed
    out terms.

    [Suppose you did not do this.  Suppose you built the grounding mechanism
    but remained ambiguous about the meaning of the p numbers.  What would
    the resulting system be computing?  From end to end it would be
    building
    facts with p numbers, but you the human observer would still be imposing
    an interpretation on the facts.  And if you are still doing anything to
    interpret, it cannot be grounded].

    Now, as far as I understand it, the standard approach to this
    condundrum
    is that researchers (in Novamente and elsewhere) do indeed make an
    attempt to disambiguate the p numbers, but they do it by developing more
    sophisticated logical systems.  First, perhaps, error-value bands of p
    values instead of sharp values.  And temporal logic mechanisms to deal
    with time.  Perhaps clusters of p and q and r and s values, each with
    some slightly different zones of applicability.  More generally, people
    try to give structure to the qualifiers that are appended to the facts:
    [I like cats](qualfier=value) instead of [I like cats](p=0.75).

    The question is, does this process of refinement have an end?  Does it
    really lead to a situation where the qualifier is disambiguated and the
    semantics is clear enough to build a trustworthy grounding system?  Is
    there a closed-form solution to the problem of building a logic that
    disambiguates the qualifiers?

    Here is what I think will happen if this process is continued.  In
    order
    to make the semantics unambiguous enough to let the system ground its
    own knowledge without the interpretation of p values, researchers will
    develop more and more sophisticated logics (with more and more
    structured replacements for that simple p value), until they are forced
    to introduce ideas that are so complicated that they do not allow you to
    do the full job of compositionality any more:  you cannot combine some
    facts and have the combination of the complicated p-structures still be
    interpretable.  For example, if the system is encoded with such stuff as

    [I like cats](general-likelihood=0.75 +- 0.05,
                  mood-variability=0.10 +-0.01,
                  time-stability=0.99 +0.005- 0.03,
                  overall-unsureness=0.07
                  special-circumstances-count=5 )

    Then can we be *absolutely* sure that a combination of facts of this
    sort is going to preserve its accuracy across long ranges of inference?
     Can we combine this fact with an [I am allergic to cats](....) fact to
    come to a clear conclusion about the proposition [I want to sit down and
    let Sleti jump onto my lap](....)?

    If we built a calculus to handle such structured facts, would we be
    kidding ourselves about whether the semantics was *really*
    compositional...?  Or would we just be sweeping the ambiguity of the
    interpretation of these facts under the carpet?  Hiding the ambiguity
    inside an impossibly dense thicket of qualfiers?

    Here, then, are the two conclusions from this phase of my comment:

    1) I do not believe anyone seriously knows if there is any end to the
    research process of trying to get a logic to does this disambiguation.
    I think it is an endeavor driven by pure hope.

    2) I believe that, in the end, this search for a good enough logic will
    result the construction of a grounding system (i.e. a mechanism that is
    able to pick up and autonomously interpret all its own facts about the
    world) that actually has NOT been disambiguated, and that for this
    reason it will start to fall apart when used in large scale situations -
    with large numbers of facts and/or over large stretches of autonomous
    functioning.  I think people will sweep the dismbiguation problem under
    the carpet, and then only notice that they are getting bitten by it when
    the large-scale system does not seem to generate coherent, sensible
    knowledge when left to its own devices.

    This second point is where I finally meet up with your comment about
    problems on the larger scale, and the system crashing and burning.  I
    think it will be a slowish crash.  Incidentally, I presume I do not
    need
    to labor the point about how this will probably appear on the larger
    scale but might not be so obvious for small scale or toy demonstrations
    of the mechanisms.

    I need to finish by making a point about what I see as the underlying
    cause of this problem.

    The whole thing started because we wanted our p numbers to be
    interpretable.  What I believe will happen as a result of imposing this
    design constraint is that we severely restrict the space of possible
    grounding mechanisms that we allow ourselves to consider.  By doing so,
    we box ourselves into an increasingly tight corner, searching for a
    solution that preserves compositional semantics, THEN quietly giving up
    on the idea when we get into the depths of some horrendous
    temporal/pragmatic/affective/case-based logic 8-) that we cannot, after
    all, interpret ...... and then, having boxed ourselves into that
    neighborhood of the space of all possible representational systems, we
    find that there simply is no solution, given all those constraints.
    (But, being stubborn, we carry on hacking away at it forever anyway).

    So what is the solution?  Well, easy:  do not even try to make those p
    numbers interpretable.  Build systems that build their own
    representations, give 'em p numbers to play with (and q and r and s
    numbers if they want them), but let the mechanisms themselves use those
    numbers without ever trying to exactly interpret them.  Frankly, why
    should we expect those numbers to be interpretable at all?  Why should
    we expect there to be a *calculus* that allows us to prove that a system
    is truth-preserving?

    In such a system the "truth value" of a fact would not be represented
    inside the object(s) that encoded the fact, it would be the result of a
    cluster of objects constraining one another.  So, if the system has in
    it the fact [I like cats], this would be connected to a host of other
    facts, in such a way that if the system were asked "Do you like cats?"
    it would build a large representation of the question and the
    implications that were relevant in the present context, and the result
    of all those objects interacting would be the thing that generated the
    answer.  If the person were responding to a questionnaire that forced
    them to give an answer on a continuous scale between 0 and 100, they
    might well put their mark at the 75% level, but this would not be the
    result of retrieving a p value, it would be a nebulous, fleeting result
    of the interaction of all the structures involved (and next time they
    were asked, the value would probably be different).

    Similarly, if the system were trying to decide whether or not to allow a
    particular cat to jump up on its lap, given that it generally liked
    cats, but was somewhat allergic, the decision would not be the
    result of
    a combination of p numbers (be they ever so complicated), but the result
    of a collision of some huge, extended structures involving many facts.
    The collision would certainly involve some weighing of p, q, r and s
    (etc.) numbers stored in these objects, but these numbers would not be
    interpretable, and the combination process would not be consistent with
    a logical calculus.

    There is much more that could be said about the methodology needed to
    find mechanisms that could do this, but leaving that aside for the
    moment, there is just the big philosophical question of whether to give
    up our obsession with interpretable semantics, or whether to be so
    scared of complex systems (because of course, such a system would very
    likely introduce the Complex Systems Bogeyman) that we do not dare
    try it.

    That is a huge difference in philosophy.  It is not just a small matter
    of technique, it is huge perspective change.

    So, to conclude, when I say that intelligence involves an irreducible
    amount of complexity, I mean only that there are some situations in the
    design of AGI systems, like the case I have just described, where I see
    people going through a bizarre process:

     Step 1)  We decide that we must make our AGI as non-complex as
    possible, so we can prove *something* about how knowledge-bits combine
    to make reliable new knowledge-bits (in the above case, try to make it
    as much like a probability-calculus or logical calculus, because we
    know
    that in the purest examples of such things, we can preserve truth as
    knowledge is added).

     Step 2)  We are eventually forced to compromise our principles and
    introduce hacks that flush the truth-preservation guarantees down the
    toilet:  in the above case, we complicate the qualifiers in our logic
    until we can no longer really be sure what the semantics is when we
    combine them (and in the related case of inference control engines, we
    allow such engines to do funky, truncated explorations of the space of
    possible inferences, with unpredictable consequences).

     Step 3)  We then refuse to acknowledge that what we have got, now, is
    a compromise that *is* a complex system: its overall behavior is subtly
    dependent on interactions down at the low level.  One reason that we get
    away with this blindness for so long is that it does not necessarily
    show itself in small systems or in relatively small scale runs, or in
    systems where the developmental mechanisms (the worst culprits for
    bringing out the complexity) have not yet been impremented.

     Step 4)  Having let some complexity in through the back door, we then
    keep hacking away at the design, hoping that somewhere in the design
    neighborhood there is a solution that is both ALMOST compositional (i.e.
    interpretable semantics, truth-preserving, etc.) and slightly complex.
    In reality, we have most likely boxed ourselves in because of our
    initial (quixotic) emphasis on making the semantics intepretable.


    Hmmm... if my luck runs the way it usually does, all this will be as
    clear as mud.  Oh well.  :-(

    This commentary is not, of course, specific to Novamente, but is really
    about an entire class of AGI systems that belong in the same family as
    Novamente.  My problem with Novamente is really that I do not see it
    being flexible enough to throw out the meaningful, interpretable
    parameters.





    Richard Loosemore


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=64637404-03ab29

Re: Essay - example of how the CSP bites [WAS Re: [agi] What best evidence for fast AI?]

Reply via email to