Re: [opencog-dev] Re: new developer

Ben Goertzel Wed, 16 May 2018 16:33:06 -0700

Alexey, Nil, Zar, Linas, others...

****
Ah, I have some new thoughts on the "theorem proving + AGI + parsing"
side, but will have to wait to type them in till I get a little time
at the computer, I'm traveling between meetings and conferences in
Europe just now...
****


And here we go...

GENERAL BLATHER

About the general issue of what role logic and theorem-proving have to
play in AGI, obviously there is room for different opinions on this….
 My own view, as most of you know, is

1) There are going to be many viable routes to human-level and
superhuman AGI, not just one

2) A route with a large role for probabilistic-logic theorem-proving
is one viable route

3) Advantages of a route of type 2) would seem to be:

3a) A system that can make good use of current computing hardware
which is logic-based (whereas e.g. a neural net based architecture
would make better use of analog computing hardware)

3b) A system that should be good at scientific and engineering
thinking, and also good at meta-thinking about how to improve its own
code … i.e. a good candidate to figure out how to cure aging, create
nanotech and femtotech, launch the Singularity, etc.

3c) A system that should be coherent in maintaining its goals over
time, as compared to a system operating more similarly to mammals

…

Anyway that is my view which is the result of a lot of thought from
diverse directions, but of course none of us knows for sure how to
create advanced AGI yet so disagreement is understandable..

I note also that since, by Curry-Howard correspondence, programs and
proofs are equivalent, the statement that AGI can be founded centrally
on theorem-proving is equivalent to the statement that AGI can be
founded centrally on program-execution.   The statement that AGI can
be founded centrally on probabilistic logic theorem proving, is
equivalent to the statement that AGI can be founded centrally on
probabilistic programming.  The difference between a theorem proving
based AI and a program learning based AI is merely an “implementation
detail” ;-) …

Now more interestingly let me explore some specifics I have been
thinking about lately!

CONNECTOR THEOREM PROVING

First of all, Linas, if you haven’t seen it before you will enjoy the
diagrams in

http://www.scholarpedia.org/article/Connection_method

which are explained in more detail in

“A Vision for Automated Deduction Rooted in the Connection Method”  (W. Bibel0

https://www.researchgate.net/publication/318226306_A_Vision_for_Automated_Deduction_Rooted_in_the_Connection_Method?enrichId=rgreq-e84163672cff86aca93bd93eac91d998-XXX&enrichSource=Y292ZXJQYWdlOzMxODIyNjMwNjtBUzo1NDg5MjU5MjY4OTU2MTZAMTUwNzg4NTU0NzA0NQ%3D%3D&el=1_x_3&_esc=publicationCoverPdf

and in even more detail in the book

Wolfgang Bibel:
Automated Theorem Proving. Vieweg Verlag, Wiesbaden, 293 pp. (1982);
2. Edition 289 pp, 1987.

which I found online in pdf via sci-hub …

This is connection-based theorem proving, in which a proof of a
theorem is constructed by drawing connections (links) between terms in
the theorem, in such a way that each link joins two instances that
involve the same predicate (but one in negated form and one in
non-negated form, where “negated vs. non-negated” is assessed based on
whether an instance would be negated in a DNF normalized version of
the theorem).   A proof is a bunch of links that form a set of
complete paths thru the theorem (so every instance is along some path
leading to the final instance of some term in the theorem); and such
that there is some unification of all the terms linked, that plays
nicely with equating each linked pair of instances.

A complication is that sometimes there need to be a couple links
emanating from a given instance, i.e. an instance in the theorem may
get used twice, three times, etc.   (This counting of usages would
provide an obvious connection with linear logic, which may or may not
be useful…)

This general concept has been around since the 1980s, but recently has
been used within some highly effective theorem-provers, i.e. leanCop

http://www.leancop.de

and its descendants.   These provers are interesting, among other
reasons, because they consist of extremely concise Prolog code, yet
they work nearly as well as the state-of-the-art theorem-provers that
are much more complex as code, and have been pretty heavily
optimized...

Leancop operates on theorems in disjunctive normal form.  However, a
variant called nanocop

http://www.leancop.de/nanocop/

operates on non-normalized theorems, using a straightforward and
elegant variant of the connection-based method given here

http://www.jens-otten.de/papers/concalc_tab11.pdf

Also, randocop

http://ceur-ws.org/Vol-373/paper-08.pdf

is like leancop but with a more efficient search algorithm.   Leancop
and nanocop basically use chaining based search involving two
operators, “extension” and “retraction.” XXX …  What randocop does, is
considers many randomizations of the order of the clauses in the DNF
version of the theorem, and does the chaining search on each of these.
  This ends up working faster.   (This is not surprising, it’s in line
with many other sorts of results in the search space, showing that
doing a shallow search from a lot of randomly selected starting points
can often be better than doing a deep search from a single starting
point…)

SAT SOLVERS FOR CONNECTOR THEOREM PROVING

As a semi-aside here, one purely algorithmic thought I had when
reading all this is that the search+unification here can probably be
done way faster using an SAT solver, at least for all but the shortest
proofs.   This is vaguely  similar to how with the link parser it's
way faster for long sentences to use SAT for parsing than a standard
NL parsing algorithm...

Use of SAT for theorem-proving has been tried already in a slightly
different context, see

https://www.lsi.upc.edu/~oliveras/espai/smtSlides/lynch.pdf

There are also various other tricky ways to code e.g. the unification
part as an SAT problem,

https://pdfs.semanticscholar.org/59ad/1b79d7a99c9330a494787deb8d6d3020d376.pdf

(that paper's not about SAT it's about unification using neural nets,
but the part where they code unification as constraint satisfaction
would apply to SAT as well as to NNs)

So bottom line is, there are lots of ways to play with the encoding
but one could use SAT or SMT here as a substitute or augmentation to
chaining.

To use SAT here you’d presumably have to set a bound on how many times
each instance can be re-used.  Then you could increase the bound
incrementally, trying the SAT solver again for the constraints
generated with each new bound.   Though the SAT solver would want to
work on the normalized version of the theorem, one could use
information from the grouping of instances in the non-normalized
version to guide the search inside the SAT solver.

CONNECTORS, PROOF SKETCHES AND AGI

Now why do I like this from an AGI perspective?  Because it lets one
do higher-level, abstract probabilistic reasoning just by reasoning
about the links (and setting the unification aside).

To see how this might work consider that a “proof sketch” could be
written in connector form as: A set of connectors that is not
complete, and skips some steps.

For instance, in a proof sketch one might have a link skipping from an
instance of P in clause 1 to an instance of P in clause 10, bypassing
instances of P in clauses 3 and 5.

Or, in a proof sketch, one might have an instance of P used for the
second time, but not for the first time (leaving it open when it will
be used for the first time).  Or one might have an instance of P used
for the k’th time in one link, and used for the m’th time in another
link, with the constraint that k<m but no commitment made about the
values of k and m.

In general a proof sketch, then, is a set of proofs sharing some
common elements.

If one is doing probabilistic induction or abduction across a bunch of
connector proofs, one is going to learn a lot about which connections
and which combinations of connections occur in which kind of contexts
within which kinds of proofs.   This learning will naturally lend
itself to various conjectural proof sketches for newly presented
theorems.

The process of filling in a proof sketch to get a proof can be
confronted just like the problem of proving a theorem de novo - by
chaining or by SAT/SMT or some combination thereof.

SYNTACTIC LINKS AND LOGICAL CONNECTORS

As a minor aside, given the loose analogy between link parsing and
connector proofs, it’s also interesting to look at how the links in a
sentence’s link parse relate to the connectors in that same sentence’s
logical interpretation.

The links in a sentence’s link parse become atomic relationships

Li(w_a, w_b)

where w_a and w_b are word-instances and Li is a link of type i.   The
logical interpretation of a sentence then involves a bunch of
implications such as

L1(w_1, w_3) & L5(w_3,w_7) & L9(w_1,w_7) ==> P4(w_1,w_7) & P8(w_7, w_3)   <p>

where the Pi are logical relationships and <p> is a probability value.
Sometimes there may be complex quantifier relations on the right hand
side of the implication.

The overall semantic interpretation of a sentence, then looks like an
implication of the form

(conjunction of all the syntactic links found in the sentence)
==>
(logical formula involving conjunctions and disjunctions and negations
of logical relationships P_i between logical terms corresponding to
the word-instances in the sentence)

where, in the logical formula on the right hand side, each term may be
tagged with a probability value.

The existence of a syntactic link between two word-instances in a
sentence, has the effect of causing the logical terms corresponding to
the word-instances to be grouped into (one or more of) the same
probability-tagged conjunctions of logical relationships in the
logical interpretation of the sentence.

If the sentence describes N different situations (each one with
different observation-instances providing groundings for the
word-instances) S1, S2, …, S_N  , then a proof of the validity of the
logical interpretation of the sentence as a model of the situations,
is produced by drawing connectors from the observation-instances to
the logical terms…

On Tue, May 15, 2018 at 6:43 AM, Ben Goertzel <[email protected]> wrote:
> Ah, I have some new thoughts on the "theorem proving + AGI + parsing"
> side, but will have to wait to type them in till I get a little time
> at the computer, I'm traveling between meetings and conferences in
> Europe just now...
>
>
>
> On Tue, May 15, 2018 at 12:44 AM, Linas Vepstas <[email protected]> 
> wrote:
>>
>>
>> On Mon, May 14, 2018 at 4:54 PM, Alexey Potapov <[email protected]> wrote:
>>>
>>> Hi Linas,
>>> this is quite an interesting discussion, and I believe we should involve
>>> Ben and others in it.
>>
>>
>> Ben has been involved  in this discussion for a decade; I think he knows the
>> general outline, even as we argue violently about the details.  Anyway, this
>> is why I say "we should discuss on the mailing list", a dictum that I have
>> been violating.
>>
>>>
>>> Actually, I have some concerns regarding too strong focus on the formal
>>> logic and automatic theorem proving in OpenCog since I'm not completely
>>> convinced of its fundamental role in AGI... ;)
>>
>>
>> Yes, I agree.  I am very much trying to pursue a probabilistic approach. The
>> question is then "probabilities on what?" and I have an answer that I like,
>> but I have trouble sharing that answer in a way that is understandable.  Let
>> me try to sketch that, in this context.
>>
>> You know probabilistic programming very very much better than I, so please
>> correct my mis-steps.  In probabilistic programming, one essentially has
>> lambda-expressions, and is assigning probabilities to them.   For example:
>> what is the probability of taking this branch (this if-statement) or not?
>> The probability of running this loop N times? the probability of calling
>> this routine? The probability of making this list/array longer or shorter?
>> This might be done with either custom-built probabilistic programming
>> languages, or with modified versions of C or Lisp.  At any rate, these can
>> be abstractly understood as being lambda-expressions, with probabilities
>> assigned to them.
>>
>> However, as I exhort below, lambdas are closely tied to cartesian products,
>> and to classical logic, and we have plenty of evidence that suggests that
>> this is inappropriate for natural, human intelligence. So what is the right
>> thing? To keep a long story short, for me, it is this concept of "disjoined
>> connector sets" (aka "jigsaw-puzzle pieces"), and I attach probabilities to
>> those. More generally, more vaguely, the structures are "patterns", this is
>> what the pattern miner looks for. There are various theoretical arguments
>> that show that these are sufficient for general purposes.   For  example,
>> theorem-proving is like assembling jigsaw-puzzle pieces, where he
>> puzzle-pieces are rules of inference; this is why theorem-proving is like
>> parsing.  However, general knowledge representation is also like this:
>> solving Sherlock Holmes mysteries has long been said to be like "solving a
>> jigsaw puzzle": well, except now, we have the mathematical formalism to make
>> this statement precise. I think much of what neural nets and deep learning
>> do also fits into this general framework; I want to write a paper on this,
>> but have not had the time yet.
>>
>> So, I am not really into formal logic, per-se; rather, I am trying to amass
>> evidence why probabilistic patterns are the correct approach, while
>> probabilistic lambdas are too stringent, too constrained, too tight to be
>> useful.  One has to assign probabilities to the patterns, the arrangements,
>> and NOT to the individual components of the pattern. (Also, I keep saying
>> "probability" when I actually mean "mutual information" or maybe
>> "surprisingness", as the case may be. Just as lambda is not quite right, I
>> get the feeling that probability is not quite right, and that talking about
>> mutual information and surprisingness is more correct, than talking about
>> probability.  But this intuition remains difficult to explain.).
>>
>> -- Linas
>>
>>>
>>> Best regards,
>>> Alexey.
>>>
>>> 2018-05-15 0:13 GMT+03:00 Linas Vepstas <[email protected]>:
>>>>
>>>> Alexey, Konstantin,
>>>>
>>>> On Mon, May 14, 2018 at 2:41 PM, Константин Тимофеев
>>>> <[email protected]> wrote:
>>>>>
>>>>> Nil, Linas, thanks for the explanations. It seems like I have to learn
>>>>> some new things before I could understand clearly what is the issue behind
>>>>> this discussion. Hopefully it will not take too much time.
>>>>
>>>>
>>>> If it doesn't take much time, then you are a super-human genius working
>>>> in the wrong field.  To understand the issue, you'd have to read a book or
>>>> two on proof theory, along with a large rainbow of related topics. Its 
>>>> taken
>>>> me ten years to understand what the problem is.  I can't magically transfer
>>>> this knowledge, but I can tell you what keywords to search for and read on.
>>>> Besides proof theory and model theory, anything that describes linear logic
>>>> should make clear what the issue with cartesian products is (in proof
>>>> theory, it corresponds to the rules of weakening and contraction, which are
>>>> assumed by the lambda abstraction but are forbidden by tensor products.)  
>>>> To
>>>> understand why products in linguistics are tensor products and not 
>>>> cartesian
>>>> products, look at the wikipedia article on "pregroup grammar".  The 
>>>> earliest
>>>> reference that I know of that discusses this clearly is Marcus Solomon
>>>> "Algebraic Linguistics" (1967) which your compatriots in Novosibirsk sent
>>>> me.  The categorial grammars there are discusses in pages 90-120, which you
>>>> can match up to what the wikipedia pregroup grammar article states.
>>>>
>>>> Last time I said that "theorem proving is like parsing", I said it to
>>>> Ben's son Zar, who is studying theorem proving, and his response was "yeah,
>>>> so what, everyone knows that".  If you already know that, good.  My point 
>>>> is
>>>> that since we know that theorem proving is incompatible with lambda, we
>>>> should stop using lambda so much. To me, this still remains a deep,
>>>> important insight; I'm trying to use it to guide the design of atomese.  I
>>>> can't imagine that you will understand in only a few months.  A few years,
>>>> maybe. I'm just saying, now is the time to start reading about it.
>>>>
>>>> -- Linas
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 14, 2018 at 8:45 PM, Linas Vepstas <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, May 14, 2018 at 12:54 AM, Nil Geisweiller
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 05/11/2018 11:36 PM, Linas Vepstas wrote:
>>>>>>>>
>>>>>>>> light sprinkling of lambda calculus on top.  Nil keeps putting in too
>>>>>>>> much lambda calculus, I can't seem to get him to stop.  It needs more
>>>>>>>> prolog, I keep telling him.
>>>>>>>
>>>>>>>
>>>>>>> But
>>>>>>
>>>>>>
>>>>>> I meant that to be funny. Like saying "it needs more pepper, no, it
>>>>>> needs more salt".
>>>>>>
>>>>>>>
>>>>>>> the left hand side term of a prolog fact shares the same goodies I'm
>>>>>>> after when I use a LambdaLink. It exposes a list of variables, in a
>>>>>>> determined order, abstracting the body away.
>>>>>>
>>>>>>
>>>>>> But Alexey and Konstantin have not heard this argument before, so let
>>>>>> me sketch it briefly.
>>>>>>
>>>>>> * Yes, lambdas provide abstraction. They're useful. But ...
>>>>>> * lambda calc is old, it was originally developed in the view of
>>>>>> Hilbert's program, of providing a foundational basis for classical logic.
>>>>>> * prolog is new, developed for knowledge representation and theorem
>>>>>> proving, that is, non-classical logic.
>>>>>> * lambda calc is the internal language of cartesian, closed categories:
>>>>>> viz anything with lists, pairs, cartesian products. lambda calc 
>>>>>> corresponds
>>>>>> to classical logic.
>>>>>>
>>>>>> * by contrast, theorem proving is known to NOT use classical logic; the
>>>>>> logic of theorem proving is intuitionist logic, and more generally, the
>>>>>> logic of Kripke frames. Insofar as the rule engine and PLN is a 
>>>>>> kind-of-like
>>>>>> theorem prover, it should be based on the principles given in books on 
>>>>>> proof
>>>>>> theory, and NOT on classical logic!  People who work on proof theory have
>>>>>> made a lot of advancements since the 1930's. We should know of, 
>>>>>> understand,
>>>>>> and employ those concepts. Proof theory is a big deal. We should not 
>>>>>> ignore
>>>>>> it.
>>>>>>
>>>>>> * There are other things that classical logic fails to describe,
>>>>>> besides proof theory: natural language and biochemistry. Both of these
>>>>>> appear to be described by intuitionistic logic, or, more properly, by
>>>>>> fragments of linear logic. The difference is that linear logic throws 
>>>>>> away
>>>>>> the notion of pairs, lists, cartesian products and lambdas: lambda's 
>>>>>> cannot
>>>>>> be used to describe language or biochemistry (In physics, the
>>>>>> failure/incorrectness of lambda is "well-known", and is called the
>>>>>> "no-cloning theorem". The fundamental reason for this is that tensor
>>>>>> products violate the assumption of the existence of pairs, lists, 
>>>>>> cartesian
>>>>>> products. Tensor products are incompatible with cartesian products.  
>>>>>> Insofar
>>>>>> as the lambda abstracts the cartesian product, it too must be discarded 
>>>>>> from
>>>>>> the theory, as being incompatible. Any theory that has tensor products
>>>>>> cannot have lambdas in it. They can't co-exist without contradiction.)
>>>>>>
>>>>>> * Insofar as we work with language, learning and theorem proving, where
>>>>>> there are "well-known" theorems that prove that lambda calc is 
>>>>>> incompatible,
>>>>>> we should perhaps stop focusing on it so much. Lambda calc has it's 
>>>>>> place,
>>>>>> -- lambda is great for abstraction, but not everything can be abstracted 
>>>>>> in
>>>>>> that way.  Most notably (for me), the link-types of link-grammar cannot 
>>>>>> be
>>>>>> abstracted by lambdas, because the disjuncts of link-grammar 
>>>>>> (alternately,
>>>>>> the VP, NP of phrase-structure grammars) look like tensor products, and 
>>>>>> not
>>>>>> link cartesian products.
>>>>>>
>>>>>> For these reasons, I think we are over-using lambdas. They offer a
>>>>>> deeply-incorrect, basically-broken view of the world we are interested 
>>>>>> in:
>>>>>> proof theory, linguistics, knowledge, biochemistry.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I know that FreeLink automatically digs up the variables, but it has
>>>>>>> other problems
>>>>>>>
>>>>>>> 1. The variable order depends on their appearances in the body. Even
>>>>>>> worse, this may depend on the variable names (see
>>>>>>> https://github.com/opencog/atomspace/issues/1677).
>>>>>>
>>>>>>
>>>>>> Yes, that is the definition of a "free variable".  Do you want to
>>>>>> propose some kind of semi-free variable?
>>>>>>>
>>>>>>>
>>>>>>> 2. There's no way to tell FreeLink to escape some variable (unless a
>>>>>>> QuoteLink is used).
>>>>>>
>>>>>>
>>>>>> Yeah, quote links are awful. I think a very very high priority is to
>>>>>> figure out how to do the things we need to do, without having to use 
>>>>>> quotes.
>>>>>> I'm not sure, but the need for quotes seem to be a side-effect of using
>>>>>> lambdas or something like that.  I don't understand.  In all the work I 
>>>>>> do,
>>>>>> with link-grammar, with language learning, quotes are never needed; the
>>>>>> concept of lambda abstraction is replaced by connectors and disjuncts; 
>>>>>> and
>>>>>> these never need to be quoted. Actually, I can't even imagine how one 
>>>>>> could
>>>>>> "quote" a connector or disjunct, it doesn't make sense, conceptually.  I
>>>>>> would have to think hard about that.
>>>>>>
>>>>>> (Well, there is also no concept of "variables", either. The roll of
>>>>>> variables are taken up by connectors, which are kind-of-like typed free
>>>>>> variables. Sort of. Again, this requires deep thought.).
>>>>>>
>>>>>> Upshot: maybe if we used lambdas less, then we would not need quotes.
>>>>>> Maybe they are partners in crime.
>>>>>>
>>>>>>>
>>>>>>> 3. Variables cannot be typed.
>>>>>>
>>>>>>
>>>>>> If they were typed, they would not be free, right?  Maybe there is a
>>>>>> way to have typed free variables, I'm not sure; its certainly not  
>>>>>> obvious;
>>>>>> but I can also see how it might be possible.  You'd have to design it
>>>>>> carefully, to avoid contradictions.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> For these 3 reasons I generally stay away from FreeLink, even though
>>>>>>> I'm sure it has good uses, if one is careful.
>>>>>>
>>>>>>
>>>>>> !? I don't recall ever suggesting that we should use FreeLink for
>>>>>> anything other than what it is currently used for. it provides some basic
>>>>>> utilities for locating variables in an expression, but that's it.  I 
>>>>>> don't
>>>>>> understand why you even brought it up... ??
>>>>>>
>>>>>> --linas
>>>>>>
>>>>>> --
>>>>>> cassette tapes - analog TV - film cameras - you
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> cassette tapes - analog TV - film cameras - you
>>>
>>>
>>
>>
>>
>> --
>> cassette tapes - analog TV - film cameras - you
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "opencog" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/opencog.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/opencog/CAHrUA356-6JTNTp1Txct21ZTuBYXwDs6bdMcr781EusGijVT3w%40mail.gmail.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
> --
> Ben Goertzel, PhD
> http://goertzel.org
>
> "Only those who will risk going too far can possibly find out how far
> they can go." - T.S. Eliot



-- 
Ben Goertzel, PhD
http://goertzel.org

"Only those who will risk going too far can possibly find out how far
they can go." - T.S. Eliot

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBfwXFb0MFrtTVGsV%2BLLu%2BFUsBc1ugQ72wp422X%2BHf7-6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [opencog-dev] Re: new developer

Reply via email to