Re: AGI & Robotics & Sophia [was Re: New user [was Re: [opencog-dev] Problem in atom deletion from postgreSQL

Linas Vepstas Wed, 24 Mar 2021 12:11:52 -0700

Hi Michele,

On Wed, Mar 24, 2021 at 8:50 AM Michele Thiella <[email protected]>
wrote:


> ok, lot of things here!
> I'm trying to learn more than 10 years of your work in few month.
>

You are avoiding 10 years of confusion and mistakes. Anyway, later in life,
you may find time to stop and smell the flowers. But the standard academic
trajectory is to push you as fast as possible to the very edge of what is
known, and do research there.


> Il giorno martedì 23 marzo 2021 alle 18:10:24 UTC+1 linas ha scritto:
>
>> Oh, please let me kill you! That's where all the fun is!  Based on
>> discussions with many people, there is a wide-spread misunderstanding of
>> what AGI is or how it might be achieved. Although what you said is
>> superficially, simplistically correct, I want to point out that
>> "excellence" cannot be achieved by hand-crafting knowledge bases. Very few
>> people seem to understand this, and seem to believe that somehow just
>> slapping a bunch of parts together will result in AGI. That designing AGI
>> is like designing an airplane, that it's just a matter of "excellent
>> design" and it will fly by itself. This is not the case.
>>
>> Thus, I was trying to be careful in distinguishing the "scaffolding",
>> which is hand-crafted, from actual AGI type work. The scaffolding is needed
>> to bring data into a format where an AGI type system can interface with
>> it.  At every point of design, you have to ask: is this piece of code just
>> some more hand-crafted (human-crafted) special-case code that is being used
>> to convert the external world into a form that a computer algorithm can
>> interact with? Or is this piece of code "AGI" (or as close to AGI as we can
>> get right now)?  So I am trying to draw a contrast between "those things
>> that are AGI" and "ancillary support services".
>>
>
> I'm starting to understand what you mean.
> Probably all the code that i'm thinking is "scaffolding". So, "bring data
> into a format where an AGI type system can interface with it".
> Maybe it's not clear to me what an AGI-code is. I had seen /learn and
> /generate only from readme, maybe because i found them hard at that time
> (and at this time, for sure).
> I agree that hand-crafting KB and "execellence" don't mix. But so, your
> proposal would be to achieved "execellence" by a knowledge base built by
> what?
>

The distinction between AGI and scaffolding is not clear. But I can
illustrate with an example.

Link Grammar is a natural-language parser, for English and other languages.
It consists of two parts: the parser itself, which embodies a theory of
natural language, and a lexis, or dictionary, that encodes the actual
grammar for different languages. (there one for English, one for Russian,
and another 8+ demo dictionaries.)  It was created in the 1990's, and was
"cutting edge" back then -- it was at the forefront of computational
linguistics research.

There are other parsers as well, created in a similar timeframe, having a
similar division: a generic algorithm, and a hand-crafted dictionary
encoding a specific language.  For this discussion, the dictionary can be
called a "knowledge base"

One goal of "true AGI" would be to automatically learn that dictionary.
There have been attempts to do this in the 1990's, and probably earlier,
and continue on to this day, with varying levels of activity and theories.
And obviously, most of the neural-net crowd has given up symbolic AI,
although they are trying to learn a collection of weight-vectors. So,
although the neural-net people don't have/use a parser, they do have a
"knowledge base". (unfortunately, its a black box - its a collection of
floating-point numbers, with no idea of what they mean)

So, naively, simplistically, a "true AGI system" should be capable of
learning the "knowledge base", instead of relying on humans to craft one.

Now comes the blurry parts: the learning algorithm itself is hand-crafted,
so isn't that a form of cheating? We are once again relying on humans to do
the work. For example, for neural nets, effectively all of them are trained
on a selection of images curated by human beings. The neural net learns how
to recognize a photo of a horse, but it was trained on a human-curated
training set. So, again, that's "cheating". Have you heard the expression
"its turtles all the way down"? Well, for neural nets, its hand-crafted
datasets all the way down. It's people all the way down. The goal of
building a true AGI is to avoid this.  Step 1 is to avoid hand-crafted
training sets. Step 2 is to avoid hand-crafted algorithms.  I'm working on
Step 1. I suppose that Step 2 is beyond the abilities of what can be done
today. It's a bit blurry.


> I had seen the beginning of the work and it is very interesting. In the
>>> next few days I will look at the current state.
>>> Two quick questions:
>>> 1) How complicated is it to work directly with Ros + Gazebo compared to
>>> Malmo and Gym?
>>>
>>
>> I have only used ROS. The design is straight-forward.  If a ROS event
>> comes in (some face is perceived; there is some loud noise, other
>> environmental change) there is a python snippet (ROS is easiest to use with
>> python) that converts that event into Atomese, and sends that Atomese to
>> the cogserver (the cogserver is a network server, nothing more). So for
>> example, a loud sound might be converted to `(StateLink (PredicateNode
>> "ambient sound") (ConceptNode "loud sound"))` Then, on the opencog side,
>> processing does whatever you've set it up to do with this kind of
>> information.  Exactly how sophisticated you want to be is up to you.
>>
>> For output, it's even easier: `(cog-evaluate! (EvaluationLink
>> (GroundedPredicateNode "py:twiddle_ROS_message") (ListLink ... arguments
>> ...))))` which calls a python function "twiddle_ROS_message" to send some
>> data somewhere in ROS.
>>
>> My remarks about "excellent design" and "AGI" above means that python
>> wrappers for converting ROS data to Atomese should be minimal, or that they
>> should do just enough to bring in external information into the AtomSpace.
>> You want to avoid a game of writing large, complex python scripts. So when
>> you ask "How complicated is it to work directly with Ros + Gazebo compared
>> to Malmo and Gym?" The answer should be "about the same" and "not
>> complicated" because there should be only minimalistic shims to convert
>> to/from Atomese and the message formats these other systems use.  If you
>> are creating something complicated in these systems, you are not doing AGI,
>> you are doing robotics.
>>
>>
> I saw the Python wrapper from ROS to Atomose (i used ROS with c++ in a
> Robotic course and yes, python is simpler) in the Eva forlder and are
> really minimal.
>

Well, also, someone reorganized the github repos, and most of that code was
moved somewhere else ... and then, after it got to it's new home, it was
cut down, ... this is one reason why things may not work. Some parts might
have been lost in the move.


> But there is a thing that i not completly understand: to activate a
> GroundedPredicateNode to execute a py function i can use its STI right? It
> should be automatic, how is it works?
>

It is not automatic.  You have to `cog-execute!` whatever code you want to
trigger. There are several ways of doing this.
1) by hand .. obviously.
2) write some sheme or some python code that loops over whatever needs to
be looped over, searching for high or low STI or any other Value or
StateLink or whatever might be changing, and call cog-execute! as needed.
3) Do the above, but entirely in Atomese.  There is a way to do an infinite
loop in atomese -- it is actually a tail-recursive call to the function
itself. It should be possible to do everything you need to do in "pure
atomese". Now, atomese was never meant to be a full-scale programming
language, like python or scheme, so it is missing many commonplace ideas
that make python/scheme/c++/etc. "human friendly". But it does have enough
to make most things possible, and many things "easy" (-ish)

The Eva/Sophia code did version 3. I forget where the main loop is; its
only 3 lines of code total, so its easy to miss. It might be in one of the
repos that was moved around. Everything else was controlled by
SequentialAndLinks, which stepped through a tree of decisions, triggering a
GroundedPredicate whenever some condition was met.

There were three design goals:
a) Make sure atomese had everything it needed to control a robot
b) Make sure that the atomese was simple enough that other algorithms could
analyze it and modify it. For example, it should be possible (in principle)
for MOSES or URE or PLN or some other system to analyze and modify the
robot-control code. (in practice, this was never done)

Keeping the robot code in the form of a decision tree should mean that it
is simple enough that other systems could analyze that tree, edit that
tree, modify it, extend it, and thus create brand-new robot behaviors out
of "thin air".

c) Make sure that the design of atomese itself was simple enough and usable
enough to allow a) and b) above. This is an ongoing project.


> Again: scaffolding vs AGI. So, 3D location is part of the external world,
>> and the scaffolding must interface to the external world, and take 3D data
>> and convert it into a format that the AGI code can operate on.  If you have
>> AGI code that can work directly with 3D point clouds, then great! No
>> scaffolding is needed! If you (like me) have proto-AGI code that wants to
>> work with symbolic-natural-language, then some scaffolding is needed to
>> convert point-clouds into prepositions.  Some day in the future, maybe we
>> can remove some of the scaffolding.
>>
>> However, up until now, almost all work that has been done, that is being
>> done, is on scaffolding. If you are not careful, you will find yourself
>> doing the same. This is not bad: it's educational, and it's important, and
>> it helps show where the boundary is between the scaffolding and the AGI. --
>> if nothing else, this is called "learning at the school of hard knocks" --
>> "I built one and it didn't work, but I learned something". At the forefront
>> of knowledge, that's the only school that is open. That's what science is.
>>
>>
> Ideally, is there an AGI code idea that works directly with pointcloud 3D?
>

Ideally, there is AGI code that can see and listen, can sense true magnetic
North, swim in the ocean, etc. so sure, of course.

I also suppose that working with symbolic-natural-language and so
> propositions is more efficient! Point clouds are heavy and it takes a lot
> of work to extract information,
>

Yes.

> so why would we want this?
>

Because, eventually, it needs to be something that happens. And just
because I don't know how to do this today does not mean that someone clever
won't be able to figure out how to process a point-cloud and discern shapes
in it. I suppose someone at Microsoft or at Tesla is already doing
something along those lines.


>
>> Reasoning and inference is a very dangerous place to start, and may kill
>> your project before it even gets started. There are several reasons for
>> this.
>>
>
>  I'm feeling it!
>
> * Reasoning presumes that you have already decided on a representation for
>> your data (either hand-crafted it, or automatically learned, somehow.) Once
>> you have this representation, then you can reason on it. But do you have
>> this representation? No, you don't. You might borrow one from blocks-world,
>> or borrow the one from Eva, or borrow the one from rocca (or the one from
>> agi-bio, which represents DNA, RNA and proteins).  You then have the
>> problem of pulling external data and placing it into your representation,
>> where "external data" is vision, sound, text, or RNA/DNA genetic sequences.
>> This is scaffolding.
>>
> * Reasoning presumes that you have inference rules. Where did these come
>> from? Did you hand-craft them? PLN has a bunch of hand-crafted inference
>> rules that Ben and friends hand-crafted 10-15 years ago, and Nil has
>> carefully implemented in C code. They work, kind-of, whenever you have a
>> hand-crafted representation for your data that is PLN-compatible. Nil
>> spends a lot of time, a huge amount of time (the last 10 years) getting the
>> hand-crafted rules to fit with the hand-crafted representation, and to get
>> reasoning working efficiently and quickly. But if your representation does
>> not fit the PLN structure, then it won't work.  (None of my language work
>> was ever able to fit with PLN. My new AGI work (at opencog/learn) will
>> almost surely not fit with PLN; the goal there is to learn brand-new
>> inference rules, instead of using the hand-crafted ones.)
>>
>> * The actual implementation of the URE is "hard-core comp-sci" or maybe
>> "good old-fashioned comp sci": its a set of algorithms to apply some
>> rewrite rules to a network. There are many non-opencog systems that do
>> something similar, such as SAT-solvers, constraint satisfaction systems,
>> ASP-answer-set programming, the "lambda cube", higher-order logic,
>> theorem-proving systems, etc. It's hard core, it's not easy.  Many of these
>> systems are much much faster, and are much more flexible, *if* your data
>> representation is not PLN, but is something else: e.g. boolean expressions
>> or prolog-like assertions. So we are back again to "what is your internal
>> model"?
>>
>> For example, in robotics, for a robot inside an office building, a common
>> inference task is "is the door open? If the door is open then roll through
>> it, else grasp the door handle and open the door."  The standard
>> grad-school robotics approach to solve this is to use ROS or something
>> similar to "see" the door, and then to use ASP (answer-set programming) to
>> perform very fast crisp-logic reasoning and inference. It works. It's what
>> 90% of all university robotics departments use. It is reasoning and
>> inference. It's not AGI.
>>
>
> I don't think the following is exactly a representation of the data, but...
> I thought I was starting with a trivial representation,
> objects are described by (ConceptNode "English-object-name"),
> primitive robot actions by GroundedPredicateNode which call py functions
> that actually perform those actions via ROS.
> A vision algorithm recognizes certain objects and returns the English name
> and their 3D coordinates.
> The robot receives goals to complete via English sentences with
> Relex2Logic. Once the inference rules are written, the robot tries to solve
> the goals. When it doesn't know what to do, it tries randomly and ramps up
> its KB from the sensors and continues to make inferences.
>
> The following is an example in a very Pseudo-language.
> That's what my mind thinks when planning the resolution of this problem.
> It certainly has many wrong ideas, concepts, ways of doing and dealing
> with things...
>

Yes, more or less.

What are the critical errors that I've made?
> What are the main differences from Eva?
>

I did not use relex2logic. That was designed for something else.

Before reasoning is possible, one must have a world-model. This model has
several parts to it:
* The people in the room, and their 3D coordinates
* The objects on the table and their 3D coordinates.
* The self-model (current position of robot, and of its arms, etc.)
The above is updated rapidly, by sensor information.

Then there is some long-term knowledge:
* The names of everyone who is known. A dictionary linking names to faces.

Then there is some common-sense knowledge:
* you can talk to people,
* you can pick up bottles on a table
* you cannot talk to bottles
* you cannot pick up people.
* bottles can be picked up with the arm.
* facial expressions and arm movements can be used to communicate with
people.

The world model needs to represent all of this. It also needs to store all
of the above in a representation that is accessible to natural language, so
that it can talk about the position of its arm, the location of the bottle,
and the name of the person it is talking to.

Reasoning is possible only *after* all of the above has been satisfied, not
before.  Attempts to do reasoning before the above has been built will
always come up short, because some important piece of information will be
missing, or will be stored somewhere, in some format that the reasoning
system does not have access to it.

The point here is that people have been building "reasoning systems" for
the last 30 or 40 years. They are always frail and fragile. They are always
missing key information.  I think it is important to try to understand how
to represent information in a uniform manner, so that reasoning does not
stumble.



Atomspace:

>   Concepts: "name" - "3D pose"
>   - bottle - Na
>   - table - Na
>   (Predicate: "over" List ("bottle") ("table"))
>   Actions:
>   - Go random
>   - Go to coord
>   - Grab obj
>
> Goal: (bottle in hand)    // = grab bottle
>
> Inference rules: all the necessary rules, i.e.
> * grab-rule: preconditions: (robot-coord = obj-coord) ..., effects: (obj
> in hand) ...
> * coord-rule: if x is in "coord1" and y is over x then y is in "coord1"
>
> -> So, robot try backward chaining to find the behavior tree to run. It
> doesn't find it, it lacks knowledge, it doesn't know where the bottle is
> (let's leave out partial trees).
> -> Go random ...
> -> Vision sensor recognizes table
> -> atomspace update: table in coord (1,1,1)
> -> forward chaining -> bottle in coord (1,1,1)
> -> backward chaining finds a tree, that is
> Go to coord (1,1,1) + Grap obj
> -> goal achieved
>

This is a more-or-less textbook robotics homework assignment. It has
certainly been solved in many different ways by many different people using
many different technologies, over the last 40-60 years. Algorithms like
A-star search are one of the research results of trying to solve the above.
The AtomSpace would be a horrible technology to solve the above problem,
its too slow, too bulky, too complicated.

The chaining steps can be called "inference", but it is inference devoid of
natural language, devoid of "true understanding". My goal is to have a
conversation with the robot:

"What do you see?"
"A bottle"
"where is it?"
"on the table"
"can you reach it?"
"no"
"could you reach it if you move to a different place?"
"yes"
"where would you move?"
"closer to the bottle"
"can you please move closer to the bottle?"
(robot moves)

This can be solved by carefully hand-crafting a chatbot dialog tree. (The
ghost chatbot system in opencog was designed to allow such dialog trees to
be created) Over the decades, many chatbots have been written. Again: there
are common problems:

-- the text is hard-coded, and not linguistic.  Minor changes in wording
cause the chatbot to get confused.
-- there is no world-model, or it is ad hoc and scattered over many places
-- no ability to perform reasoning
-- no memory of the dialog ("what were we talking about?" - well, chatbots
do have a one-word "topic" variable, so the chatbot can answer "we are
talking about baseball", but that's it. There is no "world model" of the
conversation, and no "world model" of who the conversation was with ("On
Sunday, I talked to John about a bottle on a table and how to grasp it")

Note that ghost has all of the above problems. It's not linguistic, it has
no world-model, it has no defined representation that can be reasoned over,
and it has no memory.

20 years ago, it was hard to build a robot that could grasp a bottle. It
was hard to create a good chatbot.

What is the state of the art, today? Well, Tesla has self-driving cars, and
Amazon and Apple have chatbots that are very sophisticated.  There is no
open source for any of this, and there are no open standards, so if you are
a university grad student (or a university professor) it is still very very
hard to build a robot that can grasp a bottle, or a robot that you can talk
to.  And yet, these basic tasks have become "engineering"; they are no
longer "science".  The science resides at a more abstract level.

--linas


>
>>
>>> * Ideally my goal was to extend the "model of the world" to work more
>>> with objects than people and to extend the "self-model" to execute
>>> navigation and manipulation plans. In all of this, I haven't yet explored
>>> the learning.
>>>
>>
>> For Eva, the self-model and world-model are all part of the same thing,
>> and they were hand-crafted (not learned).  The goal was to interface
>> language to movement and perception. The inspiration was to use concepts
>> and ideas from Melcuk's "Meaning-Text Theory" (MTT) for the world-model.
>>
>> Getting this to work involved a sequence of rickety and fragile
>> transformations: from sound to text (via google voice-to-text) which is
>> inaccurate. From text to a parse-tree (via link-grammar). From parse-tree
>> to the internal model. From the internal model to robot motion/action.
>> Changing anything anywhere was both conceptually hard (no one else
>> understood what the heck I was doing, including, among others, "the
>> management" (Ben and David) and without management support, the going gets
>> tough.)  Also, it was abstract enough and complex enough that other
>> programmers were unwilling to learn how it worked, and so were unwilling to
>> help.  If you personally  want to work on this, then be aware that it is
>> abstract and complex. And fragile. (Part of the goal of "good engineering"
>> is to compartmentalize the complexity so that it becomes "easy to use" and
>> non-fragile. This code bases needed a little bit more "good engineering"
>> than it ever got.)
>>
>> My goal with the opencog/learn project is to automate all of the above,
>> including the reasoning, inference, and world-model, but it is far away
>> from that, so far. I think I know how to do these things, but now I have to
>> ... do them.
>>
>> -- Linas
>>
>
> I haven't looked at the Meaning-Text Theory yet (serious I think!) I'll
> fix it!
> What I have described has precisely this direction it seems to me, but it
> was still only an idea. I can still change, I will have to speak to my
> supervisors to also evaluate the new possibilities you have shown me!
> In the meantime, thanks to everyone, my knowledge base is improving a lot
> too!
>
> Michele
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/opencog/2c33b2f2-c02d-486f-bf58-4122ed11b73dn%40googlegroups.com
> <https://groups.google.com/d/msgid/opencog/2c33b2f2-c02d-486f-bf58-4122ed11b73dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CAHrUA35SkjP7ZM1Av142Ha94Yf4t%2BDzm0YftDz5r1jpqGP%2B4Rg%40mail.gmail.com.

Re: AGI & Robotics & Sophia [was Re: New user [was Re: [opencog-dev] Problem in atom deletion from postgreSQL

Reply via email to