[opencog-dev] Re: Language learning from simple robot head experience

Daniel Gross Sat, 30 Jul 2016 16:56:34 -0700

Hi Ben, 

Thank you for this update.


What i am a bit unclear about is, what part of language learning is 
explicit pre-defined knowledge and what is learned without explicit 
instruction. 

Stated differently, if explicitly defined knowledge presents a scope, does 
implicit language learning extend this scope -- or is it about mapping the 
implicit to the explicit. 


thank you,

Daniel

On Friday, 29 July 2016 04:32:11 UTC+3, Ben Goertzel wrote:
>
> (proposed R&D project for fall 2016 - 2017) 
>
> We are now pretty close (a month away, perhaps?) to having an initial, 
> reasonably reliable version of an OpenCog-controlled Hanson robot 
> head, carrying out basic verbal and nonverbal interactions.   This 
> will be able to serve as a platform for Hanson Robotics product 
> development, and also for ongoing OpenCog R&D aimed at increasing 
> levels of embodied intelligence. 
>
> This email makes a suggestion regarding the thrust of the R&D side of 
> the ongoing work, to be done once the initial version is ready.  This 
> R&D could start around the beginning of September, and is expected to 
> take 9-12 months… 
>
>
> GENERAL IDEA: 
> Initial experiment on using OpenCog for learning language from 
> experience, using the Hanson robot heads and associated tools 
>
> In other words, the idea is to use simple conversational English 
> regarding small groups of people observed by a robot head, as a 
> context in which to experiment with our already-written-down ideas 
> about experience-based language learning. 
>
> BASIC PERCEPTION: 
>
> I think we can do some interesting language-learning work without 
> dramatic extensions of our current perception framework.  Extending 
> the perception framework is valuable but can be done in parallel with 
> using the current framework to drive language learning work. 
>
> What I think we need to drive language learning work initially, is 
> that the robot can tell, at each point in time: 
>
> — where people’s faces are (and assign a persistent label to each person’s 
> face) 
>
> — which people are talking 
>
> — whether an utterance is happy or unhappy (and maybe some additional 
> sentiment) 
>
> — if person A’s face is pointed at person B’s face (so that if A is 
> talking, A is likely talking to B) [not yet implemented, but can be 
> done soon] 
>
> — the volume of a person’s voice 
>
> — via speech-to-text, what people are saying 
>
> — where a person’s hand is pointing [not yet implemented, but can be done 
> soon] 
>
> — when a person is moving, leaving or arriving [not yet implemented, 
> but can be done soon] 
>
> — when a person sits down or stands up [not yet implemented, but can 
> be done soon] 
>
> — gender recognition (woman/man), maybe age recognition 
>
> EXAMPLES OF LANGUAGE ABOUT THESE BASIC PERCEPTIONS 
>
> While simple this set of initial basic perceptions lets a wide variety 
> of linguistic constructs get uttered, e.g. 
>
> Bob is looking at Ben 
>
> Bob is telling Jane some bad news 
>
> Bob looked at Jane before walking away 
>
> Bob said he was tired and then sat down 
>
> People more often talk to the people they are next to 
>
> Men are generally taller than women 
>
> Jane is a woman 
>
> Do you think women tend to talk more quietly than men? 
>
> Do you think women are quieter than men? 
>
> etc. etc. 
>
> It seems clear that this limited domain nevertheless supports a large 
> amount of linguistic and communicative complexity. 
>
> SECOND STAGE OF PERCEPTIONS 
>
> A second stage of perceptual sophistication, beyond the basic 
> perceptions, would be to have recognition of a closed class of 
> objects, events and properties, e.g.: 
>
> Objects: 
> — Feet, hands, hair, arms, legs (we should be able to get a lot of 
> this from the skeleton tracker) 
> — Beard 
> — Glasses 
> — Head 
> — Bottle (e.g. water bottle), cup (e.g. coffee cup) 
> — Phone 
> — Tablet 
>
> Properties: 
> — Colors: a list of color values can be recognized, I guess 
> — Tall, short, fat, thin, bald — for people 
> — Big, small — for person 
> — Big, small — for bottle or phone or tablet 
>
> Events: 
> — Handshake (between people) 
> — Kick (person A kicks person B) 
> — Punch 
> — Pat on the head 
> — Jump up and down 
> — Fall down 
> — Get up 
> — Drop (object) 
> — Pick up (object) 
> — Give (A gives object X to B) 
> — Put down (object) on table or floor 
>
>
> CORPUS PREPARATION 
>
> While the crux of the proposed project is learning via real-time 
> interaction between the robot and humans, in the early stages it will 
> also be useful to experiment with “batch learning” from recorded 
> videos of human interactions, video-d from the robot’s point of view. 
>
> As one part of supporting this effort, I’d suggest that we 
>
> 1) create a corpus of videos of 1-5 people interacting in front of the 
> robot, from the robot’s cameras 
>
> 2) create a corpus of sentences describing the people, objects and 
> events in the videos, associating each sentence with a particular 
> time-interval in one of the videos 
>
> 3) translate the sentences to Lojban and add them to our parallel 
> Lojban corpus, so we can be sure we have good logical mappings of all 
> the sentences in the corpus 
>
> Obviously, including the Stage Two perceptions along with the Basic 
> Perceptions, allows a much wider range of descriptions, e.g. … 
>
> A tall man with a hat is next to a short woman with long brown hair 
>
> The tall man is holding a briefcase in his left hand 
>
> The girl who just walked in in a midget with only one leg 
>
> Fred is bald 
>
> Vytas fell down, then Ruiting picked him up 
>
> Jim is pointing at her hat. 
>
> Jim pointing at her hat and smiling made her blush. 
>
> However, for initial work, I would say it’s best if at least 50% of 
> the descriptive sentences involve only Basic Perceptions … so we can 
> get language learning experimentation rolling right away, without 
> waiting for extended perception… 
>
> LANGUAGE LEARNING 
>
> What I then suggest is that we 
>
> 1) Use the ideas from Linas & Ben’s “unsupervised language learning” 
> paper to learn a small “link grammar dictionary” from the corpus 
> mentioned above.  Critically, the features associated with each word 
> should include features from non-linguistic PERCEPTION, not just 
> features from language.  (The algorithms in the paper support this, 
> even though non-linguistic features are only very briefly mentioned in 
> the paper.)  ….  There are various ways to use PLN inference chaining 
> and Shujing’s information-theoretic Pattern Miner (both within 
> OpenCog)  in the implementation of these ideas… 
>
> 2) Once (1) is done, we then have a parallel corpus of quintuples of the 
> form 
>
> [audiovisual scene, English sentence, parse of sentence via link 
> grammar with learned dictionary, Lojban sentence, PLN-Atomese 
> interpretation of Lojban sentence] 
>
> We can take the pairs 
>
> [parse of sentence via link grammar with learned dictionary, 
> PLN-Atomese interpretation of Lojban sentence] 
>
> from this corpus and use them as the input to a pattern mining process 
> (maybe a suitably restricted version of the OpenCog Pattern Miner, 
> maybe a specialized implementation), which will mine ImplicationLinks 
> serving the function of current RelEx2Logic rules. 
>
> The above can be done for sentences about Basic Perceptions only, and 
> also for sentences about Second Stage Perceptions. 
>
> NEXT STEPS FOR LANGUAGE LEARNING 
>
> The link grammar dictionary learned as described above will have 
> limited scope.  However, it can potentially be used as the SEED for a 
> larger link grammar dictionary to be learned from unsupervised 
> analysis of a larger text corpus, for which nonlinguistic correlates 
> of the linguistic constructs are not available.   This will be a next 
> step of experimentation. 
>
> NEXT STEPS FOR INTEGRATION 
>
> Obviously, what can be done with simple perceptions can be done with 
> more complex perceptions as well … the assumption of simple 
> perceptions is because that’s what we have working or almost-working 
> right now… but Hanson Robotics will put significant effort into making 
> better visual perception for their robots, and as this becomes a 
> reality we will be able to use it within the above process.. 
>
>
>
> -- 
> Ben Goertzel, PhD 
> http://goertzel.org 
>
> Super-benevolent super-intelligence is the thought the Global Brain is 
> currently struggling to form... 
>

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/opencog.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/7ed0fa0d-b926-4518-9f99-83826e60f998%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[opencog-dev] Re: Language learning from simple robot head experience

Reply via email to