Hi Ben, Thank you for this update.
What i am a bit unclear about is, what part of language learning is explicit pre-defined knowledge and what is learned without explicit instruction. Stated differently, if explicitly defined knowledge presents a scope, does implicit language learning extend this scope -- or is it about mapping the implicit to the explicit. thank you, Daniel On Friday, 29 July 2016 04:32:11 UTC+3, Ben Goertzel wrote: > > (proposed R&D project for fall 2016 - 2017) > > We are now pretty close (a month away, perhaps?) to having an initial, > reasonably reliable version of an OpenCog-controlled Hanson robot > head, carrying out basic verbal and nonverbal interactions. This > will be able to serve as a platform for Hanson Robotics product > development, and also for ongoing OpenCog R&D aimed at increasing > levels of embodied intelligence. > > This email makes a suggestion regarding the thrust of the R&D side of > the ongoing work, to be done once the initial version is ready. This > R&D could start around the beginning of September, and is expected to > take 9-12 months… > > > GENERAL IDEA: > Initial experiment on using OpenCog for learning language from > experience, using the Hanson robot heads and associated tools > > In other words, the idea is to use simple conversational English > regarding small groups of people observed by a robot head, as a > context in which to experiment with our already-written-down ideas > about experience-based language learning. > > BASIC PERCEPTION: > > I think we can do some interesting language-learning work without > dramatic extensions of our current perception framework. Extending > the perception framework is valuable but can be done in parallel with > using the current framework to drive language learning work. > > What I think we need to drive language learning work initially, is > that the robot can tell, at each point in time: > > — where people’s faces are (and assign a persistent label to each person’s > face) > > — which people are talking > > — whether an utterance is happy or unhappy (and maybe some additional > sentiment) > > — if person A’s face is pointed at person B’s face (so that if A is > talking, A is likely talking to B) [not yet implemented, but can be > done soon] > > — the volume of a person’s voice > > — via speech-to-text, what people are saying > > — where a person’s hand is pointing [not yet implemented, but can be done > soon] > > — when a person is moving, leaving or arriving [not yet implemented, > but can be done soon] > > — when a person sits down or stands up [not yet implemented, but can > be done soon] > > — gender recognition (woman/man), maybe age recognition > > EXAMPLES OF LANGUAGE ABOUT THESE BASIC PERCEPTIONS > > While simple this set of initial basic perceptions lets a wide variety > of linguistic constructs get uttered, e.g. > > Bob is looking at Ben > > Bob is telling Jane some bad news > > Bob looked at Jane before walking away > > Bob said he was tired and then sat down > > People more often talk to the people they are next to > > Men are generally taller than women > > Jane is a woman > > Do you think women tend to talk more quietly than men? > > Do you think women are quieter than men? > > etc. etc. > > It seems clear that this limited domain nevertheless supports a large > amount of linguistic and communicative complexity. > > SECOND STAGE OF PERCEPTIONS > > A second stage of perceptual sophistication, beyond the basic > perceptions, would be to have recognition of a closed class of > objects, events and properties, e.g.: > > Objects: > — Feet, hands, hair, arms, legs (we should be able to get a lot of > this from the skeleton tracker) > — Beard > — Glasses > — Head > — Bottle (e.g. water bottle), cup (e.g. coffee cup) > — Phone > — Tablet > > Properties: > — Colors: a list of color values can be recognized, I guess > — Tall, short, fat, thin, bald — for people > — Big, small — for person > — Big, small — for bottle or phone or tablet > > Events: > — Handshake (between people) > — Kick (person A kicks person B) > — Punch > — Pat on the head > — Jump up and down > — Fall down > — Get up > — Drop (object) > — Pick up (object) > — Give (A gives object X to B) > — Put down (object) on table or floor > > > CORPUS PREPARATION > > While the crux of the proposed project is learning via real-time > interaction between the robot and humans, in the early stages it will > also be useful to experiment with “batch learning” from recorded > videos of human interactions, video-d from the robot’s point of view. > > As one part of supporting this effort, I’d suggest that we > > 1) create a corpus of videos of 1-5 people interacting in front of the > robot, from the robot’s cameras > > 2) create a corpus of sentences describing the people, objects and > events in the videos, associating each sentence with a particular > time-interval in one of the videos > > 3) translate the sentences to Lojban and add them to our parallel > Lojban corpus, so we can be sure we have good logical mappings of all > the sentences in the corpus > > Obviously, including the Stage Two perceptions along with the Basic > Perceptions, allows a much wider range of descriptions, e.g. … > > A tall man with a hat is next to a short woman with long brown hair > > The tall man is holding a briefcase in his left hand > > The girl who just walked in in a midget with only one leg > > Fred is bald > > Vytas fell down, then Ruiting picked him up > > Jim is pointing at her hat. > > Jim pointing at her hat and smiling made her blush. > > However, for initial work, I would say it’s best if at least 50% of > the descriptive sentences involve only Basic Perceptions … so we can > get language learning experimentation rolling right away, without > waiting for extended perception… > > LANGUAGE LEARNING > > What I then suggest is that we > > 1) Use the ideas from Linas & Ben’s “unsupervised language learning” > paper to learn a small “link grammar dictionary” from the corpus > mentioned above. Critically, the features associated with each word > should include features from non-linguistic PERCEPTION, not just > features from language. (The algorithms in the paper support this, > even though non-linguistic features are only very briefly mentioned in > the paper.) …. There are various ways to use PLN inference chaining > and Shujing’s information-theoretic Pattern Miner (both within > OpenCog) in the implementation of these ideas… > > 2) Once (1) is done, we then have a parallel corpus of quintuples of the > form > > [audiovisual scene, English sentence, parse of sentence via link > grammar with learned dictionary, Lojban sentence, PLN-Atomese > interpretation of Lojban sentence] > > We can take the pairs > > [parse of sentence via link grammar with learned dictionary, > PLN-Atomese interpretation of Lojban sentence] > > from this corpus and use them as the input to a pattern mining process > (maybe a suitably restricted version of the OpenCog Pattern Miner, > maybe a specialized implementation), which will mine ImplicationLinks > serving the function of current RelEx2Logic rules. > > The above can be done for sentences about Basic Perceptions only, and > also for sentences about Second Stage Perceptions. > > NEXT STEPS FOR LANGUAGE LEARNING > > The link grammar dictionary learned as described above will have > limited scope. However, it can potentially be used as the SEED for a > larger link grammar dictionary to be learned from unsupervised > analysis of a larger text corpus, for which nonlinguistic correlates > of the linguistic constructs are not available. This will be a next > step of experimentation. > > NEXT STEPS FOR INTEGRATION > > Obviously, what can be done with simple perceptions can be done with > more complex perceptions as well … the assumption of simple > perceptions is because that’s what we have working or almost-working > right now… but Hanson Robotics will put significant effort into making > better visual perception for their robots, and as this becomes a > reality we will be able to use it within the above process.. > > > > -- > Ben Goertzel, PhD > http://goertzel.org > > Super-benevolent super-intelligence is the thought the Global Brain is > currently struggling to form... > -- You received this message because you are subscribed to the Google Groups "opencog" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/opencog. To view this discussion on the web visit https://groups.google.com/d/msgid/opencog/7ed0fa0d-b926-4518-9f99-83826e60f998%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
