> > INPUT > > ===== > > I suggest it should initially include 3 inputs: > > 1) vision > > The visual input would be the most computationally demanding, > > and I suggest to reduce the resolution to as low as 32x32. > > 2) text > > These text inputs will be passed literally without > > understanding. I include this as a separate "sense" in > > a standardized AGI framework. > > 3) sound > > May be added later. > > Actually, I suggest to do vision and sound and touch processing roughly > simultaneously. Sound and touch are easier than vision computationally, and > a lot of kinds of learning will go easier when you have multiple sensory > inputs regarding the same perceived world.
I am totally unfamiliar with artificial touch, but I do know some neuroscience of touch and proprioception. I will try to include these senses, but I'll need details of such a robotic arm/body, preferably have one to play with =) > > OUTPUT > > ====== > > Its output are simple propositions that describe sensory > > events. Basically the "what" and "where" and perhaps > > motion classifications. > > Agreed: simple logic propositions! Maybe the representation of Cyc can be used. They use some form of predicate logic too, and they made a point that it is the same form as those mathematicians use. > > ATTENTION > > ========= > > It should also contain an "attention" input which specifies > > what kind of information should be focused on. Examples: > > > > 1) a specific area in the visual field > > 2) a certain class of objects such as faces or shapes > > 3) combinations of the above > > 4) higher level concepts (which will be developed later) > > Attention should be both input and output. Cognition can try to direct > perception, but perception also must be able to determine what's interesting > and tell cognition about it... Personally, I think "what's interesting" is a cognitive thing, ie you cannot tell what's interesting from low-level processing alone. But there may be some amount of it in the intermediate stages. > > INTERACTIVITY > > ============= > > Object recognition can be partial. For example you can ask > > it to recognize an object and it may output "30% A, 70% B" > > or something like that. Then, it's up to the developer to > > synthesize information gathered from multiple perspectives. > > (I don't think the sensory module should be concerned with > > the latter task). > > Object recognition is arguably more cognitive than sensory.... Anyway I'd > advocate a layered approach where there's a lower level that produces "macro > features" based on primitive sensory inputs, and then a mid level that does > object recognition based on the lower level outputs, and then a third level > that tries to understand the whole scene based on the tentative object rec. > And of course, there is feedback btw the third and mid levels, but not so > much feedback into the lowest level, which is kinda hard-wired... During *learning*, communication among ALL layers would still be necessary. But I generally agree on the layered structure. YKY -- _______________________________________________ Find what you are looking for with the Lycos Yellow Pages http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp?SRC=lycos10 ------- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
