> > INPUT
> > =====
> > I suggest it should initially include 3 inputs:
> > 1) vision
> >     The visual input would be the most computationally demanding,
> >     and I suggest to reduce the resolution to as low as 32x32.
> > 2) text
> >     These text inputs will be passed literally without
> >     understanding. I include this as a separate "sense" in
> >     a standardized AGI framework.
> > 3) sound
> >     May be added later.
> 
> Actually, I suggest to do vision and sound and touch processing roughly
> simultaneously.  Sound and touch are easier than vision computationally, and
> a lot of kinds of learning will go easier when you have multiple sensory
> inputs regarding the same perceived world.

I am totally unfamiliar with artificial touch, but I do know some
neuroscience of touch and proprioception. I will try to include
these senses, but I'll need details of such a robotic arm/body,
preferably have one to play with =)

> > OUTPUT
> > ======
> > Its output are simple propositions that describe sensory
> > events. Basically the "what" and "where" and perhaps
> > motion classifications.
> 
> Agreed: simple logic propositions!

Maybe the representation of Cyc can be used. They use some form
of predicate logic too, and they made a point that it is the
same form as those mathematicians use.

> > ATTENTION
> > =========
> > It should also contain an "attention" input which specifies
> > what kind of information should be focused on. Examples:
> >
> > 1) a specific area in the visual field
> > 2) a certain class of objects such as faces or shapes
> > 3) combinations of the above
> > 4) higher level concepts (which will be developed later)
> 
> Attention should be both input and output.  Cognition can try to direct
> perception, but perception also must be able to determine what's interesting
> and tell cognition about it...

Personally, I think "what's interesting" is a cognitive thing,
ie you cannot tell what's interesting from low-level processing
alone. But there may be some amount of it in the intermediate
stages.

> > INTERACTIVITY
> > =============
> > Object recognition can be partial. For example you can ask
> > it to recognize an object and it may output "30% A, 70% B"
> > or something like that. Then, it's up to the developer to
> > synthesize information gathered from multiple perspectives.
> > (I don't think the sensory module should be concerned with
> > the latter task).
> 
> Object recognition is arguably more cognitive than sensory....  Anyway I'd
> advocate a layered approach where there's a lower level that produces "macro
> features" based on primitive sensory inputs, and then a mid level that does
> object recognition based on the lower level outputs, and then a third level
> that tries to understand the whole scene based on the tentative object rec.
> And of course, there is feedback btw the third and mid levels, but not so
> much feedback into the lowest level, which is kinda hard-wired...

During *learning*, communication among ALL layers would still
be necessary. But I generally agree on the layered structure.

YKY
-- 
_______________________________________________
Find what you are looking for with the Lycos Yellow Pages
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp?SRC=lycos10

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to