Re: Complexity of vision (was Re: [agi] Utilizing kickstarter.com?)

Juan Carlos Kuri Pinto Thu, 04 Apr 2013 16:34:36 -0700

The complexity of vision depends on ...
... how complex your vision system is. ^_^



On Thu, Apr 4, 2013 at 6:02 PM, Mike Tintner <[email protected]>wrote:

> You're assuming that vision is mainly about object recognition.  That's
> still mindblowingly difficult, but relatively easy compared to the main
> task of vision.
>
> The main task is not to recognize the objects in a scene - it is to
> recognize how those objects *connect*, including "object mechanics." Who or
> what is doing what to who or what? Did this guy fall because that guy
> punched him or because he stumbled? Is the chair supporting him, or is he
> squatting just over it? Is the book lying on the box, or stuck to it? Are
> the flowers bending because of a wind or what? Is that a mess, or an
> orderly array of papers?
>
> "Object mechanics" includes how objects keep moving. Where will that
> moving object end up? Will he walk straight into the guy ahead? Will that
> ball hit the window, or will he have time to catch it first?
>
> And so on and on.
>
> The main task of vision/common sense/consciousness is to understand the
> *object connectivity* and *mechanics* of the agent's world. For living
> agents, the relevant objects and mechanics of the world to be analysed,
> increase in complexity with the complexity of the agent's body and mind,
> and therefore capacity to interact with the world. (So start your AGI at
> worm or simpler level not human level).
>
> P.S.  One should add that present scientists and technologists are
> *extremely* ill-equipped to deal with human or animal vision from any AGI
> perspective. They are totally conditioned to think and look analytically in
> considering any visual scene.  They are conditioned by the
> sentential/propositional form of logic, maths and language. They think in
> terms of  CAT   ...     SAT  ...     MAT.   They don't even realise that
> the reality being referred to  here is not a set of building blocks, but a
> movie of an object/animal moving continuously through a complex scene. They
> don't have the artistic/ synthetic sensibility which looks at scenes as
> wholes. They will have to acquire it. When humans look at scenes, we look
> as both analytic scientists and synthetic artists.
>
>
>
>
>
>
>
>
>
> -----Original Message----- From: Matt Mahoney
> Sent: Thursday, April 04, 2013 11:25 PM
> To: AGI
> Subject: Complexity of vision (was Re: [agi] Utilizing kickstarter.com?)
>
>
> On Wed, Apr 3, 2013 at 12:11 PM, Ben Goertzel <[email protected]> wrote:
>
>> By using more efficient algorithms than the human brain does ...
>>>>
>>>
>>> How do you know that such algorithms exist? How do you calculate the
>>> complexity?
>>>
>>>
>> What matters is the average case complexity, relative to the
>> probability distributions characterizing the actual environments and
>> goals relevant to the AGI system...
>>
>> There is no good math for calculating this kind of complexity...
>>
>> So, we are relying in significant part on intuition here....
>>
>
> Turing's intuition was that computers were already fast enough to
> solve AI. This was before vacuum tube computers like ENIAC, so I
> presume he meant mechanical relays.
>
> Anyway, I would like opinions on the computational complexity of human
> vision. Specifically, how would you optimize Google's cat face
> recognizer and bring it up to human level?
> http://128.84.158.119/abs/**1112.6209v3<http://128.84.158.119/abs/1112.6209v3>
>
> Their current implementation is a 9 layer neural network with 10^9
> connections. It was trained on 10^7 256x256 grayscale images for 3
> days on 16,000 CPU cores. It is 15.8% accurate on ImageNet, 70% better
> than any other system. Presumably, humans would be able to recognize
> most of the images.
>
> Google's system recognizes only still images in isolation. To bring it
> to human level, it would have to model motion, color, and stereoscopic
> depth perception. It would have a fovea and model saccades, for
> example, scanning important visual features such as corners, faces,
> words, and moving objects. It would have to be integrated with other
> senses to aid recognition. For example, when you turn your head, the
> model should predict how the image will change and extract features
> from the residual errors. Vision makes heavy use of context. For
> example, you can more easily recognize a co-worker at work than at the
> store.
>
> By adulthood we see the equivalent of 10^10 images at a frame rate of
> around 10 per second. Each frame has 10^8 pixels, although to be fair,
> this is reduced to 10^6 low-level features by the retina. A single
> processor running at 10^10 OPS could easily do this. It is harder to
> estimate the number of higher level features processed by the (much
> larger) visual cortex, such as lines, edges, and movement, and then
> going up the hierarchy, corners, letters, words, faces, and familiar
> objects. The number of top level features would be at least as large
> as our vocabulary, about 10^5, although it is probably much higher or
> else we could adequately use words to convey pictures.
>
> Google's system is trained on 10^11 bits. The optic nerve transmits
> 10^16 bits by adulthood, or 10^5 times as much. Coincidentally, our
> brain has 10^5 times as many synapses (10^14) as Google's model. We
> don't need 10^5 times as many processors because the computation is
> spread out over decades, rather than 3 days. I estimate 10^6 cores at
> 10^9 to 10^10 OPS each.
>
> Is it possible to solve the problem with less hardware? How?
>
> --
> -- Matt Mahoney, [email protected]
>
>
> ------------------------------**-------------
> AGI
> Archives: 
> https://www.listbox.com/**member/archive/303/=now<https://www.listbox.com/member/archive/303/=now>
> RSS Feed: https://www.listbox.com/**member/archive/rss/303/**
> 6952829-59a2eca5<https://www.listbox.com/member/archive/rss/303/6952829-59a2eca5>
> Modify Your Subscription: 
> https://www.listbox.com/**member/?&;<https://www.listbox.com/member/?&;>
>
> Powered by Listbox: http://www.listbox.com
>
>
> ------------------------------**-------------
> AGI
> Archives: 
> https://www.listbox.com/**member/archive/303/=now<https://www.listbox.com/member/archive/303/=now>
> RSS Feed: https://www.listbox.com/**member/archive/rss/303/**
> 23601136-98835e3f<https://www.listbox.com/member/archive/rss/303/23601136-98835e3f>
> Modify Your Subscription: https://www.listbox.com/**
> member/?&id_**secret=23601136-cec82a10<https://www.listbox.com/member/?&;>
> Powered by Listbox: http://www.listbox.com
>



-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Re: Complexity of vision (was Re: [agi] Utilizing kickstarter.com?)

Reply via email to