The complexity of vision depends on ... ... how complex your vision system is. ^_^
On Thu, Apr 4, 2013 at 6:02 PM, Mike Tintner <[email protected]>wrote: > You're assuming that vision is mainly about object recognition. That's > still mindblowingly difficult, but relatively easy compared to the main > task of vision. > > The main task is not to recognize the objects in a scene - it is to > recognize how those objects *connect*, including "object mechanics." Who or > what is doing what to who or what? Did this guy fall because that guy > punched him or because he stumbled? Is the chair supporting him, or is he > squatting just over it? Is the book lying on the box, or stuck to it? Are > the flowers bending because of a wind or what? Is that a mess, or an > orderly array of papers? > > "Object mechanics" includes how objects keep moving. Where will that > moving object end up? Will he walk straight into the guy ahead? Will that > ball hit the window, or will he have time to catch it first? > > And so on and on. > > The main task of vision/common sense/consciousness is to understand the > *object connectivity* and *mechanics* of the agent's world. For living > agents, the relevant objects and mechanics of the world to be analysed, > increase in complexity with the complexity of the agent's body and mind, > and therefore capacity to interact with the world. (So start your AGI at > worm or simpler level not human level). > > P.S. One should add that present scientists and technologists are > *extremely* ill-equipped to deal with human or animal vision from any AGI > perspective. They are totally conditioned to think and look analytically in > considering any visual scene. They are conditioned by the > sentential/propositional form of logic, maths and language. They think in > terms of CAT ... SAT ... MAT. They don't even realise that > the reality being referred to here is not a set of building blocks, but a > movie of an object/animal moving continuously through a complex scene. They > don't have the artistic/ synthetic sensibility which looks at scenes as > wholes. They will have to acquire it. When humans look at scenes, we look > as both analytic scientists and synthetic artists. > > > > > > > > > > -----Original Message----- From: Matt Mahoney > Sent: Thursday, April 04, 2013 11:25 PM > To: AGI > Subject: Complexity of vision (was Re: [agi] Utilizing kickstarter.com?) > > > On Wed, Apr 3, 2013 at 12:11 PM, Ben Goertzel <[email protected]> wrote: > >> By using more efficient algorithms than the human brain does ... >>>> >>> >>> How do you know that such algorithms exist? How do you calculate the >>> complexity? >>> >>> >> What matters is the average case complexity, relative to the >> probability distributions characterizing the actual environments and >> goals relevant to the AGI system... >> >> There is no good math for calculating this kind of complexity... >> >> So, we are relying in significant part on intuition here.... >> > > Turing's intuition was that computers were already fast enough to > solve AI. This was before vacuum tube computers like ENIAC, so I > presume he meant mechanical relays. > > Anyway, I would like opinions on the computational complexity of human > vision. Specifically, how would you optimize Google's cat face > recognizer and bring it up to human level? > http://128.84.158.119/abs/**1112.6209v3<http://128.84.158.119/abs/1112.6209v3> > > Their current implementation is a 9 layer neural network with 10^9 > connections. It was trained on 10^7 256x256 grayscale images for 3 > days on 16,000 CPU cores. It is 15.8% accurate on ImageNet, 70% better > than any other system. Presumably, humans would be able to recognize > most of the images. > > Google's system recognizes only still images in isolation. To bring it > to human level, it would have to model motion, color, and stereoscopic > depth perception. It would have a fovea and model saccades, for > example, scanning important visual features such as corners, faces, > words, and moving objects. It would have to be integrated with other > senses to aid recognition. For example, when you turn your head, the > model should predict how the image will change and extract features > from the residual errors. Vision makes heavy use of context. For > example, you can more easily recognize a co-worker at work than at the > store. > > By adulthood we see the equivalent of 10^10 images at a frame rate of > around 10 per second. Each frame has 10^8 pixels, although to be fair, > this is reduced to 10^6 low-level features by the retina. A single > processor running at 10^10 OPS could easily do this. It is harder to > estimate the number of higher level features processed by the (much > larger) visual cortex, such as lines, edges, and movement, and then > going up the hierarchy, corners, letters, words, faces, and familiar > objects. The number of top level features would be at least as large > as our vocabulary, about 10^5, although it is probably much higher or > else we could adequately use words to convey pictures. > > Google's system is trained on 10^11 bits. The optic nerve transmits > 10^16 bits by adulthood, or 10^5 times as much. Coincidentally, our > brain has 10^5 times as many synapses (10^14) as Google's model. We > don't need 10^5 times as many processors because the computation is > spread out over decades, rather than 3 days. I estimate 10^6 cores at > 10^9 to 10^10 OPS each. > > Is it possible to solve the problem with less hardware? How? > > -- > -- Matt Mahoney, [email protected] > > > ------------------------------**------------- > AGI > Archives: > https://www.listbox.com/**member/archive/303/=now<https://www.listbox.com/member/archive/303/=now> > RSS Feed: https://www.listbox.com/**member/archive/rss/303/** > 6952829-59a2eca5<https://www.listbox.com/member/archive/rss/303/6952829-59a2eca5> > Modify Your Subscription: > https://www.listbox.com/**member/?&<https://www.listbox.com/member/?&> > > Powered by Listbox: http://www.listbox.com > > > ------------------------------**------------- > AGI > Archives: > https://www.listbox.com/**member/archive/303/=now<https://www.listbox.com/member/archive/303/=now> > RSS Feed: https://www.listbox.com/**member/archive/rss/303/** > 23601136-98835e3f<https://www.listbox.com/member/archive/rss/303/23601136-98835e3f> > Modify Your Subscription: https://www.listbox.com/** > member/?&id_**secret=23601136-cec82a10<https://www.listbox.com/member/?&> > Powered by Listbox: http://www.listbox.com > ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
