I am replying only to the AGI list because I was put on moderation for trolling on the OpenCog list. Sorry, I guess I was.
On Wed, Apr 10, 2013 at 4:55 AM, Ben Goertzel <[email protected]> wrote: > A single modern GPU card is enough to run a DeSTIN hierarchy > processing a video feed with > reasonably high resolution, in real time. Or so it seems. An open > question is how many DeSTIN > centroids are needed in each DeSTIN node, to achieve a high level of > recognition ability. The > answer to this question will tell us how big an OpenCog Atomspace > needs to be, in order to > appropriately coordinate with DeSTIN... My estimate is 10^6 to 10^7 for human level vision, but there are probably some interesting problems that could be solved with a lot less. For example, reading text, recognizing faces, or driving a car. There are two ways we can estimate this number. 1. The number of objects we can visually recognize is greater than our vocabulary because there are a lot of things that we can see but are unable to convey in words. Human faces is an example. 2. The information content stored in a neural network (I realize that DeSTIN is different) is on the order of 1 bit per connection. Therefore it must be big enough to represent the salient knowledge in the training data. We get about 10^7 bits per second from 10^6 optic nerves over 10^8 or 10^9 seconds by age 3 or 30. It is not clear how much of this we remember or need. If it is 1% then we would require 10^14 connections or at least 10^7 fully connected neurons. A centroid in DeSTIN detects a feature, same as a neuron. In a neural net, features can be learned without supervision by using lateral inhibition to form winner-take-all networks. I don't think that the clustering algorithm used in DeSTIN is greatly more or less efficient. There are some optimizations we can do on fast sequential machines that we can't do with slow neurons in parallel. For example, the lower layers of the visual cortex contain arrays of feature detectors for lines and edges that vary only in rotation and position relative to the fovea. In a computer, this could be simulated by scanning a block of filter coefficients over the image. This saves memory, but does not save computation. Also, this trick would not work for the higher layers where we detect more complex objects like printed words or faces. In DeSTIN and many neural architectures, the number of features decreases as you go up the hierarchy. I don't think this is the case with human level vision where you have to be able to detect millions of different objects. For the easier problems I mentioned, the number of features would be less. We note that bees can navigate in flight with far smaller brains than we have. Anyway, my suggestion is: 1. Devise specific tests and measurement criteria, for example precision and recall on ImageNet, or reading text. 2. Estimate the computation required and decide if the goal is feasible. 3. Test the algorithm and publish results. It seems that the last result produced by OpenCog (other than commercial projects that nobody knows about) is intelligent game characters in a simulated world several years ago. This tests none of the hard problems in AI like language, vision, art, or robotics. All of the work has been on software development, and none on basic research. How do you know you are on the right path without ever doing tests or experiments along the way? -- -- Matt Mahoney, [email protected] ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
