Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.))

Matt Mahoney Sat, 06 Sep 2008 14:27:25 -0700

Steve, where are you getting your cost estimate for AGI? Is it a gut feeling, 
or something like the common management practice of "I can afford $X so it will 
cost $X"?

My estimate of $10^15 is based on the value of the world economy, US $66 
trillion per year and increasing 5% annually over the next 30 years, which is 
how long it will take for the internet to grow to the computational power of 
10^10 human brains (at 10^15 bits and 10^16 OPS each) at the current rate of 
growth, doubling every couple of years. Even if you disagree with these numbers 
by a factor of 1000, it only moves the time to AGI by a few years, so the cost 
estimate hardly changes.

And even if the hardware is free, you still have to program or teach about 
10^16 to 10^17 bits of knowledge, assuming 10^9 bits of knowledge per brain [1] 
and 1% to 10% of this is not known by anyone else. Software and training costs 
are not affected by Moore's law. Even if we assume human level language 
understanding and perfect sharing of knowledge, the training cost will be 1% to 
10% of your working life to train the AGI to do your job.

Also, we have made *some* progress toward AGI since 1965, but it is mainly a 
better understanding of why it is so hard, e.g.

- We know that general intelligence is not computable [2] or provable [3]. 
There is no "neat" theory.

- From Cyc, we know that coding common sense is more than a 20 year effort. 
Lenat doesn't know how much more, but guesses it is maybe between 0.1% and 10% 
finished.

- Google is the closest we have to AI after a half trillion dollar effort.

1. Landauer, Tom (1986), “How much do
people remember?  Some estimates of the quantity of learned
information in long term memory”, Cognitive Science (10) pp.
477-493.

2. Hutter, Marcus (2003), "A Gentle
Introduction to The Universal Algorithmic Agent {AIXI}", in
Artificial General Intelligence, B. Goertzel and C. Pennachin
eds., Springer. http://www.idsia.ch/~marcus/ai/aixigentle.htm

3. Legg, Shane, (2006), "Is There an
Elegant Universal Theory of Prediction?",  Technical Report
IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for Artificial
Intelligence, Galleria 2, 6928 Manno, Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf

-- Matt Mahoney, [EMAIL PROTECTED]

--- On Sat, 9/6/08, Steve Richfield <[EMAIL PROTECTED]> wrote:
From: Steve Richfield <[EMAIL PROTECTED]>
Subject: Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: 
[agi] draft for comment.. P.S.))
To: [email protected]
Date: Saturday, September 6, 2008, 2:58 PM

Matt,

I heartily disagree with your view as expressed here, and as stated to my by 
heads of CS departments and other "high ranking" CS PhDs, nearly (but not 
quite) all of whom have lost the "fire in the belly" that we all once had for 
CS/AGI.

I DO agree that CS is like every other technological endeavor, in that almost 
everything that can be done as a PhD thesis has already been done. but there is 
a HUGE gap between a PhD thesis scale project and what that same person can do 
with another few more millions and a couple more years, especially if allowed 
to ignore the naysayers.

The reply is a even more complex than your well documented statement, but I'll 
take my best shot at it, time permitting. Here, the angel is in the details.

On 9/5/08, Matt Mahoney <[EMAIL PROTECTED]> wrote: 
--- On Fri, 9/5/08, Steve Richfield <[EMAIL PROTECTED]> wrote:

>I think that a billion or so, divided up into small pieces to fund EVERY
>disparate approach to see where the "low hanging fruit" is, would go a
>LONG way in guiding subsequent billions. I doubt that it would take a

>trillion to succeed.

Sorry, the low hanging fruit was all picked by the early 1960's. By then we had 
neural networks [1,6,7,11,12],

... but we STILL do not have any sort of useful unsupervised NN, the equivalent 
of which seems to be needed for any good AGI. Note my recent postings about a 
potential "theory of everything" that would most directly hit unsupervised NN, 
providing not only a good way of operating these, but possibly the provably 
best way of operating.

natural language processing and language translation [2],

My Dr. Eliza is right there and showing that useful "understanding" out of 
precise context is almost certainly impossible. I regularly meet with the folks 
working on the Russian translator project, and rest assured, things are STILL 
advancing fairly rapidly. Here, there is continuing funding, and I expect that 
the Russian translator will eventually succeed (they already claim success).

models of human decision making [3],

These are curious but I believe them to be an emergent properties of processes 
that we don't understand at all, so they have no value other than for testing 
of future systems. Note that "human decision making" does NOT generally include 
many advanced sorts of logic that simply don't occur to ordinary humans, which 
is where an AGI could shine. Hence, understanding the human but not the 
non-human processes is nearly worthless.

automatic theorem proving [4,8,10],

Great for when you already have the answer - but what is it good for?!

natural language databases [5],

Which are only useful if/when the provably false presumption is true that NL 
"understanding" is generally possible.

game playing programs [9,13],

Note relevant for AGI.

optical character recognition [14],

Only recently have methods emerged that are truly font-independent. This SHOULD 
have been accomplished long ago (like shortly after your 1960 reference), but 
no one wanted to throw significant money at it. I nearly launched an OCR 
company (Cognitext) in 1981, but funding eventually failed because I had done 
the research and had a new (but unproven) method that was truly 
font-independent.

handwriting and speech recognition [15],

... both of which are now good enough for AI interaction (e.g. my Gracie speech 
I/O interface to Dr. Eliza), but NOT good enough for general dictation. 
Unfortunately, the methods used don't seem to shed much light on how the 
underlying processes work in us.

and important theoretical work [16,17,18].

Note again my call for work/help on what I call "computing's theory of 
everything" leveraging off of principal component analysis.

Since then we have had mostly just incremental improvements.

YES. This only shows that the support process has long been broken. and NOT 
that there isn't a LOT of value that is just out of reach of PhD-sized projects.

Big companies like Google and Microsoft have strong incentives to develop AI

Internal politics at both (that I have personally run into) restrict 
expenditures to PROVEN methods, as a single technical failure spells doom for 
the careers of everyone working on them. Hence, their R&D is all D and no R.

and have billions to spend.

Not one dollar of which goes into what I would call genuine "research".

Maybe the problem really is hard.

... and maybe it is just a little difficult. My own Dr. Eliza program 
has seemingly unbelievable NL-stated problem solving capabilities, but is built 
mostly on the same sort of 1960s technology you cited. Why wasn't it built 
before 1970? I see two simple reasons:

1.  Joe Weizenbaum, in his Computer Power and Human Reason, explained why this 
approach could never work. That immediately made it impossible to get any 
related effort funded or acceptable in a university setting.

2.  It took about a year to make a demonstrable real-world NL problem solving 
system, which would have been at the outer reaches of a PhD or casual personal 
project.

I have a similar story for processor architecture which I have discussed here. 
It appears possible to build processors that are ~10,000 times faster on the 
same fabrication equipment, but that the corporate cultures at Intel and others 
makes this impossible.

>From my vantage point, the fallen fruit has already been picked up, but the 
>best fruit is still on the tree and waiting to be picked. The lowest-hanging 
>branches may have been cleared, but if you just stand on your tiptoes, most of 
>it is right there.

Steve Richfield================

References

1. Ashby, W. Ross (1960), Design for a Brain, 2'nd Ed., London: Wiley. 
Describes a 4 neuron electromechanical neural network.

2. Borko, Harold (1967), Automated Language Processing, The State of the Art, 
New York: Wiley.  Cites 72 NLP systems prior to 1965, and the 1959-61 U.S. 
government Russian-English translation project.

3. Feldman, Julian (1961), "Simulation of Behavior in the Binary Choice 
Experiment", Proceedings of the Western Joint Computer Conference 19:133-144

4. Gelernter, H. (1959), "Realization of a Geometry-Theorem Proving Machine", 
Proceedings of an International Conference on Information Processing, Paris: 
UNESCO House, pp. 273-282.

5. Green, Bert F. Jr., Alice K. Wolf, Carol Chomsky, and Kenneth Laughery 
(1961), "Baseball: An Automatic Question Answerer", Proceedings of the Western 
Joint Computer Conference, 19:219-224.

6. Hebb, D. O. (1949), The Organization of Behavior, New York: Wiley.  Proposed 
the first model of learning in neurons: when two neurons fire simultaneously, 
the synapse between them becomes stimulating.

7. McCulloch, Warren S., and Walter Pitts (1943), "A logical calculus of the 
ideas immanent in nervous activity", Buletin of Mathematical Biophysics (5) pp. 
115-133.

8. Newell, Allen, J. C. Shaw, H. A. Simon (1957), "Empirical Explorations with 
the Logic Theory Machine: A Case Study in Heuristics", Proceedings of the 
Western Joint Computer Conference, 15:218-239.

9. Newell, Allen, J. C. Shaw, and H. A. Simon (1958), "Chess-Playing Programs 
and the Problem of Complexity", IBM Journal of Research and Development, 
2:320-335.

10. Newell, Allen, H. A. Simon (1961), "GPS: A Program that Simulates Human 
Thought", Lernende Automaten, Munich: R. Oldenbourg KG.

11. Rochester, N., J. J. Holland, L. H. Haibt, and Wl L. Duda (1956), "Tests on 
a cell assembly theory of the action of the brain, using a large digital 
computer", IRE Transactions on Information Theory IT-2: pp. 80-93.

12. Rosenblatt, F. (1958), "The perceptron: a probabilistic model for 
information storage and organization in the brain", Psychological Review (65) 
pp. 386-408.

13. Samuel, A. L. (1959), "Some Studies in Machine Learning using the Game of 
Checkers", IBM Journal of Research and Development, 3:211-229.

14. Selfridge, Oliver G., Ulric Neisser (1960), "Pattern Recognition by 
Machine", Scientific American, Aug., 203:60-68.

15. Uhr, Leonard, Charles Vossler (1963) "A Pattern-Recognition Program that 
Generates, Evaluates, and Adjusts its own Operators", Computers and Thought, E. 
A. Feigenbaum and J. Feldman eds, New York: McGraw Hill, pp. 251-268.

16. Turing, A. M., (1950) "Computing Machinery and Intelligence", Mind, 
59:433-460.

17. Shannon, Claude, and Warren Weaver (1949), The Mathematical Theory of 
Communication, Urbana: University of Illinois Press.

18. Minsky, Marvin (1961), "Steps toward Artificial Intelligence", Proceedings 
of the Institute of Radio Engineers, 49:8-30.

-- Matt Mahoney, [EMAIL PROTECTED]

-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now

RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;

Powered by Listbox: http://www.listbox.com

      agi | Archives

 | Modify
 Your Subscription

-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.))

Reply via email to