Among the many AGI designs and proposals mentioned in this thread, it was
refreshing to see some actual results from Peter Voss's Aigo. (Also
entertaining as my Alexa was listening and answering back while I played
the demo videos). Experimental results are a lot more work to obtain than
ideas, which is why most publishers and reviewers require them. I realize
this is difficult for AGI, which I guess is why 85% of the papers accepted
to the AGI conference still lacked a results section the last time I looked.

My last 20 years of research can be summarized as finding experimental
evidence (not proof) supporting the following hypotheses:

1. The best language models are based on neural networks.
2. Intelligence grows logarithmically with CPU time and memory.
3. Automating all human labor with AGI will probably cost $1 quadrillion.

We recently learned that the best vision models are neural networks. My
work suggests this is true of language too. It is based on testing
thousands of versions of 200 compression programs since 2006 on a 1 GB text
benchmark, found at http://mattmahoney.net/dc/text.html  Text compression
measures text prediction or modeling by adding a coder, which is a solved
problem. The top models use dictionary preprocessing to convert words into
tokens followed by PAQ style compression predicting one bit at a time using
ad-hoc context features and shallow neural networks. They implement
essentially toddler level language models with hard-coded lexical features,
proximity based semantics and flat (n-gram) simple grammars and
dictionaries sorted by grammatical role (i.e. grouping "monday" with
"tuesday" or "brother" with "sister"). The models so far lack advanced
grammars necessary to understand math, software, or complex sentences.

Prior to my work on PAQ based compression, the best models were PPM
(prediction by partial match) until about 2003. PPM predicts bytes rather
than bits using the longest matching contexts. I started work on neural
based compression in 1998, 5 years before achieving this result.

The second hypothesis has several caveats. By intelligence, I mean text
prediction accuracy. I show that human level prediction (which we have not
yet achieved) implies passing the Turing test. Not everyone accepts the
Turing test as general intelligence since it lacks non-text based
processing like vision, music, and robotics, all requirements for AGI or
automating labor. Also, my tests (with the same benchmark) only show a
logarithmic trend over the range of a few bytes up to 32 GB and 1 to 10^6
operations per byte. If we assume that 10% of the human brain is used to
process language, then the goal figure is 10^13 bits of memory and 10^14
operations per character.

For my third hypothesis, please note I am estimating the cost of several
billion human level intelligences, not just one human level AGI. The two
pieces of evidence I produced in support of my claim are:

3A. My 1998 masters thesis where I showed the scalability and robustness of
distributed indexing using computer simulations. Distributed indexing is an
essential feature of an AGI design consisting of lots of independently
developed and competing narrow AI such as my 2008 proposal. (The thesis is
here: https://cs.fit.edu/~mmahoney/thesis.html ).

3B. I showed that recursive self improvement in a closed environment (boxed
AI, sometimes proposed as a shortcut to AGI or a singularity) is
impossible. http://mattmahoney.net/rsi.pdf

Of course none of this disproves the possibility of other, less expensive
routes to AGI. But logic based AI is probably not one of them (per my first
result) and early progress does not predict success (per my second result).

-- 
-- Matt Mahoney, mattmahone...@gmail.com

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T731509cdd81e3f5f-Mda6e59327c21a47a77423b17
Delivery options: https://agi.topicbox.com/groups

Reply via email to