Re: [agi] The future of AGI

Matt Mahoney Tue, 05 Feb 2019 10:02:03 -0800

It is easy to come up with ideas and to say what approaches to AGI
should be obvious. It is much harder to test them, and when we do we
are sometimes surprised that our ideas don't work. Linas's
dissertation length paper on the equivalence of symbolic and neural
language systems would be a good review of dozens of different
approaches if there were an experimental results section that told us
which ones were worth pursuing. The idea goes back to Rumelhart and
McClelland in the 1980's who proposed neural language models with
neurons representing letters or phonemes, words, and parts of speech.
It makes sense. But we just lacked the computing power to implement
it.


Rule based grammars seem to make sense because it works on artificial
languages using very little computing power. You parse the sentence
and analyze the tree for semantics. You can cover about half of all
English sentences with a few dozen rules. But English doesn't work
that way. Nobody knows how many rules you need to parse the other
half. Why do we prefer "salt and pepper" over "pepper and salt"?
Furthermore, you have to understand sentences before you can parse
them. Consider:

I ate pizza with a fork.
I ate pizza with pepperoni.
I ate pizza with Bob.

The standard measure of language model performance is word perplexity,
or equivalently, text prediction or compression. It is equivalent to a
quantitative version of the Turing test because successful prediction
of a dialog is equivalent to predicting how a human would answer the
same questions. In http://mattmahoney.net/dc/text.html I evaluated
over 1000 versions of 200 language models over the last 12 years,
measuring text prediction accuracy of each one to 9 significant
digits. So I can tell you what really works.

The top 3 programs and 13 of the top 14 use a context mixing
algorithm, which uses neural networks to combine the predictions of
hundreds or thousands of context models. (Practical compressors may
use 4 to 8 models). The input text is preprocessed, replacing capital
letters with lower case plus a special symbol and merging white space.
Words are replaced with tokens from a dictionary that groups related
words like "monday" and 'tuesday" or "mother" and "father". There are
automated methods of building the dictionary, omitting and spelling
rare words and grouping words by proximity for semantics and
clustering in context space for syntactic role. Next the tokenized
stream is fed to hundreds of context models which predict one bit at a
time. The predictions are combined by a hierarchy of neural networks.
A network takes a bit probability P in (0,1) as input, stretches it: X
= ln(P)/ln(1-P), computes a weighted sum X = SUM WiXi, and squashes
the output P = 1/(1+e^-X)). The weights are then updated by the
prediction error,  W += L(Y-P) depending on the predicted bit Y, where
L ~ .001 is the learning rate. This is single layer gradient descent
in coding cost space, which is simpler than descending in RMSE using W
+= L(P)(1-P)(Y-P) as in back propagation. The final prediction of Y is
arithmetic or ABS coded in -log2 P(Y) bits.

There are hundreds of models because there are many ways to predict
bits from a context. The simplest is to look up the last N bits
(usually starting at a byte or word boundary) or its hash in a table
that outputs P, then adjust P up or down depending on Y. A more common
method is an indirect context model, where the context hash indexes a
state representing a bit history like 0000000001. Rather than assume a
stationary model and output P = 0.1 or a rapidly changing model and
guess P = 0.5 or 0.9, we look up the state in a second table, output
P, then adjust P depending on Y. Other context models might look for
long matches and guess whatever bit followed the last match. A model
might be a mix of other models where the weight table W is selected by
a small context. A context can skip letters or words, or it can drop
bits from a token to group related words into the same context. A
model can be tuned by feeding P and a context into an interpolated
table that outputs a new P trained on Y.

Developing a compressor takes several years of work, testing thousands
of variations, observing tiny variations in compression ratio, like
0.001%, and deciding which changes are worth keeping because each
model uses time and space. The programs are complex with tens of
thousands of lines of code, in keeping with Legg's theorem that
powerful predictors are necessarily complex and simple universal
predictors do not exist. Some of the top programs use 32 GB or memory
and take a week to compress 1 GB of text, in keeping with my own
observation that prediction accuracy increases logarithmically with
computing power. This is still 1000 times faster than the human brain
processes text. The best compressor achieves 0.93665 bits per
character on this particular benchmark, which is in the range 0.6 to
1.3 bpc estimated by Shannon in 1950 and below Cover and King's 1978
estimate of 1.3 to 1.7 bpc on different texts. I don't claim any of
these programs could pass the Turing test or achieve human level
prediction on this particular benchmark because that remains to be
evaluated.

On Tue, Feb 5, 2019 at 2:08 AM Nanograte Knowledge Technologies
<[email protected]> wrote:
>
> Fair enough question. I'm involved directly in designing pseudo code (systems 
> models, policies, and logic in a computational format). Second, I was a 4GL 
> developer and worked directly as a professional in systems dev and systems 
> engineering for 22 years (my own R&D excluded). I currently employ a small 
> development team to set up the dev environment for my practical case. I'm 
> more hands-on than my time permits, but that means I'm learning a lot about 
> google and specialized plugins and how all works together.  As such, I've 
> identified the need to develop a custom-encryption system to protect the data 
> with. This is possible via a semantic application of my NLU.  In this sense, 
> semantic means something else.
>
> The next step would be to start coding the actual NLU - already being 
> deployed for many years - as well as other, mature frameworks, which would 
> form the layered, reasoning/unreasoning backbone of the eventual system. All 
> these frameworks are expressed as systems models in the NLU format. First 
> level = collecting, translating, and normalization to knowledge-maturity 
> level 5 (my own hierarchies). Second level has application for evolutionary 
> systems.
>
> In the 1st and 2nd level events, I would employ the best I (and hopefully my 
> co-funders) would be able to afford to start implementing the series of 
> designs and algorithms, which already exist in design format. I might even 
> team up with a university and their postgrad programs. Except for not yet 
> having been able to resolve IP issues, this has been explored over a number 
> of years and it seems highly feasible.
>
> Robert Benjamin
>
>
>
>
>
> ________________________________
> From: Stefan Reich via AGI <[email protected]>
> Sent: Monday, 04 February 2019 6:48 PM
> To: AGI
> Subject: Re: [agi] The future of AGI
>
> Thanks for your input, it's interesting. Are you involved in any code 
> production? (Sorry if I should know already...)
>
> Stefan
>
> Am Mo., 4. Feb. 2019 16:58 hat Nanograte Knowledge Technologies 
> <[email protected]> geschrieben:
>
> Hi Stefan
>
> I meant that there seems to be a popular view emerging, which nudges in the 
> direction of rethinking the prevailing architectural approach towards 
> enabling agi. It further means I'm recognizing how the pattern might be 
> shifting, and that I'm in support of such a view. In my opinion - and with 
> respect to the incredible effort that has gone into such ventures - 
> attempting to duplicate the human brain was never a sound-enough approach. 
> Such a fallible organ.
>
> Modern-day, real-time language translators offer sufficient advancement in 
> NLU, does it not?  I like your suggestion about converging around image/audio 
> recognition and learning logic as a single unit of cognition (perhaps). The 
> latest AI can accurately read lips at a distance. Furthermore, apps now 
> perform facial recognition from among crowds and track those faces. Some AI 
> apps monitor and analyze bio-metric forces (electo-magnetic forces) around 
> the body and other visible human characteristics as tell-tale indicators of 
> inner intent and emotional states. It helps to identify potential criminals 
> and deceivers. In addition, many computer games have shown a 
> reactive-learning capability based on cause-effect scenarios. And then you go 
> and casually plonk in the mother lode - evolutionary algorithms.
>
> This is the exact point at which I restate the likely need of a radical new 
> approach. If we cannot express computational evolution in terms of 
> recombination and diversification, we may have not yet managed to cross our 
> own, intellectual Abyss.
>
> As some suggested here (in my own words); we are inherently restricted by our 
> own human-reasoning universe. Is constructive reasoning about an unreasoning 
> universe the required level of super-positional madness designers should 
> attain, or should we rather entice the machine to indulge itself accordingly? 
> Maybe then, a bit of both.
>
> I think, first, we should ourselves evolve via recombination, not adaptation. 
> Morphing, not mimicking. If researchers and designers voluntarily became agi, 
> perhaps we would understand it a little better. Sure, the world would 
> probably reject us and call us nuts (as was done with Tesla), but they would 
> still appropriate our output.
>
> Such a radical approach. How to do our damndest not to try and make any sense 
> of it at all, purely relying on our collective ken and instinct. Some say 
> ancient-astronautical mindsets, merely following in the footprints that were 
> already laid down for those who would follow after and read the signs.
>
> Only time would tell. I'm enjoying the journey. The destination is not my 
> concern. There is no more right, or wrong.  Only to be correct in every 
> instance of a moment presented to our manifestation (in the sense of a 
> physical artifact with identity). In my lifetime I'd love to synergize with 
> fellow pilgrims though. I see a think tank of the quality that Alexander 
> Graham Bell founded and where scientists and intellectuals and inventors and 
> passionate others flocked to. I think, this is how humankind might get closer 
> to manifesting agi.
>
> Robert Benjamin
>
> ________________________________
> From: Stefan Reich via AGI <[email protected]>
> Sent: Monday, 04 February 2019 2:01 PM
> To: AGI
> Subject: Re: [agi] The future of AGI
>
> > Many commentators here agreed (over time) how agi development requires a 
> > radically-different approach to all other computational endeavors to date.
>
> Not sure what that means. A really good NLU will go a very long way, and then 
> we'll have to find a new "magic learner" module that replaces neural 
> networks, both for image/audio recognition and learning logic. I suggest 
> evolutionary algorithms.
>
> On Mon, 4 Feb 2019 at 05:45, Nanograte Knowledge Technologies 
> <[email protected]> wrote:
>
> Perhaps it's because, for its exponential complexity, agi defies theoretical 
> science. If no executable, framework of computational intelligence exists, 
> what's the use of being able to run at the speed of light?
>
> Many commentators here agreed (over time) how agi development requires a 
> radically-different approach to all other computational endeavors to date.  
> As evidenced, developing a feasible approach (in the sense of a platform) 
> would require at least 10 years of R&D. In my opinion, that is correct. In my 
> case it took more than 22 years - part-time. Towards an agi prototype then, 
> with 10-years' concentrated effort, perhaps another additional 5-7 years?
>
> Perhaps we should start pooling our research and resources with those who 
> offer the best 10-year result to date? I'm beginning to think this would be 
> the best way forward. Imagine a safe, inclusive, collaborative environment 
> where R&D parties could post real problems they needed solving and tangible 
> credit was given to the authors of such solutions? We're talking sharing in 
> the pot of gold at the end of the rainbow off course.
>
> Except for those sticky-finger, big boys who do not play well with others at 
> all. I'm quite certain they monitor this list trying to farm it yet never 
> contributing one bit of usefulness to others.  Those we should weed out from 
> any "collaborative" setup at every opportunity. They are only in it for 
> themselves, not for the industry, or the benefit of the world. Yes, you know 
> who you are!
>
> This is the extent of my professional opinion.
>
> Robert Benjamin
>
> ________________________________
> From: Linas Vepstas <[email protected]>
> Sent: Monday, 04 February 2019 6:16 AM
> To: AGI
> Subject: Re: [agi] The future of AGI
>
> I have no clue what Peter is actually thinking because he's coy and 
> secretive. But I'm not pessimistic. I'm just perplexed why no one ever seems 
> to try the obvious things. Or why I can never seem to explain obvious things  
> to anyone and have them understand it.  I am quite certain that one can do 
> better than neural nets and more easily,  too, an have explained exactly how 
> more times than I can count, but my words are not connecting with anyone who 
> understands them. So, whatever. Day at a time.
>
> --linas
>
> On Sun, Feb 3, 2019 at 5:28 PM <[email protected]> wrote:
>
> I’m not that pessimistic at all.
>
>
>
> Our own AGI project has made steady progress over the past 17 years in spite 
> of only spending about $10 million – about 150 man-years of focused effort.  
> We’ve managed to successfully commercialize an early version of our proto-AGI 
> engine in a company that now employs about 100 people www.smartaction.com . 
> For the last 5 years my full-time team of about 10 people has been working on 
> the next generation engine www.AGIinnovations.com /  www.Aigo.ai . We are now 
> ready to commercialize this more advanced platform.
>
>
>
> Our focus has been limited to natural language comprehension/ learning, 
> question answering/ inference, and conversation management.
>
> I think that $100 million could go a long way towards functional, 
> demonstrable proto AGI.  It seems to me that DeepMind hasn’t made good use of 
> the $200 or $300million spend so far – they lack a proper theory of 
> intelligence.  I don’t know why Vicarious, the other well-funded AGI company, 
> hasn’t made better progress in perception/ action – my guess, for the same 
> reason….
>
> I think all of the theoretical calculations of processing power are widely 
> off the mark – we’re not trying to reverse-engineer a bird – just need to 
> build a flying machine.
>
>
>
> My articles are here: 
> https://medium.com/@petervoss/my-ai-articles-f154c5adfd37
>
>
>
> Peter Voss
>
>
>
> From: Linas Vepstas <[email protected]>
> Sent: Friday, February 1, 2019 10:26 PM
> To: AGI <[email protected]>
> Subject: Re: [agi] The future of AGI
>
>
>
> Thanks Matt, very nice post! We're on the same wavelength, it seems. -- Linas
>
>
>
> On Thu, Jan 31, 2019 at 3:17 PM Matt Mahoney <[email protected]> wrote:
>
> When I asked Linas Vepstas, one of the original developers of OpenCog
> led by Ben Goertzel, about its future, he responded with a blog post.
> He compared research in AGI to astronomy. Anyone can do amateur
> astronomy with a pair of binoculars. But to make important
> discoveries, you need expensive equipment like the Hubble telescope.
> https://blog.opencog.org/2019/01/27/the-status-of-agi-and-opencog/
>
> Opencog began 10 years ago in 2009 with high hopes of solving AGI,
> building on the lessons learned from the prior 12 years of experience
> with WebMind and Novamente. At the time, its major components were
> DeStin, a neural vision system that could recognize handwritten
> digits, MOSES, an evolutionary learner that output simple programs to
> fit its training data, RelEx, a rule based language model, and
> AtomSpace, a hypergraph based knowledge representation for both
> structured knowledge and neural networks, intended to tie together the
> other components. Initial progress was rapid. There were chatbots,
> virtual environments for training AI agents, and dabbling in robotics.
> The timeline in 2011 had OpenCog progressing through a series of
> developmental stages leading up to "full-on human level AGI" in
> 2019-2021, and consulting with the Singularity Institute for AI (now
> MIRI) on the safety and ethics of recursive self improvement.
>
> Of course this did not happen. DeStin and MOSES never ran on hardware
> powerful enough to solve anything beyond toy problems. ReLex had all
> the usual problems of rule based systems like brittleness, parse
> ambiguity, and the lack of an effective learning mechanism from
> unstructured text. AtomSpace scaled poorly across distributed systems
> and was never integrated. There is no knowledge base. Investors and
> developers lost interest….
>
>
>
>
> --
> cassette tapes - analog TV - film cameras - you
>
>
>
> --
> Stefan Reich
> BotCompany.de // Java-based operating systems
>
> Artificial General Intelligence List / AGI / see discussions + participants + 
> delivery options Permalink



-- 
-- Matt Mahoney, [email protected]

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-M4eee284726098a93549144ec
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] The future of AGI

Reply via email to