Interesting thread. My opinion is the best course of action is to pick a scheme you favor and damn the torpedos! Get to work and plan on a lot of failures. Maybe some success. God, I hope so.
On Tuesday, February 5, 2019, Matt Mahoney <[email protected]> wrote: > It is easy to come up with ideas and to say what approaches to AGI > should be obvious. It is much harder to test them, and when we do we > are sometimes surprised that our ideas don't work. Linas's > dissertation length paper on the equivalence of symbolic and neural > language systems would be a good review of dozens of different > approaches if there were an experimental results section that told us > which ones were worth pursuing. The idea goes back to Rumelhart and > McClelland in the 1980's who proposed neural language models with > neurons representing letters or phonemes, words, and parts of speech. > It makes sense. But we just lacked the computing power to implement > it. > > Rule based grammars seem to make sense because it works on artificial > languages using very little computing power. You parse the sentence > and analyze the tree for semantics. You can cover about half of all > English sentences with a few dozen rules. But English doesn't work > that way. Nobody knows how many rules you need to parse the other > half. Why do we prefer "salt and pepper" over "pepper and salt"? > Furthermore, you have to understand sentences before you can parse > them. Consider: > > I ate pizza with a fork. > I ate pizza with pepperoni. > I ate pizza with Bob. > > The standard measure of language model performance is word perplexity, > or equivalently, text prediction or compression. It is equivalent to a > quantitative version of the Turing test because successful prediction > of a dialog is equivalent to predicting how a human would answer the > same questions. In http://mattmahoney.net/dc/text.html I evaluated > over 1000 versions of 200 language models over the last 12 years, > measuring text prediction accuracy of each one to 9 significant > digits. So I can tell you what really works. > > The top 3 programs and 13 of the top 14 use a context mixing > algorithm, which uses neural networks to combine the predictions of > hundreds or thousands of context models. (Practical compressors may > use 4 to 8 models). The input text is preprocessed, replacing capital > letters with lower case plus a special symbol and merging white space. > Words are replaced with tokens from a dictionary that groups related > words like "monday" and 'tuesday" or "mother" and "father". There are > automated methods of building the dictionary, omitting and spelling > rare words and grouping words by proximity for semantics and > clustering in context space for syntactic role. Next the tokenized > stream is fed to hundreds of context models which predict one bit at a > time. The predictions are combined by a hierarchy of neural networks. > A network takes a bit probability P in (0,1) as input, stretches it: X > = ln(P)/ln(1-P), computes a weighted sum X = SUM WiXi, and squashes > the output P = 1/(1+e^-X)). The weights are then updated by the > prediction error, W += L(Y-P) depending on the predicted bit Y, where > L ~ .001 is the learning rate. This is single layer gradient descent > in coding cost space, which is simpler than descending in RMSE using W > += L(P)(1-P)(Y-P) as in back propagation. The final prediction of Y is > arithmetic or ABS coded in -log2 P(Y) bits. > > There are hundreds of models because there are many ways to predict > bits from a context. The simplest is to look up the last N bits > (usually starting at a byte or word boundary) or its hash in a table > that outputs P, then adjust P up or down depending on Y. A more common > method is an indirect context model, where the context hash indexes a > state representing a bit history like 0000000001. Rather than assume a > stationary model and output P = 0.1 or a rapidly changing model and > guess P = 0.5 or 0.9, we look up the state in a second table, output > P, then adjust P depending on Y. Other context models might look for > long matches and guess whatever bit followed the last match. A model > might be a mix of other models where the weight table W is selected by > a small context. A context can skip letters or words, or it can drop > bits from a token to group related words into the same context. A > model can be tuned by feeding P and a context into an interpolated > table that outputs a new P trained on Y. > > Developing a compressor takes several years of work, testing thousands > of variations, observing tiny variations in compression ratio, like > 0.001%, and deciding which changes are worth keeping because each > model uses time and space. The programs are complex with tens of > thousands of lines of code, in keeping with Legg's theorem that > powerful predictors are necessarily complex and simple universal > predictors do not exist. Some of the top programs use 32 GB or memory > and take a week to compress 1 GB of text, in keeping with my own > observation that prediction accuracy increases logarithmically with > computing power. This is still 1000 times faster than the human brain > processes text. The best compressor achieves 0.93665 bits per > character on this particular benchmark, which is in the range 0.6 to > 1.3 bpc estimated by Shannon in 1950 and below Cover and King's 1978 > estimate of 1.3 to 1.7 bpc on different texts. I don't claim any of > these programs could pass the Turing test or achieve human level > prediction on this particular benchmark because that remains to be > evaluated. > > On Tue, Feb 5, 2019 at 2:08 AM Nanograte Knowledge Technologies > <[email protected]> wrote: > > > > Fair enough question. I'm involved directly in designing pseudo code > (systems models, policies, and logic in a computational format). Second, I > was a 4GL developer and worked directly as a professional in systems dev > and systems engineering for 22 years (my own R&D excluded). I currently > employ a small development team to set up the dev environment for my > practical case. I'm more hands-on than my time permits, but that means I'm > learning a lot about google and specialized plugins and how all works > together. As such, I've identified the need to develop a custom-encryption > system to protect the data with. This is possible via a semantic > application of my NLU. In this sense, semantic means something else. > > > > The next step would be to start coding the actual NLU - already being > deployed for many years - as well as other, mature frameworks, which would > form the layered, reasoning/unreasoning backbone of the eventual system. > All these frameworks are expressed as systems models in the NLU format. > First level = collecting, translating, and normalization to > knowledge-maturity level 5 (my own hierarchies). Second level has > application for evolutionary systems. > > > > In the 1st and 2nd level events, I would employ the best I (and > hopefully my co-funders) would be able to afford to start implementing the > series of designs and algorithms, which already exist in design format. I > might even team up with a university and their postgrad programs. Except > for not yet having been able to resolve IP issues, this has been explored > over a number of years and it seems highly feasible. > > > > Robert Benjamin > > > > > > > > > > > > ________________________________ > > From: Stefan Reich via AGI <[email protected]> > > Sent: Monday, 04 February 2019 6:48 PM > > To: AGI > > Subject: Re: [agi] The future of AGI > > > > Thanks for your input, it's interesting. Are you involved in any code > production? (Sorry if I should know already...) > > > > Stefan > > > > Am Mo., 4. Feb. 2019 16:58 hat Nanograte Knowledge Technologies < > [email protected]> geschrieben: > > > > Hi Stefan > > > > I meant that there seems to be a popular view emerging, which nudges in > the direction of rethinking the prevailing architectural approach towards > enabling agi. It further means I'm recognizing how the pattern might be > shifting, and that I'm in support of such a view. In my opinion - and with > respect to the incredible effort that has gone into such ventures - > attempting to duplicate the human brain was never a sound-enough approach. > Such a fallible organ. > > > > Modern-day, real-time language translators offer sufficient advancement > in NLU, does it not? I like your suggestion about converging around > image/audio recognition and learning logic as a single unit of cognition > (perhaps). The latest AI can accurately read lips at a distance. > Furthermore, apps now perform facial recognition from among crowds and > track those faces. Some AI apps monitor and analyze bio-metric forces > (electo-magnetic forces) around the body and other visible human > characteristics as tell-tale indicators of inner intent and emotional > states. It helps to identify potential criminals and deceivers. In > addition, many computer games have shown a reactive-learning capability > based on cause-effect scenarios. And then you go and casually plonk in the > mother lode - evolutionary algorithms. > > > > This is the exact point at which I restate the likely need of a radical > new approach. If we cannot express computational evolution in terms of > recombination and diversification, we may have not yet managed to cross our > own, intellectual Abyss. > > > > As some suggested here (in my own words); we are inherently restricted > by our own human-reasoning universe. Is constructive reasoning about an > unreasoning universe the required level of super-positional madness > designers should attain, or should we rather entice the machine to indulge > itself accordingly? Maybe then, a bit of both. > > > > I think, first, we should ourselves evolve via recombination, not > adaptation. Morphing, not mimicking. If researchers and designers > voluntarily became agi, perhaps we would understand it a little better. > Sure, the world would probably reject us and call us nuts (as was done with > Tesla), but they would still appropriate our output. > > > > Such a radical approach. How to do our damndest not to try and make any > sense of it at all, purely relying on our collective ken and instinct. Some > say ancient-astronautical mindsets, merely following in the footprints that > were already laid down for those who would follow after and read the signs. > > > > Only time would tell. I'm enjoying the journey. The destination is not > my concern. There is no more right, or wrong. Only to be correct in every > instance of a moment presented to our manifestation (in the sense of a > physical artifact with identity). In my lifetime I'd love to synergize with > fellow pilgrims though. I see a think tank of the quality that Alexander > Graham Bell founded and where scientists and intellectuals and inventors > and passionate others flocked to. I think, this is how humankind might get > closer to manifesting agi. > > > > Robert Benjamin > > > > ________________________________ > > From: Stefan Reich via AGI <[email protected]> > > Sent: Monday, 04 February 2019 2:01 PM > > To: AGI > > Subject: Re: [agi] The future of AGI > > > > > Many commentators here agreed (over time) how agi development requires > a radically-different approach to all other computational endeavors to date. > > > > Not sure what that means. A really good NLU will go a very long way, and > then we'll have to find a new "magic learner" module that replaces neural > networks, both for image/audio recognition and learning logic. I suggest > evolutionary algorithms. > > > > On Mon, 4 Feb 2019 at 05:45, Nanograte Knowledge Technologies < > [email protected]> wrote: > > > > Perhaps it's because, for its exponential complexity, agi defies > theoretical science. If no executable, framework of computational > intelligence exists, what's the use of being able to run at the speed of > light? > > > > Many commentators here agreed (over time) how agi development requires a > radically-different approach to all other computational endeavors to date. > As evidenced, developing a feasible approach (in the sense of a platform) > would require at least 10 years of R&D. In my opinion, that is correct. In > my case it took more than 22 years - part-time. Towards an agi prototype > then, with 10-years' concentrated effort, perhaps another additional 5-7 > years? > > > > Perhaps we should start pooling our research and resources with those > who offer the best 10-year result to date? I'm beginning to think this > would be the best way forward. Imagine a safe, inclusive, collaborative > environment where R&D parties could post real problems they needed solving > and tangible credit was given to the authors of such solutions? We're > talking sharing in the pot of gold at the end of the rainbow off course. > > > > Except for those sticky-finger, big boys who do not play well with > others at all. I'm quite certain they monitor this list trying to farm it > yet never contributing one bit of usefulness to others. Those we should > weed out from any "collaborative" setup at every opportunity. They are only > in it for themselves, not for the industry, or the benefit of the world. > Yes, you know who you are! > > > > This is the extent of my professional opinion. > > > > Robert Benjamin > > > > ________________________________ > > From: Linas Vepstas <[email protected]> > > Sent: Monday, 04 February 2019 6:16 AM > > To: AGI > > Subject: Re: [agi] The future of AGI > > > > I have no clue what Peter is actually thinking because he's coy and > secretive. But I'm not pessimistic. I'm just perplexed why no one ever > seems to try the obvious things. Or why I can never seem to explain obvious > things to anyone and have them understand it. I am quite certain that one > can do better than neural nets and more easily, too, an have explained > exactly how more times than I can count, but my words are not connecting > with anyone who understands them. So, whatever. Day at a time. > > > > --linas > > > > On Sun, Feb 3, 2019 at 5:28 PM <[email protected]> wrote: > > > > I’m not that pessimistic at all. > > > > > > > > Our own AGI project has made steady progress over the past 17 years in > spite of only spending about $10 million – about 150 man-years of focused > effort. We’ve managed to successfully commercialize an early version of > our proto-AGI engine in a company that now employs about 100 people > www.smartaction.com . For the last 5 years my full-time team of about 10 > people has been working on the next generation engine > www.AGIinnovations.com / www.Aigo.ai . We are now ready to commercialize > this more advanced platform. > > > > > > > > Our focus has been limited to natural language comprehension/ learning, > question answering/ inference, and conversation management. > > > > I think that $100 million could go a long way towards functional, > demonstrable proto AGI. It seems to me that DeepMind hasn’t made good use > of the $200 or $300million spend so far – they lack a proper theory of > intelligence. I don’t know why Vicarious, the other well-funded AGI > company, hasn’t made better progress in perception/ action – my guess, for > the same reason…. > > > > I think all of the theoretical calculations of processing power are > widely off the mark – we’re not trying to reverse-engineer a bird – just > need to build a flying machine. > > > > > > > > My articles are here: https://medium.com/@petervoss/ > my-ai-articles-f154c5adfd37 > > > > > > > > Peter Voss > > > > > > > > From: Linas Vepstas <[email protected]> > > Sent: Friday, February 1, 2019 10:26 PM > > To: AGI <[email protected]> > > Subject: Re: [agi] The future of AGI > > > > > > > > Thanks Matt, very nice post! We're on the same wavelength, it seems. -- > Linas > > > > > > > > On Thu, Jan 31, 2019 at 3:17 PM Matt Mahoney <[email protected]> > wrote: > > > > When I asked Linas Vepstas, one of the original developers of OpenCog > > led by Ben Goertzel, about its future, he responded with a blog post. > > He compared research in AGI to astronomy. Anyone can do amateur > > astronomy with a pair of binoculars. But to make important > > discoveries, you need expensive equipment like the Hubble telescope. > > https://blog.opencog.org/2019/01/27/the-status-of-agi-and-opencog/ > > > > Opencog began 10 years ago in 2009 with high hopes of solving AGI, > > building on the lessons learned from the prior 12 years of experience > > with WebMind and Novamente. At the time, its major components were > > DeStin, a neural vision system that could recognize handwritten > > digits, MOSES, an evolutionary learner that output simple programs to > > fit its training data, RelEx, a rule based language model, and > > AtomSpace, a hypergraph based knowledge representation for both > > structured knowledge and neural networks, intended to tie together the > > other components. Initial progress was rapid. There were chatbots, > > virtual environments for training AI agents, and dabbling in robotics. > > The timeline in 2011 had OpenCog progressing through a series of > > developmental stages leading up to "full-on human level AGI" in > > 2019-2021, and consulting with the Singularity Institute for AI (now > > MIRI) on the safety and ethics of recursive self improvement. > > > > Of course this did not happen. DeStin and MOSES never ran on hardware > > powerful enough to solve anything beyond toy problems. ReLex had all > > the usual problems of rule based systems like brittleness, parse > > ambiguity, and the lack of an effective learning mechanism from > > unstructured text. AtomSpace scaled poorly across distributed systems > > and was never integrated. There is no knowledge base. Investors and > > developers lost interest…. > > > > > > > > > > -- > > cassette tapes - analog TV - film cameras - you > > > > > > > > -- > > Stefan Reich > > BotCompany.de // Java-based operating systems > > > > Artificial General Intelligence List / AGI / see discussions + > participants + delivery options Permalink > > -- > -- Matt Mahoney, [email protected] ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Ta6fce6a7b640886a-Meb01cd83baff21194102456e Delivery options: https://agi.topicbox.com/groups/agi/subscription
