Re: [opencog-dev] AGI Discussion Forum

Ben Goertzel Sat, 01 May 2021 23:24:14 -0700

See "Coffee test", "Robot college student test", "employment test" ;)


https://analyticsindiamag.com/5-ways-to-test-whether-agi-has-truly-arrived/

Or more seriously look at the section "Scenarios for Assessing AGI" in
our old paper "mapping the landscape of AGI"

https://ojs.aaai.org//index.php/aimagazine/article/view/2322



On Sat, May 1, 2021 at 2:59 AM Jon P <[email protected]> wrote:
>
> Hi Ben,
>
> Thanks for hosting the AGI discussion yesterday, it was interesting to hear 
> more about the Galois Connections and COFO ideas you've been working on. I've 
> been reading your papers a bit and they are interesting.
>
> One question I have: how do you know if you've built an AGI? Is there a test 
> set of some kind which can be used to verify how general the intelligence is? 
> For example if a single system could play chess, identify pictures of dogs 
> and answer maths problems written in words is that sufficient to declare it 
> an AGI or would it need to be able to do a lot of other tasks? I am not sure 
> if there has been a lot of work on this already and benchmarks are well 
> defined.
>
> Jon
>
> On Friday, April 23, 2021 at 5:55:55 AM UTC+1 Ben Goertzel wrote:
>>
>> >> The bio-Atomspace we are experimenting with now contains only a small
>> >> % of the biomedical knowledge we would like it to, which is because of
>> >> RAM and processing speed limitations in current OpenCog
>> >>
>> >> Recent optimizations help but don't remotely come close to solving the 
>> >> problem
>> >
>> >
>> > OK. Well, that's news to me. I try to keep everyone happy, and when there 
>> > aren't any comments or complaints, I assume everyone is happy. Do you have 
>> > actual examples, where you are running out of RAM, and where things are 
>> > going too slow? Or is this just a gut-feel issue, for which you have no 
>> > actual data?
>>
>> Yes, Hedra (who is working with PLN on bio-Atomspace) is hitting these
>> issues all the time, and because of this she limits the amount of data
>> imported into Atomspace and the scope of queries run against Atomspace
>> (e.g. filtering a query to focus on just a few genes rather than all
>> the genes of interest, etc.)
>>
>> Nil is aware of this work and understands it in greater detail than
>> idea (from an OpenCog usage view not a biology view) and if you're
>> curious to dig in, asking him is probably the best idea...
>>
>> > I cannot repeat this often enough or strongly enough: the kinds of 
>> > optimizations that are performed on software systems are extremely 
>> > data-dependent and algorithm dependent. It is effectively impossible to 
>> > perform optimizations without having a specific use case. This is a 
>> > kind-of theorem of computer science.
>>
>> This is way overstated IMO. For instance optimizations made to allow
>> fast matrix operations on GPUs for computer graphics, turned out to be
>> useful for all sorts of NN and other AI algorithms. Of course things
>> can be further optimized for specific NNs or other AI algos, but
>> nonetheless the more generic optimizations made w/ computer graphics
>> in mind were pretty helpful for optimizing AI ...
>>
>> Theorems like "no free lunch" etc. operate at a level of abstraction
>> and extremity that doesn't really help in practical cases, I feel...
>>
>> >> The neural-symbolic grammar learning that Andres Suarez and I
>> >> prototyped last spring, also couldn't viably be done using OpenCog for
>> >> similar reasons (RAM and processing speed limitations).
>> >
>> >
>> > No one ever complained about RAM or processing speeds, so it's kind of 
>> > unfair to just bring this up a year later. I had the impression that the 
>> > theory you were developing wasn't working out; I wasn't surprised, but I 
>> > never fully understood it.
>>
>>
>> The theory IMO is highly promising, and we paused that work because of
>> other priorities not any problems w/ the ideas nor any lack of quality
>> in the prototype results. However to pursue that work using
>> Atomspace in a straightforward way would require importing way more
>> data into a single Atomspace than can be done in RAM on a single
>> current-day machine.
>>
>> > This spring, I restarted work on https://github.com/opencog/learn -- you 
>> > can review the README for the current status. I get good results. Its a 
>> > big project. Things go slowly. Not enough time in the day.
>>
>> Cool, I will take a look...
>>
>> >> The experimentation on pattern mining from inference histories for
>> >> automated inference control, that Nil was doing a year ago, was
>> >> incredibly slow also due to Atomspace limitations.
>> >
>> >
>> > Ben, that is also incredibly unfair. Never-ever did you or Nil or anyone 
>> > else ever complain about "atomspace limitations". So you can't just start 
>> > blaming it now. If there is an actual performance problem, open a github 
>> > issue, and describe it. Provide instrumentation, bottlenecks.
>>
>> Some problems are too obvious and too severe for it to make sense to
>> take this sort of approach. Problems that clearly can't be fixed by
>> incremental improvements.
>>
>> > I watched those projects from afar, and ... well, all I can say is "that's 
>> > not how I would have done it". The fact that you had performance problems 
>> > is almost surely a statement about your algorithms, and not a statement 
>> > about the atomspace. The atomspace is what it is, and if you use it 
>> > incorrectly, you'll get disappointing results. It's not a magic wand. It's 
>> > just software, like any other kind of software.
>>
>>
>> This could have been said about all neural net AI algorithms in the
>> period before we had modern GPUs and their associated software tools.
>> But in fact many algorithms that were run in the 1980s and 1990s with
>> poor results, were run a couple decades later on more modern hardware
>> and associated software frameworks, with really exciting results.
>> Without any changes to the core algorithms -- though often some
>> parameter tweaks and straightforward network architecture improvements
>> (of the sort that are straightforward once you're able to iterate
>> quickly running experiments at the appropriate scale).
>>
>> So the history of AI contains a lot of cases that contradict the sort
>> of assertion you're making. Often algorithms that worked poorly at
>> one scale, ended up working great at a more appropriate scale (where
>> "scale" means amount of data and also amount of processor and RAM,
>> appropriately deployed...)
>>
>> >> It is possibly true that for each such case, one could design a
>> >> specialized architecture to support just that case, working around the
>> >> need for a general-purpose DAS in that particular case....
>> >
>> >
>> > You are describing things that sound like (to me) inadequate or 
>> > inappropriate algorithms, and then switching the topic to DAS. You don't 
>> > have to use the atomspace -- you could have done the inference mining on 
>> > any one of a half-dozen map-reduce platforms out there -- many of them 
>> > from the Apache.org people -- and you would not have gotten performance 
>> > that is any better than what the atomspace provides.
>>
>>
>> That is not really true... the mining part could be done way faster
>> than is possible using current OpenCog tools. However these other
>> tools don't have the flexibility to do the inference part in any
>> non-convoluted way. And if we're going to set things up w/ closely
>> coupled back-and-forth between pattern mining of inference patterns,
>> and inference itself, it's nice if the two aspects are not implemented
>> in totally separate systems with a slow communication channel btw
>> them...
>>
>> >Nothing that I saw Shujing doing with pattern mining was any different than 
>> >what anyone else in the industry does when they data-mine.
>>
>> Standard datamining algorithms do not look for surprising patterns in
>> hypergraphs.
>>
>> > Given that I don't understand the "applications such as those above", I 
>> > don't know how to respond. You would have to describe those applications 
>> > in engineering terms, in order to understand how they could be implemented 
>> > so as to run efficiently and scalably ... without an actual description of 
>> > what it is, it's not a solvable problem. There's just an insufficient 
>> > amount of detail.
>>
>> Yeah of course my email did not contain full detail about these
>> applications, that would be infeasible to give in such a short space
>> and time.
>>
>> >> What are the main differences btw what I described above and what your
>> >> prototypes do?
>> >
>> >
>> > Rocks does not do sharding across the network.
>> >
>> > If you have different fragments of an atomspace dataset on 10 different 
>> > networked machines, and you want to write a pattern match that will run 
>> > across all of those machines in parallel, and join together the results, I 
>> > could write that snippet of code in the proverbial afternoon. It's so 
>> > simple, in fact, that it could be written as an example, to add to the set 
>> > of examples. (actually, I think one of the examples already does this, 
>> > more or less. Actually, its a mashup of these two demos: 
>> > https://github.com/opencog/atomspace/blob/master/examples/atomspace/persist-multi.scm
>> >  and 
>> > https://github.com/opencog/atomspace/blob/master/examples/atomspace/persist-query.scm
>> >  )
>> >
>> > That's not the hard part.
>>
>> yes, agreed. the above is understood. We were doing something
>> similar in HK some years ago with Mandeep's gearman based
>> implementation, but clearly your current system is better in various
>> ways.
>>
>> The hard part is a ladder of requirements:
>> > * How do you get the shards of data onto those machines? Do you use rsync 
>> > to copy files, or do you want to send them via atomspace? If you use 
>> > rsync, then where will you keep the script for it?
>> > * Where do you keep the list of the currently-active set of 10 machines? 
>> > Do you need a GUI for that? A phone app?
>> > * What do you do if one or more of them hasn't booted, or has crashed?
>> > * Are they password-protected? The atomspace is not password-protected!
>> >
>> > There are atomspace issues:
>> > * The simplest solution is to wait until all ten have returned results, 
>> > and then join them together.
>> > * Another possibility is to let the results dribble in, and join them as 
>> > they arrive. This is more complex, and requires more sophistication. The 
>> > 10-line demo program now becomes a 100 or 200 line program.
>> > * What if one of the machines has crashed during processing? e.g. bad 
>> > network card, failed disk, power outage?
>> > * Perhaps you want to load-balance, so that the slowest machine is not 
>> > always the bottleneck. This requires measuring each machine to see if it 
>> > is idle or not, and giving it more work if it is idle. This is 
>> > non-trivial. Most engineers would do this outside of the atomspace, but 
>> > you could also do it inside the atomspace if you write custom Atoms for 
>> > it. Does your design require custom Atoms for load-balancing?
>> > * Perhaps the dataset is badly sharded, so that one of the machines is 
>> > always a bottleneck. This requires not only finding the busiest machine, 
>> > but then re-sharding the data. Many databases do this automatically. The 
>> > conventional way in which this is done is to find a sequence of "least 
>> > cruel cuts" in the Tononi sense, and move those to other machines. Find 
>> > the cuts that hurt Phi the least. Talking about phi is fancy-pants 
>> > buzzword-slinging, but all the people who do data-mining have a very 
>> > intuitive understanding of Tononi's Phi, and have had that understanding 
>> > many, many decades ago, because it's key to both software and hardware 
>> > optimization. This is easy to say, but finding those cuts is hard to do. 
>> > Nothing in opencog today does this automatically. However, I can imagine 
>> > several possible solutions, ranging from real easy ones to really complex 
>> > ones, each having pros and cons. Vendors like Oracle have had solutions 
>> > for this, for decades. They've invested hundreds of man-years into it.
>> > * There's more. I wanted to mention concepts like "explain vacuum analyze" 
>> > and "query planning" but perhaps some other day. Everyone gets to solve 
>> > the query planning problem, including Hyperon. There's no free lunch.
>> >
>> > Then there are the data-design issues and meta-issues
>> > * Perhaps you are storing data as atoms, that should have been Values. 
>> > Values are a lot faster than Atoms, but they get this performance with a 
>> > set of function trade-offs.
>> > * Perhaps your data should not be kept in the atomspace at all. This 
>> > includes audio, video live-streams, text files, medical records, and a 
>> > zillion other data types.
>>
>>
>> Yes, this are all among issues that need to be solved in order to make
>> an effective DAS (a goal which you don't have a great affinity for, I
>> understand). The fact that this long list of issues (which is not
>> complete ofc) still remains to be addressed, means to me that your
>> RocksDB based prototype is not actually very close to what we would
>> need for a DAS.
>>
>> But could one usefully build a DAS on top of your current RocksDB
>> code? Surely one could but it's not yet clear to me that's the
>> optimal approach... maybe it is...
>>
>>
>> >> Hmm, at a high level we did guess a pattern cache was going to be
>> >> useful -- and Senna implemented one some time ago.
>> >
>> >
>> > The concept of a cache is about as generic as the concept of a loop or an 
>> > if statement. You are effectively saying that Senna thought of using 
>> > if-statements and loops, and implemented a program that used them. That's 
>> > just crazy-making!
>>
>> What I mean is that Senna implemented in 2017 a Pattern Index that
>> allows one to create special indices to accelerate lookup of
>> particular sorts of patterns in Atomspace,
>>
>> https://github.com/andre-senna/opencog/blob/feature_pattern_index/opencog/learning/pattern-index/README.md
>>
>> It's not the same exact idea as your pattern cache, though.
>>
>> -- Ben
>
> --
> You received this message because you are subscribed to the Google Groups 
> "opencog" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/opencog/42d634d7-364b-4cca-b669-57ff2b0c7c0bn%40googlegroups.com.



-- 
Ben Goertzel, PhD
http://goertzel.org

“He not busy being born is busy dying" -- Bob Dylan

-- 
You received this message because you are subscribed to the Google Groups 
"opencog" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/opencog/CACYTDBfz-UyutNLuLu6jfcmj9Wfmxz93OXq4ChrfsSFb4BNhiA%40mail.gmail.com.

Re: [opencog-dev] AGI Discussion Forum

Reply via email to