Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread William Pearson
2008/9/5 Mike Tintner [EMAIL PROTECTED]:
 MT:By contrast, all deterministic/programmed machines and computers are

 guaranteed to complete any task they begin.

 Will:If only such could be guaranteed! We would never have system hangs,
 dead locks. Even if it could be made so, computer systems would not
 always want to do so.

 Will,

 That's a legalistic, not a valid objection, (although heartfelt!).In the
 above case, the computer is guaranteed to hang - and it does, strictly,
 complete its task.

Not necessarily, the task could be interrupted at that process stopped
or paused indefinately.

 What's happened is that you have had imperfect knowledge of the program's
 operations. Had you known more, you would have known that it would hang.

If it hung because of mult-process issues, you would need perfect
knowledge of the environment to know the possible timing issues as
well.

 Were your computer like a human mind, it would have been able to say (as
 you/we all do) - well if that part of the problem is going to be difficult,
 I'll ignore it  or.. I'll just make up an answer... or by God I'll keep
 trying other ways until I do solve this.. or... ..  or ...
 Computers, currently, aren't free thinkers.


Computers aren't free thinkers, but it does not follow from an
inability to switch,  cancel, pause and restart or modify tasks. All
of which they can do admirably. They just don't tend to do so, because
they aren't smart enough (and cannot change themselves to be so) to
know when it might be appropriate for what they are trying to do, so
it is left up to the human operator to do so.

I'm very interested in computers that self-maintain, that is reduce
(or eliminate) the need for a human to be in the loop or know much
about the internal workings of the computer. However it doesn't need a
vastly different computing paradigm  it just needs a different way of
thinking about the systems. E.g. how can you design a system that does
not need a human around to fix mistakes, upgrade it or maintain it in
general.

As they change their own system I will not know what they are going to
do, because they can get information from the environment about how to
act. This will me it a 'free thinker' of sorts. Whether it will be
enough to get what you want, is an empirical matter, as far as I am
concerned.

 Will


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread Mike Tintner

Will,

Yes, humans are manifestly a RADICALLY different machine paradigm- if you 
care to stand back and look at the big picture.


Employ a machine of any kind and in general, you know what you're getting - 
some glitches (esp. with complex programs) etc sure - but basically, in 
general,  it will do its job.


Humans are only human, not a machine. Employ one of those, incl. yourself, 
and, by comparison, you have only a v. limited idea of what you're getting - 
whether they'll do the job at all, to what extent, how well. Employ a 
programmer, a plumber etc etc.. Can you get a good one these days?... 
VAST difference.


And that's the negative side of our positive side - the fact that we're 1) 
supremely adaptable, and 2) can tackle those problems that no machine or 
current AGI  - (actually of course, there is no such thing at the mo, only 
pretenders) - can even *begin* to tackle.


Our unreliability
.

That, I suggest, only comes from having no set structure - no computer 
program - no program of action in the first place. (Hey, good  idea, who 
needs a program?)


Here's a simple, extreme example.

Will,  I want you to take up to an hour, and come up with a dance, called 
the Keyboard Shuffle. (A very ill-structured problem.)


Hey, you can do that. You can tackle a seriously ill-structured problem. You 
can embark on an activity you've never done before, presumably had no 
training for, have no structure for,  yet you will, if cooperative, come up 
with something - cobble together a session of that activity, and 
end-product, an actual dance. May be shit, but it'll be a dance.


And that's only an extreme example of how you approach EVERY activity. You 
similarly don't have a structure for your next hour[s], if you're writing an 
essay, or a program, or spending time watching TV, flipping chanels. You may 
quickly *adopt* or *form* certain structures/ routines. But they only go 
part way, and you do have to adopt and/or create them.


Now, I assert,  that's what an AGI is - a machine that has no programs, (no 
preset, complete structures for any activities), designed to tackle 
ill-structured problems by creating and adopting structures, not 
automatically following ones that have been laboured over for ridiculous 
amounts of time by human programmers offstage.


And that in parallel, though in an obviously more constrained way, is what 
every living organism is - an extraordinary machine that builds itself 
adaptively and flexibly, as it goes along  -  Dawkins' famous plane that 
builds itself in mid-air. Just as we construct our activities in mid-air. 
Also a very different machine paradigm to any we have at the mo  (although 
obviously lots of people are currently trying to design/understand such 
self-building machines).


P.S. The irony is that scientists and rational philosophers, faced with the 
extreme nature of human imperfection - our extreme fallibility (in the sense 
described above - i.e. liable to fail/give up/procrastinate at any given 
activity at any point in a myriad of ways) - have dismissed it as, 
essentially, down to bugs in the system. Things that can be fixed.


AGI-ers have the capacity like no one else to see and truly appreciate that 
such fallibility = highly desirable adaptability and that humans/animals 
really are fundamentally different machines.


P.P.S.  BTW that's the proper analogy for constructing an AGI - not 
inventing the plane (easy-peasy), but inventing the plane that builds itself 
in mid-air, (whole new paradigm of machine- and mind- invention).


Will: MT:By contrast, all deterministic/programmed machines and computers 
are


guaranteed to complete any task they begin.


Will:If only such could be guaranteed! We would never have system hangs,
dead locks. Even if it could be made so, computer systems would not
always want to do so.

Will,

That's a legalistic, not a valid objection, (although heartfelt!).In the
above case, the computer is guaranteed to hang - and it does, strictly,
complete its task.


Not necessarily, the task could be interrupted at that process stopped
or paused indefinately.


What's happened is that you have had imperfect knowledge of the program's
operations. Had you known more, you would have known that it would hang.


If it hung because of mult-process issues, you would need perfect
knowledge of the environment to know the possible timing issues as
well.


Were your computer like a human mind, it would have been able to say (as
you/we all do) - well if that part of the problem is going to be 
difficult,
I'll ignore it  or.. I'll just make up an answer... or by God I'll 
keep

trying other ways until I do solve this.. or... ..  or ...
Computers, currently, aren't free thinkers.



Computers aren't free thinkers, but it does not follow from an
inability to switch,  cancel, pause and restart or modify tasks. All
of which they can do admirably. They just don't tend to do so, because
they aren't smart enough (and cannot change 

Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread Mike Tintner

Sorry - para Our unreliability ..  should have contined..

Our unreliabilty is the negative flip-side of our positive ability to stop 
an activity at any point, incl. the beginning and completely change tack/ 
course or whole approach, incl. the task itself, and even completely 
contradict ourself. 





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


RE: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread John G. Rose
Thinking out loud here as I find the relationship between compression and
intelligence interesting:

Compression in itself has the overriding goal of reducing storage bits.
Intelligence has coincidental compression. There is resource management
there. But I do think that it is not ONLY coincidental. Knowledge has
structure which can be organized and naturally can collapse into a lower
complexity storage state. Things have order, based on physics and other
mathematical relationships. The relationship between compression and stored
knowledge and intelligence is intriguing. But knowledge can be compressed
inefficiently to where it inhibits extraction and other operations so there
are differences with compression and intelligence related to computational
expense. Optimal intelligence would have a variational compression structure
IOW some stuff needs fast access time with minimal decompression resource
expenditure and other stuff has high storage priority but computational
expense and access time are not a priority.

And then when you say the word compression there is a complicity of utility.
The result of a compressor that has general intelligence still has a goal of
reducing storage bits. I think that compression can be a byproduct of the
stored knowledge created by a general intelligence. But if you have a
compressor with general intelligence built in and you assign it a goal of
taking input data and reducing the storage space it still may result in a
series of hacks because that may be the best way of accomplishing that goal.


Sure there may be some new undiscovered hacks that require general
intelligence to uncover. And a compressor that is generally intelligent may
produce more rich lossily compressed data from varied sources. The best
lossy compressor is probably generally intelligent. They are very similar as
you indicate... but when you start getting real lossy, when you start asking
questions from your lossy compressed data that are not related to just the
uncompressed input there is a difference there. Compression itself is just
one dimensional. Intelligence is multi. 

John 



 -Original Message-
 From: Matt Mahoney [mailto:[EMAIL PROTECTED]
 Sent: Friday, September 05, 2008 6:39 PM
 To: agi@v2.listbox.com
 Subject: Re: Language modeling (was Re: [agi] draft for comment)
 
 --- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:
 
  Like to many existing AI works, my disagreement with you is
  not that
  much on the solution you proposed (I can see the value),
  but on the
  problem you specified as the goal of AI. For example, I
  have no doubt
  about the theoretical and practical values of compression,
  but don't
  think it has much to do with intelligence.
 
 In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why
 text compression is an AI problem. To summarize, if you know the
 probability distribution of text, then you can compute P(A|Q) for any
 question Q and answer A to pass the Turing test. Compression allows you
 to precisely measure the accuracy of your estimate of P. Compression
 (actually, word perplexity) has been used since the early 1990's to
 measure the quality of language models for speech recognition, since it
 correlates well with word error rate.
 
 The purpose of this work is not to solve general intelligence, such as
 the universal intelligence proposed by Legg and Hutter [1]. That is not
 computable, so you have to make some arbitrary choice with regard to
 test environments about what problems you are going to solve. I believe
 the goal of AGI should be to do useful work for humans, so I am making a
 not so arbitrary choice to solve a problem that is central to what most
 people regard as useful intelligence.
 
 I had hoped that my work would lead to an elegant theory of AI, but that
 hasn't been the case. Rather, the best compression programs were
 developed as a series of thousands of hacks and tweaks, e.g. change a 4
 to a 5 because it gives 0.002% better compression on the benchmark. The
 result is an opaque mess. I guess I should have seen it coming, since it
 is predicted by information theory (e.g. [2]).
 
 Nevertheless the architectures of the best text compressors are
 consistent with cognitive development models, i.e. phoneme (or letter)
 sequences - lexical - semantics - syntax, which are themselves
 consistent with layered neural architectures. I already described a
 neural semantic model in my last post. I also did work supporting
 Hutchens and Alder showing that lexical models can be learned from n-
 gram statistics, consistent with the observation that babies learn the
 rules for segmenting continuous speech before they learn any words [3].
 
 I agree it should also be clear that semantics is learned before
 grammar, contrary to the way artificial languages are processed. Grammar
 requires semantics, but not the other way around. Search engines work
 using semantics only. Yet we cannot parse sentences like I ate pizza
 with Bob, I 

Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread William Pearson
2008/9/6 Mike Tintner [EMAIL PROTECTED]:
 Will,

 Yes, humans are manifestly a RADICALLY different machine paradigm- if you
 care to stand back and look at the big picture.

 Employ a machine of any kind and in general, you know what you're getting -
 some glitches (esp. with complex programs) etc sure - but basically, in
 general,  it will do its job.

What exactly is a desktop computers job?

 Humans are only human, not a machine. Employ one of those, incl. yourself,
 and, by comparison, you have only a v. limited idea of what you're getting -
 whether they'll do the job at all, to what extent, how well. Employ a
 programmer, a plumber etc etc.. Can you get a good one these days?... VAST
 difference.

If you find a new computer that I do not know how it has been
programmed (whether it has linux/windows and what version). You also
lack knowledge of what it is going to do. Aibo is a computer as well!
It follows a program.

 And that's the negative side of our positive side - the fact that we're 1)
 supremely adaptable, and 2) can tackle those problems that no machine or
 current AGI  - (actually of course, there is no such thing at the mo, only
 pretenders) - can even *begin* to tackle.

 Our unreliability
 .

 That, I suggest, only comes from having no set structure - no computer
 program - no program of action in the first place. (Hey, good  idea, who
 needs a program?)

You equate set structure with computer program. A computer program is
not set! There is set structure of some sorts in the brain, at the
neural level anyway. so you would have to be more precise in what you
mean by lack of set structure.

Wait, program of action? You don't think computer programs are like
lists of things to do in the real world, do you? That is just
something cooked up by the language writers to make things easier to
deal with, a computer program is really only about memory
manipulation. Some of the memory locations might be hooked up to the
real world, but at the end of the day the computer treats it all as
semanticless memory manipulations. Since what controls the memory
manipulations are themselves in memory, they to can be manipulated!

 Here's a simple, extreme example.

 Will,  I want you to take up to an hour, and come up with a dance, called
 the Keyboard Shuffle. (A very ill-structured problem.)

How about you go learn about self-modifying assembly language,
preferably with real-time interrupts. That would be a better use of
the time, I think.


 Will Pearson


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


RE: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread Derek Zahn
It has been explained many times to Tintner that even though computer hardware 
works with a particular set of primitive operations running in sequence, a 
hardwired set of primitive logical operations operating in sequence is NOT the 
theory of intelligence that any AGI researchers are proposing (to my 
knowledge).  A computer is just a system for holding a theory of intelligence 
which does not look like those primitives (at least not since the view that 
intelligence consists of simple interpretations of atomic tokens representing 
physical objects in small numbers of relationships with other such tokens was 
given up decades ago as insufficient).  As an example, the representational 
mechansms in Novamente and the dynamics of the mind agents that operate on them 
are probably better thought of as churning masses of probability relationships 
with varying and often non-specific semantic interpretations than Tintner's 
narrow view of what a computer is -- although I do not yet understand Novamente 
in detail.  He has to ignore all such efforts, though, because if he paid 
attention he would have to stop saying that NONE of us understand ANYTHING 
about how REAL intelligence is actually based on line drawings, or keyboards, 
or other childish notions.
 
Though he's in my killfile I do see his posts when others take the bait.  So 
Mike, please try to finally understand this:  AGI researchers do not think of 
intelligence as what you think of as a computer program -- some rigid sequence 
of logical operations programmed by a designer to mimic intelligent behavior.  
We know it is deeper than that.  This has been clear to just about everybody 
for many many years.  By engaging the field at such a level you do nothing 
worthwhile.



 Date: Sat, 6 Sep 2008 15:38:59 +0100 From: [EMAIL PROTECTED] To: 
 agi@v2.listbox.com Subject: Re: [agi] A NewMetaphor for Intelligence - the 
 Computer/Organiser  2008/9/6 Mike Tintner [EMAIL PROTECTED]:  Will,  
  Yes, humans are manifestly a RADICALLY different machine paradigm- if you 
  care to stand back and look at the big picture.   Employ a machine of 
 any kind and in general, you know what you're getting -  some glitches 
 (esp. with complex programs) etc sure - but basically, in  general, it will 
 do its job.  What exactly is a desktop computers job?


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.))

2008-09-06 Thread Steve Richfield
Matt,

I heartily disagree with your view as expressed here, and as stated to my by
heads of CS departments and other high ranking CS PhDs, nearly (but not
quite) all of whom have lost the fire in the belly that we all once had
for CS/AGI.

I DO agree that CS is like every other technological endeavor, in that
almost everything that can be done as a PhD thesis has already been done.
but there is a HUGE gap between a PhD thesis scale project and what that
same person can do with another few more millions and a couple more years,
especially if allowed to ignore the naysayers.

The reply is a even more complex than your well documented statement, but
I'll take my best shot at it, time permitting. Here, the angel is in the
details.

On 9/5/08, Matt Mahoney [EMAIL PROTECTED] wrote:

 --- On Fri, 9/5/08, Steve Richfield [EMAIL PROTECTED] wrote:
 I think that a billion or so, divided up into small pieces to fund EVERY
 disparate approach to see where the low hanging fruit is, would go a
 LONG way in guiding subsequent billions. I doubt that it would take a
 trillion to succeed.

 Sorry, the low hanging fruit was all picked by the early 1960's. By then we
 had neural networks [1,6,7,11,12],


... but we STILL do not have any sort of useful *unsupervised* NN, the
equivalent of which seems to be needed for any good AGI. Note my recent
postings about a potential theory of everything that would most directly
hit unsupervised NN, providing not only a good way of operating these, but
possibly the provably best way of operating.

natural language processing and language translation [2],


My Dr. Eliza is right there and showing that useful understanding out of
precise context is almost certainly impossible. I regularly meet with the
folks working on the Russian translator project, and rest assured, things
are STILL advancing fairly rapidly. Here, there is continuing funding, and I
expect that the Russian translator will eventually succeed (they already
claim success).

models of human decision making [3],


These are curious but I believe them to be an emergent properties of
processes that we don't understand at all, so they have no value other than
for testing of future systems. Note that human decision making does NOT
generally include many advanced sorts of logic that simply don't occur to
ordinary humans, which is where an AGI could shine. Hence, understanding the
human but not the non-human processes is nearly worthless.

automatic theorem proving [4,8,10],


Great for when you already have the answer - but what is it good for?!

natural language databases [5],


Which are only useful if/when the provably false presumption is true that NL
understanding is generally possible.

game playing programs [9,13],


Note relevant for AGI.

optical character recognition [14],


Only recently have methods emerged that are truly font-independent. This
SHOULD have been accomplished long ago (like shortly after your 1960
reference), but no one wanted to throw significant money at it. I nearly
launched an OCR company (Cognitext) in 1981, but funding eventually failed *
because* I had done the research and had a new (but *un*proven) method that
was truly font-independent.

handwriting and speech recognition [15],


... both of which are now good enough for AI interaction (e.g. my Gracie
speech I/O interface to Dr. Eliza), but NOT good enough for general
dictation. Unfortunately, the methods used don't seem to shed much light on
how the underlying processes work in us.

and important theoretical work [16,17,18].


Note again my call for work/help on what I call computing's theory of
everything leveraging off of principal component analysis.

Since then we have had mostly just incremental improvements.


YES. This only shows that the support process has long been broken. and NOT
that there isn't a LOT of value that is just out of reach of PhD-sized
projects.

Big companies like Google and Microsoft have strong incentives to develop AI


Internal politics at both (that I have personally run into) restrict
expenditures to PROVEN methods, as a single technical failure spells doom
for the careers of everyone working on them. Hence, their RD is all D and
no R.

and have billions to spend.


Not one dollar of which goes into what I would call genuine research.

Maybe the problem really is hard.


... and maybe it is just a little difficult. My own Dr. Eliza program
has seemingly unbelievable NL-stated problem solving capabilities, but is
built mostly on the same sort of 1960s technology you cited. Why wasn't it
built before 1970? I see two simple reasons:
1.  Joe Weizenbaum, in his *Computer Power and Human Reason,* explained why
this approach could never work. That immediately made it impossible to get
any related effort funded or acceptable in a university setting.
2.  It took about a year to make a demonstrable real-world NL problem
solving system, which would have been at the outer reaches of a PhD or
casual personal project.

I have 

Re: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread Matt Mahoney
--- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 Thanks for taking the time to explain your ideas in detail.
 As I said,
 our different opinions on how to do AI come from our very
 different
 understanding of intelligence. I don't take
 passing Turing Test as
 my research goal (as explained in
 http://nars.wang.googlepages.com/wang.logic_intelligence.pdf
 and
 http://nars.wang.googlepages.com/wang.AI_Definitions.pdf). 
 I disagree
 with Hutter's approach, not because his SOLUTION is not
 computable,
 but because his PROBLEM is too idealized and simplified to
 be relevant
 to the actual problems of AI.

I don't advocate the Turing test as the ideal test of intelligence. Turing 
himself was aware of the problem when he gave an example of a computer 
answering an arithmetic problem incorrectly in his famous 1950 paper:

Q: Please write me a sonnet on the subject of the Forth Bridge.
A: Count me out on this one. I never could write poetry.
Q: Add 34957 to 70764.
A: (Pause about 30 seconds and then give as answer) 105621.
Q: Do you play chess?
A: Yes.
Q: I have K at my K1, and no other pieces.  You have only K at K6 and R at R1.  
It is your move.  What do you play?
A: (After a pause of 15 seconds) R-R8 mate.

I prefer a preference test, which a machine passes if you prefer to talk to 
it over a human. Such a machine would be too fast and make too few errors to 
pass a Turing test. For example, if you had to add two large numbers, I think 
you would prefer to use a calculator than ask someone. You could, I suppose, 
measure intelligence as the fraction of questions for which the machine gives 
the preferred answer, which would be 1/4 in Turing's example.

If you know the probability distribution P of text, and therefore know the 
distribution P(A|Q) for any question Q and answer A, then to pass the Turing 
test you would randomly choose answers from this distribution. But to pass the 
preference test for all Q, you would choose A that maximizes P(A|Q) because the 
most probable answer is usually the correct one. Text compression measures 
progress toward either test.

I believe that compression measures your definition of intelligence, i.e. 
adaptation given insufficient knowledge and resources. In my benchmark, there 
are two parts: the size of the decompression program, which measures the 
initial knowledge, and the compressed size, which measures prediction errors 
that occur as the system adapts. Programs must also meet practical time and 
memory constraints to be listed in most benchmarks.

Compression is also consistent with Legg and Hutter's universal intelligence, 
i.e. expected reward of an AIXI universal agent in an environment simulated by 
a random program. Suppose you have a compression oracle that inputs any string 
x and outputs the shortest program that outputs a string with prefix x. Then 
this reduces the (uncomputable) AIXI problem to using the oracle to guess which 
environment is consistent with the interaction so far, and figuring out which 
future outputs by the agent will maximize reward.

Of course universal intelligence is also not testable because it requires an 
infinite number of environments. Instead, we have to choose a practical data 
set. I use Wikipedia text, which has fewer errors than average text, but I 
believe that is consistent with my goal of passing the preference test.


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


RE: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread Matt Mahoney
--- On Sat, 9/6/08, John G. Rose [EMAIL PROTECTED] wrote:

 Compression in itself has the overriding goal of reducing
 storage bits.

Not the way I use it. The goal is to predict what the environment will do next. 
Lossless compression is a way of measuring how well we are doing.

-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread Pei Wang
I won't argue against your  preference test here, since this is a
big topic, and I've already made my position clear in the papers I
mentioned.

As for compression, yes every intelligent system needs to 'compress'
its experience in the sense of keeping the essence but using less
space. However, it is clearly not loseless. It is even not what we
usually call loosy compression, because what to keep and in what
form is highly context-sensitive. Consequently, this process is not
reversible --- no decompression, though the result can be applied in
various ways. Therefore I prefer not to call it compression to avoid
confusing this process with the technical sense of compression,
which is reversible, at least approximately.

Legg and Hutter's universal intelligence definition is way too
narrow to cover various attempts towards AI, even as an idealization.
Therefore, I don't take it as a goal to aim at and to approach to as
close as possible. However, as I said before, I'd rather leave this
topic for the future, when I have enough time to give it a fair
treatment.

Pei

On Sat, Sep 6, 2008 at 4:29 PM, Matt Mahoney [EMAIL PROTECTED] wrote:
 --- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 Thanks for taking the time to explain your ideas in detail.
 As I said,
 our different opinions on how to do AI come from our very
 different
 understanding of intelligence. I don't take
 passing Turing Test as
 my research goal (as explained in
 http://nars.wang.googlepages.com/wang.logic_intelligence.pdf
 and
 http://nars.wang.googlepages.com/wang.AI_Definitions.pdf).
 I disagree
 with Hutter's approach, not because his SOLUTION is not
 computable,
 but because his PROBLEM is too idealized and simplified to
 be relevant
 to the actual problems of AI.

 I don't advocate the Turing test as the ideal test of intelligence. Turing 
 himself was aware of the problem when he gave an example of a computer 
 answering an arithmetic problem incorrectly in his famous 1950 paper:

 Q: Please write me a sonnet on the subject of the Forth Bridge.
 A: Count me out on this one. I never could write poetry.
 Q: Add 34957 to 70764.
 A: (Pause about 30 seconds and then give as answer) 105621.
 Q: Do you play chess?
 A: Yes.
 Q: I have K at my K1, and no other pieces.  You have only K at K6 and R at 
 R1.  It is your move.  What do you play?
 A: (After a pause of 15 seconds) R-R8 mate.

 I prefer a preference test, which a machine passes if you prefer to talk to 
 it over a human. Such a machine would be too fast and make too few errors to 
 pass a Turing test. For example, if you had to add two large numbers, I think 
 you would prefer to use a calculator than ask someone. You could, I suppose, 
 measure intelligence as the fraction of questions for which the machine gives 
 the preferred answer, which would be 1/4 in Turing's example.

 If you know the probability distribution P of text, and therefore know the 
 distribution P(A|Q) for any question Q and answer A, then to pass the Turing 
 test you would randomly choose answers from this distribution. But to pass 
 the preference test for all Q, you would choose A that maximizes P(A|Q) 
 because the most probable answer is usually the correct one. Text compression 
 measures progress toward either test.

 I believe that compression measures your definition of intelligence, i.e. 
 adaptation given insufficient knowledge and resources. In my benchmark, there 
 are two parts: the size of the decompression program, which measures the 
 initial knowledge, and the compressed size, which measures prediction errors 
 that occur as the system adapts. Programs must also meet practical time and 
 memory constraints to be listed in most benchmarks.

 Compression is also consistent with Legg and Hutter's universal intelligence, 
 i.e. expected reward of an AIXI universal agent in an environment simulated 
 by a random program. Suppose you have a compression oracle that inputs any 
 string x and outputs the shortest program that outputs a string with prefix 
 x. Then this reduces the (uncomputable) AIXI problem to using the oracle to 
 guess which environment is consistent with the interaction so far, and 
 figuring out which future outputs by the agent will maximize reward.

 Of course universal intelligence is also not testable because it requires an 
 infinite number of environments. Instead, we have to choose a practical data 
 set. I use Wikipedia text, which has fewer errors than average text, but I 
 believe that is consistent with my goal of passing the preference test.


 -- Matt Mahoney, [EMAIL PROTECTED]



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: 

Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.))

2008-09-06 Thread Matt Mahoney
Steve, where are you getting your cost estimate for AGI? Is it a gut feeling, 
or something like the common management practice of I can afford $X so it will 
cost $X?

My estimate of $10^15 is based on the value of the world economy, US $66 
trillion per year and increasing 5% annually over the next 30 years, which is 
how long it will take for the internet to grow to the computational power of 
10^10 human brains (at 10^15 bits and 10^16 OPS each) at the current rate of 
growth, doubling every couple of years. Even if you disagree with these numbers 
by a factor of 1000, it only moves the time to AGI by a few years, so the cost 
estimate hardly changes.

And even if the hardware is free, you still have to program or teach about 
10^16 to 10^17 bits of knowledge, assuming 10^9 bits of knowledge per brain [1] 
and 1% to 10% of this is not known by anyone else. Software and training costs 
are not affected by Moore's law. Even if we assume human level language 
understanding and perfect sharing of knowledge, the training cost will be 1% to 
10% of your working life to train the AGI to do your job.

Also, we have made *some* progress toward AGI since 1965, but it is mainly a 
better understanding of why it is so hard, e.g.

- We know that general intelligence is not computable [2] or provable [3]. 
There is no neat theory.

- From Cyc, we know that coding common sense is more than a 20 year effort. 
Lenat doesn't know how much more, but guesses it is maybe between 0.1% and 10% 
finished.

- Google is the closest we have to AI after a half trillion dollar effort.

 

1. Landauer, Tom (1986), “How much do
people remember?  Some estimates of the quantity of learned
information in long term memory”, Cognitive Science (10) pp.
477-493.




2. Hutter, Marcus (2003), A Gentle
Introduction to The Universal Algorithmic Agent {AIXI}, in
Artificial General Intelligence, B. Goertzel and C. Pennachin
eds., Springer. http://www.idsia.ch/~marcus/ai/aixigentle.htm




3. Legg, Shane, (2006), Is There an
Elegant Universal Theory of Prediction?,  Technical Report
IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for Artificial
Intelligence, Galleria 2, 6928 Manno, Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf


-- Matt Mahoney, [EMAIL PROTECTED]

--- On Sat, 9/6/08, Steve Richfield [EMAIL PROTECTED] wrote:
From: Steve Richfield [EMAIL PROTECTED]
Subject: Re: AI isn't cheap (was Re: Real vs. simulated environments (was Re: 
[agi] draft for comment.. P.S.))
To: agi@v2.listbox.com
Date: Saturday, September 6, 2008, 2:58 PM

Matt,
 
I heartily disagree with your view as expressed here, and as stated to my by 
heads of CS departments and other high ranking CS PhDs, nearly (but not 
quite) all of whom have lost the fire in the belly that we all once had for 
CS/AGI.

 
I DO agree that CS is like every other technological endeavor, in that almost 
everything that can be done as a PhD thesis has already been done. but there is 
a HUGE gap between a PhD thesis scale project and what that same person can do 
with another few more millions and a couple more years, especially if allowed 
to ignore the naysayers.

 
The reply is a even more complex than your well documented statement, but I'll 
take my best shot at it, time permitting. Here, the angel is in the details.
 
On 9/5/08, Matt Mahoney [EMAIL PROTECTED] wrote: 
--- On Fri, 9/5/08, Steve Richfield [EMAIL PROTECTED] wrote:

I think that a billion or so, divided up into small pieces to fund EVERY
disparate approach to see where the low hanging fruit is, would go a
LONG way in guiding subsequent billions. I doubt that it would take a

trillion to succeed.

Sorry, the low hanging fruit was all picked by the early 1960's. By then we had 
neural networks [1,6,7,11,12],
 
... but we STILL do not have any sort of useful unsupervised NN, the equivalent 
of which seems to be needed for any good AGI. Note my recent postings about a 
potential theory of everything that would most directly hit unsupervised NN, 
providing not only a good way of operating these, but possibly the provably 
best way of operating.


natural language processing and language translation [2],
 
My Dr. Eliza is right there and showing that useful understanding out of 
precise context is almost certainly impossible. I regularly meet with the folks 
working on the Russian translator project, and rest assured, things are STILL 
advancing fairly rapidly. Here, there is continuing funding, and I expect that 
the Russian translator will eventually succeed (they already claim success).


models of human decision making [3],
 
These are curious but I believe them to be an emergent properties of processes 
that we don't understand at all, so they have no value other than for testing 
of future systems. Note that human decision making does NOT generally include 
many advanced sorts of logic that simply don't occur to ordinary humans, which 
is where an AGI could 

Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-06 Thread Mike Tintner
DZ:AGI researchers do not think of intelligence as what you think of as a 
computer program -- some rigid sequence of logical operations programmed by a 
designer to mimic intelligent behavior.

1. Sequence/Structure. The concept I've been using is not that a program is  a 
sequence of operations but a structure., including as per NARS, as I 've 
read Pei, a structure that may change more or less continuously. Techno-idiot 
that I am, I am fairly aware that many modern programs are extremely 
sophisticated and complex structures. I take into account, for example, 
Minsky's idea of a possible society of mind, with many different parts 
perhaps competing - not obviously realised in program form yet. 

But programs are nevertheless manifestly structures. Would you dispute that?

And a central point I've been making is that human life and activities  are 
manifestly *unstructured* - that in just about everything we do, we struggle to 
impose structure on our activities - to impose order and organization., 
planning, focus etc. .

Especially in AGI's central challenge -creativity. Creative activities are 
outstanding examples of unstructured activities, in which structures have to be 
created - painting scenes, writing stories, designing new machines, writing 
music/pop songs - often starting from an entirely blank page. (What's the 
program equivalent?)

2. A Programmer on Programs.  I am persuaded on multiple grounds that the 
human mind is not always algorithmic, nor merely computational in the syntactic 
sense of computational.
S Kauffman, Reinventing the Sacred

Try Chap 12.  Computationally, he trumps most AGI-ers in terms of most AI 
departments, incl. complexity, bioinformatics and general standing, no? Read 
the whole book in fact - it can be read as being entirely about the creative 
problem/challenge of AGI -  you liked Barsalou, you'll like this. . 




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-06 Thread Matt Mahoney
--- On Sat, 9/6/08, Pei Wang [EMAIL PROTECTED] wrote:

 As for compression, yes every intelligent
 system needs to 'compress'
 its experience in the sense of keeping the essence
 but using less
 space. However, it is clearly not loseless. It is
 even not what we
 usually call loosy compression, because what to
 keep and in what
 form is highly context-sensitive. Consequently, this
 process is not
 reversible --- no decompression, though the result can be
 applied in
 various ways. Therefore I prefer not to call it compression
 to avoid
 confusing this process with the technical sense of
 compression,
 which is reversible, at least approximately.

I think you misunderstand my use of compression. The goal is modeling or 
prediction. Given a string, predict the next symbol. I use compression to 
estimate how accurate the model is. It is easy to show that if your model is 
accurate, then when you connect your model to an ideal coder (such as an 
arithmetic coder), then compression will be optimal. You could actually skip 
the coding step, but it is cheap, so I use it so that there is no question of 
making a mistake in the measurement. If a bug in the coder produces a too small 
output, then the decompression step won't reproduce the original file.

In fact, many speech recognition experiments do skip the coding step in their 
tests and merely calculate what the compressed size would be. (More precisely, 
they calculate word perplexity, which is equivalent). The goal of speech 
recognition is to find the text y that maximizes P(y|x) for utterance x. It is 
common to factor the model using Bayes law: P(y|x) = P(x|y)P(y)/P(x). We can 
drop P(x) since it is constant, leaving the acoustic model P(x|y) and language 
model P(y) to evaluate. We know from experiments that compression tests on P(y) 
correlate well with word error rates for the overall system.

Internally, all lossless compressors use lossy compression or data reduction to 
make predictions. Most commonly, a context is truncated and possibly hashed 
before looking up the statistics for the next symbol. The top lossless 
compressors in my benchmark use more sophisticated forms of data reduction, 
such as mapping upper and lower case letters together, or mapping groups of 
semantically or syntactically related words to the same context.

As a test, lossless compression is only appropriate for text. For other hard AI 
problems such as vision, art, and music, incompressible noise would overwhelm 
the human-perceptible signal. Theoretically you could compress video to 2 bits 
per second (the rate of human long term memory) by encoding it as a script. The 
decompressor would read the script and create a new movie. The proper test 
would be lossy compression, but this requires human judgment to evaluate how 
well the reconstructed data matches the original.


-- Matt Mahoney, [EMAIL PROTECTED]




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com