from:"Matt Mahoney"

[agi] Re: [agi] P≠NP

2010-08-16 Thread Matt Mahoney

Does anyone have any comments on this proof? I don't have the mathematical 
background to tell if it is correct. But it seems related to the idea from 
algorithmic information theory that the worst case complexity for any algorithm 
is equal to the average case for compressed inputs. Then to show that P != NP 
you would show that SAT (specifically 9-SAT) with compressed inputs has 
exponential average case complexity. That is not quite the approach the paper 
takes, probably because compression is not computable.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Kaj Sotala xue...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, August 12, 2010 2:18:13 AM
Subject: [agi] Re: [agi] P≠NP

2010/8/12 John G. Rose johnr...@polyplexic.com

 BTW here is the latest one:

 http://www.win.tue.nl/~gwoegi/P-versus-NP/Deolalikar.pdf

See also:

http://www.ugcs.caltech.edu/~stansife/pnp.html - brief summary of the proof

Discussion about whether it's correct:

http://rjlipton.wordpress.com/2010/08/08/a-proof-that-p-is-not-equal-to-np/
http://rjlipton.wordpress.com/2010/08/09/issues-in-the-proof-that-p?np/
http://rjlipton.wordpress.com/2010/08/10/update-on-deolalikars-proof-that-p≠np/
http://rjlipton.wordpress.com/2010/08/11/deolalikar-responds-to-issues-about-his-p≠np-proof/

http://news.ycombinator.com/item?id=1585850

Wiki page summarizing a lot of the discussion, as well as collecting
many of the links above:

http://michaelnielsen.org/polymath1/index.php?title=Deolalikar%27s_P!%3DNP_paper#Does_the_argument_prove_too_much.3F



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com

* Open Link in New Tab
* Download


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Help requested: Making a list of (non-robotic) AGI low hanging fruit apps

2010-08-07 Thread Matt Mahoney

Wouldn't it depend on the other researcher's area of expertise?

 -- Matt Mahoney, matmaho...@yahoo.com

From: Ben Goertzel b...@goertzel.org
To: agi agi@v2.listbox.com
Sent: Sat, August 7, 2010 9:10:23 PM
Subject: [agi] Help requested: Making a list of (non-robotic) AGI low hanging 
fruit apps

Hi,

A fellow AGI researcher sent me this request, so I figured I'd throw it
out to you guys

I'm putting together an AGI pitch for investors and thinking of low
hanging fruit applications to argue for. I'm intentionally not
involving any mechanics (robots, moving parts, etc.). I'm focusing on
voice (i.e. conversational agents) and perhaps vision-based systems.
Hellen Keller AGI, if you will :)

Along those lines, I'd like any ideas you may have that would fall
under this description. I need to substantiate the case for such AGI
technology by making an argument for high-value apps. All ideas are
welcome.

All serious responses will be appreciated!!

Also, I would be grateful if we
could keep this thread closely focused on direct answers to this
question, rather than
digressive discussions on Helen Keller, the nature of AGI, the definition of AGI
versus narrow AI, the achievability or unachievability of AGI, etc.
etc.  If you think
the question is bad or meaningless or unclear or whatever, that's
fine, but please
start a new thread with a different subject line to make your point.

If the discussion is useful, my intention is to mine the answers into a compact
list to convey to him

Thanks!
Ben G

---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com

---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Epiphany - Statements of Stupidity

2010-08-06 Thread Matt Mahoney

Mike Tintner wrote:
 What will be the SIMPLEST thing that will mark the first sign of AGI ? - 
 Given 
that there are zero but zero examples of AGI.
 
Machines have already surpassed human intelligence. If you don't think so, try 
this IQ test. http://mattmahoney.net/iq/

Or do you prefer to define intelligence as more like a human? In that case I 
agree that AGI will never happen. No machine will ever be more like a human 
than 
a human.

I really don't care how you define it. Either way, computers are profoundly 
affecting the way people interact with each other and with the world. Where is 
the threshold when machines do most of our thinking for us? Who cares as long 
as 
the machines still give us the feeling that we are in charge.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, August 6, 2010 5:57:33 AM
Subject: Re: [agi] Epiphany - Statements of Stupidity


sTEVE:I have  posted plenty about statements of ignorance, our probable 
inability to  comprehend what an advanced intelligence might be thinking, 

 
What will be the SIMPLEST thing that will mark the  first sign of AGI ? - Given 
that there are zero but zero examples of  AGI.
 
Don't you think it would be a good idea to begin at  the beginning? With 
initial AGI? Rather than advanced AGI? 

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-08-06 Thread Matt Mahoney

Jim, see http://www.scholarpedia.org/article/Algorithmic_probability
I think this answers your questions.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Fri, August 6, 2010 2:18:09 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


I meant:
Did Solomonoff's original idea use randomization to determine the bits of the 
programs that are used to produce the prior probabilities?  I think that the 
answer to that is obviously no.  The randomization of the next bit would used 
in 
the test of the prior probabilities as done using a random sampling.  He 
probably found that students who had some familiarity with statistics would 
initially assume that the prior probability was based on some subset of 
possible 
programs as would be expected from a typical sample, so he gave this statistics 
type of definition to emphasize the extent of what he had in mind.
 
I asked this question just to make sure that I understood what Solomonoff 
Induction was, because Abram had made some statement indicating that I really 
didn't know.  Remember, this particular branch of the discussion was originally 
centered around the question of whether Solomonoff Induction would 
be convergent, even given a way around the incomputability of finding only 
those 
programs that halted.  So while the random testing of the prior probabilities 
is 
of interest to me, I wanted to make sure that there is no evidence that 
Solomonoff Induction is convergent. I am not being petty about this, but I also 
needed to make sure that I understood what Solomonoff Induction is.
 
I am interested in hearing your ideas about your variation of 
Solomonoff Induction because your convergent series, in this context, was 
interesting.
Jim Bromer


On Fri, Aug 6, 2010 at 6:50 AM, Jim Bromer jimbro...@gmail.com wrote:

Jim: So, did Solomonoff's original idea involve randomizing whether the next 
bit 
would be a 1 or a 0 in the program? 

Abram: Yep. 

I meant, did Solomonoff's original idea involve randomizing whether the next 
bit 
in the program's that are originally used to produce the prior probabilities 
involve the use of randomizing whether the next bit would be a 1 or a 0?  I 
have 
not been able to find any evidence that it was.  I thought that my question was 
clear but on second thought I guess it wasn't. I think that the part about the 
coin flips was only a method to express that he was interested in the 
probability that a particular string would be produced from all possible 
programs, so that when actually testing the prior probability of a particular 
string the program that was to be run would have to be randomly generated.
Jim Bromer
 
 

 
On Wed, Aug 4, 2010 at 10:27 PM, Abram Demski abramdem...@gmail.com wrote:

Jim,


Your function may be convergent but it is not a probability. 


True! All the possibilities sum to less than 1. There are ways of addressing 
this (ie, multiply by a normalizing constant which must also be approximated 
in 
a convergent manner), but for the most part adherents of Solomonoff induction 
don't worry about this too much. What we care about, mostly, is comparing 
different hyotheses to decide which to favor. The normalizing constant doesn't 
help us here, so it usually isn't mentioned. 




You said that Solomonoff's original construction involved flipping a coin for 
the next bit.  What good does that do?

Your intuition is that running totally random programs to get predictions will 
just produce garbage, and that is fine. The idea of Solomonoff induction, 
though, is that it will produce systematically less garbage than just flipping 
coins to get predictions. Most of the garbage programs will be knocked out of 
the running by the data itself. This is supposed to be the least garbage we 
can 
manage without domain-specific knowledge

This is backed up with the proof of dominance, which we haven't talked about 
yet, but which is really the key argument for the optimality of Solomonoff 
induction. 




And how does that prove that his original idea was convergent?

The proofs of equivalence between all the different formulations of Solomonoff 
induction are something I haven't cared to look into too deeply. 




Since his idea is incomputable, there are no algorithms that can be run to 
demonstrate what he was talking about so the basic idea is papered with all 
sorts of unverifiable approximations.

I gave you a proof of convergence for one such approximation, and if you wish 
I 
can modify it to include a normalizing constant to ensure that it is a 
probability measure. It would be helpful to me if your criticisms were more 
specific to that proof.



So, did Solomonoff's original idea involve randomizing whether the next bit 
would be a 1 or a 0 in the program? 



Yep. 



Even ignoring the halting problem what kind of result would that give?


Well, the general idea is this. An even

Re: [agi] Walker Lake

2010-08-03 Thread Matt Mahoney

Samantha Atkins wrote:
 No it hasn't. People want public surveillance.
 Guess I am not people then. 

Then why are you posting your response to a public forum instead of replying by 
encrypted private email? People want their words to be available to the world.

 I don't think the global brain needs to know exactly how often I have sex or 
with whom or in what varieties.  Do you?  

A home surveillance system needs to know who is in your house and whether they 
belong there. If it is intelligent then it will know that you prefer not to 
have 
video of you having sex broadcast on the internet. At the same time, it has to 
recognize what you are doing.

Public surveillance is less objectionable because it will be two-way and can't 
be abused. If someone searches for information about you, then you get a 
notification of who it was and what they learned. I describe how this works 
in http://mattmahoney.net/agi2.html

 No humans will control it and it is not going to be that expensive.

Humans will eventually lose control of anything that is smarter than them. But 
we should delay that as long as possible by making the required threshold the 
organized intelligence of all humanity, and make that organization as efficient 
as possible. The cost is on the order of $1 quadrillion because the knowledge 
that AGI needs is mostly in billions of human brains and there is no quick way 
to extract it.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Samantha Atkins sjatk...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, August 3, 2010 6:49:34 AM
Subject: Re: [agi] Walker Lake

Matt Mahoney wrote: 
Steve Richfield wrote:
 How about an international ban on the deployment of all unmanned and 
 automated 
weapons?
 
How about a ban on suicide bombers to level the playing field?


 1984 has truly arrived.


No it hasn't. People want public surveillance. 
Guess I am not people then.  Actually I think surveillance is inevitable given 
current and all but certain future tech.  However, I recognize that human 
beings 
today, and especially their governments, are not remotely ready for it.   To be 
ready for it at the very least the State would have to consider a great number 
of things none of its business to attempt to legislate for or against.  As it 
is 
with the current incredible number of arcane laws on the books it would be very 
easy to see the already ridiculously large prison population of the US double.  
 
Also, please note that full surveillance means no successful rebellion no 
matter 
how bad the powers that be become and how ineffectual the means that let remain 
legal are to change things.  Ever. 



It is also necessary for AGI. In order for machines to do what you want, they 
have to know what you know. 

It is not necessary to have every waking moment surveilled in order to have AGI 
know what we want. 



In order for a global brain to use that knowledge, it has to be public. 
I don't think the global brain needs to know exactly how often I have sex or 
with whom or in what varieties.  Do you?  



AGI has to be a global brain because it is too expensive to build any other 
way, 
and because it would be too dangerous ifthe whole world didn't control it.
No humans will control it and it is not going to be that expensive.

- samantha


agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Walker Lake

2010-08-03 Thread Matt Mahoney

Steve Richfield wrote:
 Disaster scenarios aside, what would YOU have YOUR AGI do to navigate this 
future?

It won't be my AGI. If it were, I would be a despot and billions of people 
would 
suffer, just like if any other person ruled the world with absolute power. We 
will be much better off if everyone has a voice, and we have an AGI that makes 
that voice available to everyone else.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Steve Richfield steve.richfi...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, August 2, 2010 10:03:27 PM
Subject: Re: [agi] Walker Lake

Matt,


On Mon, Aug 2, 2010 at 1:10 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Steve Richfield wrote:
 How about an international ban on the deployment of all unmanned and 
 automated 
weapons?
 
How about a ban on suicide bombers to level the playing field?

Of course we already have that. Unfortunately, one begets the other. Hence, we 
seem to have a choice, neither or both. I vote for neither. 




 1984 has truly arrived.


No it hasn't. People want public surveillance.

I'm not sure what you mean by public surveillance. Monitoring private phone 
calls? Monitoring otherwise unused web cams? Monitoring your output when you 
use 
the toilet? Where, if anywhere, do YOU draw the line?
 

It is also necessary for AGI. In order for machines to do what you want, they 
have to know what you know.

Unfortunately, knowing everything, any use of this information will either be 
to 
my benefit, or my detriment. Do you foresee any way to limit use to only 
beneficial use?

BTW, decades ago I developed the plan of, when my kids got in some sort of 
trouble in school or elsewhere, to represent their interests as well as 
possible, regardless of whether I agreed with them or not. This worked 
EXTREMELY 
well for me, and for several other families who have tried this. The point is 
that to successfully represent their interests, I had to know what was 
happening. Potential embarrassment and explainability limited the kids' 
actions. 
I wonder if the same would work for AGIs?
 

In order for a global brain to use that knowledge, it has to be public.

Again, where do you draw the line between public and private?
 

AGI has to be a global  brain because it is too expensive to build any other 
way, and because it would be too dangerous if the whole world didn't control it.

I'm not sure what you mean by control. 


Here is the BIG question in my own mind, that I have asked in various ways, so 
far without any recognizable answer:

There are plainly lots of things wrong with our society. We got here by doing 
what we wanted, and by having our representatives do what we wanted them to do. 
Clearly some social re-engineering is in our future, if we are to thrive in the 
foreseeable future. All changes are resisted by some, and I suspect that some 
needed changes will be resisted by most, and perhaps nearly everyone. Disaster 
scenarios aside, what would YOU have YOUR AGI do to navigate this future?

To help guide your answer, I see that the various proposed systems of ethics 
would prevent breaking the eggs needed to make a good futuristic omelet. I 
suspect that completely democratic systems have run their course. Against this 
is letting AGI loose has its own unfathomable hazards. I've been hanging 
around here for quite a while, and I don't yet see any success path to work 
toward.

I'm on your side in that any successful AGI would have to have the information 
and the POWER to succeed, akin to Colossus, the Forbin Project, which I 
personally see as more of a success story than a horror scenario. Absent that, 
AGIs will only add to our present problems.

What is the success path that you see?

Steve


 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] AGI Int'l Relations

2010-08-02 Thread Matt Mahoney

Steve Richfield wrote:
 I would feel a **LOT** better if someone explained SOME scenario to 
 eventually 
emerge from our current economic mess.

What economic mess?
http://www.google.com/publicdata?ds=wb-wdictype=lstrail=falsenselm=hmet_y=ny_gdp_mktp_cdscale_y=linind_y=falserdim=countryidim=country:USAtdim=truetstart=-31561920tunit=Ytlen=48hl=endl=en


 Unemployment appears to be permanent and getting worse, 

When you pay people not to work, they are less inclined to work.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Steve Richfield steve.richfi...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, August 2, 2010 11:54:25 AM
Subject: Re: [agi] AGI  Int'l Relations

Jan

I can see that I didn't state one of my points clearly enough...


On Sun, Aug 1, 2010 at 3:04 PM, Jan Klauck jkla...@uni-osnabrueck.de wrote:



 My simple (and completely unacceptable) cure for this is to tax savings,
 to force the money back into the economy.

You have either consumption or savings. The savings are put back into
the economy in form of credits to those who invest the money.


Our present economic problem is that those credits aren't being turned over 
fast enough to keep the economic engine running well. At present, with present 
systems in place, there is little motivation to quickly turn over one's wealth, 
and lots of motivation to very carefully protect it. The result is that most of 
the wealth of the world is just sitting there in various accounts, and is NOT 
being spent/invested on various business propositions to benefit the population 
of the world.

We need to do SOMETHING to get the wealth out of the metaphorical mattresses 
and 
back into the economy. Taxation is about the only effective tool that the 
government hasn't already dulled beyond utility. However, taxation doesn't 
stand 
a chance without the cooperation of other countries to do the same. There seems 
to be enough lobbying power in the hands of those with the money to stop any 
such efforts, or at least to leave enough safe havens to make such efforts 
futile.

I would feel a **LOT** better if someone explained SOME scenario to eventually 
emerge from our current economic mess. Unemployment appears to be permanent and 
getting worse, as does the research situation. All I hear are people citing 
stock prices and claiming that the economy is turning around, when I see little 
connection between stock prices and on-the-street economy.

This is an IR problems of monumental proportions. What would YOU do about it?

Steve



 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Walker Lake

2010-08-02 Thread Matt Mahoney

Steve Richfield wrote:
 How about an international ban on the deployment of all unmanned and 
 automated 
weapons?
 
How about a ban on suicide bombers to level the playing field?

 1984 has truly arrived.

No it hasn't. People want public surveillance. It is also necessary for AGI. In 
order for machines to do what you want, they have to know what you know. In 
order for a global brain to use that knowledge, it has to be public. AGI has to 
be a global brain because it is too expensive to build any other way, and 
because it would be too dangerous if the whole world didn't control it.

-- Matt Mahoney, matmaho...@yahoo.com





From: Steve Richfield steve.richfi...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, August 2, 2010 10:40:20 AM
Subject: [agi] Walker Lake

Sometime when you are flying between the northwest US to/from Las Vegas, look 
out your window as you fly over Walker Lake in eastern Nevada. At the south end 
you will see a system of roads leading to tiny buildings, all surrounded by 
military security. From what I have been able to figure out, you will find the 
U.S. arsenal of chemical and biological weapons housed there. No, we are not 
now 
making these weapons, but neither are we disposing of them.

Similarly, there has been discussion of developing advanced military technology 
using AGI and other computer-related methods. I believe that these efforts are 
fundamentally anti-democratic, as they allow a small number of people to 
control 
a large number of people. Gone are the days when people voted with their 
swords. 
We now have the best government that money can buy monitoring our every email, 
including this one, to identify anyone resisting such efforts. 1984 has truly 
arrived. This can only lead to a horrible end to freedom, with AGIs doing their 
part and more.

Like chemical and biological weapons, unmanned and automated weapons should be 
BANNED. Unfortunately, doing so would provide a window of opportunity for 
others 
to deploy them. However, if we make these and stick them in yet another 
building 
at the south end of Walker Lake, we would be ready in case other nations deploy 
such weapons.

How about an international ban on the deployment of all unmanned and automated 
weapons? The U.S. won't now even agree to ban land mines. At least this would 
restore SOME relationship between popular support and military might. Doesn't 
it 
sound ethical to insist that a human being decide when to end another human 
being's life? Doesn't it sound fair to require the decision maker to be in 
harm's way, especially when the person being killed is in or around their own 
home? Doesn't it sound unethical to add to the present situation? When deployed 
on a large scale, aren't these WMDs?
 
Steve

 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Shhh!

2010-08-02 Thread Matt Mahoney

Jim, you are thinking out loud. There is no such thing as trans-infinite. How 
about posting when you actually solve the problem.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, August 2, 2010 9:06:53 AM
Subject: [agi] Re: Shhh!


I think I can write an abbreviated version, but there would only be a few 
people 
in the world who would both believe me and understand why it would work.


On Mon, Aug 2, 2010 at 8:53 AM, Jim Bromer jimbro...@gmail.com wrote:

I can write an algorithm that is capable of describing ('reaching') every 
possible irrational number - given infinite resources.  The infinite is not a 
number-like object, it is an active form of incrementation or concatenation.  
So 
I can write an algorithm that can write every finite state of every possible 
number.  However, it would take another algorithm to 'prove' it.  Given an 
irrational number, this other algorithm could find the infinite incrementation 
for every digit of the given number.  Each possible number (including 
the incrementation of those numbers that cannot be represented in truncated 
form) is embedded within a single infinite infinite incrementation of digits 
that is produced by the algorithm, so the second algorithm would have to 
calculate where you would find each digit of the given irrational number by 
increment.  But the thing is, both functions would be computable and provable.  
(I haven't actually figured the second algorithm out yet, but it is not 
a difficult problem.)
 
This means that the Trans-Infinite Is Computable.  But don't tell anyone about 
this, it's a secret.
 

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] AGI Alife

2010-07-28 Thread Matt Mahoney

Ian Parker wrote:
 Matt Mahoney has costed his view of AGI. I say that costs must be recoverable 
as we go along. Matt, don't frighten people with a high estimate of cost. 
Frighten people instead with the bill they are paying now for dumb systems.

It is not my intent to scare people out of building AGI, but rather to be 
realistic about its costs. Building machines that do what we want is a much 
harder problem than building intelligent machines. Machines surpassed human 
intelligence 50 years ago. But getting them to do useful work is still a $60 
trillion per year problem. It's going to happen, but not as quickly as one 
might 
hope.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Ian Parker ianpark...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 28, 2010 6:54:05 AM
Subject: Re: [agi] AGI  Alife




On 27 July 2010 21:06, Jan Klauck jkla...@uni-osnabrueck.de wrote:


 Second observation about societal punishment eliminating free loaders. The
 fact of the matter is that *freeloading* is less of a problem in
 advanced societies than misplaced unselfishness.

Fact of the matter, hm? Freeloading is an inherent problem in many
social configurations. 9/11 brought down two towers, freeloading can
bring down an entire country.



There are very considerable knock on costs. There is the mushrooming cost of 
security  This manifests itself in many ways. There is the cost of disruption 
to 
air travel. If someone rides on a plane without a ticket no one's life is put 
at 
risk. There are the military costs, it costs $1m per year to keep a soldier in 
Afghanistan. I don't know how much a Taliban fighter costs, but it must be a 
lot 
less.

Clearly any reduction in these costs would be welcomed. If someone were to come 
along in the guise of social simulation and offer a reduction in these costs 
the 
research would pay for itself many times over. What you are interested in.

This may be a somewhat unpopular thing to say, but money is important. Matt 
Mahoney has costed his view of AGI. I say that costs must be recoverable as we 
go along. Matt, don't frighten people with a high estimate of cost. Frighten 
people instead with the bill they are paying now for dumb systems.
 
 simulations seem :-

 1) To be better done by Calculus.

You usually use both, equations and heuristics. It depends on the
problem, your resources, your questions, the people working with it
a.s.o.


That is the way things should be done. I agree absolutely. We could in fact 
take 
steepest descent (Calculus) and GAs and combine them together in a single 
composite program. This would in fact be quite a useful exercise. We would also 
eliminate genes that simply dealt with Calculus and steepest descent.

I don't know whether it is useful to think in topological terms.


  - Ian Parker
 


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com


agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney

David Jones wrote:
 I should also mention that I ran into problems mainly because I was having a 
hard time deciding how to identify objects and determine what is really going 
on 
in a scene.

I think that your approach makes the problem harder than it needs to be (not 
that it is easy). Natural language processing is hard, so researchers in an 
attempt to break down the task into simpler parts, focused on steps like 
lexical 
analysis, parsing, part of speech resolution, and semantic analysis. While 
these 
problems went unsolved, Google went directly to a solution by skipping them.

Likewise, parsing an image into physically separate objects and then building a 
3-D model makes the problem harder, not easier. Again, look at the whole 
picture. You input an image and output a response. Let the system figure out 
which features are important. If your goal is to count basketball passes, then 
it is irrelevant whether the AGI recognizes that somebody is wearing a gorilla 
suit.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 2:25:49 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Abram,

I should also mention that I ran into problems mainly because I was having a 
hard time deciding how to identify objects and determine what is really going 
on 
in a scene. This adds a whole other layer of complexity to hypotheses. It's not 
just about what is more predictive of the observations, it is about deciding 
what exactly you are observing in the first place. (although you might say its 
the same problem).

I ran into this problem when my algorithm finds matches between items that are 
not the same. Or it may not find any matches between items that are the same, 
but have changed. So, how do you decide whether it is 1) the same object, 2) a 
different object or 3) the same object but it has changed. 

And how do you decide its relationship to something else...  is it 1) 
dependently attached 2) semi-dependently attached(can move independently, but 
only in certain ways. Yet also moves dependently) 3) independent 4) sometimes 
dependent 5) was dependent, but no longer is, 6) was dependent on something 
else, but then was independent, but now is dependent on something new. 


These hypotheses are different ways of explaining the same observations, but 
are 
complicated by the fact that we aren't sure of the identity of the objects we 
are observing in the first place. Multiple hypotheses may fit the same 
observations, and its hard to decide why one is simpler or better than the 
other. The object you were observing at first may have disappeared. A new 
object 
may have appeared at the same time (this is why screenshots are a bit 
malicious). Or the object you were observing may have changed. In screenshots, 
sometimes the objects that you are trying to identify as different never appear 
at the same time because they always completely occlude each other. So, that 
can 
make it extremely difficult to decide whether they are the same object that has 
changed or different objects.

Such ambiguities are common in AGI. It is unclear to me yet how to deal with 
them effectively, although I am continuing to work hard on it. 


I know its a bit of a mess, but I'm just trying to demonstrate the trouble I've 
run into. 


I hope that makes it more clear why I'm having so much trouble finding a way of 
determining what hypothesis is most predictive and simplest.

Dave


On Thu, Jul 22, 2010 at 10:23 PM, Abram Demski abramdem...@gmail.com wrote:

David,

What are the different ways you are thinking of for measuring the 
predictiveness? I can think of a few different possibilities (such as 
measuring 
number incorrect vs measuring fraction incorrect, et cetera) but I'm wondering 
which variations you consider significant/troublesome/etc.

--Abram


On Thu, Jul 22, 2010 at 7:12 PM, David Jones davidher...@gmail.com wrote:

It's certainly not as simple as you claim. First, assigning a probability is 
not 
always possible, nor is it easy. The factors in calculating that probability 
are 
unknown and are not the same for every instance. Since we do not know what 
combination of observations we will see, we cannot have a predefined set of 
probabilities, nor is it any easier to create a probability function that 
generates them for us. That is just as exactly what I meant by quantitatively 
define the predictiveness... it would be proportional to the probability. 

Second, if you can define a program ina way that is always simpler when it is 
smaller, then you can do the same thing without a program. I don't think it 
makes any sense to do it this way. 

It is not that simple. If it was, we could solve a large portion of agi 
easily.
On Thu, Jul 22, 2010 at 3:16 PM, Matt Mahoney matmaho...@yahoo.com wrote:
David Jones wrote:
 But, I am amazed at how difficult it is to quantitatively define more

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-24 Thread Matt Mahoney

Jim Bromer wrote:
 Solomonoff Induction may require a trans-infinite level of complexity just to 
run each program. 

Trans-infinite is not a mathematically defined term as far as I can tell. 
Maybe you mean larger than infinity, as in the infinite set of real numbers is 
larger than the infinite set of natural numbers (which is true).

But it is not true that Solomonoff induction requires more than aleph-null 
operations. (Aleph-null is the size of the set of natural numbers, the 
smallest 
infinity). An exact calculation requires that you test aleph-null programs for 
aleph-null time steps each. There are aleph-null programs because each program 
is a finite length string, and there is a 1 to 1 correspondence between the set 
of finite strings and N, the set of natural numbers. Also, each program 
requires 
aleph-null computation in the case that it runs forever, because each step in 
the infinite computation can be numbered 1, 2, 3...

However, the total amount of computation is still aleph-null because each step 
of each program can be described by an ordered pair (m,n) in N^2, meaning the 
n'th step of the m'th program, where m and n are natural numbers. The 
cardinality of N^2 is the same as the cardinality of N because there is a 1 to 
1 
correspondence between the sets. You can order the ordered pairs as (1,1), 
(1,2), (2,1), (1,3), (2,2), (3,1), (1,4), (2,3), (3,2), (4,1), (1,5), etc. 
See http://en.wikipedia.org/wiki/Countable_set#More_formal_introduction

Furthermore you may approximate Solomonoff induction to any desired precision 
with finite computation. Simply interleave the execution of all programs as 
indicated in the ordering of ordered pairs that I just gave, where the programs 
are ordered from shortest to longest. Take the shortest program found so far 
that outputs your string, x. It is guaranteed that this algorithm will approach 
and eventually find the shortest program that outputs x given sufficient time, 
because this program exists and it halts.

In case you are wondering how Solomonoff induction is not computable, the 
problem is that after this algorithm finds the true shortest program that 
outputs x, it will keep running forever and you might still be wondering if a 
shorter program is forthcoming. In general you won't know.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 3:59:18 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


Solomonoff Induction may require a trans-infinite level of complexity just to 
run each program.  Suppose each program is iterated through the enumeration of 
its instructions.  Then, not only do the infinity of possible programs need to 
be run, many combinations of the infinite programs from each simulated Turing 
Machine also have to be tried.  All the possible combinations of (accepted) 
programs, one from any two or more of the (accepted) programs produced by each 
simulated Turing Machine, have to be tried.  Although these combinations of 
programs from each of the simulated Turing Machine may not all be unique, they 
all have to be tried.  Since each simulated Turing Machine would produce 
infinite programs, I am pretty sure that this means that Solmonoff Induction 
is, 
by definition, trans-infinite.
Jim Bromer


On Thu, Jul 22, 2010 at 2:06 PM, Jim Bromer jimbro...@gmail.com wrote:

I have to retract my claim that the programs of Solomonoff Induction would be 
trans-infinite.  Each of the infinite individual programs could be enumerated 
by 
their individual instructions so some combination of unique individual programs 
would not correspond to a unique program but to the enumerated program that 
corresponds to the string of their individual instructions.  So I got that one 
wrong.
Jim Bromer

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney

Mike Tintner wrote:
 Huh, Matt? What examples of this holistic scene analysis are there (or are 
you thinking about)?
 
I mean a neural model with increasingly complex features, as opposed to an 
algorithmic 3-D model (like video game graphics in reverse).

Of course David rejects such ideas ( http://practicalai.org/Prize/Default.aspx 
) 
even though the one proven working vision model uses it.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 6:16:07 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Huh, Matt? What examples of this holistic scene  analysis are there (or are 
you thinking about)?


From: Matt Mahoney 
Sent: Saturday, July 24, 2010 10:25 PM
To: agi 
Subject: Re: [agi] Re: Huge Progress on the Core of  AGI

David Jones wrote:
 I should also mention that I ran into  problems mainly because I was having a 
hard time deciding how to identify  objects and determine what is really going 
on in a scene.

I think that your approach makes the problem harder than it needs to be  (not 
that it is easy). Natural language processing is hard, so researchers in an  
attempt to break down the task into simpler parts, focused on steps like 
lexical  
analysis, parsing, part of speech resolution, and semantic analysis. While 
these  
problems went unsolved, Google went directly to a solution by skipping  them.

Likewise, parsing an image into physically separate objects and then  building 
a 
3-D model makes the problem harder, not easier. Again, look at the  whole 
picture. You input an image and output a response. Let the system figure  out 
which features are important. If your goal is to count basketball passes,  then 
it is irrelevant whether the AGI recognizes that somebody is wearing a  gorilla 
suit.

 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-24 Thread Matt Mahoney

Mike Tintner wrote:
 Which is?
 
The one right behind your eyes.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Sat, July 24, 2010 9:00:42 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


Matt: 
I mean a neural model with increasingly complex features, as opposed to an  
algorithmic 3-D model (like video game graphics in reverse). Of course David  
rejects such ideas ( http://practicalai.org/Prize/Default.aspx )  even though 
the one proven working vision model uses it.
 
 
Which is? and does what?  (I'm starting to consider that vision and  visual 
perception  -  or perhaps one should say common sense, since  no sense in 
humans works independent of the others -  may well be  considerably *more* 
complex than language. The evolutionary time required to  develop our common 
sense perception and conception of the world was vastly  greater than that 
required to develop language. And we are as a culture merely  in our babbling 
infancy in beginning to understand how sensory images work and  are processed).
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Pretty worldchanging

2010-07-23 Thread Matt Mahoney

The video says it has 2 GB of memory. I assume that's SSD and there is no disk.

It's actually not hard to find a computer for $35. People are always throwing 
away old computers that still work.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 23, 2010 9:50:44 AM
Subject: [agi] Pretty worldchanging


this strikes me as socially worldchanging if it  works - potentially leading to 
you-ain't-see-nothing-yet changes in world  education ( commerce) levels over 
the next decade:
 
http://www.physorg.com/news199083092.html
 
Any comments on its technical  massproduction  viability ?
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-22 Thread Matt Mahoney

David Jones wrote:
 But, I am amazed at how difficult it is to quantitatively define more 
predictive and simpler for specific problems. 

It isn't hard. To measure predictiveness, you assign a probability to each 
possible outcome. If the actual outcome has probability p, you score a penalty 
of log(1/p) bits. To measure simplicity, use the compressed size of the code 
for 
your prediction algorithm. Then add the two scores together. That's how it is 
done in the Calgary challenge http://www.mailcom.com/challenge/ and in my own 
text compression benchmark.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 22, 2010 3:11:46 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Because simpler is not better if it is less predictive.



On Thu, Jul 22, 2010 at 1:21 PM, Abram Demski abramdem...@gmail.com wrote:

Jim,

Why more predictive *and then* simpler?

--Abram


On Thu, Jul 22, 2010 at 11:49 AM, David Jones davidher...@gmail.com wrote:

An Update

I think the following gets to the heart of general AI and what it takes to  
achieve it. It also provides us with evidence as to why general AI is so 
difficult. With this new knowledge in mind, I think I will be much more 
capable 
now  of solving the problems and making it work. 


I've come to the conclusion lately that the best hypothesis is better because 
it 
is more predictive and then simpler than other hypotheses (in that order 
more predictive... then simpler). But, I am amazed at how difficult it is to 
quantitatively define more predictive and simpler for specific problems. This 
is 
why I have sometimes doubted the truth of the statement.

In addition, the observations that the AI gets are not representative of all 
observations! This means that if your measure of predictiveness depends on 
the 
number of certain observations, it could make mistakes! So, the specific 
observations you are aware of may be unrepresentative of the predictiveness 
of a 
hypothesis relative to the truth. If you try to calculate which hypothesis is 
more predictive and you don't have the critical observations that would give 
you 
the right answer, you may get the wrong answer! This all depends of course on 
your method of calculation, which is quite elusive to define. 


Visual input from screenshots, for example, can be somewhat malicious. Things 
can move, appear, disappear or occlude each other suddenly. So, without 
sufficient knowledge it is hard to decide whether matches you find between 
such 
large changes are because it is the same object or a different object. This 
may 
indicate that bias and preprogrammed experience should be introduced to the 
AI 
before training. Either that or the training inputs should be carefully 
chosen 
to avoid malicious input and to make them nice for learning. 


This is the correspondence problem that is typical of computer vision and 
has 
never been properly solved. Such malicious input also makes it difficult to 
learn automatically because the AI doesn't have sufficient experience to know 
which changes or transformations are acceptable and which are not. It is 
immediately bombarded with malicious inputs.

I've also realized that if a hypothesis is more explanatory, it may be 
better. 
But quantitatively defining explanatory is also elusive and truly depends on 
the 
specific problems you are applying it to because it is a heuristic. It is not 
a 
true measure of correctness. It is not loyal to the truth. More explanatory 
is 
really a heuristic that helps us find hypothesis that are more predictive. 
The 
true measure of whether a hypothesis is better is simply the most accurate 
and 
predictive hypothesis. That is the ultimate and true measure of correctness.

Also, since we can't measure every possible prediction or every last 
prediction 
(and we certainly can't predict everything), our measure of predictiveness 
can't 
possibly be right all the time! We have no choice but to use a heuristic of 
some 
kind.

So, its clear to me that the right hypothesis is more predictive and then 
simpler. But, it is also clear that there will never be a single measure of 
this that can be applied to all problems. I hope to eventually find a nice 
model 
for how to apply it to different problems though. This may be the reason that 
so 
many people have tried and failed to develop general AI. Yes, there is a 
solution. But there is no silver bullet that can be applied to all problems. 
Some methods are better than others. But I think another major reason of the 
failures is that people think they can predict things without sufficient 
information. By approaching the problem this way, we compound the need for 
heuristics and the errors they produce because we simply don't have 
sufficient 
information to make a good decision with limited evidence. If approached 
correctly, the right solution would solve many more

Re: [agi] How do we hear music

2010-07-22 Thread Matt Mahoney

deepakjnath wrote:

 Why do we listen to a song sung in different scale and yet identify it as the 
same song.?  Does it have something to do with the fundamental way in which we 
store memory?

For the same reason that gray looks green on a red background. You have more 
neurons that respond to differences in tones than to absolute frequencies.

 -- Matt Mahoney, matmaho...@yahoo.com





From: deepakjnath deepakjn...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 22, 2010 3:59:57 PM
Subject: [agi] How do we hear music

Why do we listen to a song sung in different scale and yet identify it as the 
same song.?  Does it have something to do with the fundamental way in which we 
store memory?

cheers,
Deepak

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-22 Thread Matt Mahoney

Jim Bromer wrote:
 Please give me a little more explanation why you say the fundamental method 
 is 
that the probability of a string x is proportional to the sum of all programs 
M 
that output x weighted by 2^-|M|.  Why is the M in a bracket?

By |M| I mean the length of the program M in bits. Why 2^-|M|? Because each bit 
means you can have twice as many programs, so they should count half as much.

Being uncomputable doesn't make it wrong. The fact that there is no general 
procedure for finding the shortest program that outputs a string doesn't mean 
that you can never find it, or that for many cases you can't approximate it.

You apply Solomonoff induction all the time. What is the next bit in these 
sequences?

1. 0101010101010101010101010101010

2. 11001001110110101010001

In sequence 1 there is an obvious pattern with a short description. You can 
find 
a short program that outputs 0 and 1 alternately forever, so you predict the 
next bit will be 1. It might not be the shortest program, but it is enough that 
alternate 0 and 1 forever is shorter than alternate 0 and 1 15 times 
followed 
by 00 that you can confidently predict the first hypothesis is more likely.

The second sequence is not so obvious. It looks like random bits. With enough 
intelligence (or computation) you might discover that the sequence is a binary 
representation of pi, and therefore the next bit is 0. But the fact that you 
might not discover the shortest description does not invalidate the principle. 
It just says that you can't always apply Solomonoff induction and get the 
number 
you want.

Perhaps http://en.wikipedia.org/wiki/Kolmogorov_complexity will make this clear.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 22, 2010 5:06:12 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


On Wed, Jul 21, 2010 at 8:47 PM, Matt Mahoney matmaho...@yahoo.com wrote:
The fundamental method is that the probability of a string x is proportional to 
the sum of all programs M that output x weighted by 2^-|M|. That probability is 
dominated by the shortest program, but it is equally uncomputable either way.
Also, please point me to this mathematical community that you claim rejects 
Solomonoff induction. Can you find even one paper that refutes it?
 
You give a precise statement of the probability in general terms, but then say 
that it is uncomputable.  Then you ask if there is a paper that refutes it.  
Well, why would any serious mathematician bother to refute it since you 
yourself 
acknowledge that it is uncomputable and therefore unverifiable and therefore 
not 
a mathematical theorem that can be proven true or false?  It isn't like you 
claimed that the mathematical statement is verifiable. It is as if you are 
making a statement and then ducking any responsibility for it by denying that 
it 
is even an evaluation.  You honestly don't see the irregularity?
 
My point is that the general mathematical community doesn't accept Solomonoff 
Induction, not that I have a paper that refutes it, whatever that would mean.
 
Please give me a little more explanation why you say the fundamental method is 
that the probability of a string x is proportional to the sum of all programs M 
that output x weighted by 2^-|M|.  Why is the M in a bracket?

 
On Wed, Jul 21, 2010 at 8:47 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Jim Bromer wrote:
 The fundamental method of Solmonoff Induction is trans-infinite.


The fundamental method is that the probability of a string x is proportional 
to 
the sum of all programs M that output x weighted by 2^-|M|. That probability 
is 
dominated by the shortest program, but it is equally uncomputable either way. 
How does this approximation invalidate Solomonoff induction?


Also, please point me to this mathematical community that you claim rejects 
Solomonoff induction. Can you find even one paper that refutes it?

 -- Matt Mahoney, matmaho...@yahoo.com 






 From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 21, 2010 3:08:13 PM 

Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction
 


I should have said, It would be unwise to claim that this method could stand 
as 
an ideal for some valid and feasible application of probability.
Jim Bromer


On Wed, Jul 21, 2010 at 2:47 PM, Jim Bromer jimbro...@gmail.com wrote:

The fundamental method of Solmonoff Induction is trans-infinite.  Suppose you 
iterate through all possible programs, combining different programs as you go. 
 
Then you have an infinite number of possible programs which have a 
trans-infinite number of combinations, because each tier of combinations can 
then be recombined to produce a second, third, fourth,... tier of 
recombinations.
 
Anyone who claims that this method is the ideal for a method of applied 
probability is unwise

Re: [agi] The Collective Brain

2010-07-21 Thread Matt Mahoney

Mike Tintner wrote:
 The fantasy of a superAGI machine that can grow individually without a vast 
society supporting it, is another one of the wild fantasies of AGI-ers 
and Singularitarians that violate truly basic laws of nature. Individual 
brains 
cannot flourish individually in the real world, only societies of brains (and 
bodies) can.

I agree. It is the basis of my AGI design, to supplement a global brain with 
computers. http://mattmahoney.net/agi2.html

 -- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Tue, July 20, 2010 1:50:45 PM
Subject: [agi] The Collective Brain


http://www.ted.com/talks/matt_ridley_when_ideas_have_sex.html?utm_source=newsletter_weekly_2010-07-20utm_campaign=newsletter_weeklyutm_medium=email

 
Good lecture worth looking at about how trade -  exchange of both goods and 
ideas - has fostered civilisation. Near the end  introduces a v. important idea 
- the collective brain. In other words, our  apparently individual 
intelligence is actually a collective intelligence. Nobody  he points out 
actually knows how to make a computer mouse, although that may  seem 
counterintuitive  - it's an immensely complex piece of equipment,  simple as it 
may appear, that engages the collective,  interdependent intelligence and 
productive efforts of vast numbers of  people.
 
When you start thinking like that, you realise that  there is v. little we know 
how to do, esp of an intellectual nature,  individually, without the implicit 
and explicit collaboration of vast numbers of  people and sectors of society. 

 
The fantasy of a superAGI machine that can grow  individually without a vast 
society supporting it, is another one of the  wild fantasies of AGI-ers 
and Singularitarians that violate truly  basic laws of nature. Individual 
brains 
cannot flourish individually in the real  world, only societies of brains (and 
bodies) can. 

 
(And of course computers can do absolutely nothing  or in any way survive 
without their human masters - even if it may appear that  way, if you don't 
look 
properly at their whole  operation)
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-21 Thread Matt Mahoney

Jim Bromer wrote:
 The question was asked whether, given infinite resources could Solmonoff 
Induction work.  I made the assumption that it was computable and found that 
it 
wouldn't work.  

On what infinitely powerful computer did you do your experiment?

 My conclusion suggests, that the use of Solmonoff Induction as an ideal for 
compression or something like MDL is not only unsubstantiated but based on a 
massive inability to comprehend the idea of a program that runs every possible 
program. 

It is sufficient to find the shortest program consistent with past results, not 
all programs. The difference is no more than the language-dependent constant. 
Legg proved this in the paper that Ben and I both pointed you to. Do you 
dispute 
his proof? I guess you don't, because you didn't respond the last 3 times this 
was pointed out to you.

 I am comfortable with the conclusion that the claim that Solomonoff Induction 
is an ideal for compression or induction or anything else is pretty shallow 
and not based on careful consideration.

I am comfortable with the conclusion that the world is flat because I have a 
gut 
feeling about it and I ignore overwhelming evidence to the contrary.

 There is a chance that I am wrong

So why don't you drop it?

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, July 20, 2010 3:10:40 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


The question was asked whether, given infinite resources could Solmonoff 
Induction work.  I made the assumption that it was computable and found that it 
wouldn't work.  It is not computable, even with infinite resources, for the 
kind 
of thing that was claimed it would do. (I believe that with a governance 
program 
it might actually be programmable) but it could not be used to predict (or 
compute the probability of) a subsequent string given some prefix string.  Not 
only is the method impractical it is theoretically inane.  My conclusion 
suggests, that the use of Solmonoff Induction as an ideal for compression or 
something like MDL is not only unsubstantiated but based on a massive inability 
to comprehend the idea of a program that runs every possible program.  

 
I am comfortable with the conclusion that the claim that Solomonoff Induction 
is 
an ideal for compression or induction or anything else is pretty shallow and 
not based on careful consideration.
 
There is a chance that I am wrong, but I am confident that there is nothing in 
the definition of Solmonoff Induction that could be used to prove it.
Jim Bromer
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-21 Thread Matt Mahoney

Jim Bromer wrote:
 The fundamental method of Solmonoff Induction is trans-infinite.

The fundamental method is that the probability of a string x is proportional to 
the sum of all programs M that output x weighted by 2^-|M|. That probability is 
dominated by the shortest program, but it is equally uncomputable either way. 
How does this approximation invalidate Solomonoff induction?

Also, please point me to this mathematical community that you claim rejects 
Solomonoff induction. Can you find even one paper that refutes it?

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 21, 2010 3:08:13 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


I should have said, It would be unwise to claim that this method could stand as 
an ideal for some valid and feasible application of probability.
Jim Bromer


On Wed, Jul 21, 2010 at 2:47 PM, Jim Bromer jimbro...@gmail.com wrote:

The fundamental method of Solmonoff Induction is trans-infinite.  Suppose you 
iterate through all possible programs, combining different programs as you go.  
Then you have an infinite number of possible programs which have a 
trans-infinite number of combinations, because each tier of combinations can 
then be recombined to produce a second, third, fourth,... tier of 
recombinations.
 
Anyone who claims that this method is the ideal for a method of applied 
probability is unwise.
 Jim Bromer

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Of definitions and tests of AGI

2010-07-20 Thread Matt Mahoney

Mike, I think we all agree that we should not have to tell an AGI the steps to 
solving problems. It should learn and figure it out, like the way that people 
figure it out.

The question is how to do that. We know that it is possible. For example, I 
could write a chess program that I could not win against. I could write the 
program in such a way that it learns to improve its game by playing against 
itself or other opponents. I could write it in such a way that initially does 
not know the rules for chess, but instead learns the rules by being given 
examples of legal and illegal moves.

What we have not yet been able to do is scale this type of learning and problem 
solving up to general, human level intelligence. I believe it is possible, but 
it will require lots of training data and lots of computing power. It is not 
something you could do on a PC, and it won't be cheap.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Mon, July 19, 2010 9:07:53 PM
Subject: Re: [agi] Of definitions and tests of AGI


The issue isn't what a computer can do. The issue  is how you structure the 
computer's or any agent's thinking about a  problem. Programs/Turing machines 
are only one way of structuring  thinking/problemsolving - by, among other 
things, giving the  computer a method/process of solution. There is an 
alternative way of  structuring a computer's thinking, which incl., among other 
things, not giving  it a method/ process of solution, but making it rather than 
a human  programmer do the real problemsolving.  More of that another  time.


From: Matt Mahoney 
Sent: Tuesday, July 20, 2010 1:38 AM
To: agi 
Subject: Re: [agi] Of definitions and tests of AGI

Creativity is the good feeling you get when you discover a clever solution  to 
a 
hard problem without knowing the process you used to discover it.

I think a computer could do that.

 -- Matt Mahoney, matmaho...@yahoo.com 





 From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Mon, July 19, 2010 2:08:28  PM
Subject: Re: [agi] Of  definitions and tests of AGI


Yes that's what people do, but it's not what  programmed computers do.
 
The useful formulation that emerges here  is:
 
narrow AI (and in fact all rational) problems   have *a method of solution*  
(to 
be equated with general  method)   - and are programmable (a program is a 
method of  solution)
 
AGI  (and in fact all creative) problems do  NOT have *a method of solution* 
(in 
the general sense)  -  rather  a one.off *way of solving the problem* has to be 
improvised each  time.
 
AGI/creative problems do not in fact have a method  of solution, period. There 
is no (general) method of solving either the toy box  or the build-a-rock-wall 
problem - one essential feature which makes them  AGI.
 
You can learn, as you indicate, from *parts* of any  given AGI/creative 
solution, and apply the lessons to future problems - and  indeed with practice, 
should improve at solving any given kind of AGI/creative  problem. But you can 
never apply a *whole* solution/way to further  problems.
 
P.S. One should add that in terms of computers, we  are talking here of 
*complete, step-by-step* methods of  solution.
 


From: rob levy 
Sent: Monday, July 19, 2010 5:09 PM
To: agi 
Subject: Re: [agi] Of definitions and tests of AGI
  
And are you happy with:
 
AGI is about devising *one-off* methods ofproblemsolving (that only apply 
to 
the individual problem, and cannot bere-used - at 


least not in their totality)
 

Yes exactly, isn't that what people do?  Also, I think that being  able to 
recognize where past solutions can be generalized and where past  solutions can 
be varied and reused is a detail of how intelligence works that is  likely to 
be 
universal.

 
vs
 
narrow AI is about applying pre-existing*general* methods of 
problemsolving  
(applicable to whole classes ofproblems)?
 
 


From: rob levy 
Sent: Monday, July 19, 2010 4:45 PM
To: agi 
Subject: Re: [agi] Of definitions and tests ofAGI

Well, solving ANY problem is a little too strong.  This isAGI, not AGH 
(artificial godhead), though AGH could be an unintendedconsequence ;).  So 
I 
would rephrase solving any problem as being ableto come up with 
reasonable 
approaches and strategies to any problem (just ashumans are able to do).


On Mon, Jul 19, 2010 at 11:32 AM, Mike Tintner tint...@blueyonder.co.uk 
wrote:

Whaddya mean by solve the problem of how to  solve problems? Develop a 
universal approach to solving any problem?  Or find a method of solving a 
class of problems? Or what?


From: rob levy 
Sent: Monday, July 19, 2010 1:26 PM
To: agi 
Subject: Re: [agi] Of definitions and tests of  AGI


 
However, I see that there are no validdefinitions of AGI that 
explain 
what AGI is generally , and why

Re: [agi] Of definitions and tests of AGI

2010-07-19 Thread Matt Mahoney

Creativity is the good feeling you get when you discover a clever solution to a 
hard problem without knowing the process you used to discover it.

I think a computer could do that.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Mon, July 19, 2010 2:08:28 PM
Subject: Re: [agi] Of definitions and tests of AGI


Yes that's what people do, but it's not what  programmed computers do.
 
The useful formulation that emerges here  is:
 
narrow AI (and in fact all rational) problems   have *a method of solution*  
(to 
be equated with general  method)   - and are programmable (a program is a 
method of  solution)
 
AGI  (and in fact all creative) problems do  NOT have *a method of solution* 
(in 
the general sense)  -  rather  a one.off *way of solving the problem* has to be 
improvised each  time.
 
AGI/creative problems do not in fact have a method  of solution, period. There 
is no (general) method of solving either the toy box  or the build-a-rock-wall 
problem - one essential feature which makes them  AGI.
 
You can learn, as you indicate, from *parts* of any  given AGI/creative 
solution, and apply the lessons to future problems - and  indeed with practice, 
should improve at solving any given kind of AGI/creative  problem. But you can 
never apply a *whole* solution/way to further  problems.
 
P.S. One should add that in terms of computers, we  are talking here of 
*complete, step-by-step* methods of  solution.
 


From: rob levy 
Sent: Monday, July 19, 2010 5:09 PM
To: agi 
Subject: Re: [agi] Of definitions and tests of AGI
  
And are you happy with:
 
AGI is about devising *one-off* methods ofproblemsolving (that only apply 
to 
the individual problem, and cannot bere-used - at 


least not in their totality)
 

Yes exactly, isn't that what people do?  Also, I think that being  able to 
recognize where past solutions can be generalized and where past  solutions can 
be varied and reused is a detail of how intelligence works that is  likely to 
be 
universal.

 
vs
 
narrow AI is about applying pre-existing*general* methods of 
problemsolving  
(applicable to whole classes ofproblems)?
 
 


From: rob levy 
Sent: Monday, July 19, 2010 4:45 PM
To: agi 
Subject: Re: [agi] Of definitions and tests ofAGI

Well, solving ANY problem is a little too strong.  This isAGI, not AGH 
(artificial godhead), though AGH could be an unintendedconsequence ;).  So 
I 
would rephrase solving any problem as being ableto come up with 
reasonable 
approaches and strategies to any problem (just ashumans are able to do).


On Mon, Jul 19, 2010 at 11:32 AM, Mike Tintner tint...@blueyonder.co.uk 
wrote:

Whaddya mean by solve the problem of how to  solve problems? Develop a 
universal approach to solving any problem?  Or find a method of solving a 
class of problems? Or what?


From: rob levy 
Sent: Monday, July 19, 2010 1:26 PM
To: agi 
Subject: Re: [agi] Of definitions and tests of  AGI


 
However, I see that there are no validdefinitions of AGI that 
explain 
what AGI is generally , and why thesetests are indeed AGI. Google - 
there are v. few defs. of AGI or Strong AI,period.




I like Fogel's idea that intelligence is the ability to solve the  
problem 
of how to solve problems in new and changing environments.  I  don't 
think 
Fogel's method accomplishes this, but the goal he expresses  seems to be 
the 
goal of AGI as I understand it. 


Rob
agi | Archives  | Modify Your Subscription   
agi | Archives  | Modify Your Subscription  

agi | Archives  | Modify Your Subscription   
agi | Archives  | Modify Your Subscription  

agi | Archives  | Modify Your Subscription   
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Of definitions and tests of AGI

2010-07-18 Thread Matt Mahoney

http://www.loebner.net/Prizef/loebner-prize.html

 -- Matt Mahoney, matmaho...@yahoo.com

From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, July 18, 2010 3:10:12 PM
Subject: Re: [agi] Of definitions and tests of AGI

If you can't convince someone, clearly something is wrong with it. I don't 
think 
a test is the right way to do this. Which is why I haven't commented much. 
When you understand how to create AGI, it will be obvious that it is AGI or 
that 
it is what you intend it to be. You'll then understand how what you have built 
fits into the bigger scheme of things. There is no such point at which you can 
say something is AGI and not AGI. Intelligence is a very subjective thing 
that really depends on your goals. Someone will always say it is not good 
enough. But if it really works, people will quickly realize it based on results.

What you want is to develop a system that can learn about the world or its 
environment in a general way so that it can solve arbitrary problems, be able 
to 
plan in general ways, act in general ways and perform the types of goals you 
want it to perform. 

Dave

On Sun, Jul 18, 2010 at 3:03 PM, deepakjnath deepakjn...@gmail.com wrote:

So if I have a system that is close to AGI, I have no way of really knowing it 
right? 

Even if I believe that my system is a true AGI there is no way of convincing 
the 
others irrefutably that this system is indeed a AGI not just an advanced AI 
system.

I have read the toy box problem and rock wall problem, but not many people 
will 
still be convinced I am sure.

I wanted to know that if there is any consensus on a general problem which can 
be solved and only solved by a true AGI. Without such a test bench how will we 
know if we are moving closer or away from our quest. There is no map.

Deepak

On Sun, Jul 18, 2010 at 11:50 PM, Mike Tintner tint...@blueyonder.co.uk 
wrote:

I realised that what is needed is a *joint*  definition *and*  range of tests 
of 
AGI.

Benamin Johnston has submitted one valid test - the  toy box problem. (See 
archives).

I have submitted another still simpler valid test -  build a rock wall from 
rocks given, (or fill an earth hole with  rocks).

However, I see that there are no valid definitions  of AGI that explain what 
AGI 
is generally , and why these tests are indeed  AGI. Google - there are v. few 
defs. of AGI or Strong AI, period.

The most common: AGI is human-level intelligence -   is an 
embarrassing non-starter - what distinguishes human  intelligence? No 
explanation offered.

The other two are also inadequate if not as bad:  Ben's solves a variety of 
complex problems in a variety of complex  environments. Nope, so does  a 
multitasking narrow AI. Complexity does not  distinguish AGI. Ditto Pei's - 
something to do with insufficient knowledge and  resources...
Insufficient is open to narrow AI  interpretations and reducible to 
mathematically calculable probabilities.or  uncertainties. That doesn't 
distinguish AGI from narrow AI.

The one thing we should all be able to agree on  (but who can be sure?) is 
that:

** an AGI is a general intelligence system,  capable of independent learning**

i.e. capable of independently learning new  activities/skills with minimal 
guidance or even, ideally, with zero guidance (as  humans and animals are) - 
and 
thus acquiring a general, all-round range of  intelligence..  

This is an essential AGI goal -  the capacity  to keep entering and mastering 
new domains of both mental and physical skills  WITHOUT being specially 
programmed each time - that crucially distinguishes it  from narrow AI's, 
which 
have to be individually programmed anew for each new  task. Ben's AGI dog 
exemplified this in a v simple way -  the dog is  supposed to be able to 
learn 
to fetch a ball, with only minimal instructions, as  real dogs do - they can 
learn a whole variety of new skills with minimal  instruction.  But I am 
confident Ben's dog can't actually do  this.

However, the independent learning def. while  focussing on the distinctive 
AGI 
goal,  still is not detailed enough by  itself.

It requires further identification of the  **cognitive operations** which 
distinguish AGI,  and wh. are exemplified by  the above tests.

[I'll stop there for interruptions/comments   continue another time].

 P.S. Deepakjnath,

It is vital to realise that the overwhelming  majority of AGI-ers do not * 
want* 
an AGI test -  Ben has never gone near  one, and is merely typical in this 
respect. I'd put almost all AGI-ers here in  the same league as the US banks, 
who only want mark-to-fantasy rather than  mark-to-market tests of their 
assets.
agi | Archives  | Modify Your Subscription  

-- 
cheers,
Deepak

agi | Archives  | Modify Your Subscription  

agi | Archives  | Modify Your Subscription  

---
agi
Archives: https://www.listbox.com/member

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-18 Thread Matt Mahoney

Jim Bromer wrote:
 The definition of all possible programs, like the definition of all 
 possible 
mathematical functions, is not a proper mathematical problem that can be 
comprehended in an analytical way.

Finding just the shortest program is close enough because it dominates the 
probability. Or which step in the proof of theorem 1.7.2 
in http://www.vetta.org/documents/disSol.pdf do you disagree with?

You have been saying that you think Solomonoff induction is wrong, but offering 
no argument except your own intuition. So why should we care?

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, July 18, 2010 9:09:36 PM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


Abram,
I was going to drop the discussion, but then I thought I figured out why you 
kept trying to paper over the difference.  Of course, our personal disagreement 
is trivial; it isn't that important.  But the problem with Solomonoff Induction 
is that not only is the output hopelessly tangled and seriously infinite, but 
the input is as well.  The definition of all possible programs, like the 
definition of all possible mathematical functions, is not a proper 
mathematical problem that can be comprehended in an analytical way.  I think 
that is the part you haven't totally figured out yet (if you will excuse the 
pun).  Total program space, does not represent a comprehensible computational 
concept.  When you try find a way to work out feasible computable examples it 
is 
not enough to limit the output string space, you HAVE to limit the program 
space 
in the same way.  That second limitation makes the entire concept of total 
program space, much too weak for our purposes.  You seem to know this at an 
intuitive operational level, but it seems to me that you haven't truly grasped 
the implications.
 
I say that Solomonoff Induction is computational but I have to use a trick to 
justify that remark.  I think the trick may be acceptable, but I am not sure.  
But the possibility that the concept of all possible programs, might be 
computational doesn't mean that that it is a sound mathematical concept.  This 
underlies the reason that I intuitively came to the conclusion that Solomonoff 
Induction was transfinite.  However, I wasn't able to prove it because the 
hypothetical concept of all possible program space, is so pretentious that it 
does not lend itself to mathematical analysis.
 
I just wanted to point this detail out because your implied view that you 
agreed 
with me but total program space was mathematically well-defined did not 
make 
any sense.
Jim Bromer
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] NL parsing

2010-07-16 Thread Matt Mahoney

That that that Buffalo buffalo that Buffalo buffalo buffalo buffalo that 
Buffalo 
buffalo that Buffalo buffalo buffalo.

 -- Matt Mahoney, matmaho...@yahoo.com



- Original Message 
From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 16, 2010 11:05:51 AM
Subject: Re: [agi] NL parsing

Or if you want to be pedantic about caps, the speaker is identifying 3 
buffaloes from Buffalo,  2 from elsewhere.

Anyone got any other readings?

--
From: Jiri Jelinek jjelinek...@gmail.com
Sent: Friday, July 16, 2010 3:12 PM
To: agi agi@v2.listbox.com
Subject: [agi] NL parsing

 Believe it or not, this sentence is grammatically correct and has
 meaning: 'Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo
 buffalo.'

 source: http://www.mentalfloss.com/blogs/archives/13120

 :-)


 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: 
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com 




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-15 Thread Matt Mahoney

Jim Bromer wrote:
 Since you cannot fully compute every string that may be produced at a certain 
iteration, you cannot make the claim that you even know the probabilities of 
any 
possible string before infinity and therefore your claim that the sum of the 
probabilities can be computed is not provable.
 
 But I could be wrong.

Could be. Theorem 1.7.2 in http://www.vetta.org/documents/disSol.pdf proves 
that 
finding just the shortest program that outputs x gives you a probability for x 
close to the result you would get if you found all of the (infinite number of) 
programs that output x. Either number could be used for Solomonoff induction 
because the difference is bounded only by the choice of language.
 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 15, 2010 8:18:13 AM
Subject: Re: [agi] Comments On My Skepticism of Solomonoff Induction


On Wed, Jul 14, 2010 at 7:46 PM, Abram Demski abramdem...@gmail.com wrote:

Jim,

There is a simple proof of convergence for the sum involved in defining the 
probability of a given string in the Solomonoff distribution:

At its greatest, a particular string would be output by *all* programs. In 
this 
case, its sum would come to 1. This puts an upper bound on the sum. Since 
there 
is no subtraction, there is a lower bound at 0 and the sum monotonically 
increases as we take the limit. Knowing these facts, suppose it *didn't* 
converge. It must then increase without bound, since it cannot fluctuate back 
and forth (it can only go up). But this contradicts the upper bound of 1. So, 
the sum must stop at 1 or below (and in fact we can prove it stops below 1, 
though we can't say where precisely without the infinite computing power 
required to compute the limit).

--Abram
 
I believe that Solomonoff Induction would be computable given infinite time and 
infinite resources (the Godel Theorem fits into this category) but some people 
disagree for reasons I do not understand.  

 
If it is not computable then it is not a mathematical theorem and the question 
of whether the sum of probabilities equals 1 is pure fantasy.
 
If it is computable then the central issue is whether it could (given infinite 
time and infinite resources) be used to determine the probability of a 
particular string being produced from all possible programs.  The question 
about 
the sum of all the probabilities is certainly an interesting question. However, 
the problem of making sure that the function was actually computable would 
interfere with this process of determining the probability of each particular 
string that can be produced.  For example, since some strings would be 
infinite, 
the computability problem makes it imperative that the infinite strings be 
partially computed at an iteration (or else the function would be hung up at 
some particular iteration and the infinite other calculations could not be 
considered computable).  

 
My criticism is that even though I believe the function may be theoretically 
computable, the fact that each particular probability (of each particular 
string 
that is produced) cannot be proven to approach a limit through mathematical 
analysis, and since the individual probabilities will fluctuate with each new 
string that is produced, one would have to know how to reorder the production 
of 
the probabilities in order to demonstrate that the individual probabilities do 
approach a limit.  If they don't, then the claim that this function could be 
used to define the probabilities of a particular string from all possible 
program is unprovable.  (Some infinite calculations fluctuate infinitely.)  
Since you do not have any way to determine how to reorder the infinite 
probabilities a priori, your algorithm would have to be able to compute all 
possible reorderings to find the ordering and filtering of the computations 
that 
would produce evaluable limits.  Since there are trans infinite rearrangements 
of an infinite list (I am not sure that I am using the term 'trans infinite' 
properly) this shows that the conclusion that the theorem can be used to derive 
the desired probabilities is unprovable through a variation of Cantor's 
Diagonal 
Argument, and that you can't use Solomonoff Induction the way you have been 
talking about using it.
 
Since you cannot fully compute every string that may be produced at a certain 
iteration, you cannot make the claim that you even know the probabilities of 
any 
possible string before infinity and therefore your claim that the sum of the 
probabilities can be computed is not provable.
 
But I could be wrong.
Jim Bromer
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id

Re: [agi] How do we Score Hypotheses?

2010-07-15 Thread Matt Mahoney

Hypotheses are scored using Bayes law. Let D be your observed data and H be 
your 
hypothesis. Then p(H|D) = p(D|H)p(H)/p(D). Since p(D) is constant, you can 
remove it and rank hypotheses by p(D|H)p(H).

p(H) can be estimated using the minimum description length principle or 
Solomonoff induction. Ideally, p(H) = 2^-|H| where |H| is the length (in bits) 
of the description of the hypothesis. The value is language dependent, so this 
method is not perfect.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Thu, July 15, 2010 10:22:44 AM
Subject: Re: [agi] How do we Score Hypotheses?

It is no wonder that I'm having a hard time finding documentation on hypothesis 
scoring. Few can agree on how to do it and there is much debate about it. 


I noticed though that a big reason for the problems is that explanatory 
reasoning is being applied to many diverse problems. I think, like I mentioned 
before, that people should not try to come up with a single universal rule set 
for applying explanatory reasoning to every possible problem. So, maybe that's 
where the hold up is. 


I've been testing my ideas out on complex examples. But now I'm going to go 
back 
to simplified model testing (although not as simple as black squares :) ) and 
work my way up again. 


Dave


On Wed, Jul 14, 2010 at 12:59 PM, David Jones davidher...@gmail.com wrote:

Actually, I just realized that there is a way to included inductive knowledge 
and experience into this algorithm. Inductive knowledge and experience about a 
specific object or object type can be exploited to know which hypotheses in the 
past were successful, and therefore which hypothesis is most likely. By 
choosing 
the most likely hypothesis first, we skip a lot of messy hypothesis comparison 
processing and analysis. If we choose the right hypothesis first, all we really 
have to do is verify that this hypothesis reveals in the data what we expect to 
be there. If we confirm what we expect, that is reason enough not to look for 
other hypotheses because the data is explained by what we originally believed 
to 
be likely. We only look for additional hypotheses when we find something 
unexplained. And even then, we don't look at the whole problem. We only look at 
what we have to to explain the unexplained data. In fact, we could even ignore 
the unexplained data if we believe, from experience, that it isn't pertinent. 


I discovered this because I'm analyzing how a series of hypotheses are 
navigated 
when analyzing images. It seems to me that it is done very similarly to way we 
do it. We sort of confirm what we expect and try to explain what we don't 
expect. We try out hypotheses in a sort of trial and error manor and see how 
each hypothesis affects what we find in the image. If we confirm things 
because 
of the hypothesis, we are likely to keep it. We keep going, navigating the 
tree 
of hypotheses, conflicts and unexpected observations until we find a good 
hypothesis. Something like that. I'm attempting to construct an algorithm for 
doing this as I analyze specific problems. 


Dave



On Wed, Jul 14, 2010 at 10:22 AM, David Jones davidher...@gmail.com wrote:

What do you mean by definitive events? 

I guess the first problem I see with my approach is that the movement of the 
window is also a hypothesis. I need to analyze it in more detail and see how 
the 
tree of hypotheses affects the hypotheses regarding the es on the windows. 


What I believe is that these problems can be broken down into types of 
hypotheses,  types of events and types of relationships. then those types can 
be 
reasoned about in a general way. If possible, then you have a method for 
reasoning about any object that is covered by the types of hypotheses, events 
and relationships that you have defined.

How to reason about specific objects should not be preprogrammed. But, I 
think 
the solution to this part of AGI is to find general ways to reason about a 
small 
set of concepts that can be combined to describe specific objects and 
situations. 


There are other parts to AGI that I am not considering yet. I believe the 
problem has to be broken down into separate pieces and understood before 
putting 
it back together into a complete system. I have not covered inductive 
learning 
for example, which would be an important part of AGI. I have also not yet 
incorporated learned experience into the algorithm, which is also important. 


The general AI problem is way too complicated to consider all at once. I 
simply 
can't solve hypothesis generation, comparison and disambiguation while at the 
same time solving induction and experience-based reasoning. It becomes 
unwieldly. So, I'm starting where I can and I'll work my way up to the full 
complexity of the problem. 


I don't really understand what you mean here: The central unsolved problem, 
in 
my view, is: How can hypotheses

Re: [agi] What is the smallest set of operations that can potentially define everything and how do you combine them ?

2010-07-14 Thread Matt Mahoney

Actually, Fibonacci numbers can be computed without loops or recursion.

int fib(int x) {
  return round(pow((1+sqrt(5))/2, x)/sqrt(5));
}

unless you argue that loops are needed to compute sqrt() and pow().

The brain and DNA use redundancy and parallelism and don't use loops because 
their operations are slow and unreliable. This is not necessarily the best 
strategy for computers because computers are fast and reliable but don't have a 
lot of parallelism.

 -- Matt Mahoney, matmaho...@yahoo.com



- Original Message 
From: Michael Swan ms...@voyagergaming.com
To: agi agi@v2.listbox.com
Sent: Wed, July 14, 2010 12:18:40 AM
Subject: Re: [agi] What is the smallest set of operations that can potentially  
define everything and how do you combine them ?

Brain loops:


Premise:
Biological brain code does not contain looping constructs, or the
ability to creating looping code, (due to the fact they are extremely
dangerous on unreliable hardware) except for 1 global loop that fires
about 200 times a second.

Hypothesis:
Brains cannot calculate iterative problems quickly, where calculations
in the previous iteration are needed for the next iteration and, where
brute force operations are the only valid option.

Proof:
Take as an example, Fibonacci numbers
http://en.wikipedia.org/wiki/Fibonacci_number

What are the first 100 Fibonacci numbers?

int Fibonacci[102];
Fibonacci[0] = 0;
Fibonacci[1] = 1;
for(int i = 0; i  100; i++)
{
// Getting the next Fibonacci number relies on the previous values
Fibonacci[i+2] = Fibonacci[i] + Fibonacci[i+1];
}  

My brain knows the process to solve this problem but it can't directly
write a looping construct into itself. And so it solves it very slowly
compared to a computer. 

The brain probably consists of vast repeating look-up tables. Of course,
run in parallel these seem fast.


DNA has vast tracks of repeating data. Why would DNA contain repeating
data, instead of just having the data once and the number of times it's
repeated like in a loop? One explanation is that DNA can't do looping
construct either.



On Wed, 2010-07-14 at 02:43 +0100, Mike Tintner wrote:
 Michael: We can't do operations that
 require 1,000,000 loop iterations.  I wish someone would give me a PHD
 for discovering this ;) It far better describes our differences than any
 other theory.
 
 Michael,
 
 This isn't a competitive point - but I think I've made that point several 
 times (and so of course has Hawkins). Quite obviously, (unless you think the 
 brain has fabulous hidden powers), it conducts searches and other operations 
 with extremely few limited steps, and nothing remotely like the routine 
 millions to billions of current computers.  It must therefore work v. 
 fundamentally differently.
 
 Are you saying anything significantly different to that?
 
 --
 From: Michael Swan ms...@voyagergaming.com
 Sent: Wednesday, July 14, 2010 1:34 AM
 To: agi agi@v2.listbox.com
 Subject: Re: [agi] What is the smallest set of operations that can 
 potentially  define everything and how do you combine them ?
 
 
  On Tue, 2010-07-13 at 07:00 -0400, Ben Goertzel wrote:
  Well, if you want a simple but complete operator set, you can go with
 
  -- Schonfinkel combinator plus two parentheses
 
  I'll check this out soon.
  or
 
  -- S and K combinator plus two parentheses
 
  and I suppose you could add
 
  -- input
  -- output
  -- forget
 
  statements to this, but I'm not sure what this gets you...
 
  Actually, adding other operators doesn't necessarily
  increase the search space your AI faces -- rather, it
  **decreases** the search space **if** you choose the right operators, 
  that
  encapsulate regularities in the environment faced by the AI
 
  Unfortunately, an AGI needs to be absolutely general. You are right that
  higher level concepts reduce combinations, however, using them, will
  increase combinations for simpler operator combinations, and if you
  miss a necessary operator, then some concepts will be impossible to
  achieve. The smallest set can define higher level concepts, these
  concepts can be later integrated as single operations, which means
  using operators than can be understood in terms of smaller operators
  in the beginning, will definitely increase you combinations later on.
 
  The smallest operator set is like absolute zero. It has a defined end. A
  defined way of finding out what they are.
 
 
 
 
  Exemplifying this, writing programs doing humanly simple things
  using S and K is a pain and involves piling a lot of S and K and 
  parentheses
  on top of each other, whereas if we introduce loops and conditionals and
  such, these programs get shorter.  Because loops and conditionals happen
  to match the stuff that our human-written programs need to do...
  Loops are evil in most situations.
 
  Let me show you why:
  Draw a square using put_pixel(x,y)
  // loops are more scalable, but, damage this code

Re: [agi] Comments On My Skepticism of Solomonoff Induction

2010-07-14 Thread Matt Mahoney

Jim Bromer wrote:
 Last week I came up with a sketch that I felt showed that Solomonoff 
 Induction 
was incomputable in practice using a variation of Cantor's Diagonal Argument.

Cantor proved that there are more sequences (infinite length strings) than 
there 
are (finite length) strings, even though both sets are infinite. This means 
that 
some, but not all, sequences have finite length descriptions or are the output 
of finite length programs (which is the same thing in a more formal sense). For 
example, the digits of pi or sqrt(2) are infinite length sequences that have 
finite descriptions (or finite programs that output them). There are many more 
sequences that don't have finite length descriptions, but unfortunately I can't 
describe any of them except to say they contain infinite amounts of random data.

Cantor does not prove that Solomonoff induction is not computable. That was 
proved by Kolmogorov (and also by Solomonoff). Solomonoff induction says to use 
the shortest program that outputs the observed sequence to predict the next 
symbol. However, there is no procedure for finding the length of the shortest 
description. The proof sketch is that if there were, then I could describe the 
first string that cannot be described in less than a million bits even though 
I 
just did. The formal proof 
is 
http://en.wikipedia.org/wiki/Kolmogorov_complexity#Incomputability_of_Kolmogorov_complexity


I think your confusion is using the uncomputability of Solomonoff induction to 
question its applicability. That is an experimental question, not one of 
mathematics. The validity of using the shortest or simplest explanation of the 
past to predict the future was first observed by William of Ockham in the 
1400's. It is standard practice in all fields of science. The minimum 
description length principle is applicable to all branches of machine learning.

However, in the conclusion 
of http://mattmahoney.net/dc/dce.html#Section_Conclusion I argue for Solomonoff 
induction on the basis of physics. Solomonoff induction supposes that all 
observable strings are finite prefixes of computable sequences. Occam's Razor 
might not hold if it were possible for the universe to produce uncomputable 
sequences, i.e. infinite sources of random data. I argue that is not possible 
because the observable universe if finitely computable according to the laws of 
physics as they are now understood.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 14, 2010 11:29:13 AM
Subject: [agi] Comments On My Skepticism of Solomonoff Induction


Last week I came up with a sketch that I felt showed that Solomonoff Induction 
was incomputable in practice using a variation of Cantor's Diagonal Argument.  
I 
wondered if my argument made sense or not.  I will explain why I think it did.
 
First of all, I should have started out by saying something like, Suppose 
Solomonoff Induction was computable, since there is some reason why people feel 
that it isn't.
 
Secondly I don't think I needed to use Cantor's Diagonal Argument (for the in 
practice case), because it would be sufficient to point out that since it was 
impossible to say whether or not the probabilities ever approached any 
sustained 
(collared) limits due to the lack of adequate mathematical definition of the 
concept all programs, it would be impossible to make the claim that they were 
actual representations of the probabilities of all programs that could produce 
certain strings.
 
But before I start to explain why I think my variation of the Diagonal Argument 
was valid, I would like to make another comment about what was being claimed.
 
Take a look at the n-ary expansion of the square root of 2 (such as the decimal 
expansion or the binary expansion).  The decimal expansion or the binary 
expansion of the square root of 2 is an infinite string.  To say that the 
algorithm that produces the value is predicting the value is a torturous use 
of meaning of the word 'prediction'.  Now I have less than perfect grammar, but 
the idea of prediction is so important in the field of intelligence that I do 
not feel that this kind of reduction of the concept of prediction is 
illuminating.  

 
Incidentally, There are infinite ways to produce the square root of 2 (sqrt 2 
+1-1, sqrt2 +2-2, sqrt2 +3-3,...).  So the idea that the square root of 2 is 
unlikely is another stretch of conventional thinking.  But since there are an 
infinite ways for a program to produce any number (that can be produced by a 
program) we would imagine that the probability that one of the infinite ways to 
produce the square root of 2 approaches 0 but never reaches it.  We can imagine 
it, but we cannot prove that this occurs in Solomonoff Induction because 
Solomonoff Induction is not limited to just this class of programs (which could 
be proven to approach a limit).  For example, we could make

Re: [agi] What is the smallest set of operations that can potentially define everything and how do you combine them ?

2010-07-14 Thread Matt Mahoney

Michael Swan wrote:
 What 3456/6 ?
 we don't know, at least not from the top of our head.

No, it took me about 10 or 20 seconds to get 576. Starting with the first 
digit, 
3/6 = 1/2 (from long term memory) and 3 is in the thousands place, so 1/2 of 
1000 is 500 (1/2 = .5 from LTM). I write 500 into short term memory (STM), 
which 
only has enough space to hold about 7 digits. Then to divide 45/6 I get 42/6 = 
7 
with a remainder of 3, or 7.5, but since this is in the tens place I get 75. I 
put 75 in STM, add to 500 to get 575, put the result back in STM replacing 500 
and 75 for which there is no longer room. Finally, 6/6 = 1, which I add to 575 
to get 576. I hold this number in STM long enough to check with a calculator.

One could argue that this calculation in my head uses a loop iterator (in STM) 
to keep track of which digit I am working on. It definitely involves a sequence 
of instructions with intermediate results being stored temporarily. The brain 
can only execute 2 or 3 sequential instructions per second and has very limited 
short term memory, so it needs to draw from a large database of rules to 
perform 
calculations like this. A calculator, being faster and having more RAM, is able 
to use simpler but more tedious algorithms such as converting to binary, 
division by shift and subtract, and converting back to decimal. Doing this with 
a carbon based computer would require pencil and paper to make up for lack of 
STM, and it would require enough steps to have a high probability of making a 
mistake.

Intelligence = knowledge + computing power. The human brain has a lot of 
knowledge. The calculator has less knowledge, but makes up for it in speed and 
memory.

 -- Matt Mahoney, matmaho...@yahoo.com



- Original Message 
From: Michael Swan ms...@voyagergaming.com
To: agi agi@v2.listbox.com
Sent: Wed, July 14, 2010 7:53:33 PM
Subject: Re: [agi] What is the smallest set of operations that can potentially  
define everything and how do you combine them ?

On Wed, 2010-07-14 at 07:48 -0700, Matt Mahoney wrote:
 Actually, Fibonacci numbers can be computed without loops or recursion.
 
 int fib(int x) {
   return round(pow((1+sqrt(5))/2, x)/sqrt(5));
 }
;) I know. I was wondering if someone would pick up on it. This won't
prove that brains have loops though, so I wasn't concerned about the
shortcuts. 
 unless you argue that loops are needed to compute sqrt() and pow().
 
I would find it extremely unlikely that brains have *, /, and even more
unlikely to have sqrt and pow inbuilt. Even more unlikely, even if it
did have them, to figure out how to combine them to round(pow((1
+sqrt(5))/2, x)/sqrt(5)). 

Does this mean we should discount all maths that use any complex
operations ? 

I suspect the brain is full of look-up tables mainly, with some fairly
primitive methods of combining the data. 

eg What's 6 / 3 ?
ans = 2 most people would get that because it's been wrote learnt, a
common problem.

What 3456/6 ?
we don't know, at least not from the top of our head.


 The brain and DNA use redundancy and parallelism and don't use loops because 
 their operations are slow and unreliable. This is not necessarily the best 
 strategy for computers because computers are fast and reliable but don't have 
 a 

 lot of parallelism.

The brains slow and unreliable methods I think are the price paid for
generality and innately unreliable hardware. Imagine writing a computer
program that runs for 120 years without crashing and surviving damage
like a brain can. I suspect the perfect AGI program is a rigorous
combination of the 2. 


 
  -- Matt Mahoney, matmaho...@yahoo.com
 
 
 
 - Original Message 
 From: Michael Swan ms...@voyagergaming.com
 To: agi agi@v2.listbox.com
 Sent: Wed, July 14, 2010 12:18:40 AM
 Subject: Re: [agi] What is the smallest set of operations that can 
 potentially  

 define everything and how do you combine them ?
 
 Brain loops:
 
 
 Premise:
 Biological brain code does not contain looping constructs, or the
 ability to creating looping code, (due to the fact they are extremely
 dangerous on unreliable hardware) except for 1 global loop that fires
 about 200 times a second.
 
 Hypothesis:
 Brains cannot calculate iterative problems quickly, where calculations
 in the previous iteration are needed for the next iteration and, where
 brute force operations are the only valid option.
 
 Proof:
 Take as an example, Fibonacci numbers
 http://en.wikipedia.org/wiki/Fibonacci_number
 
 What are the first 100 Fibonacci numbers?
 
 int Fibonacci[102];
 Fibonacci[0] = 0;
 Fibonacci[1] = 1;
 for(int i = 0; i  100; i++)
 {
 // Getting the next Fibonacci number relies on the previous values
 Fibonacci[i+2] = Fibonacci[i] + Fibonacci[i+1];
 }  
 
 My brain knows the process to solve this problem but it can't directly
 write a looping construct into itself. And so it solves it very slowly
 compared to a computer. 
 
 The brain probably consists of vast repeating look-up

Re: [agi] Mechanical Analogy for Neural Operation!

2010-07-12 Thread Matt Mahoney

Steve Richfield wrote:
 No, I am NOT proposing building mechanical contraptions, just using the 
 concept 
to compute neuronal characteristics (or AGI formulas for learning).

Funny you should mention that. Ross Ashby actually built such a device in 1948 
called a homeostat ( http://en.wikipedia.org/wiki/Homeostat ), a fully 
interconnected neural network with 4 neurons using mechanical components and 
vacuum tubes. Synaptic weights were implemented by motor driven water filled 
potentiometers in which electrodes moved through a tank to vary the electrical 
resistance. It implemented a type of learning algorithm in which weights were 
varied using a rotating switch wired randomly using the RAND book of a million 
random digits. He described the device in his 1960 book, Design for a Brain.
 -- Matt Mahoney, matmaho...@yahoo.com





From: Steve Richfield steve.richfi...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, July 12, 2010 2:02:20 AM
Subject: [agi] Mechanical Analogy for Neural Operation!

Everyone has heard about the water analogy for electrical operation. I have a 
mechanical analogy for neural operation that just might be solid enough to 
compute at least some characteristics optimally.

No, I am NOT proposing building mechanical contraptions, just using the concept 
to compute neuronal characteristics (or AGI formulas for learning).

Suppose neurons were mechanical contraptions, that receive inputs and 
communicate outputs via mechanical movements. If one or more of the neurons 
connected to an output of a neuron, can't make sense of a given input given its 
other inputs, then its mechanism would physically resist the several inputs 
that 
didn't make mutual sense because its mechanism would jam, with the resistance 
possibly coming from some downstream neuron.

This would utilize position to resolve opposing forces, e.g. one force being 
the observed inputs, and the other force being that they don't make sense, 
suggest some painful outcome, etc. In short, this would enforce the sort of 
equation over the present formulaic view of neurons (and AGI coding) that I 
have 
suggested in past postings may be present, and show that the math may not be 
all 
that challenging.

Uncertainty would be expressed in stiffness/flexibility, computed limitations 
would be handled with over-running clutches, etc.

Propagation of forces would come close (perfect?) to being able to identify 
just 
where in a complex network something should change to learn as efficiently as 
possible.

Once the force concentrates at some point, it then gives, something slips or 
bends, to unjam the mechanism. Thus, learning is effected.

Note that this suggests little difference between forward propagation and 
backwards propagation, though real-world wet design considerations would 
clearly 
prefer fast mechanisms for forward propagation, and compact mechanisms for 
backwards propagation.

Epiphany or mania?

Any thoughts?

Steve

 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Solomonoff Induction is Not Universal and Probability is not Prediction

2010-07-09 Thread Matt Mahoney

Ben Goertzel wrote:
 Secondly, since it cannot be computed it is useless.  Third, it is not the 
 sort 
of thing that is useful for AGI in the first place.

 I agree with these two statements

The principle of Solomonoff induction can be applied to computable subsets of 
the (infinite) hypothesis space. For example, if you are using a neural network 
to make predictions, the principle says to use the smallest network that 
computes the past training data.
 -- Matt Mahoney, matmaho...@yahoo.com





From: Ben Goertzel b...@goertzel.org
To: agi agi@v2.listbox.com
Sent: Fri, July 9, 2010 7:56:53 AM
Subject: Re: [agi] Solomonoff Induction is Not Universal and Probability is 
not Prediction




On Fri, Jul 9, 2010 at 7:49 AM, Jim Bromer jimbro...@gmail.com wrote:

Abram,
Solomoff Induction would produce poor predictions if it could be used to 
compute them.  


Solomonoff induction is a mathematical, not verbal, construct.  Based on the 
most obvious mapping from the verbal terms you've used above into mathematical 
definitions in terms of which Solomonoff induction is constructed, the above 
statement of yours is FALSE.

If you're going to argue against a mathematical theorem, your argument must be 
mathematical not verbal.  Please explain one of

1) which step in the proof about Solomonoff induction's effectiveness you 
believe is in error

2) which of the assumptions of this proof you think is inapplicable to real 
intelligence [apart from the assumption of infinite or massive compute 
resources]

Otherwise, your statement is in the same category as the statement by the 
protagonist of Dostoesvky's Notes from the Underground --

I admit that two times two makes four is an excellent thing, but if we are to 
give everything its due, two times two makes five is sometimes a very charming 
thing too.

;-)

 
Secondly, since it cannot be computed it is useless.  Third, it is not the sort 
of thing that is useful for AGI in the first place.

I agree with these two statements

-- ben G 


agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Solomonoff Induction is Not Universal and Probability is not Prediction

2010-07-07 Thread Matt Mahoney

 Jim Bromer wrote:
 But, a more interesting question is, given that the first digits are 000, 
 what 
are the chances that the next digit will be 1?  Dim Induction will report .5, 
which of course is nonsense and a whole less useful than making a rough guess.

Wrong. The probability of a 1 is p(0001)/(p()+p(0001)) where the 
probabilities are computed using Solomonoff induction. A program that outputs 
 will be shorter in most languages than a program that outputs 0001, so 0 
is 
the most likely next bit.

More generally, probability and prediction are equivalent by the chain rule. 
Given any 2 strings x followed by y, the prediction p(y|x) = p(xy)/p(x).

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 7, 2010 10:10:37 AM
Subject: [agi] Solomonoff Induction is Not Universal and Probability is not 
Prediction


Suppose you have sets of programs that produce two strings.  One set of 
outputs is 00 and the other is 11. Now suppose you used these sets of 
programs to chart the probabilities of the output of the strings.  If the two 
strings were each output by the same number of programs then you'd have a .5 
probability that either string would be output.  That's ok.  But, a more 
interesting question is, given that the first digits are 000, what are the 
chances that the next digit will be 1?  Dim Induction will report .5, which of 
course is nonsense and a whole less useful than making a rough guess.
 
But, of course, Solomonoff Induction purports to be able, if it was feasible, 
to 
compute the possibilities for all possible programs.  Ok, but now, try thinking 
about this a little bit.  If you have ever tried writing random program 
instructions what do you usually get?  Well, I'll take a hazard and guess (a 
lot 
better than the bogus method of confusing shallow probability with prediction 
in my example) and say that you will get a lot of programs that crash.  Well, 
most of my experiments with that have ended up with programs that go into an 
infinite loop or which crash.  Now on a universal Turing machine, the results 
would probably look a little different.  Some strings will output nothing and 
go 
into an infinite loop.  Some programs will output something and then either 
stop 
outputting anything or start outputting an infinite loop of the same substring. 
 
Other programs will go on to infinity producing something that looks like 
random 
strings.  But the idea that all possible programs would produce well 
distributed 
strings is complete hogwash.  Since Solomonoff Induction does not define what 
kind of programs should be used, the assumption that the distribution would 
produce useful data is absurd.  In particular, the use of the method to 
determine the probability based given an initial string (as in what follows 
given the first digits are 000) is wrong as in really wrong.  The idea that 
this 
crude probability can be used as prediction is unsophisticated.
 
Of course you could develop an infinite set of Solomonoff Induction values for 
each possible given initial sequence of digits.  Hey when you're working with 
infeasible functions why not dream anything?
 
I might be wrong of course.  Maybe there is something you guys haven't been 
able 
to get across to me.  Even if you can think for yourself you can still make 
mistakes.  So if anyone has actually tried writing a program to output all 
possible programs (up to some feasible point) on a Turing Machine simulator, 
let 
me know how it went.
 
Jim Bromer
 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Hutter - A fundamental misdirection?

2010-07-07 Thread Matt Mahoney

Gorrell and Webb describe a neural implementation of LSA that seems more 
biologically plausible than the usual matrix factoring implementation.
http://www.dcs.shef.ac.uk/~genevieve/gorrell_webb.pdf
 
In the usual implementation, a word-word matrix A is factored to A = USV where 
S 
is diagonal (containing eigenvalues), and then the smaller elements of S are 
discarded. In the Gorrell model, U and V are the weights of a 3 layer neural 
network mapping words to words, and the nonzero elements of S represent the 
semantic space in the middle layer. As the network is trained, neurons are 
added 
to S. Thus, the network is trained online in a single pass, unlike factoring, 
which is offline.

-- Matt Mahoney, matmaho...@yahoo.com





From: Gabriel Recchia grecc...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, July 7, 2010 12:12:00 PM
Subject: Re: [agi] Hutter - A fundamental misdirection?

 In short, instead of a pot of neurons, we might instead have a pot of 
 dozens 
of types of 

 neurons that each have their own complex rules regarding what other types of 
neurons they 

 can connect to, and how they process information...

 ...there is plenty of evidence (from the slowness of evolution, the large 
number (~200) 

 of neuron types, etc.), that it is many-layered and quite complex...

The disconnect between the low-level neural hardware and the implementation of 
algorithms that build conceptual spaces via dimensionality reduction--which 
generally ignore facts such as the existence of different types of neurons, the 
apparently hierarchical organization of neocortex, etc.--seems significant. 
Have 
there been attempts to develop computational models capable of LSA-style feats 
(e.g., constructing a vector space in which words with similar meanings tend to 
be relatively close to each other) that take into account basic facts about how 
neurons actually operate (ideally in a more sophisticated way than the nodes of 
early connectionist networks which, as we now know, are not particularly 
neuron-like at all)? If so, I would love to know about them.



On Tue, Jun 29, 2010 at 3:02 PM, Ian Parker ianpark...@gmail.com wrote:

The paper seems very similar in principle to LSA. What you need for a concept 
vector  (or position) is the application of LSA followed by K-Means which will 
give you your concept clusters.


I would not knock Hutter too much. After all LSA reduces {primavera, 
mamanthal, 
salsa, resorte} to one word giving 2 bits saving on Hutter.




  - Ian Parker



On 29 June 2010 07:32, rob levy r.p.l...@gmail.com wrote:

Sorry, the link I included was invalid, this is what I meant: 


http://www.geog.ucsb.edu/~raubal/Publications/RefConferences/ICSC_2009_AdamsRaubal_Camera-FINAL.pdf




On Tue, Jun 29, 2010 at 2:28 AM, rob levy r.p.l...@gmail.com wrote:

On Mon, Jun 28, 2010 at 5:23 PM, Steve Richfield steve.richfi...@gmail.com 
wrote:

Rob,

I just LOVE opaque postings, because they identify people who see things 
differently than I do. I'm not sure what you are saying here, so I'll make 
some 
random responses to exhibit my ignorance and elicit more explanation.




I think based on what you wrote, you understood (mostly) what I was trying 
to 
get across.  So I'm glad it was at least quasi-intelligible. :)
 
 It sounds like this is a finer measure than the dimensionality that I was 
referencing. However, I don't see how to reduce anything as quantized as 
dimensionality into finer measures. Can you say some more about this?




I was just referencing Gardenfors' research program of conceptual spaces 
(I 
was intentionally vague about committing to this fully though because I 
don't 
necessarily think this is the whole answer).  Page 2 of this article 
summarizes 
it pretty succinctly: 
http://www.geog.ucsb.edu/.../ICSC_2009_AdamsRaubal_Camera-FINAL.pdf


 
However, different people's brains, even the brains of identical twins, have 
DIFFERENT mappings. This would seem to mandate experience-formed topology.
 



Yes definitely.
 
Since these conceptual spaces that structure sensorimotor 
expectation/prediction 
(including in higher order embodied exploration of concepts I think) are 
multidimensional spaces, it seems likely that some kind of neural 
computation 
over these spaces must occur,

I agree.
 

though I wonder what it actually would be in terms of neurons, (and if that 
matters).

I don't see any route to the answer except via neurons.


I agree this is true of natural intelligence, though maybe in modeling, the 
neural level can be shortcut to the topo map level without recourse to 
neural 
computation (use some more straightforward computation like matrix algebra 
instead).

Rob

agi | Archives  | Modify Your Subscription  

agi | Archives  | Modify Your Subscription  

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https

Re: [agi] Reward function vs utility

2010-07-04 Thread Matt Mahoney

Perhaps we now have a better understanding of the risks of uploading to a form 
where we could modify our own software. We already do this to some extent using 
drugs. Evolution will eliminate such failures.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Abram Demski abramdem...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, July 4, 2010 11:43:46 AM
Subject: Re: [agi] Reward function vs utility

Joshua,

But couldn't it game the external utility function by taking actions which 
modify it? For example, if the suggestion is taken literally and you have a 
person deciding the reward at each moment, an AI would want to focus on making 
that person *think* the reward should be high, rather than focusing on actually 
doing well at whatever task it's set...and the two would tend to diverge 
greatly for more and more complex/difficult tasks, since these tend to be 
harder to judge. Furthermore, the AI would be very pleased to knock the human 
out of the loop and push its own buttons. Similar comments would apply to 
automated reward calculations.

--Abram


On Sun, Jul 4, 2010 at 4:40 AM, Joshua Fox joshuat...@gmail.com wrote:

Another point. I'm probably repeating the obvious, but perhaps this will be 
useful to some.


On the one hand,  an agent could not game a Legg-like intelligence metric by 
altering the utility function, even an internal one,, since the metric is 
based on the function before any such change.


On the other hand, since an  internally-calculated utility function would 
necessarily be a function of observations, rather than of actual world state, 
it could be successfully gamed by altering observations.  


This latter objection does not apply to functions which are externally 
calculated, whether known or unknown.

Joshua







On Fri, Jul 2, 2010 at 7:23 PM, Joshua Fox joshuat...@gmail.com wrote:



I found the answer as given by Legg, Machine Superintelligence, p. 72, copied 
below. A reward function is used to bypass potential difficulty in 
communicating a utility function to the agent.


Joshua



The existence of a goal raises the problem of how the agent knows what the
goal is. One possibility would be for the goal to be known in advance and
for this knowledge to be built into the agent. The problem with this is that
it limits each agent to just one goal. We need to allow agents that are more
flexible, specifically, we need to be able to inform the agent of what the 
goal
is. For humans this is easily done using language. In general however, the
possession of a suffciently high level of language is too strong an assumption
to make about the agent. Indeed, even for something as intelligent as a dog
or a cat, direct explanation is not very effective.


Fortunately there is another possibility which is, in some sense, a blend of
the above two. We define an additional communication channel with the sim-
plest possible semantics: a signal that indicates how good the agent’s current
situation is. We will call this signal the reward. The agent simply has to
maximise the amount of reward it receives, which is a function of the goal. In
a complex setting the agent might be rewarded for winning a game or solving
a puzzle. If the agent is to succeed in its environment, that is, receive a 
lot of
reward, it must learn about the structure of the environment and in particular
what it needs to do in order to get reward.







On Mon, Jun 28, 2010 at 1:32 AM, Ben Goertzel b...@goertzel.org wrote:

You can always build the utility function into the assumed universal 
Turing machine underlying the definition of algorithmic information...

I guess this will improve learning rate by some additive constant, in the 
long run ;)

ben


On Sun, Jun 27, 2010 at 4:22 PM, Joshua Fox joshuat...@gmail.com wrote:

This has probably been discussed at length, so I will appreciate a reference 
on this:


Why does Legg's definition of intelligence (following on Hutters' AIXI and 
related work) involve a reward function rather than a utility function? For 
this purpose, reward is a function of the word state/history which is 
unknown to the agent while  a utility function is known to the agent. 


Even if  we replace the former with the latter, we can still have a 
definition of intelligence that integrates optimization capacity over 
possible all utility functions. 


What is the real  significance of the difference between the two types of 
functions here?


Joshua

agi | Archives   | Modify  Your Subscription  


-- 
Ben Goertzel, PhD
CEO, Novamente LLC and Biomind LLC
CTO, Genescient Corp
Vice Chairman, Humanity+
Advisor, Singularity University and Singularity Institute




External Research Professor, Xiamen University, China
b...@goertzel.org

 
“When nothing seems to help, I go look at a stonecutter hammering away at 
his rock, perhaps a hundred times without as much as a crack showing in it. 
Yet at the hundred and first blow it will split in two, and I know

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-03 Thread Matt Mahoney

Jim Bromer wrote:
 You can't assume a priori that the diagonal argument is not relevant. 

When I say infinite in my proof of Solomonoff induction, I mean countably 
infinite, as in aleph-null, as in there is a 1 to 1 mapping between the set and 
N, the set of natural numbers. There are a countably infinite number of finite 
strings, or of finite programs, or of finite length descriptions of any 
particular string. For any finite length string or program or description x 
with nonzero probability, there are a countably infinite number of finite 
length strings or programs or descriptions that are longer and less likely than 
x, and a finite number of finite length strings or programs or descriptions 
that are either shorter or more likely or both than x.

Aleph-null is larger than any finite integer. This means that for any finite 
set and any countably infinite set, there is not a 1 to 1 mapping between the 
elements, and if you do map all of the elements of the finite set to elements 
of the infinite set, then there are unmapped elements of the infinite set left 
over.

Cantor's diagonalization argument proves that there are infinities larger than 
aleph-null, such as the cardinality of the set of real numbers, which we call 
uncountably infinite. But since I am not using any uncountably infinite sets, I 
don't understand your objection.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, July 3, 2010 9:43:15 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Jim, to address all of your points,


Solomonoff induction claims that the probability of a string is proportional 
to the number of programs that output the string, where each program M is 
weighted by 2^-|M|. The probability is dominated by the shortest program 
(Kolmogorov complexity), but it is not exactly the same. The difference is 
small enough that we may neglect it, just as we neglect differences that 
depend on choice of language.
 
 
The infinite number of programs that could output the infinite number of 
strings that are to be considered (for example while using Solomonoff induction 
to predict what string is being output) lays out the potential for the 
diagonal argument.  You can't assume a priori that the diagonal argument is not 
relevant.  I don't believe that you can prove that it isn't relevant since as 
you say, Kolmogorov Complexity is not computable, and you cannot be sure that 
you have listed all the programs that were able to output a particular string. 
This creates a situation in which the underlying logic of using Solmonoff 
induction is based on incomputable reasoning which can be shown using the 
diagonal argument.
 
This kind of criticism cannot be answered with the kinds of presumptions that 
you used to derive the conclusions that you did.  It has to be answered 
directly.  I can think of other infinity to infinity relations in which the 
potential mappings can be countably derived from the formulas or equations, but 
I have yet to see any analysis which explains why this usage can be.  Although 
you may imagine that the summation of the probabilities can be used just like 
it was an ordinary number, the unchecked usage is faulty.  In other words the 
criticism has to be considered more carefully by someone capable of dealing 
with complex mathematical problems that involve the legitimacy of claims 
between infinite to infinite mappings.
 
Jim Bromer
 
 
 
On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Jim, to address all of your points,


Solomonoff induction claims that the probability of a string is proportional 
to the number of programs that output the string, where each program M is 
weighted by 2^-|M|. The probability is dominated by the shortest program 
(Kolmogorov complexity), but it is not exactly the same. The difference is 
small enough that we may neglect it, just as we neglect differences that 
depend on choice of language.


Here is the proof that Kolmogorov complexity is not computable. Suppose it 
were. Then I could test the Kolmogorov complexity of strings in increasing 
order of length (breaking ties lexicographically) and describe the first 
string that cannot be described in less than a million bits, contradicting 
the fact that I just did. (Formally, I could write a program that outputs the 
first string whose Kolmogorov complexity is at least n bits, choosing n to be 
larger than my program).


Here is the argument that Occam's Razor and Solomonoff distribution must be 
true. Consider all possible probability distributions p(x) over any infinite 
set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is 
infinite. All such distributions must favor shorter strings over longer ones. 
Consider any x in X. Then p(x)  0. There can be at most a finite number (less 
than 1/p(x

Re: [agi] masterpiece on an iPad

2010-07-02 Thread Matt Mahoney

It could be done a lot faster if the iPad had a camera.

 -- Matt Mahoney, matmaho...@yahoo.com

From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 6:28:58 AM
Subject: [agi] masterpiece on an iPad

http://www.telegraph.co.uk/culture/culturevideo/artvideo/7865736/Artist-creates-masterpiece-on-an-iPad.html

McLuhan argues that touch is the central sense - 
the one that binds the others. He may be right. The i-devices integrate touch 
into intelligence.
agi | Archives  | Modify Your Subscription  

---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] masterpiece on an iPad

2010-07-02 Thread Matt Mahoney

AGI is all about building machines that think, so you don't have to.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 9:37:51 AM
Subject: Re: [agi] masterpiece on an iPad


that's like saying cartography or 
cartoons could be done a lot faster if they just used cameras -  ask 
Michael to explain what the hand can draw that the 
camera can't


From: Matt Mahoney 
Sent: Friday, July 02, 2010 2:21 PM
To: agi 
Subject: Re: [agi] masterpiece on an iPad

It could be done a lot faster if the iPad had a camera.

 -- Matt Mahoney, matmaho...@yahoo.com 





 From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 6:28:58 
AM
Subject: [agi] masterpiece 
on an iPad


http://www.telegraph.co.uk/culture/culturevideo/artvideo/7865736/Artist-creates-masterpiece-on-an-iPad.html
 
McLuhan argues that touch is the central sense - 
the one that binds the others. He may be right. The i-devices integrate touch 
into intelligence.
agi | Archives  | Modify Your Subscription  
agi | Archives  | Modify Your Subscription   
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] masterpiece on an iPad

2010-07-02 Thread Matt Mahoney

An AGI only has to predict your behavior so that it can serve you better by 
giving you what you want without you asking for it. It is not a copy of your 
mind. It is a program that can call a function that simulates your mind for 
some arbitrary purpose determined by its programmer.

 -- Matt Mahoney, matmaho...@yahoo.com





From: John G. Rose johnr...@polyplexic.com
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 11:39:23 AM
Subject: RE: [agi] masterpiece on an iPad


An AGI may not really think like we do, it may just execute code. 
 
Though I suppose you could program a lot of fuzzy loops and idle speculation, 
entertaining possibilities, having human think envy.. 
 
John
 
From:Matt Mahoney [mailto:matmaho...@yahoo.com] 
Sent: Friday, July 02, 2010 8:21 AM
To: agi
Subject: Re: [agi] masterpiece on an iPad
 
AGI is all about building machines that think, so you don't have to.

 
-- Matt Mahoney, matmaho...@yahoo.com
 
 



From:Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 9:37:51 AM
Subject: Re: [agi] masterpiece on an iPad
that's like saying cartography or cartoons could be done a lot faster if they 
just used cameras -  ask Michael to explain what the hand can draw that the 
camera can't
 
From:Matt Mahoney 
Sent:Friday, July 02, 2010 2:21 PM
To:agi 
Subject:Re: [agi] masterpiece on an iPad
 
It could be done a lot faster if the iPad had a camera.

 
-- Matt Mahoney, matmaho...@yahoo.com 
 
 



From:Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 6:28:58 AM
Subject: [agi] masterpiece on an iPad
http://www.telegraph.co.uk/culture/culturevideo/artvideo/7865736/Artist-creates-masterpiece-on-an-iPad.html
 
McLuhan argues that touch is the central sense - the one that binds the others. 
He may be right. The i-devices integrate touch into intelligence.
agi| Archives | Modify Your Subscription  
agi| Archives| Modify Your Subscription  
 
agi| Archives| Modify Your Subscription  
agi| Archives| ModifyYour Subscription  
 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-07-02 Thread Matt Mahoney

Jim, to address all of your points,

Solomonoff induction claims that the probability of a string is proportional to 
the number of programs that output the string, where each program M is weighted 
by 2^-|M|. The probability is dominated by the shortest program (Kolmogorov 
complexity), but it is not exactly the same. The difference is small enough 
that we may neglect it, just as we neglect differences that depend on choice of 
language.

Here is the proof that Kolmogorov complexity is not computable. Suppose it 
were. Then I could test the Kolmogorov complexity of strings in increasing 
order of length (breaking ties lexicographically) and describe the first 
string that cannot be described in less than a million bits, contradicting the 
fact that I just did. (Formally, I could write a program that outputs the first 
string whose Kolmogorov complexity is at least n bits, choosing n to be larger 
than my program).

Here is the argument that Occam's Razor and Solomonoff distribution must be 
true. Consider all possible probability distributions p(x) over any infinite 
set X of possible finite strings x, i.e. any X = {x: p(x)  0} that is 
infinite. All such distributions must favor shorter strings over longer ones. 
Consider any x in X. Then p(x)  0. There can be at most a finite number (less 
than 1/p(x)) of strings that are more likely than x, and therefore an infinite 
number of strings which are less likely than x. Of this infinite set, only a 
finite number (less than 2^|x|) can be shorter than x, and therefore there must 
be an infinite number that are longer than x. So for each x we can partition X 
into 4 subsets as follows:

- shorter and more likely than x: finite
- shorter and less likely than x: finite
- longer and more likely than x: finite
- longer and less likely than x: infinite.

So in this sense, any distribution over the set of strings must favor shorter 
strings over longer ones.

-- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Fri, July 2, 2010 4:09:38 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI




On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer jimbro...@gmail.com wrote:  
There cannot be a one to one correspondence to the representation of the 
shortest program to produce a string and the strings that they produce.  This 
means that if the consideration of the hypotheses were to be put into general 
mathematical form it must include the potential of many to one relations 
between candidate programs (or subprograms) and output strings.
 
But, there is also no way to determine what the shortest program is, since 
there may be different programs that are the same length.  That means that 
there is a many to one relation between programs and program length.  So the 
claim that you could just iterate through programs by length is false.  This is 
the goal of algorithmic information theory not a premise of a methodology that 
can be used.  So you have the diagonalization problem. 
 
A counter argument is that there are only a finite number of Turing Machine 
programs of a given length.  However, since you guys have specifically 
designated that this theorem applies to any construction of a Turing Machine it 
is not clear that this counter argument can be used.  And there is still the 
specific problem that you might want to try a program that writes a longer 
program to output a string (or many strings).  Or you might want to write a 
program that can be called to write longer programs on a dynamic basis.  I 
think these cases, where you might consider a program that outputs a longer 
program, (or another instruction string for another Turing Machine) constitutes 
a serious problem, that at the least, deserves to be answered with sound 
analysis.
 
Part of my original intuitive argument, that I formed some years ago, was that 
without a heavy constraint on the instructions for the program, it will be 
practically impossible to test or declare that some program is indeed the 
shortest program.  However, I can't quite get to the point now that I can say 
that there is definitely a diagonalization problem.
 
Jim Bromer

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-30 Thread Matt Mahoney

Jim, what evidence do you have that Occam's Razor or algorithmic information 
theory is wrong, besides your own opinions? It is well established that elegant 
(short) theories are preferred in all branches of science because they have 
greater predictive power.

Also, what does this have to do with Cantor's diagonalization argument? AIT 
considers only the countably infinite set of hypotheses.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Jim Bromer jimbro...@gmail.com
To: agi agi@v2.listbox.com
Sent: Wed, June 30, 2010 9:13:44 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI


On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.com wrote:
In brief, the answer to your question is: we formalize the description length 
heuristic by assigning lower probabilities to longer hypotheses, and we apply 
Bayes law to update these probabilities given the data we observe. This 
updating captures the idea that we should reward theories which explain/expect 
more of the observations; it also provides a natural way to balance simplicity 
vs explanatory power, so that we can compare any two theories with a single 
scoring mechanism. Bayes Law automatically places the right amount of pressure 
to avoid overly elegant explanations which don't get much right, and to avoid 
overly complex explanations which fit the observations perfectly but which 
probably won't generalize to new data.
...
If you go down this path, you will eventually come to understand (and, 
probably, accept) algorithmic information theory. Matt may be tring to force it 
on you too soon. :)
--Abram 
 
David was asking about theories of explanation, and here you are suggesting 
that following a certain path of reasoning will lead to accepting AIT.  What 
nonsense.  Even assuming that Baye's law can be used to update probabilities of 
idealized utility, the connection between description length and explanatory 
power in general AI is tenuous.  And when you realize that AIT is an 
unattainable idealism that lacks mathematical power (I do not believe that it 
is a valid mathematical method because it is incomputable and therefore 
innumerable and cannot be used to derive probability distributions even as 
ideals) you have to accept that the connection between explanatory theories and 
AIT is not established except as a special case based on the imagination that a 
similarities between a subclass of practical examples is the same as a powerful 
generalization of those examples.  
 
The problem is that while compression seems to be related to intelligence, it 
is not equivalent to intelligence.  A much stronger but similarly false 
argument is that memory is intelligence.  Of course memory is a major part of 
intelligence, but it is not everything.  The argument that AIT is a reasonable 
substitute for developing more sophisticated theories about conceptual 
explanation is not well founded, it lacks any experimental evidence other than 
a spattering of results on simplistic cases, and it is just wrong to suggest 
that there is no reason to consider other theories of explanation.
 
Yes compression has something to do with intelligence and, in some special 
cases it can be shown to act as an idealism for numerical rationality.  And yes 
unattainable theories that examine the boundaries of productive mathematical 
systems is a legitimate subject for mathematics.  But there is so much more to 
theories of explanatory reasoning that I genuinely feel sorry for those of you, 
who originally motivated to develop better AGI programs, would get caught in 
the obvious traps of AIT and AIXI.
 
Jim Bromer 

 
On Tue, Jun 29, 2010 at 11:46 PM, Abram Demski abramdem...@gmail.com wrote:

David,

What Matt is trying to explain is all right, but I think a better way of 
answering your question would be to invoke the mighty mysterious Bayes' Law.

I had an epiphany similar to yours (the one that started this thread) about 5 
years ago now. At the time I did not know that it had all been done before. I 
think many people feel this way about MDL. Looking into the MDL (minimum 
description length) literature would be a good starting point.

In brief, the answer to your question is: we formalize the description length 
heuristic by assigning lower probabilities to longer hypotheses, and we apply 
Bayes law to update these probabilities given the data we observe. This 
updating captures the idea that we should reward theories which explain/expect 
more of the observations; it also provides a natural way to balance simplicity 
vs explanatory power, so that we can compare any two theories with a single 
scoring mechanism. Bayes Law automatically places the right amount of pressure 
to avoid overly elegant explanations which don't get much right, and to avoid 
overly complex explanations which fit the observations perfectly but which 
probably won't generalize to new data.

Bayes' Law and MDL have strong connections, though

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney

David Jones wrote:
 If anyone has any knowledge of or references to the state of the art in 
 explanation-based reasoning, can you send me keywords or links? 

The simplest explanation of the past is the best predictor of the future.
http://en.wikipedia.org/wiki/Occam's_razor
http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 9:05:45 AM
Subject: [agi] Re: Huge Progress on the Core of AGI

If anyone has any knowledge of or references to the state of the art in 
explanation-based reasoning, can you send me keywords or links? I've read some 
through google, but I'm not really satisfied with anything I've found. 

Thanks,

Dave


On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.com wrote:

A method for comparing hypotheses in explanatory-based reasoning: 

We prefer the hypothesis or explanation that *expects* more observations. If 
both explanations expect the same observations, then the simpler of the two is 
preferred (because the unnecessary terms of the more complicated explanation 
do not add to the predictive power). 

Why are expected events so important? They are a measure of 1) explanatory 
power and 2) predictive power. The more predictive and 
the more explanatory a hypothesis is, the more likely the hypothesis is when 
compared to a competing hypothesis.

Here are two case studies I've been analyzing from sensory perception of 
simplified visual input:


The goal of the case studies is to answer the following: How do you generate 
the most likely motion hypothesis in a way that is 
general and applicable to AGI?
Case Study 1) Here is a link to an example: animated gif of two black squares 
move from left to right. Description: Two black squares are moving in unison 
from left to right across a white screen. In each frame the black squares 
shift to the right so that square 1 steals square 2's original position and 
square two moves an equal distance to the right.
Case Study 2) Here is a link to an example: the interrupted square. 
Description: A single square is moving from left to right. Suddenly in the 
third frame, a single black square is added in the middle of the expected path 
of the original black square. This second square just stays there. So, what 
happened? Did the square moving from left to right keep moving? Or did it stop 
and then another square suddenly appeared and moved from left to right?

Here is a simplified version of how we solve case study 1:
The important hypotheses to consider are: 
1) the square from frame 1 of the video that has a very close position to the 
square from frame 2 should be matched (we hypothesize that they are the same 
square and that any difference in position is motion).  So, what happens is 
that in each two frames of the video, we only match one square. The other 
square goes unmatched.   


2) We do the same thing as in hypothesis #1, but this time we also match the 
remaining squares and hypothesize motion as follows: the first square jumps 
over the second square from left to right. We hypothesize that this happens 
over and over in each frame of the video. Square 2 stops and square 1 jumps 
over it over and over again. 


3) We hypothesize that both squares move to the right in unison. This is the 
correct hypothesis.

So, why should we prefer the correct hypothesis, #3 over the other two?

Well, first of all, #3 is correct because it has the most explanatory power of 
the three and is the simplest of the three. Simpler is better because, with 
the given evidence and information, there is no reason to desire a more 
complicated hypothesis such as #2. 

So, the answer to the question is because explanation #3 expects the most 
observations, such as: 
1) the consistent relative positions of the squares in each frame are 
expected. 
2) It also expects their new positions in each from based on velocity 
calculations. 


3) It expects both squares to occur in each frame. 

Explanation 1 ignores 1 square from each frame of the video, because it can't 
match it. Hypothesis #1 doesn't have a reason for why the a new square appears 
in each frame and why one disappears. It doesn't expect these observations. In 
fact, explanation 1 doesn't expect anything that happens because something new 
happens in each frame, which doesn't give it a chance to confirm its 
hypotheses in subsequent frames.

The power of this method is immediately clear. It is general and it solves the 
problem very cleanly.

Here is a simplified version of how we solve case study 2:
We expect the original square to move at a similar velocity from left to right 
because we hypothesized that it did move from left to right and we calculated 
its velocity. If this expectation is confirmed, then it is more likely than 
saying that the square suddenly stopped and another started moving

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney

 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 2) 
 what defines simpler? 

A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in one language to the other.

 For example, the simplest hypothesis for all visual interpretation is that 
 everything in the first image is gone in the second image, and everything in 
 the second image is a new object. Simple. Done. Solved :) right? 

The hypothesis is not the simplest. The program that outputs the two frames as 
if independent cannot be smaller than the two frames compressed independently. 
The program could be made smaller if it only described how the second frame is 
different than the first. It would be more likely to correctly predict the 
third frame if it continued to run and described how it would be different than 
the second frame.

 I don't think much progress has been made in this area, but I'd like to know 
 what other people have done and any successes they've had.

Kolmogorov proved that the solution is not computable. Given a hypothesis (a 
description of the observed data, or a program that outputs the observed data), 
there is no general procedure or test to determine whether a shorter (simpler, 
better) hypothesis exists. Proof: suppose there were. Then I could describe 
the first data set that cannot be described in less than a million bits even 
though I just did. (By first I mean the first data set encoded by a string 
from shortest to longest, breaking ties lexicographically).

That said, I believe the state of the art in both language and vision are based 
on hierarchical neural models, i.e. pattern recognition using learned weighted 
combinations of simpler patterns. I am more familiar with language. The top 
ranked programs can be found at http://mattmahoney.net/dc/text.html

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 10:44:41 AM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Thanks Matt,

Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
this only applies when two hypotheses have the same explanatory power and 2) 
what defines simpler? 

So, maybe what I want to know from the state of the art in research is: 

1) how precisely do other people define simpler
and
2) More importantly, how do you compare competing explanations/hypotheses that 
have more or less explanatory power. Simpler does not apply unless you are 
comparing equally explanatory hypotheses. 

For example, the simplest hypothesis for all visual interpretation is that 
everything in the first image is gone in the second image, and everything in 
the second image is a new object. Simple. Done. Solved :) right? Well, clearly 
a more complicated explanation is warranted because a more complicated 
explanation is more *explanatory* and a better explanation. So, why is it 
better? Can it be defined as better in a precise way so that you can compare 
arbitrary hypotheses or explanations? That is what I'm trying to learn about. I 
don't think much progress has been made in this area, but I'd like to know what 
other people have done and any successes they've had.

Dave



On Tue, Jun 29, 2010 at 10:29 AM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 If anyone has any knowledge of or references to the state of the art in 
 explanation-based reasoning, can you send me keywords or links? 


The simplest explanation of the past is the best predictor of the future.
http://en.wikipedia.org/wiki/Occam's_razor
http://www.scholarpedia.org/article/Algorithmic_probability

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com

To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 9:05:45 AM
Subject: [agi] Re: Huge Progress on the Core of AGI


If anyone has any knowledge of or references to the state of the art in 
explanation-based reasoning, can you send me keywords or links? I've read some 
through google, but I'm not really satisfied with anything I've found. 

Thanks,

Dave


On Sun, Jun 27, 2010 at 1:31 AM, David Jones davidher...@gmail.com wrote:

A method for comparing hypotheses in explanatory-based reasoning: 

We prefer the hypothesis or explanation that *expects* more observations. If 
both explanations expect the same observations, then the simpler of the two

Re: [agi] A Primary Distinction for an AGI

2010-06-29 Thread Matt Mahoney

David Jones wrote:
 I wish people understood this better.

For example, animals can be intelligent even though they lack language because 
they can see. True, but an AGI with language skills is more useful than one 
without.

And yes, I realize that language, vision, motor skills, hearing, and all the 
other senses and outputs are tied together. Skills in any area make learning 
the others easier.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:42:51 PM
Subject: Re: [agi] A Primary Distinction for an AGI

Mike, 

THIS is the flawed reasoning that causes people to ignore vision as the right 
way to create AGI. And I've finally come up with a great way to show you how 
wrong this reasoning is. 

I'll give you an extremely obvious argument that proves that vision requires 
much less knowledge to interpret than language does. Let's say that you have 
never been to egypt, you have never seen some particular movie before.  But if 
you see the movie, an alien landscape, an alien world, a new place or any such 
new visual experience, you can immediately interpret it in terms of spacial, 
temporal, compositional and other relationships. 

Now, go to egypt and listen to them speak. Can you interpret it? Nope. Why?! 
Because you don't have enough information. The language itself does not contain 
any information to help you interpret it. We do not learn language simply by 
listening. We learn based on evidence from how the language is used and how it 
occurs in our daily lives. Without that experience, you cannot interpret it.

But with vision, you do not need extra knowledge to interpret a new situation. 
You can recognize completely new objects without any training except for simply 
observing them in their natural state. 

I wish people understood this better.

Dave


On Tue, Jun 29, 2010 at 12:51 PM, Mike Tintner tint...@blueyonder.co.uk wrote:





Just off the cuff here - isn't the same true for 
vision? You can't learn vision from vision. Just as all NLP has no connection 
with the real world, and totally relies on the human programmer's knowledge of 
that world. 
 
Your visual program actually relies totally on your 
visual vocabulary - not its own. That is the inevitable penalty of 
processing 
unreal signals on a computer screen which are not in fact connected to the 
real world any more than the verbal/letter signals involved in NLP 
are.
 
What you need to do - what anyone in your situation 
with anything like your asprations needs to do - is to hook up with a 
roboticist. Everyone here should be doing that.
 


From: David Jones 
Sent: Tuesday, June 29, 2010 5:27 PM
To: agi 
Subject: Re: [agi] A Primary Distinction for an 
AGI


You can't learn language from language without embedding way more knowledge 
than is reasonable. Language does not contain the information required for its 
interpretation. There is no *reason* to interpret the language into any of the 
infinite possible interpretaions. There is nothing to explain but it requires 
explanatory reasoning to determine the correct real world interpretation
On Jun 29, 2010 10:58 AM, Matt Mahoney matmaho...@yahoo.com wrote:


David Jones wrote:
 Natural language 
  requires more than the words on the page in the real world. Of...
Any knowledge that can be demonstrated over a 
  text-only channel (as in the Turing test) can also be learned over a 
 text-only 
  channel.


 Cyc also is trying to store knowledge 
  about a super complicated world in simplistic forms and al...
Cyc failed because it lacks natural language. The vast knowledge 
  store of the internet is unintelligible to Cyc. The average person can't 
  use it because they don't speak Cycl and because they have neither the 
 ability 
  nor the patience to translate their implicit thoughts into augmented first 
  order logic. Cyc's approach was understandable when they started in 1984 
 when 
  they had neither the internet nor the vast computing power that is required 
 to 
  learn natural language from unlabeled examples like children do.


 Vision and other sensory interpretaion, on 
  the other hand, do not require more info because that...
Without natural language, your system will fail too. You don't have 
  enough computing power to learn language, much less the million times more 
  computing power you need to learn to see.


 
-- Matt Mahoney, matmaho...@yahoo.com




From: David Jones 
  davidher...@gmail.com
To: agi 
  a...@v2.listbox.c...sent: Mon, June 28, 2010 9:28:57 PM 
 

Subject: Re: [agi] A Primary Distinction for an 
  AGI 

Natural language requires more than the words on 
  the page in the real world. Of course that didn't ...
agi | Archives  | Modify Your Subscription   
agi | Archives  | 
 Modify   Your Subscription   

agi | Archives   | Modify  Your Subscription  

agi | Archives  | Modify Your Subscription

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney

David Jones wrote:
 I really don't think this is the right way to calculate simplicity. 

I will give you an example, because examples are more convincing than proofs.

Suppose you perform a sequence of experiments whose outcome can either be 0 or 
1. In the first 10 trials you observe 00. What do you expect to observe 
in the next trial?

Hypothesis 1: the outcome is always 0.
Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.

Hypothesis 1 is shorter than 2, so it is more likely to be correct.

If I describe the two hypotheses in French or Chinese, then 1 is still shorter 
than 2.

If I describe the two hypotheses in C, then 1 is shorter than 2.

  void hypothesis_1() {
while (1) printf(0);
  }

  void hypothesis_2() {
int i;
for (i=0; i10; ++i) printf(0);
while (1) printf(1);
  }

If I translate these programs into Perl or Lisp or x86 assembler, then 1 will 
still be shorter than 2.

I realize there might be smaller equivalent programs. But I think you could 
find a smaller program equivalent to hypothesis_1 than hypothesis_2.

I realize there are other hypotheses than 1 or 2. But I think that the smallest 
one you can find that outputs eleven bits of which the first ten are zeros will 
be a program that outputs another zero.

I realize that you could rewrite 1 so that it is longer than 2. But it is the 
shortest version that counts. More specifically consider all programs in which 
the first 10 outputs are 0. Then weight each program by 2^-length. So the 
shortest programs dominate.

I realize you could make up a language where the shortest encoding of 
hypothesis 2 is shorter than 1. You could do this for any pair of hypotheses. 
However, I think if you stick to simple languages (and I realize this is a 
circular definition), then 1 will usually be shorter than 2.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:31:01 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI




On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney matmaho...@yahoo.com wrote:

 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 2) 
 what defines simpler? 


A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

I can't be confident that bits is the right way to do it. I suspect bits is an 
approximation of a more accurate method. I also suspect that you can write a 
more complex explanation program with the same number of bits. So, there are 
some flaws with this approach. It is an interesting idea to consider though. 
 



The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in
 one language to the other.

Hypotheses don't have to be written in actual computer code and probably 
shouldn't be because hypotheses are not really meant to be run per say. And 
outputs are not necessarily the right way to put it either. Outputs imply 
prediction. And as mike has often pointed out, things cannot be precisely 
predicted. We can, however, determine whether a particular observation fits 
expectations, rather than equals some prediction. There may be multiple 
possible outcomes that we expect and which would be consistent with a 
hypothesis, which is why actual prediction should not be used.


 For example, the simplest hypothesis for all visual interpretation is that 
 everything in the first image is gone in the second image, and everything in 
 the second image is a new object. Simple. Done. Solved :) right? 


The hypothesis is not the simplest. The program that outputs the two frames as 
if independent cannot be smaller than the two frames compressed independently. 
The program could be made smaller if it only described how the second frame is 
different than the first. It would be more likely to correctly predict the 
third frame if it continued to run and described how it would be different 
than the second frame.

I really don't think this is the right way to calculate simplicity. 
 



 I don't think much progress has been made in this area, but I'd like to know 
 what other people have done and any successes they've had.


Kolmogorov proved that the solution is not
 computable. Given a hypothesis (a description of the observed data, or a 
 program that outputs the observed data), there is no general procedure or 
 test to determine whether a shorter (simpler, better) hypothesis exists. 
 Proof: suppose there were. Then I could describe

Re: [agi] A Primary Distinction for an AGI

2010-06-29 Thread Matt Mahoney

Experiments in text compression show that text alone is sufficient for learning 
to predict text.

I realize that for a machine to pass the Turing test, it needs a visual model 
of the world. Otherwise it would have a hard time with questions like what 
word in this ernai1 did I spell wrong? Obviously the easiest way to build a 
visual model is with vision, but it is not the only way.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:22:33 PM
Subject: Re: [agi] A Primary Distinction for an AGI

I certainly agree that the techniques and explanation generating algorithms for 
learning language are hard coded into our brain. But, those techniques alone 
are not sufficient to learn language in the absence of sensory perception or 
some other way of getting the data required.

Dave


On Tue, Jun 29, 2010 at 3:19 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
  The knowledge for interpreting language though should not be 
 pre-programmed. 


I think that human brains are wired differently than other animals to make 
language learning easier. We have not been successful in training other 
primates to speak, even though they have all the right anatomy such as vocal 
chords, tongue, lips, etc. When primates have been taught sign language, they 
have not successfully mastered forming sentences.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:00:09 PM

Subject: Re: [agi] A Primary Distinction for an AGI


The point I was trying to make is that an approach that tries to interpret 
language just using language itself and without sufficient information or the 
means to realistically acquire that information, *should* fail. 

On the other hand, an approach that tries to interpret vision with minimal 
upfront knowledge needs *should* succeed because the knowledge required to 
automatically learn to interpret images is amenable to preprogramming. In 
addition, such knowledge must be pre-programmed. The knowledge for 
interpreting language though should not be pre-programmed. 

Dave


On Tue, Jun 29, 2010 at 2:51 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 I wish people understood this better.


For example, animals can be intelligent even though they lack language 
because they can see. True, but an AGI with language skills is more useful 
than one without.


And yes, I realize that language, vision, motor skills, hearing, and all the 
other senses and outputs are tied together. Skills in any area make learning 
the others easier.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:42:51 PM


Subject: Re: [agi] A Primary Distinction for an AGI


Mike, 

THIS is the flawed reasoning that causes people to ignore vision as the right 
way to create AGI. And I've finally come up with a great way to show you how 
wrong this reasoning is. 

I'll give you an extremely obvious argument that proves that vision requires 
much less knowledge to interpret than language does. Let's say that you have 
never been to egypt, you have never seen some particular movie before.  But
 if you see the movie, an alien landscape, an alien world, a new place or any 
 such new visual experience, you can immediately interpret it in terms of 
 spacial, temporal, compositional and other relationships. 

Now, go to egypt and listen to them speak. Can you interpret it? Nope. Why?! 
Because you don't have enough information. The language itself does not 
contain any information to help you interpret it. We do not learn language 
simply by listening. We learn based on evidence from how the language is used 
and how it occurs in our daily lives. Without that experience, you cannot 
interpret it.

But with vision, you do not need extra knowledge to interpret a new 
situation. You can recognize completely new objects without any training 
except for simply observing them in their natural state. 

I wish people understood this better.

Dave


On Tue, Jun 29, 2010 at 12:51 PM, Mike Tintner tint...@blueyonder.co.uk 
wrote:





Just off the cuff here - isn't the same true for 
vision? You can't learn vision from vision. Just as all NLP has no 
connection 
with the real world, and totally relies on the human programmer's knowledge 
of 
that world. 
 
Your visual program actually relies totally on your 
visual vocabulary - not its own. That is the inevitable penalty of 
processing 
unreal signals on a computer screen which are not in fact connected to the 
real world any more than the verbal/letter signals involved in NLP 
are.
 
What you need to do - what anyone in your situation 
with anything like your asprations needs to do - is to hook up with a 
roboticist

Re: [agi] A Primary Distinction for an AGI

2010-06-29 Thread Matt Mahoney

Answering questions is the same problem as predicting the answers. If you can 
compute p(A|Q) where Q is the question (and previous context of the 
conversation) and A is the answer, then you can also choose an answer A from 
the same distribution. If p() correctly models human communication, then the 
response would be indistinguishable from a human in a Turing test.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:43:53 PM
Subject: Re: [agi] A Primary Distinction for an AGI

the purpose of text is to convey something. It has to be interpreted. who cares 
about predicting the next word if you can't interpret a single bit of it.


On Tue, Jun 29, 2010 at 3:43 PM, David Jones davidher...@gmail.com wrote:

People do not predict the next words of text. We anticipate it, but when 
something different shows up, we accept it if it is *explanatory*. Using 
compression like algorithms though will never be able to do this type of 
explanatory reasoning, which is required to disambiguate text. It is certainly 
not sufficient for learning language, which is not at all about predicting text.



On Tue, Jun 29, 2010 at 3:38 PM, Matt Mahoney matmaho...@yahoo.com wrote:


Experiments in text compression show that text alone is sufficient for 
learning to predict text.


I realize that for a machine to pass the Turing test, it needs a visual model 
of the world. Otherwise it would have a hard time with questions like what 
word in this ernai1 did I spell wrong? Obviously the easiest way to build a 
visual model is with vision, but it is not the only way.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones
 davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:22:33 PM

Subject: Re: [agi] A Primary Distinction for an AGI


I certainly agree that the techniques and explanation generating algorithms 
for learning language are hard coded into our brain. But, those techniques 
alone are not sufficient to learn language in the absence of sensory 
perception or some other way of getting the data required.

Dave


On Tue, Jun 29, 2010 at 3:19 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
  The knowledge for interpreting language though should not be 
 pre-programmed. 


I think that human brains are wired differently than other animals to make 
language learning easier. We have not been successful in training other 
primates to speak, even though they have all the right anatomy such as vocal 
chords, tongue, lips, etc. When primates have been taught sign language, 
they have not successfully mastered forming sentences.

 -- Matt Mahoney, matmaho...@yahoo.com







From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:00:09 PM



Subject: Re: [agi] A Primary Distinction for an AGI


The point I was trying to make is that an approach that tries to interpret 
language just using language itself and without sufficient information or 
the means to realistically acquire that information, *should* fail. 

On the other hand, an approach that tries to interpret vision with 
minimal upfront knowledge needs *should* succeed because the knowledge 
required to automatically learn to interpret images is amenable to 
preprogramming. In addition, such knowledge must be pre-programmed. The 
knowledge for interpreting language though should not be pre-programmed. 

Dave


On Tue, Jun 29, 2010 at 2:51 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 I wish people understood this better.


For example, animals can be intelligent even though they lack language 
because they can see. True, but an AGI with language skills is more useful 
than one without.


And yes, I realize that language, vision, motor skills, hearing, and all 
the other senses and outputs are tied together. Skills in any area make 
learning the others easier.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:42:51 PM




Subject: Re: [agi] A Primary Distinction for an AGI


Mike, 

THIS is the flawed reasoning that causes people to ignore vision as the 
right way to create AGI. And I've finally come up with a great way to show 
you how wrong this reasoning is. 

I'll give you an extremely obvious argument that proves that vision 
requires much less knowledge to interpret than language does. Let's say 
that you have never been to egypt, you have never seen some particular 
movie before.  But
 if you see the movie, an alien landscape, an alien world, a new place or 
 any such new visual experience, you can immediately interpret it in terms 
 of spacial, temporal, compositional and other relationships. 

Now, go to egypt and listen to them speak. Can you interpret it? Nope

Re: [agi] Re: Huge Progress on the Core of AGI

2010-06-29 Thread Matt Mahoney

You can always find languages that favor either hypothesis. Suppose that you 
want to predict the sequence 10, 21, 32, ? and we write our hypothesis as a 
function that takes the trial number (0, 1, 2, 3...) and returns the outcome. 
The sequence 10, 21, 32, 43, 54... would be coded:

int hypothesis_1(int trial) {
  return trial*11+10;
}

The sequence 10, 21, 32, 10, 21, 32... would be coded

int hypothesis_2(int trial) {
  return trial%3*11+10;
}

which is longer and therefore less likely.

Here is another example: predict the sequence 0, 1, 4, 9, 16, 25, 36, 49, ?

Can you find a program shorter than this that doesn't predict 64?

int hypothesis_1(int trial) {
  return trial*trial;
}

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 3:48:01 PM
Subject: Re: [agi] Re: Huge Progress on the Core of AGI

Such an example is no where near sufficient to accept the assertion that 
program size is the right way to define simplicity of a hypothesis.

Here is a counter example. It requires a slightly more complex example because 
all zeros doesn't leave any room for alternative hypotheses.

Here is the sequence: 10, 21, 32

void hypothesis_1() {
   int ten = 10;
   int counter = 0;

while (1)
{
   print(ten+counter)
   ten = ten + 10;
   counter = counter + 1;
}

  }

void hypothesis_2() {

while (1)
   print(10 21 32)

  }

Hypothesis 2 is simpler, yet clearly wrong. These examples don't really show 
anything.

Dave


On Tue, Jun 29, 2010 at 3:15 PM, Matt Mahoney matmaho...@yahoo.com wrote:

David Jones wrote:
 I really don't think this is the right way to calculate simplicity. 


I will give you an example, because examples are more convincing than proofs.


Suppose you perform a sequence of experiments whose outcome can either be 0 or 
1. In the first 10 trials you observe 00. What do you expect to 
observe in the next trial?


Hypothesis 1: the outcome is always 0.
Hypothesis 2: the outcome is 0 for the first 10 trials and 1 thereafter.


Hypothesis 1 is shorter than 2, so it is more likely to be correct.


If I describe
 the two hypotheses in French or Chinese, then 1 is still shorter than 2.


If I describe the two hypotheses in C, then 1 is shorter than 2.


  void hypothesis_1() {
while (1) printf(0);
  }


  void hypothesis_2() {
int i;
for (i=0; i10; ++i) printf(0);
while (1) printf(1);
  }


If I translate these programs into Perl or Lisp or x86 assembler, then 1 will 
still be shorter than 2.


I realize there might be smaller equivalent programs. But I think you could 
find a smaller program equivalent to hypothesis_1 than hypothesis_2.


I realize there are other hypotheses than 1 or 2. But I think that the 
smallest one you can find that outputs
 eleven bits of which the first ten are zeros will be a program that outputs 
 another zero.


I realize that you could rewrite 1 so that it is longer than 2. But it is the 
shortest version that counts. More specifically consider all programs in which 
the first 10 outputs are 0. Then weight each program by 2^-length. So the 
shortest programs dominate.


I realize you could make up a language where the shortest encoding of 
hypothesis 2 is shorter than 1. You could do this for any pair of hypotheses. 
However, I think if you stick to simple languages (and I realize this is a 
circular definition), then 1 will usually be shorter than 2.

 -- Matt Mahoney, matmaho...@yahoo.com






From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Tue, June 29, 2010 1:31:01 PM

Subject: Re: [agi] Re: Huge Progress on the Core of AGI





On Tue, Jun 29, 2010 at 11:26 AM, Matt Mahoney matmaho...@yahoo.com wrote:


 Right. But Occam's Razor is not complete. It says simpler is better, but 1) 
 this only applies when two hypotheses have the same explanatory power and 
 2) what defines simpler? 


A hypothesis is a program that outputs the observed data. It explains the 
data if its output matches what is observed. The simpler hypothesis is the 
shorter program, measured in bits.

I can't be confident that bits is the right way to do it. I suspect bits is an 
approximation of a more accurate method. I also suspect that you can write a 
more complex explanation program with the same number of bits. So, there are 
some flaws with this approach. It is an interesting idea to consider though. 

 



The language used to describe the data can be any Turing complete programming 
language (C, Lisp, etc) or any natural language such as English. It does not 
matter much which language you use, because for any two languages there is a 
fixed length procedure, described in either of the languages, independent of 
the data, that translates descriptions in
 one language to the other.

Hypotheses don't have to be written in actual computer code and probably 
shouldn't

Re: [agi] A Primary Distinction for an AGI

2010-06-28 Thread Matt Mahoney

David Jones wrote:
 I also want to mention that I develop solutions to the toy problems with the 
 real problems in mind. I also fully intend to work my way up to the real 
 thing by incrementally adding complexity and exploring the problem well at 
 each level of complexity.

A little research will show you the folly of this approach. For example, the 
toy approach to language modeling is to write a simplified grammar that 
approximates English, then write a parser, then some code to analyze the parse 
tree and take some action. The classic example is SHRDLU (blocks world, 
http://en.wikipedia.org/wiki/SHRDLU ). Efforts like that have always stalled. 
That is not how people learn language. People learn from lots of examples, not 
explicit rules, and they learn semantics before grammar.

For a second example, the toy approach to modeling logical reasoning is to 
design a knowledge representation based on augmented first order logic, then 
write code to implement deduction, forward chaining, backward chaining, etc. 
The classic example is Cyc. Efforts like that have always stalled. That is not 
how people reason. People learn to associate events that occur in quick 
succession, and then reason by chaining associations. This model is built in. 
People might later learn math, programming, and formal logic as rules for 
manipulating symbols within the framework of natural language learning.

For a third example, the toy approach to modeling vision is to segment the 
image into regions and try to interpret the meaning of each region. Efforts 
like that have always stalled. That is not how people see. People learn to 
recognize visual features that they have seen before. Features are made up of 
weighted sums of lots of simpler features with learned weights. Features range 
from dots, edges, color, and motion at the lowest levels, to complex objects 
like faces at the higher levels. Vision is integrated with lots of other 
knowledge sources. You see what you expect to see.

The common theme is that real AGI consists of a learning algorithm, an opaque 
knowledge representation, and a vast amount of training data and computing 
power. It is not an extension of a toy system where you code all the knowledge 
yourself. That doesn't scale. You can't know more than an AGI that knows more 
than you. So I suggest you do a little research instead of continuing to repeat 
all the mistakes that were made 50 years ago. You aren't the first person to do 
these kinds of experiments.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, June 28, 2010 4:00:24 PM
Subject: Re: [agi] A Primary Distinction for an AGI

I also want to mention that I develop solutions to the toy problems with the 
real problems in mind. I also fully intend to work my way up to the real thing 
by incrementally adding complexity and exploring the problem well at each level 
of complexity. As you do this, the flaws in the design will be clear and I can 
retrace my steps to create a different solution. The benefit to this strategy 
is that we fully understand the problems at each level of complexity. When you 
run into something that is not accounted, you are much more likely to know how 
to solve it. Despite its difficulties, I prefer my strategy to the alternatives.

Dave


On Mon, Jun 28, 2010 at 3:56 PM, David Jones davidher...@gmail.com wrote:

That does not have to be the case. Yes, you need to know what problems you 
might have in more complicated domains to avoid developing completely useless 
theories on toy problems. But, as you develop for full complexity problems, 
you are confronted with several sub problems. Because you have no previous 
experience, what tends to happen is you hack together a solution that barely 
works and simply isn't right or scalable because we don't have a full 
understanding of the individual sub problems. Having experience with the full 
problem is important, but forcing yourself to solve every sub problem at once 
is not a better strategy at all. You may think my strategies has flaws, but I 
know that and still chose it because the alternative strategies are worse.

Dave



On Mon, Jun 28, 2010 at 3:41 PM, Russell Wallace russell.wall...@gmail.com 
wrote:

On Mon, Jun 28, 2010 at 4:54 PM, David Jones davidher...@gmail.com wrote:
 But, that's why it is important to force oneself to solve them in such a 
 way that it IS applicable to AGI. It doesn't mean that you have to choose 
 a problem that is so hard you can't cheat. It's unnecessary to do that 
 unless you can't control your desire to cheat. I can.

That would be relevant if it was entirely a problem of willpower and
self-discipline, but it isn't. It's also a problem of guidance. A real
problem gives you feedback at every step of the way, it keeps blowing
your ideas out of the water until you come up with one that will
actually work, that you would never have

Re: [agi] Questions for an AGI

2010-06-27 Thread Matt Mahoney

This is wishful thinking. Wishful thinking is dangerous. How about instead of 
hoping that AGI won't destroy the world, you study the problem and come up with 
a safe design.

 -- Matt Mahoney, matmaho...@yahoo.com





From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, June 26, 2010 1:14:22 PM
Subject: Re: [agi] Questions for an AGI

why should AGIs give a damn about us?


I like to think that they will give a damn because humans have a unique way of 
experiencing reality and there is no reason to not take advantage of that 
precious opportunity to create astonishment or bliss. If anything is important 
in the universe, its insuring positive experiences for all areas in which it is 
conscious, I think it will realize that. And with the resources available in 
the solar system alone, I don't think we will be much of a burden. 

I like that idea.  Another reason might be that we won't crack the problem of 
autonomous general intelligence, but the singularity will proceed regardless as 
a symbiotic relationship between life and AI.  That would be beneficial to us 
as a form of intelligence expansion, and beneficial to the artificial entity a 
way of being alive and having an experience of the world.  
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Reward function vs utility

2010-06-27 Thread Matt Mahoney

The definition of universal intelligence being over all utility functions 
implies that the utility function is unknown. Otherwise there is a fixed 
solution.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Joshua Fox joshuat...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 27, 2010 4:22:19 PM
Subject: [agi] Reward function vs utility


This has probably been discussed at length, so I will appreciate a reference on 
this:

Why does Legg's definition of intelligence (following on Hutters' AIXI and 
related work) involve a reward function rather than a utility function? For 
this purpose, reward is a function of the word state/history which is unknown 
to the agent while  a utility function is known to the agent. 

Even if  we replace the former with the latter, we can still have a definition 
of intelligence that integrates optimization capacity over possible all utility 
functions. 

What is the real  significance of the difference between the two types of 
functions here?

Joshua
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Questions for an AGI

2010-06-27 Thread Matt Mahoney

rob levy wrote:
 This is wishful thinking.
 I definitely agree, however we lack a convincing model or plan of any sort 
 for the construction of systems demonstrating subjectivity, 

Define subjectivity. An objective decision might appear subjective to you only 
because you aren't intelligent enough to understand the decision process.

 Therefore it is reasonable to consider symbiosis

How does that follow?

 as both a safe design 

How do you know that a self replicating organism that we create won't evolve to 
kill us instead? Do we control evolution?

 and potentially the only possible design 

It is not the only possible design. It is possible to create systems that are 
more intelligent than a single human but less intelligent than all of humanity, 
without the capability to modify itself or reproduce without the collective 
permission of the billions of humans that own and maintain control over it. An 
example would be the internet.

 -- Matt Mahoney, matmaho...@yahoo.com





From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 27, 2010 2:37:15 PM
Subject: Re: [agi] Questions for an AGI

I definitely agree, however we lack a convincing model or plan of any sort for 
the construction of systems demonstrating subjectivity, and it seems plausible 
that subjectivity is functionally necessary for general intelligence. Therefore 
it is reasonable to consider symbiosis as both a safe design and potentially 
the only possible design (at least at first), depending on how creative and 
resourceful we get in cog sci/ AGI in coming years.


On Sun, Jun 27, 2010 at 1:13 PM, Matt Mahoney matmaho...@yahoo.com wrote:

This is wishful thinking. Wishful thinking is dangerous. How about instead of 
hoping that AGI won't destroy the world, you study the problem and come up with 
a safe design.

 -- Matt Mahoney, matmaho...@yahoo.com






From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, June 26, 2010 1:14:22 PM
Subject: Re: [agi]
 Questions for an AGI


why should AGIs give a damn about us?


I like to think that they will give a damn because humans have a unique way of 
experiencing reality and there is no reason to not take advantage of that 
precious opportunity to create astonishment or bliss. If anything is important 
in the universe, its insuring positive experiences for all areas in which it 
is conscious, I think it will realize that. And with the resources available 
in the solar system alone, I don't think we will be much of a burden. 


I like that idea.  Another reason might be that we won't crack the problem of 
autonomous general intelligence, but the singularity will proceed regardless 
as a symbiotic relationship between life and AI.  That would be beneficial to 
us as a form of intelligence expansion, and beneficial to the artificial 
entity a way of being alive and having an experience of the world.  

agi | Archives   | Modify  Your Subscription  

agi | Archives   | Modify  Your Subscription  

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Questions for an AGI

2010-06-27 Thread Matt Mahoney

Travis Lenting wrote:
 I don't like the idea of enhancing human intelligence before the singularity.

The singularity is a point of infinite collective knowledge, and therefore 
infinite unpredictability. Everything has to happen before the singularity 
because there is no after.

 I think crime has to be made impossible even for an enhanced humans first. 

That is easy. Eliminate all laws.

 I would like to see the singularity enabling AI to be as least like a 
 reproduction machine as possible.

Is there a difference between enhancing our intelligence by uploading and 
creating killer robots? Think about it.

 Does it really need to be a general AI to cause a singularity? Can it not 
 just stick to scientific data and quantify human uncertainty?  It seems like 
 it would be less likely to ever care about killing all humans so it can rule 
 the galaxy or that its an omnipotent servant.   

Assume we succeed. People want to be happy. Depending on how our minds are 
implemented, it's either a matter of rewiring our neurons or rewriting our 
software. Is that better than a gray goo accident?

 -- Matt Mahoney, matmaho...@yahoo.com





From: Travis Lenting travlent...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 27, 2010 5:21:24 PM
Subject: Re: [agi] Questions for an AGI

I don't like the idea of enhancing human intelligence before the singularity. I 
think crime has to be made impossible even for an enhanced humans first. I 
think life is too adapt to abusing opportunities if possible. I would like to 
see the singularity enabling AI to be as least like a reproduction machine as 
possible. Does it really need to be a general AI to cause a singularity? Can it 
not just stick to scientific data and quantify human uncertainty?  It seems 
like it would be less likely to ever care about killing all humans so it can 
rule the galaxy or that its an omnipotent servant.


On Sun, Jun 27, 2010 at 11:39 AM, The Wizard key.unive...@gmail.com wrote:

This is wishful thinking. Wishful thinking is dangerous. How about instead of 
hoping that AGI won't destroy the world, you study the problem and come up with 
a safe design.



Agreed on this dangerous thought! 


On Sun, Jun 27, 2010 at 1:13 PM, Matt Mahoney matmaho...@yahoo.com wrote:


This is wishful thinking. Wishful thinking is dangerous. How about instead of 
hoping that AGI won't destroy the world, you study the problem and come up 
with a safe design.

 -- Matt Mahoney, matmaho...@yahoo.com






 From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sat, June 26, 2010 1:14:22 PM
Subject: Re: [agi]
 Questions for an AGI


why should AGIs give a damn about us?



I like to think that they will give a damn because humans have a unique way 
of experiencing reality and there is no reason to not take advantage of that 
precious opportunity to create astonishment or bliss. If anything is 
important in the universe, its insuring positive experiences for all areas in 
which it is conscious, I think it will realize that. And with the resources 
available in the solar system alone, I don't think we will be much of a 
burden. 


I like that idea.  Another reason might be that we won't crack the problem of 
autonomous general intelligence, but the singularity will proceed regardless 
as a symbiotic relationship between life and AI.  That would be beneficial to 
us as a form of intelligence expansion, and beneficial to the artificial 
entity a way of being alive and having an experience of the world.  

agi | Archives   | Modify  Your Subscription  

agi | Archives   | Modify  Your Subscription  



-- 
Carlos A Mejia

Taking life one singularity at a time.
www.Transalchemy.com  


agi | Archives   | Modify  Your Subscription  

agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Theory of Hardcoded Intelligence

2010-06-27 Thread Matt Mahoney

Correct. Intelligence = log(knowledge) + log(computing power). At the extreme 
left of your graph is AIXI, which has no knowledge but infinite computing 
power. At the extreme right you have a giant lookup table.

 -- Matt Mahoney, matmaho...@yahoo.com





From: M E botag...@hotmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 27, 2010 5:36:38 PM
Subject: [agi] Theory of Hardcoded Intelligence

  I sketched a graph the other day which represented my thoughts on the 
usefulness of hardcoding knowledge into an AI.  (Graph attached)

Basically, the more hardcoded knowledge you include in an AI, of AGI, the lower 
the overall intelligence it will have, but that faster you will reach 
that value.  I would include any real AGI to be toward the left of the 
graph with systems like CYC to be toward the right.

Matt


The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get 
started.
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] Questions for an AGI

2010-06-27 Thread Matt Mahoney

Travis Lenting wrote:
 Is there a difference between enhancing our intelligence by uploading and 
 creating killer robots? Think about it.

 Well yes, we're not all bad but I think you read me wrong because thats 
 basically my worry.

What I mean is that one way to look at uploading is to create a robot that 
behaves like you and then dying. The question is whether you become the 
robot. But it is a nonsense question. Nothing changes whichever way you answer 
it.

 Assume we succeed. People want to be happy. Depending on how our minds are 
 implemented, it's either a matter of rewiring our neurons or rewriting our 
 software. Is that better than a gray goo accident?

 Are you asking if changing your hardware or software ends your true existence 
 like a grey goo accident would?

A state of maximum happiness or maximum utility is a degenerate mental state 
where any thought or perception would be unpleasant because it would result in 
a different mental state. In a competition with machines that can't have 
everything they want (for example, they fear death and later die), the other 
machines would win because you would have no interest in self preservation and 
they would.

 Assuming the goo is unconscious, 

What do you mean by unconscious?

 it would be worse because there is the potential for a peaceful experience 
 free from the power struggle for limited resources even if humans don't truly 
 exist or not.

That result could be reached by a dead planet, which BTW, is the only stable 
attractor in the chaotic process of evolution.

 Does anyone else worry about how we're going to keep this machine's 
 unprecedented resourcefulness from being abused by an elite few to further 
 protect and advance their social superiority?

If the elite few kill off all their competition, then theirs is the only 
ethical model that matters. From their point of view, it would be a good thing. 
How do you feel about humans currently being at the top of the food chain?

 To me it seems like if we can't create a democratic society where people have 
 real choices concerning the issues that affect them most and it  just ends up 
 being a continuation of the class war we have today, then maybe grey goo 
 would be the better option before we start promoting democracy throughout 
 the universe.

Freedom and fairness are important to us because they were programmed into our 
ethical models, not because they are actually important. As a counterexample, 
they are irrelevant to evolution. Gray goo might be collectively vastly more 
intelligent than humanity, if that makes you feel any better.
 -- Matt Mahoney, matmaho...@yahoo.com





From: Travis Lenting travlent...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 27, 2010 6:53:14 PM
Subject: Re: [agi] Questions for an AGI

Everything has to happen before the singularity because there is no after.

I meant when machines take over technological evolution. 

That is easy. Eliminate all laws.

I would prefer a surveillance state. I should say impossible to get away with 
if conducted in public. 

Is there a difference between enhancing our intelligence by uploading and 
creating killer robots? Think about it.

Well yes, we're not all bad but I think you read me wrong because thats 
basically my worry.

Assume we succeed. People want to be happy. Depending on how our minds are 
implemented, it's either a matter of rewiring our neurons or rewriting our 
software. Is that better than a gray goo accident?

Are you asking if changing your hardware or software ends your true existence 
like a grey goo accident would? Assuming the goo is unconscious, it would be 
worse because there is the potential for a peaceful experience free from the 
power struggle for limited resources even if humans don't truly exist or not. 
Does anyone else worry about how we're going to keep this machine's 
unprecedented resourcefulness from being abused by an elite few to further 
protect and advance their social superiority? To me it seems like if we can't 
create a democratic society where people have real choices concerning the 
issues that affect them most and it  just ends up being a continuation of the 
class war we have today, then maybe grey goo would be the better option before 
we start promoting democracy throughout the universe.


On Sun, Jun 27, 2010 at 2:43 PM, Matt Mahoney matmaho...@yahoo.com wrote:

Travis Lenting wrote:
 I don't like the idea of enhancing human intelligence before the singularity.


The singularity is a point of infinite collective knowledge, and therefore 
infinite unpredictability. Everything has to happen before the singularity 
because there is no after.


 I think crime has to be made impossible even for an enhanced humans first. 


That is easy. Eliminate all laws.


 I would like to see the singularity enabling AI to be as least like a 
 reproduction machine as possible.


Is there a difference between enhancing our intelligence

Re: [agi] Questions for an AGI

2010-06-24 Thread Matt Mahoney

Am I a human or am I an AGI?

Dana Ream wrote:
 How do you work?
 
Just like you designed me to.

deepakjnath wrote: 
 What should I ask if I could ask AGI anything?
The Wizard wrote:
 What should I ask an agi


You don't need to ask me anything. I will do all of your thinking for you.

Florent Bethert wrote:
 Tell me what I need to know, by order of importance.

Nothing. I will do all of your thinking for you.

A. T. Murray wrote:
 Who killed Donald Young, a gay sex partner of U.S. President Barak Obama

It must have been that other AGI, Mentifex. I never did trust it ;-)


-- Matt Mahoney, matmaho...@yahoo.com


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] An alternative plan to discover self-organization theory

2010-06-21 Thread Matt Mahoney

Mike Tintner wrote:
 Matt:It is like the way evolution works, except that there is a human in the 
 loop to make the process a little more intelligent.
  
 IOW this is like AGI, except that it's narrow AI. That's the whole point - 
 you have to remove the human from the loop.  In fact, it also sounds like a 
 misconceived and rather literal idea of evolution as opposed to the reality. 
You're right. It is narrow AI. You keep pointing out that we haven't solved the 
general problem. You are absolutely correct.

So, do you have any constructive ideas on how to solve it? Preferably something 
that takes less than 3 billion years on a planet sized molecular computer.

-- Matt Mahoney, matmaho...@yahoo.com





From: Mike Tintner tint...@blueyonder.co.uk
To: agi agi@v2.listbox.com
Sent: Mon, June 21, 2010 7:59:29 AM
Subject: Re: [agi] An alternative plan to discover self-organization theory


Matt:It is like 
the way evolution works, except that there is a human in the loop to make the 
process a little more intelligent.
 
IOW this is like AGI, except that it's narrow AI. 
That's the whole point - you have to remove the human from the loop.  In 
fact, it also sounds like a misconceived and rather literal idea of evolution 
as 
opposed to the reality.


From: Matt Mahoney 
Sent: Monday, June 21, 2010 3:01 AM
To: agi 
Subject: Re: [agi] An alternative plan to discover self-organization 
theory

Steve Richfield wrote:
 He suggested that I construct a simple 
NN that couldn't work without self organizing, and make dozens/hundreds of 
different neuron and synapse operational characteristics selectable ala genetic 
programming, put it on the fastest computer I could get my hands on, turn it 
loose trying arbitrary combinations of characteristics, and see what the 
winning combination turns out to be. Then, armed with that knowledge, refine 
the genetic characteristics and do it again, and iterate until 
it efficiently self organizes. This might go on for months, but 
self-organization theory might just emerge from such an effort. 

Well, that is the process that created human intelligence, no? But months? 
It actually took 3 billion years on a planet sized molecular computer.

That doesn't mean it won't work. It just means you have to narrow your 
search space and lower your goals.

I can give you an example of a similar process. Look at the code for 
PAQ8HP12ANY and LPAQ9M data compressors by Alexander Ratushnyak, which are the 
basis of winning Hutter prize submissions. The basic principle is that you have 
a model that receives a stream of bits from an unknown source and it uses a 
complex hierarchy of models to predict the next bit. It is sort of like a 
neural 
network because it averages together the results of lots of adaptive pattern 
recognizers by processes that are themselves adaptive. But I would describe the 
code as inscrutable, kind of like your DNA. There are lots of parameters to 
tweak, such as how to preprocess the data, arrange the dictionary, compute 
various contexts, arrange the order of prediction flows, adjust various 
learning 
rates and storage capacities, and make various tradeoffs sacrificing 
compression 
to meet memory and speed requirements. It is simple to describe the process of 
writing the code. You make random changes and keep the ones that work. It is 
like the way evolution works, except that there is a human in the loop to make 
the process a little more intelligent.

There are also fully automated optimizers for compression algorithms, but 
they are more limited in their search space. For example, the experimental PPM 
based EPM by Serge Osnach includes a program EPMOPT that adjusts 20 numeric 
parameters up or down using a hill climbing search to find the best 
compression. 
It can be very slow. Another program, M1X2 by Christopher Mattern, uses a 
context mixing (PAQ like) algorithm in which the contexts are selected by using 
a hill climbing genetic algorithm to select a set of 64-bit masks. One version 
was run for 3 days to find the best options to compress a file that normally 
takes 45 seconds.

 -- Matt Mahoney, matmaho...@yahoo.com 





 From: Steve Richfield 
steve.richfi...@gmail.com
To: agi 
agi@v2.listbox.com
Sent: Sun, June 20, 2010 2:06:55 
AM
Subject: [agi] An 
alternative plan to discover self-organization theory

No, I 
haven't been smokin' any wacky tobacy. Instead, I was having a long talk with 
my 
son Eddie, about self-organization theory. This is his proposal:

He suggested that I construct a simple NN that couldn't work 
without self organizing, and make dozens/hundreds of different neuron and 
synapse operational characteristics selectable ala genetic programming, put it 
on the fastest computer I could get my hands on, turn it loose trying arbitrary 
combinations of characteristics, and see what the winning combination turns 
out to be. Then, armed with that knowledge, refine

Re: [agi] Re: High Frame Rates Reduce Uncertainty

2010-06-21 Thread Matt Mahoney

Your computer monitor flashes 75 frames per second, but you don't notice any 
flicker because light sensing neurons have a response delay of about 100 ms. 
Motion detection begins in the retina by cells that respond to contrast between 
light and dark moving in specific directions computed by simple, fixed weight 
circuits. Higher up in the processing chain, you detect motion when your eyes 
and head smoothly track moving objects using kinesthetic feedback from your eye 
and neck muscles and input from your built in accelerometer in the semicircular 
canals in your ears. This is all very complicated of course. You are more 
likely to detect motion in objects that you recognize and expect to move, like 
people, animals, cars, etc.

 -- Matt Mahoney, matmaho...@yahoo.com





From: David Jones davidher...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, June 21, 2010 9:39:30 AM
Subject: [agi] Re: High Frame Rates Reduce Uncertainty

Ignoring Steve because we are simply going to have to agree to disagree... And 
I don't see enough value in trying to understand his paper. I said the math was 
overly complex, but what I really meant is that the approach is overly complex 
and so filled with research specific jargon, I don't care to try understand it. 
It is overly converned with copying the way that the brain does things. I don't 
care how the brain does it. I care about why the brain does it. Its the same as 
the analogy of giving a man a fish or teaching him to fish. You may figure out 
how the brain works, but it does you little good if you don't understand why it 
works that way. You would have to create a synthetic brain to take advantage of 
the knowledge, which is not a approach to AGI for many reasons. There are a 
million other ways, even better ways, to do it than the way the brain does it. 
Just because the brain accidentally found 1 way out of a million to do it 
doesn't make it the
 right way for us to develop AGI. 

So, moving on

I can't find references online, but I've read that the Air Force studied the 
ability of the human eye to identify aircraft in images that were flashed on a 
screen at 1/220th of a second. So, clearly, the human eye can at least 
distinguish 220 fps if it operated that way. Of course, it may not operate on 
fps second, but that is besides the point. I've also heard other people say 
that a study has shown that the human eye takes 1000 exposures per second. They 
had no references though, so it is hearsay.

The point was that the brain takes advantage of the fact that with such a high 
exposure rate, the changes between each image are very small if the objects are 
moving. This allows it to distinguish movement and visual changes with 
extremely low uncertainty. If it detects that the changes required to match two 
parts of an image are too high or the distance between matches is too far, it 
can reject a match. This allows it to distinguish only very low uncertainty 
changes and reject changes that have high uncertainty. 

I think this is a very significant discovery regarding how the brain is able to 
learn in such an ambiguous world with so many variables that are difficult to 
disambiguate, interpret and understand. 

Dave

On Fri, Jun 18, 2010 at 2:19 PM, David Jones davidher...@gmail.com wrote:

I just came up with an awesome idea. I just realized that the brain takes 
advantage of high frame rates to reduce uncertainty when it is estimating 
motion. The slower the frame rate, the more uncertainty there is because 
objects may have traveled too far between images to match with high certainty 
using simple techniques. 

So, this made me think, what if the secret to the brain's ability to learn 
generally stems from this high frame rate trick. What if we made a system that 
could process even high frame rates than the brain can. By doing this you can 
reduce the uncertainty of matches very very low (well in my theory so far). If 
you can do that, then you can learn about the objects in a video, how they 
move together or separately with very high certainty. 

You see, matching is the main barrier when learning about objects. But with a 
very high frame rate, we can use a fast algorithm and could potentially reduce 
the uncertainty to almost nothing. Once we learn about objects, matching gets 
easier because now we have training data and experience to take advantage of. 

In addition, you can also gain knowledge about lighting, color variation, 
noise, etc. With that knowledge, you can then automatically create a model of 
the object with extremely high confidence. You will also be able to determine 
the effects of light and noise on the object's appearance, which will help 
match the object invariantly in the future. It allows you to determine what is 
expected and unexpected for the object's appearance with much higher 
confidence. 

Pretty cool idea huh?

Dave


agi | Archives  | Modify Your Subscription

Re: [agi] Fwd: AGI question

2010-06-21 Thread Matt Mahoney

rob levy wrote:
 I am secondarily motivated by the fact that (considerations of morality or 
 amorality aside) AGI is inevitable, though it is far from being a forgone 
 conclusion that powerful general thinking machines will have a first-hand 
 subjective relationship to a world, as living creatures do-- and therefore it 
 is vital that we do as well as possible in understanding what makes systems 
 conscious.  A zombie machine intelligence singularity is something I would 
 refer to rather as a holocaust, even if no one were directly killed, 
 assuming these entities could ultimately prevail over the previous forms of 
 life on our planet.

What do you mean by conscious? If your brain were removed and replaced by a 
functionally equivalent computer that simulated your behavior (presumably a 
zombie), how would you be any different? Why would it matter?

 -- Matt Mahoney, matmaho...@yahoo.com





From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, June 21, 2010 11:53:29 AM
Subject: [agi] Fwd: AGI question

Hi


I'm new to this list, but I've been thinking about consciousness, cognition and 
AI for about half of my life (I'm 32 years old).  As is probably the case for 
many of us here, my interests began with direct recognition of the depth and 
wonder of varieties of phenomenological experiences-- and attempting to 
comprehend how these constellations of significance fit in with a larger 
picture of what we can reliably know about the natural world.  

I am secondarily motivated by the fact that (considerations of morality or 
amorality aside) AGI is inevitable, though it is far from being a forgone 
conclusion that powerful general thinking machines will have a first-hand 
subjective relationship to a world, as living creatures do-- and therefore it 
is vital that we do as well as possible in understanding what makes systems 
conscious.  A zombie machine intelligence singularity is something I would 
refer to rather as a holocaust, even if no one were directly killed, assuming 
these entities could ultimately prevail over the previous forms of life on our 
planet.

I'm sure I'm not the only one on this list who sees a behavioral/ecological 
level of analysis as the most likely correct level at which to study perception 
and cognition, and perception as being a kind of active relationship between an 
organism and an environment.  Having thoroughly convinced my self of a 
non-dualist, embodied, externalist perspective on cognition, I turn to the 
nature of life itself (and possibly even physics but maybe that level will not 
be necessary) to make sense of the nature of subjectivity.  I like Bohm's or 
Bateson's panpsychism about systems as wholes, and significance as 
informational distinctions (which it would be natural to understand as being 
the basis of subjective experience), but this is descriptive rather than 
explanatory.

I am not a biologist, but I am increasingly interested in finding answers to 
what it is about living organisms that gives them a unity such that something 
is something to the system as a whole.  The line of investigation that 
theoretical biologists like Robert Rosen and other NLDS/chaos people have 
pursued is interesting, but I am unfamiliar with related work that might have 
made more progress on the system-level properties that give life its 
characteristic unity and system-level responsiveness.  To me, this seems the 
most likely candidate for a paradigm shift that would produce AGI.  In contrast 
I'm not particularly convinced that modeling a brain is a good way to get AGI, 
although I'd guess we could learn a few more things about the coordination of 
complex behavior if we could really understand them.

Another way to put this is that obviously evolutionary computation would be 
more than just boring hill-climbing if we knew what an organism even IS 
(perhaps in a more precise computational sense). If we can know what an 
organism is then it should be (maybe) trivial to model concepts, consciousness, 
and high level semantics to the umpteenth degree, or at least this would be a 
major hurtle I think.

Even assuming a solution to the problem posed above, there is still plenty of 
room for other minds skepticism in non-living entities implemented on 
questionably foreign mediums but there would be a lot more reason to sleep well 
that the science/technology is leading in a direction in which questions about 
subjectivity could be meaningfully investigated.

Rob


agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Re: [agi] An alternative plan to discover self-organization theory

2010-06-21 Thread Matt Mahoney

rob levy wrote:
 On a related note, what is everyone's opinion on why evolutionary algorithms 
 are such a miserable failure as creative machines, despite their successes 
 in narrow optimization problems?

Lack of computing power. How much computation would you need to simulate the 3 
billion years of evolution that created human intelligence?

 -- Matt Mahoney, matmaho...@yahoo.com





From: rob levy r.p.l...@gmail.com
To: agi agi@v2.listbox.com
Sent: Mon, June 21, 2010 11:56:53 AM
Subject: Re: [agi] An alternative plan to discover self-organization theory

(I'm a little late in this conversation.  I tried to send this message the 
other day but I had my list membership configured wrong. -Rob)


-- Forwarded message --
From: rob levy r.p.l...@gmail.com
Date: Sun, Jun 20, 2010 at 5:48 PM
Subject: Re: [agi] An alternative plan to discover self-organization theory
To: agi@v2.listbox.com


On a related note, what is everyone's opinion on why evolutionary algorithms 
are such a miserable failure as creative machines, despite their successes in 
narrow optimization problems?


I don't want to conflate the possibly separable problems of biological 
development and evolution, though they are interrelated.  There are various 
approaches to evolutionary theory such as Lima de Faria's evolution without 
selection ideas and Reid's evolution by natural experiment that suggest 
natural selection is not  all it's cracked up to be, and that the step of 
generating, (mutating, combining, ) is where the more interesting stuff 
happens.  Most of the alternatives to Neodarwinian Synthesis I have seen are 
based in dynamic models of emergence in complex systems. The upshot is, you 
don't get creativity for free, you actually still need to solve a problem that 
is as hard as AGI in order to get creativity for free. 


So, you would need to solve the AGI-hard problem of evolution and development 
of life, in order to then solve AGI itself (reminds me of the old SNL sketch: 
first, get a million dollars...).  Also, my hunch is that there is quite a 
bit of overlap between the solutions to the two problems.

Rob

Disclaimer: I'm discussing things above that I'm not and don't claim to be an 
expert in, but from what I have seen so far on this list, that should be 
alright.  AGI is by its nature very multidisciplinary which necessitates often 
being breadth-first, and therefore shallow in some areas.




On Sun, Jun 20, 2010 at 2:06 AM, Steve Richfield steve.richfi...@gmail.com 
wrote:

No, I haven't been smokin' any wacky tobacy. Instead, I was having a long talk 
with my son Eddie, about self-organization theory. This is his proposal:

He suggested that I construct a simple NN that couldn't work without self 
organizing, and make dozens/hundreds of different neuron and synapse 
operational characteristics selectable ala genetic programming, put it on the 
fastest computer I could get my hands on, turn it loose trying arbitrary 
combinations of characteristics, and see what the winning combination turns 
out to be. Then, armed with that knowledge, refine the genetic characteristics 
and do it again, and iterate until it efficiently self organizes. This might 
go on for months, but self-organization theory might just emerge from such an 
effort. I had a bunch of objections to his approach, e.g.

Q.  What if it needs something REALLY strange to work?
A.  Who better than you to come up with a long list of really strange 
functionality?

Q.  There are at least hundreds of bits in the genome.


A.  Try combinations in pseudo-random order, with each bit getting asserted in 
~half of the tests. If/when you stumble onto a combination that sort of works, 
switch to varying the bits one-at-a-time, and iterate in this way until the 
best combination is found.

Q.  Where are we if this just burns electricity for a few months and finds 
nothing?
A.  Print out the best combination, break out the wacky tobacy, and come up 
with even better/crazier parameters to test.

I have never written a line of genetic programming, but I know that others 
here have. Perhaps you could bring some rationality to this discussion?

What would be a simple NN that needs self-organization? Maybe a small pot 
of neurons that could only work if they were organized into layers, e.g. a 
simple 64-neuron system that would work as a 4x4x4-layer visual recognition 
system, given the input that I fed it?

Any thoughts on how to score partial successes?

Has anyone tried anything like this in the past?

Is anyone here crazy enough to want to help with such an effort?

This Monte Carlo approach might just be simple enough to work, and simple 
enough that it just HAS to be tried.

All thoughts, stones, and rotten fruit will be gratefully appreciated.

Thanks in advance.

Steve

 

agi | Archives   | Modify  Your Subscription  


agi | Archives  | Modify Your Subscription  


---
agi

Re: [agi] An alternative plan to discover self-organization theory

2010-06-20 Thread Matt Mahoney

Steve Richfield wrote:
 He suggested that I construct a simple NN that couldn't work without self 
 organizing, and make dozens/hundreds of different neuron and synapse 
 operational characteristics selectable ala genetic programming, put it on the 
 fastest computer I could get my hands on, turn it loose trying arbitrary 
 combinations of characteristics, and see what the winning combination turns 
 out to be. Then, armed with that knowledge, refine the genetic 
 characteristics and do it again, and iterate until it efficiently self 
 organizes. This might go on for months, but self-organization theory might 
 just emerge from such an effort. 

Well, that is the process that created human intelligence, no? But months? It 
actually took 3 billion years on a planet sized molecular computer.

That doesn't mean it won't work. It just means you have to narrow your search 
space and lower your goals.

I can give you an example of a similar process. Look at the code for 
PAQ8HP12ANY and LPAQ9M data compressors by Alexander Ratushnyak, which are the 
basis of winning Hutter prize submissions. The basic principle is that you have 
a model that receives a stream of bits from an unknown source and it uses a 
complex hierarchy of models to predict the next bit. It is sort of like a 
neural network because it averages together the results of lots of adaptive 
pattern recognizers by processes that are themselves adaptive. But I would 
describe the code as inscrutable, kind of like your DNA. There are lots of 
parameters to tweak, such as how to preprocess the data, arrange the 
dictionary, compute various contexts, arrange the order of prediction flows, 
adjust various learning rates and storage capacities, and make various 
tradeoffs sacrificing compression to meet memory and speed requirements. It is 
simple to describe the process of writing the code. You make random
 changes and keep the ones that work. It is like the way evolution works, 
except that there is a human in the loop to make the process a little more 
intelligent.

There are also fully automated optimizers for compression algorithms, but they 
are more limited in their search space. For example, the experimental PPM based 
EPM by Serge Osnach includes a program EPMOPT that adjusts 20 numeric 
parameters up or down using a hill climbing search to find the best 
compression. It can be very slow. Another program, M1X2 by Christopher Mattern, 
uses a context mixing (PAQ like) algorithm in which the contexts are selected 
by using a hill climbing genetic algorithm to select a set of 64-bit masks. One 
version was run for 3 days to find the best options to compress a file that 
normally takes 45 seconds.

 -- Matt Mahoney, matmaho...@yahoo.com





From: Steve Richfield steve.richfi...@gmail.com
To: agi agi@v2.listbox.com
Sent: Sun, June 20, 2010 2:06:55 AM
Subject: [agi] An alternative plan to discover self-organization theory

No, I haven't been smokin' any wacky tobacy. Instead, I was having a long talk 
with my son Eddie, about self-organization theory. This is his proposal:

He suggested that I construct a simple NN that couldn't work without self 
organizing, and make dozens/hundreds of different neuron and synapse 
operational characteristics selectable ala genetic programming, put it on the 
fastest computer I could get my hands on, turn it loose trying arbitrary 
combinations of characteristics, and see what the winning combination turns 
out to be. Then, armed with that knowledge, refine the genetic characteristics 
and do it again, and iterate until it efficiently self organizes. This might go 
on for months, but self-organization theory might just emerge from such an 
effort. I had a bunch of objections to his approach, e.g.

Q.  What if it needs something REALLY strange to work?
A.  Who better than you to come up with a long list of really strange 
functionality?

Q.  There are at least hundreds of bits in the genome.
A.  Try combinations in pseudo-random order, with each bit getting asserted in 
~half of the tests. If/when you stumble onto a combination that sort of works, 
switch to varying the bits one-at-a-time, and iterate in this way until the 
best combination is found.

Q.  Where are we if this just burns electricity for a few months and finds 
nothing?
A.  Print out the best combination, break out the wacky tobacy, and come up 
with even better/crazier parameters to test.

I have never written a line of genetic programming, but I know that others here 
have. Perhaps you could bring some rationality to this discussion?

What would be a simple NN that needs self-organization? Maybe a small pot 
of neurons that could only work if they were organized into layers, e.g. a 
simple 64-neuron system that would work as a 4x4x4-layer visual recognition 
system, given the input that I fed it?

Any thoughts on how to score partial successes?

Has anyone tried anything like this in the past?

Is anyone here crazy

Re: [agi] just a thought

2009-01-14 Thread Matt Mahoney

--- On Wed, 1/14/09, Christopher Carr cac...@pdx.edu wrote:

 Problems with IQ notwithstanding, I'm confident that, were my silly IQ
of 145 merely doubled, I could convince Dr. Goertzel to give me the
majority of his assets, including control of his businesses. And if he
were to really meet someone that bright, he would be a fool or
super-human not to do so, which he isn't (a fool, that is).

First, if you knew what you would do if you were twice as smart, you would 
already be that smart. Therefore you don't know.

Second, you have never even met anyone with an IQ of 290. How do you know what 
they would do?

How do you measure an IQ of 100n?

- Ability to remember n times as much?
- Ability to learn n times faster?
- Ability to solve problems n times faster?
- Ability to do the work of n people?
- Ability to make n times as much money?
- Ability to communicate with n people at once?

Please give me an IQ test that measures something that can't be done by n log n 
people (allowing for some organizational overhead).

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] What Must a World Be That a Humanlike Intelligence May Develop In It?

2009-01-13 Thread Matt Mahoney

My response to Ben's paper is to be cautious about drawing conclusions from 
simulated environments. Human level AGI has an algorithmic complexity of 10^9 
bits (as estimated by Landauer). It is not possible to learn this much 
information from an environment that is less complex. If a baby AI did perform 
well in a simplified simulation of the world, it would not imply that the same 
system would work in the real world. It would be like training a language model 
on a simple, artificial language and then concluding that the system could be 
scaled up to learn English.

This is a lesson from my dissertation work in network intrusion anomaly 
detection. This was a machine learning task in which the system was trained on 
attack-free network traffic, and then identified anything out of the ordinary 
as malicious. For development and testing, we used the 1999 MIT-DARPA Lincoln 
Labs data set consisting of 5 weeks of synthetic network traffic with hundreds 
of labeled attacks. The test set developers took great care to make the data as 
realistic as possible. They collected statistics from real networks, built an 
isolated network of 4 real computers running different operating systems, and 
thousands of simulated computers that generated HTTP requests to public 
websites and mailing lists, and generated synthetic email using English word 
bigram frequencies, and other kinds of traffic.

In my work I discovered a simple algorithm that beat the best intrusion 
detection systems available at the time. I parsed network packets into 
individual 1-4 byte fields, recorded all the values that ever occurred at least 
once in training, and flagged any new value in the test data as suspicious, 
with a score inversely proportional to the size of the set of values observed 
in training and proportional to the time since the previous anomaly.

Not surprisingly, the simple algorithm failed on real network traffic. There 
were too many false alarms for it to be even remotely useful. The reason it 
worked on the synthetic traffic was that it was algorithmically simple compared 
to real traffic. For example, one of the most effective tests was the TTL 
value, a counter that decrements with each IP routing hop, intended to prevent 
routing loops. It turned out that most of the attacks were simulated from a 
machine that was one hop further away than the machines simulating normal 
traffic.

A problem like that could have been fixed, but there were a dozen others that I 
found, and probably many that I didn't find. It's not that the test set 
developers weren't careful. They spent probably $1 million developing it 
(several people over 2 years). It's that you can't simulate the high complexity 
of thousands of computers and human users with anything less than that. Simple 
problems have simple solutions, but that's not AGI.

-- Matt Mahoney, matmaho...@yahoo.com


--- On Fri, 1/9/09, Ben Goertzel b...@goertzel.org wrote:

 From: Ben Goertzel b...@goertzel.org
 Subject: [agi] What Must a World Be That a Humanlike Intelligence May Develop 
 In It?
 To: agi@v2.listbox.com
 Date: Friday, January 9, 2009, 5:58 PM
 Hi all,
 
 I intend to submit the following paper to JAGI shortly, but
 I figured
 I'd run it past you folks on this list first, and
 incorporate any
 useful feedback into the draft I submit
 
 This is an attempt to articulate a virtual world
 infrastructure that
 will be adequate for the development of human-level AGI
 
 http://www.goertzel.org/papers/BlocksNBeadsWorld.pdf
 
 Most of the paper is taken up by conceptual and
 requirements issues,
 but at the end specific world-design proposals are made.
 
 This complements my earlier paper on AGI Preschool.  It
 attempts to
 define what kind of underlying virtual world infrastructure
 an
 effective AGI preschool would minimally require.
 
 thx
 Ben G
 
 
 
 -- 
 Ben Goertzel, PhD
 CEO, Novamente LLC and Biomind LLC
 Director of Research, SIAI
 b...@goertzel.org
 
 I intend to live forever, or die trying.
 -- Groucho Marx



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] What Must a World Be That a Humanlike Intelligence May Develop In It?

2009-01-13 Thread Matt Mahoney

--- On Tue, 1/13/09, Ben Goertzel b...@goertzel.org wrote:

 The complexity of a simulated environment is tricky to estimate, if
 the environment contains complex self-organizing dynamics, random
 number generation, and complex human interactions ...

In fact it's not computable. But if you write 10^6 bits of code for your 
simulator, you know it's less than 10^6 bits.

But I wonder which is a better test of AI.

http://cs.fit.edu/~mmahoney/compression/text.html
is based on natural language prediction, equivalent to the Turing test. The 
data has 10^9 bits of complexity, just enough to train a human adult language 
model.

http://cs.fit.edu/~mmahoney/compression/uiq/
is based on Legg and Hutter's universal intelligence. It probably has a few 
hundred bits of complexity, designed to be just beyond the reach of 
tractability for universal algorithms like AIXI^tl.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] [WAS The Smushaby] The Logic of Creativity

2009-01-13 Thread Matt Mahoney

I think what Mike is saying is that I could draw what I think a flying house 
would look like, and you could look at my picture and say it was a flying 
house, even though neither of us has ever seen one. Therefore, AGI should be 
able to solve the same kind of problems, and why aren't we designing and 
testing AGI this way? But don't worry about it. Mike doesn't know how to solve 
the problem either.

-- Matt Mahoney, matmaho...@yahoo.com


--- On Tue, 1/13/09, Jim Bromer jimbro...@gmail.com wrote:

 From: Jim Bromer jimbro...@gmail.com
 Subject: Re: [agi] [WAS The Smushaby] The Logic of Creativity
 To: agi@v2.listbox.com
 Date: Tuesday, January 13, 2009, 3:02 PM
 I am reluctant to say this, but I am not sure if I actually
 understand
 what Mike is getting at. He described a number of logical
 (in the
 greater sense of being reasonable and structured) methods
 by which one
 could achieve some procedural goal, and then he declares
 that logic
 (in this greater sense that I believe acknowledged) was
 incapable of
 achieving it.
 
 Let's take a flying house.  I have to say that there
 was a very great
 chance that I misunderstood what Mike was saying, since I
 believe that
 he effectively said that a computer program, using
 logically derived
 systems could not come to the point where it could
 creatively draw a
 picture of a flying house like a child might.
 
 If that was what he was saying then it is very strange. 
 Obviously,
 one could program a computer to draw a flying house.  So
 right away,
 his point must have been under stated, because that means
 that a
 computer program using computer logic (somewhere within
 this greater
 sense of the term) could follow a program designed to get
 it to draw a
 flying house.
 
 So right away, Mike's challenge can't be taken
 seriously.  If we can
 use logical design to get the computer program to draw a
 flying house,
 we can find more creative ways to get it to the same point.
  Do you
 understand what I am saying?  You aren't actually going
 to challenge
 me to write a rather insipid program that will draw a
 flying house for
 you are you?  You accept the statement that I could do that
 if I
 wanted to right?  If you do accept that statement, then you
 should be
 able to accept the fact that I could also write a more
 elaborate
 computer program to do the same thing, only it might, for
 example, do
 so only after the words house and
 flying were input. I think you
 understand that I could write a slightly more elaborate
 computer
 program to do the something like that.  Ok, now I could
 keep making it
 more complicated and eventually I could get to the point
 where where
 it could take parts of pictures that it was exposed to and
 draw them
 in more creative combinations.   If it was exposed to
 pictures of
 airplanes flying, and if it was exposed to pictures of
 houses, it
 might,. through quasi random experimentation try drawing a
 picture of
 the airplane flying past the house as if the house was an
 immense
 mountain, and then it might try some clouds as landscaping
 for the
 house and then it might try a cloud with a driveway,
 garbage can and a
 chimney, and eventually it might even draw a picture of a
 house with
 wings.  All I need to do that is to use some shape
 detecting
 algorithms that have been developed for graphics programs
 and are used
 all the time by graphic artists that can approximately
 determine the
 shape of the house and airplane in the different pictures
 and then it
 would just be a matter of time before it could (and would)
 try to draw
 a flying house.
 
 Which step do you doubt, or did I completely misunderstand
 you?
 1. I could (I hope I don't have to) write a program
 that could draw a
 flying house.
 2. I could make it slightly more elaborate so, for example,
 that it
 would only draw the flying house if the words
 'house' and 'flying'
 were input.
 3. I could vary the program in many other ways.  Now
 suppose that I
 showed you one of these programs.  After that I could make
 it more
 complicated so that it went through a slightly more
 creative process
 than the program you saw the previous time.
 4. I could continue to make the program more and more
 complicated. I
 could, (with a lot of graphics techniques that I know about
 but
 haven't actually mastered) write the program so that if
 it was exposed
 to pictures of houses and to pictures of flying, would have
 the
 ability to eventually draw a picture of a flying house
 (along with a
 lot of other creative efforts that you have not) even
 thought of.  But
 the thing is, that I can do this without using advanced AGI
 techniques!
 
 So, I must retain the recognition that I may not have been
 able to
 understand you because what you are saying is not totally
 reasonable
 to me.
 Jim Bromer
 
 
 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your

Re: [agi] [WAS The Smushaby] The Logic of Creativity

2009-01-13 Thread Matt Mahoney

--- On Tue, 1/13/09, Mike Tintner tint...@blueyonder.co.uk wrote:

 Oh and just to answer Matt - if you want to keep doing
 narrow AI, like everyone else, then he's right -
 don't worry about it. Pretend it doesn't exist.
 Compress things :).

Now, Mike, it is actually a simple problem.

1. Collect about 10^8 random photos (about what we see in a lifetime).

2. Label all the ones of houses, and all the ones of things flying.

3. Train an image recognition system (a hierarchical neural network, probably 
3-5 layers, 10^7 neurons, 10^11 connections) to detect these two features. 
You'll need about 10^19 CPU operations, or about a month on a 1000 CPU cluster.

4. Invert the network by iteratively drawing images that activate these two 
features and work down the hierarchy. (Should be faster than step 3). When you 
are done, you will have a picture of a flying house.

Let me know if you have any trouble implementing this.

And BTW the first 2 steps are done.
http://images.google.com/images?q=flying+houseum=1ie=UTF-8sa=Xoi=image_result_groupresnum=5ct=title

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] [WAS The Smushaby] The Logic of Creativity

2009-01-13 Thread Matt Mahoney

Mike, it's not cheating. It's called research :-)

-- Matt Mahoney, matmaho...@yahoo.com

--- On Tue, 1/13/09, Mike Tintner tint...@blueyonder.co.uk wrote:

 From: Mike Tintner tint...@blueyonder.co.uk
 Subject: Re: [agi] [WAS The Smushaby] The Logic of Creativity
 To: agi@v2.listbox.com
 Date: Tuesday, January 13, 2009, 7:38 PM
 Matt,

 Well little Matt, as your class teacher, in one sense
 this is quite clever of you. But you see, little Matt, when
 I gave you and the class that exercise, the idea was for you
 to show me what *you* could do - what you could produce from
 your own brain. I didn't mean you to copy someone
 else's flying house from a textbook. That's cheating
 Matt, - getting someone else to do the work for you -  and
 we don't like cheats do we? So perhaps you can go away
 and draw a flying house all by yourself - a superduper one
 with lots of fabbo new bits that no one has ever drawn
 before, and all kinds of wonderful bells and whistles, that
 will be ten times better than that silly old foto.  I know
 you can Matt, I have faith in you. And I know if you really,
 really try, you can understand the difference between
 creating your own drawing, and copying someone else's.
 Because, well frankly, Matt, every time I give you an
 exercise - ask you to write an essay, or tell me a story in
 your own words - you always, always copy from other people,
 even if you try to disguise it by copying from several
 people. Now that's not fair, is it Matt? That's not
 the American way. You have to get over this lack of
 confidence in yourself. 

 Matt/Mike Tintner tint...@blueyonder.co.uk wrote:

  Oh and just to answer Matt - if you want to keep
 doing
  narrow AI, like everyone else, then he's right
 -
  don't worry about it. Pretend it doesn't
 exist.
  Compress things :).

  Now, Mike, it is actually a simple problem.

  1. Collect about 10^8 random photos (about what we see
 in a lifetime).

  2. Label all the ones of houses, and all the ones of
 things flying.

  3. Train an image recognition system (a hierarchical
 neural network, probably 3-5 layers, 10^7 neurons, 10^11
 connections) to detect these two features. You'll need
 about 10^19 CPU operations, or about a month on a 1000 CPU
 cluster.

  4. Invert the network by iteratively drawing images
 that activate these two features and work down the
 hierarchy. (Should be faster than step 3). When you are
 done, you will have a picture of a flying house.

  Let me know if you have any trouble implementing this.

  And BTW the first 2 steps are done.

 http://images.google.com/images?q=flying+houseum=1ie=UTF-8sa=Xoi=image_result_groupresnum=5ct=title

  -- Matt Mahoney, matmaho...@yahoo.com

  ---
  agi
  Archives:
 https://www.listbox.com/member/archive/303/=now
  RSS Feed:
 https://www.listbox.com/member/archive/rss/303/
  Modify Your Subscription:
 https://www.listbox.com/member/?;
  Powered by Listbox: http://www.listbox.com

 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com

---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] just a thought

2009-01-13 Thread Matt Mahoney

--- On Tue, 1/13/09, Valentina Poletti jamwa...@gmail.com wrote:

 Anyways my point is, the reason why we have achieved so much technology, so 
 much knowledge in this time is precisely the we, it's the union of several 
 individuals together with their ability to communicate with one-other that 
 has made us advance so much.

I agree. A machine that is 10 times as smart as a human in every way could not 
achieve much more than hiring 10 more people. In order to automate the economy, 
we have to replicate the capabilities of not one human mind, but a system of 
10^10 minds. That is why my AGI proposal is so hideously expensive.
http://www.mattmahoney.net/agi2.html

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] initial reaction to A2I2's call center product

2009-01-12 Thread Matt Mahoney

--- On Mon, 1/12/09, Ben Goertzel b...@goertzel.org wrote:

 AGI company A2I2 has released a product for automating call
 center functionality, see...
 
 http://www.smartaction.com/index.html

It would be nice to see some transcripts of actual conversation between the 
system and customers to get some idea of how well the system actually works.

You'll notice that the contact for more information is a live human...


-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] fuzzy-probabilistic logic again

2009-01-12 Thread Matt Mahoney

--- On Mon, 1/12/09, YKY (Yan King Yin) generic.intellige...@gmail.com wrote:

 I have refined my P(Z) logic a bit.  Now the truth values are all
 unified to one type, probability distribution over Z, which has a
 pretty nice interpretation.  The new stuff are at sections 4.4.2 and
 4.4.3.
 
 http://www.geocities.com/genericai/P-Z-logic-excerpt-12-Jan-2009.pdf

Do you have any experimental results supporting your proposed probabilistic 
fuzzy logic implementation? How would you devise such an experiment (for 
example, a prediction task) to test alternative interpretations of logical 
operators like AND, OR, NOT, IF-THEN, etc? Maybe you could manually encoding 
knowledge in your system (like you did with Goldilocks) and test whether it can 
make inferences? I'd be more interested to see results on real data, however.

(Also, instead of a disclaimer about political correctness, couldn't you just 
find examples that don't reveal your obsession with sex?)

-- Matt Mahoney, matmaho...@yahoo.com




 
 I'm wondering if anyone is interested in helping me
 implement the
 logic or develop an AGI basing on it?  I have already
 written part of
 the inference engine in Lisp.
 
 Also, is anyone here working on fuzzy or probabilistic
 logics, other
 than Ben and Pei and me?
 
 YKY
 
 
 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=126863270-d7b0b0
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-09 Thread Matt Mahoney

Mike, after a sequence of free associations, you drift from the original 
domain. How is that incompatible with the model I described? I use A, B, C, as 
variables to represent arbitrary thoughts.

-- Matt Mahoney, matmaho...@yahoo.com

--- On Fri, 1/9/09, Mike Tintner tint...@blueyonder.co.uk wrote:
From: Mike Tintner tint...@blueyonder.co.uk
Subject: Re: [agi] The Smushaby of Flatway.
To: agi@v2.listbox.com
Date: Friday, January 9, 2009, 10:08 AM


I
 
 
 _filtered #yiv455060292 {
font-family:Courier;}
 _filtered #yiv455060292 {
font-family:Tms Rmn;}
 _filtered #yiv455060292 {margin:1.0in 77.95pt 1.0in 77.95pt;}
#yiv455060292 P.MsoNormal {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 LI.MsoNormal {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 DIV.MsoNormal {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H1 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:12pt 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H2 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:6pt 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H3 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H4 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H5 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 H6 {
FONT-WEIGHT:normal;FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 P.MsoHeading7 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 LI.MsoHeading7 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 DIV.MsoHeading7 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 P.MsoHeading8 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 LI.MsoHeading8 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 DIV.MsoHeading8 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 P.MsoHeading9 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 LI.MsoHeading9 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 DIV.MsoHeading9 {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier;}
#yiv455060292 P.MsoNormalIndent {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt 0.5in;FONT-FAMILY:Courier;}
#yiv455060292 LI.MsoNormalIndent {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt 0.5in;FONT-FAMILY:Courier;}
#yiv455060292 DIV.MsoNormalIndent {
FONT-SIZE:12pt;MARGIN:0in 0in 0pt 0.5in;FONT-FAMILY:Courier;}
#yiv455060292 A:link {
COLOR:blue;TEXT-DECORATION:underline;}
#yiv455060292 SPAN.MsoHyperlink {
COLOR:blue;TEXT-DECORATION:underline;}
#yiv455060292 A:visited {
COLOR:purple;TEXT-DECORATION:underline;}
#yiv455060292 SPAN.MsoHyperlinkFollowed {
COLOR:purple;TEXT-DECORATION:underline;}
#yiv455060292 P.MsoPlainText {
FONT-SIZE:10pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier New;}
#yiv455060292 LI.MsoPlainText {
FONT-SIZE:10pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier New;}
#yiv455060292 DIV.MsoPlainText {
FONT-SIZE:10pt;MARGIN:0in 0in 0pt;FONT-FAMILY:Courier New;}
#yiv455060292 DIV.Section1 {
}

Matt,
 
I mainly want to lay down a marker here for a 
future discussion.
 
What you have done  is what all AGI-ers/AI-ers 
do. Faced with the problem of domain-switching - (I pointed out that the human 
brain and human thought are * freely domain-switching*), -  you 
have simply ignored it - and, I imagine, are completely unaware that you have 
done so. And this, remember, is *the* problem of AGI - what should be 
the central focus of all discussion here.
 
If you look at your examples, you will find that 
they are all *intra-domain* and do not address domain-switching at all 
-
 
a. if you learned the associations A-B and B-C, then A will predict C. 

 That is called reasoning
 
b) a word-word 
matrix M from a large text corpus, ..gives you something similar to 
 
your free association chain like rain-wet-water-...
 
No domain-switching there.
 
Compare these 
with my 
 
b) 
domain-switching chain -COW - DOG - TAIL - CURRENT CRISIS - LOCAL 
VS
 GLOBAL
 THINKING - WHAT A NICE DAY - MUST GET ON- CANT 
SPEND MUCH
 MORE TIME ON
 
THIS
 
 (switching between the domains of - Animals - Politics/Economics - 
Weather - Personal Timetable)
 
a) your 
(extremely limited) idea of (logical) reasoning is also entirely 
intra-domain - the domain of the Alphabet,   
(A-B-C). 
 
But my creative 
and similar creative chains are analogous  to switching from say 
an Alphabet domain (A-B-C) to a Foreign Languages domain (alpha - omega)  
to a Semiotics one (symbol - sign - representation) to a Fonts one (Courier - 
Times Roman) etc. etc.  - i.e. we could all easily and spontaneously form 
such a domain-switching chain.
 
Your programs 
and all the programs ever written are still incapable of doing this - switching 
domains. This, it bears repeating, is the problem of 
AGI. 
 
Because you're 
ignoring it, you don't see that you're in effect maintaining an absurdity

Re: [agi] The Smushaby of Flatway.

2009-01-09 Thread Matt Mahoney

--- On Thu, 1/8/09, Vladimir Nesov robot...@gmail.com wrote:

  I claim that K(P)  K(Q) because any description of P must include
  a description of Q plus a description of what P does for at least one other 
  input.
 
 
 Even if you somehow must represent P as concatenation of Q and
 something else (you don't need to), it's not true that always
 K(P)K(Q). It's only true that length(P)length(Q), and longer strings
 can easily have smaller programs that output them. If P is
 10^(10^10) symbols X, and Q is some random number of X smaller
 than 10^(10^10), it's probably K(P)K(Q), even though Q is a
 substring of P.

Well, it is true that you can find |P|  |Q| for some cases of P nontrivially 
simulating Q depending on the choice of language. However, it is not true on 
average. It is also not possible for P to nontrivially simulate itself because 
it is a contradiction to say that P does everything that Q does and at least 
one thing that Q doesn't do if P = Q.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-08 Thread Matt Mahoney

--- On Wed, 1/7/09, Ben Goertzel b...@goertzel.org wrote:
if proving Fermat's Last theorem was just a matter of doing math, it would 
have been done 150 years ago ;-p

obviously, all hard problems that can be solved have already been solved...

???

In theory, FLT could be solved by brute force enumeration of proofs until a 
match to Wiles' is found. In theory, AGI could be solved by coding all the 
knowledge in LISP. The difference is that 50 years ago people actually expected 
the latter to work. Some people still believe so.

AGI is an engineering and policy problem. We already have small scale neural 
models of learning, language, vision, and motor control. We currently lack the 
computing power (10^16 OPS, 10^15 bits) to implement these at human levels, but 
Moore's law will take care of that.

But that is not the hard part of the problem. AGI is a system that eliminates 
our need to work, to think, and to function in the real world. Its value is USD 
10^15, the value of the global economy. Once we have the hardware, we still 
need to extract 10^18 bits of knowledge from human brains. That is the 
complexity of the global economy (assuming 10^10 people x 10^9 bits per person 
x 0.1 fraction consisting of unique job skills). This is far bigger than the 
internet. The only way to extract this knowledge without new technology like 
brain scanning is by communication at the rate of 2 bits per second per person. 
The cheapest option is a system of pervasive surveillance where everything you 
say and do is public knowledge.

AGI is too expensive for any person or group to build or own. It is a vastly 
improved internet, a communication system so efficient that the world's 
population starts to look like a single entity, and nobody notices or cares as 
silicon gradually replaces carbon.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-08 Thread Matt Mahoney

--- On Thu, 1/8/09, Mike Tintner tint...@blueyonder.co.uk wrote:

 What then do you see as the way people *do* think? You
 surprise me, Matt, because both the details of your answer
 here and your thinking generally strike me as *very*
 logicomathematical - with lots of emphasis on numbers and
 compression - yet you seem to be acknowledging here, like
 Jim,  the fundamental deficiencies of the logicomathematical
 form - and it is indeed only one form - of thinking. 

Pattern recognition in parallel, and hierarchical learning of increasingly 
complex patterns by classical conditioning (association), clustering in context 
space (feature creation), and reinforcement learning to meet evolved goals.

You can't write a first order logic expression that inputs a picture and tells 
you whether it is a cat or a dog. Yet any child can do it. Logic is great for 
abstract mathematics. We regard it as the highest form of thought, the hardest 
thing that humans can learn, yet it is the easiest problem to solve on a 
computer.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-08 Thread Matt Mahoney

--- On Thu, 1/8/09, Mike Tintner tint...@blueyonder.co.uk wrote:

 Matt,
 
 Thanks. But how do you see these:
 
 Pattern recognition in parallel, and hierarchical
 learning of increasingly complex patterns by classical
 conditioning (association), clustering in context space
 (feature creation), and reinforcement learning to meet
 evolved goals.
 
 as fundamentally different from logicomathematical
 thinking? (Reinforcement learning strikes me as
 literally extraneous and not a mode of thinking). Perhaps
 you need to explain why conditioned association is
 different.

Free association is the basic way of recalling memories. If you experience A 
followed by B, then the next time you experience A you will think of (or 
predict) B. Pavlov demonstrated this type of learning in animals in 1927. Hebb 
proposed a neural model in 1949 which has since been widely accepted. The model 
is unrelated to first order logic. It is a strengthening of the connections 
from neuron A to neuron B.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-08 Thread Matt Mahoney

Mike,

Your own thought processes only seem mysterious because you can't predict what 
you will think without actually thinking it. It's not just a property of the 
human brain, but of all Turing machines. No program can non-trivially model 
itself. (By model, I mean that P models Q if for any input x, P can compute the 
output Q(x). By non-trivial, I mean that P does something else besides just 
model Q. (Every program trivially models itself). The proof is that for P to 
non-trivially model Q requires K(P)  K(Q), where K is Kolmogorov complexity, 
because P needs a description of Q plus whatever else it does to make it 
non-trivial. It is obviously not possible for K(P)  K(P)).

So if you learned the associations A-B and B-C, then A will predict C. That is 
called reasoning.

Also, each concept is associated with thousands of other concepts, not just 
A-B. If you pick the strongest associated concept not previously activated, you 
get the semi-random thought chain you describe. You can demonstrate this with a 
word-word matrix M from a large text corpus, where M[i,j] is the degree to 
which the i'th word in the vocabulary is associated with the j'th word, as 
measured by the probability of finding both words near each other in the 
corpus. Thus, M[rain,wet] and M[wet,water] have high values because the words 
often appear in the same paragraph. Traversing related words in M gives you 
something similar to your free association chain like rain-wet-water-...

-- Matt Mahoney, matmaho...@yahoo.com


--- On Thu, 1/8/09, Mike Tintner tint...@blueyonder.co.uk wrote:

 From: Mike Tintner tint...@blueyonder.co.uk
 Subject: Re: [agi] The Smushaby of Flatway.
 To: agi@v2.listbox.com
 Date: Thursday, January 8, 2009, 3:54 PM
 Matt:Free association is the basic way of recalling
 memories. If you experience A followed by B, then the next
 time you experience A you will think of (or predict) B.
 Pavlov demonstrated this type of learning in animals in
 1927.
 
 Matt,
 
 You're not thinking your argument through. Look
 carefully at my spontaneous
 
 COW - DOG - TAIL - CURRENT CRISIS - LOCAL VS
 GLOBAL
 THINKING - WHAT A NICE DAY - MUST GET ON- CANT SPEND MUCH
 MORE TIME ON
 THIS etc. etc
 
 that's not A-B association.
 
 That's 1. A-B-C  then  2. Gamma-Delta then  3.
 Languages  then  4. Number of Lines in Letters.
 
 IOW the brain is typically not only freely associating
 *ideas* but switching freely across, and connecting, 
 radically different *domains* in any given chain of
 association. [e.g above from Animals to Economics/Politics
 to Weather to Personal Timetable]
 
 It can do this partly because
 
 a) single ideas have multiple, often massively mutiple, 
 idea/domain connections in the human brain, and allow one to
 go off in any of multiple tangents/directions
 b) humans have many things - and therefore multiple domains
 - on their mind at the same time concurrently  - and can
 switch as above from the immediate subject to  some other
 pressing subject  domain (e.g. from economics/politics
 (local vs global) to the weather (what a nice day).
 
 If your A-B, everything-is-memory-recall thesis
 were true, our chains-of-thought-association would be
 largely repetitive, and the domain switches inevitable..
 
 In fact, our chains (or networks) of free association and
 domain-switching are highly creative, and each one is
 typically, from a purely technical POV, novel and
 surprising. (I have never connected TAIL and CURRENT CRISIS
 before - though Animals and Politics yes. Nor have I
 connected LOCAL VS GLOBAL THINKING before with WHAT A NICE
 DAY and the weather).
 
 IOW I'm suggesting, the natural mode of human thought -
 and our continuous streams of association - are creative.
 And achieving such creativity is the principal problem/goal
 of AGI.
 
 So maybe it's worth taking 20 secs. of time - producing
 your own chain-of-free-association starting say with
 MAHONEY  and going on for another 10 or so items
 -  and trying to figure out how the result could.possibly be
 the  narrow kind of memory-recall you're arguing for.
 It's an awful lot to ask for, but could you possibly try
 it, analyse it and report back?
 
 [Ben claims to have heard every type of argument I make
 before,  (somewhat like your A-B memory claim), so perhaps
 he can tell me where he's read before about the Freely
 Associative, Freely Domain Switching nature of human thought
 - I'd be interested to follow up on it]. 



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-08 Thread Matt Mahoney

--- On Thu, 1/8/09, Vladimir Nesov robot...@gmail.com wrote:

 On Fri, Jan 9, 2009 at 12:19 AM, Matt Mahoney
 matmaho...@yahoo.com wrote:
  Mike,
 
  Your own thought processes only seem mysterious
 because you can't predict what you will think without
 actually thinking it. It's not just a property of the
 human brain, but of all Turing machines. No program can
 non-trivially model itself. (By model, I mean that P models
 Q if for any input x, P can compute the output Q(x). By
 non-trivial, I mean that P does something else besides just
 model Q. (Every program trivially models itself). The proof
 is that for P to non-trivially model Q requires K(P) 
 K(Q), where K is Kolmogorov complexity, because P needs a
 description of Q plus whatever else it does to make it
 non-trivial. It is obviously not possible for K(P) 
 K(P)).
 
 
 Matt, please stop. I even constructed an explicit
 counterexample to
 this pseudomathematical assertion of yours once. You
 don't pay enough
 attention to formal definitions: what this has a
 description means,
 and which reference TMs specific Kolmogorov complexities
 are measured
 in.

Your earlier counterexample was a trivial simulation. It simulated itself but 
did nothing else. If P did something that Q didn't, then Q would not be 
simulating P.

This applies regardless of your choice of universal TM.

I suppose I need to be more precise. I say P simulates Q if for all x, 
P(what is Q(x)?) = Q(x)=y iff Q(x)=y (where x and y are arbitrary strings). 
When I say that P does something else, I mean that it accepts at least one 
input not of the form what is Q(x)?. I claim that K(P)  K(Q) because any 
description of P must include a description of Q plus a description of what P 
does for at least one other input.


-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] The Smushaby of Flatway.

2009-01-07 Thread Matt Mahoney

Logic has not solved AGI because logic is a poor model of the way people think.

Neural networks have not solved AGI because you would need about 10^15 bits of 
memory and 10^16 OPS to simulate a human brain sized network.

Genetic algorithms have not solved AGI because the computational requirements 
are even worse. You would need 10^36 bits just to model all the world's DNA, 
and even if you could simulate it in real time, it took 3 billion years to 
produce human intelligence the first time.

Probabilistic reasoning addresses only one of the many flaws of first order 
logic as a model of AGI. Reasoning under uncertainty is fine, but you haven't 
solved learning by induction, reinforcement learning, complex pattern 
recognition (e.g. vision), and language. If it was just a matter of writing the 
code, then it would have been done 50 years ago.

-- Matt Mahoney, matmaho...@yahoo.com


--- On Wed, 1/7/09, Jim Bromer jimbro...@gmail.com wrote:

 From: Jim Bromer jimbro...@gmail.com
 Subject: [agi] The Smushaby of Flatway.
 To: agi@v2.listbox.com
 Date: Wednesday, January 7, 2009, 8:23 PM
 All of the major AI paradigms, including those that are
 capable of
 learning, are flat according to my definition.  What makes
 them flat
 is that the method of decision making is
 minimally-structured and they
 funnel all reasoning through a single narrowly focused
 process that
 smushes different inputs to produce output that can appear
 reasonable
 in some cases but is really flat and lacks any structure
 for complex
 reasoning.
 
 The classic example is of course logic.  Every proposition
 can be
 described as being either True or False and any collection
 of
 propositions can be used in the derivation of a conclusion
 regardless
 of whether the input propositions had any significant
 relational
 structure that would actually have made it reasonable to
 draw the
 definitive conclusion that was drawn from them.
 
 But logic didn't do the trick, so along came neural
 networks and
 although the decision making is superficially distributed
 and can be
 thought of as being comprised of a structure of layer-like
 stages in
 some variations, the methodology of the system is really
 just as flat.
  Again anything can be dumped into the neural network and a
 single
 decision making process works on the input through a
 minimally-structured reasoning system and output is
 produced
 regardless of the lack of appropriate relative structure in
 it.  In
 fact, this lack of discernment was seen as a major
 breakthrough!
 Surprise, neural networks did not work just like the mind
 works in
 spite of the years and years of hype-work that went into
 repeating
 this slogan in the 1980's.
 
 Then came Genetic Algorithms and finally we had a system
 that could
 truly learn to improve on its previous learning and how did
 it do
 this?  It used another flat reasoning method whereby
 combinations of
 data components were processed according to one simple
 untiring method
 that was used over and over again regardless of any
 potential to see
 input as being structured in more ways than one.  Is anyone
 else
 starting to discern a pattern here?
 
 Finally we reach the next century to find that the future
 of AI has
 already arrived and that future is probabilistic reasoning!
  And how
 is probabilistic reasoning different?  Well, it can solve
 problems
 that logic, neural networks, genetic algorithms
 couldn't!  And how
 does probabilistic reasoning do this?  It uses a funnel
 minimally-structured method of reasoning whereby any input
 can be
 smushed together with other disparate input to produce a
 conclusion
 which is only limited by the human beings who strive to
 program it!
 
 The very allure of minimally-structured reasoning is that
 it works
 even in some cases where it shouldn't.  It's the
 hip hooray and bally
 hoo of the smushababies of Flatway.
 
 Jim Bromer



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-30 Thread Matt Mahoney

John,
So if consciousness is important for compression, then I suggest you write two 
compression programs, one conscious and one not, and see which one compresses 
better. 

Otherwise, this is nonsense.

-- Matt Mahoney, matmaho...@yahoo.com

--- On Tue, 12/30/08, John G. Rose johnr...@polyplexic.com wrote:
From: John G. Rose johnr...@polyplexic.com
Subject: RE: [agi] Universal intelligence test benchmark
To: agi@v2.listbox.com
Date: Tuesday, December 30, 2008, 9:46 AM




 
 






If the agents were p-zombies or just not conscious they would have
different motivations. 

   

Consciousness has properties of communication protocol and effects
inter-agent communication. The idea being it enhances agents' existence and
survival. I assume it facilitates collective intelligence, generally. For a
multi-agent system with a goal of compression or prediction the agent
consciousness would have to be catered.  So introducing -  

Consciousness of X is: the idea or feeling that X is
correlated with Consciousness of X

to
the agents would give them more glue if they expended that consciousness
on one another. The communications dynamics of the system would change 
verses
a similar non-conscious multi-agent system. 

   

John 

   







From: Ben Goertzel
[mailto:b...@goertzel.org] 

Sent: Monday, December 29, 2008 2:30 PM

To: agi@v2.listbox.com

Subject: Re: [agi] Universal intelligence test benchmark 





   



Consciousness of X is: the idea or feeling that X is correlated with
Consciousness of X



;-)



ben g 



On Mon, Dec 29, 2008 at 4:23 PM, Matt Mahoney matmaho...@yahoo.com wrote: 



--- On Mon, 12/29/08, John G.
Rose johnr...@polyplexic.com
wrote: 





  What does
consciousness have to do with the rest of your argument?

 



 Multi-agent systems should need individual consciousness to

 achieve advanced

 levels of collective intelligence. So if you are

 programming a multi-agent

 system, potentially a compressor, having consciousness in

 the agents could

 have an intelligence amplifying effect instead of having

 non-conscious

 agents. Or some sort of primitive consciousness component

 since higher level

 consciousness has not really been programmed yet.



 Agree? 



No. What do you mean by consciousness?



Some people use consciousness and intelligence
interchangeably. If that is the case, then you are just using a circular
argument. If not, then what is the difference? 





-- Matt Mahoney, matmaho...@yahoo.com







 













  

  
  agi | Archives

 | Modify
 Your Subscription


  

  


 




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-29 Thread Matt Mahoney

--- On Sun, 12/28/08, Philip Hunt cabala...@googlemail.com wrote:

  Please remember that I am not proposing compression as
  a solution to the AGI problem. I am proposing it as a
  measure of progress in an important component (prediction).
 
 Then why not cut out the middleman and measure prediction
 directly?

Because a compressor proves the correctness of the measurement software at no 
additional cost in either space or time complexity or software complexity. The 
hard part of compression is modeling. Arithmetic coding is essentially a solved 
problem. A decompressor uses exactly the same model as a compressor. In high 
end compressors like PAQ, the arithmetic coder takes up about 1% of the 
software, 1% of the CPU time, and less than 1% of memory.

In speech recognition research it is common to use word perplexity as a measure 
of the quality of a language model. Experimentally, it correlates well with 
word error rate. Perplexity is defined as 2^H where H is the average number of 
bits needed to encode a word. Unfortunately this is sometimes done in 
nonstandard ways, such as with restricted vocabularies and different methods of 
handling words outside the vocabulary, parsing, stemming, capitalization, 
punctuation, spacing, and numbers. Without accounting for this additional data, 
it makes published results difficult to compare. Compression removes the 
possibility of such ambiguities.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-29 Thread Matt Mahoney

--- On Mon, 12/29/08, Philip Hunt cabala...@googlemail.com wrote:

 Incidently, reading Matt's posts got me interested in writing a
 compression program using Markov-chain prediction. The prediction bit
 was a piece of piss to write; the compression code is proving
 considerably more difficult.

Well, there is plenty of open source software.
http://cs.fit.edu/~mmahoney/compression/

If you want to write your own model and just need a simple arithmetic coder, 
you probably want fpaq0. Most of the other programs on this page use the same 
coder or some minor variation of it.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-29 Thread Matt Mahoney

--- On Mon, 12/29/08, John G. Rose johnr...@polyplexic.com wrote:

 Well that's a question. Does death somehow enhance a
 lifeforms' collective intelligence?

Yes, by weeding out the weak and stupid.

 Agents competing over finite resources.. I'm wondering if
 there were multi-agent evolutionary genetics going on would there be a
 finite resource of which there would be a relation to the collective goal of
 predicting the next symbol.

No, prediction is a secondary goal. The primary goal is to have a lot of 
descendants.

 Agent knowledge is not only passed on in their
 genes, it is also passed around to other agents Does agent death hinder
 advances in intelligence or enhance it? And then would the intelligence
 collected thus be applicable to the goal. And if so, consciousness may be
 valuable.

What does consciousness have to do with the rest of your argument?

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-29 Thread Matt Mahoney

--- On Mon, 12/29/08, John G. Rose johnr...@polyplexic.com wrote:

  What does consciousness have to do with the rest of your argument?
  
 
 Multi-agent systems should need individual consciousness to
 achieve advanced
 levels of collective intelligence. So if you are
 programming a multi-agent
 system, potentially a compressor, having consciousness in
 the agents could
 have an intelligence amplifying effect instead of having
 non-conscious
 agents. Or some sort of primitive consciousness component
 since higher level
 consciousness has not really been programmed yet. 
 
 Agree?

No. What do you mean by consciousness?

Some people use consciousness and intelligence interchangeably. If that is 
the case, then you are just using a circular argument. If not, then what is the 
difference?

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] [Science Daily] Our Unconscious Brain Makes The Best Decisions Possible

2008-12-29 Thread Matt Mahoney

--- On Mon, 12/29/08, Richard Loosemore r...@lightlink.com wrote:

 8-) Don't say that too loudly, Yudkowsky might hear
 you. :-)
...
 When I suggested that someone go check some of his ravings
 with an outside authority, he banned me from his discussion
 list.

Yudkowsky's side of the story might be of interest...

http://www.sl4.org/archive/0608/15895.html
http://www.sl4.org/archive/0608/15928.html

-- Matt Mahoney, matmaho...@yahoo.com


 From: Richard Loosemore r...@lightlink.com
 Subject: Re: [agi] [Science Daily] Our Unconscious Brain Makes The Best 
 Decisions Possible
 To: agi@v2.listbox.com
 Date: Monday, December 29, 2008, 4:02 PM
 Lukasz Stafiniak wrote:
 
 http://www.sciencedaily.com/releases/2008/12/081224215542.htm
  
  Nothing surprising ;-)
 
 Nothing surprising?!!
 
 8-) Don't say that too loudly, Yudkowsky might hear
 you. :-)
 
 The article is a bit naughty when it says, of Tversky and
 Kahnemann, that ...this has become conventional wisdom
 among cognition researchers.  Actually, the original
 facts were interpreted in a variety of ways, some of which
 strongly disagreed with T  K's original
 intepretation, just like this one you reference above.  The
 only thing that is conventional wisdom is that the topic
 exists, and is the subject of dispute.
 
 And, as many people know, I made the mistake of challenging
 Yudkowsky on precisely this subject back in 2006, when he
 wrote an essay strongly advocating TK's original
 intepretation.  Yudkowsky went completely berserk, accused
 me of being an idiot, having no brain, not reading any of
 the literature, never answering questions, and generally
 being something unspeakably worse than a slime-oozing crank.
  He literally wrote an essay denouncing me as equivalent to
 a flat-earth believing crackpot.
 
 When I suggested that someone go check some of his ravings
 with an outside authority, he banned me from his discussion
 list.
 
 Ah, such are the joys of being speaking truth to power(ful
 idiots).
 
 ;-)
 
 As far as this research goes, it sits somewhere down at the
 lower end of the available theories.  My friend Mike
 Oaksford in the UK has written several papers giving a
 higher level cognitive theory that says that people are, in
 fact, doing something like bayesian estimation when then
 make judgments.  In fact, people are very good at being
 bayesians, contra the loud protests of the I Am A Bayesian
 Rationalist crowd, who think they were the first to do it.
 
 
 
 
 
 Richard Loosemore



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-29 Thread Matt Mahoney

--- On Mon, 12/29/08, Philip Hunt cabala...@googlemail.com wrote:
 Am I right in understanding that the coder from fpaq0 could
 be used with any other predictor?

Yes. It has a simple interface. You have a class called Predictor which is your 
bit sequence predictor. It has 2 member functions that you have to write. p() 
should return your estimated probability that the next bit will be a 1, as a 12 
bit number (0 to 4095). update(y) then tells you what that bit actually was, a 
0 or 1. The encoder will alternately call these 2 functions for each bit of the 
sequence. The predictor doesn't know whether it is compressing or decompressing 
because it sees exactly the same sequence either way.

So the easy part is done :)

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-28 Thread Matt Mahoney

--- On Sun, 12/28/08, John G. Rose johnr...@polyplexic.com wrote:

 So maybe for improved genetic
 algorithms used for obtaining max compression there needs to be a
 consciousness component in the agents? Just an idea I think there is
 potential for distributed consciousness inside of command line compressors
 :)

No, consciousness (as the term is commonly used) is the large set of properties 
of human mental processes that distinguish life from death, such as ability to 
think, learn, experience, make decisions, take actions, communicate, etc. It is 
only relevant as an independent concept to agents that have a concept of death 
and the goal of avoiding it. The only goal of a compressor is to predict the 
next input symbol. 

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-27 Thread Matt Mahoney

--- On Sat, 12/27/08, John G. Rose johnr...@polyplexic.com wrote:

   How does consciousness fit into your compression
   intelligence modeling?
  
  It doesn't. Why is consciousness important?
  
 
 I was just prodding you on this. Many people on this list talk about the
 requirements of consciousness for AGI and I was imagining some sort of
 consciousness in one of your command line compressors :) 
 I've yet to grasp
 the relationship between intelligence and consciousness though lately I
 think consciousness may be more of an evolutionary social thing. Home grown
 digital intelligence, since it is a loner, may not require much
 consciousness IMO..

What we commonly call consciousness is a large collection of features that 
distinguish living human brains from dead human brains: ability to think, 
communicate, perceive, make decisions, learn, move, talk, see, etc. We only 
attach significance to it because we evolved, like all animals, to fear a large 
set of things that can kill us.

   Max compression implies hacks, kludges and a
 large decompressor.
  
  As I discovered with the large text benchmark.
  
 
 Yep and the behavior of the metrics near max theoretical
 compression is erratic I think?

It shouldn't be. There is a well defined (but possibly not computable) limit 
for each of the well defined universal Turing machines that the benchmark 
accepts (x86, C, C++, etc).

I was hoping to discover an elegant theory for AI. It didn't quite work that 
way. It seems to be a kind of genetic algorithm: make random changes to the 
code and keep the ones that improve compression.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-27 Thread Matt Mahoney

--- On Sat, 12/27/08, John G. Rose johnr...@polyplexic.com wrote:

 Well I think consciousness must be some sort of out of band intelligence
 that bolsters an entity in terms of survival. Intelligence probably
 stratifies or optimizes in zonal regions of similar environmental
 complexity, consciousness being one or an overriding out-of-band one...

No, consciousness only seems mysterious because human brains are programmed 
that way. For example, I should logically be able to convince you that pain 
is just a signal that reduces the probability of you repeating whatever actions 
immediately preceded it. I can't do that because emotionally you are convinced 
that pain is real. Emotions can't be learned the way logical facts can, so 
emotions always win. If you could accept the logical consequences of your brain 
being just a computer, then you would not pass on your DNA. That's why you 
can't.

BTW the best I can do is believe both that consciousness exists and 
consciousness does not exist. I realize these positions are inconsistent, and I 
leave it at that.

  I was hoping to discover an elegant theory for AI. It didn't quite work
  that way. It seems to be a kind of genetic algorithm: make random
  changes to the code and keep the ones that improve compression.
  
 
 Is this true for most data? For example would PI digit compression attempts
 result in genetic emergences the same as say compressing environmental
 noise? I'm just speculating that genetically originated data would require
 compression avenues of similar algorithmic complexity descriptors, for
 example PI digit data does not originate genetically so compression attempts
 would not show genetic emergences as chained as say environmental
 noise basically I'm asking if you can tell the difference from data that
 has a genetic origination ingredient verses all non-genetic...

No, pi can be compressed to a simple program whose size is dominated by the log 
of the number of digits you want.

For text, I suppose I should be satisfied that a genetic algorithm compresses 
it, except for the fact that so far the algorithm requires a human in the loop, 
so it doesn't solve the AI problem.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-27 Thread Matt Mahoney

1 FC4A5294A52000
1 FC4A473E25239F1291C000
1 FC4A0941282504A09400
1 FC4A00

The best compressors will compress this data to just under 3 MB, which implies 
an average algorithmic complexity of less than 24 bits per string. However, the 
language allows the construction of arbitrary 128 bit strings in a fairly 
straightforward manner.

-- Matt Mahoney, matmaho...@yahoo.com





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-26 Thread Matt Mahoney

--- On Fri, 12/26/08, John G. Rose johnr...@polyplexic.com wrote:

 Human memory storage may be lossy compression and recall may be
 decompression. Some very rare individuals remember every
 day of their life
 in vivid detail, not sure what that means in terms of
 memory storage.

Human perception is a form of lossy compression which has nothing to do with 
the lossless compression that I use to measure prediction accuracy. Many 
lossless compressors use lossy filters too. A simple example is an order-n 
context where we discard everything except the last n symbols.

 How does consciousness fit into your compression
 intelligence modeling?

It doesn't. Why is consciousness important?

 Max compression implies hacks, kludges and a large decompressor. 

As I discovered with the large text benchmark.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Spatial indexing (was Re: [agi] Universal intelligence test benchmark)

2008-12-26 Thread Matt Mahoney

--- On Fri, 12/26/08, J. Andrew Rogers and...@ceruleansystems.com wrote:

 For example, there is no general indexing algorithm
 described in computer science.

Which was my thesis topic and is the basis of my AGI design.
http://www.mattmahoney.net/agi2.html

(I wanted to do my dissertation on AI/compression, but funding issues got in 
the way).

Distributed indexing is critical to an AGI design consisting of a huge number 
of relatively dumb specialists and an infrastructure for getting messages to 
the right ones. In my thesis, I proposed a vector space model where messages 
are routed in O(n) time over n nodes. The problem is that the number of 
connections per node has to be on the order of the number of dimensions in the 
search space. For text, that is about 10^5.

There are many other issues, of course, such as fault tolerance, security and 
ownership issues. There has to be an economic incentive to contribute knowledge 
and computing resources, because it is too expensive for anyone to own it.

 The human genome size has no meaningful relationship to the
 complexity of coding AGI.

Yes it does. It is an upper bound on the complexity of a baby.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-26 Thread Matt Mahoney

--- On Fri, 12/26/08, Ben Goertzel b...@goertzel.org wrote:

 IMO the test is *too* generic  ...

Hopefully this work will lead to general principles of learning and prediction 
that could be combined with more specific techniques. For example, a common way 
to compress text is to encode it with one symbol per word and feed the result 
to a general purpose compressor. Generic compression should improve the back 
end.

My concern is the data is not generic enough. A string has an algorithmic 
complexity that is independent of language up to a small constant, but in 
practice that constant (the algorithmic complexity of the compiler) can be much 
larger than the string. I have not been able to find a good solution to this 
problem. I realize there are some very simple, Turing-complete systems, such as 
a 2 state machine with a 3 symbol alphabet, and a 6 state binary machine, as 
well as various cellular automata (like rule 110). The problem is that 
programming simple machines often requires long programs to do simple things. 
For example, it is difficult to find a simple language where the smallest 
program to output 100 zero bits is shorter than 100 bits. Existing languages 
and instruction sets tend to be complex and ad-hoc in order to allow 
programmers to be expressive.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-26 Thread Matt Mahoney

--- On Fri, 12/26/08, Philip Hunt cabala...@googlemail.com wrote:

  Humans are very good at predicting sequences of
  symbols, e.g. the next word in a text stream.
 
 Why not have that as your problem domain, instead of text
 compression?

That's the same thing, isn't it?

 While you're at it you may want to change the size of the chunks in
 each item of prediction, from characters to either strings or
 s-expressions. Though doing so doesn't fundamentally alter the
 problem.

In the generic test, the fundamental units are bits. It's not entirely suitable 
for most existing compressors, which tend to be byte oriented. But they are 
only byte oriented because a lot of data is structured that way. In general, it 
doesn't need to be.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: Spatial indexing (was Re: [agi] Universal intelligence test benchmark)

2008-12-26 Thread Matt Mahoney

--- On Sat, 12/27/08, Matt Mahoney matmaho...@yahoo.com wrote:

 In my thesis, I proposed a vector space model where
 messages are routed in O(n) time over n nodes.

Oops, O(log n).

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

[agi] Universal intelligence test benchmark

2008-12-23 Thread Matt Mahoney

I have been developing an experimental test set along the lines of Legg and 
Hutter's universal intelligence ( 
http://www.idsia.ch/idsiareport/IDSIA-04-05.pdf ). They define general 
intelligence as the expected reward of an AIXI agent in a Solomonoff 
distribution of environments (simulated by random Turing machines). AIXI is 
essentially a compression problem (find the shortest program consistent with 
the interaction so far). Thus, my benchmark is a large number (10^6) of small 
strings (1-32 bytes) generated by random Turing machines. The benchmark is 
here: http://cs.fit.edu/~mmahoney/compression/uiq/

I believe I have solved the technical issues related to experimental 
uncertainty and ensuring the source is cryptographically random. My goal was to 
make it an open benchmark with verifiable results while making it impossible to 
hard-code any knowledge of the test data into the agent. Other benchmarks solve 
this problem by including the decompressor size in the measurement, but my 
approach makes this unnecessary. However, I would appreciate any comments.

A couple of issues arose in designing the benchmark. One is that compression 
results are highly dependent on the choice of universal Turing machine, even 
though all machines are theoretically equivalent. The problem is that even 
though any machine can simulate any other by appending a compiler or 
interpreter, this small constant is significant in practice where the 
complexity of the programs is already small. I tried to create a simple but 
expressive language based on a 2 tape machine (working plus output, both one 
sided and binary) and an instruction set that outputs a bit with each 
instruction. There are, of course, many options. I suppose I could use an 
experimental approach of finding languages that rank compressors in the same 
order as other benchmarks. But there doesn't seem to be a guiding principle.

Also, it does not seem even possible to sample a Solomonoff distribution. Legg 
proved in http://arxiv.org/abs/cs.AI/0606070 that there are strings that are 
hard to learn, but that the time to create them grows as fast as the busy 
beaver problem. Of course I can't create such strings in my benchmark. I can 
create algorithmically complex sources, but they are necessarily easy to learn 
(for example, 100 random bits followed by all zero bits).

Is it possible to test the intelligence of an agent without having at least as 
much computing power? Legg's paper seems to say no.

-- Matt Mahoney, matmaho...@yahoo.com


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: FW: [agi] Lamarck Lives!(?)

2008-12-11 Thread Matt Mahoney

--- On Thu, 12/11/08, Eric Burton brila...@gmail.com wrote:

 You can see though how genetic memory encoding opens the door to
 acquired phenotype changes over an organism's life, though, and those
 could become communicable. I think Lysenko was onto something like
 this. Let us hope all those Soviet farmers wouldn't have just starved!
 ;3

No, apparently you didn't understand anything I wrote.

Please explain how the memory encoded separately as one bit each in 10^11 
neurons through DNA methylation (the mechanism for cell differentiation, not 
genetic changes) is all collected together and encoded into genetic changes in 
a single egg or sperm cell, and back again to the brain when the organism 
matures.

And please explain why you think that Lysenko's work should not have been 
discredited. http://en.wikipedia.org/wiki/Trofim_Lysenko

-- Matt Mahoney, matmaho...@yahoo.com


 On 12/11/08, Matt Mahoney matmaho...@yahoo.com
 wrote:
  --- On Thu, 12/11/08, Eric Burton
 brila...@gmail.com wrote:
 
  It's all a big vindication for genetic memory,
 that's for certain. I
  was comfortable with the notion of certain
 templates, archetypes,
  being handed down as aspects of brain design via
 natural selection,
  but this really clears the way for organisms'
 life experiences to
  simply be copied in some form to their offspring.
 DNA form!
 
  No it's not.
 
  1. There is no experimental evidence that learned
 memories are passed to
  offspring in humans or any other species.
 
  2. If memory is encoded by DNA methylation as proposed
 in
 
 http://www.newscientist.com/article/mg20026845.000-memories-may-be-stored-on-your-dna.html
  then how is the memory encoded in 10^11 separate
 neurons (not to mention
  connectivity information) transferred to a single egg
 or sperm cell with
  less than 10^5 genes? The proposed mechanism is to
 activate one gene and
  turn off another -- 1 or 2 bits.
 
  3. The article at
 http://www.technologyreview.com/biomedicine/21801/ says
  nothing about where memory is encoded, only that
 memory might be enhanced by
  manipulating neuron chemistry. There is nothing
 controversial here. It is
  well known that certain drugs affect learning.
 
  4. The memory mechanism proposed in
 
 http://www.ncbi.nlm.nih.gov/pubmed/16822969?ordinalpos=14itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum
  is distinct from (2). It proposes protein regulation
 at the mRNA level near
  synapses (consistent with the Hebbian model) rather
 than DNA in the nucleus.
  Such changes could not make their way back to the
 nucleus unless there was a
  mechanism to chemically distinguish the tens of
 thousands of synapses and
  encode this information, along with the connectivity
 information (about 10^6
  bits per neuron) back to the nuclear DNA.
 
  Last week I showed how learning could occur in neurons
 rather than synapses
  in randomly and sparsely connected neural networks
 where all of the outputs
  of a neuron are constrained to have identical weights.
 The network is
  trained by tuning neurons toward excitation or
 inhibition to reduce the
  output error. In general an arbitrary X to Y bit
 binary function with N = Y
  2^X bits of complexity can be learned using about 1.5N
 to 2N neurons with ~
  N^1/2 synapses each and ~N log N training cycles. As
 an example I posted a
  program that learns a 3 by 3 bit multiplier in about
 20 minutes on a PC
  using 640 neurons with 36 connections each.
 
  This is slower than Hebbian learning by a factor of
 O(N^1/2) on sequential
  computers, as well as being inefficient because sparse
 networks cannot be
  simulated efficiently using typical vector processing
 parallel hardware or
  memory optimized for sequential access. However this
 architecture is what we
  actually observe in neural tissue, which nevertheless
 does everything in
  parallel. The presence of neuron-centered learning
 does not preclude Hebbian
  learning occurring at the same time (perhaps at a
 different rate). However,
  the number of neurons (10^11) is much closer to
 Landauer's estimate of human
  long term memory capacity (10^9 bits) than the number
 of synapses (10^15).
 
  However, I don't mean to suggest that memory in
 either form can be
  inherited. There is no biological evidence for such a
 thing.
 
  -- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: FW: [agi] Lamarck Lives!(?)

2008-12-11 Thread Matt Mahoney

--- On Thu, 12/11/08, Eric Burton brila...@gmail.com wrote:

 I don't think that each inheritor receives a full set of the
 original's memories. But there may have *evolved* in spite of the
 obvious barriers, a means of transferring primary or significant
 experience from one organism to another in genetic form...
 we can imagine such a thing given this news!

Well, we could, if there was any evidence whatsoever for Lamarckian evolution, 
and if we thought with our reproductive organs.

To me, it suggests that AGI could be implemented with a 10^4 speedup over whole 
brain emulation -- maybe. Is it possible to emulate a sparse neural network 
with 10^11 adjustable neurons and 10^15 fixed, random connections using a 
non-sparse neural network with 10^11 adjustable connections?

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: FW: [agi] Lamarck Lives!(?)

2008-12-11 Thread Matt Mahoney

--- On Thu, 12/11/08, Eric Burton brila...@gmail.com wrote:

 I don't know how you derived the value 10^4, Matt, but that seems
 reasonable to me. Terren, let me go back to the article and try to
 understand what exactly it says is happening. Certainly that's my
 editorial's crux

A simulation of a neural network with 10^15 synapses requires 10^15 operations 
to update the activation levels of the neurons. If we assume 100 ms resolution, 
that is 10^16 operations per second.

If memory is stored in neurons rather than synapses, as suggested in the 
original paper (see http://www.cell.com/neuron/retrieve/pii/S0896627307001420 ) 
then the brain has a memory capacity of at most 10^11 bits, which could be 
simulated by a neural network with 10^11 connections (or 10^12 operations per 
second).

This assumes that (1) the networks are equivalent and (2) that there isn't any 
secondary storage in synapses in addition to neurons. The program I posted last 
week was intended to show (1). However (2) has not been shown. The fact that 
DNA methylation occurs in the cortex does not exclude the possibility of more 
than one memory mechanism. As a counter argument, the cortex has about 10^4 
times as much storage as the hippocampus (10^4 days vs. 1 day), but is not 10^4 
times larger.

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Machine Knowledge and Inverse Machine Knowledge...

2008-12-09 Thread Matt Mahoney

Steve, the difference between Cyc and Dr. Eliza is that Cyc has much more 
knowledge. Cyc has millions of rules. The OpenCyc download is hundreds of MB 
compressed. Several months ago you posted the database file for Dr. Eliza. I 
recall it was a few hundred rules and I think under 1 MB. Both of these 
databases are far too small for AGI because neither has solved the learning 
problem.

 -- Matt Mahoney, [EMAIL PROTECTED]





From: Steve Richfield [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Tuesday, December 9, 2008 3:06:08 AM
Subject: [agi] Machine Knowledge and Inverse Machine Knowledge...


Larry Lefkowitz, Stephen Reed, et al,
 
First, Thanks Steve for your pointer to Larry Lefkowitz, and thanks Larry for 
so much time and effort in trying to relate our two approaches..
 
After discussions with Larry Lefkowitz of Cycorp, I have had a bit of an 
epiphany regarding machine knowledge that I would like to share for all to 
comment on...
 
First, it wasn't as though there were points of incompatibility between 
Cycorp's idea of machine knowledge and that used in DrEliza.com, but rather, 
there were no apparent points of connection. How could two related things be so 
completely different, especially when both are driven by the real world?
 
Then it struck me. Cycorp and others here on this forum seek to represent the 
structures of real world domains in a machine, whereas Dr. Eliza seeks only to 
represent the structure of the malfunctions within structures, while making no 
attempt whatever to represent the structures in which those malfunctions occur, 
as though those malfunctions have their very own structure, as they truly do. 
This seems a bit like simulating the holes in a semiconductor.
 
OF COURSE there were no points of connection.
 
Larry pointed out the limitations in my approach - which I already knew, 
namely, Dr. Eliza will NEVER EVER understand normal operation when all it has 
to go on are ABnormalities.
 
Similarly, I pointed out that Cycorp's approach had the inverse problem, in 
that it would probably take the quadrillion dollars that Matt Mahoney keeps 
talking about to ever understand malfunctions starting from the wrong side (as 
seen from Dr. Eliza's viewpoint) of things.
 
In short, I see both of these as being quite valid but completely incompatible 
approaches, that accomplish very different things via very different methods. 
Each could move toward the other's capabilities given infinite resources, but 
only a madman (like Matt Mahoney?) would ever throw money at such folly. 
 
Back to my reason for contacting Cycorp - to see if some sort of web standard 
to represent metadata could be hammered out. Neither Larry nor I could see how 
Dr. Eliza's approach could be adapted to Cycorp, and further, this is aside 
from Cycorp's present interests. Hence, I am on my own here.
 
Hence, it is my present viewpoint that I should proceed with my present 
standard to accompany the only semi-commercial program that models malfunctions 
rather than the real world, somewhat akin to the original Eliza program. 
However, I should prominently label the standard and appropriate fields therein 
appropriately so that there is no future confusion between machine knowledge 
and Dr. Eliza's sort of inverse machine knowledge.
 
Any thoughts?
 
Steve Richfield
 


 
agi | Archives  | Modify Your Subscription  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=120640061-aded06
Powered by Listbox: http://www.listbox.com

1 2 3 4 5 6 7 8 >

1 - 100 of 777 matches

Mail list logo