[agi] Remembering Caught in the Act

2008-09-05 Thread Brad Paulsen

http://www.nytimes.com/2008/09/05/science/05brain.html?_r=3partner=rssnytemc=rssoref=sloginoref=sloginoref=slogin

or, indirectly,

http://science.slashdot.org/article.pl?sid=08/09/05/0138237from=rss


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] Remembering Caught in the Act

2008-09-05 Thread Bob Mottram
As the article says, this has long been suspected but until now hadn't
been demonstrated.  Edelman was describing the same phenomena as the
remembered present well over a decade ago, and his idea seems to have
been loosely inspired by ideas from Freud and James.

Remembering seems to be an act of reassembly of stored percepts.  If
you can intervene and tinker around with the reassembly process you
can alter how people remember things, and this seems to be the essence
of psychoanalysis.


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] Remembering Caught in the Act

2008-09-05 Thread Kaj Sotala
On Fri, Sep 5, 2008 at 11:21 AM, Brad Paulsen [EMAIL PROTECTED] wrote:
 http://www.nytimes.com/2008/09/05/science/05brain.html?_r=3partner=rssnytemc=rssoref=sloginoref=sloginoref=slogin

http://www.sciencemag.org/cgi/content/short/1164685 for the original study.


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] Remembering Caught in the Act

2008-09-05 Thread Mike Tintner
Er sorry - my question is answered in the interesting Slashdot thread 
(thanks again):


Past studies have shown how many neurons are involved in a single, simple 
memory. Researchers might be able to isolate a few single neurons in the 
process of summoning a memory, but that is like saying that they have 
isolated a few water molecules in the runoff of a giant hydroelectric dam. 
The practical utility of this is highly questionable.  (and much more.. 
good thread) 





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Mike Tintner

OK, I'll bite: what's nondeterministic programming if not a contradiction?

Again - v. briefly - it's a reality - nondeterministic programming is a 
reality, so there's no material, mechanistic, software problem in getting a 
machine to decide either way. The only problem is a logical one of doing it 
for sensible reasons. And that's the long part - there are a continuous 
stream of sensible reasons, as there are for current nondeterministic 
computer choices.


Yes, strictly, a nondeterministic *program* can be regarded as a 
contradiction - i.e. a structured *series* of instructions to decide freely 
. The way the human mind is programmed is that we are not only free, and 
have to, *decide* either way about certain decisions, but we are also free 
to *think* about it - i.e. to decide metacognitively whether and how we 
decide at all - we continually decide. for example, to put off the 
decision till later.


So the simple reality of being as free to decide and think as you are, is 
that when you sit down to engage in any task, like write a post, essay, or 
have a conversation, or almost literally anything, there is no guarantee 
that you will start, or continue to the 2nd, 3rd, 4th step, let alone 
complete it. You may jack in your post more or less immediately.  This is at 
once the bane and the blessing of your life, and why you have such 
extraordinary problems finishing so many things. Procrastination.


By contrast, all deterministic/programmed machines and computers are 
guaranteed to complete any task they begin. (Zero procrastination or 
deviation). Very different kinds of machines to us. Very different paradigm. 
(No?)


I would say then that the human mind is strictly not so much 
nondeterministically programmed as briefed. And that's how an AGI will 
have to function. 





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] open models, closed models, priors

2008-09-05 Thread Pei Wang
On Thu, Sep 4, 2008 at 11:17 PM, Abram Demski [EMAIL PROTECTED] wrote:
 Pei,

 I sympathize with your care in wording, because I'm very aware of the
 strange meaning that the word model takes on in formal accounts of
 semantics. While a cognitive scientist might talk about a person's
 model of the world, a logician would say that the world is a model
 of a first-order theory. I do want to avoid the second meaning. But,
 I don't think I could fare well by saying system instead, because
 the models are only a part of the larger system... so I'm not sure
 there is a word that is both neutral and sufficiently meaningful.

Yes, the first usage of model is less evil than the second, though
it still carry the sense of representing the world as it is and
building a one-to-one mapping between the symbols and the objects.
As I write in the draft, it is better to take knowledge as a
representation of the experience of the system, after summarization
and organization.

 Do you think it is impossible to apply probability to open
 models/theories/systems, or merely undesirable?

Well, to apply probability can be done in many ways. What I have
argued (e.g., in
http://nars.wang.googlepages.com/wang.bayesianism.pdf) is that if a
system is open to new information and works in real time, it is
practically impossible to maintain a (consistent) probability
distribution among its beliefs --- incremental revision is not
supported by the theory, and re-building the distribution from raw
data is not affordable. It only works on toy problems and cannot scale
up.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] Remembering Caught in the Act

2008-09-05 Thread Bob Mottram
2008/9/5 Mike Tintner [EMAIL PROTECTED]:
 Past studies have shown how many neurons are involved in a single, simple
 memory. Researchers might be able to isolate a few single neurons in the
 process of summoning a memory, but that is like saying that they have
 isolated a few water molecules in the runoff of a giant hydroelectric dam.
 The practical utility of this is highly questionable.  (and much more..
 good thread)

It's true that there isn't much practical utility to be gained from
this from the standpoint of designing more intelligent systems.  Most
people with any interest in neuroscience already thought this to be
the case anyway.

There's still much more to be learned about how the various stored
percepts are reintegrated into a coherent memory or a conscious
introspection.  It's believed that the thalamus acts as a kind of
central switching hub dynamically recruiting neural assemblies from
diverse areas of the cortex, which Edelman called the dynamic core,
although the exact details of how this works remains to be
characterised.


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread William Pearson
2008/9/5 Mike Tintner [EMAIL PROTECTED]:
 By contrast, all deterministic/programmed machines and computers are
 guaranteed to complete any task they begin.

If only such could be guaranteed! We would never have system hangs,
dead locks. Even if it could be made so, computer systems would not
always want to do so. Have you every had a programmed computer system
say to you. This program is not responding, do you wish to terminate
it. There is no reason in principle why the decision to terminate the
program couldn't be made automatically.

 (Zero procrastination or
 deviation).

Multi-tasking systems deviate all the time...

 Very different kinds of machines to us. Very different paradigm.
 (No?)

We commonly talk about single program systems because they are
generally interesting, and can be analysed simply. My discussion on
self-modifying systems ignored the interrupt driven multi-tasking
nature of the system I want to build, because that makes analysis a
lot more hard. I will still be building an interrupt driven, multi
tasking system.

  Will Pearson


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Mike Tintner

MT:By contrast, all deterministic/programmed machines and computers are

guaranteed to complete any task they begin.


Will:If only such could be guaranteed! We would never have system hangs,
dead locks. Even if it could be made so, computer systems would not
always want to do so.

Will,

That's a legalistic, not a valid objection, (although heartfelt!).In the 
above case, the computer is guaranteed to hang - and it does, strictly, 
complete its task.


What's happened is that you have had imperfect knowledge of the program's 
operations. Had you known more, you would have known that it would hang.


Were your computer like a human mind, it would have been able to say (as 
you/we all do) - well if that part of the problem is going to be difficult, 
I'll ignore it  or.. I'll just make up an answer... or by God I'll keep 
trying other ways until I do solve this.. or... ..  or ... 
Computers, currently, aren't free thinkers. 





---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Matt Mahoney
--- On Thu, 9/4/08, Pei Wang [EMAIL PROTECTED] wrote:

 I guess you still see NARS as using model-theoretic
 semantics, so you
 call it symbolic and contrast it with system
 with sensors. This is
 not correct --- see
 http://nars.wang.googlepages.com/wang.semantics.pdf and
 http://nars.wang.googlepages.com/wang.AI_Misconceptions.pdf

I mean NARS is symbolic in the sense that you write statements in Narsese like 
raven - bird 0.97, 0.92 (probability=0.97, confidence=0.92). I realize 
that the meanings of raven and bird are determined by their relations to 
other symbols in the knowledge base and that the probability and confidence 
change with experience. But in practice you are still going to write statements 
like this because it is the easiest way to build the knowledge base. You aren't 
going to specify the brightness of millions of pixels in a vision system in 
Narsese, and there is no mechanism I am aware of to collect this knowledge from 
a natural language text corpus. There is no mechanism to add new symbols to the 
knowledge base through experience. You have to explicitly add them.

 You have made this point on CPU power several
 times, and I'm still
 not convinced that the bottleneck of AI is hardware
 capacity. Also,
 there is no reason to believe an AGI must be designed in a
 biologically plausible way.

Natural language has evolved to be learnable on a massively parallel network of 
slow computing elements. This should be apparent when we compare successful 
language models with unsuccessful ones. Artificial language models usually 
consist of tokenization, parsing, and semantic analysis phases. This does not 
work on natural language because artificial languages have precise 
specifications and natural languages do not. No two humans use exactly the same 
language, nor does the same human at two points in time. Rather, language is 
learnable by example, so that each message causes the language of the receiver 
to be a little more like that of the sender.

Children learn semantics before syntax, which is the opposite order from which 
you would write an artificial language interpreter. An example of a successful 
language model is a search engine. We know that most of the meaning of a text 
document depends only on the words it contains, ignoring word order. A search 
engine matches the semantics of the query with the semantics of a document 
mostly by matching words, but also by matching semantically related words like 
water to wet.

Here is an example of a computationally intensive but biologically plausible 
language model. A semantic model is a word-word matrix A such that A_ij is the 
degree to which words i and j are related, which you can think of as the 
probability of finding i and j together in a sliding window over a huge text 
corpus. However, semantic relatedness is a fuzzy identity relation, meaning it 
is reflexive, commutative, and transitive. If i is related to j and j to k, 
then i is related to k. Deriving transitive relations in A, also known as 
latent semantic analysis, is performed by singular value decomposition, 
factoring A = USV where S is diagonal, then discarding the small terms of S, 
which has the effect of lossy compression. Typically, A has about 10^6 elements 
and we keep only a few hundred elements of S. Fortunately there is a parallel 
algorithm that incrementally updates the matrices as the system learns: a 3 
layer neural network where S is the hidden layer
 (which can grow) and U and V are weight matrices. [1].

Traditional language processing has failed because the task of converting 
natural language statements like ravens are birds to formal language is 
itself an AI problem. It requires humans who have already learned what ravens 
are and how to form and recognize grammatically correct sentences so they 
understand all of the hundreds of ways to express the same statement. You have 
to have human level understand of the logic to realize that ravens are coming 
doesn't mean ravens - coming. If you solve the translation problem, then you 
must have already solved the natural language problem. You can't take a 
shortcut directly to the knowledge base, tempting as it might be. You have to 
learn the language first, going through all the childhood stages. I would have 
hoped we have learned a lesson from Cyc.

1. Gorrell, Genevieve (2006), Generalized Hebbian Algorithm for Incremental 
Singular Value Decomposition in Natural Language Processing, Proceedings of 
EACL 2006, Trento, Italy.
http://www.aclweb.org/anthology-new/E/E06/E06-1013.pdf

-- Matt Mahoney, [EMAIL PROTECTED]




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
On Fri, Sep 5, 2008 at 11:15 AM, Matt Mahoney [EMAIL PROTECTED] wrote:
 --- On Thu, 9/4/08, Pei Wang [EMAIL PROTECTED] wrote:

 I guess you still see NARS as using model-theoretic
 semantics, so you
 call it symbolic and contrast it with system
 with sensors. This is
 not correct --- see
 http://nars.wang.googlepages.com/wang.semantics.pdf and
 http://nars.wang.googlepages.com/wang.AI_Misconceptions.pdf

 I mean NARS is symbolic in the sense that you write statements in Narsese 
 like raven - bird 0.97, 0.92 (probability=0.97, confidence=0.92). I 
 realize that the meanings of raven and bird are determined by their 
 relations to other symbols in the knowledge base and that the probability and 
 confidence change with experience. But in practice you are still going to 
 write statements like this because it is the easiest way to build the 
 knowledge base.

Yes.

 You aren't going to specify the brightness of millions of pixels in a vision 
 system in Narsese, and there is no mechanism I am aware of to collect this 
 knowledge from a natural language text corpus.

Of course not. To have visual experience, there must be a devise to
convert visual signals into internal representation in Narsese. I
never suggested otherwise.

 There is no mechanism to add new symbols to the knowledge base through 
 experience. You have to explicitly add them.

New symbols either come from the outside in experience (experience
can be verbal), or composed by the concept-formation rules from
existing ones. The latter case is explained in my book.

 Natural language has evolved to be learnable on a massively parallel network 
 of slow computing elements. This should be apparent when we compare 
 successful language models with unsuccessful ones. Artificial language models 
 usually consist of tokenization, parsing, and semantic analysis phases. This 
 does not work on natural language because artificial languages have precise 
 specifications and natural languages do not.

It depends on which aspect of the language you talk about. Narsese has
precise specifications in syntax, but the meaning of the terms is a
function of experience, and change from time to time.

 No two humans use exactly the same language, nor does the same human at two 
 points in time. Rather, language is learnable by example, so that each 
 message causes the language of the receiver to be a little more like that of 
 the sender.

Same thing in NARS --- if two implementations of NARS have different
experience, they will disagree on what is the meaning of a term. When
they begin to learn natural language, it will also be true for
grammar. Since I haven't done any concrete NLP yet, I don't expect you
to believe me on the second point, but you cannot rule out that
possibility just because no traditional system can do that.

 Children learn semantics before syntax, which is the opposite order from 
 which you would write an artificial language interpreter.

NARS indeed can learn semantics before syntax --- see
http://nars.wang.googlepages.com/wang.roadmap.pdf

I won't comment on the following detailed statements, since I agree
with your criticism on the traditional processing of formal language,
but that is not how NARS handles languages. Don't think NARS as
another Cyc just because both use formal language. The same ravens
are birds in these two systems are treated very differently in them.

Pei


 An example of a successful language model is a search engine. We know that 
 most of the meaning of a text document depends only on the words it contains, 
 ignoring word order. A search engine matches the semantics of the query with 
 the semantics of a document mostly by matching words, but also by matching 
 semantically related words like water to wet.

 Here is an example of a computationally intensive but biologically plausible 
 language model. A semantic model is a word-word matrix A such that A_ij is 
 the degree to which words i and j are related, which you can think of as the 
 probability of finding i and j together in a sliding window over a huge text 
 corpus. However, semantic relatedness is a fuzzy identity relation, meaning 
 it is reflexive, commutative, and transitive. If i is related to j and j to 
 k, then i is related to k. Deriving transitive relations in A, also known as 
 latent semantic analysis, is performed by singular value decomposition, 
 factoring A = USV where S is diagonal, then discarding the small terms of S, 
 which has the effect of lossy compression. Typically, A has about 10^6 
 elements and we keep only a few hundred elements of S. Fortunately there is a 
 parallel algorithm that incrementally updates the matrices as the system 
 learns: a 3 layer neural network where S is the hidden layer
  (which can grow) and U and V are weight matrices. [1].

 Traditional language processing has failed because the task of converting 
 natural language statements like ravens are birds to formal language is 
 itself an AI problem. It 

Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.)

2008-09-05 Thread Steve Richfield
Matt,

FINALLY, someone here is saying some of the same things that I have been
saying. With general agreement with your posting, I will make some
comments...

On 9/4/08, Matt Mahoney [EMAIL PROTECTED] wrote:

 --- On Thu, 9/4/08, Valentina Poletti [EMAIL PROTECTED] wrote:
 Ppl like Ben argue that the concept/engineering aspect of intelligence is
 independent of the type of environment. That is, given you understand how
 to make it in a virtual environment you can then tarnspose that concept
 into a real environment more safely.


This is probably a good starting point, to avoid beating the world up during
the debugging process.


 Some other ppl on the other hand believe intelligence is a property of
 humans only.


Only people who haven't had a pet believe such things. I have seen too many
animals find clever solutions to problems.

So you have to simulate every detail about humans to get
 that intelligence. I'd say that among the two approaches the first one
 (Ben's) is safer and more realistic.

 The issue is not what is intelligence, but what do you want to create? In
 order for machines to do more work for us, they may need language and
 vision, which we associate with human intelligence.


Not necessarily, as even text-interfaced knowledge engines can handily
outperform humans in many complex problem solving tasks. The still open
question is: What would best do what we need done but can NOT presently do
(given computers, machinery, etc.). So far, the talk here on this forum
has been about what we could do and how we might do it, rather than about
what we NEED done.

Right now, we NEED resources to work productively in the directions that we
have been discussing, yet the combined intelligence of those here on this
forum is apparently unable to solve even this seemingly trivial problem.
Perhaps something more than raw intelligence is needed?

But building artificial humans is not necessarily useful. We already know
 how to create humans, and we are doing so at an unsustainable rate.

 I suggest that instead of the imitation game (Turing test) for AI, we
 should use a preference test. If you prefer to talk to a machine vs. a
 human, then the machine passes the test.


YES, like what is it that our AGI can do that we need done but can NOT
presently do?

Prediction is central to intelligence. If you can predict a text stream,
 then for any question Q and any answer A, you can compute the probability
 distribution P(A|Q) = P(QA)/P(Q). This passes the Turing test. More
 importantly, it allows you to output max_A P(QA), the most likely answer
 from a group of humans. This passes the preference test because a group is
 usually more accurate than any individual member. (It may fail a Turing test
 for giving too few wrong answers, a problem Turing was aware of in 1950 when
 he gave an example of a computer incorrectly answering an arithmetic
 problem).


Unfortunately, this also tests the ability to incorporate the very
misunderstandings that presently limit our thinking. We need to give credit
for compression algorithms that cleans up our grammar, corrects our
technical errors, etc., as these can probably be done in the process of
better compressing the text.

Text compression is equivalent to AI because we have already solved the
 coding problem. Given P(x) for string x, we know how to optimally and
 efficiently code x in log_2(1/P(x)) bits (e.g. arithmetic coding). Text
 compression has an advantage over the Turing or preference tests in that
 that incremental progress in modeling can be measured precisely and the test
 is repeatable and verifiable.

 If I want to test a text compressor, it is important to use real data
 (human generated text) rather than simulated data, i.e. text generated by a
 program. Otherwise, I know there is a concise code for the input data, which
 is the program that generated it. When you don't understand the source
 distribution (i.e. the human brain), the problem is much harder, and you
 have a legitimate test.


Wouldn't it be better to understand the problem domain while ignoring human
(mis)understandings? After all, if humans need an AGI to work in a difficult
domain, it is probably made more difficult by incorporating human
misunderstandings.

Of course, humans state human problems, so it is important to be able to
semantically communicate, but also useful to separate the communications
from the problems.

I understand that Ben is developing AI for virtual worlds. This might
 produce interesting results, but I wouldn't call it AGI. The value of AGI is
 on the order of US $1 quadrillion. It is a global economic system running on
 a smarter internet. I believe that any attempt to develop AGI on a budget of
 $1 million or $1 billion or $1 trillion is just wishful thinking.


I think that a billion or so, divided up into small pieces to fund EVERY
disparate approach to see where the low hanging fruit is, would go a LONG
way in guiding subsequent billions. I doubt that it would 

Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Abram Demski
Mike,

Will's objection is not quite so easily dismissed. You need to argue
that there is an alternative, not just that Will's is more of the
same.

--Abram

On Fri, Sep 5, 2008 at 9:34 AM, Mike Tintner [EMAIL PROTECTED] wrote:
 MT:By contrast, all deterministic/programmed machines and computers are

 guaranteed to complete any task they begin.

 Will:If only such could be guaranteed! We would never have system hangs,
 dead locks. Even if it could be made so, computer systems would not
 always want to do so.

 Will,

 That's a legalistic, not a valid objection, (although heartfelt!).In the
 above case, the computer is guaranteed to hang - and it does, strictly,
 complete its task.

 What's happened is that you have had imperfect knowledge of the program's
 operations. Had you known more, you would have known that it would hang.

 Were your computer like a human mind, it would have been able to say (as
 you/we all do) - well if that part of the problem is going to be difficult,
 I'll ignore it  or.. I'll just make up an answer... or by God I'll keep
 trying other ways until I do solve this.. or... ..  or ...
 Computers, currently, aren't free thinkers.



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Abram Demski
Mike,

The philosophical paradigm I'm assuming is that the only two
alternatives are deterministic and random. Either the next state is
completely determined by the last, or it is only probabilistically
determined.

Deterministic does not mean computable, since physical processes can
be totally well-defined without being computable (take Newton's
physics for example).

So,

1) Is the next action that your creativity machine will take intended
to be uniquely defined, given its experience and inputs?

2) is the next action intended to be computable from the experience
and inputs? Meaning (approximately), could the creativity machine be
implemented on a computer?

--Abram

On Fri, Sep 5, 2008 at 9:26 AM, Mike Tintner [EMAIL PROTECTED] wrote:
 Abram: In that case I do not see how your view differs from simplistic

 dualism, as Terren cautioned. If your goal is to make a creativity
 machine, in what sense would the machine be non-algorithmic? Physical
 random processes?


 Abram,

 You're operating within a philosophical paradigm that says all actions and
 problemsolving must be preprogrammed. Nothing else is possible. That ignores
 the majority of real life problems where no program is possible, period.

 Sometimes the best plan is no plan  If you're confronted with the task of
 finding something in a foreign territory, you simply don't (and couldn't)
 have the luxury of a program.

 All you have is a rough idea, as opposed to an algorithm, of the sort of
 things you can do. You know roughly what you're looking for - an object
 somewhere in that territory. You know roughly how to travel and put one
 foor in front of the other and avoid obstacles and pick things up etc.

 (Let's say - you have to find a key that has been lost somewhere in a
 house).

 Well you certainly don't have an algorithm for finding a lost key in a
 house. In fact, if you or anyone would care to spend 5 mins on this problem,
 you would start to realise that no algorithm is possible. Check out
 Kauffman's interview on edge.com. for similar problems  arguments
 .
 So what do/can you do? Make it up as you go along. Start somewhere and keep
 going, and after a while if that doesn't work, try somewhere and something
 else...

 But there's no algorithm for this. Just as there is, or was,  no algorithm
 for your putting the pieces of a jigsaw puzzle together (a much simpler,
 more tightly defined problem).  You just got stuck in. Somewhere. Anywhere
 reasonable.

 Algorithms, from a human POV, are for literal people who have to do things
 by the book - people with a compulsive obsessional disorder - who can't
 bear to confront a blank page. :).V useful *after* you've solved a problem,
 but not in the beginning

 There are no physical, computational, mechanical reasons why machines can't
 be designed on these principles - to proceed with rough ideas of what to do,
 freely consulting and combining options and looking around for fresh ones,
 as they go along, rather than following a preprogrammed list.

 P.S. Nothing in this is strictly random - as in a narrow AI, randomly,
 blindly. working its way through a preprogrammed list. You only try options
 that are appropriate -  routes that appear likely to lead to your goal. I
 would call this unstructured but not (blindly) random thinking.



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription:
 https://www.listbox.com/member/?;
 Powered by Listbox: http://www.listbox.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] What is Friendly AI?

2008-09-05 Thread Steve Richfield
Vladamir,

On 9/4/08, Vladimir Nesov [EMAIL PROTECTED] wrote:

 On Thu, Sep 4, 2008 at 12:02 PM, Valentina Poletti [EMAIL PROTECTED]
 wrote:
  Also, Steve made another good point here: loads of people at any moment
 do
  whatever they can to block the advancement and progress of human beings
 as
  it is now. How will those people react to a progress as advanced as AGI?
  That's why I keep stressing the social factor in intelligence as very
  important part to consider.

 No, it's not important, unless these people start to pose a serious
 threat to the project.


Here we are, lunch-money funded, working on the project with the MOST
economic potential of any project in the history of man. NO ONE will invest
the few millions needed to check out the low hanging fruit and kick this
thing into high gear. Sure, no one is holding guns to investors' heads and
saying don't invest, but neither is it socially acceptable to invest in
such directions. That social system is crafted by the Christian majority
here in the U.S. Hence, I see U.S. Christians as being a THE really SERIOUS
threat to AGI.



 You need to care about what is the correct
 answer, not what is a popular one, in the case where popular answer is
 dictated by ignorance.


As Reverse Reductio ad Absurdum shows ever so well, you can't even
understand the answers without some education. This is akin to learning that
a Game Theory solution consists of a list of probabilities, with the final
decision being made as a weighted random decision. Hence, there appears to
be NO prospect of an AGI being useful to people who lack this sort of
education, as nearly all of the population and all of the world leaders now
lack. Given a ubiquitous understanding of these principles, people are
probably smart enough to figure things out for themselves, so AGIs may
not even be needed.


Most disputes are NOT about what is the best answer, but rather about what
the goal is. Special methods like Reverse Reductio ad Absurdum are needed in
situations with conflicting goals.

The Koran states that most evil is done by people who think they are doing
good. However, Christians, seeing another competing religious book as itself
being evil, reject all of the wisdom therein, and in their misdirected
actions, confirm this very statement. When the majority of people reject
wisdom simply because of its source, and AGIs must necessarily displace
religions as they identify the misstatements made therein, it seems pretty
obvious to me that a war without limits lies ahead between the Christian
majority and AGIs.

Steve Richfield



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Mike Tintner


Abram,

I don't understand why.how I need to argue an alternative - please explain. 
If it helps, a deterministic, programmed machine can, at any given point, 
only follow one route through a given territory or problem space or maze - 
even if surprising  *appearing* to halt/deviate from the plan -   to the 
original, less-than-omniscient-of-what-he-hath-wrought programmer. (A 
fundamental programming problem, right?) A creative free machine, like a 
human, really can follow any of what may be a vast range of routes - and you 
really can't predict what it will do or, at a basic level, be surprised by 
it.



Mike,

Will's objection is not quite so easily dismissed. You need to argue
that there is an alternative, not just that Will's is more of the
same.

--Abram

On Fri, Sep 5, 2008 at 9:34 AM, Mike Tintner [EMAIL PROTECTED] 
wrote:

MT:By contrast, all deterministic/programmed machines and computers are


guaranteed to complete any task they begin.


Will:If only such could be guaranteed! We would never have system hangs,
dead locks. Even if it could be made so, computer systems would not
always want to do so.

Will,

That's a legalistic, not a valid objection, (although heartfelt!).In the
above case, the computer is guaranteed to hang - and it does, strictly,
complete its task.

What's happened is that you have had imperfect knowledge of the program's
operations. Had you known more, you would have known that it would hang.

Were your computer like a human mind, it would have been able to say (as
you/we all do) - well if that part of the problem is going to be 
difficult,
I'll ignore it  or.. I'll just make up an answer... or by God I'll 
keep

trying other ways until I do solve this.. or... ..  or ...
Computers, currently, aren't free thinkers.



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?;
Powered by Listbox: http://www.listbox.com




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?;

Powered by Listbox: http://www.listbox.com






---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Terren Suydam

Hi Mike, comments below...

--- On Fri, 9/5/08, Mike Tintner [EMAIL PROTECTED] wrote:

 Again - v. briefly - it's a reality - nondeterministic
 programming is a 
 reality, so there's no material, mechanistic, software
 problem in getting a 
 machine to decide either way. 

This is inherently dualistic to say this. On one hand you're calling it a 
'reality' and on the other you're denying the influence of material or 
mechanism. What exactly is deciding then, a soul?  How do you get one of those 
into an AI? 

 Yes, strictly, a nondeterministic *program* can be regarded
 as a 
 contradiction - i.e. a structured *series* of instructions
 to decide freely. 

At some point you will have to explain how this deciding freely works. As of 
now, all you have done is name it. 

 The way the human mind is programmed is that
 we are not only free, and 
 have to, *decide* either way about certain decisions, but
 we are also free 
 to *think* about it - i.e. to decide metacognitively
 whether and how we 
 decide at all - we continually decide. for
 example, to put off the 
 decision till later.

There is an entire school of thought, quite mainstream now, in cognitive 
science that says that what appears to be free will is an illusion. Of 
course, you can say that you are free to choose whatever you like, but that 
only speaks to the strength of the illusion - that in itself is not enough to 
disprove the claim. 

In fact, it is plain to see that if you do not commit yourself to this view 
(free will as illusion), you are either a dualist, or you must invoke some kind 
of probabilistic mechanism (as some like Penrose have done by saying that the 
free-will buck stops at the level of quantum mechanics). 

So, Mike, is free will:

1) an illusion based on some kind of unpredictable, complex but *deterministic* 
interaction of physical components
2) the result of probabilistic physics - a *non-deterministic* interaction 
described by something like quantum mechanics
3) the expression of our god-given spirit, or some other non-physical mover of 
physical things


 By contrast, all deterministic/programmed machines and
 computers are 
 guaranteed to complete any task they begin. (Zero
 procrastination or 
 deviation). Very different kinds of machines to us. Very
 different paradigm. 
 (No?)

I think the difference of paradigm between computers and humans is not that one 
is deterministic and one isn't, but rather that one is a paradigm of top-down, 
serialized control, and the other is bottom-up, massively parallel, and 
emergent. It comes down to design vs. emergence.

Terren


  


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: [agi] A NewMetaphor for Intelligence - the Computer/Organiser

2008-09-05 Thread Abram Demski
Mike,

On Fri, Sep 5, 2008 at 1:15 PM, Mike Tintner [EMAIL PROTECTED] wrote:

 Abram,

 I don't understand why.how I need to argue an alternative - please explain.

I am not sure what to say, but here is my view of the situation. You
are claiming that there is a broad range of things that algorithmic
systems cannot do. You gave some examples. William took a couple of
these examples and argued that they are routinely done by
multi-tasking systems. You say  that those methods do not really
count, because they reduce to normal computation. I say that that is
not a valid response, because that was exactly Will's point, that they
do reduce to normal computations. To make your objection work, you
need to argue that humans do not do the same sort of thing when we
change our minds about something.

 If it helps, a deterministic, programmed machine can, at any given point,
 only follow one route through a given territory or problem space or maze -
 even if surprising  *appearing* to halt/deviate from the plan -   to the
 original, less-than-omniscient-of-what-he-hath-wrought programmer. (A
 fundamental programming problem, right?) A creative free machine, like a
 human, really can follow any of what may be a vast range of routes - and you
 really can't predict what it will do or, at a basic level, be surprised by
 it.

It still sounds like you are describing physical randomness.

--Abram


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Matt Mahoney
--- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 NARS indeed can learn semantics before syntax --- see
 http://nars.wang.googlepages.com/wang.roadmap.pdf

Yes, I see this corrects many of the problems with Cyc and with traditional 
language models. I didn't see a description of a mechanism for learning new 
terms in your other paper. Clearly this could be added, although I believe it 
should be a statistical process.

I am interested in determining the computational cost of language modeling. The 
evidence I have so far is that it is high. I believe the algorithmic complexity 
of a model is 10^9 bits. This is consistent with Turing's 1950 prediction that 
AI would require this much memory, with Landauer's estimate of human long term 
memory, and is about how much language a person processes by adulthood assuming 
an information content of 1 bit per character as Shannon estimated in 1950. 
This is why I use a 1 GB data set in my compression benchmark.

However there is a 3 way tradeoff between CPU speed, memory, and model accuracy 
(as measured by compression ratio). I added two graphs to my benchmark at 
http://cs.fit.edu/~mmahoney/compression/text.html (below the main table) which 
shows this clearly. In particular the size-memory tradeoff is an almost 
perfectly straight line (with memory on a log scale) over tests of 104 
compressors. These tests suggest to me that CPU and memory are indeed 
bottlenecks to language modeling. The best models in my tests use simple 
semantic and grammatical models, well below adult human level. The 3 top 
programs on the memory graph map words to tokens using dictionaries that group 
semantically and syntactically related words together, but only one 
(paq8hp12any) uses a semantic space of more than one dimension. All have large 
vocabularies, although not implausibly large for an educated person. Other top 
programs like nanozipltcb and WinRK use smaller dictionaries and
 strictly lexical models. Lesser programs model only at the n-gram level.

I don't yet have an answer to my question, but I believe efficient human-level 
NLP will require hundreds of GB or perhaps 1 TB of memory. The slowest programs 
are already faster than real time, given that equivalent learning in humans 
would take over a decade. I think you could use existing hardware in a 
speed-memory tradeoff to get real time NLP, but it would not be practical for 
doing experiments where each source code change requires training the model 
from scratch. Model development typically requires thousands of tests.


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
On Fri, Sep 5, 2008 at 6:15 PM, Matt Mahoney [EMAIL PROTECTED] wrote:
 --- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 NARS indeed can learn semantics before syntax --- see
 http://nars.wang.googlepages.com/wang.roadmap.pdf

 Yes, I see this corrects many of the problems with Cyc and with traditional 
 language models. I didn't see a description of a mechanism for learning new 
 terms in your other paper. Clearly this could be added, although I believe it 
 should be a statistical process.

I don't have a separate paper on term composition, so you'd have to
read my book. It is indeed a statistical process, in the sense that
most of the composed terms won't be useful, so will be forgot
gradually. Only the useful patterns will be kept for long time in
the form of compound terms.

 I am interested in determining the computational cost of language modeling. 
 The evidence I have so far is that it is high. I believe the algorithmic 
 complexity of a model is 10^9 bits. This is consistent with Turing's 1950 
 prediction that AI would require this much memory, with Landauer's estimate 
 of human long term memory, and is about how much language a person processes 
 by adulthood assuming an information content of 1 bit per character as 
 Shannon estimated in 1950. This is why I use a 1 GB data set in my 
 compression benchmark.

I see your point, though I think to analyze this problem in terms of
computational complexity is not the correct way to go, because this
process does not follow a predetermined algorithm. Instead, language
learning is an incremental process, without a well-defined beginning
and ending.

 However there is a 3 way tradeoff between CPU speed, memory, and model 
 accuracy (as measured by compression ratio). I added two graphs to my 
 benchmark at http://cs.fit.edu/~mmahoney/compression/text.html (below the 
 main table) which shows this clearly. In particular the size-memory tradeoff 
 is an almost perfectly straight line (with memory on a log scale) over tests 
 of 104 compressors. These tests suggest to me that CPU and memory are indeed 
 bottlenecks to language modeling. The best models in my tests use simple 
 semantic and grammatical models, well below adult human level. The 3 top 
 programs on the memory graph map words to tokens using dictionaries that 
 group semantically and syntactically related words together, but only one 
 (paq8hp12any) uses a semantic space of more than one dimension. All have 
 large vocabularies, although not implausibly large for an educated person. 
 Other top programs like nanozipltcb and WinRK use smaller dictionaries and
  strictly lexical models. Lesser programs model only at the n-gram level.

Like to many existing AI works, my disagreement with you is not that
much on the solution you proposed (I can see the value), but on the
problem you specified as the goal of AI. For example, I have no doubt
about the theoretical and practical values of compression, but don't
think it has much to do with intelligence. I don't think this kind of
issue can be efficient handled by email discussion like this one. I've
been thinking about to write a paper to compare my ideas with the
ideas represented by AIXI, which is closely related to yours, though
this project hasn't got enough priority in my to-do list. Hopefully
I'll find the time to make myself clear on this topic.

 I don't yet have an answer to my question, but I believe efficient 
 human-level NLP will require hundreds of GB or perhaps 1 TB of memory. The 
 slowest programs are already faster than real time, given that equivalent 
 learning in humans would take over a decade. I think you could use existing 
 hardware in a speed-memory tradeoff to get real time NLP, but it would not be 
 practical for doing experiments where each source code change requires 
 training the model from scratch. Model development typically requires 
 thousands of tests.

I guess we are exploring very different paths in NLP, and now it is
too early to tell which one will do better.

Pei


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Matt Mahoney
--- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 Like to many existing AI works, my disagreement with you is
 not that
 much on the solution you proposed (I can see the value),
 but on the
 problem you specified as the goal of AI. For example, I
 have no doubt
 about the theoretical and practical values of compression,
 but don't
 think it has much to do with intelligence.

In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why text 
compression is an AI problem. To summarize, if you know the probability 
distribution of text, then you can compute P(A|Q) for any question Q and answer 
A to pass the Turing test. Compression allows you to precisely measure the 
accuracy of your estimate of P. Compression (actually, word perplexity) has 
been used since the early 1990's to measure the quality of language models for 
speech recognition, since it correlates well with word error rate.

The purpose of this work is not to solve general intelligence, such as the 
universal intelligence proposed by Legg and Hutter [1]. That is not computable, 
so you have to make some arbitrary choice with regard to test environments 
about what problems you are going to solve. I believe the goal of AGI should be 
to do useful work for humans, so I am making a not so arbitrary choice to solve 
a problem that is central to what most people regard as useful intelligence.

I had hoped that my work would lead to an elegant theory of AI, but that hasn't 
been the case. Rather, the best compression programs were developed as a series 
of thousands of hacks and tweaks, e.g. change a 4 to a 5 because it gives 
0.002% better compression on the benchmark. The result is an opaque mess. I 
guess I should have seen it coming, since it is predicted by information theory 
(e.g. [2]).

Nevertheless the architectures of the best text compressors are consistent with 
cognitive development models, i.e. phoneme (or letter) sequences - lexical - 
semantics - syntax, which are themselves consistent with layered neural 
architectures. I already described a neural semantic model in my last post. I 
also did work supporting Hutchens and Alder showing that lexical models can be 
learned from n-gram statistics, consistent with the observation that babies 
learn the rules for segmenting continuous speech before they learn any words 
[3].

I agree it should also be clear that semantics is learned before grammar, 
contrary to the way artificial languages are processed. Grammar requires 
semantics, but not the other way around. Search engines work using semantics 
only. Yet we cannot parse sentences like I ate pizza with Bob, I ate pizza 
with pepperoni, I ate pizza with chopsticks, without semantics.

My benchmark does not prove that there aren't better language models, but it is 
strong evidence. It represents the work of about 100 researchers who have tried 
and failed to find more accurate, faster, or less memory intensive models. The 
resource requirements seem to increase as we go up the chain from n-grams to 
grammar, contrary to symbolic approaches. This is my argument why I think AI is 
bound by lack of hardware, not lack of theory.

1. Legg, Shane, and Marcus Hutter (2006), A Formal Measure of Machine 
Intelligence, Proc. Annual machine learning conference of Belgium and The 
Netherlands (Benelearn-2006). Ghent, 2006.  
http://www.vetta.org/documents/ui_benelearn.pdf

2. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?,  
Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for 
Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf

3. M. Mahoney (2000), A Note on Lexical Acquisition in Text without Spaces, 
http://cs.fit.edu/~mmahoney/dissertation/lex1.html


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


AI isn't cheap (was Re: Real vs. simulated environments (was Re: [agi] draft for comment.. P.S.))

2008-09-05 Thread Matt Mahoney
--- On Fri, 9/5/08, Steve Richfield [EMAIL PROTECTED] wrote:
I think that a billion or so, divided up into small pieces to fund EVERY
disparate approach to see where the low hanging fruit is, would go a
LONG way in guiding subsequent billions. I doubt that it would take a
trillion to succeed.

Sorry, the low hanging fruit was all picked by the early 1960's. By then we had 
neural networks [1,6,7,11,12], natural language processing and language 
translation [2], models of human decision making [3], automatic theorem proving 
[4,8,10], natural language databases [5], game playing programs [9,13], optical 
character recognition [14], handwriting and speech recognition [15], and 
important theoretical work [16,17,18]. Since then we have had mostly just 
incremental improvements.

Big companies like Google and Microsoft have strong incentives to develop AI 
and have billions to spend. Maybe the problem really is hard.

References

1. Ashby, W. Ross (1960), Design for a Brain, 2’nd Ed., London: Wiley. 
Describes a 4 neuron electromechanical neural network.

2. Borko, Harold (1967), Automated Language Processing, The State of the Art, 
New York: Wiley.  Cites 72 NLP systems prior to 1965, and the 1959-61 U.S. 
government Russian-English translation project.

3. Feldman, Julian (1961), Simulation of Behavior in the Binary Choice 
Experiment, Proceedings of the Western Joint Computer Conference 19:133-144

4. Gelernter, H. (1959), Realization of a Geometry-Theorem Proving Machine, 
Proceedings of an International Conference on Information Processing, Paris: 
UNESCO House, pp. 273-282.

5. Green, Bert F. Jr., Alice K. Wolf, Carol Chomsky, and Kenneth Laughery 
(1961), Baseball: An Automatic Question Answerer, Proceedings of the Western 
Joint Computer Conference, 19:219-224.

6. Hebb, D. O. (1949), The Organization of Behavior, New York: Wiley.  Proposed 
the first model of learning in neurons: when two neurons fire simultaneously, 
the synapse between them becomes stimulating.

7. McCulloch, Warren S., and Walter Pitts (1943), A logical calculus of the 
ideas immanent in nervous activity, Buletin of Mathematical Biophysics (5) pp. 
115-133.

8. Newell, Allen, J. C. Shaw, H. A. Simon (1957), Empirical Explorations with 
the Logic Theory Machine: A Case Study in Heuristics, Proceedings of the 
Western Joint Computer Conference, 15:218-239.

9. Newell, Allen, J. C. Shaw, and H. A. Simon (1958), Chess-Playing Programs 
and the Problem of Complexity, IBM Journal of Research and Development, 
2:320-335.

10. Newell, Allen, H. A. Simon (1961), GPS: A Program that Simulates Human 
Thought, Lernende Automaten, Munich: R. Oldenbourg KG.

11. Rochester, N., J. J. Holland, L. H. Haibt, and Wl L. Duda (1956), Tests on 
a cell assembly theory of the action of the brain, using a large digital 
computer, IRE Transactions on Information Theory IT-2: pp. 80-93. 

12. Rosenblatt, F. (1958), The perceptron: a probabilistic model for 
information storage and organization in the brain, Psychological Review (65) 
pp. 386-408.

13. Samuel, A. L. (1959), Some Studies in Machine Learning using the Game of 
Checkers, IBM Journal of Research and Development, 3:211-229.

14. Selfridge, Oliver G., Ulric Neisser (1960), Pattern Recognition by 
Machine, Scientific American, Aug., 203:60-68.

15. Uhr, Leonard, Charles Vossler (1963) A Pattern-Recognition Program that 
Generates, Evaluates, and Adjusts its own Operators, Computers and Thought, E. 
A. Feigenbaum and J. Feldman eds, New York: McGraw Hill, pp. 251-268.

16. Turing, A. M., (1950) Computing Machinery and Intelligence, Mind, 
59:433-460.

17. Shannon, Claude, and Warren Weaver (1949), The Mathematical Theory of 
Communication, Urbana: University of Illinois Press. 

18. Minsky, Marvin (1961), Steps toward Artificial Intelligence, Proceedings 
of the Institute of Radio Engineers, 49:8-30. 


-- Matt Mahoney, [EMAIL PROTECTED]



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com


Re: Language modeling (was Re: [agi] draft for comment)

2008-09-05 Thread Pei Wang
Matt,

Thanks for taking the time to explain your ideas in detail. As I said,
our different opinions on how to do AI come from our very different
understanding of intelligence. I don't take passing Turing Test as
my research goal (as explained in
http://nars.wang.googlepages.com/wang.logic_intelligence.pdf and
http://nars.wang.googlepages.com/wang.AI_Definitions.pdf).  I disagree
with Hutter's approach, not because his SOLUTION is not computable,
but because his PROBLEM is too idealized and simplified to be relevant
to the actual problems of AI.

Even so, I'm glad that we can still agree on somethings, like
semantics comes before syntax. In my plan for NLP, there won't be
separate 'parsing' and 'semantic mapping' stages. I'll say more when I
have concrete results to share.

Pei

On Fri, Sep 5, 2008 at 8:39 PM, Matt Mahoney [EMAIL PROTECTED] wrote:
 --- On Fri, 9/5/08, Pei Wang [EMAIL PROTECTED] wrote:

 Like to many existing AI works, my disagreement with you is
 not that
 much on the solution you proposed (I can see the value),
 but on the
 problem you specified as the goal of AI. For example, I
 have no doubt
 about the theoretical and practical values of compression,
 but don't
 think it has much to do with intelligence.

 In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why text 
 compression is an AI problem. To summarize, if you know the probability 
 distribution of text, then you can compute P(A|Q) for any question Q and 
 answer A to pass the Turing test. Compression allows you to precisely measure 
 the accuracy of your estimate of P. Compression (actually, word perplexity) 
 has been used since the early 1990's to measure the quality of language 
 models for speech recognition, since it correlates well with word error rate.

 The purpose of this work is not to solve general intelligence, such as the 
 universal intelligence proposed by Legg and Hutter [1]. That is not 
 computable, so you have to make some arbitrary choice with regard to test 
 environments about what problems you are going to solve. I believe the goal 
 of AGI should be to do useful work for humans, so I am making a not so 
 arbitrary choice to solve a problem that is central to what most people 
 regard as useful intelligence.

 I had hoped that my work would lead to an elegant theory of AI, but that 
 hasn't been the case. Rather, the best compression programs were developed as 
 a series of thousands of hacks and tweaks, e.g. change a 4 to a 5 because it 
 gives 0.002% better compression on the benchmark. The result is an opaque 
 mess. I guess I should have seen it coming, since it is predicted by 
 information theory (e.g. [2]).

 Nevertheless the architectures of the best text compressors are consistent 
 with cognitive development models, i.e. phoneme (or letter) sequences - 
 lexical - semantics - syntax, which are themselves consistent with layered 
 neural architectures. I already described a neural semantic model in my last 
 post. I also did work supporting Hutchens and Alder showing that lexical 
 models can be learned from n-gram statistics, consistent with the observation 
 that babies learn the rules for segmenting continuous speech before they 
 learn any words [3].

 I agree it should also be clear that semantics is learned before grammar, 
 contrary to the way artificial languages are processed. Grammar requires 
 semantics, but not the other way around. Search engines work using semantics 
 only. Yet we cannot parse sentences like I ate pizza with Bob, I ate pizza 
 with pepperoni, I ate pizza with chopsticks, without semantics.

 My benchmark does not prove that there aren't better language models, but it 
 is strong evidence. It represents the work of about 100 researchers who have 
 tried and failed to find more accurate, faster, or less memory intensive 
 models. The resource requirements seem to increase as we go up the chain from 
 n-grams to grammar, contrary to symbolic approaches. This is my argument why 
 I think AI is bound by lack of hardware, not lack of theory.

 1. Legg, Shane, and Marcus Hutter (2006), A Formal Measure of Machine 
 Intelligence, Proc. Annual machine learning conference of Belgium and The 
 Netherlands (Benelearn-2006). Ghent, 2006.  
 http://www.vetta.org/documents/ui_benelearn.pdf

 2. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?,  
 Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for 
 Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
 http://www.vetta.org/documents/IDSIA-12-06-1.pdf

 3. M. Mahoney (2000), A Note on Lexical Acquisition in Text without Spaces, 
 http://cs.fit.edu/~mmahoney/dissertation/lex1.html


 -- Matt Mahoney, [EMAIL PROTECTED]



 ---
 agi
 Archives: https://www.listbox.com/member/archive/303/=now
 RSS Feed: https://www.listbox.com/member/archive/rss/303/
 Modify Your Subscription: https://www.listbox.com/member/?;
 Powered by