Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-13 Thread James Ratcliff
It shouldnt matter how a general ontology is used, it should be available for 
multiple different AI and AGI processes, to be generally useful.

And the key thing about this usage is it doesnt get any information from a 
single text, but extracts patterns from the mass usage, reading a single 
passage is much more difficult.

I have also used this in conjunction with Google news feed, where many many 
articles can be gathered in a short period on a single topic, and reinforce the 
information.

James Ratcliff


Vladimir Nesov [EMAIL PROTECTED] wrote: On Dec 13, 2007 12:09 AM, James 
Ratcliff  wrote:
   Mainly as a primer ontology / knowledge representation data set for an AGI
 to work with.
   Having a number of facts known without having to be typed in about many
 frames and connections between frames gives an AGI a good booster to start
 with.

   Taken a simple set of common words in a house chair, table, sock, closet
 etc, a house agi bot could get a feel for objects it would expect to find in
 a house, and what locations to look for say a sock, and properties of a
 sock, without having to have that information typed in from a human user.
   Then that information would be updated thru experience, and with a human
 trainer working with an embodied (probably virtual) agi.


Yes, it's how story usually goes. But if you don't specify how
ontology will be used, why do you believe that it will be more useful
than original texts? Probably at a point where you'll be able to make
use of ontology you'd also be able to analyze texts directly (that is,
if you aim that high, otherwise it's a different issue entirely).


-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



___
James Ratcliff - http://falazar.com
Looking for something...
   
-
Never miss a thing.   Make Yahoo your homepage.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75658037-88df5d

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-12 Thread James Ratcliff
  Mainly as a primer ontology / knowledge representation data set for an AGI to 
work with.
  Having a number of facts known without having to be typed in about many 
frames and connections between frames gives an AGI a good booster to start with.

  Taken a simple set of common words in a house chair, table, sock, closet etc, 
a house agi bot could get a feel for objects it would expect to find in a 
house, and what locations to look for say a sock, and properties of a sock, 
without having to have that information typed in from a human user.
  Then that information would be updated thru experience, and with a human 
trainer working with an embodied (probably virtual) agi.

The novels gave a really good data set that reinforced the factoids extracted 
and was a bit more world-knowledge common sense than other extraction projects 
using Wall Street Journal, or a subset of the web as hole, and removed much of 
the junk data.

James


Vladimir Nesov [EMAIL PROTECTED] wrote: On Dec 11, 2007 7:26 PM, James 
Ratcliff  wrote:
 Here's a basic abstract I did last year I think:

 http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf

 Would like to work with others on a full fledged Reprensentation system that
 could use these kind of techniques
 I hacked this together by myself, so I know a real team could put this kind
 of stuff to much better use.



 James

Do you have any particular path in mind to put this kind of thing to
work? Finding patterns is fine, and somewhat inevitable, but what are
those ontologies good for, and why?

-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



___
James Ratcliff - http://falazar.com
Looking for something...
   
-
Looking for last minute shopping deals?  Find them fast with Yahoo! Search.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75372394-eb4d01

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-12 Thread James Ratcliff
I had been thinking about something along these lines, though not worded as you 
have in this message yet.

What I would be most interested in at this point is a knowledge gathering 
system somewhere along these lines, where the main AGI could be 
centralized/clustered or distributed, but where questions and information would 
be posed to the Bot on each persons node and collected together.
The system would remember any facts and domain that a person has contributed so 
any future unique questions could be posed to the knowledgeable expert users.
  This would allow a large amount of knowledge to be extracted in a distributed 
manner, keeping track of the quality of information gathered from each person 
as a trust metric, and many facts would be gathered and checked for truth.

  Mainly the system should have an ability to ACTIVELY go out in search of the 
answer, by chatting with known users to find and confirm any conflicting 
results.

For instance, it would randomly ask me Who is the highest paid baseball 
player?
and I would pass on that question... the system would put a lower score for any 
further baseball questions sent towards me, but based on my answering of other 
computer questions and ones about Austin, TX, it would be more likely to ask me 
questions about them.
And only me and a couple other people here would get the questions about 
Austin, TX.

Something along the lines of a higher quality Yahoo Questions, with an active 
component, and central knowledge base.
I think the knowledge base is one of the most important pieces of these, and 
hope to start seeing some more of ppls ideas and implementations of KR db's.

James Ratcliff

Matt Mahoney [EMAIL PROTECTED] wrote: --- Jean-Paul Van Belle  wrote:

 Hi Matt, Wonderful idea, now it will even show the typical human trait of
 lying...when i ask it do you still love me? most answers in its database
 will have Yes as an answer  but when i ask it 'what's my name?' it'll call
 me John?

My proposed message posting service allows anyone to contribute to its
knowledge base, just like Wikipedia, so it could certainly contain some false
or useless information.  However, the number of peers that keep a copy of a
message will depend on the number of peers that accept it according to the
peers' policies, which are set individually by their owners.  The network
provides an incentive for peers to produce useful information so that other
peers will accept it.  Thus, useful and truthful information is more likely to
be propagated.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



___
James Ratcliff - http://falazar.com
Looking for something...
   
-
Looking for last minute shopping deals?  Find them fast with Yahoo! Search.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75375812-111ad4

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-12 Thread Vladimir Nesov
On Dec 13, 2007 12:09 AM, James Ratcliff [EMAIL PROTECTED] wrote:
   Mainly as a primer ontology / knowledge representation data set for an AGI
 to work with.
   Having a number of facts known without having to be typed in about many
 frames and connections between frames gives an AGI a good booster to start
 with.

   Taken a simple set of common words in a house chair, table, sock, closet
 etc, a house agi bot could get a feel for objects it would expect to find in
 a house, and what locations to look for say a sock, and properties of a
 sock, without having to have that information typed in from a human user.
   Then that information would be updated thru experience, and with a human
 trainer working with an embodied (probably virtual) agi.


Yes, it's how story usually goes. But if you don't specify how
ontology will be used, why do you believe that it will be more useful
than original texts? Probably at a point where you'll be able to make
use of ontology you'd also be able to analyze texts directly (that is,
if you aim that high, otherwise it's a different issue entirely).


-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75420074-1ea3c3


Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-12 Thread Mike Dougherty
On 12/12/07, James Ratcliff [EMAIL PROTECTED] wrote:
   This would allow a large amount of knowledge to be extracted in a
 distributed manner, keeping track of the quality of information gathered
 from each person as a trust metric, and many facts would be gathered and
 checked for truth.

 Something along the lines of a higher quality Yahoo Questions, with an
 active component, and central knowledge base.
 I think the knowledge base is one of the most important pieces of these, and
 hope to start seeing some more of ppls ideas and implementations of KR db's.

I believe where you said central knowledge base you mean
distributed KB - right?  The idea of keeping local KB at each node
shares the burden for storage/bandwidth to every node in the network.
Your trust metrics are how nodes conditionally connect for per-topic
fact-checking.

I have already volunteered my free CPU/bandwidth to a prototype of
this model.  Of course, I'd like to be a collaborator of mechanisms
involved in addition to a user of the grid.  Even if it starts out
only a toy or hobby, it would still teach us a great deal.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75442948-fd876c


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-11 Thread James Ratcliff
Here's a basic abstract I did last year I think:

http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf

Would like to work with others on a full fledged Reprensentation system that 
could use these kind of techniques 
I hacked this together by myself, so I know a real team could put this kind of 
stuff to much better use.

James


Ed Porter [EMAIL PROTECTED] wrote:   I   James,
   
  Do you have any description or examples of you results.  
   
  This is something I have been telling people for years.   That you should be 
able to extract a significant amount (but probably far from all) world 
knowledge by scanning large corpora of text.  I would love to see how well it 
actually works for a given size of corpora, and for a given level of 
algorithmic sophistication.
   
  Ed Porter
   
  -Original Message-
 From: James Ratcliff [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, December 06, 2007 4:51 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
   
  Richard,
   What is your specific complaint about the 'viability of the framework'?
 
 
 Ed,
   This line of data gathering is very interesting to me as well, though I 
found quickly that using all web sources quickly devolved into insanity.
 By using scanned text novels, I was able to extract lots of relational 
information on a range of topics. 
With a well defined ontology system, and some human overview, a large 
amount of information can be extracted and many probabilities learned.
 
 James
 
 
 Ed Porter [EMAIL PROTECTED] wrote:
  
 RICHARD LOOSEMORE=
 You are implicitly assuming a certain framework for solving the problem of 
representing knowledge ... and then all your discussion is about whether or not 
it is feasible to implement that framework (to overcome various issues to do 
with searches that have to be done within that framework).
 
 But I am not challenging the implementation issues, I am challenging the 
viability of the framework itself.
 
 JAMES--- What e
 
 
 ED PORTER= So what is wrong with my framework? What is wrong with a
 system of recording patterns, and a method for developing compositions and
 generalities from those patterns, in multiple hierarchical levels, and for
 indicating the probabilities of certain patterns given certain other pattern
 etc? 
 
 I know it doesn't genuflect before the alter of complexity. But what is
 wrong with the framework other than the fact that it is at a high level and
 thus does not explain every little detail of how to actually make an AGI
 work?
 
 
 
 RICHARD LOOSEMORE= These models you are talking about are trivial
 exercises in public 
 relations, designed to look really impressive, and filled with hype 
 designed to attract funding, which actually accomplish very little.
 
 Please, Ed, don't do this to me. Please don't try to imply that I need 
 to open my mind any more. Th implication seems to be that I do not 
 understand the issues in enough depth, and need to do some more work to 
 understand you points. I can assure you this is not the case.
 
 
 
 ED PORTER= Shastri's Shruiti is a major piece of work. Although it is
 a highly simplified system, for its degree of simplification it is amazingly
 powerful. It has been very helpful to my thinking about AGI. Please give
 me some excuse for calling it trivial exercise in public relations. I
 certainly have not published anything as important. Have you?
 
 The same for Mike Collins's parsers which, at least several years ago I was
 told by multiple people at MIT was considered one of the most accurate NL
 parsers around. Is that just a trivial exercise in public relations? 
 
 With regard to Hecht-Nielsen's work, if it does half of what he says it does
 it is pretty damned impressive. It is also a work I think about often when
 thinking how to deal with certain AI problems. 
 
 Richard if you insultingly dismiss such valid work as trivial exercises in
 public relations it sure as hell seems as if either you are quite lacking
 in certain important understandings -- or you have a closed mind -- or both.
 
 
 
 Ed Porter
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
  
 
 
 ___
 James Ratcliff - http://falazar.com
 Looking for something...


-
  
  Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.

-
  
  This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
  
  
-
 This list is sponsored by AGIRI: http

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-11 Thread Ed Porter
James,

 

I read your paper.  Your project seems right on the mark.  It provides a
domain-limited example of the general type of learning algorithm that will
probably be the central learning algorithm of AGI, i.e., finding patterns,
and hierarchies of patterns in the AGI's experience in a largely
unsupervised manner.

 

The application of the type of learning algorithm to text makes sense
because, with the web, it is one of the easiest types of experience to get
in large volumes.  It is very much the type of project I have been
advocating for years.  When I first heard of the Google project to put
millions of books into digital form, I assumed it was for exactly such
purposes, and told multiple people so.  (Ditto for the CMU million book
project.)  It seems to be the conventional wisdom that Google is not using
its vast resources for such an obvious purpose, but I wouldn't be so sure.

 

It seems to me that fiction books, at an estimated average length of 300
pages at 300 words/page, would only have about 100K words each, so that 600
of them would only be about 60 Million words, which is amazingly small for
learning from corpora studies.  That you were able to learn so much from so
little is encouraging, but it would really be interesting to see such a
project done on very large corpora, 10 or 100s of billions of words.  It
would be interesting to see how much of human common sense (and expertise)
they could, and could not, derive.

 

Ed Porter

-Original Message-
From: James Ratcliff [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 11, 2007 11:26 AM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 

Here's a basic abstract I did last year I think:

http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf

Would like to work with others on a full fledged Reprensentation system that
could use these kind of techniques 
I hacked this together by myself, so I know a real team could put this kind
of stuff to much better use.

James


Ed Porter [EMAIL PROTECTED] wrote:

James,

 

Do you have any description or examples of you results.  

 

This is something I have been telling people for years.   That you should be
able to extract a significant amount (but probably far from all) world
knowledge by scanning large corpora of text.  I would love to see how well
it actually works for a given size of corpora, and for a given level of
algorithmic sophistication.

 

Ed Porter

 

-Original Message-
From: James Ratcliff [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 4:51 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 

Richard,
  What is your specific complaint about the 'viability of the framework'?


Ed,
  This line of data gathering is very interesting to me as well, though I
found quickly that using all web sources quickly devolved into insanity.
By using scanned text novels, I was able to extract lots of relational
information on a range of topics. 
   With a well defined ontology system, and some human overview, a large
amount of information can be extracted and many probabilities learned.

James


Ed Porter [EMAIL PROTECTED] wrote:


RICHARD LOOSEMORE=
You are implicitly assuming a certain framework for solving the problem of
representing knowledge ... and then all your discussion is about whether or
not it is feasible to implement that framework (to overcome various issues
to do with searches that have to be done within that framework).

But I am not challenging the implementation issues, I am challenging the
viability of the framework itself.

JAMES--- What e


ED PORTER= So what is wrong with my framework? What is wrong with a
system of recording patterns, and a method for developing compositions and
generalities from those patterns, in multiple hierarchical levels, and for
indicating the probabilities of certain patterns given certain other pattern
etc? 

I know it doesn't genuflect before the alter of complexity. But what is
wrong with the framework other than the fact that it is at a high level and
thus does not explain every little detail of how to actually make an AGI
work?



RICHARD LOOSEMORE= These models you are talking about are trivial
exercises in public 
relations, designed to look really impressive, and filled with hype 
designed to attract funding, which actually accomplish very little.

Please, Ed, don't do this to me. Please don't try to imply that I need 
to open my mind any more. Th implication seems to be that I do not 
understand the issues in enough depth, and need to do some more work to 
understand you points. I can assure you this is not the case.



ED PORTER= Shastri's Shruiti is a major piece of work. Although it is
a highly simplified system, for its degree of simplification it is amazingly
powerful. It has been very helpful to my thinking about AGI. Please give
me some excuse for calling it trivial

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-11 Thread Matt Mahoney
--- Jean-Paul Van Belle [EMAIL PROTECTED] wrote:

 Hi Matt, Wonderful idea, now it will even show the typical human trait of
 lying...when i ask it do you still love me? most answers in its database
 will have Yes as an answer  but when i ask it 'what's my name?' it'll call
 me John?

My proposed message posting service allows anyone to contribute to its
knowledge base, just like Wikipedia, so it could certainly contain some false
or useless information.  However, the number of peers that keep a copy of a
message will depend on the number of peers that accept it according to the
peers' policies, which are set individually by their owners.  The network
provides an incentive for peers to produce useful information so that other
peers will accept it.  Thus, useful and truthful information is more likely to
be propagated.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=74671775-73001c


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-11 Thread Vladimir Nesov
On Dec 11, 2007 7:26 PM, James Ratcliff [EMAIL PROTECTED] wrote:
 Here's a basic abstract I did last year I think:

 http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf

 Would like to work with others on a full fledged Reprensentation system that
 could use these kind of techniques
 I hacked this together by myself, so I know a real team could put this kind
 of stuff to much better use.



 James

Do you have any particular path in mind to put this kind of thing to
work? Finding patterns is fine, and somewhat inevitable, but what are
those ontologies good for, and why?

-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=75044005-87874a


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-07 Thread Jean-Paul Van Belle
Hi Matt, Wonderful idea, now it will even show the typical human trait of 
lying...when i ask it do you still love me? most answers in its database will 
have Yes as an answer  but when i ask it 'what's my name?' it'll call me John?

However, your approach is actually already being implemented to a certain 
extent. Apparantly (was it newsweek, time?) the No 1 search engine in 
(Singapore? Hong Kong? Taiwan? - sorry I forgot) is *not* Google but a local 
language QA system that works very much the way you envisage it (except it 
collects the answers in its own SAN i.e. not distributed over the user machines)

=Jean-Paul
 On 2007/12/07 at 18:58, in message
 [EMAIL PROTECTED], Matt Mahoney
 [EMAIL PROTECTED] wrote:
 
 Hi Matt
 
 You call it an AGI proposal but it is described as a distributed search
 algorithms that (merely) appears intelligent i.e. design for an
 Internet-wide message posting and search service. There doesn't appear to
 be any grounding or semantic interpretation by the AI system? How will it
 become more intelligent?

Turing was careful to make no distinction between being intelligent and
appearing intelligent.  The requirement for passing the Turing test is to be
able to compute a probability distribution P over text strings that varies
from the true distribution no more than it varies between different people. 
Once you can do this, then given a question Q, you can compute answer A that
maximizes P(A|Q) = P(QA)/P(Q).

This does not require grounding.  The way my system appears intelligent is by
directing Q to the right experts, and by being big enough to have experts on
nearly every conceivable topic of interest to humans.

A lot of AGI research seems to be focused on how to represent knowledge and
thought efficiently on a (much too small) computer, rather than on what
services the AGI should provide for us.

-- 

Research Associate: CITANDA
Post-Graduate Section Head 
Department of Information Systems
Phone: (+27)-(0)21-6504256
Fax: (+27)-(0)21-6502280
Office: Leslie Commerce 4.21


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73912948-7bb204

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Mark Waser
THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO 
EXPLICITLY DEAL WITH 500K TUPLES


And I asked -- Do you believe that this is some sort of huge conceptual 
breakthrough?




-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73155533-eaf7a5


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Ed Porter
Mark,

First you attack me for making a statement which you falsely claimed
indicated I did not understand the math in the Collins' article (and
potentially discreted everything I said on this list).  Once it was show
that that attack was unfair, rather than apologizing sufficiently for the
unfair attack, now you seem to be coming back with another swing.  Now you
are implicitly attacking me for implying it is new to think you could deal
with vectors in some sort of compressed representation.

I was aware that there were previous methods for dealing with vectors in
high dimensional spaces using various compression schemes, although I had
only heard of a few examples.  I personally had been planning for years
prior to reading Collin's paper to score matches based mainly on the number
of similar features, and not all the dissimilar features(except in certain
cases) to avoid the curse of high dimensionalities.  

But I was also aware of many discussions, such as one in a current best
selling AI textbook, which implies that a certain problem becomes
intractable easily because it assumes one is saddled with dealing with the
full possible dimensionality of the problem space being represented, when it
is clear you can accomplish a high percent of the same thing with a GNG type
approach by only placing represention where there are significant
probabilities.

So, all though it may not be new to you, it seems to be new to some that the
curse of high dimensionality can often be avoided in many classes of
problems.  I was citing the Collins paper as one example for showing that AI
systems have been able to deal well with high dimensionality.  I attended a
lecture at MIT that a few years after the Collin's paper came out where the
major thrust of the speech was that recently great headway was being made in
many field of AI because people were beginning to realize all sorts of
efficient hacks that avoid many of the problems of combinatorial explosion
of high dimensionality that had previously thwarted their efforts.  The
Collins paper is an example of that movement.

When it was relatively new, the Collins paper was treated by several people
I talked to as quite a breakthrough, because in conjunction of the work of
people like Haussler it showed a relatively simple way to apply the Kernel
trick to graph mapping.  As you may be aware the Kernel trick not only
allows one to score matches, but also allows many of the analytical tools of
linear algebra to be applied through the kernel, greatly reducing the
complexity of applying such tools in the much higher dimensional space
represented by the kernel mapping.  I am not a historian of this field of
math, but in its day the Kernel trick was getting a lot of buzz from many
people in the field.  I attended an NL conference at CMU in the early '90s.
The use of support vector classifiers using the kernel trick was all the
rage at the conference, and the kernels they were use seemed much less
appropriate than that Collin's paper discloses.

Ed Porter


-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 9:09 AM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO 
 EXPLICITLY DEAL WITH 500K TUPLES

And I asked -- Do you believe that this is some sort of huge conceptual 
breakthrough?



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73199664-8396eaattachment: winmail.dat

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Mark Waser

Ed,

   Get a grip.  Try to write with complete words in complete sentences 
(unless discreted means a combination of excreted and discredited -- which 
works for me :-).


   I'm not coming back for a second swing.  I'm still pursuing the first 
one.  You just aren't oriented well enough to realize it.


Now you are implicitly attacking me for implying it is new to think you 
could deal with vectors in some sort of compressed representation.


   Nope.  First of all, compressed representation is *absolutely* the wrong 
term for what you're looking for.


   Second, I actually am still trying to figure out what *you* think you 
ARE gushing about.  (And my quest is not helped by such gems as all though 
[sic] it may not be new to you, it seems to be new to some)


   Why don't you just answer my question?  Do you believe that this is some 
sort of huge conceptual breakthrough?  For NLP (as you were initially 
pushing) or just for some nice computational tricks?


   I'll also note that you've severely changed the focus of this away from 
the NLP that you were initially raving about as such quality work -- and 
while I'll agree that kernel mapping is a very elegant tool -- Collin's work 
is emphatically *not* what I would call a shining example of it (I mean, 
*look* at his results -- they're terrible).  Yet you were touting it because 
of your 500,000 dimension fantasies and you're belief that it's good NLP 
work.


   So, in small words -- and not whining about an attack -- what precisely 
are you saying?



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73247008-aecb7f


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Ed Porter
Mark,

You claimed I made a particular false statement about the Collins paper.
(That by itself could have just been a misunderstanding or an honest
mistake.) But then you added an insult to that by implying I had probably
made the alleged error because I was incapable of understand the mathematics
involved.  As if that wasn't enough in the way of gratuitous insults, you
suggested my alleged error called in to question the validity of the other
things I have said on this list.  

That is a pretty deep, purposely and unnecessarily, insulting put down.

I think I have shown that I did understood the math in question, perhaps
better than you, since you initially totally ignored the part of the paper
that supported my statement.  I have shown that my statement was in fact
correct by a reasonable interpretation of my words.  Thus, not only was your
accusation of my error unjustified, but also, even more so, the two insults
placed on top of it.

You have not apologized for your unjustified accusation of error and the two
additional unnecessary insults (unless your statement Ok. I'll bite. is
considered an appropriate apology for such an improper set of deep insults).
Instead you have continued in an even more insulting tone, including
starting one subsequent email with a comment about something I had said that
went as follows: 

HeavySarcasmWow.  Is that what dot products
are?/HeavySarcasm

I don't mind people questioning me, or pointing out errors when I make them.
I even have a fair amount of tolerance for people mistakenly accusing me of
making an error, if they make the false accusation honestly and not in a
purposely insulting manner, as did you.

Why should I waste more time conversing with someone who wants to converse
in such an insulting tone?

Mark, you have been quick to publicly call other people on this list
trolls, in effect to their face, in front of the whole list.  This is a
behavior most people would consider very hurtful.  So what do you call
people on this list who not only falsely accuse other people of errors, add
several unnecessary insults based on the false accusation, and then when
shown to be in error, continue addressing comments to the falsely accused
person in a HeavySarcasm style?  

How about mean spirited.

Mark, you are an intelligent person, and I have found some of your posts
valuable.  That day a few weeks ago when you and Ben were riffing back and
forth, I was offended by your tone, but I thought many of your questions
were valuable.  If you wish to continue any sort of communication with me,
feel free to question and challenge, but please lay off the HeavySarcasm
and insults which do nothing to further the exchange and clarification of
ideas.

With regard to your questions below, If you actually took the time to read
my prior responses, I think you will see I have substantially answered them.

Ed Porter

-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 1:24 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed,

Get a grip.  Try to write with complete words in complete sentences 
(unless discreted means a combination of excreted and discredited -- which 
works for me :-).

I'm not coming back for a second swing.  I'm still pursuing the first 
one.  You just aren't oriented well enough to realize it.

 Now you are implicitly attacking me for implying it is new to think you 
 could deal with vectors in some sort of compressed representation.

Nope.  First of all, compressed representation is *absolutely* the wrong

term for what you're looking for.

Second, I actually am still trying to figure out what *you* think you 
ARE gushing about.  (And my quest is not helped by such gems as all though 
[sic] it may not be new to you, it seems to be new to some)

Why don't you just answer my question?  Do you believe that this is some

sort of huge conceptual breakthrough?  For NLP (as you were initially 
pushing) or just for some nice computational tricks?

I'll also note that you've severely changed the focus of this away from 
the NLP that you were initially raving about as such quality work -- and 
while I'll agree that kernel mapping is a very elegant tool -- Collin's work

is emphatically *not* what I would call a shining example of it (I mean, 
*look* at his results -- they're terrible).  Yet you were touting it because

of your 500,000 dimension fantasies and you're belief that it's good NLP 
work.

So, in small words -- and not whining about an attack -- what precisely 
are you saying?


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73284487

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter [EMAIL PROTECTED] wrote:

 I have a lot of respect for Google, but I don't like monopolies, whether it
 is Microsoft or Google.  I think it is vitally important that there be
 several viable search competators.  
 
 I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Partly.  The main difference is that I am also proposing a message posting
service, where messages become instantly searchable and are also directed to
persistent queries.

Wikia has a big hurdle to get over.  People will ask how is this better than
Google? before they bother to download the software.  For example, Grub
(distributed spider) uses a lot of bandwidth and disk without providing much
direct benefit to the user.  The major benefit of Wikia seems to be that users
provide feedback on relevance to query responses, which in theory ought to
provide a better ranking algorithm than something like Google's PageRank.  But
assuming they get enough users to get to this level, spammers could still game
the system by flooding the network with with high rankings for their websites.

In a distributed message posting service, each peer would have its own policy
regarding which messages to relay, keep in its cache, or ignore.  If a
document is valuable, then lots of peers would keep a copy.  A client could
then rank query responses by the number of copies received weighted by the
peer's reputation.  Spammers could try to game the system by adding lots of
peers and flooding the network with advertising, but this would fail because
most other peers would be configured to ignore peers that don't provide
reciprocal services by routing their own outgoing messages.  Any peer not so
configured would quickly be abused and isolated from the network in the same
way that open relay SMTP servers get abused by spammers and blacklisted by
spam filters.

Of course a message posting service would have a big hurdle too.  Initially,
the service would have to be well integrated with the existing Internet. 
Client queries would have to go to the major search engines, and there would
have to be websites set up as peers without the user having to install
software.  Most computers are not configured to run as servers (dynamic IP,
behind firewalls, slow upload, etc), so peers will probably need to allow
message passing over client HTTP (website polling), by email, and over instant
messaging protocols.

File sharing networks became popular because they offered a service not
available elsewhere (free music).  But I don't intend for the message posting
service to be used to evade copyright or censorship (although it probably
could be).  The protocol requires that the message's originator and
intermediate routers all be identified by a reply address and time stamp.  It
won't work otherwise.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73286384-77b385


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Matt,

Does a PC become more vulnerable to viruses, worms, Trojan horses, root
kits, and other web attacks if it becomes part of a P2P network? And if so
why and how much.  

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 3:01 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter [EMAIL PROTECTED] wrote:

 I have a lot of respect for Google, but I don't like monopolies, whether
it
 is Microsoft or Google.  I think it is vitally important that there be
 several viable search competators.  
 
 I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Partly.  The main difference is that I am also proposing a message posting
service, where messages become instantly searchable and are also directed to
persistent queries.

Wikia has a big hurdle to get over.  People will ask how is this better
than
Google? before they bother to download the software.  For example, Grub
(distributed spider) uses a lot of bandwidth and disk without providing much
direct benefit to the user.  The major benefit of Wikia seems to be that
users
provide feedback on relevance to query responses, which in theory ought to
provide a better ranking algorithm than something like Google's PageRank.
But
assuming they get enough users to get to this level, spammers could still
game
the system by flooding the network with with high rankings for their
websites.

In a distributed message posting service, each peer would have its own
policy
regarding which messages to relay, keep in its cache, or ignore.  If a
document is valuable, then lots of peers would keep a copy.  A client could
then rank query responses by the number of copies received weighted by the
peer's reputation.  Spammers could try to game the system by adding lots of
peers and flooding the network with advertising, but this would fail because
most other peers would be configured to ignore peers that don't provide
reciprocal services by routing their own outgoing messages.  Any peer not so
configured would quickly be abused and isolated from the network in the same
way that open relay SMTP servers get abused by spammers and blacklisted by
spam filters.

Of course a message posting service would have a big hurdle too.  Initially,
the service would have to be well integrated with the existing Internet. 
Client queries would have to go to the major search engines, and there would
have to be websites set up as peers without the user having to install
software.  Most computers are not configured to run as servers (dynamic IP,
behind firewalls, slow upload, etc), so peers will probably need to allow
message passing over client HTTP (website polling), by email, and over
instant
messaging protocols.

File sharing networks became popular because they offered a service not
available elsewhere (free music).  But I don't intend for the message
posting
service to be used to evade copyright or censorship (although it probably
could be).  The protocol requires that the message's originator and
intermediate routers all be identified by a reply address and time stamp.
It
won't work otherwise.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73293460-0b3fcd

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney
--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 Does a PC become more vulnerable to viruses, worms, Trojan horses, root
 kits, and other web attacks if it becomes part of a P2P network? And if so
 why and how much.  

It does if the P2P software has vulnerabilities, just like any other server or
client.  Worms would be especially dangerous because they could spread quickly
without user intervention, but slowly spreading viruses that are well hidden
can be dangerous too.  There is no foolproof defense, but it helps to keep the
protocol and software as simple as possible, to run the P2P software as a
nonprivileged process, use open source code, and not to depend to any large
extent on a single source of software.

The protocol I have in mind is that a message contain searchable natural
language text, possibly some nonsearchable attached files, and a header with
the reply address and timestamp of the originator and any intermediate peers
through which the message was routed.  The protocol is not dangerous except
for the attached files, but these have to be included because it is a useful
service.  If you don't include it, people will figure out how to embed
arbitrary data in the message text, which would make the protocol more
dangerous because it wasn't planned for.

In theory, you could use the P2P network to spread information about malicious
peers and deliver software patches.  But I think this would introduce more
problems than it solves because it would also introduce a mechanism for
spreading false information and patches containing trojans.  Peers should have
defenses that operate independently of the network, including disconnecting
itself if it detects anomalies in its own behavior.

Of course the network is vulnerable even if the peers behave properly. 
Malicious peers could forge headers, for example, to hide the true source of
messages or to force replies to be directed to unintended targets.  Some
attacks could be very complex depending on the idiosyncratic behavior of
particular peers.



-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73321137-bba914


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread James Ratcliff
Richard,
  What is your specific complaint about the 'viability of the framework'?


Ed,
  This line of data gathering is very interesting to me as well, though I found 
quickly that using all web sources quickly devolved into insanity.
By using scanned text novels, I was able to extract lots of relational 
information on a range of topics. 
   With a well defined ontology system, and some human overview, a large amount 
of information can be extracted and many probabilities learned.

James


Ed Porter [EMAIL PROTECTED] wrote: 
RICHARD LOOSEMORE=
You are implicitly assuming a certain framework for solving the problem of 
representing knowledge ... and then all your discussion is about whether or not 
it is feasible to implement that framework (to overcome various issues to do 
with searches that have to be done within that framework).

But I am not challenging the implementation issues, I am challenging the 
viability of the framework itself.

JAMES--- What e


ED PORTER= So what is wrong with my framework?  What is wrong with a
system of recording patterns, and a method for developing compositions and
generalities from those patterns, in multiple hierarchical levels, and for
indicating the probabilities of certain patterns given certain other pattern
etc?  

I know it doesn't genuflect before the alter of complexity.  But what is
wrong with the framework other than the fact that it is at a high level and
thus does not explain every little detail of how to actually make an AGI
work?



RICHARD LOOSEMORE= These models you are talking about are trivial
exercises in public 
relations, designed to look really impressive, and filled with hype 
designed to attract funding, which actually accomplish very little.

Please, Ed, don't do this to me. Please don't try to imply that I need 
to open my mind any more.  Th implication seems to be that I do not 
understand the issues in enough depth, and need to do some more work to 
understand you points.  I can assure you this is not the case.



ED PORTER= Shastri's Shruiti is a major piece of work.  Although it is
a highly simplified system, for its degree of simplification it is amazingly
powerful.  It has been very helpful to my thinking about AGI.  Please give
me some excuse for calling it trivial exercise in public relations.  I
certainly have not published anything as important.  Have you?

The same for Mike Collins's parsers which, at least several years ago I was
told by multiple people at MIT was considered one of the most accurate NL
parsers around.  Is that just a trivial exercise in public relations?  

With regard to Hecht-Nielsen's work, if it does half of what he says it does
it is pretty damned impressive.  It is also a work I think about often when
thinking how to deal with certain AI problems.  

Richard if you insultingly dismiss such valid work as trivial exercises in
public relations it sure as hell seems as if either you are quite lacking
in certain important understandings -- or you have a closed mind -- or both.



Ed Porter

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;


___
James Ratcliff - http://falazar.com
Looking for something...
   
-
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73349390-542055

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Matt,  
So if it is perceived as something that increases a machine's vulnerability,
it seems to me that would be one more reason for people to avoid using it.
Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 4:06 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])

--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 Does a PC become more vulnerable to viruses, worms, Trojan horses, root
 kits, and other web attacks if it becomes part of a P2P network? And if so
 why and how much.  

It does if the P2P software has vulnerabilities, just like any other server
or
client.  Worms would be especially dangerous because they could spread
quickly
without user intervention, but slowly spreading viruses that are well hidden
can be dangerous too.  There is no foolproof defense, but it helps to keep
the
protocol and software as simple as possible, to run the P2P software as a
nonprivileged process, use open source code, and not to depend to any large
extent on a single source of software.

The protocol I have in mind is that a message contain searchable natural
language text, possibly some nonsearchable attached files, and a header with
the reply address and timestamp of the originator and any intermediate peers
through which the message was routed.  The protocol is not dangerous except
for the attached files, but these have to be included because it is a useful
service.  If you don't include it, people will figure out how to embed
arbitrary data in the message text, which would make the protocol more
dangerous because it wasn't planned for.

In theory, you could use the P2P network to spread information about
malicious
peers and deliver software patches.  But I think this would introduce more
problems than it solves because it would also introduce a mechanism for
spreading false information and patches containing trojans.  Peers should
have
defenses that operate independently of the network, including disconnecting
itself if it detects anomalies in its own behavior.

Of course the network is vulnerable even if the peers behave properly. 
Malicious peers could forge headers, for example, to hide the true source of
messages or to force replies to be directed to unintended targets.  Some
attacks could be very complex depending on the idiosyncratic behavior of
particular peers.



-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73357661-483045

Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread William Pearson
On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote:
 Matt,
 So if it is perceived as something that increases a machine's vulnerability,
 it seems to me that would be one more reason for people to avoid using it.
 Ed Porter


Why are you having this discussion on an AGI list?

  Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73366106-264b25


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Vladimir Nesov
On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote:

 This is something I have been telling people for years.   That you should be
 able to extract a significant amount (but probably far from all) world
 knowledge by scanning large corpora of text.  I would love to see how well
 it actually works for a given size of corpora, and for a given level of
 algorithmic sophistication.


But what's knowledge?


-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73373961-20dc54


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Ed Porter
James,

 

Do you have any description or examples of you results.  

 

This is something I have been telling people for years.   That you should be
able to extract a significant amount (but probably far from all) world
knowledge by scanning large corpora of text.  I would love to see how well
it actually works for a given size of corpora, and for a given level of
algorithmic sophistication.

 

Ed Porter

 

-Original Message-
From: James Ratcliff [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 4:51 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 

Richard,
  What is your specific complaint about the 'viability of the framework'?


Ed,
  This line of data gathering is very interesting to me as well, though I
found quickly that using all web sources quickly devolved into insanity.
By using scanned text novels, I was able to extract lots of relational
information on a range of topics. 
   With a well defined ontology system, and some human overview, a large
amount of information can be extracted and many probabilities learned.

James


Ed Porter [EMAIL PROTECTED] wrote:


RICHARD LOOSEMORE=
You are implicitly assuming a certain framework for solving the problem of
representing knowledge ... and then all your discussion is about whether or
not it is feasible to implement that framework (to overcome various issues
to do with searches that have to be done within that framework).

But I am not challenging the implementation issues, I am challenging the
viability of the framework itself.

JAMES--- What e


ED PORTER= So what is wrong with my framework? What is wrong with a
system of recording patterns, and a method for developing compositions and
generalities from those patterns, in multiple hierarchical levels, and for
indicating the probabilities of certain patterns given certain other pattern
etc? 

I know it doesn't genuflect before the alter of complexity. But what is
wrong with the framework other than the fact that it is at a high level and
thus does not explain every little detail of how to actually make an AGI
work?



RICHARD LOOSEMORE= These models you are talking about are trivial
exercises in public 
relations, designed to look really impressive, and filled with hype 
designed to attract funding, which actually accomplish very little.

Please, Ed, don't do this to me. Please don't try to imply that I need 
to open my mind any more. Th implication seems to be that I do not 
understand the issues in enough depth, and need to do some more work to 
understand you points. I can assure you this is not the case.



ED PORTER= Shastri's Shruiti is a major piece of work. Although it is
a highly simplified system, for its degree of simplification it is amazingly
powerful. It has been very helpful to my thinking about AGI. Please give
me some excuse for calling it trivial exercise in public relations. I
certainly have not published anything as important. Have you?

The same for Mike Collins's parsers which, at least several years ago I was
told by multiple people at MIT was considered one of the most accurate NL
parsers around. Is that just a trivial exercise in public relations? 

With regard to Hecht-Nielsen's work, if it does half of what he says it does
it is pretty damned impressive. It is also a work I think about often when
thinking how to deal with certain AI problems. 

Richard if you insultingly dismiss such valid work as trivial exercises in
public relations it sure as hell seems as if either you are quite lacking
in certain important understandings -- or you have a closed mind -- or both.



Ed Porter

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;




___
James Ratcliff - http://falazar.com
Looking for something...

  

  _  

Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try
http://us.rd.yahoo.com/evt=51733/*http:/mobile.yahoo.com/;_ylt=Ahu06i62sR8H
DtDypao8Wcj9tAcJ%20  it now.

  _  

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
http://v2.listbox.com/member/?;


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73371326-7ffb17

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
It was part of a discussion of using a P2P network with OpenCog to develop
distributed AGI's.

-Original Message-
From: William Pearson [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 5:20 PM
To: agi@v2.listbox.com
Subject: Re: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])

On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote:
 Matt,
 So if it is perceived as something that increases a machine's
vulnerability,
 it seems to me that would be one more reason for people to avoid using it.
 Ed Porter


Why are you having this discussion on an AGI list?

  Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73390249-cd905b

Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- William Pearson [EMAIL PROTECTED] wrote:

 On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote:
  Matt,
  So if it is perceived as something that increases a machine's
 vulnerability,
  it seems to me that would be one more reason for people to avoid using it.
  Ed Porter
 
 
 Why are you having this discussion on an AGI list?

Because this is an AGI design.  The intelligence comes from having a lot of
specialized experts on narrow topics and a distributed infrastructure that
directs your queries to the right experts.  The P2P protocol is natural
language text.  I will write up the proposal so it will make more sense than
the current collection of posts.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73390737-69c951


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Ed Porter
Are you saying the increase in vulnerability would be no more than that?

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 6:17 PM
To: agi@v2.listbox.com
Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,  
 So if it is perceived as something that increases a machine's
vulnerability,
 it seems to me that would be one more reason for people to avoid using it.
 Ed Porter

A web browser and email increases your computer's vulnerability, but it
doesn't stop people from using them.

 
 -Original Message-
 From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, December 06, 2007 4:06 PM
 To: agi@v2.listbox.com
 Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
Re:
 [agi] Funding AGI research])
 
 --- Ed Porter [EMAIL PROTECTED] wrote:
 
  Matt,
  
  Does a PC become more vulnerable to viruses, worms, Trojan horses, root
  kits, and other web attacks if it becomes part of a P2P network? And if
so
  why and how much.  
 
 It does if the P2P software has vulnerabilities, just like any other
server
 or
 client.  Worms would be especially dangerous because they could spread
 quickly
 without user intervention, but slowly spreading viruses that are well
hidden
 can be dangerous too.  There is no foolproof defense, but it helps to keep
 the
 protocol and software as simple as possible, to run the P2P software as a
 nonprivileged process, use open source code, and not to depend to any
large
 extent on a single source of software.
 
 The protocol I have in mind is that a message contain searchable natural
 language text, possibly some nonsearchable attached files, and a header
with
 the reply address and timestamp of the originator and any intermediate
peers
 through which the message was routed.  The protocol is not dangerous
except
 for the attached files, but these have to be included because it is a
useful
 service.  If you don't include it, people will figure out how to embed
 arbitrary data in the message text, which would make the protocol more
 dangerous because it wasn't planned for.
 
 In theory, you could use the P2P network to spread information about
 malicious
 peers and deliver software patches.  But I think this would introduce more
 problems than it solves because it would also introduce a mechanism for
 spreading false information and patches containing trojans.  Peers should
 have
 defenses that operate independently of the network, including
disconnecting
 itself if it detects anomalies in its own behavior.
 
 Of course the network is vulnerable even if the peers behave properly. 
 Malicious peers could forge headers, for example, to hide the true source
of
 messages or to force replies to be directed to unintended targets.  Some
 attacks could be very complex depending on the idiosyncratic behavior of
 particular peers.
 
 
 
 -- Matt Mahoney, [EMAIL PROTECTED]
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73394329-17b2b6

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,  
 So if it is perceived as something that increases a machine's vulnerability,
 it seems to me that would be one more reason for people to avoid using it.
 Ed Porter

A web browser and email increases your computer's vulnerability, but it
doesn't stop people from using them.

 
 -Original Message-
 From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, December 06, 2007 4:06 PM
 To: agi@v2.listbox.com
 Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
 [agi] Funding AGI research])
 
 --- Ed Porter [EMAIL PROTECTED] wrote:
 
  Matt,
  
  Does a PC become more vulnerable to viruses, worms, Trojan horses, root
  kits, and other web attacks if it becomes part of a P2P network? And if so
  why and how much.  
 
 It does if the P2P software has vulnerabilities, just like any other server
 or
 client.  Worms would be especially dangerous because they could spread
 quickly
 without user intervention, but slowly spreading viruses that are well hidden
 can be dangerous too.  There is no foolproof defense, but it helps to keep
 the
 protocol and software as simple as possible, to run the P2P software as a
 nonprivileged process, use open source code, and not to depend to any large
 extent on a single source of software.
 
 The protocol I have in mind is that a message contain searchable natural
 language text, possibly some nonsearchable attached files, and a header with
 the reply address and timestamp of the originator and any intermediate peers
 through which the message was routed.  The protocol is not dangerous except
 for the attached files, but these have to be included because it is a useful
 service.  If you don't include it, people will figure out how to embed
 arbitrary data in the message text, which would make the protocol more
 dangerous because it wasn't planned for.
 
 In theory, you could use the P2P network to spread information about
 malicious
 peers and deliver software patches.  But I think this would introduce more
 problems than it solves because it would also introduce a mechanism for
 spreading false information and patches containing trojans.  Peers should
 have
 defenses that operate independently of the network, including disconnecting
 itself if it detects anomalies in its own behavior.
 
 Of course the network is vulnerable even if the peers behave properly. 
 Malicious peers could forge headers, for example, to hide the true source of
 messages or to force replies to be directed to unintended targets.  Some
 attacks could be very complex depending on the idiosyncratic behavior of
 particular peers.
 
 
 
 -- Matt Mahoney, [EMAIL PROTECTED]
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73388768-0927ef


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Vladimir Nesov
Edward,

It's certainly a trick question, since if you don't define semantics
for this knowledge thing, it can turn out to be anything from simplest
do-nothings to full-blown physically-infeasible superintelligences. So
you assertion doesn't cut the viability of knowledge extraction for
various purposes, and without that it's not clear what you actually
mean.


On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote:
 This is something I have been telling people for years.   That you should be
 able to extract a significant amount (but probably far from all) world
 knowledge by scanning large corpora of text.  I would love to see how well
 it actually works for a given size of corpora, and for a given level of
 algorithmic sophistication.


-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73400395-303d49


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Vladimir Nesov
Yes, it's what triggered my nitpicking reflex; I am sorry about that.

Your comment sounds fine when related to viability of teaching an AGI
in a text-only mode without too much manual assistance, but semantics
of what it was given to is quite different.


On Dec 7, 2007 3:13 AM, Ed Porter [EMAIL PROTECTED] wrote:
 Vlad,

 My response was to the following message

 ==
 Ed,
   This line of data gathering is very interesting to me as well, though I
 found quickly that using all web sources quickly devolved into insanity.
 By using scanned text novels, I was able to extract lots of relational
 information on a range of topics.
With a well defined ontology system, and some human overview, a large
 amount of information can be extracted and many probabilities learned.

 James
 =
 so I was asking what sort of knowledge he had extracted as part of the lots
 of relational information on a range of topics.

 Ed Porter



 -Original Message-
 From: Vladimir Nesov [mailto:[EMAIL PROTECTED]
 Sent: Thursday, December 06, 2007 7:02 PM
 To: agi@v2.listbox.com
 Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]


 Edward,

 It's certainly a trick question, since if you don't define semantics
 for this knowledge thing, it can turn out to be anything from simplest
 do-nothings to full-blown physically-infeasible superintelligences. So
 you assertion doesn't cut the viability of knowledge extraction for
 various purposes, and without that it's not clear what you actually
 mean.


 On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote:
  This is something I have been telling people for years.   That you should
 be
  able to extract a significant amount (but probably far from all) world
  knowledge by scanning large corpora of text.  I would love to see how well
  it actually works for a given size of corpora, and for a given level of
  algorithmic sophistication.


 --
 Vladimir Nesovmailto:[EMAIL PROTECTED]

 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;

 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;



-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73408474-ba1629


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-06 Thread Ed Porter
Vlad,

My response was to the following message

==
Ed,
  This line of data gathering is very interesting to me as well, though I
found quickly that using all web sources quickly devolved into insanity.
By using scanned text novels, I was able to extract lots of relational
information on a range of topics. 
   With a well defined ontology system, and some human overview, a large
amount of information can be extracted and many probabilities learned.

James
=
so I was asking what sort of knowledge he had extracted as part of the lots
of relational information on a range of topics.

Ed Porter



-Original Message-
From: Vladimir Nesov [mailto:[EMAIL PROTECTED] 
Sent: Thursday, December 06, 2007 7:02 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Edward,

It's certainly a trick question, since if you don't define semantics
for this knowledge thing, it can turn out to be anything from simplest
do-nothings to full-blown physically-infeasible superintelligences. So
you assertion doesn't cut the viability of knowledge extraction for
various purposes, and without that it's not clear what you actually
mean.


On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote:
 This is something I have been telling people for years.   That you should
be
 able to extract a significant amount (but probably far from all) world
 knowledge by scanning large corpora of text.  I would love to see how well
 it actually works for a given size of corpora, and for a given level of
 algorithmic sophistication.


-- 
Vladimir Nesovmailto:[EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73401551-1f6d58

RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-06 Thread Matt Mahoney

--- Ed Porter [EMAIL PROTECTED] wrote:

 Are you saying the increase in vulnerability would be no more than that?

Yes, at least short term if we are careful with the design.  But then again,
you can't predict what AGI will do, or else it wouldn't be intelligent.  I
can't say for certain long term (2040s?) it wouldn't launch a singularity, or
even that it wouldn't create an intelligent worm that would eat the Internet. 
I don't think anyone is smart enough to get it right, but it is going to
happen in one form or another.

I wrote up a quick description of my AGI proposal at
http://www.mattmahoney.net/agi.html
basically summarizing what I posted over the last several emails, including
various attack scenarios.  I'm sure I didn't think of everything.  It is kind
of sketchy because it's not an area I am actively pursuing.  It should be a
useful service at least in the short term before it destroys us.


 
 -Original Message-
 From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, December 06, 2007 6:17 PM
 To: agi@v2.listbox.com
 Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re:
 [agi] Funding AGI research])
 
 
 --- Ed Porter [EMAIL PROTECTED] wrote:
 
  Matt,  
  So if it is perceived as something that increases a machine's
 vulnerability,
  it seems to me that would be one more reason for people to avoid using it.
  Ed Porter
 
 A web browser and email increases your computer's vulnerability, but it
 doesn't stop people from using them.
 
  
  -Original Message-
  From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
  Sent: Thursday, December 06, 2007 4:06 PM
  To: agi@v2.listbox.com
  Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS
 Re:
  [agi] Funding AGI research])
  
  --- Ed Porter [EMAIL PROTECTED] wrote:
  
   Matt,
   
   Does a PC become more vulnerable to viruses, worms, Trojan horses, root
   kits, and other web attacks if it becomes part of a P2P network? And if
 so
   why and how much.  
  
  It does if the P2P software has vulnerabilities, just like any other
 server
  or
  client.  Worms would be especially dangerous because they could spread
  quickly
  without user intervention, but slowly spreading viruses that are well
 hidden
  can be dangerous too.  There is no foolproof defense, but it helps to keep
  the
  protocol and software as simple as possible, to run the P2P software as a
  nonprivileged process, use open source code, and not to depend to any
 large
  extent on a single source of software.
  
  The protocol I have in mind is that a message contain searchable natural
  language text, possibly some nonsearchable attached files, and a header
 with
  the reply address and timestamp of the originator and any intermediate
 peers
  through which the message was routed.  The protocol is not dangerous
 except
  for the attached files, but these have to be included because it is a
 useful
  service.  If you don't include it, people will figure out how to embed
  arbitrary data in the message text, which would make the protocol more
  dangerous because it wasn't planned for.
  
  In theory, you could use the P2P network to spread information about
  malicious
  peers and deliver software patches.  But I think this would introduce more
  problems than it solves because it would also introduce a mechanism for
  spreading false information and patches containing trojans.  Peers should
  have
  defenses that operate independently of the network, including
 disconnecting
  itself if it detects anomalies in its own behavior.
  
  Of course the network is vulnerable even if the peers behave properly. 
  Malicious peers could forge headers, for example, to hide the true source
 of
  messages or to force replies to be directed to unintended targets.  Some
  attacks could be very complex depending on the idiosyncratic behavior of
  particular peers.
  
  
  
  -- Matt Mahoney, [EMAIL PROTECTED]
  
  -
  This list is sponsored by AGIRI: http://www.agiri.org/email
  To unsubscribe or change your options, please go to:
  http://v2.listbox.com/member/?;
  
  -
  This list is sponsored by AGIRI: http://www.agiri.org/email
  To unsubscribe or change your options, please go to:
  http://v2.listbox.com/member/?;
 
 
 -- Matt Mahoney, [EMAIL PROTECTED]
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;
 
 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73450735-649fdc


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Mark Waser
Interesting.  Since I am interested in parsing, I read Collin's paper.  It's a 
solid piece of work (though with the stated error percentages, I don't believe 
that it really proves anything worthwhile at all) -- but your 
over-interpretations of it are ridiculous.

You claim that It is actually showing that you can do something roughly 
equivalent to growing neural gas (GNG) in a space with something approaching 
500,000 dimensions, but you can do it without normally having to deal with more 
than a few of those dimensions at one time.  Collins makes no claims that even 
remotely resembles this.  He *is* taking a deconstructionist approach (which 
Richard and many others would argue vehemently with) -- but that is virtually 
the entirety of the overlap between his paper and your claims.  Where do you 
get all this crap about 500,000 dimensions, for example?

You also make statements that are explicitly contradicted in the paper.  For 
example, you say But there really seem to be no reason why there should be any 
limit to the dimensionality of the space in which the Collin's algorithm works, 
because it does not use an explicit vector representation while his paper 
quite clearly states Each tree is represented by an n dimensional vector where 
the i'th component counts the number of occurences of the i'th tree fragment. 
(A mistake I believe you made because you didn't understand the prevceding 
sentence -- or, more critically, *any* of the math).

Are all your claims on this list this far from reality if one pursues them? 


- Original Message - 
From: Ed Porter [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Tuesday, December 04, 2007 10:52 PM
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]


The particular NL parser paper in question, Collins's Convolution Kernels
for Natural Language
(http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Collins-kernels.pdf)
is actually saying something quite important that extends way beyond parsers
and is highly applicable to AGI in general.  

It is actually showing that you can do something roughly equivalent to
growing neural gas (GNG) in a space with something approaching 500,000
dimensions, but you can do it without normally having to deal with more than
a few of those dimensions at one time.  GNG is an algorithm I learned about
from reading Peter Voss that allows one to learn how to efficiently
represent a distribution in a relatively high dimensional space in a totally
unsupervised manner.  But there really seem to be no reason why there should
be any limit to the dimensionality of the space in which the Collin's
algorithm works, because it does not use an explicit vector representation,
nor, if I recollect correctly, a Euclidian distance metric, but rather a
similarity metric which is generally much more appropriate for matching in
very high dimensional spaces.

But what he is growing are not just points representing where data has
occurred in a high dimensional space, but sets of points that define
hyperplanes for defining the boundaries between classes.  My recollection is
that this system learns automatically from both labeled data (instances of
correct parse trees) and randomly generated deviations from those instances.
His particular algorithm matches tree structures, but with modification it
would seem to be extendable to matching arbitrary nets.  Other versions of
it could be made to operate, like GNG, in an unsupervised manner.

If you stop and think about what this is saying and generalize from it, it
provides an important possible component in an AGI tool kit. What it shows
is not limited to parsing, but it would seem possibly applicable to
virtually any hierarchical or networked representation, including nets of
semantic web RDF triples, and semantic nets, and predicate logic
expressions.  At first glance it appears it would even be applicable to
kinkier net matching algorithms, such as an Augmented transition network
(ATN) matching.

So if one reads this paper with a mind to not only what it specifically
shows, but to what how what it shows could be expanded, this paper says
something very important.  That is, that one can represent, learn, and
classify things in very high dimensional spaces -- such as 10^1
dimensional spaces -- and do it efficiently provided the part of the space
being represented is sufficiently sparsely connected.

I had already assumed this, before reading this paper, but the paper was
valuable to me because it provided a mathematically rigorous support for my
prior models, and helped me better understand the mathematical foundations
of my own prior intuitive thinking.  

It means that systems like Novemente can deal in very high dimensional
spaces relatively efficiently. It does not mean that all processes that can
be performed in such spaces will be computationally cheap (for example,
combinatorial searches), but it means that many of them, such as GNG like
recording

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Ed Porter
Dave, 

 

Thanks for the link.  Seems like it gives Matt the right to say to the world
I told you so.  

 

I wonder if OpenCog could get involved in this, or something like this, in a
productive way.

 

Ed Porter

 

-Original Message-
From: David Hart [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 3:16 AM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 

On 12/5/07, Matt Mahoney [EMAIL PROTECTED] wrote:


[snip]  Centralized search is limited to a few big players that
can keep a copy of the Internet on their servers.  Google is certainly
useful,
but imagine if it searched a space 1000 times larger and if posts were 
instantly added to its index, without having to wait days for its spider to
find them.  Imagine your post going to persistent queries posted days
earlier.
Imagine your queries being answered by real human beings in addition to
other 
peers.

I probably won't be the one writing this program, but where there is a need,
I
expect it will happen.



Wikia, the company run by Wikipedia founder Jimmy Wales, is tackling the
Internet-scale distributed search problem -
http://search.wikia.com/wiki/Atlas

Connecting to related threads (some recent, some not-so-recent), the Grub
distributed crawler ( http://search.wikia.com/wiki/Grub ) is intended to be
one of many plug-in Atlas Factories. A development goal for Grub is to
enhance it with a NL toolkit (e.g. the soon-to-be-released RelEx), so it can
do more than parse simple keywords and calculate statistical word
relationships. 

-dave

 

 

  _  

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
http://v2.listbox.com/member/?;


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72270417-205c60

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Richard Loosemore
 two of
the Collin's paper cited in my prior email.  The language you quoted
occurred in the following context.

Conceptually we begin by enumerating all tree fragments that occur in the
training data
1,,n. NOTE THAT THIS IS DONE ONLY IMPLICITLY. Each tree is represented
by an n dimensional vector where the i'th component counts the number of
occurences of the i'th tree
fragment. (capitalization is added for emphasis)

This is the discussion of the conceptually very high dimensional space his
system effectively computes in but normally avoid having to explicitly deals
in.  In that conceptually high dimensional space patterns are represented
conceptually by vectors having a scalar associated with each dimension of
the high dimensional space.  But this vector is only the conceptual
representation, not the one his system actually explicitly uses for
computation.  This is the very high dimensional space I was talking about,
not the reduced dimensionality I talked about in which most operations are
performed.

The 4th paragraph on Page 3 of the paper starts with The key to our
efficient use of this high dimensional representation is the definition of
an appropriate kernel.  This kernel method it discusses uses a kernel
function C(n1,n2) which is at the end of the major equation what has three
equal signs and spans the width of page 3.  Immediately below is an image of
this equations for those reading in rich text



This function C(n1,n2) is summarized in the following text at the start of
the first full paragraph on page 4.

To see that this recursive definition is correct, note that C(n1,n2) simply
counts the number
of common subtrees that are found rooted at both n1 and n2.  In the above
equation, n1 and n2 are iteratively each node, respectively, in each of the
two trees being matched. 


Thus this kernel function deals with much less than all of the i subtrees
that occur in the training data mentioned in the above quoted text that
starts with the word Conceptually.  Instead it only deals with that subset
of the i subtrees that occur in the two parse trees that are being compared.
Since the vector referred to in the conceptually paragraph that had the
full dimensionality i is not used in the kernel function, it never needs to
be explicitly dealt with.  THUS, THE ALLEGATION BELOW THAT I MISUNDERSTOOD
THE MATH BECAUSE THOUGHT COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR
HAVING THE FULL DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY
FALSE. 


QED

What this is saying is rather common sensical.  It says that regardless of
how many dimensions a space has, you can compare things based on the number
of dimensions they share, which is normally a very small subset of the total
number of dimensions.  This is often called a dot product comparision, and
the matching metric is often called a similarity rather than a distance.
This is different than a normal distance comparison, which, by common
definition, measures the similarity or lack thereof in all dimensions.   But
in an extremely high dimensional space such computations become extremely
complex, and the distance is dominated by the extremely large number of
dimension that are for many purposes irrelevant to the comparison.  Of
course in the case of Collin's paper the comparison is made a little more
complex because in involves a mapping, not just a measure of the similarity
along each shared i dimension.

So, in summary, Mark, before you trash me so harshly, please take a little
more care to be sure your criticisms are actually justified.

Ed Porter
  



-Original Message-
		From: Mark Waser [mailto:[EMAIL PROTECTED] 
		Sent: Wednesday, December 05, 2007 10:27 AM

To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi]
Funding AGI research]

Interesting.  Since I am interested in parsing, I read
Collin's paper.  It's a solid piece of work (though with the stated error
percentages, I don't believe that it really proves anything worthwhile at
all) -- but your over-interpretations of it are ridiculous.
		 
		You claim that It is actually showing that you can do

something roughly equivalent to growing neural gas (GNG) in a space with
something approaching 500,000 dimensions, but you can do it without normally
having to deal with more than a few of those dimensions at one time.
Collins makes no claims that even remotely resembles this.  He *is* taking a
deconstructionist approach (which Richard and many others would argue
vehemently with) -- but that is virtually the entirety of the overlap
between his paper and your claims.  Where do you get all this crap about
500,000 dimensions, for example?
		 
		You also make statements that are explicitly contradicted in

the paper.  For example, you say But there really seem to be no reason why
there should be any limit to the dimensionality of the space in which the
Collin's algorithm works, because it does not use

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Mark Waser

ED PORTER= The 500K dimensions were mentioned several times in a
lecture Collins gave at MIT about his parse.  This was probably 5 years ago
so I am not 100% sure the number was 500K, but I am about 90% sure that was
the number used, and 100% sure the number was well over 100K.

OK.  I'll bite.  So what do *you* believe that these dimensions are?  Words? 
Word pairs?  Entire sentences?  Different trees? 



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72410952-199e0d


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Ed Porter
Richard,

It actually is more valuable than you say.  

First, the same kernel trick can be used for GNG type unsupervised learning
in high dimensional spaces.  So it is not limited to supervised learning.

Second, you are correct is saying that through the kernel trick it is doing
the actually doing almost all of its computations in a lower dimensional
space.  

But unlike with many kernel tricks, in this one the system actually directly
access each of the dimensions in the space in different combinations as
necessary.  That is important.  It means that you can have a space with as
many dimensions as there are features or patterns in your system and still
efficiently do similarity matching (but not distance matching.)

Ed Porter

-Original Message-
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 2:37 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter wrote:
 Mark,
 
 MARK WASER=== You claim that It is actually showing that you can do
 something roughly equivalent to growing neural gas (GNG) in a space with
 something approaching 500,000 dimensions, but you can do it without
normally
 having to deal with more than a few of those dimensions at one time.
 Collins makes no claims that even remotely resembles this.  He *is* taking
a
 deconstructionist approach (which Richard and many others would argue
 vehemently with) -- but that is virtually the entirety of the overlap
 between his paper and your claims.  Where do you get all this crap about
 500,000 dimensions, for example?
 
 ED PORTER= The 500K dimensions were mentioned several times in a
 lecture Collins gave at MIT about his parse.  This was probably 5 years
ago
 so I am not 100% sure the number was 500K, but I am about 90% sure that
was
 the number used, and 100% sure the number was well over 100K.  The very
 large size of the number of dimensions was mentioned repeatedly by both
 Collin's and at least one other professor with whom I talked after the
 lecture.  One of the points both emphasized was that by use of the kernel
 trick he was effectively matching in a 500K dimensional space, without
 having to deal with most of those dimensions at any one time (although, it
 is my understanding, that over many parses the system would deal with a
 large percent of all those dimensions.)  

It sounds like you may have misunderstood the relevance of the high 
number of dimensions.

Correct me if I am wrong, but Collins is not really matching in large 
numbers of dimensions, he is using the kernel trick to transform a 
nonlinear CLASSIFICATION problem into a high-dimensional linear 
classification.

This is just a trick to enable a better type of supervised learning.

Would you follow me if I said that using supervised learning is of no 
use in general?  Because it means that someone has already (a) decided 
on the dimensions of representation in the initial problem domain, and 
(b) already done all the work of classifying the sentences into 
syntactically correct and syntactically incorrect.  All that the SVM 
is doing is summarizing this training data in a nice compact form:  the 
high number of dimensions involved at one stage of the problem appear to 
be just an artifact of the method, it means nothing in general.

It especially does not mean that this supervised training algorithm is 
somehow able to break out and become and unsupervised, feature-discovery 
method, which it would have to do to be of any general interest.

I still have not read Collins' paper:  I am just getting this from my 
understanding of the math you have mentioned here.

It seems that whether or not he mentioned 500K dimensions or an infinite 
number of dimensions (which he could have done) makes no difference to 
anything.

If you think it does make a big difference, could you explain why?




Richard Loosemore




 If you read papers on support vector machines using kernel methods you
will
 realize that it is well know that you can do certain types of matching and
 other operations in high dimensional spaces with out having to actually
 normally deal in the high dimensions by use of the kernel trick.  The
 issue is often that of finding a particular kernel that works well for
your
 problem.  Collins shows the kernel trick can be extended to parse tree net
 matching.  
 
 With regard to my statement that the efficiency of the kernel trick could
be
 applied relatively generally, it is quite well supported by the following
 text from page 4 of the paper.
 
 This paper and previous work by Lodhi et al. [12] examining the
application
 of convolution kernels to strings provide some evidence that convolution
 kernels may provide an extremely useful tool for applying modern machine
 learning techniques to highly structured objects. The key idea here is
that
 one may take a structured object and split it up into parts. If one can
 construct kernels over the parts then one can

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Ed Porter
Mark,

The paper said:

Conceptually we begin by enumerating all tree fragments that occur in the
training data 1,...,n.

Those are the dimensions, all of the parse tree fragments in the training
data.  And as I pointed out in an email I just sent to Richard, although
usually only a small set of them are involved in any one match between two
parse trees, they can all be used over set of many such matches.

So the full dimensionality is actually there, it is just that only a
particular subset of them are being used at any one time.  And when the
system is waiting for the next tree to match it is potentially capability of
matching it against any of its dimensions.

Ed Porter

-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 3:07 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

ED PORTER= The 500K dimensions were mentioned several times in a
lecture Collins gave at MIT about his parse.  This was probably 5 years ago
so I am not 100% sure the number was 500K, but I am about 90% sure that was
the number used, and 100% sure the number was well over 100K.

OK.  I'll bite.  So what do *you* believe that these dimensions are?  Words?

Word pairs?  Entire sentences?  Different trees? 


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72646193-0bde77attachment: winmail.dat

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Mark Waser
Dimensions is an awfully odd word for that since dimensions are normally 
assumed to be orthogonal.


- Original Message - 
From: Ed Porter [EMAIL PROTECTED]

To: agi@v2.listbox.com
Sent: Wednesday, December 05, 2007 5:08 PM
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]


Mark,

The paper said:

Conceptually we begin by enumerating all tree fragments that occur in the
training data 1,...,n.

Those are the dimensions, all of the parse tree fragments in the training
data.  And as I pointed out in an email I just sent to Richard, although
usually only a small set of them are involved in any one match between two
parse trees, they can all be used over set of many such matches.

So the full dimensionality is actually there, it is just that only a
particular subset of them are being used at any one time.  And when the
system is waiting for the next tree to match it is potentially capability of
matching it against any of its dimensions.

Ed Porter

-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 05, 2007 3:07 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

ED PORTER= The 500K dimensions were mentioned several times in a
lecture Collins gave at MIT about his parse.  This was probably 5 years ago
so I am not 100% sure the number was 500K, but I am about 90% sure that was
the number used, and 100% sure the number was well over 100K.

OK.  I'll bite.  So what do *you* believe that these dimensions are?  Words?

Word pairs?  Entire sentences?  Different trees?


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72664919-0f4727


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Mark Waser

HeavySarcasmWow.  Is that what dot products are?/HeavySarcasm

You're confusing all sorts of related concepts with a really garbled 
vocabulary.


Let's do this with some concrete 10-D geometry . . . . Vector A runs from 
(0,0,0,0,0,0,0,0,0,0) to (1, 1, 0,0,0,0,0,0,0,0).  Vector B runs from 
(0,0,0) to (1, 0, 1,0,0,0,0,0,0,0).


Clearly A and B share the first dimension.  Do you believe that they share 
the second and the third dimension?  Do you believe that dropping out the 
fourth through tenth dimension in all calculations is some sort of huge 
conceptual breakthrough?


The two vectors are similar in the first dimension (indeed, in all but the 
second and third) but otherwise very distant from each other (i.e. they are 
*NOT* similar).  Do you believe that these vectors are similar or distant?


THE ALLEGATION BELOW THAT I MISUNDERSTOOD THE MATH BECAUSE THOUGHT 
COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR HAVING THE FULL 
DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY FALSE.


My allegation was that you misunderstood the math because you claimed that 
Collin's paper does not use an explicit vector representation while 
Collin's statements and the math itself makes it quite clear that they are 
dealing with a vector representation scheme.  I'm now guessing that you're 
claiming that you intended explicit to mean full dimensionality. 
Whatever.  Don't invent your own meanings for words and you'll be 
misunderstood less often (unless you continue to drop out key words like in 
the capitalized sentence above).



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72452073-36665f


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Ed Porter
Mark, 

Your last email started OK.  I'll bite. 

I guess you didn't bite for very long.  We are already back to explicitly
marked HeavySarcasm mode.

I guess one could argue, as you seem to be doing, that indicating which of
500k dimensions had a match between two subtrees currently being compared,
could be considered equivalent to explicitly representing a huge 500k
dimensional binary vector -- but i think one could more strongly claim that
such an indication would be, at best, only an implicit representation of the
500k vector.  

THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO EXPLICITLY
DEAL WITH 500K TUPLES in each match, which is what I meant when I said not
explicitly deal with the high dimensional vectors.  This is a big plus in
terms of representational and computational efficiency.  I did not say there
was nothing equivalent to an implicit use of the high dimensional vector,
because kernels implicitly do use high dimensional vectors, but they do so
implicitly rather than explicitly.  That is why they increase efficiency.

My Merriam-Webster's Collegiate Dictionary gives as its first, which usually
means most common, definition of  explicit the following:

 fully revealed or expressed without vagueness,
implication, or ambiguity.

The information that two subtree to be matched contains a given set of
subtrees, defined by their indicies, without more, does not by itself define
a full 500K vector, nor even the full dimensionality of the vector.  That
information can only be derived from other information, which presumably is
not even used in the match procedure

Of course there are other definitions of the world explicit which mean
exact, and you could argue that indicating a few of the 500K indicies is
equivalent to exactly specifying a corresponding 500K dimensional vector,
once one takes into account other information.

When a use of a word in a given statement has two interpretations one of
which is correct, it is not clear one has the right to attack the person
making that statement for being incorrect.  At most you can attack him for
being ambiguous.  And normally on this list people do not attack other
people as rudely as you have attached me for merely being ambiguous.

Ed Porter


-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 3:40 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

HeavySarcasmWow.  Is that what dot products are?/HeavySarcasm

You're confusing all sorts of related concepts with a really garbled 
vocabulary.

Let's do this with some concrete 10-D geometry . . . . Vector A runs from 
(0,0,0,0,0,0,0,0,0,0) to (1, 1, 0,0,0,0,0,0,0,0).  Vector B runs from 
(0,0,0) to (1, 0, 1,0,0,0,0,0,0,0).

Clearly A and B share the first dimension.  Do you believe that they share 
the second and the third dimension?  Do you believe that dropping out the 
fourth through tenth dimension in all calculations is some sort of huge 
conceptual breakthrough?

The two vectors are similar in the first dimension (indeed, in all but the 
second and third) but otherwise very distant from each other (i.e. they are 
*NOT* similar).  Do you believe that these vectors are similar or distant?

 THE ALLEGATION BELOW THAT I MISUNDERSTOOD THE MATH BECAUSE THOUGHT 
 COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR HAVING THE FULL 
 DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY FALSE.

My allegation was that you misunderstood the math because you claimed that 
Collin's paper does not use an explicit vector representation while 
Collin's statements and the math itself makes it quite clear that they are 
dealing with a vector representation scheme.  I'm now guessing that you're 
claiming that you intended explicit to mean full dimensionality. 
Whatever.  Don't invent your own meanings for words and you'll be 
misunderstood less often (unless you continue to drop out key words like in 
the capitalized sentence above).


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72881028-794447attachment: winmail.dat

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread Ed Porter
They need not be.

-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 6:04 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Dimensions is an awfully odd word for that since dimensions are normally 
assumed to be orthogonal.

- Original Message - 
From: Ed Porter [EMAIL PROTECTED]
To: agi@v2.listbox.com
Sent: Wednesday, December 05, 2007 5:08 PM
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]


Mark,

The paper said:

Conceptually we begin by enumerating all tree fragments that occur in the
training data 1,...,n.

Those are the dimensions, all of the parse tree fragments in the training
data.  And as I pointed out in an email I just sent to Richard, although
usually only a small set of them are involved in any one match between two
parse trees, they can all be used over set of many such matches.

So the full dimensionality is actually there, it is just that only a
particular subset of them are being used at any one time.  And when the
system is waiting for the next tree to match it is potentially capability of
matching it against any of its dimensions.

Ed Porter

-Original Message-
From: Mark Waser [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 05, 2007 3:07 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

ED PORTER= The 500K dimensions were mentioned several times in a
lecture Collins gave at MIT about his parse.  This was probably 5 years ago
so I am not 100% sure the number was 500K, but I am about 90% sure that was
the number used, and 100% sure the number was well over 100K.

OK.  I'll bite.  So what do *you* believe that these dimensions are?  Words?

Word pairs?  Entire sentences?  Different trees?


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72742511-f9bb8b

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread David Hart
On 12/5/07, Matt Mahoney [EMAIL PROTECTED] wrote:


 [snip]  Centralized search is limited to a few big players that
 can keep a copy of the Internet on their servers.  Google is certainly
 useful,
 but imagine if it searched a space 1000 times larger and if posts were
 instantly added to its index, without having to wait days for its spider
 to
 find them.  Imagine your post going to persistent queries posted days
 earlier.
 Imagine your queries being answered by real human beings in addition to
 other
 peers.

 I probably won't be the one writing this program, but where there is a
 need, I
 expect it will happen.



Wikia, the company run by Wikipedia founder Jimmy Wales, is tackling the
Internet-scale distributed search problem -
http://search.wikia.com/wiki/Atlas

Connecting to related threads (some recent, some not-so-recent), the Grub
distributed crawler ( http://search.wikia.com/wiki/Grub ) is intended to be
one of many plug-in Atlas Factories. A development goal for Grub is to
enhance it with a NL toolkit (e.g. the soon-to-be-released RelEx), so it can
do more than parse simple keywords and calculate statistical word
relationships.

-dave

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72165246-397899

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-05 Thread John G. Rose
 From: Matt Mahoney [mailto:[EMAIL PROTECTED]
 My design would use most of the Internet (10^9 P2P nodes).  Messages
 would be
 natural language text strings, making no distinction between documents,
 queries, and responses.  Each message would have a header indicating the
 ID
 and time stamp of the originator and any intermediate nodes through
 which the
 message was routed.  A message could also have attached files.  Each
 node
 would have a cache of messages and its own policy on which messages it
 decides
 to keep or discard.
 
 The goal of the network is to route messages to other nodes that store
 messages with matching terms.  To route an incoming message x, it
 matches
 terms in x to terms in stored messages and sends copies to nodes that
 appear
 in those headers, appending its own ID and time stamp to the header of
 the
 outgoing copies.  It also keeps a copy, so that the receiving nodes
 knows that
 they know it has a copy of x (at least temporarily).
 
 The network acts as a distributed database with a distributed search
 function.
  If X posts a document x and Y posts a query y with matching terms, then
 the
 network acts to route x to Y and y to X.


The very tricky but required part of creating a global network like this is
going from zero nodes to whatever the goal is. I think that much emphasis of
a design needs to be put into the growth function. If you have 50 nodes
running how do you get to 500? And 500 to 5,000? And then if it goes down
from 50,000 to 10,000 fast how is it revived before crash? Engineering
expertise, ingenuity + maybe psychological and sociological wisdom can be
used to make this happen. And we all know that the growth could happen
quickly, even overnight. 

Then once getting to 10^9 nodes they have to be maintained or they can die
quickly and even instantaneously. 

Having an intelligent botnet has its advantages. Once it's running and users
try to uninstall it the botnet can try to fight for survival by reasoning
with the users. You could make it such that a user has to verbally
communicate with it to remove it. The botnet could stall and ask things like
Why are you doing this to me after all I have done for you? User:sorry
charlie, I command you to uninstall! Bot:OK let's cut a deal... I know we
can work this out...

John


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72911975-ce1dcc


Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-05 Thread Matt Mahoney

--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 Perhaps your are right.  
 
 But one problem is that big Google-like compuplexes in the next five to ten
 years will be powerful enough to do AGI and they will be much more efficient
 for AGI search because the physical closeness of their machines will make it
 possible for them to perform the massive interconnected needed for powerful
 AGI much more efficiently.

Google controls about 0.1% of the world's computing power.  But I think their
ability to achieve AGI first will not be so much due to the high bandwidth of
their CPU cluster, as that nobody controls the other 99.9%.

Centralized search tends to produce monopolies as the cost of entry goes up. 
It is not so bad now because Google still has a (dwindling) set of
competitors.  They can't yet hide content that threatens them.

Distributed search like Wikia/Atlas/Grub is interesting, but if people don't
see a compelling need for it, it won't happen.  How big will it have to get
before it is better than Google?  File sharing networks would probably be a
lot bigger and more useful (with mostly legitimate content) if we could solve
the distributed search problem.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72969535-74e4ee


Distrubuted message pool (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-05 Thread Matt Mahoney
--- John G. Rose [EMAIL PROTECTED] wrote:

  From: Matt Mahoney [mailto:[EMAIL PROTECTED]
  My design would use most of the Internet (10^9 P2P nodes).  Messages
  would be
  natural language text strings, making no distinction between documents,
  queries, and responses.  Each message would have a header indicating the
  ID
  and time stamp of the originator and any intermediate nodes through
  which the
  message was routed.  A message could also have attached files.  Each
  node
  would have a cache of messages and its own policy on which messages it
  decides
  to keep or discard.
  
  The goal of the network is to route messages to other nodes that store
  messages with matching terms.  To route an incoming message x, it
  matches
  terms in x to terms in stored messages and sends copies to nodes that
  appear
  in those headers, appending its own ID and time stamp to the header of
  the
  outgoing copies.  It also keeps a copy, so that the receiving nodes
  knows that
  they know it has a copy of x (at least temporarily).
  
  The network acts as a distributed database with a distributed search
  function.
   If X posts a document x and Y posts a query y with matching terms, then
  the
  network acts to route x to Y and y to X.
 
 
 The very tricky but required part of creating a global network like this is
 going from zero nodes to whatever the goal is. I think that much emphasis of
 a design needs to be put into the growth function. If you have 50 nodes
 running how do you get to 500? And 500 to 5,000? And then if it goes down
 from 50,000 to 10,000 fast how is it revived before crash? Engineering
 expertise, ingenuity + maybe psychological and sociological wisdom can be
 used to make this happen. And we all know that the growth could happen
 quickly, even overnight. 

Getting the network to grow means providing enough incentive that people will
want to install your software.  A distributed message pool offers two
services: distributed search and a message posting service.  Information has
negative value, so it is the second service that provides the incentive.  You
type your message into a client window, and it instantly becomes available to
anyone who enters a query with matching terms.

 Then once getting to 10^9 nodes they have to be maintained or they can die
 quickly and even instantaneously. 

How?  A peer would a piece of software that people would use every day, like a
web browser or email.  People aren't going to suddenly decide to uninstall
them all at once or turn off their computers.  One possible scenario is a
virus or worm spreading quickly from peer to peer.  Hopefully there will be a
wide variety of peers offering different services, so that individual
vulnerabilities could affect only a small part of the network.

 Having an intelligent botnet has its advantages. Once it's running and users
 try to uninstall it the botnet can try to fight for survival by reasoning
 with the users. You could make it such that a user has to verbally
 communicate with it to remove it. The botnet could stall and ask things like
 Why are you doing this to me after all I have done for you? User:sorry
 charlie, I command you to uninstall! Bot:OK let's cut a deal... I know we
 can work this out...

Well, I expect the intelligence to come from having a large number of
specialized but relatively dumb peers, and a network that can direct your
queries to the right ones.  Peers would individually be under the control of
their human owners, just as web servers and clients are now.  It's not like
you could command the Internet to uninstall anyway.

Eventually we will need to deal with the problem of the network becoming
smarter than us, but I think the threshold of concern is when the collective
computing power in silicon exceeds the collective computing power in carbon. 
Right now the Internet has about as much computing power as a few hundred
human brains, but we still have a ways to go to the singularity.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73000478-537c13


RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])

2007-12-05 Thread Ed Porter
I have a lot of respect for Google, but I don't like monopolies, whether it
is Microsoft or Google.  I think it is vitally important that there be
several viable search competators.  

I wish this wicki one luck.  As I said, it sounds a lot like your idea.

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 05, 2007 9:24 PM
To: agi@v2.listbox.com
Subject: Distributed search (was RE: Hacker intelligence level [WAS Re:
[agi] Funding AGI research])


--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 Perhaps your are right.  
 
 But one problem is that big Google-like compuplexes in the next five to
ten
 years will be powerful enough to do AGI and they will be much more
efficient
 for AGI search because the physical closeness of their machines will make
it
 possible for them to perform the massive interconnected needed for
powerful
 AGI much more efficiently.

Google controls about 0.1% of the world's computing power.  But I think
their
ability to achieve AGI first will not be so much due to the high bandwidth
of
their CPU cluster, as that nobody controls the other 99.9%.

Centralized search tends to produce monopolies as the cost of entry goes up.

It is not so bad now because Google still has a (dwindling) set of
competitors.  They can't yet hide content that threatens them.

Distributed search like Wikia/Atlas/Grub is interesting, but if people don't
see a compelling need for it, it won't happen.  How big will it have to get
before it is better than Google?  File sharing networks would probably be a
lot bigger and more useful (with mostly legitimate content) if we could
solve
the distributed search problem.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=73068614-a9079e

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
Bryan, The name grub sounds familiar.  That is probably it.  Ed

-Original Message-
From: Bryan Bishop [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 10:47 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

On Thursday 29 November 2007, Ed Porter wrote:
 Somebody (I think it was David Hart) told me there is a shareware
 distributed web crawler already available, but I don't know the
 details, such as how good or fast it is.

http://grub.org/
Previous owner went by the name of 'kordless'. I found him on Slashdot.

- Bryan

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71801708-39700e


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter


RICHARD LOOSEMORE= You have no idea of the context in which I made
that sweeping dismissal. 
  If you have enough experience of research in this area you will know 
that it is filled with bandwagons, hype and publicity-seeking.  Trivial 
models are presented as if they are fabulous achievements when, in fact, 
they are just engineered to look very impressive but actually solve an 
easy problem.  Have you had experience of such models?  Have you been 
around long enough to have seen something promoted as a great 
breakthrough even though it strikes you as just a trivial exercise in 
public relations, and then watch history unfold as the great 
breakthrough leads to  absolutely nothing at all, and is then 
quietly shelved by its creator?  There is a constant ebb and flow of 
exaggeration and retreat, exaggeration and retreat.  You are familiar 
with this process, yes?

ED PORTER= Richard, the fact that a certain percent of theories and
demonstrations are false and/or misleading does not give you the right to
dismiss any theory or demonstration that counters your position in an
argument as 

trivial exercises in public relations, designed to look
really impressive, and filled with hype designed to attract funding, which
actually accomplish very little

without at least giving some supporting argument for your dismissal.
Otherwise you could deny any aspect of scientific, mathematical, or
technological knowledge, no matter how sound, that proved inconvenient to
whatever argument you were making.  

There are people who argue in that dishonest fashion, but it is questionable
how much time one should spend conversing with them.  Do you want to be such
a person?

The fact that one of the pieces of evidence you so rudely dismissed is a
highly functional program that has been used by many other researchers,
shows the blindness with which you dismiss the arguments of others.


RICHARD LOOSEMORE=This entire discussion baffles me.  Does it matter
at all to you that I 
have been working in this field for decades?  Would you go up to someone 
at your local university and tell them how to do their job?  Would you 
listen to what they had to say about issues that arise in their field of 
expertise, or would you consider your own opinion entirely equal to 
theirs, with only a tiny fraction of their experience?

ED PORTER= No mater how many years you have been studying something, if
your argumentative and intellectual approach is to dismiss evidence contrary
to your position on clearly false bases, as you did with you dismissal of my
evidence with your above quoted insult, a serious question is raised as to
whether you are worth listening to or conversing with. 


ED PORTER





-Original Message-
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 10:47 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter wrote:
 
 
I'm sorry, but this is not addressing the actual
 issues involved.
 
 You are implicitly assuming a certain framework for solving the problem 
 of representing knowledge ... and then all your discussion is about 
 whether or not it is feasible to implement that framework (to overcome 
 various issues to do with searches that have to be done within that 
 framework).
 
 But I am not challenging the implementation issues, I am challenging the 
 viability of the framework itself.
 
 
 ED PORTER= So what is wrong with my framework?  What is wrong with a
 system of recording patterns, and a method for developing compositions and
 generalities from those patterns, in multiple hierarchical levels, and for
 indicating the probabilities of certain patterns given certain other
pattern
 etc?  
 
 I know it doesn't genuflect before the alter of complexity.  But what is
 wrong with the framework other than the fact that it is at a high level
and
 thus does not explain every little detail of how to actually make an AGI
 work?
 
 
 
 RICHARD LOOSEMORE= These models you are talking about are trivial
 exercises in public 
 relations, designed to look really impressive, and filled with hype 
 designed to attract funding, which actually accomplish very little.
 
 Please, Ed, don't do this to me. Please don't try to imply that I need 
 to open my mind any more.  Th implication seems to be that I do not 
 understand the issues in enough depth, and need to do some more work to 
 understand you points.  I can assure you this is not the case.
 
 
 
 ED PORTER= Shastri's Shruiti is a major piece of work.  Although it
is
 a highly simplified system, for its degree of simplification it is
amazingly
 powerful.  It has been very helpful to my thinking about AGI.  Please give
 me some excuse for calling it trivial exercise in public relations.  I
 certainly have not published anything as important.  Have you?
 
 The same for Mike Collins's parsers which, at least several years ago I
was
 told

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
John, 

I am sure there is interesting stuff that can be done.  It would be
interesting just to see what sort of an agi could be made on a PC.

I would be interested in you Ideas for how to make a powerful AGI without a
vast amount of interconnect.  The major schemes I know about for reducting
interconnect involve allocating what interconnect you have to the links with
the highest probability or importance, varying those measures of probability
and importance in a contest specific way, and being guided by prior similar
experiences.

Ed Porter

-Original Message-
From: John G. Rose [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 1:42 AM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed,

Well it'd be nice having a supercomputer but P2P is a poor man's
supercomputer and beggars can't be choosy.

Honestly the type of AGI that I have been formulating in my mind has not
been at all closely related to simulating neural activity through
orchestrating partial and mass activations at low frequencies and I had been
avoiding those contagious cog sci memes on purpose. But your expose on the
subject is quite interesting and I wasn't that aware that that is how things
have been being done.

But getting more than a few thousand P2P nodes is difficult. Going from 10K
to 20K nodes and up, getting more difficult to the point of being
prohibitively expensive to being impossible or extremely lucky.  There are
ways to do it but according to your calculations the supercomputer mayt be
more of a wise choice as going out and scrounging up funding for that would
be easier.

Still though (besides working on my group theory heavy design) exploring the
crafting and chiseling of an activation model you are talking about to the
P2P network could be fruitful. I feel that through a number of up front and
unfortunately complicated design changes/adaptations that the activation
orchestrations could be improved thus bringing down the message rate
requirements, reducing activation requirements, depths and frequencies,
through a sort of computational resource topology consumption,
self-organizational design molding.

You do indicate some dynamic resource adaption and things like intelligent
inference guiding schemes in your description but it doesn't seem like it
melts enough into the resource space. But having a design be less static
risks excessive complications...

A major problem though with P2P and the activation methodology is that there
are so many variances in the latencies and availability that serious
synchronicity/simultaneity issues would exist that even more messaging might
be required. Since there are so many variables in public P2P, empirical data
also would be necessary to get a gander on feasibility.

I still feel strongly that the way to do AGI P2P (with public P2P as core
not augmental) is to understand the grid, and build the AGI design based on
that and what it will be in a few years, instead of taking a design and
morphing it to the resource space. That said, there are finite designs that
will work so the number of choices is few.

John


_
From: Ed Porter [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 6:17 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi]
Funding AGI research]


John, 

You raised some good points.  The problem is that the total
number of messages/sec that can be received is relatively small.  It is not
as if you are dealing with a multidimensional grid or toroidal net in which
spreading tree activation can take advantage of the fact that the total
parallel bandwidth for regional messaging can be much greater than the
x-sectional bandwidth.  

In a system where each node is a server class node with
multiple processors and 32 or 64Gbytes of ram, much of which is allocable to
representation, sending messages to local indices on each machine could
fairly efficiently activate all occurrences of something in a 32 to 64 TByte
knowledge base with a max of 1K internode messages, if there was only 1K
nodes.

But in a PC based P2P system the ratio of nodes to
representation space is high and the total number of 128 byte messages/sec
than can be received is limited to about 100, so neither methods of trying
to increase number of patterns than can be activated with the given
interconnect of the network buy you as much.

Human level context sensitivity arises because a large
number of things that can depend on a large number of things in the current
context are made aware of those dependencies.  This takes a lot of
messaging, and I don't see how a P2P system where each node can only receive
about 100 relatively short messages a second is going to make this possible
unless you had a huge

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Richard Loosemore

Ed Porter wrote:



RICHARD LOOSEMORE= You have no idea of the context in which I made
that sweeping dismissal. 
  If you have enough experience of research in this area you will know 
that it is filled with bandwagons, hype and publicity-seeking.  Trivial 
models are presented as if they are fabulous achievements when, in fact, 
they are just engineered to look very impressive but actually solve an 
easy problem.  Have you had experience of such models?  Have you been 
around long enough to have seen something promoted as a great 
breakthrough even though it strikes you as just a trivial exercise in 
public relations, and then watch history unfold as the great 
breakthrough leads to  absolutely nothing at all, and is then 
quietly shelved by its creator?  There is a constant ebb and flow of 
exaggeration and retreat, exaggeration and retreat.  You are familiar 
with this process, yes?


ED PORTER= Richard, the fact that a certain percent of theories and
demonstrations are false and/or misleading does not give you the right to
dismiss any theory or demonstration that counters your position in an
argument as 


trivial exercises in public relations, designed to look
really impressive, and filled with hype designed to attract funding, which
actually accomplish very little

without at least giving some supporting argument for your dismissal.
Otherwise you could deny any aspect of scientific, mathematical, or
technological knowledge, no matter how sound, that proved inconvenient to
whatever argument you were making.  


There are people who argue in that dishonest fashion, but it is questionable
how much time one should spend conversing with them.  Do you want to be such
a person?

The fact that one of the pieces of evidence you so rudely dismissed is a
highly functional program that has been used by many other researchers,
shows the blindness with which you dismiss the arguments of others.


Ed,

You are misunderstanding this situation.  You repeatedly make extremely 
strong statements about the subject matter of AGI, but you do not have 
enough knowledge of the issues to understand the replies you get.


Now, there is nothing wrong with not understanding, but what happens 
next is quite intolerable:  you argue back as if your opinion was just 
as valid as the hard-won knowledge that someone else took 25 years to 
acquire.


Not only that, but you go on to sprinkle your comments with instructions 
to that person to open their mind as if the were somehow being 
closed-minded.


AND not only that, but when I display some impatience with this behavior 
and decline to write a massive essay to explain stuff that you should be 
learning for yourself, you decide to fling out accusations such as that 
i am arguing in a dishonest manner, or that I am dismissing an 
argument or theory just because it counters my position.


If you look at the broad sweep of my postings on these lists you will 
notice that I spend much more time than I should writing out 
explanations when people say that they find something I wrote confusing 
or incomplete.  When someone starts behaving rudely, however, I lose 
patience.  What you are experiencing now is lost patience, that is all.




Richard Loosemore

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71815518-2fa3ba


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
Richard,

It is not clear how valuable your 25 years of hard won learning is if it
causes you to dismiss valuable scientific work that seems to have eclipsed
the importance of anything I or you have published as trivial exercises in
public relations without giving any reason whatsoever for the particular
dismissal.

I welcome criticism in this forum provided it is well reasoned and without
venom.  But to dismiss a list of examples I give to support an argument as
trivial exercises in public relations without any justification other than
the fact that in general a certain numbers of published papers are
inaccurate and/or overblown, is every bit as dishonest as calling someone a
liar with regard to a particular statement based on nothing more than the
knowledge some people are liars.

In my past exchanges with you, sometimes your responses have been helpful.
But I have noticed that although you are very quick to question me (and
others), if I question you, rather than respond directly to my arguments you
often don't respond to them at all -- such as your recent refusal to justify
your allegation that my whole framework, presumably for understanding AGI,
was wrong (a pretty insulting statement which should not be flung around
without some justification).  Or if you do respond to challenges, you often
dismiss them as invalid without any substantial evidence, or you
substantially change the subject, such as by focusing on one small part of
my argument that I have not yet fully supported, while refusing to
acknowledge the major support I have shown for the major thrust of my
argument.

When you argue like that there really is no purpose in continuing the
conversation.  What's the point.  Under those circumstance your not dealing
with someone who is likely to tell you anything of worth.  Rather you are
only likely to hear lame defensive arguments from somebody who is either
incapable of properly defending or unwilling to properly defend their
arguments, and, thus, is unlikely to communicate anything of value in the
exchange.

Your 25 years of experience doesn't mean squat about how much you truly
understand AGI unless you are capable of being more intellectually honest,
both with yourself and with others -- and unless you are capable of actually
reasonably defending your understandings, head-on, against reasoned
questioning and countering evidence.  To dismiss counter evidence cited
against your arguments as trivial exercises in public relations without
any specific justification is not a reasonable defense, and the fact that
you so often result to such intellectually dishonest tactics to defend your
stated understandings relating to AGI really does call into question the
quality of those understandings.

In summary, don't go around attacking other people's statements unless you
are willing to defend those attacks in an intellectually honest manner.

Ed Porter

P.S. This is my last response in this thread.  You can have the last say if
you so wish.  

-Original Message-
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 9:58 AM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter wrote:
 
 RICHARD LOOSEMORE= You have no idea of the context in which I made
 that sweeping dismissal. 
   If you have enough experience of research in this area you will know 
 that it is filled with bandwagons, hype and publicity-seeking.  Trivial 
 models are presented as if they are fabulous achievements when, in fact, 
 they are just engineered to look very impressive but actually solve an 
 easy problem.  Have you had experience of such models?  Have you been 
 around long enough to have seen something promoted as a great 
 breakthrough even though it strikes you as just a trivial exercise in 
 public relations, and then watch history unfold as the great 
 breakthrough leads to  absolutely nothing at all, and is then 
 quietly shelved by its creator?  There is a constant ebb and flow of 
 exaggeration and retreat, exaggeration and retreat.  You are familiar 
 with this process, yes?
 
 ED PORTER= Richard, the fact that a certain percent of theories and
 demonstrations are false and/or misleading does not give you the right to
 dismiss any theory or demonstration that counters your position in an
 argument as 
 
   trivial exercises in public relations, designed to look
 really impressive, and filled with hype designed to attract funding, which
 actually accomplish very little
 
 without at least giving some supporting argument for your dismissal.
 Otherwise you could deny any aspect of scientific, mathematical, or
 technological knowledge, no matter how sound, that proved inconvenient to
 whatever argument you were making.  
 
 There are people who argue in that dishonest fashion, but it is
questionable
 how much time one should spend conversing with them.  Do you want to be
such
 a person?
 
 The fact that one

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread John G. Rose
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 John,
 
 I am sure there is interesting stuff that can be done.  It would be
 interesting just to see what sort of an agi could be made on a PC.

Yes it would be interesting to see what could be done on a small cluster of
modern server grade computers. I like to think about the newer Penryn 45nm,
SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running 64
bit OS (sorry I prefer coding in Windows) using standard gigabit Ethernet
quad NICs, with solid state drives, and 15,000 RPM SAS for the slower stuff,
and a take maybe 10 of these servers. There HAS to be enough resource there
to get some small prototype going. 

And look at next year's 8 core Nehalem procs coming out...

Interserver messaging should make heavy use of IP multicasting. Then another
messaging channel with the new USB 3.0... Supposedly USB 3.0 is 4.8
gigabits.


 I would be interested in you Ideas for how to make a powerful AGI
 without a
 vast amount of interconnect.  The major schemes I know about for
 reducting
 interconnect involve allocating what interconnect you have to the links
 with
 the highest probability or importance, varying those measures of
 probability
 and importance in a contest specific way, and being guided by prior
 similar
 experiences.

Well I actually don't have the theory far enough to calculate interconnect
metrics. But I try to minimize that through storage structure. What gets
stored, how it gets stored, where it's stored, how systems are modeled, what
a model is, what a system of models are, how systems of models are stored,..
don't store dupes, store diffs... mixing code and data, collapsing data into
code, what is code and what is data? Basically a lot of intelligent
indexing, like real intelligent indexing... 

I'm working on using CA's as universal symbolistic indexors and generators -
IOW exploring a theory of uncalculated precalcs for computational complexity
indexing using CA's in order to control uncertainty and manage complexity...

Lots of addicting brain candy stuff...

John



 

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71921419-8e0002


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
John,

As you say the hardware is just going to get better and better.  In five
years the PC's of most of the people on this list will probably have at
least 8 cores and 16 gig of ram.

But even with a current 32 bit PC with say 4G of Ram you should be able to
build an AGI that would be a meaningful proof of concept.  Lets say 3G is
for representation, at say 60 bytes per atom (less than my usual 100
bytes/atom because using 32bit pointers), that would allow you roughly
50Million atoms.  Over 1 million seconds (very roughly two weeks 24/7) that
would allow an average of 50 atoms a second of representation.  Of course
your short term memory would record at a much higher frequency, and over
time more and more of your representation would go into models rather than
episodic recording.  But as this happened the vocabulary of patterns would
grow and thus one atom, on average would be able to represent more.

But it seems to me such an AGI should be able to have meaningful world
knowledge about certain simple worlds, or certain simple subparts of the
world.  For example, it should be able to have a pretty good model for the
world of many early video games, such as pong and perhaps even pac-man (Its
been so long since I've seen pac-man I don't know how complex it is, but I
am assuming 50 million atoms, many of which, over time, would represent
complex patterns, would be able to catch most of the meaningful
generalizations of pac-man including its control mechanisms and the results
they occur).

Is I said in an earlier email, if we want AGI-at-Home to catch on it would
be valuable to think of some sort of application that would either inspire
through importance or entice by usefulness or amusement to cause people let
it use a substantial part of their machine cycles.  

You mention an interest in intelligent indexing.  Of course, hierarchical
memory provides a fairly good from of intelligent indexing, in the sense
that it automatically promotes indexing through learned combinations of
indicies, and can be easily made to have probabilistic and importance
weights on its index links to more efficiency allocate index activations.  

How does your intelligent indexing work?

Ed Porter


-Original Message-
From: John G. Rose [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 2:17 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 From: Ed Porter [mailto:[EMAIL PROTECTED]
 John,
 
 I am sure there is interesting stuff that can be done.  It would be
 interesting just to see what sort of an agi could be made on a PC.

Yes it would be interesting to see what could be done on a small cluster of
modern server grade computers. I like to think about the newer Penryn 45nm,
SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running 64
bit OS (sorry I prefer coding in Windows) using standard gigabit Ethernet
quad NICs, with solid state drives, and 15,000 RPM SAS for the slower stuff,
and a take maybe 10 of these servers. There HAS to be enough resource there
to get some small prototype going. 

And look at next year's 8 core Nehalem procs coming out...

Interserver messaging should make heavy use of IP multicasting. Then another
messaging channel with the new USB 3.0... Supposedly USB 3.0 is 4.8
gigabits.


 I would be interested in you Ideas for how to make a powerful AGI
 without a
 vast amount of interconnect.  The major schemes I know about for
 reducting
 interconnect involve allocating what interconnect you have to the links
 with
 the highest probability or importance, varying those measures of
 probability
 and importance in a contest specific way, and being guided by prior
 similar
 experiences.

Well I actually don't have the theory far enough to calculate interconnect
metrics. But I try to minimize that through storage structure. What gets
stored, how it gets stored, where it's stored, how systems are modeled, what
a model is, what a system of models are, how systems of models are stored,..
don't store dupes, store diffs... mixing code and data, collapsing data into
code, what is code and what is data? Basically a lot of intelligent
indexing, like real intelligent indexing... 

I'm working on using CA's as universal symbolistic indexors and generators -
IOW exploring a theory of uncalculated precalcs for computational complexity
indexing using CA's in order to control uncertainty and manage complexity...

Lots of addicting brain candy stuff...

John



 

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71938578-e534ed

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread John G. Rose
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 
 But even with a current 32 bit PC with say 4G of Ram you should be able
 to
 build an AGI that would be a meaningful proof of concept.  Lets say 3G
 is
 for representation, at say 60 bytes per atom (less than my usual 100
 bytes/atom because using 32bit pointers), that would allow you roughly
 50Million atoms.  Over 1 million seconds (very roughly two weeks 24/7)
 that
 would allow an average of 50 atoms a second of representation.  Of
 course
 your short term memory would record at a much higher frequency, and over
 time more and more of your representation would go into models rather
 than
 episodic recording.  But as this happened the vocabulary of patterns
 would
 grow and thus one atom, on average would be able to represent more.
 But it seems to me such an AGI should be able to have meaningful world
 knowledge about certain simple worlds, or certain simple subparts of the
 world.  For example, it should be able to have a pretty good model for
 the
 world of many early video games, such as pong and perhaps even pac-man
 (Its
 been so long since I've seen pac-man I don't know how complex it is, but
 I
 am assuming 50 million atoms, many of which, over time, would represent
 complex patterns, would be able to catch most of the meaningful
 generalizations of pac-man including its control mechanisms and the
 results
 they occur).


Yes I can imagine this. But how much information would be in each 60 byte
atom? Is it a pointer to a pattern stored on disk, or is it some sort of
index, or is it a portion of a pattern, or is it a full pattern in a simple
pacman type world?
 

 Is I said in an earlier email, if we want AGI-at-Home to catch on it
 would
 be valuable to think of some sort of application that would either
 inspire
 through importance or entice by usefulness or amusement to cause people
 let
 it use a substantial part of their machine cycles.


Well I can't elaborate publicly but I actually have this application
running, still in pre-alpha mode... ahh.. but I have to sell this thing
enabling me to buy RD time to potentially convert it to a protoAGI...so no
open source on that one :(

BUT there are many other applications that could be the delivery mechanism.
There are a number of ways to do it... one way was discussed earlier where
you sell your PC resources. That is a good idea!

 
 You mention an interest in intelligent indexing.  Of course,
 hierarchical
 memory provides a fairly good from of intelligent indexing, in the sense
 that it automatically promotes indexing through learned combinations of
 indicies, and can be easily made to have probabilistic and importance
 weights on its index links to more efficiency allocate index
 activations.
 
 How does your intelligent indexing work?

Well I can describe briefly, there are two basic types of virtual indexing,
the actual disk based indexing I'm trying to still use a DBMS for that since
they do it so well. First type is based on algebraic structure
decomposition. I see everything as algebraic structure; an AGI computer can
do the same, but way better. When everything is converted to algebraic
structure things become very index friendly, in fact so friendly it looks
like many things collapse or telescope down. The other type of indexing that
I just started working on is CA based universal symbolistic
generation/indexing. Algebraic structure is good for skeltoidal but you
need some filler. CA's seem like they can do the trick. The thing with CA's
is that they can be indexed based on uncalculated values. If a CA structure
is so darn complex why waste the cycles calculating it? The CA's have
infinite symbolistic properties that only a portion of them need be
calculated (take up resources). Linking the algebraic structure indexing
with CA indexing I'm trying to smooth out with group semiautomata, but a lot
of magic still happens there :)

So that's it without getting too into details. Very primitive still ...

John


 
 -Original Message-
 From: John G. Rose [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, December 04, 2007 2:17 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
  From: Ed Porter [mailto:[EMAIL PROTECTED]
  John,
 
  I am sure there is interesting stuff that can be done.  It would be
  interesting just to see what sort of an agi could be made on a PC.
 
 Yes it would be interesting to see what could be done on a small cluster
 of
 modern server grade computers. I like to think about the newer Penryn
 45nm,
 SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running
 64
 bit OS (sorry I prefer coding in Windows) using standard gigabit
 Ethernet
 quad NICs, with solid state drives, and 15,000 RPM SAS for the slower
 stuff,
 and a take maybe 10 of these servers. There HAS to be enough resource
 there
 to get some small prototype going.
 
 And look at next year's 8 core Nehalem procs coming out...
 
 Interserver messaging

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Matt Mahoney
--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably
 quoting below I discussed the bandwidth issues.  I am assuming nodes
 directly talk to each other, which is probably overly optimistic, but still
 are limited by the fact that each node can only receive somewhere roughly
 around 100 128 byte messages a second.  Unless you have a really big P2P
 system, that just isn't going to give you much bandwidth.  If you had 100
 million P2P nodes it would.  Thus, a key issue is how many participants is
 an AGI-at-Home P2P system going to get.  

My design would use most of the Internet (10^9 P2P nodes).  Messages would be
natural language text strings, making no distinction between documents,
queries, and responses.  Each message would have a header indicating the ID
and time stamp of the originator and any intermediate nodes through which the
message was routed.  A message could also have attached files.  Each node
would have a cache of messages and its own policy on which messages it decides
to keep or discard.

The goal of the network is to route messages to other nodes that store
messages with matching terms.  To route an incoming message x, it matches
terms in x to terms in stored messages and sends copies to nodes that appear
in those headers, appending its own ID and time stamp to the header of the
outgoing copies.  It also keeps a copy, so that the receiving nodes knows that
they know it has a copy of x (at least temporarily).

The network acts as a distributed database with a distributed search function.
 If X posts a document x and Y posts a query y with matching terms, then the
network acts to route x to Y and y to X.

 I mean, what would motivate the average American, or even the average
 computer geek turn over part of his computer to it?  It might not be an easy
 sell for more than several hundred or several thousand people, at least
 until it could do something cool, like index their videos for them, be a
 funny chat bot, or something like that.

The value is the ability to post messages that can be found by search, without
having to create a website.  Information has negative value; people will trade
CPU resources for the ability to advertise.

 In addition to my last email, I don't understand what your were saying below
 about complexity.  Are you saying that as a system becomes bigger it
 naturally becomes unstable, or what?

When a system's Lyapunov exponent (or its discrete approximation) becomes
positive, it becomes unmaintainable.  This is solved by reducing its
interconnectivity.  For example, in software we use scope, data abstraction,
packages, protocols, etc. to reduce the degree to which one part of the
program can affect another.  This allows us to build larger programs.

In a message passing network, the critical parameter is the ratio of messages
out to messages in.  The ratio cannot exceed 1 on average.  Each node can have
 its own independent policy of prioritizing messages, but will probably send
messages at a nearly constant maximum rate regardless of the input rate.  This
reaches equilibrium at a ratio of 1, but it would also allow rare but
important messages to propagate to a large number of nodes.  All critically
balanced complex systems are subject to rare but significant events, for
example software (state changes and failures), evolution (population
explosions, plagues, and mass extinctions), and gene regulatory networks (cell
differentiation).


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72111983-b0ec39


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
MATT MAHONEY= My design would use most of the Internet (10^9 P2P
nodes).
ED PORTER= That's ambitious.  Easier said than done unless you have a
Google, Microsoft, or mass popular movement backing you.

 ED PORTER= I mean, what would motivate the average American, or even
the average computer geek turn over part of his computer to it?...
MATT MAHONEY= The value is the ability to post messages that can be
found by search, without having to create a website.  Information has
negative value; people will trade CPU resources for the ability to
advertise.
ED PORTER=It sounds theoretically possible.  But actually making it
happen in a world with so much competition for mind and machine share might
be quite difficult.  Again it is something that would probably require a
major force of the type I listed above to make it happen.


 ED PORTER= Are you saying that as a system becomes bigger it
naturally becomes unstable, or what?
MATT MAHONEY= 
When a system's Lyapunov exponent (or its discrete approximation) becomes
positive, it becomes unmaintainable.  This is solved by reducing its
interconnectivity.  For example, in software we use scope, data abstraction,
packages, protocols, etc. to reduce the degree to which one part of the
program can affect another.  This allows us to build larger programs.

In a message passing network, the critical parameter is the ratio of
messages
out to messages in.  The ratio cannot exceed 1 on average.
ED PORTER= Thanks for the info.  By unmaintainable what do you mean?

I don't understand why more messages coming in than going out creates a
problem, unless most of what nodes do is relay message, which is not what
they do in my system.

The unruly chaotic side of AGI is not something I have thought much about.
I have tried to design my system to largely avoid it.  So this is something
I don't know much about, although I have thought about net congestion a fair
amount which can be very dynamic, and that sounds like it is a related to
what you are talking about.  

I have tried to design my system as a largly asynchronous messaging system
so most processes are relatively loosely linked, as browsers and servers
generally are on the internet.  As such, the major type of instability I
have worried about is that of network traffic congestion, such as if all of
a sudden many nodes want to talk to the same node, both for computer nodes
and pattern nodes.

I WOULD BE INTERESTED IN ANY THOUGHTS ON THE OTHER TYPES OF DYNAMIC
INSTABILITIES A HIERARCHICAL MEMORY SYSTEM -- WITH PROBABILISTIC INDEX-BASED
SPREADING ACTIVATION -- MIGHT HAVE.

Matt, it sounds as if OpenCog ever tries to build a large P2P network you
the man.

Ed Porter


-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 7:42 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

--- Ed Porter [EMAIL PROTECTED] wrote:

 Matt,
 
 IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably
 quoting below I discussed the bandwidth issues.  I am assuming nodes
 directly talk to each other, which is probably overly optimistic, but
still
 are limited by the fact that each node can only receive somewhere roughly
 around 100 128 byte messages a second.  Unless you have a really big P2P
 system, that just isn't going to give you much bandwidth.  If you had 100
 million P2P nodes it would.  Thus, a key issue is how many participants is
 an AGI-at-Home P2P system going to get.  

My design would use most of the Internet (10^9 P2P nodes).  Messages would
be
natural language text strings, making no distinction between documents,
queries, and responses.  Each message would have a header indicating the ID
and time stamp of the originator and any intermediate nodes through which
the
message was routed.  A message could also have attached files.  Each node
would have a cache of messages and its own policy on which messages it
decides
to keep or discard.

The goal of the network is to route messages to other nodes that store
messages with matching terms.  To route an incoming message x, it matches
terms in x to terms in stored messages and sends copies to nodes that appear
in those headers, appending its own ID and time stamp to the header of the
outgoing copies.  It also keeps a copy, so that the receiving nodes knows
that
they know it has a copy of x (at least temporarily).

The network acts as a distributed database with a distributed search
function.
 If X posts a document x and Y posts a query y with matching terms, then the
network acts to route x to Y and y to X.

 I mean, what would motivate the average American, or even the average
 computer geek turn over part of his computer to it?  It might not be an
easy
 sell for more than several hundred or several thousand people, at least
 until it could do something cool, like index their videos for them, be a
 funny chat bot, or something like

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Richard Loosemore


Ed Porter wrote:

Richard,

It is not clear how valuable your 25 years of hard won learning is if
 it causes you to dismiss valuable scientific work that seems to have
 eclipsed the importance of anything I or you have published as 
trivial exercises in public relations without giving any reason 
whatsoever for the particular dismissal.


I welcome criticism in this forum provided it is well reasoned and 
without venom.  But to dismiss a list of examples I give to support 
an argument as trivial exercises in public relations without any 
justification other than the fact that in general a certain numbers 
of published papers are inaccurate and/or overblown, is every bit as 
dishonest as calling someone a liar with regard to a particular 
statement based on nothing more than the knowledge some people are 
liars.


In my past exchanges with you, sometimes your responses have been 
helpful. But I have noticed that although you are very quick to 
question me (and others), if I question you, rather than respond 
directly to my arguments you often don't respond to them at all -- 
such as your recent refusal to justify your allegation that my whole 
framework, presumably for understanding AGI, was wrong (a pretty 
insulting statement which should not be flung around without some 
justification).  Or if you do respond to challenges, you often 
dismiss them as invalid without any substantial evidence, or you 
substantially change the subject, such as by focusing on one small 
part of my argument that I have not yet fully supported, while 
refusing to acknowledge the major support I have shown for the major 
thrust of my argument.


When you argue like that there really is no purpose in continuing the
 conversation.  What's the point.  Under those circumstance your not 
dealing with someone who is likely to tell you anything of worth. 
Rather you are only likely to hear lame defensive arguments from 
somebody who is either incapable of properly defending or unwilling 
to properly defend their arguments, and, thus, is unlikely to 
communicate anything of value in the exchange.


Your 25 years of experience doesn't mean squat about how much you 
truly understand AGI unless you are capable of being more 
intellectually honest, both with yourself and with others -- and 
unless you are capable of actually reasonably defending your 
understandings, head-on, against reasoned questioning and countering 
evidence.  To dismiss counter evidence cited against your arguments 
as trivial exercises in public relations without any specific 
justification is not a reasonable defense, and the fact that you so 
often result to such intellectually dishonest tactics to defend your

 stated understandings relating to AGI really does call into question
 the quality of those understandings.

In summary, don't go around attacking other people's statements 
unless you are willing to defend those attacks in an intellectually 
honest manner.


I confess, I would rather that I had not so quickly dismissed those
researchers you mentioned - mostly because my motivation at the time was
to dismiss the exaggerated value that *you* placed on these results.

But let me explain the reason why I still feel that it was valid to
dismiss them.

They are examples of a category of research that addresses issues that
are completely compromised by the lack of solutions to other issues.
Thus: building a NL parser, no matter how good it is, is of no use
whatsoever unless it can be shown to emerge from (or at least fit with)
a learning mechanism that allows the system itself to generate its own
understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
issue is dealt with, a NL parser will arise naturally, and any previous
work on non-developmental, hand-built parsers will be completely
discarded. You were trumpeting the importance of work that I know will
be thrown away later, and in the mean time will be of no help in
resolving the important issues.

Now, I am harsh about these researchers not because they in particular
were irresponsible, but because they are part of a tradition in which
everyone is looking for cheap results that superficially appear good to
peer reviewers, so they can get things published, so they can get more
research grants, so they can get higher salaries. There is an
appallingly high incidence of research that is carried out because it
fits the ideal paper-publication template, not because the work itself
addresses important issues. This is a kind of low-level academic 
corruption, and I will continue to call it what it is, even if you don't 
have the slightest idea that this corruption exists.


It was towards *that* issue that my criticism was directed.

I would have been perfectly happy to explain this to you before, but
instead of appreciating where I was coming from, you launched into a
tirade about my dishonesty and stupidity in rejecting papers 

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Benjamin Goertzel
 Thus: building a NL parser, no matter how good it is, is of no use
 whatsoever unless it can be shown to emerge from (or at least fit with)
 a learning mechanism that allows the system itself to generate its own
 understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
 MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
 issue is dealt with, a NL parser will arise naturally, and any previous
 work on non-developmental, hand-built parsers will be completely
 discarded. You were trumpeting the importance of work that I know will
 be thrown away later, and in the mean time will be of no help in
 resolving the important issues.

Richard, you discount the possibility that said NL parser will play a key
role in the adaptive emergence of a system that can generate its own
linguistic understanding.  I.e., you discount the possibility that, with the
right learning mechanism and instructional environment, hand-coded
rules may serve as part of the initial seed for a learning process that will
eventually generate knowledge obsoleting these initial hand-coded
rules.

It's fine that you discount this possibility -- I just want to point out that
in doing so, you are making a bold and unsupported theoretical hypothesis,
rather than stating an obvious or demonstrated fact.

Vaguely similarly, the grammar of child language is largely thrown
away in adulthood, yet it was useful as scaffolding in leading to the
emergence of adult language.

-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72129171-2bf67a


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Matt Mahoney

--- Ed Porter [EMAIL PROTECTED] wrote:

 MATT MAHONEY= My design would use most of the Internet (10^9 P2P
 nodes).
 ED PORTER= That's ambitious.  Easier said than done unless you have a
 Google, Microsoft, or mass popular movement backing you.

It would take some free software that people find useful.  The Internet has
been transformed before.  Remember when there were no web browsers and no
search engines?  You can probably think of transformations that would make the
Internet more useful.  Centralized search is limited to a few big players that
can keep a copy of the Internet on their servers.  Google is certainly useful,
but imagine if it searched a space 1000 times larger and if posts were
instantly added to its index, without having to wait days for its spider to
find them.  Imagine your post going to persistent queries posted days earlier.
 Imagine your queries being answered by real human beings in addition to other
peers.

I probably won't be the one writing this program, but where there is a need, I
expect it will happen.


 In a message passing network, the critical parameter is the ratio of
 messages
 out to messages in.  The ratio cannot exceed 1 on average.
 ED PORTER= Thanks for the info.  By unmaintainable what do you mean?
 
 I don't understand why more messages coming in than going out creates a
 problem, unless most of what nodes do is relay message, which is not what
 they do in my system.

I meant the other way, which would flood the network with duplicate messages. 
But I believe the network would be stable against this, even in the face of
spammers and malicious nodes, because most nodes would be configured to ignore
duplicates and any messages that it deemed irrelevant.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72132605-fe415f


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Richard Loosemore

Benjamin Goertzel wrote:

Thus: building a NL parser, no matter how good it is, is of no use
whatsoever unless it can be shown to emerge from (or at least fit with)
a learning mechanism that allows the system itself to generate its own
understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
issue is dealt with, a NL parser will arise naturally, and any previous
work on non-developmental, hand-built parsers will be completely
discarded. You were trumpeting the importance of work that I know will
be thrown away later, and in the mean time will be of no help in
resolving the important issues.


Richard, you discount the possibility that said NL parser will play a key
role in the adaptive emergence of a system that can generate its own
linguistic understanding.  I.e., you discount the possibility that, with the
right learning mechanism and instructional environment, hand-coded
rules may serve as part of the initial seed for a learning process that will
eventually generate knowledge obsoleting these initial hand-coded
rules.

It's fine that you discount this possibility -- I just want to point out that
in doing so, you are making a bold and unsupported theoretical hypothesis,
rather than stating an obvious or demonstrated fact.

Vaguely similarly, the grammar of child language is largely thrown
away in adulthood, yet it was useful as scaffolding in leading to the
emergence of adult language.


The problem is that this discussion has drifted away from the original 
context in which I made the remarks.


I do *not* discount the possibility that an ordinary NL parser may play 
a role in the future.


What I was attacking was the idea that a NL parser that does a wonderful 
job today (but which is built on a formalism that ignores all the issues 
involved in getting an adaptive language-understanding system working) 
is IPSO FACTO going to be a valuable step in the direction of a full 
adaptive system.


It was the linkage that I dismissed.  It was the idea that BECAUSE the 
NL parser did such a great job, therefore it has a very high probability 
of being a great step on the road to a full adaptive (etc) language 
understanding system.


If the NL parser completely ignores those larger issues I am justified 
in saying that it is a complete crap shoot whether or not this 
particular parser is going to be of use in future, more complete 
theories of language.


But that is not the same thing as making a blanket dismissal of all 
parsers, saying they cannot be of any use as (as you point out) seed 
material in the design of a complete system.


I was objecting to Ed's pushing this particular NL parser in my face and 
insisting that I should respect it as a substantial step towards full 
AGI   .   and my objection was that I find models like that all show 
and no deep substance precisely because they ignore the larger issues 
and go for the short-term gratification of a parser that works really well.


So I was not taking the position you thought I was.




Richard Loosemore





-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72135004-3fc959


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
The particular NL parser paper in question, Collins's Convolution Kernels
for Natural Language
(http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Collins-kernels.pdf)
is actually saying something quite important that extends way beyond parsers
and is highly applicable to AGI in general.  

It is actually showing that you can do something roughly equivalent to
growing neural gas (GNG) in a space with something approaching 500,000
dimensions, but you can do it without normally having to deal with more than
a few of those dimensions at one time.  GNG is an algorithm I learned about
from reading Peter Voss that allows one to learn how to efficiently
represent a distribution in a relatively high dimensional space in a totally
unsupervised manner.  But there really seem to be no reason why there should
be any limit to the dimensionality of the space in which the Collin's
algorithm works, because it does not use an explicit vector representation,
nor, if I recollect correctly, a Euclidian distance metric, but rather a
similarity metric which is generally much more appropriate for matching in
very high dimensional spaces.

But what he is growing are not just points representing where data has
occurred in a high dimensional space, but sets of points that define
hyperplanes for defining the boundaries between classes.  My recollection is
that this system learns automatically from both labeled data (instances of
correct parse trees) and randomly generated deviations from those instances.
His particular algorithm matches tree structures, but with modification it
would seem to be extendable to matching arbitrary nets.  Other versions of
it could be made to operate, like GNG, in an unsupervised manner.

If you stop and think about what this is saying and generalize from it, it
provides an important possible component in an AGI tool kit. What it shows
is not limited to parsing, but it would seem possibly applicable to
virtually any hierarchical or networked representation, including nets of
semantic web RDF triples, and semantic nets, and predicate logic
expressions.  At first glance it appears it would even be applicable to
kinkier net matching algorithms, such as an Augmented transition network
(ATN) matching.

So if one reads this paper with a mind to not only what it specifically
shows, but to what how what it shows could be expanded, this paper says
something very important.  That is, that one can represent, learn, and
classify things in very high dimensional spaces -- such as 10^1
dimensional spaces -- and do it efficiently provided the part of the space
being represented is sufficiently sparsely connected.

I had already assumed this, before reading this paper, but the paper was
valuable to me because it provided a mathematically rigorous support for my
prior models, and helped me better understand the mathematical foundations
of my own prior intuitive thinking.  

It means that systems like Novemente can deal in very high dimensional
spaces relatively efficiently. It does not mean that all processes that can
be performed in such spaces will be computationally cheap (for example,
combinatorial searches), but it means that many of them, such as GNG like
recording of experience, and simple indexed based matching can scale
relatively well in a sparsely connected world.

That is important, for those with the vision to understand.

Ed Porter

-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 8:59 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

 Thus: building a NL parser, no matter how good it is, is of no use
 whatsoever unless it can be shown to emerge from (or at least fit with)
 a learning mechanism that allows the system itself to generate its own
 understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
 MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
 issue is dealt with, a NL parser will arise naturally, and any previous
 work on non-developmental, hand-built parsers will be completely
 discarded. You were trumpeting the importance of work that I know will
 be thrown away later, and in the mean time will be of no help in
 resolving the important issues.

Richard, you discount the possibility that said NL parser will play a key
role in the adaptive emergence of a system that can generate its own
linguistic understanding.  I.e., you discount the possibility that, with the
right learning mechanism and instructional environment, hand-coded
rules may serve as part of the initial seed for a learning process that will
eventually generate knowledge obsoleting these initial hand-coded
rules.

It's fine that you discount this possibility -- I just want to point out
that
in doing so, you are making a bold and unsupported theoretical hypothesis,
rather than stating an obvious or demonstrated fact.

Vaguely similarly, the grammar of child language

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Ed Porter
Matt,

Perhaps your are right.  

But one problem is that big Google-like compuplexes in the next five to ten
years will be powerful enough to do AGI and they will be much more efficient
for AGI search because the physical closeness of their machines will make it
possible for them to perform the massive interconnected needed for powerful
AGI much more efficiently.

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 9:18 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]


--- Ed Porter [EMAIL PROTECTED] wrote:

 MATT MAHONEY= My design would use most of the Internet (10^9 P2P
 nodes).
 ED PORTER= That's ambitious.  Easier said than done unless you have a
 Google, Microsoft, or mass popular movement backing you.

It would take some free software that people find useful.  The Internet has
been transformed before.  Remember when there were no web browsers and no
search engines?  You can probably think of transformations that would make
the
Internet more useful.  Centralized search is limited to a few big players
that
can keep a copy of the Internet on their servers.  Google is certainly
useful,
but imagine if it searched a space 1000 times larger and if posts were
instantly added to its index, without having to wait days for its spider to
find them.  Imagine your post going to persistent queries posted days
earlier.
 Imagine your queries being answered by real human beings in addition to
other
peers.

I probably won't be the one writing this program, but where there is a need,
I
expect it will happen.


 In a message passing network, the critical parameter is the ratio of
 messages
 out to messages in.  The ratio cannot exceed 1 on average.
 ED PORTER= Thanks for the info.  By unmaintainable what do you
mean?
 
 I don't understand why more messages coming in than going out creates a
 problem, unless most of what nodes do is relay message, which is not what
 they do in my system.

I meant the other way, which would flood the network with duplicate
messages. 
But I believe the network would be stable against this, even in the face of
spammers and malicious nodes, because most nodes would be configured to
ignore
duplicates and any messages that it deemed irrelevant.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72151542-9bffdb

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-04 Thread Benjamin Goertzel
OK, understood...

On Dec 4, 2007 9:32 PM, Richard Loosemore [EMAIL PROTECTED] wrote:

 Benjamin Goertzel wrote:
  Thus: building a NL parser, no matter how good it is, is of no use
  whatsoever unless it can be shown to emerge from (or at least fit with)
  a learning mechanism that allows the system itself to generate its own
  understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
  MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
  issue is dealt with, a NL parser will arise naturally, and any previous
  work on non-developmental, hand-built parsers will be completely
  discarded. You were trumpeting the importance of work that I know will
  be thrown away later, and in the mean time will be of no help in
  resolving the important issues.
 
  Richard, you discount the possibility that said NL parser will play a key
  role in the adaptive emergence of a system that can generate its own
  linguistic understanding.  I.e., you discount the possibility that, with the
  right learning mechanism and instructional environment, hand-coded
  rules may serve as part of the initial seed for a learning process that will
  eventually generate knowledge obsoleting these initial hand-coded
  rules.
 
  It's fine that you discount this possibility -- I just want to point out 
  that
  in doing so, you are making a bold and unsupported theoretical hypothesis,
  rather than stating an obvious or demonstrated fact.
 
  Vaguely similarly, the grammar of child language is largely thrown
  away in adulthood, yet it was useful as scaffolding in leading to the
  emergence of adult language.

 The problem is that this discussion has drifted away from the original
 context in which I made the remarks.

 I do *not* discount the possibility that an ordinary NL parser may play
 a role in the future.

 What I was attacking was the idea that a NL parser that does a wonderful
 job today (but which is built on a formalism that ignores all the issues
 involved in getting an adaptive language-understanding system working)
 is IPSO FACTO going to be a valuable step in the direction of a full
 adaptive system.

 It was the linkage that I dismissed.  It was the idea that BECAUSE the
 NL parser did such a great job, therefore it has a very high probability
 of being a great step on the road to a full adaptive (etc) language
 understanding system.

 If the NL parser completely ignores those larger issues I am justified
 in saying that it is a complete crap shoot whether or not this
 particular parser is going to be of use in future, more complete
 theories of language.

 But that is not the same thing as making a blanket dismissal of all
 parsers, saying they cannot be of any use as (as you point out) seed
 material in the design of a complete system.

 I was objecting to Ed's pushing this particular NL parser in my face and
 insisting that I should respect it as a substantial step towards full
 AGI   .   and my objection was that I find models like that all show
 and no deep substance precisely because they ignore the larger issues
 and go for the short-term gratification of a parser that works really well.

 So I was not taking the position you thought I was.




 Richard Loosemore





 -
 This list is sponsored by AGIRI: http://www.agiri.org/email
 To unsubscribe or change your options, please go to:
 http://v2.listbox.com/member/?;


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=72155184-923590


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Bryan Bishop
On Sunday 02 December 2007, John G. Rose wrote:
 Building up parse trees and word sense models, let's say that would
 be a first step. And then say after a while this was accomplished and
 running on some peers. What would the next theoretical step be?

I am not sure what the next step would be. The first step might be 
enough for the moment. When you have the network functioning at all, 
expose an API so that other programmers can come in and try to utilize 
sentence analysis (and other functions) as if the network is just 
another lobe of the brain or another component for ai. This would allow 
others who are possibly more creative than us to take advantage of what 
looks to be interesting work.

- Bryan

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71422338-8cb1da


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:

Once you build up good models for parsing and word sense, then you read
large amounts of text and start building up model of the realities described
and generalizations from them.

Assuming this is a continuation of the discussion of an AGI-at-home P2P
system, you are going to be very limited by the lack of bandwidth,
particularly for attacking the high dimensional problem of seeking to
understand the meaning of text, which often involve multiple levels of
implication, which would normally be accomplished by some sort of search of
a large semantic space, which is going to be difficult with limited
bandwidth.

But a large amount of text with appropriate parsing and word sense labeling
would still provide a valuable aid for web and text search and for many
forms of automatic learning.  And the level of understanding that such a P2P
system could derive from reading huge amounts of text could be a valuable
initial source of one component of world knowledge for use by AGI.


I know you always find it teious when I express scepticism, so I will 
preface my remarks with:  take this advice or ignore it, your choice.


This description of how to get AGI done reminds me of my childhood 
project to build a Mars-bound spacecraft modeled after James Blish's 
Book Welcome to Mars.  I Knew that I could build it in time for the 
next conjunction of Mars, but I hadn't quite gotten the anti-gravity 
drive sorted out, so instead I collected all the other materials 
described in the book, so everything would be ready when the AG drive 
started working...


The reason it reminds me of this episode is that you are calmly talking 
here about the high dimensional problem of seeking to understand the 
meaning of text, which often involve multiple levels of implication, 
which would normally be accomplished by some sort of search of a large 
semantic space . this is your equivalent of the anti-gravity 
drive.  This is the part that needs extremely detailed knowledge of AI 
and psychology, just to be understand the nature of the problem (never 
mind to solve it).  If you had any idea bout how to solve this part of 
the problem, everything else would drop into your lap.  You wouldn't 
need a P2P AGI-at-home system, because with this solution in hand you 
would have people beating down your door to give you a supercomputer.


Menawhile, unfortunately, solving all those other issues like making 
parsers and trying to do word-sense disambiguation would not help one 
whit to get the real theoretical task done.


I am not being negative, I am just relaying the standard understanding 
of priorities in the AGI field as a whole.  Send complaints addressed to 
AGI Community, not to me, please.




Richard Loosemore


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71451441-4352c5


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter
Once you build up good models for parsing and word sense, then you read
large amounts of text and start building up model of the realities described
and generalizations from them.

Assuming this is a continuation of the discussion of an AGI-at-home P2P
system, you are going to be very limited by the lack of bandwidth,
particularly for attacking the high dimensional problem of seeking to
understand the meaning of text, which often involve multiple levels of
implication, which would normally be accomplished by some sort of search of
a large semantic space, which is going to be difficult with limited
bandwidth.

But a large amount of text with appropriate parsing and word sense labeling
would still provide a valuable aid for web and text search and for many
forms of automatic learning.  And the level of understanding that such a P2P
system could derive from reading huge amounts of text could be a valuable
initial source of one component of world knowledge for use by AGI.

Ed Porter

-Original Message-
From: Bryan Bishop [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 7:33 AM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

On Sunday 02 December 2007, John G. Rose wrote:
 Building up parse trees and word sense models, let's say that would
 be a first step. And then say after a while this was accomplished and
 running on some peers. What would the next theoretical step be?

I am not sure what the next step would be. The first step might be 
enough for the moment. When you have the network functioning at all, 
expose an API so that other programmers can come in and try to utilize 
sentence analysis (and other functions) as if the network is just 
another lobe of the brain or another component for ai. This would allow 
others who are possibly more creative than us to take advantage of what 
looks to be interesting work.

- Bryan

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71438525-d92982

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread John G. Rose
 From: Richard Loosemore [mailto:[EMAIL PROTECTED]
 
 The reason it reminds me of this episode is that you are calmly talking
 here about the high dimensional problem of seeking to understand the
 meaning of text, which often involve multiple levels of implication,
 which would normally be accomplished by some sort of search of a large
 semantic space . this is your equivalent of the anti-gravity
 drive.  This is the part that needs extremely detailed knowledge of AI
 and psychology, just to be understand the nature of the problem (never
 mind to solve it).  If you had any idea bout how to solve this part of
 the problem, everything else would drop into your lap.  You wouldn't
 need a P2P AGI-at-home system, because with this solution in hand you
 would have people beating down your door to give you a supercomputer.


This is naïve. It almost never works this way, where if someone has a
solution to a well known unsolved engineering problem that resources just
come knocking at the door.

 
 Menawhile, unfortunately, solving all those other issues like making
 parsers and trying to do word-sense disambiguation would not help one
 whit to get the real theoretical task done.

This is impractical. ... 
 
 I am not being negative, I am just relaying the standard understanding
 of priorities in the AGI field as a whole.  Send complaints addressed to
 AGI Community, not to me, please.

You are being negative! And since when have the priorities of understandings
in the AGI field been standardized? Perhaps that is part the limiting factor
and self-defeating narrow-mindedness.

John


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71501965-68a77a

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Matt Mahoney
--- Richard Loosemore [EMAIL PROTECTED] wrote:
 Menawhile, unfortunately, solving all those other issues like making 
 parsers and trying to do word-sense disambiguation would not help one 
 whit to get the real theoretical task done.

I agree.  AI has a long history of doing the easy part of the problem first:
solving the mathematics or logic of a word problem, and deferring the hard
part, which is extracting the right formal statement from the natural language
input.  This is the opposite order of how children learn.  The proper order
is: lexical rules first, then semantics, then grammar, and then the problem
solving.  The whole point of using massive parallel computation is to do the
hard part of the problem.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71493437-c427ac


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Tintner

Matt::  The whole point of using massive parallel computation is to do the
hard part of the problem.

I get it : you and most other AI-ers are equating hard with very, very 
complex, right?  But you don't seriously think that the human mind 
successfully deals with language by massive parallel computation, do you? 
Isn't it obvious that the brain is able to understand the wealth of language 
by relatively few computations - quite intricate, hierarchical, 
multi-levelled processing, yes, (in order to understand, for example, any of 
the sentences you or I are writing here), but only a tiny fraction of the 
operations that computers currently perform?


The whole idea of massive parallel computation here, surely has to be wrong. 
And yet none of you seem able to face this to my mind obvious truth.


I only saw this term recently - perhaps it's v. familiar to you (?) - that 
the human brain works by look-up rather than search.  Hard problems can 
have relatively simple but ingenious solutions.



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71504832-b01a2d


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread John G. Rose
 From: Bryan Bishop [mailto:[EMAIL PROTECTED]
 I am not sure what the next step would be. The first step might be
 enough for the moment. When you have the network functioning at all,
 expose an API so that other programmers can come in and try to utilize
 sentence analysis (and other functions) as if the network is just
 another lobe of the brain or another component for ai. This would allow
 others who are possibly more creative than us to take advantage of what
 looks to be interesting work.
 

This is true and a way to get utility out of it. And getting the first step
accomplished is quite a bit of work as is maintaining it. Having just a few
basic baby steps actually materialize in front of you eliminates some of the
complexity so that the larger problem may appear just a bit less daunting.
Also communal developer feedback is a constructive motivator.

John

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71510117-536e83


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread John G. Rose
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 Once you build up good models for parsing and word sense, then you read
 large amounts of text and start building up model of the realities
 described
 and generalizations from them.
 
 Assuming this is a continuation of the discussion of an AGI-at-home P2P
 system, you are going to be very limited by the lack of bandwidth,
 particularly for attacking the high dimensional problem of seeking to
 understand the meaning of text, which often involve multiple levels of
 implication, which would normally be accomplished by some sort of search
 of
 a large semantic space, which is going to be difficult with limited
 bandwidth.
 
 But a large amount of text with appropriate parsing and word sense
 labeling
 would still provide a valuable aid for web and text search and for many
 forms of automatic learning.  And the level of understanding that such a
 P2P
 system could derive from reading huge amounts of text could be a
 valuable
 initial source of one component of world knowledge for use by AGI.


I kind of see the small bandwidth between (most) individual nodes as not a
limiting factor as sets of nodes act as temporary single group entities. IOW
the BW between one set of 50 nodes and another set of 50 nodes is quite
large actually and individual nodes' data access would depend on - indexes
of indexes to minimize their individual BW requirements.

Does this not apply to your model?

John


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71511001-15807d


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

John G. Rose wrote:

From: Richard Loosemore [mailto:[EMAIL PROTECTED]

[snip]

I am not being negative, I am just relaying the standard understanding
of priorities in the AGI field as a whole.  Send complaints addressed to
AGI Community, not to me, please.


You are being negative! And since when have the priorities of understandings
in the AGI field been standardized? Perhaps that is part the limiting factor
and self-defeating narrow-mindedness.


It is easy for a research field to agree that certain problems are 
really serious and unsolved.


A hundred years ago, the results of the Michelson-Morley experiments 
were a big unsolved problem, and pretty serious for the foundations of 
physics.  I don't think it would have been self-defeating 
narrow-mindedness for someone to have pointed to that problem and said 
this is a serious problem.




Richard Loosemore

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71517612-4f04ee


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Mike Tintner wrote:

Matt::  The whole point of using massive parallel computation is to do the
hard part of the problem.

I get it : you and most other AI-ers are equating hard with very, 
very complex, right?  But you don't seriously think that the human mind 
successfully deals with language by massive parallel computation, do 
you? Isn't it obvious that the brain is able to understand the wealth of 
language by relatively few computations - quite intricate, hierarchical, 
multi-levelled processing, yes, (in order to understand, for example, 
any of the sentences you or I are writing here), but only a tiny 
fraction of the operations that computers currently perform?


The whole idea of massive parallel computation here, surely has to be 
wrong. And yet none of you seem able to face this to my mind obvious truth.


I only saw this term recently - perhaps it's v. familiar to you (?) - 
that the human brain works by look-up rather than search.  Hard 
problems can have relatively simple but ingenious solutions.


You need to check the psychology data:  it emphatically disagrees with 
your position here.


One thing that can be easily measured is the activation of lexical 
items related in various ways to a presented word (i.e. show the subject 
the word Doctor and test to see if the word Nurse gets activated).


It turns out that within an extremely short time of the forst word being 
seen, a very large numbmer of other words have their activations raised 
significantly.  Now, whichever way you interpret these (so called 
priming) results, one thing is not in doubt:  there is massively 
parallel activation of lexical units going on during language processing.




Richard Loosemore

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71515718-ac1ab7


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:

Richard,

It is false to imply that knowledge of how to draw implications from a
series of statements by some sort of search mechanism is equally unknown as
that of how to make an anti-gravity drive -- if by anti-gravity drive you
mean some totally unknown form of physics, rather than just anything, such
as human legs, that can push against gravity.  


It is unfair because there is a fair amount of knowledge about how to draw
implications from sequences of statements.  For example view Shastri's
www.icsi.berkeley.edu/~shastri/psfiles/cogsci00.ps.  Also Ben Goertzel has
demonstrated a program that draws implications from statements contained in
different medical texts.

Ed Porter 


P.S., I have enclosed an inexact, but, at least to me, useful drawing I made
of the type of search involved in understanding the multiple implications
contained in the series of statements contained in Shastri's John fell in
the Hallway. Tom had cleaned it.  He was hurt example.  Of course, what is
most missing from this drawing are all the other, dead end, implications
which do not provide a likely implication.  Only one of such dead end is
shown (the implication between fall and trip).  As a result you don't sense
how many dead ends have to be searched to find the implications which best
explain the statements.   EWP


Well, bear in mind that I was not meaning the analogy to be *that* 
exact, or I would have given up on AGI long ago - I'm sure you know that 
I don't believe that getting an understanding system working is as 
impossible as getting an AG drive built.


The purpose of my comment was to point to a huge gap in understanding, 
and the mistaken strategy of dealing with all the peripheral issues 
before having a clear idea how to solve the central problem.


I cannot even begin to do justice, here, to the issues involved in 
solving the high dimensional problem of seeking to understand the 
meaning of text, which often involve multiple levels of implication, 
which would normally be accomplished by some sort of search of a large 
semantic space


You talk as if an extension of some current strategy will solve this ... 
but it is not at all clear that any current strategy for solving this 
problem actually does scale up to a full solution to the problem.  I 
don't care how many toy examples you come up with, you have to show a 
strategy for dealing with some of the core issues, AND reasons to 
believe that those strategies really will work (other than I find them 
quite promising).


Not only that, but there at least some people (to wit, myself) who 
believe there are positive reasons to believe that the current 
strategies *will* not scale up.




Richard Loosemore




-Original Message-
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 10:07 AM

To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter wrote:

Once you build up good models for parsing and word sense, then you read
large amounts of text and start building up model of the realities

described

and generalizations from them.

Assuming this is a continuation of the discussion of an AGI-at-home P2P
system, you are going to be very limited by the lack of bandwidth,
particularly for attacking the high dimensional problem of seeking to
understand the meaning of text, which often involve multiple levels of
implication, which would normally be accomplished by some sort of search

of

a large semantic space, which is going to be difficult with limited
bandwidth.

But a large amount of text with appropriate parsing and word sense

labeling

would still provide a valuable aid for web and text search and for many
forms of automatic learning.  And the level of understanding that such a

P2P

system could derive from reading huge amounts of text could be a valuable
initial source of one component of world knowledge for use by AGI.


I know you always find it teious when I express scepticism, so I will 
preface my remarks with:  take this advice or ignore it, your choice.


This description of how to get AGI done reminds me of my childhood 
project to build a Mars-bound spacecraft modeled after James Blish's 
Book Welcome to Mars.  I Knew that I could build it in time for the 
next conjunction of Mars, but I hadn't quite gotten the anti-gravity 
drive sorted out, so instead I collected all the other materials 
described in the book, so everything would be ready when the AG drive 
started working...


The reason it reminds me of this episode is that you are calmly talking 
here about the high dimensional problem of seeking to understand the 
meaning of text, which often involve multiple levels of implication, 
which would normally be accomplished by some sort of search of a large 
semantic space . this is your equivalent of the anti-gravity 
drive.  This is the part that needs extremely detailed knowledge of AI 
and psychology, just

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread John G. Rose
 From: Richard Loosemore [mailto:[EMAIL PROTECTED]
 It is easy for a research field to agree that certain problems are
 really serious and unsolved.
 
 A hundred years ago, the results of the Michelson-Morley experiments
 were a big unsolved problem, and pretty serious for the foundations of
 physics.  I don't think it would have been self-defeating
 narrow-mindedness for someone to have pointed to that problem and said
 this is a serious problem.
 

Well the definition of problems and the approaches to solving the problems
can be narrow-minded or looked at with a narrow-human-psychological AI
perspective.

Most of these problems boil down to engineering problems and the theory
already exists in some other form; it is a matter of putting things together
IMO.

But myself not being in the cog sci world for that long, only thinking of
AGI in terms of computers, math and AI, I am unaware of the details of some
of the particular AGI unsolved mysteries that are talked about. Not to say I
haven't thought about them from my own narrow-human-psychological AI
perspective :)

John

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71519373-6b5212


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter

MIKE TINTNER Isn't it obvious that the brain is able to understand the
wealth of language by relatively few computations - quite intricate,
hierarchical, multi-levelled processing,

ED PORTER How do you find the right set of relatively few computations
and/or models that are appropriate in a complex context without massive
computation?  

-Original Message-
From: Mike Tintner [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 12:12 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Matt::  The whole point of using massive parallel computation is to do the
hard part of the problem.

I get it : you and most other AI-ers are equating hard with very, very 
complex, right?  But you don't seriously think that the human mind 
successfully deals with language by massive parallel computation, do you? 
Isn't it obvious that the brain is able to understand the wealth of language

by relatively few computations - quite intricate, hierarchical, 
multi-levelled processing, yes, (in order to understand, for example, any of

the sentences you or I are writing here), but only a tiny fraction of the 
operations that computers currently perform?

The whole idea of massive parallel computation here, surely has to be wrong.

And yet none of you seem able to face this to my mind obvious truth.

I only saw this term recently - perhaps it's v. familiar to you (?) - that 
the human brain works by look-up rather than search.  Hard problems can 
have relatively simple but ingenious solutions.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71590357-a986d6

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Dougherty
On Dec 3, 2007 12:12 PM, Mike Tintner [EMAIL PROTECTED] wrote:
 I get it : you and most other AI-ers are equating hard with very, very
 complex, right?  But you don't seriously think that the human mind
 successfully deals with language by massive parallel computation, do you?

Very very complex tends to exceed one's ability to properly model and
especially predict.  Even if the human mind invokes some special kind
of magical cleverness, do you think you (judging from your writing)
have some unique ability to isolate that function (noun) without
simultaneously using that function (verb) ?   I often imagine that I
understand the working of my own mind almost perfectly.  Those that
claim to have grasped the quintessential bit typically end up so far
over the edge that they are unable to express it in meaningful or
useful terms.

 Isn't it obvious that the brain is able to understand the wealth of language
 by relatively few computations - quite intricate, hierarchical,
 multi-levelled processing, yes, (in order to understand, for example, any of
 the sentences you or I are writing here), but only a tiny fraction of the
 operations that computers currently perform?

I believe you are making that statement because you wish it to be
true.  I see no basis for anything to be obvious - especially the
formalism required to define what the term means.  This is due
primarily to the complexity associated with recursive self-reflection.

 The whole idea of massive parallel computation here, surely has to be wrong.
 And yet none of you seem able to face this to my mind obvious truth.

We each continue to persist in our delusions.  Yours may be no
different in the end. :)

 I only saw this term recently - perhaps it's v. familiar to you (?) - that
 the human brain works by look-up rather than search.  Hard problems can
 have relatively simple but ingenious solutions.

How is the look-up table built?  Usually by experience.  When we have
enough similar experiences to look up a solution to general adaptive
intelligence, we will have likely been close enough to it for so long
that (probably) nobody will be surprised.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71652723-808348


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Tintner

RL: One thing that can be easily measured is the activation of lexical
items related in various ways to a presented word (i.e. show the subject
the word Doctor and test to see if the word Nurse gets activated).
It turns out that within an extremely short time of the forst word being
seen, a very large numbmer of other words have their activations raised
significantly.  Now, whichever way you interpret these (so called
priming) results, one thing is not in doubt:  there is massively
parallel activation of lexical units going on during language processing.

Thanks for reply. How many associations are activated? How do we know 
neuroscientifically they are associations to the words being processed and 
not something else entirely? Out of interest, can you give me a ball park 
estimate of how many associations you personally think are activated, say, 
in in a few seconds, in processing sentences like:


The doctor made a move on the nurse.
Relationships between staff in health organizations are fraught with 
complexities


No, I'm not trying to be ridiculously demanding or asking you to be 
ridiculously exact. As you probably know by now, I see the processing of 
sentences as involving several levels, especially for the second sentence, 
but I don't see the number of associations as that many. Let's be generous 
and guess hundreds for the items in the above sentences. But a computer 
program, as I understand, will be typically searching through anywhere 
between thousands, millions and way upwards.


On the one hand, we can perhaps agree that one of the brain's glories is 
that it can very rapidly draw analogies - that I can quickly produce a 
string of associations like, say,  snake, rope, chain, spaghetti 
strand, - and you may quickly be able to continue that string with further 
associations, (like string). I believe that power is mainly based on 
look-up - literally finding matching shapes at speed. But I don't see the 
brain as checking through huge numbers of such shapes. (It would be 
enormously demanding on resources, given that these are complex pictures, 
no?).


As evidence , I'd point to what happens if you try to keep producing further 
analogies. The brain rapidly slows down. It gets harder and harder. And yet 
you will be able to keep producing further examples from memory virtually 
for ever - just slower and slower. Relevant images/ concepts are there, but 
it's not easy to access them. That's why copywriters get well paid to, in 
effect, keep searching for similar analogies (as cool/refreshing as...). 
It's hard work. If that many relevant shapes were being unconsciously 
activated as you seem to be suggesting, it shouldn't be such protracted 
work.


The brain can literally connect any thing to any other thing with, so to 
speak, 6 degrees of separation - but I don't think it can conect that many 
things at once.


I accept that this is still neuroscientifically an open issue, ( I'd be 
grateful for  pointers to the research you're referring to). But I would 
have thought it obvious that the brain has massively inferior search 
capabilities to those of computers - that, surely, is a major reasonwhy we 
invented computers in the first place - they're a massive extension of our 
powers.


And yet the brain can draw analogies, and basically, with minor exceptions, 
computers still can't. I think it's clear that computers won't catch up here 
by quantitatively increasing their powers still further. If you're digging a 
hole in the wrong place, digging further  quicker won't help. (I'm arguing 
a variant of your own argument against  Edward P!). But of course when your 
education and technology dispose you to dig in just those places, it's 
extremely hard to change your ways - or even believe, pace Edward, that 
change is necessary at all. After all, look at the size of those holes.. 
surely, we'll hit the Promised Land anytime now.


P.S. In general, the brain is hugely irrational - it can only maintain a 
reflective, concentrated train of thought for literally seconds, not minutes 
before going off at tangents. It continually and necessarily jumps to 
conclusions. Such irrationality is highly adaptive in a fast-moving world 
where you can't hang around thinking about things for long.  The idea that 
this same brain is systematically, thoroughly searching through, let's say, 
thousands or millions of variants on ideas, seems to me seriously at odds 
with this irrationality. (But I'm interested in all relevant research).




-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71651016-b43e51


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Matt Mahoney
--- Mike Tintner [EMAIL PROTECTED] wrote:
 On the one hand, we can perhaps agree that one of the brain's glories is 
 that it can very rapidly draw analogies - that I can quickly produce a 
 string of associations like, say,  snake, rope, chain, spaghetti 
 strand, - and you may quickly be able to continue that string with further 
 associations, (like string). I believe that power is mainly based on 
 look-up - literally finding matching shapes at speed. But I don't see the 
 brain as checking through huge numbers of such shapes. (It would be 
 enormously demanding on resources, given that these are complex pictures, 
 no?).

Semantic models learn associations by proximity in the training text.  The
degree to which you associate snake and rope depends on how often these
words appear near each other.  You can create an association matrix A, e.g.
A[snake][rope] is the degree of association between these words.

Among the most successful of these models is latent semantic analysis (LSA),
where A is factored: A = USV by singular value decomposition (SVD), such that
U and V are orthonormal and S is diagonal, and then discard all but the
largest elements of S.  In a typical LSA model, A is 20K by 20K, and S is
reduced to about 200.  This approximates A to two 20K by 200 matrices, using
about 2% as much space.

One effect of lossy compression by LSA is to derive associations by the
transitive property of semantics.  For example, if snake is associated with
rope and rope with chain, then the LSA approximation will derive an
association of snake with chain even if it was not seen in the training
data.

SVD has an efficient parallel implementation.  It is most easily visualized as
a 20K by 200 by 20K 3-layer linear neural network [1].  But this really should
not be surprising, because natural language evolved to be processed
efficiently on a slow but highly parallel computer.

1. Gorrell, Genevieve (2006), “Generalized Hebbian Algorithm for Incremental
Singular Value Decomposition in Natural Language Processing”, Proceedings of
EACL 2006, Trento, Italy.
http://www.aclweb.org/anthology-new/E/E06/E06-1013.pdf


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71675396-27fd0e


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter
 was relatively efficient. 

-5-Hecht-Nielsen's sentence completion program (produced by his
confabulation see http://r.ucsd.edu), just by appropriately tying together
probabilistic implications learned from sequences of words, automatically
creates grammatically correct sentences that are related to a prior
sentense, allegedly without any knowledge of grammar, using millions of
probability activations per word, without any un-computable combinatorial
explosion.  The search space that is being explored at any one time
theoretically is considering more possibilities than there are particles in
the known universe -- yet it works.  At any given time several, lets, say 6
to 12 word or phrase slots can be under computation, in which each of
approximately 100K or so words or phrases is receiving scores.  One could
consider the search space to include each of the possible words or phrase
being considered in each of those say 10 ordered slots as the possible
permutation of 10 slot fillers each chosen from a set of about 10^5 words or
phrases, a permuation that has (10^5)!/(10^4)! possibilities.  This is a
very large search space  -- just 100!/10! is over 10^151¸and (10^5)!/(10^4)!
is much, much, much larger space than that -- and yet it all compute with
somewhere within several orders of magnitude of a billion opps.  This very
large search space is actually handled with a superposition of probabilities
(somewhat as in quantum computing) which are collapsed in a sequential
manner, in a rippling propagation of decisions and ensuing probability
propagations. 

So Richard there are ways to do searches efficiently in very high
dimensional spaces, including in the case of confabulation spaces that are
in some ways trillions and trillions of times larger than the known universe
-- all on relatively small computers.  

So lift thine eyes up unto Hecht-Nielsen -- (and his cat with whom he
generously shares credit for Confabulation) -- and believe!

Ed Porter



-Original Message-
From: Richard Loosemore [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 12:49 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter wrote:
 Richard,
 
 It is false to imply that knowledge of how to draw implications from a
 series of statements by some sort of search mechanism is equally unknown
as
 that of how to make an anti-gravity drive -- if by anti-gravity drive
you
 mean some totally unknown form of physics, rather than just anything, such
 as human legs, that can push against gravity.  
 
 It is unfair because there is a fair amount of knowledge about how to draw
 implications from sequences of statements.  For example view Shastri's
 www.icsi.berkeley.edu/~shastri/psfiles/cogsci00.ps.  Also Ben Goertzel has
 demonstrated a program that draws implications from statements contained
in
 different medical texts.
 
 Ed Porter 
 
 P.S., I have enclosed an inexact, but, at least to me, useful drawing I
made
 of the type of search involved in understanding the multiple
implications
 contained in the series of statements contained in Shastri's John fell in
 the Hallway. Tom had cleaned it.  He was hurt example.  Of course, what
is
 most missing from this drawing are all the other, dead end, implications
 which do not provide a likely implication.  Only one of such dead end is
 shown (the implication between fall and trip).  As a result you don't
sense
 how many dead ends have to be searched to find the implications which best
 explain the statements.   EWP

Well, bear in mind that I was not meaning the analogy to be *that* 
exact, or I would have given up on AGI long ago - I'm sure you know that 
I don't believe that getting an understanding system working is as 
impossible as getting an AG drive built.

The purpose of my comment was to point to a huge gap in understanding, 
and the mistaken strategy of dealing with all the peripheral issues 
before having a clear idea how to solve the central problem.

I cannot even begin to do justice, here, to the issues involved in 
solving the high dimensional problem of seeking to understand the 
meaning of text, which often involve multiple levels of implication, 
which would normally be accomplished by some sort of search of a large 
semantic space

You talk as if an extension of some current strategy will solve this ... 
but it is not at all clear that any current strategy for solving this 
problem actually does scale up to a full solution to the problem.  I 
don't care how many toy examples you come up with, you have to show a 
strategy for dealing with some of the core issues, AND reasons to 
believe that those strategies really will work (other than I find them 
quite promising).

Not only that, but there at least some people (to wit, myself) who 
believe there are positive reasons to believe that the current 
strategies *will* not scale up.



Richard Loosemore



 -Original Message-
 From: Richard Loosemore [mailto

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Tintner

MIKE TINTNER Isn't it obvious that the brain is able to understand the
wealth of language by relatively few computations - quite intricate,
hierarchical, multi-levelled processing,

ED PORTER How do you find the right set of relatively few computations
and/or models that are appropriate in a complex context without massive
computation?

Ed, Contrary to my PM, maybe I should answer this in more precise detail.My 
hypothesis is as follows: the brain does most of its thinking, and 
particularly adaptive thinking, by look-up not by blind search.


How can you or I deal with :

Get that box out of this house now..

How is it say, that I will be able to think of a series of ideas like get 
ten men to carry it, get a fork-lift truck to move it, use large 
levers,  get hold of some heavy ropes ... etc etc. straight off the top 
of my head in well under a minute?


All of those ideas are derived from visual/sensory images/ schemas of large 
objects being moved.  The brain does not, I suggest, consult digital/ verbal 
lists or networks of verbal ideas about moving boxes out of houses or any 
similar set of verbal concepts, (except v. occasionally).


How then does the brain rapidly pull relevant large-object-moving shapes out 
of  memory? (There are obviously more operations involved here than just 
shape search, but that's what I want to concentrate on).  Now this is where 
I confess again to being a general techno-idiot (although I suspect that in 
this particular area most of you may be, too). My confused idea is that if 
you have a stack of shapes, there are ways to pull out/ spot the relevant 
ones quickly without sorting through the stack one by one. I think Hawkins 
suggests something like this in ON INtelligence. Maybe you can have thoughts 
about this.


(Alternatively, the again confused idea occurs that certain neuronal areas, 
when stimulated with a certain shape, may be able to remember similar shapes 
that have been there before -  v. loosely as certain metals when heated, can 
remember/ resume old forms)


Whatever, I am increasingly confident  that the brain does work v. 
extensively by matching shapes physically, (rather than by first converting 
them into digital/symbolic form). And I recommend here Sandra Blakeslee's 
latest book on body maps -  the opening Ramachandran quote -


When a reporter asked the famous biologist JBS Haldane what his biological 
studies had taught about God, Haldane replied:The creator if he exists must 
have an inordinate fondness for beetles since there are more species of 
beetle than any other group of living creqtures. By the same token, a 
neurologist might conclude that God is a cartographer. He must have an 
inordinate fondness for maps, for everywhere you look in the brain maps 
abound.


If I'm headed even loosely in the right direction here,  only analog 
computation will be able to handle the kind of rapid shape matching and 
searches I'm talking about, as opposed to the inordinately long, blind 
symbolic searches of digital computation. And you're going to need a whole 
new kind of computer. But none of you guys are prepared to even contemplate 
that.


P.S. One important feature of shape searches by contrast with digital, 
symbolic searches is that you don't make mistakes.  IOW when we think 
about a problem like getting the box out of a house, all our ideas, I 
suggest, will be to some extent relevant. They may not totally solve the 
problem, but they will fit some of the requirements, precisely because they 
have been derived by shape comparison. When a computer blindly searches 
lists of symbols by contrast, most of them of course are totally irrelevant.




-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71680486-77dd12


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:

RICHARD LOOSEMORE I cannot even begin to do justice, here, to the issues
involved in solving the high dimensional problem of seeking to understand
the meaning of text, which often involve multiple levels of implication,
which would normally be accomplished by some sort of search of a large
semantic space

You talk as if an extension of some current strategy will solve this ... but
it is not at all clear that any current strategy for solving this 
problem actually does scale up to a full solution to the problem.  I don't
care how many toy examples you come up with, you have to show a 
strategy for dealing with some of the core issues, AND reasons to believe
that those strategies really will work (other than I find them 
quite promising).


Not only that, but there at least some people (to wit, myself) who believe
there are positive reasons to believe that the current 
strategies *will* not scale up.


ED PORTER  I don't know if you read the Shastri paper I linked to or
not, but it shows we do know how to do many of the types of implication
which are used in NL.  What he shows needs some extensions, so it is more
generalized, but it and other known inference schemes explain a lot of how
text understanding could be done.  


With regard to the scaling issue, it is a real issue.  But there are
multiple reasons to believe the scaling problems can be overcome.  Not
proofs, Richard, so you are entitled to your doubts.  But open your mind to
the possibilities they present.  They include:

-1-the likely availability of roughly brain level representational,
computational, and interconnect capacities within the several hundred
thousand to 1 million dollar range in seven to ten years.

-2-the fact that human experience and representation does not
explode combinatorially.  Instead it is quite finite.  It fits insides our
heads.  


Thus, although you are dealing with extremely high dimensional spaces, most
of that space is empty.  There are know ways to deal with extremely high
dimensional spaces while avoiding the exponential explosion made possible by
such high dimensionality.  


Take the well know Growing Neural Gas (GNG) algorithm.  It automatically
creates a relative compact representation of a possibly infinite dimensional
space, by allocated nodes to only those parts of the high dimensional space
where there is stuff, or, if resource are more limited, where the most stuff
is.

Or take indexing, it takes one only to places in the hyperspace where
something actually occurred or was thought about.  One can have
probabilitistically selected hierarchical indexing (something like John Rose
suggested) which make indexing much more efficient.


I'm sorry, but this is not addressing the actual issues involved.

You are implicitly assuming a certain framework for solving the problem 
of representing knowledge ... and then all your discussion is about 
whether or not it is feasible to implement that framework (to overcome 
various issues to do with searches that have to be done within that 
framework).


But I am not challenging the implementation issues, I am challenging the 
viability of the framework itself.


My mind is completely open.  But right now I raised one issue, and this 
is not answered.


I am talking about issues that could prevent that framework from ever 
working no matter how much computing power is available.


You must be able to see this:  you are familiar with the fact that it is 
possible to frame a solution to certain problems in such a way that the 
proposed solution is KNOWN to not converge on an answer?  An answer can 
be perfectly findable IF you use a different representation, but there 
are some ways of representing the problem that lead to a type of 
solution that is completely incomputable.


This is an analogy:  I suggest to you that the framework you have in 
mind when you discuss the solution of the AGI problem is like those 
broken representations.




-3-experiential computers focus most learning, most models, and most
search on things that actually have happened in the past or on things that
in many ways are similar to what has happened in the past.  This tends to
greatly reduce representational and search spaces.

When such a system synthesizes or perceives new patterns that have never
happened before the system will normally have to explore large search
spaces, but because of the capacity of brain level hardware it will have
considerable capability to do so.  The type of hardware that will be
available for human-level agi in the next decade will probably have
sustainable cross sectional bandwidths of 10G to 1T messages/sec with 64Byte
payloads/msg.  With branching tree activations and the fact that many
messages will be regional, the total amount of messaging could well be 100G
to 100T such msg/sec.

Lets assume our hardware has 10T msg/sec and that we want to read 10 words a
second.  That would allow 1T msg/word.  With a dumb spreading 

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:

RICHARD LOOSEMORE I cannot even begin to do justice, here, to the issues
involved in solving the high dimensional problem of seeking to understand
the meaning of text, which often involve multiple levels of implication,
which would normally be accomplished by some sort of search of a large
semantic space

You talk as if an extension of some current strategy will solve this ... but
it is not at all clear that any current strategy for solving this 
problem actually does scale up to a full solution to the problem.  I don't
care how many toy examples you come up with, you have to show a 
strategy for dealing with some of the core issues, AND reasons to believe
that those strategies really will work (other than I find them 
quite promising).


Not only that, but there at least some people (to wit, myself) who believe
there are positive reasons to believe that the current 
strategies *will* not scale up.


ED PORTER  I don't know if you read the Shastri paper I linked to or
not, but it shows we do know how to do many of the types of implication
which are used in NL.  What he shows needs some extensions, so it is more
generalized, but it and other known inference schemes explain a lot of how
text understanding could be done.  


With regard to the scaling issue, it is a real issue.  But there are
multiple reasons to believe the scaling problems can be overcome.  Not
proofs, Richard, so you are entitled to your doubts.  But open your mind to
the possibilities they present.  They include:

-1-the likely availability of roughly brain level representational,
computational, and interconnect capacities within the several hundred
thousand to 1 million dollar range in seven to ten years.

-2-the fact that human experience and representation does not
explode combinatorially.  Instead it is quite finite.  It fits insides our
heads.  


Thus, although you are dealing with extremely high dimensional spaces, most
of that space is empty.  There are know ways to deal with extremely high
dimensional spaces while avoiding the exponential explosion made possible by
such high dimensionality.  


Take the well know Growing Neural Gas (GNG) algorithm.  It automatically
creates a relative compact representation of a possibly infinite dimensional
space, by allocated nodes to only those parts of the high dimensional space
where there is stuff, or, if resource are more limited, where the most stuff
is.

Or take indexing, it takes one only to places in the hyperspace where
something actually occurred or was thought about.  One can have
probabilitistically selected hierarchical indexing (something like John Rose
suggested) which make indexing much more efficient.

-3-experiential computers focus most learning, most models, and most
search on things that actually have happened in the past or on things that
in many ways are similar to what has happened in the past.  This tends to
greatly reduce representational and search spaces.

When such a system synthesizes or perceives new patterns that have never
happened before the system will normally have to explore large search
spaces, but because of the capacity of brain level hardware it will have
considerable capability to do so.  The type of hardware that will be
available for human-level agi in the next decade will probably have
sustainable cross sectional bandwidths of 10G to 1T messages/sec with 64Byte
payloads/msg.  With branching tree activations and the fact that many
messages will be regional, the total amount of messaging could well be 100G
to 100T such msg/sec.

Lets assume our hardware has 10T msg/sec and that we want to read 10 words a
second.  That would allow 1T msg/word.  With a dumb spreading activation
rule that would allow you to: active the 30K most probably implications; and
for each of them the 3K most probable implications; and for each of them the
300 most probable implications; and for each of them the 30 most probable
implications.  As dumb as this method of inferencing would be, it actually
would make a high percent of the appropriate multi-step inferences,
particularly when you consider that the probability of activation at the
successive stages would be guided by probabilities from other activations in
the current context.

Of course there are much more intelligent ways to guide activation that
this.

Also it is important to understand that at every level in many of the
searches or explorations in such a system there will be guidance and
limitations provided by similar models from past experience, greatly
reducing the amount of or the number of explorations that are required to
produce reasonable results.

-4-Michael Collins a few years ago had was many AI researches
considered to be the best grammatical parser, which used the kernel trick to
effectively match parse trees in, I think it was, 500K dimensions.  By use
of the Kernel trick the actual computation usually was performed in a small
subset of these dimensions 

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Tintner
Matt: Semantic models learn associations by proximity in the training text. 
The

degree to which you associate snake and rope depends on how often these
words appear near each other

Correct me - but it's the old, old problem here, isn't it? Those semantic 
models/programs  won't be able to form any *new* analogies, will they? Or 
understand newly minted analogies in texts?  And I'm v. dubious about their 
powers to even form valid associations of much value in the ways you 
describe from existing texts.


You're saying that there's a semantic model/program that can answer, if 
asked,:


yes - 'snake, chain, rope, spaghetti strand'  is a legitimate/ valid series 
of associations/ yes, they fit together  (based on previous textual 
analysis) ?


or:  the odd one out in 'snake/ chain/ cigarette/ rope  is 'cigarette'?

I have yet to find or be given a single useful analogy drawn by computers 
(despite asking many times). The only kind of analogy I can remember here is 
Ed, I think,  pointing to Hofstader's analogies along the lines of  xxyy 
is  like .  Not exactly a big deal. No doubt there must be more, 
but my impression is that in general computers are still pathetic here. 



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71683316-d0bd3c


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Matt Mahoney
--- Ed Porter [EMAIL PROTECTED] wrote:
 We do not know the number and width of the spreading activation that is
 necessary for human level reasoning over world knowledge.  Thus, we really
 don't know how much interconnect is needed and thus how large of a P2P net
 would be needed for impressive AGI.  But I think it would have to be larger
 than say 10K nodes.

In complex systems on the boundary between stability and chaos, the degree of
interconnectedness per node is constant.  Complex systems always evolve to
this boundary because stable systems aren't complex and chaotic systems can't
be incrementally updated.

In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate
the communication bandwidth.  But it is O(n log n) because the distance
between nodes grows as O(log n).  For each message sent or received, a node
must also relay O(log n) messages.

If the communication protocol is natural language text, then I am pretty sure
our existing networks can handle it.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71684400-910726


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Matt Mahoney

--- Mike Tintner [EMAIL PROTECTED] wrote:

 Matt: Semantic models learn associations by proximity in the training text. 
 The
 degree to which you associate snake and rope depends on how often these
 words appear near each other
 
 Correct me - but it's the old, old problem here, isn't it? Those semantic 
 models/programs  won't be able to form any *new* analogies, will they? Or 
 understand newly minted analogies in texts?  And I'm v. dubious about their 
 powers to even form valid associations of much value in the ways you 
 describe from existing texts.
 
 You're saying that there's a semantic model/program that can answer, if 
 asked,:
 yes - 'snake, chain, rope, spaghetti strand'  is a legitimate/ valid series
 of associations/ yes, they fit together  (based on previous textual 
 analysis) ?

Yes, because each adjacent pair of words has a high frequency of co-occurrence
in a corpus of training text.

 or:  the odd one out in 'snake/ chain/ cigarette/ rope  is 'cigarette'?

Yes, because cigarette does not have a high co-occurrence with the other
words.

 I have yet to find or be given a single useful analogy drawn by computers 
 (despite asking many times). The only kind of analogy I can remember here is
 Ed, I think,  pointing to Hofstader's analogies along the lines of  xxyy 
 is  like .  Not exactly a big deal. No doubt there must be more, 
 but my impression is that in general computers are still pathetic here.

This simplistic vector space model I described has been used to pass the word
analogy section of the SAT exams.  See: 

Turney, P., Human Level Performance on Word Analogy Questions by Latent
Relational Analysis (2004), National Research Council of Canada,
http://iit-iti.nrc-cnrc.gc.ca/iit-publications-iti/docs/NRC-47422.pdf


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71685861-05fe0f


RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter
Mike

-Original Message-
From: Mike Tintner [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 8:25 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

MIKE TINTNER Isn't it obvious that the brain is able to understand
the
wealth of language by relatively few computations - quite intricate,
hierarchical, multi-levelled processing,

ED PORTER How do you find the right set of relatively few
computations
and/or models that are appropriate in a complex context without massive
computation?

MIKE TINTNER How then does the brain rapidly pull relevant
large-object-moving shapes out 
of  memory? (There are obviously more operations involved here than just 
shape search, but that's what I want to concentrate on).  Now this is where 
I confess again to being a general techno-idiot (although I suspect that in 
this particular area most of you may be, too). My confused idea is that if 
you have a stack of shapes, there are ways to pull out/ spot the relevant 
ones quickly without sorting through the stack one by one. I think Hawkins 
suggests something like this in ON INtelligence. Maybe you can have thoughts

about this.

ED One way is by indexing some thing by its features, but this is a form
of a search, which if done completely activates each occurrence of each
feature searched for, and then selects the one or more pattern with the best
activation score.  Others on the list can probably name other methods

Another used in perception is to hierarchically match inputs against
patterns that represent given shapes under different conditions.

MIKE TINTNER (Alternatively, the again confused idea occurs that
certain neuronal areas, 
when stimulated with a certain shape, may be able to remember similar shapes

that have been there before -  v. loosely as certain metals when heated, can

remember/ resume old forms)

Whatever, I am increasingly confident  that the brain does work v. 
extensively by matching shapes physically, (rather than by first converting 
them into digital/symbolic form). And I recommend here Sandra Blakeslee's 
latest book on body maps -  the opening Ramachandran quote -

ED there clearly is some shape matching in the brain.

MIKE TINTNER P.S. One important feature of shape searches by contrast
with digital, 
symbolic searches is that you don't make mistakes.  IOW when we think 
about a problem like getting the box out of a house, all our ideas, I 
suggest, will be to some extent relevant. They may not totally solve the 
problem, but they will fit some of the requirements, precisely because they 
have been derived by shape comparison. When a computer blindly searches 
lists of symbols by contrast, most of them of course are totally irrelevant.


ED Yes, but there are a lot of types of thinking that cannot be done by
shape alone, and shape is actually much more complicated than shape.  There
is shape, and shape distorted by perspective, and shape changed by bending,
and shape changed by size.  There is shape of objects, shape of
trajectories, 2d shapes, 3d shapes.  There are visual memories, where we
don't really remember all the shapes, but instead remember the types of
things that were their and fill in most of the actual shapes.  In sum, it's
a lot more complicated that just finding a matching photograph.


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71691780-efaeb1attachment: winmail.dat

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter



RICHARD LOOSEMORE= I'm sorry, but this is not addressing the actual
issues involved.

You are implicitly assuming a certain framework for solving the problem 
of representing knowledge ... and then all your discussion is about 
whether or not it is feasible to implement that framework (to overcome 
various issues to do with searches that have to be done within that 
framework).

But I am not challenging the implementation issues, I am challenging the 
viability of the framework itself.


ED PORTER= So what is wrong with my framework?  What is wrong with a
system of recording patterns, and a method for developing compositions and
generalities from those patterns, in multiple hierarchical levels, and for
indicating the probabilities of certain patterns given certain other pattern
etc?  

I know it doesn't genuflect before the alter of complexity.  But what is
wrong with the framework other than the fact that it is at a high level and
thus does not explain every little detail of how to actually make an AGI
work?



RICHARD LOOSEMORE= These models you are talking about are trivial
exercises in public 
relations, designed to look really impressive, and filled with hype 
designed to attract funding, which actually accomplish very little.

Please, Ed, don't do this to me. Please don't try to imply that I need 
to open my mind any more.  Th implication seems to be that I do not 
understand the issues in enough depth, and need to do some more work to 
understand you points.  I can assure you this is not the case.



ED PORTER= Shastri's Shruiti is a major piece of work.  Although it is
a highly simplified system, for its degree of simplification it is amazingly
powerful.  It has been very helpful to my thinking about AGI.  Please give
me some excuse for calling it trivial exercise in public relations.  I
certainly have not published anything as important.  Have you?

The same for Mike Collins's parsers which, at least several years ago I was
told by multiple people at MIT was considered one of the most accurate NL
parsers around.  Is that just a trivial exercise in public relations?  

With regard to Hecht-Nielsen's work, if it does half of what he says it does
it is pretty damned impressive.  It is also a work I think about often when
thinking how to deal with certain AI problems.  

Richard if you insultingly dismiss such valid work as trivial exercises in
public relations it sure as hell seems as if either you are quite lacking
in certain important understandings -- or you have a closed mind -- or both.



Ed Porter

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71696956-846847

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter

Richard Loosemore= None of the above is relevant.  The issue is not
whether toy problems 
set within the current paradigm can be done with this or that search 
algorithm, it is whether the current paradigm can be made to converge at 
all for non-toy problems.

Ed Porter= Richard, I wouldn't call a state of the art NL parser that
matches parse trees in 500K dimensions a toy problem.  Yes, it is much less
than a complete human brain, but it is not a toy problem.

With regard to Hecht-Nielsen's sentence completion program it is arguably a
toy problem, but it operates extremely efficiently (i.e., converges) in an
astronomically large search space, with a significant portion of that search
space having some arguable activation.  The fact that there is such
efficient convergence in such a large search space is meaningful, and the
fact that you just dismiss it, as you did in your last email as a trivial
publicity stunt is also meaningful.

Ed Porter


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71705619-d121f2

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter
Matt,

IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably
quoting below I discussed the bandwidth issues.  I am assuming nodes
directly talk to each other, which is probably overly optimistic, but still
are limited by the fact that each node can only receive somewhere roughly
around 100 128 byte messages a second.  Unless you have a really big P2P
system, that just isn't going to give you much bandwidth.  If you had 100
million P2P nodes it would.  Thus, a key issue is how many participants is
an AGI-at-Home P2P system going to get.  

I mean, what would motivate the average American, or even the average
computer geek turn over part of his computer to it?  It might not be an easy
sell for more than several hundred or several thousand people, at least
until it could do something cool, like index their videos for them, be a
funny chat bot, or something like that.

Ed Porter

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 8:51 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

--- Ed Porter [EMAIL PROTECTED] wrote:
 We do not know the number and width of the spreading activation that is
 necessary for human level reasoning over world knowledge.  Thus, we really
 don't know how much interconnect is needed and thus how large of a P2P net
 would be needed for impressive AGI.  But I think it would have to be
larger
 than say 10K nodes.

In complex systems on the boundary between stability and chaos, the degree
of
interconnectedness per node is constant.  Complex systems always evolve to
this boundary because stable systems aren't complex and chaotic systems
can't
be incrementally updated.

In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate
the communication bandwidth.  But it is O(n log n) because the distance
between nodes grows as O(log n).  For each message sent or received, a node
must also relay O(log n) messages.

If the communication protocol is natural language text, then I am pretty
sure
our existing networks can handle it.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71708450-da8cab

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Ed Porter
Matt,

In addition to my last email, I don't understand what your were saying below
about complexity.  Are you saying that as a system becomes bigger it
naturally becomes unstable, or what?

Ed Porter 

-Original Message-
From: Matt Mahoney [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 8:51 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

--- Ed Porter [EMAIL PROTECTED] wrote:
 We do not know the number and width of the spreading activation that is
 necessary for human level reasoning over world knowledge.  Thus, we really
 don't know how much interconnect is needed and thus how large of a P2P net
 would be needed for impressive AGI.  But I think it would have to be
larger
 than say 10K nodes.

In complex systems on the boundary between stability and chaos, the degree
of
interconnectedness per node is constant.  Complex systems always evolve to
this boundary because stable systems aren't complex and chaotic systems
can't
be incrementally updated.

In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate
the communication bandwidth.  But it is O(n log n) because the distance
between nodes grows as O(log n).  For each message sent or received, a node
must also relay O(log n) messages.

If the communication protocol is natural language text, then I am pretty
sure
our existing networks can handle it.


-- Matt Mahoney, [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71710422-50e2fa

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Bryan Bishop
On Thursday 29 November 2007, Ed Porter wrote:
 Somebody (I think it was David Hart) told me there is a shareware
 distributed web crawler already available, but I don't know the
 details, such as how good or fast it is.

http://grub.org/
Previous owner went by the name of 'kordless'. I found him on Slashdot.

- Bryan

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71712384-417a60


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:

Richard Loosemore= None of the above is relevant.  The issue is not
whether toy problems 
set within the current paradigm can be done with this or that search 
algorithm, it is whether the current paradigm can be made to converge at 
all for non-toy problems.


Ed Porter= Richard, I wouldn't call a state of the art NL parser that
matches parse trees in 500K dimensions a toy problem.  Yes, it is much less
than a complete human brain, but it is not a toy problem.


This is a toy problem.

Parsing is a deep problem?  Do you understand the relationship between 
parsing NL and extracting semantics?  Do you understand what this great 
NL parser would do if confronted with a syntactically incorrect but 
contextually meaningful sentence?  Has it been analysed to see what its 
behavior is on ambiguous sentences?  Could it learn to cope with someone 
speaking a pidgin version of NL, or would someone have to write an 
entire grammar for the language before the system could even start 
parsing it?  Can it generate syntactically correct sentences that 
express an idea?  Can it cope with speech errors, recgnising the nature 
o fteh error and backfilling, or does it just collapse with no viable 
parse?  Would the parser have to be completely rewritten in the future 
when someone else finally solves the problem of representing the 
semantics of language?


Finally, if you are impressed by the claim about 500K dimensions then 
what can I say?  Can you explain to me in what sense it matches parse 
trees in 500K dimensions, and why that is so impressive?


Perhaps I am being unnecessarily hard on you, Ed.  I don't mean to be 
personally rude, you know, but it is sometimes exhausting to have 
someone trying to teach you how to suck eggs



Richard Loosemore



With regard to Hecht-Nielsen's sentence completion program it is arguably a
toy problem, but it operates extremely efficiently (i.e., converges) in an
astronomically large search space, with a significant portion of that search
space having some arguable activation.  The fact that there is such
efficient convergence in such a large search space is meaningful, and the
fact that you just dismiss it, as you did in your last email as a trivial
publicity stunt is also meaningful.

Ed Porter


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71714474-5576ff


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Richard Loosemore

Ed Porter wrote:




RICHARD LOOSEMORE= I'm sorry, but this is not addressing the actual

issues involved.

You are implicitly assuming a certain framework for solving the problem 
of representing knowledge ... and then all your discussion is about 
whether or not it is feasible to implement that framework (to overcome 
various issues to do with searches that have to be done within that 
framework).


But I am not challenging the implementation issues, I am challenging the 
viability of the framework itself.



ED PORTER= So what is wrong with my framework?  What is wrong with a
system of recording patterns, and a method for developing compositions and
generalities from those patterns, in multiple hierarchical levels, and for
indicating the probabilities of certain patterns given certain other pattern
etc?  


I know it doesn't genuflect before the alter of complexity.  But what is
wrong with the framework other than the fact that it is at a high level and
thus does not explain every little detail of how to actually make an AGI
work?




RICHARD LOOSEMORE= These models you are talking about are trivial
exercises in public 
relations, designed to look really impressive, and filled with hype 
designed to attract funding, which actually accomplish very little.


Please, Ed, don't do this to me. Please don't try to imply that I need 
to open my mind any more.  Th implication seems to be that I do not 
understand the issues in enough depth, and need to do some more work to 
understand you points.  I can assure you this is not the case.




ED PORTER= Shastri's Shruiti is a major piece of work.  Although it is
a highly simplified system, for its degree of simplification it is amazingly
powerful.  It has been very helpful to my thinking about AGI.  Please give
me some excuse for calling it trivial exercise in public relations.  I
certainly have not published anything as important.  Have you?

The same for Mike Collins's parsers which, at least several years ago I was
told by multiple people at MIT was considered one of the most accurate NL
parsers around.  Is that just a trivial exercise in public relations?  


With regard to Hecht-Nielsen's work, if it does half of what he says it does
it is pretty damned impressive.  It is also a work I think about often when
thinking how to deal with certain AI problems.  


Richard if you insultingly dismiss such valid work as trivial exercises in
public relations it sure as hell seems as if either you are quite lacking
in certain important understandings -- or you have a closed mind -- or both.


Ed,

You have no idea of the context in which I made that sweeping dismissal. 
 If you have enough experience of research in this area you will know 
that it is filled with bandwagons, hype and publicity-seeking.  Trivial 
models are presented as if they are fabulous achievements when, in fact, 
they are just engineered to look very impressive but actually solve an 
easy problem.  Have you had experience of such models?  Have you been 
around long enough to have seen something promoted as a great 
breakthrough even though it strikes you as just a trivial exercise in 
public relations, and then watch history unfold as the great 
breakthrough leads to  absolutely nothing at all, and is then 
quietly shelved by its creator?  There is a constant ebb and flow of 
exaggeration and retreat, exaggeration and retreat.  You are familiar 
with this process, yes?


This entire discussion baffles me.  Does it matter at all to you that I 
have been working in this field for decades?  Would you go up to someone 
at your local university and tell them how to do their job?  Would you 
listen to what they had to say about issues that arise in their field of 
expertise, or would you consider your own opinion entirely equal to 
theirs, with only a tiny fraction of their experience?




Richard Loosemore


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71711822-0e911b


Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread Mike Tintner
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]ED Yes, 
but there are a lot of types of thinking that cannot be done by shape alone, 
and shape is actually much more complicated than shape.  There is shape, and 
shape distorted by perspective, and shape changed by bending, and shape changed 
by size.  There is shape of objects, shape of trajectories, 2d shapes, 3d 
shapes.  There are visual memories, where we don't really remember all the 
shapes, but instead remember the types of things that were their and fill in 
most of the actual shapes.  In sum, it's a lot more complicated that just 
finding a matching photograph.

Ed,

I am not suggesting that shape matching is everything, merely that it is 
central to a great many of the brain's operations - and to its ability to 
search rapidly and briefly and locate analogical ideas (and if that's true, as 
I believe it is, then, sorry, AGI's stuckness is going to continue for a long 
time yet).

The reason I'm replying though is a further thought occurred to me. Essentially 
I've been suggesting that the brain has some means to locate matching shapes 
quickly in very few operations where a digital computer laboriously searches 
through long lists or networks of symbols in a great many operations. One v. 
crude idea for the mechanism I suggested was that neuronal areas somehow 
retain memories of shapes, which can be stimulated by similar incoming shapes - 
so that analogies can be drawn with extreme rapidity, more or less on the spot. 
[Spot checks]

It's occurred to me that this may well happen over and over throughout the body 
 related brain areas.  The same body areas that today feel stiff / expanded/ 
cold  , felt loose/ contracted/ warm yesterday. The same hand that was a ball, 
and many other shapes, is now a fist. So perhaps these memories are all somehow 
laid on top of each in the same brain areas..Map upon map upon map .Just an 
extremely rough idea, but I think it does go some way to showing how shape 
matching could indeed be extremely rapid and effective in the brain, by 
contrast with computers' blind, disembodied search. 

It follows BTW re your points above, that the same brain areas will also retain 
many morphic variations on the same basic shapes - objects/cups seen say 
moving, from different angles, zooming in and out etc. 

And if it's true, as I believe, that the brain uses loose, highly flexible 
templates for visual object perception - then that too should mean that it will 
easily and rapidly be able to connect closely related shapes as in snake/ 
chain/ rope/ spaghetti strand. Analogies and perception are interwoven for the 
brain. Blakeslee makes a good deal of the brain using flexible, morphic body 
maps. 

Thanks for your reply. Further thoughts re mechanisms welcome. As Blakeslee 
points out, this whole area is just beginning to open up.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=71724560-1bc574

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-03 Thread John G. Rose
Ed,

Well it'd be nice having a supercomputer but P2P is a poor man's
supercomputer and beggars can't be choosy.

Honestly the type of AGI that I have been formulating in my mind has not
been at all closely related to simulating neural activity through
orchestrating partial and mass activations at low frequencies and I had been
avoiding those contagious cog sci memes on purpose. But your expose on the
subject is quite interesting and I wasn't that aware that that is how things
have been being done.

But getting more than a few thousand P2P nodes is difficult. Going from 10K
to 20K nodes and up, getting more difficult to the point of being
prohibitively expensive to being impossible or extremely lucky.  There are
ways to do it but according to your calculations the supercomputer mayt be
more of a wise choice as going out and scrounging up funding for that would
be easier.

Still though (besides working on my group theory heavy design) exploring the
crafting and chiseling of an activation model you are talking about to the
P2P network could be fruitful. I feel that through a number of up front and
unfortunately complicated design changes/adaptations that the activation
orchestrations could be improved thus bringing down the message rate
requirements, reducing activation requirements, depths and frequencies,
through a sort of computational resource topology consumption,
self-organizational design molding.

You do indicate some dynamic resource adaption and things like intelligent
inference guiding schemes in your description but it doesn't seem like it
melts enough into the resource space. But having a design be less static
risks excessive complications...

A major problem though with P2P and the activation methodology is that there
are so many variances in the latencies and availability that serious
synchronicity/simultaneity issues would exist that even more messaging might
be required. Since there are so many variables in public P2P, empirical data
also would be necessary to get a gander on feasibility.

I still feel strongly that the way to do AGI P2P (with public P2P as core
not augmental) is to understand the grid, and build the AGI design based on
that and what it will be in a few years, instead of taking a design and
morphing it to the resource space. That said, there are finite designs that
will work so the number of choices is few.

John


_
From: Ed Porter [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 6:17 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi]
Funding AGI research]


John, 

You raised some good points.  The problem is that the total
number of messages/sec that can be received is relatively small.  It is not
as if you are dealing with a multidimensional grid or toroidal net in which
spreading tree activation can take advantage of the fact that the total
parallel bandwidth for regional messaging can be much greater than the
x-sectional bandwidth.  

In a system where each node is a server class node with
multiple processors and 32 or 64Gbytes of ram, much of which is allocable to
representation, sending messages to local indices on each machine could
fairly efficiently activate all occurrences of something in a 32 to 64 TByte
knowledge base with a max of 1K internode messages, if there was only 1K
nodes.

But in a PC based P2P system the ratio of nodes to
representation space is high and the total number of 128 byte messages/sec
than can be received is limited to about 100, so neither methods of trying
to increase number of patterns than can be activated with the given
interconnect of the network buy you as much.

Human level context sensitivity arises because a large
number of things that can depend on a large number of things in the current
context are made aware of those dependencies.  This takes a lot of
messaging, and I don't see how a P2P system where each node can only receive
about 100 relatively short messages a second is going to make this possible
unless you had a huge number of nodes. As Richard Loosemore said in his Mon
12/3/2007 12:57 PM post.

It turns out that within an extremely short
time of the forst word being 
seen, a very large numbmer of other words
have their activations raised 
significantly.  Now, whichever way you
interpret these (so called 
priming) results, one thing is not in
doubt:  there is massively 
parallel activation of lexical units going
on during language processing. 

With special software, a $10M dollar supercomputer cluster
with 1K nodes, 32TBytes of Ram, and a dual ported 20Mb infiniband
interconnect send about 1

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-02 Thread John G. Rose
Ed,

Building up parse trees and word sense models, let's say that would be a
first step. And then say after a while this was accomplished and running on
some peers. What would the next theoretical step be?

Also, what would you try to accomplish if there was more bandwidth and more
computing power? The reason I ask is that a public peer network can be
constructed in many ways and a subset of nodes can be higher bandwidth - 10,
20, 30+ mbits and some legs can be very high approaching 400 mbits.
Computing power doesn't get that high 'cept for a small subset where you
have multiproc/multicore servers but these are rare. Also, even with the
basic lower end, lower quality nodes, including DSL, etc. the computational
resource topology can molded and optimized for particular computational goal
structures.

John


 -Original Message-
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 Sent: Saturday, December 01, 2007 6:41 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 John,
 
 I tested Exeter, NH to LA at 5371kbs download, and 362Kbs upload.
 Strangelly
 my scores were slightly slower to NYC.
 
 
 Just throwing out ideas, for example, AGI-at-home PC's in the net could
 crawl the web looking for reasonable NL text.  Use current NL tools to
 guess
 parse and word sense.  For each word in text, send it and it surrounding
 text, Part of speech labeling, surrounding parse tree, and word sense
 guess,
 to another P2P node that specializes in that word in similar contexts
 and
 separately another P2P node that specializes in similar parse trees.
 These
 specialist node could then develop statistical models for word senses
 based
 on clustering or other technique.  Then over time the statistical models
 would get send down to the reading nodes, and this EM cycle could be
 constantly repeated.
 
 Of course, without the cross-sectional bandwidth of proper AGI hardware,
 you
 are going to be severely limited from doing a lot of the things you
 would
 really like to be able to do.  But I think you should be able to come up
 with pretty good word sense models.
 
 Ed Porter
 
 -Original Message-
 From: John G. Rose [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 2:55 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 Ed,
 
 That is probably a good rough estimate. There are more headers for the
 more
 frequently transmitted smaller messages but a 16 byte header may be a
 bit
 large.
 
 Here is a speedtest link -
 http://www.speedtest.net/
 
 My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588
 kb/sec
 UL much larger than the calculations 256kb/sec. The variance between
 tests
 to the same location is quite large on the DL side but UL is relatively
 stable. Saturating either DL or UL would impact the other.
 
 You can get higher efficiencies if you use UDP transmission without
 message
 serialization. Also you can do things like compression, only sending
 changes, etc..
 
 Distributed crawling with NL learning fits the scenario well since nodes
 download at higher speeds, process the download into a smaller dataset,
 then
 UL communicate the results to the server or share with peers. When one
 peer
 shares with many peers you hit the UL limit fast though so it has to be
 managed. And you have to figure out how the knowledge will be spread out
 -
 server centric, shared, hybrid... As the knowledge size increases with
 peer
 storage you have to come up with distributed indexes.
 
 John
 
 
  -Original Message-
  From: Ed Porter [mailto:[EMAIL PROTECTED]
  Sent: Friday, November 30, 2007 12:06 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  John,
 
  Thanks.  I guess that means and AGI-at-home system could be both up-
  loading
  and receiving about 27 1K msgs/sec if it wasn't being used for
 anything
  else
  and the networks weren't backed up in its neck of the woods.
 
  Presumably the number for say 128Byte messages would be say, roughly,
 8
  times faster (minus some percent for the latency associated with each
  message, so lets say roughly about 5 times faster or 135msg/sec.  Is
  that
  reasonable?
 
  So, it seems for example it would be quite possible to do
  estimation/maximilation type NL learning in a distributed manner with
 a
  lot
  of cable-box connected PC's and a distributed web crawler.
 
  Ed Porter
 
  -Original Message-
  From: John G. Rose [mailto:[EMAIL PROTECTED]
  Sent: Friday, November 30, 2007 12:33 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  Hi Ed,
 
  If the peer is not running other apps utilizing the network it could
 do
  the
  same. Typically a peer first needs to locate other peers. There may be
  servers involved but these are just for the few bytes transmitted for
  public
  IP address discovery

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-12-01 Thread Ed Porter
John,

I tested Exeter, NH to LA at 5371kbs download, and 362Kbs upload. Strangelly
my scores were slightly slower to NYC.


Just throwing out ideas, for example, AGI-at-home PC's in the net could
crawl the web looking for reasonable NL text.  Use current NL tools to guess
parse and word sense.  For each word in text, send it and it surrounding
text, Part of speech labeling, surrounding parse tree, and word sense guess,
to another P2P node that specializes in that word in similar contexts and
separately another P2P node that specializes in similar parse trees.  These
specialist node could then develop statistical models for word senses based
on clustering or other technique.  Then over time the statistical models
would get send down to the reading nodes, and this EM cycle could be
constantly repeated.  

Of course, without the cross-sectional bandwidth of proper AGI hardware, you
are going to be severely limited from doing a lot of the things you would
really like to be able to do.  But I think you should be able to come up
with pretty good word sense models.

Ed Porter

-Original Message-
From: John G. Rose [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 30, 2007 2:55 PM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed,

That is probably a good rough estimate. There are more headers for the more
frequently transmitted smaller messages but a 16 byte header may be a bit
large.

Here is a speedtest link - 
http://www.speedtest.net/ 

My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 kb/sec
UL much larger than the calculations 256kb/sec. The variance between tests
to the same location is quite large on the DL side but UL is relatively
stable. Saturating either DL or UL would impact the other.

You can get higher efficiencies if you use UDP transmission without message
serialization. Also you can do things like compression, only sending
changes, etc.. 

Distributed crawling with NL learning fits the scenario well since nodes
download at higher speeds, process the download into a smaller dataset, then
UL communicate the results to the server or share with peers. When one peer
shares with many peers you hit the UL limit fast though so it has to be
managed. And you have to figure out how the knowledge will be spread out -
server centric, shared, hybrid... As the knowledge size increases with peer
storage you have to come up with distributed indexes.

John


 -Original Message-
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 12:06 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 John,
 
 Thanks.  I guess that means and AGI-at-home system could be both up-
 loading
 and receiving about 27 1K msgs/sec if it wasn't being used for anything
 else
 and the networks weren't backed up in its neck of the woods.
 
 Presumably the number for say 128Byte messages would be say, roughly, 8
 times faster (minus some percent for the latency associated with each
 message, so lets say roughly about 5 times faster or 135msg/sec.  Is
 that
 reasonable?
 
 So, it seems for example it would be quite possible to do
 estimation/maximilation type NL learning in a distributed manner with a
 lot
 of cable-box connected PC's and a distributed web crawler.
 
 Ed Porter
 
 -Original Message-
 From: John G. Rose [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 12:33 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 Hi Ed,
 
 If the peer is not running other apps utilizing the network it could do
 the
 same. Typically a peer first needs to locate other peers. There may be
 servers involved but these are just for the few bytes transmitted for
 public
 IP address discovery as many(or most) peers reside hidden behind NATs.
 DNS
 names also require lookups but these are just for doing the initial
 match of
 hostname to IP address, if DNS is used at all.
 
 We're just talking basic P2P, one peer talking to one other peer,
 nothing
 complicated. As you can imagine P2P can take on many flavors as the
 number
 of peers increases.
 
 John
 
  -Original Message-
  From: Ed Porter [mailto:[EMAIL PROTECTED]
  Sent: Friday, November 30, 2007 10:10 AM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  John,
 
  Thanks.
 
  Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the
  client
  to server upload you discribed?
 
  Ed Porter
 
  -Original Message-
  From: John G. Rose [mailto:[EMAIL PROTECTED]
  Sent: Thursday, November 29, 2007 11:40 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  OK for a guestimate take a half-way decent cable connection say
 Comcast
  on a
  good day with DL of 4mbits max and UL of 256kbits max with an
  undiscriminated protocol

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-11-30 Thread John G. Rose
Hi Ed,

If the peer is not running other apps utilizing the network it could do the
same. Typically a peer first needs to locate other peers. There may be
servers involved but these are just for the few bytes transmitted for public
IP address discovery as many(or most) peers reside hidden behind NATs. DNS
names also require lookups but these are just for doing the initial match of
hostname to IP address, if DNS is used at all.

We're just talking basic P2P, one peer talking to one other peer, nothing
complicated. As you can imagine P2P can take on many flavors as the number
of peers increases.  

John

 -Original Message-
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 10:10 AM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 John,
 
 Thanks.
 
 Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the
 client
 to server upload you discribed?
 
 Ed Porter
 
 -Original Message-
 From: John G. Rose [mailto:[EMAIL PROTECTED]
 Sent: Thursday, November 29, 2007 11:40 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 OK for a guestimate take a half-way decent cable connection say Comcast
 on a
 good day with DL of 4mbits max and UL of 256kbits max with an
 undiscriminated protocol, an unknown TCP based protocol, talking to a
 fat-pipe, low latency server. Assume say 16 byte message header wrappers
 for
 all of your 128, 1024 and 10k byte message sizes.
 
 So upload is 256kbits, go ahead and saturate it fully with either of
 your
 128+16bytes, 1024+16bytes, and 10k+16bytes packet streams. Using TCP for
 reliability and assume some overhead say subtract 10% from the saturated
 value, retransmits, latency.
 
 What are we left with? Assume the PC has 1gigbit NIC so it is usually
 waiting to squeeze out the 256kbits of cable upload capacity.
 
 Oh right this is just upstream, DL is 4mbits cable into PC NIC or
 1gigbit
 (assume 60% saturation) so there is  ample PC NIC BW for this.
 
 ...
 
 So for 256kbits/sec = 256,000 bits/sec,
 
 (256,000 bits/sec) / ((1024 + 16)bytes x 8bits/ (message bytes)) =
 30.769
 messages / sec.
 
 So 30.769 messages/sec - 10% = 27.692 messages /sec.
 
 
 About 27.692 message per sec for the 1024 byte message upload stream.
 
 Download = 16x UL = 443.072 messages/sec
 
 My calculation look right?
 
 Note: some Comcast cable connections allow as much as 1.4mbits upload.
 UL is
 always way less than DL (dependant on protocol). Other cable companies
 are
 similar depends on the company and geographic region...
 
 
 John
 
 
  -Original Message-
  From: Ed Porter [mailto:[EMAIL PROTECTED]
  Sent: Thursday, November 29, 2007 6:50 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  John,
 
  Somebody (I think it was David Hart) told me there is a shareware
  distributed web crawler already available, but I don't know the
 details,
  such as how good or fast it is.
 
  How fast could P2P communication be done on one PC, on average both
  sending
  upstream and receiving downstream from servers with fat pipes?
 Roughly
  how
  many msgs a second for cable connected PC's, say at 128byte and
  1024byte,
  and 10K byte message sizes?
 
  Decent guestimates on such numbers would help me think about what sort
  of
  interesting distributed NL learning tasks could be done with by AGI-
 at-
  Home
  network. (of course once it showed any promise Google would start
 doing
  it a
  thousand times faster, but at least it would be open source).
 
  Ed Porter
 
 
  -Original Message-
  From: John G. Rose [mailto:[EMAIL PROTECTED]
  Sent: Thursday, November 29, 2007 8:31 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  Ed,
 
  That is the http protocol, it is a client server request/response
  communication. Your browser asked for the contents at
  http://www.nytimes.com. The NY Times server(s) dumped the response
  stream
  data to your external IP address. You probably have a NAT'd cable
  address
  and NAT'ted again by your local router (if you have one). This
  communication
  is mainly one way - except for your original few bytes of http
 request.
  For
  a full ack-nack real-time dynamically addressed protocol there is more
  involved but say OpenCog could be setup to act as an http server and
 you
  could have a http client (browser or whatever) for simplicity in
  communications. Http is very firewall friendly since it is universally
  used
  on the internet.
 
  A distributed web crawler is a stretch though the communications
 is
  more
  complicated.
 
  John
 
   -Original Message-
   From: Ed Porter [mailto:[EMAIL PROTECTED]
   Sent: Thursday, November 29, 2007 6:13 PM
   To: agi@v2.listbox.com
   Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
   research

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-11-30 Thread John G. Rose
Ed,

That is probably a good rough estimate. There are more headers for the more
frequently transmitted smaller messages but a 16 byte header may be a bit
large.

Here is a speedtest link - 
http://www.speedtest.net/

My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 kb/sec
UL much larger than the calculations 256kb/sec. The variance between tests
to the same location is quite large on the DL side but UL is relatively
stable. Saturating either DL or UL would impact the other.

You can get higher efficiencies if you use UDP transmission without message
serialization. Also you can do things like compression, only sending
changes, etc.. 

Distributed crawling with NL learning fits the scenario well since nodes
download at higher speeds, process the download into a smaller dataset, then
UL communicate the results to the server or share with peers. When one peer
shares with many peers you hit the UL limit fast though so it has to be
managed. And you have to figure out how the knowledge will be spread out -
server centric, shared, hybrid... As the knowledge size increases with peer
storage you have to come up with distributed indexes.

John


 -Original Message-
 From: Ed Porter [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 12:06 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 John,
 
 Thanks.  I guess that means and AGI-at-home system could be both up-
 loading
 and receiving about 27 1K msgs/sec if it wasn't being used for anything
 else
 and the networks weren't backed up in its neck of the woods.
 
 Presumably the number for say 128Byte messages would be say, roughly, 8
 times faster (minus some percent for the latency associated with each
 message, so lets say roughly about 5 times faster or 135msg/sec.  Is
 that
 reasonable?
 
 So, it seems for example it would be quite possible to do
 estimation/maximilation type NL learning in a distributed manner with a
 lot
 of cable-box connected PC's and a distributed web crawler.
 
 Ed Porter
 
 -Original Message-
 From: John G. Rose [mailto:[EMAIL PROTECTED]
 Sent: Friday, November 30, 2007 12:33 PM
 To: agi@v2.listbox.com
 Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
 research]
 
 Hi Ed,
 
 If the peer is not running other apps utilizing the network it could do
 the
 same. Typically a peer first needs to locate other peers. There may be
 servers involved but these are just for the few bytes transmitted for
 public
 IP address discovery as many(or most) peers reside hidden behind NATs.
 DNS
 names also require lookups but these are just for doing the initial
 match of
 hostname to IP address, if DNS is used at all.
 
 We're just talking basic P2P, one peer talking to one other peer,
 nothing
 complicated. As you can imagine P2P can take on many flavors as the
 number
 of peers increases.
 
 John
 
  -Original Message-
  From: Ed Porter [mailto:[EMAIL PROTECTED]
  Sent: Friday, November 30, 2007 10:10 AM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  John,
 
  Thanks.
 
  Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the
  client
  to server upload you discribed?
 
  Ed Porter
 
  -Original Message-
  From: John G. Rose [mailto:[EMAIL PROTECTED]
  Sent: Thursday, November 29, 2007 11:40 PM
  To: agi@v2.listbox.com
  Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI
  research]
 
  OK for a guestimate take a half-way decent cable connection say
 Comcast
  on a
  good day with DL of 4mbits max and UL of 256kbits max with an
  undiscriminated protocol, an unknown TCP based protocol, talking to a
  fat-pipe, low latency server. Assume say 16 byte message header
 wrappers
  for
  all of your 128, 1024 and 10k byte message sizes.
 
  So upload is 256kbits, go ahead and saturate it fully with either of
  your
  128+16bytes, 1024+16bytes, and 10k+16bytes packet streams. Using TCP
 for
  reliability and assume some overhead say subtract 10% from the
 saturated
  value, retransmits, latency.
 
  What are we left with? Assume the PC has 1gigbit NIC so it is usually
  waiting to squeeze out the 256kbits of cable upload capacity.
 
  Oh right this is just upstream, DL is 4mbits cable into PC NIC or
  1gigbit
  (assume 60% saturation) so there is  ample PC NIC BW for this.
 
  ...
 
  So for 256kbits/sec = 256,000 bits/sec,
 
  (256,000 bits/sec) / ((1024 + 16)bytes x 8bits/ (message bytes)) =
  30.769
  messages / sec.
 
  So 30.769 messages/sec - 10% = 27.692 messages /sec.
 
 
  About 27.692 message per sec for the 1024 byte message upload stream.
 
  Download = 16x UL = 443.072 messages/sec
 
  My calculation look right?
 
  Note: some Comcast cable connections allow as much as 1.4mbits upload.
  UL is
  always way less than DL (dependant on protocol). Other cable companies
  are
  similar depends on the company and geographic region

Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

2007-11-30 Thread Mike Tintner

RL:However, I have previously written a good deal about the design of
different types of motivation system, and my understanding of the likely
situation is that by the time we had gotten the AGI working, its
motivations would have been arranged in such a way that it would *want*
to be extremely cooperative.

You do keep saying this. An autonomous mobile agent that did not have 
fundamentally conflicting emotions about each and every activity and part of 
the world,  would not succeed and survive. An AGI that trusted and 
cooperated with every human would not succeed and survive. Conflict is 
essential in a world fraught with risks, where time and effort can be 
wasted, essential needs can be neglected,  and life and limb are under more 
or less continuous threat. Conflict is as fundamental and essential to 
living creatures and any emotional system as gravity is to the physical 
world. (But I can't recall any mention of it in your writings about 
emotions).


No one wants to be extremely cooperative with anybody. Everyone wants and 
needs a balance of give-and-take. (And right away, an agent's interests and 
emotions of giving must necessarily conflict with their emotions of taking). 
Anything approaching a perfect balance of interests between extremely 
complex creatures/ psychoeconomies with extremely complex interests, is 
quite impossible - hence the simply massive literature dealing with the 
massive reality of relationship problems. And all living creatures have 
them.


Obviously, living creatures can have highly cooperative and smooth 
relationships -but they tend to be in the small minority. Ditto 
relationships between humans and pets. And there is no reason to think any 
different odds would apply to artificial and living creatures. (Equally, 
extremely uncooperative, aggressive relationships also tend to be in the 
minority, and similar odds should apply about that).


P.S. Perhaps the balance of cooperative/uncooperative relationsbips on this 
forum might give representative odds?! :)





-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=70905478-e0c379


  1   2   >