--- "J. Andrew Rogers" <[EMAIL PROTECTED]> wrote:

> 
> On Apr 30, 2008, at 11:41 AM, Matt Mahoney wrote:
> > By distributing the problem across the internet.  AGI can be
> divided
> > into lots of specialized experts and a network for getting messages
> to
> > the right experts.  http://www.mattmahoney.net/agi.html
> 
> 
> There are a few problems with your model that need to be fixed before
> it is legitimately viable, though you do acknowledge some of them in 
> the paper:
> 
> 1.)  The protocol design is naive and will not scale up to the level 
> you think it will, simplifying away by assumption topology  
> characteristics where deviations from the assumption will have a
> major  
> impact.  There are no general, computable solution to the underlying 
> issues (neither in literature nor in unpublished research that I know
> of), and you gloss over or do not consider problems that would have a
> pathological expression if you actually tried to build it.  This is
> an important and active area of mathematics research in a couple  
> different fields.

Which protocol are you referring to, the one described on my web page
or the abstract one described in my thesis?  In the abstract one I
described a network of n identical (but unreliable) peers, each
connected to c = O(log n) peers, using a vector space model with
messages uniformly distributed in d dimensions.  In simulations it
scales to large n but does poorly when d > c.  This would seem to
preclude text indexing where d ~ 10^5.  However, the simulation is
worst case.  In practice, text tends to cluster in a vector space,
effectively reducing the number of dimensions to a few hundred.  Also,
I expect computing resources to be distributed unevenly; a few big
nodes with large c and lots of smaller ones with small c.  An example
would be connecting Google as a peer with c ~ 10^9.

What missing assumptions would break it?


> 2.)  There is nothing in published literature that will do the kind
> of  
> indexing you want to do in the spatial domain, but it is possible in 
> theory.

Yes, that was my main challenge.  My motivation in 1997 was the lack of
decentralized search capability for P2P networks.  Then I forgot about 
it while I investigated language modeling for AI and started my
dissertation on it (and then changing the topic in order to get
funded).  Only last year I saw the application of distributed indexing
to AGI.

> For your purposes in the broadest sense, things like kD-trees  
> will drop dead for pretty trivial systems, never mind for something  
> ambitious.  On the other hand, generalized distribution with O(n)  
> storage complexity was solved last year which may or may not address 
> your issues.

Storage is O(n) using an organizational tree, but that would not be
robust.  I think the O(log n) factor is not a big penalty.  You want to
have more pointers per peer as the network grows, and you want to have
more cached and backup copies of messages.

Anyway I agree the protocol design needs more work before it is ready
to deploy.  I expect there will be many problems we didn't anticipate. 
I am in no hurry.


-- Matt Mahoney, [EMAIL PROTECTED]

-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com

Reply via email to