RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed Porter Tue, 04 Dec 2007 06:36:08 -0800

John, 

I am sure there is interesting stuff that can be done.  It would be
interesting just to see what sort of an agi could be made on a PC.


I would be interested in you Ideas for how to make a powerful AGI without a
vast amount of interconnect.  The major schemes I know about for reducting
interconnect involve allocating what interconnect you have to the links with
the highest probability or importance, varying those measures of probability
and importance in a contest specific way, and being guided by prior similar
experiences.

Ed Porter

-----Original Message-----
From: John G. Rose [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 1:42 AM
To: agi@v2.listbox.com
Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Ed,

Well it'd be nice having a supercomputer but P2P is a poor man's
supercomputer and beggars can't be choosy.

Honestly the type of AGI that I have been formulating in my mind has not
been at all closely related to simulating neural activity through
orchestrating partial and mass activations at low frequencies and I had been
avoiding those contagious cog sci memes on purpose. But your expose on the
subject is quite interesting and I wasn't that aware that that is how things
have been being done.

But getting more than a few thousand P2P nodes is difficult. Going from 10K
to 20K nodes and up, getting more difficult to the point of being
prohibitively expensive to being impossible or extremely lucky.  There are
ways to do it but according to your calculations the supercomputer mayt be
more of a wise choice as going out and scrounging up funding for that would
be easier.

Still though (besides working on my group theory heavy design) exploring the
crafting and chiseling of an activation model you are talking about to the
P2P network could be fruitful. I feel that through a number of up front and
unfortunately complicated design changes/adaptations that the activation
orchestrations could be improved thus bringing down the message rate
requirements, reducing activation requirements, depths and frequencies,
through a sort of computational resource topology consumption,
self-organizational design molding.

You do indicate some dynamic resource adaption and things like "intelligent
inference guiding schemes" in your description but it doesn't seem like it
melts enough into the resource space. But having a design be less static
risks excessive complications...

A major problem though with P2P and the activation methodology is that there
are so many variances in the latencies and availability that serious
synchronicity/simultaneity issues would exist that even more messaging might
be required. Since there are so many variables in public P2P, empirical data
also would be necessary to get a gander on feasibility.

I still feel strongly that the way to do AGI P2P (with public P2P as core
not augmental) is to understand the grid, and build the AGI design based on
that and what it will be in a few years, instead of taking a design and
morphing it to the resource space. That said, there are finite designs that
will work so the number of choices is few.

John


                _____________________________________________
                From: Ed Porter [mailto:[EMAIL PROTECTED] 
                Sent: Monday, December 03, 2007 6:17 PM
                To: agi@v2.listbox.com
                Subject: RE: Hacker intelligence level [WAS Re: [agi]
Funding AGI research]
                

                John, 

                You raised some good points.  The problem is that the total
number of messages/sec that can be received is relatively small.  It is not
as if you are dealing with a multidimensional grid or toroidal net in which
spreading tree activation can take advantage of the fact that the total
parallel bandwidth for regional messaging can be much greater than the
x-sectional bandwidth.  

                In a system where each node is a server class node with
multiple processors and 32 or 64Gbytes of ram, much of which is allocable to
representation, sending messages to local indices on each machine could
fairly efficiently activate all occurrences of something in a 32 to 64 TByte
knowledge base with a max of 1K internode messages, if there was only 1K
nodes.

                But in a PC based P2P system the ratio of nodes to
representation space is high and the total number of 128 byte messages/sec
than can be received is limited to about 100, so neither methods of trying
to increase number of patterns than can be activated with the given
interconnect of the network buy you as much.

                Human level context sensitivity arises because a large
number of things that can depend on a large number of things in the current
context are made aware of those dependencies.  This takes a lot of
messaging, and I don't see how a P2P system where each node can only receive
about 100 relatively short messages a second is going to make this possible
unless you had a huge number of nodes. As Richard Loosemore said in his Mon
12/3/2007 12:57 PM post.

                                "It turns out that within an extremely short
time of the forst word being 
                                seen, a very large numbmer of other words
have their activations raised 
                                significantly.  Now, whichever way you
interpret these (so called 
                                "priming") results, one thing is not in
doubt:  there is massively 
                                parallel activation of lexical units going
on during language processing." 

                With special software, a $10M dollar supercomputer cluster
with 1K nodes, 32TBytes of Ram, and a dual ported 20Mb infiniband
interconnect send about 1 to 5 billion 128byte messages/sec.  Since there
are only 1K nodes that means a global activation would require a max of 1K
internode messages.

                If you had 10K P2P nodes each with 2G of ram dedicated to
representation, each of which could receive only about 100 128byte
message/sec, you would have a total message limit of 1M 128 msg/sec
addressing 20TBytes of Ram.  A global activation would require 10K messages
meaning only 100 could be done a second.  Of course with such fine grained
nodes many activations of all occurrences of a given pattern would not
activate all 10K nodes, but many would probably activate at least a third of
them, meaning only 100 to 300 system wide activations of the occurrences of
a pattern could be done a second, which is about 10K times slower than the
above mentioned super computer.

                If you had 1M P2P nodes each with 2G of ram dedicated to
representation, each of which could receive only about 100 128byte
message/sec, you would have a total message limit of 100M 128 msg/sec
addressing 2000TBytes of Ram.  This is 1/10 to 1/50 the number of similarly
sized messages on the super computer.  

                But because of the much larger number of separate machines,
a larger number of message is required for a spreading activation from a
pattern to allow of the other patterns in which it occurs.  A global
activation talking to all compute nodes would require 1M messages meaning
only 100 could be done a second.  

                But due to the fact that much of world knowledge is sparsely
connected a complete activation of all of a pattern's occurrences might
activate anywhere roughly between 1 and 10M patterns.  We don't know the
average level of interconnection appropriate for world knowledge, but let us
assume an average of occurrences for the average pattern.  (It should be
remembered however that activations are often made not only from a given
pattern or concept node, but also from a set of similar pattern nodes.   Let
us assume on average each message to a P2P node activates 2 related pattern
nodes in its RAM, so only 10K messages on aveage are needed for each
activation, that would allow 10K nodes to spread full activation a second.
This figure is just a guessed average of the total number of occurrences for
the average pattern in the world knowledge.

                So the 1M node P2P network could only due full-occurrence
activations for 10K pattern/sec, which is 1/100th the amount that would be
allowed on the super computer mentioned above.  

                Fully activating 10K nodes a second might sound like a lot,
until you realize that spreading activation is typically done by relaying
such spreading activation through multiple levels of implication, with the
number of activated nodes growing somewhat  exponentially at each level
unless one greatly prunes down the percent of links that get activated at
each successive radial level out in the search.  

                Say you have 10 pattern nodes a second that are added to
your context, and for each we send out full activations from it and 10
similar nodes.  That would use 100 of the 10K average full activations
(i.e., 20Kpattern nodes, and 10K messages each), so now we have 2M second
level partial activations.  Lets wildly guess the number of activated second
level pattern nodes should be reduced to 200K because there probably is a
lot of overlap in these activations.  These 200k 2nd level patterns have to
share an amount of messaging equal to 9900 full average activations (assumed
above to be 10K messages each).  This only allows an average of 1/20 a full
activation or 500 messages from each of the 200k second level activations.
It would not leave any messages for from the millions of third level
activations that received such messages.

                And this doesn't account for spreading activation from
similar nodes at each successive generation of an activation.  It also
doesn't allow for the fact that in a system with multiple changing
constraints on a node's activation level, a node's activation level can
change many times a second, and if the change is large enough, news of such
changes should be relayed via multi-level spreading activation multiple
times a second, not only for the new concepts activated each second, but
also for many of the concept nodes in the context that have been previously
activated in STM.  In Shurti's system he assumes activations take place at
roughly a gama wave frequencies (30-40Hz), and that most slot filler
concepts are repeated about 4 times a second, and that relationship concepts
are sending messages to each multiple times a second to update the values of
relationships they are connected to. 

                The 100M inter node messages in the 1M node P2P example,
should be large enough to at least do interesting exploration in large AGI.
If one could greatly limit the combinatorial rate of growth of spreading
activation messaging, one might actually be able to do some impressive
multilevel.  Implication.  At this time we just don't know how well
implication can be controlled and how much in the way of spreading
activation messaging will be required.

                The 1M node P2P example, which would allow 100M 128 byte
messages/sec, would allow roughly 10 activations a second each activating 2k
other patterns, each activing 200 other patterns, each activating 20 other
patterns (The actual number would be higher because of the overlap of
activations).   You should be able to do some sort of inferences with this,
particularly if one takes into account some of features mentioned in my
recent prior email to Richard Loosemore.   If by indicies of indices, John,
you mean a probabilistic hiearachy of activation patterns used to activate
each other, you might actually be able to do something impressive in
improving the efficiency of activation.   

                In the Shruiti example I copied in the picture I sent out
earlier today (and am copying again) the implication required activation
across 8 nodes from "fall" to "cleans".  This is a radial search so it
requires an average of four hops from each of those two nodes.  You can see
the amount of inferencing provided by the 1M node P2P network would not
allow such 4 level deep activation across any significant part of many large
pattern spaces at a rapid speed.  Even the above mentioned supercomputer
with several million activations a second would not, but with intelligent
inference guiding schemes it would at least have a much better shot.

                Of course, it is my belief that in this "John Fell in the
Hallway.  Tom had cleaned it. He was hurt." Shastri example shown in the
attached figure, we humans are actually likely to have at least one
individual pattern which include slipping and falling on a wet floor that
had been cleaned all in one pattern, requiring many fewer hops.  The fact
that we humans have very large experiential knowledge base in one of the
things that makes search relatively efficient, because our large number of
relatively complex patterns greatly reduced the amount of search that is
required for solution to many problems.

                We do not know the number and width of the spreading
activation that is necessary for human level reasoning over world knowledge.
Thus, we really don't know how much interconnect is needed and thus how
large of a P2P net would be needed for impressive AGI.  But I think it would
have to be larger than say 10K nodes.

                Ed Porter
                 << File: SHRUTI IMPLICATION-2.jpg >> 
                -----Original Message-----
                From: John G. Rose [mailto:[EMAIL PROTECTED] 
                Sent: Monday, December 03, 2007 12:37 PM
                To: agi@v2.listbox.com
                Subject: RE: Hacker intelligence level [WAS Re: [agi]
Funding AGI research]

                > From: Ed Porter [mailto:[EMAIL PROTECTED]
                > Once you build up good models for parsing and word sense,
then you read
                > large amounts of text and start building up model of the
realities
                > described
                > and generalizations from them.
                > 
                > Assuming this is a continuation of the discussion of an
AGI-at-home P2P
                > system, you are going to be very limited by the lack of
bandwidth,
                > particularly for attacking the high dimensional problem of
seeking to
                > understand the meaning of text, which often involve
multiple levels of
                > implication, which would normally be accomplished by some
sort of search
                > of
                > a large semantic space, which is going to be difficult
with limited
                > bandwidth.
                > 
                > But a large amount of text with appropriate parsing and
word sense
                > labeling
                > would still provide a valuable aid for web and text search
and for many
                > forms of automatic learning.  And the level of
understanding that such a
                > P2P
                > system could derive from reading huge amounts of text
could be a
                > valuable
                > initial source of one component of world knowledge for use
by AGI.


                I kind of see the small bandwidth between (most) individual
nodes as not a
                limiting factor as sets of nodes act as temporary single
group entities. IOW
                the BW between one set of 50 nodes and another set of 50
nodes is quite
                large actually and individual nodes' data access would
depend on - indexes
                of indexes to minimize their individual BW requirements.

                Does this not apply to your model?

                John


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=71808695-d7fd06

RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

Reply via email to