Ed, Building up parse trees and word sense models, let's say that would be a first step. And then say after a while this was accomplished and running on some peers. What would the next theoretical step be?
Also, what would you try to accomplish if there was more bandwidth and more computing power? The reason I ask is that a public peer network can be constructed in many ways and a subset of nodes can be higher bandwidth - 10, 20, 30+ mbits and some legs can be very high approaching 400 mbits. Computing power doesn't get that high 'cept for a small subset where you have multiproc/multicore servers but these are rare. Also, even with the basic lower end, lower quality nodes, including DSL, etc. the computational resource topology can molded and optimized for particular computational goal structures. John > -----Original Message----- > From: Ed Porter [mailto:[EMAIL PROTECTED] > Sent: Saturday, December 01, 2007 6:41 PM > To: [email protected] > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > research] > > John, > > I tested Exeter, NH to LA at 5371kbs download, and 362Kbs upload. > Strangelly > my scores were slightly slower to NYC. > > > Just throwing out ideas, for example, AGI-at-home PC's in the net could > crawl the web looking for reasonable NL text. Use current NL tools to > guess > parse and word sense. For each word in text, send it and it surrounding > text, Part of speech labeling, surrounding parse tree, and word sense > guess, > to another P2P node that specializes in that word in similar contexts > and > separately another P2P node that specializes in similar parse trees. > These > specialist node could then develop statistical models for word senses > based > on clustering or other technique. Then over time the statistical models > would get send down to the reading nodes, and this EM cycle could be > constantly repeated. > > Of course, without the cross-sectional bandwidth of proper AGI hardware, > you > are going to be severely limited from doing a lot of the things you > would > really like to be able to do. But I think you should be able to come up > with pretty good word sense models. > > Ed Porter > > -----Original Message----- > From: John G. Rose [mailto:[EMAIL PROTECTED] > Sent: Friday, November 30, 2007 2:55 PM > To: [email protected] > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > research] > > Ed, > > That is probably a good rough estimate. There are more headers for the > more > frequently transmitted smaller messages but a 16 byte header may be a > bit > large. > > Here is a speedtest link - > http://www.speedtest.net/ > > My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 > kb/sec > UL much larger than the calculations 256kb/sec. The variance between > tests > to the same location is quite large on the DL side but UL is relatively > stable. Saturating either DL or UL would impact the other. > > You can get higher efficiencies if you use UDP transmission without > message > serialization. Also you can do things like compression, only sending > changes, etc.. > > Distributed crawling with NL learning fits the scenario well since nodes > download at higher speeds, process the download into a smaller dataset, > then > UL communicate the results to the server or share with peers. When one > peer > shares with many peers you hit the UL limit fast though so it has to be > managed. And you have to figure out how the knowledge will be spread out > - > server centric, shared, hybrid... As the knowledge size increases with > peer > storage you have to come up with distributed indexes. > > John > > > > -----Original Message----- > > From: Ed Porter [mailto:[EMAIL PROTECTED] > > Sent: Friday, November 30, 2007 12:06 PM > > To: [email protected] > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > research] > > > > John, > > > > Thanks. I guess that means and AGI-at-home system could be both up- > > loading > > and receiving about 27 1K msgs/sec if it wasn't being used for > anything > > else > > and the networks weren't backed up in its neck of the woods. > > > > Presumably the number for say 128Byte messages would be say, roughly, > 8 > > times faster (minus some percent for the latency associated with each > > message, so lets say roughly about 5 times faster or 135msg/sec. Is > > that > > reasonable? > > > > So, it seems for example it would be quite possible to do > > estimation/maximilation type NL learning in a distributed manner with > a > > lot > > of cable-box connected PC's and a distributed web crawler. > > > > Ed Porter > > > > -----Original Message----- > > From: John G. Rose [mailto:[EMAIL PROTECTED] > > Sent: Friday, November 30, 2007 12:33 PM > > To: [email protected] > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > research] > > > > Hi Ed, > > > > If the peer is not running other apps utilizing the network it could > do > > the > > same. Typically a peer first needs to locate other peers. There may be > > servers involved but these are just for the few bytes transmitted for > > public > > IP address discovery as many(or most) peers reside hidden behind NATs. > > DNS > > names also require lookups but these are just for doing the initial > > match of > > hostname to IP address, if DNS is used at all. > > > > We're just talking basic P2P, one peer talking to one other peer, > > nothing > > complicated. As you can imagine P2P can take on many flavors as the > > number > > of peers increases. > > > > John > > > > > -----Original Message----- > > > From: Ed Porter [mailto:[EMAIL PROTECTED] > > > Sent: Friday, November 30, 2007 10:10 AM > > > To: [email protected] > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > > research] > > > > > > John, > > > > > > Thanks. > > > > > > Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the > > > client > > > to server upload you discribed? > > > > > > Ed Porter > > > > > > -----Original Message----- > > > From: John G. Rose [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, November 29, 2007 11:40 PM > > > To: [email protected] > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > > research] > > > > > > OK for a guestimate take a half-way decent cable connection say > > Comcast > > > on a > > > good day with DL of 4mbits max and UL of 256kbits max with an > > > undiscriminated protocol, an unknown TCP based protocol, talking to > a > > > fat-pipe, low latency server. Assume say 16 byte message header > > wrappers > > > for > > > all of your 128, 1024 and 10k byte message sizes. > > > > > > So upload is 256kbits, go ahead and saturate it fully with either of > > > your > > > 128+16bytes, 1024+16bytes, and 10k+16bytes packet streams. Using TCP > > for > > > reliability and assume some overhead say subtract 10% from the > > saturated > > > value, retransmits, latency. > > > > > > What are we left with? Assume the PC has 1gigbit NIC so it is > usually > > > waiting to squeeze out the 256kbits of cable upload capacity. > > > > > > Oh right this is just upstream, DL is 4mbits cable into PC NIC or > > > 1gigbit > > > (assume 60% saturation) so there is > ample PC NIC BW for this. > > > > > > ... > > > > > > So for 256kbits/sec = 256,000 bits/sec, > > > > > > (256,000 bits/sec) / ((1024 + 16)bytes x 8bits/ (message bytes)) = > > > 30.769 > > > messages / sec. > > > > > > So 30.769 messages/sec - 10% = 27.692 messages /sec. > > > > > > > > > About 27.692 message per sec for the 1024 byte message upload > stream. > > > > > > Download = 16x UL = 443.072 messages/sec > > > > > > My calculation look right? > > > > > > Note: some Comcast cable connections allow as much as 1.4mbits > upload. > > > UL is > > > always way less than DL (dependant on protocol). Other cable > companies > > > are > > > similar depends on the company and geographic region... > > > > > > > > > John > > > > > > > > > > -----Original Message----- > > > > From: Ed Porter [mailto:[EMAIL PROTECTED] > > > > Sent: Thursday, November 29, 2007 6:50 PM > > > > To: [email protected] > > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > > > research] > > > > > > > > John, > > > > > > > > Somebody (I think it was David Hart) told me there is a shareware > > > > distributed web crawler already available, but I don't know the > > > details, > > > > such as how good or fast it is. > > > > > > > > How fast could P2P communication be done on one PC, on average > both > > > > sending > > > > upstream and receiving downstream from servers with fat pipes? > > > Roughly > > > > how > > > > many msgs a second for cable connected PC's, say at 128byte and > > > > 1024byte, > > > > and 10K byte message sizes? > > > > > > > > Decent guestimates on such numbers would help me think about what > > sort > > > > of > > > > interesting distributed NL learning tasks could be done with by > AGI- > > > at- > > > > Home > > > > network. (of course once it showed any promise Google would start > > > doing > > > > it a > > > > thousand times faster, but at least it would be open source). > > > > > > > > Ed Porter > > > > > > > > > > > > -----Original Message----- > > > > From: John G. Rose [mailto:[EMAIL PROTECTED] > > > > Sent: Thursday, November 29, 2007 8:31 PM > > > > To: [email protected] > > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI > > > > research] > > > > > > > > Ed, > > > > > > > > That is the http protocol, it is a client server request/response > > > > communication. Your browser asked for the contents at > > > > http://www.nytimes.com. The NY Times server(s) dumped the response > > > > stream > > > > data to your external IP address. You probably have a NAT'd cable > > > > address > > > > and NAT'ted again by your local router (if you have one). This > > > > communication > > > > is mainly one way - except for your original few bytes of http > > > request. > > > > For > > > > a full ack-nack real-time dynamically addressed protocol there is > > more > > > > involved but say OpenCog could be setup to act as an http server > and > > > you > > > > could have a http client (browser or whatever) for simplicity in > > > > communications. Http is very firewall friendly since it is > > universally > > > > used > > > > on the internet. > > > > > > > > A distributed web crawler is a stretch though.... the > communications > > > is > > > > more > > > > complicated. > > > > > > > > John > > > > > > > > > -----Original Message----- > > > > > From: Ed Porter [mailto:[EMAIL PROTECTED] > > > > > Sent: Thursday, November 29, 2007 6:13 PM > > > > > To: [email protected] > > > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding > AGI > > > > > research] > > > > > > > > > > John, > > > > > > > > > > Thank you for the info. > > > > > > > > > > I just did a rough count of all the "IMG SRC="http://" in the > > > source > > > > of > > > > > the > > > > > NYTimes home page which down loaded to my cable-modem connected > > > > computer > > > > > in > > > > > about 3 seconds. I counted roughly 50 occurrences of that > string. > > > I > > > > > assume > > > > > there a many other downloaded files such as for layout info. > Lets > > > > guess > > > > > a > > > > > total of at least 100 files that have to be requested and > > downloaded > > > > and > > > > > displayed. That would be about 33 per second. So what could one > > do > > > > with > > > > > a > > > > > system that could do on average about 20 accesses a second on a > > > > > sustained > > > > > rate, if a user was leaving it one at night as part of an > OpenCog- > > > at- > > > > > Home > > > > > project. > > > > > > > > > > It seems to me that that would be enough for some interesting > > large > > > > > corpus > > > > > NL work in conjunction with a distributed web crawler. > > > > > > > > > > Ed Porter > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: John G. Rose [mailto:[EMAIL PROTECTED] > > > > > Sent: Thursday, November 29, 2007 7:27 PM > > > > > To: [email protected] > > > > > Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding > AGI > > > > > research] > > > > > > > > > > > From: Ed Porter [mailto:[EMAIL PROTECTED] > > > > > > As I have said many before, to have brain-level AGI I believe > > you > > > > need > > > > > > within several orders of magnitude the representational, > > > > > computational, > > > > > > and > > > > > > interconnect capability of the human mind. > > > > > > > > > > > > If you had 1 million PC bots on the web, the representational > > and > > > > > > computational power would be there. But what sort of > > interconnect > > > > > would > > > > > > you > > > > > > have? What is the average cable box connected computers > upload > > > > > > bandwidth? > > > > > > > > > > > > Is it about 1MBit/sec? If so that would be a bandwidth of > > > > 1TBit/sec. > > > > > > But > > > > > > presumably only a small percent of that total 1TBit/sec could > be > > > > > > effectively > > > > > > used, say 100Gbits/sec. That's way below brain level, but it > is > > > high > > > > > > enough > > > > > > to do valuable AGI research. > > > > > > > > > > > > But would even 10% of this total 1Tbit/sec bandwidth be > > > practically > > > > > > available? > > > > > > > > > > > > How many messages a second can a PC upload a second at say > 100K, > > > > 10K, > > > > > > 1K, > > > > > > and 128 bytes each? Does anybody know? > > > > > > > > > > > > > > > I've gone through all this while being in VOIP R&D. MANY > different > > > > > connections at many different bandwidths, latencies, QOS, it's > > dirty > > > > > across > > > > > the board. Communications between different points is very non- > > > > > homogenous. > > > > > There are "deep" connections and "surface" alluding to deep web > > and > > > > > surface > > > > > web though network topology is somewhat independent of > > permissions. > > > > The > > > > > physical infrastructure of the internet allows for certain > > extremely > > > > > high > > > > > bandwidth, low latency connections where the edge is typically > > lower > > > > > bandwidth, higher latency but it does depend on the hop graph, > > time > > > of > > > > > day, > > > > > etc.. > > > > > > > > > > Messages per sec depends on many factors - network topology > > starting > > > > > from pc > > > > > bus, to NIC, to LAN switch and router, to other routers to ISPs, > > > > between > > > > > ISPs, back in other end, etc.. A cable box usually does anywhere > > > from > > > > > 64kbit > > > > > to 1.4mbit upload depending on things such as provider, > protocol, > > > hop > > > > > distance, it totally depends... usually a test is required. > > > > > > > > > > > > > > > > On the net, can one bot directly talk to another bot, or does > > the > > > > > > communication have to go through some sort of server (other > than > > > > those > > > > > > provided gratis on the web, such as DNS servers)? > > > > > > > > > > > > If two bots send messages to a third bot at the same time, > does > > > the > > > > > net > > > > > > infrastructure hold the second of the conflicting messages > until > > > the > > > > > > first > > > > > > has been received, or what? > > > > > > > > > > This is called protocol and there are many - see RFCs and ITU > for > > > > > standards > > > > > but better ones are custom made. There are connectionless and > > > > connection > > > > > oriented protocols, broadcast, multicast, C/S, P2P, etc.. > Existing > > > > > protocol > > > > > standards can be extended, piggybacked or parasited. > > > > > > > > > > Bots can talk direct or go through a server using or not using > > DNS. > > > > Also > > > > > depends on topology - is one point (or both) behind a NAT? > > > > > > > > > > Message simultaneity handling is dependent on protocol. > > > > > > > > > > > > > > > > To me the big hurdle to achieving the equivalent of SETI-at- > home > > > AGI > > > > > is > > > > > > getting the bandwidth necessary to allow the interactive > > computing > > > > of > > > > > > large > > > > > > amounts of knowledge. If we could solve that problem, then it > > > should > > > > > be > > > > > > pretty easy to get some great tests going, such as with > > something > > > > like > > > > > > OpenCog. > > > > > > > > > > Like I was saying before - better to design based on what you > have > > > to > > > > > work > > > > > with than trying to do something like fit the human brain design > > on > > > > the > > > > > "unbounded nondeterministic" internet grid. I'm not sure though > > what > > > > the > > > > > architecture of OpenCog looks like... > > > > > > > > > > John > > > > > > > > > > > > > > > > > > > > ----- > > > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > > > To unsubscribe or change your options, please go to: > > > > > http://v2.listbox.com/member/?& > > > > > > > > > > ----- > > > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > > > To unsubscribe or change your options, please go to: > > > > > http://v2.listbox.com/member/?& > > > > > > > > ----- > > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > > To unsubscribe or change your options, please go to: > > > > http://v2.listbox.com/member/?& > > > > > > > > ----- > > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > > To unsubscribe or change your options, please go to: > > > > http://v2.listbox.com/member/?& > > > > > > ----- > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > To unsubscribe or change your options, please go to: > > > http://v2.listbox.com/member/?& > > > > > > ----- > > > This list is sponsored by AGIRI: http://www.agiri.org/email > > > To unsubscribe or change your options, please go to: > > > http://v2.listbox.com/member/?& > > > > ----- > > This list is sponsored by AGIRI: http://www.agiri.org/email > > To unsubscribe or change your options, please go to: > > http://v2.listbox.com/member/?& > > > > ----- > > This list is sponsored by AGIRI: http://www.agiri.org/email > > To unsubscribe or change your options, please go to: > > http://v2.listbox.com/member/?& > > ----- > This list is sponsored by AGIRI: http://www.agiri.org/email > To unsubscribe or change your options, please go to: > http://v2.listbox.com/member/?& > > ----- > This list is sponsored by AGIRI: http://www.agiri.org/email > To unsubscribe or change your options, please go to: > http://v2.listbox.com/member/?& ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=71372336-9364b9
