Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
It shouldnt matter how a general ontology is used, it should be available for multiple different AI and AGI processes, to be generally useful. And the key thing about this usage is it doesnt get any information from a single text, but extracts patterns from the mass usage, reading a single passage is much more difficult. I have also used this in conjunction with Google news feed, where many many articles can be gathered in a short period on a single topic, and reinforce the information. James Ratcliff Vladimir Nesov [EMAIL PROTECTED] wrote: On Dec 13, 2007 12:09 AM, James Ratcliff wrote: Mainly as a primer ontology / knowledge representation data set for an AGI to work with. Having a number of facts known without having to be typed in about many frames and connections between frames gives an AGI a good booster to start with. Taken a simple set of common words in a house chair, table, sock, closet etc, a house agi bot could get a feel for objects it would expect to find in a house, and what locations to look for say a sock, and properties of a sock, without having to have that information typed in from a human user. Then that information would be updated thru experience, and with a human trainer working with an embodied (probably virtual) agi. Yes, it's how story usually goes. But if you don't specify how ontology will be used, why do you believe that it will be more useful than original texts? Probably at a point where you'll be able to make use of ontology you'd also be able to analyze texts directly (that is, if you aim that high, otherwise it's a different issue entirely). -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Never miss a thing. Make Yahoo your homepage. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75658037-88df5d
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mainly as a primer ontology / knowledge representation data set for an AGI to work with. Having a number of facts known without having to be typed in about many frames and connections between frames gives an AGI a good booster to start with. Taken a simple set of common words in a house chair, table, sock, closet etc, a house agi bot could get a feel for objects it would expect to find in a house, and what locations to look for say a sock, and properties of a sock, without having to have that information typed in from a human user. Then that information would be updated thru experience, and with a human trainer working with an embodied (probably virtual) agi. The novels gave a really good data set that reinforced the factoids extracted and was a bit more world-knowledge common sense than other extraction projects using Wall Street Journal, or a subset of the web as hole, and removed much of the junk data. James Vladimir Nesov [EMAIL PROTECTED] wrote: On Dec 11, 2007 7:26 PM, James Ratcliff wrote: Here's a basic abstract I did last year I think: http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf Would like to work with others on a full fledged Reprensentation system that could use these kind of techniques I hacked this together by myself, so I know a real team could put this kind of stuff to much better use. James Do you have any particular path in mind to put this kind of thing to work? Finding patterns is fine, and somewhat inevitable, but what are those ontologies good for, and why? -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Looking for last minute shopping deals? Find them fast with Yahoo! Search. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75372394-eb4d01
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
I had been thinking about something along these lines, though not worded as you have in this message yet. What I would be most interested in at this point is a knowledge gathering system somewhere along these lines, where the main AGI could be centralized/clustered or distributed, but where questions and information would be posed to the Bot on each persons node and collected together. The system would remember any facts and domain that a person has contributed so any future unique questions could be posed to the knowledgeable expert users. This would allow a large amount of knowledge to be extracted in a distributed manner, keeping track of the quality of information gathered from each person as a trust metric, and many facts would be gathered and checked for truth. Mainly the system should have an ability to ACTIVELY go out in search of the answer, by chatting with known users to find and confirm any conflicting results. For instance, it would randomly ask me Who is the highest paid baseball player? and I would pass on that question... the system would put a lower score for any further baseball questions sent towards me, but based on my answering of other computer questions and ones about Austin, TX, it would be more likely to ask me questions about them. And only me and a couple other people here would get the questions about Austin, TX. Something along the lines of a higher quality Yahoo Questions, with an active component, and central knowledge base. I think the knowledge base is one of the most important pieces of these, and hope to start seeing some more of ppls ideas and implementations of KR db's. James Ratcliff Matt Mahoney [EMAIL PROTECTED] wrote: --- Jean-Paul Van Belle wrote: Hi Matt, Wonderful idea, now it will even show the typical human trait of lying...when i ask it do you still love me? most answers in its database will have Yes as an answer but when i ask it 'what's my name?' it'll call me John? My proposed message posting service allows anyone to contribute to its knowledge base, just like Wikipedia, so it could certainly contain some false or useless information. However, the number of peers that keep a copy of a message will depend on the number of peers that accept it according to the peers' policies, which are set individually by their owners. The network provides an incentive for peers to produce useful information so that other peers will accept it. Thus, useful and truthful information is more likely to be propagated. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Looking for last minute shopping deals? Find them fast with Yahoo! Search. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75375812-111ad4
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Dec 13, 2007 12:09 AM, James Ratcliff [EMAIL PROTECTED] wrote: Mainly as a primer ontology / knowledge representation data set for an AGI to work with. Having a number of facts known without having to be typed in about many frames and connections between frames gives an AGI a good booster to start with. Taken a simple set of common words in a house chair, table, sock, closet etc, a house agi bot could get a feel for objects it would expect to find in a house, and what locations to look for say a sock, and properties of a sock, without having to have that information typed in from a human user. Then that information would be updated thru experience, and with a human trainer working with an embodied (probably virtual) agi. Yes, it's how story usually goes. But if you don't specify how ontology will be used, why do you believe that it will be more useful than original texts? Probably at a point where you'll be able to make use of ontology you'd also be able to analyze texts directly (that is, if you aim that high, otherwise it's a different issue entirely). -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75420074-1ea3c3
Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
On 12/12/07, James Ratcliff [EMAIL PROTECTED] wrote: This would allow a large amount of knowledge to be extracted in a distributed manner, keeping track of the quality of information gathered from each person as a trust metric, and many facts would be gathered and checked for truth. Something along the lines of a higher quality Yahoo Questions, with an active component, and central knowledge base. I think the knowledge base is one of the most important pieces of these, and hope to start seeing some more of ppls ideas and implementations of KR db's. I believe where you said central knowledge base you mean distributed KB - right? The idea of keeping local KB at each node shares the burden for storage/bandwidth to every node in the network. Your trust metrics are how nodes conditionally connect for per-topic fact-checking. I have already volunteered my free CPU/bandwidth to a prototype of this model. Of course, I'd like to be a collaborator of mechanisms involved in addition to a user of the grid. Even if it starts out only a toy or hobby, it would still teach us a great deal. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75442948-fd876c
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Here's a basic abstract I did last year I think: http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf Would like to work with others on a full fledged Reprensentation system that could use these kind of techniques I hacked this together by myself, so I know a real team could put this kind of stuff to much better use. James Ed Porter [EMAIL PROTECTED] wrote: I James, Do you have any description or examples of you results. This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. Ed Porter -Original Message- From: James Ratcliff [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:51 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Richard, What is your specific complaint about the 'viability of the framework'? Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James Ed Porter [EMAIL PROTECTED] wrote: RICHARD LOOSEMORE= You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. JAMES--- What e ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told by multiple people at MIT was considered one of the most accurate NL parsers around. Is that just a trivial exercise in public relations? With regard to Hecht-Nielsen's work, if it does half of what he says it does it is pretty damned impressive. It is also a work I think about often when thinking how to deal with certain AI problems. Richard if you insultingly dismiss such valid work as trivial exercises in public relations it sure as hell seems as if either you are quite lacking in certain important understandings -- or you have a closed mind -- or both. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
James, I read your paper. Your project seems right on the mark. It provides a domain-limited example of the general type of learning algorithm that will probably be the central learning algorithm of AGI, i.e., finding patterns, and hierarchies of patterns in the AGI's experience in a largely unsupervised manner. The application of the type of learning algorithm to text makes sense because, with the web, it is one of the easiest types of experience to get in large volumes. It is very much the type of project I have been advocating for years. When I first heard of the Google project to put millions of books into digital form, I assumed it was for exactly such purposes, and told multiple people so. (Ditto for the CMU million book project.) It seems to be the conventional wisdom that Google is not using its vast resources for such an obvious purpose, but I wouldn't be so sure. It seems to me that fiction books, at an estimated average length of 300 pages at 300 words/page, would only have about 100K words each, so that 600 of them would only be about 60 Million words, which is amazingly small for learning from corpora studies. That you were able to learn so much from so little is encouraging, but it would really be interesting to see such a project done on very large corpora, 10 or 100s of billions of words. It would be interesting to see how much of human common sense (and expertise) they could, and could not, derive. Ed Porter -Original Message- From: James Ratcliff [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 11:26 AM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Here's a basic abstract I did last year I think: http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf Would like to work with others on a full fledged Reprensentation system that could use these kind of techniques I hacked this together by myself, so I know a real team could put this kind of stuff to much better use. James Ed Porter [EMAIL PROTECTED] wrote: James, Do you have any description or examples of you results. This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. Ed Porter -Original Message- From: James Ratcliff [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:51 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Richard, What is your specific complaint about the 'viability of the framework'? Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James Ed Porter [EMAIL PROTECTED] wrote: RICHARD LOOSEMORE= You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. JAMES--- What e ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Jean-Paul Van Belle [EMAIL PROTECTED] wrote: Hi Matt, Wonderful idea, now it will even show the typical human trait of lying...when i ask it do you still love me? most answers in its database will have Yes as an answer but when i ask it 'what's my name?' it'll call me John? My proposed message posting service allows anyone to contribute to its knowledge base, just like Wikipedia, so it could certainly contain some false or useless information. However, the number of peers that keep a copy of a message will depend on the number of peers that accept it according to the peers' policies, which are set individually by their owners. The network provides an incentive for peers to produce useful information so that other peers will accept it. Thus, useful and truthful information is more likely to be propagated. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=74671775-73001c
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Dec 11, 2007 7:26 PM, James Ratcliff [EMAIL PROTECTED] wrote: Here's a basic abstract I did last year I think: http://www.falazar.com/AI/AAAI05_Student_Abtract_James_Ratcliff.pdf Would like to work with others on a full fledged Reprensentation system that could use these kind of techniques I hacked this together by myself, so I know a real team could put this kind of stuff to much better use. James Do you have any particular path in mind to put this kind of thing to work? Finding patterns is fine, and somewhat inevitable, but what are those ontologies good for, and why? -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=75044005-87874a
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
Hi Matt, Wonderful idea, now it will even show the typical human trait of lying...when i ask it do you still love me? most answers in its database will have Yes as an answer but when i ask it 'what's my name?' it'll call me John? However, your approach is actually already being implemented to a certain extent. Apparantly (was it newsweek, time?) the No 1 search engine in (Singapore? Hong Kong? Taiwan? - sorry I forgot) is *not* Google but a local language QA system that works very much the way you envisage it (except it collects the answers in its own SAN i.e. not distributed over the user machines) =Jean-Paul On 2007/12/07 at 18:58, in message [EMAIL PROTECTED], Matt Mahoney [EMAIL PROTECTED] wrote: Hi Matt You call it an AGI proposal but it is described as a distributed search algorithms that (merely) appears intelligent i.e. design for an Internet-wide message posting and search service. There doesn't appear to be any grounding or semantic interpretation by the AI system? How will it become more intelligent? Turing was careful to make no distinction between being intelligent and appearing intelligent. The requirement for passing the Turing test is to be able to compute a probability distribution P over text strings that varies from the true distribution no more than it varies between different people. Once you can do this, then given a question Q, you can compute answer A that maximizes P(A|Q) = P(QA)/P(Q). This does not require grounding. The way my system appears intelligent is by directing Q to the right experts, and by being big enough to have experts on nearly every conceivable topic of interest to humans. A lot of AGI research seems to be focused on how to represent knowledge and thought efficiently on a (much too small) computer, rather than on what services the AGI should provide for us. -- Research Associate: CITANDA Post-Graduate Section Head Department of Information Systems Phone: (+27)-(0)21-6504256 Fax: (+27)-(0)21-6502280 Office: Leslie Commerce 4.21 - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73912948-7bb204
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO EXPLICITLY DEAL WITH 500K TUPLES And I asked -- Do you believe that this is some sort of huge conceptual breakthrough? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73155533-eaf7a5
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mark, First you attack me for making a statement which you falsely claimed indicated I did not understand the math in the Collins' article (and potentially discreted everything I said on this list). Once it was show that that attack was unfair, rather than apologizing sufficiently for the unfair attack, now you seem to be coming back with another swing. Now you are implicitly attacking me for implying it is new to think you could deal with vectors in some sort of compressed representation. I was aware that there were previous methods for dealing with vectors in high dimensional spaces using various compression schemes, although I had only heard of a few examples. I personally had been planning for years prior to reading Collin's paper to score matches based mainly on the number of similar features, and not all the dissimilar features(except in certain cases) to avoid the curse of high dimensionalities. But I was also aware of many discussions, such as one in a current best selling AI textbook, which implies that a certain problem becomes intractable easily because it assumes one is saddled with dealing with the full possible dimensionality of the problem space being represented, when it is clear you can accomplish a high percent of the same thing with a GNG type approach by only placing represention where there are significant probabilities. So, all though it may not be new to you, it seems to be new to some that the curse of high dimensionality can often be avoided in many classes of problems. I was citing the Collins paper as one example for showing that AI systems have been able to deal well with high dimensionality. I attended a lecture at MIT that a few years after the Collin's paper came out where the major thrust of the speech was that recently great headway was being made in many field of AI because people were beginning to realize all sorts of efficient hacks that avoid many of the problems of combinatorial explosion of high dimensionality that had previously thwarted their efforts. The Collins paper is an example of that movement. When it was relatively new, the Collins paper was treated by several people I talked to as quite a breakthrough, because in conjunction of the work of people like Haussler it showed a relatively simple way to apply the Kernel trick to graph mapping. As you may be aware the Kernel trick not only allows one to score matches, but also allows many of the analytical tools of linear algebra to be applied through the kernel, greatly reducing the complexity of applying such tools in the much higher dimensional space represented by the kernel mapping. I am not a historian of this field of math, but in its day the Kernel trick was getting a lot of buzz from many people in the field. I attended an NL conference at CMU in the early '90s. The use of support vector classifiers using the kernel trick was all the rage at the conference, and the kernels they were use seemed much less appropriate than that Collin's paper discloses. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 9:09 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO EXPLICITLY DEAL WITH 500K TUPLES And I asked -- Do you believe that this is some sort of huge conceptual breakthrough? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73199664-8396eaattachment: winmail.dat
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed, Get a grip. Try to write with complete words in complete sentences (unless discreted means a combination of excreted and discredited -- which works for me :-). I'm not coming back for a second swing. I'm still pursuing the first one. You just aren't oriented well enough to realize it. Now you are implicitly attacking me for implying it is new to think you could deal with vectors in some sort of compressed representation. Nope. First of all, compressed representation is *absolutely* the wrong term for what you're looking for. Second, I actually am still trying to figure out what *you* think you ARE gushing about. (And my quest is not helped by such gems as all though [sic] it may not be new to you, it seems to be new to some) Why don't you just answer my question? Do you believe that this is some sort of huge conceptual breakthrough? For NLP (as you were initially pushing) or just for some nice computational tricks? I'll also note that you've severely changed the focus of this away from the NLP that you were initially raving about as such quality work -- and while I'll agree that kernel mapping is a very elegant tool -- Collin's work is emphatically *not* what I would call a shining example of it (I mean, *look* at his results -- they're terrible). Yet you were touting it because of your 500,000 dimension fantasies and you're belief that it's good NLP work. So, in small words -- and not whining about an attack -- what precisely are you saying? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73247008-aecb7f
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mark, You claimed I made a particular false statement about the Collins paper. (That by itself could have just been a misunderstanding or an honest mistake.) But then you added an insult to that by implying I had probably made the alleged error because I was incapable of understand the mathematics involved. As if that wasn't enough in the way of gratuitous insults, you suggested my alleged error called in to question the validity of the other things I have said on this list. That is a pretty deep, purposely and unnecessarily, insulting put down. I think I have shown that I did understood the math in question, perhaps better than you, since you initially totally ignored the part of the paper that supported my statement. I have shown that my statement was in fact correct by a reasonable interpretation of my words. Thus, not only was your accusation of my error unjustified, but also, even more so, the two insults placed on top of it. You have not apologized for your unjustified accusation of error and the two additional unnecessary insults (unless your statement Ok. I'll bite. is considered an appropriate apology for such an improper set of deep insults). Instead you have continued in an even more insulting tone, including starting one subsequent email with a comment about something I had said that went as follows: HeavySarcasmWow. Is that what dot products are?/HeavySarcasm I don't mind people questioning me, or pointing out errors when I make them. I even have a fair amount of tolerance for people mistakenly accusing me of making an error, if they make the false accusation honestly and not in a purposely insulting manner, as did you. Why should I waste more time conversing with someone who wants to converse in such an insulting tone? Mark, you have been quick to publicly call other people on this list trolls, in effect to their face, in front of the whole list. This is a behavior most people would consider very hurtful. So what do you call people on this list who not only falsely accuse other people of errors, add several unnecessary insults based on the false accusation, and then when shown to be in error, continue addressing comments to the falsely accused person in a HeavySarcasm style? How about mean spirited. Mark, you are an intelligent person, and I have found some of your posts valuable. That day a few weeks ago when you and Ben were riffing back and forth, I was offended by your tone, but I thought many of your questions were valuable. If you wish to continue any sort of communication with me, feel free to question and challenge, but please lay off the HeavySarcasm and insults which do nothing to further the exchange and clarification of ideas. With regard to your questions below, If you actually took the time to read my prior responses, I think you will see I have substantially answered them. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 1:24 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed, Get a grip. Try to write with complete words in complete sentences (unless discreted means a combination of excreted and discredited -- which works for me :-). I'm not coming back for a second swing. I'm still pursuing the first one. You just aren't oriented well enough to realize it. Now you are implicitly attacking me for implying it is new to think you could deal with vectors in some sort of compressed representation. Nope. First of all, compressed representation is *absolutely* the wrong term for what you're looking for. Second, I actually am still trying to figure out what *you* think you ARE gushing about. (And my quest is not helped by such gems as all though [sic] it may not be new to you, it seems to be new to some) Why don't you just answer my question? Do you believe that this is some sort of huge conceptual breakthrough? For NLP (as you were initially pushing) or just for some nice computational tricks? I'll also note that you've severely changed the focus of this away from the NLP that you were initially raving about as such quality work -- and while I'll agree that kernel mapping is a very elegant tool -- Collin's work is emphatically *not* what I would call a shining example of it (I mean, *look* at his results -- they're terrible). Yet you were touting it because of your 500,000 dimension fantasies and you're belief that it's good NLP work. So, in small words -- and not whining about an attack -- what precisely are you saying? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73284487
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Ed Porter [EMAIL PROTECTED] wrote: I have a lot of respect for Google, but I don't like monopolies, whether it is Microsoft or Google. I think it is vitally important that there be several viable search competators. I wish this wicki one luck. As I said, it sounds a lot like your idea. Partly. The main difference is that I am also proposing a message posting service, where messages become instantly searchable and are also directed to persistent queries. Wikia has a big hurdle to get over. People will ask how is this better than Google? before they bother to download the software. For example, Grub (distributed spider) uses a lot of bandwidth and disk without providing much direct benefit to the user. The major benefit of Wikia seems to be that users provide feedback on relevance to query responses, which in theory ought to provide a better ranking algorithm than something like Google's PageRank. But assuming they get enough users to get to this level, spammers could still game the system by flooding the network with with high rankings for their websites. In a distributed message posting service, each peer would have its own policy regarding which messages to relay, keep in its cache, or ignore. If a document is valuable, then lots of peers would keep a copy. A client could then rank query responses by the number of copies received weighted by the peer's reputation. Spammers could try to game the system by adding lots of peers and flooding the network with advertising, but this would fail because most other peers would be configured to ignore peers that don't provide reciprocal services by routing their own outgoing messages. Any peer not so configured would quickly be abused and isolated from the network in the same way that open relay SMTP servers get abused by spammers and blacklisted by spam filters. Of course a message posting service would have a big hurdle too. Initially, the service would have to be well integrated with the existing Internet. Client queries would have to go to the major search engines, and there would have to be websites set up as peers without the user having to install software. Most computers are not configured to run as servers (dynamic IP, behind firewalls, slow upload, etc), so peers will probably need to allow message passing over client HTTP (website polling), by email, and over instant messaging protocols. File sharing networks became popular because they offered a service not available elsewhere (free music). But I don't intend for the message posting service to be used to evade copyright or censorship (although it probably could be). The protocol requires that the message's originator and intermediate routers all be identified by a reply address and time stamp. It won't work otherwise. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73286384-77b385
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 3:01 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: I have a lot of respect for Google, but I don't like monopolies, whether it is Microsoft or Google. I think it is vitally important that there be several viable search competators. I wish this wicki one luck. As I said, it sounds a lot like your idea. Partly. The main difference is that I am also proposing a message posting service, where messages become instantly searchable and are also directed to persistent queries. Wikia has a big hurdle to get over. People will ask how is this better than Google? before they bother to download the software. For example, Grub (distributed spider) uses a lot of bandwidth and disk without providing much direct benefit to the user. The major benefit of Wikia seems to be that users provide feedback on relevance to query responses, which in theory ought to provide a better ranking algorithm than something like Google's PageRank. But assuming they get enough users to get to this level, spammers could still game the system by flooding the network with with high rankings for their websites. In a distributed message posting service, each peer would have its own policy regarding which messages to relay, keep in its cache, or ignore. If a document is valuable, then lots of peers would keep a copy. A client could then rank query responses by the number of copies received weighted by the peer's reputation. Spammers could try to game the system by adding lots of peers and flooding the network with advertising, but this would fail because most other peers would be configured to ignore peers that don't provide reciprocal services by routing their own outgoing messages. Any peer not so configured would quickly be abused and isolated from the network in the same way that open relay SMTP servers get abused by spammers and blacklisted by spam filters. Of course a message posting service would have a big hurdle too. Initially, the service would have to be well integrated with the existing Internet. Client queries would have to go to the major search engines, and there would have to be websites set up as peers without the user having to install software. Most computers are not configured to run as servers (dynamic IP, behind firewalls, slow upload, etc), so peers will probably need to allow message passing over client HTTP (website polling), by email, and over instant messaging protocols. File sharing networks became popular because they offered a service not available elsewhere (free music). But I don't intend for the message posting service to be used to evade copyright or censorship (although it probably could be). The protocol requires that the message's originator and intermediate routers all be identified by a reply address and time stamp. It won't work otherwise. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73293460-0b3fcd
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Ed Porter [EMAIL PROTECTED] wrote: Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. It does if the P2P software has vulnerabilities, just like any other server or client. Worms would be especially dangerous because they could spread quickly without user intervention, but slowly spreading viruses that are well hidden can be dangerous too. There is no foolproof defense, but it helps to keep the protocol and software as simple as possible, to run the P2P software as a nonprivileged process, use open source code, and not to depend to any large extent on a single source of software. The protocol I have in mind is that a message contain searchable natural language text, possibly some nonsearchable attached files, and a header with the reply address and timestamp of the originator and any intermediate peers through which the message was routed. The protocol is not dangerous except for the attached files, but these have to be included because it is a useful service. If you don't include it, people will figure out how to embed arbitrary data in the message text, which would make the protocol more dangerous because it wasn't planned for. In theory, you could use the P2P network to spread information about malicious peers and deliver software patches. But I think this would introduce more problems than it solves because it would also introduce a mechanism for spreading false information and patches containing trojans. Peers should have defenses that operate independently of the network, including disconnecting itself if it detects anomalies in its own behavior. Of course the network is vulnerable even if the peers behave properly. Malicious peers could forge headers, for example, to hide the true source of messages or to force replies to be directed to unintended targets. Some attacks could be very complex depending on the idiosyncratic behavior of particular peers. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73321137-bba914
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Richard, What is your specific complaint about the 'viability of the framework'? Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James Ed Porter [EMAIL PROTECTED] wrote: RICHARD LOOSEMORE= You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. JAMES--- What e ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told by multiple people at MIT was considered one of the most accurate NL parsers around. Is that just a trivial exercise in public relations? With regard to Hecht-Nielsen's work, if it does half of what he says it does it is pretty damned impressive. It is also a work I think about often when thinking how to deal with certain AI problems. Richard if you insultingly dismiss such valid work as trivial exercises in public relations it sure as hell seems as if either you are quite lacking in certain important understandings -- or you have a closed mind -- or both. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73349390-542055
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:06 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. It does if the P2P software has vulnerabilities, just like any other server or client. Worms would be especially dangerous because they could spread quickly without user intervention, but slowly spreading viruses that are well hidden can be dangerous too. There is no foolproof defense, but it helps to keep the protocol and software as simple as possible, to run the P2P software as a nonprivileged process, use open source code, and not to depend to any large extent on a single source of software. The protocol I have in mind is that a message contain searchable natural language text, possibly some nonsearchable attached files, and a header with the reply address and timestamp of the originator and any intermediate peers through which the message was routed. The protocol is not dangerous except for the attached files, but these have to be included because it is a useful service. If you don't include it, people will figure out how to embed arbitrary data in the message text, which would make the protocol more dangerous because it wasn't planned for. In theory, you could use the P2P network to spread information about malicious peers and deliver software patches. But I think this would introduce more problems than it solves because it would also introduce a mechanism for spreading false information and patches containing trojans. Peers should have defenses that operate independently of the network, including disconnecting itself if it detects anomalies in its own behavior. Of course the network is vulnerable even if the peers behave properly. Malicious peers could forge headers, for example, to hide the true source of messages or to force replies to be directed to unintended targets. Some attacks could be very complex depending on the idiosyncratic behavior of particular peers. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73357661-483045
Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter Why are you having this discussion on an AGI list? Will Pearson - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73366106-264b25
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote: This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. But what's knowledge? -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73373961-20dc54
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
James, Do you have any description or examples of you results. This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. Ed Porter -Original Message- From: James Ratcliff [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:51 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Richard, What is your specific complaint about the 'viability of the framework'? Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James Ed Porter [EMAIL PROTECTED] wrote: RICHARD LOOSEMORE= You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. JAMES--- What e ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told by multiple people at MIT was considered one of the most accurate NL parsers around. Is that just a trivial exercise in public relations? With regard to Hecht-Nielsen's work, if it does half of what he says it does it is pretty damned impressive. It is also a work I think about often when thinking how to deal with certain AI problems. Richard if you insultingly dismiss such valid work as trivial exercises in public relations it sure as hell seems as if either you are quite lacking in certain important understandings -- or you have a closed mind -- or both. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; ___ James Ratcliff - http://falazar.com Looking for something... _ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try http://us.rd.yahoo.com/evt=51733/*http:/mobile.yahoo.com/;_ylt=Ahu06i62sR8H DtDypao8Wcj9tAcJ%20 it now. _ This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/? http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73371326-7ffb17
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
It was part of a discussion of using a P2P network with OpenCog to develop distributed AGI's. -Original Message- From: William Pearson [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 5:20 PM To: agi@v2.listbox.com Subject: Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter Why are you having this discussion on an AGI list? Will Pearson - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73390249-cd905b
Re: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- William Pearson [EMAIL PROTECTED] wrote: On 06/12/2007, Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter Why are you having this discussion on an AGI list? Because this is an AGI design. The intelligence comes from having a lot of specialized experts on narrow topics and a distributed infrastructure that directs your queries to the right experts. The P2P protocol is natural language text. I will write up the proposal so it will make more sense than the current collection of posts. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73390737-69c951
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
Are you saying the increase in vulnerability would be no more than that? -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 6:17 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter A web browser and email increases your computer's vulnerability, but it doesn't stop people from using them. -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:06 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. It does if the P2P software has vulnerabilities, just like any other server or client. Worms would be especially dangerous because they could spread quickly without user intervention, but slowly spreading viruses that are well hidden can be dangerous too. There is no foolproof defense, but it helps to keep the protocol and software as simple as possible, to run the P2P software as a nonprivileged process, use open source code, and not to depend to any large extent on a single source of software. The protocol I have in mind is that a message contain searchable natural language text, possibly some nonsearchable attached files, and a header with the reply address and timestamp of the originator and any intermediate peers through which the message was routed. The protocol is not dangerous except for the attached files, but these have to be included because it is a useful service. If you don't include it, people will figure out how to embed arbitrary data in the message text, which would make the protocol more dangerous because it wasn't planned for. In theory, you could use the P2P network to spread information about malicious peers and deliver software patches. But I think this would introduce more problems than it solves because it would also introduce a mechanism for spreading false information and patches containing trojans. Peers should have defenses that operate independently of the network, including disconnecting itself if it detects anomalies in its own behavior. Of course the network is vulnerable even if the peers behave properly. Malicious peers could forge headers, for example, to hide the true source of messages or to force replies to be directed to unintended targets. Some attacks could be very complex depending on the idiosyncratic behavior of particular peers. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73394329-17b2b6
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter A web browser and email increases your computer's vulnerability, but it doesn't stop people from using them. -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:06 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. It does if the P2P software has vulnerabilities, just like any other server or client. Worms would be especially dangerous because they could spread quickly without user intervention, but slowly spreading viruses that are well hidden can be dangerous too. There is no foolproof defense, but it helps to keep the protocol and software as simple as possible, to run the P2P software as a nonprivileged process, use open source code, and not to depend to any large extent on a single source of software. The protocol I have in mind is that a message contain searchable natural language text, possibly some nonsearchable attached files, and a header with the reply address and timestamp of the originator and any intermediate peers through which the message was routed. The protocol is not dangerous except for the attached files, but these have to be included because it is a useful service. If you don't include it, people will figure out how to embed arbitrary data in the message text, which would make the protocol more dangerous because it wasn't planned for. In theory, you could use the P2P network to spread information about malicious peers and deliver software patches. But I think this would introduce more problems than it solves because it would also introduce a mechanism for spreading false information and patches containing trojans. Peers should have defenses that operate independently of the network, including disconnecting itself if it detects anomalies in its own behavior. Of course the network is vulnerable even if the peers behave properly. Malicious peers could forge headers, for example, to hide the true source of messages or to force replies to be directed to unintended targets. Some attacks could be very complex depending on the idiosyncratic behavior of particular peers. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73388768-0927ef
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Edward, It's certainly a trick question, since if you don't define semantics for this knowledge thing, it can turn out to be anything from simplest do-nothings to full-blown physically-infeasible superintelligences. So you assertion doesn't cut the viability of knowledge extraction for various purposes, and without that it's not clear what you actually mean. On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote: This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73400395-303d49
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Yes, it's what triggered my nitpicking reflex; I am sorry about that. Your comment sounds fine when related to viability of teaching an AGI in a text-only mode without too much manual assistance, but semantics of what it was given to is quite different. On Dec 7, 2007 3:13 AM, Ed Porter [EMAIL PROTECTED] wrote: Vlad, My response was to the following message == Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James = so I was asking what sort of knowledge he had extracted as part of the lots of relational information on a range of topics. Ed Porter -Original Message- From: Vladimir Nesov [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 7:02 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Edward, It's certainly a trick question, since if you don't define semantics for this knowledge thing, it can turn out to be anything from simplest do-nothings to full-blown physically-infeasible superintelligences. So you assertion doesn't cut the viability of knowledge extraction for various purposes, and without that it's not clear what you actually mean. On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote: This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73408474-ba1629
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Vlad, My response was to the following message == Ed, This line of data gathering is very interesting to me as well, though I found quickly that using all web sources quickly devolved into insanity. By using scanned text novels, I was able to extract lots of relational information on a range of topics. With a well defined ontology system, and some human overview, a large amount of information can be extracted and many probabilities learned. James = so I was asking what sort of knowledge he had extracted as part of the lots of relational information on a range of topics. Ed Porter -Original Message- From: Vladimir Nesov [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 7:02 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Edward, It's certainly a trick question, since if you don't define semantics for this knowledge thing, it can turn out to be anything from simplest do-nothings to full-blown physically-infeasible superintelligences. So you assertion doesn't cut the viability of knowledge extraction for various purposes, and without that it's not clear what you actually mean. On Dec 7, 2007 1:20 AM, Ed Porter [EMAIL PROTECTED] wrote: This is something I have been telling people for years. That you should be able to extract a significant amount (but probably far from all) world knowledge by scanning large corpora of text. I would love to see how well it actually works for a given size of corpora, and for a given level of algorithmic sophistication. -- Vladimir Nesovmailto:[EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73401551-1f6d58
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Ed Porter [EMAIL PROTECTED] wrote: Are you saying the increase in vulnerability would be no more than that? Yes, at least short term if we are careful with the design. But then again, you can't predict what AGI will do, or else it wouldn't be intelligent. I can't say for certain long term (2040s?) it wouldn't launch a singularity, or even that it wouldn't create an intelligent worm that would eat the Internet. I don't think anyone is smart enough to get it right, but it is going to happen in one form or another. I wrote up a quick description of my AGI proposal at http://www.mattmahoney.net/agi.html basically summarizing what I posted over the last several emails, including various attack scenarios. I'm sure I didn't think of everything. It is kind of sketchy because it's not an area I am actively pursuing. It should be a useful service at least in the short term before it destroys us. -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 6:17 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, So if it is perceived as something that increases a machine's vulnerability, it seems to me that would be one more reason for people to avoid using it. Ed Porter A web browser and email increases your computer's vulnerability, but it doesn't stop people from using them. -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Thursday, December 06, 2007 4:06 PM To: agi@v2.listbox.com Subject: RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, Does a PC become more vulnerable to viruses, worms, Trojan horses, root kits, and other web attacks if it becomes part of a P2P network? And if so why and how much. It does if the P2P software has vulnerabilities, just like any other server or client. Worms would be especially dangerous because they could spread quickly without user intervention, but slowly spreading viruses that are well hidden can be dangerous too. There is no foolproof defense, but it helps to keep the protocol and software as simple as possible, to run the P2P software as a nonprivileged process, use open source code, and not to depend to any large extent on a single source of software. The protocol I have in mind is that a message contain searchable natural language text, possibly some nonsearchable attached files, and a header with the reply address and timestamp of the originator and any intermediate peers through which the message was routed. The protocol is not dangerous except for the attached files, but these have to be included because it is a useful service. If you don't include it, people will figure out how to embed arbitrary data in the message text, which would make the protocol more dangerous because it wasn't planned for. In theory, you could use the P2P network to spread information about malicious peers and deliver software patches. But I think this would introduce more problems than it solves because it would also introduce a mechanism for spreading false information and patches containing trojans. Peers should have defenses that operate independently of the network, including disconnecting itself if it detects anomalies in its own behavior. Of course the network is vulnerable even if the peers behave properly. Malicious peers could forge headers, for example, to hide the true source of messages or to force replies to be directed to unintended targets. Some attacks could be very complex depending on the idiosyncratic behavior of particular peers. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73450735-649fdc
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Interesting. Since I am interested in parsing, I read Collin's paper. It's a solid piece of work (though with the stated error percentages, I don't believe that it really proves anything worthwhile at all) -- but your over-interpretations of it are ridiculous. You claim that It is actually showing that you can do something roughly equivalent to growing neural gas (GNG) in a space with something approaching 500,000 dimensions, but you can do it without normally having to deal with more than a few of those dimensions at one time. Collins makes no claims that even remotely resembles this. He *is* taking a deconstructionist approach (which Richard and many others would argue vehemently with) -- but that is virtually the entirety of the overlap between his paper and your claims. Where do you get all this crap about 500,000 dimensions, for example? You also make statements that are explicitly contradicted in the paper. For example, you say But there really seem to be no reason why there should be any limit to the dimensionality of the space in which the Collin's algorithm works, because it does not use an explicit vector representation while his paper quite clearly states Each tree is represented by an n dimensional vector where the i'th component counts the number of occurences of the i'th tree fragment. (A mistake I believe you made because you didn't understand the prevceding sentence -- or, more critically, *any* of the math). Are all your claims on this list this far from reality if one pursues them? - Original Message - From: Ed Porter [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Tuesday, December 04, 2007 10:52 PM Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] The particular NL parser paper in question, Collins's Convolution Kernels for Natural Language (http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Collins-kernels.pdf) is actually saying something quite important that extends way beyond parsers and is highly applicable to AGI in general. It is actually showing that you can do something roughly equivalent to growing neural gas (GNG) in a space with something approaching 500,000 dimensions, but you can do it without normally having to deal with more than a few of those dimensions at one time. GNG is an algorithm I learned about from reading Peter Voss that allows one to learn how to efficiently represent a distribution in a relatively high dimensional space in a totally unsupervised manner. But there really seem to be no reason why there should be any limit to the dimensionality of the space in which the Collin's algorithm works, because it does not use an explicit vector representation, nor, if I recollect correctly, a Euclidian distance metric, but rather a similarity metric which is generally much more appropriate for matching in very high dimensional spaces. But what he is growing are not just points representing where data has occurred in a high dimensional space, but sets of points that define hyperplanes for defining the boundaries between classes. My recollection is that this system learns automatically from both labeled data (instances of correct parse trees) and randomly generated deviations from those instances. His particular algorithm matches tree structures, but with modification it would seem to be extendable to matching arbitrary nets. Other versions of it could be made to operate, like GNG, in an unsupervised manner. If you stop and think about what this is saying and generalize from it, it provides an important possible component in an AGI tool kit. What it shows is not limited to parsing, but it would seem possibly applicable to virtually any hierarchical or networked representation, including nets of semantic web RDF triples, and semantic nets, and predicate logic expressions. At first glance it appears it would even be applicable to kinkier net matching algorithms, such as an Augmented transition network (ATN) matching. So if one reads this paper with a mind to not only what it specifically shows, but to what how what it shows could be expanded, this paper says something very important. That is, that one can represent, learn, and classify things in very high dimensional spaces -- such as 10^1 dimensional spaces -- and do it efficiently provided the part of the space being represented is sufficiently sparsely connected. I had already assumed this, before reading this paper, but the paper was valuable to me because it provided a mathematically rigorous support for my prior models, and helped me better understand the mathematical foundations of my own prior intuitive thinking. It means that systems like Novemente can deal in very high dimensional spaces relatively efficiently. It does not mean that all processes that can be performed in such spaces will be computationally cheap (for example, combinatorial searches), but it means that many of them, such as GNG like recording
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Dave, Thanks for the link. Seems like it gives Matt the right to say to the world I told you so. I wonder if OpenCog could get involved in this, or something like this, in a productive way. Ed Porter -Original Message- From: David Hart [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:16 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] On 12/5/07, Matt Mahoney [EMAIL PROTECTED] wrote: [snip] Centralized search is limited to a few big players that can keep a copy of the Internet on their servers. Google is certainly useful, but imagine if it searched a space 1000 times larger and if posts were instantly added to its index, without having to wait days for its spider to find them. Imagine your post going to persistent queries posted days earlier. Imagine your queries being answered by real human beings in addition to other peers. I probably won't be the one writing this program, but where there is a need, I expect it will happen. Wikia, the company run by Wikipedia founder Jimmy Wales, is tackling the Internet-scale distributed search problem - http://search.wikia.com/wiki/Atlas Connecting to related threads (some recent, some not-so-recent), the Grub distributed crawler ( http://search.wikia.com/wiki/Grub ) is intended to be one of many plug-in Atlas Factories. A development goal for Grub is to enhance it with a NL toolkit (e.g. the soon-to-be-released RelEx), so it can do more than parse simple keywords and calculate statistical word relationships. -dave _ This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/? http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72270417-205c60
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
two of the Collin's paper cited in my prior email. The language you quoted occurred in the following context. Conceptually we begin by enumerating all tree fragments that occur in the training data 1,,n. NOTE THAT THIS IS DONE ONLY IMPLICITLY. Each tree is represented by an n dimensional vector where the i'th component counts the number of occurences of the i'th tree fragment. (capitalization is added for emphasis) This is the discussion of the conceptually very high dimensional space his system effectively computes in but normally avoid having to explicitly deals in. In that conceptually high dimensional space patterns are represented conceptually by vectors having a scalar associated with each dimension of the high dimensional space. But this vector is only the conceptual representation, not the one his system actually explicitly uses for computation. This is the very high dimensional space I was talking about, not the reduced dimensionality I talked about in which most operations are performed. The 4th paragraph on Page 3 of the paper starts with The key to our efficient use of this high dimensional representation is the definition of an appropriate kernel. This kernel method it discusses uses a kernel function C(n1,n2) which is at the end of the major equation what has three equal signs and spans the width of page 3. Immediately below is an image of this equations for those reading in rich text This function C(n1,n2) is summarized in the following text at the start of the first full paragraph on page 4. To see that this recursive definition is correct, note that C(n1,n2) simply counts the number of common subtrees that are found rooted at both n1 and n2. In the above equation, n1 and n2 are iteratively each node, respectively, in each of the two trees being matched. Thus this kernel function deals with much less than all of the i subtrees that occur in the training data mentioned in the above quoted text that starts with the word Conceptually. Instead it only deals with that subset of the i subtrees that occur in the two parse trees that are being compared. Since the vector referred to in the conceptually paragraph that had the full dimensionality i is not used in the kernel function, it never needs to be explicitly dealt with. THUS, THE ALLEGATION BELOW THAT I MISUNDERSTOOD THE MATH BECAUSE THOUGHT COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR HAVING THE FULL DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY FALSE. QED What this is saying is rather common sensical. It says that regardless of how many dimensions a space has, you can compare things based on the number of dimensions they share, which is normally a very small subset of the total number of dimensions. This is often called a dot product comparision, and the matching metric is often called a similarity rather than a distance. This is different than a normal distance comparison, which, by common definition, measures the similarity or lack thereof in all dimensions. But in an extremely high dimensional space such computations become extremely complex, and the distance is dominated by the extremely large number of dimension that are for many purposes irrelevant to the comparison. Of course in the case of Collin's paper the comparison is made a little more complex because in involves a mapping, not just a measure of the similarity along each shared i dimension. So, in summary, Mark, before you trash me so harshly, please take a little more care to be sure your criticisms are actually justified. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 10:27 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Interesting. Since I am interested in parsing, I read Collin's paper. It's a solid piece of work (though with the stated error percentages, I don't believe that it really proves anything worthwhile at all) -- but your over-interpretations of it are ridiculous. You claim that It is actually showing that you can do something roughly equivalent to growing neural gas (GNG) in a space with something approaching 500,000 dimensions, but you can do it without normally having to deal with more than a few of those dimensions at one time. Collins makes no claims that even remotely resembles this. He *is* taking a deconstructionist approach (which Richard and many others would argue vehemently with) -- but that is virtually the entirety of the overlap between his paper and your claims. Where do you get all this crap about 500,000 dimensions, for example? You also make statements that are explicitly contradicted in the paper. For example, you say But there really seem to be no reason why there should be any limit to the dimensionality of the space in which the Collin's algorithm works, because it does not use
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
ED PORTER= The 500K dimensions were mentioned several times in a lecture Collins gave at MIT about his parse. This was probably 5 years ago so I am not 100% sure the number was 500K, but I am about 90% sure that was the number used, and 100% sure the number was well over 100K. OK. I'll bite. So what do *you* believe that these dimensions are? Words? Word pairs? Entire sentences? Different trees? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72410952-199e0d
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Richard, It actually is more valuable than you say. First, the same kernel trick can be used for GNG type unsupervised learning in high dimensional spaces. So it is not limited to supervised learning. Second, you are correct is saying that through the kernel trick it is doing the actually doing almost all of its computations in a lower dimensional space. But unlike with many kernel tricks, in this one the system actually directly access each of the dimensions in the space in different combinations as necessary. That is important. It means that you can have a space with as many dimensions as there are features or patterns in your system and still efficiently do similarity matching (but not distance matching.) Ed Porter -Original Message- From: Richard Loosemore [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 2:37 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed Porter wrote: Mark, MARK WASER=== You claim that It is actually showing that you can do something roughly equivalent to growing neural gas (GNG) in a space with something approaching 500,000 dimensions, but you can do it without normally having to deal with more than a few of those dimensions at one time. Collins makes no claims that even remotely resembles this. He *is* taking a deconstructionist approach (which Richard and many others would argue vehemently with) -- but that is virtually the entirety of the overlap between his paper and your claims. Where do you get all this crap about 500,000 dimensions, for example? ED PORTER= The 500K dimensions were mentioned several times in a lecture Collins gave at MIT about his parse. This was probably 5 years ago so I am not 100% sure the number was 500K, but I am about 90% sure that was the number used, and 100% sure the number was well over 100K. The very large size of the number of dimensions was mentioned repeatedly by both Collin's and at least one other professor with whom I talked after the lecture. One of the points both emphasized was that by use of the kernel trick he was effectively matching in a 500K dimensional space, without having to deal with most of those dimensions at any one time (although, it is my understanding, that over many parses the system would deal with a large percent of all those dimensions.) It sounds like you may have misunderstood the relevance of the high number of dimensions. Correct me if I am wrong, but Collins is not really matching in large numbers of dimensions, he is using the kernel trick to transform a nonlinear CLASSIFICATION problem into a high-dimensional linear classification. This is just a trick to enable a better type of supervised learning. Would you follow me if I said that using supervised learning is of no use in general? Because it means that someone has already (a) decided on the dimensions of representation in the initial problem domain, and (b) already done all the work of classifying the sentences into syntactically correct and syntactically incorrect. All that the SVM is doing is summarizing this training data in a nice compact form: the high number of dimensions involved at one stage of the problem appear to be just an artifact of the method, it means nothing in general. It especially does not mean that this supervised training algorithm is somehow able to break out and become and unsupervised, feature-discovery method, which it would have to do to be of any general interest. I still have not read Collins' paper: I am just getting this from my understanding of the math you have mentioned here. It seems that whether or not he mentioned 500K dimensions or an infinite number of dimensions (which he could have done) makes no difference to anything. If you think it does make a big difference, could you explain why? Richard Loosemore If you read papers on support vector machines using kernel methods you will realize that it is well know that you can do certain types of matching and other operations in high dimensional spaces with out having to actually normally deal in the high dimensions by use of the kernel trick. The issue is often that of finding a particular kernel that works well for your problem. Collins shows the kernel trick can be extended to parse tree net matching. With regard to my statement that the efficiency of the kernel trick could be applied relatively generally, it is quite well supported by the following text from page 4 of the paper. This paper and previous work by Lodhi et al. [12] examining the application of convolution kernels to strings provide some evidence that convolution kernels may provide an extremely useful tool for applying modern machine learning techniques to highly structured objects. The key idea here is that one may take a structured object and split it up into parts. If one can construct kernels over the parts then one can
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mark, The paper said: Conceptually we begin by enumerating all tree fragments that occur in the training data 1,...,n. Those are the dimensions, all of the parse tree fragments in the training data. And as I pointed out in an email I just sent to Richard, although usually only a small set of them are involved in any one match between two parse trees, they can all be used over set of many such matches. So the full dimensionality is actually there, it is just that only a particular subset of them are being used at any one time. And when the system is waiting for the next tree to match it is potentially capability of matching it against any of its dimensions. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:07 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] ED PORTER= The 500K dimensions were mentioned several times in a lecture Collins gave at MIT about his parse. This was probably 5 years ago so I am not 100% sure the number was 500K, but I am about 90% sure that was the number used, and 100% sure the number was well over 100K. OK. I'll bite. So what do *you* believe that these dimensions are? Words? Word pairs? Entire sentences? Different trees? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72646193-0bde77attachment: winmail.dat
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Dimensions is an awfully odd word for that since dimensions are normally assumed to be orthogonal. - Original Message - From: Ed Porter [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, December 05, 2007 5:08 PM Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Mark, The paper said: Conceptually we begin by enumerating all tree fragments that occur in the training data 1,...,n. Those are the dimensions, all of the parse tree fragments in the training data. And as I pointed out in an email I just sent to Richard, although usually only a small set of them are involved in any one match between two parse trees, they can all be used over set of many such matches. So the full dimensionality is actually there, it is just that only a particular subset of them are being used at any one time. And when the system is waiting for the next tree to match it is potentially capability of matching it against any of its dimensions. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:07 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] ED PORTER= The 500K dimensions were mentioned several times in a lecture Collins gave at MIT about his parse. This was probably 5 years ago so I am not 100% sure the number was 500K, but I am about 90% sure that was the number used, and 100% sure the number was well over 100K. OK. I'll bite. So what do *you* believe that these dimensions are? Words? Word pairs? Entire sentences? Different trees? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72664919-0f4727
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
HeavySarcasmWow. Is that what dot products are?/HeavySarcasm You're confusing all sorts of related concepts with a really garbled vocabulary. Let's do this with some concrete 10-D geometry . . . . Vector A runs from (0,0,0,0,0,0,0,0,0,0) to (1, 1, 0,0,0,0,0,0,0,0). Vector B runs from (0,0,0) to (1, 0, 1,0,0,0,0,0,0,0). Clearly A and B share the first dimension. Do you believe that they share the second and the third dimension? Do you believe that dropping out the fourth through tenth dimension in all calculations is some sort of huge conceptual breakthrough? The two vectors are similar in the first dimension (indeed, in all but the second and third) but otherwise very distant from each other (i.e. they are *NOT* similar). Do you believe that these vectors are similar or distant? THE ALLEGATION BELOW THAT I MISUNDERSTOOD THE MATH BECAUSE THOUGHT COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR HAVING THE FULL DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY FALSE. My allegation was that you misunderstood the math because you claimed that Collin's paper does not use an explicit vector representation while Collin's statements and the math itself makes it quite clear that they are dealing with a vector representation scheme. I'm now guessing that you're claiming that you intended explicit to mean full dimensionality. Whatever. Don't invent your own meanings for words and you'll be misunderstood less often (unless you continue to drop out key words like in the capitalized sentence above). - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72452073-36665f
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mark, Your last email started OK. I'll bite. I guess you didn't bite for very long. We are already back to explicitly marked HeavySarcasm mode. I guess one could argue, as you seem to be doing, that indicating which of 500k dimensions had a match between two subtrees currently being compared, could be considered equivalent to explicitly representing a huge 500k dimensional binary vector -- but i think one could more strongly claim that such an indication would be, at best, only an implicit representation of the 500k vector. THE KEY POINT I WAS TRYING TO GET ACROSS WAS ABOUT NOT HAVING TO EXPLICITLY DEAL WITH 500K TUPLES in each match, which is what I meant when I said not explicitly deal with the high dimensional vectors. This is a big plus in terms of representational and computational efficiency. I did not say there was nothing equivalent to an implicit use of the high dimensional vector, because kernels implicitly do use high dimensional vectors, but they do so implicitly rather than explicitly. That is why they increase efficiency. My Merriam-Webster's Collegiate Dictionary gives as its first, which usually means most common, definition of explicit the following: fully revealed or expressed without vagueness, implication, or ambiguity. The information that two subtree to be matched contains a given set of subtrees, defined by their indicies, without more, does not by itself define a full 500K vector, nor even the full dimensionality of the vector. That information can only be derived from other information, which presumably is not even used in the match procedure Of course there are other definitions of the world explicit which mean exact, and you could argue that indicating a few of the 500K indicies is equivalent to exactly specifying a corresponding 500K dimensional vector, once one takes into account other information. When a use of a word in a given statement has two interpretations one of which is correct, it is not clear one has the right to attack the person making that statement for being incorrect. At most you can attack him for being ambiguous. And normally on this list people do not attack other people as rudely as you have attached me for merely being ambiguous. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:40 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] HeavySarcasmWow. Is that what dot products are?/HeavySarcasm You're confusing all sorts of related concepts with a really garbled vocabulary. Let's do this with some concrete 10-D geometry . . . . Vector A runs from (0,0,0,0,0,0,0,0,0,0) to (1, 1, 0,0,0,0,0,0,0,0). Vector B runs from (0,0,0) to (1, 0, 1,0,0,0,0,0,0,0). Clearly A and B share the first dimension. Do you believe that they share the second and the third dimension? Do you believe that dropping out the fourth through tenth dimension in all calculations is some sort of huge conceptual breakthrough? The two vectors are similar in the first dimension (indeed, in all but the second and third) but otherwise very distant from each other (i.e. they are *NOT* similar). Do you believe that these vectors are similar or distant? THE ALLEGATION BELOW THAT I MISUNDERSTOOD THE MATH BECAUSE THOUGHT COLLIN'S PARSER DIDN'T HAVE TO DEAL WITH A VECTOR HAVING THE FULL DIMENSIONALITY OF THE SPACE BEING DEALT WITH IS CLEARLY FALSE. My allegation was that you misunderstood the math because you claimed that Collin's paper does not use an explicit vector representation while Collin's statements and the math itself makes it quite clear that they are dealing with a vector representation scheme. I'm now guessing that you're claiming that you intended explicit to mean full dimensionality. Whatever. Don't invent your own meanings for words and you'll be misunderstood less often (unless you continue to drop out key words like in the capitalized sentence above). - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72881028-794447attachment: winmail.dat
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
They need not be. -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 6:04 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Dimensions is an awfully odd word for that since dimensions are normally assumed to be orthogonal. - Original Message - From: Ed Porter [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Wednesday, December 05, 2007 5:08 PM Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Mark, The paper said: Conceptually we begin by enumerating all tree fragments that occur in the training data 1,...,n. Those are the dimensions, all of the parse tree fragments in the training data. And as I pointed out in an email I just sent to Richard, although usually only a small set of them are involved in any one match between two parse trees, they can all be used over set of many such matches. So the full dimensionality is actually there, it is just that only a particular subset of them are being used at any one time. And when the system is waiting for the next tree to match it is potentially capability of matching it against any of its dimensions. Ed Porter -Original Message- From: Mark Waser [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 3:07 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] ED PORTER= The 500K dimensions were mentioned several times in a lecture Collins gave at MIT about his parse. This was probably 5 years ago so I am not 100% sure the number was 500K, but I am about 90% sure that was the number used, and 100% sure the number was well over 100K. OK. I'll bite. So what do *you* believe that these dimensions are? Words? Word pairs? Entire sentences? Different trees? - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72742511-f9bb8b
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On 12/5/07, Matt Mahoney [EMAIL PROTECTED] wrote: [snip] Centralized search is limited to a few big players that can keep a copy of the Internet on their servers. Google is certainly useful, but imagine if it searched a space 1000 times larger and if posts were instantly added to its index, without having to wait days for its spider to find them. Imagine your post going to persistent queries posted days earlier. Imagine your queries being answered by real human beings in addition to other peers. I probably won't be the one writing this program, but where there is a need, I expect it will happen. Wikia, the company run by Wikipedia founder Jimmy Wales, is tackling the Internet-scale distributed search problem - http://search.wikia.com/wiki/Atlas Connecting to related threads (some recent, some not-so-recent), the Grub distributed crawler ( http://search.wikia.com/wiki/Grub ) is intended to be one of many plug-in Atlas Factories. A development goal for Grub is to enhance it with a NL toolkit (e.g. the soon-to-be-released RelEx), so it can do more than parse simple keywords and calculate statistical word relationships. -dave - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72165246-397899
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Matt Mahoney [mailto:[EMAIL PROTECTED] My design would use most of the Internet (10^9 P2P nodes). Messages would be natural language text strings, making no distinction between documents, queries, and responses. Each message would have a header indicating the ID and time stamp of the originator and any intermediate nodes through which the message was routed. A message could also have attached files. Each node would have a cache of messages and its own policy on which messages it decides to keep or discard. The goal of the network is to route messages to other nodes that store messages with matching terms. To route an incoming message x, it matches terms in x to terms in stored messages and sends copies to nodes that appear in those headers, appending its own ID and time stamp to the header of the outgoing copies. It also keeps a copy, so that the receiving nodes knows that they know it has a copy of x (at least temporarily). The network acts as a distributed database with a distributed search function. If X posts a document x and Y posts a query y with matching terms, then the network acts to route x to Y and y to X. The very tricky but required part of creating a global network like this is going from zero nodes to whatever the goal is. I think that much emphasis of a design needs to be put into the growth function. If you have 50 nodes running how do you get to 500? And 500 to 5,000? And then if it goes down from 50,000 to 10,000 fast how is it revived before crash? Engineering expertise, ingenuity + maybe psychological and sociological wisdom can be used to make this happen. And we all know that the growth could happen quickly, even overnight. Then once getting to 10^9 nodes they have to be maintained or they can die quickly and even instantaneously. Having an intelligent botnet has its advantages. Once it's running and users try to uninstall it the botnet can try to fight for survival by reasoning with the users. You could make it such that a user has to verbally communicate with it to remove it. The botnet could stall and ask things like Why are you doing this to me after all I have done for you? User:sorry charlie, I command you to uninstall! Bot:OK let's cut a deal... I know we can work this out... John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72911975-ce1dcc
Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- Ed Porter [EMAIL PROTECTED] wrote: Matt, Perhaps your are right. But one problem is that big Google-like compuplexes in the next five to ten years will be powerful enough to do AGI and they will be much more efficient for AGI search because the physical closeness of their machines will make it possible for them to perform the massive interconnected needed for powerful AGI much more efficiently. Google controls about 0.1% of the world's computing power. But I think their ability to achieve AGI first will not be so much due to the high bandwidth of their CPU cluster, as that nobody controls the other 99.9%. Centralized search tends to produce monopolies as the cost of entry goes up. It is not so bad now because Google still has a (dwindling) set of competitors. They can't yet hide content that threatens them. Distributed search like Wikia/Atlas/Grub is interesting, but if people don't see a compelling need for it, it won't happen. How big will it have to get before it is better than Google? File sharing networks would probably be a lot bigger and more useful (with mostly legitimate content) if we could solve the distributed search problem. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72969535-74e4ee
Distrubuted message pool (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
--- John G. Rose [EMAIL PROTECTED] wrote: From: Matt Mahoney [mailto:[EMAIL PROTECTED] My design would use most of the Internet (10^9 P2P nodes). Messages would be natural language text strings, making no distinction between documents, queries, and responses. Each message would have a header indicating the ID and time stamp of the originator and any intermediate nodes through which the message was routed. A message could also have attached files. Each node would have a cache of messages and its own policy on which messages it decides to keep or discard. The goal of the network is to route messages to other nodes that store messages with matching terms. To route an incoming message x, it matches terms in x to terms in stored messages and sends copies to nodes that appear in those headers, appending its own ID and time stamp to the header of the outgoing copies. It also keeps a copy, so that the receiving nodes knows that they know it has a copy of x (at least temporarily). The network acts as a distributed database with a distributed search function. If X posts a document x and Y posts a query y with matching terms, then the network acts to route x to Y and y to X. The very tricky but required part of creating a global network like this is going from zero nodes to whatever the goal is. I think that much emphasis of a design needs to be put into the growth function. If you have 50 nodes running how do you get to 500? And 500 to 5,000? And then if it goes down from 50,000 to 10,000 fast how is it revived before crash? Engineering expertise, ingenuity + maybe psychological and sociological wisdom can be used to make this happen. And we all know that the growth could happen quickly, even overnight. Getting the network to grow means providing enough incentive that people will want to install your software. A distributed message pool offers two services: distributed search and a message posting service. Information has negative value, so it is the second service that provides the incentive. You type your message into a client window, and it instantly becomes available to anyone who enters a query with matching terms. Then once getting to 10^9 nodes they have to be maintained or they can die quickly and even instantaneously. How? A peer would a piece of software that people would use every day, like a web browser or email. People aren't going to suddenly decide to uninstall them all at once or turn off their computers. One possible scenario is a virus or worm spreading quickly from peer to peer. Hopefully there will be a wide variety of peers offering different services, so that individual vulnerabilities could affect only a small part of the network. Having an intelligent botnet has its advantages. Once it's running and users try to uninstall it the botnet can try to fight for survival by reasoning with the users. You could make it such that a user has to verbally communicate with it to remove it. The botnet could stall and ask things like Why are you doing this to me after all I have done for you? User:sorry charlie, I command you to uninstall! Bot:OK let's cut a deal... I know we can work this out... Well, I expect the intelligence to come from having a large number of specialized but relatively dumb peers, and a network that can direct your queries to the right ones. Peers would individually be under the control of their human owners, just as web servers and clients are now. It's not like you could command the Internet to uninstall anyway. Eventually we will need to deal with the problem of the network becoming smarter than us, but I think the threshold of concern is when the collective computing power in silicon exceeds the collective computing power in carbon. Right now the Internet has about as much computing power as a few hundred human brains, but we still have a ways to go to the singularity. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73000478-537c13
RE: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research])
I have a lot of respect for Google, but I don't like monopolies, whether it is Microsoft or Google. I think it is vitally important that there be several viable search competators. I wish this wicki one luck. As I said, it sounds a lot like your idea. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 05, 2007 9:24 PM To: agi@v2.listbox.com Subject: Distributed search (was RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]) --- Ed Porter [EMAIL PROTECTED] wrote: Matt, Perhaps your are right. But one problem is that big Google-like compuplexes in the next five to ten years will be powerful enough to do AGI and they will be much more efficient for AGI search because the physical closeness of their machines will make it possible for them to perform the massive interconnected needed for powerful AGI much more efficiently. Google controls about 0.1% of the world's computing power. But I think their ability to achieve AGI first will not be so much due to the high bandwidth of their CPU cluster, as that nobody controls the other 99.9%. Centralized search tends to produce monopolies as the cost of entry goes up. It is not so bad now because Google still has a (dwindling) set of competitors. They can't yet hide content that threatens them. Distributed search like Wikia/Atlas/Grub is interesting, but if people don't see a compelling need for it, it won't happen. How big will it have to get before it is better than Google? File sharing networks would probably be a lot bigger and more useful (with mostly legitimate content) if we could solve the distributed search problem. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=73068614-a9079e
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Bryan, The name grub sounds familiar. That is probably it. Ed -Original Message- From: Bryan Bishop [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 10:47 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] On Thursday 29 November 2007, Ed Porter wrote: Somebody (I think it was David Hart) told me there is a shareware distributed web crawler already available, but I don't know the details, such as how good or fast it is. http://grub.org/ Previous owner went by the name of 'kordless'. I found him on Slashdot. - Bryan - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71801708-39700e
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
RICHARD LOOSEMORE= You have no idea of the context in which I made that sweeping dismissal. If you have enough experience of research in this area you will know that it is filled with bandwagons, hype and publicity-seeking. Trivial models are presented as if they are fabulous achievements when, in fact, they are just engineered to look very impressive but actually solve an easy problem. Have you had experience of such models? Have you been around long enough to have seen something promoted as a great breakthrough even though it strikes you as just a trivial exercise in public relations, and then watch history unfold as the great breakthrough leads to absolutely nothing at all, and is then quietly shelved by its creator? There is a constant ebb and flow of exaggeration and retreat, exaggeration and retreat. You are familiar with this process, yes? ED PORTER= Richard, the fact that a certain percent of theories and demonstrations are false and/or misleading does not give you the right to dismiss any theory or demonstration that counters your position in an argument as trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little without at least giving some supporting argument for your dismissal. Otherwise you could deny any aspect of scientific, mathematical, or technological knowledge, no matter how sound, that proved inconvenient to whatever argument you were making. There are people who argue in that dishonest fashion, but it is questionable how much time one should spend conversing with them. Do you want to be such a person? The fact that one of the pieces of evidence you so rudely dismissed is a highly functional program that has been used by many other researchers, shows the blindness with which you dismiss the arguments of others. RICHARD LOOSEMORE=This entire discussion baffles me. Does it matter at all to you that I have been working in this field for decades? Would you go up to someone at your local university and tell them how to do their job? Would you listen to what they had to say about issues that arise in their field of expertise, or would you consider your own opinion entirely equal to theirs, with only a tiny fraction of their experience? ED PORTER= No mater how many years you have been studying something, if your argumentative and intellectual approach is to dismiss evidence contrary to your position on clearly false bases, as you did with you dismissal of my evidence with your above quoted insult, a serious question is raised as to whether you are worth listening to or conversing with. ED PORTER -Original Message- From: Richard Loosemore [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 10:47 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed Porter wrote: I'm sorry, but this is not addressing the actual issues involved. You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
John, I am sure there is interesting stuff that can be done. It would be interesting just to see what sort of an agi could be made on a PC. I would be interested in you Ideas for how to make a powerful AGI without a vast amount of interconnect. The major schemes I know about for reducting interconnect involve allocating what interconnect you have to the links with the highest probability or importance, varying those measures of probability and importance in a contest specific way, and being guided by prior similar experiences. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 1:42 AM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed, Well it'd be nice having a supercomputer but P2P is a poor man's supercomputer and beggars can't be choosy. Honestly the type of AGI that I have been formulating in my mind has not been at all closely related to simulating neural activity through orchestrating partial and mass activations at low frequencies and I had been avoiding those contagious cog sci memes on purpose. But your expose on the subject is quite interesting and I wasn't that aware that that is how things have been being done. But getting more than a few thousand P2P nodes is difficult. Going from 10K to 20K nodes and up, getting more difficult to the point of being prohibitively expensive to being impossible or extremely lucky. There are ways to do it but according to your calculations the supercomputer mayt be more of a wise choice as going out and scrounging up funding for that would be easier. Still though (besides working on my group theory heavy design) exploring the crafting and chiseling of an activation model you are talking about to the P2P network could be fruitful. I feel that through a number of up front and unfortunately complicated design changes/adaptations that the activation orchestrations could be improved thus bringing down the message rate requirements, reducing activation requirements, depths and frequencies, through a sort of computational resource topology consumption, self-organizational design molding. You do indicate some dynamic resource adaption and things like intelligent inference guiding schemes in your description but it doesn't seem like it melts enough into the resource space. But having a design be less static risks excessive complications... A major problem though with P2P and the activation methodology is that there are so many variances in the latencies and availability that serious synchronicity/simultaneity issues would exist that even more messaging might be required. Since there are so many variables in public P2P, empirical data also would be necessary to get a gander on feasibility. I still feel strongly that the way to do AGI P2P (with public P2P as core not augmental) is to understand the grid, and build the AGI design based on that and what it will be in a few years, instead of taking a design and morphing it to the resource space. That said, there are finite designs that will work so the number of choices is few. John _ From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 6:17 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, You raised some good points. The problem is that the total number of messages/sec that can be received is relatively small. It is not as if you are dealing with a multidimensional grid or toroidal net in which spreading tree activation can take advantage of the fact that the total parallel bandwidth for regional messaging can be much greater than the x-sectional bandwidth. In a system where each node is a server class node with multiple processors and 32 or 64Gbytes of ram, much of which is allocable to representation, sending messages to local indices on each machine could fairly efficiently activate all occurrences of something in a 32 to 64 TByte knowledge base with a max of 1K internode messages, if there was only 1K nodes. But in a PC based P2P system the ratio of nodes to representation space is high and the total number of 128 byte messages/sec than can be received is limited to about 100, so neither methods of trying to increase number of patterns than can be activated with the given interconnect of the network buy you as much. Human level context sensitivity arises because a large number of things that can depend on a large number of things in the current context are made aware of those dependencies. This takes a lot of messaging, and I don't see how a P2P system where each node can only receive about 100 relatively short messages a second is going to make this possible unless you had a huge
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: RICHARD LOOSEMORE= You have no idea of the context in which I made that sweeping dismissal. If you have enough experience of research in this area you will know that it is filled with bandwagons, hype and publicity-seeking. Trivial models are presented as if they are fabulous achievements when, in fact, they are just engineered to look very impressive but actually solve an easy problem. Have you had experience of such models? Have you been around long enough to have seen something promoted as a great breakthrough even though it strikes you as just a trivial exercise in public relations, and then watch history unfold as the great breakthrough leads to absolutely nothing at all, and is then quietly shelved by its creator? There is a constant ebb and flow of exaggeration and retreat, exaggeration and retreat. You are familiar with this process, yes? ED PORTER= Richard, the fact that a certain percent of theories and demonstrations are false and/or misleading does not give you the right to dismiss any theory or demonstration that counters your position in an argument as trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little without at least giving some supporting argument for your dismissal. Otherwise you could deny any aspect of scientific, mathematical, or technological knowledge, no matter how sound, that proved inconvenient to whatever argument you were making. There are people who argue in that dishonest fashion, but it is questionable how much time one should spend conversing with them. Do you want to be such a person? The fact that one of the pieces of evidence you so rudely dismissed is a highly functional program that has been used by many other researchers, shows the blindness with which you dismiss the arguments of others. Ed, You are misunderstanding this situation. You repeatedly make extremely strong statements about the subject matter of AGI, but you do not have enough knowledge of the issues to understand the replies you get. Now, there is nothing wrong with not understanding, but what happens next is quite intolerable: you argue back as if your opinion was just as valid as the hard-won knowledge that someone else took 25 years to acquire. Not only that, but you go on to sprinkle your comments with instructions to that person to open their mind as if the were somehow being closed-minded. AND not only that, but when I display some impatience with this behavior and decline to write a massive essay to explain stuff that you should be learning for yourself, you decide to fling out accusations such as that i am arguing in a dishonest manner, or that I am dismissing an argument or theory just because it counters my position. If you look at the broad sweep of my postings on these lists you will notice that I spend much more time than I should writing out explanations when people say that they find something I wrote confusing or incomplete. When someone starts behaving rudely, however, I lose patience. What you are experiencing now is lost patience, that is all. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71815518-2fa3ba
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Richard, It is not clear how valuable your 25 years of hard won learning is if it causes you to dismiss valuable scientific work that seems to have eclipsed the importance of anything I or you have published as trivial exercises in public relations without giving any reason whatsoever for the particular dismissal. I welcome criticism in this forum provided it is well reasoned and without venom. But to dismiss a list of examples I give to support an argument as trivial exercises in public relations without any justification other than the fact that in general a certain numbers of published papers are inaccurate and/or overblown, is every bit as dishonest as calling someone a liar with regard to a particular statement based on nothing more than the knowledge some people are liars. In my past exchanges with you, sometimes your responses have been helpful. But I have noticed that although you are very quick to question me (and others), if I question you, rather than respond directly to my arguments you often don't respond to them at all -- such as your recent refusal to justify your allegation that my whole framework, presumably for understanding AGI, was wrong (a pretty insulting statement which should not be flung around without some justification). Or if you do respond to challenges, you often dismiss them as invalid without any substantial evidence, or you substantially change the subject, such as by focusing on one small part of my argument that I have not yet fully supported, while refusing to acknowledge the major support I have shown for the major thrust of my argument. When you argue like that there really is no purpose in continuing the conversation. What's the point. Under those circumstance your not dealing with someone who is likely to tell you anything of worth. Rather you are only likely to hear lame defensive arguments from somebody who is either incapable of properly defending or unwilling to properly defend their arguments, and, thus, is unlikely to communicate anything of value in the exchange. Your 25 years of experience doesn't mean squat about how much you truly understand AGI unless you are capable of being more intellectually honest, both with yourself and with others -- and unless you are capable of actually reasonably defending your understandings, head-on, against reasoned questioning and countering evidence. To dismiss counter evidence cited against your arguments as trivial exercises in public relations without any specific justification is not a reasonable defense, and the fact that you so often result to such intellectually dishonest tactics to defend your stated understandings relating to AGI really does call into question the quality of those understandings. In summary, don't go around attacking other people's statements unless you are willing to defend those attacks in an intellectually honest manner. Ed Porter P.S. This is my last response in this thread. You can have the last say if you so wish. -Original Message- From: Richard Loosemore [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 9:58 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed Porter wrote: RICHARD LOOSEMORE= You have no idea of the context in which I made that sweeping dismissal. If you have enough experience of research in this area you will know that it is filled with bandwagons, hype and publicity-seeking. Trivial models are presented as if they are fabulous achievements when, in fact, they are just engineered to look very impressive but actually solve an easy problem. Have you had experience of such models? Have you been around long enough to have seen something promoted as a great breakthrough even though it strikes you as just a trivial exercise in public relations, and then watch history unfold as the great breakthrough leads to absolutely nothing at all, and is then quietly shelved by its creator? There is a constant ebb and flow of exaggeration and retreat, exaggeration and retreat. You are familiar with this process, yes? ED PORTER= Richard, the fact that a certain percent of theories and demonstrations are false and/or misleading does not give you the right to dismiss any theory or demonstration that counters your position in an argument as trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little without at least giving some supporting argument for your dismissal. Otherwise you could deny any aspect of scientific, mathematical, or technological knowledge, no matter how sound, that proved inconvenient to whatever argument you were making. There are people who argue in that dishonest fashion, but it is questionable how much time one should spend conversing with them. Do you want to be such a person? The fact that one
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Ed Porter [mailto:[EMAIL PROTECTED] John, I am sure there is interesting stuff that can be done. It would be interesting just to see what sort of an agi could be made on a PC. Yes it would be interesting to see what could be done on a small cluster of modern server grade computers. I like to think about the newer Penryn 45nm, SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running 64 bit OS (sorry I prefer coding in Windows) using standard gigabit Ethernet quad NICs, with solid state drives, and 15,000 RPM SAS for the slower stuff, and a take maybe 10 of these servers. There HAS to be enough resource there to get some small prototype going. And look at next year's 8 core Nehalem procs coming out... Interserver messaging should make heavy use of IP multicasting. Then another messaging channel with the new USB 3.0... Supposedly USB 3.0 is 4.8 gigabits. I would be interested in you Ideas for how to make a powerful AGI without a vast amount of interconnect. The major schemes I know about for reducting interconnect involve allocating what interconnect you have to the links with the highest probability or importance, varying those measures of probability and importance in a contest specific way, and being guided by prior similar experiences. Well I actually don't have the theory far enough to calculate interconnect metrics. But I try to minimize that through storage structure. What gets stored, how it gets stored, where it's stored, how systems are modeled, what a model is, what a system of models are, how systems of models are stored,.. don't store dupes, store diffs... mixing code and data, collapsing data into code, what is code and what is data? Basically a lot of intelligent indexing, like real intelligent indexing... I'm working on using CA's as universal symbolistic indexors and generators - IOW exploring a theory of uncalculated precalcs for computational complexity indexing using CA's in order to control uncertainty and manage complexity... Lots of addicting brain candy stuff... John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71921419-8e0002
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
John, As you say the hardware is just going to get better and better. In five years the PC's of most of the people on this list will probably have at least 8 cores and 16 gig of ram. But even with a current 32 bit PC with say 4G of Ram you should be able to build an AGI that would be a meaningful proof of concept. Lets say 3G is for representation, at say 60 bytes per atom (less than my usual 100 bytes/atom because using 32bit pointers), that would allow you roughly 50Million atoms. Over 1 million seconds (very roughly two weeks 24/7) that would allow an average of 50 atoms a second of representation. Of course your short term memory would record at a much higher frequency, and over time more and more of your representation would go into models rather than episodic recording. But as this happened the vocabulary of patterns would grow and thus one atom, on average would be able to represent more. But it seems to me such an AGI should be able to have meaningful world knowledge about certain simple worlds, or certain simple subparts of the world. For example, it should be able to have a pretty good model for the world of many early video games, such as pong and perhaps even pac-man (Its been so long since I've seen pac-man I don't know how complex it is, but I am assuming 50 million atoms, many of which, over time, would represent complex patterns, would be able to catch most of the meaningful generalizations of pac-man including its control mechanisms and the results they occur). Is I said in an earlier email, if we want AGI-at-Home to catch on it would be valuable to think of some sort of application that would either inspire through importance or entice by usefulness or amusement to cause people let it use a substantial part of their machine cycles. You mention an interest in intelligent indexing. Of course, hierarchical memory provides a fairly good from of intelligent indexing, in the sense that it automatically promotes indexing through learned combinations of indicies, and can be easily made to have probabilistic and importance weights on its index links to more efficiency allocate index activations. How does your intelligent indexing work? Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 2:17 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] From: Ed Porter [mailto:[EMAIL PROTECTED] John, I am sure there is interesting stuff that can be done. It would be interesting just to see what sort of an agi could be made on a PC. Yes it would be interesting to see what could be done on a small cluster of modern server grade computers. I like to think about the newer Penryn 45nm, SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running 64 bit OS (sorry I prefer coding in Windows) using standard gigabit Ethernet quad NICs, with solid state drives, and 15,000 RPM SAS for the slower stuff, and a take maybe 10 of these servers. There HAS to be enough resource there to get some small prototype going. And look at next year's 8 core Nehalem procs coming out... Interserver messaging should make heavy use of IP multicasting. Then another messaging channel with the new USB 3.0... Supposedly USB 3.0 is 4.8 gigabits. I would be interested in you Ideas for how to make a powerful AGI without a vast amount of interconnect. The major schemes I know about for reducting interconnect involve allocating what interconnect you have to the links with the highest probability or importance, varying those measures of probability and importance in a contest specific way, and being guided by prior similar experiences. Well I actually don't have the theory far enough to calculate interconnect metrics. But I try to minimize that through storage structure. What gets stored, how it gets stored, where it's stored, how systems are modeled, what a model is, what a system of models are, how systems of models are stored,.. don't store dupes, store diffs... mixing code and data, collapsing data into code, what is code and what is data? Basically a lot of intelligent indexing, like real intelligent indexing... I'm working on using CA's as universal symbolistic indexors and generators - IOW exploring a theory of uncalculated precalcs for computational complexity indexing using CA's in order to control uncertainty and manage complexity... Lots of addicting brain candy stuff... John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71938578-e534ed
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Ed Porter [mailto:[EMAIL PROTECTED] But even with a current 32 bit PC with say 4G of Ram you should be able to build an AGI that would be a meaningful proof of concept. Lets say 3G is for representation, at say 60 bytes per atom (less than my usual 100 bytes/atom because using 32bit pointers), that would allow you roughly 50Million atoms. Over 1 million seconds (very roughly two weeks 24/7) that would allow an average of 50 atoms a second of representation. Of course your short term memory would record at a much higher frequency, and over time more and more of your representation would go into models rather than episodic recording. But as this happened the vocabulary of patterns would grow and thus one atom, on average would be able to represent more. But it seems to me such an AGI should be able to have meaningful world knowledge about certain simple worlds, or certain simple subparts of the world. For example, it should be able to have a pretty good model for the world of many early video games, such as pong and perhaps even pac-man (Its been so long since I've seen pac-man I don't know how complex it is, but I am assuming 50 million atoms, many of which, over time, would represent complex patterns, would be able to catch most of the meaningful generalizations of pac-man including its control mechanisms and the results they occur). Yes I can imagine this. But how much information would be in each 60 byte atom? Is it a pointer to a pattern stored on disk, or is it some sort of index, or is it a portion of a pattern, or is it a full pattern in a simple pacman type world? Is I said in an earlier email, if we want AGI-at-Home to catch on it would be valuable to think of some sort of application that would either inspire through importance or entice by usefulness or amusement to cause people let it use a substantial part of their machine cycles. Well I can't elaborate publicly but I actually have this application running, still in pre-alpha mode... ahh.. but I have to sell this thing enabling me to buy RD time to potentially convert it to a protoAGI...so no open source on that one :( BUT there are many other applications that could be the delivery mechanism. There are a number of ways to do it... one way was discussed earlier where you sell your PC resources. That is a good idea! You mention an interest in intelligent indexing. Of course, hierarchical memory provides a fairly good from of intelligent indexing, in the sense that it automatically promotes indexing through learned combinations of indicies, and can be easily made to have probabilistic and importance weights on its index links to more efficiency allocate index activations. How does your intelligent indexing work? Well I can describe briefly, there are two basic types of virtual indexing, the actual disk based indexing I'm trying to still use a DBMS for that since they do it so well. First type is based on algebraic structure decomposition. I see everything as algebraic structure; an AGI computer can do the same, but way better. When everything is converted to algebraic structure things become very index friendly, in fact so friendly it looks like many things collapse or telescope down. The other type of indexing that I just started working on is CA based universal symbolistic generation/indexing. Algebraic structure is good for skeltoidal but you need some filler. CA's seem like they can do the trick. The thing with CA's is that they can be indexed based on uncalculated values. If a CA structure is so darn complex why waste the cycles calculating it? The CA's have infinite symbolistic properties that only a portion of them need be calculated (take up resources). Linking the algebraic structure indexing with CA indexing I'm trying to smooth out with group semiautomata, but a lot of magic still happens there :) So that's it without getting too into details. Very primitive still ... John -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 2:17 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] From: Ed Porter [mailto:[EMAIL PROTECTED] John, I am sure there is interesting stuff that can be done. It would be interesting just to see what sort of an agi could be made on a PC. Yes it would be interesting to see what could be done on a small cluster of modern server grade computers. I like to think about the newer Penryn 45nm, SSE4, quadcore quadproc servers with lots of FB DDR3 800mhz RAM running 64 bit OS (sorry I prefer coding in Windows) using standard gigabit Ethernet quad NICs, with solid state drives, and 15,000 RPM SAS for the slower stuff, and a take maybe 10 of these servers. There HAS to be enough resource there to get some small prototype going. And look at next year's 8 core Nehalem procs coming out... Interserver messaging
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Ed Porter [EMAIL PROTECTED] wrote: Matt, IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably quoting below I discussed the bandwidth issues. I am assuming nodes directly talk to each other, which is probably overly optimistic, but still are limited by the fact that each node can only receive somewhere roughly around 100 128 byte messages a second. Unless you have a really big P2P system, that just isn't going to give you much bandwidth. If you had 100 million P2P nodes it would. Thus, a key issue is how many participants is an AGI-at-Home P2P system going to get. My design would use most of the Internet (10^9 P2P nodes). Messages would be natural language text strings, making no distinction between documents, queries, and responses. Each message would have a header indicating the ID and time stamp of the originator and any intermediate nodes through which the message was routed. A message could also have attached files. Each node would have a cache of messages and its own policy on which messages it decides to keep or discard. The goal of the network is to route messages to other nodes that store messages with matching terms. To route an incoming message x, it matches terms in x to terms in stored messages and sends copies to nodes that appear in those headers, appending its own ID and time stamp to the header of the outgoing copies. It also keeps a copy, so that the receiving nodes knows that they know it has a copy of x (at least temporarily). The network acts as a distributed database with a distributed search function. If X posts a document x and Y posts a query y with matching terms, then the network acts to route x to Y and y to X. I mean, what would motivate the average American, or even the average computer geek turn over part of his computer to it? It might not be an easy sell for more than several hundred or several thousand people, at least until it could do something cool, like index their videos for them, be a funny chat bot, or something like that. The value is the ability to post messages that can be found by search, without having to create a website. Information has negative value; people will trade CPU resources for the ability to advertise. In addition to my last email, I don't understand what your were saying below about complexity. Are you saying that as a system becomes bigger it naturally becomes unstable, or what? When a system's Lyapunov exponent (or its discrete approximation) becomes positive, it becomes unmaintainable. This is solved by reducing its interconnectivity. For example, in software we use scope, data abstraction, packages, protocols, etc. to reduce the degree to which one part of the program can affect another. This allows us to build larger programs. In a message passing network, the critical parameter is the ratio of messages out to messages in. The ratio cannot exceed 1 on average. Each node can have its own independent policy of prioritizing messages, but will probably send messages at a nearly constant maximum rate regardless of the input rate. This reaches equilibrium at a ratio of 1, but it would also allow rare but important messages to propagate to a large number of nodes. All critically balanced complex systems are subject to rare but significant events, for example software (state changes and failures), evolution (population explosions, plagues, and mass extinctions), and gene regulatory networks (cell differentiation). -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72111983-b0ec39
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
MATT MAHONEY= My design would use most of the Internet (10^9 P2P nodes). ED PORTER= That's ambitious. Easier said than done unless you have a Google, Microsoft, or mass popular movement backing you. ED PORTER= I mean, what would motivate the average American, or even the average computer geek turn over part of his computer to it?... MATT MAHONEY= The value is the ability to post messages that can be found by search, without having to create a website. Information has negative value; people will trade CPU resources for the ability to advertise. ED PORTER=It sounds theoretically possible. But actually making it happen in a world with so much competition for mind and machine share might be quite difficult. Again it is something that would probably require a major force of the type I listed above to make it happen. ED PORTER= Are you saying that as a system becomes bigger it naturally becomes unstable, or what? MATT MAHONEY= When a system's Lyapunov exponent (or its discrete approximation) becomes positive, it becomes unmaintainable. This is solved by reducing its interconnectivity. For example, in software we use scope, data abstraction, packages, protocols, etc. to reduce the degree to which one part of the program can affect another. This allows us to build larger programs. In a message passing network, the critical parameter is the ratio of messages out to messages in. The ratio cannot exceed 1 on average. ED PORTER= Thanks for the info. By unmaintainable what do you mean? I don't understand why more messages coming in than going out creates a problem, unless most of what nodes do is relay message, which is not what they do in my system. The unruly chaotic side of AGI is not something I have thought much about. I have tried to design my system to largely avoid it. So this is something I don't know much about, although I have thought about net congestion a fair amount which can be very dynamic, and that sounds like it is a related to what you are talking about. I have tried to design my system as a largly asynchronous messaging system so most processes are relatively loosely linked, as browsers and servers generally are on the internet. As such, the major type of instability I have worried about is that of network traffic congestion, such as if all of a sudden many nodes want to talk to the same node, both for computer nodes and pattern nodes. I WOULD BE INTERESTED IN ANY THOUGHTS ON THE OTHER TYPES OF DYNAMIC INSTABILITIES A HIERARCHICAL MEMORY SYSTEM -- WITH PROBABILISTIC INDEX-BASED SPREADING ACTIVATION -- MIGHT HAVE. Matt, it sounds as if OpenCog ever tries to build a large P2P network you the man. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 7:42 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] --- Ed Porter [EMAIL PROTECTED] wrote: Matt, IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably quoting below I discussed the bandwidth issues. I am assuming nodes directly talk to each other, which is probably overly optimistic, but still are limited by the fact that each node can only receive somewhere roughly around 100 128 byte messages a second. Unless you have a really big P2P system, that just isn't going to give you much bandwidth. If you had 100 million P2P nodes it would. Thus, a key issue is how many participants is an AGI-at-Home P2P system going to get. My design would use most of the Internet (10^9 P2P nodes). Messages would be natural language text strings, making no distinction between documents, queries, and responses. Each message would have a header indicating the ID and time stamp of the originator and any intermediate nodes through which the message was routed. A message could also have attached files. Each node would have a cache of messages and its own policy on which messages it decides to keep or discard. The goal of the network is to route messages to other nodes that store messages with matching terms. To route an incoming message x, it matches terms in x to terms in stored messages and sends copies to nodes that appear in those headers, appending its own ID and time stamp to the header of the outgoing copies. It also keeps a copy, so that the receiving nodes knows that they know it has a copy of x (at least temporarily). The network acts as a distributed database with a distributed search function. If X posts a document x and Y posts a query y with matching terms, then the network acts to route x to Y and y to X. I mean, what would motivate the average American, or even the average computer geek turn over part of his computer to it? It might not be an easy sell for more than several hundred or several thousand people, at least until it could do something cool, like index their videos for them, be a funny chat bot, or something like
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: Richard, It is not clear how valuable your 25 years of hard won learning is if it causes you to dismiss valuable scientific work that seems to have eclipsed the importance of anything I or you have published as trivial exercises in public relations without giving any reason whatsoever for the particular dismissal. I welcome criticism in this forum provided it is well reasoned and without venom. But to dismiss a list of examples I give to support an argument as trivial exercises in public relations without any justification other than the fact that in general a certain numbers of published papers are inaccurate and/or overblown, is every bit as dishonest as calling someone a liar with regard to a particular statement based on nothing more than the knowledge some people are liars. In my past exchanges with you, sometimes your responses have been helpful. But I have noticed that although you are very quick to question me (and others), if I question you, rather than respond directly to my arguments you often don't respond to them at all -- such as your recent refusal to justify your allegation that my whole framework, presumably for understanding AGI, was wrong (a pretty insulting statement which should not be flung around without some justification). Or if you do respond to challenges, you often dismiss them as invalid without any substantial evidence, or you substantially change the subject, such as by focusing on one small part of my argument that I have not yet fully supported, while refusing to acknowledge the major support I have shown for the major thrust of my argument. When you argue like that there really is no purpose in continuing the conversation. What's the point. Under those circumstance your not dealing with someone who is likely to tell you anything of worth. Rather you are only likely to hear lame defensive arguments from somebody who is either incapable of properly defending or unwilling to properly defend their arguments, and, thus, is unlikely to communicate anything of value in the exchange. Your 25 years of experience doesn't mean squat about how much you truly understand AGI unless you are capable of being more intellectually honest, both with yourself and with others -- and unless you are capable of actually reasonably defending your understandings, head-on, against reasoned questioning and countering evidence. To dismiss counter evidence cited against your arguments as trivial exercises in public relations without any specific justification is not a reasonable defense, and the fact that you so often result to such intellectually dishonest tactics to defend your stated understandings relating to AGI really does call into question the quality of those understandings. In summary, don't go around attacking other people's statements unless you are willing to defend those attacks in an intellectually honest manner. I confess, I would rather that I had not so quickly dismissed those researchers you mentioned - mostly because my motivation at the time was to dismiss the exaggerated value that *you* placed on these results. But let me explain the reason why I still feel that it was valid to dismiss them. They are examples of a category of research that addresses issues that are completely compromised by the lack of solutions to other issues. Thus: building a NL parser, no matter how good it is, is of no use whatsoever unless it can be shown to emerge from (or at least fit with) a learning mechanism that allows the system itself to generate its own understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger issue is dealt with, a NL parser will arise naturally, and any previous work on non-developmental, hand-built parsers will be completely discarded. You were trumpeting the importance of work that I know will be thrown away later, and in the mean time will be of no help in resolving the important issues. Now, I am harsh about these researchers not because they in particular were irresponsible, but because they are part of a tradition in which everyone is looking for cheap results that superficially appear good to peer reviewers, so they can get things published, so they can get more research grants, so they can get higher salaries. There is an appallingly high incidence of research that is carried out because it fits the ideal paper-publication template, not because the work itself addresses important issues. This is a kind of low-level academic corruption, and I will continue to call it what it is, even if you don't have the slightest idea that this corruption exists. It was towards *that* issue that my criticism was directed. I would have been perfectly happy to explain this to you before, but instead of appreciating where I was coming from, you launched into a tirade about my dishonesty and stupidity in rejecting papers
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Thus: building a NL parser, no matter how good it is, is of no use whatsoever unless it can be shown to emerge from (or at least fit with) a learning mechanism that allows the system itself to generate its own understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger issue is dealt with, a NL parser will arise naturally, and any previous work on non-developmental, hand-built parsers will be completely discarded. You were trumpeting the importance of work that I know will be thrown away later, and in the mean time will be of no help in resolving the important issues. Richard, you discount the possibility that said NL parser will play a key role in the adaptive emergence of a system that can generate its own linguistic understanding. I.e., you discount the possibility that, with the right learning mechanism and instructional environment, hand-coded rules may serve as part of the initial seed for a learning process that will eventually generate knowledge obsoleting these initial hand-coded rules. It's fine that you discount this possibility -- I just want to point out that in doing so, you are making a bold and unsupported theoretical hypothesis, rather than stating an obvious or demonstrated fact. Vaguely similarly, the grammar of child language is largely thrown away in adulthood, yet it was useful as scaffolding in leading to the emergence of adult language. -- Ben G - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72129171-2bf67a
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Ed Porter [EMAIL PROTECTED] wrote: MATT MAHONEY= My design would use most of the Internet (10^9 P2P nodes). ED PORTER= That's ambitious. Easier said than done unless you have a Google, Microsoft, or mass popular movement backing you. It would take some free software that people find useful. The Internet has been transformed before. Remember when there were no web browsers and no search engines? You can probably think of transformations that would make the Internet more useful. Centralized search is limited to a few big players that can keep a copy of the Internet on their servers. Google is certainly useful, but imagine if it searched a space 1000 times larger and if posts were instantly added to its index, without having to wait days for its spider to find them. Imagine your post going to persistent queries posted days earlier. Imagine your queries being answered by real human beings in addition to other peers. I probably won't be the one writing this program, but where there is a need, I expect it will happen. In a message passing network, the critical parameter is the ratio of messages out to messages in. The ratio cannot exceed 1 on average. ED PORTER= Thanks for the info. By unmaintainable what do you mean? I don't understand why more messages coming in than going out creates a problem, unless most of what nodes do is relay message, which is not what they do in my system. I meant the other way, which would flood the network with duplicate messages. But I believe the network would be stable against this, even in the face of spammers and malicious nodes, because most nodes would be configured to ignore duplicates and any messages that it deemed irrelevant. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72132605-fe415f
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Benjamin Goertzel wrote: Thus: building a NL parser, no matter how good it is, is of no use whatsoever unless it can be shown to emerge from (or at least fit with) a learning mechanism that allows the system itself to generate its own understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger issue is dealt with, a NL parser will arise naturally, and any previous work on non-developmental, hand-built parsers will be completely discarded. You were trumpeting the importance of work that I know will be thrown away later, and in the mean time will be of no help in resolving the important issues. Richard, you discount the possibility that said NL parser will play a key role in the adaptive emergence of a system that can generate its own linguistic understanding. I.e., you discount the possibility that, with the right learning mechanism and instructional environment, hand-coded rules may serve as part of the initial seed for a learning process that will eventually generate knowledge obsoleting these initial hand-coded rules. It's fine that you discount this possibility -- I just want to point out that in doing so, you are making a bold and unsupported theoretical hypothesis, rather than stating an obvious or demonstrated fact. Vaguely similarly, the grammar of child language is largely thrown away in adulthood, yet it was useful as scaffolding in leading to the emergence of adult language. The problem is that this discussion has drifted away from the original context in which I made the remarks. I do *not* discount the possibility that an ordinary NL parser may play a role in the future. What I was attacking was the idea that a NL parser that does a wonderful job today (but which is built on a formalism that ignores all the issues involved in getting an adaptive language-understanding system working) is IPSO FACTO going to be a valuable step in the direction of a full adaptive system. It was the linkage that I dismissed. It was the idea that BECAUSE the NL parser did such a great job, therefore it has a very high probability of being a great step on the road to a full adaptive (etc) language understanding system. If the NL parser completely ignores those larger issues I am justified in saying that it is a complete crap shoot whether or not this particular parser is going to be of use in future, more complete theories of language. But that is not the same thing as making a blanket dismissal of all parsers, saying they cannot be of any use as (as you point out) seed material in the design of a complete system. I was objecting to Ed's pushing this particular NL parser in my face and insisting that I should respect it as a substantial step towards full AGI . and my objection was that I find models like that all show and no deep substance precisely because they ignore the larger issues and go for the short-term gratification of a parser that works really well. So I was not taking the position you thought I was. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72135004-3fc959
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
The particular NL parser paper in question, Collins's Convolution Kernels for Natural Language (http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Collins-kernels.pdf) is actually saying something quite important that extends way beyond parsers and is highly applicable to AGI in general. It is actually showing that you can do something roughly equivalent to growing neural gas (GNG) in a space with something approaching 500,000 dimensions, but you can do it without normally having to deal with more than a few of those dimensions at one time. GNG is an algorithm I learned about from reading Peter Voss that allows one to learn how to efficiently represent a distribution in a relatively high dimensional space in a totally unsupervised manner. But there really seem to be no reason why there should be any limit to the dimensionality of the space in which the Collin's algorithm works, because it does not use an explicit vector representation, nor, if I recollect correctly, a Euclidian distance metric, but rather a similarity metric which is generally much more appropriate for matching in very high dimensional spaces. But what he is growing are not just points representing where data has occurred in a high dimensional space, but sets of points that define hyperplanes for defining the boundaries between classes. My recollection is that this system learns automatically from both labeled data (instances of correct parse trees) and randomly generated deviations from those instances. His particular algorithm matches tree structures, but with modification it would seem to be extendable to matching arbitrary nets. Other versions of it could be made to operate, like GNG, in an unsupervised manner. If you stop and think about what this is saying and generalize from it, it provides an important possible component in an AGI tool kit. What it shows is not limited to parsing, but it would seem possibly applicable to virtually any hierarchical or networked representation, including nets of semantic web RDF triples, and semantic nets, and predicate logic expressions. At first glance it appears it would even be applicable to kinkier net matching algorithms, such as an Augmented transition network (ATN) matching. So if one reads this paper with a mind to not only what it specifically shows, but to what how what it shows could be expanded, this paper says something very important. That is, that one can represent, learn, and classify things in very high dimensional spaces -- such as 10^1 dimensional spaces -- and do it efficiently provided the part of the space being represented is sufficiently sparsely connected. I had already assumed this, before reading this paper, but the paper was valuable to me because it provided a mathematically rigorous support for my prior models, and helped me better understand the mathematical foundations of my own prior intuitive thinking. It means that systems like Novemente can deal in very high dimensional spaces relatively efficiently. It does not mean that all processes that can be performed in such spaces will be computationally cheap (for example, combinatorial searches), but it means that many of them, such as GNG like recording of experience, and simple indexed based matching can scale relatively well in a sparsely connected world. That is important, for those with the vision to understand. Ed Porter -Original Message- From: Benjamin Goertzel [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 8:59 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Thus: building a NL parser, no matter how good it is, is of no use whatsoever unless it can be shown to emerge from (or at least fit with) a learning mechanism that allows the system itself to generate its own understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger issue is dealt with, a NL parser will arise naturally, and any previous work on non-developmental, hand-built parsers will be completely discarded. You were trumpeting the importance of work that I know will be thrown away later, and in the mean time will be of no help in resolving the important issues. Richard, you discount the possibility that said NL parser will play a key role in the adaptive emergence of a system that can generate its own linguistic understanding. I.e., you discount the possibility that, with the right learning mechanism and instructional environment, hand-coded rules may serve as part of the initial seed for a learning process that will eventually generate knowledge obsoleting these initial hand-coded rules. It's fine that you discount this possibility -- I just want to point out that in doing so, you are making a bold and unsupported theoretical hypothesis, rather than stating an obvious or demonstrated fact. Vaguely similarly, the grammar of child language
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Matt, Perhaps your are right. But one problem is that big Google-like compuplexes in the next five to ten years will be powerful enough to do AGI and they will be much more efficient for AGI search because the physical closeness of their machines will make it possible for them to perform the massive interconnected needed for powerful AGI much more efficiently. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 04, 2007 9:18 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] --- Ed Porter [EMAIL PROTECTED] wrote: MATT MAHONEY= My design would use most of the Internet (10^9 P2P nodes). ED PORTER= That's ambitious. Easier said than done unless you have a Google, Microsoft, or mass popular movement backing you. It would take some free software that people find useful. The Internet has been transformed before. Remember when there were no web browsers and no search engines? You can probably think of transformations that would make the Internet more useful. Centralized search is limited to a few big players that can keep a copy of the Internet on their servers. Google is certainly useful, but imagine if it searched a space 1000 times larger and if posts were instantly added to its index, without having to wait days for its spider to find them. Imagine your post going to persistent queries posted days earlier. Imagine your queries being answered by real human beings in addition to other peers. I probably won't be the one writing this program, but where there is a need, I expect it will happen. In a message passing network, the critical parameter is the ratio of messages out to messages in. The ratio cannot exceed 1 on average. ED PORTER= Thanks for the info. By unmaintainable what do you mean? I don't understand why more messages coming in than going out creates a problem, unless most of what nodes do is relay message, which is not what they do in my system. I meant the other way, which would flood the network with duplicate messages. But I believe the network would be stable against this, even in the face of spammers and malicious nodes, because most nodes would be configured to ignore duplicates and any messages that it deemed irrelevant. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72151542-9bffdb
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
OK, understood... On Dec 4, 2007 9:32 PM, Richard Loosemore [EMAIL PROTECTED] wrote: Benjamin Goertzel wrote: Thus: building a NL parser, no matter how good it is, is of no use whatsoever unless it can be shown to emerge from (or at least fit with) a learning mechanism that allows the system itself to generate its own understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger issue is dealt with, a NL parser will arise naturally, and any previous work on non-developmental, hand-built parsers will be completely discarded. You were trumpeting the importance of work that I know will be thrown away later, and in the mean time will be of no help in resolving the important issues. Richard, you discount the possibility that said NL parser will play a key role in the adaptive emergence of a system that can generate its own linguistic understanding. I.e., you discount the possibility that, with the right learning mechanism and instructional environment, hand-coded rules may serve as part of the initial seed for a learning process that will eventually generate knowledge obsoleting these initial hand-coded rules. It's fine that you discount this possibility -- I just want to point out that in doing so, you are making a bold and unsupported theoretical hypothesis, rather than stating an obvious or demonstrated fact. Vaguely similarly, the grammar of child language is largely thrown away in adulthood, yet it was useful as scaffolding in leading to the emergence of adult language. The problem is that this discussion has drifted away from the original context in which I made the remarks. I do *not* discount the possibility that an ordinary NL parser may play a role in the future. What I was attacking was the idea that a NL parser that does a wonderful job today (but which is built on a formalism that ignores all the issues involved in getting an adaptive language-understanding system working) is IPSO FACTO going to be a valuable step in the direction of a full adaptive system. It was the linkage that I dismissed. It was the idea that BECAUSE the NL parser did such a great job, therefore it has a very high probability of being a great step on the road to a full adaptive (etc) language understanding system. If the NL parser completely ignores those larger issues I am justified in saying that it is a complete crap shoot whether or not this particular parser is going to be of use in future, more complete theories of language. But that is not the same thing as making a blanket dismissal of all parsers, saying they cannot be of any use as (as you point out) seed material in the design of a complete system. I was objecting to Ed's pushing this particular NL parser in my face and insisting that I should respect it as a substantial step towards full AGI . and my objection was that I find models like that all show and no deep substance precisely because they ignore the larger issues and go for the short-term gratification of a parser that works really well. So I was not taking the position you thought I was. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=72155184-923590
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Sunday 02 December 2007, John G. Rose wrote: Building up parse trees and word sense models, let's say that would be a first step. And then say after a while this was accomplished and running on some peers. What would the next theoretical step be? I am not sure what the next step would be. The first step might be enough for the moment. When you have the network functioning at all, expose an API so that other programmers can come in and try to utilize sentence analysis (and other functions) as if the network is just another lobe of the brain or another component for ai. This would allow others who are possibly more creative than us to take advantage of what looks to be interesting work. - Bryan - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71422338-8cb1da
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: Once you build up good models for parsing and word sense, then you read large amounts of text and start building up model of the realities described and generalizations from them. Assuming this is a continuation of the discussion of an AGI-at-home P2P system, you are going to be very limited by the lack of bandwidth, particularly for attacking the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space, which is going to be difficult with limited bandwidth. But a large amount of text with appropriate parsing and word sense labeling would still provide a valuable aid for web and text search and for many forms of automatic learning. And the level of understanding that such a P2P system could derive from reading huge amounts of text could be a valuable initial source of one component of world knowledge for use by AGI. I know you always find it teious when I express scepticism, so I will preface my remarks with: take this advice or ignore it, your choice. This description of how to get AGI done reminds me of my childhood project to build a Mars-bound spacecraft modeled after James Blish's Book Welcome to Mars. I Knew that I could build it in time for the next conjunction of Mars, but I hadn't quite gotten the anti-gravity drive sorted out, so instead I collected all the other materials described in the book, so everything would be ready when the AG drive started working... The reason it reminds me of this episode is that you are calmly talking here about the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space . this is your equivalent of the anti-gravity drive. This is the part that needs extremely detailed knowledge of AI and psychology, just to be understand the nature of the problem (never mind to solve it). If you had any idea bout how to solve this part of the problem, everything else would drop into your lap. You wouldn't need a P2P AGI-at-home system, because with this solution in hand you would have people beating down your door to give you a supercomputer. Menawhile, unfortunately, solving all those other issues like making parsers and trying to do word-sense disambiguation would not help one whit to get the real theoretical task done. I am not being negative, I am just relaying the standard understanding of priorities in the AGI field as a whole. Send complaints addressed to AGI Community, not to me, please. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71451441-4352c5
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Once you build up good models for parsing and word sense, then you read large amounts of text and start building up model of the realities described and generalizations from them. Assuming this is a continuation of the discussion of an AGI-at-home P2P system, you are going to be very limited by the lack of bandwidth, particularly for attacking the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space, which is going to be difficult with limited bandwidth. But a large amount of text with appropriate parsing and word sense labeling would still provide a valuable aid for web and text search and for many forms of automatic learning. And the level of understanding that such a P2P system could derive from reading huge amounts of text could be a valuable initial source of one component of world knowledge for use by AGI. Ed Porter -Original Message- From: Bryan Bishop [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 7:33 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] On Sunday 02 December 2007, John G. Rose wrote: Building up parse trees and word sense models, let's say that would be a first step. And then say after a while this was accomplished and running on some peers. What would the next theoretical step be? I am not sure what the next step would be. The first step might be enough for the moment. When you have the network functioning at all, expose an API so that other programmers can come in and try to utilize sentence analysis (and other functions) as if the network is just another lobe of the brain or another component for ai. This would allow others who are possibly more creative than us to take advantage of what looks to be interesting work. - Bryan - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71438525-d92982
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Richard Loosemore [mailto:[EMAIL PROTECTED] The reason it reminds me of this episode is that you are calmly talking here about the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space . this is your equivalent of the anti-gravity drive. This is the part that needs extremely detailed knowledge of AI and psychology, just to be understand the nature of the problem (never mind to solve it). If you had any idea bout how to solve this part of the problem, everything else would drop into your lap. You wouldn't need a P2P AGI-at-home system, because with this solution in hand you would have people beating down your door to give you a supercomputer. This is naïve. It almost never works this way, where if someone has a solution to a well known unsolved engineering problem that resources just come knocking at the door. Menawhile, unfortunately, solving all those other issues like making parsers and trying to do word-sense disambiguation would not help one whit to get the real theoretical task done. This is impractical. ... I am not being negative, I am just relaying the standard understanding of priorities in the AGI field as a whole. Send complaints addressed to AGI Community, not to me, please. You are being negative! And since when have the priorities of understandings in the AGI field been standardized? Perhaps that is part the limiting factor and self-defeating narrow-mindedness. John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71501965-68a77a
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Richard Loosemore [EMAIL PROTECTED] wrote: Menawhile, unfortunately, solving all those other issues like making parsers and trying to do word-sense disambiguation would not help one whit to get the real theoretical task done. I agree. AI has a long history of doing the easy part of the problem first: solving the mathematics or logic of a word problem, and deferring the hard part, which is extracting the right formal statement from the natural language input. This is the opposite order of how children learn. The proper order is: lexical rules first, then semantics, then grammar, and then the problem solving. The whole point of using massive parallel computation is to do the hard part of the problem. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71493437-c427ac
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Matt:: The whole point of using massive parallel computation is to do the hard part of the problem. I get it : you and most other AI-ers are equating hard with very, very complex, right? But you don't seriously think that the human mind successfully deals with language by massive parallel computation, do you? Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, yes, (in order to understand, for example, any of the sentences you or I are writing here), but only a tiny fraction of the operations that computers currently perform? The whole idea of massive parallel computation here, surely has to be wrong. And yet none of you seem able to face this to my mind obvious truth. I only saw this term recently - perhaps it's v. familiar to you (?) - that the human brain works by look-up rather than search. Hard problems can have relatively simple but ingenious solutions. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71504832-b01a2d
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Bryan Bishop [mailto:[EMAIL PROTECTED] I am not sure what the next step would be. The first step might be enough for the moment. When you have the network functioning at all, expose an API so that other programmers can come in and try to utilize sentence analysis (and other functions) as if the network is just another lobe of the brain or another component for ai. This would allow others who are possibly more creative than us to take advantage of what looks to be interesting work. This is true and a way to get utility out of it. And getting the first step accomplished is quite a bit of work as is maintaining it. Having just a few basic baby steps actually materialize in front of you eliminates some of the complexity so that the larger problem may appear just a bit less daunting. Also communal developer feedback is a constructive motivator. John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71510117-536e83
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Ed Porter [mailto:[EMAIL PROTECTED] Once you build up good models for parsing and word sense, then you read large amounts of text and start building up model of the realities described and generalizations from them. Assuming this is a continuation of the discussion of an AGI-at-home P2P system, you are going to be very limited by the lack of bandwidth, particularly for attacking the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space, which is going to be difficult with limited bandwidth. But a large amount of text with appropriate parsing and word sense labeling would still provide a valuable aid for web and text search and for many forms of automatic learning. And the level of understanding that such a P2P system could derive from reading huge amounts of text could be a valuable initial source of one component of world knowledge for use by AGI. I kind of see the small bandwidth between (most) individual nodes as not a limiting factor as sets of nodes act as temporary single group entities. IOW the BW between one set of 50 nodes and another set of 50 nodes is quite large actually and individual nodes' data access would depend on - indexes of indexes to minimize their individual BW requirements. Does this not apply to your model? John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71511001-15807d
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
John G. Rose wrote: From: Richard Loosemore [mailto:[EMAIL PROTECTED] [snip] I am not being negative, I am just relaying the standard understanding of priorities in the AGI field as a whole. Send complaints addressed to AGI Community, not to me, please. You are being negative! And since when have the priorities of understandings in the AGI field been standardized? Perhaps that is part the limiting factor and self-defeating narrow-mindedness. It is easy for a research field to agree that certain problems are really serious and unsolved. A hundred years ago, the results of the Michelson-Morley experiments were a big unsolved problem, and pretty serious for the foundations of physics. I don't think it would have been self-defeating narrow-mindedness for someone to have pointed to that problem and said this is a serious problem. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71517612-4f04ee
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mike Tintner wrote: Matt:: The whole point of using massive parallel computation is to do the hard part of the problem. I get it : you and most other AI-ers are equating hard with very, very complex, right? But you don't seriously think that the human mind successfully deals with language by massive parallel computation, do you? Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, yes, (in order to understand, for example, any of the sentences you or I are writing here), but only a tiny fraction of the operations that computers currently perform? The whole idea of massive parallel computation here, surely has to be wrong. And yet none of you seem able to face this to my mind obvious truth. I only saw this term recently - perhaps it's v. familiar to you (?) - that the human brain works by look-up rather than search. Hard problems can have relatively simple but ingenious solutions. You need to check the psychology data: it emphatically disagrees with your position here. One thing that can be easily measured is the activation of lexical items related in various ways to a presented word (i.e. show the subject the word Doctor and test to see if the word Nurse gets activated). It turns out that within an extremely short time of the forst word being seen, a very large numbmer of other words have their activations raised significantly. Now, whichever way you interpret these (so called priming) results, one thing is not in doubt: there is massively parallel activation of lexical units going on during language processing. Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71515718-ac1ab7
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: Richard, It is false to imply that knowledge of how to draw implications from a series of statements by some sort of search mechanism is equally unknown as that of how to make an anti-gravity drive -- if by anti-gravity drive you mean some totally unknown form of physics, rather than just anything, such as human legs, that can push against gravity. It is unfair because there is a fair amount of knowledge about how to draw implications from sequences of statements. For example view Shastri's www.icsi.berkeley.edu/~shastri/psfiles/cogsci00.ps. Also Ben Goertzel has demonstrated a program that draws implications from statements contained in different medical texts. Ed Porter P.S., I have enclosed an inexact, but, at least to me, useful drawing I made of the type of search involved in understanding the multiple implications contained in the series of statements contained in Shastri's John fell in the Hallway. Tom had cleaned it. He was hurt example. Of course, what is most missing from this drawing are all the other, dead end, implications which do not provide a likely implication. Only one of such dead end is shown (the implication between fall and trip). As a result you don't sense how many dead ends have to be searched to find the implications which best explain the statements. EWP Well, bear in mind that I was not meaning the analogy to be *that* exact, or I would have given up on AGI long ago - I'm sure you know that I don't believe that getting an understanding system working is as impossible as getting an AG drive built. The purpose of my comment was to point to a huge gap in understanding, and the mistaken strategy of dealing with all the peripheral issues before having a clear idea how to solve the central problem. I cannot even begin to do justice, here, to the issues involved in solving the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space You talk as if an extension of some current strategy will solve this ... but it is not at all clear that any current strategy for solving this problem actually does scale up to a full solution to the problem. I don't care how many toy examples you come up with, you have to show a strategy for dealing with some of the core issues, AND reasons to believe that those strategies really will work (other than I find them quite promising). Not only that, but there at least some people (to wit, myself) who believe there are positive reasons to believe that the current strategies *will* not scale up. Richard Loosemore -Original Message- From: Richard Loosemore [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 10:07 AM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed Porter wrote: Once you build up good models for parsing and word sense, then you read large amounts of text and start building up model of the realities described and generalizations from them. Assuming this is a continuation of the discussion of an AGI-at-home P2P system, you are going to be very limited by the lack of bandwidth, particularly for attacking the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space, which is going to be difficult with limited bandwidth. But a large amount of text with appropriate parsing and word sense labeling would still provide a valuable aid for web and text search and for many forms of automatic learning. And the level of understanding that such a P2P system could derive from reading huge amounts of text could be a valuable initial source of one component of world knowledge for use by AGI. I know you always find it teious when I express scepticism, so I will preface my remarks with: take this advice or ignore it, your choice. This description of how to get AGI done reminds me of my childhood project to build a Mars-bound spacecraft modeled after James Blish's Book Welcome to Mars. I Knew that I could build it in time for the next conjunction of Mars, but I hadn't quite gotten the anti-gravity drive sorted out, so instead I collected all the other materials described in the book, so everything would be ready when the AG drive started working... The reason it reminds me of this episode is that you are calmly talking here about the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space . this is your equivalent of the anti-gravity drive. This is the part that needs extremely detailed knowledge of AI and psychology, just
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
From: Richard Loosemore [mailto:[EMAIL PROTECTED] It is easy for a research field to agree that certain problems are really serious and unsolved. A hundred years ago, the results of the Michelson-Morley experiments were a big unsolved problem, and pretty serious for the foundations of physics. I don't think it would have been self-defeating narrow-mindedness for someone to have pointed to that problem and said this is a serious problem. Well the definition of problems and the approaches to solving the problems can be narrow-minded or looked at with a narrow-human-psychological AI perspective. Most of these problems boil down to engineering problems and the theory already exists in some other form; it is a matter of putting things together IMO. But myself not being in the cog sci world for that long, only thinking of AGI in terms of computers, math and AI, I am unaware of the details of some of the particular AGI unsolved mysteries that are talked about. Not to say I haven't thought about them from my own narrow-human-psychological AI perspective :) John - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71519373-6b5212
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
MIKE TINTNER Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, ED PORTER How do you find the right set of relatively few computations and/or models that are appropriate in a complex context without massive computation? -Original Message- From: Mike Tintner [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 12:12 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Matt:: The whole point of using massive parallel computation is to do the hard part of the problem. I get it : you and most other AI-ers are equating hard with very, very complex, right? But you don't seriously think that the human mind successfully deals with language by massive parallel computation, do you? Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, yes, (in order to understand, for example, any of the sentences you or I are writing here), but only a tiny fraction of the operations that computers currently perform? The whole idea of massive parallel computation here, surely has to be wrong. And yet none of you seem able to face this to my mind obvious truth. I only saw this term recently - perhaps it's v. familiar to you (?) - that the human brain works by look-up rather than search. Hard problems can have relatively simple but ingenious solutions. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71590357-a986d6
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Dec 3, 2007 12:12 PM, Mike Tintner [EMAIL PROTECTED] wrote: I get it : you and most other AI-ers are equating hard with very, very complex, right? But you don't seriously think that the human mind successfully deals with language by massive parallel computation, do you? Very very complex tends to exceed one's ability to properly model and especially predict. Even if the human mind invokes some special kind of magical cleverness, do you think you (judging from your writing) have some unique ability to isolate that function (noun) without simultaneously using that function (verb) ? I often imagine that I understand the working of my own mind almost perfectly. Those that claim to have grasped the quintessential bit typically end up so far over the edge that they are unable to express it in meaningful or useful terms. Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, yes, (in order to understand, for example, any of the sentences you or I are writing here), but only a tiny fraction of the operations that computers currently perform? I believe you are making that statement because you wish it to be true. I see no basis for anything to be obvious - especially the formalism required to define what the term means. This is due primarily to the complexity associated with recursive self-reflection. The whole idea of massive parallel computation here, surely has to be wrong. And yet none of you seem able to face this to my mind obvious truth. We each continue to persist in our delusions. Yours may be no different in the end. :) I only saw this term recently - perhaps it's v. familiar to you (?) - that the human brain works by look-up rather than search. Hard problems can have relatively simple but ingenious solutions. How is the look-up table built? Usually by experience. When we have enough similar experiences to look up a solution to general adaptive intelligence, we will have likely been close enough to it for so long that (probably) nobody will be surprised. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71652723-808348
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
RL: One thing that can be easily measured is the activation of lexical items related in various ways to a presented word (i.e. show the subject the word Doctor and test to see if the word Nurse gets activated). It turns out that within an extremely short time of the forst word being seen, a very large numbmer of other words have their activations raised significantly. Now, whichever way you interpret these (so called priming) results, one thing is not in doubt: there is massively parallel activation of lexical units going on during language processing. Thanks for reply. How many associations are activated? How do we know neuroscientifically they are associations to the words being processed and not something else entirely? Out of interest, can you give me a ball park estimate of how many associations you personally think are activated, say, in in a few seconds, in processing sentences like: The doctor made a move on the nurse. Relationships between staff in health organizations are fraught with complexities No, I'm not trying to be ridiculously demanding or asking you to be ridiculously exact. As you probably know by now, I see the processing of sentences as involving several levels, especially for the second sentence, but I don't see the number of associations as that many. Let's be generous and guess hundreds for the items in the above sentences. But a computer program, as I understand, will be typically searching through anywhere between thousands, millions and way upwards. On the one hand, we can perhaps agree that one of the brain's glories is that it can very rapidly draw analogies - that I can quickly produce a string of associations like, say, snake, rope, chain, spaghetti strand, - and you may quickly be able to continue that string with further associations, (like string). I believe that power is mainly based on look-up - literally finding matching shapes at speed. But I don't see the brain as checking through huge numbers of such shapes. (It would be enormously demanding on resources, given that these are complex pictures, no?). As evidence , I'd point to what happens if you try to keep producing further analogies. The brain rapidly slows down. It gets harder and harder. And yet you will be able to keep producing further examples from memory virtually for ever - just slower and slower. Relevant images/ concepts are there, but it's not easy to access them. That's why copywriters get well paid to, in effect, keep searching for similar analogies (as cool/refreshing as...). It's hard work. If that many relevant shapes were being unconsciously activated as you seem to be suggesting, it shouldn't be such protracted work. The brain can literally connect any thing to any other thing with, so to speak, 6 degrees of separation - but I don't think it can conect that many things at once. I accept that this is still neuroscientifically an open issue, ( I'd be grateful for pointers to the research you're referring to). But I would have thought it obvious that the brain has massively inferior search capabilities to those of computers - that, surely, is a major reasonwhy we invented computers in the first place - they're a massive extension of our powers. And yet the brain can draw analogies, and basically, with minor exceptions, computers still can't. I think it's clear that computers won't catch up here by quantitatively increasing their powers still further. If you're digging a hole in the wrong place, digging further quicker won't help. (I'm arguing a variant of your own argument against Edward P!). But of course when your education and technology dispose you to dig in just those places, it's extremely hard to change your ways - or even believe, pace Edward, that change is necessary at all. After all, look at the size of those holes.. surely, we'll hit the Promised Land anytime now. P.S. In general, the brain is hugely irrational - it can only maintain a reflective, concentrated train of thought for literally seconds, not minutes before going off at tangents. It continually and necessarily jumps to conclusions. Such irrationality is highly adaptive in a fast-moving world where you can't hang around thinking about things for long. The idea that this same brain is systematically, thoroughly searching through, let's say, thousands or millions of variants on ideas, seems to me seriously at odds with this irrationality. (But I'm interested in all relevant research). - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71651016-b43e51
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Mike Tintner [EMAIL PROTECTED] wrote: On the one hand, we can perhaps agree that one of the brain's glories is that it can very rapidly draw analogies - that I can quickly produce a string of associations like, say, snake, rope, chain, spaghetti strand, - and you may quickly be able to continue that string with further associations, (like string). I believe that power is mainly based on look-up - literally finding matching shapes at speed. But I don't see the brain as checking through huge numbers of such shapes. (It would be enormously demanding on resources, given that these are complex pictures, no?). Semantic models learn associations by proximity in the training text. The degree to which you associate snake and rope depends on how often these words appear near each other. You can create an association matrix A, e.g. A[snake][rope] is the degree of association between these words. Among the most successful of these models is latent semantic analysis (LSA), where A is factored: A = USV by singular value decomposition (SVD), such that U and V are orthonormal and S is diagonal, and then discard all but the largest elements of S. In a typical LSA model, A is 20K by 20K, and S is reduced to about 200. This approximates A to two 20K by 200 matrices, using about 2% as much space. One effect of lossy compression by LSA is to derive associations by the transitive property of semantics. For example, if snake is associated with rope and rope with chain, then the LSA approximation will derive an association of snake with chain even if it was not seen in the training data. SVD has an efficient parallel implementation. It is most easily visualized as a 20K by 200 by 20K 3-layer linear neural network [1]. But this really should not be surprising, because natural language evolved to be processed efficiently on a slow but highly parallel computer. 1. Gorrell, Genevieve (2006), “Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing”, Proceedings of EACL 2006, Trento, Italy. http://www.aclweb.org/anthology-new/E/E06/E06-1013.pdf -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71675396-27fd0e
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
was relatively efficient. -5-Hecht-Nielsen's sentence completion program (produced by his confabulation see http://r.ucsd.edu), just by appropriately tying together probabilistic implications learned from sequences of words, automatically creates grammatically correct sentences that are related to a prior sentense, allegedly without any knowledge of grammar, using millions of probability activations per word, without any un-computable combinatorial explosion. The search space that is being explored at any one time theoretically is considering more possibilities than there are particles in the known universe -- yet it works. At any given time several, lets, say 6 to 12 word or phrase slots can be under computation, in which each of approximately 100K or so words or phrases is receiving scores. One could consider the search space to include each of the possible words or phrase being considered in each of those say 10 ordered slots as the possible permutation of 10 slot fillers each chosen from a set of about 10^5 words or phrases, a permuation that has (10^5)!/(10^4)! possibilities. This is a very large search space -- just 100!/10! is over 10^151¸and (10^5)!/(10^4)! is much, much, much larger space than that -- and yet it all compute with somewhere within several orders of magnitude of a billion opps. This very large search space is actually handled with a superposition of probabilities (somewhat as in quantum computing) which are collapsed in a sequential manner, in a rippling propagation of decisions and ensuing probability propagations. So Richard there are ways to do searches efficiently in very high dimensional spaces, including in the case of confabulation spaces that are in some ways trillions and trillions of times larger than the known universe -- all on relatively small computers. So lift thine eyes up unto Hecht-Nielsen -- (and his cat with whom he generously shares credit for Confabulation) -- and believe! Ed Porter -Original Message- From: Richard Loosemore [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 12:49 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed Porter wrote: Richard, It is false to imply that knowledge of how to draw implications from a series of statements by some sort of search mechanism is equally unknown as that of how to make an anti-gravity drive -- if by anti-gravity drive you mean some totally unknown form of physics, rather than just anything, such as human legs, that can push against gravity. It is unfair because there is a fair amount of knowledge about how to draw implications from sequences of statements. For example view Shastri's www.icsi.berkeley.edu/~shastri/psfiles/cogsci00.ps. Also Ben Goertzel has demonstrated a program that draws implications from statements contained in different medical texts. Ed Porter P.S., I have enclosed an inexact, but, at least to me, useful drawing I made of the type of search involved in understanding the multiple implications contained in the series of statements contained in Shastri's John fell in the Hallway. Tom had cleaned it. He was hurt example. Of course, what is most missing from this drawing are all the other, dead end, implications which do not provide a likely implication. Only one of such dead end is shown (the implication between fall and trip). As a result you don't sense how many dead ends have to be searched to find the implications which best explain the statements. EWP Well, bear in mind that I was not meaning the analogy to be *that* exact, or I would have given up on AGI long ago - I'm sure you know that I don't believe that getting an understanding system working is as impossible as getting an AG drive built. The purpose of my comment was to point to a huge gap in understanding, and the mistaken strategy of dealing with all the peripheral issues before having a clear idea how to solve the central problem. I cannot even begin to do justice, here, to the issues involved in solving the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space You talk as if an extension of some current strategy will solve this ... but it is not at all clear that any current strategy for solving this problem actually does scale up to a full solution to the problem. I don't care how many toy examples you come up with, you have to show a strategy for dealing with some of the core issues, AND reasons to believe that those strategies really will work (other than I find them quite promising). Not only that, but there at least some people (to wit, myself) who believe there are positive reasons to believe that the current strategies *will* not scale up. Richard Loosemore -Original Message- From: Richard Loosemore [mailto
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
MIKE TINTNER Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, ED PORTER How do you find the right set of relatively few computations and/or models that are appropriate in a complex context without massive computation? Ed, Contrary to my PM, maybe I should answer this in more precise detail.My hypothesis is as follows: the brain does most of its thinking, and particularly adaptive thinking, by look-up not by blind search. How can you or I deal with : Get that box out of this house now.. How is it say, that I will be able to think of a series of ideas like get ten men to carry it, get a fork-lift truck to move it, use large levers, get hold of some heavy ropes ... etc etc. straight off the top of my head in well under a minute? All of those ideas are derived from visual/sensory images/ schemas of large objects being moved. The brain does not, I suggest, consult digital/ verbal lists or networks of verbal ideas about moving boxes out of houses or any similar set of verbal concepts, (except v. occasionally). How then does the brain rapidly pull relevant large-object-moving shapes out of memory? (There are obviously more operations involved here than just shape search, but that's what I want to concentrate on). Now this is where I confess again to being a general techno-idiot (although I suspect that in this particular area most of you may be, too). My confused idea is that if you have a stack of shapes, there are ways to pull out/ spot the relevant ones quickly without sorting through the stack one by one. I think Hawkins suggests something like this in ON INtelligence. Maybe you can have thoughts about this. (Alternatively, the again confused idea occurs that certain neuronal areas, when stimulated with a certain shape, may be able to remember similar shapes that have been there before - v. loosely as certain metals when heated, can remember/ resume old forms) Whatever, I am increasingly confident that the brain does work v. extensively by matching shapes physically, (rather than by first converting them into digital/symbolic form). And I recommend here Sandra Blakeslee's latest book on body maps - the opening Ramachandran quote - When a reporter asked the famous biologist JBS Haldane what his biological studies had taught about God, Haldane replied:The creator if he exists must have an inordinate fondness for beetles since there are more species of beetle than any other group of living creqtures. By the same token, a neurologist might conclude that God is a cartographer. He must have an inordinate fondness for maps, for everywhere you look in the brain maps abound. If I'm headed even loosely in the right direction here, only analog computation will be able to handle the kind of rapid shape matching and searches I'm talking about, as opposed to the inordinately long, blind symbolic searches of digital computation. And you're going to need a whole new kind of computer. But none of you guys are prepared to even contemplate that. P.S. One important feature of shape searches by contrast with digital, symbolic searches is that you don't make mistakes. IOW when we think about a problem like getting the box out of a house, all our ideas, I suggest, will be to some extent relevant. They may not totally solve the problem, but they will fit some of the requirements, precisely because they have been derived by shape comparison. When a computer blindly searches lists of symbols by contrast, most of them of course are totally irrelevant. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71680486-77dd12
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: RICHARD LOOSEMORE I cannot even begin to do justice, here, to the issues involved in solving the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space You talk as if an extension of some current strategy will solve this ... but it is not at all clear that any current strategy for solving this problem actually does scale up to a full solution to the problem. I don't care how many toy examples you come up with, you have to show a strategy for dealing with some of the core issues, AND reasons to believe that those strategies really will work (other than I find them quite promising). Not only that, but there at least some people (to wit, myself) who believe there are positive reasons to believe that the current strategies *will* not scale up. ED PORTER I don't know if you read the Shastri paper I linked to or not, but it shows we do know how to do many of the types of implication which are used in NL. What he shows needs some extensions, so it is more generalized, but it and other known inference schemes explain a lot of how text understanding could be done. With regard to the scaling issue, it is a real issue. But there are multiple reasons to believe the scaling problems can be overcome. Not proofs, Richard, so you are entitled to your doubts. But open your mind to the possibilities they present. They include: -1-the likely availability of roughly brain level representational, computational, and interconnect capacities within the several hundred thousand to 1 million dollar range in seven to ten years. -2-the fact that human experience and representation does not explode combinatorially. Instead it is quite finite. It fits insides our heads. Thus, although you are dealing with extremely high dimensional spaces, most of that space is empty. There are know ways to deal with extremely high dimensional spaces while avoiding the exponential explosion made possible by such high dimensionality. Take the well know Growing Neural Gas (GNG) algorithm. It automatically creates a relative compact representation of a possibly infinite dimensional space, by allocated nodes to only those parts of the high dimensional space where there is stuff, or, if resource are more limited, where the most stuff is. Or take indexing, it takes one only to places in the hyperspace where something actually occurred or was thought about. One can have probabilitistically selected hierarchical indexing (something like John Rose suggested) which make indexing much more efficient. I'm sorry, but this is not addressing the actual issues involved. You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. My mind is completely open. But right now I raised one issue, and this is not answered. I am talking about issues that could prevent that framework from ever working no matter how much computing power is available. You must be able to see this: you are familiar with the fact that it is possible to frame a solution to certain problems in such a way that the proposed solution is KNOWN to not converge on an answer? An answer can be perfectly findable IF you use a different representation, but there are some ways of representing the problem that lead to a type of solution that is completely incomputable. This is an analogy: I suggest to you that the framework you have in mind when you discuss the solution of the AGI problem is like those broken representations. -3-experiential computers focus most learning, most models, and most search on things that actually have happened in the past or on things that in many ways are similar to what has happened in the past. This tends to greatly reduce representational and search spaces. When such a system synthesizes or perceives new patterns that have never happened before the system will normally have to explore large search spaces, but because of the capacity of brain level hardware it will have considerable capability to do so. The type of hardware that will be available for human-level agi in the next decade will probably have sustainable cross sectional bandwidths of 10G to 1T messages/sec with 64Byte payloads/msg. With branching tree activations and the fact that many messages will be regional, the total amount of messaging could well be 100G to 100T such msg/sec. Lets assume our hardware has 10T msg/sec and that we want to read 10 words a second. That would allow 1T msg/word. With a dumb spreading
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: RICHARD LOOSEMORE I cannot even begin to do justice, here, to the issues involved in solving the high dimensional problem of seeking to understand the meaning of text, which often involve multiple levels of implication, which would normally be accomplished by some sort of search of a large semantic space You talk as if an extension of some current strategy will solve this ... but it is not at all clear that any current strategy for solving this problem actually does scale up to a full solution to the problem. I don't care how many toy examples you come up with, you have to show a strategy for dealing with some of the core issues, AND reasons to believe that those strategies really will work (other than I find them quite promising). Not only that, but there at least some people (to wit, myself) who believe there are positive reasons to believe that the current strategies *will* not scale up. ED PORTER I don't know if you read the Shastri paper I linked to or not, but it shows we do know how to do many of the types of implication which are used in NL. What he shows needs some extensions, so it is more generalized, but it and other known inference schemes explain a lot of how text understanding could be done. With regard to the scaling issue, it is a real issue. But there are multiple reasons to believe the scaling problems can be overcome. Not proofs, Richard, so you are entitled to your doubts. But open your mind to the possibilities they present. They include: -1-the likely availability of roughly brain level representational, computational, and interconnect capacities within the several hundred thousand to 1 million dollar range in seven to ten years. -2-the fact that human experience and representation does not explode combinatorially. Instead it is quite finite. It fits insides our heads. Thus, although you are dealing with extremely high dimensional spaces, most of that space is empty. There are know ways to deal with extremely high dimensional spaces while avoiding the exponential explosion made possible by such high dimensionality. Take the well know Growing Neural Gas (GNG) algorithm. It automatically creates a relative compact representation of a possibly infinite dimensional space, by allocated nodes to only those parts of the high dimensional space where there is stuff, or, if resource are more limited, where the most stuff is. Or take indexing, it takes one only to places in the hyperspace where something actually occurred or was thought about. One can have probabilitistically selected hierarchical indexing (something like John Rose suggested) which make indexing much more efficient. -3-experiential computers focus most learning, most models, and most search on things that actually have happened in the past or on things that in many ways are similar to what has happened in the past. This tends to greatly reduce representational and search spaces. When such a system synthesizes or perceives new patterns that have never happened before the system will normally have to explore large search spaces, but because of the capacity of brain level hardware it will have considerable capability to do so. The type of hardware that will be available for human-level agi in the next decade will probably have sustainable cross sectional bandwidths of 10G to 1T messages/sec with 64Byte payloads/msg. With branching tree activations and the fact that many messages will be regional, the total amount of messaging could well be 100G to 100T such msg/sec. Lets assume our hardware has 10T msg/sec and that we want to read 10 words a second. That would allow 1T msg/word. With a dumb spreading activation rule that would allow you to: active the 30K most probably implications; and for each of them the 3K most probable implications; and for each of them the 300 most probable implications; and for each of them the 30 most probable implications. As dumb as this method of inferencing would be, it actually would make a high percent of the appropriate multi-step inferences, particularly when you consider that the probability of activation at the successive stages would be guided by probabilities from other activations in the current context. Of course there are much more intelligent ways to guide activation that this. Also it is important to understand that at every level in many of the searches or explorations in such a system there will be guidance and limitations provided by similar models from past experience, greatly reducing the amount of or the number of explorations that are required to produce reasonable results. -4-Michael Collins a few years ago had was many AI researches considered to be the best grammatical parser, which used the kernel trick to effectively match parse trees in, I think it was, 500K dimensions. By use of the Kernel trick the actual computation usually was performed in a small subset of these dimensions
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Matt: Semantic models learn associations by proximity in the training text. The degree to which you associate snake and rope depends on how often these words appear near each other Correct me - but it's the old, old problem here, isn't it? Those semantic models/programs won't be able to form any *new* analogies, will they? Or understand newly minted analogies in texts? And I'm v. dubious about their powers to even form valid associations of much value in the ways you describe from existing texts. You're saying that there's a semantic model/program that can answer, if asked,: yes - 'snake, chain, rope, spaghetti strand' is a legitimate/ valid series of associations/ yes, they fit together (based on previous textual analysis) ? or: the odd one out in 'snake/ chain/ cigarette/ rope is 'cigarette'? I have yet to find or be given a single useful analogy drawn by computers (despite asking many times). The only kind of analogy I can remember here is Ed, I think, pointing to Hofstader's analogies along the lines of xxyy is like . Not exactly a big deal. No doubt there must be more, but my impression is that in general computers are still pathetic here. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71683316-d0bd3c
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Ed Porter [EMAIL PROTECTED] wrote: We do not know the number and width of the spreading activation that is necessary for human level reasoning over world knowledge. Thus, we really don't know how much interconnect is needed and thus how large of a P2P net would be needed for impressive AGI. But I think it would have to be larger than say 10K nodes. In complex systems on the boundary between stability and chaos, the degree of interconnectedness per node is constant. Complex systems always evolve to this boundary because stable systems aren't complex and chaotic systems can't be incrementally updated. In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate the communication bandwidth. But it is O(n log n) because the distance between nodes grows as O(log n). For each message sent or received, a node must also relay O(log n) messages. If the communication protocol is natural language text, then I am pretty sure our existing networks can handle it. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71684400-910726
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
--- Mike Tintner [EMAIL PROTECTED] wrote: Matt: Semantic models learn associations by proximity in the training text. The degree to which you associate snake and rope depends on how often these words appear near each other Correct me - but it's the old, old problem here, isn't it? Those semantic models/programs won't be able to form any *new* analogies, will they? Or understand newly minted analogies in texts? And I'm v. dubious about their powers to even form valid associations of much value in the ways you describe from existing texts. You're saying that there's a semantic model/program that can answer, if asked,: yes - 'snake, chain, rope, spaghetti strand' is a legitimate/ valid series of associations/ yes, they fit together (based on previous textual analysis) ? Yes, because each adjacent pair of words has a high frequency of co-occurrence in a corpus of training text. or: the odd one out in 'snake/ chain/ cigarette/ rope is 'cigarette'? Yes, because cigarette does not have a high co-occurrence with the other words. I have yet to find or be given a single useful analogy drawn by computers (despite asking many times). The only kind of analogy I can remember here is Ed, I think, pointing to Hofstader's analogies along the lines of xxyy is like . Not exactly a big deal. No doubt there must be more, but my impression is that in general computers are still pathetic here. This simplistic vector space model I described has been used to pass the word analogy section of the SAT exams. See: Turney, P., Human Level Performance on Word Analogy Questions by Latent Relational Analysis (2004), National Research Council of Canada, http://iit-iti.nrc-cnrc.gc.ca/iit-publications-iti/docs/NRC-47422.pdf -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71685861-05fe0f
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Mike -Original Message- From: Mike Tintner [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 8:25 PM To: agi@v2.listbox.com Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research] MIKE TINTNER Isn't it obvious that the brain is able to understand the wealth of language by relatively few computations - quite intricate, hierarchical, multi-levelled processing, ED PORTER How do you find the right set of relatively few computations and/or models that are appropriate in a complex context without massive computation? MIKE TINTNER How then does the brain rapidly pull relevant large-object-moving shapes out of memory? (There are obviously more operations involved here than just shape search, but that's what I want to concentrate on). Now this is where I confess again to being a general techno-idiot (although I suspect that in this particular area most of you may be, too). My confused idea is that if you have a stack of shapes, there are ways to pull out/ spot the relevant ones quickly without sorting through the stack one by one. I think Hawkins suggests something like this in ON INtelligence. Maybe you can have thoughts about this. ED One way is by indexing some thing by its features, but this is a form of a search, which if done completely activates each occurrence of each feature searched for, and then selects the one or more pattern with the best activation score. Others on the list can probably name other methods Another used in perception is to hierarchically match inputs against patterns that represent given shapes under different conditions. MIKE TINTNER (Alternatively, the again confused idea occurs that certain neuronal areas, when stimulated with a certain shape, may be able to remember similar shapes that have been there before - v. loosely as certain metals when heated, can remember/ resume old forms) Whatever, I am increasingly confident that the brain does work v. extensively by matching shapes physically, (rather than by first converting them into digital/symbolic form). And I recommend here Sandra Blakeslee's latest book on body maps - the opening Ramachandran quote - ED there clearly is some shape matching in the brain. MIKE TINTNER P.S. One important feature of shape searches by contrast with digital, symbolic searches is that you don't make mistakes. IOW when we think about a problem like getting the box out of a house, all our ideas, I suggest, will be to some extent relevant. They may not totally solve the problem, but they will fit some of the requirements, precisely because they have been derived by shape comparison. When a computer blindly searches lists of symbols by contrast, most of them of course are totally irrelevant. ED Yes, but there are a lot of types of thinking that cannot be done by shape alone, and shape is actually much more complicated than shape. There is shape, and shape distorted by perspective, and shape changed by bending, and shape changed by size. There is shape of objects, shape of trajectories, 2d shapes, 3d shapes. There are visual memories, where we don't really remember all the shapes, but instead remember the types of things that were their and fill in most of the actual shapes. In sum, it's a lot more complicated that just finding a matching photograph. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71691780-efaeb1attachment: winmail.dat
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
RICHARD LOOSEMORE= I'm sorry, but this is not addressing the actual issues involved. You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told by multiple people at MIT was considered one of the most accurate NL parsers around. Is that just a trivial exercise in public relations? With regard to Hecht-Nielsen's work, if it does half of what he says it does it is pretty damned impressive. It is also a work I think about often when thinking how to deal with certain AI problems. Richard if you insultingly dismiss such valid work as trivial exercises in public relations it sure as hell seems as if either you are quite lacking in certain important understandings -- or you have a closed mind -- or both. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71696956-846847
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Richard Loosemore= None of the above is relevant. The issue is not whether toy problems set within the current paradigm can be done with this or that search algorithm, it is whether the current paradigm can be made to converge at all for non-toy problems. Ed Porter= Richard, I wouldn't call a state of the art NL parser that matches parse trees in 500K dimensions a toy problem. Yes, it is much less than a complete human brain, but it is not a toy problem. With regard to Hecht-Nielsen's sentence completion program it is arguably a toy problem, but it operates extremely efficiently (i.e., converges) in an astronomically large search space, with a significant portion of that search space having some arguable activation. The fact that there is such efficient convergence in such a large search space is meaningful, and the fact that you just dismiss it, as you did in your last email as a trivial publicity stunt is also meaningful. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71705619-d121f2
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Matt, IN my Mon 12/3/2007 8:17 PM post to John Rose from which your are probably quoting below I discussed the bandwidth issues. I am assuming nodes directly talk to each other, which is probably overly optimistic, but still are limited by the fact that each node can only receive somewhere roughly around 100 128 byte messages a second. Unless you have a really big P2P system, that just isn't going to give you much bandwidth. If you had 100 million P2P nodes it would. Thus, a key issue is how many participants is an AGI-at-Home P2P system going to get. I mean, what would motivate the average American, or even the average computer geek turn over part of his computer to it? It might not be an easy sell for more than several hundred or several thousand people, at least until it could do something cool, like index their videos for them, be a funny chat bot, or something like that. Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 8:51 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] --- Ed Porter [EMAIL PROTECTED] wrote: We do not know the number and width of the spreading activation that is necessary for human level reasoning over world knowledge. Thus, we really don't know how much interconnect is needed and thus how large of a P2P net would be needed for impressive AGI. But I think it would have to be larger than say 10K nodes. In complex systems on the boundary between stability and chaos, the degree of interconnectedness per node is constant. Complex systems always evolve to this boundary because stable systems aren't complex and chaotic systems can't be incrementally updated. In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate the communication bandwidth. But it is O(n log n) because the distance between nodes grows as O(log n). For each message sent or received, a node must also relay O(log n) messages. If the communication protocol is natural language text, then I am pretty sure our existing networks can handle it. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71708450-da8cab
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Matt, In addition to my last email, I don't understand what your were saying below about complexity. Are you saying that as a system becomes bigger it naturally becomes unstable, or what? Ed Porter -Original Message- From: Matt Mahoney [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 8:51 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] --- Ed Porter [EMAIL PROTECTED] wrote: We do not know the number and width of the spreading activation that is necessary for human level reasoning over world knowledge. Thus, we really don't know how much interconnect is needed and thus how large of a P2P net would be needed for impressive AGI. But I think it would have to be larger than say 10K nodes. In complex systems on the boundary between stability and chaos, the degree of interconnectedness per node is constant. Complex systems always evolve to this boundary because stable systems aren't complex and chaotic systems can't be incrementally updated. In my thesis ( http://cs.fit.edu/~mmahoney/thesis.html ) I did not estimate the communication bandwidth. But it is O(n log n) because the distance between nodes grows as O(log n). For each message sent or received, a node must also relay O(log n) messages. If the communication protocol is natural language text, then I am pretty sure our existing networks can handle it. -- Matt Mahoney, [EMAIL PROTECTED] - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71710422-50e2fa
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
On Thursday 29 November 2007, Ed Porter wrote: Somebody (I think it was David Hart) told me there is a shareware distributed web crawler already available, but I don't know the details, such as how good or fast it is. http://grub.org/ Previous owner went by the name of 'kordless'. I found him on Slashdot. - Bryan - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71712384-417a60
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: Richard Loosemore= None of the above is relevant. The issue is not whether toy problems set within the current paradigm can be done with this or that search algorithm, it is whether the current paradigm can be made to converge at all for non-toy problems. Ed Porter= Richard, I wouldn't call a state of the art NL parser that matches parse trees in 500K dimensions a toy problem. Yes, it is much less than a complete human brain, but it is not a toy problem. This is a toy problem. Parsing is a deep problem? Do you understand the relationship between parsing NL and extracting semantics? Do you understand what this great NL parser would do if confronted with a syntactically incorrect but contextually meaningful sentence? Has it been analysed to see what its behavior is on ambiguous sentences? Could it learn to cope with someone speaking a pidgin version of NL, or would someone have to write an entire grammar for the language before the system could even start parsing it? Can it generate syntactically correct sentences that express an idea? Can it cope with speech errors, recgnising the nature o fteh error and backfilling, or does it just collapse with no viable parse? Would the parser have to be completely rewritten in the future when someone else finally solves the problem of representing the semantics of language? Finally, if you are impressed by the claim about 500K dimensions then what can I say? Can you explain to me in what sense it matches parse trees in 500K dimensions, and why that is so impressive? Perhaps I am being unnecessarily hard on you, Ed. I don't mean to be personally rude, you know, but it is sometimes exhausting to have someone trying to teach you how to suck eggs Richard Loosemore With regard to Hecht-Nielsen's sentence completion program it is arguably a toy problem, but it operates extremely efficiently (i.e., converges) in an astronomically large search space, with a significant portion of that search space having some arguable activation. The fact that there is such efficient convergence in such a large search space is meaningful, and the fact that you just dismiss it, as you did in your last email as a trivial publicity stunt is also meaningful. Ed Porter - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?; - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71714474-5576ff
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed Porter wrote: RICHARD LOOSEMORE= I'm sorry, but this is not addressing the actual issues involved. You are implicitly assuming a certain framework for solving the problem of representing knowledge ... and then all your discussion is about whether or not it is feasible to implement that framework (to overcome various issues to do with searches that have to be done within that framework). But I am not challenging the implementation issues, I am challenging the viability of the framework itself. ED PORTER= So what is wrong with my framework? What is wrong with a system of recording patterns, and a method for developing compositions and generalities from those patterns, in multiple hierarchical levels, and for indicating the probabilities of certain patterns given certain other pattern etc? I know it doesn't genuflect before the alter of complexity. But what is wrong with the framework other than the fact that it is at a high level and thus does not explain every little detail of how to actually make an AGI work? RICHARD LOOSEMORE= These models you are talking about are trivial exercises in public relations, designed to look really impressive, and filled with hype designed to attract funding, which actually accomplish very little. Please, Ed, don't do this to me. Please don't try to imply that I need to open my mind any more. Th implication seems to be that I do not understand the issues in enough depth, and need to do some more work to understand you points. I can assure you this is not the case. ED PORTER= Shastri's Shruiti is a major piece of work. Although it is a highly simplified system, for its degree of simplification it is amazingly powerful. It has been very helpful to my thinking about AGI. Please give me some excuse for calling it trivial exercise in public relations. I certainly have not published anything as important. Have you? The same for Mike Collins's parsers which, at least several years ago I was told by multiple people at MIT was considered one of the most accurate NL parsers around. Is that just a trivial exercise in public relations? With regard to Hecht-Nielsen's work, if it does half of what he says it does it is pretty damned impressive. It is also a work I think about often when thinking how to deal with certain AI problems. Richard if you insultingly dismiss such valid work as trivial exercises in public relations it sure as hell seems as if either you are quite lacking in certain important understandings -- or you have a closed mind -- or both. Ed, You have no idea of the context in which I made that sweeping dismissal. If you have enough experience of research in this area you will know that it is filled with bandwagons, hype and publicity-seeking. Trivial models are presented as if they are fabulous achievements when, in fact, they are just engineered to look very impressive but actually solve an easy problem. Have you had experience of such models? Have you been around long enough to have seen something promoted as a great breakthrough even though it strikes you as just a trivial exercise in public relations, and then watch history unfold as the great breakthrough leads to absolutely nothing at all, and is then quietly shelved by its creator? There is a constant ebb and flow of exaggeration and retreat, exaggeration and retreat. You are familiar with this process, yes? This entire discussion baffles me. Does it matter at all to you that I have been working in this field for decades? Would you go up to someone at your local university and tell them how to do their job? Would you listen to what they had to say about issues that arise in their field of expertise, or would you consider your own opinion entirely equal to theirs, with only a tiny fraction of their experience? Richard Loosemore - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71711822-0e911b
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]ED Yes, but there are a lot of types of thinking that cannot be done by shape alone, and shape is actually much more complicated than shape. There is shape, and shape distorted by perspective, and shape changed by bending, and shape changed by size. There is shape of objects, shape of trajectories, 2d shapes, 3d shapes. There are visual memories, where we don't really remember all the shapes, but instead remember the types of things that were their and fill in most of the actual shapes. In sum, it's a lot more complicated that just finding a matching photograph. Ed, I am not suggesting that shape matching is everything, merely that it is central to a great many of the brain's operations - and to its ability to search rapidly and briefly and locate analogical ideas (and if that's true, as I believe it is, then, sorry, AGI's stuckness is going to continue for a long time yet). The reason I'm replying though is a further thought occurred to me. Essentially I've been suggesting that the brain has some means to locate matching shapes quickly in very few operations where a digital computer laboriously searches through long lists or networks of symbols in a great many operations. One v. crude idea for the mechanism I suggested was that neuronal areas somehow retain memories of shapes, which can be stimulated by similar incoming shapes - so that analogies can be drawn with extreme rapidity, more or less on the spot. [Spot checks] It's occurred to me that this may well happen over and over throughout the body related brain areas. The same body areas that today feel stiff / expanded/ cold , felt loose/ contracted/ warm yesterday. The same hand that was a ball, and many other shapes, is now a fist. So perhaps these memories are all somehow laid on top of each in the same brain areas..Map upon map upon map .Just an extremely rough idea, but I think it does go some way to showing how shape matching could indeed be extremely rapid and effective in the brain, by contrast with computers' blind, disembodied search. It follows BTW re your points above, that the same brain areas will also retain many morphic variations on the same basic shapes - objects/cups seen say moving, from different angles, zooming in and out etc. And if it's true, as I believe, that the brain uses loose, highly flexible templates for visual object perception - then that too should mean that it will easily and rapidly be able to connect closely related shapes as in snake/ chain/ rope/ spaghetti strand. Analogies and perception are interwoven for the brain. Blakeslee makes a good deal of the brain using flexible, morphic body maps. Thanks for your reply. Further thoughts re mechanisms welcome. As Blakeslee points out, this whole area is just beginning to open up. - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=71724560-1bc574
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed, Well it'd be nice having a supercomputer but P2P is a poor man's supercomputer and beggars can't be choosy. Honestly the type of AGI that I have been formulating in my mind has not been at all closely related to simulating neural activity through orchestrating partial and mass activations at low frequencies and I had been avoiding those contagious cog sci memes on purpose. But your expose on the subject is quite interesting and I wasn't that aware that that is how things have been being done. But getting more than a few thousand P2P nodes is difficult. Going from 10K to 20K nodes and up, getting more difficult to the point of being prohibitively expensive to being impossible or extremely lucky. There are ways to do it but according to your calculations the supercomputer mayt be more of a wise choice as going out and scrounging up funding for that would be easier. Still though (besides working on my group theory heavy design) exploring the crafting and chiseling of an activation model you are talking about to the P2P network could be fruitful. I feel that through a number of up front and unfortunately complicated design changes/adaptations that the activation orchestrations could be improved thus bringing down the message rate requirements, reducing activation requirements, depths and frequencies, through a sort of computational resource topology consumption, self-organizational design molding. You do indicate some dynamic resource adaption and things like intelligent inference guiding schemes in your description but it doesn't seem like it melts enough into the resource space. But having a design be less static risks excessive complications... A major problem though with P2P and the activation methodology is that there are so many variances in the latencies and availability that serious synchronicity/simultaneity issues would exist that even more messaging might be required. Since there are so many variables in public P2P, empirical data also would be necessary to get a gander on feasibility. I still feel strongly that the way to do AGI P2P (with public P2P as core not augmental) is to understand the grid, and build the AGI design based on that and what it will be in a few years, instead of taking a design and morphing it to the resource space. That said, there are finite designs that will work so the number of choices is few. John _ From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Monday, December 03, 2007 6:17 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, You raised some good points. The problem is that the total number of messages/sec that can be received is relatively small. It is not as if you are dealing with a multidimensional grid or toroidal net in which spreading tree activation can take advantage of the fact that the total parallel bandwidth for regional messaging can be much greater than the x-sectional bandwidth. In a system where each node is a server class node with multiple processors and 32 or 64Gbytes of ram, much of which is allocable to representation, sending messages to local indices on each machine could fairly efficiently activate all occurrences of something in a 32 to 64 TByte knowledge base with a max of 1K internode messages, if there was only 1K nodes. But in a PC based P2P system the ratio of nodes to representation space is high and the total number of 128 byte messages/sec than can be received is limited to about 100, so neither methods of trying to increase number of patterns than can be activated with the given interconnect of the network buy you as much. Human level context sensitivity arises because a large number of things that can depend on a large number of things in the current context are made aware of those dependencies. This takes a lot of messaging, and I don't see how a P2P system where each node can only receive about 100 relatively short messages a second is going to make this possible unless you had a huge number of nodes. As Richard Loosemore said in his Mon 12/3/2007 12:57 PM post. It turns out that within an extremely short time of the forst word being seen, a very large numbmer of other words have their activations raised significantly. Now, whichever way you interpret these (so called priming) results, one thing is not in doubt: there is massively parallel activation of lexical units going on during language processing. With special software, a $10M dollar supercomputer cluster with 1K nodes, 32TBytes of Ram, and a dual ported 20Mb infiniband interconnect send about 1
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed, Building up parse trees and word sense models, let's say that would be a first step. And then say after a while this was accomplished and running on some peers. What would the next theoretical step be? Also, what would you try to accomplish if there was more bandwidth and more computing power? The reason I ask is that a public peer network can be constructed in many ways and a subset of nodes can be higher bandwidth - 10, 20, 30+ mbits and some legs can be very high approaching 400 mbits. Computing power doesn't get that high 'cept for a small subset where you have multiproc/multicore servers but these are rare. Also, even with the basic lower end, lower quality nodes, including DSL, etc. the computational resource topology can molded and optimized for particular computational goal structures. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Saturday, December 01, 2007 6:41 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, I tested Exeter, NH to LA at 5371kbs download, and 362Kbs upload. Strangelly my scores were slightly slower to NYC. Just throwing out ideas, for example, AGI-at-home PC's in the net could crawl the web looking for reasonable NL text. Use current NL tools to guess parse and word sense. For each word in text, send it and it surrounding text, Part of speech labeling, surrounding parse tree, and word sense guess, to another P2P node that specializes in that word in similar contexts and separately another P2P node that specializes in similar parse trees. These specialist node could then develop statistical models for word senses based on clustering or other technique. Then over time the statistical models would get send down to the reading nodes, and this EM cycle could be constantly repeated. Of course, without the cross-sectional bandwidth of proper AGI hardware, you are going to be severely limited from doing a lot of the things you would really like to be able to do. But I think you should be able to come up with pretty good word sense models. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 2:55 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed, That is probably a good rough estimate. There are more headers for the more frequently transmitted smaller messages but a 16 byte header may be a bit large. Here is a speedtest link - http://www.speedtest.net/ My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 kb/sec UL much larger than the calculations 256kb/sec. The variance between tests to the same location is quite large on the DL side but UL is relatively stable. Saturating either DL or UL would impact the other. You can get higher efficiencies if you use UDP transmission without message serialization. Also you can do things like compression, only sending changes, etc.. Distributed crawling with NL learning fits the scenario well since nodes download at higher speeds, process the download into a smaller dataset, then UL communicate the results to the server or share with peers. When one peer shares with many peers you hit the UL limit fast though so it has to be managed. And you have to figure out how the knowledge will be spread out - server centric, shared, hybrid... As the knowledge size increases with peer storage you have to come up with distributed indexes. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:06 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. I guess that means and AGI-at-home system could be both up- loading and receiving about 27 1K msgs/sec if it wasn't being used for anything else and the networks weren't backed up in its neck of the woods. Presumably the number for say 128Byte messages would be say, roughly, 8 times faster (minus some percent for the latency associated with each message, so lets say roughly about 5 times faster or 135msg/sec. Is that reasonable? So, it seems for example it would be quite possible to do estimation/maximilation type NL learning in a distributed manner with a lot of cable-box connected PC's and a distributed web crawler. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:33 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Hi Ed, If the peer is not running other apps utilizing the network it could do the same. Typically a peer first needs to locate other peers. There may be servers involved but these are just for the few bytes transmitted for public IP address discovery
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
John, I tested Exeter, NH to LA at 5371kbs download, and 362Kbs upload. Strangelly my scores were slightly slower to NYC. Just throwing out ideas, for example, AGI-at-home PC's in the net could crawl the web looking for reasonable NL text. Use current NL tools to guess parse and word sense. For each word in text, send it and it surrounding text, Part of speech labeling, surrounding parse tree, and word sense guess, to another P2P node that specializes in that word in similar contexts and separately another P2P node that specializes in similar parse trees. These specialist node could then develop statistical models for word senses based on clustering or other technique. Then over time the statistical models would get send down to the reading nodes, and this EM cycle could be constantly repeated. Of course, without the cross-sectional bandwidth of proper AGI hardware, you are going to be severely limited from doing a lot of the things you would really like to be able to do. But I think you should be able to come up with pretty good word sense models. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 2:55 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed, That is probably a good rough estimate. There are more headers for the more frequently transmitted smaller messages but a 16 byte header may be a bit large. Here is a speedtest link - http://www.speedtest.net/ My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 kb/sec UL much larger than the calculations 256kb/sec. The variance between tests to the same location is quite large on the DL side but UL is relatively stable. Saturating either DL or UL would impact the other. You can get higher efficiencies if you use UDP transmission without message serialization. Also you can do things like compression, only sending changes, etc.. Distributed crawling with NL learning fits the scenario well since nodes download at higher speeds, process the download into a smaller dataset, then UL communicate the results to the server or share with peers. When one peer shares with many peers you hit the UL limit fast though so it has to be managed. And you have to figure out how the knowledge will be spread out - server centric, shared, hybrid... As the knowledge size increases with peer storage you have to come up with distributed indexes. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:06 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. I guess that means and AGI-at-home system could be both up- loading and receiving about 27 1K msgs/sec if it wasn't being used for anything else and the networks weren't backed up in its neck of the woods. Presumably the number for say 128Byte messages would be say, roughly, 8 times faster (minus some percent for the latency associated with each message, so lets say roughly about 5 times faster or 135msg/sec. Is that reasonable? So, it seems for example it would be quite possible to do estimation/maximilation type NL learning in a distributed manner with a lot of cable-box connected PC's and a distributed web crawler. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:33 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Hi Ed, If the peer is not running other apps utilizing the network it could do the same. Typically a peer first needs to locate other peers. There may be servers involved but these are just for the few bytes transmitted for public IP address discovery as many(or most) peers reside hidden behind NATs. DNS names also require lookups but these are just for doing the initial match of hostname to IP address, if DNS is used at all. We're just talking basic P2P, one peer talking to one other peer, nothing complicated. As you can imagine P2P can take on many flavors as the number of peers increases. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 10:10 AM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the client to server upload you discribed? Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 11:40 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] OK for a guestimate take a half-way decent cable connection say Comcast on a good day with DL of 4mbits max and UL of 256kbits max with an undiscriminated protocol
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Hi Ed, If the peer is not running other apps utilizing the network it could do the same. Typically a peer first needs to locate other peers. There may be servers involved but these are just for the few bytes transmitted for public IP address discovery as many(or most) peers reside hidden behind NATs. DNS names also require lookups but these are just for doing the initial match of hostname to IP address, if DNS is used at all. We're just talking basic P2P, one peer talking to one other peer, nothing complicated. As you can imagine P2P can take on many flavors as the number of peers increases. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 10:10 AM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the client to server upload you discribed? Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 11:40 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] OK for a guestimate take a half-way decent cable connection say Comcast on a good day with DL of 4mbits max and UL of 256kbits max with an undiscriminated protocol, an unknown TCP based protocol, talking to a fat-pipe, low latency server. Assume say 16 byte message header wrappers for all of your 128, 1024 and 10k byte message sizes. So upload is 256kbits, go ahead and saturate it fully with either of your 128+16bytes, 1024+16bytes, and 10k+16bytes packet streams. Using TCP for reliability and assume some overhead say subtract 10% from the saturated value, retransmits, latency. What are we left with? Assume the PC has 1gigbit NIC so it is usually waiting to squeeze out the 256kbits of cable upload capacity. Oh right this is just upstream, DL is 4mbits cable into PC NIC or 1gigbit (assume 60% saturation) so there is ample PC NIC BW for this. ... So for 256kbits/sec = 256,000 bits/sec, (256,000 bits/sec) / ((1024 + 16)bytes x 8bits/ (message bytes)) = 30.769 messages / sec. So 30.769 messages/sec - 10% = 27.692 messages /sec. About 27.692 message per sec for the 1024 byte message upload stream. Download = 16x UL = 443.072 messages/sec My calculation look right? Note: some Comcast cable connections allow as much as 1.4mbits upload. UL is always way less than DL (dependant on protocol). Other cable companies are similar depends on the company and geographic region... John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 6:50 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Somebody (I think it was David Hart) told me there is a shareware distributed web crawler already available, but I don't know the details, such as how good or fast it is. How fast could P2P communication be done on one PC, on average both sending upstream and receiving downstream from servers with fat pipes? Roughly how many msgs a second for cable connected PC's, say at 128byte and 1024byte, and 10K byte message sizes? Decent guestimates on such numbers would help me think about what sort of interesting distributed NL learning tasks could be done with by AGI- at- Home network. (of course once it showed any promise Google would start doing it a thousand times faster, but at least it would be open source). Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 8:31 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Ed, That is the http protocol, it is a client server request/response communication. Your browser asked for the contents at http://www.nytimes.com. The NY Times server(s) dumped the response stream data to your external IP address. You probably have a NAT'd cable address and NAT'ted again by your local router (if you have one). This communication is mainly one way - except for your original few bytes of http request. For a full ack-nack real-time dynamically addressed protocol there is more involved but say OpenCog could be setup to act as an http server and you could have a http client (browser or whatever) for simplicity in communications. Http is very firewall friendly since it is universally used on the internet. A distributed web crawler is a stretch though the communications is more complicated. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 6:13 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research
RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
Ed, That is probably a good rough estimate. There are more headers for the more frequently transmitted smaller messages but a 16 byte header may be a bit large. Here is a speedtest link - http://www.speedtest.net/ My Comcast cable from Denver to NYC tests at 3537 kb/sec DL and 1588 kb/sec UL much larger than the calculations 256kb/sec. The variance between tests to the same location is quite large on the DL side but UL is relatively stable. Saturating either DL or UL would impact the other. You can get higher efficiencies if you use UDP transmission without message serialization. Also you can do things like compression, only sending changes, etc.. Distributed crawling with NL learning fits the scenario well since nodes download at higher speeds, process the download into a smaller dataset, then UL communicate the results to the server or share with peers. When one peer shares with many peers you hit the UL limit fast though so it has to be managed. And you have to figure out how the knowledge will be spread out - server centric, shared, hybrid... As the knowledge size increases with peer storage you have to come up with distributed indexes. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:06 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. I guess that means and AGI-at-home system could be both up- loading and receiving about 27 1K msgs/sec if it wasn't being used for anything else and the networks weren't backed up in its neck of the woods. Presumably the number for say 128Byte messages would be say, roughly, 8 times faster (minus some percent for the latency associated with each message, so lets say roughly about 5 times faster or 135msg/sec. Is that reasonable? So, it seems for example it would be quite possible to do estimation/maximilation type NL learning in a distributed manner with a lot of cable-box connected PC's and a distributed web crawler. Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 12:33 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] Hi Ed, If the peer is not running other apps utilizing the network it could do the same. Typically a peer first needs to locate other peers. There may be servers involved but these are just for the few bytes transmitted for public IP address discovery as many(or most) peers reside hidden behind NATs. DNS names also require lookups but these are just for doing the initial match of hostname to IP address, if DNS is used at all. We're just talking basic P2P, one peer talking to one other peer, nothing complicated. As you can imagine P2P can take on many flavors as the number of peers increases. John -Original Message- From: Ed Porter [mailto:[EMAIL PROTECTED] Sent: Friday, November 30, 2007 10:10 AM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] John, Thanks. Can P2P transmission match the same roughly 27 1Kmsg/sec rate as the client to server upload you discribed? Ed Porter -Original Message- From: John G. Rose [mailto:[EMAIL PROTECTED] Sent: Thursday, November 29, 2007 11:40 PM To: agi@v2.listbox.com Subject: RE: Hacker intelligence level [WAS Re: [agi] Funding AGI research] OK for a guestimate take a half-way decent cable connection say Comcast on a good day with DL of 4mbits max and UL of 256kbits max with an undiscriminated protocol, an unknown TCP based protocol, talking to a fat-pipe, low latency server. Assume say 16 byte message header wrappers for all of your 128, 1024 and 10k byte message sizes. So upload is 256kbits, go ahead and saturate it fully with either of your 128+16bytes, 1024+16bytes, and 10k+16bytes packet streams. Using TCP for reliability and assume some overhead say subtract 10% from the saturated value, retransmits, latency. What are we left with? Assume the PC has 1gigbit NIC so it is usually waiting to squeeze out the 256kbits of cable upload capacity. Oh right this is just upstream, DL is 4mbits cable into PC NIC or 1gigbit (assume 60% saturation) so there is ample PC NIC BW for this. ... So for 256kbits/sec = 256,000 bits/sec, (256,000 bits/sec) / ((1024 + 16)bytes x 8bits/ (message bytes)) = 30.769 messages / sec. So 30.769 messages/sec - 10% = 27.692 messages /sec. About 27.692 message per sec for the 1024 byte message upload stream. Download = 16x UL = 443.072 messages/sec My calculation look right? Note: some Comcast cable connections allow as much as 1.4mbits upload. UL is always way less than DL (dependant on protocol). Other cable companies are similar depends on the company and geographic region
Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]
RL:However, I have previously written a good deal about the design of different types of motivation system, and my understanding of the likely situation is that by the time we had gotten the AGI working, its motivations would have been arranged in such a way that it would *want* to be extremely cooperative. You do keep saying this. An autonomous mobile agent that did not have fundamentally conflicting emotions about each and every activity and part of the world, would not succeed and survive. An AGI that trusted and cooperated with every human would not succeed and survive. Conflict is essential in a world fraught with risks, where time and effort can be wasted, essential needs can be neglected, and life and limb are under more or less continuous threat. Conflict is as fundamental and essential to living creatures and any emotional system as gravity is to the physical world. (But I can't recall any mention of it in your writings about emotions). No one wants to be extremely cooperative with anybody. Everyone wants and needs a balance of give-and-take. (And right away, an agent's interests and emotions of giving must necessarily conflict with their emotions of taking). Anything approaching a perfect balance of interests between extremely complex creatures/ psychoeconomies with extremely complex interests, is quite impossible - hence the simply massive literature dealing with the massive reality of relationship problems. And all living creatures have them. Obviously, living creatures can have highly cooperative and smooth relationships -but they tend to be in the small minority. Ditto relationships between humans and pets. And there is no reason to think any different odds would apply to artificial and living creatures. (Equally, extremely uncooperative, aggressive relationships also tend to be in the minority, and similar odds should apply about that). P.S. Perhaps the balance of cooperative/uncooperative relationsbips on this forum might give representative odds?! :) - This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244id_secret=70905478-e0c379