Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Hi, Agreed, however, you previously referred to "background information that an intelligent program has inferred from previous inputs". Information inferred from previous inputs is certainly included in the compression size (either as part of the program or as a necessary resource to the program's correct/optimal operation). Mark, I'm not sure I fully grok the context of this passage, but it seems to me that an adaptive compression program could infer information from each file it sees, and then store this information in its own memory --- and then use this information to figure out how to do excellent file compression on new files, but without storing much of this information in any of the new files it compresses. For example, a huge knowledge base about the world could be learned by a software program reading a lot of texts. This KB would be stored in the program's main memory and would help it compress future texts, but in each future text it compressed, only a tiny amount of this knowledge would be embodied... ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Hi Ben, I agree with everything that you're saying; however, looking at the specific task: Create a compressed version (self-extracting archive) of the 100MB file enwik8 of less than 18MB. More precisely: a.. Create a Linux or Windows executable archive8.exe of size S < L := 18'324'887 = previous record. b.. If run, it produces (without input from other sources) a 108 byte file data8 that is identical to enwik8. c.. If we can verify your claim, you are eligible for a prize of 50'000€×(1-S/L). Minimum claim is 500€. . . . . there clearly isn't the opportunity for it to store knowledge from other/previous files except in it's executable since it explicitly says "without input from other sources" -- and the size of the executable counts as part of the compressed size. Mark - Original Message - From: "Ben Goertzel" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 7:50 AM Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Hi, Agreed, however, you previously referred to "background information that an intelligent program has inferred from previous inputs". Information inferred from previous inputs is certainly included in the compression size (either as part of the program or as a necessary resource to the program's correct/optimal operation). Mark, I'm not sure I fully grok the context of this passage, but it seems to me that an adaptive compression program could infer information from each file it sees, and then store this information in its own memory --- and then use this information to figure out how to do excellent file compression on new files, but without storing much of this information in any of the new files it compresses. For example, a huge knowledge base about the world could be learned by a software program reading a lot of texts. This KB would be stored in the program's main memory and would help it compress future texts, but in each future text it compressed, only a tiny amount of this knowledge would be embodied... ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Yes, but the compression software could have learned stuff before trying the Hutter Challenge, via compressing a bunch of other files ... and storing the knowledge it learned via this experience in its long-term memory... -- Ben On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Hi Ben, I agree with everything that you're saying; however, looking at the specific task: Create a compressed version (self-extracting archive) of the 100MB file enwik8 of less than 18MB. More precisely: a.. Create a Linux or Windows executable archive8.exe of size S < L := 18'324'887 = previous record. b.. If run, it produces (without input from other sources) a 108 byte file data8 that is identical to enwik8. c.. If we can verify your claim, you are eligible for a prize of 50'000€×(1-S/L). Minimum claim is 500€. . . . . there clearly isn't the opportunity for it to store knowledge from other/previous files except in it's executable since it explicitly says "without input from other sources" -- and the size of the executable counts as part of the compressed size. Mark - Original Message - From: "Ben Goertzel" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 7:50 AM Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize > Hi, > >> Agreed, however, you previously referred to "background information that >> an >> intelligent program has inferred from previous inputs". Information >> inferred from previous inputs is certainly included in the compression >> size >> (either as part of the program or as a necessary resource to the >> program's >> correct/optimal operation). > > Mark, I'm not sure I fully grok the context of this passage, but it > seems to me that an adaptive compression program could infer > information from each file it sees, and then store this information in > its own memory --- and then use this information to figure out how to > do excellent file compression on new files, but without storing much > of this information in any of the new files it compresses. > > For example, a huge knowledge base about the world could be learned by > a software program reading a lot of texts. This KB would be stored in > the program's main memory and would help it compress future texts, but > in each future text it compressed, only a tiny amount of this > knowledge would be embodied... > > ben > > --- > To unsubscribe, change your address, or temporarily deactivate your > subscription, please go to > http://v2.listbox.com/member/[EMAIL PROTECTED] > --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
>> I don't see any point in this debate over lossless vs. lossy compression Lets see if I can simplify it. The stated goal is compressing human knowledge. The exact, same knowledge can always be expressed in a *VERY* large number of different bit strings Not being able to reproduce the exact bit string is lossy compression when viewed from the bit viewpoint but can be lossless from the knowledge viewpoint Therefore, reproducing the bit string is an additional requirement above and beyond the stated goal I strongly believe that this additional requirement will necessitate a *VERY* large amount of additional work not necessary for the stated goal In addition, by information theory, reproducing the exact bit string will require additional information beyond the knowledge contained in it (since numerous different strings can encode the same knowledge) Assuming optimal compression, also by by information theory, additional information will add to the compressed size (i.e. lead to a less optimal result). So the question is "Given that bit-level reproduction is harder, not necessary for knowledge compression/intelligence, and doesn't allow for the same degree of compression. Why make life tougher when it isn't necessary for your stated purposes and makes your results (i.e. compression) worse?" - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 12:55 AM Subject: Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Where will the knowledge to compress text come from? There are 3 possibilities.1. externally supplied, like the lexical models (dictionaries) for paq8h and WinRK.2. learned from the input in a separate pass, like xml-wrt|ppmonstr.3. learned online in one pass, like paq8f and slim.These all have the same effect on compressed size. In the first case, you increase the size of the decompressor. In the second, you have to append the model you learned from the first pass to the compressed file so it is available to the decompressor. In the third case, compression is poor at the beginning. From the viewpoint of information theory, there is no difference in these three approaches. The penalty is the same.To improve compression further, you will need to model semantics and/or syntax. No compressor currently does this. I think the reason is that it is not worthwhile unless you have hundreds of megabytes of natural language text. In fact, only the top few compressors even have lexical models. All the rest are byte oriented n-gram models.A semantic model would know what words are related, like "star" and "moon". It would learn this by their tendency to appear together. You can build a dictionary of such knowledge from the data set itself or you can build it some other way (such as Wordnet) and include it in the decompressor. If you learn it from the input, you could do it in a separate pass (like LSA) or you could do it in one pass (maybe an equivalent neural network) so that you build the model as you compress.To learn syntax, you can cluster words by similarity of their immediate context. These clusters correspond to part of speech. For instance, "the X is" tells you that X is a noun. You can model simple grammars as n-grams over their classifications, such as (Art Noun Verb). Again, you can use any of 3 approaches.Learning semantics and syntax is a hard problem, but I think you can see it can be done with statistical modeling. The training data you need is in the input itself.I don't see any point in this debate over lossless vs. lossy compression. You have to solve the language learning problem in either case to improve compression. I think it will be more productive to discuss how this can be done. -- Matt Mahoney, [EMAIL PROTECTED] To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
I think that our difference is that I am interpreting "without input from other sources" as not allowing that "bunch of other files" UNLESS that "long-term memory" is counted as part of the executable size. - Original Message - From: "Ben Goertzel" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 9:03 AM Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Yes, but the compression software could have learned stuff before trying the Hutter Challenge, via compressing a bunch of other files ... and storing the knowledge it learned via this experience in its long-term memory... -- Ben On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Hi Ben, I agree with everything that you're saying; however, looking at the specific task: Create a compressed version (self-extracting archive) of the 100MB file enwik8 of less than 18MB. More precisely: a.. Create a Linux or Windows executable archive8.exe of size S < L := 18'324'887 = previous record. b.. If run, it produces (without input from other sources) a 108 byte file data8 that is identical to enwik8. c.. If we can verify your claim, you are eligible for a prize of 50'000€×(1-S/L). Minimum claim is 500€. . . . . there clearly isn't the opportunity for it to store knowledge from other/previous files except in it's executable since it explicitly says "without input from other sources" -- and the size of the executable counts as part of the compressed size. Mark - Original Message - From: "Ben Goertzel" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 7:50 AM Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize > Hi, > >> Agreed, however, you previously referred to "background information >> that >> an >> intelligent program has inferred from previous inputs". Information >> inferred from previous inputs is certainly included in the compression >> size >> (either as part of the program or as a necessary resource to the >> program's >> correct/optimal operation). > > Mark, I'm not sure I fully grok the context of this passage, but it > seems to me that an adaptive compression program could infer > information from each file it sees, and then store this information in > its own memory --- and then use this information to figure out how to > do excellent file compression on new files, but without storing much > of this information in any of the new files it compresses. > > For example, a huge knowledge base about the world could be learned by > a software program reading a lot of texts. This KB would be stored in > the program's main memory and would help it compress future texts, but > in each future text it compressed, only a tiny amount of this > knowledge would be embodied... > > ben > > --- > To unsubscribe, change your address, or temporarily deactivate your > subscription, please go to > http://v2.listbox.com/member/[EMAIL PROTECTED] > --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On Tuesday 15 August 2006 00:55, Matt Mahoney wrote: ... > To improve compression further, you will need to model semantics and/or > syntax. No compressor currently does this. Has anyone looked at the statistical parsers? There is a big subfield of computational linguistics doing exactly this, cf e.g. Charniak (down the page to statistical parsing) http://www.cs.brown.edu/%7Eec/ I would speculate, btw, that the decompressor should be a virtual machine for some powerful macro-expander (which are equivalent to the lambda calculus, ergo Turing machines) and the probabilistic regularities in the source be reflected in the encoding -- which would be implemented by the "executable" compressed file. Josh --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On Tuesday 15 August 2006 09:03, Ben Goertzel wrote: > Yes, but the compression software could have learned stuff before > trying the Hutter Challenge, via compressing a bunch of other files > ... and storing the knowledge it learned via this experience in its > long-term memory... This could have a secondary value in helping the compressor know what kind of regularities to look for in the source file -- but if the regularity isn't in the source file, you obviously don't want any information about it in either the compressed file or the decompressor. So the compressor might get hints from such knowledge, but any regularities it actually (should) use are going to be present in the source file by assumption. Josh --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Hi Ben, I agree with everything that you're saying; however, looking at the specific task: . . . . there clearly isn't the opportunity for it to store knowledge from other/previous files except in it's executable since it explicitly says "without input from other sources" -- and the size of the executable counts as part of the compressed size. Right. This is probably necessary for the contest; it would be hard to verify that a program with a large database wasn't in some way storing lots of Wiki-specific information in that database. Unfortunately, this restriction makes the contest much less relevant to AGI. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
I've read Chaniak's book, Statistical Language Learning. A lot of researchers in language modeling are using perplexity (compression ratio) to compare models. But there are some problems with the way this is done. 1. Many evaluations are done on corpora from the LDC which are not free, like TREC, WSJ, Brown, etc. 2. Many evaluations use offline models. They train on a portion of the data set and evaluate on the rest, or use leave-one-out, or maybe divide into 3 parts including a validation set. This makes it difficult to compare work by different researchers because there is no consistency in the details of these experiments. 3. The input is usually preprocessed in various ways. Normally, case is folded, the words are converted to tokens from a fixed vocabulary and punctuation is removed. Again there is no consistency in the details, like the size of the vocabulary, whether to include numbers, etc. Also this filtering removes useful information, so it is difficult to evaulate the true perplexity of the model. I think a good language model will need to combine many techniques in lexical modeling (vocabulary acquistion, stemming, recognizing multiword phrases and compound words, dealing with rare words, misspelled words, capitalization, punctuation and various nontext forms of junk), semantics (distant bigrams, LSA), and syntax (statistical parsers, hidden Markov models) in a uniform framework. Most work is usually in the form of a word trigram model plus one other technique on cleaned up text. Nobody has put all this stuff together. As a result, the best compresors still use byte-level ngram statistics and at most some crude lexical parsing. I think we can do better. -- Matt Mahoney, [EMAIL PROTECTED] - Original Message From: "J. Storrs Hall, PhD." <[EMAIL PROTECTED]> To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 9:37:32 AM Subject: Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize On Tuesday 15 August 2006 00:55, Matt Mahoney wrote: ... > To improve compression further, you will need to model semantics and/or > syntax. No compressor currently does this. Has anyone looked at the statistical parsers? There is a big subfield of computational linguistics doing exactly this, cf e.g. Charniak (down the page to statistical parsing) http://www.cs.brown.edu/%7Eec/ I would speculate, btw, that the decompressor should be a virtual machine for some powerful macro-expander (which are equivalent to the lambda calculus, ergo Turing machines) and the probabilistic regularities in the source be reflected in the encoding -- which would be implemented by the "executable" compressed file. Josh --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] confirmation paradox
> A further example is: > S1 = "The fall of the Roman empire is due to Christianity". > S2 = "The fall of the Roman empire is due to lead poisoning". > I'm not sure whether S1 or S2 is "more" true. But the question is how can > you define the meaning of the NTV associated with S1 or S2? If we can't, > why not just leave these statements as non-numerical? > > YKY If you cannot tell the difference, of course you can assign them the same value. However, very often we state both S1 and S2 as "possible", but when are forced to make a choice, can still say that S1 is "more likely". Pei YKY is advocating the post-modern viewpoint that knowledge is context-dependent, and true-false assignments and numeric value judgements are both extremely problematic. Pei is pointing out the commonsense, classicist position, and also the refutation of the post-modern tradition, that some ways of building bridges make bridges that stay standing, and other ways make bridges that fall down. I think that the task of "completing the Modernist project", and uniting the many important observations of both enlightment and post-modernist thinking, has fallen to AI; we MUST resolve these two viewpoints before we can create an AGI. - Phil --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/12/06, Matt Mahoney <[EMAIL PROTECTED]> wrote: A common objection to compression as a test for AI is that humans can't do compression, so it has nothing to do with AI. The reason people can't compress is that compression requires both AI and deterministic computation. The human brain is not deterministic because it is made of neurons, which are noisy analog devices. People do compression extremely well. Your eyes send you about 3 gigabytes of data per second; at deeper levels, that is reduced to the roughly one byte per second that your brain "processes". (The 3G/sec figure is pretty accurate; the one byte/second is more contentious; it is more well-supported to say that we can make choices at a rate of around one byte per second, IIRC.) Compression is almost ALL WE DO. Cognition = compression. Most of what we do is make observations of the environment, and compress that into representations, which trigger responses that are compressed representations of actions. Or, if you like, we compress environment inputs directly into selection of actions which are appropriate in those circumstances. - Phil --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote: People do compression extremely well. Your eyes send you about 3 gigabytes of data per second; at deeper levels, Oops. Looking at my notes, that should be 2.7Gbits/second (1.4 per eye), not 3Gbytes/second. This is reduced to about 165Mbits/second per eye by the time the signal leaves the lateral geniculate nucleus, which is before it hits the thalamus or any other part of the brain. - Phil --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Ah... well, in that case the contest is indeed even less AGI-relevant than I thought... This particular shortcoming of the contest is more a pragmatic than a philosophical one: Wikipedia is not a complete knowledge domain... it's an advanced knowledge domain that is only meant to be interpretable by reference to an additional body of (more elementary) world-knowledge. (Similar to the world-knowledge that Cyc tries, fairly unsuccessfully, to capture.) But this additional world-knowledge is voluminous -- so it's quite possible that the "AGI-natural" ways to compress Wikipedia using an AGI system with a lot of world-knowledge, are quite different from the best ways to compress Wikipedia using a small executable file. Conceptually, a better (though still deeply flawed) contest would be: Compress this file of advanced knowledge, assuming as background knowledge this other file of elementary knowledge, in terms of which the advanced knowledge is defined. -- Ben G On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote: On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: > Hi Ben, > > I agree with everything that you're saying; however, looking at the > specific task: > > . . . . there clearly isn't the opportunity for it to store knowledge > from other/previous files except in it's executable since it explicitly says > "without input from other sources" -- and the size of the executable counts > as part of the compressed size. Right. This is probably necessary for the contest; it would be hard to verify that a program with a large database wasn't in some way storing lots of Wiki-specific information in that database. Unfortunately, this restriction makes the contest much less relevant to AGI. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
I realize it is tempting to use lossy text compression as a test for AI because that is what the human brain does when we read text and recall it in paraphrased fashion. We remember the ideas and discard details about the _expression_ of those ideas. A lossy text compressor that did the same thing would certainly demonstrate AI.But there are two problems with using lossy compression as a test of AI:1. The test is subjective.2. Lossy compression does not imply AI.Lets assume we solve the subjectivity problem by having human judges evaluate whether the decompressed output is "close enough" to the input. We already do this with lossy image, audio and video compression (without much consensus).The second problem remains: ideal lossy compression does not imply passing the Turing test. For lossless compression, it can be proven that it does. Let p(s) be the (unknown) probability that s will be the prefix of a text dialog. Then a machine that can compute p(s) exactly is able to generate response A to question Q with the distribution p(QA)/p(Q) which is indistinguishable from human. The same model minimizes the compressed size, E[log 1/p(s)].This proof does not hold for lossy compression because different lossless models map to identical lossy models. The desired property of a lossless compressor C is that if and only if s1 and s2 have the same meaning (to most people), then the encodings C(s1) = C(s2). This code will ideally have length log 1/(p(s1)+p(s2)). But this does not imply that the decompressor knows p(s1) or p(s2). Thus, the decompressor may decompress to s1 or s2 or choose randomly between them. In general, the output distribution will be different than the true distrubution p(s1), p(s2), so it will be distinguishable from human even if the compression ratio is ideal. -- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> I don't see any point in this debate over lossless vs. lossy compression Lets see if I can simplify it. The stated goal is compressing human knowledge. The exact, same knowledge can always be expressed in a *VERY* large number of different bit strings Not being able to reproduce the exact bit string is lossy compression when viewed from the bit viewpoint but can be lossless from the knowledge viewpoint Therefore, reproducing the bit string is an additional requirement above and beyond the stated goal I strongly believe that this additional requirement will necessitate a *VERY* large amount of additional work not necessary for the stated goal In addition, by information theory, reproducing the exact bit string will require additional information beyond the knowledge contained in it (since numerous different strings can encode the same knowledge) Assuming optimal compression, also by by information theory, additional information will add to the compressed size (i.e. lead to a less optimal result). So the question is "Given that bit-level reproduction is harder, not necessary for knowledge compression/intelligence, and doesn't allow for the same degree of compression. Why make life tougher when it isn't necessary for your stated purposes and makes your results (i.e. compression) worse?" - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 12:55 AM Subject: Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Where will the knowledge to compress text come from? There are 3 possibilities.1. externally supplied, like the lexical models (dictionaries) for paq8h and WinRK.2. learned from the input in a separate pass, like xml-wrt|ppmonstr.3. learned online in one pass, like paq8f and slim.These all have the same effect on compressed size. In the first case, you increase the size of the decompressor. In the second, you have to append the model you learned from the first pass to the compressed file so it is available to the decompressor. In the third case, compression is poor at the beginning. From the viewpoint of information theory, there is no difference in these three approaches. The penalty is the same.To improve compression further, you will need to model semantics and/or syntax. No compressor currently does this. I think the reason is that it is not worthwhile unless you have hundreds of megabytes of natural language text. In fact, only the top few compressors even have lexical models. All the rest are byte oriented n-gram models.A semantic model would know what words are related, like "star" and "moon". It would learn this by their tendency to appear together. You can build a dictionary of such knowledge from the data set itself or you
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Ben >> Conceptually, a better (though still deeply flawed) contest would be: Compress this file of advanced knowledge, assuming as background knowledge this other file of elementary knowledge, in terms of which the advanced knowledge is defined. Nah. It wouldn't be much of a contest if they gave the elementary knowledge file and *much* harder on the organizers. A much better contest would be if they just had several other undisclosed Wikipedia-chunk files and the program had to have comparable compression ratios on the undisclosed files as well. That way, the contestant is responsible for assembling the elementary knowledge in a compact format. (And the verification against undisclosed files will eliminate cheating). - Original Message - From: "Ben Goertzel" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 12:11 PM Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Ah... well, in that case the contest is indeed even less AGI-relevant than I thought... This particular shortcoming of the contest is more a pragmatic than a philosophical one: Wikipedia is not a complete knowledge domain... it's an advanced knowledge domain that is only meant to be interpretable by reference to an additional body of (more elementary) world-knowledge. (Similar to the world-knowledge that Cyc tries, fairly unsuccessfully, to capture.) But this additional world-knowledge is voluminous -- so it's quite possible that the "AGI-natural" ways to compress Wikipedia using an AGI system with a lot of world-knowledge, are quite different from the best ways to compress Wikipedia using a small executable file. Conceptually, a better (though still deeply flawed) contest would be: Compress this file of advanced knowledge, assuming as background knowledge this other file of elementary knowledge, in terms of which the advanced knowledge is defined. -- Ben G On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote: On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: > Hi Ben, > > I agree with everything that you're saying; however, looking at the > specific task: > > . . . . there clearly isn't the opportunity for it to store > knowledge > from other/previous files except in it's executable since it explicitly > says > "without input from other sources" -- and the size of the executable > counts > as part of the compressed size. Right. This is probably necessary for the contest; it would be hard to verify that a program with a large database wasn't in some way storing lots of Wiki-specific information in that database. Unfortunately, this restriction makes the contest much less relevant to AGI. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Marcus Hutter's lossless compression of human knowledge prize
Philip Goetz wrote: On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote: People do compression extremely well. Your eyes send you about 3 ... This is reduced to about 165Mbits/second per eye by the time the signal leaves the lateral geniculate nucleus, which is before it hits the thalamus or any other part of the brain. - Phil Well, various things indicate that much of what we do is "taking the deltas", i.e., we only notice changes in the signals. That accounts for much of the compression. There's also something special about how we handle repetitive patterns. (That's probably related "somehow" to hypnosis...which might be revealing if we knew enough.) --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
>> 1. The test is subjective. I disagree. If you have an automated test with clear criteria like the following, it will be completely objective: a) the compressing program must be able to output all inconsistencies in the corpus (in their original string form) AND b) the decompressing program must be able to do the following when presented with a standard list of test ideas/pieces of knowledge FOR EACH IDEA/PIECE OF KNOWLEDGE IN THE TEST WHICH IS NOT IN THE LIST OF INCONSISTENCIES if the knowledge is in the corpus, recognize that it is in the corpus. if the negation of the knowledge is in the corpus, recognize that the test knowledge is false according to the corpus. if an incorrect substitution has been made to create the test item from an item the corpus (i.e. red for yellow, ten for twenty, etc.), recognize that the test knowledge is false according to the corpus. if a possibly correct (hierarchical) substitution has been made to create the test item in the corpus, recognize that the substitution is either a) in the corpus for broader concepts (i.e. testing red for corpus lavender, testing dozens for corpus thirty-seven, etc) or b) that there is related information in the corpus which the test idea further refines for narrower substitutions >> 2. Lossy compression does not imply AI. and two sentences before >> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI. Require that the decompressing program be able to output all of the compressed file's knowledge in ordinary English. This is a pretty trivial task compared to everything else. Mark - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 12:27 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize I realize it is tempting to use lossy text compression as a test for AI because that is what the human brain does when we read text and recall it in paraphrased fashion. We remember the ideas and discard details about the _expression_ of those ideas. A lossy text compressor that did the same thing would certainly demonstrate AI.But there are two problems with using lossy compression as a test of AI:1. The test is subjective.2. Lossy compression does not imply AI.Lets assume we solve the subjectivity problem by having human judges evaluate whether the decompressed output is "close enough" to the input. We already do this with lossy image, audio and video compression (without much consensus).The second problem remains: ideal lossy compression does not imply passing the Turing test. For lossless compression, it can be proven that it does. Let p(s) be the (unknown) probability that s will be the prefix of a text dialog. Then a machine that can compute p(s) exactly is able to generate response A to question Q with the distribution p(QA)/p(Q) which is indistinguishable from human. The same model minimizes the compressed size, E[log 1/p(s)].This proof does not hold for lossy compression because different lossless models map to identical lossy models. The desired property of a lossless compressor C is that if and only if s1 and s2 have the same meaning (to most people), then the encodings C(s1) = C(s2). This code will ideally have length log 1/(p(s1)+p(s2)). But this does not imply that the decompressor knows p(s1) or p(s2). Thus, the decompressor may decompress to s1 or s2 or choose randomly between them. In general, the output distribution will be different than the true distrubution p(s1), p(s2), so it will be distinguishable from human even if the compression ratio is ideal. -- Matt Mahoney, [EMAIL PROTECTED] - Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> I don't see any point in this debate over lossless vs. lossy compression Lets see if I can simplify it. The stated goal is compressing human knowledge. The exact, same knowledge can always be expressed in a *VERY* large number of different bit strings Not being able to reproduce the exact bit string is lossy compression when viewed from the bit viewpoint but can be lossless from the knowledge viewpoint Therefore, reproducing the bit string is an additional requirement above and beyond the stated goal I strongly believe that this additional requirement will necessitate a *VERY* large amount of additional work not necessary for the stated goal In addition, by information theory, reproducing the exact bit string will require additional information beyond the knowledge
Re: [agi] confirmation paradox
Phil, I see no conceptual problems with using probability theory to define context-dependent or viewpoint-dependent probabilities... Regarding YKY's example, "causation" is a subtle concept going beyond probability (but strongly probabilistically based), and indeed any mind needs to have fairly general and at least moderately clever methods for dealing with it But I see no problem with the assignment of numerical truth values to causal statements. Judea Pearl's math does it; Novamente's math does it... ben g On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote: > > A further example is: > > S1 = "The fall of the Roman empire is due to Christianity". > > S2 = "The fall of the Roman empire is due to lead poisoning". > > I'm not sure whether S1 or S2 is "more" true. But the question is how can > > you define the meaning of the NTV associated with S1 or S2? If we can't, > > why not just leave these statements as non-numerical? > > > > YKY > > If you cannot tell the difference, of course you can assign them the > same value. However, very often we state both S1 and S2 as "possible", > but when are forced to make a choice, can still say that S1 is "more > likely". > > Pei YKY is advocating the post-modern viewpoint that knowledge is context-dependent, and true-false assignments and numeric value judgements are both extremely problematic. Pei is pointing out the commonsense, classicist position, and also the refutation of the post-modern tradition, that some ways of building bridges make bridges that stay standing, and other ways make bridges that fall down. I think that the task of "completing the Modernist project", and uniting the many important observations of both enlightment and post-modernist thinking, has fallen to AI; we MUST resolve these two viewpoints before we can create an AGI. - Phil --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Mark,Could you please write a test program to objectively test for lossy text compression using your algorithm? You can start by listing all of the inconsistencies in Wikipedia. To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter.>> Lossy compression does not imply AI.>> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.I disagree that these are inconsistent. Demonstrating and implying are different things.-- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 12:55:24 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> 1. The test is subjective. I disagree. If you have an automated test with clear criteria like the following, it will be completely objective: a) the compressing program must be able to output all inconsistencies in the corpus (in their original string form) AND b) the decompressing program must be able to do the following when presented with a standard list of test ideas/pieces of knowledge FOR EACH IDEA/PIECE OF KNOWLEDGE IN THE TEST WHICH IS NOT IN THE LIST OF INCONSISTENCIES if the knowledge is in the corpus, recognize that it is in the corpus. if the negation of the knowledge is in the corpus, recognize that the test knowledge is false according to the corpus. if an incorrect substitution has been made to create the test item from an item the corpus (i.e. red for yellow, ten for twenty, etc.), recognize that the test knowledge is false according to the corpus. if a possibly correct (hierarchical) substitution has been made to create the test item in the corpus, recognize that the substitution is either a) in the corpus for broader concepts (i.e. testing red for corpus lavender, testing dozens for corpus thirty-seven, etc) or b) that there is related information in the corpus which the test idea further refines for narrower substitutions >> 2. Lossy compression does not imply AI. and two sentences before >> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI. Require that the decompressing program be able to output all of the compressed file's knowledge in ordinary English. This is a pretty trivial task compared to everything else. Mark - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 12:27 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize I realize it is tempting to use lossy text compression as a test for AI because that is what the human brain does when we read text and recall it in paraphrased fashion. We remember the ideas and discard details about the _expression_ of those ideas. A lossy text compressor that did the same thing would certainly demonstrate AI.But there are two problems with using lossy compression as a test of AI:1. The test is subjective.2. Lossy compression does not imply AI.Lets assume we solve the subjectivity problem by having human judges evaluate whether the decompressed output is "close enough" to the input. We already do this with lossy image, audio and video compression (without much consensus).The second problem remains: ideal lossy compression does not imply passing the Turing test. For lossless compression, it can be proven that it does. Let p(s) be the (unknown) probability that s will be the prefix of a text dialog. Then a machine that can compute p(s) exactly is able to generate response A to question Q with the distribution p(QA)/p(Q) which is indistinguishable from human. The same model minimizes the compressed size, E[log 1/p(s)].This proof does not hold for lossy compression because different lossless models map to identical lossy models. The desired property of a lossless compressor C is that if and only if s1 and s2 have the same meaning (to most people), then the encodings C(s1) = C(s2). This code will ideally have length log 1/(p(s1)+p(s2)). But this does not imply that the decompressor knows p(s1) or p(s2). Thus, the decompressor may decompress to s1 or s2 or choose randomly between them. In general, the output distribution will be different than the true distrubution p(s1), p(s2), so it will be distinguishable from human even if the compression ratio is ideal. -- Matt Mahoney, [EMAIL PROTECTED] - Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Ben wrote: >Conceptually, a better (though still deeply flawed) contest would be: >Compress this file of advanced knowledge, assuming as background >knowledge this other file of elementary knowledge, in terms of which >the advanced knowledge is defined. How about if you sort the input to put the elementary knowedge at the front? -- Matt Mahoney, [EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] confirmation paradox
On 8/15/06, Ben Goertzel <[EMAIL PROTECTED]> wrote: Phil, I see no conceptual problems with using probability theory to define context-dependent or viewpoint-dependent probabilities... Regarding YKY's example, "causation" is a subtle concept going beyond probability (but strongly probabilistically based), and indeed any mind needs to have fairly general and at least moderately clever methods for dealing with it But I see no problem with the assignment of numerical truth values to causal statements. Judea Pearl's math does it; Novamente's math does it... There isn't a problem in doing it, but there's serious doubts whether an approach in which symbols have constant meanings (the same symbol has the same semantics in different propositions) can lead to AI. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Ben >> Conceptually, a better (though still deeply flawed) contest would be: Compress this file of advanced knowledge, assuming as background knowledge this other file of elementary knowledge, in terms of which the advanced knowledge is defined. Nah. It wouldn't be much of a contest if they gave the elementary knowledge file and *much* harder on the organizers. How about using OpenCyc? --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Matt Mahoney <[EMAIL PROTECTED]> wrote: Ben wrote: >Conceptually, a better (though still deeply flawed) contest would be: >Compress this file of advanced knowledge, assuming as background >knowledge this other file of elementary knowledge, in terms of which >the advanced knowledge is defined. How about if you sort the input to put the elementary knowedge at the front? I think that sort will take considerably longer than nlogn. :) --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
>> Could you please write a test program to objectively test for lossy text compression using your algorithm? Writing the test program for the decompressing program is relatively easy. Since the requirement was that the decompressing program be able to recognize when a piece of knowledge is in the corpus, when it's negation is in the corpus, when an incorrect substitution has been made, and when a correct substitution has been made -- all you/I would need to do is invent (or obtain -- see two paragraphs down) a reasonably sized set of knowledge pieces to test, put them in a file, feed them to the decompressing program, and automatically grade it's answers as to which category each falls into. A reasonably small number of test cases should suffice as long as you don't advertise exactly which test cases are in the final test but once you're having competitors generate each other's tests, you can go hog-wild with the number. Writing the test program for the compressing program is also easy but developing the master list of inconsistencies is going to be a real difficulty -- unless you use the various contenders themselves to generate various versions of the list. I strongly doubt that most contenders will get false positives but strongly suspect that finding all of the inconsistencies will be a major area for improvement as the systems become more sophisticated. Note also that minor modifications of any decompressing program should also be able to create test cases for your decompressor test. Simply ask it for a random sampling of knowledge, for the negations of a random sampling of knowledge, for some incorrect substitutions, and some hierarchical substitutions of each type. Any *real* contenders should be able to easily generate the tests for you. >> You can start by listing all of the inconsistencies in Wikipedia. see paragraph 2 above >> To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter. It is impossible to show that people will never disagree on a matter. On the other hand, a knowledge compressor is going to have to recognize when two pieces of knowledge conflict (i.e. when two strings parse into knowledge statements that cannot coexist). You can always have a contender evaluate whether a competitor's "inconsistencies" are incorrect and then do some examination by hand on a representative sample where the contender says it can't tell (since, again, I suspect you'll find few misidentified inconsistencies -- but that finding all of the inconsistencies will be ever subject to improvement). >> >> Lossy compression does not imply AI. >> >> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.>> I disagree that these are inconsistent. Demonstrating and implying are different things. I didn't say that they were inconsistent. What I meant to say was that a decompressing program that is able to output all of the compressed file's knowledge in ordinary English would, in your words, "certainly demonstrate AI". given statement 1, it's not a problem that "lossy compression does not imply AI" since the decompressing program would still "certainly demonstrate AI" - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 2:23 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Mark,Could you please write a test program to objectively test for lossy text compression using your algorithm? You can start by listing all of the inconsistencies in Wikipedia. To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter. >> Lossy compression does not imply AI.>> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.I disagree that these are inconsistent. Demonstrating and implying are different things.-- Matt Mahoney, [EMAIL PROTECTED] - Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 12:55:24 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> 1. The test is subjective. I disagree. If you have an automated test with clear criteria like the following, it will be completely objective: a) the compressing program must be able to output all inconsistencies in the corpus (in their original string form) AND b) the decompressing program must be able to do the following when presented with a standard list of test ideas/pieces of knowledge
Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
How about using OpenCyc? Actually, instructing the competitors to compress both the OpenCyc corpus AND then the Wikipedia sample in sequence and measuring the size of both *would* be an interesting and probably good contest. - Original Message - From: "Philip Goetz" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 3:16 PM Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Ben >> Conceptually, a better (though still deeply flawed) contest would be: Compress this file of advanced knowledge, assuming as background knowledge this other file of elementary knowledge, in terms of which the advanced knowledge is defined. Nah. It wouldn't be much of a contest if they gave the elementary knowledge file and *much* harder on the organizers. How about using OpenCyc? --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Actually, instructing the competitors to compress both the OpenCyc corpus AND then the Wikipedia sample in sequence and measuring the size of both *would* be an interesting and probably good contest. I think it would be more interesting for it to use the OpenCyc corpus as its knowledge for compressing the Wikipedia sample. The point is to demonstrate intelligent use of information, not to get a wider variety of data. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Marcus Hutter's lossless compression of human knowledge prize
I proposed knowledge-based text compression as a dissertation topic, back around 1991, but my advisor turned it down. I never got back to the topic because there wasn't any money in it - text is already so small, relative to audio and video, that it was clear that the money was in audio and video compression. Also, non-AI methods were already getting near enough the theoretical limits for compression that it didn't seem worthwhile. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
You could use Keogh's compression dissimilarity measure to test for inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means the compressed size of x (lossless). The measure ranges from about 0.5 if x = y to about 1.0 if x and y do not share any information. Then, CDM("it is hot", "it is very warm") < CDM("it is hot", "it is cold").assuming your compressor uses a good language model.Now if only we had some test to tell which compressors have the best language models... -- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 3:22:10 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> Could you please write a test program to objectively test for lossy text compression using your algorithm? Writing the test program for the decompressing program is relatively easy. Since the requirement was that the decompressing program be able to recognize when a piece of knowledge is in the corpus, when it's negation is in the corpus, when an incorrect substitution has been made, and when a correct substitution has been made -- all you/I would need to do is invent (or obtain -- see two paragraphs down) a reasonably sized set of knowledge pieces to test, put them in a file, feed them to the decompressing program, and automatically grade it's answers as to which category each falls into. A reasonably small number of test cases should suffice as long as you don't advertise exactly which test cases are in the final test but once you're having competitors generate each other's tests, you can go hog-wild with the number. Writing the test program for the compressing program is also easy but developing the master list of inconsistencies is going to be a real difficulty -- unless you use the various contenders themselves to generate various versions of the list. I strongly doubt that most contenders will get false positives but strongly suspect that finding all of the inconsistencies will be a major area for improvement as the systems become more sophisticated. Note also that minor modifications of any decompressing program should also be able to create test cases for your decompressor test. Simply ask it for a random sampling of knowledge, for the negations of a random sampling of knowledge, for some incorrect substitutions, and some hierarchical substitutions of each type. Any *real* contenders should be able to easily generate the tests for you. >> You can start by listing all of the inconsistencies in Wikipedia. see paragraph 2 above >> To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter. It is impossible to show that people will never disagree on a matter. On the other hand, a knowledge compressor is going to have to recognize when two pieces of knowledge conflict (i.e. when two strings parse into knowledge statements that cannot coexist). You can always have a contender evaluate whether a competitor's "inconsistencies" are incorrect and then do some examination by hand on a representative sample where the contender says it can't tell (since, again, I suspect you'll find few misidentified inconsistencies -- but that finding all of the inconsistencies will be ever subject to improvement). >> >> Lossy compression does not imply AI. >> >> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.>> I disagree that these are inconsistent. Demonstrating and implying are different things. I didn't say that they were inconsistent. What I meant to say was that a decompressing program that is able to output all of the compressed file's knowledge in ordinary English would, in your words, "certainly demonstrate AI". given statement 1, it's not a problem that "lossy compression does not imply AI" since the decompressing program would still "certainly demonstrate AI" - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 2:23 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize Mark,Could you please write a test program to objectively test for lossy text compression using your algorithm? You can start by listing all of the inconsistencies in Wikipedia. To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter. >> Lossy compression does not imply AI.>> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.I d
Re: [agi] confirmation paradox
Hi, Phil wrote: There isn't a problem in doing it, but there's serious doubts whether an approach in which symbols have constant meanings (the same symbol has the same semantics in different propositions) can lead to AI. Sure, but neither Novamente nor NARS (for example) has the problematic issue you mention In both of these systems, symbols and other paterns may have context-dependent semantics... -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
I think it would be more interesting for it to use the OpenCyc corpus as its knowledge for compressing the Wikipedia sample. The point is to demonstrate intelligent use of information, not to get a wider variety of data. :-) My assumption is that the compression program is building/adding to a knowledge base when it reads a file/corpus and then it exports a compressed version of either 1) all of the knowledge from the newest file/corpus OR 2) the file/corpus knowledge minus whatever was previously known from a previous file/corpus. If you "compressed" the OpenCyc corpus, threw away the compressed file, and then compressed the Wikipedia sample with the output option set to #2 above excluding the OpenCyc corpus, then my program would be doing exactly what you are suggesting (and doing it in the easiest possible way since you need some way to get the OpenCyc corpus into the knowledge base). The *only* real difference between your suggestion and mine is that you are ignoring the size of the compressed OpenCyc file. - Original Message - From: "Philip Goetz" <[EMAIL PROTECTED]> To: Sent: Tuesday, August 15, 2006 3:37 PM Subject: **SPAM** Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: Actually, instructing the competitors to compress both the OpenCyc corpus AND then the Wikipedia sample in sequence and measuring the size of both *would* be an interesting and probably good contest. I think it would be more interesting for it to use the OpenCyc corpus as its knowledge for compressing the Wikipedia sample. The point is to demonstrate intelligent use of information, not to get a wider variety of data. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Matt Mahoney <[EMAIL PROTECTED]> wrote: I realize it is tempting to use lossy text compression as a test for AI because that is what the human brain does when we read text and recall it in paraphrased fashion. We remember the ideas and discard details about the expression of those ideas. A lossy text compressor that did the same thing would certainly demonstrate AI. But there are two problems with using lossy compression as a test of AI: 1. The test is subjective. 2. Lossy compression does not imply AI. Lets assume we solve the subjectivity problem by having human judges evaluate whether the decompressed output is "close enough" to the input. We already do this with lossy image, audio and video compression (without much consensus). The second problem remains: ideal lossy compression does not imply passing the Turing test. For lossless compression, it can be proven that it does. Let p(s) be the (unknown) probability that s will be the prefix of a text dialog. Then a machine that can compute p(s) exactly is able to generate response A to question Q with the distribution p(QA)/p(Q) which is indistinguishable from human. The same model minimizes the compressed size, E[log 1/p(s)]. This proof is really not useful. The Turing test is subjective; all you are saying is that lossy compression is lossy, and lossless compression is not. A solution to the first problem would also solve the second problem. It is necessary to allow lossy compression in order for this compression test to be useful for AI, because lossless and uncomprehending compression is already bumping up against the theoretical limits for text compression. - Phil --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
>> You could use Keogh's compression dissimilarity measure to test for inconsistency. I don't think so. Take the following strings: "I only used red and yellow paint in the painting", "I painted the rose in my favorite color", "My favorite color is pink", "Orange is created by mixing red and yellow", "Pink is created by mixing red and white". How is Keogh's measure going to help you with that? The problem is that Keogh's measure is intended for data-mining where you have separate instances, not one big entwined Gordian knot. >> Now if only we had some test to tell which compressors have the best language models... Huh? By definition, the compressor with the best language model is the one with the highest compression ratio. - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 3:54 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize You could use Keogh's compression dissimilarity measure to test for inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means the compressed size of x (lossless). The measure ranges from about 0.5 if x = y to about 1.0 if x and y do not share any information. Then, CDM("it is hot", "it is very warm") < CDM("it is hot", "it is cold").assuming your compressor uses a good language model.Now if only we had some test to tell which compressors have the best language models... -- Matt Mahoney, [EMAIL PROTECTED] - Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 3:22:10 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> Could you please write a test program to objectively test for lossy text compression using your algorithm? Writing the test program for the decompressing program is relatively easy. Since the requirement was that the decompressing program be able to recognize when a piece of knowledge is in the corpus, when it's negation is in the corpus, when an incorrect substitution has been made, and when a correct substitution has been made -- all you/I would need to do is invent (or obtain -- see two paragraphs down) a reasonably sized set of knowledge pieces to test, put them in a file, feed them to the decompressing program, and automatically grade it's answers as to which category each falls into. A reasonably small number of test cases should suffice as long as you don't advertise exactly which test cases are in the final test but once you're having competitors generate each other's tests, you can go hog-wild with the number. Writing the test program for the compressing program is also easy but developing the master list of inconsistencies is going to be a real difficulty -- unless you use the various contenders themselves to generate various versions of the list. I strongly doubt that most contenders will get false positives but strongly suspect that finding all of the inconsistencies will be a major area for improvement as the systems become more sophisticated. Note also that minor modifications of any decompressing program should also be able to create test cases for your decompressor test. Simply ask it for a random sampling of knowledge, for the negations of a random sampling of knowledge, for some incorrect substitutions, and some hierarchical substitutions of each type. Any *real* contenders should be able to easily generate the tests for you. >> You can start by listing all of the inconsistencies in Wikipedia. see paragraph 2 above >> To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter. It is impossible to show that people will never disagree on a matter. On the other hand, a knowledge compressor is going to have to recognize when two pieces of knowledge conflict (i.e. when two strings parse into knowledge statements that cannot coexist). You can always have a contender evaluate whether a competitor's "inconsistencies" are incorrect and then do some examination by hand on a representative sample where the contender says it can't tell (since, again, I suspect you'll find few misidentified inconsistencies -- but that finding all of the inconsistencies will be ever subject to improvement). >> >> Lossy compression does not imply AI. >> >> A lossy text compressor that did the same thing (recall it in paraphrased fashion) would certainly demonstrate AI.>> I disagree that these are inconsistent. Demonstrating and implying are different things. I didn't say tha
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
Mark wrote:>Huh? By definition, the compressor with the best language model is the one with the highest compression ratio. I'm glad we finally agree :-)>> You could use Keogh's compression dissimilarity measure to test for inconsistency. I don't think so. Take the following strings: "I only used red and yellow paint in the painting", "I painted the rose in my favorite color", "My favorite color is pink", "Orange is created by mixing red and yellow", "Pink is created by mixing red and white". How is Keogh's measure going to help you with that? You group the strings into a fixed set and a variable set and concatenate them. The variable set could be just "I only used red and yellow paint in the painting", and you compare the CDM replacing "yellow" with "white". Of course your compressor must be capable of abstract reasoning and have a world model.To answer Phil's post: Text compression is only near the theoretical limts for small files. For large files, there is progress to be made integrating known syntactic and semantic modeling techniques into general purpose compressors. The theoretical limit is about 1 bpc and we are not there yet. See the graph at http://cs.fit.edu/~mmahoney/dissertation/The proof that I gave that a language model implies passing the Turing test is for the ideal case where all people share identical models. The ideal case is deterministic. For the real case where models differ, passing the test is easier because a judge will attribute some machine errors to normal human variation. I discuss this in more detail at http://cs.fit.edu/~mmahoney/compression/rationale.html (text compression is equivalent to AI).It is really hard to get funding for text compression research (or AI). I had to change my dissertation topic to network security in 1999 because my advisor had funding for that. As a postdoc I applied for a $50K NSF grant for a text compression contest. It was rejected, so I started one without funding (which we now have). The problem is that many people do not believe that text compression is related to AI (even though speech recognition researchers have been evaluating models by perplexity since the early 1990's). -- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 5:00:47 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize >> You could use Keogh's compression dissimilarity measure to test for inconsistency. I don't think so. Take the following strings: "I only used red and yellow paint in the painting", "I painted the rose in my favorite color", "My favorite color is pink", "Orange is created by mixing red and yellow", "Pink is created by mixing red and white". How is Keogh's measure going to help you with that? The problem is that Keogh's measure is intended for data-mining where you have separate instances, not one big entwined Gordian knot. >> Now if only we had some test to tell which compressors have the best language models... Huh? By definition, the compressor with the best language model is the one with the highest compression ratio. - Original Message - From: Matt Mahoney To: agi@v2.listbox.com Sent: Tuesday, August 15, 2006 3:54 PM Subject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize You could use Keogh's compression dissimilarity measure to test for inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means the compressed size of x (lossless). The measure ranges from about 0.5 if x = y to about 1.0 if x and y do not share any information. Then, CDM("it is hot", "it is very warm") < CDM("it is hot", "it is cold").assuming your compressor uses a good language model.Now if only we had some test to tell which compressors have the best language models... -- Matt Mahoney, [EMAIL PROTECTED] To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize
On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote: > I think it would be more interesting for it to use the OpenCyc corpus > as its knowledge for compressing the Wikipedia sample. The point is > to demonstrate intelligent use of information, not to get a wider > variety of data. :-) My assumption is that the compression program is building/adding to a knowledge base when it reads a file/corpus and then it exports a compressed version of either 1) all of the knowledge from the newest file/corpus OR 2) the file/corpus knowledge minus whatever was previously known from a previous file/corpus. If you "compressed" the OpenCyc corpus, threw away the compressed file, and then compressed the Wikipedia sample with the output option set to #2 above excluding the OpenCyc corpus, then my program would be doing exactly what you are suggesting (and doing it in the easiest possible way since you need some way to get the OpenCyc corpus into the knowledge base). Yes. RIght. The *only* real difference between your suggestion and mine is that you are ignoring the size of the compressed OpenCyc file. Right. Which is an important difference. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]