Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Ben Goertzel

Hi,


Agreed, however, you previously referred to "background information that an
intelligent program has inferred from previous inputs".  Information
inferred from previous inputs is certainly included in the compression size
(either as part of the program or as a necessary resource to the program's
correct/optimal operation).


Mark, I'm not sure I fully grok the context of this passage, but it
seems to me that an adaptive compression program could infer
information from each file it sees, and then store this information in
its own memory --- and then use this information to figure out how to
do excellent file compression on new files, but without storing much
of this information in any of the new files it compresses.

For example, a huge knowledge base about the world could be learned by
a software program reading a lot of texts.  This KB would be stored in
the program's main memory and would help it compress future texts, but
in each future text it compressed, only a tiny amount of this
knowledge would be embodied...

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser

Hi Ben,

   I agree with everything that you're saying; however, looking at the 
specific task:


Create a compressed version (self-extracting archive) of the 100MB file 
enwik8 of less than 18MB. More precisely:
 a.. Create a Linux or Windows executable archive8.exe of size S < L := 
18'324'887 = previous record.
 b.. If run, it produces (without input from other sources) a 108 byte file 
data8 that is identical to enwik8.
 c.. If we can verify your claim, you are eligible for a prize of 
50'000€×(1-S/L). Minimum claim is 500€.
   . . . . there clearly isn't the opportunity for it to store knowledge 
from other/previous files except in it's executable since it explicitly says 
"without input from other sources" -- and the size of the executable counts 
as part of the compressed size.


   Mark

- Original Message - 
From: "Ben Goertzel" <[EMAIL PROTECTED]>

To: 
Sent: Tuesday, August 15, 2006 7:50 AM
Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of 
human knowledge prize




Hi,

Agreed, however, you previously referred to "background information that 
an

intelligent program has inferred from previous inputs".  Information
inferred from previous inputs is certainly included in the compression 
size
(either as part of the program or as a necessary resource to the 
program's

correct/optimal operation).


Mark, I'm not sure I fully grok the context of this passage, but it
seems to me that an adaptive compression program could infer
information from each file it sees, and then store this information in
its own memory --- and then use this information to figure out how to
do excellent file compression on new files, but without storing much
of this information in any of the new files it compresses.

For example, a huge knowledge base about the world could be learned by
a software program reading a lot of texts.  This KB would be stored in
the program's main memory and would help it compress future texts, but
in each future text it compressed, only a tiny amount of this
knowledge would be embodied...

ben

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, please go to 
http://v2.listbox.com/member/[EMAIL PROTECTED]





---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Ben Goertzel

Yes, but the compression software could have learned stuff before
trying the Hutter Challenge, via compressing a bunch of other files
... and storing the knowledge it learned via this experience in its
long-term memory...

-- Ben

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Hi Ben,

I agree with everything that you're saying; however, looking at the
specific task:

Create a compressed version (self-extracting archive) of the 100MB file
enwik8 of less than 18MB. More precisely:
  a.. Create a Linux or Windows executable archive8.exe of size S < L :=
18'324'887 = previous record.
  b.. If run, it produces (without input from other sources) a 108 byte file
data8 that is identical to enwik8.
  c.. If we can verify your claim, you are eligible for a prize of
50'000€×(1-S/L). Minimum claim is 500€.
. . . . there clearly isn't the opportunity for it to store knowledge
from other/previous files except in it's executable since it explicitly says
"without input from other sources" -- and the size of the executable counts
as part of the compressed size.

Mark

- Original Message -
From: "Ben Goertzel" <[EMAIL PROTECTED]>
To: 
Sent: Tuesday, August 15, 2006 7:50 AM
Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of
human knowledge prize


> Hi,
>
>> Agreed, however, you previously referred to "background information that
>> an
>> intelligent program has inferred from previous inputs".  Information
>> inferred from previous inputs is certainly included in the compression
>> size
>> (either as part of the program or as a necessary resource to the
>> program's
>> correct/optimal operation).
>
> Mark, I'm not sure I fully grok the context of this passage, but it
> seems to me that an adaptive compression program could infer
> information from each file it sees, and then store this information in
> its own memory --- and then use this information to figure out how to
> do excellent file compression on new files, but without storing much
> of this information in any of the new files it compresses.
>
> For example, a huge knowledge base about the world could be learned by
> a software program reading a lot of texts.  This KB would be stored in
> the program's main memory and would help it compress future texts, but
> in each future text it compressed, only a tiny amount of this
> knowledge would be embodied...
>
> ben
>
> ---
> To unsubscribe, change your address, or temporarily deactivate your
> subscription, please go to
> http://v2.listbox.com/member/[EMAIL PROTECTED]
>


---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser



>> I 
don't see any point in this debate over lossless vs. lossy 
compression
 
Lets see if I can simplify it.

  The stated goal is compressing human 
  knowledge.
  The exact, same knowledge can always be expressed 
  in a *VERY* large number of different bit strings
  Not being able to reproduce the exact bit string 
  is lossy compression when viewed from the bit viewpoint but can be 
  lossless from the knowledge viewpoint
  Therefore, reproducing the bit string is an 
  additional requirement above and beyond the stated goal
  I strongly believe that this additional 
  requirement will necessitate a *VERY* large amount of additional work not 
  necessary for the stated goal
  In addition, by information theory, reproducing 
  the exact bit string will require additional information beyond the 
  knowledge contained in it (since numerous different strings can encode the 
  same knowledge)
  Assuming optimal compression, also by by 
  information theory, additional information will add to the compressed size 
  (i.e. lead to a less optimal result).
So the question is "Given that bit-level 
reproduction is harder, not necessary for knowledge compression/intelligence, 
and doesn't allow for the same degree of compression.  Why make life 
tougher when it isn't necessary for your stated purposes and makes your results 
(i.e. compression) worse?"
 

  - Original Message - 
  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 12:55 
  AM
  Subject: Re: Sampo: [agi] Marcus Hutter's 
  lossless compression of human knowledge prize
  
  Where 
  will the knowledge to compress text come from?  There are 3 
  possibilities.1. externally supplied, like the lexical models 
  (dictionaries) for paq8h and WinRK.2. learned from the input in a separate 
  pass, like xml-wrt|ppmonstr.3. learned online in one pass, like paq8f and 
  slim.These all have the same effect on compressed size.  In the 
  first case, you increase the size of the decompressor.  In the second, 
  you have to append the model you learned from the first pass to the compressed 
  file so it is available to the decompressor.  In the third case, 
  compression is poor at the beginning.  From the viewpoint of information 
  theory, there is no difference in these three approaches.  The penalty is 
  the same.To improve compression further, you will need to model 
  semantics and/or syntax.  No compressor currently does this.  I 
  think the reason is that it is not worthwhile unless you have hundreds of 
  megabytes of natural language text.  In fact, only the top few 
  compressors even have lexical models.  All the rest are byte oriented 
  n-gram models.A semantic model would know what words are related, like 
  "star" and "moon".  It would learn this by their tendency to appear 
  together.  You can build a dictionary of such knowledge from the data set 
  itself or you can build it some other way (such as Wordnet) and include it in 
  the decompressor.  If you learn it from the input, you could do it in a 
  separate pass (like LSA) or you could do it in one pass (maybe an equivalent 
  neural network) so that you build the model as you compress.To learn 
  syntax, you can cluster words by similarity of their immediate context.  
  These clusters correspond to part of speech.  For instance, "the X is" 
  tells you that X is a noun.  You can model simple grammars as n-grams 
  over their classifications, such as (Art Noun Verb).  Again, you can use 
  any of 3 approaches.Learning semantics and syntax is a hard problem, 
  but I think you can see it can be done with statistical modeling.  The 
  training data you need is in the input itself.I don't see any point in 
  this debate over lossless vs. lossy compression.  You have to solve the 
  language learning problem in either case to improve compression.  I think 
  it will be more productive to discuss how this can be done.
   -- Matt Mahoney, [EMAIL PROTECTED]
  
  
  To unsubscribe, change your address, or temporarily deactivate your 
  subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] 

To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



Re: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser
   I think that our difference is that I am interpreting "without input 
from other sources" as not allowing that "bunch of other files" UNLESS that 
"long-term memory" is counted as part of the executable size.


- Original Message - 
From: "Ben Goertzel" <[EMAIL PROTECTED]>

To: 
Sent: Tuesday, August 15, 2006 9:03 AM
Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless 
compression of human knowledge prize



Yes, but the compression software could have learned stuff before
trying the Hutter Challenge, via compressing a bunch of other files
... and storing the knowledge it learned via this experience in its
long-term memory...

-- Ben

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Hi Ben,

I agree with everything that you're saying; however, looking at the
specific task:

Create a compressed version (self-extracting archive) of the 100MB file
enwik8 of less than 18MB. More precisely:
  a.. Create a Linux or Windows executable archive8.exe of size S < L :=
18'324'887 = previous record.
  b.. If run, it produces (without input from other sources) a 108 byte 
file

data8 that is identical to enwik8.
  c.. If we can verify your claim, you are eligible for a prize of
50'000€×(1-S/L). Minimum claim is 500€.
. . . . there clearly isn't the opportunity for it to store knowledge
from other/previous files except in it's executable since it explicitly 
says
"without input from other sources" -- and the size of the executable 
counts

as part of the compressed size.

Mark

- Original Message -
From: "Ben Goertzel" <[EMAIL PROTECTED]>
To: 
Sent: Tuesday, August 15, 2006 7:50 AM
Subject: **SPAM** Re: Sampo: [agi] Marcus Hutter's lossless compression of
human knowledge prize


> Hi,
>
>> Agreed, however, you previously referred to "background information 
>> that

>> an
>> intelligent program has inferred from previous inputs".  Information
>> inferred from previous inputs is certainly included in the compression
>> size
>> (either as part of the program or as a necessary resource to the
>> program's
>> correct/optimal operation).
>
> Mark, I'm not sure I fully grok the context of this passage, but it
> seems to me that an adaptive compression program could infer
> information from each file it sees, and then store this information in
> its own memory --- and then use this information to figure out how to
> do excellent file compression on new files, but without storing much
> of this information in any of the new files it compresses.
>
> For example, a huge knowledge base about the world could be learned by
> a software program reading a lot of texts.  This KB would be stored in
> the program's main memory and would help it compress future texts, but
> in each future text it compressed, only a tiny amount of this
> knowledge would be embodied...
>
> ben
>
> ---
> To unsubscribe, change your address, or temporarily deactivate your
> subscription, please go to
> http://v2.listbox.com/member/[EMAIL PROTECTED]
>


---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,

please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,

please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread J. Storrs Hall, PhD.
On Tuesday 15 August 2006 00:55, Matt Mahoney wrote:
...
> To improve compression further, you will need to model semantics and/or
> syntax.  No compressor currently does this. 

Has anyone looked at the statistical parsers? There is a big subfield of 
computational linguistics doing exactly this, cf e.g. Charniak (down the page 
to statistical parsing) http://www.cs.brown.edu/%7Eec/

I would speculate, btw, that the decompressor should be a virtual machine for 
some powerful macro-expander (which are equivalent to the lambda calculus, 
ergo Turing machines) and the probabilistic regularities in the source be 
reflected in the encoding -- which would be implemented by the "executable" 
compressed file.

Josh

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread J. Storrs Hall, PhD.
On Tuesday 15 August 2006 09:03, Ben Goertzel wrote:
> Yes, but the compression software could have learned stuff before
> trying the Hutter Challenge, via compressing a bunch of other files
> ... and storing the knowledge it learned via this experience in its
> long-term memory...

This could have a secondary value in helping the compressor know what kind of 
regularities to look for in the source file -- but if the regularity isn't in 
the source file, you obviously don't want any information about it in either 
the compressed file or the decompressor. So the compressor might get hints 
from such knowledge, but any regularities it actually (should) use are going 
to be present in the source file by assumption.

Josh

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Hi Ben,

I agree with everything that you're saying; however, looking at the
specific task:

. . . . there clearly isn't the opportunity for it to store knowledge
from other/previous files except in it's executable since it explicitly says
"without input from other sources" -- and the size of the executable counts
as part of the compressed size.


Right.  This is probably necessary for the contest; it would be hard
to verify that a program with a large database wasn't in some way
storing lots of Wiki-specific information in that database.
Unfortunately, this restriction makes the contest much less relevant
to AGI.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
I've read Chaniak's book, Statistical Language Learning.  A lot of researchers 
in language modeling are using perplexity (compression ratio) to compare 
models.  But there are some problems with the way this is done.

1. Many evaluations are done on corpora from the LDC which are not free, like 
TREC, WSJ, Brown, etc.
2. Many evaluations use offline models.  They train on a portion of the data 
set and evaluate on the rest, or use leave-one-out, or maybe divide into 3 
parts including a validation set.  This makes it difficult to compare work by 
different researchers because there is no consistency in the details of these 
experiments.
3. The input is usually preprocessed in various ways.  Normally, case is 
folded, the words are converted to tokens from a fixed vocabulary and 
punctuation is removed.  Again there is no consistency in the details, like the 
size of the vocabulary, whether to include numbers, etc.  Also this filtering 
removes useful information, so it is difficult to evaulate the true perplexity 
of the model.

I think a good language model will need to combine many techniques in lexical 
modeling (vocabulary acquistion, stemming, recognizing multiword phrases and 
compound words, dealing with rare words, misspelled words, capitalization, 
punctuation and various nontext forms of junk), semantics (distant bigrams, 
LSA), and syntax (statistical parsers, hidden Markov models) in a uniform 
framework.  Most work is usually in the form of a word trigram model plus one 
other technique on cleaned up text.  Nobody has put all this stuff together.  
As a result, the best compresors still use byte-level ngram statistics and at 
most some crude lexical parsing.  I think we can do better.
 
-- Matt Mahoney, [EMAIL PROTECTED]

- Original Message 
From: "J. Storrs Hall, PhD." <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Tuesday, August 15, 2006 9:37:32 AM
Subject: Re: Sampo: [agi] Marcus Hutter's lossless compression of human 
knowledge prize

On Tuesday 15 August 2006 00:55, Matt Mahoney wrote:
...
> To improve compression further, you will need to model semantics and/or
> syntax.  No compressor currently does this. 

Has anyone looked at the statistical parsers? There is a big subfield of 
computational linguistics doing exactly this, cf e.g. Charniak (down the page 
to statistical parsing) http://www.cs.brown.edu/%7Eec/

I would speculate, btw, that the decompressor should be a virtual machine for 
some powerful macro-expander (which are equivalent to the lambda calculus, 
ergo Turing machines) and the probabilistic regularities in the source be 
reflected in the encoding -- which would be implemented by the "executable" 
compressed file.

Josh

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] confirmation paradox

2006-08-15 Thread Philip Goetz

> A further example is:
> S1 = "The fall of the Roman empire is due to Christianity".
> S2 = "The fall of the Roman empire is due to lead poisoning".
> I'm not sure whether S1 or S2 is "more" true.  But the question is how can
> you define the meaning of the NTV associated with S1 or S2?  If we can't,
> why not just leave these statements as non-numerical?
>
> YKY

If you cannot tell the difference, of course you can assign them the
same value. However, very often we state both S1 and S2 as "possible",
but when are forced to make a choice, can still say that S1 is "more
likely".

Pei


YKY is advocating the post-modern viewpoint that knowledge is
context-dependent, and true-false assignments and numeric value
judgements are both extremely problematic.  Pei is pointing out the
commonsense, classicist position, and also the refutation of the
post-modern tradition, that some ways of building bridges make bridges
that stay standing, and other ways make bridges that fall down.

I think that the task of "completing the Modernist project", and
uniting the many important observations of both enlightment and
post-modernist thinking, has fallen to AI; we MUST resolve these two
viewpoints before we can create an AGI.

- Phil

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/12/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:

A common objection to compression as a test for AI is that humans can't do 
compression, so it has nothing to do with AI.  The reason people can't compress 
is that compression requires both AI and deterministic computation.  The human 
brain is not deterministic because it is made of neurons, which are noisy 
analog devices.


People do compression extremely well.  Your eyes send you about 3
gigabytes of data per second; at deeper levels, that is reduced to the
roughly one byte per second that your brain "processes".  (The 3G/sec
figure is pretty accurate; the one byte/second is more contentious; it
is more well-supported to say that we can make choices at a rate of
around one byte per second, IIRC.)

Compression is almost ALL WE DO.  Cognition = compression.  Most of
what we do is make observations of the environment, and compress that
into representations, which trigger responses that are compressed
representations of actions.  Or, if you like, we compress environment
inputs directly into selection of actions which are appropriate in
those circumstances.

- Phil

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote:

People do compression extremely well.  Your eyes send you about 3
gigabytes of data per second; at deeper levels,


Oops.  Looking at my notes, that should be 2.7Gbits/second (1.4 per
eye), not 3Gbytes/second.

This is reduced to about 165Mbits/second per eye by the time the
signal leaves the lateral geniculate nucleus, which is before it hits
the thalamus or any other part of the brain.

- Phil

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Ben Goertzel

Ah... well, in that case the contest is indeed even less AGI-relevant
than I thought...

This particular shortcoming of the contest is more a pragmatic than a
philosophical one: Wikipedia is not a complete knowledge domain...
it's an advanced knowledge domain that is only meant to be
interpretable by reference to an additional body of (more elementary)
world-knowledge.  (Similar to the world-knowledge that Cyc tries,
fairly unsuccessfully, to capture.)   But this additional
world-knowledge is voluminous -- so it's quite possible that the
"AGI-natural" ways to compress Wikipedia using an AGI system with a
lot of world-knowledge, are quite different from the best ways to
compress Wikipedia using a small executable file.

Conceptually, a better (though still deeply flawed) contest would be:
Compress this file of advanced knowledge, assuming as background
knowledge this other file of elementary knowledge, in terms of which
the advanced knowledge is defined.

-- Ben G

On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote:

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:
> Hi Ben,
>
> I agree with everything that you're saying; however, looking at the
> specific task:
>
> . . . . there clearly isn't the opportunity for it to store knowledge
> from other/previous files except in it's executable since it explicitly says
> "without input from other sources" -- and the size of the executable counts
> as part of the compressed size.

Right.  This is probably necessary for the contest; it would be hard
to verify that a program with a large database wasn't in some way
storing lots of Wiki-specific information in that database.
Unfortunately, this restriction makes the contest much less relevant
to AGI.

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
I realize it is tempting to use lossy text compression as a test for AI because that is what the human brain does when we read text and recall it in paraphrased fashion.  We remember the ideas and discard details about the _expression_ of those ideas.  A lossy text compressor that did the same thing would certainly demonstrate AI.But there are two problems with using lossy compression as a test of AI:1. The test is subjective.2. Lossy compression does not imply AI.Lets assume we solve the subjectivity problem by having human judges evaluate whether the decompressed output is "close enough" to the input.  We already do this with lossy image, audio and video compression (without much consensus).The second problem remains: ideal lossy compression does not imply passing
 the Turing test.  For lossless compression, it can be proven that it does.  Let p(s) be the (unknown) probability that s will be the prefix of a text dialog.  Then a machine that can compute p(s) exactly is able to generate response A to question Q with the distribution p(QA)/p(Q) which is indistinguishable from human.  The same model minimizes the compressed size, E[log 1/p(s)].This proof does not hold for lossy compression because different lossless models map to identical lossy models.  The desired property of a lossless compressor C is that if and only if s1 and s2 have the same meaning (to most people), then the encodings C(s1) = C(s2).  This code will ideally have length log 1/(p(s1)+p(s2)).  But this does not imply that the decompressor knows p(s1) or p(s2).  Thus, the decompressor may decompress to s1 or s2 or choose randomly between them.  In general, the output distribution will be different than the true
 distrubution p(s1), p(s2), so it will be distinguishable from human even if the compression ratio is ideal. -- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

 


>> I 
don't see any point in this debate over lossless vs. lossy 
compression
 
Lets see if I can simplify it.

  The stated goal is compressing human 
  knowledge.
  The exact, same knowledge can always be expressed 
  in a *VERY* large number of different bit strings
  Not being able to reproduce the exact bit string 
  is lossy compression when viewed from the bit viewpoint but can be 
  lossless from the knowledge viewpoint
  Therefore, reproducing the bit string is an 
  additional requirement above and beyond the stated goal
  I strongly believe that this additional 
  requirement will necessitate a *VERY* large amount of additional work not 
  necessary for the stated goal
  In addition, by information theory, reproducing 
  the exact bit string will require additional information beyond the 
  knowledge contained in it (since numerous different strings can encode the 
  same knowledge)
  Assuming optimal compression, also by by 
  information theory, additional information will add to the compressed size 
  (i.e. lead to a less optimal result).
So the question is "Given that bit-level 
reproduction is harder, not necessary for knowledge compression/intelligence, 
and doesn't allow for the same degree of compression.  Why make life 
tougher when it isn't necessary for your stated purposes and makes your results 
(i.e. compression) worse?"
 

  - Original Message - 
  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 12:55 
  AM
  Subject: Re: Sampo: [agi] Marcus Hutter's 
  lossless compression of human knowledge prize
  
  Where 
  will the knowledge to compress text come from?  There are 3 
  possibilities.1. externally supplied, like the lexical models 
  (dictionaries) for paq8h and WinRK.2. learned from the input in a separate 
  pass, like xml-wrt|ppmonstr.3. learned online in one pass, like paq8f and 
  slim.These all have the same effect on compressed size.  In the 
  first case, you increase the size of the decompressor.  In the second, 
  you have to append the model you learned from the first pass to the compressed 
  file so it is available to the decompressor.  In the third case, 
  compression is poor at the beginning.  From the viewpoint of information 
  theory, there is no difference in these three approaches.  The penalty is 
  the same.To improve compression further, you will need to model 
  semantics and/or syntax.  No compressor currently does this.  I 
  think the reason is that it is not worthwhile unless you have hundreds of 
  megabytes of natural language text.  In fact, only the top few 
  compressors even have lexical models.  All the rest are byte oriented 
  n-gram models.A semantic model would know what words are related, like 
  "star" and "moon".  It would learn this by their tendency to appear 
  together.  You can build a dictionary of such knowledge from the data set 
  itself or you 

Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser

Ben >> Conceptually, a better (though still deeply flawed) contest would be:
Compress this file of advanced knowledge, assuming as background
knowledge this other file of elementary knowledge, in terms of which
the advanced knowledge is defined.

Nah.  It wouldn't be much of a contest if they gave the elementary knowledge 
file and *much* harder on the organizers.


A much better contest would be if they just had several other undisclosed 
Wikipedia-chunk files and the program had to have comparable compression 
ratios on the undisclosed files as well.  That way, the contestant is 
responsible for assembling the elementary knowledge in a compact format. 
(And the verification against undisclosed files will eliminate cheating).


- Original Message - 
From: "Ben Goertzel" <[EMAIL PROTECTED]>

To: 
Sent: Tuesday, August 15, 2006 12:11 PM
Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless 
compression of human knowledge prize




Ah... well, in that case the contest is indeed even less AGI-relevant
than I thought...

This particular shortcoming of the contest is more a pragmatic than a
philosophical one: Wikipedia is not a complete knowledge domain...
it's an advanced knowledge domain that is only meant to be
interpretable by reference to an additional body of (more elementary)
world-knowledge.  (Similar to the world-knowledge that Cyc tries,
fairly unsuccessfully, to capture.)   But this additional
world-knowledge is voluminous -- so it's quite possible that the
"AGI-natural" ways to compress Wikipedia using an AGI system with a
lot of world-knowledge, are quite different from the best ways to
compress Wikipedia using a small executable file.

Conceptually, a better (though still deeply flawed) contest would be:
Compress this file of advanced knowledge, assuming as background
knowledge this other file of elementary knowledge, in terms of which
the advanced knowledge is defined.

-- Ben G

On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote:

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:
> Hi Ben,
>
> I agree with everything that you're saying; however, looking at the
> specific task:
>
> . . . . there clearly isn't the opportunity for it to store 
> knowledge
> from other/previous files except in it's executable since it explicitly 
> says
> "without input from other sources" -- and the size of the executable 
> counts

> as part of the compressed size.

Right.  This is probably necessary for the contest; it would be hard
to verify that a program with a large database wasn't in some way
storing lots of Wiki-specific information in that database.
Unfortunately, this restriction makes the contest much less relevant
to AGI.

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,

please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, please go to 
http://v2.listbox.com/member/[EMAIL PROTECTED]





---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Charles D Hixson

Philip Goetz wrote:

On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote:

People do compression extremely well.  Your eyes send you about 3 ...


This is reduced to about 165Mbits/second per eye by the time the
signal leaves the lateral geniculate nucleus, which is before it hits
the thalamus or any other part of the brain.

- Phil 
Well, various things indicate that much of what we do is "taking the 
deltas", i.e., we only notice changes in the signals.  That accounts for 
much of the compression.  There's also something special about how we 
handle repetitive patterns.  (That's probably related "somehow" to 
hypnosis...which might be revealing if we knew enough.)


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser



>> 1. The 
test is subjective.
I disagree.  If you have an automated test 
with clear criteria like the following, it will be completely 
objective:
 
  a) the compressing program must be able to output all 
inconsistencies in the corpus (in their original string form) AND
  b) the decompressing program must be able to do the 
following when presented with a standard list of test ideas/pieces of 
knowledge
 
        FOR EACH IDEA/PIECE OF KNOWLEDGE IN 
THE TEST WHICH IS NOT IN THE LIST OF INCONSISTENCIES

  if the knowledge is in the corpus, recognize that it is in 
  the corpus. 
  if the negation of the knowledge is in the corpus, recognize that 
  the test knowledge is false according to the corpus. 
  if an incorrect substitution has been made to create 
  the test item from an item the corpus (i.e. red for 
  yellow, ten for twenty, etc.), recognize that the test 
  knowledge is false according to the corpus. 
  if a possibly correct (hierarchical) substitution has been made 
  to create the test item in the corpus, recognize that the 
  substitution is either a) in the corpus for broader concepts 
  (i.e. testing red for corpus lavender, testing dozens for corpus 
  thirty-seven, etc) or b) that there is related information in 
  the corpus which the test idea further refines for narrower 
  substitutions
>> 2. Lossy compression does not imply AI.
and two sentences before
>> A lossy text compressor that did the same thing (recall it in 
paraphrased fashion) would certainly demonstrate AI.
Require that the decompressing program be able to output all 
of the compressed file's knowledge in ordinary English.  This is a pretty 
trivial task compared to everything else. 
 
    Mark
 
- Original Message - 

  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 12:27 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  
  I realize it is tempting to use lossy text compression as a test for AI 
  because that is what the human brain does when we read text and recall it in 
  paraphrased fashion.  We remember the ideas and discard details about the 
  _expression_ of those ideas.  A lossy text compressor that did the same 
  thing would certainly demonstrate AI.But there are two problems with 
  using lossy compression as a test of AI:1. The test is subjective.2. 
  Lossy compression does not imply AI.Lets assume we solve the 
  subjectivity problem by having human judges evaluate whether the decompressed 
  output is "close enough" to the input.  We already do this with lossy 
  image, audio and video compression (without much consensus).The second 
  problem remains: ideal lossy compression does not imply passing the Turing 
  test.  For lossless compression, it can be proven that it does.  Let 
  p(s) be the (unknown) probability that s will be the prefix of a text 
  dialog.  Then a machine that can compute p(s) exactly is able to generate 
  response A to question Q with the distribution p(QA)/p(Q) which is 
  indistinguishable from human.  The same model minimizes the compressed 
  size, E[log 1/p(s)].This proof does not hold for lossy compression 
  because different lossless models map to identical lossy models.  The 
  desired property of a lossless compressor C is that if and only if s1 and s2 
  have the same meaning (to most people), then the encodings C(s1) = 
  C(s2).  This code will ideally have length log 1/(p(s1)+p(s2)).  But 
  this does not imply that the decompressor knows p(s1) or p(s2).  Thus, 
  the decompressor may decompress to s1 or s2 or choose randomly between 
  them.  In general, the output distribution will be different than the 
  true distrubution p(s1), p(s2), so it will be distinguishable from human even 
  if the compression ratio is ideal. -- Matt Mahoney, 
  [EMAIL PROTECTED]
  
  - 
  Original Message From: Mark Waser <[EMAIL PROTECTED]>To: 
  agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: 
  Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human 
  knowledge prize
  

  >> I 
  don't see any point in this debate over lossless vs. lossy 
  compression
   
  Lets see if I can simplify it.
  
The stated goal is compressing human 
knowledge. 
The exact, same knowledge can always be 
expressed in a *VERY* large number of different bit strings 
Not being able to reproduce the exact bit string 
is lossy compression when viewed from the bit viewpoint but can be 
lossless from the knowledge viewpoint 
Therefore, reproducing the bit string is an 
additional requirement above and beyond the stated goal 
I strongly believe that this additional 
requirement will necessitate a *VERY* large amount of additional work not 
necessary for the stated goal 
In addition, by information theory, reproducing 
the exact bit string will require additional information beyond the 
knowledge 

Re: [agi] confirmation paradox

2006-08-15 Thread Ben Goertzel

Phil, I see no conceptual problems with using probability theory to
define context-dependent or viewpoint-dependent probabilities...

Regarding YKY's example, "causation" is a subtle concept going beyond
probability (but strongly probabilistically based), and indeed any
mind needs to have fairly general and at least moderately clever
methods for dealing with it  But I see no problem with the
assignment of numerical truth values to causal statements.  Judea
Pearl's math does it; Novamente's math does it...

ben g

On 8/15/06, Philip Goetz <[EMAIL PROTECTED]> wrote:

> > A further example is:
> > S1 = "The fall of the Roman empire is due to Christianity".
> > S2 = "The fall of the Roman empire is due to lead poisoning".
> > I'm not sure whether S1 or S2 is "more" true.  But the question is how can
> > you define the meaning of the NTV associated with S1 or S2?  If we can't,
> > why not just leave these statements as non-numerical?
> >
> > YKY
>
> If you cannot tell the difference, of course you can assign them the
> same value. However, very often we state both S1 and S2 as "possible",
> but when are forced to make a choice, can still say that S1 is "more
> likely".
>
> Pei

YKY is advocating the post-modern viewpoint that knowledge is
context-dependent, and true-false assignments and numeric value
judgements are both extremely problematic.  Pei is pointing out the
commonsense, classicist position, and also the refutation of the
post-modern tradition, that some ways of building bridges make bridges
that stay standing, and other ways make bridges that fall down.

I think that the task of "completing the Modernist project", and
uniting the many important observations of both enlightment and
post-modernist thinking, has fallen to AI; we MUST resolve these two
viewpoints before we can create an AGI.

- Phil

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
Mark,Could you please write a test program to objectively test for lossy text compression using your algorithm?  You can start by listing all of the inconsistencies in Wikipedia.  To make the test objective, you will either need a function to test whether two strings are inconsistent or not, or else you need to show that people will never disagree on this matter.>> Lossy compression does not imply AI.>> A lossy text compressor that did the same thing (recall it in 
paraphrased fashion) would certainly demonstrate AI.I disagree that these are inconsistent.  Demonstrating and implying are different things.-- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 12:55:24 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

 


>> 1. The 
test is subjective.
I disagree.  If you have an automated test 
with clear criteria like the following, it will be completely 
objective:
 
  a) the compressing program must be able to output all 
inconsistencies in the corpus (in their original string form) AND
  b) the decompressing program must be able to do the 
following when presented with a standard list of test ideas/pieces of 
knowledge
 
        FOR EACH IDEA/PIECE OF KNOWLEDGE IN 
THE TEST WHICH IS NOT IN THE LIST OF INCONSISTENCIES

  if the knowledge is in the corpus, recognize that it is in 
  the corpus. 
  if the negation of the knowledge is in the corpus, recognize that 
  the test knowledge is false according to the corpus. 
  if an incorrect substitution has been made to create 
  the test item from an item the corpus (i.e. red for 
  yellow, ten for twenty, etc.), recognize that the test 
  knowledge is false according to the corpus. 
  if a possibly correct (hierarchical) substitution has been made 
  to create the test item in the corpus, recognize that the 
  substitution is either a) in the corpus for broader concepts 
  (i.e. testing red for corpus lavender, testing dozens for corpus 
  thirty-seven, etc) or b) that there is related information in 
  the corpus which the test idea further refines for narrower 
  substitutions
>> 2. Lossy compression does not imply AI.
and two sentences before
>> A lossy text compressor that did the same thing (recall it in 
paraphrased fashion) would certainly demonstrate AI.
Require that the decompressing program be able to output all 
of the compressed file's knowledge in ordinary English.  This is a pretty 
trivial task compared to everything else. 
 
    Mark
 
- Original Message - 

  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 12:27 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  
  I realize it is tempting to use lossy text compression as a test for AI 
  because that is what the human brain does when we read text and recall it in 
  paraphrased fashion.  We remember the ideas and discard details about the 
  _expression_ of those ideas.  A lossy text compressor that did the same 
  thing would certainly demonstrate AI.But there are two problems with 
  using lossy compression as a test of AI:1. The test is subjective.2. 
  Lossy compression does not imply AI.Lets assume we solve the 
  subjectivity problem by having human judges evaluate whether the decompressed 
  output is "close enough" to the input.  We already do this with lossy 
  image, audio and video compression (without much consensus).The second 
  problem remains: ideal lossy compression does not imply passing the Turing 
  test.  For lossless compression, it can be proven that it does.  Let 
  p(s) be the (unknown) probability that s will be the prefix of a text 
  dialog.  Then a machine that can compute p(s) exactly is able to generate 
  response A to question Q with the distribution p(QA)/p(Q) which is 
  indistinguishable from human.  The same model minimizes the compressed 
  size, E[log 1/p(s)].This proof does not hold for lossy compression 
  because different lossless models map to identical lossy models.  The 
  desired property of a lossless compressor C is that if and only if s1 and s2 
  have the same meaning (to most people), then the encodings C(s1) = 
  C(s2).  This code will ideally have length log 1/(p(s1)+p(s2)).  But 
  this does not imply that the decompressor knows p(s1) or p(s2).  Thus, 
  the decompressor may decompress to s1 or s2 or choose randomly between 
  them.  In general, the output distribution will be different than the 
  true distrubution p(s1), p(s2), so it will be distinguishable from human even 
  if the compression ratio is ideal. -- Matt Mahoney, 
  [EMAIL PROTECTED]
  
  - 
  Original Message From: Mark Waser <[EMAIL PROTECTED]>To: 
  agi@v2.listbox.comSent: Tuesday, August 15, 2006 9:28:26 AMSubject: 
  Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human 
  knowledge prize
  


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
Ben wrote:
>Conceptually, a better (though still deeply flawed) contest would be:
>Compress this file of advanced knowledge, assuming as background
>knowledge this other file of elementary knowledge, in terms of which
>the advanced knowledge is defined.

How about if you sort the input to put the elementary knowedge at the front?

-- Matt Mahoney, [EMAIL PROTECTED]
 



---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] confirmation paradox

2006-08-15 Thread Philip Goetz

On 8/15/06, Ben Goertzel <[EMAIL PROTECTED]> wrote:

Phil, I see no conceptual problems with using probability theory to
define context-dependent or viewpoint-dependent probabilities...

Regarding YKY's example, "causation" is a subtle concept going beyond
probability (but strongly probabilistically based), and indeed any
mind needs to have fairly general and at least moderately clever
methods for dealing with it  But I see no problem with the
assignment of numerical truth values to causal statements.  Judea
Pearl's math does it; Novamente's math does it...


There isn't a problem in doing it, but there's serious doubts whether
an approach in which symbols have constant meanings (the same symbol
has the same semantics in different propositions) can lead to AI.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Ben >> Conceptually, a better (though still deeply flawed) contest would be:
Compress this file of advanced knowledge, assuming as background
knowledge this other file of elementary knowledge, in terms of which
the advanced knowledge is defined.

Nah.  It wouldn't be much of a contest if they gave the elementary knowledge
file and *much* harder on the organizers.


How about using OpenCyc?

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:

Ben wrote:
>Conceptually, a better (though still deeply flawed) contest would be:
>Compress this file of advanced knowledge, assuming as background
>knowledge this other file of elementary knowledge, in terms of which
>the advanced knowledge is defined.

How about if you sort the input to put the elementary knowedge at the front?


I think that sort will take considerably longer than nlogn.  :)

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser



>> Could 
you please write a test program to objectively test for lossy text compression 
using your algorithm? 
 
Writing the test program for the decompressing 
program is relatively easy.  Since the requirement was that the 
decompressing program be able to recognize when a piece of knowledge is in the 
corpus, when it's negation is in the corpus, when an incorrect substitution has 
been made, and when a correct substitution has been made -- all you/I would need 
to do is invent (or obtain -- see two paragraphs down) a 
reasonably sized set of knowledge pieces to test, put them in a file, feed them 
to the decompressing program, and automatically grade it's answers as to which 
category each falls into.  A reasonably small number of test 
cases should suffice as long as you don't advertise exactly which test cases are 
in the final test but once you're having competitors generate each other's 
tests, you can go hog-wild with the number.
 
Writing the test program for the compressing 
program is also easy but developing the master list of inconsistencies is going 
to be a real difficulty -- unless you use the various contenders themselves to 
generate various versions of the list.  I strongly doubt that most 
contenders will get false positives but strongly suspect that finding all of the 
inconsistencies will be a major area for improvement as the systems become more 
sophisticated.
 
Note also that minor modifications of any 
decompressing program should also be able to create test cases for your 
decompressor test.  Simply ask it for a random sampling of knowledge, for 
the negations of a random sampling of knowledge, for some incorrect 
substitutions, and some hierarchical substitutions of each type.
 
Any *real* contenders should be able to easily 
generate the tests for you.
 
>> You 
can start by listing all of the inconsistencies in 
Wikipedia.
 
see paragraph 2 above
 
>> To 
make the test objective, you will either need a function to test whether two 
strings are inconsistent or not, or else you need to show that people will never 
disagree on this matter.
It is impossible to show that people will never 
disagree on a matter.
 
On the other hand, a knowledge compressor is going 
to have to recognize when two pieces of knowledge conflict (i.e. when two 
strings parse into knowledge statements that cannot coexist).  You can 
always have a contender evaluate whether a competitor's 
"inconsistencies" are incorrect and then do some examination by hand on a 
representative sample where the contender says it can't tell (since, 
again, I suspect you'll find few misidentified inconsistencies -- but that 
finding all of the inconsistencies will be ever subject to 
improvement).
 
>> >> Lossy compression does not imply 
AI.
>> >> A lossy text compressor that did 
the same thing (recall it in paraphrased fashion) would certainly 
demonstrate AI.>> I disagree 
that these are inconsistent.  Demonstrating and implying are different 
things.
 
I didn't say that they were inconsistent.  What I meant to say was 


  that a decompressing program that is able to output all 
  of the compressed file's knowledge in ordinary English would, in your 
  words, "certainly demonstrate AI".
  given statement 1, it's not a problem that "lossy compression does not 
  imply AI" since the decompressing program would still "certainly demonstrate 
  AI"
 
- Original Message - 

  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 2:23 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  Mark,Could 
  you please write a test program to objectively test for lossy text compression 
  using your algorithm?  You can start by listing all of the 
  inconsistencies in Wikipedia.  To make the test objective, you will 
  either need a function to test whether two strings are inconsistent or not, or 
  else you need to show that people will never disagree on this matter.
  >> Lossy compression does not imply 
  AI.>> A lossy text compressor that 
  did the same thing (recall it in paraphrased fashion) would certainly 
  demonstrate AI.I disagree that 
  these are inconsistent.  Demonstrating and implying are different 
  things.-- Matt Mahoney, [EMAIL PROTECTED]
  
  - 
  Original Message From: Mark Waser <[EMAIL PROTECTED]>To: 
  agi@v2.listbox.comSent: Tuesday, August 15, 2006 12:55:24 PMSubject: 
  Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human 
  knowledge prize
  

  >> 1. 
  The test is subjective.
  I disagree.  If you have an automated test 
  with clear criteria like the following, it will be completely 
  objective:
   
    a) the compressing program must 
  be able to output all inconsistencies in the corpus (in their original string 
  form) AND
    b) the decompressing program 
  must be able to do the following when presented with a standard list of test 
  ideas/pieces of knowledge
   
     

Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser

How about using OpenCyc?


Actually, instructing the competitors to compress both the OpenCyc corpus 
AND then the Wikipedia sample in sequence and measuring the size of both 
*would* be an interesting and probably good contest.


- Original Message - 
From: "Philip Goetz" <[EMAIL PROTECTED]>

To: 
Sent: Tuesday, August 15, 2006 3:16 PM
Subject: **SPAM** Re: Goertzel/Sampo: [agi] Marcus Hutter's lossless 
compression of human knowledge prize




On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:
Ben >> Conceptually, a better (though still deeply flawed) contest would 
be:

Compress this file of advanced knowledge, assuming as background
knowledge this other file of elementary knowledge, in terms of which
the advanced knowledge is defined.

Nah.  It wouldn't be much of a contest if they gave the elementary 
knowledge

file and *much* harder on the organizers.


How about using OpenCyc?

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, please go to 
http://v2.listbox.com/member/[EMAIL PROTECTED]





---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Actually, instructing the competitors to compress both the OpenCyc corpus
AND then the Wikipedia sample in sequence and measuring the size of both
*would* be an interesting and probably good contest.


I think it would be more interesting for it to use the OpenCyc corpus
as its knowledge for compressing the Wikipedia sample.  The point is
to demonstrate intelligent use of information, not to get a wider
variety of data.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

I proposed knowledge-based text compression as a dissertation topic,
back around 1991, but my advisor turned it down.  I never got back to
the topic because there wasn't any money in it - text is already so
small, relative to audio and video, that it was clear that the money
was in audio and video compression.  Also, non-AI methods were already
getting near enough the theoretical limits for compression that it
didn't seem worthwhile.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
You could use Keogh's compression dissimilarity measure to test for inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf  CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means the compressed size of x (lossless).  The measure ranges from about 0.5 if x = y to about 1.0 if x and y do not share any information.  Then,  CDM("it is hot", "it is very warm") < CDM("it is hot", "it is cold").assuming your compressor uses a good language model.Now if only we had some test to tell which compressors have the best language models... -- Matt Mahoney, [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 3:22:10 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

 


>> Could 
you please write a test program to objectively test for lossy text compression 
using your algorithm? 
 
Writing the test program for the decompressing 
program is relatively easy.  Since the requirement was that the 
decompressing program be able to recognize when a piece of knowledge is in the 
corpus, when it's negation is in the corpus, when an incorrect substitution has 
been made, and when a correct substitution has been made -- all you/I would need 
to do is invent (or obtain -- see two paragraphs down) a 
reasonably sized set of knowledge pieces to test, put them in a file, feed them 
to the decompressing program, and automatically grade it's answers as to which 
category each falls into.  A reasonably small number of test 
cases should suffice as long as you don't advertise exactly which test cases are 
in the final test but once you're having competitors generate each other's 
tests, you can go hog-wild with the number.
 
Writing the test program for the compressing 
program is also easy but developing the master list of inconsistencies is going 
to be a real difficulty -- unless you use the various contenders themselves to 
generate various versions of the list.  I strongly doubt that most 
contenders will get false positives but strongly suspect that finding all of the 
inconsistencies will be a major area for improvement as the systems become more 
sophisticated.
 
Note also that minor modifications of any 
decompressing program should also be able to create test cases for your 
decompressor test.  Simply ask it for a random sampling of knowledge, for 
the negations of a random sampling of knowledge, for some incorrect 
substitutions, and some hierarchical substitutions of each type.
 
Any *real* contenders should be able to easily 
generate the tests for you.
 
>> You 
can start by listing all of the inconsistencies in 
Wikipedia.
 
see paragraph 2 above
 
>> To 
make the test objective, you will either need a function to test whether two 
strings are inconsistent or not, or else you need to show that people will never 
disagree on this matter.
It is impossible to show that people will never 
disagree on a matter.
 
On the other hand, a knowledge compressor is going 
to have to recognize when two pieces of knowledge conflict (i.e. when two 
strings parse into knowledge statements that cannot coexist).  You can 
always have a contender evaluate whether a competitor's 
"inconsistencies" are incorrect and then do some examination by hand on a 
representative sample where the contender says it can't tell (since, 
again, I suspect you'll find few misidentified inconsistencies -- but that 
finding all of the inconsistencies will be ever subject to 
improvement).
 
>> >> Lossy compression does not imply 
AI.
>> >> A lossy text compressor that did 
the same thing (recall it in paraphrased fashion) would certainly 
demonstrate AI.>> I disagree 
that these are inconsistent.  Demonstrating and implying are different 
things.
 
I didn't say that they were inconsistent.  What I meant to say was 


  that a decompressing program that is able to output all 
  of the compressed file's knowledge in ordinary English would, in your 
  words, "certainly demonstrate AI".
  given statement 1, it's not a problem that "lossy compression does not 
  imply AI" since the decompressing program would still "certainly demonstrate 
  AI"
 
- Original Message - 

  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 2:23 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  Mark,Could 
  you please write a test program to objectively test for lossy text compression 
  using your algorithm?  You can start by listing all of the 
  inconsistencies in Wikipedia.  To make the test objective, you will 
  either need a function to test whether two strings are inconsistent or not, or 
  else you need to show that people will never disagree on this matter.
  >> Lossy compression does not imply 
  AI.>> A lossy text compressor that 
  did the same thing (recall it in paraphrased fashion) would certainly 
  demonstrate AI.I d

Re: [agi] confirmation paradox

2006-08-15 Thread Ben Goertzel

Hi,

Phil wrote:

There isn't a problem in doing it, but there's serious doubts whether
an approach in which symbols have constant meanings (the same symbol
has the same semantics in different propositions) can lead to AI.


Sure, but neither Novamente nor NARS (for example) has the problematic
issue you mention  In both of these systems, symbols and other
paterns may have context-dependent semantics...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser

I think it would be more interesting for it to use the OpenCyc corpus
as its knowledge for compressing the Wikipedia sample.  The point is
to demonstrate intelligent use of information, not to get a wider
variety of data.


:-)  My assumption is that the compression program is building/adding to a 
knowledge base when it reads a file/corpus and then it exports a compressed 
version of either 1) all of the knowledge from the newest file/corpus OR 2) 
the file/corpus knowledge minus whatever was previously known from a 
previous file/corpus.


   If you "compressed" the OpenCyc corpus, threw away the compressed file, 
and then compressed the Wikipedia sample with the output option set to #2 
above excluding the OpenCyc corpus, then my program would be doing exactly 
what you are suggesting (and doing it in the easiest possible way since you 
need some way to get the OpenCyc corpus into the knowledge base).


   The *only* real difference between your suggestion and mine is that you 
are ignoring the size of the compressed OpenCyc file.


- Original Message - 
From: "Philip Goetz" <[EMAIL PROTECTED]>

To: 
Sent: Tuesday, August 15, 2006 3:37 PM
Subject: **SPAM** Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless 
compression of human knowledge prize




On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

Actually, instructing the competitors to compress both the OpenCyc corpus
AND then the Wikipedia sample in sequence and measuring the size of both
*would* be an interesting and probably good contest.


I think it would be more interesting for it to use the OpenCyc corpus
as its knowledge for compressing the Wikipedia sample.  The point is
to demonstrate intelligent use of information, not to get a wider
variety of data.

---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, please go to 
http://v2.listbox.com/member/[EMAIL PROTECTED]





---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:



I realize it is tempting to use lossy text compression as a test for AI
because that is what the human brain does when we read text and recall it in
paraphrased fashion.  We remember the ideas and discard details about the
expression of those ideas.  A lossy text compressor that did the same thing
would certainly demonstrate AI.

But there are two problems with using lossy compression as a test of AI:
1. The test is subjective.
2. Lossy compression does not imply AI.

Lets assume we solve the subjectivity problem by having human judges
evaluate whether the decompressed output is "close enough" to the input.  We
already do this with lossy image, audio and video compression (without much
consensus).

The second problem remains: ideal lossy compression does not imply passing
the Turing test.  For lossless compression, it can be proven that it does.
Let p(s) be the (unknown) probability that s will be the prefix of a text
dialog.  Then a machine that can compute p(s) exactly is able to generate
response A to question Q with the distribution p(QA)/p(Q) which is
indistinguishable from human.  The same model minimizes the compressed size,
E[log 1/p(s)].


This proof is really not useful.  The Turing test is subjective; all
you are saying is that lossy compression is lossy, and lossless
compression is not.  A solution to the first problem would also solve
the second problem.

It is necessary to allow lossy compression in order for this
compression test to be useful for AI, because lossless and
uncomprehending compression is already bumping up against the
theoretical limits for text compression.

- Phil

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Mark Waser



>> You 
could use Keogh's compression dissimilarity measure to test for 
inconsistency.
I don't think so.  Take the following strings: 
"I only used red and yellow paint in the painting", "I painted the rose in my 
favorite color", "My favorite color is pink", "Orange is created by mixing red 
and yellow", "Pink is created by mixing red and white".  How is Keogh's 
measure going to help you with that?
 
The problem is that Keogh's measure is intended for 
data-mining where you have separate instances, not one big entwined Gordian 
knot.
 
>> Now if 
only we had some test to tell which compressors have the best language 
models...
Huh? By definition, the compressor with the best 
language model is the one with the highest compression ratio.
 

  - Original Message - 
  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 3:54 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  You 
  could use Keogh's compression dissimilarity measure to test for 
  inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf  
  CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means 
  the compressed size of x (lossless).  The measure ranges from about 0.5 
  if x = y to about 1.0 if x and y do not share any information.  
  Then,  CDM("it is hot", "it is very warm") < CDM("it is hot", 
  "it is cold").assuming your compressor uses a good language 
  model.Now if only we had some test to tell which compressors have the 
  best language models...
   -- Matt Mahoney, [EMAIL PROTECTED]
  
  - 
  Original Message From: Mark Waser <[EMAIL PROTECTED]>To: 
  agi@v2.listbox.comSent: Tuesday, August 15, 2006 3:22:10 PMSubject: 
  Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human 
  knowledge prize
  

  >> Could you please write a test program to objectively test for lossy 
  text compression using your algorithm? 
   
  Writing the test program for the 
  decompressing program is relatively easy.  Since the requirement was 
  that the decompressing program be able to recognize when a piece of knowledge 
  is in the corpus, when it's negation is in the corpus, when an incorrect 
  substitution has been made, and when a correct substitution has been made -- 
  all you/I would need to do is invent (or obtain -- see two 
  paragraphs down) a reasonably sized set of knowledge pieces to test, put 
  them in a file, feed them to the decompressing program, and automatically 
  grade it's answers as to which category each falls into.  A 
  reasonably small number of test cases should suffice as long as you don't 
  advertise exactly which test cases are in the final test but once you're 
  having competitors generate each other's tests, you can go hog-wild with the 
  number.
   
  Writing the test program for the compressing 
  program is also easy but developing the master list of inconsistencies is 
  going to be a real difficulty -- unless you use the various contenders 
  themselves to generate various versions of the list.  I strongly doubt 
  that most contenders will get false positives but strongly suspect that 
  finding all of the inconsistencies will be a major area for improvement as the 
  systems become more sophisticated.
   
  Note also that minor modifications of any 
  decompressing program should also be able to create test cases for your 
  decompressor test.  Simply ask it for a random sampling of knowledge, for 
  the negations of a random sampling of knowledge, for some incorrect 
  substitutions, and some hierarchical substitutions of each type.
   
  Any *real* contenders should be able to easily 
  generate the tests for you.
   
  >> You 
  can start by listing all of the inconsistencies in 
  Wikipedia.
   
  see paragraph 2 above
   
  >> To 
  make the test objective, you will either need a function to test whether two 
  strings are inconsistent or not, or else you need to show that people will 
  never disagree on this matter.
  It is impossible to show that people will never 
  disagree on a matter.
   
  On the other hand, a knowledge compressor is 
  going to have to recognize when two pieces of knowledge conflict (i.e. when 
  two strings parse into knowledge statements that cannot coexist).  You 
  can always have a contender evaluate whether a competitor's 
  "inconsistencies" are incorrect and then do some examination by hand on a 
  representative sample where the contender says it can't tell (since, 
  again, I suspect you'll find few misidentified inconsistencies -- but 
  that finding all of the inconsistencies will be ever subject to 
  improvement).
   
  >> >> Lossy 
  compression does not imply AI.
  >> >> A lossy 
  text compressor that did the same thing (recall it in paraphrased 
  fashion) would certainly demonstrate AI.>> I disagree 
  that these are inconsistent.  Demonstrating and implying are different 
  things.
   
  I didn't say tha

Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Matt Mahoney
Mark wrote:>Huh? By definition, the compressor with the best 
language model is the one with the highest compression ratio.
I'm glad we finally agree :-)>> You 
could use Keogh's compression dissimilarity measure to test for 
inconsistency.
I don't think so.  Take the following strings: 
"I only used red and yellow paint in the painting", "I painted the rose in my 
favorite color", "My favorite color is pink", "Orange is created by mixing red 
and yellow", "Pink is created by mixing red and white".  How is Keogh's 
measure going to help you with that?
You group the strings into a fixed set and a variable set and concatenate them.  The variable set could be just "I only used red and yellow paint in the painting", and you compare the CDM replacing "yellow" with "white".   Of course your compressor must be capable of abstract reasoning and have a world model.To answer Phil's post: Text compression is only near the theoretical limts for small files.  For large files, there is progress to be made integrating known syntactic and semantic modeling techniques into general purpose compressors.  The theoretical limit is about 1 bpc and we are not there yet.  See the graph at http://cs.fit.edu/~mmahoney/dissertation/The proof that I gave that a language model implies passing the Turing test is for the ideal case where all people share identical models.  The ideal case is
 deterministic.  For the real case where models differ, passing the test is easier because a judge will attribute some machine errors to normal human variation.  I discuss this in more detail at http://cs.fit.edu/~mmahoney/compression/rationale.html (text compression is equivalent to AI).It is really hard to get funding for text compression research (or AI).  I had to change my dissertation topic to network security in 1999 because my advisor had funding for that.  As a postdoc I applied for a $50K NSF grant for a text compression contest.  It was rejected, so I started one without funding (which we now have).  The problem is that many people do not believe that text compression is related to AI (even though speech recognition researchers have been evaluating models by perplexity since the early 1990's). -- Matt Mahoney,
 [EMAIL PROTECTED]- Original Message From: Mark Waser <[EMAIL PROTECTED]>To: agi@v2.listbox.comSent: Tuesday, August 15, 2006 5:00:47 PMSubject: Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

 


>> You 
could use Keogh's compression dissimilarity measure to test for 
inconsistency.
I don't think so.  Take the following strings: 
"I only used red and yellow paint in the painting", "I painted the rose in my 
favorite color", "My favorite color is pink", "Orange is created by mixing red 
and yellow", "Pink is created by mixing red and white".  How is Keogh's 
measure going to help you with that?
 
The problem is that Keogh's measure is intended for 
data-mining where you have separate instances, not one big entwined Gordian 
knot.
 
>> Now if 
only we had some test to tell which compressors have the best language 
models...
Huh? By definition, the compressor with the best 
language model is the one with the highest compression ratio.
 

  - Original Message - 
  From: 
  Matt 
  Mahoney 
  To: agi@v2.listbox.com 
  Sent: Tuesday, August 15, 2006 3:54 
  PM
  Subject: Re: Mahoney/Sampo: [agi] Marcus 
  Hutter's lossless compression of human knowledge prize
  
  You 
  could use Keogh's compression dissimilarity measure to test for 
  inconsistency.http://www.cs.ucr.edu/~eamonn/SIGKDD_2004_long.pdf  
  CDM(x,y) = C(xy)/(C(x)+C(y)).where x and y are strings, and C(x) means 
  the compressed size of x (lossless).  The measure ranges from about 0.5 
  if x = y to about 1.0 if x and y do not share any information.  
  Then,  CDM("it is hot", "it is very warm") < CDM("it is hot", 
  "it is cold").assuming your compressor uses a good language 
  model.Now if only we had some test to tell which compressors have the 
  best language models...
   -- Matt Mahoney, [EMAIL PROTECTED]
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: Goetz/Goertzel/Sampo: [agi] Marcus Hutter's lossless compression of human knowledge prize

2006-08-15 Thread Philip Goetz

On 8/15/06, Mark Waser <[EMAIL PROTECTED]> wrote:

> I think it would be more interesting for it to use the OpenCyc corpus
> as its knowledge for compressing the Wikipedia sample.  The point is
> to demonstrate intelligent use of information, not to get a wider
> variety of data.

:-)  My assumption is that the compression program is building/adding to a
knowledge base when it reads a file/corpus and then it exports a compressed
version of either 1) all of the knowledge from the newest file/corpus OR 2)
the file/corpus knowledge minus whatever was previously known from a
previous file/corpus.

If you "compressed" the OpenCyc corpus, threw away the compressed file,
and then compressed the Wikipedia sample with the output option set to #2
above excluding the OpenCyc corpus, then my program would be doing exactly
what you are suggesting (and doing it in the easiest possible way since you
need some way to get the OpenCyc corpus into the knowledge base).


Yes.  RIght.


The *only* real difference between your suggestion and mine is that you
are ignoring the size of the compressed OpenCyc file.


Right.  Which is an important difference.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]