|
>> 1. The
test is subjective.
I disagree. If you have an automated test
with clear criteria like the following, it will be completely
objective:
a) the compressing program must be able to output all
inconsistencies in the corpus (in their original string form) AND
b) the decompressing program must be able to do the
following when presented with a standard list of test ideas/pieces of
knowledge
FOR EACH IDEA/PIECE OF KNOWLEDGE IN
THE TEST WHICH IS NOT IN THE LIST OF INCONSISTENCIES
- if the knowledge is in the corpus, recognize that it is in
the corpus.
- if the negation of the knowledge is in the corpus, recognize that
the test knowledge is false according to the corpus.
- if an incorrect substitution has been made to create
the test item from an item the corpus (i.e. red for
yellow, ten for twenty, etc.), recognize that the test
knowledge is false according to the corpus.
- if a possibly correct (hierarchical) substitution has been made
to create the test item in the corpus, recognize that the
substitution is either a) in the corpus for broader concepts
(i.e. testing red for corpus lavender, testing dozens for corpus
thirty-seven, etc) or b) that there is related information in
the corpus which the test idea further refines for narrower
substitutions
>> 2. Lossy compression does not imply AI.
and two sentences before
>> A lossy text compressor that did the same thing (recall it in
paraphrased fashion) would certainly demonstrate AI.
Require that the decompressing program be able to output all
of the compressed file's knowledge in ordinary English. This is a pretty
trivial task compared to everything else.
Mark
----- Original Message -----
Sent: Tuesday, August 15, 2006 12:27
PM
Subject: Re: Mahoney/Sampo: [agi] Marcus
Hutter's lossless compression of human knowledge prize
I realize it is tempting to use lossy text compression as a test for AI
because that is what the human brain does when we read text and recall it in
paraphrased fashion. We remember the ideas and discard details about the
_expression_ of those ideas. A lossy text compressor that did the same
thing would certainly demonstrate AI.
But there are two problems with
using lossy compression as a test of AI: 1. The test is subjective. 2.
Lossy compression does not imply AI.
Lets assume we solve the
subjectivity problem by having human judges evaluate whether the decompressed
output is "close enough" to the input. We already do this with lossy
image, audio and video compression (without much consensus).
The second
problem remains: ideal lossy compression does not imply passing the Turing
test. For lossless compression, it can be proven that it does. Let
p(s) be the (unknown) probability that s will be the prefix of a text
dialog. Then a machine that can compute p(s) exactly is able to generate
response A to question Q with the distribution p(QA)/p(Q) which is
indistinguishable from human. The same model minimizes the compressed
size, E[log 1/p(s)].
This proof does not hold for lossy compression
because different lossless models map to identical lossy models. The
desired property of a lossless compressor C is that if and only if s1 and s2
have the same meaning (to most people), then the encodings C(s1) =
C(s2). This code will ideally have length log 1/(p(s1)+p(s2)). But
this does not imply that the decompressor knows p(s1) or p(s2). Thus,
the decompressor may decompress to s1 or s2 or choose randomly between
them. In general, the output distribution will be different than the
true distrubution p(s1), p(s2), so it will be distinguishable from human even
if the compression ratio is ideal. -- Matt Mahoney,
[EMAIL PROTECTED]
-----
Original Message ---- From: Mark Waser <[EMAIL PROTECTED]> To:
[email protected]Sent: Tuesday, August 15, 2006 9:28:26 AM Subject:
Re: Mahoney/Sampo: [agi] Marcus Hutter's lossless compression of human
knowledge prize
>> I
don't see any point in this debate over lossless vs. lossy
compression
Lets see if I can simplify it.
- The stated goal is compressing human
knowledge.
- The exact, same knowledge can always be
expressed in a *VERY* large number of different bit strings
- Not being able to reproduce the exact bit string
is lossy compression when viewed from the bit viewpoint but can be
lossless from the knowledge viewpoint
- Therefore, reproducing the bit string is an
additional requirement above and beyond the stated goal
- I strongly believe that this additional
requirement will necessitate a *VERY* large amount of additional work not
necessary for the stated goal
- In addition, by information theory, reproducing
the exact bit string will require additional information beyond the
knowledge contained in it (since numerous different strings can encode the
same knowledge)
- Assuming optimal compression, also by by
information theory, additional information will add to the compressed size
(i.e. lead to a less optimal result).
So the question is "Given that bit-level
reproduction is harder, not necessary for knowledge compression/intelligence,
and doesn't allow for the same degree of compression. Why make life
tougher when it isn't necessary for your stated purposes and makes your
results (i.e. compression) worse?"
-----
Original Message -----
Sent:
Tuesday, August 15, 2006 12:55 AM
Subject:
Re: Sampo: [agi] Marcus Hutter's lossless compression of human knowledge
prize
Where
will the knowledge to compress text come from? There are 3
possibilities. 1. externally supplied, like the lexical models
(dictionaries) for paq8h and WinRK. 2. learned from the input in a
separate pass, like xml-wrt|ppmonstr. 3. learned online in one pass, like
paq8f and slim. These all have the same effect on compressed
size. In the first case, you increase the size of the
decompressor. In the second, you have to append the model you learned
from the first pass to the compressed file so it is available to the
decompressor. In the third case, compression is poor at the
beginning. From the viewpoint of information theory, there is no
difference in these three approaches. The penalty is the
same. To improve compression further, you will need to model
semantics and/or syntax. No compressor currently does this. I
think the reason is that it is not worthwhile unless you have hundreds of
megabytes of natural language text. In fact, only the top few
compressors even have lexical models. All the rest are byte oriented
n-gram models. A semantic model would know what words are related,
like "star" and "moon". It would learn this by their tendency to
appear together. You can build a dictionary of such knowledge from the
data set itself or you can build it some other way (such as Wordnet) and
include it in the decompressor. If you learn it from the input, you
could do it in a separate pass (like LSA) or you could do it in one pass
(maybe an equivalent neural network) so that you build the model as you
compress. To learn syntax, you can cluster words by similarity of
their immediate context. These clusters correspond to part of
speech. For instance, "the X is" tells you that X is a noun. You
can model simple grammars as n-grams over their classifications, such as
(Art Noun Verb). Again, you can use any of 3
approaches. Learning semantics and syntax is a hard problem, but I
think you can see it can be done with statistical modeling. The
training data you need is in the input itself. I don't see any point
in this debate over lossless vs. lossy compression. You have to solve
the language learning problem in either case to improve compression. I
think it will be more productive to discuss how this can be done.
-- Matt Mahoney, [EMAIL PROTECTED]
To
unsubscribe, change your address, or temporarily deactivate your
subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
To
unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
To unsubscribe, change your address, or temporarily deactivate your
subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
|