Let me state one more time why a lossless model has more knowledge. If x
and x' have the same meaning to a lossy compressor (they compress to
identical codes), then the lossy model only knows p(x)+p(x'). A lossless
model also knows p(x) and p(x'). You can argue that if x and x' are not
distinguishable then this extra knowledge is not important. But all text
strings are distinguishable to humans.
There is a difference between information and knowledge. Your argument is
100% correct for information. It is not correct for knowledge. Information
only counts as knowledge if it is *usable*. PKZip has exactly ONE piece of
knowledge --> the exact string that was fed to it. It can't do anything
else with what it has other than reproduce that string.
Also in the opinion of speech recognition researchers studying language
models since the early 1990's.
Duh. If your purpose is to recognize speech then you don't want to lose any
of it. Your stated purpose was different -- thus, it makes sense to have
different judging criterai -- like, maybe, ones that are dictated by your
goals.
Deciding if a lossy decompression is "close enough" is an AI problem, or
it requires subjective judging by humans.
Absolutely not. We've covered this before. You can judge how much
knowledge a file contains by requiring that the decompression program output
it in a standard canonical form. The "smartest" program will probably
output far more knowledge than a team of puny humans could develop in a
large number of man-years (as well as give you some ideas for useful
research projects).
- - - - -
Seriously, dude -- I DO understand your defense of the contest but
insisting on lossless compression has *nothing* to do with KNOWLEDGE
(though, maybe everything to do with judging).
----- Original Message -----
From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Friday, August 25, 2006 7:54 PM
Subject: Re: [agi] Lossy *&* lossless compression
----- Original Message ----
From: Mark Waser <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Friday, August 25, 2006 5:58:02 PM
Subject: Re: [agi] Lossy *&* lossless compression
However, a machine with a lossless model will still outperform one with
a
lossy model because the lossless model has more knowledge.
PKZip has a lossless model. Are you claiming that it has more knowledge?
More data/information *might* be arguable but certainly not knowledge --
and
PKZip certainly can't use any "knowledge" that you claim that it "has".
DEL has a lossy model, and nothing compresses smaller. Is it smarter than
PKZip?
Let me state one more time why a lossless model has more knowledge. If x
and x' have the same meaning to a lossy compressor (they compress to
identical codes), then the lossy model only knows p(x)+p(x'). A lossless
model also knows p(x) and p(x'). You can argue that if x and x' are not
distinguishable then this extra knowledge is not important. But all text
strings are distinguishable to humans.
But let me give an example of what we have already learned from lossless
compression tests.
1. PKZip, bzip2, ppmd, etc. model text at the character (ngram) level.
2. WinRK and paq8h model text at the lexical level using static
dictionaries. They compress better than (1).
3. xml-wrt|ppmonstr and paq8hp1 model text at the lexical level using
dictionaries learned from the input. They compress better than (2).
I think you can see the pattern.
There has been research in semantic models using distant bigrams and LSA.
These compress cleaned text (restricted vocabulary, no punctuation) better
than models without these capabilities, as measured by word perplexity.
Currently there are no general purpose compressors that model syntax or
semantics, probably because such models are only useful on large text
corpora, not the kind of files people normally compress. I think that
will change if there is a financial incentive.
This does not change the fact that lossless compression is the right way
to evaluate a language model.
. . . . in *your* opinion. I might argue that it is the *easiest* way to
evaluate a language model but certainly NOT the best -- and I would then
argue, therefore, not the "right" way either.
Also in the opinion of speech recognition researchers studying language
models since the early 1990's.
A lossy model cannot be evaluated objectively
Bullsh*t. I've given you several examples of how. You've discarded them
because you felt that they were "too difficult" and/or you didn't
understand
them.
Deciding if a lossy decompression is "close enough" is an AI problem, or
it requires subjective judging by humans. Look at benchmarks for video or
audio codecs. Which sounds better, AAC or Ogg?
-- Matt Mahoney, [EMAIL PROTECTED]
-------
To unsubscribe, change your address, or temporarily deactivate your
subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]