Let me state one more time why a lossless model has more knowledge. If x and x' have the same meaning to a lossy compressor (they compress to identical codes), then the lossy model only knows p(x)+p(x'). A lossless model also knows p(x) and p(x'). You can argue that if x and x' are not distinguishable then this extra knowledge is not important. But all text strings are distinguishable to humans.

There is a difference between information and knowledge. Your argument is 100% correct for information. It is not correct for knowledge. Information only counts as knowledge if it is *usable*. PKZip has exactly ONE piece of knowledge --> the exact string that was fed to it. It can't do anything else with what it has other than reproduce that string.

Also in the opinion of speech recognition researchers studying language models since the early 1990's.

Duh. If your purpose is to recognize speech then you don't want to lose any of it. Your stated purpose was different -- thus, it makes sense to have different judging criterai -- like, maybe, ones that are dictated by your goals.

Deciding if a lossy decompression is "close enough" is an AI problem, or it requires subjective judging by humans.

Absolutely not. We've covered this before. You can judge how much knowledge a file contains by requiring that the decompression program output it in a standard canonical form. The "smartest" program will probably output far more knowledge than a team of puny humans could develop in a large number of man-years (as well as give you some ideas for useful research projects).

- - - - -

Seriously, dude -- I DO understand your defense of the contest but insisting on lossless compression has *nothing* to do with KNOWLEDGE (though, maybe everything to do with judging).

----- Original Message ----- From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Friday, August 25, 2006 7:54 PM
Subject: Re: [agi] Lossy *&* lossless compression


----- Original Message ----
From: Mark Waser <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Friday, August 25, 2006 5:58:02 PM
Subject: Re: [agi] Lossy *&* lossless compression

However, a machine with a lossless model will still outperform one with a
lossy model because the lossless model has more knowledge.

PKZip has a lossless model.  Are you claiming that it has more knowledge?
More data/information *might* be arguable but certainly not knowledge -- and
PKZip certainly can't use any "knowledge" that you claim that it "has".

DEL has a lossy model, and nothing compresses smaller. Is it smarter than PKZip?

Let me state one more time why a lossless model has more knowledge. If x and x' have the same meaning to a lossy compressor (they compress to identical codes), then the lossy model only knows p(x)+p(x'). A lossless model also knows p(x) and p(x'). You can argue that if x and x' are not distinguishable then this extra knowledge is not important. But all text strings are distinguishable to humans.

But let me give an example of what we have already learned from lossless compression tests.

1. PKZip, bzip2, ppmd, etc. model text at the character (ngram) level.
2. WinRK and paq8h model text at the lexical level using static dictionaries. They compress better than (1). 3. xml-wrt|ppmonstr and paq8hp1 model text at the lexical level using dictionaries learned from the input. They compress better than (2).

I think you can see the pattern.

There has been research in semantic models using distant bigrams and LSA. These compress cleaned text (restricted vocabulary, no punctuation) better than models without these capabilities, as measured by word perplexity. Currently there are no general purpose compressors that model syntax or semantics, probably because such models are only useful on large text corpora, not the kind of files people normally compress. I think that will change if there is a financial incentive.

This does not change the fact that lossless compression is the right way
to evaluate a language model.

. . . . in *your* opinion.  I might argue that it is the *easiest* way to
evaluate a language model but certainly NOT the best -- and I would then
argue, therefore, not the "right" way either.

Also in the opinion of speech recognition researchers studying language models since the early 1990's.

A lossy model cannot be evaluated objectively

Bullsh*t.  I've given you several examples of how.  You've discarded them
because you felt that they were "too difficult" and/or you didn't understand
them.

Deciding if a lossy decompression is "close enough" is an AI problem, or it requires subjective judging by humans. Look at benchmarks for video or audio codecs. Which sounds better, AAC or Ogg?

-- Matt Mahoney, [EMAIL PROTECTED]





-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]



-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to