On 09/05/07, Ted Mittelstaedt <[EMAIL PROTECTED]> wrote:
> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Gary Kline > Sent: Tuesday, May 08, 2007 7:19 PM > To: [EMAIL PROTECTED] > Cc: Gary Kline; FreeBSD Mailing List > Subject: Re: Another slightly OT q... > > > > So it *was* a hoax? Rats. Some weeks ago on Public > Broadcasting, a few sentences were spoken on the potential of > fractal geometry to achieve [I'm guessing] data-compression on > the order of what Sloot was claiming. So far, no one has figured > it out. It may be a dream... . > There's some cool math out there that explains all of this but I never liked math, but it isn't necessary to know the math to understand the issue. Just consider the problem for a while and you will realize that the compression ratio of a specific data stream varies dependent on the amount of repetition in the input datastream. A perfectly unrandom datastream, like a constant series of logical 1's, carries no information, but has a compression ratio that is infinite. A perfectly random datastream, on the other hand, also carries no information, but has a compression ratio that is zero. I believe that a datastream that is 50% of the way between either extreme carries the most information, and I believe your typical datastream is much closer to the perfectly unrandom side than the perfectly random side, compression is merely the process of pushing the randomness of the stream closer to the random side.
Actually, the more information (as such) the closer the data stream is to perfectly random. The relation- ship might be asymptotic, but I am no maths major.
Thus, if the input datastream is very close to the perfectly unrandom side - meaning it has a very high amount of repetition in it, you can get some pretty spectacular compression ratios. But as you move closer to unrandom, you carry less data. So, the better applications emit datastreams that are less unrandom, therefore compression does not work as well on them.
I suppose this leads to the discussion about what "data" and "information" really are. Imagine a can. The can is data. Imagine tha can is full of worms.
This of course is completely ignoring the other data issue, is the application data efficient to begin with? For example, you can transfer about a page of information in ASCII that consumes about 1K of data, that same page of information in a MS Word file consumes a hundred times that amount of space - Word is therefore extremely inefficient with data.
In this case, since word "has to" replace typesetting, layout, and formatting software, in addition to being a word processor the header and meta information tend to bloat the files quite a lot. Every few years someone comes along who makes some mad claims about some new buzzword-enhanced compression technology. Obviously, if there is ever a radical leap forward in that area the theory will have to follow, since modern theory cannot accomodate (lossless) compression past the point of randomness (generally less than 16:1 even for Danielle Steele). mp3, avi, real media mpeg, et al are a different story entirely, sicne they are lossy and optimised for their respective information. -rw-r--r-- 1 1705 1705 7826420 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm -rw-r--r-- 1 1705 1705 7791691 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm.bz2 In this case, very slightly compressible: with some data your resulting file will be slightly larger, yet the raw datastream (and it looks like it was filmed from a cameraphone here (though most likely an 8mm digicam (these, I believe, compress on the fly, so the raw datastream never touches tape))) would probably have been many tens, if not several hundreds, of megabytes. Remember life before the tweel? -- -- _______________________________________________ firstname.lastname@example.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"