On 09/05/07, Ted Mittelstaedt <[EMAIL PROTECTED]> wrote:

> -----Original Message-----
> [mailto:[EMAIL PROTECTED] Behalf Of Gary Kline
> Sent: Tuesday, May 08, 2007 7:19 PM
> Cc: Gary Kline; FreeBSD Mailing List
> Subject: Re: Another slightly OT q...
>       So it *was* a hoax?  Rats.  Some weeks ago on Public
>       Broadcasting, a few sentences were spoken on the potential of
>       fractal geometry to achieve [I'm guessing] data-compression on
>       the order of what Sloot was claiming.  So far, no one has figured
>       it out.  It may be a dream... .

There's some cool math out there that explains all of this but I never liked
math, but it isn't necessary to know the math to understand the issue.  Just
consider the problem for a while and you will realize that the compression
ratio of a specific data stream varies dependent on the amount of repetition
the input datastream.  A perfectly unrandom datastream, like a constant
series of logical 1's, carries no information, but has a compression ratio
that is infinite.  A perfectly random datastream, on the other hand,
also carries no information, but has a compression ratio that is zero.
I believe that a datastream that is 50% of the way between either extreme
carries the most information, and I believe your typical datastream is much
closer to
the perfectly unrandom side than the perfectly random side, compression is
merely the process of pushing the randomness of the stream closer to the
random side.

Actually, the more information (as such) the closer
the data stream is to perfectly random.  The relation-
ship might be asymptotic, but I am no maths major.

Thus, if the input datastream is very close to the perfectly unrandom side -
meaning it has a very high amount of repetition in it, you can get some
pretty spectacular compression ratios.  But as you move closer to unrandom,
you carry less data.  So, the better applications emit datastreams that
are less unrandom, therefore compression does not work as well on them.

I suppose this leads to the discussion about what
"data" and "information" really are.  Imagine a can.
The can is data.  Imagine tha can is full of worms.

This of course is completely ignoring the other data issue, is the
data efficient to begin with?  For example, you can transfer about a page of
information in ASCII that consumes about 1K of data, that same page of
information in a MS Word file consumes a hundred times that amount of
space -
Word is therefore extremely inefficient with data.

In this case, since word "has to" replace typesetting,
layout, and formatting software, in addition to being a
word processor the header and meta information tend
to bloat the files quite a lot.

Every few years someone comes along who makes
some mad claims about some new buzzword-enhanced
compression technology.  Obviously, if there is ever a
radical leap forward in that area the theory will have to
follow, since modern theory cannot accomodate (lossless)
compression past the point of randomness (generally less
than 16:1 even for Danielle Steele).  mp3, avi, real media
mpeg, et al are a different story entirely, sicne they are
lossy and optimised for their respective information.

-rw-r--r--  1 1705  1705  7826420 May  9 10:58
-rw-r--r--  1 1705  1705  7791691 May  9 10:58

In this case, very slightly compressible: with some data
your resulting file will be slightly larger, yet the raw datastream
(and it looks like it was filmed from a cameraphone here (though
most likely an 8mm digicam (these, I believe, compress on the fly,
so the raw datastream never touches tape))) would probably have
been many tens, if not several hundreds, of megabytes.

Remember life before the tweel?

freebsd-questions@freebsd.org mailing list
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to