Thanks Steve! (Anybody seen my Exedrin?) :-( Don
> -----Original Message----- > From: Steve Jolly [mailto:[EMAIL PROTECTED] > Sent: Saturday, November 13, 2004 4:07 PM > To: [EMAIL PROTECTED] > Subject: JPEG Compression Made Slightly-Less-Complicated (was Re: > Reducing File Size with Photoshop) > > > William Robb wrote: > > This should be entertaining. > > Please, elucidate. > > OK. This is going to be bloody impossible to do entirely in text > without attachments, but I'll give it a try. :-) > > The first thing that JPEG compression does is divide your image up into > 8-pixel by 8-pixel blocks. These blocks are then compressed separately. > This is the reason that highly-compressed JPEGs look blocky. :-) > > You're going to have to think like a mathematician now. What you think > of as an 8-pixel by 8-pixel crop of your photo, a mathematician thinks > of as a "function". They would write something like B(x,y) - brightness > as a function of position in two dimensions. Your 8x8 block contains > the values of B(x,y) at 64 points in the image. > > The second thing that JPEG compression does is, for each block in the > image, to take the data and runs it through a "Discrete Cosine > Tranformation", or DCT. This is just a mathematical equation - a "black > box" that takes in a function B(x,y) and outputs another function > A(u,v). Now, what are A, u and v, I hear you cry? That's where it > starts getting tricky. A(u,v) is a new 8x8 block that contains *spatial > frequencies* of the brightness information in the original 8x8 block. I > reckon that needs a bit more explanation. > > Forget the two-dimensions aspect for a minute; think about a single 8x1 > pixel crop of your image. If I draw a graph that shows their > brightnesses, it might look something like this: > > B^ > | > | | | > | | | | | | > | | | | | | | | > +------------------------> x > 1 2 3 4 5 6 7 8 > pixel number > > Now, that looks a bit like a sine wave to me. If I draw a sine wave > with a suitable frequency, I could get something that looks like this: > > B^ > | * > | * * * > | * * > | * * > | * * > +---------*--------------> x > 1 2 3 4 5 6 7 8 > pixel number > > Now that's not quite the same shape as the original data, but it's a > start. If we picked a second sine wave with a different frequency and > added it to the first one, we could get closer. If we added a third > one, we'd be closer still, and so on. As it happens, there's a quirk of > maths that says that if we use eight very specific sine waves that are > the same every time (but with different amplitudes and horizontal > offsets) and add them, we can get *exactly* the same shape as we started > out again. The formula that says what the amplitudes and offsets should > be is the DCT. > > So, still thinking in *one* dimension, a DCT takes B(x) (The brightness > of each of the eight positions represented by x) and outputs A(u), where > A is the amplitude and horizontal offset of each of the eight sine waves > represented by u. I hope it's not to hard to imagine that a *two* > dimensional DCT takes B(x,y) and outputs A(u,v). Because if it is, I've > lost you. :-) > > Right. > > So far, we haven't done any lossy compression. If we take A(u,v) and > add all those sine waves together, we get B(x,y) back *exactly* as it > was originally. The lossy bit happens next, in a process called > "quantisation". You see, the values of A aren't whole numbers (0,1,2,3 > etc) like the original brightness values were; they're "real" numbers - > they can take *any* value between zero and some maximum value that you > don't need to know about. Because they can take an infinite number of > values, they'd need an infinite amount of disk space to store them in, > which isn't much good for a compression scheme. So, what we do is we > take each of those numbers, and we work out what the closest we can get > to it using a *fixed* number of bits is. And we don't use the same > number of bits for each value of u and v, we take advantage of the fact > that the human eye is less sensitive to certain spatial frequencies > under certain circumstances to "weight" the process - frequencies that > the eye sees better get encoded more accurately, and vice versa. You > can vary the overall amount of compression by varying the total number > of bits used to compress each block. > > And that's it! You can take all the bits that represent all the blocks, > save them to a file and you have a new representation of the image; one > that takes a bit of work to decode again, but that can take up much less > storage space than the original while being perceptually identical. > > Well, I hope that made *some* kind of sense... you may have noticed that > I only mentioned stuff relevant to greyscale images. Colour ones are > more complicated (but use the same principles). > > S >

