[videoblogging] Secrets of Video Compression

Joshua Czikowski Mon, 25 Jul 2005 22:06:14 -0700

Secrets of Video Compression

Well, I don't know how secret they are, but there are some techniques you can use to get the most out of your web video. A lot of it just has to do with understanding how data (read that movies) is transmitted over thre net. Some of it goes into how we poor humans perceive things like color -- and the differences between TV screens and a computer screen.

Standard D1 video (like what you see on TV) is 640 pixels wide by 480 tall and is made up of 30 frames of two fields per second – or 60 images at 640x480 per second of transmission. That's 27 Megabytes per second of data to process. Most computers can't even play that straight form the hard drive, they just don't have the processing power to handle all that data that fast. So to make it workable and playable we use a codec – for compressor/decompressor to squeeze that data down to a manageable level.
DV or the Digital Video codec is probably the video codec you're most familiar with. It's the near lossless compression technology that's used in most digital video cameras. It gives us about 5:1 compression. So our original 27 MB of data drops to about a fourth of its size. Even DV source material has a data rate of 3.5 megabytes per second (nearly 30,000 kilobits per second), though. So to get that down to a size that can be streamed over the Internet to viewers with broadband connections, or even modems requires a combination of intelligent preprocessing and powerful compression technologies.

Basically there are two types of video compression—spatial and temporal. Spatial compression is compression within a single frame of video. Usually spatial compression techniques are based on the idea that computers are stupid. The easiest thing to compress is a solid color square. That's because the computer starts out by recording the square's color by saying, "the pixel with coordinates x=0 and y=0 is black" and then does that for every individual pixel. It's a little like marking your calendar Monday 12:30 Lunch, Tuesday 12:30, Lunch, Wednesday 12:30 Lunch, etc. A "lossless" compressor like LZW (a common compressor for still images in TIFF format) rewrites the data to simply say, "Lunch 12:30 Monday-Friday", or in this case "all the pixels from 0,0 to 100,100 are black". It eliminates all those extra lines of data creating a smaller file without sacrificing any of the original information. Lossless compressors can usually compress up to about 50% of the original file size, sometimes less, but almost never more.

JPEG is also a spatial compression technology, but its what is called "lossy". It works by throwing out some of the information stored in the image. Ideally it does this intelligently by discarding first that information which is less useful. So at lower levels of compression it throws out things like colors that are outside of the range of human perception. As you apply higher levels of compression it relies more on averaging across pixels to flatten out the color range and create less data to store. That's why it's a bad idea to open and resave JPEG or other lossy compressed files. Every time you do they get recompressed, throwing out more data, till you eventually have crap.

Temporal compression is also lossy. It works by compressing across many frames of video—compression across time —and temporally compressed frames of video are called "difference frames" or "delta frames". That's because the codec only records the differences between frames, not all the information in each frame. The best codecs use a combination of both spatial and temporal compression. The MPEG4 codec, for example sets a keyframe (spatial compression) every so often as a reference point, then uses delta frames (temporal compression) between keyframes to get even more compression. The combination results in higher quality at smaller sizes. With a codec like MPEG4 you may get upwards of 90-95% compression and still have a pretty good looking video.

Video codecs are generally broken down to transfer codecs and delivery codecs. Transfer codecs are used to transfer files form one place to another, for example from one editing station to another, or to store files using less drive space. These codecs are generally lossless or minimally lossy and afford minimal compression, but very high quality. Apple's new Pixlet codec was developed at Pixar Studios (the people that did Toy Story and Finding Nemo) to do just that – move and store files. Delivery codecs are just what they sound like, a way to deliver compressed video usually on the web or CD. Delivery codecs are generally lossy and sacrifice quality for playability.

In all, we can reduce video sizes in three ways, reduce the data-rate (increase the compression), reduce the physical size (number of pixels), or reduce the number of frames per second. Play with any combination of those numbers and you've got a smaller file, more easily playable over the web or from CD ROM.

Data Rate

A 56Kilobits per second (Kbps) modem transfers data at 56 kilobits per second -- it doesn't really. Its actually rated at 53 Kb, and in reality it'll probably get between 4 and 5 Kilobytes (KB) of actual data. See that -- small "b" bit -- big "B" byte. The files on your hardrive are measured in Kilobytes and megabytes. Data throughput is measured in kilobits per second.

The rule of thumb is to divide by eight (there are eight bits to a byte). That'll get you in the ball park. 53 ÷ 8 = 6.6, a little high, but not bad for government work. Wanna' be safe, cut that in half. Shoot for four or five and you'll be covered for those nasty dips and spikes that are inevitable in an imperfect system. So really, to get a movie that will play in real-time on a 56K modem you're only looking at getting about 4Kb of data for BOTH video and audio.

So your data rate is going to depend on your target audience. In their training materials, Sorenson recommends using the following formula to calculate data rate:

datarate = (width x hieght x frames per second)÷ 48000

Most codecs perform well at no less than half the resulting number, and no more than twice that. For example, if you have a 320 x 240 movie at 30fps, you'll get a baseline datarate of 48KB/s. So a pretty static clip, one without a lot of motion will probably still look okay at 24KB (half that), while an action clip with a lot of jerky camera movements and fast cuts will probably need about twice that, or somewhere around 96KB. You'll still need to test to get the best results, but that gives you a place to start looking.

Another good rule, if you're doing progressive download is to give your viewers some options. If you've been to the Apple QuickTime movie trailer site, you've probably seen some of the trailers offered in small, medium and large (even full screen) options. For the viewer the choices are small crappy video that starts playing quickly, or big beautiful video that requires a wait to get it started. My advice is to let them decide. You should be aware though, that according to Apple's numbers something like half of all the movie viewers on their site choose the biggest movie -- regardless of their connection. That means you have people on dial-up waiting hours while they download a 40-50MB movie. The next most popular -- the smallest size.

Make it Pretty

So we've covered the science, now let's discuss the art. The real trick to making it pretty is not to try to do too much at once. First you clean, then you compress. Its called pre-processing and most compressionists agree it is the secret to getting great web video.

Personally, my favorite codec at this time is still Sorenson Video 3. I'm not a huge fan of MPEG4 video, despite Apple's marketing push. Love the AAC sound compressor, but the video codec just isn't quite there yet. Even still, the following recommendations will work with just about any of the current crop of codecs. As I've explained, most of these codecs use algorithms that try to squeeze the video by interpolating and saving only the data that is different between one frame and the next. So the more alike each frame is, the better we can compress. That means high motion shots, fast movements and lots of jerky camera work are harder to compress than static headshots. These are called "interframe" codecs.

Whether you're using Discreet's cleaner, Hip Flicks, Squeeze, or even just QuickTime Pro, you've got several options for adjusting brightness, contrast, and hue. Depending on the software you've probably got a couple more options-- like white and black restore. The trick is to take your uncompressed video into your favorite program and NOT compress it, at least not on the first pass. Your basic uncompressed video has schmutz -- pros call these "artifacts". So we're going to raise the contrast and brightness to get rid of some or all of that. We're looking to blow out the bright areas and crush the blacks. By eliminating some of the useless detail in those areas, we eliminate some of the differences between frames. Differences that interframe codecs would interpret as data and try to encode, thereby creating artifacts.

Cleaning up the audio by normalizing it, adjusting the volume, and applying whatever filters your software provides. The real key is to do all these things, clean, crop, adjust -- whatever you need to do OTHER than compress, in your first pass. Save your movie with audio and video compression of "none". Your going to get a darn big file, but it'll be absolutely pristine so once you start compression your end result will be as high a quality as possible given you other variables.

Certain codecs calculate a certain way, learn where your specific codec's sweet spot is and exploit it. For example, Sorenson in particular samples in blocks of eight. So cropping to a size divisible by eight is going to give you the best result. And always, always crop. Not only will it get rid of edge noise sometimes picked and magnified by the compression process, but shaving down the physical dimensions will also save you bandwidth, even if it's just a few pixels.

Most any compression software you use will have a way to preview the results of your filtering, so find the darkest areas of the video and the lightest and do a little testing. Also, frame out your video at the smallest crops possible and scrub through the timeline to make sure you're not cropping out pieces of a face, or other important information. You might even need to crop separate parts of the movie individually, then paste them together later. I've sometimes found that one scene might be framed a bit high, another a bit low. Don't be afraid to split the movie up and handle it this way. The splitting and stitching can all be handled right in QuickTime Pro.

Compress Pass

Now that you have your big-fat video that's been cropped, crushed, and otherwise scrubbed clean of annoying stray pixels that'll never be noticed anyway, its time to do what you really came for in the first place and compress the living crap out of the file.

Remember the rules, you can reduce file size and make the video more deliverable in three ways; change the height and width dimensions of the video, change the frame rate, or change the data rate. It'll always be some combination of the three that will give you the best results, but don't forget that everything is a trade off. Despite what your girlfriend tells you, bigger is better. Reducing the dimensions decreases the perceived value to your audience and may also interfere with the clear communication of your message. Nobody wants to watch a postage stamp-sized video, no matter how pretty it is. Set the frame rate too low and you'll get something that looks more like a slide-show than a video, and squeeze the data-rate down too much and the video will be blocky, blurry and otherwise unattractive.

So what are the secret settings that make the perfect video every time? Unfortunately, they change depending on the video, and in fact the best results might come from using different codecs on different parts of your video and stitching it all up later. As I noted earlier, QuickTime can handle mixed codecs, multiple tracks, and even different bits with different data-rates without missing a step. Don't be afraid to experiment for the best results.

It's important to realize that the best codecs are interframe codecs, so they are looking for changes from scene to scene. If you take nothing else away from this section and the one on camera techniques, take that. Let it roll around in your head like a BB in a basketball until you start to understand the implications of just what that means.

Ultimately, at least in terms of compression, change is bad.

So scenes with little change, talking heads (newscasts for example) are easy to compress. Solid color backgrounds, flat lighting and slow deliberate movements, all yield better compression results. They also make for a pretty darn boring piece of video. The things that make your movie engaging are the things that are hardest to compress, namely action. Running, jumping, quick pans, that cool jerky hand-held technique like they used to do on the TV show Homicide, all make lousy candidates for compression.

With QuickTime we can compensate for this by working the compression program and massaging our three variables till we get to the best possible compromise, and that's going to be different every time for every piece of video. Even for experienced compressionists, it's really a matter of trial and error. Like negotiating a divorce, it's all about what you're willing to give up.

-----------------------------------------------------------------
chat to vloggers, see realtime feeds and more
Download the vlogbar
http://vlogbar.blogspot.com
------------------------------------------------------------------

YAHOO! GROUPS LINKS

Visit your group "videoblogging" on the web.
To unsubscribe from this group, send an email to: [EMAIL PROTECTED]
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

[videoblogging] Secrets of Video Compression

Reply via email to