Re: Object Container Files format questions : COBS performance proposal

Scott Carey Wed, 06 May 2009 19:52:15 -0700

I have read the COBS paper, and I propose modifications to COBS to (likely)
make it significantly faster and fit the Hadoop use case better below.
When I first skimmed the paper, I identified the same issues that Doug did
-- particularly that it is a byte-by-byte encoding.  Java is SLOW at
byte-by-byte anything.  The JIT compilers in Java haven't really optimized
byte-by-byte work well -- compared to a C compiler.

I just saw that Matt Massie opened a JIRA ticket for this, so I'll leave the
bulk of the details for that and send what I have written so far below.

Initial comments inline below, with a summary at the end.  JIRA details
later.

On 5/4/09 10:43 AM, "Doug Cutting" <[email protected]> wrote:

> Matt Massie wrote:
>> (2) Why are we using a 16-byte UUID block delimiter to mark block boundaries
>> instead of using say Consistent Overhead Byte Stuffing (COBS)?
> 
> In part because I didn't yet know about COBS when I implemented this
> container.  (You're the second person to ask this.)  Perhaps we should
> instead use COBS.  The only potential disadvantage I see is that COBS
> seems to require byte-by-byte processing.  When projecting records to a
> subset schema, we've seen huge speedups if we skip through data by
> chunks, passing over strings, blobs, etc. just incrementing the file
> pointer.  So I worry that COBS, while quite fast, might add a
> significant cost to such skipping.  Compression also requires
> byte-by-byte processing, but provides more tangible value than COBS.  So
> COBS would need to add truly negligible CPU overhead, which it might.

If its going byte by byte, its a performance risk for the low overhead, high
throughput case.  Perhaps a test using a ByteBuffer and simply counting the
number of zero values encountered would be a useful (prevents dead code
elimination).

One could compare this to something similar with IntBuffer or LongBuffer,
counting the occurances of zero values there.

> 
>> COBs
>> also allows for in-place encoding and decoding with very little overhead and
>> copying (see the paper for test results).
> [ ... ]
> 
> Thanks for all your COBS analysis!  It does seem attractive.  Skimming
> the paper, I see mostly space overhead benchmarks, not CPU overhead.  In
> projection benchmarks, we saw a 4x speedup from just skipping strings.
> AVRO-25 proposes to make entire arrays and maps skippable, which should,
> e.g., speed the projection of a single field from large, complex records
> by 10x or more.  But if compression is always used, perhaps the right
> comparison is with something like LZO.  If COBS is negligible next to
> LZO, that could be good enough.
> 

First, I do suspect that compression will be common.  But there will be very
important non-compressed use cases as well.

Next, even if it is roughly on par with LZO, that is too slow.  It would
have to be a couple times faster than LZO to have minimal CPU impact.  A
fast checksum operation is a few times faster than LZO.  A big reason for
that is that you do most checksums on 32 bit or 64 bit chunks.

LZO is also "slow".  Google mentions a compression scheme for BigTable that
is faster than LZO:

From the Google BigTable whitepaper (2006):
" Many clients use a two-pass custom
compression scheme. The ﬁrst pass uses Bentley and
McIlroy’s scheme [6], which compresses long common
strings across a large window. The second pass uses a
fast compression algorithm that looks for repetitions in
a small 16 KB window of the data. Both compression
passes are very fast—they encode at 100–200 MB/s, and
decode at 400–1000 MB/s on modern machines "

LZO and its similar schemes are somewhat slower, especially in Java:
http://quicklz.com/

Sun's lzjb compression in ZFS claims to be a bit faster (benchmarks seem to
put it in the neighborhood of lzo, and they are related algorithms):
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/
compress.c

In short, if you want fast streaming for Java, I suspect COBS won't work in
the byte-by-byte form in the paper.  However ...

> Hopefully we'll develop a benchmark suite for Avro that includes
> projection & compression, so we can more easily evaluate things like this.
> 
> Doug
> 

COBS has fantastic properties.
My proposal is to essentially go from COBS to COWS or COLS.  Constant
Overhead Word Stuffing or Constant Overhead Long Stuffing.  The properties
of COBS all carry over with larger chunks, just faster.

Java may not compile efficient byte-by-byte operations, but it does pretty
well with 32 bit or 64 bit chunks.  My proposal is essentially to do almost
the same thing as COBS, but in 32 or 64 bit chunks.  This should be
efficient in many languages. Eyeballing the future, it should probably be 64
bit chunks.  Sun is about to release JVM 1.6.0_14 which has pointer
compression, and 64 bit JVMs will perform better than 32 bit ones on x86-64
with nearly the same memory overhead.
When scanning for the beginning of a 'packet', one can do so looking for a
zero WORD or LONG rather than a zero BYTE.

There are drawbacks to this, I'll address in the JIRA.  There are many ways
to handle them however and one can reduce overhead to a maximum of 16 bits.
Since we expect most hadoop related streams to be larger than the design
goals of COBS, we probably want to encode longer runs to 'skip' more data at
a time and can afford more overhead per packet.
The truth is, if we had access to SSE instructions, you'd probably do this
128 bits at a time and approach a few GB/sec in memory on newer hardware.

The COBS paper seems to target network protocols and low level network
devices and hardware, which already operate one byte at a time and would not
have a performance problem from that aspect of it.

Re: Object Container Files format questions : COBS performance proposal

Reply via email to