I just hit an interesting pack failure because of how git (mis-)uses zlib,
and I'm wondering what to do about it.
In particular, the "git-unpack-objects" code gets a data stream, and only
knows the _unpacked_ size of each object, because writing packed size is
extremely inconvenient in many ways (let me count the ways.. At pack time,
we want to fill in the size field before we've even packed things, and at
unpack time, we really don't care about the packed size, but we _do_ care
about the unpacked size in order to be able to allocate the right sized
allocation for the result).
However, it turns out that there's a silly special case: a zero-sized
"blob" object will ne encoded as a single byte "0x30" followed by the
"packed representation of empty".
Now, you'd expect the packed representation of empty to be empty, but
that's not apparently what zlib does. It actually seems to pack zero bytes
as 8 bytes of "78 9c 03 00 00 00 00 01". Which is fine, I don't care, with
git this will literally happen for only one single object, so it's not
like I care about the expansion.
But what I care about is that when git-unpack-objects sees that it wants a
zero-byte object, and asks zlib to unpack it, zlib will not actually use
the bytes it wrote - it will just say "oh, you wanted zero bytes, here's
zero bytes". Which means that the stream handling gets upset.
Now, I can easily fix this by just teaching the packing code that it
should pack the zero-byte object as zero bytes, and not let zlib mess it
up. In fact, I've done exactly that. However, now I worry that there's
some other case where zlib uncompression doesn't eat everything that zlib
compression generated. I've not found it, and I think a zero sized case is
special (it's kind of like a "break" event), but this is yet another cry
for zlib expertise in case somebody knows...
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html