On Tue, Apr 12, 2016 at 3:37 PM, Dan Burkert <[email protected]> wrote:
> Sounds good to me. It should also be possible to begin preallocating in
> the future if we see a need, since it's the same on disk format, right?
>
Yeah. We'd likely just want to change the validation rules.
> The only other concern about preallocation is making sure that the length
> seed for a new record is never 0, since you could read a string of null
> bytes as repeated 0 length records. E.G., assuming CRC(null bytes, 0) == 0
This assumption doesn't appear to hold. I wasn't sure, so I checked. This
is the test I wrote (which fails):
TEST_F(CrcTest, TestCRC32CZeroSeed) {
uint64_t zero = 0;
ASSERT_EQ(0, Crc32c(&zero, sizeof(zero)));
}
The output, on my machine, is:
[ RUN ] CrcTest.TestCRC32CZeroSeed
../../src/kudu/util/crc-test.cc:92: Failure
Value of: Crc32c(&zero, sizeof(zero))
Actual: 2351477386
Expected: 0
So I think we are safe not chaining the CRCs, and not changing the CRC
seed, unless we think that it's very important to enforce ordering.
Mike