[
https://issues.apache.org/jira/browse/KAFKA-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jay Kreps updated KAFKA-739:
----------------------------
Attachment: KAFKA-739-v2.patch
New patch rebased to trunk and addresses Neha's comments:
1. Changed delete retention to 24 hours
2. Fixed broken logic in warning statement so it warns when your buffer is too
big.
3. Yes, that was in the patch, just got lost in the conflict?
4. Dump log segments was printing the value as the key, fixed.
5. SimpleKafkaETLMapper didn't handle null. This isn't an easy fix since the
text format doesn't have an out of range marker to represent null. Returning
empty string which is ambiguous but better than crashing.
6. Linear probing has the problem that it tends to lead to "runs". I.e. if you
have a fixed probing step size of N then if you have a collision the
probability that the spot M slots over is full is going to be higher. So the
ideal probing approach would be a sequence of fully random hashes which were
completely uncorrelated with one another. That is the motivation for using the
rest of the md5 before degrading to linear probing since we have already
computed 16 bytes of random hash. The second question is wether it is legit to
increment byte by byte or not since this effectively reuses bytes of the hash.
I agree it is a little sketchy, though it does seem to work.
7. Clarified the purpose of dump logs.
> Handle null values in Message payload
> -------------------------------------
>
> Key: KAFKA-739
> URL: https://issues.apache.org/jira/browse/KAFKA-739
> Project: Kafka
> Issue Type: Bug
> Reporter: Jay Kreps
> Assignee: Jay Kreps
> Fix For: 0.8.1
>
> Attachments: KAFKA-739-v1.patch, KAFKA-739-v2.patch
>
>
> Add tests for null message payloads in producer, server, and consumer.
> Ensure log cleaner treats these as deletes.
> Test that null keys are rejected on dedupe logs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira