[
https://issues.apache.org/jira/browse/KAFKA-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942069#comment-15942069
]
ASF GitHub Bot commented on KAFKA-1449:
---------------------------------------
GitHub user ijuma opened a pull request:
https://github.com/apache/kafka/pull/2739
KAFKA-1449: Use CRC32C for checksum of V2 message format
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ijuma/kafka kafka-1449-crc32c
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/2739.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2739
----
commit ec8dcc18aa21a819349b00e09c8e2a13e71a3643
Author: Ismael Juma <[email protected]>
Date: 2017-03-25T23:35:27Z
Use CRC32C for checksum of V2 message format
----
> Extend wire protocol to allow CRC32C
> ------------------------------------
>
> Key: KAFKA-1449
> URL: https://issues.apache.org/jira/browse/KAFKA-1449
> Project: Kafka
> Issue Type: Improvement
> Components: consumer
> Reporter: Albert Strasheim
> Assignee: Neha Narkhede
>
> Howdy
> We are currently building out a number of Kafka consumers in Go, based on a
> patched version of the Sarama library that Shopify released a while back.
> We have a reasonably fast serialization protocol (Cap'n Proto), a 10G network
> and lots of cores. We have various consumers computing all kinds of
> aggregates on a reasonably high volume access log stream (1.1e6 messages/sec
> peak, about 500-600 bytes per message uncompressed).
> When profiling our consumer, our single hottest function (until we disabled
> it), was the CRC32 checksum validation, since the deserialization and
> aggregation in these consumers is pretty cheap.
> We believe things could be improved by extending the wire protocol to support
> CRC-32C (Castagnoli), since SSE 4.2 has an instruction to accelerate its
> calculation.
> https://en.wikipedia.org/wiki/SSE4#SSE4.2
> It might be hard to use from Java, but consumers written in most other
> languages will benefit a lot.
> To give you an idea, here are some benchmarks for the Go CRC32 functions
> running on a Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz core:
> BenchmarkCrc32KB 90196 ns/op 363.30 MB/s
> BenchmarkCrcCastagnoli32KB 3404 ns/op 9624.42 MB/s
> I believe BenchmarkCrc32 written in C would do about 600-700 MB/sec, and the
> CRC32-C speed should be close to what one achieves in Go.
> (Met Todd and Clark at the meetup last night. Thanks for the great
> presentation!)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)