Ismael Juma created KAFKA-3565:
----------------------------------
Summary: Producer's throughput lower with compressed data after
KIP-31/32
Key: KAFKA-3565
URL: https://issues.apache.org/jira/browse/KAFKA-3565
Project: Kafka
Issue Type: Bug
Reporter: Ismael Juma
Priority: Critical
Fix For: 0.10.0.0
Relative offsets were introduced by KIP-31 so that the broker does not have to
recompress data (this was previously required after offsets were assigned). The
implicit assumption is that reducing CPU usage required by recompression would
mean that producer throughput for compressed data would increase.
However, this doesn't seem to be the case:
{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:
2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status: PASS
run time: 59.030 seconds
{"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
{code}
Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:
2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status: PASS
run time: 1 minute 0.243 seconds
{"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
{code}
Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
The difference for the uncompressed case is smaller (and within what one would
expect given the additional size overhead caused by the timestamp field):
{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:
2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status: PASS
run time: 1 minute 4.176 seconds
{"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
{code}
Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:
2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status: PASS
run time: 1 minute 5.079 seconds
{"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
{code}
Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)