We have 3 DC's and created 5 node Kafka cluster in each DC, connected these 3
DC's using Mirror Maker for replication. We were conducting performance testing
using Kafka Producer Performance tool to load 100 million rows into 7 topics.
We expected that data will be loaded evenly across 7 topics but 4 topics got
loaded with ~2 million messages and remaining 3 topics loaded with 90 million
messages. The nodes that were leaders of those 3 topics ran out of disk space
and nodes went down.
We tried to bring back these 2 nodes by doing following
1. Stopped Kafka Service 2. Deleted couple of topics that were taking up too
much space i.e. /var/kafka/logs/{topic$}/ and file system showed 47% available
3. Brought back the Kafka nodes
As soon as nodes are back, we started observing the file system growing and in
15 minutes the mount point became full again. Deleted topics got recreated and
taking up space again. Looking at kafka.log, it shows many of the following
messages. Ultimately the node goes down. We don't need to recover data now, we
would like to bring nodes back. What are the steps to bring back these nodes?
[2015-03-11 20:52:36,323] INFO Rolled new log segment for 'dc2-perf-topic5-0'
in 3 ms. (kafka.log.Log)
[2015-03-11 15:58:07,321] INFO [Kafka Server 1021124614], started
(kafka.server.KafkaServer)
[2015-03-11 15:58:07,882] INFO Completed load of log dc2-perf-topic5-0 with log
end offset 0 (kafka.log.Log)
[2015-03-11 15:58:07,900] INFO Created log for partition [dc2-perf-topic5,0] in
/var/kafka/log with properties {segment.index.bytes -> 10485760,
file.delete.delay.ms -> 60000, segment.bytes -> 1073741824, flush.ms ->
9223372036854775807, delete.retention.ms -> 3600000, index.interval.bytes ->
4096, retention.bytes -> -1, cleanup.policy -> delete, segment.ms -> 604800000,
max.message.bytes -> 1000012, flush.messages -> 9223372036854775807,
min.cleanable.dirty.ratio -> 0.5, retention.ms -> 604800000}.
(kafka.log.LogManager)
[2015-03-11 15:58:07,914] INFO Completed load of log dc2-perf-topic2-0 with log
end offset 0 (kafka.log.Log)
[2015-03-11 15:58:07,916] INFO Created log for partition [dc2-perf-topic2,0] in
/var/kafka/log with properties {segment.index.bytes -> 10485760,
file.delete.delay.ms -> 60000, segment.bytes -> 1073741824, flush.ms ->
9223372036854775807, delete.retention.ms -> 3600000, index.interval.bytes ->
4096, retention.bytes -> -1, cleanup.policy -> delete, segment.ms -> 604800000,
max.message.bytes -> 1000012, flush.messages -> 9223372036854775807,
min.cleanable.dirty.ratio -> 0.5, retention.ms -> 604800000}.
(kafka.log.LogManager)
[2015-03-11 15:58:07,935] INFO Completed load of log dc2-perf-topic9-0 with log
end offset 0 (kafka.log.Log)
SP Naidu