Zhipeng Liu created KAFKA-13222:
-----------------------------------
Summary: Records in newly rolled segment couldn't be fetched by
consumer
Key: KAFKA-13222
URL: https://issues.apache.org/jira/browse/KAFKA-13222
Project: Kafka
Issue Type: Bug
Reporter: Zhipeng Liu
We encountered a issue about Kafka broker in production environment, one of
consumers within a consumer group unable to fetch messages suddenly from the
partition it was assigned. The offset that the consumer couldn't fetch just the
base offset of an new segment file.
{code:java}
# /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092
--describe --group privisioning
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
......
nrss_event 0 18898345 18899120 775
consumer-2-0af9de42-7d3d-4a9b-a7af-f70cd51e518f /10.199.149.89 consumer-2
dbqt_event 0 83 83 0 consumer-2-0af9de42-7d3d-4a9b-a7af-f70cd51e518f
/10.199.149.89 consumer-2
......
{code}
Below is the log segments status.
{code:java}
# ls -lrth
total 432K
-rw-r--r-- 1 kafka kafka 10 Aug 4 22:03 00000000000018897133.snapshot
-rw-r--r-- 1 kafka kafka 246K Aug 11 11:03 00000000000018897133.log
-rw-r--r-- 1 kafka kafka 464 Aug 11 22:29 00000000000018897133.index
-rw-r--r-- 1 kafka kafka 708 Aug 11 22:29 00000000000018897133.timeindex
-rw-r--r-- 1 kafka kafka 10 Aug 11 22:29 00000000000018898345.snapshot
-rw-r--r-- 1 kafka kafka 10M Aug 13 23:32 00000000000018898345.timeindex
-rw-r--r-- 1 kafka kafka 10M Aug 13 23:32 00000000000018898345.index
-rw-r--r-- 1 kafka kafka 154K Aug 13 23:42 00000000000018898345.log
-rw-r--r-- 1 kafka kafka 28 Aug 16 18:46 leader-epoch-checkpoint
{code}
We dumped the older segment 00000000000018897133.log (Starting offset:
18897133) and the latest record (offset: 18898344) created at 11:03 Aug 11.
We checked server log on Aug 11 and found a segment rolling happened at 22:29.
{code:java}
[2021-08-11 22:03:46,004] INFO [Log partition=nrss_event-0,
dir=/kafka/kafka/data] Found deletable segments with base offsets [18895822]
due to retention time 604800000ms breach (kafka.log.Log)
[2021-08-11 22:03:46,004] INFO [Log partition=nrss_event-0,
dir=/kafka/kafka/data] Scheduling log segment [baseOffset 18895822, size
266974] for deletion. (kafka.log.Log)
[2021-08-11 22:03:46,005] INFO [Log partition=nrss_event-0,
dir=/kafka/kafka/data] Incrementing log start offset to 18897133 (kafka.log.Log)
[2021-08-11 22:04:46,005] INFO [Log partition=nrss_event-0,
dir=/kafka/kafka/data] Deleting segment 18895822 (kafka.log.Log)
[2021-08-11 22:04:46,006] INFO Deleted log
/kafka/kafka/data/nrss_event-0/00000000000018895822.log.deleted.
(kafka.log.LogSegment)
[2021-08-11 22:04:46,006] INFO Deleted offset index
/kafka/kafka/data/nrss_event-0/00000000000018895822.index.deleted.
(kafka.log.LogSegment)
[2021-08-11 22:04:46,007] INFO Deleted time index
/kafka/kafka/data/nrss_event-0/00000000000018895822.timeindex.deleted.
(kafka.log.LogSegment)
{code}
We noticed there is a similar issue KAFKA-10313 reported before about segment
rolling. We checked logging of Kafka client consumer (with logging level INFO
in logback.xml), but not found any similar logging, no offset reset happened at
Kafka client.
We used default segment rotation period (168 hrs) and size (1G), suppose there
should be only one segment in the log folder. But the records in my enviroment
spreads in two segments and the records in the newer segment couldn't be
fetched by consumer. It is quite confused to me and wondering if a Kafka bug
for my broker version.
Version info:
||component||version||
|Kafka broker|2.0.1
|Kafka client|0.11.0.0|
--
This message was sent by Atlassian Jira
(v8.3.4#803005)