[ https://issues.apache.org/jira/browse/KAFKA-16511 ]
Francois Visconte deleted comment on KAFKA-16511:
-------------------------------------------
was (Author: JIRAUSER288982):
>The issue might be due to the overlapping remote log segments after a new
>leader gets elected during rolling restart. Would you please upload the past
>10 segments remote-log-segment metadata events for
>5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-765 partition? Thanks!
Here is
{code:java}
partition: 27, offset: 400504, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=v8QykWvQQryMxNoS1aHMCQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712045518393, brokerId=10015}
partition: 27, offset: 400505, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=v8QykWvQQryMxNoS1aHMCQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712045518494, brokerId=10015}
partition: 27, offset: 400828, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=6K-bSNgxSnOypIrjl9OiGA}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712047438507, brokerId=10015}
partition: 27, offset: 400829, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=6K-bSNgxSnOypIrjl9OiGA}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712047438609, brokerId=10015}
partition: 27, offset: 401145, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=ErPfaCVhRiS6EKx7RnL1iQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712049418775, brokerId=10015}
partition: 27, offset: 401146, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=ErPfaCVhRiS6EKx7RnL1iQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712049418894, brokerId=10015}
partition: 27, offset: 401458, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=2yAT3R6tS3umrvU8iuMVfw}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712051340571, brokerId=10015}
partition: 27, offset: 401459, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=2yAT3R6tS3umrvU8iuMVfw}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712051340719, brokerId=10015}
partition: 27, offset: 401758, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=KLtWI-SWS7eBML87LX63SA}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712053320735, brokerId=10015}
partition: 27, offset: 401759, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=KLtWI-SWS7eBML87LX63SA}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712053320838, brokerId=10015}
partition: 27, offset: 539496, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=8dP13VDYSaiFlubl9SNBTQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_STARTED, eventTimestampMs=1712909568625, brokerId=10015}
partition: 27, offset: 539497, value:
RemoteLogSegmentMetadataUpdate{remoteLogSegmentId=RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1-765,
id=8dP13VDYSaiFlubl9SNBTQ}, customMetadata=Optional.empty,
state=DELETE_SEGMENT_FINISHED, eventTimestampMs=1712909568846, brokerId=10015}
{code}
> Leaking tiered segments
> -----------------------
>
> Key: KAFKA-16511
> URL: https://issues.apache.org/jira/browse/KAFKA-16511
> Project: Kafka
> Issue Type: Bug
> Components: Tiered-Storage
> Affects Versions: 3.7.0
> Reporter: Francois Visconte
> Assignee: Kamal Chandraprakash
> Priority: Major
> Labels: tiered-storage
>
> I have some topics there were not written since a few days (having 12h
> retention) where some data remains on tiered storage (in our case S3) and
> they are never deleted.
>
> Looking at the log history, it appears that we never even tried to delete
> these segments:
> When looking at one of the non-leaking segment, I get the following
> interesting messages:
>
> {code:java}
> "2024-04-02T10:30:45.265Z","""kafka""","""10039""","[RemoteLogManager=10039
> partition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764] Deleted remote log segment
> RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ} due to leader-epoch-cache truncation. Current
> earliest-epoch-entry: EpochEntry(epoch=8, startOffset=2980106),
> segment-end-offset: 2976819 and segment-epochs: [5]"
> "2024-04-02T10:30:45.242Z","""kafka""","""10039""","Deleting log segment data
> for completed successfully
> RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ}
> , startOffset=2968418, endOffset=2976819, brokerId=10029,
> maxTimestampMs=1712009754536, eventTimestampMs=1712013411147,
> segmentLeaderEpochs={5=2968418}, segmentSizeInBytes=536351075,
> customMetadata=Optional.empty, state=COPY_SEGMENT_FINISHED}"
> "2024-04-02T10:30:45.144Z","""kafka""","""10039""","Deleting log segment data
> for RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ}
> , startOffset=2968418, endOffset=2976819, brokerId=10029,
> maxTimestampMs=1712009754536, eventTimestampMs=1712013411147,
> segmentLeaderEpochs={5=2968418}, segmentSizeInBytes=536351075,
> customMetadata=Optional.empty, state=COPY_SEGMENT_FINISHED}"
> "2024-04-01T23:16:51.157Z","""kafka""","""10029""","[RemoteLogManager=10029
> partition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764] Copied
> 00000000000002968418.log to remote storage with segment-id:
> RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ}"
> "2024-04-01T23:16:51.147Z","""kafka""","""10029""","Copying log segment data
> completed successfully, metadata:
> RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ}
> , startOffset=2968418, endOffset=2976819, brokerId=10029,
> maxTimestampMs=1712009754536, eventTimestampMs=1712013397319,
> segmentLeaderEpochs={5=2968418}, segmentSizeInBytes=536351075,
> customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}"
> "2024-04-01T23:16:37.328Z","""kafka""","""10029""","Copying log segment data,
> metadata: RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-764,
> id=fqGng3UURCG3-v4lETeLKQ}
> , startOffset=2968418, endOffset=2976819, brokerId=10029,
> maxTimestampMs=1712009754536, eventTimestampMs=1712013397319,
> segmentLeaderEpochs={5=2968418}, segmentSizeInBytes=536351075,
> customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}"
> {code}
>
> Which looks right because we can see logs from both the plugin and remote log
> manager indicating that the remote log segment was removed.
> Now if I look on one of the leaked segment, here is what I see
>
> {code:java}
> "2024-04-02T00:43:33.834Z","""kafka""","""10001""","[RemoteLogManager=10001
> partition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-765] Copied
> 00000000000002971163.log to remote storage with segment-id:
> RemoteLogSegmentId{topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-765,
> id=8dP13VDYSaiFlubl9SNBTQ}"
> "2024-04-02T00:43:33.822Z","""kafka""","""10001""","Copying log segment data
> completed successfully, metadata:
> RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-765,
> id=8dP13VDYSaiFlubl9SNBTQ}
> , startOffset=2971163, endOffset=2978396, brokerId=10001,
> maxTimestampMs=1712010648756, eventTimestampMs=1712018599981,
> segmentLeaderEpochs={7=2971163}, segmentSizeInBytes=459778940,
> customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}"
> "2024-04-02T00:43:20.003Z","""kafka""","""10001""","Copying log segment data,
> metadata: RemoteLogSegmentMetadata{remoteLogSegmentId=RemoteLogSegmentId
> {topicIdPartition=5G8Ai8kBSwmQ3Ln4QRY5rA:topic1_3543-765,
> id=8dP13VDYSaiFlubl9SNBTQ}
> , startOffset=2971163, endOffset=2978396, brokerId=10001,
> maxTimestampMs=1712010648756, eventTimestampMs=1712018599981,
> segmentLeaderEpochs={7=2971163}, segmentSizeInBytes=459778940,
> customMetadata=Optional.empty, state=COPY_SEGMENT_STARTED}"
>
> {code}
> I have no errors whatsoever indicating that the remote log deletion was
> actually triggered and failed.
> I tried rolling restarting my cluster to see if refreshing remote log
> metadata from fresh state would help but that did not help. Removing segments
> in tiered storage and restarting brokers after did not help either.
> Now if I use offset shell to get earliest and latest offsets for both
> partitions
>
> {code:java}
> $ /opt/kafka-3.7.0/bin/kafka-get-offsets.sh --bootstrap-server
> $KAFKA_CONNECT_STR --topic topic1 --partitions 764,765 -
> -time -2
> raw_spans_datadog_3543:764:2980106
> raw_spans_datadog_3543:765:2976337
> $ $ /opt/kafka-3.7.0/bin/kafka-get-offsets.sh --bootstrap-server
> $KAFKA_CONNECT_STR --topic topic1 --partitions 764,765 --time -1
> raw_spans_datadog_3543:764:2980106
> raw_spans_datadog_3543:765:2978397
> {code}
> We can see that despite the retention period being largely exceeded, there
> are 2060 offsets that are never expiring.
> One thing that might have triggered this bug was the fact that some nodes
> were replaced with fresh disk in between the segment creation and their
> expiration.
> The issue happened on 418 partitions out of 4513, ~9% of partitions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)