[
https://issues.apache.org/jira/browse/KAFKA-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764170#comment-17764170
]
Luke Chen commented on KAFKA-15414:
-----------------------------------
[~fvisconte], one thing to clarify, this time, the remote segments are now
deleted, right?
Could you provide full log for investigation ? If no, I'll try to reproduce it
in my env.
> remote logs get deleted after partition reassignment
> ----------------------------------------------------
>
> Key: KAFKA-15414
> URL: https://issues.apache.org/jira/browse/KAFKA-15414
> Project: Kafka
> Issue Type: Bug
> Reporter: Luke Chen
> Assignee: Kamal Chandraprakash
> Priority: Blocker
> Fix For: 3.6.0
>
> Attachments: Screenshot 2023-09-12 at 13.53.07.png,
> image-2023-08-29-11-12-58-875.png
>
>
> it seems I'm reaching that codepath when running reassignments on my cluster
> and segment are deleted from remote store despite a huge retention (topic
> created a few hours ago with 1000h retention).
> It seems to happen consistently on some partitions when reassigning but not
> all partitions.
> My test:
> I have a test topic with 30 partition configured with 1000h global retention
> and 2 minutes local retention
> I have a load tester producing to all partitions evenly
> I have consumer load tester consuming that topic
> I regularly reset offsets to earliest on my consumer to test backfilling from
> tiered storage.
> My consumer was catching up consuming the backlog and I wanted to upscale my
> cluster to speed up recovery: I upscaled my cluster from 3 to 12 brokers and
> reassigned my test topic to all available brokers to have an even
> leader/follower count per broker.
> When I triggered the reassignment, the consumer lag dropped on some of my
> topic partitions:
> !image-2023-08-29-11-12-58-875.png|width=800,height=79! Screenshot 2023-08-28
> at 20 57 09
> Later I tried to reassign back my topic to 3 brokers and the issue happened
> again.
> Both times in my logs, I've seen a bunch of logs like:
> [RemoteLogManager=10005 partition=uR3O_hk3QRqsn4mPXGFoOw:loadtest11-17]
> Deleted remote log segment RemoteLogSegmentId
> {topicIdPartition=uR3O_hk3QRqsn4mPXGFoOw:loadtest11-17,
> id=Mk0chBQrTyKETTawIulQog}
> due to leader epoch cache truncation. Current earliest epoch:
> EpochEntry(epoch=14, startOffset=46776780), segmentEndOffset: 46437796 and
> segmentEpochs: [10]
> Looking at my s3 bucket. The segments prior to my reassignment have been
> indeed deleted.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)