viktorsomogyi commented on PR #13421:
URL: https://github.com/apache/kafka/pull/13421#issuecomment-1582838059

   So I have some context with the replica fetcher area (mostly by reading and 
debugging), I hope I can help.
   
   First, since the conversation is a bit long, let me summarize what I 
understand:
   - The problem is disk A reaches its capacity limits
   - The solution is to move partition X-1 to disk B
   - During the reassignment, log cleaning is disabled on X-1 (which can 
therefore fill disk A)
   - The reassignment of X-1 fails, it is left failed there on B and X-1 on A 
keeps growing
   Is this correct?
   
   If it is, we may need to separate the deletion and compaction cases. I think 
resuming deletion is safe, however resuming compaction might not be, since 
compaction alters the log. If an operator somehow resumes B and lets 
replication continue, then the history of X-1 in A and B might be different 
(I'm still working on a local test case that reproduces this). What do you 
think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to