[GitHub] [kafka] viktorsomogyi commented on pull request #13421: KAFKA-14824: ReplicaAlterLogDirsThread may cause serious disk growing in case of potential exception

2023-06-19 Thread via GitHub


viktorsomogyi commented on PR #13421:
URL: https://github.com/apache/kafka/pull/13421#issuecomment-1596994184

   > In addition, if we add integration tests, put them in this PR, or need to 
open another PR?
   Let's add them to this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] viktorsomogyi commented on pull request #13421: KAFKA-14824: ReplicaAlterLogDirsThread may cause serious disk growing in case of potential exception

2023-06-15 Thread via GitHub


viktorsomogyi commented on PR #13421:
URL: https://github.com/apache/kafka/pull/13421#issuecomment-1593258910

   It seems like @clolov is right, I tested it both in quorum and zk mode, 
Kafka successfully reconciles the questionable case (when X-1 on B comes back 
after A has compacted the logs), so I think it's fine to merge in this PR.
   
   I was also thinking of creating some integration test for this but it's hard 
to simulate disk errors in Java and we can't have any assumptions about where 
the tests run, so I think that should be a separate task as it's out of scope 
for this one. If you folks know a good fault injection framework, I'm all ears.
   
   I'll come back tomorrow for a last round of review and if I find everything 
fine, I'll merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] viktorsomogyi commented on pull request #13421: KAFKA-14824: ReplicaAlterLogDirsThread may cause serious disk growing in case of potential exception

2023-06-08 Thread via GitHub


viktorsomogyi commented on PR #13421:
URL: https://github.com/apache/kafka/pull/13421#issuecomment-1582838059

   So I have some context with the replica fetcher area (mostly by reading and 
debugging), I hope I can help.
   
   First, since the conversation is a bit long, let me summarize what I 
understand:
   - The problem is disk A reaches its capacity limits
   - The solution is to move partition X-1 to disk B
   - During the reassignment, log cleaning is disabled on X-1 (which can 
therefore fill disk A)
   - The reassignment of X-1 fails, it is left failed there on B and X-1 on A 
keeps growing
   Is this correct?
   
   If it is, we may need to separate the deletion and compaction cases. I think 
resuming deletion is safe, however resuming compaction might not be, since 
compaction alters the log. If an operator somehow resumes B and lets 
replication continue, then the history of X-1 in A and B might be different 
(I'm still working on a local test case that reproduces this). What do you 
think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] viktorsomogyi commented on pull request #13421: KAFKA-14824: ReplicaAlterLogDirsThread may cause serious disk growing in case of potential exception

2023-06-06 Thread via GitHub


viktorsomogyi commented on PR #13421:
URL: https://github.com/apache/kafka/pull/13421#issuecomment-1578862889

   @hudeqi I added myself as a reviewer, I may not have time to review this 
today but will get to it this week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org