[GitHub] [kafka] ccding opened a new pull request #11351: KAFKA-13315: log layer exception during shutdown that caused an unclean shutdown

GitBox Tue, 21 Sep 2021 14:34:37 -0700


ccding opened a new pull request #11351:
URL: https://github.com/apache/kafka/pull/11351

We have seen an exception caused by shutting down the scheduler before
shutting down LogManager.

When LogManager was closing partitions one by one, the scheduler called to
delete old segments due to retention. However, the old segments could have been
closed by the LogManager, which caused an exception and subsequently marked
logdir as offline. As a result, the broker didn't flush the remaining
partitions and didn't write the clean shutdown marker. Ultimately the broker
took hours to recover the log during restart.

This PR essentially reverts https://github.com/apache/kafka/pull/10538

I believe the exception https://github.com/apache/kafka/pull/10538 saw is at
https://github.com/apache/kafka/blob/5a6f19b2a1ff72c52ad627230ffdf464456104ee/core/src/main/scala/kafka/log/LocalLog.scala#L895-L903
which called the scheduler and crashed the compaction thread. The effect of
this exception has been mitigated by https://github.com/apache/kafka/pull/10763

cc @rondagostino @ijuma @cmccabe @junrao @dhruvilshah3 as authors/reviewers
of the PRs mentioned above to make sure this change look okay.

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] ccding opened a new pull request #11351: KAFKA-13315: log layer exception during shutdown that caused an unclean shutdown

Reply via email to