hangc0276 opened a new pull request, #3956: URL: https://github.com/apache/bookkeeper/pull/3956
### Motivation This PR is generated from https://github.com/apache/bookkeeper/pull/3437 ### Changes When we use `kill -9` command to stop one bookie pod, the journal file may be broken. If we try to start up the bookie pod again, the pod will startup to fail with the following exception. ``` 10:15:55.026 [main] ERROR org.apache.bookkeeper.bookie.Bookie - Exception while replaying journals, shutting down java.io.IOException: Invalid record found with negative length -448299468 at org.apache.bookkeeper.bookie.Journal.scanJournal(Journal.java:821) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.bookie.Bookie.replay(Bookie.java:945) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:911) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:965) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:156) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.server.service.BookieService.doStart(BookieService.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) ~[org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0] at org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$4(LifecycleComponentStack.java:144) ~[org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0] at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:406) [com.google.guava-guava-30.1-jre.jar:?] at org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:144) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0] at org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:85) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0] at org.apache.bookkeeper.server.Main.doMain(Main.java:234) [org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] at org.apache.bookkeeper.server.Main.main(Main.java:208) [org.apache.bookkeeper-bookkeeper-server-4.12.0.jar:4.12.0] 10:15:55.134 [main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x200064a14681be2 closed ``` We should have a way to make the bookie pod startup instead of decommissioning it. ### Changes 1. Add a configuration to allow skipping the invalid entries when journal replying to journal files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
