RaulGracia opened a new issue #2485:
URL: https://github.com/apache/bookkeeper/issues/2485
**QUESTION**
We have done a set of experiments with restarting a Kubernetes node with
Bookies running inside. In one of the experiment, we found that one Bookie was
not able to restart which the following:
```
2020-11-11 07:43:42,429 - INFO - [main:JournalChannel@154] - Opening journal
/bk/journal/j1/current/175b5dae741.txn
2020-11-11 07:43:42,475 - ERROR - [main:Bookie@924] - Exception while
replaying journals, shutting down
java.io.IOException: Missing ledger signature while reading header for
/bk/index/current/1/9/109.idx
at org.apache.bookkeeper.bookie.FileInfo.readHeader(FileInfo.java:224)
at org.apache.bookkeeper.bookie.FileInfo.checkOpen(FileInfo.java:310)
at org.apache.bookkeeper.bookie.FileInfo.checkOpen(FileInfo.java:278)
at org.apache.bookkeeper.bookie.FileInfo.size(FileInfo.java:388)
at
org.apache.bookkeeper.bookie.IndexPersistenceMgr.updatePage(IndexPersistenceMgr.java:643)
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.grabLedgerEntryPage(IndexInMemPageMgr.java:447)
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.getLedgerEntryPage(IndexInMemPageMgr.java:412)
at
org.apache.bookkeeper.bookie.IndexInMemPageMgr.putEntryOffset(IndexInMemPageMgr.java:571)
at
org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:103)
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:530)
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:512)
at
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.addEntry(InterleavedLedgerStorage.java:366)
at
org.apache.bookkeeper.bookie.LedgerDescriptorImpl.addEntry(LedgerDescriptorImpl.java:153)
at org.apache.bookkeeper.bookie.Bookie$4.process(Bookie.java:888)
at org.apache.bookkeeper.bookie.Journal.scanJournal(Journal.java:821)
at org.apache.bookkeeper.bookie.Journal.replay(Journal.java:866)
at org.apache.bookkeeper.bookie.Bookie.readJournal(Bookie.java:901)
at org.apache.bookkeeper.bookie.Bookie.start(Bookie.java:922)
at org.apache.bookkeeper.proto.BookieServer.start(BookieServer.java:141)
at
org.apache.bookkeeper.server.service.BookieService.doStart(BookieService.java:58)
at
org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:78)
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$2(LifecycleComponentStack.java:113)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:408)
at
org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:113)
at
org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:80)
at org.apache.bookkeeper.server.Main.doMain(Main.java:229)
at org.apache.bookkeeper.server.Main.main(Main.java:203)
2020-11-11 07:43:42,489 - INFO - [main:ZooKeeper@693] - Session:
0x10007c8a7a803d8 closed
2020-11-11 07:43:42,490 - INFO -
[main-EventThread:ClientCnxn$EventThread@522] - EventThread shut down for
session: 0x10007c8a7a803d8
2020-11-11 07:43:42,546 - INFO -
[vert.x-eventloop-thread-0:VertxHttpServer$2@79] - Starting Vertx HTTP server
on port 8080
```
After this, the Bookie process starts but it is unable to do any IO,
including the readiness probes of the [Bookkeeper
Operator](https://github.com/pravega/bookkeeper-operator). The questions
regarding this problem are the following:
- Is this `Missing ledger signature` expected in the presence when rebooting
the node in which a Bookie runs? It looks as a form of data corruption/loss,
but I would like to hear the confirmation from the Bookkeeper community about
this.
- Once in the presence of this error (o similar ones), what is the best
course of action? Should we decommission the Bookie and create a new one? I
would like to hear what is the best approach to handle this situation.
Thanks in advance.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]