hamadodene commented on issue #2528:
URL: https://github.com/apache/bookkeeper/issues/2528#issuecomment-938549309
We encountered the same problem in production. We have seen several
Exceptions such as:
```
21-09-29-02-11-13 Failed to compact entry log 8277 due to unexpected
error
21-09-29-02-11-13 java.lang.IllegalArgumentException: Negative position
java.lang.IllegalArgumentException: Negative position
at
java.base/sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:785)
at
org.apache.bookkeeper.bookie.BufferedReadChannel.read(BufferedReadChannel.java:93)
at
org.apache.bookkeeper.bookie.BufferedReadChannel.read(BufferedReadChannel.java:65)
at
org.apache.bookkeeper.bookie.EntryLogger.readFromLogChannel(EntryLogger.java:418)
at
org.apache.bookkeeper.bookie.EntryLogger.scanEntryLog(EntryLogger.java:996)
at
org.apache.bookkeeper.bookie.EntryLogCompactor.compact(EntryLogCompactor.java:61)
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.compactEntryLog(GarbageCollectorThread.java:518)
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.doCompactEntryLogs(GarbageCollectorThread.java:455)
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(GarbageCollectorThread.java:360)
at
org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun(GarbageCollectorThread.java:309)
at
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
```
After several days we have The Herddb service which bases its replication on
bookkeeper had to recover its tables.
This recovery failed for Exception:
```
21-10-07-16-24-17 herddb.core.DBManager Oct 07, 2021 4:24:17 PM
herddb.core.DBManager manageTableSpaces
SEVERE: cannot handle tablespace q103
herddb.log.LogNotAvailableException:
org.apache.bookkeeper.client.BKException$BKDigestMatchException: Entry digest
does not match
at herddb.cluster.BookkeeperCommitLog.recovery(BookkeeperCommitLog.java:683)
at herddb.core.TableSpaceManager.recover(TableSpaceManager.java:325)
at herddb.core.TableSpaceManager.start(TableSpaceManager.java:250)
at herddb.core.DBManager.handleTableSpace(DBManager.java:571)
at herddb.core.DBManager.manageTableSpaces(DBManager.java:1226)
at herddb.core.DBManager.executeActivator(DBManager.java:1172)
at herddb.core.DBManager.access$500(DBManager.java:120)
at herddb.core.DBManager$Activator.run(DBManager.java:1115)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.bookkeeper.client.BKException$BKDigestMatchException:
Entry digest does not match
at org.apache.bookkeeper.client.BKException.create(BKException.java:70)
at
org.apache.bookkeeper.client.PendingReadOp.submitCallback(PendingReadOp.java:640)
at
org.apache.bookkeeper.client.PendingReadOp$LedgerEntryRequest.fail(PendingReadOp.java:171)
at
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:393)
at
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.logErrorAndReattemptRead(PendingReadOp.java:436)
at
org.apache.bookkeeper.client.PendingReadOp$LedgerEntryRequest.complete(PendingReadOp.java:142)
at
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.complete(PendingReadOp.java:442)
at
org.apache.bookkeeper.client.PendingReadOp.readEntryComplete(PendingReadOp.java:590)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1836)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleReadResponse(PerChannelBookieClient.java:1917)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleV3Response(PerChannelBookieClient.java:1892)
at
org.apache.bookkeeper.proto.PerChannelBookieClient$3.safeRun(PerChannelBookieClient.java:1447)
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
```
We think this is due to corruption of the entry log.
Do you have any idea how we can solve this problem?
cc @eolivelli
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]