milleruntime commented on issue #1844:
URL: https://github.com/apache/accumulo/issues/1844#issuecomment-757990015
I saw this error happen, which I think is a similar situation. This error
does provide a message about the sort not being finished.
<pre>
2021-01-11 07:51:37,858 [tserver.TabletServer] WARN : exception trying to
assign tablet !0;2ac755608c39162d;26dbd6f6192bcaee
hdfs://localhost:8020/accumulo/tables/!0/t-00001z4
java.lang.RuntimeException: Error recovering tablet
!0;2ac755608c39162d;26dbd6f6192bcaee from log files
at org.apache.accumulo.tserver.tablet.Tablet.<init>(Tablet.java:499)
at
org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2413)
at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at
org.apache.accumulo.tserver.ActiveAssignmentRunnable.run(ActiveAssignmentRunnable.java:64)
at
org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: java.io.IOException: Sort "finished" flag
not found in
hdfs://localhost:8020/accumulo/recovery/5bce00d4-1920-416a-8bfb-41e4cf16e9f6
at
org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:608)
at
org.apache.accumulo.tserver.TabletServer.recover(TabletServer.java:3315)
at org.apache.accumulo.tserver.tablet.Tablet.<init>(Tablet.java:437)
... 8 more
Caused by: java.io.IOException: Sort "finished" flag not found in
hdfs://localhost:8020/accumulo/recovery/5bce00d4-1920-416a-8bfb-41e4cf16e9f6
at
org.apache.accumulo.tserver.log.RecoveryLogReader.<init>(RecoveryLogReader.java:138)
at
org.apache.accumulo.tserver.log.RecoveryLogsIterator.<init>(RecoveryLogsIterator.java:56)
at
org.apache.accumulo.tserver.log.SortedLogRecovery.findMaxTabletId(SortedLogRecovery.java:104)
at
org.apache.accumulo.tserver.log.SortedLogRecovery.findLogsThatDefineTablet(SortedLogRecovery.java:144)
at
org.apache.accumulo.tserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:303)
at
org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:606)
</pre>
I believe the other error is being thrown in the same situation but only if
the finished flag doesn't exist here:
https://github.com/apache/accumulo/blob/8d6aae599821cb43e2d320ebee5dbb480e1d41a3/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L3303-L3312
I am not sure how the recovery can pass the first check for the finished
file but then later fail, saying its not found. Since there isn't anyting
removing the file, once the finished file exists it should until the WAL is
removed by the GC.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]