Michael Berman created ACCUMULO-1651:
----------------------------------------
Summary: GC removed WAL that master wasn't done with
Key: ACCUMULO-1651
URL: https://issues.apache.org/jira/browse/ACCUMULO-1651
Project: Accumulo
Issue Type: Bug
Components: gc, master
Affects Versions: 1.6.0
Reporter: Michael Berman
Assignee: Michael Berman
I have a master that's spinning trying to recover a walog that doesn't exist in
hdfs. It looks like the GC cleaned it up. I was stopping and starting my
cluster throughout this period, and there was at least a few minutes in which
every service was talking SSL except the GC, so the GC couldn't receive thrift
messages from other services, but [~vines] says this shouldn't affect the GC's
deletion behavior.
Here are some relevant logs. Note that the master thinks its logSet includes
that file straight through the time the GC removed it.
GC:
{code}
2013-08-09 11:58:14,835 [util.MetadataTableUtil] INFO : Returning logs [!!R<<
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
2013-08-09 11:58:14,852 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing WAL
for offline server
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:03:15,467 [util.MetadataTableUtil] INFO : Returning logs [!!R<<
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(1)] for extent !!R<<
{code}
Master:
{code}
2013-08-09 11:57:45,235 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,238 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,286 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,324 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,939 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,942 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,975 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,612 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,679 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,739 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,764 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,784 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,031 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,046 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:58:56,051 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:59:56,057 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:00:56,062 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:01:56,066 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:02:56,071 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:08:56,103 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:09:56,108 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:10:56,113 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:11:56,118 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:13:19,883 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:14:19,887 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
<master was restarted here>
2013-08-09 12:15:44,459 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:15:44,467 [recovery.RecoveryManager] DEBUG: Recovering
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to
hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:15:44,472 [recovery.RecoveryManager] INFO : Starting recovery of
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 10s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:15:54,479 [recovery.RecoveryManager] DEBUG: Unable to initate log
sort for
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found
/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,487 [state.ZooTabletStateStore] DEBUG: root tablet logSet
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:16:44,488 [recovery.RecoveryManager] DEBUG: Recovering
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
to
hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,490 [recovery.RecoveryManager] INFO : Starting recovery of
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
(in : 20s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:17:04,494 [recovery.RecoveryManager] DEBUG: Unable to initate log
sort for
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
java.io.FileNotFoundException: java.io.FileNotFoundException: File not found
/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
<repeating ad infinitum>
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira