Michael Berman created ACCUMULO-1651:
----------------------------------------

             Summary: GC removed WAL that master wasn't done with
                 Key: ACCUMULO-1651
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1651
             Project: Accumulo
          Issue Type: Bug
          Components: gc, master
    Affects Versions: 1.6.0
            Reporter: Michael Berman
            Assignee: Michael Berman


I have a master that's spinning trying to recover a walog that doesn't exist in 
hdfs.  It looks like the GC cleaned it up.  I was stopping and starting my 
cluster throughout this period, and there was at least a few minutes in which 
every service was talking SSL except the GC, so the GC couldn't receive thrift 
messages from other services, but [~vines] says this shouldn't affect the GC's 
deletion behavior.


Here are some relevant logs.  Note that the master thinks its logSet includes 
that file straight through the time the GC removed it.

GC:
{code}
2013-08-09 11:58:14,835 [util.MetadataTableUtil] INFO : Returning logs [!!R<< 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 (1)] for extent !!R<<
2013-08-09 11:58:14,852 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing WAL 
for offline server 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:03:15,467 [util.MetadataTableUtil] INFO : Returning logs [!!R<< 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 (1)] for extent !!R<<
{code}

Master:
{code}
2013-08-09 11:57:45,235 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,238 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,286 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,324 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,939 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,942 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:45,975 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,612 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,679 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,739 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,764 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:55,784 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,031 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:57:56,046 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:58:56,051 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 11:59:56,057 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:00:56,062 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:01:56,066 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:02:56,071 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:08:56,103 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:09:56,108 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:10:56,113 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:11:56,118 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:13:19,883 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:14:19,887 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
<master was restarted here>
2013-08-09 12:15:44,459 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:15:44,467 [recovery.RecoveryManager] DEBUG: Recovering 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 to 
hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:15:44,472 [recovery.RecoveryManager] INFO : Starting recovery of 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 (in : 10s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:15:54,479 [recovery.RecoveryManager] DEBUG: Unable to initate log 
sort for 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
 java.io.FileNotFoundException: java.io.FileNotFoundException: File not found 
/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,487 [state.ZooTabletStateStore] DEBUG: root tablet logSet 
[localhost+9997/hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7]
2013-08-09 12:16:44,488 [recovery.RecoveryManager] DEBUG: Recovering 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 to 
hdfs://localhost:54310/otherAccumuloInstance/recovery/5a383792-c89b-41ed-bc22-0802e76638f7
2013-08-09 12:16:44,490 [recovery.RecoveryManager] INFO : Starting recovery of 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
 (in : 20s) created for localhost+9997, tablet !!R<< holds a reference
2013-08-09 12:17:04,494 [recovery.RecoveryManager] DEBUG: Unable to initate log 
sort for 
hdfs://localhost:54310/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7:
 java.io.FileNotFoundException: java.io.FileNotFoundException: File not found 
/otherAccumuloInstance/wal/localhost+9997/5a383792-c89b-41ed-bc22-0802e76638f7
<repeating ad infinitum>
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to