Josh Elser created ACCUMULO-1840:
------------------------------------
Summary: Deleted WAL still referenced in !METADATA
Key: ACCUMULO-1840
URL: https://issues.apache.org/jira/browse/ACCUMULO-1840
Project: Accumulo
Issue Type: Bug
Components: gc, master, tserver
Reporter: Josh Elser
Priority: Critical
Fix For: 1.6.0
Running b0da55 locally with hadoop2.2.0.
I wrote some data to a number of tables. Killed the tabletserver. The
tabletserver failed to complete recovery because it ran out of memory.
I upped the tserver's heap, and restarted it. !METADATA and !!ROOT are online
and have all tablets assigned, but none of the other tables are coming online.
There's a log entry for each of the tables that aren't online. Each of those
tables have only one tablet.
{noformat}
log:localhost+9997/hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a
{noformat}
Looking for that WAL in the GC's log results in:
{noformat}
2013-10-31 21:03:26,805 [util.MetadataTableUtil] INFO : Returning logs [!!R<<
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/56c7a37f-06bd-49ad-ad18-72dd787f8ae3
(1), !!R<<
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a
(1)] for extent !!R<<
2013-10-31 21:03:26,815 [gc.GarbageCollectWriteAheadLogs] DEBUG: deleted
[hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/6f1facae-7d5f-4c74-8a41-8621921af85f,
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/56c7a37f-06bd-49ad-ad18-72dd787f8ae3,
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a]
from localhost+9997
2013-10-31 21:08:27,105 [util.MetadataTableUtil] INFO : Returning logs [!!R<<
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a
(1)] for extent !!R<<
2013-10-31 21:08:27,115 [gc.GarbageCollectWriteAheadLogs] DEBUG: deleted
[hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/56c7a37f-06bd-49ad-ad18-72dd787f8ae3,
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a]
from localhost+9997
2013-10-31 21:14:25,667 [gc.GarbageCollectWriteAheadLogs] DEBUG: deleted
[hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/20b1adb0-9c5f-4027-82d2-c8e1a1be9f87,
hdfs://localhost:8020/accumulo1.6/wal/localhost+9997/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a]
from localhost+9997
2013-10-31 21:14:25,667 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing
sorted WAL
hdfs://localhost:8020/accumulo1.6/recovery/3f941aa2-0cf8-42f5-8135-fb51a07c5c1a
{noformat}
--
This message was sent by Atlassian JIRA
(v6.1#6144)