Ben Popp created ACCUMULO-2166:
----------------------------------

             Summary: tserver recovering tablet on startup fails with "Garbage 
collection may be interfering with lock keep-alive"
                 Key: ACCUMULO-2166
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2166
             Project: Accumulo
          Issue Type: Bug
            Reporter: Ben Popp


on a locally built 1.6.0-SNAPSHOT based on the SHA 
417902e218c566333b6ea5ac492186ae305e5e16 (Dec 31st) alongside an apache hadoop 
1.2.1 install. 

I had a single node deployment on my laptop, which had probably been 
interrupted by my laptop sleep during some ingest.  

When i restarted the tserver, I got a "FATAL: Garbage collection may be 
interfering with lock keep-alive.  Halting." log message after some recovery 
attempts and the tserver shut down.  

I restarted and had similar results twice.  [~vines] and I tried to attach a 
debugger to the tserver on startup the next time, and recovery completed 
without any failures!  I have restarted the tserver since then with no issues. 

The only notable thing special about my workspace (since it affects the startup 
paths) is that i'm using some custom Authorizer and Authenticator plugins in 
Accumulo. 

Log excerpts: 

{noformat} 
==============
FIRST TIME 
==============

2014-01-09 12:42:58,241 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 12:44:47,820 [tserver.Tablet] INFO : Starting Write-Ahead Log 
recovery for d<;7006d
2014-01-09 12:44:47,821 [tserver.TabletServer] INFO : Looking for 
hdfs://localhost:9000/accumulo160b/recovery/2dceae9f-98c2-4f0c-9c66-27f4ab52113a/finished
2014-01-09 12:44:47,822 [tserver.TabletServer] INFO : Looking for 
hdfs://localhost:9000/accumulo160b/recovery/61b9709a-c945-41ae-946c-fc6425f4c96a/finished
2014-01-09 12:44:47,823 [log.SortedLogRecovery] INFO : Looking at mutations 
from 
hdfs://localhost:9000/accumulo160b/recovery/2dceae9f-98c2-4f0c-9c66-27f4ab52113a
 for d<;7006d
2014-01-09 12:44:47,900 [log.SortedLogRecovery] INFO : Looking at mutations 
from 
hdfs://localhost:9000/accumulo160b/recovery/61b9709a-c945-41ae-946c-fc6425f4c96a
 for d<;7006d
2014-01-09 12:44:47,990 [log.SortedLogRecovery] INFO : Scanning for mutations 
starting at sequence number 2 for tid 110
2014-01-09 12:45:32,510 [util.Halt] FATAL: Garbage collection may be 
interfering with lock keep-alive.  Halting.

==============
SECOND TIME 
==============

2014-01-09 13:55:17,890 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 13:55:33,687 [log.LogSorter] INFO : Copying 
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/7e36fa46-95df-4b58-a1e4-6a29f43843a8
 to 
hdfs://localhost:9000/accumulo160b/recovery/7e36fa46-95df-4b58-a1e4-6a29f43843a8
2014-01-09 13:55:33,688 [log.LogSorter] INFO : Copying 
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/e3428689-9011-4908-8990-9bd079978983
 to 
hdfs://localhost:9000/accumulo160b/recovery/e3428689-9011-4908-8990-9bd079978983
2014-01-09 13:56:31,628 [util.Halt] FATAL: Garbage collection may be 
interfering with lock keep-alive.  Halting.


==============
THIRD TIME 
==============

2014-01-09 14:04:16,315 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 14:04:32,078 [log.LogSorter] INFO : Copying 
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/e3428689-9011-4908-8990-9bd079978983
 to 
hdfs://localhost:9000/accumulo160b/recovery/e3428689-9011-4908-8990-9bd079978983
2014-01-09 14:04:32,079 [log.LogSorter] INFO : Copying 
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/a7467054-ec80-4734-974d-7c55c201b332
 to 
hdfs://localhost:9000/accumulo160b/recovery/a7467054-ec80-4734-974d-7c55c201b332
2014-01-09 14:04:32,370 [util.NativeCodeLoader] WARN : Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2014-01-09 14:04:32,811 [log.LogSorter] INFO : Finished log sort 
a7467054-ec80-4734-974d-7c55c201b332 11543822 bytes 1 parts in 732ms
2014-01-09 14:04:32,817 [log.LogSorter] INFO : Copying 
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/7e36fa46-95df-4b58-a1e4-6a29f43843a8
 to 
hdfs://localhost:9000/accumulo160b/recovery/7e36fa46-95df-4b58-a1e4-6a29f43843a8
2014-01-09 14:05:27,482 [util.Halt] FATAL: Garbage collection may be 
interfering with lock keep-alive.  Halting.
{noformat} 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to