Ben Popp created ACCUMULO-2166:
----------------------------------
Summary: tserver recovering tablet on startup fails with "Garbage
collection may be interfering with lock keep-alive"
Key: ACCUMULO-2166
URL: https://issues.apache.org/jira/browse/ACCUMULO-2166
Project: Accumulo
Issue Type: Bug
Reporter: Ben Popp
on a locally built 1.6.0-SNAPSHOT based on the SHA
417902e218c566333b6ea5ac492186ae305e5e16 (Dec 31st) alongside an apache hadoop
1.2.1 install.
I had a single node deployment on my laptop, which had probably been
interrupted by my laptop sleep during some ingest.
When i restarted the tserver, I got a "FATAL: Garbage collection may be
interfering with lock keep-alive. Halting." log message after some recovery
attempts and the tserver shut down.
I restarted and had similar results twice. [~vines] and I tried to attach a
debugger to the tserver on startup the next time, and recovery completed
without any failures! I have restarted the tserver since then with no issues.
The only notable thing special about my workspace (since it affects the startup
paths) is that i'm using some custom Authorizer and Authenticator plugins in
Accumulo.
Log excerpts:
{noformat}
==============
FIRST TIME
==============
2014-01-09 12:42:58,241 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 12:44:47,820 [tserver.Tablet] INFO : Starting Write-Ahead Log
recovery for d<;7006d
2014-01-09 12:44:47,821 [tserver.TabletServer] INFO : Looking for
hdfs://localhost:9000/accumulo160b/recovery/2dceae9f-98c2-4f0c-9c66-27f4ab52113a/finished
2014-01-09 12:44:47,822 [tserver.TabletServer] INFO : Looking for
hdfs://localhost:9000/accumulo160b/recovery/61b9709a-c945-41ae-946c-fc6425f4c96a/finished
2014-01-09 12:44:47,823 [log.SortedLogRecovery] INFO : Looking at mutations
from
hdfs://localhost:9000/accumulo160b/recovery/2dceae9f-98c2-4f0c-9c66-27f4ab52113a
for d<;7006d
2014-01-09 12:44:47,900 [log.SortedLogRecovery] INFO : Looking at mutations
from
hdfs://localhost:9000/accumulo160b/recovery/61b9709a-c945-41ae-946c-fc6425f4c96a
for d<;7006d
2014-01-09 12:44:47,990 [log.SortedLogRecovery] INFO : Scanning for mutations
starting at sequence number 2 for tid 110
2014-01-09 12:45:32,510 [util.Halt] FATAL: Garbage collection may be
interfering with lock keep-alive. Halting.
==============
SECOND TIME
==============
2014-01-09 13:55:17,890 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 13:55:33,687 [log.LogSorter] INFO : Copying
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/7e36fa46-95df-4b58-a1e4-6a29f43843a8
to
hdfs://localhost:9000/accumulo160b/recovery/7e36fa46-95df-4b58-a1e4-6a29f43843a8
2014-01-09 13:55:33,688 [log.LogSorter] INFO : Copying
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/e3428689-9011-4908-8990-9bd079978983
to
hdfs://localhost:9000/accumulo160b/recovery/e3428689-9011-4908-8990-9bd079978983
2014-01-09 13:56:31,628 [util.Halt] FATAL: Garbage collection may be
interfering with lock keep-alive. Halting.
==============
THIRD TIME
==============
2014-01-09 14:04:16,315 [server.Accumulo] INFO : tserver starting
...snip...
2014-01-09 14:04:32,078 [log.LogSorter] INFO : Copying
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/e3428689-9011-4908-8990-9bd079978983
to
hdfs://localhost:9000/accumulo160b/recovery/e3428689-9011-4908-8990-9bd079978983
2014-01-09 14:04:32,079 [log.LogSorter] INFO : Copying
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/a7467054-ec80-4734-974d-7c55c201b332
to
hdfs://localhost:9000/accumulo160b/recovery/a7467054-ec80-4734-974d-7c55c201b332
2014-01-09 14:04:32,370 [util.NativeCodeLoader] WARN : Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2014-01-09 14:04:32,811 [log.LogSorter] INFO : Finished log sort
a7467054-ec80-4734-974d-7c55c201b332 11543822 bytes 1 parts in 732ms
2014-01-09 14:04:32,817 [log.LogSorter] INFO : Copying
hdfs://localhost:9000/accumulo160b/wal/localhost+9997/7e36fa46-95df-4b58-a1e4-6a29f43843a8
to
hdfs://localhost:9000/accumulo160b/recovery/7e36fa46-95df-4b58-a1e4-6a29f43843a8
2014-01-09 14:05:27,482 [util.Halt] FATAL: Garbage collection may be
interfering with lock keep-alive. Halting.
{noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)