[
https://issues.apache.org/jira/browse/ACCUMULO-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821511#comment-13821511
]
ASF subversion and git services commented on ACCUMULO-1831:
-----------------------------------------------------------
Commit 0e63755d5e3a9e441bae51d2f28402a3768820dc in branch refs/heads/master
from [~ecn]
[ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=0e63755 ]
ACCUMULO-1831 ACCUMULO-1888 use uuids to confirm WALog GC; remove WALog entries
in the !METADATA table with the correct mutation
> Write ahead logs from upgrade prematurely GCed
> ----------------------------------------------
>
> Key: ACCUMULO-1831
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1831
> Project: Accumulo
> Issue Type: Sub-task
> Components: master, tserver
> Reporter: Keith Turner
> Assignee: Eric Newton
> Priority: Blocker
> Fix For: 1.6.0
>
>
> I was running {{test/system/upgrade_test.sh dirty}} and the test hung. Upon
> inspection, the wals from 1.5 were deleted before all tablets were recovered.
>
> Some tablets from 1.5 recovered fine.
> {noformat}
> 2013-10-29 20:29:26,475 [log.SortedLogRecovery] INFO : Recovery complete for
> !!R<< using
> hdfs://nnhost:6093/rktl/accumulo-upt/recovery/754f171b-c260-42dd-b17e-bd15064608c7
> {noformat}
> Then the GC kicked in and deleted files before tablets were finished
> recovering.
> {noformat}
> 2013-10-29 20:29:30,421 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing WAL
> for offline server
> hdfs://nnhost:6093/rktl/accumulo-upt/wal/127.0.0.1+9997/754f171b-c260-42dd-b17e-bd15064608c7
> 2013-10-29 20:29:30,428 [gc.GarbageCollectWriteAheadLogs] DEBUG: Removing
> sorted WAL
> hdfs://nnhost:6093/rktl/accumulo-upt/recovery/754f171b-c260-42dd-b17e-bd15064608c7
> {noformat}
> Tablet failed to recover.
> {noformat}
> 2013-10-29 20:29:30,858 [tabletserver.TabletServer] WARN : exception trying
> to assign tablet 1<;row_0000180000 /default_tablet
> java.lang.RuntimeException: java.io.IOException: Unable to find recovery
> files for extent 1<;row_0000180000 logEntry: 1<;
> 754f171b-c260-42dd-b17e-bd15064608c7 (19)
> at
> org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1398)
> at
> org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1233)
> at
> org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1088)
> at
> org.apache.accumulo.server.tabletserver.Tablet.<init>(Tablet.java:1076)
> {noformat}
> I had set my gc delay to 30 secs while testing another issue and thats why I
> ran into this issue.
> Looking at the code, I do not think its properly converting relative paths
> from 1.5 to absolute paths. I think the code should convert everything to
> relative paths (just UUIDs) to avoid problems caused by differing
> configurations.
--
This message was sent by Atlassian JIRA
(v6.1#6144)