[
https://issues.apache.org/jira/browse/ACCUMULO-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411182#comment-15411182
]
Josh Elser commented on ACCUMULO-4362:
--------------------------------------
I was hoping this was just a dumb bug:
{noformat}
diff --git
a/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
b/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
index b38083f..8d9b234 100644
---
a/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
+++
b/server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
@@ -744,13 +744,18 @@ public class MetadataTableUtil {
@VisibleForTesting
public static int checkClone(String tableName, String srcTableId, String
tableId, Connector conn, BatchWriter bw) throws TableNotFoundException,
MutationsRejectedException {
- TabletIterator srcIter = new TabletIterator(createCloneScanner(tableName,
srcTableId, conn), new KeyExtent(srcTableId, null, null).toMetadataRange(),
true,
- true);
+ TabletIterator srcIter;
+ if (srcTableId.equals(MetadataTable.ID))
+ srcIter = new TabletIterator(createCloneScanner(tableName, srcTableId,
conn), new Range(), true, true);
+ else
+ srcIter = new TabletIterator(createCloneScanner(tableName, srcTableId,
conn), new KeyExtent(srcTableId, null, null).toMetadataRange(), true, true);
TabletIterator cloneIter = new
TabletIterator(createCloneScanner(tableName, tableId, conn), new
KeyExtent(tableId, null, null).toMetadataRange(), true,
true);
- if (!cloneIter.hasNext() || !srcIter.hasNext())
- throw new RuntimeException(" table deleted during clone? srcTableId = "
+ srcTableId + " tableId=" + tableId);
+ if (!cloneIter.hasNext())
+ throw new RuntimeException("Destination table deleted during clone?
tableId=" + tableId);
+ if (!srcIter.hasNext())
+ throw new RuntimeException("Source table deleted during clone?
srcTableId = " + srcTableId);
int rewrites = 0;
@@ -855,7 +860,7 @@ public class MetadataTableUtil {
// delete what we have cloned and try again
deleteTable(tableId, false, context, null);
- log.debug("Tablets merged in table " + srcTableId + " while
attempting to clone, trying again");
+ log.debug("Tablets merged in table " + srcTableId + " while
attempting to clone, trying again", tde);
sleepUninterruptibly(100, TimeUnit.MILLISECONDS);
}
{noformat}
Turns out, this just lead me to another issue. I had hoped to get to this
yesterday, but I had other FOSS work to do. With these changes, I'm now seeing
the following:
{noformat}
2016-08-07 22:07:52,026 [util.MetadataTableUtil] DEBUG: Tablets merged in table
!0 while attempting to clone, trying again
org.apache.accumulo.server.util.TabletIterator$TabletDeletedException: Tablets
deleted from src during clone : some split null
at
org.apache.accumulo.server.util.MetadataTableUtil.checkClone(MetadataTableUtil.java:786)
at
org.apache.accumulo.server.util.MetadataTableUtil.cloneTable(MetadataTableUtil.java:847)
at
org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:45)
at
org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:24)
at org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
at org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:74)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at
org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:745)
{noformat}
I need to step back and figure out how this is actually supposed to work. I
don't have any understanding presently.
FYI, [~kturner], you may have some more familiarity than I do also.
> TabletStateChangeIteratorIT failure on cloning metadata table
> -------------------------------------------------------------
>
> Key: ACCUMULO-4362
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4362
> Project: Accumulo
> Issue Type: Bug
> Components: test
> Reporter: Josh Elser
> Assignee: Josh Elser
> Priority: Blocker
> Fix For: 1.8.0
>
>
> In the Master log:
> {noformat}
> 2016-07-09 16:22:15,858 [master.MasterClientServiceHandler] ERROR: table
> deleted during clone? srcTableId = !0 tableId=4
> java.lang.RuntimeException: table deleted during clone? srcTableId = !0
> tableId=4
> at
> org.apache.accumulo.server.util.MetadataTableUtil.checkClone(MetadataTableUtil.java:753)
> at
> org.apache.accumulo.server.util.MetadataTableUtil.cloneTable(MetadataTableUtil.java:842)
> at
> org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:45)
> at
> org.apache.accumulo.master.tableOps.CloneMetadata.call(CloneMetadata.java:24)
> at org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
> at org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:74)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at java.lang.Thread.run(Thread.java:745)
> 2016-07-09 16:22:15,859 [thrift.ProcessFunction] ERROR: Internal error
> processing waitForFateOperation
> org.apache.thrift.TException: table deleted during clone? srcTableId = !0
> tableId=4
> at
> org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:81)
> at com.sun.proxy.$Proxy10.waitForFateOperation(Unknown Source)
> at
> org.apache.accumulo.core.master.thrift.FateService$Processor$waitForFateOperation.getResult(FateService.java:481)
> at
> org.apache.accumulo.core.master.thrift.FateService$Processor$waitForFateOperation.getResult(FateService.java:465)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63)
> at
> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
> at
> org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:106)
> at org.apache.thrift.server.Invocation.run(Invocation.java:18)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The test case:
> {noformat}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.544 sec <<<
> FAILURE! - in org.apache.accumulo.test.functional.TabletStateChangeIteratorIT
> test(org.apache.accumulo.test.functional.TabletStateChangeIteratorIT) Time
> elapsed: 7.402 sec <<< ERROR!
> org.apache.accumulo.core.client.AccumuloException: Internal error processing
> waitForFateOperation
> at
> org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.cloneMetadataTable(TabletStateChangeIteratorIT.java:201)
> at
> org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.test(TabletStateChangeIteratorIT.java:103)
> Caused by: org.apache.thrift.TApplicationException: Internal error processing
> waitForFateOperation
> at
> org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.cloneMetadataTable(TabletStateChangeIteratorIT.java:201)
> at
> org.apache.accumulo.test.functional.TabletStateChangeIteratorIT.test(TabletStateChangeIteratorIT.java:103)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)