Marton Greber has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/23358 )

Change subject: Deflake testRandomBackupAndRestore
......................................................................

Deflake testRandomBackupAndRestore

The backup/restore tests were intermittently failing. This wasn’t
obvious at first because dist-test archives contained empty
_ARCHIVE_TOO_BIG_-* files.

Root cause was a Base64 dependency mismatch:
java.lang.NoClassDefFoundError: org/apache/commons/net/util/Base64
Switching from org.apache.commons.net.util.Base64 to java.util.Base64
resolves the failure. The issue is reproduced by the new test
testBinaryColumnBackupReproducesBase64Issue.

Running the repro test without the fix results in this stacktrace:

13:45:05.130 [Test worker] ERROR org.apache.kudu.test.junit.RetryRule -- 
org.apache.kudu.backup.TestKuduBackup.testBinaryColumnBackupReproducesBase64Issue:
 failed attempt 1
java.lang.NoClassDefFoundError: org/apache/commons/net/util/Base64
        at 
org.apache.kudu.backup.TableMetadata$.valueToString(TableMetadata.scala:351)
        at 
org.apache.kudu.backup.TableMetadata$.$anonfun$getTableMetadata$1(TableMetadata.scala:78)
        at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at scala.collection.Iterator.foreach(Iterator.scala:943)
        at scala.collection.Iterator.foreach$(Iterator.scala:943)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
        at scala.collection.IterableLike.foreach(IterableLike.scala:74)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at 
org.apache.kudu.backup.TableMetadata$.getTableMetadata(TableMetadata.scala:61)
        at org.apache.kudu.backup.KuduBackup$.doBackup(KuduBackup.scala:102)
        at 
org.apache.kudu.backup.KuduBackup$.$anonfun$run$6(KuduBackup.scala:146)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at 
org.apache.kudu.backup.KuduBackup$.$anonfun$run$5(KuduBackup.scala:146)
        at 
scala.collection.parallel.AugmentedIterableIterator.map2combiner(RemainsIterator.scala:116)
        at 
scala.collection.parallel.AugmentedIterableIterator.map2combiner$(RemainsIterator.scala:113)
        at 
scala.collection.parallel.immutable.ParVector$ParVectorIterator.map2combiner(ParVector.scala:66)
        at 
scala.collection.parallel.ParIterableLike$Map.leaf(ParIterableLike.scala:1064)
        at scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
        at scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
        at scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
        at 
scala.collection.parallel.ParIterableLike$Map.tryLeaf(ParIterableLike.scala:1061)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute(Tasks.scala:153)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute$(Tasks.scala:149)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:440)
        at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinTask.doJoin(ForkJoinTask.java:389)
        at java.util.concurrent.ForkJoinTask.join(ForkJoinTask.java:719)
        at 
scala.collection.parallel.ForkJoinTasks$WrappedTask.sync(Tasks.scala:379)
        at 
scala.collection.parallel.ForkJoinTasks$WrappedTask.sync$(Tasks.scala:379)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.sync(Tasks.scala:440)
        at 
scala.collection.parallel.ForkJoinTasks.executeAndWaitResult(Tasks.scala:423)
        at 
scala.collection.parallel.ForkJoinTasks.executeAndWaitResult$(Tasks.scala:416)
        at 
scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:60)
        at 
scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:968)
        at scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
        at scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
        at scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
        at 
scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:963)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute(Tasks.scala:153)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute$(Tasks.scala:149)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:440)
        at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.net.util.Base64
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 56 common frames omitted

During investigation I found another bug with BINARY range-partition
boundaries causing:
java.lang.ClassCastException: class java.nio.HeapByteBuffer cannot be
cast to class [B
This is fixed in the same valueToString switch; covered by
testBinaryPartitionBoundariesBase64Issue.

Running the repro test without the fix results in this stacktrace:

13:39:47.053 [ForkJoinPool-2-worker-1] ERROR org.apache.kudu.backup.KuduBackup$ 
-- Failed to back up table binary-partition-base64-issue
java.lang.ClassCastException: java.nio.HeapByteBuffer cannot be cast to [B
        at 
org.apache.kudu.backup.TableMetadata$.valueToString(TableMetadata.scala:351)
        at 
org.apache.kudu.backup.TableMetadata$.$anonfun$getBoundValues$2(TableMetadata.scala:225)
        at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at 
org.apache.kudu.backup.TableMetadata$.getBoundValues(TableMetadata.scala:219)
        at 
org.apache.kudu.backup.TableMetadata$.$anonfun$getRangePartitionMetadata$2(TableMetadata.scala:196)
        at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
        at scala.collection.Iterator.foreach(Iterator.scala:943)
        at scala.collection.Iterator.foreach$(Iterator.scala:943)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
        at scala.collection.IterableLike.foreach(IterableLike.scala:74)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
        at scala.collection.TraversableLike.map(TraversableLike.scala:286)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at 
org.apache.kudu.backup.TableMetadata$.getRangePartitionMetadata(TableMetadata.scala:195)
        at 
org.apache.kudu.backup.TableMetadata$.getPartitionSchemaMetadata(TableMetadata.scala:125)
        at 
org.apache.kudu.backup.TableMetadata$.getTableMetadata(TableMetadata.scala:106)
        at org.apache.kudu.backup.KuduBackup$.doBackup(KuduBackup.scala:102)
        at 
org.apache.kudu.backup.KuduBackup$.$anonfun$run$6(KuduBackup.scala:146)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at 
org.apache.kudu.backup.KuduBackup$.$anonfun$run$5(KuduBackup.scala:146)
        at 
scala.collection.parallel.AugmentedIterableIterator.map2combiner(RemainsIterator.scala:116)
        at 
scala.collection.parallel.AugmentedIterableIterator.map2combiner$(RemainsIterator.scala:113)
        at 
scala.collection.parallel.immutable.ParVector$ParVectorIterator.map2combiner(ParVector.scala:66)
        at 
scala.collection.parallel.ParIterableLike$Map.leaf(ParIterableLike.scala:1064)
        at scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
        at scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
        at scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
        at 
scala.collection.parallel.ParIterableLike$Map.tryLeaf(ParIterableLike.scala:1061)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute(Tasks.scala:153)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute$(Tasks.scala:149)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:440)
        at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at java.util.concurrent.ForkJoinTask.doJoin(ForkJoinTask.java:389)
        at java.util.concurrent.ForkJoinTask.join(ForkJoinTask.java:719)
        at 
scala.collection.parallel.ForkJoinTasks$WrappedTask.sync(Tasks.scala:379)
        at 
scala.collection.parallel.ForkJoinTasks$WrappedTask.sync$(Tasks.scala:379)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.sync(Tasks.scala:440)
        at 
scala.collection.parallel.ForkJoinTasks.executeAndWaitResult(Tasks.scala:423)
        at 
scala.collection.parallel.ForkJoinTasks.executeAndWaitResult$(Tasks.scala:416)
        at 
scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:60)
        at 
scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:968)
        at scala.collection.parallel.Task.$anonfun$tryLeaf$1(Tasks.scala:53)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.control.Breaks$$anon$1.catchBreak(Breaks.scala:67)
        at scala.collection.parallel.Task.tryLeaf(Tasks.scala:56)
        at scala.collection.parallel.Task.tryLeaf$(Tasks.scala:50)
        at 
scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:963)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute(Tasks.scala:153)
        at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask.compute$(Tasks.scala:149)
        at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:440)
        at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
        at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

Before these fixes, running org.apache.kudu.backup.TestKuduBackup 100
times yielded 43 failures / 57 passes. After the fixes: 100/100 passes.
The random nature of test failures is because the problematic test:
testRandomBackupAndRestore uses randomized schema and data.

This is not a full RCA I wanted to deflake with reasonable solution;
I’ve opened KUDU-3694 to track root-cause follow-ups and additional
action items.

Change-Id: Ibbf05b23cb6c922ee880f4f1dda262be5e97452b
Reviewed-on: http://gerrit.cloudera.org:8080/23358
Reviewed-by: Abhishek Chennaka <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
Reviewed-by: Gabriella Lotz <[email protected]>
Tested-by: Marton Greber <[email protected]>
---
M 
java/kudu-backup-common/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
2 files changed, 153 insertions(+), 4 deletions(-)

Approvals:
  Abhishek Chennaka: Looks good to me, approved
  Alexey Serbin: Looks good to me, but someone else must approve
  Gabriella Lotz: Looks good to me, but someone else must approve
  Marton Greber: Verified

--
To view, visit http://gerrit.cloudera.org:8080/23358
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibbf05b23cb6c922ee880f4f1dda262be5e97452b
Gerrit-Change-Number: 23358
Gerrit-PatchSet: 4
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>

Reply via email to