Adar Dembo created KUDU-2668:
--------------------------------

             Summary: TestKuduClient.readYourWrites tests are flaky
                 Key: KUDU-2668
                 URL: https://issues.apache.org/jira/browse/KUDU-2668
             Project: Kudu
          Issue Type: Bug
          Components: java, test
    Affects Versions: 1.9.0
            Reporter: Adar Dembo


I looped TestKuduClient 1000 times in dist-test while working on another 
problem, and saw the following failures:
{noformat}
1 testReadYourWritesBatchLeaderReplica
14 testReadYourWritesSyncClosestReplica
15 testReadYourWritesSyncLeaderReplica
{noformat}

In all cases, the stack trace of the failure was effectively this:
{noformat}
java.util.concurrent.ExecutionException: java.lang.AssertionError
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
        ...
Caused by: java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:86)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at org.junit.Assert.assertTrue(Assert.java:52)
        at 
org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
        at 
org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
        ...
{noformat}

The offending lines:
{code}
              AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
                      .readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
                      .replicaSelection(replicaSelection)
                      .build();
              KuduScanner syncScanner = new KuduScanner(scanner);
              long preTs = asyncClient.getLastPropagatedTimestamp();
              assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
                  asyncClient.getLastPropagatedTimestamp());

              long row_count = countRowsInScan(syncScanner);
              long expected_count = 100L * (i + 1);
              assertTrue(expected_count <= row_count);

              // After the scan, verify that the chosen snapshot timestamp is
              // returned from the server and it is larger than the previous
              // propagated timestamp.
              assertNotEquals(AsyncKuduClient.NO_TIMESTAMP, 
scanner.getSnapshotTimestamp());
-->           assertTrue(preTs < scanner.getSnapshotTimestamp());
{code}

It's possible that this is just test flakiness, but I'm setting a higher 
priority so we can understand whether that's the case, or whether there's 
something wrong with read-your-writes scans.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to