yifan-c commented on code in PR #98:
URL: 
https://github.com/apache/cassandra-analytics/pull/98#discussion_r1947233197


##########
cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/reader/CassandraBridgeUtilTests.java:
##########
@@ -169,6 +170,61 @@ public void testOverlaps()
         });
     }
 
+    @Test
+    public void testContains()
+    {
+        runTest((partitioner, bridge, schema, testDir) -> {
+            // write SSTable
+            Set<String> keys = IntStream.range(0, 25).mapToObj(i -> 
randomAlphanumeric()).collect(Collectors.toSet());
+            List<ByteBuffer> buffers = bridge.encodePartitionKeys(
+            partitioner,
+            schema.keyspace,
+            schema.createStatement,
+            keys.stream()
+                .map(Collections::singletonList)
+                .collect(Collectors.toList())
+            );
+            assertTrue(TestSSTable.allIn(testDir).isEmpty());
+            writeSSTable(partitioner, bridge, schema, testDir,
+                         (writer) -> keys.forEach(key -> writer.write(key, 
randomAlphanumeric(), randomAlphanumeric())));
+            assertEquals(1, TestSSTable.allIn(testDir).size());
+
+            TestSSTable ssTable = (TestSSTable) TestSSTable.firstIn(testDir);
+
+            // should return all positives for the keys contained in the 
SSTable
+            List<Boolean> result = bridge.maybeContains(partitioner, 
schema.keyspace, schema.table, ssTable, buffers);
+            assertEquals(result.size(), buffers.size());
+            assertTrue(result.stream().allMatch(boolValue -> boolValue));
+
+            // random keys should return some negatives for keys not contained 
in the SSTable
+            List<String> otherKeys = IntStream.range(0, 
DEFAULT_NUM_ROWS).mapToObj(i -> 
randomAlphanumeric(keys)).collect(Collectors.toList());
+            List<ByteBuffer> randomBuffers = bridge.encodePartitionKeys(
+            partitioner,
+            schema.keyspace,
+            schema.createStatement,
+            otherKeys.stream()
+                     .map(Collections::singletonList)
+                     .collect(Collectors.toList())
+            );
+            List<Boolean> randomResult = bridge.maybeContains(partitioner, 
schema.keyspace, schema.table, ssTable, randomBuffers);
+            assertEquals(randomResult.size(), otherKeys.size());
+            assertTrue(randomResult.stream().anyMatch(boolValue -> 
!boolValue));
+
+            // perform exact contains query to confirm expected keys exist and 
random keys don't
+            assertTrue(bridge.contains(partitioner, schema.keyspace, 
schema.table, ssTable, buffers).stream().allMatch(aBoolean -> aBoolean));
+            assertTrue(bridge.contains(partitioner, schema.keyspace, 
schema.table, ssTable, randomBuffers).stream().noneMatch(aBoolean -> aBoolean));
+            List<ByteBuffer> allKeys = new ArrayList<>();
+            allKeys.addAll(buffers);
+            allKeys.addAll(randomBuffers);
+            List<Boolean> exactResult = bridge.contains(partitioner, 
schema.keyspace, schema.table, ssTable, allKeys);
+            assertEquals(allKeys.size(), exactResult.size());
+            for (int i = 0; i < exactResult.size(); i++)
+            {
+                assertEquals(i < buffers.size(), exactResult.get(i));
+            }

Review Comment:
   nit: for this test, how about adding the assertion that given the same list 
of keys, for each boolean result in the result lists, if `maybeContains` 
returns false, `contains` must also return false? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to