stevenzwu commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1157537021


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, 
lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, 
lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE 
'%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   it is a behavior change? 
   
   I thought `%%i` meant a string ending with char `i`. that is why it 
shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is 
still sth I am not understanding here.



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,9 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN 
(1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), 
lastScanEvent.filter());
+    Assert.assertNull(
+        "As the filtering is pushed down, Flink does not generate data, so 
there is no ScanEvent",

Review Comment:
   nit: "As the predicate pushdown filter out all rows, Iceberg scan planning 
doesn't publish any ScanEvent`.
   
   On the other hand, is this correct? shouldn't `ScanEvent` always be notified 
since there is a table scan planning?
   ```
   /** Event sent to listeners when a table scan is planned. */
   public final class ScanEvent {
   ```



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, 
lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   sounds good to me, as it seems to match the existing style. originally I was 
thinking each test method covers one scenario. In this case, we can have 
`testFilterPushDownLikeMatchingPrefix` and 
`testFilterPushDownLikeMatchingAnyChars`



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), 
result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   you are saying Flink skips the `null` for predicate pushdown? if yes, please 
add it the comment. it will also be great we can link to a Flink jira (if there 
is) for future improvement on pushdown with `null` value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to