kbendick commented on a change in pull request #2081:
URL: https://github.com/apache/iceberg/pull/2081#discussion_r556236631



##########
File path: data/src/test/java/org/apache/iceberg/data/TestLocalScan.java
##########
@@ -498,6 +499,35 @@ public void testFilterWithDateAndTimestamp() throws 
IOException {
     }
   }
 
+  @Test
+  public void testFilterWithEmptyStringColumn() throws IOException {
+    File tableLocation = 
temp.newFolder("filter_table_with_empty_string_column_value");
+    Assert.assertTrue(tableLocation.delete());
+
+    Table table = TABLES.create(
+            SCHEMA, PartitionSpec.unpartitioned(),
+            ImmutableMap.of(TableProperties.DEFAULT_FILE_FORMAT, 
format.name()),
+            tableLocation.getAbsolutePath());
+
+    List<Record> testRecords = ImmutableList.of(
+            genericRecord.copy(ImmutableMap.of("id", 1L, "data", "clammy")),
+            genericRecord.copy(ImmutableMap.of("id", 2L, "data", "evacuate")),
+            genericRecord.copy(ImmutableMap.of("id", 3L, "data", "tissue")),
+            genericRecord.copy(ImmutableMap.of("id", 4L, "data", ""))
+    );
+
+    Set<Record> expected = Sets.newHashSet(testRecords);
+
+    DataFile file = writeFile(tableLocation.toString(), 
format.addExtension("record-file"),
+            SCHEMA, testRecords);
+    table.newFastAppend().appendFile(file).commit();
+    Iterable<Record> results = 
IcebergGenerics.read(table).where(startsWith("data", "")).build();
+    Set<Record> actual = Sets.newHashSet(results);
+
+    Assert.assertEquals("Should produce correct number of records", 
expected.size(), actual.size());
+    Assert.assertEquals("Record set should match", expected, actual);
+  }

Review comment:
       I'm thinking of removing this test and just updating the records in the 
two spark `TestFilteredScan` suite, as that covers a range of possible 
scenarios (including the one that the original issue was written for) and would 
not add any additional CI time.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to