kbendick commented on a change in pull request #2081:
URL: https://github.com/apache/iceberg/pull/2081#discussion_r556236631
##########
File path: data/src/test/java/org/apache/iceberg/data/TestLocalScan.java
##########
@@ -498,6 +499,35 @@ public void testFilterWithDateAndTimestamp() throws
IOException {
}
}
+ @Test
+ public void testFilterWithEmptyStringColumn() throws IOException {
+ File tableLocation =
temp.newFolder("filter_table_with_empty_string_column_value");
+ Assert.assertTrue(tableLocation.delete());
+
+ Table table = TABLES.create(
+ SCHEMA, PartitionSpec.unpartitioned(),
+ ImmutableMap.of(TableProperties.DEFAULT_FILE_FORMAT,
format.name()),
+ tableLocation.getAbsolutePath());
+
+ List<Record> testRecords = ImmutableList.of(
+ genericRecord.copy(ImmutableMap.of("id", 1L, "data", "clammy")),
+ genericRecord.copy(ImmutableMap.of("id", 2L, "data", "evacuate")),
+ genericRecord.copy(ImmutableMap.of("id", 3L, "data", "tissue")),
+ genericRecord.copy(ImmutableMap.of("id", 4L, "data", ""))
+ );
+
+ Set<Record> expected = Sets.newHashSet(testRecords);
+
+ DataFile file = writeFile(tableLocation.toString(),
format.addExtension("record-file"),
+ SCHEMA, testRecords);
+ table.newFastAppend().appendFile(file).commit();
+ Iterable<Record> results =
IcebergGenerics.read(table).where(startsWith("data", "")).build();
+ Set<Record> actual = Sets.newHashSet(results);
+
+ Assert.assertEquals("Should produce correct number of records",
expected.size(), actual.size());
+ Assert.assertEquals("Record set should match", expected, actual);
+ }
Review comment:
I'm thinking of removing this test and just updating the records in the
two spark `TestFilteredScan` suite, as that covers a range of possible
scenarios (including the one that the original issue was written for) and would
not add any additional CI time.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]