kbendick commented on code in PR #4734:
URL: https://github.com/apache/iceberg/pull/4734#discussion_r869515613
##########
data/src/test/java/org/apache/iceberg/data/TestMetricsRowGroupFilter.java:
##########
@@ -298,17 +298,32 @@ public void testNoNulls() {
@Test
public void testIsNaN() {
+ Assume.assumeTrue("Avro files do not have row group statistics", format !=
FileFormat.AVRO);
+
boolean shouldRead = shouldRead(isNaN("all_nans"));
- Assert.assertTrue("Should read: NaN counts are not tracked in Parquet
metrics", shouldRead);
+ Assert.assertTrue(String.format("Should read: NaN counts are not tracked
in %s metrics", format), shouldRead);
shouldRead = shouldRead(isNaN("some_nans"));
- Assert.assertTrue("Should read: NaN counts are not tracked in Parquet
metrics", shouldRead);
-
- shouldRead = shouldRead(isNaN("no_nans"));
- Assert.assertTrue("Should read: NaN counts are not tracked in Parquet
metrics", shouldRead);
+ Assert.assertTrue(String.format("Should read: NaN counts are not tracked
in %s metrics", format), shouldRead);
shouldRead = shouldRead(isNaN("all_nulls"));
Assert.assertFalse("Should skip: all null column will not contain nan
value", shouldRead);
+
+ if (format == FileFormat.ORC) {
+ testIsNanOrc();
+ } else {
+ testIsNanParquet();
+ }
+ }
+
+ private void testIsNanParquet() {
+ boolean shouldRead = shouldRead(isNaN("no_nans"));
+ Assert.assertTrue("Should read: NaN counts are not tracked in Parquet
metrics", shouldRead);
+ }
+
+ private void testIsNanOrc() {
+ boolean shouldRead = shouldRead(isNaN("no_nans"));
+ Assert.assertFalse("Should skip: no nans will not contain nan value",
shouldRead);
Review Comment:
For the ORC assertion message, it might be more clear if we're a little more
descriptive, given that we've stated that the `NaN counts are not tricked in
orc metrics` already.
Something like `Should skip: The existence of NaN in a column is tracked in
ORC metrics`. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]