amogh-jahagirdar commented on code in PR #14081: URL: https://github.com/apache/iceberg/pull/14081#discussion_r2363551105
########## spark/v4.0/spark/src/test/java/org/apache/iceberg/spark/SparkTestHelperBase.java: ########## @@ -79,6 +80,12 @@ protected void assertEquals(String context, Object[] expectedRow, Object[] actua } else { assertEquals(newContext, (Object[]) expectedValue, (Object[]) actualValue); } + } else if (expectedValue instanceof VariantVal && actualValue instanceof VariantVal) { + // Spark VariantVal comparison is based on raw byte[] comparison, which can fail + // if Spark uses trailing null bytes. so, we compare their JSON representation instead. + assertThat((actualValue).toString()) + .as("%s contents should match (VariantVal JSON)", context) + .isEqualTo((expectedValue).toString()); Review Comment: Discussed offline, I think it's reasonable to compare the JSON representation to determine if the logical variant is equivalent. ########## data/src/test/java/org/apache/iceberg/data/TestMetricsRowGroupFilter.java: ########## @@ -988,6 +1004,67 @@ public void testTransformFilter() { .isTrue(); } + @TestTemplate + public void testVariantFieldMixedValuesNotNull() throws IOException { + assumeThat(format).isEqualTo(FileFormat.PARQUET); + + List<GenericRecord> records = Lists.newArrayList(); + for (int i = 0; i < 10; i++) { + GenericRecord record = GenericRecord.create(VARIANT_SCHEMA); + record.setField("id", i); + if (i % 2 == 0) { + VariantMetadata metadata = Variants.metadata("field"); + ShreddedObject obj = Variants.object(metadata); + obj.put("field", Variants.of("value" + i)); + Variant variant = Variant.of(metadata, obj); + record.setField("variant_field", variant); + } + records.add(record); + } + + File parquetFile = writeParquetFile("test-variant", VARIANT_SCHEMA, records); + InputFile inFile = Files.localInput(parquetFile); + try (ParquetFileReader reader = ParquetFileReader.open(parquetInputFile(inFile))) { + BlockMetaData blockMetaData = reader.getRowGroups().get(0); + MessageType fileSchema = reader.getFileMetaData().getSchema(); Review Comment: Non-blocking nit: I also feel like we can abstract reading the parquet block metadata/schema in a helper too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org