rdblue commented on code in PR #15087:
URL: https://github.com/apache/iceberg/pull/15087#discussion_r2709506747
##########
parquet/src/test/java/org/apache/iceberg/parquet/TestVariantWriters.java:
##########
@@ -280,4 +283,61 @@ private static ValueArray array(VariantValue... values) {
return arr;
}
+
+ @Test
+ public void testPartialShreddingWithShreddedObject() throws IOException {
+ // Test for issue #15086: partial shredding with ShreddedObject created
using put()
+ // Create a ShreddedObject with multiple fields, then partially shred it
+ VariantMetadata metadata = Variants.metadata("id", "name", "city");
+
+ // Create objects using ShreddedObject.put() instead of serialized buffers
+ List<Record> records = Lists.newArrayList();
+ for (int i = 0; i < 3; i++) {
+ ShreddedObject obj = Variants.object(metadata);
+ obj.put("id", Variants.of(1000L + i));
+ obj.put("name", Variants.of("user_" + i));
+ obj.put("city", Variants.of("city_" + i));
+
+ Variant variant = Variant.of(metadata, obj);
+ Record record = RECORD.copy("id", i, "var", variant);
+ records.add(record);
+ }
+
+ // Shredding function that only shreds the "id" field
+ VariantShreddingFunction partialShredding =
+ (id, name) ->
+ org.apache.parquet.schema.Types.optionalGroup()
+ .addField(
+ org.apache.parquet.schema.Types.optionalGroup()
+ .addField(
+ org.apache.parquet.schema.Types.optional(
+ PrimitiveType.PrimitiveTypeName.BINARY)
+ .named("value"))
+ .addField(
+ org.apache.parquet.schema.Types.optional(
+ PrimitiveType.PrimitiveTypeName.INT64)
+ .named("typed_value"))
+ .named("id"))
+ .named("typed_value");
+
+ // Write and read back
+ List<Record> actual = writeAndRead(partialShredding, records);
Review Comment:
I think an end-to-end test should live in `TestVariantReaders` rather than
`TestVariantWriters`. The problem you're trying to recreate is in the read path
when a partially shredded object is being reconstructed. I think that the tests
should also be in the read path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]