mbutrovich commented on code in PR #2188:
URL: https://github.com/apache/iceberg-rust/pull/2188#discussion_r3326706646


##########
crates/iceberg/src/writer/file_writer/parquet_writer.rs:
##########
@@ -191,20 +207,11 @@ impl SchemaVisitor for IndexByParquetPathName {
     }
 
     fn primitive(&mut self, _p: &PrimitiveType) -> Result<Self::T> {
-        let full_name = self.field_names.iter().map(String::as_str).join(".");
-        let field_id = self.field_id;
-        if let Some(existing_field_id) = 
self.name_to_id.get(full_name.as_str()) {
-            return Err(Error::new(
-                ErrorKind::DataInvalid,
-                format!(
-                    "Invalid schema: multiple fields for name {full_name}: 
{field_id} and {existing_field_id}"
-                ),
-            ));
-        } else {
-            self.name_to_id.insert(full_name, field_id);
-        }
+        self.insert_current_path()
+    }
 
-        Ok(())
+    fn variant(&mut self, _v: &VariantType) -> Result<Self::T> {

Review Comment:
   Not really a comment on this line, but: iceberg-java's 
`TypeToMessageType#variant` writes the Parquet group with 
`LogicalTypeAnnotation.variantType(VARIANT_SPEC_VERSION)`. The Rust write path 
here doesn't add that annotation, so files written by iceberg-rust would carry 
a plain `Struct(Binary, Binary)` without the variant logical type marker. The 
integration tests are read-only against Spark-written data, so it isn't caught. 
Worth a tracking issue, or already on the roadmap?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to