jecsand838 commented on code in PR #8274:
URL: https://github.com/apache/arrow-rs/pull/8274#discussion_r2322622541


##########
arrow-avro/src/schema.rs:
##########
@@ -370,6 +371,49 @@ impl AvroSchema {
     pub fn fingerprint(&self) -> Result<Fingerprint, ArrowError> {
         generate_fingerprint_rabin(&self.schema()?)
     }
+
+    /// Build Avro JSON from an Arrow [`ArrowSchema`], applying the given 
null‑union order.
+    ///
+    /// If the input Arrow schema already contains Avro JSON in
+    /// [`SCHEMA_METADATA_KEY`], that JSON is returned verbatim to preserve
+    ///  the exact header encoding alignment; otherwise, a new JSON is 
generated
+    /// honoring `null_union_order` at **all nullable sites**.
+    pub fn from_arrow_with_options(
+        schema: &ArrowSchema,
+        null_union_order: Option<Nullability>,
+    ) -> Result<AvroSchema, ArrowError> {
+        if let Some(json) = schema.metadata.get(SCHEMA_METADATA_KEY) {
+            return Ok(AvroSchema::new(json.clone()));
+        }
+        let order = null_union_order.unwrap_or(Nullability::NullFirst);
+        let mut name_gen = NameGenerator::default();
+        let fields_json = schema
+            .fields()
+            .iter()
+            .map(|f| arrow_field_to_avro_with_order(f, &mut name_gen, order))
+            .collect::<Result<Vec<_>, _>>()?;
+        let record_name = schema
+            .metadata
+            .get(AVRO_NAME_METADATA_KEY)
+            .map_or("topLevelRecord", |s| s.as_str());

Review Comment:
   > aside: Is this a well-known default name? Or just an arbitrary naming 
choice by this package?
   
   While not an Avro‑spec default, `topLevelRecord` is used as a de‑facto 
default because several popular tools (notably Spark/Databricks) default to the 
same name when they synthesize an Avro schema from a struct/row.
   
   > And does it actually matter in practice? (I guess if it mattered, the 
schema metadata would say so)?
   
   Avro requires that a record have a name to be valid. Also because the record 
name participates in canonical form parsing, changing a record's name will 
change it's fingerprint.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to