nathaniel-d-ef commented on code in PR #8492:
URL: https://github.com/apache/arrow-rs/pull/8492#discussion_r2392752743


##########
arrow-avro/src/codec.rs:
##########
@@ -208,7 +209,15 @@ impl AvroDataType {
 
     /// Returns an arrow [`Field`] with the given name
     pub fn field_with_name(&self, name: &str) -> Field {

Review Comment:
   Pub(crate) here?



##########
arrow-avro/test/data/README.md:
##########
@@ -268,6 +268,91 @@ Options:
 **Source / Repro script:**
 `create_avro_union_file.py` (Gist): contains the full writer schema, record 
builders covering four rows, and the `fastavro.writer` call which emits 
`union_fields.avro`.
 
+## Comprehensive E2E Coverage File
+
+**Purpose:** A single OCF that exercises **all decoder paths** used by 
`arrow-avro` with both **nested and non‑nested** shapes, including **dense 
unions** (null‑first, null‑second, multi‑branch), **aliases** (type and field), 
**default values**, **docs** and **namespaces**, and combinations thereof. It’s 
intended to validate the final `Reader` implementation and to stress 
schema‑resolution behavior in the tests under `arrow-avro/src/reader/mod.rs`.

Review Comment:
   Epic 💪 



##########
arrow-avro/src/codec.rs:
##########
@@ -1011,25 +1018,48 @@ impl<'a> Resolver<'a> {
     }
 }
 
+fn full_name_set(name: &str, ns: Option<&str>, aliases: &[&str]) -> 
HashSet<String> {
+    let mut out = HashSet::with_capacity(1 + aliases.len());
+    let (full, _) = make_full_name(name, ns, None);
+    out.insert(full);
+    for a in aliases {
+        let (fa, _) = make_full_name(a, None, ns);
+        out.insert(fa);
+    }
+    out
+}
+
 fn names_match(
     writer_name: &str,
+    writer_namespace: Option<&str>,
     writer_aliases: &[&str],
     reader_name: &str,
+    reader_namespace: Option<&str>,
     reader_aliases: &[&str],
 ) -> bool {
-    writer_name == reader_name
-        || reader_aliases.contains(&writer_name)
-        || writer_aliases.contains(&reader_name)
+    let writer_set = full_name_set(writer_name, writer_namespace, 
writer_aliases);
+    let reader_set = full_name_set(reader_name, reader_namespace, 
reader_aliases);
+    // If the canonical full names match, or any alias matches cross-wise.
+    !writer_set.is_disjoint(&reader_set)
 }
 
 fn ensure_names_match(
     data_type: &str,
     writer_name: &str,

Review Comment:
   This could be a little DRYer where the name, namespace, alias are 
abstracted. Not a big deal though.



##########
arrow-avro/src/schema.rs:
##########
@@ -208,70 +201,81 @@ pub enum ComplexType<'a> {
 ///
 /// <https://avro.apache.org/docs/1.11.1/specification/#schema-record>
 #[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
-pub struct Record<'a> {
+pub(crate) struct Record<'a> {
     /// Name of the record
     #[serde(borrow)]
-    pub name: &'a str,
+    pub(crate) name: &'a str,
     /// Optional namespace for the record, provides a way to organize names
     #[serde(borrow, default)]
-    pub namespace: Option<&'a str>,
+    pub(crate) namespace: Option<&'a str>,
     /// Optional documentation string for the record
     #[serde(borrow, default)]
-    pub doc: Option<&'a str>,
+    pub(crate) doc: Option<Cow<'a, str>>,
     /// Alternative names for this record
     #[serde(borrow, default)]
-    pub aliases: Vec<&'a str>,
+    pub(crate) aliases: Vec<&'a str>,
     /// The fields contained in this record
     #[serde(borrow)]
-    pub fields: Vec<Field<'a>>,
+    pub(crate) fields: Vec<Field<'a>>,
     /// Additional attributes for this record
     #[serde(flatten)]
-    pub attributes: Attributes<'a>,
+    pub(crate) attributes: Attributes<'a>,
+}
+
+fn deserialize_default<'de, D>(deserializer: D) -> Result<Option<Value>, 
D::Error>
+where
+    D: serde::Deserializer<'de>,
+{
+    Value::deserialize(deserializer).map(Some)

Review Comment:
   This is a great candidate for a native method IMO, good solution



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to