scovich commented on code in PR #8349: URL: https://github.com/apache/arrow-rs/pull/8349#discussion_r2368634667
########## arrow-avro/src/reader/record.rs: ########## @@ -486,6 +734,70 @@ impl Decoder { Self::Decimal256(_, _, _, builder) => builder.append_value(i256::ZERO), Self::Enum(indices, _, _) => indices.push(0), Self::Duration(builder) => builder.append_null(), + Self::Union(fields, type_ids, offsets, encodings, encoding_counts, None) => { + let mut chosen = None; + for (i, ch) in encodings.iter().enumerate() { + if matches!(ch, Decoder::Null(_)) { + chosen = Some(i); + break; + } + } + let idx = chosen.unwrap_or(0); + let type_id = fields + .iter() + .nth(idx) + .map(|(type_id, _)| type_id) + .unwrap_or_else(|| i8::try_from(idx).unwrap_or(0)); + type_ids.push(type_id); + offsets.push(encoding_counts[idx]); + encodings[idx].append_null(); + encoding_counts[idx] += 1; + } + Self::Union( + fields, + type_ids, + offsets, + encodings, + encoding_counts, + Some(union_resolution), + ) => match &mut union_resolution.kind { + UnionResolvedKind::Both { .. } => { + let mut chosen = None; + for (i, ch) in encodings.iter().enumerate() { + if matches!(ch, Decoder::Null(_)) { + chosen = Some(i); + break; + } + } + let idx = chosen.unwrap_or(0); + let type_id = fields + .iter() + .nth(idx) + .map(|(type_id, _)| type_id) + .unwrap_or_else(|| i8::try_from(idx).unwrap_or(0)); Review Comment: I agree we shouldn't conflate Avro branch index with Arrow union type code... but (how) did all this shake out in the latest revisions? I'm not confident I understand this bit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org