alamb commented on code in PR #6307:
URL: https://github.com/apache/arrow-datafusion/pull/6307#discussion_r1189233014


##########
datafusion/core/src/avro_to_arrow/arrow_array_reader.rs:
##########
@@ -860,13 +862,14 @@ fn flatten_string_values(values: &[&Value]) -> 
Vec<Option<String>> {
 /// Reads an Avro value as a string, regardless of its type.
 /// This is useful if the expected datatype is a string, in which case we 
preserve
 /// all the values regardless of they type.
-fn resolve_string(v: &Value) -> ArrowResult<String> {
+fn resolve_string(v: &Value) -> ArrowResult<Option<String>> {
     let v = if let Value::Union(_, b) = v { b } else { v };
     match v {
-        Value::String(s) => Ok(s.clone()),
-        Value::Bytes(bytes) => {
-            String::from_utf8(bytes.to_vec()).map_err(AvroError::ConvertToUtf8)
-        }
+        Value::String(s) => Ok(Some(s.clone())),
+        Value::Bytes(bytes) => String::from_utf8(bytes.to_vec())
+            .map_err(AvroError::ConvertToUtf8)
+            .map(Some),
+        Value::Null => Ok(None),

Review Comment:
   Looks reasonable to me. 👍 



##########
datafusion/core/src/datasource/file_format/avro.rs:
##########
@@ -350,6 +393,48 @@ mod tests {
         Ok(())
     }
 
+    #[tokio::test]
+    async fn read_null_binary_alltypes_plain_avro() -> Result<()> {
+        let session_ctx = SessionContext::new();
+        let state = session_ctx.state();
+        let task_ctx = state.task_ctx();
+        let projection = Some(vec![6]);
+        let exec =
+            get_exec(&state, "alltypes_nulls_plain.avro", projection, 
None).await?;
+
+        let batches = collect(exec, task_ctx).await?;

Review Comment:
   It might also be worth checking out the 
https://docs.rs/datafusion/latest/datafusion/macro.assert_batches_eq.html macro 
to verify the rows / columns in a more easy to maintain wai



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to