jonmmease opened a new issue, #5095:
URL: https://github.com/apache/arrow-rs/issues/5095

   **Describe the bug**
   It looks like the `coerce_primitive` configuration option on `ReaderBuilder` 
is not honored when decoding from serde-compatible objects like 
`serde_json::Value`.
   
   **To Reproduce**
   Code to add a test to arrow-json/src/reader/mod.rs that triggers the error
   
   ```rust
       #[test]
       fn test_coercing_primitive_into_string_decoder() {
           let buf = r#"[
               {"a": 1, "b": "A", "c": "T"},
               {"a": 2, "b": "BB", "c": "F"},
               {"a": 3, "b": 123, "c": false}
           ]"#;
           let schema = Schema::new(vec![
               Field::new("a", DataType::Float64, true),
               Field::new("b", DataType::Utf8, true),
               Field::new("c", DataType::Utf8, true),
           ]);
           let json_array: Vec<serde_json::Value> = 
serde_json::from_str(buf).unwrap();
   
           let schema_ref = Arc::new(schema);
   
           // read record batches
           let reader =
               
ReaderBuilder::new(schema_ref.clone()).with_coerce_primitive(true);
           let mut decoder = reader.build_decoder().unwrap();
           decoder.serialize(json_array.as_slice()).unwrap();
           let batch = decoder.flush().unwrap().unwrap();
           println!("{:?}", batch);
       }
   ```
   ```
   called `Result::unwrap()` on an `Err` value: JsonError("whilst decoding 
field 'b': expected string got 123")
   thread 'reader::tests::test_coercing_primitive_into_string_decoder' panicked 
at 'called `Result::unwrap()` on an `Err` value: JsonError("whilst decoding 
field 'b': expected string got 123")', arrow-json/src/reader/mod.rs:2284:37
   stack backtrace:
      0: rust_begin_unwind
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
      1: core::panicking::panic_fmt
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
      2: core::result::unwrap_failed
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/result.rs:1651:5
      3: core::result::Result<T,E>::unwrap
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/result.rs:1076:23
      4: arrow_json::reader::tests::test_coercing_primitive_into_string_decoder
                at ./src/reader/mod.rs:2284:21
      5: 
arrow_json::reader::tests::test_coercing_primitive_into_string_decoder::{{closure}}
                at ./src/reader/mod.rs:2264:54
      6: core::ops::function::FnOnce::call_once
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
      7: core::ops::function::FnOnce::call_once
                at 
/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose 
backtrace.
   ```
   
   **Expected behavior**
   I expect the value `123` in column `b` to be converted to a string.
   
   **Additional context**
   This worked in version 47. I hit the error when updating to DataFusion 
33.0.0 with Arrow 48.0.1, but have also confirmed it on `main`.
   
   My hunch is that this was introduced with 
https://github.com/apache/arrow-rs/pull/4861, and that some additional logic is 
needed around here 
   
   
https://github.com/apache/arrow-rs/blob/dc75a280b46149140eca8dd5e18d31cbadf04716/arrow-json/src/reader/string_array.rs#L45-L66
   
   to handle coercing the new TapeElement enumerations (`I64`, `I32`, `F64`, 
and `F32`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to