nicklan opened a new issue, #6391:
URL: https://github.com/apache/arrow-rs/issues/6391

   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   
   If you use an `arrow_json::ReaderBuilder` to read a json file, and specify a 
schema that includes a map that shouldn't allow nullable _values_, you can 
still read files that have nulls in the actual json map.
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   ```rust
   use std::{fs::File, io::BufReader, sync::Arc};
   
   use arrow::datatypes::{DataType, Field, Schema};
   
   fn main() {
       let schema = Arc::new(Schema::new(vec![
           Field::new("str", DataType::Utf8, false),
           Field::new_map(
               "map",
               "entries",
               Field::new("key", DataType::Utf8, false),
               Field::new("value", DataType::Utf8, false), // value is not 
nullable
               false,
               false
           )
       ]));
   
       let file = File::open("test.json").unwrap();
   
       let mut json = 
arrow_json::ReaderBuilder::new(schema).build(BufReader::new(file)).unwrap();
       let batch = json.next().unwrap().unwrap();
       println!("Batch: {batch:#?}");
   }
   ```
   
   And use this json file:
   ```json
   {
     "str": "s",
     "map":  {
       "key": null
     }
   }
   ```
   
   Running produces:
   ```
   Batch: RecordBatch {
       schema: Schema {
           fields: [
               Field {
                   name: "str",
                   data_type: Utf8,
                   nullable: false,
                   dict_id: 0,
                   dict_is_ordered: false,
                   metadata: {},
               },
               Field {
                   name: "map",
                   data_type: Map(
                       Field {
                           name: "entries",
                           data_type: Struct(
                               [
                                   Field {
                                       name: "key",
                                       data_type: Utf8,
                                       nullable: false,
                                       dict_id: 0,
                                       dict_is_ordered: false,
                                       metadata: {},
                                   },
                                   Field {
                                       name: "value",
                                       data_type: Utf8,
                                       nullable: false,
                                       dict_id: 0,
                                       dict_is_ordered: false,
                                       metadata: {},
                                   },
                               ],
                           ),
                           nullable: false,
                           dict_id: 0,
                           dict_is_ordered: false,
                           metadata: {},
                       },
                       false,
                   ),
                   nullable: false,
                   dict_id: 0,
                   dict_is_ordered: false,
                   metadata: {},
               },
           ],
           metadata: {},
       },
       columns: [
           StringArray
           [
             "s",
           ],
           MapArray
           [
             StructArray
           [
           -- child 0: "key" (Utf8)
           StringArray
           [
             "key",
           ]
           -- child 1: "value" (Utf8)
           StringArray
           [
             null,
           ]
           ],
           ],
       ],
       row_count: 1,
   }
   ```
   
   Note I've included the `str` field so you can easily see that the right 
thing happens if you change your .json file to
   ```json
   {
     "str": null,
     "map":  {
       "key": null
     }
   }
   ```
   
   You will get:
   ```
   called `Result::unwrap()` on an `Err` value: JsonError("Encountered unmasked 
nulls in non-nullable StructArray child: Field { name: \"str\", data_type: 
Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }")
   
   ```
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   Expect an error similar to what happens when `str` field is set to null.
   
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to