tustvold opened a new issue, #1681:
URL: https://github.com/apache/arrow-rs/issues/1681

   **Describe the bug**
   
   The schema inference logic in parquet does not infer the correct nullability 
for nested types.
   
   For example
   
   ```
   let message_type = "
   message test_schema {
     OPTIONAL INT32 leaf1;
     REPEATED GROUP outerGroup {
       OPTIONAL INT32 leaf2;
       REPEATED GROUP innerGroup {
         OPTIONAL INT32 leaf3;
       }
     }
   }
   ";
   let parquet_group_type = parse_message_type(message_type).unwrap();
   let parquet_schema = SchemaDescriptor::new(Arc::new(parquet_group_type));
   let converted_arrow_schema =
   parquet_to_arrow_schema(&parquet_schema, None).unwrap();
   ```
   
   Will infer innerGroup and outerGroup as nullable lists with nullable 
elements, when they are neither.
   
   **To Reproduce**
   
   See test
   
   **Expected behavior**
   
   The nullability should be inferred correctly
   
   **Additional context**
   
   This has likely been hidden by the lack of support for repeated fields - 
https://github.com/apache/arrow-rs/issues/1680
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to