[GitHub] [arrow-rs] kskalski commented on issue #4814: Null fields are omitted by `infer_json_schema_from_seekable`

via GitHub Wed, 13 Sep 2023 12:31:41 -0700


kskalski commented on issue #4814:
URL: https://github.com/apache/arrow-rs/issues/4814#issuecomment-1718202641


   I think `DataType::Null` is a correct data type in this situation and 
compatible with other inference behavior, e.g.
   ```rust
       const DATA: &str = r#"{"a": 1, "b": "str", "c": null, "d": []}"#;
   ```
   will create a schema with `d` field, which has a type `List(Field { name: 
"item", data_type: Null, nullable: true, dict_id: 0, dict_is_ordered: false, 
metadata: {} })`
   
   Creating data frames with such schema is a bit different topic, I guess not 
all output formats support null columns, however there is 
https://docs.rs/arrow/latest/arrow/array/struct.NullArray.html so:
   * up to a point it could still be used to create / process data
   * if user wants to use schema for writing to format not allowing nulls (e.g. 
parquet seems to be one, though "Parquet Logical Type Definitions" mention 
column of `UNKNOWN` type, which relate to arrow null type), they could modify 
schema to assign some other datatype with `nullable` option and this should 
work fine when parsing data in strict mode


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] kskalski commented on issue #4814: Null fields are omitted by `infer_json_schema_from_seekable`

Reply via email to