Pavol Knapek created ARROW-11978:
------------------------------------
Summary: [Python] Dynamic casting during JSON schema inference
Key: ARROW-11978
URL: https://issues.apache.org/jira/browse/ARROW-11978
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Affects Versions: 3.0.0
Reporter: Pavol Knapek
It would be very nice to have some opt-in dynamic casting supported in the JSON
schema inference process.
Example input.json file:
{{{"col1": "1"}}}
{{{"col1": 1}}}
Example schema-inference invocation:
{{pyarrow.json.read_json('input.json')}}
Expected output:
{{pyarrow.Table with a schema of \{col1: string}}}
Actual output:
{{ArrowInvalid: JSON parse error: Column(/col1) changed from string to number
in row 1}}
This applies for all the DataTypes, convertible to a super-type, i.e.:
Integer -> String
Object -> String
Anything -> String
--
This message was sent by Atlassian Jira
(v8.3.4#803005)