[GitHub] [arrow] nevi-me commented on a change in pull request #9412: ARROW-11491: [Rust] support JSON schema inference for nested list and struct

GitBox Sun, 07 Feb 2021 07:44:01 -0800


nevi-me commented on a change in pull request #9412:
URL: https://github.com/apache/arrow/pull/9412#discussion_r571636351




##########
File path: rust/arrow/test/data/mixed_arrays.json
##########
@@ -1,4 +1,4 @@
-{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":4.1}
+{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":["4.1"]}
 {"a":-10, "b":[2.0, 1.3, -6.1], "c":null, "d":null}
-{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":"text"}
-{"a":3, "b":4, "c": true, "d":[1, false, "array", 2.4]}
+{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":["text"]}
+{"a":3, "b":[], "c": [], "d":["array"]}

Review comment:
       @houqp @alamb one downside for now is that we don't have utilities to 
deal with JSON data once it's inferred and interpreted as strings (matching 
Spark behaviour). For example, we don't have anything to convert "[true, 
false]" back to an array of those values.
   
   I suppose with schema inference we'll always have to make some compromises. 
I still prefer casting to list as that loses the least data, but if we use 
different behaviour, we could document the decision on the relevant function, 
so that other users could make changes in future (if they motivate for some 
other behaviour).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] nevi-me commented on a change in pull request #9412: ARROW-11491: [Rust] support JSON schema inference for nested list and struct

Reply via email to