houqp commented on a change in pull request #9412: URL: https://github.com/apache/arrow/pull/9412#discussion_r571374068
########## File path: rust/arrow/test/data/mixed_arrays.json ########## @@ -1,4 +1,4 @@ -{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":4.1} +{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":["4.1"]} {"a":-10, "b":[2.0, 1.3, -6.1], "c":null, "d":null} -{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":"text"} -{"a":3, "b":4, "c": true, "d":[1, false, "array", 2.4]} +{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":["text"]} +{"a":3, "b":[], "c": [], "d":["array"]} Review comment: @nevi-me i just tested it with spark, looks like it's doing the conversion the other way, i.e. when a column contains both scalar and list values, it gets converted to string type. as a result, boolean lists are parsed into `"[true, false]"` string. I don't have a strong opinion on this, but if we want to match spark, then we should probably go that fallback to string if incompatible types are detected. What do you think? @nevi-me @jorgecarleitao @andygrove . ########## File path: rust/arrow/test/data/mixed_arrays.json ########## @@ -1,4 +1,4 @@ -{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":4.1} +{"a":1, "b":[2.0, 1.3, -6.1], "c":[false, true], "d":["4.1"]} {"a":-10, "b":[2.0, 1.3, -6.1], "c":null, "d":null} -{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":"text"} -{"a":3, "b":4, "c": true, "d":[1, false, "array", 2.4]} +{"a":2, "b":[2.0, null, -6.1], "c":[false, null], "d":["text"]} +{"a":3, "b":[], "c": [], "d":["array"]} Review comment: @nevi-me i just tested it with spark, looks like it's doing the conversion the other way, i.e. when a column contains both scalar and list values, it gets converted to string type. as a result, boolean lists are parsed into `"[true, false]"` string. I don't have a strong opinion on this, but if we want to match spark, then we should probably go that fallback to string if incompatible types are detected. What do you think? @nevi-me @jorgecarleitao @andygrove @alamb . ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org