shashbha14 opened a new pull request, #49177:
URL: https://github.com/apache/arrow/pull/49177

   Fixes #49158
   
   The issue: when you provide an explicit schema to the JSON parser, it errors 
if JSON types don't exactly match schema types, even when conversion is 
straightforward.
   
   For example, if you have:n
   {"_id": "152934"}
   {"_id": 152934}And your schema says `_id` should be string, it fails on row 
1 with "Column changed from string to number" instead of converting 152934 to 
"152934".
   
   I fixed this by making the parser attempt type conversion when an explicit 
schema is provided. Before erroring on a type mismatch, it checks if we have an 
explicit schema and tries to convert the value to match the expected type.
   
   Changes:
   - Store explicit_schema in HandlerBase so we can access it during parsing
   - Modified AppendScalar() to try conversion before erroring when explicit 
schema exists
   - Added TryConvertAndAppend() helper that handles the conversion logic
   - Updated Bool() handler to also support conversion
   - Added tests for number->string and string->number cases
   
   Conversions that work now:
   - Number -> String (152934 -> "152934")
   - String -> Number (when the string is numeric)
   - Boolean conversions to/from string and number
   - Number -> Boolean (0 is false, non-zero is true)
   
   This only happens when explicit schema is provided, so it's backward 
compatible. All existing tests still pass.
   
   Fixes #49158
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to