hvanhovell opened a new pull request, #38720:
URL: https://github.com/apache/spark/pull/38720

   ### What changes were proposed in this pull request?
   The arrow collect code path for connect contains a bug where it would always 
fall back to JSON. This was caused by the assumption that `NonFatal(e)` does 
not match nulls, it unfortunately does. This has been fixed by doing explicit 
null checks and by reordering the checks in 
`SparkConnectStreamHandler.processAsArrowBatches`.
   
   ### Why are the changes needed?
   The previous code had a bug and would always fallback to JSON.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   I added a new test, and I re-enabled the python test disabled in SPARK-41184.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to