tianhanhu opened a new pull request #34167:
URL: https://github.com/apache/spark/pull/34167


   ### What changes were proposed in this pull request?
   Migrating a Spark application from 2.4.x to 3.1.x and finding a difference 
in the exception chaining behavior. In a case of parsing a malformed CSV, where 
the root cause exception should be Caused by: java.lang.RuntimeException: 
Malformed CSV record, only the top level exception is kept, and all lower level 
exceptions and root cause are lost. Thus, when we call 
ExceptionUtils.getRootCause on the exception, we still get itself.
   The reason for the difference is that RuntimeException is wrapped in 
BadRecordException, which has unserializable fields. When we try to serialize 
the exception from tasks and deserialize from scheduler, the exception is lost.
   This PR makes unserializable fields of BadRecordException transient, so the 
rest of the exception could be serialized and deserialized properly.
   
   ### Why are the changes needed?
   Make BadRecordException serializable
   
   
   ### Does this PR introduce _any_ user-facing change?
   User could get root cause of BadRecordException
   -->
   
   
   ### How was this patch tested?
   Unit testing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to