Yicong-Huang commented on PR #54010:
URL: https://github.com/apache/spark/pull/54010#issuecomment-3808562097

   > @Yicong-Huang @fangchenli I am thinking whether we should use golden files 
instead. If we use golden files, we don't need to touch the test files too 
much, just need to regenereate or compare a new file with new conditions 
(`safe=False` in this PR)
   
   hmm I am really not a fan of golden file, especially for this kind of type 
related tests. To compare on a text based format, all python objects needs to 
be serialized in a way, (e.g., `str(object)` or `repr(object)`) which could 
cause edge cases missing. For example, a decimal128 and a decimal256 may have 
different digits init but serialized to be the same string (I am making it up 
as an example). We will need to make sure we don't introduce potential problems 
like this. This particular test for `pa.Array.cast` maybe sensitive to an extra 
serialization.
   
   Maybe we can exercise golden file practice in future/other tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to