revans2 commented on PR #45294:
URL: https://github.com/apache/spark/pull/45294#issuecomment-2352962655

   I am conflicted on this. I agree with @cloud-fan that changing the behavior 
to be more like the interpreted code is not ideal in a big fix release is we 
are worried about consistency for the user. But at the same time this is 
totally a corner case and I cannot think of a real world situation where the 
truncation is not better than overflowing. The overflow starts to happen 
sometime in the year 292278994, so effectively no legitimate timestamp, should 
ever hit this. Yes, estimates for the heat death of the universe are after that.
   
   So, then I have to think what are the cases where this could become a 
problem for a user. Perhaps detecting and filtering out bad dates/timestamps? I 
know that this can happen in practice. But the simplest way to do that is to 
compare the timestamp against an allowed range, and that does not involve 
casting the timestamp to seconds since the epoch. So it would have to be a case 
where a user wants seconds since the epoch and is doing filtering/cleanup after 
the conversion. But in this case the truncation makes it 100% guaranteed to 
catch all bad timestamps because overflow could covert a timestamp back to a 
"valid" one. Unless your "valid" range is outside of the overflow/underflow 
range.
   
   The only use case I can think of where the overflow is better is if someone 
is trying to detect the overflow to mark the conversion as bad (essentially 
look for a change in the sign of the value after the conversion). But that 
would also imply that the timestamp is good, which I really doubt. Perhaps 
there are some physics or sci-fi datasets out there where this really would be 
valid. I just don't know.
   
   For me I would vote to keep this change as is, but perhaps I have too 
limited of a view on how this is being used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to