srowen commented on issue #26548: [SPARK-29918][SQL] RecordBinaryComparator should check endianness when compared by long URL: https://github.com/apache/spark/pull/26548#issuecomment-554632392 Agree that mixing big- and little-endian probably breaks more than this. Mixing aligned/unaligned seems a bit more "valid" (again think mixing in some cheaper ARM instances, the "a" instances on AWS), though I still don't know whether it should be supported. It'd be nice I guess, but looks like it'd introduce a small perf hit. One argument for this kind of change is simply that it makes the code do what it appears to intend to do now, but doesn't, which makes me a little nervous. However if I read this right, this could come up even on the same hardware if it doesn't support unaligned access? if the two offsets can be aligned, it'll use the 'longs' code path for some of it. If it doesn't, then it uses the byte code path for all of it. This suggests those can give different answers. Even if unaligned access is supported everywhere, it seems like the same data could give different results depending on where the data are in memory, on the offsets mod 8. The code path again varies depending on whether they match or do not at the start. If so that's a bigger problem, but, why would we not observe it until now in testing? is it that the data it seems always happens to be aligned similarly, like always 0 or 4 mod 8 for internal reasons? just that this rarely ever actually causes a problem in practice because consistent results would only matter in rare cases? BTW @tcondie did your issue come up on mixed hardware?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
