GitHub user carloea2 added a comment to the discussion: disallowing sorting on 
binary type

Yes, I think that is also unexpected for some users, and it may cause confusion 
for real users, example;

An user has a CSV with a column containing scientific notation numbers mixed 
with normal floats, if in my python code I sort with pandas default, then 
pandas will cast automatically to float, and the sort will be right.

In texera, now, when reading my CSV, then sorting by that column, the sort will 
be as strings resulting in a totally different output as in my python code, and 
since there is no hint in the UI that the column is string the user will have a 
bad experience.

//// Extra question
In Python, bytes are sequences of unsigned integers 0..255, and binary 
sequences compare lexicographically by those numeric byte values. 

On the JVM (Java/Scala), byte/Byte is signed 8-bit two’s-complement (-128..127).

Is not this another source of problems when sorting bytes? I know similar 
things can apply to strings but managing bytes is more complex which means for 
users needing bytes sorting we should provide more Params on how to cast and 
manage the bytes, right?

GitHub link: 
https://github.com/apache/texera/discussions/4007#discussioncomment-14797552

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]

Reply via email to