rtpsw commented on PR #13784:
URL: https://github.com/apache/arrow/pull/13784#issuecomment-1209861142

   I developed this PR's code while experimenting with designs for AsOfJoin; 
however, the design I chose to proceed with does not need this PR's code. So, 
I'm fine if this doesn't go through. I just figured I'd share the code to see 
what people think. I noticed that Arrow does not implement ordering for numeric 
Scalar types, but I wasn't aware of the considerations that @bkietz describes - 
perhaps this means the corresponding Jira, and not just this PR, is being 
questioned. The bit twiddling idea for ordering floats crossed my mind, but I 
opted to start with something simpler.
   
   > If this is intended for use in AsofJoin then we should definitely not be 
using Scalars at all since extracting them from a join key column, boxing them 
in the type-erased Scalar, then unboxing them again to do type check and 
comparison would be wasteful compared to doing the comparisons in-place against 
elements of each column.
   
   I experimented with adding support for updating a Scalar in-place, which 
potentially could have avoided the above costs in my code, but ran into 
implementation problems (changes needed in too many places across Arrow) and 
decided to give up on it.
   
   > Tangent: I don't know how applicable this will be, but at least for 
testing purposes it might be handy to use something like [Rust's 
float_ord](https://docs.rs/float-ord/latest/float_ord/) to cast floating point 
numbers into something less subtle to sort. For example: 
https://gist.github.com/bkietz/8e2ef182883b886e532ffde8e537f7a3
   
   These resources are interesting - thanks for sharing! I also found 
http://stereopsis.com/radix.html (linked from 
https://docs.rs/crate/float-ord/latest) interesting.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to