andygrove commented on issue #4500:
URL: 
https://github.com/apache/datafusion-comet/issues/4500#issuecomment-4835443276

   **Item #3 (hex/unhex collation) — investigated, no change needed.**
   
   `Hex` output is restricted to the fixed ASCII alphabet `0-9A-F`, so its 
values are collation-invariant: identical hex strings compare equal and group 
identically under every collation, and the small uppercase alphabet is not 
reordered by realistic locale collations. `Unhex` returns `BinaryType`, which 
carries no collation at all.
   
   The only way collation could matter is a collation-aware operation 
*consuming* the hex result (e.g. `WHERE hex(x) = 'ab'`, `ORDER BY hex(x)`). 
Comet already declines to evaluate collation-aware string comparisons/sorts 
natively (`QueryPlanSerde.scala:1051`), so those fall back to Spark and produce 
correct results. The hex projection still runs natively.
   
   Net: Comet cannot produce a collation-related wrong answer for `hex`/`unhex` 
today, so they correctly stay `Compatible`. The original audit flag was a 
blanket-policy concern, not a defect. (Contrast `concat`/`reverse`, which are 
gated for the same theoretical reason and are arguably over-gated.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to