Jefffrey commented on PR #19369:
URL: https://github.com/apache/datafusion/pull/19369#issuecomment-3706945045

   > > Seeing all this logic introduced, I'm beginning to question whether 
there is actual benefit to having a native log implementation 🤔
   > > Perhaps we should just revert to casting it to float and accept the 
accuracy loss
   > > Thoughts @theirix ?
   > 
   > Fair enough, the logic becomes more convoluted.
   > 
   > The original idea was to introduce common decimal operations. 
Scale-preserving operations like abs, round, gcd, etc., are easy to implement 
and support. Some other operations with a natural mapping to decimals (like 
log10, pow10) adjust scales and do not have a natural analogue in the arrow 
buffer, leading to more complex logic. These operations are typical for data 
analytics, and applications could benefit from them. So ten-based operations 
can be calculated precisely, while for the rest and for more complicated 
operations, of course, it is fine to lose precision using a native float 
implementation.
   > 
   > First, we should reuse the arrow's foundational primitives as much as 
possible. If there is an `OP_checked`, it's better to piggyback on it. A few 
num traits were recently added to decimals in arrow-buffer, making it easier 
for us.
   > 
   > Second, I believe more logic should be isolated in 
`calculate_binary_decimal_math`, especially for handling different scales, to 
shift responsibility from UDF implementers (like pow) to middleware. It is in 
progress, and I'll submit it shortly.
   
   That makes sense. I guess what we could also do to alleviate this complexity 
(and ensure less performance impact) would be:
   
   - At invoke time of function, only use native decimal operations when we 
have a scalar exponent
   - Otherwise fall back to casting to float
   
   This can be done in followup PRs of course but at least sets a roadmap for 
us.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to