westonpace commented on PR #14434:
URL: https://github.com/apache/arrow/pull/14434#issuecomment-1281604803

   > I don't think DecodeOptionlessOverflowableArithmetic is suitable for these 
functions, nor for division or for any of the floating-point versions of the 
functions it's already used for. 
   
   At the moment we should focus on exposing the capability that we have, and 
describing it accurately.  For example, in Arrow we have "_checked" but "what 
exactly is checked?"  For example, in Arrow we have `divide` and 
`divide_checked`.  They seem to behave as follows (through trial and error):
   
   | Method | Data Type | Problem | Behavior |
   | --- | --- | --- | --- |
   | divide | integral | overflow | silent |
   | divide | integral | divide by zero | error |
   | divide | integral | domain error | error |
   | divide | floating | divide by zero | limit |
   | divide | floating | domain error | nan |
   | divide_checked | integral | overflow | error |
   | divide_checked | integral | divide by zero | error |
   | divide_checked | integral | domain error | error |
   | divide_checked | floating | divide by zero | error |
   | divide_checked | floating | domain error | error |
   
   Thus we should have the following behavior for substrait calls
   
   | Call | Data Type | Options | Arrow Result |
   | --- | --- | --- | --- |
   | divide | integral | overflow=SILENT | divide |
   | divide | integral | overflow=SATURATE | reject plan |
   | divide | integral | overflow=ERROR | divide_checked |
   | divide | floating | rounding=* | reject plan[1] |
   | divide | floating | on_domain_error=NAN | divide |
   | divide | floating | on_domain_error=ERROR | divide_checked |
   | divide | floating | on_division_by_zero=LIMIT | divide |
   | divide | floating | on_division_by_zero=NAN | reject plan |
   | divide | floating | on_division_by_zero=ERROR | divide_checked |
   | divide | floating | on_domain_error=NAN on_division_by_zero=ERROR | reject 
plan[2] |
   | divide | floating | on_domain_error=ERROR on_division_by_zero=LIMIT | 
reject plan[2] |
   
   1. I'm not sure what rounding behavior we have implemented in Acero and 
therefore the safest thing to do would be to reject all plans that are asking 
for a specific rounding behavior.
   2. We cannot satisfy these specific combinations between divide and 
divide_checked.
   
   Caveat / Needs clarification:
   
   IMHO, It is not clear, in Substrait, if division by zero for integral 
arguments counts as overflow or not.  Also, it is possible to have 0/0 in 
integer arithmetic, which I have been interpreting as "domain error" 
(technically division by zero requires non-zero/0) but have no idea what the 
integral behavior should be.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to