Light-City commented on issue #37765:
URL: https://github.com/apache/arrow/issues/37765#issuecomment-1734758320

   > This is fundamentally caused by acero strictly evaluating argument 
expressions before calling a function on those arguments. Refactoring to 
support more intrusive/lazy evaluation semantics would be a significant change; 
certainly not one which should be handled as a special case in 
`ExecuteScalarExpression`.
   > 
   > I'd recommend looking at the expression simplification passes 
(SimplifyWithGuarantee). There's machinery there to pattern match and modify 
expressions. Currently it is only used to produce more efficient expressions 
using partition information and other guarantees, but it could also be used to 
rewrite expressions for safe evaluation:
   > 
   > ```
   > Expression unsafe = case_when({greater_than(field_ref("j"), literal(0))}, {
   >   call("divide", {field_ref("i"), field_ref("j")}),
   >   field_ref("i"),
   > });
   > //...
   > ARROW_ASSIGN_OR_RAISE(Expression safe, MakeSafe(unsafe));
   > assert(safe == call("divide", {
   >   field_ref("i"),
   >   call("max", {field_ref("j"), literal(1)}),
   > }))
   > ```
   > 
   > Another way (much less intensive) to approach this problem would be 
writing a new option for the divide compute functions which produces null or 
zero when dividing by zero instead of raising an error. This could then be used 
explicitly in situations where division by zero is otherwise inevitable.
   
   Yes, the underlying operation needs to be changed. Adding options is a 
relatively big change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to