jayzhan211 commented on issue #9972:
URL: 
https://github.com/apache/arrow-datafusion/issues/9972#issuecomment-2056427365

   > Sorry for the late reply. Since I was in vacation, couldn't look here.
   > 
   > > Btw, why is reverse expr in Avg, Sum, MinMax, and Count just returning 
clone, what is the difference between returns None?
   > 
   > As an example usecase: Consider the query
   > 
   > ```
   > SELECT SUM(b), FIRST_VALUE(b ORDER BY c DESC)
   > FROM table
   > GROUP BY a
   > ```
   > 
   > where `table` is already ordered by `c ASC`. In this case, by taking 
reverse of `SUM`(which is itself) and `FIRST_VALUE` we can convert query above 
to it equivalent form below
   > 
   > ```
   > SELECT SUM(b), LAST_VALUE(b ORDER BY c ASC)
   > FROM table
   > GROUP BY a
   > ```
   > 
   > to align ordering requirement with existing ordering. Returning `None` 
from `fn reverser_expr()` indicates that when input data is iterated in reverse 
order, the result generated wouldn't be same compared to existing version. 
However, for `SUM`, `AVG` etc. when input data is iterated in reverse order, 
the result is same. As an another counter example, consider query
   > 
   > ```
   > SELECT ARRAY_AGG(b ORDER BY c DESC), FIRST_VALUE(b ORDER BY c DESC)
   > FROM table
   > GROUP BY a
   > ```
   > 
   > where `table` is ordered by `c ASC` as before. There is no way to produce 
result of the `ARRAY_AGG(b ORDER BY c DESC) with the ordering `c ASC`at the 
input. Hence, for`ARRAY_AGG`, this implementation returns `None`to communicate 
this feature. In short, for order insensitive aggregators we should 
implement`fn reverse_expr` by returning the clone of the existing aggregator, 
to communicate same result would be generated in reverse order (in any 
arbitrary permutation actually).
   
   I see. It makes sense to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to