rednaxelafx commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL 
filter predicate for aggregate expression.
URL: https://github.com/apache/spark/pull/26420#issuecomment-552315759
 
 
   I'd like to propose a solution for the codegen part that'll augment this PR. 
The overall direction this PR is taking sounds good to me, although I haven't 
reviewed the full details yet (would like to do that some time this week).
   
   I'll prepare a separate PR for demo purposes to show how it'll augment the 
codegen part. It's actually fairly easy and could also serve as a bit of code 
clean up for a lot of the declarative aggregate functions.
   
   The tl;dr is that I'd like to have explicit support for the user-specified 
filter clause in the infrastructure, instead of solely relying on a rewrite.
   A lot of aggregate functions are null-skipping by nature, e.g. `count()`, 
`sum()`, `avg()` etc. But that's not a property common to ALL possible 
aggregate functions, and some of them have interesting semantics like 
`first()`/ `last()` where you can configure whether or not you want to include 
the nulls as the result, or skip them and only take the non-null values.
   Having explicit support for the filter clause in the infrastructure ensures 
that we can properly support this feature, without having to rely on logical 
rewrite that might work for most aggregate functions and then a handful of 
exception cases have to be implemented in really ugly ways.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to