clintropolis commented on issue #10644:
URL: https://github.com/apache/druid/issues/10644#issuecomment-840116328


   Hi @damnMeddlingKid , @ericxiao251, (and sorry I missed the mailing list 
thread)
   
   I did document this behavior recently in #11188, which includes a new column 
that lists out all of the initial aggregator values in both modes, 
https://github.com/apache/druid/blob/master/docs/querying/sql.md#aggregation-functions.
 
   
   Whether or not this is the most correct thing for min/max to be doing I 
would consider this fair game to be up for debate. As mentioned in this thread 
already, in SQL compatible null handling mode these aggregators are initialized 
to the null value, and so not aggregating any rows will produce the expected 
`null` result, which in my personal view seems like the only correct thing to 
do in the case where filters do not match, but that doesn't work for default 
mode of course.
   
   As alluded to by @abhishekagarwal87, I don't think we could make the 
'default' mode min/max aggregators return `0` either without storing some 
additional information to distinguish not aggregating values (so probably no 
longer using the primitive numeric base aggregators, or perhaps always using 
the nullable version and just coercing to 0 later?), or by dropping the very 
end values and assuming Long.MIN_VALUE and Long.MAX_VALUE don't legitimately 
exist and translating them to 0 in a finalizer, which would allow them to keep 
using the numeric primitive aggregator base classes.
   
   "default" null handling mode is often unintuitive I think, especially in SQL 
queries, and for the most part SQL compatible null handling mode should have 
behavior that is more consistent with other databases and with what you would 
expect in SQL. I would much prefer long term to deprecate and eventually remove 
default value mode, so that this case of no matches always returns null, but it 
seems reasonable to discuss adjusting the default values for default value mode 
until then.
   
   I don't have a strong opinion on the default value for min/max... what is 
the motivation to have 0 instead of an unlikely value returned when nothing 
matches? Or is this just confusion since none of this was previously explicitly 
documented so the only reference was the documentation on the two null handling 
modes?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to