ericxiao251 commented on issue #10644: URL: https://github.com/apache/druid/issues/10644#issuecomment-839956260
@abhishekagarwal87 do you know who the best person to reach out to about this issue is? Copying and pasting my email to the dev mailing list... I am not sure if this is actually a bug in the code or undocumented behaviour: --- Hi Developers, I am looking for some guidance on this situation described here: https://github.com/apache/druid/issues/10644. I am trying to understand if the behavior is expected and the documentation for the runtime parameter `druid.generic.useDefaultValueForNull` is not updated to reflect the behavior OR this is an actual bug in the engine. As described in the issue, there are a couple of scenarios that don’t seem right according to the documentation: Example when results are `Long.MIN_VALUE` 1) perform a MAX aggregation function 2) on empty rows 3) `druid.generic.useDefaultValueForNull=true` ```sql Query: SELECT MAX(l1) FILTER(WHERE dim1 = 'non_existing') FROM druid.foo Result: -9223372036854775808 ``` Reference to explicitly set value in source code: https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/LongMaxAggregatorFactory.java#L56-L60 Example when results are 0 1) perform a SUM aggregation function 2) on empty rows 3) `druid.generic.useDefaultValueForNull=true` ```sql Query: SELECT SUM(l1) FILTER(WHERE dim1 = 'non_existing') FROM druid.foo Result: 0 ``` Reference to explicitly set value in source code: https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/query/aggregation/LongSumAggregatorFactory.java#L56-L60 From the Druid documentation link: Property: `druid.generic.useDefaultValueForNull` Description: When set to true, null values will be stored as `‘’` for string columns and 0 for numeric columns. Set to false to store and query data in SQL compatible mode. Default: true The documentation does not explicitly mention what the expected results of an aggregation query would be but I expected the results from both of the above queries to be 0; not `Long.MIN_VALUE` in some instances. Should the documentation be updated or should the code be fixed? I can put in either fix, but want to know which the community thinks is appropriate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
