[GitHub] [druid] maytasm3 commented on a change in pull request #9187: Implement ANY aggregator

GitBox Thu, 16 Jan 2020 12:07:23 -0800

maytasm3 commented on a change in pull request #9187: Implement ANY aggregator
URL: https://github.com/apache/druid/pull/9187#discussion_r367624288


 ##########
 File path: docs/querying/sql.md
 ##########
 @@ -203,6 +203,10 @@ Only the COUNT aggregation can accept DISTINCT.
 |`EARLIEST(expr, maxBytesPerString)`|Like `EARLIEST(expr)`, but for strings. 
The `maxBytesPerString` parameter determines how much aggregation space to 
allocate per string. Strings longer than this limit will be truncated. This 
parameter should be set as low as possible, since high values will lead to 
wasted memory.|
 |`LATEST(expr)`|Returns the latest non-null value of `expr`, which must be 
numeric. If `expr` comes from a relation with a timestamp column (like a Druid 
datasource) then "latest" is the value last encountered with the maximum 
overall timestamp of all values being aggregated. If `expr` does not come from 
a relation with a timestamp, then it is simply the last value encountered.|
 |`LATEST(expr, maxBytesPerString)`|Like `LATEST(expr)`, but for strings. The 
`maxBytesPerString` parameter determines how much aggregation space to allocate 
per string. Strings longer than this limit will be truncated. This parameter 
should be set as low as possible, since high values will lead to wasted memory.|
+|`ANY_VALUE(expr)`|Returns any value of `expr`, which must be numeric. If 
`druid.generic.useDefaultValueForNull=true` this can returns the default value 
for null and does not prefer "non-null" values over the default value for null. 
If `druid.generic.useDefaultValueForNull=false`, then this will returns any 
non-null value of `expr`|
+|`ANY_VALUE(expr, maxBytesPerString)`|Like `ANY_VALUE(expr)`, but for strings. 
The `maxBytesPerString` parameter determines how much aggregation space to 
allocate per string. Strings longer than this limit will be truncated. This 
parameter should be set as low as possible, since high values will lead to 
wasted memory.|
 
 Review comment:
   Currently, the implementation for LATEST, EARLIEST (and ANY since I based it 
off LATEST, EARLIEST) is that if you use the json stuff, then maxStringBytes is 
optional and if not present will default to 1024 (as per the docs in 
docs/querying/aggregations.md). 
   However, this does not work the same if you issue the query through SQL. To 
use LATEST, EARLIEST (and ANY) in SQL, you must give the maxStringBytes as the 
second argument. If you do not, then the column actually gets cast into double 
(super weird).  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] maytasm3 commented on a change in pull request #9187: Implement ANY aggregator

Reply via email to