[ 
https://issues.apache.org/jira/browse/CALCITE-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17367879#comment-17367879
 ] 

duan xiong commented on CALCITE-4661:
-------------------------------------

[~julianhyde] This [https://wiki.postgresql.org/wiki/Aggregate_Mode] has some 
valid information about Why need to require the values of MODE to be sorted. To 
summarize:

1) the new built-in function[Mode() that requires sorted] is _much faster_. 

Because If we don't require the values of MODE to be sorted, the process will 
be to gather each value into an array. Then once all values are in the array, 
we will run a function to find the most common value in our array. Then you 
need to create a function to find the most common value in the array. But If we 
do, We don't need to gather all values in an array, Just reserve the common 
value.

and some implement tips:
 * Like most built-in aggregate functions, NULL values are ignored. If the most 
common value is NULL, built-in {{mode()}} returns the second most common value.
 * The built-in {{mode()}} does not return an error if the expression is NULL 
in all rows. Returns NULL instead.

> Add MODE aggregate functions
> ----------------------------
>
>                 Key: CALCITE-4661
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4661
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: duan xiong
>            Assignee: duan xiong
>            Priority: Major
>
> Add MODE functions in the operator table which returns the most frequent 
> input value (arbitrarily choosing the first one if there are multiple 
> equally-frequent results)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to