[ 
https://issues.apache.org/jira/browse/FLINK-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628651#comment-14628651
 ] 

ASF GitHub Bot commented on FLINK-2148:
---------------------------------------

Github user gyfora commented on the pull request:

    https://github.com/apache/flink/pull/910#issuecomment-121733115
  
    This is an interesting issue, exactly because of the current API for 
aggregations. I think the main purpose of returning the aggregated values 
inside the original data is to use them when we have grouped aggregations. 
There this makes perfect sense.
    
    Maybe someone knows better but I think we have this as we don't have 
explicit key-value pairs.


> Approximately calculate the number of distinct elements of a stream
> -------------------------------------------------------------------
>
>                 Key: FLINK-2148
>                 URL: https://issues.apache.org/jira/browse/FLINK-2148
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming
>            Reporter: Gabor Gevay
>            Assignee: Gabor Gevay
>            Priority: Minor
>              Labels: statistics
>
> In the paper
> http://people.seas.harvard.edu/~minilek/papers/f0.pdf
> Kane et al. describes an optimal algorithm for estimating the number of 
> distinct elements in a data stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to