Paris Carbone created FLINK-1284:
------------------------------------

             Summary: Uniform random sampling operator over windows
                 Key: FLINK-1284
                 URL: https://issues.apache.org/jira/browse/FLINK-1284
             Project: Flink
          Issue Type: New Feature
          Components: Streaming
            Reporter: Paris Carbone
            Priority: Minor


It would be useful for several use cases to have a built-in uniform random 
sampling operator in the streaming API that can operate on windows. This can be 
used for example for online machine learning operations, evaluating heuristics 
or continuous visualisation of representative values.

The operator could be given a field and a number of random samples needed, 
following a window statement as such:

mystream.window(..).sample(fieldID,#samples)

Given that pre-aggregation is enabled, this could perhaps be implemented as a 
binary reduce operator or a combinable groupreduce that pre-aggregates the 
empiricals of that field.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to