[
https://issues.apache.org/jira/browse/FLINK-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kurt Young closed FLINK-11943.
------------------------------
Resolution: Duplicate
Fix Version/s: 1.9.0
> Support TopN feature for SQL
> ----------------------------
>
> Key: FLINK-11943
> URL: https://issues.apache.org/jira/browse/FLINK-11943
> Project: Flink
> Issue Type: New Feature
> Components: Table SQL / Runtime
> Reporter: Jark Wu
> Priority: Major
> Fix For: 1.9.0
>
>
> TopN is a frequently used feature in data analysis. We can use ORDER BY +
> LIMIT to easily express a TopN query, e.g. {{SELECT * FROM T ORDER BY amount
> DESC LIMIT 10}}.
> But this is a global TopN, there is a great requirement for per-group TopN.
> For example, top 10 shops for each category. In order to avoid introducing
> new syntax for this, we would like to use traditional syntax to express it by
> using {{ROW_NUMBER}} over window + {{FILTER}} to limit the numbers.
> For example:
> SELECT *
> FROM (
> SELECT category, shopId, sales,
> [ROW_NUMBER()|RANK()|DENSE_RANK()] OVER
> (PARTITION BY category ORDER BY sales ASC) as rownum
> FROM shop_sales
> )
> WHERE rownum <= 10
> This issue is aiming to optimize this query to an {{Rank}} node instead of
> {{Over}} plus {{Calc}}. And translate the {{Rank}} node into physical
> operators.
> There are some optimization for rank operator based on the different input of
> the Rank. We would like to implement the basic and one-fit-all
> implementation. And do the performance improvement later.
> Here is a brief design doc:
> https://docs.google.com/document/d/14JCV6X6hcpoA51loprgntZNxQ2NmnDLucxgGY8xVDuI/edit#
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)