[
https://issues.apache.org/jira/browse/FLINK-23751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Martijn Visser resolved FLINK-23751.
------------------------------------
Resolution: Resolved
I've verified the results using the following statements and doing a manual
comparison of the raw data & the computed TopN results:
{code:sql}
CREATE TABLE Bid (
bidtime TIMESTAMP(3),
price DOUBLE,
item STRING,
supplier_id STRING,
WATERMARK FOR bidtime AS bidtime - INTERVAL '10' SECOND
) WITH (
'connector' = 'faker',
'fields.bidtime.expression' = '#{date.past ''5'',''SECONDS''}',
'fields.price.expression' = '#{Number.randomDouble ''2'',''1'',''150''}',
'fields.item.expression' = '#{Commerce.productName}',
'fields.supplier_id.expression' = '#{regexify
''(Alice|Bob|Carol|Alex|Joe|James|Jane|Jack)''}',
'rows-per-second' = '1'
);
CREATE TABLE RawBids (
bidtime TIMESTAMP(3),
price DOUBLE,
item STRING,
supplier_id STRING
) WITH (
'connector' = 'filesystem', -- required: specify the connector
'path' = 'file:////Users/martijnvisser/Downloads/rawdata', -- required: path
to a directory
'format' = 'csv', -- required: file system connector
requires to specify a format
'sink.rolling-policy.rollover-interval' = '5 minutes',
'partition.default-name' = 'RawData'
);
CREATE TABLE TopN (
bidtime TIMESTAMP(3),
price DOUBLE,
item STRING,
supplier_id STRING,
window_start TIMESTAMP(3),
window_end TIMESTAMP(3),
rownum BIGINT
) WITH (
'connector' = 'filesystem', -- required: specify the connector
'path' = 'file:////Users/martijnvisser/Downloads/topn', -- required: path to
a directory
'format' = 'csv', -- required: file system connector
requires to specify a format
'sink.rolling-policy.rollover-interval' = '5 minutes',
'partition.default-name' = 'TopN'
);
BEGIN STATEMENT SET;
INSERT INTO RawBids
SELECT * from Bid;
INSERT INTO TopN
SELECT *
FROM (
SELECT bidtime, price, item, supplier_id, window_start, window_end,
ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY price DESC)
as rownum
FROM TABLE(
TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '1' MINUTE))
) WHERE rownum <= 3;
END;
{code}
> Testing Window Top-N after Windowing TVF
> ----------------------------------------
>
> Key: FLINK-23751
> URL: https://issues.apache.org/jira/browse/FLINK-23751
> Project: Flink
> Issue Type: Improvement
> Components: Tests
> Reporter: JING ZHANG
> Assignee: Martijn Visser
> Priority: Blocker
> Labels: release-testing
> Fix For: 1.14.0
>
>
> Currently, Flink not only supports Window Top-N which follows after [Window
> Aggregation|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/queries/window-agg/].
> but also supports Window Top-N follows after [Windowing
> TVF.|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/queries/window-tvf/]
> The following example shows how to calculate Top 3 items which have the
> highest price for every tumbling 10 minutes window.
> {code:java}
> SELECT * FROM Bid;
> +------------------+-------+------+-------------+
> | bidtime | price | item | supplier_id |
> +------------------+-------+------+-------------+
> | 2020-04-15 08:05 | 4.00 | A | supplier1 |
> | 2020-04-15 08:06 | 4.00 | C | supplier2 |
> | 2020-04-15 08:07 | 2.00 | G | supplier1 |
> | 2020-04-15 08:08 | 2.00 | B | supplier3 |
> | 2020-04-15 08:09 | 5.00 | D | supplier4 |
> | 2020-04-15 08:11 | 2.00 | B | supplier3 |
> | 2020-04-15 08:13 | 1.00 | E | supplier1 |
> | 2020-04-15 08:15 | 3.00 | H | supplier2 |
> | 2020-04-15 08:17 | 6.00 | F | supplier5 |
> +------------------+-------+------+-------------+
> Flink SQL> SELECT *
> FROM (
> SELECT *, ROW_NUMBER() OVER (PARTITION BY window_start, window_end ORDER BY
> price DESC) as rownum
> FROM TABLE(
> TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES))
> ) WHERE rownum <= 3;
> +------------------+-------+------+-------------+------------------+------------------+--------+
> | bidtime | price | item | supplier_id | window_start | window_end | rownum |
> +------------------+-------+------+-------------+------------------+------------------+--------+
> | 2020-04-15 08:05 | 4.00 | A | supplier1 | 2020-04-15 08:00 | 2020-04-15
> 08:10 | 2 |
> | 2020-04-15 08:06 | 4.00 | C | supplier2 | 2020-04-15 08:00 | 2020-04-15
> 08:10 | 3 |
> | 2020-04-15 08:09 | 5.00 | D | supplier4 | 2020-04-15 08:00 | 2020-04-15
> 08:10 | 1 |
> | 2020-04-15 08:11 | 2.00 | B | supplier3 | 2020-04-15 08:10 | 2020-04-15
> 08:20 | 3 |
> | 2020-04-15 08:15 | 3.00 | H | supplier2 | 2020-04-15 08:10 | 2020-04-15
> 08:20 | 2 |
> | 2020-04-15 08:17 | 6.00 | F | supplier5 | 2020-04-15 08:10 | 2020-04-15
> 08:20 | 1 |
> +------------------+-------+------+-------------+------------------+------------------+--------+
> {code}
> Note: Currently, Flink only supports Window Top-N follows after Windowing TVF
> with Tumble Windows, Hop Windows and Cumulate Windows. Window Top-N follows
> after Windowing TVF with Session windows will be supported in the near future.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)