[GitHub] [flink] beyond1920 commented on a change in pull request #17651: [FLINK-24656][docs] Add documentation for window deduplication

2021-11-08 Thread GitBox


beyond1920 commented on a change in pull request #17651:
URL: https://github.com/apache/flink/pull/17651#discussion_r745333651



##
File path: docs/content.zh/docs/dev/table/sql/queries/window-deduplication.md
##
@@ -0,0 +1,93 @@
+---
+title: "窗口去重"
+weight: 16
+type: docs
+---
+
+
+# Window Deduplication
+{{< label Batch >}} {{< label Streaming >}}
+
+Window Deduplication is a special [Deduplication]({{< ref 
"docs/dev/table/sql/queries/deduplication" >}}) which removes rows that 
duplicate over a set of columns, keeping only the first one or the last one for 
each window and other partitioned keys. 
+
+For streaming queries, unlike regular Deduplicate on continuous tables, window 
Deduplication does not emit intermediate results but only a final result at the 
end of the window. Moreover, window Deduplication purges all intermediate state 
when no longer needed.
+Therefore, window Deduplication queries have better performance if users don't 
need results updated per record. Usually, Window Deduplication is used with 
[Windowing TVF]({{< ref "docs/dev/table/sql/queries/window-tvf" >}}) directly. 
Besides, Window Deduplication could be used with other operations based on 
[Windowing TVF]({{< ref "docs/dev/table/sql/queries/window-tvf" >}}), such as 
[Window Aggregation]({{< ref "docs/dev/table/sql/queries/window-agg" >}}), 
[Window TopN]({{< ref "docs/dev/table/sql/queries/window-topn">}}) and [Window 
Join]({{< ref "docs/dev/table/sql/queries/window-join">}}). 
+
+Window Deduplication can be defined in the same syntax as regular 
Deduplication, see [Deduplication documentation]({{< ref 
"docs/dev/table/sql/queries/deduplication" >}}) for more information.
+Besides that, Window Deduplication requires the `PARTITION BY` clause contains 
`window_start` and `window_end` columns of the relation applied [Windowing 
TVF]({{< ref "docs/dev/table/sql/queries/window-tvf" >}}) or [Window 
Aggregation]({{< ref "docs/dev/table/sql/queries/window-agg" >}}).

Review comment:
   All window TVF operations are allowed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] beyond1920 commented on a change in pull request #17651: [FLINK-24656][docs] Add documentation for window deduplication

2021-11-08 Thread GitBox


beyond1920 commented on a change in pull request #17651:
URL: https://github.com/apache/flink/pull/17651#discussion_r745332586



##
File path: docs/content.zh/docs/dev/table/sql/queries/window-deduplication.md
##
@@ -0,0 +1,93 @@
+---
+title: "窗口去重"
+weight: 16
+type: docs
+---
+
+
+# Window Deduplication
+{{< label Batch >}} {{< label Streaming >}}
+
+Window Deduplication is a special [Deduplication]({{< ref 
"docs/dev/table/sql/queries/deduplication" >}}) which removes rows that 
duplicate over a set of columns, keeping only the first one or the last one for 
each window and other partitioned keys. 
+
+For streaming queries, unlike regular Deduplicate on continuous tables, window 
Deduplication does not emit intermediate results but only a final result at the 
end of the window. Moreover, window Deduplication purges all intermediate state 
when no longer needed.
+Therefore, window Deduplication queries have better performance if users don't 
need results updated per record. Usually, Window Deduplication is used with 
[Windowing TVF]({{< ref "docs/dev/table/sql/queries/window-tvf" >}}) directly. 
Besides, Window Deduplication could be used with other operations based on 
[Windowing TVF]({{< ref "docs/dev/table/sql/queries/window-tvf" >}}), such as 
[Window Aggregation]({{< ref "docs/dev/table/sql/queries/window-agg" >}}), 
[Window TopN]({{< ref "docs/dev/table/sql/queries/window-topn">}}) and [Window 
Join]({{< ref "docs/dev/table/sql/queries/window-join">}}). 

Review comment:
   not yet




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] beyond1920 commented on a change in pull request #17651: [FLINK-24656][docs] Add documentation for window deduplication

2021-11-08 Thread GitBox


beyond1920 commented on a change in pull request #17651:
URL: https://github.com/apache/flink/pull/17651#discussion_r745330897



##
File path: docs/content.zh/docs/dev/table/sql/queries/window-deduplication.md
##
@@ -0,0 +1,93 @@
+---
+title: "窗口去重"
+weight: 16
+type: docs
+---
+
+
+# Window Deduplication
+{{< label Batch >}} {{< label Streaming >}}

Review comment:
   Batch would be supported soon in 
[PR](https://github.com/apache/flink/pull/17666) and 
[PR](https://github.com/apache/flink/pull/17670). 
   I mark it to be only streaming currently.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org