olehborysevych commented on code in PR #29507:
URL: https://github.com/apache/beam/pull/29507#discussion_r1411113948


##########
learning/prompts/documentation-lookup/08_basic_windowing.md:
##########
@@ -0,0 +1,28 @@
+Prompt:
+What is Windowing in Apache Beam?
+Response:
+Windowing is a key concept in stream processing, as it allows you to divide 
streams of data into logical units for efficient and correct parallel 
processing.
+With an unbounded data set, it is impossible to collect all of the elements, 
since new elements are constantly being added. In the Beam model, any 
PCollection (including unbounded PCollections) can be subdivided into [logical 
windows](https://beam.apache.org/documentation/programming-guide/#windowing-basics).
 Grouping transforms then consider each PCollection’s elements on a per-window 
basis.
+
+Since Beam's default windowing strategy is to assign each element to a single, 
global window, you must explicitly specify a [windowing 
function](https://beam.apache.org/documentation/programming-guide/#setting-your-pcollections-windowing-function)
 for your pipeline.
+
+The following code snippet shows how  to divide a PCollection into 60-second 
windows:
+```python
+from apache_beam import beam
+from apache_beam import window
+fixed_windowed_items = (
+    items | 'window' >> beam.WindowInto(window.FixedWindows(60)))
+```
+
+Beam provides a number of [built-in windowing 
functions](https://beam.apache.org/documentation/programming-guide/#provided-windowing-functions)
 that you can use to subdivide your data into windows:
+- Fixed Time Windows
+- Sliding Time Windows
+- Per-Session Windows
+- Single Global Window
+- Calendar-based Windows (not supported by the Beam SDK for Python or Go)
+
+You can also create your own custom windowing function 
[WindowFn](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/window.py).
+
+You also need to specify a [triggering 
strategy](https://beam.apache.org/documentation/programming-guide/#triggers) to 
determine when to emit the results of your pipeline’s windowed computations.
+
+You can adjust the windowing strategy to allow for [late 
data](https://beam.apache.org/documentation/programming-guide/#watermarks-and-late-data),
 or data that arrives after the watermark has passed the end of the window. You 
can also specify how to handle late data, such as discarding it or adding it to 
the next window.

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to