dariabezkorovaina commented on code in PR #30284:
URL: https://github.com/apache/beam/pull/30284#discussion_r1487812410


##########
learning/prompts/documentation-lookup/29_advanced_side_inputs.md:
##########
@@ -0,0 +1,34 @@
+Prompt:
+What are side inputs in Apache Beam?
+
+Response:
+In Apache Beam, side inputs are additional inputs that can be provided to a 
[‘ParDo’ 
transform](https://beam.apache.org/documentation/programming-guide/#pardo), in 
addition to the main input  
[`PCollection`](https://beam.apache.org/documentation/basics/#pcollection). 
These side inputs can be used by the `ParDo` transform's `DoFn` to access and 
read additional data during the processing of each element in the main input 
`PCollection`.
+Side inputs are particularly useful when a Beam pipeline needs to ingest 
additional data dynamically at runtime when processing each element, rather 
than relying on preset or hard-coded values. By leveraging side inputs, Apache 
Beam can determine additional data based on the main `PCollection` data or even 
from a different branch of your pipeline. For example, you can leverage side 
inputs for your streaming analytics use cases, enriching streaming events with 
additional data retrieved from a lookup table.
+
+In the Apache Beam Python SDK, you can provide side inputs as extra arguments 
in the `DoFn`'s `process` method or the `Map`/`FlatMap` transforms. The Python 
SDK supports optional, positional, and keyword arguments.
+
+```python
+class MyDoFn(beam.DoFn):
+  def process(self, element, side_input):
+    ...
+```
+
+In the Java SDK, side inputs are provided to the `ParDo` transform by using 
the `.withSideInputs` method. To access a side input from within a `DoFn`, the 
`DoFn.ProcessContext.sideInput` method is invoked.
+
+```java
+PCollection<Integer> input = ...;
+PCollectionView<Integer> sideInput = ...;
+PCollection<Integer> output =
+    input.apply(ParDo.of(new DoFn<Integer, Integer>() {
+      @ProcessElement
+      public void processElement(ProcessContext c) {
+        Integer sideInputValue = c.sideInput(sideInput);
+        ...
+      }
+    }).withSideInputs(sideInput));
+```
+
+Side inputs can also be used to process windowed data. Apache Beam uses the 
windows for the main input elements to look up the appropriate windows for the 
side input elements. Apache Beam projects the main input’s window to the side 
input's window set and uses the side input from the resulting window. Main and 
side inputs might have identical or different windowing strategies.
+For example, if the main input `PCollection` is windowed into ten-minute 
windows, and the side input is windowed into one-hour windows, Apache Beam 
projects the main input window set and selects the value from the corresponding 
one-hour side input window. For more information, refer to [side inputs and 
windowing](https://beam.apache.org/documentation/programming-guide/#side-inputs-windowing).
+
+For common patterns and code samples in Apache Beam pipelines to implement 
side inputs, refer to [side input 
patterns](https://beam.apache.org/documentation/patterns/side-inputs/).

Review Comment:
   ```suggestion
   For common patterns and code samples to implement side inputs in Apache Beam 
pipelines, refer to [side input 
patterns](https://beam.apache.org/documentation/patterns/side-inputs/).
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to