Re: [PR] DuetAI knowledge lookup prompts [beam]

via GitHub Thu, 30 Nov 2023 09:27:22 -0800


olehborysevych commented on code in PR #29507:
URL: https://github.com/apache/beam/pull/29507#discussion_r1411018332



##########
learning/prompts/documentation-lookup/02_basic_pipelines.md:
##########
@@ -0,0 +1,20 @@
+Prompt:
+What is a Pipeline in Apache Beam?
+Response:
+A 
[Pipeline](https://beam.apache.org/documentation/pipelines/design-your-pipeline/)
 in Apache Beam serves as an abstraction that encapsulates the entirety of a 
data processing task, including all the data and each step of the process. 
Essentially, it's a [Directed Acyclic 
Graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of transformations 
(known as `PTransform`s) applied to collections of data (`PCollection`s).
+
+The simplest pipelines in Apache Beam follow a linear flow of operations, 
typically adhering to a read-process-write pattern. However, pipelines can also 
be significantly more complex, featuring multiple input sources, multiple 
output sinks, and operations (`PTransform`s) that can both read from and output 
to multiple `PCollection`s.
+
+For more information on pipeline design and best practices, see the [Common 
Pipeline Patterns](https://beam.apache.org/documentation/patterns/overview/)
+
+To use Beam, your driver program must first create an instance of the Beam SDK 
class `Pipeline` (typically in the `main()` function).

Review Comment:
   Done



##########
learning/prompts/documentation-lookup/02_basic_pipelines.md:
##########
@@ -0,0 +1,20 @@
+Prompt:
+What is a Pipeline in Apache Beam?
+Response:
+A 
[Pipeline](https://beam.apache.org/documentation/pipelines/design-your-pipeline/)
 in Apache Beam serves as an abstraction that encapsulates the entirety of a 
data processing task, including all the data and each step of the process. 
Essentially, it's a [Directed Acyclic 
Graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of transformations 
(known as `PTransform`s) applied to collections of data (`PCollection`s).
+
+The simplest pipelines in Apache Beam follow a linear flow of operations, 
typically adhering to a read-process-write pattern. However, pipelines can also 
be significantly more complex, featuring multiple input sources, multiple 
output sinks, and operations (`PTransform`s) that can both read from and output 
to multiple `PCollection`s.
+
+For more information on pipeline design and best practices, see the [Common 
Pipeline Patterns](https://beam.apache.org/documentation/patterns/overview/)
+
+To use Beam, your driver program must first create an instance of the Beam SDK 
class `Pipeline` (typically in the `main()` function).
+
+```python
+import apache_beam as beam
+
+with beam.Pipeline() as pipeline:
+  pass  # build your pipeline here
+```
+
+When you create your `Pipeline`, you’ll also need to set some [configuration 
options](https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options).
 You can set your pipeline’s configuration options programmatically, but it’s 
often easier to set the options ahead of time (or read them from the command 
line) and pass them to the `Pipeline` object when you create the object.

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] DuetAI knowledge lookup prompts [beam]

Reply via email to