sirenbyte commented on code in PR #25301:
URL: https://github.com/apache/beam/pull/25301#discussion_r1153936812
##########
learning/tour-of-beam/learning-content/IO/text-io/text-io-gcs-write/description.md:
##########
@@ -80,4 +80,8 @@ It is important to note that in order to interact with
**GCS** you will need to
```
--tempLocation=gs://my-bucket/temp
```
-{{end}}
\ No newline at end of file
+{{end}}
+
+### Playground exercise
+
+You can write PCollection to a file. Use DoFn to generate numbers and write
them down by filtering.
Review Comment:
done
##########
learning/tour-of-beam/learning-content/IO/text-io/text-io-gcs-write/description.md:
##########
@@ -80,4 +80,8 @@ It is important to note that in order to interact with
**GCS** you will need to
```
--tempLocation=gs://my-bucket/temp
```
-{{end}}
\ No newline at end of file
+{{end}}
+
+### Playground exercise
+
+You can write PCollection to a file. Use DoFn to generate numbers and write
them down by filtering.
Review Comment:
Done
##########
learning/tour-of-beam/learning-content/IO/kafka-io/kafka-read/description.md:
##########
@@ -0,0 +1,72 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Reading from Kafka using KafkaIO
+
+`KafkaIO` is a part of the Apache Beam SDK that provides a way to read data
from Apache Kafka and write data to it. It allows for the creation of Beam
pipelines that can consume data from a Kafka topic, process the data and write
the processed data back to another Kafka topic. This makes it possible to build
data processing pipelines using Apache Beam that can easily integrate with a
Kafka-based data architecture.
+
+When reading data from Kafka topics using Apache Beam, developers can use the
`ReadFromKafka` transform to create a `PCollection` of Kafka messages. This
transform takes the following parameters:
+
+When the `ReadFromKafka` transform is executed, it creates a `PCollection` of
Kafka messages, where each message is represented as a tuple containing the
key, value, and metadata fields. If the with_metadata flag is set to True, the
metadata fields are included in the tuple as well.
Review Comment:
Done
##########
learning/tour-of-beam/learning-content/IO/kafka-io/kafka-write/description.md:
##########
@@ -13,8 +13,17 @@ limitations under the License.
-->
### Writing to Kafka using KafkaIO
-`KafkaIO` is a part of the Apache Beam SDK that provides a way to read data
from Apache Kafka and write data to it. It allows for the creation of Beam
pipelines that can consume data from a Kafka topic, process the data and write
the processed data back to another Kafka topic. This makes it possible to build
data processing pipelines using Apache Beam that can easily integrate with a
Kafka-based data architecture.
+When writing data processing pipelines using Apache Beam, developers can use
the `WriteToKafka` transform to write data to Kafka topics. This transform
takes a PCollection of data as input and writes the data to a specified Kafka
topic using a Kafka producer.
+To use the `WriteToKafka` transform, developers need to provide the following
parameters:
+
+* **producer_config**: a dictionary that contains the Kafka producer
configuration properties, such as the Kafka broker addresses and the number of
acknowledgments to wait for before considering a message as sent.
+* **bootstrap.servers**: is a configuration property in Apache Kafka that
specifies the list of bootstrap servers that the Kafka clients should use to
connect to the Kafka cluster.
+* **topic**: the name of the Kafka topic to write the data to.
+* **key**: a function that takes an element from the input PCollection and
returns the key to use for the Kafka message. The key is optional and can be
None.
+* **value**: a function that takes an element from the input PCollection and
returns the value to use for the Kafka message.
+
+When writing data to Kafka using Apache Beam, it is important to ensure that
the pipeline is fault-tolerant and can handle failures, such as network errors,
broker failures, or message serialization errors. Apache Beam provides features
such as checkpointing, retries, and dead-letter queues to help developers build
robust and reliable data processing pipelines that can handle these types of
failures.
Review Comment:
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]