This is an automated email from the ASF dual-hosted git repository.
kamir pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-wayang-website.git
The following commit(s) were added to refs/heads/main by this push:
new 65cf35ac Update 2024-03-10-kafka-meets-wayang-3.md
65cf35ac is described below
commit 65cf35ac46b0b80fee9c578d4ee232f1a61b73b2
Author: Mirko Kämpf <[email protected]>
AuthorDate: Sat Sep 21 07:49:02 2024 +0200
Update 2024-03-10-kafka-meets-wayang-3.md
Updated the third blog article.
---
blog/2024-03-10-kafka-meets-wayang-3.md | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/blog/2024-03-10-kafka-meets-wayang-3.md
b/blog/2024-03-10-kafka-meets-wayang-3.md
index 3c512b0f..24e2409b 100644
--- a/blog/2024-03-10-kafka-meets-wayang-3.md
+++ b/blog/2024-03-10-kafka-meets-wayang-3.md
@@ -14,15 +14,20 @@ Let's see how it goes this time with Apache Spark.
## The goal of this implementation
We want to process data from Apache Kafka topics, which are hosted on
Confluent cloud.
-In our example scenario, the data is available in multiple different clusters,
in different regions and owned by different organizations.
+In our example scenario, data is available in multiple different clusters, in
different regions and owned by different organizations.
+Each organization uses the "Stream sharing" feature
[https://docs.confluent.io/cloud/current/stream-sharing/index.html] provided by
Confluent cloud.
-We assume, that the operator of our job has been granted appropriate
permissions, and the topic owner already provided the configuration properties,
including access coordinates and credentials.
+This way, the operator of our central processing job has been granted
appropriate permissions. The plaftorm provided the necessary configuration
properties, including access coordinates and credentials in the name of the
topic owner to us.
+
+The following illustration has already been introduced in part one of the blog
series, but for clarity we repeat it here.

-This illustration has already been introduced in part one.
-We focus on **Job 4** in the image and start to implement it.
-This time we expect the processing load to be higher so that we want to
utilize the scalability capabilities of Apache Spark.
+Today, we focus on **Job 4** in the image. We implement a program which uses
data federation based on multiple sources.
+Each source allows us to read the data from that particular topic so that we
can process it in a different governance context.
+In this example it is a public processing context, in which data from multiple
private processing contexts are used together.
+
+This use case is already prepared for high processing loads We can utilize the
scalability capabilities of Apache Spark or simply use a Java program for
initial tests of the solution. Switching between both is done in one line of
code in Apache Wayang.
Again, we start with a **WayangContext**, as shown by examples in the Wayang
code repository.