Hello Everyone, https://github.com/apache/beam/pull/25686 is approved and merged. I describe below its two new Beam project assets for those who know Kubernetes and terraform and those who don't.
*Short Version (For those who know Kubernetes and terraform)*: This PR provides: 1. An end-to-end Infrastructure-as-Code solution using terraform to provision a Google Kubernetes Engine (GKE) from scratch starting from a custom network, service account, IAM roles all the way to the private autopilot Kubernetes engine and a bastion host to connect. See instructions: https://github.com/apache/beam/tree/master/.test-infra/terraform/google-cloud-platform/google-kubernetes-engine 2. A strimzi.io Kafka on kubernetes. I tailored it using kustomize <https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/> to version control the strimzi operator and allow for cloud specific overlays. The kustomize solution uses an internal GKE TCP load balancer overlay. See instructions: https://github.com/apache/beam/tree/master/.test-infra/kafka/strimzi *Long Version (For those not as familiar with Kubernetes and terraform)*: What does this have to do with Beam? This email relates to Beam by providing a resource to help with our Beam I/O related work. Beam I/Os are tools in the SDK that allow pipelines to read from and write to various databases and API-dependent resources. You may have seen, for example, BigQueryIO, a class in the SDK that provides transforms for reading from and writing to BigQuery. In this email's context, we have KafkaIO, a class that provides transforms for reading from and writing to Kafka. Kafka is an event streaming platform (See https://kafka.apache.org/). While we have our existing jenkins solution, it was designed for integration testing and shuts down some important resources after tests complete. We needed a way to spin up a Kafka resource on our own in our own environments. What is Kubernetes? Kubernetes is an open-source system for automating deployment, scaling and management of containerized workloads (See kubernetes.io). This PR chooses Kubernetes as it serves as the scalable environment and one typically used by enterprise users of Beam and KafkaIO. What is Infrastructure-as-Code and terraform? Infrastructure-as-Code (IaC) is a declarative-based process of managing resources using code. While also code, bash scripts or our existing groovy scripts to provision resources sit on the side of imperative methods that are arguably less readable and prone to error. Terraform is one established IaC solution to provision resources in the major cloud providers. This PR focuses on Google Cloud and particularly Google Kubernetes Engine (GKE). In order to provision GKE, the terraform solution needs to provision the service account, network and other related resources per security best practice. What value does this PR provide? This PR provides an end-to-end solution to provision a private GKE autopilot and strimzi kafka cluster. It allows a Beam developer to deploy this in their own Google Cloud project. Additionally there are instructions on how to use the solution in your KafkaIO related pipeline. They are listed below in recommended order: 1) https://github.com/apache/beam/tree/master/.test-infra/terraform/google-cloud-platform/google-kubernetes-engine 2) https://github.com/apache/beam/tree/master/.test-infra/kafka/strimzi Best, Damon On Wed, Mar 1, 2023 at 4:12 PM Damon Douglas <damondoug...@google.com> wrote: > Hello Everyone, > > I created a PR to provide to the Beam community terraform code to > provision a private Google Kubernetes Engine and kubernetes manifests to > provision an internally TCP load balanced strimzi.io Kafka cluster. This > solution helped me a lot when I needed a repeatable solution to spin up > resources for reading from and writing to Kafka without having to scratch > my head and remember steps. > > https://github.com/apache/beam/pull/25686 > > This is *not* meant to replace our current test-infra kubernetes and > kafka setup which is designed for our automated testing using jenkins. > > Best, > > Damon > >