PatrickRen commented on code in PR #3165: URL: https://github.com/apache/flink-cdc/pull/3165#discussion_r1529629379
########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) + +## Preparation + +The doc assumes a running Kubernetes cluster fulfilling the following requirements: + +- Kubernetes >= 1.9. +- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`. +- Enabled Kubernetes DNS. +- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods. + +If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). Review Comment: ```suggestion If you have problems setting up a Kubernetes cluster, please take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). ``` ########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) + +## Preparation + +The doc assumes a running Kubernetes cluster fulfilling the following requirements: + +- Kubernetes >= 1.9. +- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`. +- Enabled Kubernetes DNS. +- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods. + +If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). + +## Session Mode + +Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). +You can refer [overview](../connectors/pipeline-connectors/overview.md) to check supported versions and download [the binary release](https://flink.apache.org/downloads/) of Flink, +then extract the archive: + +```bash +tar -xzf flink-*.tgz +``` + +You should set `FLINK_HOME` environment variables like: + +```bash +export FLINK_HOME=/path/flink-* +``` + +### Start a session cluster + +To start a session cluster on k8s, run the bash script that comes with Flink: + +```bash +cd /path/flink-* +./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster +``` + +After successful startup, the return information is as follows: + +``` +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Please note that Flink client operations(e.g. cancel, list, stop, savepoint, etc.) won't work from outside the Kubernetes cluster since 'kubernetes.rest-service.exposed.type' has been set to ClusterIP. +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink session cluster my-first-flink-cluster successfully, JobManager Web Interface: http://my-first-flink-cluster-rest.default:8081 +``` + +**please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.** +**You should ensure that REST endpoint can be accessed by the node of your submission.** +Then, you need to add these two config to your flink-conf.yaml: + +```yaml +rest.bind-port: {{REST_PORT}} +rest.address: {{NODE_IP}} +``` + +{{REST_PORT}} and {{NODE_IP}} should be replaced by the actual values of your JobManager Web Interface. + +### Set up Flink CDC +Download the tar file of Flink CDC from [release page](https://github.com/apache/flink-cdc/releases), then extract the archive: + +```bash +tar -xzf flink-cdc-*.tar.gz +``` + +flink-cdc directory will contain four directories: `bin`,`lib`,`log`,`conf`. + +Download the connector package listed below and move it to the `lib` directory. +**Download links are available only for stable releases, SNAPSHOT dependencies need to be built based on master or release branches by yourself.** +- [MySQL pipeline connector 3.0.0](https://repo1.maven.org/maven2/org/apache/flink/flink-cdc-pipeline-connector-mysql/3.0.0/flink-cdc-pipeline-connector-mysql-3.0.0.jar) +- [Apache Doris pipeline connector 3.0.0](https://repo1.maven.org/maven2/org/apache/flink/flink-cdc-pipeline-connector-doris/3.0.0/flink-cdc-pipeline-connector-doris-3.0.0.jar) + +### Submit a Flink CDC Job +Here is an example file for synchronizing the entire database `mysql-to-doris.yaml`: + +```yaml +################################################################################ +# Description: Sync MySQL all tables to Doris +################################################################################ +source: + type: mysql + hostname: localhost + port: 3306 + username: root + password: 123456 + tables: app_db.\.* + server-id: 5400-5404 + server-time-zone: UTC + +sink: + type: doris + fenodes: 127.0.0.1:8030 + username: root + password: "" + +pipeline: + name: Sync MySQL Database to Doris + parallelism: 2 + +``` + +You need to modify the configuration file according to your needs. +Finally, submit job to Flink Standalone cluster using Cli. + +```bash +cd /path/flink-cdc-* +./bin/flink-cdc.sh mysql-to-doris.yaml +``` + +After successful submission, the return information is as follows: + +```bash +Pipeline has been submitted to cluster. +Job ID: ae30f4580f1918bebf16752d4963dc54 +Job Description: Sync MySQL Database to Doris +``` + +We can find a job named `Sync MySQL Database to Doris` is running through Flink Web UI. Review Comment: ```suggestion Then you can find a job named `Sync MySQL Database to Doris` running through Flink Web UI. ``` ########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) + +## Preparation + +The doc assumes a running Kubernetes cluster fulfilling the following requirements: + +- Kubernetes >= 1.9. +- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`. +- Enabled Kubernetes DNS. +- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods. + +If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). + +## Session Mode + +Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). +You can refer [overview](../connectors/pipeline-connectors/overview.md) to check supported versions and download [the binary release](https://flink.apache.org/downloads/) of Flink, +then extract the archive: + +```bash +tar -xzf flink-*.tgz +``` + +You should set `FLINK_HOME` environment variables like: + +```bash +export FLINK_HOME=/path/flink-* +``` + +### Start a session cluster + +To start a session cluster on k8s, run the bash script that comes with Flink: + +```bash +cd /path/flink-* +./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster +``` + +After successful startup, the return information is as follows: + +``` +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Please note that Flink client operations(e.g. cancel, list, stop, savepoint, etc.) won't work from outside the Kubernetes cluster since 'kubernetes.rest-service.exposed.type' has been set to ClusterIP. +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink session cluster my-first-flink-cluster successfully, JobManager Web Interface: http://my-first-flink-cluster-rest.default:8081 +``` + +**please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.** +**You should ensure that REST endpoint can be accessed by the node of your submission.** +Then, you need to add these two config to your flink-conf.yaml: + +```yaml +rest.bind-port: {{REST_PORT}} +rest.address: {{NODE_IP}} +``` + +{{REST_PORT}} and {{NODE_IP}} should be replaced by the actual values of your JobManager Web Interface. + +### Set up Flink CDC +Download the tar file of Flink CDC from [release page](https://github.com/apache/flink-cdc/releases), then extract the archive: + +```bash +tar -xzf flink-cdc-*.tar.gz +``` + +flink-cdc directory will contain four directories: `bin`,`lib`,`log`,`conf`. Review Comment: ```suggestion Extracted `flink-cdc` contains four directories: `bin`,`lib`,`log` and `conf`. ``` ########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) Review Comment: ```suggestion For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/). ``` ########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) + +## Preparation + +The doc assumes a running Kubernetes cluster fulfilling the following requirements: + +- Kubernetes >= 1.9. +- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`. +- Enabled Kubernetes DNS. +- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods. + +If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). + +## Session Mode + +Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). +You can refer [overview](../connectors/pipeline-connectors/overview.md) to check supported versions and download [the binary release](https://flink.apache.org/downloads/) of Flink, +then extract the archive: + +```bash +tar -xzf flink-*.tgz +``` + +You should set `FLINK_HOME` environment variables like: + +```bash +export FLINK_HOME=/path/flink-* +``` + +### Start a session cluster + +To start a session cluster on k8s, run the bash script that comes with Flink: + +```bash +cd /path/flink-* +./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster +``` + +After successful startup, the return information is as follows: + +``` +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Please note that Flink client operations(e.g. cancel, list, stop, savepoint, etc.) won't work from outside the Kubernetes cluster since 'kubernetes.rest-service.exposed.type' has been set to ClusterIP. +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink session cluster my-first-flink-cluster successfully, JobManager Web Interface: http://my-first-flink-cluster-rest.default:8081 +``` + +**please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.** +**You should ensure that REST endpoint can be accessed by the node of your submission.** +Then, you need to add these two config to your flink-conf.yaml: + +```yaml +rest.bind-port: {{REST_PORT}} +rest.address: {{NODE_IP}} +``` + +{{REST_PORT}} and {{NODE_IP}} should be replaced by the actual values of your JobManager Web Interface. + +### Set up Flink CDC +Download the tar file of Flink CDC from [release page](https://github.com/apache/flink-cdc/releases), then extract the archive: + +```bash +tar -xzf flink-cdc-*.tar.gz +``` + +flink-cdc directory will contain four directories: `bin`,`lib`,`log`,`conf`. + +Download the connector package listed below and move it to the `lib` directory. +**Download links are available only for stable releases, SNAPSHOT dependencies need to be built based on master or release branches by yourself.** +- [MySQL pipeline connector 3.0.0](https://repo1.maven.org/maven2/org/apache/flink/flink-cdc-pipeline-connector-mysql/3.0.0/flink-cdc-pipeline-connector-mysql-3.0.0.jar) +- [Apache Doris pipeline connector 3.0.0](https://repo1.maven.org/maven2/org/apache/flink/flink-cdc-pipeline-connector-doris/3.0.0/flink-cdc-pipeline-connector-doris-3.0.0.jar) Review Comment: What about pointing to `Connectors` pages here? It's hard to list all connectors here as our ecosystem expands in the future. ########## docs/content/docs/deployment/kubernetes.md: ########## @@ -23,3 +23,132 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + +# Introduction + +Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management. +Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster. +Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes. + +Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes. + +For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) + +## Preparation + +The doc assumes a running Kubernetes cluster fulfilling the following requirements: + +- Kubernetes >= 1.9. +- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`. +- Enabled Kubernetes DNS. +- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods. + +If you have problems setting up a Kubernetes cluster, then take a look at [how to setup a Kubernetes cluster](https://kubernetes.io/docs/setup/). + +## Session Mode + +Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows). +You can refer [overview](../connectors/pipeline-connectors/overview.md) to check supported versions and download [the binary release](https://flink.apache.org/downloads/) of Flink, +then extract the archive: + +```bash +tar -xzf flink-*.tgz +``` + +You should set `FLINK_HOME` environment variables like: + +```bash +export FLINK_HOME=/path/flink-* +``` + +### Start a session cluster + +To start a session cluster on k8s, run the bash script that comes with Flink: + +```bash +cd /path/flink-* +./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster +``` + +After successful startup, the return information is as follows: + +``` +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 +org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Please note that Flink client operations(e.g. cancel, list, stop, savepoint, etc.) won't work from outside the Kubernetes cluster since 'kubernetes.rest-service.exposed.type' has been set to ClusterIP. +org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Create flink session cluster my-first-flink-cluster successfully, JobManager Web Interface: http://my-first-flink-cluster-rest.default:8081 +``` + +**please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.** +**You should ensure that REST endpoint can be accessed by the node of your submission.** Review Comment: What about using a better style for this kind of reminders? For example: https://github.com/apache/flink-connector-kafka/blob/15f2662eccf461d9d539ed87a78c9851cd17fa43/docs/content/docs/connectors/datastream/kafka.md?plain=1#L47-L50 which displays like the blue box as below: <img width="909" alt="image" src="https://github.com/apache/flink-cdc/assets/18453087/32b51ff3-96d4-4efa-b665-efc919f2892d"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
