lvyanquan commented on code in PR #3605: URL: https://github.com/apache/flink-cdc/pull/3605#discussion_r1904874893
########## README.md: ########## @@ -25,14 +25,25 @@ and elegance of data integration via YAML to describe the data movement and tran The Flink CDC prioritizes efficient end-to-end data integration and offers enhanced functionalities such as full database synchronization, sharding table synchronization, schema evolution and data transformation. - + +### Quickstart Guide +Flink CDC provides a CdcUp CLI utility to start a playground environment and run Flink CDC jobs. +You will need to have a working Docker and Docker compose environment to use it. + +1. Run `git clone https://github.com/apache/flink-cdc.git --depth=1` to retrieve a copy of Flink CDC source code. +2. Run `cd tools/cdcup/ && ./cdcup.sh init` to use the CdcUp tool to start a playground environment. +3. Run `./cdcup.sh up` to initialize docker containers, and `./cdcup.sh pipeline` to submit a pipeline job. Review Comment: `./cdcup.sh pipeline` => `./cdcup.sh pipeline <yaml>` ########## tools/cdcup/README.md: ########## @@ -0,0 +1,30 @@ +# cdcup + +A `docker` (`compose`) environment on Linux / macOS is required to play with this. Ruby is **not** necessary. + +## `./cdcup.sh init` + +Initialize a playground environment, and generate configuration files. + +## `./cdcup.sh up` + +Start docker containers. Note that it may take a while before database is ready. + +## `./cdcup.sh pipeline <yaml>` + +Submit a YAML pipeline job. Before executing this, please ensure that: + +1. All container are running and ready for connections +2. (For MySQL) You've created at least one database & tables to be captured + +## `./cdcup.sh flink` + +Print Flink Web dashboard URL. Review Comment: What's more, could we provided a command to display the username/password/url of the mysql source in docker compose, or provided a command to enter the pod of mysql. ########## README.md: ########## @@ -25,14 +25,25 @@ and elegance of data integration via YAML to describe the data movement and tran The Flink CDC prioritizes efficient end-to-end data integration and offers enhanced functionalities such as full database synchronization, sharding table synchronization, schema evolution and data transformation. - + +### Quickstart Guide +Flink CDC provides a CdcUp CLI utility to start a playground environment and run Flink CDC jobs. +You will need to have a working Docker and Docker compose environment to use it. + +1. Run `git clone https://github.com/apache/flink-cdc.git --depth=1` to retrieve a copy of Flink CDC source code. +2. Run `cd tools/cdcup/ && ./cdcup.sh init` to use the CdcUp tool to start a playground environment. +3. Run `./cdcup.sh up` to initialize docker containers, and `./cdcup.sh pipeline` to submit a pipeline job. Review Comment: What about providing a default yaml file for mysql to Doris pipeline? As we can check the result from Doris web ui directly. ########## tools/cdcup/cdcup.sh: ########## @@ -0,0 +1,71 @@ +#!/usr/bin/env bash + +# Do not continue after error +set -e + +display_help() { + echo "Usage: ./cdcup.sh { init | up | pipeline <yaml> | flink | stop | down | help }" + echo + echo "Commands:" + echo " * init:" + echo " Initialize a playground environment, and generate configuration files." + echo + echo " * up:" + echo " Start docker containers. This may take a while before database is ready." + echo + echo " * pipeline <yaml>:" + echo " Submit a YAML pipeline job." + echo + echo " * flink:" + echo " Print Flink Web dashboard URL." + echo + echo " * stop:" + echo " Stop all running playground containers." + echo + echo " * down:" + echo " Stop and remove containers, networks, and volumes." + echo + echo " * help:" + echo " Print this message." +} + +if [ "$1" == 'init' ]; then + printf "🚩 Building bootstrap docker image...\n" + docker build -q -t cdcup/bootstrap . + rm -rf cdc && mkdir -p cdc + printf "🚩 Starting bootstrap wizard...\n" + docker run -it --rm -v "$(pwd)/cdc":/cdc cdcup/bootstrap + mv cdc/docker-compose.yaml ./docker-compose.yaml + mv cdc/pipeline-definition.yaml ./pipeline-definition.yaml +elif [ "$1" == 'up' ]; then + printf "🚩 Starting playground...\n" + docker compose up -d + docker compose exec jobmanager bash -c 'rm -rf /opt/flink-cdc' + docker compose cp cdc jobmanager:/opt/flink-cdc +elif [ "$1" == 'pipeline' ]; then + if [ -z "$2" ]; then + printf "Usage: ./cdcup.sh pipeline <pipeline-definition.yaml>\n" + exit 1 + fi + printf "🚩 Submitting pipeline job...\n" + docker compose cp "$2" jobmanager:/opt/flink-cdc/pipeline-definition.yaml + startup_script="cd /opt/flink-cdc && ./bin/flink-cdc.sh ./pipeline-definition.yaml --flink-home /opt/flink" + if test -f ./cdc/lib/hadoop-uber.jar; then + startup_script="$startup_script --jar lib/hadoop-uber.jar" + fi + if test -f ./cdc/lib/mysql-connector-java.jar; then + startup_script="$startup_script --jar lib/mysql-connector-java.jar" Review Comment: Still met the following error message when testing the script with FlinkCDC 3.2.0 `Caused by: java.lang.ClassNotFoundException: com.mysql.cj.jdbc.Driver at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source) at java.base/java.lang.ClassLoader.loadClass(Unknown Source) ... 10 more ` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
