Zkplo commented on code in PR #1089: URL: https://github.com/apache/inlong-website/pull/1089#discussion_r1864059753
########## docs/quick_start/offline_data_sync/airflow_pulsar_mysql_example.md: ########## @@ -0,0 +1,120 @@ +--- +title: Example of Airflow Offline Synchronization +sidebar_position: 3 +--- +In the following content, a complete example will be used to introduce how to create Airflow scheduling tasks using Apache InLong and complete offline data synchronization from Pulsar to MySQL. + +## Deployment +### Install InLong + +Before we begin, we need to install InLong. Here we provide two ways: +- [Docker Deployment](deployment/docker.md) (Recommended) +- [Bare Metal Deployment](deployment/bare_metal.md) + +### Add Connectors + +Download the [connectors](https://inlong.apache.org/downloads/) corresponding to Flink version, and after decompression, place `sort-connector-jdbc-[version]-SNAPSHOT.jar` in `/inlong-sort/connectors/` directory. +> Currently, Apache InLong's offline data synchronization capability only supports Flink-1.18, so please download the 1.18 version of connectors. + +## Create Clusters And Data Target + +### Create Cluster Label + + +### Register Pulsar Cluster + + + +### Create Data Target + + + +Execute the following SQL statement: + +```mysql +CREATE TABLE sink_table ( + id INT AUTO_INCREMENT PRIMARY KEY, + name VARCHAR(255) NOT NULL, + create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP +); +``` + +## Airflow Initialization + +### Create Initial DAG + +Place the DAG file in the Airflow default DAG directory and wait for a while. The Airflow scheduler will scan the directory and load the DAG: + + +> Airflow does not provide an API for DAG creation, so two original DAGs are required. `dag_creator` is used to create offline tasks, and `dag_cleaner` is used to clean up offline tasks regularly. They can be obtained from [Inlong](https://github.com/apache/inlong). + Review Comment: thx~ done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
