This is an automated email from the ASF dual-hosted git repository.
leonard pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-cdc.git
The following commit(s) were added to refs/heads/master by this push:
new b5cfc0b12 [docs][minor] Improve the YAML example docs
b5cfc0b12 is described below
commit b5cfc0b123fa97b99450d64ac2bb309588927c6d
Author: Robin Moffatt <[email protected]>
AuthorDate: Fri Apr 25 08:32:57 2025 +0100
[docs][minor] Improve the YAML example docs
This closes #3772
Co-authored-by: Leonard Xu <[email protected]>
---
docs/content/_index.md | 4 +-
docs/content/docs/core-concept/data-pipeline.md | 51 ++++++++++---------------
2 files changed, 23 insertions(+), 32 deletions(-)
diff --git a/docs/content/_index.md b/docs/content/_index.md
index dd52c5863..188c80c95 100644
--- a/docs/content/_index.md
+++ b/docs/content/_index.md
@@ -117,7 +117,7 @@ under the License.
<div class="divider w-1/2 opacity-50"></div>
</div>
<p class="text-sm my-0 text-center md:text-left">
- Flink CDC will soon support data transform operations of
ETL, including column projection, computed column, filter expression and
classical scalar functions.
+ Flink CDC supports data transform operations of ETL,
including column projection, computed column, filter expression and classical
scalar functions.
</p>
</div>
<div class="w-full md:w-1/3 px-8 py-6 flex flex-col flex-grow
flex-shrink">
@@ -183,4 +183,4 @@ under the License.
Flink CDC is developed under the umbrella of <a class="text-white"
href="https://flink.apache.org">Apache Flink</a>.
</p>
</div>
-</div>
\ No newline at end of file
+</div>
diff --git a/docs/content/docs/core-concept/data-pipeline.md
b/docs/content/docs/core-concept/data-pipeline.md
index d6086a667..79c448aba 100644
--- a/docs/content/docs/core-concept/data-pipeline.md
+++ b/docs/content/docs/core-concept/data-pipeline.md
@@ -43,6 +43,10 @@ the following parts are optional:
We could use following yaml file to define a concise Data Pipeline describing
synchronize all tables under MySQL app_db database to Doris :
```yaml
+ pipeline:
+ name: Sync MySQL Database to Doris
+ parallelism: 2
+
source:
type: mysql
hostname: localhost
@@ -56,28 +60,6 @@ We could use following yaml file to define a concise Data
Pipeline describing sy
fenodes: 127.0.0.1:8030
username: root
password: ""
-
- transform:
- - source-table: adb.web_order01
- projection: \*, UPPER(product_name) as product_name
- filter: id > 10 AND order_id > 100
- description: project fields and filter
- - source-table: adb.web_order02
- projection: \*, UPPER(product_name) as product_name
- filter: id > 20 AND order_id > 200
- description: project fields and filter
-
- route:
- - source-table: app_db.orders
- sink-table: ods_db.ods_orders
- - source-table: app_db.shipments
- sink-table: ods_db.ods_shipments
- - source-table: app_db.products
- sink-table: ods_db.ods_products
-
- pipeline:
- name: Sync MySQL Database to Doris
- parallelism: 2
```
## With optional
@@ -127,11 +109,20 @@ We could use following yaml file to define a complicated
Data Pipeline describin
```
# Pipeline Configurations
-The following config options of Data Pipeline level are supported:
-
-| parameter | meaning
| optional/required |
-|-------------------------|--------------------------------------------------------------------------------------------------------|-------------------|
-| name | The name of the pipeline, which will be submitted
to the Flink cluster as the job name. | optional |
-| parallelism | The global parallelism of the pipeline. Defaults
to 1. | optional |
-| local-time-zone | The local time zone defines current session time
zone id. | optional |
-| execution.runtime-mode | The runtime mode of the pipeline includes
STREAMING and BATCH, with the default value being STREAMING. | optional
|
\ No newline at end of file
+
+The following config options of Data Pipeline level are supported.
+Note that whilst the parameters are each individually optional, at least one
of them must be specified. That is to say, The `pipeline` section is mandatory
and cannot be empty.
+
+
+| parameter | meaning
[...]
+|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[...]
+| `name` | The name of the pipeline, which will be
submitted to the Flink cluster as the job name.
[...]
+| `parallelism` | The global parallelism of the pipeline.
Defaults to 1.
[...]
+| `local-time-zone` | The local time zone defines current session
time zone id.
[...]
+| `execution.runtime-mode` | The runtime mode of the pipeline includes
STREAMING and BATCH, with the default value being STREAMING.
[...]
+| `schema.change.behavior` | How to handle [changes in schema]({{< ref
"docs/core-concept/schema-evolution" >}}). One of: [`exception`]({{< ref
"docs/core-concept/schema-evolution" >}}#exception-mode), [`evolve`]({{< ref
"docs/core-concept/schema-evolution" >}}#evolve-mode), [`try_evolve`]({{< ref
"docs/core-concept/schema-evolution" >}}#tryevolve-mode), [`lenient`]({{< ref
"docs/core-concept/schema-evolution" >}}#lenient-mode) (default) or
[`ignore`]({{< ref "docs/core-concept/sche [...]
+| `schema.operator.uid` | The unique ID for schema operator. This ID
will be used for inter-operator communications and must be unique across
operators.
[...]
+| `schema-operator.rpc-timeout` | The timeout time for SchemaOperator to wait
downstream SchemaChangeEvent applying finished, the default value is 3 minutes.
[...]
+
+NOTE: Whilst the above parameters are each individually optional, at least one
of them must be specified. The `pipeline` section is mandatory and cannot be
empty.
+