(flink-cdc) branch master updated: [docs][minor] Improve the YAML example docs

leonard Fri, 25 Apr 2025 00:43:18 -0700

This is an automated email from the ASF dual-hosted git repository.

leonard pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-cdc.git



The following commit(s) were added to refs/heads/master by this push:
     new b5cfc0b12 [docs][minor] Improve the YAML example docs
b5cfc0b12 is described below

commit b5cfc0b123fa97b99450d64ac2bb309588927c6d
Author: Robin Moffatt <[email protected]>
AuthorDate: Fri Apr 25 08:32:57 2025 +0100

    [docs][minor] Improve the YAML example docs
    
    This closes #3772
    
    Co-authored-by: Leonard Xu <[email protected]>
---
 docs/content/_index.md                          |  4 +-
 docs/content/docs/core-concept/data-pipeline.md | 51 ++++++++++---------------
 2 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/docs/content/_index.md b/docs/content/_index.md
index dd52c5863..188c80c95 100644
--- a/docs/content/_index.md
+++ b/docs/content/_index.md
@@ -117,7 +117,7 @@ under the License.
                     <div class="divider w-1/2 opacity-50"></div>
                 </div>
                 <p class="text-sm my-0 text-center md:text-left">
-                    Flink CDC will soon support data transform operations of 
ETL, including column projection, computed column, filter expression and 
classical scalar functions.
+                    Flink CDC supports data transform operations of ETL, 
including column projection, computed column, filter expression and classical 
scalar functions.
                 </p>
             </div>
             <div class="w-full md:w-1/3 px-8 py-6 flex flex-col flex-grow 
flex-shrink">
@@ -183,4 +183,4 @@ under the License.
             Flink CDC is developed under the umbrella of <a class="text-white" 
href="https://flink.apache.org";>Apache Flink</a>.
         </p>
     </div>
-</div>
\ No newline at end of file
+</div>
diff --git a/docs/content/docs/core-concept/data-pipeline.md 
b/docs/content/docs/core-concept/data-pipeline.md
index d6086a667..79c448aba 100644
--- a/docs/content/docs/core-concept/data-pipeline.md
+++ b/docs/content/docs/core-concept/data-pipeline.md
@@ -43,6 +43,10 @@ the following parts are optional:
 We could use following yaml file to define a concise Data Pipeline describing 
synchronize all tables under MySQL app_db database to Doris :
 
 ```yaml
+   pipeline:
+     name: Sync MySQL Database to Doris
+     parallelism: 2
+
    source:
      type: mysql
      hostname: localhost
@@ -56,28 +60,6 @@ We could use following yaml file to define a concise Data 
Pipeline describing sy
      fenodes: 127.0.0.1:8030
      username: root
      password: ""
-
-   transform:
-     - source-table: adb.web_order01
-       projection: \*, UPPER(product_name) as product_name
-       filter: id > 10 AND order_id > 100
-       description: project fields and filter
-     - source-table: adb.web_order02
-       projection: \*, UPPER(product_name) as product_name
-       filter: id > 20 AND order_id > 200
-       description: project fields and filter
-
-   route:
-     - source-table: app_db.orders
-       sink-table: ods_db.ods_orders
-     - source-table: app_db.shipments
-       sink-table: ods_db.ods_shipments
-     - source-table: app_db.products
-       sink-table: ods_db.ods_products
-
-   pipeline:
-     name: Sync MySQL Database to Doris
-     parallelism: 2
 ```
 
 ## With optional
@@ -127,11 +109,20 @@ We could use following yaml file to define a complicated 
Data Pipeline describin
 ```
 
 # Pipeline Configurations
-The following config options of Data Pipeline level are supported:
-
-| parameter               | meaning                                            
                                                    | optional/required |
-|-------------------------|--------------------------------------------------------------------------------------------------------|-------------------|
-| name                    | The name of the pipeline, which will be submitted 
to the Flink cluster as the job name.                | optional          |
-| parallelism             | The global parallelism of the pipeline. Defaults 
to 1.                                                 | optional          |
-| local-time-zone         | The local time zone defines current session time 
zone id.                                              | optional          |
-| execution.runtime-mode  | The runtime mode of the pipeline includes 
STREAMING and BATCH, with the default value being STREAMING. | optional         
 |
\ No newline at end of file
+
+The following config options of Data Pipeline level are supported. 
+Note that whilst the parameters are each individually optional, at least one 
of them must be specified. That is to say, The `pipeline` section is mandatory 
and cannot be empty.
+
+
+| parameter                     | meaning                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| `name`                        | The name of the pipeline, which will be 
submitted to the Flink cluster as the job name.                                 
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
+| `parallelism`                 | The global parallelism of the pipeline. 
Defaults to 1.                                                                  
                                                                                
                                                                                
                                                                                
                                                                                
                   [...]
+| `local-time-zone`             | The local time zone defines current session 
time zone id.                                                                   
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
+| `execution.runtime-mode`      | The runtime mode of the pipeline includes 
STREAMING and BATCH, with the default value being STREAMING.                    
                                                                                
                                                                                
                                                                                
                                                                                
                 [...]
+| `schema.change.behavior`      | How to handle [changes in schema]({{< ref 
"docs/core-concept/schema-evolution" >}}). One of: [`exception`]({{< ref 
"docs/core-concept/schema-evolution" >}}#exception-mode), [`evolve`]({{< ref 
"docs/core-concept/schema-evolution" >}}#evolve-mode), [`try_evolve`]({{< ref 
"docs/core-concept/schema-evolution" >}}#tryevolve-mode), [`lenient`]({{< ref 
"docs/core-concept/schema-evolution" >}}#lenient-mode) (default) or 
[`ignore`]({{< ref "docs/core-concept/sche [...]
+| `schema.operator.uid`         | The unique ID for schema operator. This ID 
will be used for inter-operator communications and must be unique across 
operators.                                                                      
                                                                                
                                                                                
                                                                                
                       [...]
+| `schema-operator.rpc-timeout` | The timeout time for SchemaOperator to wait 
downstream SchemaChangeEvent applying finished, the default value is 3 minutes. 
                                                                                
                                                                                
                                                                                
                                                                                
               [...]
+
+NOTE: Whilst the above parameters are each individually optional, at least one 
of them must be specified. The `pipeline` section is mandatory and cannot be 
empty.
+

(flink-cdc) branch master updated: [docs][minor] Improve the YAML example docs

Reply via email to