yuxiqian commented on code in PR #3844:
URL: https://github.com/apache/flink-cdc/pull/3844#discussion_r1910194160
##########
docs/content.zh/docs/connectors/pipeline-connectors/mysql.md:
##########
@@ -77,6 +77,32 @@ pipeline:
parallelism: 4
```
+## 多数据源示例
+
+单数据源,从多个 MySQL 读取数据同步到 Doris 的 Pipeline 可以定义如下:
+
+```yaml
+source:
+ type: mysql_mutiple
Review Comment:
I do like @ChaomingZhangCN's proposed syntax for a fully multiple data
source, they're intuitive and expressive, but might be a chore if users just
want to connect to a MySQL cluster with multiple servers, as they have to copy
all identical configurations to both source definition.
@linjianchang's solution for now seems like MySQL specific, especially for
multi-host clusters. It could not be extended for hetero-sources (like
concatenating data from different DBMS), or when one wants to use different
configs for each node. These cases don't exist for now since all we have is
MySQL source connector, but as we're modifying composer and YAML API (instead
of MySQL connector itself), such possibility should be discussed more carefully.
As for multiple sources in pipeline itself, I remembered the idea has been
informally discussed with @leonardBang and @PatrickRen long time ago, and the
conclusion was running multiple sources in one single job actually makes the
pipeline more fragile, since any single-point failure would easily escalate and
cause a global failover. Things might have changed since then, still needs
hearing from senior developers on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]