[GitHub] [incubator-inlong-website] dockerzhang commented on a diff in pull request #400: [INLONG-397][Doc] Add guide for extend data source in manager

GitBox Wed, 08 Jun 2022 04:58:22 -0700


dockerzhang commented on code in PR #400:
URL: 
https://github.com/apache/incubator-inlong-website/pull/400#discussion_r892258457



##########
docs/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: Extended Data Source

Review Comment:
   ->
   Data Node Plugin



##########
docs/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: Extended Data Source
+sidebar_position: 6
+---
+
+## Overview
+
+Inlong is aimed at create dataflow between different data sources, now Inlong 
has support several universal data sources such as **MySQL**, **Apache Kafka**, 
**ClickHouse** on Input/Output respectively,
+You can refer to 
[data_node](https://inlong.apache.org/docs/next/data_node/extract_node/auto_push)
 for specific information.
+We Plan to support more data sources in the future, and this article is a 
development manual to extend data sources.

Review Comment:
   data sources
   ->
   data nodes



##########
docs/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: Extended Data Source
+sidebar_position: 6
+---
+
+## Overview
+
+Inlong is aimed at create dataflow between different data sources, now Inlong 
has support several universal data sources such as **MySQL**, **Apache Kafka**, 
**ClickHouse** on Input/Output respectively,
+You can refer to 
[data_node](https://inlong.apache.org/docs/next/data_node/extract_node/auto_push)
 for specific information.
+We Plan to support more data sources in the future, and this article is a 
development manual to extend data sources.
+
+## Extend Data Extract Node

Review Comment:
   Extend Extract Node
   



##########
i18n/zh-CN/docusaurus-plugin-content-docs/current/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: 数据源扩展

Review Comment:
   数据节点插件



##########
docs/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: Extended Data Source
+sidebar_position: 6
+---
+
+## Overview
+
+Inlong is aimed at create dataflow between different data sources, now Inlong 
has support several universal data sources such as **MySQL**, **Apache Kafka**, 
**ClickHouse** on Input/Output respectively,
+You can refer to 
[data_node](https://inlong.apache.org/docs/next/data_node/extract_node/auto_push)
 for specific information.
+We Plan to support more data sources in the future, and this article is a 
development manual to extend data sources.
+
+## Extend Data Extract Node
+
+In order to extend an input data sources , also refered to **extract node** in 
Inlong. We take **MySQL_BINLOG** for example.
+
+- Develop extract node plugin in sort, refer to 
[how_to_write_plugin_sort](https://inlong.apache.org/docs/next/design_and_concept/how_to_write_plugin_sort)
+- Add **TaskType** in `org.apache.inlong.common.enums.TaskTypeEnum`
+```java
+public enum TaskTypeEnum {
+
+    DATABASE_MIGRATION(0),
+    SQL(1),
+    BINLOG(2),
+    FILE(3),
+    KAFKA(4),
+    PULSAR(5),
+    POSTGRES(6),
+    ORACLE(7),
+    SQLSERVER(8),
+    MONGODB(9),
+    ...
+```
+- Add **SourceType** in `org.apache.inlong.manager.common.enums.SourceType`
+```java
+public enum SourceType {
+
+    AUTO_PUSH("AUTO_PUSH", null),
+    FILE("FILE", TaskTypeEnum.FILE),
+    SQL("SQL", TaskTypeEnum.SQL),
+    BINLOG("BINLOG", TaskTypeEnum.BINLOG),
+    KAFKA("KAFKA", TaskTypeEnum.KAFKA),
+    PULSAR("PULSAR", TaskTypeEnum.PULSAR),
+    POSTGRES("POSTGRES", TaskTypeEnum.POSTGRES),
+    ORACLE("ORACLE", TaskTypeEnum.ORACLE),
+    SQLSERVER("SQLSERVER", TaskTypeEnum.SQLSERVER),
+    MONGODB("MONGO", TaskTypeEnum.MONGODB),
+    ...
+```
+- Create new package under package path: 
`org.apache.inlong.manager.common.pojo.source`, develop every entity class 
needed.
+  ![](img/Binlog_Entity_Class.png)
+- Create Operation class for new data source under package path: 
`org.apache.inlong.manager.service.source`.
+  ![](img/Binlog_Operation.png)
+- Transfer data source to **ExtractNode** supported in **Sort**
+```java
+public class ExtractNodeUtils {
+    
+    public static ExtractNode createExtractNode(StreamSource sourceInfo) {
+        SourceType sourceType = SourceType.forType(sourceInfo.getSourceType());
+        switch (sourceType) {
+            case BINLOG:
+                return createExtractNode((MySQLBinlogSource) sourceInfo);
+            case KAFKA:
+                return createExtractNode((KafkaSource) sourceInfo);
+            case PULSAR:
+                return createExtractNode((PulsarSource) sourceInfo);
+            case POSTGRES:
+                return createExtractNode((PostgresSource) sourceInfo);
+            case ORACLE:
+                return createExtractNode((OracleSource) sourceInfo);
+            case SQLSERVER:
+                return createExtractNode((SqlServerSource) sourceInfo);
+            case MONGODB:
+                return createExtractNode((MongoDBSource) sourceInfo);
+            default:
+                throw new IllegalArgumentException(
+                        String.format("Unsupported sourceType=%s to create 
extractNode", sourceType));
+        }
+    }
+    ...
+```
+## Extend Data Load Node

Review Comment:
   need one more line.



##########
i18n/zh-CN/docusaurus-plugin-content-docs/current/design_and_concept/how_to_extend_data_source.md:
##########
@@ -0,0 +1,141 @@
+---
+title: 数据源扩展
+sidebar_position: 6
+---
+
+## 总览
+
+Inlong 设计初衷即是为了在不同数据源之间创建数据流, 到目前为止，Inlong已经支持了多种常用数据源的读取和写入，如 **MySQL**, 
**Apache Kafka**, **ClickHouse** 等,
+详细内容可参考 
[数据节点](https://inlong.apache.org/zh-CN/docs/next/data_node/extract_node/auto_push).
+我们预计会在未来支持更多的常用数据源, 故本文会简短介绍如何在现有框架下扩展数据源.
+
+## 扩展读取节点
+ 
+以**MySQL_BINLOG**为例，下午会介绍如何在Inlong框架下扩展读取节点.
+
+- 首先需要在Sort组件内支持该数据源, 详情参考 [Sort 
插件](https://inlong.apache.org/zh-CN/docs/next/design_and_concept/how_to_write_plugin_sort)
+- 在枚举类`org.apache.inlong.common.enums.TaskTypeEnum`中增加对应的枚举
+```java
+public enum TaskTypeEnum {
+
+    DATABASE_MIGRATION(0),
+    SQL(1),
+    BINLOG(2),
+    FILE(3),
+    KAFKA(4),
+    PULSAR(5),
+    POSTGRES(6),
+    ORACLE(7),
+    SQLSERVER(8),
+    MONGODB(9),
+    ...
+```
+- 在枚举类`org.apache.inlong.manager.common.enums.SourceType`中同样增加对应枚举
+```java
+public enum SourceType {
+
+    AUTO_PUSH("AUTO_PUSH", null),
+    FILE("FILE", TaskTypeEnum.FILE),
+    SQL("SQL", TaskTypeEnum.SQL),
+    BINLOG("BINLOG", TaskTypeEnum.BINLOG),
+    KAFKA("KAFKA", TaskTypeEnum.KAFKA),
+    PULSAR("PULSAR", TaskTypeEnum.PULSAR),
+    POSTGRES("POSTGRES", TaskTypeEnum.POSTGRES),
+    ORACLE("ORACLE", TaskTypeEnum.ORACLE),
+    SQLSERVER("SQLSERVER", TaskTypeEnum.SQLSERVER),
+    MONGODB("MONGO", TaskTypeEnum.MONGODB),
+    ...
+```
+- 在`org.apache.inlong.manager.common.pojo.source`路径下创建文件夹, 创建对应实体类.
+  ![](img/Binlog_Entity_Class.png)
+- 在`org.apache.inlong.manager.service.source`路径下，创建对应工具类.
+  ![](img/Binlog_Operation.png)
+- 支持数据源到**ExtractNode**的转换函数
+```java
+public class ExtractNodeUtils {
+    
+    public static ExtractNode createExtractNode(StreamSource sourceInfo) {
+        SourceType sourceType = SourceType.forType(sourceInfo.getSourceType());
+        switch (sourceType) {
+            case BINLOG:
+                return createExtractNode((MySQLBinlogSource) sourceInfo);
+            case KAFKA:
+                return createExtractNode((KafkaSource) sourceInfo);
+            case PULSAR:
+                return createExtractNode((PulsarSource) sourceInfo);
+            case POSTGRES:
+                return createExtractNode((PostgresSource) sourceInfo);
+            case ORACLE:
+                return createExtractNode((OracleSource) sourceInfo);
+            case SQLSERVER:
+                return createExtractNode((SqlServerSource) sourceInfo);
+            case MONGODB:
+                return createExtractNode((MongoDBSource) sourceInfo);
+            default:
+                throw new IllegalArgumentException(
+                        String.format("Unsupported sourceType=%s to create 
extractNode", sourceType));
+        }
+    }
+    ...
+```
+## 扩展写入节点

Review Comment:
   need one more line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-inlong-website] dockerzhang commented on a diff in pull request #400: [INLONG-397][Doc] Add guide for extend data source in manager

Reply via email to