Jellal-HT commented on issue #5099: URL: https://github.com/apache/inlong/issues/5099#issuecomment-1198078300
# Motivation Sort module supports Apache Hudi. Apache Hudi is a popular streaming datalake platform. We should support Apache Hudi in sort module. # Design The design will follow following the document [Sort Plugin](https://inlong.apache.org/docs/design_and_concept/how_to_extend_data_node_for_sort) and [Manager Plugin](https://inlong.apache.org/zh-CN/docs/design_and_concept/how_to_extend_data_node_for_manager) 1. Extend a new Extract Node for Apache Hudi 2. Extend a new Load Node for Apache Hudi 3. Implement the corresponding flink connectors for Apache Hudi 4. Extend Extract Node and Load Node in manager module for apache Hudi # Modification ## Load Node 1. add the new class `HudiLoadNode`, which inherits the LoadNode class 2. add the Load for Hudi to JsonSubTypes in LoadNode and Node ## Extract Node 1. add the new class `HudiExtractNode`, which inherits the ExtractNode class 2. add the Extract for Hudi to JsonSubTypes in ExtractNode and Node ## Flink Connector Adding new classes: - HudiTableSink - HudiTableSource - HudiTableFactory - ConfigOptions - HudiCatalog - HudiCatalogFactory (As Apache Hudi has already integrated Flink, this part will refer to the [implementation of flink connector](https://github.com/apache/hudi/tree/master/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table) in Apache Hudi) ## Manager plugin Follow the document [Manager Plugin](https://inlong.apache.org/zh-CN/docs/design_and_concept/how_to_extend_data_node_for_manager) to extend extract node and load node -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
