hi dev,



Apache inlong contains four module, includes Ingest, Converge, Cache, Consume. 
After looking into the project, we

can find that there is a similar structure in several modules, we can call it 
"source - channel - sink"(SCS in short). 

- Ingest module, there is a customized "source - channel - sink" structure. 

- DataProxy module, we can see this module depends on flume project, which is a 
typical SCS structure.

- Consume module, we can also think flink is also a SCS structure.




Several disadvantages:

In Ingest module, we may need to implement many sources ourselves, inculdes 
streaming and batch ingest, like cdc, file, socket...

In DataProxy module, as we known, the development of flume project is getting 
slower.




We can see that it's hard for us to maintain them, we need to learn the flume 
api and implements many connectors. The apache flink already contains rich 
connectors, and support sql grammar, batch and streaming source connectors, 
also 

contains many other excellent features, we can benefit from it. So, here we can 
merge some functions of the Ingest, Converge module and introduce flink to 
dataproxy module.




Here the new architecture diagram compared to origin.

- Origin

  
https://user-images.githubusercontent.com/91316485/136642894-d7a7fd5c-45a7-46b3-ba72-00066839a176.png
 

- New

  
https://user-images.githubusercontent.com/91316485/136642903-89730b52-6b46-4a57-8791-890c72d2327c.png




Looking forward to your ideas, thanks.




Best,

Leo65535

Reply via email to