Hi, @leo65535, thanks for the reply. Very glad to hear the topic about decoupling connectors from The Flink version. That's a good direction.
About sql connectors management, actually I do not have a complete design in my head. But some points would be discussed firstly here: 1. Plugin style. For SQL, we don't need to write the glue code to "register" connectors. The only thing we need to do is to put the connector jars into ClassPath, then the process (jm/tm) could find the connector class, which implements specified interfaces, based on a mechanism like SPI. So maybe we can't manage the connectors as plugins. When we submit the job, we add the corresponding connector jars as resources to job. 2. Decouple connectors with Flink version. Decoupling is necessary. To my knowledge, Flink SQL uses a factory discovery mechanism to manage connectors. So when this interface is not refactored in the Flink core, we can expect it as backward compatible. Moreover, if it is version related, we can manage them in the arch like `/plugins/<flink-version>/connectors/kafka` 3. Connectors dynamic loading. To reduce the class conflict between connectors, I propose to do connectors loading when necessary, namely dynamic loading. About how to find the connectors used in SQL, I thinks we specified the connectors in config file, or parse SQL to get the connectors info. Maybe we can discuss this topic lately in a more detailed tech design doc. Thanks, Kelu. On Fri, Apr 29, 2022 at 4:10 PM leo65535 <[email protected]> wrote: > > > Hi @taokelu, > > > This proposal is nice!! > We also disscuss another topic "Decoupling connectors from compute > engines"[1], I have a question how to manager flink sql connector? > > > [1] https://lists.apache.org/thread/j99crn7nkfpwovng6ycbxhw65sxg9xn2 > > > Thanks, leo65535 > > > > At 2022-04-29 11:06:42, "陶克路" <[email protected]> wrote: > > The background https://github.com/apache/incubator-seatunnel/issues/1753 > > > Let me have a brief introduction about the background. I found the Flink > SQL support in Seatunnel is very simple, so I want to do some improvements > on this story. > > > And now seatunnel uses many deprecated datastream apis, which are > encouraged to be replaced with SQL, such as > `StreamTableEnvironment.connect`. Maybe SQL would be an alternative. > > > > > > > > > Here are the improvement details: > 1. refactor start-seatunnel-sql.sh. Now start-seatunnel-flink.sh and > start-seatunnel-spark.sh have been refactored, and the main logic has been > rewritten by java code. I think we can first keep them consistent. > 2. enrich sql config file. Now flink sql job config is very simple, and > it's all about the sql script. I think we can add more sub-config into it. > 3. sql connectors management. Flink community supports a rich set of SQL > connectors. Only with connectors, we can run our job successfully end-to-end > 4. sql related logic. Such as validation before job running, throwing the > error as soon as possible > 5. Catalog support. With catalog, we can reuse tables/udfs defined in > catalog. > 6. kubernetes native mode support. Actually, this is a universal feature, > not just about sql. In Flink, to run job in kubernetes native mode, we must > bundle the main jar and dependency files into the Flink image. This is not > user-friendly. Community support a workaround for this, namely podTemplate > 7. ... > > > This is a long-term plan. We can implement it step by step. > > > What do you think about this PROPOSAL? Feel free to give any comment or > suggestion. > > > Thanks. > Kelu. > -- > > > > Hello, Find me here: www.legendtkl.com. -- Hello, Find me here: www.legendtkl.com.
