ruanwenjun commented on PR #1722: URL: https://github.com/apache/incubator-seatunnel/pull/1722#issuecomment-1108021844
> Have some problem at now. @ruanwenjun @CalvinKirs Since flink only provides client mode now, we can let the driver identify which jars need to be uploaded to the cluster. But spark provides cluster mode, our driver needs to run on the cluster, so we can't configure the jar package that needs to be submitted to the cluster through code, because the driver code runs on the cluster. Spark provides the --jar parameter to configure the jar package submitted to the cluster. So now we need to identify which jar packages need to be provided to the cluster according to config, and we need to complete the identification of jar packages in `SparkStarter`. But now the jar package identification is achieved by loading the jar package through the classloader and calling the `getPluginName` method to achieve matching. The `seatunnel-core-spark.jar` corresponding to `SparkStarter` does not provide the dependency corresponding to spark, so the jar package cannot be loaded successfully. > > Is there a good solution for this? > > 1. Add Spark dependency to `seatunnel-core-spark.jar` > 2. Separate a module to do `SparkStarter` related things, and add Spark dependencies at the same time > 3. Change a jar package identification method I suggest that we can maintain a plugin mapping file in conf directory. The content looks like below ```properties Clickhouse=seatunnel-connector-spark-clickhouse.jar ClickhouseFile=seatunnel-connector-spark-clickhouse.jar Console=seatunnel-connector-spark-console.jar ``` Then we can easily find out which plugin jar should be used, and once we add a new plugin, we need to add a mapping in this file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
