Hi Dian, Thanks a lot for your input. That’s a valid solution. We avoid using fat jars in Java API, because it easily leads to class conflicts. But PyFlink is like SQL API, user-imported Java dependencies are comparatively rare, so fat jar is a proper choice.
Best, Paul Lam > 2021年12月14日 19:26,Dian Fu <dian0511...@gmail.com> 写道: > > Hi Paul, > > For connectors(including Kafka), it's recommended to use the fat jar which > contains the dependencies. For example, for kafka, you could use > https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.14.0/flink-sql-connector-kafka_2.11-1.14.0.jar > > <https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.14.0/flink-sql-connector-kafka_2.11-1.14.0.jar> > > Regards, > Dian > > On Tue, Dec 14, 2021 at 5:44 PM Paul Lam <paullin3...@gmail.com > <mailto:paullin3...@gmail.com>> wrote: > Hi! > > I’m trying out PyFlink and looking for the best practice to manage Java > dependencies. > > The docs recommends to use ‘pipeline-jars’ configuration or command line > options to specify jars for a PyFlink job. However, PyFlink users may not > know what Java dependencies is required. For example, a user may import Kafka > connector without knowing Kafka client need to be added to the classpaths. I > think the problem here is the lack of a cross-language dependencies > management, so we have to do it manually. > > Now I workaround the problem by providing a tool that extracts the required > jars of the corresponding Java artifact of the imported PyFlink modules via > maven dependency plugin. But I wonder if there is some best practice to > address the problem? Thanks a lot! > > Best, > Paul Lam >