Re: Java dependencies management in Pyflink

Paul Lam Tue, 14 Dec 2021 18:33:45 -0800

Hi Dian,

Thanks a lot for your input. That’s a valid solution. We avoid using fat jars 
in Java API, because it easily leads to class conflicts. But PyFlink is like 
SQL API, user-imported Java dependencies are comparatively rare, so fat jar is 
a proper choice.


Best,
Paul Lam

> 2021年12月14日 19:26，Dian Fu <dian0511...@gmail.com> 写道：
> 
> Hi Paul,
> 
> For connectors(including Kafka), it's recommended to use the fat jar which 
> contains the dependencies. For example, for kafka, you could use 
> https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.14.0/flink-sql-connector-kafka_2.11-1.14.0.jar
>  
> <https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.14.0/flink-sql-connector-kafka_2.11-1.14.0.jar>
> 
> Regards,
> Dian
> 
> On Tue, Dec 14, 2021 at 5:44 PM Paul Lam <paullin3...@gmail.com 
> <mailto:paullin3...@gmail.com>> wrote:
> Hi!
> 
> I’m trying out PyFlink and looking for the best practice to manage Java 
> dependencies. 
> 
> The docs recommends to use ‘pipeline-jars’ configuration or command line 
> options to specify jars for a PyFlink job. However, PyFlink users may not 
> know what Java dependencies is required. For example, a user may import Kafka 
> connector without knowing Kafka client need to be added to the classpaths. I 
> think the problem here is the lack of a cross-language dependencies 
> management, so we have to do it manually.
> 
> Now I workaround the problem by providing a tool that extracts the required 
> jars of the corresponding Java artifact of the imported PyFlink modules via 
> maven dependency plugin. But I wonder if there is some best practice to 
> address the problem? Thanks a lot!
> 
> Best,
> Paul Lam
>

Re: Java dependencies management in Pyflink

Reply via email to