Hi! I’m trying out PyFlink and looking for the best practice to manage Java dependencies.
The docs recommends to use ‘pipeline-jars’ configuration or command line options to specify jars for a PyFlink job. However, PyFlink users may not know what Java dependencies is required. For example, a user may import Kafka connector without knowing Kafka client need to be added to the classpaths. I think the problem here is the lack of a cross-language dependencies management, so we have to do it manually. Now I workaround the problem by providing a tool that extracts the required jars of the corresponding Java artifact of the imported PyFlink modules via maven dependency plugin. But I wonder if there is some best practice to address the problem? Thanks a lot! Best, Paul Lam