[
https://issues.apache.org/jira/browse/FLINK-14306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16945324#comment-16945324
]
Hequn Cheng commented on FLINK-14306:
-------------------------------------
[~pnowojski] Yes, virtualenv requires some dependencies. We can address all
these dependencies automatically. For example, we can download all dependencies
and install them, similar to the behavior in {{lint-python.sh}} under
flink-python/dev. Currently, the {{lint-python.sh}} is used to run python tests
and generate python docs.
For this problem, there are probably two options to solve it:
1. Don't build {{flink-python}} by default. For example, only build python if
we use the command {{mvn clean install -Ppython}}. If {{-Ppython}} is provided,
we can use a script to download and install dependencies automatically. We can
use virtualenv to manage these dependencies. One thing needs to be noticed is
the compilation needs to be done in a networked environment.
2. Avoid dependencies. Currently, the dependencies are introduced by the
{{gen_protos.py}}. It is used to generate python files during the building. To
remove the dependencies, we can generate python files manually and add the
generated python files into the git. Currently, there is only one generated
file(flink_fn_execution_pb2.py, 493 lines of code in total) and the file is
seldom altered. Furthermore, there are probably no more generated python files.
Personally, I prefer the second option not only we can avoid these dependencies
but also makes the build more simple, i.e., without additional flags and users
will not make a mistake on it. The second option also has no strong requirement
for the network during the build.
Does it make sense to you? [~pnowojski] [~trohrmann][~chesnay]
> flink-python build fails with No module named pkg_resources
> -----------------------------------------------------------
>
> Key: FLINK-14306
> URL: https://issues.apache.org/jira/browse/FLINK-14306
> Project: Flink
> Issue Type: Bug
> Components: API / Python, Build System
> Affects Versions: 1.10.0
> Reporter: Piotr Nowojski
> Priority: Critical
> Fix For: 1.10.0
>
>
> [Benchmark
> builds|http://codespeed.dak8s.net:8080/job/flink-master-benchmarks/4576/console]
> started to fail with
> {noformat}
> [INFO] Adding generated sources (java):
> /home/jenkins/workspace/flink-master-benchmarks/flink/flink-python/target/generated-sources
> [INFO]
> [INFO] --- exec-maven-plugin:1.5.0:exec (Protos Generation) @
> flink-python_2.11 ---
> Traceback (most recent call last):
> File
> "/home/jenkins/workspace/flink-master-benchmarks/flink/flink-python/pyflink/gen_protos.py",
> line 33, in <module>
> import pkg_resources
> ImportError: No module named pkg_resources
> [ERROR] Command execution failed.
> (...)
> [INFO] flink-state-processor-api .......................... SUCCESS [ 0.299
> s]
> [INFO] flink-python ....................................... FAILURE [ 0.434
> s]
> [INFO] flink-scala-shell .................................. SKIPPED
> {noformat}
> because of this ticket: https://issues.apache.org/jira/browse/FLINK-14018
> I think I can solve the benchmark builds failing quite easily by installing
> {{setuptools}} python package, so this ticket is not about this, but about
> deciding how should we treat such kind of external dependencies. I don't see
> this dependency being mentioned anywhere in the documentation ([for example
> here|https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/building.html]).
> Probably at the very least those external dependencies should be documented,
> but also I fear about such kind of manual steps to do before building the
> Flink can become a problem if grow out of control. Some questions:
> # Do we really need this dependency?
> # Could this dependency be resolve automatically? By installing into a local
> python virtual environment?
> # Should we document those dependencies somewhere?
> # Maybe we should not build flink-python by default?
> # Maybe we should add a pre-build script for flink-python to verify the
> dependencies and to throw an easy to understand error with hint how to fix it?
> CC [~hequn] [~dian.fu] [~trohrmann] [~jincheng]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)