Nathanael England created FLINK-31073:
-----------------------------------------
Summary: Pyflink testing library can't be used out of the box
Key: FLINK-31073
URL: https://issues.apache.org/jira/browse/FLINK-31073
Project: Flink
Issue Type: Bug
Environment: Ubuntu 20.04
Python 3.8.10
Pyflink 1.16.1
Reporter: Nathanael England
The pyflink distribution comes with a `pyflink.testing.test_case_utils.py` that
makes it appear like unit testing pyflink tooling is supported. It actually
takes some non-trivial effort to figure out which packages are needed in order
to run a simple no-op test case that makes it through the class setup in that
module.
The user has to add the following jar files to their system in order to get
through the set up steps.
{code:bash}
flink-runtime-1.16.1-tests.jar
flink-test-utils-1.16.1.jar
hamcrest-core-1.3.jar
junit-4.13.2.jar
{code}
The first is needed because the gathering of `MiniClusterResourceConfiguration`
fails to be retrieved. The second is needed because it provides
`MiniClusterWithClientResource`. The junit jars are needed because they are a
dependency of `MiniClusterWithClientResource` and the user is met with a
`ClassNotFoundError` for `org.junit.rules.ExternalResource` when trying to set
up the mini cluster resource.
Further, these jars have to be put in a place where
`pyflink_gateway_server.py:construct_test_classpath` is set up to look. It has
some patterns that are expected under the source root of the installation. For
pyflink, this is typically inside a virtual environment folder that a user
should not be modifying. The only alternative to not putting the files inside
the virtual environment directories is to override that function with a custom
function that looks for jar files to add somewhere else.
The documentation available has no mention of python unit testing examples.
Most of the motivation for this fix came from
https://github.com/dianfu/pyflink-faq/tree/main/testing .
--
This message was sent by Atlassian Jira
(v8.20.10#820010)