alnzng opened a new pull request, #526: URL: https://github.com/apache/flink-agents/pull/526
<!-- * Thank you very much for contributing to Flink Agents. * Please add the relevant components in the PR title. E.g., [api], [runtime], [java], [python], [hotfix], etc. --> <!-- Please link the PR to the relevant issue(s). Hotfix doesn't need this. --> Linked issue: #xxx ### Purpose of change Setuptools `v82.0.0` (released yesterday) removed pkg_resources as a top-level module. From the https://setuptools.pypa.io/en/stable/history.html: ``` pkg_resources has been removed from Setuptools. Most common uses of pkg_resources have been superseded by the [importlib.resources](https://docs.python.org/3/library/importlib.resources.html) and [importlib.metadata](https://docs.python.org/3/library/importlib.metadata.html) projects. Projects and environments relying on pkg_resources for namespace packages or other behavior should depend on older versions of setuptools. ([#3085](https://github.com/pypa/setuptools/issues/3085)) ``` `apache-beam`, a transitive dependency of `apache-flink` (via PyFlink), imports `pkg_resources` at module load time. Since Flink Agents declares `setuptools>=75.3` without an upper bound, pip resolves to the latest version `v82.0.0`, which no longer includes pkg_resources. This causes the Flink Python worker (Beam harness) to crash on startup with the following error: ``` 2026-02-09 14:26:47 java.lang.RuntimeException: Failed to create stage bundle factory! INFO:root:Initializing Python harness: /Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/pyflink/fn_execution/beam/beam_boot.py --id=2-1 --provision_endpoint=localhost:62283 INFO:root:Starting up Python harness in a standalone process. Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/pyflink/fn_execution/beam/beam_boot.py", line 121, in <module> from pyflink.fn_execution.beam import beam_sdk_worker_main File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/pyflink/fn_execution/beam/beam_sdk_worker_main.py", line 24, in <module> import pyflink.fn_execution.beam.beam_operations # noqa # pylint: disable=unused-import ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/pyflink/fn_execution/beam/beam_operations.py", line 21, in <module> from apache_beam.runners.worker import bundle_processor, operation_specs File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/runners/worker/bundle_processor.py", line 52, in <module> import apache_beam as beam File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/__init__.py", line 88, in <module> from apache_beam import io File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/io/__init__.py", line 21, in <module> from apache_beam.io.avroio import * File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/io/avroio.py", line 53, in <module> from apache_beam.io import filebasedsink File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/io/filebasedsink.py", line 29, in <module> from apache_beam.io import iobase File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/io/iobase.py", line 54, in <module> from apache_beam.transforms import Impulse File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/transforms/__init__.py", line 24, in <module> from apache_beam.transforms.external import * File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/transforms/external.py", line 41, in <module> from apache_beam.runners import pipeline_context File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/runners/pipeline_context.py", line 48, in <module> from apache_beam.transforms import environments File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/transforms/environments.py", line 53, in <module> from apache_beam.runners.portability import stager File "/Users/alazhang/Workspaces/Python/linkedin_extend/flink-agent/.venv/lib/python3.11/site-packages/apache_beam/runners/portability/stager.py", line 63, in <module> import pkg_resources ModuleNotFoundError: No module named 'pkg_resources' at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:677) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:290) at org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator.open(AbstractExternalPythonFunctionOperator.java:57) at org.apache.flink.streaming.api.operators.python.process.AbstractExternalDataStreamPythonFunctionOperator.open(AbstractExternalDataStreamPythonFunctionOperator.java:85) at org.apache.flink.streaming.api.operators.python.process.AbstractExternalOneInputPythonFunctionOperator.open(AbstractExternalOneInputPythonFunctionOperator.java:117) at org.apache.flink.streaming.api.operators.python.process.ExternalPythonProcessOperator.open(ExternalPythonProcessOperator.java:64) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateAndGates(StreamTask.java:858) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$restoreInternal$5(StreamTask.java:812) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:812) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:771) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:970) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:939) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:763) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Process died with exit code 0 at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:498) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:482) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:342) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:675) ... 16 more Caused by: java.lang.IllegalStateException: Process died with exit code 0 at org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:75) at org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:110) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:284) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:240) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) ... 24 more ``` ### Tests <!-- How is this change verified? --> ### API <!-- Does this change touches any public APIs? --> ### Documentation <!-- Do not remove this section. Check the proper box only. --> - [ ] `doc-needed` <!-- Your PR changes impact docs --> - [X] `doc-not-needed` <!-- Your PR changes do not impact docs --> - [ ] `doc-included` <!-- Your PR already contains the necessary documentation updates --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
