Ox0400 opened a new issue, #35013: URL: https://github.com/apache/beam/issues/35013
### What happened? The main points of my issue are as follows:  When submitting a DataFlowPythonOP task, the script is executed using subprocess. However, apache_beam.runners.dataflow.internal.apiclient does not adjust the default log level, and logging.info() is used to print the Job ID and URL. As a result, create_python_job fails to match the Job ID, causing the task to fail. Although it’s possible to forcibly set the log level in the script using: ``` logging.getLogger('apache_beam.runners.dataflow.internal.apiclient').setLevel(logging.INFO) ``` this is not a very user-friendly solution. Additional Context: Vertex Pipeline needs to match the submitted Job URL from subprocess when running DataFlowPythonOP to call create_python_job. However, since the module does not change the default log level, the Job URL is not displayed, leading to a RuntimeError. ``` # run_pipeline.py from google_cloud_pipeline_components.v1.dataflow import DataflowPythonJobOp ... @dsl.pipeline(name='test dataflow') def test_dataflow(): dataflow_task = DataflowPythonJobOp( project=project, location=region, python_module_path=dataflow_clean_local_path, requirements_file_path=requirements_file_path, temp_location=temp_location, args=[ "--project", project, "--region", region, "--temp_location", temp_location, "--job_name", f"dataflow-clean-{time.strftime('%Y%m%d-%H%M%S', time.gmtime())}", "--save_main_session", "--runner", "DataflowRunner", ], ) ... ``` ``` # data_clean.py p = beam.Pipeline(options=options) p | ..... result = p.run() result.wait_until_finish() return result ``` PR: https://github.com/apache/beam/pull/34952 Let me know if you'd like any refinements! ### Issue Priority Priority: 1 (data loss / total loss of function) ### Issue Components - [x] Component: Python SDK - [ ] Component: Java SDK - [ ] Component: Go SDK - [ ] Component: Typescript SDK - [ ] Component: IO connector - [ ] Component: Beam YAML - [ ] Component: Beam examples - [x] Component: Beam playground - [ ] Component: Beam katas - [ ] Component: Website - [ ] Component: Infrastructure - [ ] Component: Spark Runner - [ ] Component: Flink Runner - [ ] Component: Samza Runner - [ ] Component: Twister2 Runner - [ ] Component: Hazelcast Jet Runner - [x] Component: Google Cloud Dataflow Runner -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org