tvalentyn commented on code in PR #28564: URL: https://github.com/apache/beam/pull/28564#discussion_r1333326458
########## sdks/python/apache_beam/runners/portability/stager.py: ########## @@ -62,6 +62,7 @@ from urllib.parse import urlparse from packaging import version +from pip._internal.operations import freeze Review Comment: let's not use internal APIs. a future upgrade to pip may break this api call in previously released versions of Beam, this happened before: https://github.com/pypa/pip/issues/5243#issuecomment-381422449 . Running `pip freeze` command line would be more reliable. Quick search, but feel free to look more: https://stackoverflow.com/questions/49923671/are-there-any-function-replacement-for-pip-get-installed-distributions-in-pip ########## sdks/python/apache_beam/runners/portability/stager.py: ########## @@ -84,6 +85,8 @@ WORKFLOW_TARBALL_FILE = 'workflow.tar.gz' REQUIREMENTS_FILE = 'requirements.txt' EXTRA_PACKAGES_FILE = 'extra_packages.txt' +# Filename that stores the submission environment dependencies. +SUBMISSION_ENV_DEPENDENCIES_FILENAME = 'submission_environment_dependencies.txt' Review Comment: consistency nit: ```suggestion SUBMISSION_ENV_DEPENDENCIES_FILE = 'submission_environment_dependencies.txt' ``` ########## sdks/python/apache_beam/runners/portability/stager.py: ########## @@ -365,6 +368,16 @@ def create_job_resources(options, # type: PipelineOptions Stager._create_file_stage_to_artifact( pickled_session_file, names.PICKLED_MAIN_SESSION_FILE)) + # stage the submission environment dependencies + local_dependency_file_path = os.path.join( + temp_dir, SUBMISSION_ENV_DEPENDENCIES_FILENAME) + dependencies = freeze.freeze() Review Comment: Let's make this best effort: if for whatever reason this fails, don't fail the job submission. You could also consider moving this portion of code into a helper since this method keeps growing (no strong opinion). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
