[
https://issues.apache.org/jira/browse/BEAM-5683?focusedWorklogId=168064&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-168064
]
ASF GitHub Bot logged work on BEAM-5683:
----------------------------------------
Author: ASF GitHub Bot
Created on: 21/Nov/18 00:29
Start Date: 21/Nov/18 00:29
Worklog Time Spent: 10m
Work Description: pabloem closed pull request #7043: [BEAM-5683] Add
retry in pip download
URL: https://github.com/apache/beam/pull/7043
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/sdks/python/apache_beam/runners/portability/stager.py
b/sdks/python/apache_beam/runners/portability/stager.py
index d5b7e005dff..285c2c26597 100644
--- a/sdks/python/apache_beam/runners/portability/stager.py
+++ b/sdks/python/apache_beam/runners/portability/stager.py
@@ -50,7 +50,6 @@
import logging
import os
import shutil
-import subprocess
import sys
import tempfile
@@ -63,6 +62,7 @@
# TODO(angoenka): Remove reference to dataflow internal names
from apache_beam.runners.dataflow.internal import names
from apache_beam.utils import processes
+from apache_beam.utils import retry
# All constants are for internal use only; no backwards-compatibility
# guarantees.
@@ -76,6 +76,13 @@
BEAM_PACKAGE_NAME = 'apache-beam'
+def retry_on_non_zero_exit(exception):
+ if (isinstance(exception, processes.CalledProcessError) and
+ exception.returncode != 0):
+ return True
+ return False
+
+
class Stager(object):
"""Abstract Stager identifies and copies the appropriate artifacts to the
staging location.
@@ -395,6 +402,8 @@ def _get_python_executable():
return python_bin
@staticmethod
+ @retry.with_exponential_backoff(num_retries=4,
+ retry_filter=retry_on_non_zero_exit)
def _populate_requirements_cache(requirements_file, cache_dir):
# The 'pip download' command will not download again if it finds the
# tarball with the proper version already present.
@@ -416,7 +425,7 @@ def _populate_requirements_cache(requirements_file,
cache_dir):
':all:'
]
logging.info('Executing command: %s', cmd_args)
- processes.check_output(cmd_args)
+ processes.check_output(cmd_args, stderr=processes.STDOUT)
@staticmethod
def _build_setup_package(setup_file, temp_dir, build_setup_args=None):
@@ -558,7 +567,7 @@ def _download_pypi_sdk_package(temp_dir,
logging.info('Executing command: %s', cmd_args)
try:
processes.check_output(cmd_args)
- except subprocess.CalledProcessError as e:
+ except processes.CalledProcessError as e:
raise RuntimeError(repr(e))
for sdk_file in expected_files:
diff --git a/sdks/python/apache_beam/utils/processes.py
b/sdks/python/apache_beam/utils/processes.py
index b0e8e3c8ba5..ccc0aa47742 100644
--- a/sdks/python/apache_beam/utils/processes.py
+++ b/sdks/python/apache_beam/utils/processes.py
@@ -32,6 +32,7 @@
# We mimic the interface of the standard Python subprocess module.
PIPE = subprocess.PIPE
STDOUT = subprocess.STDOUT
+CalledProcessError = subprocess.CalledProcessError
def call(*args, **kwargs):
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 168064)
Time Spent: 1h 20m (was: 1h 10m)
> [beam_PostCommit_Py_VR_Dataflow] [test_multiple_empty_outputs] Fails due to
> pip download flake
> ----------------------------------------------------------------------------------------------
>
> Key: BEAM-5683
> URL: https://issues.apache.org/jira/browse/BEAM-5683
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-harness, test-failures
> Reporter: Scott Wegner
> Assignee: Ankur Goenka
> Priority: Major
> Labels: currently-failing
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> _Use this form to file an issue for test failure:_
> * [Jenkins
> Job|https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1289/]
> * [Gradle Build
> Scan|https://scans.gradle.com/s/hjmzvh4ylhs6y/console-log?task=:beam-sdks-python:validatesRunnerBatchTests]
> * [Test source
> code|https://github.com/apache/beam/blob/303a4275eb0a323761e1a4dec6a22fde9863acf8/sdks/python/apache_beam/runners/portability/stager.py#L390]
> Initial investigation:
> Seems to be failing on pip download.
> ======================================================================
> ERROR: test_multiple_empty_outputs
> (apache_beam.transforms.ptransform_test.PTransformTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/transforms/ptransform_test.py",
> line 277, in test_multiple_empty_outputs
> pipeline.run()
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/testing/test_pipeline.py",
> line 104, in run
> result = super(TestPipeline, self).run(test_runner_api)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
> line 403, in run
> self.to_runner_api(), self.runner, self._options).run(False)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/pipeline.py",
> line 416, in run
> return self.runner.run_pipeline(self)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
> line 50, in run_pipeline
> self.result = super(TestDataflowRunner, self).run_pipeline(pipeline)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
> line 389, in run_pipeline
> self.dataflow_client.create_job(self.job), self)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/retry.py",
> line 184, in wrapper
> return fun(*args, **kwargs)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py",
> line 490, in create_job
> self.create_job_description(job)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py",
> line 519, in create_job_description
> resources = self._stage_resour
> ces(job.options)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py",
> line 452, in _stage_resources
> staging_location=google_cloud_options.staging_location)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py",
> line 161, in stage_job_resources
> requirements_cache_path)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/runners/portability/stager.py",
> line 411, in _populate_requirements_cache
> processes.check_call(cmd_args)
> File
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/apache_beam/utils/processes.py",
> line 46, in check_call
> return subprocess.check_call(*args, **kwargs)
> File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
> raise CalledProcessError(retcode, cmd)
> CalledProcessError: Command
> '['/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/sdks/python/build/gradleenv/bin/python',
> '-m', 'pip', 'download', '--dest', '/tmp/dataflow-requirements-cache', '-r',
> 'postcommit_requirements.txt', '--exists-action', 'i', '--no-binary',
> ':all:']' returned non-zero exit status 1
> ----
> _After you've filled out the above details, please [assign the issue to an
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
> Assignee should [treat test failures as
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
> helping to fix the issue or find a more appropriate owner. See [Apache Beam
> Post-Commit
> Policies|https://beam.apache.org/contribute/postcommits-policies]._
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)