Ben Kietzman created ARROW-8432:
-----------------------------------
Summary: [Python][CI] Failure to download Hadoop
Key: ARROW-8432
URL: https://issues.apache.org/jira/browse/ARROW-8432
Project: Apache Arrow
Issue Type: Bug
Components: Continuous Integration, Python
Affects Versions: 0.16.0
Reporter: Ben Kietzman
Assignee: Ben Kietzman
Fix For: 0.17.0
https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link
This is caused by an HTTP request failure
https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36
We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi
to get tarballs. Currently there are three:
{code}
ci/docker/conda-python-hdfs.dockerfile
36:RUN wget -q -O -
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz"
| tar -xzf - -C /opt
ci/docker/linux-apt-docs.dockerfile
57:RUN wget -q -O -
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz"
| tar -xzf - -C /opt
python/manylinux1/scripts/build_thrift.sh
22:
"https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=${THRIFT_DOWNLOAD_PATH}"
\
{code}
Factor these out into a reusable script for downloading apache tarballs. It
should contain hard coded apache mirrors and retry when connections fail
--
This message was sent by Atlassian Jira
(v8.3.4#803005)