[ 
https://issues.apache.org/jira/browse/FLINK-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691952#comment-16691952
 ] 

ASF GitHub Bot commented on FLINK-10481:
----------------------------------------

dawidwys closed pull request #7074: [FLINK-10481][e2e] Added retry logic for 
building docker image
URL: https://github.com/apache/flink/pull/7074
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/flink-end-to-end-tests/test-scripts/common.sh 
b/flink-end-to-end-tests/test-scripts/common.sh
index 4e6254864c1..275d9c49f4b 100644
--- a/flink-end-to-end-tests/test-scripts/common.sh
+++ b/flink-end-to-end-tests/test-scripts/common.sh
@@ -663,3 +663,22 @@ function find_latest_completed_checkpoint {
     local checkpoint_meta_file=$(ls -d 
${checkpoint_root_directory}/chk-[1-9]*/_metadata | sort -Vr | head -n1)
     echo "$(dirname "${checkpoint_meta_file}")"
 }
+
+function retry_times() {
+    local retriesNumber=$1
+    local backoff=$2
+    local command=${@:3}
+
+    for (( i = 0; i < ${retriesNumber}; i++ ))
+    do
+        if ${command}; then
+            return 0
+        fi
+
+        echo "Command: ${command} failed. Retrying..."
+        sleep ${backoff}
+    done
+
+    echo "Command: ${command} failed ${retriesNumber} times."
+    return 1
+}
diff --git a/flink-end-to-end-tests/test-scripts/test_docker_embedded_job.sh 
b/flink-end-to-end-tests/test-scripts/test_docker_embedded_job.sh
index 370ef052a4d..2d8aa4f47d4 100755
--- a/flink-end-to-end-tests/test-scripts/test_docker_embedded_job.sh
+++ b/flink-end-to-end-tests/test-scripts/test_docker_embedded_job.sh
@@ -21,6 +21,8 @@ source "$(dirname "$0")"/common.sh
 
 DOCKER_MODULE_DIR=${END_TO_END_DIR}/../flink-container/docker
 DOCKER_SCRIPTS=${END_TO_END_DIR}/test-scripts/container-scripts
+DOCKER_IMAGE_BUILD_RETRIES=3
+BUILD_BACKOFF_TIME=5
 
 export FLINK_JOB=org.apache.flink.examples.java.wordcount.WordCount
 export FLINK_DOCKER_IMAGE_NAME=test_docker_embedded_job
@@ -30,12 +32,19 @@ export INPUT_PATH=/data/test/input
 export OUTPUT_PATH=/data/test/output
 export FLINK_JOB_ARGUMENTS="--input ${INPUT_PATH}/words --output 
${OUTPUT_PATH}/docker_wc_out"
 
-# user inside the container must be able to createto workaround in-container 
permissions
+build_image() {
+    ./build.sh --from-local-dist --job-jar 
${FLINK_DIR}/examples/batch/WordCount.jar --image-name 
${FLINK_DOCKER_IMAGE_NAME}
+}
+
+# user inside the container must be able to create files, this is a workaround 
in-container permissions
 mkdir -p $OUTPUT_VOLUME
 chmod 777 $OUTPUT_VOLUME
 
 cd "$DOCKER_MODULE_DIR"
-./build.sh --from-local-dist --job-jar 
${FLINK_DIR}/examples/batch/WordCount.jar --image-name 
${FLINK_DOCKER_IMAGE_NAME}
+if ! retry_times $DOCKER_IMAGE_BUILD_RETRIES ${BUILD_BACKOFF_TIME} 
build_image; then
+    echo "Failed to build docker image. Aborting..."
+    exit 1
+fi
 cd "$END_TO_END_DIR"
 
 docker-compose -f ${DOCKER_MODULE_DIR}/docker-compose.yml -f 
${DOCKER_SCRIPTS}/docker-compose.test.yml up --abort-on-container-exit 
--exit-code-from job-cluster &> /dev/null


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Wordcount end-to-end test in docker env unstable
> ------------------------------------------------
>
>                 Key: FLINK-10481
>                 URL: https://issues.apache.org/jira/browse/FLINK-10481
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 1.7.0
>            Reporter: Till Rohrmann
>            Assignee: Dawid Wysakowicz
>            Priority: Critical
>              Labels: pull-request-available, test-stability
>             Fix For: 1.5.6, 1.6.3, 1.8.0, 1.7.1
>
>
> The {{Wordcount end-to-end test in docker env}} fails sometimes on Travis 
> with the following problem:
> {code}
> Status: Downloaded newer image for java:8-jre-alpine
>  ---> fdc893b19a14
> Step 2/16 : RUN apk add --no-cache bash snappy
>  ---> [Warning] IPv4 forwarding is disabled. Networking will not work.
>  ---> Running in 4329ebcd8a77
> fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
> WARNING: Ignoring 
> http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz: 
> temporary error (try again later)
> fetch 
> http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz
> WARNING: Ignoring 
> http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz: 
> temporary error (try again later)
> ERROR: unsatisfiable constraints:
>   bash (missing):
>     required by: world[bash]
>   snappy (missing):
>     required by: world[snappy]
> The command '/bin/sh -c apk add --no-cache bash snappy' returned a non-zero 
> code: 2
> {code}
> https://api.travis-ci.org/v3/job/434909395/log.txt
> It seems as if it is related to 
> https://github.com/gliderlabs/docker-alpine/issues/264 and 
> https://github.com/gliderlabs/docker-alpine/issues/279.
> We might want to switch to a different base image to avoid these problems in 
> the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to