[
https://issues.apache.org/jira/browse/FLINK-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197110#comment-17197110
]
Matthias commented on FLINK-18842:
----------------------------------
{quote}
Maybe we could harden the start of the fileserver; we currently just try
starting it but don't do any verification that it actually started successfully.
{quote}
I tried to reproduce it by simulating a not-started file server. ...without
luck. But when going through the logs of the failed builds again, I realized
that we can exclude a not-running file server as a cause: Each of the failed
builds includes {{127.0.0.1 - - [12/Aug/2020 09:24:06] "GET /flink.tgz
HTTP/1.1" 200 -}}. Hence, the file server was running.
Next thing to try: timeout + retries.
> e2e test failed to download "localhost:9999/flink.tgz" in "Wordcount on
> Docker test"
> ------------------------------------------------------------------------------------
>
> Key: FLINK-18842
> URL: https://issues.apache.org/jira/browse/FLINK-18842
> Project: Flink
> Issue Type: Test
> Components: flink-docker, Test Infrastructure
> Affects Versions: 1.11.0
> Reporter: Dian Fu
> Assignee: Matthias
> Priority: Major
> Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5260&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=2b7514ee-e706-5046-657b-3430666e7bd9
> {code}
> 2020-08-06T20:51:31.2499580Z [91m+ wget -nv -O flink.tgz
> localhost:9999/flink.tgz
> 2020-08-06T20:51:31.2501498Z [0m127.0.0.1 - - [06/Aug/2020 20:51:31] "GET
> /flink.tgz HTTP/1.1" 200 -
> 2020-08-06T20:51:31.2502214Z [91mfailed: Connection refused.
> 2020-08-06T20:51:31.6885068Z [0m[91m2020-08-06 20:51:31
> URL:http://localhost:9999/flink.tgz [322693675/322693675] -> "flink.tgz" [1]
> 2020-08-06T20:51:31.6888547Z [0m[91m+ [ false = true ]
> 2020-08-06T20:51:31.6889384Z [0m[91m+ tar -xf flink.tgz --strip-components=1
> 2020-08-06T20:51:34.8125585Z [0m[91m+ rm flink.tgz
> 2020-08-06T20:51:34.8699287Z [0m[91m+ chown -R flink:flink .
> 2020-08-07T00:20:42.7919165Z [0m
> 2020-08-07T00:20:43.0365895Z ##[error]The operation was canceled.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)