[ 
https://issues.apache.org/jira/browse/BEAM-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164017#comment-17164017
 ] 

Hannah Jiang edited comment on BEAM-10567 at 7/23/20, 11:07 PM:
----------------------------------------------------------------

I went through several recent failures and didn't see any failures related to 
license pull related tasks.

One common pattern I observed is *[Execution failed for task 
':sdks:go:resolveBuildDependencies'|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1908/console].*
 
Execution failed for task ':sdks:go:resolveBuildDependencies'.
10:38:06 > Exception in resolution, message is:
10:38:06   Cannot resolve dependency:github.com/coreos/etcd: 
commit='11214aa33bf5a47d3d9d8dafe0f6b97237dfe921', 
urls=[https://github.com/coreos/etcd.git, [email protected]:coreos/etcd.git]
10:38:06   Resolution stack is:
10:38:06   +- github.com/apache/beam/sdks/go

Some other errors are 
1. [ProtocolError: ("Connection broken: error(104, 'Connection reset by 
peer')", error(104, 'Connection reset by peer'))
|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1746/consoleFull]
 when downloading tensorflow for Py2. 
2. [91mERROR: Could not install packages due to an EnvironmentError: [Errno 28] 
No space left on device
|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1905/console]
 <- This tends to happen to several consecutive runs until the space is 
released.

I looked through last 14 failed runs, and around 80% of time, it failed with 
the :sdks:go:resolveBuildDependencies task. In order to fix the flaky, we 
should look into this task. No failures are due to license job.

Another way to reduce the flakiness is merging this task to PythonPrecommit, 
because some tasks are duplicated, like the one mentioned above. By merging 
these two, we will have one failure, instead of two failures.
The reason PythonDocker task is created is that 1. PythonPrecommit is huge 
enough, don't want to add more tasks when it's reasonable to separate them. 
Docker images are not created with PythonPrecommit tests. 2. The job should run 
with --info option to print error logs from docker image, otherwise, it's hard 
to debug. PythonPrecommit doesn't run with --info option, so we need to add it, 
but if we do, it prints out too many logs. 
Merging or not is a trade-off.




  


was (Author: hannahjiang):
I went through several recent failures and don't think the failures are from 
license pull related tasks.

One common pattern I observed is *[Execution failed for task 
':sdks:go:resolveBuildDependencies'|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1908/console].*
 
Execution failed for task ':sdks:go:resolveBuildDependencies'.
10:38:06 > Exception in resolution, message is:
10:38:06   Cannot resolve dependency:github.com/coreos/etcd: 
commit='11214aa33bf5a47d3d9d8dafe0f6b97237dfe921', 
urls=[https://github.com/coreos/etcd.git, [email protected]:coreos/etcd.git]
10:38:06   Resolution stack is:
10:38:06   +- github.com/apache/beam/sdks/go

Some other errors are 
1. [ProtocolError: ("Connection broken: error(104, 'Connection reset by 
peer')", error(104, 'Connection reset by peer'))
|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1746/consoleFull]
 when downloading tensorflow for Py2. 
2. [91mERROR: Could not install packages due to an EnvironmentError: [Errno 28] 
No space left on device
|https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1905/console]
 <- This tends to happen to several consecutive runs until the space is 
released.

I looked through last 14 failed runs, and around 80% of time, it failed with 
the :sdks:go:resolveBuildDependencies task. In order to fix the flaky, we 
should look into this task. No failures are due to license job.

Another way to reduce the flakiness is merging this task to PythonPrecommit, 
because some tasks are duplicated, like the one mentioned above. By merging 
these two, we will have one failure, instead of two failures.
The reason PythonDocker task is created is that 1. PythonPrecommit is huge 
enough, don't want to add more tasks when it's reasonable to separate them. 
Docker images are not created with PythonPrecommit tests. 2. The job should run 
with --info option to print error logs from docker image, otherwise, it's hard 
to debug. PythonPrecommit doesn't run with --info option, so we need to add it, 
but if we do, it prints out too many logs. 
Merging or not is a trade-off.




  

> PythonDocker precommit seems flaky
> ----------------------------------
>
>                 Key: BEAM-10567
>                 URL: https://issues.apache.org/jira/browse/BEAM-10567
>             Project: Beam
>          Issue Type: Bug
>          Components: test-failures
>            Reporter: Udi Meiri
>            Assignee: Hannah Jiang
>            Priority: P2
>
> I've been getting these failures recently:
> {code}
> 08:53:23 > Task :sdks:python:container:py2:docker FAILED
> 08:53:23 The command '/bin/sh -c pip install -r 
> /tmp/base_image_requirements.txt &&     python -c "from 
> google.protobuf.internal import api_implementation; assert 
> api_implementation._default_implementation_type == 'cpp'; print ('Verified 
> fast protobuf used.')" &&     rm -rf /root/.cache/pip' returned a non-zero 
> code: 2
> 08:53:23 :sdks:python:container:py2:docker (Thread[Execution worker for ':' 
> Thread 11,5,main]) completed. Took 2 mins 32.762 secs.
> {code}
> https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Commit/1746/consoleFull



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to