[jira] [Updated] (YETUS-744) Report broken ASF nodes

Allen Wittenauer (JIRA) Fri, 21 Dec 2018 11:54:39 -0800


     [ 
https://issues.apache.org/jira/browse/YETUS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Allen Wittenauer updated YETUS-744:
-----------------------------------
    Description: 
The ASF build infrastructure is barely monitored and most of the jobs are 
pretty terrible. This means it isn't unusual for things such as process slots 
to drop to zero and cause problems.  For example, it isn't unusual for the 
relatively tiny Yetus project jobs to fail.  But they fail in such a way that 
Yetus doesn't really report the problem correctly.  Digging into the 
coprocessors log will show:

{code}
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: Resource 
temporarily unavailable
{code}

test-patch should: 
* specifically look for this condition 
* bail out early rather than trying to continue on
* report exactly which node is broken, especially if it can be done prior or 
after launching docker

  was:
The ASF build infrastructure is barely monitored and most of the jobs are 
pretty terrible. This means it isn't unusually for things such as process slots 
to drop to zero and cause problems.  For example, it isn't unusual for the 
relatively tiny Yetus project jobs to fail.  But they fail in such a way that 
Yetus doesn't really report the problem correctly.  Digging into the 
coprocessors log will show:

{code}
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
processes
/testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: Resource 
temporarily unavailable
{code}

test-patch should: 
* specifically look for this condition 
* bail out early rather than trying to continue on
* report exactly which node is broken, especially if it can be done prior or 
after launching docker


> Report broken ASF nodes
> -----------------------
>
>                 Key: YETUS-744
>                 URL: https://issues.apache.org/jira/browse/YETUS-744
>             Project: Yetus
>          Issue Type: New Feature
>            Reporter: Allen Wittenauer
>            Priority: Major
>
> The ASF build infrastructure is barely monitored and most of the jobs are 
> pretty terrible. This means it isn't unusual for things such as process slots 
> to drop to zero and cause problems.  For example, it isn't unusual for the 
> relatively tiny Yetus project jobs to fail.  But they fail in such a way that 
> Yetus doesn't really report the problem correctly.  Digging into the 
> coprocessors log will show:
> {code}
> /testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
> processes
> /testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
> processes
> /testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
> processes
> /testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: retry: No child 
> processes
> /testptch/patchprocess/precommit/core.d/00-yetuslib.sh: fork: Resource 
> temporarily unavailable
> {code}
> test-patch should: 
> * specifically look for this condition 
> * bail out early rather than trying to continue on
> * report exactly which node is broken, especially if it can be done prior or 
> after launching docker



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (YETUS-744) Report broken ASF nodes

Reply via email to