[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219353#comment-16219353
 ] 

Allen Wittenauer commented on HDFS-12711:
-----------------------------------------

Re-launched the agent using the Jenkins UI.  But now Jenkins doesn't appear to 
want to schedule *any* jobs.

Just shoot me.

... In others news, a bit of "inside baseball" that I bet a lot of people don't 
know.

When Yetus launches a docker container, Jenkins doesn't know how to kill it.  
It sends the equiv of ctrl-c to the docker CLI but it doesn't seem to respond 
to it. So the Docker container *continues to run*. (Thus why "timed out" tasks 
will still run if they are running their docker armor). In qbt mode, there is 
no JIRA to write to.  So the output is actually handled by Jenkins.  In 
test-patch mode, Yetus has a JIRA to write output to.  What we are seeing is 
that Yetus is continuing to run, finishes, then says "yeah, a bunch of stuff 
failed.  fix your code."  Meanwhile, outside the container, it's death and 
destruction and the loss of the Jenkins agent and probably other stuff.

But this does mean at least in this run, that it was *NOT* a kernel panic 
because otherwise we would never have gotten any feedback at all.  That's 
fantastic news because it means there are likely some controls that can put 
around it.... just a matter if they are OS-level/infra or docker-related.

It's worth noting that from what I can tell, surefire will report OOM'd and/or 
otherwise externally killed tests as "timed out".  So there was still a lot of 
death and destruction inside the container as well.

> deadly hdfs test
> ----------------
>
>                 Key: HDFS-12711
>                 URL: https://issues.apache.org/jira/browse/HDFS-12711
>             Project: Hadoop HDFS
>          Issue Type: Test
>    Affects Versions: 2.9.0, 2.8.2
>            Reporter: Allen Wittenauer
>            Priority: Critical
>         Attachments: HDFS-12711.branch-2.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to