Thanks Ivan. We're not using SHH agents but Docker Cloud (the agents are 
provisioned on the fly as docker containers).

I was indeed looking for how to turn on some debugging on the agent side 
but I couldn't find anything. Also the agent docker container is removed 
once the job is finished so it seems even harder to get some info about 
what's going on.

What I wanted to know is whether what we're experiencing is a normal 
behavior of Jenkins or not. I'm asking because a lot of our jobs are going 
fine every day but we stil have several ones that are killed in mid-air 
every day. For example if I take agent 6 (a6) from 
https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA I can see it's been 
terminate on 2020-02-10 at:
* 4:44
* 5:06
* 5:24
* 7:45
* 10:06
* 10:24
* etc

Now I don't think we have that many job failures every day. It's more like 
1 or 2 per day. So I'm not sure what to think of it. 

I was trying to investigate why we see the following regularly (every day) 
in our CI job logs:

Cannot contact Jenkins SSH Slave a6-009448n7sqon4: 
java.lang.InterruptedException
Agent Jenkins SSH Slave a6-009448n7sqon4 was deleted; cancelling node body
Could not connect to Jenkins SSH Slave a6-009448n7sqon4 to send interrupt 
signal to process

And then I discovered what I've pasted at 
https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA by looking at the jenkins 
master log file and I went "wow, how come there are so many disconnections".

Any idea is most welcome!

Thanks a lot
-Vincent


Le vendredi 14 février 2020 19:50:27 UTC+1, Ivan Fernandez Calvo a écrit :
>
> Pingthread and some monitoring stuff run every 4 min, I think that the 
> disconnections happens before that process but because there is not 
> activity on this agents is not detected until the pingthread passes. So I 
> guess you have half closed connections, I mean, the agent closes the 
> convention but the master does not received the reset packet. If you are 
> using SSH agents, you can enable the verbose mode on the sshd server to 
> monitor what the heck happens see 
> https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md#common-info-needed-to-troubleshooting-a-bug
>  
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjenkinsci%2Fssh-slaves-plugin%2Fblob%2Fmaster%2Fdoc%2FTROUBLESHOOTING.md%23common-info-needed-to-troubleshooting-a-bug&sa=D&sntz=1&usg=AFQjCNFInvV2jEZnSZ_-KN3YkxCp6g7igA>

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/6745b3f8-6da2-49b4-8e99-835fb67315dc%40googlegroups.com.

Reply via email to