Thanks Ivan. We're not using SHH agents but Docker Cloud (the agents are provisioned on the fly as docker containers).
I was indeed looking for how to turn on some debugging on the agent side but I couldn't find anything. Also the agent docker container is removed once the job is finished so it seems even harder to get some info about what's going on. What I wanted to know is whether what we're experiencing is a normal behavior of Jenkins or not. I'm asking because a lot of our jobs are going fine every day but we stil have several ones that are killed in mid-air every day. For example if I take agent 6 (a6) from https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA I can see it's been terminate on 2020-02-10 at: * 4:44 * 5:06 * 5:24 * 7:45 * 10:06 * 10:24 * etc Now I don't think we have that many job failures every day. It's more like 1 or 2 per day. So I'm not sure what to think of it. I was trying to investigate why we see the following regularly (every day) in our CI job logs: Cannot contact Jenkins SSH Slave a6-009448n7sqon4: java.lang.InterruptedException Agent Jenkins SSH Slave a6-009448n7sqon4 was deleted; cancelling node body Could not connect to Jenkins SSH Slave a6-009448n7sqon4 to send interrupt signal to process And then I discovered what I've pasted at https://up1.xwikisas.com/#vI0VAypIpe_tD9LrQRTdMA by looking at the jenkins master log file and I went "wow, how come there are so many disconnections". Any idea is most welcome! Thanks a lot -Vincent Le vendredi 14 février 2020 19:50:27 UTC+1, Ivan Fernandez Calvo a écrit : > > Pingthread and some monitoring stuff run every 4 min, I think that the > disconnections happens before that process but because there is not > activity on this agents is not detected until the pingthread passes. So I > guess you have half closed connections, I mean, the agent closes the > convention but the master does not received the reset packet. If you are > using SSH agents, you can enable the verbose mode on the sshd server to > monitor what the heck happens see > https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/doc/TROUBLESHOOTING.md#common-info-needed-to-troubleshooting-a-bug > > <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fjenkinsci%2Fssh-slaves-plugin%2Fblob%2Fmaster%2Fdoc%2FTROUBLESHOOTING.md%23common-info-needed-to-troubleshooting-a-bug&sa=D&sntz=1&usg=AFQjCNFInvV2jEZnSZ_-KN3YkxCp6g7igA> -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/6745b3f8-6da2-49b4-8e99-835fb67315dc%40googlegroups.com.
