Hey Jeff,looks indeed like the 'standard' type of problems. Unfortunately in
our network, I do not have theprivileges to do anything much. Not that that
would help much, since I'm only a simple SW engineer,not a network
specialist.The tip to try another agent connection is a good one though. Will
try that.
thx again, David
On Tuesday, April 14, 2020, 07:52:18 PM GMT+2, Jeff Thompson
<[email protected]> wrote:
Unfortunately, it's really hard to say. Possibilities include resource
contention, such as CPU or networking, anything in the middle, such as load
balancers, firewalls, etc., network or system configuration. I heard of one a
while back that ended up being connected to IP table definition. Can't remember
if that was related to docker containers or full VMs. I've heard that there
have been some common problems in some VM environments, but I don't know what
environments or issues specifically. Maybe VMotion. Maybe the network gets
overloaded, especially between VMs. Or interactions between loads on different
VMs. I'm not as familiar with the current state, but in the past in other
environments I have seen more interference between VMs than expected.
It comes down to standard troubleshooting sorts of behavior. Try to catch the
problem. Gather information about different occurrences. Try to isolate any
commonalities. Isolate a system for reproduction.
You could try a different type of agent, such as an SSH Agent. The behavior
might be different. I've heard recently that Microsoft's SSHD implementation
works well.
Good luck on troubleshooting
Jeff
On 4/14/20 8:31 AM, 'monger_39' via Jenkins Users wrote:
Hi Jeff, thx. Last week I disabled the ping-thread on master and slaves by
setting the interval to '-1'. Unfortunately, over the weekend, again one of the
slaves (even though the jobs kept on running), went into 'offline' mode. It
seems indeed that this does not solve the issue. Or, iow I think it means that
the disconnect was not caused by the ping-thread(s) timing out.
Which puts me to the challenge to figure out what could be this 'external
someting' that you mention that would break the remoting. And I honestly have
no idea how to tackle that yet.
The master, as well as the slave are Windows server VM's running 6 executor
slots each. The
tests we are running heavily use TCP communication.
Any idea how to tackle this ?
thx, M. On Thursday, April 9, 2020, 10:53:48 PM GMT+2, Jeff Thompson
<[email protected]> wrote:
On 4/7/20 11:46 PM, 'monger_39' via Jenkins Users wrote:
Hi,
in my Jenkins I am regularly facing master/slave connection drops with a
message like:
hudson.remoting.ChannelClosedException: Channel "unknown": Remote
call on JNLP4-connect connection from IP/IP:58344 failed.
The channel is closing down or has closed down.
Usually these are caused by something external to the Remoting communication
protocol. Most often by something in the system or networking environment.
Sometimes by some bad interaction between plugins that ends up impacting the
channel.
Your best approach is to figure out where these disconnects originate and
resolve the issue.
I have seen a lot of bug-reports on this. For most, a workaround is
advised by disabling the Ping-Thread through setting:
You should be cautious about changing the ping settings or disabling it
entirely. It can cause some weird and unexpected behaviors. If you do change
the settings, I recommend you change one thing at a time and evaluate the
results. If it doesn't make any difference, restore it to its default setting.
And, is there also a slave setting for the timeoutvalue?
It depends on how you launch the agent. Remoting system properties are
described at
https://github.com/jenkinsci/remoting/blob/master/docs/configuration.md
(naming for all these settings does not look to be very consistent...)
Unfortunately, that's the case.
Jeff Thompson
--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/ab43b555-176c-4834-e125-fb66ff612f4d%40cloudbees.com
.
--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/1263552947.287750.1586874699662%40mail.yahoo.com.
--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/98d316ef-0bd0-d706-16ff-b8f9d409d900%40cloudbees.com.
--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/1020197314.795889.1586930758877%40mail.yahoo.com.