Re: help needed to manage s390x host for ci.debian.net

2023-02-12 Thread Philipp Kern

Hi,

On 12.02.23 22:38, Paul Gevers wrote:
I have munin [1], but as said, I'm not a trained sysadmin. I don't know 
what I'm looking for if you ask "statistics on the network".


This is more of a software development / devops question than a sysadmin 
question, but alas. What I am interested in is *application-level* 
logging on reconnects. Presumably the connection to RabbitMQ is 
outbound? Is it tunneled? Does your application log somewhere when a 
reconnect happens? Does it say when it successfully connected?


I'd expect good software to log something like this:

[10:00:00] Connecting to broker "rabbitmq.debci.debian.net:12345"...
[10:00:05] Connected to broker "rabbitmq.debci.debian.net:12345".

And also:

[10:00:00] Connecting to broker "rabbitmq.debci.debian.net:12345"...
[10:00:01] Connection to broker "rabbitmq.debci.debian.net:12345" 
failed: Connection refused


Kind regards
Philipp Kern



RE: help needed to manage s390x host for ci.debian.net

2023-02-12 Thread Dipak Zope1
I am not CI/networking expert, but I will be more than happy to assist.
I am at +0530 hrs available widely.

Thanks,
-Dipak Zope
Debian s390 porting team

On 13/02/23, 3:09 AM, "Paul Gevers"  wrote:
Hi Phil and all others offering help,

On 12-02-2023 20:32, Philipp Kern wrote:
> On 11.02.23 18:18, Paul Gevers wrote:
>   * [suspect 1] network issues between the s390x and the main ci.d.n
>> server (the results (log files) of the autopkgtests are transferred to
>> the main server). Our ppc64el hosts are also located at Marist, so I
>> would expect commonality here, but also ppc64el isn't performing
>> great, so maybe part of the problem is common.
>
> Do you have any kind of statistics on the network connections? I.e. how
> often it reconnects and how long it takes to reconnect? The Marist
> network has a very weird firewall inbound (e.g. if I do too many SSH
> requests in a row, I'm backholed) - so I would not be surprised if there
> is some weirdness there.

I have munin [1], but as said, I'm not a trained sysadmin. I don't know
what I'm looking for if you ask "statistics on the network".

Also, I have no experience with s390x except for deploying the Debian
software on the server setup by Phil. All the quirks of s390x are beyond me.

I can provide logging from the host, but I'll need detailed instructions
of what people find useful to look at. Recently Antonio taught me a
trick to provide temporary access to a lxc container on any of our
hosts, so if it helps to be on the host (but inside lxc) we can provide
for that.

Paul

[1]
https://ci.debian.net/munin/ci-worker-s390x-01/ci-worker-s390x-01/index.html



Re: help needed to manage s390x host for ci.debian.net

2023-02-12 Thread Paul Gevers

Hi Phil and all others offering help,

On 12-02-2023 20:32, Philipp Kern wrote:

On 11.02.23 18:18, Paul Gevers wrote:
  * [suspect 1] network issues between the s390x and the main ci.d.n
server (the results (log files) of the autopkgtests are transferred to 
the main server). Our ppc64el hosts are also located at Marist, so I 
would expect commonality here, but also ppc64el isn't performing 
great, so maybe part of the problem is common.


Do you have any kind of statistics on the network connections? I.e. how 
often it reconnects and how long it takes to reconnect? The Marist 
network has a very weird firewall inbound (e.g. if I do too many SSH 
requests in a row, I'm backholed) - so I would not be surprised if there 
is some weirdness there.


I have munin [1], but as said, I'm not a trained sysadmin. I don't know 
what I'm looking for if you ask "statistics on the network".


Also, I have no experience with s390x except for deploying the Debian 
software on the server setup by Phil. All the quirks of s390x are beyond me.


I can provide logging from the host, but I'll need detailed instructions 
of what people find useful to look at. Recently Antonio taught me a 
trick to provide temporary access to a lxc container on any of our 
hosts, so if it helps to be on the host (but inside lxc) we can provide 
for that.


Paul

[1] 
https://ci.debian.net/munin/ci-worker-s390x-01/ci-worker-s390x-01/index.html


OpenPGP_signature
Description: OpenPGP digital signature


Re: help needed to manage s390x host for ci.debian.net

2023-02-12 Thread Philipp Kern

On 11.02.23 18:18, Paul Gevers wrote:
 * [suspect 1] network issues between the s390x and the main ci.d.n
server (the results (log files) of the autopkgtests are transferred to 
the main server). Our ppc64el hosts are also located at Marist, so I 
would expect commonality here, but also ppc64el isn't performing great, 
so maybe part of the problem is common.


Do you have any kind of statistics on the network connections? I.e. how 
often it reconnects and how long it takes to reconnect? The Marist 
network has a very weird firewall inbound (e.g. if I do too many SSH 
requests in a row, I'm backholed) - so I would not be surprised if there 
is some weirdness there.


Kind regards
Philipp Kern