I think that looks OK. Forget my response.
On Tue, 13 May 2025 at 14:09, Tilman Hoffbauer via slurm-users < slurm-users@lists.schedmd.com> wrote: > Thank you for your response. nslookup on e.g. ouga20 is instant, getent > hosts ouga20 takes about 1.6 seconds from g-vm03. It is about the same > speed for ouga20 looking up g-vm03. > > Is this too slow? > On 5/13/25 15:01, John Hearns wrote: > > Stupid response from me. A loooong time ago I ha issues with slow > response on PBS. The cause was name resolution. > > On your setup is name resolution OK? Can you look up host names without > delays? > > On Tue, 13 May 2025 at 13:50, Tilman Hoffbauer via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> Hello, >> >> we are running a SLURM-managed cluster with one control node (g-vm03) and >> 26 worker nodes (ouga[03-28]) on Rocky 8. We recently updated from 20.11.9 >> through 23.02.8 to 24.11.0 and then 24.11.5. Since then, we are >> experiencing performance issues - squeue and scontrol ping are slow to >> react and sometimes deliver "timeout on send/recv" messages, even with only >> very few parallel requests. We did not experience these issues with SLURM >> 20.11.9 before, we did not check the intermediate version 23.02.8 in detail >> before. In the log of slurmctld, we can also find messages like >> >> slurmctld: error: slurm_send_node_msg: [socket:[1272743]] >> slurm_bufs_sendto(msg_type=RESPONSE_JOB_INFO) failed: Unexpected missing >> socket error >> >> We thus implemented all recommendations from the high throughput >> documentation, and did achieve improvements with it (most notably by >> increasing the maximum number of open files and increasing MessageTimeout >> and TCPTimeout). >> >> For debugging, I attached the slurm.conf, the sdiag output (the server >> thread count is almost always 1 and sometimes increases to 2), the >> slurmctld log and the slurmdbd log from a time of high load. >> >> We would be very thankful for any input on how restore the old >> performance. >> >> Kind Regards, >> Tilman Hoffbauer >> >> >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com