[slurm-users] gres/gpu: count changed for node node002 from 0 to 1

2020-03-13 Thread Robert Kudyba
We're running slurm-17.11.12 on Bright Cluster 8.1 and our node002 keeps going into a draining state: sinfo -a PARTITION AVAIL TIMELIMIT NODES STATE NODELIST defq*up infinite 1 drng node002 info -N -o "%.20N %.15C %.10t %.10m %.15P %.15G %.35E" NODELIST

Re: [slurm-users] log rotation for slurmctld.

2020-03-13 Thread Bjørn-Helge Mevik
navin srivastava writes: > can i move the log file to some other location and then restart.reload of > slurm service will start a new log file. Yes, restarting it will start a new log file if the old one is moved away. However, also reconfig will do, and you can trigger that by sending the

[slurm-users] log rotation for slurmctld.

2020-03-13 Thread navin srivastava
Hi, i wanted to understand how log rotation of slurmctld works. in my environment i don't have any logrotation for the slurmctld.log and now the log file size reached to 125GB. can i move the log file to some other location and then restart.reload of slurm service will start a new log file.i

Re: [slurm-users] Error upgrading slurmdbd from 19.05 to 20.02

2020-03-13 Thread Steininger, Herbert
Hi, i guess i found the Problem. It seems to come from this file: src/plugins/accounting_storage/mysql/as_mysql_convert.c in particular from here: --- code --- static int _convert_job_table_pre(mysql_conn_t *mysql_conn, char *cluster_name) { int rc = SLURM_SUCCESS; char *query =

Re: [slurm-users] slurmd -C showing incorrect core count

2020-03-13 Thread Ryan Novosielski
From what I know of how this works, no, it’s not getting it from a local file or the master node. I don’t believe it even makes a network connection, nor requires a slurm.conf in order to run. If you can run it fresh on a node with no config and that’s what it comes up with, it’s probably