[slurm-users] gres/gpu: count changed for node node002 from 0 to 1

2020-03-13 Thread Robert Kudyba
We're running slurm-17.11.12 on Bright Cluster 8.1 and our node002 keeps
going into a draining state:
 sinfo -a
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*up   infinite  1   drng node002

info -N -o "%.20N %.15C %.10t %.10m %.15P %.15G %.35E"
NODELIST   CPUS(A/I/O/T)  STATE MEMORY   PARTITION
   GRES  REASON
 node001   9/15/0/24mix 191800   defq*
  gpu:1none
 node002   1/0/23/24   drng 191800   defq*
  gpu:1 gres/gpu count changed and jobs are
 node003   1/23/0/24mix 191800   defq*
  gpu:1none

Node of the nodes have a separate slurm.conf file, it's all shared from the
head node. What else could be causing this?

[2020-03-13T07:14:28.590] gres/gpu: count changed for node node002 from 0 to
1
[2020-03-13T07:14:28.590] error: _slurm_rpc_node_registration
node=node002: Invalid
argument
[2020-03-13T07:14:28.590] error: Node node001 appears to have a
different slurm.conf
than the slurmctld. This could cause issues with communication and
functionality. Please review both files and make sure they are the same. If
this is expected ignore, and set  DebugFlags=NO_CONF_HASH in your
slurm.conf.
[2020-03-13T07:14:28.590] error: Node node003 appears to have a
different slurm.conf
than the slurmctld. This could cause issues with communication and
functionality. Please review both files and make sure they are the same. If
this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
[2020-03-13T07:47:48.787] error: Node node001 appears to have a
different slurm.conf
than the slurmctld. This could cause issues with communication and
functionality. Please review both files and make sure they are the same. If
this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
[2020-03-13T07:47:48.787] error: Node node003 appears to have a
different slurm.conf
than the slurmctld. This could cause issues with communication and
functionality. Please review both files and make sure they are the same. If
this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
[2020-03-13T07:47:48.788] gres/gpu: count changed for node node002 from 0 to
1
[2020-03-13T07:47:48.788] error: _slurm_rpc_node_registration node=node002:
Invalid argument [2020-03-13T08:21:08.057] error: Node node001 appears to
have a different slurm.conf than the slurmctld. This could cause issues
with communication and functionality. Please review both files and make
sure they are the same. If this is expected ignore, and set
DebugFlags=NO_CONF_HASH in your slurm.conf.
[2020-03-13T08:21:08.058] error: Node node003 appears to have a
different slurm.conf
than the slurmctld. This could cause issues with communication and
functionality. Please review both files and make sure they are the same. If
this is expected ignore, and set DebugFlags=NO_CONF_HASH in your slurm.conf.
[2020-03-13T08:21:08.058] gres/gpu: count changed for node node002 from 0 to
1
[2020-03-13T08:21:08.058] error: _slurm_rpc_node_registration
node=node002: Invalid
argument


Re: [slurm-users] log rotation for slurmctld.

2020-03-13 Thread Bjørn-Helge Mevik
navin srivastava  writes:

> can i move the log file  to some other location and then restart.reload of
> slurm service will start a new log file.

Yes, restarting it will start a new log file if the old one is moved
away.  However, also reconfig will do, and you can trigger that by
sending the process a HUP signal.  That way you don't have to restart
the daemon.  We have this in our logrotate file:

postrotate
## Using the newer feature of reconfig when getting a SIGHUP.
kill -hup $(ps -C slurmctld h -o pid)
kill -hup $(ps -C slurmdbd h -o pid)
endscript

(That is for both slurmctld.log and slurmdbd.log.)

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo


signature.asc
Description: PGP signature


[slurm-users] log rotation for slurmctld.

2020-03-13 Thread navin srivastava
Hi,

i wanted to understand how log rotation of slurmctld works.
in my environment i don't have any logrotation for the slurmctld.log and
now the log file size reached to 125GB.

can i move the log file  to some other location and then restart.reload of
slurm service will start a new log file.i think this should work without
any issues.
am i right or it will create any issue.

Also i need to create a log rotate.is the below config works as it is.i
need to do it on production environment so asking to make sure it will work
fine without any issue.

/var/log/slurm/slurmctld.log {
weekly
missingok
notifempty
sharedscripts
create 0600 slurm slurm
rotate 8
compress
postrotate
  /bin/systemctl reload slurmctld.service > /dev/null 2>/dev/null || true
endscript
}



Regards
Navin.


Re: [slurm-users] Error upgrading slurmdbd from 19.05 to 20.02

2020-03-13 Thread Steininger, Herbert
Hi,

i guess i found the Problem.

It seems to come from this file:
src/plugins/accounting_storage/mysql/as_mysql_convert.c
in particular from here:

--- code ---
static int _convert_job_table_pre(mysql_conn_t *mysql_conn, char *cluster_name)
{
int rc = SLURM_SUCCESS;
char *query = NULL;

if (db_curr_ver < 8) {
/*
 * Change the names pack_job_id and pack_job_offset to be het_*
 */
query = xstrdup_printf(
"alter table \"%s_%s\" "
"change pack_job_id het_job_id int unsigned not null, "
"change pack_job_offset het_job_offset "
"int unsigned not null;",
cluster_name, job_table);
}

if (query) {
if (debug_flags & DEBUG_FLAG_DB_QUERY)
DB_DEBUG(mysql_conn->conn, "query\n%s", query);

rc = mysql_db_query(mysql_conn, query);
xfree(query);
if (rc != SLURM_SUCCESS)
error("%s: Can't convert %s_%s info: %m",
  __func__, cluster_name, job_table);
}

return rc;
}
--- code ---

it checks if version is below "8" and if it is so, rename the tables.

In the Table the Version is "7"

--- mysql ---
MariaDB [slurm_acct_db]> select * from convert_version_table;
++-+
| mod_time   | version |
++-+
| 1579853103 |   7 |
++-+
1 row in set (0.00 sec)
--- mysql ---


But in my Table, I already have the right columns:

--- table ---
MariaDB [slurm_acct_db]> show columns from `mpip-cluster_job_table`;
++-+--+-+++
| Field  | Type| Null | Key | Default| Extra
  |
++-+--+-+++
| job_db_inx | bigint(20) unsigned | NO   | PRI | NULL   | 
auto_increment |
| mod_time   | bigint(20) unsigned | NO   | | 0  |  
  |
| deleted| tinyint(4)  | NO   | | 0  |  
  |
| account| tinytext| YES  | | NULL   |  
  |
| admin_comment  | text| YES  | | NULL   |  
  |
| array_task_str | text| YES  | | NULL   |  
  |
| array_max_tasks| int(10) unsigned| NO   | | 0  |  
  |
| array_task_pending | int(10) unsigned| NO   | | 0  |  
  |
| constraints| text| YES  | | NULL   |  
  |
| cpus_req   | int(10) unsigned| NO   | | NULL   |  
  |
| derived_ec | int(10) unsigned| NO   | | 0  |  
  |
| derived_es | text| YES  | | NULL   |  
  |
| exit_code  | int(10) unsigned| NO   | | 0  |  
  |
| flags  | int(10) unsigned| NO   | | 0  |  
  |
| job_name   | tinytext| NO   | | NULL   |  
  |
| id_assoc   | int(10) unsigned| NO   | MUL | NULL   |  
  |
| id_array_job   | int(10) unsigned| NO   | MUL | 0  |  
  |
| id_array_task  | int(10) unsigned| NO   | | 4294967294 |  
  |
| id_block   | tinytext| YES  | | NULL   |  
  |
| id_job | int(10) unsigned| NO   | MUL | NULL   |  
  |
| id_qos | int(10) unsigned| NO   | MUL | 0  |  
  |
| id_resv| int(10) unsigned| NO   | MUL | NULL   |  
  |
| id_wckey   | int(10) unsigned| NO   | MUL | NULL   |  
  |
| id_user| int(10) unsigned| NO   | MUL | NULL   |  
  |
| id_group   | int(10) unsigned| NO   | | NULL   |  
  |
| het_job_id | int(10) unsigned| NO   | MUL | NULL   |  
  |
| het_job_offset | int(10) unsigned| NO   | | NULL   |  
  |
| kill_requid| int(11) | NO   | | -1 |  
  |
| state_reason_prev  | int(10) unsigned| NO   | | NULL   |  
  |
| mcs_label  | tinytext| YES  | | NULL   |  
  |
| mem_req| bigint(20) unsigned | NO   | | 0  |  
  |
| nodelist   | text| YES  | | NULL   |  
  |
| nodes_alloc| int(10) unsigned| NO   | MUL | NULL   |  
  |
| node_inx   | text| YES  | | 

Re: [slurm-users] slurmd -C showing incorrect core count

2020-03-13 Thread Ryan Novosielski
From what I know of how this works, no, it’s not getting it from a local file 
or the master node. I don’t believe it even makes a network connection, nor 
requires a slurm.conf in order to run. If you can run it fresh on a node with 
no config and that’s what it comes up with, it’s probably getting it from the 
VM somehow.

--

|| \\UTGERS, |---*O*---
||_// the State  | Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\of NJ  | Office of Advanced Research Computing - MSB C630, Newark
 `'

> On Mar 11, 2020, at 10:26 AM, mike tie  wrote:
> 
> 
> Yep, slurmd -C is obviously getting the data from somewhere, either a local 
> file or from the master node.  hence my email to the group;  I was hoping 
> that someone would just say:  "yeah, modify file ".  But oh well. I'll 
> start playing with strace and gdb later this week;  looking through the 
> source might also be helpful.  
> 
> I'm not cloning existing virtual machines with slurm.  I have access to a 
> vmware system that from time to time isn't running at full capacity;  usage 
> is stable for blocks of a month or two at a time, so my thought/plan was to 
> spin up a slurm compute node  on it, and resize it appropriately every few 
> months (why not put it to work).  I started with 10 cores, and it looks like 
> I can up it to 16 cores for a while, and that's when I ran into the problem.
> 
> -mike
> 
> 
> 
> Michael Tie
> Technical Director
> Mathematics, Statistics, and Computer Science
> 
>  One North College Street  phn:  507-222-4067
>  Northfield, MN 55057   cel:952-212-8933
>  m...@carleton.edufax:507-222-4312
> 
> 
> 
> On Wed, Mar 11, 2020 at 1:15 AM Kirill 'kkm' Katsnelson  
> wrote:
> On Tue, Mar 10, 2020 at 1:41 PM mike tie  wrote:
> Here is the output of lstopo
> 
> $ lstopo -p
> Machine (63GB)
>   Package P#0 + L3 (16MB)
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#1
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#2
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#3
>   Package P#1 + L3 (16MB)
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#4
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#5
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#6
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#7
>   Package P#2 + L3 (16MB)
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#8
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#9
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#10
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#11
>   Package P#3 + L3 (16MB)
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#12
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#13
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#14
> L2 (4096KB) + L1d (32KB) + L1i (32KB) + Core P#3 + PU P#15
> 
> There is no sane way to derive the number 10 from this topology. obviously: 
> it has a prime factor of 5, but everything in the lstopo output is sized in 
> powers of 2 (4 packages, a.k.a.  sockets, 4 single-threaded CPU cores per). 
> 
> I responded yesterday but somehow managed to plop my signature into the 
> middle of it, so maybe you have missed inline replies?
> 
> It's very, very likely that the number is stored *somewhere*. First to 
> eliminate is the hypothesis that the number is acquired from the control 
> daemon. That's the simplest step and the largest landgrab in the 
> divide-and-conquer analysis plan. Then just look where it comes from on the 
> VM. strace(1) will reveal all files slurmd reads. 
> 
> You are not rolling out the VMs from an image, ain't you? I'm wondering why 
> do you need to tweak an existing VM that is already in a weird state. Is 
> simply setting its snapshot aside and creating a new one from an image 
> hard/impossible? I did not touch VMWare for more than 10 years, so I may be a 
> bit naive; in the platform I'm working now (GCE), create-use-drop pattern of 
> VM use is much more common and simpler than create and maintain it to either 
> *ad infinitum* or *ad nauseam*, whichever will have been reached the 
> earliest.  But I don't know anything about VMWare; maybe it's not possible or 
> feasible with it.
> 
>  -kkm
>