I looked at the code more closely this morning. What I discovered is that a reconfigure invokes logic to clear the "group" usage counts (grp_used_cpus, grp_used_nodes, grp_used_cpu_run_secs, grp_used_jobs, and grp_used_submit_jobs) for a designated QOS, but does nothing to clear the "user" counts. In fact, submit_jobs is the only QOS "user" count being adjusted during the reconfigure. None of the other QOS "user" counts (maxcpus, maxjobs, maxnodes) are being adjusted by the reconfigure. I verified this by instrumenting the logic that increments & decrements the counts. I have therefore written a simple new function, of the name _clear_qos_job_submit_info, which is invoked within the _clear_used_qos_info function located in the module src/common/acct_mgr.c. This new function clears the submit_job count for each "user" found in the QOS usage->user_limit_list. This is isolated logic so I feel very confident it will not introduce any regressions. Attached is a copy of the patch for 2.3.0. Best Regards, Bill
assoc_mgr.c.patch
Description: Binary data