Re: [slurm-users] Node OverSubscribe even if set to no

2018-04-17 Thread Stéphane Larose
Hi all,

I found out a way to avoid oversubscribing. I had to comment this configuration:

PreemptMode=Suspend,Gang
PreemptType=preempt/partition_prio

In my actual configuration, all the partitions are at the same priority. At 
times, I increase the priority of a partition and jobs in other partitions are 
suspended. That works fine.  But I still do not understand why oversubscribing 
occurs when preemption is activated. I would like to keep preemption by 
suspending and not get oversubscription. If anyone have an idea of how to do 
this.

Thank you!

Stéphane

-Message d'origine-
De : Stéphane Larose 
Envoyé : 17 avril 2018 10:02
À : 'Slurm User Community List' 
Objet : RE: [slurm-users] Node OverSubscribe even if set to no

Hi Chris,

> You might want to double check the config is acting as expected with:
>
> scontrol show part | fgrep OverSubscribe

   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO

> Also what does this say?
>
> scontrol show config | fgrep SelectTypeParameters

SelectTypeParameters= CR_CPU_MEMORY

From the doc, it seems that only CR_Memory implies OverSubscribe=YES :
All CR_s assume OverSubscribe=No or OverSubscribe=Force EXCEPT for CR_MEMORY 
which assumes OverSubscribe=Yes

When I do "scontrol list jobs", all jobs have OverSubscribe=OK (which is not 
Yes). Again from the docs it seems fine: "OK" otherwise (typically allocated 
dedicated CPUs)

Thanks again,

Stéphane

-Message d'origine-
De : slurm-users  De la part de Chris 
Samuel Envoyé : 17 avril 2018 04:29 À : slurm-users@lists.schedmd.com Objet : 
Re: [slurm-users] Node OverSubscribe even if set to no

On Tuesday, 17 April 2018 5:26:26 AM AEST Stéphane Larose wrote:

> So some jobs are now sharing the same cores but I don’t understand why 
> since OverSubscribe is set to no.

You might want to double check the config is acting as expected with:

scontrol show part | fgrep OverSubscribe

Also what does this say?

scontrol show config | fgrep SelectTypeParameters

I note that if you've got CR_Memory then:

 CR_Memory
Memory  is  a  consumable  resource.   NOTE:  This
implies OverSubscribe=YES  or  OverSubscribe=FORCE
for  all  partitions.  Setting a value for DefMem‐
PerCPU is strongly recommended.

cheers,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Mahmood Naderan
Great. Thank you very much. It passed the problematic point.



On Tue, Apr 17, 2018, 19:24 Ole Holm Nielsen 
wrote:

> On 04/17/2018 04:38 PM, Mahmood Naderan wrote:
> > That parameter is used in slurm.conf. Should I modify that only on the
> > head node? Or all nodes? Then should I restart slurm processes?
>
> Yes, definitely!  I collected the detailed instructions here:
>
> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#reconfiguration-of-slurm-conf
>
> /Ole
>
>


[slurm-users] What can cause a job to get killed?

2018-04-17 Thread Andy Riebs
I had a job running last night, with a 30 minute timeout. (It's a 
well-tested script that runs multiple times daily.)


On one run, in a middle of a set of runs for this job, I got this on the 
console after about 8 minutes:


srun: forcing job termination
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** STEP 617845.0 ON node01 CANCELLED AT 
2018-04-17T00:36:58 ***

srun: error: node14 : tasks 3680,3682-3683: Killed
srun: Terminating job step 617845.0

Slurmctld duly reports that the job terminated with "WTERMSIG 9", and 
the slurmd logs also indicate "task  (Y) exited. Killed by 
signal 9."


Any thoughts about why a job would get cancelled without getting any 
more detail than this?


Andy

--
Andy Riebs
andy.ri...@hpe.com
Hewlett-Packard Enterprise
High Performance Computing Software Engineering
+1 404 648 9024
My opinions are not necessarily those of HPE
May the source be with you!




[slurm-users] configure --htmldir

2018-04-17 Thread Jason Bacon


FYI, I just discovered that doc/man/man1/Makefile does not respect 
configure's --htmldir flag:


[root@centosdev slurm]# fgrep '$ ./configure' work/slurm-17.11.5/config.log
  $ ./configure --bindir=/usr/pkg/bin 
--htmldir=/usr/pkg/share/doc/slurm-wlm/html --with-munge=/usr/pkg 
--with-hwloc=/usr/pkg --with-json=/usr/pkg --with-libssh2=/usr/pkg 
--prefix=/usr/pkg --build=x86_64-redhat-linux --host=x86_64-redhat-linux 
--mandir=/usr/pkg/man


[root@centosdev slurm]# grep ^htmldir work/slurm-17.11.5/Makefile 
work/slurm-17.11.5/doc/man/man1/Makefile

work/slurm-17.11.5/Makefile:htmldir = /usr/pkg/share/doc/slurm-wlm/html
work/slurm-17.11.5/doc/man/man1/Makefile:htmldir = 
${datadir}/doc/${PACKAGE}-${SLURM_VERSION_STRING}/html


Is this a bug or a feature?

Most of the generated Makefiles do respect --htmldir, but it looks like 
everything under doc does not:


[root@centosdev slurm]# find work/slurm-17.11.5/ -name Makefile -exec 
grep -H ^htmldir '{}' \; | fgrep SLURM_VERSION
work/slurm-17.11.5/doc/man/man5/Makefile:htmldir = 
${datadir}/doc/${PACKAGE}-${SLURM_VERSION_STRING}/html
work/slurm-17.11.5/doc/man/man8/Makefile:htmldir = 
${datadir}/doc/${PACKAGE}-${SLURM_VERSION_STRING}/html
work/slurm-17.11.5/doc/man/man1/Makefile:htmldir = 
${datadir}/doc/${PACKAGE}-${SLURM_VERSION_STRING}/html
work/slurm-17.11.5/doc/html/Makefile:htmldir = 
${datadir}/doc/${PACKAGE}-${SLURM_VERSION_STRING}/html


I can do a patch in the pkgsrc package to work around it, but maybe this 
needs to be fixed upstream...


Cheers,

    Jason

--
Earth is a beta site.




[slurm-users] SLURM Operator Role (to cancel SLURM Jobs)

2018-04-17 Thread Buckley, Ronan
Hi,

I have given 4 users the operator role and they are all part of the coordinator 
accounts. However, when I su to the users in question, they get a permission 
denied error when trying to cancel a job.

What am I missing?

Ronan



Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Ole Holm Nielsen

On 04/17/2018 04:38 PM, Mahmood Naderan wrote:

That parameter is used in slurm.conf. Should I modify that only on the
head node? Or all nodes? Then should I restart slurm processes?


Yes, definitely!  I collected the detailed instructions here:
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#reconfiguration-of-slurm-conf

/Ole



Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Mahmood Naderan
That parameter is used in slurm.conf. Should I modify that only on the
head node? Or all nodes? Then should I restart slurm processes?

Regards,
Mahmood




On Tue, Apr 17, 2018 at 4:18 PM, Chris Samuel  wrote:
> On Tuesday, 17 April 2018 7:23:40 PM AEST Mahmood Naderan wrote:
>
>> [hamid@rocks7 case1_source2]$  scontrol show config | fgrep VSizeFactor
>> VSizeFactor = 110 percent
>
> Great, I think that's the cause of the limit you are seeing..
>
>VSizeFactor
>   Memory  specifications in job requests apply to real memory size
>   (also known as resident set size). It  is  possible  to  enforce
>   virtual  memory  limits  for both jobs and job steps by limiting
>   their virtual memory to some percentage  of  their  real  memory
>   allocation. The VSizeFactor parameter specifies the job's or job
>   step's virtual memory limit as a percentage of its  real  memory
>   limit.  For  example,  if a job's real memory limit is 500MB and
>   VSizeFactor is set to 101 then the job will  be  killed  if  its
>   real  memory  exceeds  500MB or its virtual memory exceeds 505MB
>   (101 percent of the real memory limit).  The default value is 0,
>   which  disables enforcement of virtual memory limits.  The value
>   may not exceed 65533 percent.
>
> Setting it to 0 should make that limit go away.
>
> All the best,
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>



[slurm-users] Recurring error

2018-04-17 Thread Valerio Bellizzomi
Hello,

I have a recurring error in the log of slurmctld:

[2018-04-10T19:32:40.145] error: _unpack_ret_list: message type 24949,
record 0 of 56214
[2018-04-10T19:32:40.145] error: invalid type trying to be freed 24949
[2018-04-10T19:32:40.145] error: unpacking header
[2018-04-10T19:32:40.145] error: destroy_forward: no init
[2018-04-10T19:32:40.145] error: slurm_receive_msgs: Message receive
failure
[2018-04-10T19:34:20.161] error: _unpack_ret_list: message type 24949,
record 0 of 56218
[2018-04-10T19:34:20.161] error: invalid type trying to be freed 24949
[2018-04-10T19:34:20.161] error: unpacking header
[2018-04-10T19:34:20.161] error: destroy_forward: no init
[2018-04-10T19:34:20.161] error: slurm_receive_msgs: Message receive
failure




This happened in presence of different versions of slurm, 17.11 on
computing nodes and 17.02 on the controller

after update to latest version the error disappeared.







Re: [slurm-users] Node OverSubscribe even if set to no

2018-04-17 Thread Stéphane Larose
Hi Chris,

> You might want to double check the config is acting as expected with:
>
> scontrol show part | fgrep OverSubscribe

   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO
   PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NO

> Also what does this say?
>
> scontrol show config | fgrep SelectTypeParameters

SelectTypeParameters= CR_CPU_MEMORY

From the doc, it seems that only CR_Memory implies OverSubscribe=YES :
All CR_s assume OverSubscribe=No or OverSubscribe=Force EXCEPT for CR_MEMORY 
which assumes OverSubscribe=Yes

When I do "scontrol list jobs", all jobs have OverSubscribe=OK (which is not 
Yes). Again from the docs it seems fine: "OK" otherwise (typically allocated 
dedicated CPUs)

Thanks again,

Stéphane

-Message d'origine-
De : slurm-users  De la part de Chris 
Samuel
Envoyé : 17 avril 2018 04:29
À : slurm-users@lists.schedmd.com
Objet : Re: [slurm-users] Node OverSubscribe even if set to no

On Tuesday, 17 April 2018 5:26:26 AM AEST Stéphane Larose wrote:

> So some jobs are now sharing the same cores but I don’t understand why 
> since OverSubscribe is set to no.

You might want to double check the config is acting as expected with:

scontrol show part | fgrep OverSubscribe

Also what does this say?

scontrol show config | fgrep SelectTypeParameters

I note that if you've got CR_Memory then:

 CR_Memory
Memory  is  a  consumable  resource.   NOTE:  This
implies OverSubscribe=YES  or  OverSubscribe=FORCE
for  all  partitions.  Setting a value for DefMem‐
PerCPU is strongly recommended.

cheers,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] Way MaxRSS should be interpreted

2018-04-17 Thread E.S. Rosenberg
Hi Gareth,
Your assessment is also what I would have thought MaxRSS should be the
maximum of the sum of all RSS in a sample, swap and shared memory does
complicate things but I think most people expect jobs to only be killed if
their RSS exceeds their memory request.

That being said as far as I understand the current slurm reporting
mechanisms there is actually no way to get the total MaxRSS of a job but
only of whatever step/subjob/thread was largest in memory.
Thanks,
Eli

On Tue, Apr 17, 2018 at 4:03 PM,  wrote:

> I think the situation is likely to be a little different. Let’s consider a
> fortran program that statically or dynamically defines large arrays. This
> defines a virtual memory size – like declaring that this is the maximum
> amount of memory you might use if you fill the arrays. That amount of real
> memory + swap must be available for the program to run – after all, you
> might use that amount…  Speaking loosely, linux has a soft memory
> allocation policy so memory may not actually be allocated until it is used.
> If the program happens to read a smaller dataset and the arrays are not
> filled then the resident set size may be significantly smaller than the
> virtual memory size.  Further, memory swapped doesn’t count to the RSS so
> it might be even smaller. Effectively RSS for a process is the actual
> footprint in RAM. It will change over the life of the process/job and slurm
> will track the maximum (MaxRSS). I’d actually expect MaxRSS to be the
> maximum of the sum of RSS of known processes as sampled periodically
> through the job – but I’m guessing. This should apply reasonably to
> parallel jobs if the sum spans nodes (or it wouldn’t be the first batch
> system to only effectively account for the first allocated node). The whole
> linux memory tracking/accounting system has gotchas as shared memory (say
> for library code) has to be accounted for somewhere, but we can reasonably
> assume in HPC that memory use is dominated by unique computational working
> set data – so MaxRSS is a good estimate of how much RAM is needed to run a
> given job.
>
>
>
> Gareth
>
>
>
> *From:* slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] *On
> Behalf Of *E.S. Rosenberg
> *Sent:* Tuesday, 17 April 2018 10:42 PM
> *To:* Slurm User Community List 
> *Subject:* Re: [slurm-users] Way MaxRSS should be interpreted
>
>
>
> Hi Loris,
>
> Thanks for your explanation!
>
> I would have interpreted as max(sum()).
>
>
>
> Is there a way to get max(sum()) or at least sum form of sum()? The
> assumption that all processes are peaking at the same value is not a valid
> one unless all threads have essentially the same workload...
>
> Thanks again!
>
> Eli
>
>
>
> On Tue, Apr 17, 2018 at 2:09 PM, Loris Bennett 
> wrote:
>
> Hi Eli,
>
> "E.S. Rosenberg"  writes:
>
> > Hi fellow slurm users,
> > We have been struggling for a while with understanding how MaxRSS is
> reported.
> >
> > This because jobs often die with MaxRSS not even approaching 10% of the
> requested memory sometimes.
> >
> > I just found the following document:
> > https://research.csc.fi/-/a
> >
> > It says:
> > "maxrss = maximum amount of memory used at any time by any process in
> that job. This applies directly for serial jobs. For parallel jobs you need
> to multiply with the number of cores (max 16 or 24 as this is
> > reported only for that node that used the most memory)"
> >
> > While 'man sacct' says:
> > "Maximum resident set size of all tasks in job."
> >
> > Which explanation is correct? How should I be interpreting MaxRSS?
>
> As far as I can tell, both explanations are correct, but the
> text in 'man acct' is confusing.
>
>   "Maximum resident set size of all tasks in job."
>
> is analogous to
>
>   "maximum height of all people in the room"
>
> rather than
>
>   "total height of all people in the room"
>
> More specifically it means
>
>   "Maximum individual resident set size out of the group of resident set
>   sizes associated with all tasks in job."
>
> It doesn't mean
>
>   "Sum of the resident set sizes of all the tasks"
>
> I'm a native English-speaker and I keep on stumbling over this in 'man
> sacct' and then remembering that I have already worked out how it was
> supposed to be interpreted.
>
> My suggestion for improving this would be
>
>   "Maximum individual resident set size of all resident set sizes
>   associated with the tasks in job."
>
> It's a little clunky, but I hope it is clearer.
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>
>
>


Re: [slurm-users] "allocated+" status

2018-04-17 Thread Andy Riebs

Hmmm... the man page says of "reduce_completing_frag,"

"By default if a job is found completing then no jobs are scheduled. If 
this parameter is used the node in a completing job are taken out of 
consideration."


This feels like it's missing a word or two. The first sentence says 
that, by default, no jobs are scheduled if another job is found 
"completing." The second sentence suggests that this parameter must be 
set to prevent jobs from being scheduled in this situation.


BTW, the first sentence describes what I had expected to  be the case.

What am I missing?

Andy

On 04/16/2018 02:15 PM, Kilian Cavalotti wrote:

Hi Andy,

On Mon, Apr 16, 2018 at 8:43 AM, Andy Riebs  wrote:

I hadn't realized that jobs can be scheduled to run on a node that is still
in "completing" state from an earlier job. We occasionally use epilog
scripts that can take 30 seconds or longer, and we really don't want the
next job to start until the epilog scripts have completed.

Other than coding a little loop to wait until the desired nodes are "idle"
before scheduling a job, is there an automated way to say "don't start a job
on a node until it reaches 'idle' status?"

I'd recommend taking a look at the following options in slurm.conf:
* CompleteWait,
* reduce_completing_frag (in SchedulerParams).

Cheers,





Re: [slurm-users] Way MaxRSS should be interpreted

2018-04-17 Thread Gareth.Williams
I think the situation is likely to be a little different. Let’s consider a 
fortran program that statically or dynamically defines large arrays. This 
defines a virtual memory size – like declaring that this is the maximum amount 
of memory you might use if you fill the arrays. That amount of real memory + 
swap must be available for the program to run – after all, you might use that 
amount…  Speaking loosely, linux has a soft memory allocation policy so memory 
may not actually be allocated until it is used. If the program happens to read 
a smaller dataset and the arrays are not filled then the resident set size may 
be significantly smaller than the virtual memory size.  Further, memory swapped 
doesn’t count to the RSS so it might be even smaller. Effectively RSS for a 
process is the actual footprint in RAM. It will change over the life of the 
process/job and slurm will track the maximum (MaxRSS). I’d actually expect 
MaxRSS to be the maximum of the sum of RSS of known processes as sampled 
periodically through the job – but I’m guessing. This should apply reasonably 
to parallel jobs if the sum spans nodes (or it wouldn’t be the first batch 
system to only effectively account for the first allocated node). The whole 
linux memory tracking/accounting system has gotchas as shared memory (say for 
library code) has to be accounted for somewhere, but we can reasonably assume 
in HPC that memory use is dominated by unique computational working set data – 
so MaxRSS is a good estimate of how much RAM is needed to run a given job.

Gareth

From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of 
E.S. Rosenberg
Sent: Tuesday, 17 April 2018 10:42 PM
To: Slurm User Community List 
Subject: Re: [slurm-users] Way MaxRSS should be interpreted

Hi Loris,
Thanks for your explanation!
I would have interpreted as max(sum()).

Is there a way to get max(sum()) or at least sum form of sum()? The assumption 
that all processes are peaking at the same value is not a valid one unless all 
threads have essentially the same workload...
Thanks again!
Eli

On Tue, Apr 17, 2018 at 2:09 PM, Loris Bennett 
mailto:loris.benn...@fu-berlin.de>> wrote:
Hi Eli,

"E.S. Rosenberg" 
mailto:esr%2bslurm-...@mail.hebrew.edu>> writes:

> Hi fellow slurm users,
> We have been struggling for a while with understanding how MaxRSS is reported.
>
> This because jobs often die with MaxRSS not even approaching 10% of the 
> requested memory sometimes.
>
> I just found the following document:
> https://research.csc.fi/-/a
>
> It says:
> "maxrss = maximum amount of memory used at any time by any process in that 
> job. This applies directly for serial jobs. For parallel jobs you need to 
> multiply with the number of cores (max 16 or 24 as this is
> reported only for that node that used the most memory)"
>
> While 'man sacct' says:
> "Maximum resident set size of all tasks in job."
>
> Which explanation is correct? How should I be interpreting MaxRSS?

As far as I can tell, both explanations are correct, but the
text in 'man acct' is confusing.

  "Maximum resident set size of all tasks in job."

is analogous to

  "maximum height of all people in the room"

rather than

  "total height of all people in the room"

More specifically it means

  "Maximum individual resident set size out of the group of resident set
  sizes associated with all tasks in job."

It doesn't mean

  "Sum of the resident set sizes of all the tasks"

I'm a native English-speaker and I keep on stumbling over this in 'man
sacct' and then remembering that I have already worked out how it was
supposed to be interpreted.

My suggestion for improving this would be

  "Maximum individual resident set size of all resident set sizes
  associated with the tasks in job."

It's a little clunky, but I hope it is clearer.

Cheers,

Loris

--
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email 
loris.benn...@fu-berlin.de



Re: [slurm-users] Way MaxRSS should be interpreted

2018-04-17 Thread E.S. Rosenberg
Hi Loris,
Thanks for your explanation!
I would have interpreted as max(sum()).

Is there a way to get max(sum()) or at least sum form of sum()? The
assumption that all processes are peaking at the same value is not a valid
one unless all threads have essentially the same workload...
Thanks again!
Eli

On Tue, Apr 17, 2018 at 2:09 PM, Loris Bennett 
wrote:

> Hi Eli,
>
> "E.S. Rosenberg"  writes:
>
> > Hi fellow slurm users,
> > We have been struggling for a while with understanding how MaxRSS is
> reported.
> >
> > This because jobs often die with MaxRSS not even approaching 10% of the
> requested memory sometimes.
> >
> > I just found the following document:
> > https://research.csc.fi/-/a
> >
> > It says:
> > "maxrss = maximum amount of memory used at any time by any process in
> that job. This applies directly for serial jobs. For parallel jobs you need
> to multiply with the number of cores (max 16 or 24 as this is
> > reported only for that node that used the most memory)"
> >
> > While 'man sacct' says:
> > "Maximum resident set size of all tasks in job."
> >
> > Which explanation is correct? How should I be interpreting MaxRSS?
>
> As far as I can tell, both explanations are correct, but the
> text in 'man acct' is confusing.
>
>   "Maximum resident set size of all tasks in job."
>
> is analogous to
>
>   "maximum height of all people in the room"
>
> rather than
>
>   "total height of all people in the room"
>
> More specifically it means
>
>   "Maximum individual resident set size out of the group of resident set
>   sizes associated with all tasks in job."
>
> It doesn't mean
>
>   "Sum of the resident set sizes of all the tasks"
>
> I'm a native English-speaker and I keep on stumbling over this in 'man
> sacct' and then remembering that I have already worked out how it was
> supposed to be interpreted.
>
> My suggestion for improving this would be
>
>   "Maximum individual resident set size of all resident set sizes
>   associated with the tasks in job."
>
> It's a little clunky, but I hope it is clearer.
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>
>


Re: [slurm-users] SLURM's reservations

2018-04-17 Thread Chris Samuel
On Tuesday, 17 April 2018 7:41:18 PM AEST De Giorgi Jean-Claude wrote:

> Thanks a lot for your help.
> Yes, I misunderstood the "format" part.
> 
> Thank you for your example.

My pleasure, glad it was useful!  We have a newer version of Slurm which has 
different (& more) format options to you so I had to guess a little.

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Chris Samuel
On Tuesday, 17 April 2018 7:23:40 PM AEST Mahmood Naderan wrote:

> [hamid@rocks7 case1_source2]$  scontrol show config | fgrep VSizeFactor
> VSizeFactor = 110 percent

Great, I think that's the cause of the limit you are seeing..

   VSizeFactor
  Memory  specifications in job requests apply to real memory size
  (also known as resident set size). It  is  possible  to  enforce
  virtual  memory  limits  for both jobs and job steps by limiting
  their virtual memory to some percentage  of  their  real  memory
  allocation. The VSizeFactor parameter specifies the job's or job
  step's virtual memory limit as a percentage of its  real  memory
  limit.  For  example,  if a job's real memory limit is 500MB and
  VSizeFactor is set to 101 then the job will  be  killed  if  its
  real  memory  exceeds  500MB or its virtual memory exceeds 505MB
  (101 percent of the real memory limit).  The default value is 0,
  which  disables enforcement of virtual memory limits.  The value
  may not exceed 65533 percent.

Setting it to 0 should make that limit go away.

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] Way MaxRSS should be interpreted

2018-04-17 Thread Loris Bennett
Hi Eli,

"E.S. Rosenberg"  writes:

> Hi fellow slurm users,
> We have been struggling for a while with understanding how MaxRSS is reported.
>
> This because jobs often die with MaxRSS not even approaching 10% of the 
> requested memory sometimes.
>
> I just found the following document:
> https://research.csc.fi/-/a
>
> It says:
> "maxrss = maximum amount of memory used at any time by any process in that 
> job. This applies directly for serial jobs. For parallel jobs you need to 
> multiply with the number of cores (max 16 or 24 as this is
> reported only for that node that used the most memory)"
>
> While 'man sacct' says:
> "Maximum resident set size of all tasks in job."
>
> Which explanation is correct? How should I be interpreting MaxRSS?

As far as I can tell, both explanations are correct, but the
text in 'man acct' is confusing.

  "Maximum resident set size of all tasks in job."

is analogous to

  "maximum height of all people in the room"

rather than 

  "total height of all people in the room"

More specifically it means

  "Maximum individual resident set size out of the group of resident set
  sizes associated with all tasks in job."

It doesn't mean

  "Sum of the resident set sizes of all the tasks"

I'm a native English-speaker and I keep on stumbling over this in 'man
sacct' and then remembering that I have already worked out how it was
supposed to be interpreted.

My suggestion for improving this would be

  "Maximum individual resident set size of all resident set sizes
  associated with the tasks in job."

It's a little clunky, but I hope it is clearer.

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



Re: [slurm-users] slurm jobs are pending but resources are available

2018-04-17 Thread Benjamin Redling
Hello,

Am 16.04.2018 um 18:50 schrieb Michael Di Domenico:
> On Mon, Apr 16, 2018 at 6:35 AM,   wrote:

> perhaps i missed something in the email, but it sounds like you have
> 56 cores, you have two running jobs that consume 52 cores, leaving you
> four free.  

No. From the original mail:
<--- %< --->
"scontrol show nodes cn_burebista" gives me the following:

NodeName=cn_burebista Arch=x86_64 CoresPerSocket=14
   CPUAlloc=32
<--- %< --->

Jobs 2356 and 2357 use 32 CPUs as long as the original poster gave the
right numbers.

Regards,
Benjamin
-- 
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
☎ +49 3641 9 44323



[slurm-users] Way MaxRSS should be interpreted

2018-04-17 Thread E.S. Rosenberg
Hi fellow slurm users,
We have been struggling for a while with understanding how MaxRSS is
reported.

This because jobs often die with MaxRSS not even approaching 10% of the
requested memory sometimes.

I just found the following document:
https://research.csc.fi/-/a

It says:
"*maxrss *= maximum amount of memory used at any time by any process in
that job. This applies directly for serial jobs. For parallel jobs you need
to multiply with the number of cores (max 16 or 24 as this is reported only
for that node that used the most memory)"

While 'man sacct' says:
"Maximum resident set size of all tasks in job."

Which explanation is correct? How should I be interpreting MaxRSS?

Thanks,
Eli


Re: [slurm-users] Python code for munging hostfiles

2018-04-17 Thread John Hearns
Loris, Ole, thankyou so much.  That is the Python script I was thinking of.


On 17 April 2018 at 11:15, Ole Holm Nielsen 
wrote:

> On 04/17/2018 10:56 AM, John Hearns wrote:
>
>> Please can some kind soul remind me what the Python code for mangling
>> Slurm and PBS machinefiles is called please? We discussed it here about a
>> year ago, in the context of running Ansys.
>>
>> I have a Cunning Plan (TM) to recode it in Julia, for no real reason
>> other than curiosity.
>>
>
> I discuss hosts lists in my Wiki: https://wiki.fysik.dtu.dk/nifl
> heim/SLURM#expanding-and-collapsing-host-lists
>
> Perhaps you were looking for this tool? https://www.nsc.liu.se/~kent/p
> ython-hostlist/
>
> /Ole
>
>


Re: [slurm-users] SLURM's reservations

2018-04-17 Thread De Giorgi Jean-Claude
Hi Chris,

Thanks a lot for your help.
Yes, I misunderstood the "format" part.

Thank you for your example.

Regards,
Jean-Claude



On 17.04.18, 05:43, "slurm-users on behalf of Chris Samuel" 
 wrote:

On Tuesday, 17 April 2018 12:52:04 AM AEST De Giorgi Jean-Claude wrote:

> According to the man page, I should get these headers:
> Allocated, Associations, Cluster, Count, CPUTime, End, Flags, Idle, Name,
> Nodes, ReservationId, Start, TotalTime

I suspect you're misreading the manual page, you're probably looking at the
allowed format options for the reservation utilisation report.

So you should be able to do this to see all that info:

sreport reservation utilization start=2018-02-10T10:00:00 
format=Allocated,Associations,Cluster,Count,CPUTime,End,Flags,Idle,Name,Nodes,ReservationId,Start,TotalTime

best of luck!
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC






Re: [slurm-users] SLURM's reservations

2018-04-17 Thread De Giorgi Jean-Claude
Hello Daniel,

Thank you for your information.
That’s very helpful.

Regards,
Jean-Claude


From: slurm-users  on behalf of Daniel 
Grimwood 
Reply-To: Slurm User Community List 
Date: Tuesday, 17 April 2018 at 05:10
To: 'Slurm User Community List' 
Subject: Re: [slurm-users] SLURM's reservations

Hi Jean-Claude,

Within an ugly Perl script (since dictionaries are easy in Perl), I run:
sreport -n -p -M \"$cluster\" reservation Utilization start=$startdate 
end=$enddate -t hours 
Format=Name,ReservationID,Associations,TotalTime,Nodes,Allocated,Start,End

and then
sacct -n -a -M \"$cluster\" -S $startdate -E $enddate -p 
--format=jobid,user,CPUTimeRaw,AssocID,ReservationID -X -T

and then match up the ReservationID numbers, to see to what extent the 
reservations actually get used.

With regards,
Daniel.


From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of 
De Giorgi Jean-Claude
Sent: Monday, 16 April 2018 10:52 PM
To: Slurm User Community List 
Subject: Re: [slurm-users] SLURM's reservations

Here it is,

The command line to get the previous reservation is:
sreport reservation utilization start=2018-02-10T10:00:00

According to the man page, I should get these headers:
Allocated, Associations, Cluster, Count, CPUTime, End, Flags, Idle, Name, 
Nodes, ReservationId, Start, TotalTime
But I get only these ones:
Cluster  Name   Start EndAllocated  
   Idle

I try to put a “verbose” somewhere (or a “-v” ), but it doesn’t change anything.

My SLURM version is 17.02.7  .

If anyone has more usage examples/explanations, very welcome.




Regards,
Jean-Claude




Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Mahmood Naderan
See

[hamid@rocks7 case1_source2]$  scontrol show config | fgrep VSizeFactor
VSizeFactor = 110 percent
Regards,
Mahmood




On Tue, Apr 17, 2018 at 12:51 PM, Chris Samuel  wrote:
> On Tuesday, 17 April 2018 5:08:09 PM AEST Mahmood Naderan wrote:
>
>> So, UsePAM has not been set. So, slurm shouldn't limit anything. Is
>> that correct? however, I see that slurm limits the virtual memory size
>
> What does this say?
>
> scontrol show config | fgrep VSizeFactor
>
>
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>



Re: [slurm-users] Python code for munging hostfiles

2018-04-17 Thread Ole Holm Nielsen

On 04/17/2018 10:56 AM, John Hearns wrote:
Please can some kind soul remind me what the Python code for mangling 
Slurm and PBS machinefiles is called please? We discussed it here about 
a year ago, in the context of running Ansys.


I have a Cunning Plan (TM) to recode it in Julia, for no real reason 
other than curiosity.


I discuss hosts lists in my Wiki: 
https://wiki.fysik.dtu.dk/niflheim/SLURM#expanding-and-collapsing-host-lists


Perhaps you were looking for this tool? 
https://www.nsc.liu.se/~kent/python-hostlist/


/Ole



Re: [slurm-users] Python code for munging hostfiles

2018-04-17 Thread Loris Bennett
Loris Bennett  writes:

> Hi John,
>
> John Hearns  writes:
>
>> Please can some kind soul remind me what the Python code for mangling
>> Slurm and PBS machinefiles is called please? We discussed it here
>> about a year ago, in the context of running Ansys.
>>
>> I have a Cunning Plan (TM) to recode it in Julia, for no real reason
>> other than curiosity.
>>
>
> Are you thinking of 'hostnames':
>
> $ scontrol show hostnames node[1-3]
> node1
> node2
> node3

Actually, you probably mean this:

  https://www.nsc.liu.se/~kent/python-hostlist/

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



Re: [slurm-users] Python code for munging hostfiles

2018-04-17 Thread Loris Bennett
Hi John,

John Hearns  writes:

> Please can some kind soul remind me what the Python code for mangling
> Slurm and PBS machinefiles is called please? We discussed it here
> about a year ago, in the context of running Ansys.
>
> I have a Cunning Plan (TM) to recode it in Julia, for no real reason
> other than curiosity.
>

Are you thinking of 'hostnames':

$ scontrol show hostnames node[1-3]
node1
node2
node3

Good luck with the CP!

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de



Re: [slurm-users] What version I should install?

2018-04-17 Thread Ole Holm Nielsen

On 04/17/2018 09:14 AM, David Rodríguez wrote:

Thanks Chris!

Thanks Ole!

In fact, I followed your wiki. But I had many doubts in order to use 
version 17.11 or 17.02 because I don know the differences between them. 
Finally, I installed the last one.


Always install the latest and greatest version :-)  Only the newest 2 
versions are supported by SchedMD.


There are two steps that change a little from actual version. For 
example, in slurm-17.11.5-1 does not appear "slurm-plugins-$VER*rpm"


Thanks for informing about the error in the Wiki documentation, I 
removed the slurm-plugins for Slurm 17.11 in the examples now.


/Ole



[slurm-users] Python code for munging hostfiles

2018-04-17 Thread John Hearns
Please can some kind soul remind me what the Python code for mangling Slurm
and PBS machinefiles is called please? We discussed it here about a year
ago, in the context of running Ansys.

I have a Cunning Plan (TM) to recode it in Julia, for no real reason other
than curiosity.


Re: [slurm-users] Node OverSubscribe even if set to no

2018-04-17 Thread Chris Samuel
On Tuesday, 17 April 2018 5:26:26 AM AEST Stéphane Larose wrote:

> So some jobs are now sharing the same cores but I don’t understand why since
> OverSubscribe is set to no.

You might want to double check the config is acting as expected with:

scontrol show part | fgrep OverSubscribe

Also what does this say?

scontrol show config | fgrep SelectTypeParameters

I note that if you've got CR_Memory then:

 CR_Memory
Memory  is  a  consumable  resource.   NOTE:  This
implies OverSubscribe=YES  or  OverSubscribe=FORCE
for  all  partitions.  Setting a value for DefMem‐
PerCPU is strongly recommended.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] Limit to "-N1" and "-n1" with job_submit.lua

2018-04-17 Thread Chris Samuel
On Monday, 16 April 2018 11:00:59 PM AEST Sysadmin CAOS wrote:

> However, this script is not working as I desire because logged value for
> job_desc.max_cpus return a value too big. I get "4294967294" for
> job_desc.max_cpus and "1" for job_desc.max_nodes...

That large number means the limit has not been specified by the user.

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Chris Samuel
On Tuesday, 17 April 2018 5:08:09 PM AEST Mahmood Naderan wrote:

> So, UsePAM has not been set. So, slurm shouldn't limit anything. Is
> that correct? however, I see that slurm limits the virtual memory size

What does this say?

scontrol show config | fgrep VSizeFactor


-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] What version I should install?

2018-04-17 Thread Chris Samuel
On Tuesday, 17 April 2018 5:14:39 PM AEST David Rodríguez wrote:

> There are two steps that change a little from actual version. For example,
> in slurm-17.11.5-1 does not appear "slurm-plugins-$VER*rpm"

Yes, that's a change documented in the RELEASE_NOTES for Slurm 17.11.x.

NOTE: The slurm.spec file used to build RPM packages has been
  aggressively refactored, and some package names may now be
  different. Notably, the three daemons (slurmctld,
  slurmd/slurmstepd, slurmdbd) each have their own separate
  package with the binary and the appropriate systemd service
  file, which will be installed automatically (but not enabled).
  The slurm-plugins, slurm-munge, and slurm-lua package has been
  removed, and the contents moved in to the main slurm package.
  The slurm-sql package has been removed, and merged in with the
  slurm (job_comp_mysql.so) and slurm-slurmdbd
  (accounting_storage_mysql) packages.

The example configuration files have been moved to slurm-example-configs.

-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




Re: [slurm-users] What version I should install?

2018-04-17 Thread David Rodríguez
Thanks Chris!

Thanks Ole!

In fact, I followed your wiki. But I had many doubts in order to use
version 17.11 or 17.02 because I don know the differences between them.
Finally, I installed the last one.

There are two steps that change a little from actual version. For example,
in slurm-17.11.5-1 does not appear "slurm-plugins-$VER*rpm"

Your wiki helped me a lot! Thanks!

David


2018-04-17 8:34 GMT+02:00 Ole Holm Nielsen :

> On 04/16/2018 08:20 PM, David Rodríguez Galiano wrote:
>
>> Dear Slurm community,
>>
>> I am a sysadmin who needs to make a fresh installation of Slurm.
>> When visiting the download website, I can see two different versions.
>> The first is 17.02.10 and the second one is 17.11.5. I have not found
>> information on what version to use.
>> The latest version fixes some errors, but when I try to generate the
>> rpms with rpmbuild, I do not see any reference to munge. However, the
>> previous version (17.02.10) contains it.
>>
>> I would be very grateful if you can help me to clarify which version I
>> should use. Or the differences between them. I would like to install the
>> latest version of slurm and munge in CentOS 7.
>>
>
> David, I have written a Wiki about installing Slurm on CentOS 7 systems:
> https://wiki.fysik.dtu.dk/niflheim/SLURM
>
> In https://wiki.fysik.dtu.dk/niflheim/Slurm_installation you can see how
> to install Munge on CentOS 7.
>
> /Ole
>
>


Re: [slurm-users] ulimit in sbatch script

2018-04-17 Thread Mahmood Naderan
Hi Bill,
Sorry for the late reply. As I greped for pam_limits.so, I see

[root@rocks7 ~]# grep -r pam_limits.so /etc/pam.d/
/etc/pam.d/sudo:sessionrequired pam_limits.so
/etc/pam.d/runuser:session  requiredpam_limits.so
/etc/pam.d/sudo-i:sessionrequired pam_limits.so
/etc/pam.d/system-auth-ac:session required  pam_limits.so
/etc/pam.d/fingerprint-auth-ac:session required  pam_limits.so
/etc/pam.d/smartcard-auth-ac:session required  pam_limits.so
/etc/pam.d/password-auth-ac:session required  pam_limits.so
[root@rocks7 ~]# grep -r UsePAM /etc/slurm/
/etc/slurm/slurm.conf:#UsePAM=



So, UsePAM has not been set. So, slurm shouldn't limit anything. Is
that correct? however, I see that slurm limits the virtual memory size


[hamid@rocks7 case1_source2]$ cat slurm_script.sh
#!/bin/bash
#SBATCH --job-name=hvacSteadyFoam
#SBATCH --output=hvacSteadyFoam.log
#SBATCH --ntasks=32
#SBATCH --time=100:00:00
#SBATCH --mem=64000M
ulimit -a
mpirun hvacSteadyFoam -parallel

[hamid@rocks7 case1_source2]$ sbatch slurm_script.sh
Submitted batch job 55
[hamid@rocks7 case1_source2]$ ssh compute-0-3
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Last login: Sun Apr 15 23:11:15 2018 from rocks7.local
Rocks Compute Node
Rocks 7.0 (Manzanita)
Profile built 19:21 11-Apr-2018

Kickstarted 19:37 11-Apr-2018
[hamid@compute-0-3 ~]$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 256712
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 4096
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited
[hamid@compute-0-3 ~]$ exit
logout
Connection to compute-0-3 closed.
[hamid@rocks7 case1_source2]$ cat hvacSteadyFoam.log
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 256712
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) 65536000
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 4096
virtual memory  (kbytes, -v) 72089600
file locks  (-x) unlimited
[hamid@rocks7 case1_source2]$


Regards,
Mahmood




On Mon, Apr 16, 2018 at 12:02 AM, Bill Barth  wrote:
> Specifying --mem to Slurm only tells it to find a node that has that much, 
> not to enforce a limit as far as I know. That node has that much so it finds 
> it. You probably want to enable UsePAM and setup the pam.d slurm files and 
> /etc/security/limits.conf to keep users under the 64000MB physical memory 
> that the node has (minus some padding for the OS, etc.). IS UsePAM enabled in 
> your slurm.conf, maybe that’s doing it.
>
> Best,
> Bill.
>
> --
> Bill Barth, Ph.D., Director, HPC
> bba...@tacc.utexas.edu|   Phone: (512) 232-7069
> Office: ROC 1.435|   Fax:   (512) 475-9445
>



[slurm-users] How to have an array job name include the array task ID

2018-04-17 Thread Alex Reynolds
Hello all,

I am submitting a job to a SLURM scheduler, which contains an array of
small jobs.

For example, here's a script that simply prints out the date and hostname
of the compute node from within a heredoc:

---
#!/bin/bash
...(variables)...
sbatch --parsable --partition=${jobPartition} --array=1-${jobArrayCount}
--job-name=${jobName}.%a --output=${jobName}.stdout.%a.%j
--error=${jobName}.stderr.%a.%j --mem-per-cpu=${jobMem} --export=ALL <<"EOF"
#!/bin/bash
stamp=`date && hostname`
echo -e "Child array job [${SLURM_ARRAY_TASK_ID}]:\n${stamp}"
EOF
exit 0
---

The filenames of the output and error logs from this job contain the
correct array task ID (1 through ${jobArrayCount}, represented with the %a
variable) and parent job ID (represented with the %j variable).

However, the job name (${jobName}.%a) only expands the ${jobName} variable,
and it prints the %a value as a string literal — that is, it is left
untranslated to the array task ID.

For example, if "jobName=foo", then the use of --job-name=${jobName}.%a
results in the scheduler using the job name "foo.%a", instead of "foo.1",
"foo.2", and so on, up to the number of child jobs in the array.

As output and error logs can use the %a array task ID variable, is there a
way to get the job name assignment to use this variable as well?

Another thing I tried was to move the job name assignment within the
heredoc block:

---
#!/bin/bash
...(variables)...
sbatch --parsable --partition=${jobPartition} --array=1-${jobArrayCount}
--output=${jobName}.stdout.%a.%j --error=${jobName}.stderr.%a.%j
--mem-per-cpu=${jobMem} --export=ALL <<"EOF"
#!/bin/bash
#SBATCH --job-name="${jobName}.${SLURM_ARRAY_TASK_ID}"
stamp=`date && hostname`
echo -e "Child array job [${SLURM_ARRAY_TASK_ID}]:\n${stamp}"
EOF
exit 0
---

In this case, the job name is rendered literally as the string
"${jobName}.${SLURM_ARRAY_TASK_ID}".

A third thing that I tried was to rename the job name via `scontrol`, after
the fact, which works but only if the job is in the scheduler and only if
it is running:

---
$ scontrol update JobId=${arrayJobId} JobName=${jobName}.${jobArrayTaskId}
---

The `sacct` program does not seem to have keywords that grant access to
array job and task IDs, e.g.:

---
$ sacct -j ${arrayJobId} --format=ArrayJobId,ArrayTaskId --noheader
--parsable2
sacct: error: Invalid field requested: "ArrayJobId"
---

(Keywords are listed here: https://slurm.schedmd.com/sacct.html)

However, it looks like I can use `scontrol` to get the array job and task
IDs, though it is a bit of a hack:

---
$ scontrol show job ${arrayJobId} | grep ArrayTaskId | awk '{i=split($0,a,"
"); j=split(a[3],b,"="); k=split(a[4],c,"="); print c[2]"."b[2]; }'
testArrayChild.1
---

There are a few problems with this approach:

1. I can't rename the array of jobs until they are in the scheduler
2. My method for getting the array task ID is a hack that seems fragile
3. I can't rename the job after it is finished

These issues seem to make this approach difficult to implement in a
reliable way.

My question, ultimately, is: Is there an easier way to have the an array
job name include the array task ID?

Regards,
Alex