Danny,

Thanks for responding so quickly. Earlier this afternoon I confirmed the
bug on SLURM 2.3.x (I can confirm the revision in the morning once i'm back
at the office) on a Cray XT system using straight MySQL (I believe I have
seen the same bug on 2.2.6 on an x86 cluster, but am now unsure and will
check in the morning). My instinct tells me this is related to the fact
that you can't modify a user's account assignment without recreating the
user. As you may know once a user is created, you can't modify his account
assignment (can't do sacctmgr modify user fred account=newaccount1,
newaccount2). I would like to try my hand at that problem later, but for
now the fact that I need to restart slurmctld is a nuisance for me right
now. Hopefully, this is just a config issue and that is why I was trying to
figure out where/when the association data is sent to the slurmdctld­. Here
are the steps to recreate the issue on a Cray XT.

First off I configure SLURM to use Associtation, Limits and QoS as
enforcement rules. Then, I create a set of QoS's and parent accounts, and
assign users to them. At this point if a user tries to submit a job with
srun, an error is generated about not having a valid association. In order
to fix it, I need to restart the slurmctld. Once restarted, all is well.
Now if I want to add a user, I will need to restart slurmctld again for the
changes to take effect. As I said, I have definitely confirmed this on my
Cray test system, but I will recheck on my 2.2.6 x86 test system.

Many thanks!

Fred

2012/2/22 Danny Auble <[email protected]>

>
> What version of SLURM are you using?  This seems like a bug that would
> bite a lot of people.  I don't see it in 2.3 or 2.4. (with the SlurmDBD,
> I didn't test a direct mysql plugin, but that should work as well.)
>
> Danny
>
> On 02/22/12 16:29, Frédérick Lessard wrote:
> > Hello (I hope you have only received this email once...had problems
> sending
> > it to the list...),
> >
> > I'm running into issues where information is not available for the
> > slurmctld until the daemon is restarted following an update in sacctmgr
> and
> > I would like to try and fix it in the code (and submit a patch back to
> you
> > guys if I find it!). Can someone provide me with some direction as to
> where
> > to look into the code to understand when the information does get
> > propagated from the association database (because it does appear there)
> to
> > the slurmctld?
> >
> > A bit more background: Whenever I add a user, I need to restart the
> > slurmctld in order for that user to be allowed to use the association
> > created between the user and the account. Same thing applies for adding
> > more accounts to that user.... I'm using mysql as the accounting DB and
> > enforce Associations, Limits and QoS in the slurm.conf.
> >
> > Thanks for your help
> >
> > Fred.
> >
>

Reply via email to