Tim, this is a good find.  Was there a problem keeping the slurm user 
though?  I would feel more comfortable leaving that check in there.  Was 
this not an issue with 2.3?

Danny

On 07/03/12 14:33, Tim Wickberg wrote:
> Attached patch fixes a bug in the select/bluegene plugin for BG/L+P in 
> the 2.4 series.
>
> Without this applied, the userid assigned to a block will never be 
> updated in MMCS, preventing the user from launching a job with a 
> message like:
>
>> <Jul 03 11:50:56.264854> BE_MPI (ERROR): Current user is not the 
>> owner of the partition,
>> <Jul 03 11:50:56.264925> BE_MPI (ERROR):   and is not in the 
>> partition's user list - Aborting
>> <Jul 03 11:50:56.406071> FE_MPI (ERROR): Back-end failed while 
>> preparing partition with return code 31.
>> <Jul 03 11:50:56.477110> FE_MPI (ERROR): Failure list:
>> <Jul 03 11:50:56.477145> FE_MPI (ERROR):   - 1. A user does not have 
>> permission to run the job on specified partition (failure #31)
>
> The patch simplifies the logic a bit: it removes all users that aren't 
> the correct assigned user including the slurm user account. (This 
> doesn't seem to affect operation on our 1-rack BG/L here at least, 
> although I can't guarantee that for BG/P.) And, correcting the bug 
> itself: it makes sure to add the assigned user to the block.
>
> Before, if user_count=0 or 1 (which was likely slurm_user, hitting a 
> continue out of the one pass of the loop), the loop would be skipped 
> over and the correct user would never be added in to the block.
>
> - Tim

Reply via email to