Wow, well spotted. I came here to see if anyone had reported this same issue with environment modules, as I noticed several of my jobs failing on our cluster this morning. Turns out, I'm probably the only one who had failed jobs, as I have a long-running tmux session open on the head node, and therefore old bash. ;)
Other users wouldn't have noticed because we updated all of our infrastructure in one go using ansible[0] last Friday. In any case, glad to be in good company. Cheers! Alan [0] http://mjanja.co.ke/2014/09/update-hosts-via-ansible-to-mitigate-bash-shellshock-vulnerability/ On 09/29/2014 08:27 AM, Christopher Samuel wrote: > On 27/09/14 08:30, John Brunelle wrote: > >> This caused a bit of trouble for us when we patched some head nodes >> before compute nodes. > We did some testing to confirm that: > > A) If you update a login node before compute nodes jobs will fail as > John describes. > > B) If you update a compute node when there are jobs queued under the > previous bash then they will fail when they run there (also cannot find > modules, even though a prologue of ours sets BASH_ENV to force the env > vars to get set). > > > Our way to (hopefully safely) upgrade our x86-64 clusters was: > > 0) Note that our slurmctld runs on the cluster management node which is > separate to the login nodes and not accessible to users. > > 1) Kick all the users off the login nodes, update bash, reboot them > (ours come back with nologin enabled to stop users getting back on > before we're ready). > > 2) Set all partitions down to stop new jobs starting > > 3) Move all compute nodes to an "old" partition > > 4) Move all queued (pending) jobs to the "old" partition > > 5) Update bash on any idle nodes and move them back to our "main" > (default) partition > > 6) Set an AllowGroups on the "old" partition so users can't submit jobs > to it by accident. > > 7) Let users back onto the login nodes. > > 8) Set partitions back to "up" to start jobs going again. > > > Hope this helps folks.. > > cheers! > Chris -- Alan Orth [email protected] http://alaninkenya.org http://mjanja.co.ke "I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone." -Bjarne Stroustrup, inventor of C++ GPG public key ID: 0x8cb0d0acb5cd81ec209c6cdfbd1a0e09c2f836c0
