I noticed something curious after migrating some nodes from 4.1 to 4.2 which is that mounts now can take foorrreeevverrr. It seems to boil down to the point in the mount process where getEFOptions is called.
To highlight the difference-- 4.1: # /usr/bin/time /usr/lpp/mmfs/bin/mmcommon getEFOptions dnb02 skipMountPointCheck >/dev/null 0.16user 0.04system 0:00.43elapsed 45%CPU (0avgtext+0avgdata 9108maxresident)k 0inputs+2768outputs (0major+15404minor)pagefaults 0swaps 4.2: /usr/bin/time /usr/lpp/mmfs/bin/mmcommon getEFOptions dnb02 skipMountPointCheck >/dev/null 9.75user 3.79system 0:23.35elapsed 58%CPU (0avgtext+0avgdata 10832maxresident)k 0inputs+38104outputs (0major+3135097minor)pagefaults 0swaps that's uh...a 543x increase. Which, if you have 25+ filesystems and 3500 nodes that time really starts to add up. It looks like under 4.2 this getEFOptions function triggers a bunch of mmsdrfs parsing happens and node lists get generated whereas on 4.1 that doesn't happen. Digging in a little deeper it looks to me like the big difference is in gpfsClusterInit after the node fetches the "shadow" mmsdrs file. Here's a 4.1 node: gpfsClusterInit:mmsdrfsdef.sh[2827]> loginPrefix='' gpfsClusterInit:mmsdrfsdef.sh[2828]> [[ -n '' ]] gpfsClusterInit:mmsdrfsdef.sh[2829]> /usr/bin/scp supersecrethost:/var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmsdrfs.25326 gpfsClusterInit:mmsdrfsdef.sh[2830]> rc=0 gpfsClusterInit:mmsdrfsdef.sh[2831]> [[ 0 -ne 0 ]] gpfsClusterInit:mmsdrfsdef.sh[2863]> [[ -f /var/mmfs/gen/mmsdrfs.25326 ]] gpfsClusterInit:mmsdrfsdef.sh[2867]> /usr/bin/diff /var/mmfs/gen/mmsdrfs.25326 /var/mmfs/gen/mmsdrfs gpfsClusterInit:mmsdrfsdef.sh[2867]> 1> /dev/null 2> /dev/null gpfsClusterInit:mmsdrfsdef.sh[2868]> rc=0 gpfsClusterInit:mmsdrfsdef.sh[2869]> [[ 0 -ne 0 ]] gpfsClusterInit:mmsdrfsdef.sh[2874]> sdrfsFile=/var/mmfs/gen/mmsdrfs gpfsClusterInit:mmsdrfsdef.sh[2875]> /bin/rm -f /var/mmfs/gen/mmsdrfs.25326 Here's a 4.2 node: gpfsClusterInit:mmsdrfsdef.sh[2938]> loginPrefix='' gpfsClusterInit:mmsdrfsdef.sh[2939]> [[ -n '' ]] gpfsClusterInit:mmsdrfsdef.sh[2940]> /usr/bin/scp supersecrethost:/var/mmfs/gen/mmsdrfs /var/mmfs/gen/mmsdrfs.8534 gpfsClusterInit:mmsdrfsdef.sh[2941]> rc=0 gpfsClusterInit:mmsdrfsdef.sh[2942]> [[ 0 -ne 0 ]] gpfsClusterInit:mmsdrfsdef.sh[2974]> /bin/rm -f /var/mmfs/tmp/cmdTmpDir.mmcommon.8534/tmpsdrfs.gpfsClusterInit gpfsClusterInit:mmsdrfsdef.sh[2975]> [[ -f /var/mmfs/gen/mmsdrfs.8534 ]] gpfsClusterInit:mmsdrfsdef.sh[2979]> /usr/bin/diff /var/mmfs/gen/mmsdrfs.8534 /var/mmfs/gen/mmsdrfs gpfsClusterInit:mmsdrfsdef.sh[2979]> 1> /dev/null 2> /dev/null gpfsClusterInit:mmsdrfsdef.sh[2980]> rc=0 gpfsClusterInit:mmsdrfsdef.sh[2981]> [[ 0 -ne 0 ]] gpfsClusterInit:mmsdrfsdef.sh[2986]> sdrfsFile=/var/mmfs/gen/mmsdrfs it looks like the 4.1 code deletes the shadow mmsdrfs file is it's not different from what's locally on the node where as 4.2 does *not* do that. This seems to cause a problem when checkMmfsEnvironment is called because it will return 1 if the shadow file exists which according to the function comments indicates "something is not right", triggering the environment update where the slowdown is incurred. On 4.1 checkMmfsEnvironment returned 0 because the shadow mmsdrfs file had been removed, whereas on 4.2 it returned 1 because the shadow mmsdrfs file still existed despite it being identical to the mmsdrfs on the node. I've looked at 4.2.3.6 (efix12) and it doesn't look like 4.2.3.7 has dropped yet so it may be this has been fixed there. Maybe it's time for a PMR... -Aaron -- Aaron Knister NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center (301) 286-2776 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
