Hi Stephen,

Replies in-line below.

Thanks,
- Larry

On 3/3/15 11:49 AM, Stephen John Smoogen wrote:

On Mar 3, 2015 8:49 AM, "P. Larry Nelson" <[email protected]
<mailto:[email protected]>> wrote:
 >
 > I am seeing a bizarre bug where an SL6.x system hangs on either
 > shutdown or reboot at the point where it wants to shutdown the
 > loopback interface.
 >
 > Let me start off by saying I'm running a mixed shop of SL5.x servers
 > (DNS, NIS, NTP, DHCP, NFS, etc.) along with a bunch of new cluster-esque
 > nodes running SL6.x.  All new SL6 nodes are Dell R410, R510, R710, for
 > whatever that's worth, but I don't believe they have anything to do
 > with the bug, per se.
 >
 > Since building these new SL6 nodes many weeks back, they have all
 > exhibited this extremely annoying habit of hanging on shutdown or
 > reboot at the shutdown of the loopback interface.
 > Eventually (for the most part) they stop spinning whatever wheels
 > they're spinning and do manage to complete either the shutdown or
 > reboot, but it takes upwards of 15, 20, or 30 minutes!  Usually
 > I can't wait that long and just do a power off/on of the node.
 >
 > No amount of trying to find out what they are doing has worked,
 > from trying to open another console window (Alt-F1, etc.) at
 > shutdown/reboot to having top running in one terminal window while
 > doing a 'service network restart' in another.  Everything just freezes!
 >
 > I tried any number of things over the past several weeks, including
 > ripping out NetworkManager knowing that it has had a history of mucking
 > things up.  No luck.  They still hang.
 >
 > On another front, I was having some UID/GID problems with the mix of
 > NFS v3 from my SL5.x file servers and NFS v4 on the SL6 nodes, so
 > I forced all mounts to use NFS v3.  I thought maybe that could be
 > the problem, but again, no luck - still hanging.
 >
 > Revisiting it again in earnest this weekend via Google, I came up
 > empty as all hits seemed to have something to do with scenarios that
 > just did not apply, including many hits about a problem with running
 > the iscsi daemon (and there was a patch for that).  But I'm not running
 > the iscsi daemon.  It's not even installed.
 >
 > One comment by someone who also had the same problem was that he, not
 > ever figuring out the cause, just commented out the line in
 > /etc/init.d/network that shuts down the loopback interface, saying it's
 > not a real device anyway, so what the hell.
 >
 > So yesterday I thought I'd try the commenting out the loopback
shutdown tactic on a test system.  Sure enough, the reboot was normal
with no
 > hangs.
 >
 > Ok, at least now I have a workaround, though that seems pretty kludgy.
 >
 > I decided to try and nail the culprit down with a fresh rebuild of
 > a test system and see just where in the build process the bug appears.
 >
 > After the basic install of SL6, the system reboots just fine.
 > Then do a 'yum update' with all its hundreds of patches.
 > It reboots just fine, as I expected.
 >
 > So the first "local" change was to configure NIS.
 > Try the reboot.  Reboots fine.
 >
 > [ok, here is where it becomes bizarre]
 > Modify /etc/nsswitch.conf to switch the order of "files nis" to
 > "nis files" for passwd, shadow, and group, as I've always done.
 > Reboot.  Boom!  It hangs at loopback interface shutdown!
 >

I want to thank you for giving all the details of your testing. I would
like to use it as a future example of how to be constructive and helpful
to other people needing help.

Thanks.  Yep, feel free to use this as an example.  I suppose it comes
from being in the biz for over 46 years and shaking my head at *SO* many
ill conceived requests for help on listservs.

So have you looked at nscd any? Does having nscd turned on or off alter
this problem.

Nay, I have not, and frankly, it didn't occur to me till you asked.
I will explore that when I get a chance and see if it alters the problem.

Also what is in hosts and is the NIS server listed. Thanks

I assume you're talking about /etc/hosts on the clients.
The SL6.x clients just have the following in hosts:

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

 > I repeated this many times to be sure, and it happens the same on
 > every SL6.x node.
 >
 > Bug or feature?  I can't imagine it to be a feature nor can I
 > fathom what the order of "files" and "nis" in /etc/nsswitch.conf
 > has to do with the hanging of the loopback interface shutdown.
 > It's possible that an SL6.x NIS server might correct the situation,
 > but I have no time right now to spend a week on that not knowing
 > it would even work.
 >
 > Comments and suggestions are welcome.
 >
 > - Larry
 >
 > --
 > P. Larry Nelson (217-244-9855) | IT Administrator
 > 461 Loomis Lab                 | High Energy Physics Group
 > 1110 W. Green St., Urbana, IL  | Physics Dept., Univ. of Ill.
 > MailTo:[email protected] <mailto:[email protected]>    |
http://www.brf-llc.com/lnelson/
 > -------------------------------------------------------------------
 >  "Information without accountability is just noise."  - P.L. Nelson



--
P. Larry Nelson (217-244-9855) | IT Administrator
461 Loomis Lab                 | High Energy Physics Group
1110 W. Green St., Urbana, IL  | Physics Dept., Univ. of Ill.
MailTo:[email protected]    | http://www.brf-llc.com/lnelson/
-------------------------------------------------------------------
 "Information without accountability is just noise."  - P.L. Nelson

Reply via email to