Ordinarily I try and avoid entering these sorts of chains on the list. I'm frustrated; you're frustrated, and meaningful progress, while often born out of frustration, is often impolite in the process. So forgive me if I sound a bit annoyed, but this nfs-utils problem has been stuck in my side for a while now. I have limited my replies to as short as possible to avoid a more confrontational approach. Please consider the terse nature of my comments as an attempt at etiquette, by limiting my words I hopefully limit my opportunity to be aggressively confrontational. I had a longer response, but it was far too aggressive and forceful, lacking all together the spirit of co-operation.

On 01/22/2012 10:35 AM, Chris Schanzle wrote:
On 01/21/2012 11:01 AM, Nico Kadel-Garcia wrote:
On Fri, Jan 20, 2012 at 11:14 PM, Chris Schanzle<[email protected]> wrote:
On 01/20/2012 09:51 PM, Konstantin Olchanski wrote:

I feel obligated to vent about the ongoing mess-up of the nfs-utils
package.
I receive a few emails about this package per week these days. Some politely ask what is going on, some simply demand a fix, as though they think the error was deliberate. I have attempted to politely reply to each one.

In the nutshell, all of my SL6.1 machines are affected (not "both
machines",
both dozens of machines, 24 is the last count).

The "/" directory is filling up with 1 Mbyte core files from umount.nfs
at the rate of about 3 core dumps per minute.



Just wanted to put a "me too" out there. I admit to not keeping up with the
various nfs-utils versions and just recently joined this list.

Seemed that umount.nfs dumping core caused /etc/mtab to not get cleaned up,
so you had many duplicates in the output of say, 'df'.

We don't use kerberos, just NIS and the automounter, so it seemed like a lot
of the discussion didn't apply to us.  It didn't affect all our systems
either.

I feel the same frustration.  I have stopped rolling out EL6 and I'm
apologizing to my existing early adopter users.  With this issue and my
previously mentioned email about the inability to reboot successfully (due to umount issues) not generating any discussion, I'm preparing to hop back to the other prominent NA enterprise Linux derivative. It's great to have
choices.
I'm sorry to hear that your EL6 rollout has been paused. Excluding this particular issue, SL6 has been generally considered a stable and up to date release.

PS - I just noticed the mailing list doesn't add a Reply-To: field to direct
replies to the list.

Chris, I'm not sure you can blame SL for this one at all. Our favorite
upstream vendor occasionally publishes software with a bug, although
they're very good about testing and fixing any reported issues, which
is why some of us pay them for support licenses and others take
advantage of the goodness of free software. Is there a sign or pointer
that this was, in fact, an SL compilation generated bug?

Yes, I understand and agree we get upstream's occasional rare bugs. I'm pointing the finger (possibly!) at SL due to this thread which references the re-issuing of nfs-utils due to a build environment error: http://listserv.fnal.gov/scripts/wa.exe?A2=ind1201&L=scientific-linux-devel&T=0&P=77

There are other interesting threads including "Recent updates break autofs/ldap/krb5", but those may be upstream bugs.

The segfaults are a non-upstream bug, but not specific to SL. Another major rebuild had a similar problem with their nfs-utils package. The buildroot updates were forwarded onto them by one of our list members. A different major rebuild was contacted by us privately with the buildroot updates before their package was publicly released.

The package with the incremented version was built under identical conditions as the one which does not exhibit the segfault. My tests all passed without incident. It sat in testing until a number of users reported it was working fine. If necessary I can prove both; however, I have no intention of naming the users who graciously tested this package and, like me, could not make it segfault.


Are you mounting NFS directories at / ? That's usually a *REALLY* bad
idea, because if the NFS mount has any issues, it interferes with any
function that glances in / for permissions or other information. If
not, do you have any idea why it'd dumping those files in / ?

No - I'm sorry if I wrote something that implied that. We use standard indirect maps (e.g., /home/<user>). I am guessing umount.nfs is dumping cores in "/" since that is it's CWD.

I've got my CentOS 6.2 installation process ready and will switch my most troublesome user hopefully Monday. Unfortunately, it is not an apples-to-apples comparison with the current SL 6.1.

As an experiment, I installed 6rolling/testing/x86_64/nfs-utils/nfs-utils-1.2.3-15.el6.0.sl6.x86_64.rpm (I believe the latest upstream version, i.e., what is in 6.2) on that user's system and while the core dumps have not returned, /etc/mtab is still accumulating duplicates, viewable with duplicate counts via:

  sort /etc/mtab | uniq -dc



--
Pat Riehecky
Scientific Linux Developer

Reply via email to