On Tue, 11 Feb 2014, Andrew Deason wrote:
On Tue, 4 Feb 2014 09:38:27 -0500
Dave Botsch <[email protected]> wrote:
kernel: 2.6.32-431.3.1.el6.x86_64
Just a little bit of more information about this. Apparently the
problematic code was also introduced in the RHEL 6.4 kernel series
(2.6.32-358*), but was quickly pulled out. Evidently from this thread it
was added again in the 6.5 series (2.6.32-431*) and maybe isn't coming
back out. I'm not entirely sure, since I can't find a changelog entry
for this, and redpatch.git is currently not helpful since it hasn't
caught up to the most recent versions yet.
One more data point -- we upgraded a client to 2.6.32-431.3.1.el6.x86_64
and began getting kernel Oops (but only sometimes) when a daily script ran
'gzip -c bigfile | tee copyfile > /afs/some/path/in/afs'
Red Hat weren't interested because of the tainted stack trace, but if I
removed the AFS client and ran the same thing to local disk, 'gzip' turns
into an unkillable process consuming 100% CPU, but no longer doing any
I/O.
We never generated an actual kernel crash without AFS.
The gzip binary is unchanged, and older kernels work fine. I couldn't
reproduce without the '| tee', so apparently that is needed to tickle
the bug.
Richard
If anyone wants to try to ask Red Hat about it, you'd be asking about
what versions include functionality related to the upstream Linux commit
I mentioned earlier (7732a557b1342c6e6966efb5f07effcf99f56167). I assume
they won't change anything for us, but it's always helpful to know what
they're going to be doing with it, if they have a specific plan or
timeline in mind.
--
Richard Brittain, Research Computing Group,
IT Services, 37 Dewey Field Road, HB6219
Dartmouth College, Hanover NH 03755
[email protected] 6-2085
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info