On 7/14/2016 6:18 PM, Chad William Seys wrote:
> Hi Ben,
> 
> The Scientific Linux clients are using patched (by Redhat) 2.6.32 and
> the Debian clients are using patched (by Debian) 3.2.78 and 3.16.7 .
> 
> Do you suspect that a recent security patch, applied to all three
> kernels, could have broken the older AFS clients?
> 
> I could certainly test this idea if it appears promising.  I guess I'd
> start with the server's kernel though: One data point that argues
> against it being the client's kernel is that for the Scientific Linux
> box I booted up an machine which had not been updated for a long time
> (kernel dated Mar 22, 2016) and compiled openafs 1.6.15 (not functional)
> and 1.6.16 (functional).
> 
> Chad.

I am dismissive of the notion that the server's kernel version matters
since all of the fileserver code is in userland.

I believe the Debian and Scientific Linux issues are unrelated because
the symptoms are so different.

If you said that 1.6.18 was the first version of OpenAFS to work on
Debian I would correlate that with the Linux kernel changes to support
interrupting splice operations.  The splice operations were used by the
OpenAFS client for StoreData RPCs to avoid an extra memory copy of every
page that is written to the fileserver.  The 1.6.18 release removed it.

One of the symptoms of the splice change on OpenAFS clients was "git"
operations failing in such a fashion that the OpenAFS client marked the
fileserver state as "down".  When that happens the

  "Connection timed out"

error is logged regardless of the actual cause.

Since you indicate that 1.6.16 is the first version to work, something
else must be to blame on Debian.

For the Scientific Linux issue you should obtain a stack trace for the
hung "ls" process and collect cmdebug output for the affected cache manager.

Jeffrey Altman

<<attachment: jaltman.vcf>>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to