On Mon, 7 Mar 2016, Chas Williams wrote:

> On Mon, 2016-03-07 at 01:42 -0500, Benjamin Kaduk wrote:
> >
> > I am given to understand that the proximal trigger is linux commit
> > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> > ?id=c725bfce7968009756ed2836a8cd7ba4dc163011,
> > which addds a path wherein -ERESTARTSYS can be returned from within
> > VFS
> > library code.  (Maybe there are other such paths, but we maybe just
> > didn't
> > notice before?)  This particular function, splice_from_pipe_next(),
> > ends
> > up getting called from the low-level afs_linux_storeproc() routine. 
>
> I haven't had time to look at this, but does this also happen with
> the memcache?

I believe so.

> > There
> > are many call paths in the cache manager that end up at this function,
> > most of which are not prepared to properly handle an ERESTARTSYS
> > return.
> > Since this status can be returned after some data has already been
> > written, the correct behavior upon receiving it is far from clear ...
> > a
> > path towards a client free of this vector for data corruption may
> > involve
> > avoiding the dependence on splice_from_pipe_next() in preference to
> > adopting all call sites to handle the ERSTARTSYS case.
>
> For the 1.6 release, this seems the best choice of action.  The "real"
> fix would likely be difficult to completely test in a timely fashion.

That only helps if we know what the replacement would be...I am not a
linux VFS expert and do not have any ideas right now.

-Ben

Reply via email to