On Mon, 7 Mar 2016, Chas Williams wrote: > On Mon, 2016-03-07 at 01:42 -0500, Benjamin Kaduk wrote: > > > > I am given to understand that the proximal trigger is linux commit > > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/ > > ?id=c725bfce7968009756ed2836a8cd7ba4dc163011, > > which addds a path wherein -ERESTARTSYS can be returned from within > > VFS > > library code. (Maybe there are other such paths, but we maybe just > > didn't > > notice before?) This particular function, splice_from_pipe_next(), > > ends > > up getting called from the low-level afs_linux_storeproc() routine. > > I haven't had time to look at this, but does this also happen with > the memcache?
I believe so. > > There > > are many call paths in the cache manager that end up at this function, > > most of which are not prepared to properly handle an ERESTARTSYS > > return. > > Since this status can be returned after some data has already been > > written, the correct behavior upon receiving it is far from clear ... > > a > > path towards a client free of this vector for data corruption may > > involve > > avoiding the dependence on splice_from_pipe_next() in preference to > > adopting all call sites to handle the ERSTARTSYS case. > > For the 1.6 release, this seems the best choice of action. The "real" > fix would likely be difficult to completely test in a timely fashion. That only helps if we know what the replacement would be...I am not a linux VFS expert and do not have any ideas right now. -Ben
