On Mon, May 14, 2018 at 05:36:24PM +0200, Andreas Gruenbacher wrote:
> According to xfstest generic/240, applications see, to expect direct I/O
> writes to either complete as a whole or to fail; short direct I/O writes
> are apparently not appreciated.  This means that when only part of an
> asynchronous direct I/O write succeeds, we can either fail the entire
> write, or we can wait wait for the partial write to complete and retry
> the remaining write using buffered I/O.  The old __blockdev_direct_IO
> helper has code for waiting for partial writes to complete; the new
> iomap_dio_rw iomap helper does not.
> 
> The above mentioned fallback mode is used by gfs2, which doesn't allow
> block allocations under direct I/O to avoid taking cluster-wide
> exclusive locks.  As a consequence, an asynchronous direct I/O write to
> a file range that ends in a hole will result in a short write.  When
> that happens, we want to retry the remaining write using buffered I/O.
> 
> To allow that, change iomap_dio_rw to wait for short direct I/O writes
> like __blockdev_direct_IO does instead of returning -EIOCBQUEUED.
> 
> This fixes xfstest generic/240 on gfs2.

The code looks pretty racy to me.  Why would gfs2 cause a short direct
I/O write to start with?  I suspect that is where the problem that needs
fixing is burried.

Reply via email to