[RFC PATCH 4/5] direct_IO: use iov_iter_rw() instead of rw everywhere
The rw parameter to direct_IO is redundant with iov_iter-type, and treated slightly differently just about everywhere it's used: some users do rw WRITE, and others do rw == WRITE where they should be doing a bitwise check. Simplify this with the new iov_iter_rw() helper, which always returns either READ or WRITE. Signed-off-by: Omar Sandoval osan...@osandov.com --- drivers/staging/lustre/lustre/llite/rw26.c | 18 +- fs/affs/file.c | 4 ++-- fs/btrfs/inode.c | 12 ++-- fs/ext2/inode.c| 2 +- fs/ext3/inode.c| 8 fs/ext4/ext4.h | 4 ++-- fs/ext4/indirect.c | 10 +- fs/ext4/inode.c| 20 ++-- fs/f2fs/data.c | 16 fs/fat/inode.c | 4 ++-- fs/fuse/file.c | 13 +++-- fs/gfs2/aops.c | 7 +++ fs/hfs/inode.c | 2 +- fs/hfsplus/inode.c | 2 +- fs/jfs/inode.c | 2 +- fs/nfs/direct.c| 2 +- fs/nilfs2/inode.c | 4 ++-- fs/ocfs2/aops.c| 2 +- fs/reiserfs/inode.c| 2 +- fs/udf/inode.c | 2 +- fs/xfs/xfs_aops.c | 2 +- 21 files changed, 69 insertions(+), 69 deletions(-) diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c index 2f21304..3aa9de6 100644 --- a/drivers/staging/lustre/lustre/llite/rw26.c +++ b/drivers/staging/lustre/lustre/llite/rw26.c @@ -399,7 +399,7 @@ static ssize_t ll_direct_IO_26(int rw, struct kiocb *iocb, *size changing by concurrent truncates and writes. * 1. Need inode mutex to operate transient pages. */ - if (rw == READ) + if (iov_iter_rw(iter) == READ) mutex_lock(inode-i_mutex); LASSERT(obj-cob_transient_pages == 0); @@ -408,7 +408,7 @@ static ssize_t ll_direct_IO_26(int rw, struct kiocb *iocb, size_t offs; count = min_t(size_t, iov_iter_count(iter), size); - if (rw == READ) { + if (iov_iter_rw(iter) == READ) { if (file_offset = i_size_read(inode)) break; if (file_offset + count i_size_read(inode)) @@ -418,11 +418,11 @@ static ssize_t ll_direct_IO_26(int rw, struct kiocb *iocb, result = iov_iter_get_pages_alloc(iter, pages, count, offs); if (likely(result 0)) { int n = DIV_ROUND_UP(result + offs, PAGE_SIZE); - result = ll_direct_IO_26_seg(env, io, rw, inode, -file-f_mapping, -result, file_offset, -pages, n); - ll_free_user_pages(pages, n, rw==READ); + result = ll_direct_IO_26_seg(env, io, iov_iter_rw(iter), +inode, file-f_mapping, +result, file_offset, pages, +n); + ll_free_user_pages(pages, n, iov_iter_rw(iter) == READ); } if (unlikely(result = 0)) { /* If we can't allocate a large enough buffer @@ -449,11 +449,11 @@ static ssize_t ll_direct_IO_26(int rw, struct kiocb *iocb, } out: LASSERT(obj-cob_transient_pages == 0); - if (rw == READ) + if (iov_iter_rw(iter) == READ) mutex_unlock(inode-i_mutex); if (tot_bytes 0) { - if (rw == WRITE) { + if (iov_iter_rw(iter) == WRITE) { struct lov_stripe_md *lsm; lsm = ccc_inode_lsm_get(inode); diff --git a/fs/affs/file.c b/fs/affs/file.c index f3028aa..0314acd 100644 --- a/fs/affs/file.c +++ b/fs/affs/file.c @@ -398,7 +398,7 @@ affs_direct_IO(int rw, struct kiocb *iocb, struct iov_iter *iter, size_t count = iov_iter_count(iter); ssize_t ret; - if (rw == WRITE) { + if (iov_iter_rw(iter) == WRITE) { loff_t size = offset + count; if (AFFS_I(inode)-mmu_private size) @@ -406,7 +406,7 @@ affs_direct_IO(int rw, struct kiocb *iocb, struct iov_iter *iter, } ret = blockdev_direct_IO(iocb, inode, iter, offset, affs_get_block); - if (ret 0 (rw WRITE)) + if (ret 0 iov_iter_rw(iter) == WRITE) affs_write_failed(mapping, offset + count);
[RFC PATCH 0/5] Remove rw parameter from direct_IO()
Hi, Al, here's some cleanup that you mentioned back in December that I got around to (https://lkml.org/lkml/2014/12/15/28). In summary, the rw parameter to a_ops-direct_IO() is redundant with .type in struct iov_iter. Additionally, rw is inconsistently checked for being a WRITE; some filesystems do rw == WRITE, others do rw WRITE, and others do both within the same function :) The distinction is that swapout may OR in the ITER_BVEC flag in the rw passed to -direct_IO(), so the two are not equivalent (although this really only happens for swap-over-NFS, but it's scary nonetheless). After looking through all of these, it definitely looks like every check means for ANY write, not just non-kernel writes. So, the solution presented here is: - Add a helper, iov_iter_rw(), which always returns either READ or WRITE, no ITER_.* or REQ_.* nonsense mixed in. For consistency, the return value is always checked for equality - Get rid of all uses of rw in any implementations of direct_IO, starting with the generic code - Nuke the actual parameter and update the documentation I decided to squish all of the filesystems together in patch 4 to avoid inundating the mailing lists with 20+ mostly two-line patches, but I can split those out if that's any better. Additionally, patch 1 pulls fs.h into uio.h, which seems undesirable. These were mostly just compile tested, with a couple of direct I/O xfstests run on btrfs as quick sanity check, so getting some more eyes on is probably a good thing. They should apply on top of v4.0-rc4. Please comment away. Thank you, Omar Sandoval (5): new helper: iov_iter_rw() Remove rw from {,__,do_}blockdev_direct_IO() Remove rw from dax_{do_,}io() direct_IO: use iov_iter_rw() instead of rw everywhere direct_IO: remove rw from a_ops-direct_IO() Documentation/filesystems/Locking | 2 +- Documentation/filesystems/vfs.txt | 2 +- drivers/staging/lustre/lustre/llite/rw26.c | 22 - fs/9p/vfs_addr.c | 2 +- fs/affs/file.c | 9 +++ fs/block_dev.c | 8 +++--- fs/btrfs/inode.c | 24 +- fs/ceph/addr.c | 3 +-- fs/cifs/file.c | 3 +-- fs/dax.c | 27 ++--- fs/direct-io.c | 39 ++ fs/exofs/inode.c | 4 +-- fs/ext2/inode.c| 11 - fs/ext3/inode.c| 14 +-- fs/ext4/ext4.h | 4 +-- fs/ext4/indirect.c | 25 ++- fs/ext4/inode.c| 28 ++--- fs/f2fs/data.c | 22 - fs/fat/inode.c | 9 +++ fs/fuse/file.c | 16 ++-- fs/gfs2/aops.c | 16 ++-- fs/hfs/inode.c | 8 +++--- fs/hfsplus/inode.c | 9 +++ fs/jfs/inode.c | 8 +++--- fs/nfs/direct.c| 4 +-- fs/nilfs2/inode.c | 10 +++- fs/ocfs2/aops.c| 22 +++-- fs/reiserfs/inode.c| 8 +++--- fs/udf/file.c | 3 +-- fs/udf/inode.c | 7 +++--- fs/xfs/xfs_aops.c | 12 - include/linux/fs.h | 24 +- include/linux/nfs_fs.h | 2 +- include/linux/uio.h| 10 mm/filemap.c | 4 +-- mm/page_io.c | 4 +-- 36 files changed, 206 insertions(+), 219 deletions(-) -- 2.3.3 -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/5] new helper: iov_iter_rw()
On Mon, Mar 16, 2015 at 04:33:49AM -0700, Omar Sandoval wrote: Get either READ or WRITE out of iter-type. Umm... + * Get one of READ or WRITE out of iter-type without any other flags OR'd in + * with it. + */ +static inline int iov_iter_rw(const struct iov_iter *i) +{ + return i-type RW_MASK; +} TBH, I would turn that into a macro. Reason: indirect includes. How about #define iov_iter_rw(i) ((0 ? (struct iov_iter *)0 : (i))-type RW_MASK) Should do you all the type safety of inline function and avoids the need to include fs.h in uio.h; _users_ of iov_iter_rw() obviously still need fs.h, but such places always used to... -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/5] Remove rw parameter from direct_IO()
On Mon, Mar 16, 2015 at 04:33:48AM -0700, Omar Sandoval wrote: Hi, Al, here's some cleanup that you mentioned back in December that I got around to (https://lkml.org/lkml/2014/12/15/28). In summary, the rw parameter to a_ops-direct_IO() is redundant with .type in struct iov_iter. Additionally, rw is inconsistently checked for being a WRITE; some filesystems do rw == WRITE, others do rw WRITE, and others do both within the same function :) The distinction is that swapout may OR in the ITER_BVEC flag in the rw passed to -direct_IO(), so the two are not equivalent (although this really only happens for swap-over-NFS, but it's scary nonetheless). After looking through all of these, it definitely looks like every check means for ANY write, not just non-kernel writes. So, the solution presented here is: - Add a helper, iov_iter_rw(), which always returns either READ or WRITE, no ITER_.* or REQ_.* nonsense mixed in. For consistency, the return value is always checked for equality TBH, I'm not sure I like such calling conventions, but I guess we can live with that. I decided to squish all of the filesystems together in patch 4 to avoid inundating the mailing lists with 20+ mostly two-line patches, but I can split those out if that's any better. Additionally, patch 1 pulls fs.h into uio.h, which seems undesirable. ... and easily avoided if you use a macro instead of inline, without losing type safety or getting double evaluation, etc. Look: 0 ? (struct T *)0 : (x) always evaluates to x. Now look at 6.5.15p3 in C99: the second and the third arguments are both pointers, so we are left with p3.4 (both arguments are pointers to qualified or unqualified versions of compatible types), p3.5 (one operand is a pointer and another null pointer constant) and p3.6 (one operand is a pointer to an object or incomplete type, and the other is a pointer to qualified or unqualied version of void. The first variant means that x is a pointer to qualified or unqualified struct T; the type of result is, per 6.5.15p6, the same as that of x. The second variant means that x is a null pointer constant ((struct T *)0 isn't one) and result is a null pointer to T. The third one means that x is a pointer to qualified or unqualified void. The type of result is the same as that of x. Now note that your variant is no better wrt type safety; worse, actually, since it does accept any pointer to void. (0 ? (struct iov_iter *)0 : (x))-type will reject those. And we obviously don't have double evaluation here either. -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 2/5] Remove rw from {,__,do_}blockdev_direct_IO()
Most filesystems call through to these at some point, so we'll start here. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/affs/file.c | 2 +- fs/block_dev.c | 5 ++--- fs/btrfs/inode.c| 8 fs/direct-io.c | 39 ++- fs/ext2/inode.c | 2 +- fs/ext3/inode.c | 2 +- fs/ext4/indirect.c | 11 ++- fs/ext4/inode.c | 2 +- fs/f2fs/data.c | 2 +- fs/fat/inode.c | 2 +- fs/gfs2/aops.c | 5 ++--- fs/hfs/inode.c | 2 +- fs/hfsplus/inode.c | 3 +-- fs/jfs/inode.c | 2 +- fs/nilfs2/inode.c | 3 +-- fs/ocfs2/aops.c | 16 +++- fs/reiserfs/inode.c | 2 +- fs/udf/inode.c | 2 +- fs/xfs/xfs_aops.c | 9 - include/linux/fs.h | 22 -- 20 files changed, 67 insertions(+), 74 deletions(-) diff --git a/fs/affs/file.c b/fs/affs/file.c index d2468bf..f3028aa 100644 --- a/fs/affs/file.c +++ b/fs/affs/file.c @@ -405,7 +405,7 @@ affs_direct_IO(int rw, struct kiocb *iocb, struct iov_iter *iter, return 0; } - ret = blockdev_direct_IO(rw, iocb, inode, iter, offset, affs_get_block); + ret = blockdev_direct_IO(iocb, inode, iter, offset, affs_get_block); if (ret 0 (rw WRITE)) affs_write_failed(mapping, offset + count); return ret; diff --git a/fs/block_dev.c b/fs/block_dev.c index 975266b..e3a3125 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -153,9 +153,8 @@ blkdev_direct_IO(int rw, struct kiocb *iocb, struct iov_iter *iter, struct file *file = iocb-ki_filp; struct inode *inode = file-f_mapping-host; - return __blockdev_direct_IO(rw, iocb, inode, I_BDEV(inode), iter, - offset, blkdev_get_block, - NULL, NULL, 0); + return __blockdev_direct_IO(iocb, inode, I_BDEV(inode), iter, offset, + blkdev_get_block, NULL, NULL, 0); } int __sync_blockdev(struct block_device *bdev, int wait) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index da828cf..d6413f4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8119,10 +8119,10 @@ static ssize_t btrfs_direct_IO(int rw, struct kiocb *iocb, wakeup = false; } - ret = __blockdev_direct_IO(rw, iocb, inode, - BTRFS_I(inode)-root-fs_info-fs_devices-latest_bdev, - iter, offset, btrfs_get_blocks_direct, NULL, - btrfs_submit_direct, flags); + ret = __blockdev_direct_IO(iocb, inode, + BTRFS_I(inode)-root-fs_info-fs_devices-latest_bdev, + iter, offset, btrfs_get_blocks_direct, NULL, + btrfs_submit_direct, flags); if (rw WRITE) { if (ret 0 ret != -EIOCBQUEUED) btrfs_delalloc_release_space(inode, count); diff --git a/fs/direct-io.c b/fs/direct-io.c index e181b6b..02a87d9 100644 --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -1094,10 +1094,10 @@ static inline int drop_refcount(struct dio *dio) * for the whole file. */ static inline ssize_t -do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, - struct block_device *bdev, struct iov_iter *iter, loff_t offset, - get_block_t get_block, dio_iodone_t end_io, - dio_submit_t submit_io, int flags) +do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, + struct block_device *bdev, struct iov_iter *iter, + loff_t offset, get_block_t get_block, dio_iodone_t end_io, + dio_submit_t submit_io, int flags) { unsigned i_blkbits = ACCESS_ONCE(inode-i_blkbits); unsigned blkbits = i_blkbits; @@ -,9 +,6 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, struct blk_plug plug; unsigned long align = offset | iov_iter_alignment(iter); - if (rw WRITE) - rw = WRITE_ODIRECT; - /* * Avoid references to bdev if not absolutely needed to give * the early prefetch in the caller enough time. @@ -1128,7 +1125,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, } /* watch out for a 0 len io from a tricksy fs */ - if (rw == READ !iov_iter_count(iter)) + if (iov_iter_rw(iter) == READ !iov_iter_count(iter)) return 0; dio = kmem_cache_alloc(dio_cache, GFP_KERNEL); @@ -1144,7 +1141,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, dio-flags = flags; if (dio-flags DIO_LOCKING) { - if (rw == READ) { + if (iov_iter_rw(iter) == READ) { struct address_space *mapping = iocb-ki_filp-f_mapping; @@ -1170,19 +1167,19
[RFC PATCH 3/5] Remove rw from dax_{do_,}io()
And use iov_iter_rw() instead. Signed-off-by: Omar Sandoval osan...@osandov.com --- fs/dax.c | 27 +-- fs/ext2/inode.c| 4 ++-- fs/ext4/indirect.c | 4 ++-- fs/ext4/inode.c| 2 +- include/linux/fs.h | 4 ++-- 5 files changed, 20 insertions(+), 21 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ed1619e..a278469 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -98,9 +98,9 @@ static bool buffer_size_valid(struct buffer_head *bh) return bh-b_state != 0; } -static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, - loff_t start, loff_t end, get_block_t get_block, - struct buffer_head *bh) +static ssize_t dax_io(struct inode *inode, struct iov_iter *iter, + loff_t start, loff_t end, get_block_t get_block, + struct buffer_head *bh) { ssize_t retval = 0; loff_t pos = start; @@ -109,7 +109,7 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, void *addr; bool hole = false; - if (rw != WRITE) + if (iov_iter_rw(iter) != WRITE) end = min(end, i_size_read(inode)); while (pos end) { @@ -124,7 +124,7 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, bh-b_size = PAGE_ALIGN(end - pos); bh-b_state = 0; retval = get_block(inode, block, bh, - rw == WRITE); + iov_iter_rw(iter) == WRITE); if (retval) break; if (!buffer_size_valid(bh)) @@ -137,7 +137,7 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, bh-b_size -= done; } - hole = (rw != WRITE) !buffer_written(bh); + hole = iov_iter_rw(iter) != WRITE !buffer_written(bh); if (hole) { addr = NULL; size = bh-b_size - first; @@ -154,7 +154,7 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, max = min(pos + size, end); } - if (rw == WRITE) + if (iov_iter_rw(iter) == WRITE) len = copy_from_iter(addr, max - pos, iter); else if (!hole) len = copy_to_iter(addr, max - pos, iter); @@ -173,7 +173,6 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, /** * dax_do_io - Perform I/O to a DAX file - * @rw: READ to read or WRITE to write * @iocb: The control block for this I/O * @inode: The file which the I/O is directed at * @iter: The addresses to do I/O from or to @@ -189,9 +188,9 @@ static ssize_t dax_io(int rw, struct inode *inode, struct iov_iter *iter, * As with do_blockdev_direct_IO(), we increment i_dio_count while the I/O * is in progress. */ -ssize_t dax_do_io(int rw, struct kiocb *iocb, struct inode *inode, - struct iov_iter *iter, loff_t pos, - get_block_t get_block, dio_iodone_t end_io, int flags) +ssize_t dax_do_io(struct kiocb *iocb, struct inode *inode, + struct iov_iter *iter, loff_t pos, get_block_t get_block, + dio_iodone_t end_io, int flags) { struct buffer_head bh; ssize_t retval = -EINVAL; @@ -199,7 +198,7 @@ ssize_t dax_do_io(int rw, struct kiocb *iocb, struct inode *inode, memset(bh, 0, sizeof(bh)); - if ((flags DIO_LOCKING) (rw == READ)) { + if ((flags DIO_LOCKING) iov_iter_rw(iter) == READ) { struct address_space *mapping = inode-i_mapping; mutex_lock(inode-i_mutex); retval = filemap_write_and_wait_range(mapping, pos, end - 1); @@ -212,9 +211,9 @@ ssize_t dax_do_io(int rw, struct kiocb *iocb, struct inode *inode, /* Protects against truncate */ atomic_inc(inode-i_dio_count); - retval = dax_io(rw, inode, iter, pos, end, get_block, bh); + retval = dax_io(inode, iter, pos, end, get_block, bh); - if ((flags DIO_LOCKING) (rw == READ)) + if ((flags DIO_LOCKING) iov_iter_rw(iter) == READ) mutex_unlock(inode-i_mutex); if ((retval 0) end_io) diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index 00979a6..cf95bda 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -861,8 +861,8 @@ ext2_direct_IO(int rw, struct kiocb *iocb, struct iov_iter *iter, ssize_t ret; if (IS_DAX(inode)) - ret = dax_do_io(rw, iocb, inode, iter, offset, ext2_get_block, - NULL, DIO_LOCKING); +
[RFC PATCH 1/5] new helper: iov_iter_rw()
Get either READ or WRITE out of iter-type. Signed-off-by: Omar Sandoval osan...@osandov.com --- include/linux/uio.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 7188029..87a47b3 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -10,6 +10,7 @@ #define __LINUX_UIO_H #include linux/kernel.h +#include linux/fs.h #include uapi/linux/uio.h struct page; @@ -111,6 +112,15 @@ static inline bool iter_is_iovec(struct iov_iter *i) } /* + * Get one of READ or WRITE out of iter-type without any other flags OR'd in + * with it. + */ +static inline int iov_iter_rw(const struct iov_iter *i) +{ + return i-type RW_MASK; +} + +/* * Cap the iov_iter by given limit; note that the second argument is * *not* the new size - it's upper limit for such. Passing it a value * greater than the amount of data in iov_iter is fine - it'll just do -- 2.3.3 -- To unsubscribe from this list: send the line unsubscribe linux-nilfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html