Re: [PATCH 03/11] fs: add support for an inode to carry stream related data

2017-06-15 Thread Christoph Hellwig
On Wed, Jun 14, 2017 at 09:45:04PM -0600, Jens Axboe wrote: > No functional changes in this patch, just in preparation for > allowing applications to pass in hints about data life times > for writes. > > Pack the i_write_hint field into a 2-byte hole, so we don't grow > the size of the inode. A

Re: [RFC PATCH 0/4] Allow file systems to selectively bypass dm-crypt

2017-06-15 Thread Milan Broz
On 06/15/2017 01:40 AM, Michael Halcrow wrote: > Several file systems either have already implemented encryption or are > in the process of doing so. This addresses usability and storage > isolation requirements on mobile devices and in multi-tenant > environments. > > While distinct keys locked

Re: [PATCH 02/11] blk-mq: expose stream write stats through debugfs

2017-06-15 Thread Christoph Hellwig
On Wed, Jun 14, 2017 at 09:45:03PM -0600, Jens Axboe wrote: > Useful to verify that things are working the way they should. > Reading the file will return number of kb written to each > stream. Writing the file will reset the statistics. No care > is taken to ensure that we don't race on updates.

Re: [PATCH v6 12/20] fs: add a new fstype flag to indicate how writeback errors are tracked

2017-06-15 Thread Christoph Hellwig
On Wed, Jun 14, 2017 at 01:24:43PM -0400, Jeff Layton wrote: > In this smaller set, it's only really used for DAX. DAX only is implemented by three filesystems, please just fix them up in one go. > sync_file_range: ->fsync isn't called directly there, and I think we > probably want similar

Re: [PATCHSET v4] Add support for write life time hints

2017-06-15 Thread Christoph Hellwig
On Wed, Jun 14, 2017 at 09:45:01PM -0600, Jens Axboe wrote: > A new iteration of this patchset, previously known as write streams. > As before, this patchset aims at enabling applications split up > writes into separate streams, based on the perceived life time > of the data written. This is

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-15 Thread Christoph Hellwig
I think Darrick has a very valid concern here - using RWF_* flags to affect inode or fd-wide state is extremely counter productive. Combined with the fact that the streams need a special setup in NVMe I'm tempted to say that the interface really should be fadvise or similar, which would keep the

[PATCH 4/5] RFC: mmc: block: Convert RPMB to a character device

2017-06-15 Thread Linus Walleij
The RPMB partition on the eMMC devices is a special area used for storing cryptographically safe information signed by a special secret key. To write and read records from this special area, authentication is needed. The RPMB area is *only* and *exclusively* accessed using ioctl():s from

[PATCH 2/5] mmc: block: Refactor mmc_blk_part_switch()

2017-06-15 Thread Linus Walleij
Instead of passing a struct mmc_blk_data * to mmc_blk_part_switch() let's pass the actual partition type we want to switch to. This is necessary in order not to have a block device with a backing mmc_blk_data and request queue and all for every hardware partition, such as RPMB. Signed-off-by:

[PATCH 3/5] mmc: block: Reparametrize mmc_blk_ioctl_[multi]_cmd()

2017-06-15 Thread Linus Walleij
Instead of passing a block device to mmc_blk_ioctl[_multi]_cmd(), let's pass struct mmc_blk_data() so we operate ioctl()s on the MMC block device representation rather than the vanilla block device. This saves a little duplicated code and makes it possible to issue ioctl()s not targeted for a

[PATCH 5/5] mmc: block: Delete mmc_access_rpmb()

2017-06-15 Thread Linus Walleij
This function is used by the block layer queue to bail out of requests if the current request is an RPMB request. However this makes no sense: RPMB is only used from ioctl():s, there are no RPMB accesses coming from the block layer. An RPMB ioctl() always switches to the RPMB partition and then

[PATCH 1/5] mmc: block: Move duplicate check

2017-06-15 Thread Linus Walleij
mmc_blk_ioctl() calls either mmc_blk_ioctl_cmd() or mmc_blk_ioctl_multi_cmd() and each of these make the same check. Factor it into a new helper function, call it on both branches of the switch() statement and save a chunk of duplicate code. Cc: Shawn Lin Signed-off-by:

[PATCH 0/5] Convert RPMB block device to a character device

2017-06-15 Thread Linus Walleij
Looking for ways to get rid of the RPMB "block device" and the extra block queue. This is one approach, I don't know if it will stick, let's discuss it, especially the RFC patch. Patches 1,2,3 can be applied as cleanups unless they collide with something else. Patch 5 is a consequence of the

Re: [PATCH v6 12/20] fs: add a new fstype flag to indicate how writeback errors are tracked

2017-06-15 Thread Jeff Layton
On Thu, 2017-06-15 at 01:22 -0700, Christoph Hellwig wrote: > On Wed, Jun 14, 2017 at 01:24:43PM -0400, Jeff Layton wrote: > > In this smaller set, it's only really used for DAX. > > DAX only is implemented by three filesystems, please just fix them > up in one go. > Ok. > > sync_file_range:

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-15 Thread Al Viro
On Wed, Jun 14, 2017 at 09:15:03PM -0700, Darrick J. Wong wrote: > > + */ > > +#define RWF_WRITE_LIFE_SHIFT 4 > > +#define RWF_WRITE_LIFE_MASK0x00f0 /* 4 bits of stream > > ID */ > > +#define RWF_WRITE_LIFE_SHORT (1 << RWF_WRITE_LIFE_SHIFT) > >

Re: [PATCH 0/10 v11] No wait AIO

2017-06-15 Thread Al Viro
On Mon, Jun 12, 2017 at 11:14:31PM -0700, Christoph Hellwig wrote: > On Mon, Jun 12, 2017 at 05:38:13PM -0500, Goldwyn Rodrigues wrote: > > We had FS_NOWAIT in filesystem type flags (in v3), but retracted it > > later in v4. > > A per-fs flag is wrong as file_operation may have different >

Re: [PATCH V2 12/12] block: use standard blktrace API to output cgroup info for debug notes

2017-06-15 Thread kbuild test robot
Hi Shaohua, [auto build test ERROR on linus/master] [also build test ERROR on v4.12-rc5] [cannot apply to driver-core/driver-core-testing block/for-next next-20170615] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com

Re: [PATCHSET v4] Add support for write life time hints

2017-06-15 Thread Jens Axboe
On 06/15/2017 02:12 AM, Christoph Hellwig wrote: > On Wed, Jun 14, 2017 at 09:45:01PM -0600, Jens Axboe wrote: >> A new iteration of this patchset, previously known as write streams. >> As before, this patchset aims at enabling applications split up >> writes into separate streams, based on the

Re: [PATCH 03/11] fs: add support for an inode to carry stream related data

2017-06-15 Thread Jens Axboe
On 06/15/2017 02:17 AM, Christoph Hellwig wrote: > On Wed, Jun 14, 2017 at 09:45:04PM -0600, Jens Axboe wrote: >> No functional changes in this patch, just in preparation for >> allowing applications to pass in hints about data life times >> for writes. >> >> Pack the i_write_hint field into a

Re: [PATCH 02/11] blk-mq: expose stream write stats through debugfs

2017-06-15 Thread Jens Axboe
On 06/15/2017 02:16 AM, Christoph Hellwig wrote: > On Wed, Jun 14, 2017 at 09:45:03PM -0600, Jens Axboe wrote: >> Useful to verify that things are working the way they should. >> Reading the file will return number of kb written to each >> stream. Writing the file will reset the statistics. No

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-15 Thread Jens Axboe
On 06/15/2017 02:19 AM, Christoph Hellwig wrote: > I think Darrick has a very valid concern here - using RWF_* flags > to affect inode or fd-wide state is extremely counter productive. > > Combined with the fact that the streams need a special setup in NVMe > I'm tempted to say that the interface

[PATCH 01/10] fs: Separate out kiocb flags setup based on RWF_* flags

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Goldwyn Rodrigues --- fs/read_write.c| 12 +++- include/linux/fs.h | 14 ++ 2 files changed, 17 insertions(+),

[PATCH 07/10] block: return on congested block device

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues A new bio operation flag REQ_NOWAIT is introduced to identify bio's orignating from iocb with IOCB_NOWAIT. This flag indicates to return immediately if a request cannot be made instead of retrying. Stacked devices such as md (the ones with

[PATCH 09/10] xfs: nowait aio support

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable immediately. IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin if it needs allocation either due to file extension, writing to a hole, or COW or waiting for other DIOs to finish.

[PATCH 04/10] fs: Introduce RWF_NOWAIT and FMODE_AIO_NOWAIT

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues RWF_NOWAIT informs kernel to bail out if an AIO request will block for reasons such as file allocations, or a writeback triggered, or would block while allocating requests while performing direct I/O. RWF_NOWAIT is translated to IOCB_NOWAIT for

[PATCH 02/10] fs: Introduce filemap_range_has_page()

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues filemap_range_has_page() return true if the file's mapping has a page within the range mentioned. This function will be used to check if a write() call will cause a writeback of previous writes. Reviewed-by: Christoph Hellwig Reviewed-by:

[PATCH 03/10] fs: Use RWF_* flags for AIO operations

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues aio_rw_flags is introduced in struct iocb (using aio_reserved1) which will carry the RWF_* flags. We cannot use aio_flags because they are not checked for validity which may break existing applications. Note, the only place RWF_HIPRI comes in effect is

[PATCH 05/10] fs: return if direct write will trigger writeback

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Find out if the write will trigger a wait due to writeback. If yes, return -EAGAIN. Return -EINVAL for buffered AIO: there are multiple causes of delay such as page locks, dirty throttling logic, page loading from disk etc. which cannot be taken care

[PATCH 0/10 v12] No wait AIO

2017-06-15 Thread Goldwyn Rodrigues
This series adds nonblocking feature to asynchronous I/O writes. io_submit() can be delayed because of a number of reason: - Block allocation for files - Data writebacks for direct I/O - Sleeping because of waiting to acquire i_rwsem - Congested block device The goal of the patch series is to

[PATCH 06/10] fs: Introduce IOMAP_NOWAIT

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps. This is used by XFS in the XFS patch. Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Goldwyn Rodrigues --- fs/iomap.c|

[PATCH 08/10] ext4: nowait aio support

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail for direct I/O: + i_rwsem is lockable + Writing beyond end of file (will trigger allocation) + Blocks are not allocated at the write location Signed-off-by: Goldwyn Rodrigues

[PATCH 10/10] btrfs: nowait aio support

2017-06-15 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail + i_rwsem is not lockable + NODATACOW or PREALLOC is not set + Cannot nocow at the desired location + Writing beyond end of file which is not allocated Acked-by: David Sterba

Re: [RFC PATCH 0/4] Allow file systems to selectively bypass dm-crypt

2017-06-15 Thread Milan Broz
On 06/15/2017 07:24 PM, Michael Halcrow wrote: ... >> If this is accepted, we basically allow attacker to trick system to >> write plaintext to media just by setting this flag. This must never >> ever happen with FDE - BY DESIGN. > > That's an important point. This expands the attack surface to

[GIT PULL] Block fix for 4.12-rc

2017-06-15 Thread Jens Axboe
Hi Linus, Just a single fix this week, fixing a regression introduced in this series. When we put the final reference to the queue, we may need to block. Ensure that we can safely do so. From Bart. Please pull! git://git.kernel.dk/linux-block.git for-linus

Re: [PATCH 0/10 v12] No wait AIO

2017-06-15 Thread Andrew Morton
On Thu, 15 Jun 2017 10:59:52 -0500 Goldwyn Rodrigues wrote: > This series adds nonblocking feature to asynchronous I/O writes. > io_submit() can be delayed because of a number of reason: > - Block allocation for files > - Data writebacks for direct I/O > - Sleeping because

Re: [PATCH 02/10] fs: Introduce filemap_range_has_page()

2017-06-15 Thread Andrew Morton
On Thu, 15 Jun 2017 10:59:54 -0500 Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues > > filemap_range_has_page() return true if the file's mapping has > a page within the range mentioned. This function will be used > to check if a write() call will cause

Re: [PATCH V2 05/12] kernfs: introduce kernfs_node_id

2017-06-15 Thread kbuild test robot
Hi Shaohua, [auto build test ERROR on linus/master] [also build test ERROR on v4.12-rc5 next-20170615] [cannot apply to driver-core/driver-core-testing block/for-next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com

[PATCH V3 00/12] blktrace: output cgroup info

2017-06-15 Thread Shaohua Li
From: Shaohua Li Hi, Currently blktrace isn't cgroup aware. blktrace prints out task name of current context, but the task of current context isn't always in the cgroup where the BIO comes from. We can't use task name to find out IO cgroup. For example, Writeback BIOs always comes

[PATCH V3 01/12] kernfs: use idr instead of ida to manage inode number

2017-06-15 Thread Shaohua Li
From: Shaohua Li kernfs uses ida to manage inode number. The problem is we can't get kernfs_node from inode number with ida. Switching to use idr, next patch will add an API to get kernfs_node from inode number. Signed-off-by: Shaohua Li --- fs/kernfs/dir.c|

[PATCH V3 08/12] blktrace: export cgroup info in trace

2017-06-15 Thread Shaohua Li
From: Shaohua Li Currently blktrace isn't cgroup aware. blktrace prints out task name of current context, but the task of current context isn't always in the cgroup where the BIO comes from. We can't use task name to find out IO cgroup. For example, Writeback BIOs always comes from

[PATCH V3 10/12] block: call __bio_free in bio_endio

2017-06-15 Thread Shaohua Li
From: Shaohua Li bio_free isn't a good place to free cgroup/integrity info. There are a lot of cases bio is allocated in special way (for example, in stack) and never gets called by bio_put hence bio_free, we are leaking memory. This patch moves the free to bio endio, which should

[PATCH V3 06/12] kernfs: add exportfs operations

2017-06-15 Thread Shaohua Li
From: Shaohua Li Now we have the facilities to implement exportfs operations. The idea is cgroup can export the fhandle info to userspace, then userspace uses fhandle to find the cgroup name. Another example is userspace can get fhandle for a cgroup and BPF uses the fhandle to

Re: [RFC PATCH 0/4] Allow file systems to selectively bypass dm-crypt

2017-06-15 Thread Michael Halcrow
On Thu, Jun 15, 2017 at 09:33:39AM +0200, Milan Broz wrote: > On 06/15/2017 01:40 AM, Michael Halcrow wrote: > > Several file systems either have already implemented encryption or are > > in the process of doing so. This addresses usability and storage > > isolation requirements on mobile devices

[PATCH V3 03/12] kernfs: add an API to get kernfs node from inode number

2017-06-15 Thread Shaohua Li
From: Shaohua Li Add an API to get kernfs node from inode number. We will need this to implement exportfs operations. To make the API lock free, kernfs node is freed in RCU context. And we depend on kernfs_node count/ino number to filter stale kernfs nodes. Signed-off-by: Shaohua

[PATCH V3 12/12] block: use standard blktrace API to output cgroup info for debug notes

2017-06-15 Thread Shaohua Li
From: Shaohua Li Currently cfq/bfq/blk-throttle output cgroup info in trace in their own way. Now we have standard blktrace API for this, so convert them to use it. Note, this changes the behavior a little bit. cgroup info isn't output by default, we only do this with 'blk_cgroup'

[PATCH V3 04/12] kernfs: don't set dentry->d_fsdata

2017-06-15 Thread Shaohua Li
From: Shaohua Li When working on adding exportfs operations in kernfs, I found it's hard to initialize dentry->d_fsdata in the exportfs operations. Looks there is no way to do it without race condition. Look at the kernfs code closely, there is no point to set dentry->d_fsdata.

[PATCH V3 05/12] kernfs: introduce kernfs_node_id

2017-06-15 Thread Shaohua Li
From: Shaohua Li inode number and generation can identify a kernfs node. We are going to export the identification by exportfs operations, so put ino and generation into a separate structure. It's convenient when later patches use the identification. Signed-off-by: Shaohua Li

[PATCH V3 11/12] blktrace: add an option to allow displying cgroup path

2017-06-15 Thread Shaohua Li
From: Shaohua Li By default we output cgroup id in blktrace. This adds an option to display cgroup path. Since get cgroup path is a relativly heavy operation, we don't enable it by default. with the option enabled, blktrace will output something like this: dd-1353 [007] d..2

[PATCH V3 07/12] cgroup: export fhandle info for a cgroup

2017-06-15 Thread Shaohua Li
From: Shaohua Li Add an API to export cgroup fhandle info. We don't export a full 'struct file_handle', there are unrequired info. Sepcifically, cgroup is always a directory, so we don't need a 'FILEID_INO32_GEN_PARENT' type fhandle, we only need export the inode number and

[PATCH V3 09/12] block: always attach cgroup info into bio

2017-06-15 Thread Shaohua Li
From: Shaohua Li blkcg_bio_issue_check() already gets blkcg for a BIO. bio_associate_blkcg() uses a percpu refcounter, so it's a very cheap operation. There is no point we don't attach the cgroup info into bio at blkcg_bio_issue_check. This also makes blktrace outputs correct cgroup

Re: [PATCH 0/10 v12] No wait AIO

2017-06-15 Thread Andrew Morton
On Thu, 15 Jun 2017 16:51:41 -0500 Goldwyn Rodrigues wrote: > > I have only minor quibbles - I'll grab the patch series for some -next > > testing (at least). > > > > I agree to the quibbles you have on patch 02/10. Should I send the > entire fixed series, just the 02/10

RFC: pwritev2 regression test for invalid flags

2017-06-15 Thread Adhemerval Zanella
After the issue with LO_HI_LONG definition on x86_64-linux-gnu, I planed to add this patch to check the above patch for correct check for invalid flags (which would also have show this issue with LO_HI_LONG being used on p{read,write}v2). However it seems to trigger what I think it is a kernel

Re: RFC: pwritev2 regression test for invalid flags

2017-06-15 Thread Jon Derrick
Hi Zanella, On 06/15/2017 04:10 PM, Adhemerval Zanella wrote: > After the issue with LO_HI_LONG definition on x86_64-linux-gnu, I planed to > add > this patch to check the above patch for correct check for invalid flags (which > would also have show this issue with LO_HI_LONG being used on >

Re: [PATCH 0/10 v12] No wait AIO

2017-06-15 Thread Goldwyn Rodrigues
On 06/15/2017 01:25 PM, Andrew Morton wrote: > On Thu, 15 Jun 2017 10:59:52 -0500 Goldwyn Rodrigues wrote: > >> This series adds nonblocking feature to asynchronous I/O writes. >> io_submit() can be delayed because of a number of reason: >> - Block allocation for files >> -

Re: [PATCH 1/5] mmc: block: Move duplicate check

2017-06-15 Thread Shawn Lin
Hi, On 2017/6/15 20:12, Linus Walleij wrote: mmc_blk_ioctl() calls either mmc_blk_ioctl_cmd() or mmc_blk_ioctl_multi_cmd() and each of these make the same check. Factor it into a new helper function, call it on both branches of the switch() statement and save a chunk of duplicate code. Cc:

Re: [PATCH 0/10 v12] No wait AIO

2017-06-15 Thread Goldwyn Rodrigues
On 06/15/2017 05:01 PM, Andrew Morton wrote: > On Thu, 15 Jun 2017 16:51:41 -0500 Goldwyn Rodrigues wrote: > >>> I have only minor quibbles - I'll grab the patch series for some -next >>> testing (at least). >>> >> >> I agree to the quibbles you have on patch 02/10. Should I

[PATCH 1/2] loop: use filp_close() rather than fput()

2017-06-15 Thread NeilBrown
When a loop device is being shutdown the backing file is closed with fput(). This is different from how close(2) closes files - it uses filp_close(). The difference is important for filesystems which provide a ->flush file operation such as NFS. NFS assumes a flush will always be called on last

[PATCH 0/2] Two fixes for loop devices

2017-06-15 Thread NeilBrown
Hi Jens, one of these is a resend of a patch I sent a while back. The other is new - loop closes files differently from close() and in a way that can confuse NFS. Thanks, NeilBrown --- NeilBrown (2): loop: use filp_close() rather than fput() loop: Add PF_LESS_THROTTLE to

[PATCH 2/2] loop: Add PF_LESS_THROTTLE to block/loop device thread.

2017-06-15 Thread NeilBrown
When a filesystem is mounted from a loop device, writes are throttled by balance_dirty_pages() twice: once when writing to the filesystem and once when the loop_handle_cmd() writes to the backing file. This double-throttling can trigger positive feedback loops that create significant delays. The

Re: [PATCH 00/13] block: assorted cleanup for bio splitting and cloning.

2017-06-15 Thread NeilBrown
On Thu, May 11 2017, NeilBrown wrote: > On Tue, May 02 2017, NeilBrown wrote: > >> This is a revision of my series of patches working >> towards removing the bioset work queues. > > Hi Jens, > could I get some feed-back about your thoughts on this series? > Will you apply it? When? Do I need

Re: [PATCH v6 12/20] fs: add a new fstype flag to indicate how writeback errors are tracked

2017-06-15 Thread Christoph Hellwig
On Thu, Jun 15, 2017 at 06:42:12AM -0400, Jeff Layton wrote: > Correct. > > But if there is a data writeback error, should we report an error on all > open fds at that time (like we will for fsync)? We should in theory, but I don't see how to properly do it. In addition sync_file_range just

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-15 Thread Jens Axboe
On 06/15/2017 08:21 AM, Jens Axboe wrote: > On 06/15/2017 02:19 AM, Christoph Hellwig wrote: >> I think Darrick has a very valid concern here - using RWF_* flags >> to affect inode or fd-wide state is extremely counter productive. >> >> Combined with the fact that the streams need a special setup

Re: [PATCH 0/10 v11] No wait AIO

2017-06-15 Thread Christoph Hellwig
On Thu, Jun 15, 2017 at 12:11:58PM +0100, Al Viro wrote: > Which flags are you talking about? aio ones? AFAICS, it's the same > kind of thing as "can we lseek?" or "can we read/pread?", etc. > What would that field look like? Note that some of those might depend > upon the flags passed to

Re: [PATCH v6 12/20] fs: add a new fstype flag to indicate how writeback errors are tracked

2017-06-15 Thread Jeff Layton
On Thu, 2017-06-15 at 07:57 -0700, Christoph Hellwig wrote: > On Thu, Jun 15, 2017 at 06:42:12AM -0400, Jeff Layton wrote: > > Correct. > > > > But if there is a data writeback error, should we report an error on all > > open fds at that time (like we will for fsync)? > > We should in theory,

[PATCH 02/12] blk-mq: expose stream write stats through debugfs

2017-06-15 Thread Jens Axboe
Useful to verify that things are working the way they should. Reading the file will return number of kb written to each stream. Writing the file will reset the statistics. No care is taken to ensure that we don't race on updates. Drivers will write to q->stream_writes[] if they handle a stream.

[PATCHSET v5] Add support for write life time hints

2017-06-15 Thread Jens Axboe
A new iteration of this patchset, previously known as write streams. As before, this patchset aims at enabling applications split up writes into separate streams, based on the perceived life time of the data written. This is useful for a variety of reasons: - For NVMe, this feature is ratified

[PATCH 06/12] block: add helpers for setting/checking write hint validity

2017-06-15 Thread Jens Axboe
We map the WRITE_HINT_* life time hints to the internal flags. Drivers can then, in turn, map those flags to a suitable stream type. Signed-off-by: Jens Axboe --- block/bio.c | 16 include/linux/bio.h | 1 + include/linux/blk_types.h | 5

[PATCH 09/12] ext4: add support for passing in write hints for buffered writes

2017-06-15 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/ext4/page-io.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 1a82138ba739..764bf0ddecd4 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@

[PATCH 08/12] fs: add support for buffered writeback to pass down write hints

2017-06-15 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/buffer.c | 14 +- fs/mpage.c | 1 + 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 161be58c5cb0..3faf73a71d4b 100644 --- a/fs/buffer.c

[PATCH 11/12] btrfs: add support for passing in write hints for buffered writes

2017-06-15 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Chris Mason Signed-off-by: Jens Axboe --- fs/btrfs/extent_io.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d3619e010005..2bc2dfca87c2 100644 ---

[PATCH 10/12] xfs: add support for passing in write hints for buffered writes

2017-06-15 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/xfs/xfs_aops.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 09af0f7cd55e..fe11fe47d235 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@

[PATCH 03/12] fs: add support for an inode to carry write hint related data

2017-06-15 Thread Jens Axboe
No functional changes in this patch, just in preparation for allowing applications to pass in hints about data life times for writes. Set aside 3 bits for carrying hint information in the inode flags. Adds the public hints as well, which are: WRITE_HINT_NONE No hints about write life

[PATCH 05/12] fs: add fcntl() interface for setting/getting write life time hints

2017-06-15 Thread Jens Axboe
We have a pwritev2(2) interface based on passing in flags. Add an fcntl interface for querying these flags, and also for setting them as well: F_GET_WRITE_LIFEReturns one of the valid type of write hints, like WRITE_HINT_MEDIUM. F_SET_WRITE_LIFEPass in a

[PATCH 12/12] nvme: add support for streams and directives

2017-06-15 Thread Jens Axboe
This adds support for Directives in NVMe, particular for the Streams directive. Support for Directives is a new feature in NVMe 1.3. It allows a user to pass in information about where to store the data, so that it the device can do so most effiently. If an application is managing and writing data

[PATCH 07/12] fs: add O_DIRECT support for sending down bio stream information

2017-06-15 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/block_dev.c | 2 ++ fs/direct-io.c | 2 ++ fs/iomap.c | 1 + 3 files changed, 5 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index 51959936..de4301168710 100644 ---

[PATCH 04/12] fs: add support for allowing applications to pass in write life time hints

2017-06-15 Thread Jens Axboe
Add four flags for the pwritev2(2) system call, allowing an application to give the kernel a hint about what on-media life times can be expected from a given write. The intent is for these values to be relative to each other, no absolute meaning should be attached to these flag names. Set aside

Re: [PATCH 07/10] block: return on congested block device

2017-06-15 Thread Jens Axboe
On 06/15/2017 09:59 AM, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues > > A new bio operation flag REQ_NOWAIT is introduced to identify bio's > orignating from iocb with IOCB_NOWAIT. This flag indicates > to return immediately if a request cannot be made instead > of