Re: [lkp-robot] [scsi] ebc76736f2: fio.write_bw_MBps -4% regression

2017-06-19 Thread Ye Xiaolong
Hi, Christoph On 06/19, Christoph Hellwig wrote: >On Mon, Jun 19, 2017 at 04:52:36PM +0800, Ye Xiaolong wrote: >> >I've not seen a compile-time option for the MQ I/O scheduler (unlike >> >the legacy one), so the way to change it would be to echo the name to >> >/sys/block//queue/scheduler >> >>

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 02:33 PM, Jens Axboe wrote: > On 06/19/2017 01:10 PM, Jens Axboe wrote: >> On 06/19/2017 01:00 PM, Jens Axboe wrote: >>> On 06/19/2017 12:58 PM, Christoph Hellwig wrote: On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: > Actually, one good use case is O_DIRECT on

Re: [PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Al Viro
On Mon, Jun 19, 2017 at 05:36:05PM -0600, Jens Axboe wrote: > On 06/19/2017 05:34 PM, Al Viro wrote: > > On Mon, Jun 19, 2017 at 05:15:16PM -0600, Jens Axboe wrote: > >> On 06/19/2017 10:33 AM, Goldwyn Rodrigues wrote: > >>> Jens, > >>> > >>> As Christoph suggested, I am sending the patches

[patch 55/55] genirq/affinity: Assign vectors to all present CPUs

2017-06-19 Thread Thomas Gleixner
From: Christoph Hellwig Currently the irq vector spread algorithm is restricted to online CPUs, which ties the IRQ mapping to the currently online devices and doesn't deal nicely with the fact that CPUs could come and go rapidly due to e.g. power management. Instead assign vectors

Re: [PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Jens Axboe
On 06/19/2017 05:34 PM, Al Viro wrote: > On Mon, Jun 19, 2017 at 05:15:16PM -0600, Jens Axboe wrote: >> On 06/19/2017 10:33 AM, Goldwyn Rodrigues wrote: >>> Jens, >>> >>> As Christoph suggested, I am sending the patches against the block >>> tree for merge since the block layer changes had the

Re: [PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Al Viro
On Mon, Jun 19, 2017 at 05:15:16PM -0600, Jens Axboe wrote: > On 06/19/2017 10:33 AM, Goldwyn Rodrigues wrote: > > Jens, > > > > As Christoph suggested, I am sending the patches against the block > > tree for merge since the block layer changes had the most conflicts. > > My tree is at

Re: [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1)

2017-06-19 Thread Stephen Rothwell
Hi Jeff, On Mon, 19 Jun 2017 12:23:46 -0400 Jeff Layton wrote: > > If there are no major objections to this set, I'd like to have > linux-next start picking it up to get some wider testing. What's the > right vehicle for this, given that it touches stuff all over the tree? >

Re: [PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Goldwyn Rodrigues
On 06/19/2017 06:15 PM, Jens Axboe wrote: > On 06/19/2017 10:33 AM, Goldwyn Rodrigues wrote: >> Jens, >> >> As Christoph suggested, I am sending the patches against the block >> tree for merge since the block layer changes had the most conflicts. >> My tree is at

Re: [PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Jens Axboe
On 06/19/2017 10:33 AM, Goldwyn Rodrigues wrote: > Jens, > > As Christoph suggested, I am sending the patches against the block > tree for merge since the block layer changes had the most conflicts. > My tree is at https://github.com/goldwynr/linux/tree/nowait-block I can merge it for 4.13, but

Re: [PATCH v4 12/12] blk-mq: Warn when attempting to run a hardware queue that is not mapped

2017-06-19 Thread Bart Van Assche
On Mon, 2017-06-19 at 17:06 -0600, Jens Axboe wrote: > On 06/19/2017 04:08 PM, Bart Van Assche wrote: > > --- a/block/blk-mq.c > > +++ b/block/blk-mq.c > > @@ -1140,8 +1140,9 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx > > *hctx) > > static void __blk_mq_delay_run_hw_queue(struct

Re: [PATCH v4 12/12] blk-mq: Warn when attempting to run a hardware queue that is not mapped

2017-06-19 Thread Jens Axboe
On 06/19/2017 04:08 PM, Bart Van Assche wrote: > From: Bart Van Assche > > A queue must be frozen while the mapped state of a hardware queue > is changed. Additionally, any change of the mapped state is > followed by a call to blk_mq_map_swqueue() (see also >

Re: [PATCH v4 03/12] block: Introduce request_queue.initialize_rq_fn()

2017-06-19 Thread Jens Axboe
On 06/19/2017 04:07 PM, Bart Van Assche wrote: > From: Bart Van Assche > > Several block drivers need to initialize the driver-private request > data after having called blk_get_request() and before .prep_rq_fn() > is called, e.g. when submitting a REQ_OP_SCSI_*

[PATCH v4 08/12] block: Check locking assumptions at runtime

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Instead of documenting the locking assumptions of most block layer functions as a comment, use lockdep_assert_held() to verify locking assumptions at runtime. Signed-off-by: Bart Van Assche Reviewed-by: Christoph

[PATCH v4 12/12] blk-mq: Warn when attempting to run a hardware queue that is not mapped

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche A queue must be frozen while the mapped state of a hardware queue is changed. Additionally, any change of the mapped state is followed by a call to blk_mq_map_swqueue() (see also blk_mq_init_allocated_queue() and blk_mq_update_nr_hw_queues()).

[PATCH v4 11/12] block: Constify disk_type

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche The variable 'disk_type' is never modified so constify it. Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig Cc: Hannes Reinecke Cc: Omar Sandoval Cc: Ming

[PATCH v4 01/12] blk-mq: Reduce blk_mq_hw_ctx size

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Since the srcu structure is rather large (184 bytes on an x86-64 system with kernel debugging disabled), only allocate it if needed. Reported-by: Ming Lei Signed-off-by: Bart Van Assche

[PATCH v4 10/12] blk-mq: Document locking assumptions

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Document the locking assumptions in functions that modify blk_mq_ctx.rq_list to make it easier for humans to verify this code. Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig Cc: Hannes

[PATCH v4 07/12] block: Add a comment above queue_lockdep_assert_held()

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Add a comment above the queue_lockdep_assert_held() macro that explains the purpose of the q->queue_lock test. Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig Cc: Hannes Reinecke

[PATCH v4 03/12] block: Introduce request_queue.initialize_rq_fn()

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Several block drivers need to initialize the driver-private request data after having called blk_get_request() and before .prep_rq_fn() is called, e.g. when submitting a REQ_OP_SCSI_* request. Avoid that that initialization code has to be

[PATCH v4 04/12] block: Make most scsi_req_init() calls implicit

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Instead of explicitly calling scsi_req_init() after blk_get_request(), call that function from inside blk_get_request(). Add an .initialize_rq_fn() callback function to the block drivers that need it. Merge the IDE .init_rq_fn() function into

[PATCH v4 09/12] block: Document what queue type each function is intended for

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Some functions in block/blk-core.c must only be used on blk-sq queues while others are safe to use against any queue type. Document which functions are intended for blk-sq queues and issue a warning if the blk-sq API is misused. This does not

[PATCH v4 06/12] blk-mq: Initialize .rq_flags in blk_mq_rq_ctx_init()

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Initialization of blk-mq requests is a bit weird: blk_mq_rq_ctx_init() is called after a value has been assigned to .rq_flags and .rq_flags is initialized in __blk_mq_finish_request(). Initialize .rq_flags in blk_mq_rq_ctx_init() instead of

[PATCH v4 05/12] block: Change argument type of scsi_req_init()

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Since scsi_req_init() works on a struct scsi_request, change the argument type into struct scsi_request *. Signed-off-by: Bart Van Assche Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke

[PATCH v4 00/12] More patches for kernel v4.13

2017-06-19 Thread Bart Van Assche
Hello Jens, This patch series contains one patch that reduces the size of struct blk_mq_hw_ctx, a few patches that simplify some of the block layer code and also patches that improve the block layer documentation. Please consider these patches for kernel v4.13. The basis for these patches is

[PATCH v4 02/12] block: Make request operation type argument declarations consistent

2017-06-19 Thread Bart Van Assche
From: Bart Van Assche Instead of declaring the second argument of blk_*_get_request() as int and passing it to functions that expect an unsigned int, declare that second argument as unsigned int. Also because of consistency, rename that second argument from 'rw' into

Re: [PATCH 4/5] RFC: mmc: block: Convert RPMB to a character device

2017-06-19 Thread Tomas Winkler
On Fri, Jun 16, 2017 at 10:22 AM, Avri Altman wrote: > Hi, > >> -Original Message- >> From: Linus Walleij [mailto:linus.wall...@linaro.org] >> Sent: Thursday, June 15, 2017 3:13 PM >> To: linux-...@vger.kernel.org; Ulf Hansson >> Cc:

Re: [PATCH 4/5] RFC: mmc: block: Convert RPMB to a character device

2017-06-19 Thread Tomas Winkler
>>> Currently the RPMB partition spawns a separate block device >>> named /dev/mmcblkNrpmb for each device with an RPMB partition, >>> including the creation of a block queue with its own kernel >>> thread and all overhead associated with this. On the Ux500 >>> HREFv60 platform, for example, the

Re: [PATCH 4/5] RFC: mmc: block: Convert RPMB to a character device

2017-06-19 Thread Tomas Winkler
> The RPMB partition on the eMMC devices is a special area used > for storing cryptographically safe information signed by a > special secret key. To write and read records from this special > area, authentication is needed. > > The RPMB area is *only* and *exclusively* accessed using > ioctl():s

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 01:10 PM, Jens Axboe wrote: > On 06/19/2017 01:00 PM, Jens Axboe wrote: >> On 06/19/2017 12:58 PM, Christoph Hellwig wrote: >>> On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: Actually, one good use case is O_DIRECT on a block device. Since I'm not a huge fan of

Re: [PATCH 3/5] mmc: block: Reparametrize mmc_blk_ioctl_[multi]_cmd()

2017-06-19 Thread Tomas Winkler
> Instead of passing a block device to > mmc_blk_ioctl[_multi]_cmd(), let's pass struct mmc_blk_data() > so we operate ioctl()s on the MMC block device representation > rather than the vanilla block device. > > This saves a little duplicated code and makes it possible to > issue ioctl()s not

Re: [PATCH 2/5] mmc: block: Refactor mmc_blk_part_switch()

2017-06-19 Thread Tomas Winkler
On Thu, Jun 15, 2017 at 3:12 PM, Linus Walleij wrote: > > Instead of passing a struct mmc_blk_data * to mmc_blk_part_switch() > let's pass the actual partition type we want to switch to. This > is necessary in order not to have a block device with a backing >

Re: [PATCH v3 00/12] More patches for kernel v4.13

2017-06-19 Thread Bart Van Assche
On Sun, 2017-06-18 at 20:55 -0600, Jens Axboe wrote: > On 06/08/2017 11:33 AM, Bart Van Assche wrote: > > This patch series contains one patch that reduces the size of struct > > blk_mq_hw_ctx, a few patches that simplify some of the block layer code and > > also patches that improve the block

Re: [PATCH V3 11/12] blktrace: add an option to allow displying cgroup path

2017-06-19 Thread Tejun Heo
Hello, On Thu, Jun 15, 2017 at 11:17:19AM -0700, Shaohua Li wrote: > From: Shaohua Li > > By default we output cgroup id in blktrace. This adds an option to > display cgroup path. Since get cgroup path is a relativly heavy > operation, we don't enable it by default. > > with the

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 01:00 PM, Jens Axboe wrote: > On 06/19/2017 12:58 PM, Christoph Hellwig wrote: >> On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: >>> Actually, one good use case is O_DIRECT on a block device. Since I'm >>> not a huge fan of having per-call hints that is only useful for a

Re: [PATCH V3 09/12] block: always attach cgroup info into bio

2017-06-19 Thread Tejun Heo
On Thu, Jun 15, 2017 at 11:17:17AM -0700, Shaohua Li wrote: > From: Shaohua Li > > blkcg_bio_issue_check() already gets blkcg for a BIO. > bio_associate_blkcg() uses a percpu refcounter, so it's a very cheap > operation. There is no point we don't attach the cgroup info into bio at

Re: [PATCH V3 06/12] kernfs: add exportfs operations

2017-06-19 Thread Tejun Heo
Hello, On Thu, Jun 15, 2017 at 11:17:14AM -0700, Shaohua Li wrote: > -static int kernfs_fill_super(struct super_block *sb, unsigned long magic) > +static int kernfs_fill_super(struct super_block *sb, unsigned long magic, > + bool enable_expop) Hmm... can't we make this a

Re: [PATCH 11/11] nvme: add support for streams and directives

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:53 PM, Christoph Hellwig wrote: > On Mon, Jun 19, 2017 at 08:53:08AM -0600, Jens Axboe wrote: >> Looking at it a bit more closely - there's a difference between >> assigning X number of streams (allocating) for use by the subsystem or >> per-ns, and having to manually open them.

Re: [PATCH V3 05/12] kernfs: introduce kernfs_node_id

2017-06-19 Thread Tejun Heo
Hello, On Thu, Jun 15, 2017 at 11:17:13AM -0700, Shaohua Li wrote: > +/* represent a kernfs node */ > +struct kernfs_node_id { > + u32 ino; > + u32 generation; > +} __attribute__((packed)); Can we just make it a u64? kernfs cares about the details

Re: [PATCH V3 04/12] kernfs: don't set dentry->d_fsdata

2017-06-19 Thread Tejun Heo
On Thu, Jun 15, 2017 at 11:17:12AM -0700, Shaohua Li wrote: > From: Shaohua Li > > When working on adding exportfs operations in kernfs, I found it's hard > to initialize dentry->d_fsdata in the exportfs operations. Looks there > is no way to do it without race condition. Look at

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:58 PM, Christoph Hellwig wrote: > On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: >> Actually, one good use case is O_DIRECT on a block device. Since I'm >> not a huge fan of having per-call hints that is only useful for a >> single case, how about we add the hints to

Re: [PATCH V3 03/12] kernfs: add an API to get kernfs node from inode number

2017-06-19 Thread Tejun Heo
Hello, On Thu, Jun 15, 2017 at 11:17:11AM -0700, Shaohua Li wrote: > diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c > index 33f711f..7a4f327 100644 > --- a/fs/kernfs/dir.c > +++ b/fs/kernfs/dir.c > @@ -508,6 +508,10 @@ void kernfs_put(struct kernfs_node *kn) > struct kernfs_node *parent; >

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Christoph Hellwig
On Mon, Jun 19, 2017 at 10:02:09AM -0600, Jens Axboe wrote: > Actually, one good use case is O_DIRECT on a block device. Since I'm > not a huge fan of having per-call hints that is only useful for a > single case, how about we add the hints to the struct file as well? > For buffered IO, just grab

Re: [PATCH 11/11] nvme: add support for streams and directives

2017-06-19 Thread Christoph Hellwig
On Mon, Jun 19, 2017 at 08:53:08AM -0600, Jens Axboe wrote: > Looking at it a bit more closely - there's a difference between > assigning X number of streams (allocating) for use by the subsystem or > per-ns, and having to manually open them. So I don't necessarily think > there's a problem here,

Re: [PATCH V3 01/12] kernfs: use idr instead of ida to manage inode number

2017-06-19 Thread Tejun Heo
On Thu, Jun 15, 2017 at 11:17:09AM -0700, Shaohua Li wrote: > From: Shaohua Li > > kernfs uses ida to manage inode number. The problem is we can't get > kernfs_node from inode number with ida. Switching to use idr, next patch > will add an API to get kernfs_node from inode number. >

Re: [PATCH V3 02/12] kernfs: implement i_generation

2017-06-19 Thread Tejun Heo
On Thu, Jun 15, 2017 at 11:17:10AM -0700, Shaohua Li wrote: > From: Shaohua Li > > Set i_generation for kernfs inode. This is required to implement > exportfs operations. The generation is 32-bit, so it's possible the > generation wraps up and we find stale files. To reduce the

Re: [PATCH v7 15/22] dax: set errors in mapping when writeback fails

2017-06-19 Thread Ross Zwisler
On Sat, Jun 17, 2017 at 08:39:53AM -0400, Jeff Layton wrote: > On Fri, 2017-06-16 at 15:34 -0400, Jeff Layton wrote: > > Jan Kara's description for this patch is much better than mine, so I'm > > quoting it verbatim here: > > > > DAX currently doesn't set errors in the mapping when cache flushing

[PATCH 7/9] xfs: add support for passing in write hints for buffered writes

2017-06-19 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/xfs/xfs_aops.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 76b6f988e2fa..e4d9d470402c 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@

[PATCH 2/9] block: add support for write hints in a bio

2017-06-19 Thread Jens Axboe
No functional changes in this patch, we just set aside 3 bits in the bio/request flags, which can be used to hold a WRITE_LIFE_* life time hint. Ensure that we don't merge requests that have different life time hints assigned to them. Signed-off-by: Jens Axboe ---

[PATCH 3/9] blk-mq: expose stream write hints through debugfs

2017-06-19 Thread Jens Axboe
Useful to verify that things are working the way they should. Reading the file will return number of kb written with each write hint. Writing the file will reset the statistics. No care is taken to ensure that we don't race on updates. Drivers will write to q->write_hints[] if they handle a given

[PATCH 6/9] ext4: add support for passing in write hints for buffered writes

2017-06-19 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/ext4/page-io.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 930ca0fc9a0f..92834b702728 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@

[PATCH 5/9] fs: add support for buffered writeback to pass down write hints

2017-06-19 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/buffer.c | 14 +- fs/mpage.c | 1 + 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 306b720f7383..1259524715c8 100644 --- a/fs/buffer.c

[PATCH 1/9] fs: add fcntl() interface for setting/getting write life time hints

2017-06-19 Thread Jens Axboe
Define a set of write life time hints: and add an fcntl interface for querying these flags, and also for setting them as well: F_GET_RW_HINT Returns the read/write hint set. F_SET_RW_HINT Pass one of the above write hints. The user passes in a 64-bit pointer to get/set

[PATCH 4/9] fs: add O_DIRECT support for sending down write life time hints

2017-06-19 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Jens Axboe --- fs/block_dev.c | 2 ++ fs/direct-io.c | 2 ++ fs/iomap.c | 1 + 3 files changed, 5 insertions(+) diff --git a/fs/block_dev.c b/fs/block_dev.c index dd91c99e9ba0..30e1fb65c2fa 100644 ---

[PATCH 9/9] nvme: add support for streams and directives

2017-06-19 Thread Jens Axboe
This adds support for Directives in NVMe, particular for the Streams directive. Support for Directives is a new feature in NVMe 1.3. It allows a user to pass in information about where to store the data, so that it the device can do so most effiently. If an application is managing and writing data

[PATCH 8/9] btrfs: add support for passing in write hints for buffered writes

2017-06-19 Thread Jens Axboe
Reviewed-by: Andreas Dilger Signed-off-by: Chris Mason Signed-off-by: Jens Axboe --- fs/btrfs/extent_io.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 19eedf2e630b..3e57cfaa6dd6 100644 ---

[PATCHSET v8] Add support for write life time hints

2017-06-19 Thread Jens Axboe
A new iteration of this patchset, previously known as write streams. As before, this patchset aims at enabling applications split up writes into separate streams, based on the perceived life time of the data written. This is useful for a variety of reasons: - For NVMe, this feature is ratified

[PATCH 08/10] ext4: nowait aio support

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail for direct I/O: + i_rwsem is lockable + Writing beyond end of file (will trigger allocation) + Blocks are not allocated at the write location Signed-off-by: Goldwyn Rodrigues

[PATCH 09/10] xfs: nowait aio support

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable immediately. IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin if it needs allocation either due to file extension, writing to a hole, or COW or waiting for other DIOs to finish.

[PATCH 06/10] fs: Introduce IOMAP_NOWAIT

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps. This is used by XFS in the XFS patch. Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Goldwyn Rodrigues --- fs/iomap.c|

[PATCH 10/10] btrfs: nowait aio support

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail + i_rwsem is not lockable + NODATACOW or PREALLOC is not set + Cannot nocow at the desired location + Writing beyond end of file which is not allocated Acked-by: David Sterba

[PATCH 04/10] fs: Introduce RWF_NOWAIT and FMODE_AIO_NOWAIT

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues RWF_NOWAIT informs kernel to bail out if an AIO request will block for reasons such as file allocations, or a writeback triggered, or would block while allocating requests while performing direct I/O. RWF_NOWAIT is translated to IOCB_NOWAIT for

[PATCH 02/10] fs: Introduce filemap_range_has_page()

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues filemap_range_has_page() return true if the file's mapping has a page within the range mentioned. This function will be used to check if a write() call will cause a writeback of previous writes. Reviewed-by: Christoph Hellwig Reviewed-by:

[PATCH 07/10] block: return on congested block device

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues A new bio operation flag REQ_NOWAIT is introduced to identify bio's orignating from iocb with IOCB_NOWAIT. This flag indicates to return immediately if a request cannot be made instead of retrying. Stacked devices such as md (the ones with

[PATCH 03/10] fs: Use RWF_* flags for AIO operations

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues aio_rw_flags is introduced in struct iocb (using aio_reserved1) which will carry the RWF_* flags. We cannot use aio_flags because they are not checked for validity which may break existing applications. Note, the only place RWF_HIPRI comes in effect is

[PATCH 05/10] fs: return if direct I/O will trigger writeback

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Find out if the I/O will trigger a wait due to writeback. If yes, return -EAGAIN. Return -EINVAL for buffered AIO: there are multiple causes of delay such as page locks, dirty throttling logic, page loading from disk etc. which cannot be taken care of.

[PATCH 01/10] fs: Separate out kiocb flags setup based on RWF_* flags

2017-06-19 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Also added RWF_SUPPORTED to encompass all flags. Reviewed-by: Christoph Hellwig Reviewed-by: Jan Kara Signed-off-by: Goldwyn Rodrigues --- fs/read_write.c | 12 +++- include/linux/fs.h

[PATCH 0/10 v13] merge request: No wait AIO

2017-06-19 Thread Goldwyn Rodrigues
Jens, As Christoph suggested, I am sending the patches against the block tree for merge since the block layer changes had the most conflicts. My tree is at https://github.com/goldwynr/linux/tree/nowait-block This series adds nonblocking feature to asynchronous I/O writes. io_submit() can be

Re: [PATCH rfc 25/30] nvme: move control plane handling to nvme core

2017-06-19 Thread Sagi Grimberg
+static void nvme_free_io_queues(struct nvme_ctrl *ctrl) +{ + int i; + + for (i = 1; i < ctrl->queue_count; i++) + ctrl->ops->free_hw_queue(ctrl, i); +} + +void nvme_stop_io_queues(struct nvme_ctrl *ctrl) +{ + int i; + + for (i = 1; i < ctrl->queue_count;

Re: [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1)

2017-06-19 Thread Jeff Layton
On Fri, 2017-06-16 at 15:34 -0400, Jeff Layton wrote: > v7: > === > This is the seventh posting of the patchset to revamp the way writeback > errors are tracked and reported. > > The main difference from the v6 posting is the removal of the > FS_WB_ERRSEQ flag. That requires a few other

Re: [PATCH rfc 28/30] nvme: update tagset nr_hw_queues when reallocating io queues

2017-06-19 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg Could use a changelog. Ming: does this solve your problem of not seeing the new queues after a qemu CPU hotplug + reset? The issue I observed is that there isn't NVMe reset triggered after CPU becomes online. This won't help with that. It

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 08:56 AM, Jens Axboe wrote: > On 06/19/2017 12:27 AM, Christoph Hellwig wrote: >> On Sat, Jun 17, 2017 at 01:59:47PM -0600, Jens Axboe wrote: >>> Add four flags for the pwritev2(2) system call, allowing an application >>> to give the kernel a hint about what on-media life times can

Re: [PATCH 01/10] pktcdvd: remove the call to blk_queue_bounce

2017-06-19 Thread Ming Lei
On Mon, Jun 19, 2017 at 11:18 PM, Christoph Hellwig wrote: > On Mon, Jun 19, 2017 at 11:13:46PM +0800, Ming Lei wrote: >> On Mon, Jun 19, 2017 at 11:00 PM, Christoph Hellwig wrote: >> > On Mon, Jun 19, 2017 at 10:34:36PM +0800, Ming Lei wrote: >> >>

Re: [PATCH rfc 01/30] nvme: Add admin connect request queue

2017-06-19 Thread Hannes Reinecke
On 06/19/2017 09:49 AM, Sagi Grimberg wrote: > >>> In case we reconnect with inflight admin IO we >>> need to make sure that the connect comes before >>> the admin command. This can be only achieved by >>> using a seperate request queue for admin connects. >> >> Use up a few more lines of the

Re: dm integrity tests crash kernel (4.12-rc5)

2017-06-19 Thread Mike Snitzer
On Mon, Jun 19 2017 at 11:16am -0400, Ondrej Kozina wrote: > On 06/19/2017 04:20 PM, Mike Snitzer wrote: > > > >Looks like submit_flush_bio() is disabling/enabling interrupts from > >interrupt context. Ondrej, does this patch fix the issue? > > I let it spin for 30 minutes

Re: [PATCH 01/10] pktcdvd: remove the call to blk_queue_bounce

2017-06-19 Thread Christoph Hellwig
On Mon, Jun 19, 2017 at 11:13:46PM +0800, Ming Lei wrote: > On Mon, Jun 19, 2017 at 11:00 PM, Christoph Hellwig wrote: > > On Mon, Jun 19, 2017 at 10:34:36PM +0800, Ming Lei wrote: > >> blk_queue_make_request() sets bounce for any highmem page for long time, > >> and in theory this

Re: [dm-devel] dm integrity tests crash kernel (4.12-rc5)

2017-06-19 Thread Ondrej Kozina
On 06/19/2017 04:20 PM, Mike Snitzer wrote: Looks like submit_flush_bio() is disabling/enabling interrupts from interrupt context. Ondrej, does this patch fix the issue? I let it spin for 30 minutes on patched dm-integrity and everything seems ok now. The moment I loaded back the old one it

Re: [PATCH 01/10] pktcdvd: remove the call to blk_queue_bounce

2017-06-19 Thread Ming Lei
On Mon, Jun 19, 2017 at 11:00 PM, Christoph Hellwig wrote: > On Mon, Jun 19, 2017 at 10:34:36PM +0800, Ming Lei wrote: >> blk_queue_make_request() sets bounce for any highmem page for long time, >> and in theory this patch might cause regression on 32bit arch, when >> the controller

Re: [PATCH 11/11] nvme: add support for streams and directives

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:35 AM, Christoph Hellwig wrote: > Can you add linux-nvme for the next repost? > > As said before I think we should rely on implicit streams allocation, > as that will make the whole patch a lot simpler, and it solves the issue > that your current patch will take away your 4

Re: [PATCH 06/11] fs: add O_DIRECT support for sending down write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:28 AM, Christoph Hellwig wrote: > On Sat, Jun 17, 2017 at 01:59:49PM -0600, Jens Axboe wrote: >> Reviewed-by: Andreas Dilger >> Signed-off-by: Jens Axboe >> --- >> fs/block_dev.c | 2 ++ >> fs/direct-io.c | 2 ++ >> fs/iomap.c | 1 + >> 3

Re: [PATCH 05/11] fs: add fcntl() interface for setting/getting write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:28 AM, Christoph Hellwig wrote: > On Sat, Jun 17, 2017 at 01:59:48PM -0600, Jens Axboe wrote: >> We have a pwritev2(2) interface based on passing in flags. Add an >> fcntl interface for querying these flags, and also for setting them >> as well: >> >> F_GET_RW_HINT

Re: [PATCH 04/11] fs: add support for allowing applications to pass in write life time hints

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:27 AM, Christoph Hellwig wrote: > On Sat, Jun 17, 2017 at 01:59:47PM -0600, Jens Axboe wrote: >> Add four flags for the pwritev2(2) system call, allowing an application >> to give the kernel a hint about what on-media life times can be >> expected from a given write. >> >> The

Re: [PATCH 01/11] fs: add support for an inode to carry write hint related data

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:26 AM, Christoph Hellwig wrote: >> +/* >> + * Write life time hint values. >> + */ >> +enum rw_hint { >> +WRITE_LIFE_NONE = 0, >> +WRITE_LIFE_SHORT, >> +WRITE_LIFE_MEDIUM, >> +WRITE_LIFE_LONG, >> +WRITE_LIFE_EXTREME, >> +}; >> + >> +#define RW_HINT_MASK

Re: [PATCH 01/10] pktcdvd: remove the call to blk_queue_bounce

2017-06-19 Thread Ming Lei
On Mon, Jun 19, 2017 at 3:26 PM, Christoph Hellwig wrote: > pktcdvd is a make_request based stacking driver and thus doesn't have any > addressing limits on it's own. It also doesn't use bio_data() or > page_address(), so it doesn't need a lowmem bounce either. > > Signed-off-by:

Re: [PATCH 11/11] nvme: add support for streams and directives

2017-06-19 Thread Jens Axboe
On 06/19/2017 12:25 AM, Christoph Hellwig wrote: > On Sat, Jun 17, 2017 at 09:11:30AM -0600, Jens Axboe wrote: >> I have two samples here, and I just tested, and both of them want it >> assigned with nsid=0x or they will fail the writes... So I'd say >> we're better off ensuring we do

Re: dm integrity tests crash kernel (4.12-rc5)

2017-06-19 Thread Mike Snitzer
On Mon, Jun 19 2017 at 8:11am -0400, Ondrej Kozina wrote: > same log again, hopefully prettier format. Sorry: > > [ 330.980914] DEBUG_LOCKS_WARN_ON(current->hardirq_context) > [ 330.980923] [ cut here ] > [ 330.982627] WARNING: CPU: 1 PID: 0 at

Re: [PATCH rfc 20/30] nvme: add err, reconnect and delete work items to nvme core

2017-06-19 Thread Sagi Grimberg
We intent for these handlers to become generic, thus, add them to the nvme core controller struct. Do you remember why we actually need all the different work items? I remember documenting it at some point, but either it got lost somewhere or I don't remember... We need err_work to

Re: [PATCH rfc 10/30] nvme: Add admin_tagset pointer to nvme_ctrl

2017-06-19 Thread Sagi Grimberg
Will be used when we centralize control flows. only rdma for now. Should we at some point move the tag_sets themselves to the generic ctrl instead of just pointers? We can easily do that, but the tagsets are heavily read in the hot path so I was careful not to completely move them to

Re: [PATCH 09/10] blk-mq-sched: unify request prepare methods

2017-06-19 Thread Paolo Valente
> Il giorno 16 giu 2017, alle ore 18:15, Christoph Hellwig ha > scritto: > > This patch makes sure we always allocate requests in the core blk-mq > code and use a common prepare_request method to initialize them for > both mq I/O schedulers. For Kyber and additional limit_depth

Re: [PATCH 07/10] bfq-iosched: fix NULL ioc check in bfq_get_rq_private

2017-06-19 Thread Paolo Valente
> Il giorno 16 giu 2017, alle ore 18:15, Christoph Hellwig ha > scritto: > > icq_to_bic is a container_of operation, so we need to check for NULL > before it. Also move the check outside the spinlock while we're at > it. > > Signed-off-by: Christoph Hellwig > --- >

Re: [PATCH rfc 25/30] nvme: move control plane handling to nvme core

2017-06-19 Thread Christoph Hellwig
> +static void nvme_free_io_queues(struct nvme_ctrl *ctrl) > +{ > + int i; > + > + for (i = 1; i < ctrl->queue_count; i++) > + ctrl->ops->free_hw_queue(ctrl, i); > +} > + > +void nvme_stop_io_queues(struct nvme_ctrl *ctrl) > +{ > + int i; > + > + for (i = 1; i <

Re: [PATCH rfc 29/30] nvme: add sed-opal ctrl manipulation in admin configuration

2017-06-19 Thread Christoph Hellwig
On Mon, Jun 19, 2017 at 11:03:36AM +0300, Sagi Grimberg wrote: > >> The subject sounds odd and it could use a changelog. But I'd love to >> pick this change up ASAP as it's the right thing to do.. > > How? where would you place it? there is no nvme_configure_admin_queue in > nvme-core. Doh.

Re: [PATCH rfc 20/30] nvme: add err, reconnect and delete work items to nvme core

2017-06-19 Thread Christoph Hellwig
On Sun, Jun 18, 2017 at 06:21:54PM +0300, Sagi Grimberg wrote: > We intent for these handlers to become generic, thus, add them to > the nvme core controller struct. Do you remember why we actually need all the different work items? We need err_work to recover from RDMA QP-level errors. But how

Re: [PATCH rfc 17/30] nvme-rdma: move admin specific resources to alloc_queue

2017-06-19 Thread Christoph Hellwig
On Sun, Jun 18, 2017 at 06:21:51PM +0300, Sagi Grimberg wrote: > We're trying to make admin queue configuration generic, so > move the rdma specifics to the queue allocation (based on > the queue index passed). Needs at least a comment, and probably factoring into a little helper. And once we

Re: [PATCH rfc 16/30] nvme-rdma: move tagset allocation to a dedicated routine

2017-06-19 Thread Christoph Hellwig
Looks fine, but how about doing this early in the series? There's quite a bit of churn around this code.

Re: [PATCH rfc 15/30] nvme-rdma: don't check queue state for shutdown/disable

2017-06-19 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH rfc 14/30] nvme-rdma: stop queues instead of simply flipping their state

2017-06-19 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH rfc 13/30] nvme-rdma: move queue LIVE/DELETING flags settings to queue routines

2017-06-19 Thread Christoph Hellwig
Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCH rfc 12/30] nvme-rdma: disable controller in reset instead of shutdown

2017-06-19 Thread Christoph Hellwig
Looks fine, Reviewed-by: Christoph Hellwig

Re: [PATCH rfc 11/30] nvme: move controller cap to struct nvme_ctrl

2017-06-19 Thread Christoph Hellwig
On Sun, Jun 18, 2017 at 06:21:45PM +0300, Sagi Grimberg wrote: > Will be used in centralized code later. only rdma > for now. It would be great to initialize it early on for all transports, and then just use the stored field instead of re-reading CAP in various places.

Re: [PATCH rfc 10/30] nvme: Add admin_tagset pointer to nvme_ctrl

2017-06-19 Thread Christoph Hellwig
On Sun, Jun 18, 2017 at 06:21:44PM +0300, Sagi Grimberg wrote: > Will be used when we centralize control flows. only > rdma for now. Should we at some point move the tag_sets themselves to the generic ctrl instead of just pointers?

  1   2   >