Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Hannes Reinecke
On 03/07/2017 06:22 AM, Minchan Kim wrote: > Hello Johannes, > > On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote: >> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using >> the NVMe over Fabrics loopback target which potentially sends a huge bulk of >>

Re: [PATCH RFC 04/14] block, bfq: modify the peak-rate estimator

2017-03-06 Thread Bart Van Assche
On Sat, 2017-03-04 at 17:01 +0100, Paolo Valente wrote: > +static sector_t get_sdist(sector_t last_pos, struct request *rq) > +{ > + sector_t sdist = 0; > + > + if (last_pos) { > + if (last_pos < blk_rq_pos(rq)) > + sdist = blk_rq_pos(rq) - last_pos; > +

Re: [PATCH RFC 00/14] Add the BFQ I/O Scheduler to blk-mq

2017-03-06 Thread Bart Van Assche
On 03/04/2017 08:01 AM, Paolo Valente wrote: > Some patch generates WARNINGS with checkpatch.pl, but these WARNINGS > seem to be either unavoidable for the involved pieces of code (which > the patch just extends), or false positives. The code in this series looks reasonably clean from a code

Re: [PATCH RFC 00/14] Add the BFQ I/O Scheduler to blk-mq

2017-03-06 Thread Bart Van Assche
On Sat, 2017-03-04 at 17:01 +0100, Paolo Valente wrote: > Finally, a few details on the patchset. > > The first two patches introduce BFQ-v0, which is more or less the > first version of BFQ submitted a few years ago [1]. The remaining > patches turn progressively BFQ-v0 into BFQ-v8r8, the

Re: [PATCH RFC 13/14] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs

2017-03-06 Thread Bart Van Assche
On Sat, 2017-03-04 at 17:01 +0100, Paolo Valente wrote: > @@ -8301,7 +8297,7 @@ static struct blkcg_policy blkcg_policy_bfq = { > static int __init bfq_init(void) > { > int ret; > - char msg[50] = "BFQ I/O-scheduler: v6"; > + char msg[50] = "BFQ I/O-scheduler: v7r3"; > > #ifdef

cfq-iosched: two questions about the hrtimer version of CFQ

2017-03-06 Thread Hou Tao
Hi Jan and list, When testing the hrtimer version of CFQ, we found a performance degradation problem which seems to be caused by commit 0b31c10 ("cfq-iosched: Charge at least 1 jiffie instead of 1 ns"). The following is the test process: * filesystem and block device * XFS + /dev/sda

Re: [PATCH 02/11] block: Fix race of bdev open with gendisk shutdown

2017-03-06 Thread Tejun Heo
Hello, On Mon, Mar 06, 2017 at 05:33:55PM +0100, Jan Kara wrote: > + disk->flags &= ~GENHD_FL_UP; > + /* > + * Make sure __blkdev_open() sees the disk is going away before > + * starting to unhash bdev inodes. > + */ > + smp_wmb(); But which rmb is this paired with?

Re: [PATCH 07/11] bdi: Do not wait for cgwbs release in bdi_unregister()

2017-03-06 Thread Tejun Heo
On Mon, Mar 06, 2017 at 05:34:00PM +0100, Jan Kara wrote: > Currently we wait for all cgwbs to get released in cgwb_bdi_destroy() > (called from bdi_unregister()). That is however unnecessary now when > cgwb->bdi is a proper refcounted reference (thus bdi cannot get > released before all cgwbs are

Re: [PATCH 01/11] block: Fix bdi assignment to bdev inode when racing with disk delete

2017-03-06 Thread Tejun Heo
On Mon, Mar 06, 2017 at 05:33:54PM +0100, Jan Kara wrote: > When disk->fops->open() in __blkdev_get() returns -ERESTARTSYS, we > restart the process of opening the block device. However we forget to > switch bdev->bd_bdi back to noop_backing_dev_info and as a result bdev > inode will be pointing

Re: [PATCH 06/11] bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()

2017-03-06 Thread Tejun Heo
On Mon, Mar 06, 2017 at 05:33:59PM +0100, Jan Kara wrote: > Currently we waited for all cgwbs to get freed in cgwb_bdi_destroy() > which also means that writeback has been shutdown on them. Since this > wait is going away, directly shutdown writeback on cgwbs from > cgwb_bdi_destroy() to avoid

Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Jens Axboe
On 03/06/2017 01:18 PM, Andrew Morton wrote: > On Mon, 6 Mar 2017 08:21:11 -0700 Jens Axboe wrote: > >> On 03/06/2017 03:23 AM, Johannes Thumshirn wrote: >>> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using >>> the NVMe over Fabrics loopback target which

Re: [PATCH] block/sed: Fix opal user range check and unused variables

2017-03-06 Thread Jens Axboe
On 03/06/2017 08:41 AM, Jon Derrick wrote: > Fixes check that the opal user is within the range, and cleans up unused > method variables. Applied, thanks. -- Jens Axboe

Re: [PATCH RFC 01/14] block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler

2017-03-06 Thread Bart Van Assche
On 03/04/2017 08:01 AM, Paolo Valente wrote: > BFQ is a proportional-share I/O scheduler, whose general structure, > plus a lot of code, are borrowed from CFQ. > [ ... ] This description is very useful. However, since it is identical to the description this patch adds to

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-06 Thread James Bottomley
On Mon, 2017-03-06 at 16:14 +0100, Jan Kara wrote: > On Mon 06-03-17 06:35:21, James Bottomley wrote: > > On Mon, 2017-03-06 at 13:01 +0100, Jan Kara wrote: > > > On Mon 06-03-17 11:27:33, Jan Kara wrote: > > > > Hi, > > > > > > > > On Sun 05-03-17 10:21:11, Wu Fengguang wrote: > > > > > FYI

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-06 Thread James Bottomley
On Mon, 2017-03-06 at 17:13 +0100, Jan Kara wrote: > On Mon 06-03-17 07:44:55, James Bottomley wrote: > > On Mon, 2017-03-06 at 16:14 +0100, Jan Kara wrote: > > > On Mon 06-03-17 06:35:21, James Bottomley wrote: > > > > On Mon, 2017-03-06 at 13:01 +0100, Jan Kara wrote: > > > > > On Mon 06-03-17

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Avi Kivity
On 03/06/2017 08:27 PM, Jens Axboe wrote: On 03/06/2017 11:17 AM, Avi Kivity wrote: On 03/06/2017 07:06 PM, Jens Axboe wrote: On 03/06/2017 09:59 AM, Avi Kivity wrote: On 03/06/2017 06:08 PM, Jens Axboe wrote: On 03/06/2017 08:59 AM, Avi Kivity wrote: On 03/06/2017 05:38 PM, Jens Axboe

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Jens Axboe
On 03/06/2017 09:59 AM, Avi Kivity wrote: > > > On 03/06/2017 06:08 PM, Jens Axboe wrote: >> On 03/06/2017 08:59 AM, Avi Kivity wrote: >>> On 03/06/2017 05:38 PM, Jens Axboe wrote: On 03/06/2017 08:29 AM, Avi Kivity wrote: > On 03/06/2017 05:19 PM, Jens Axboe wrote: >> On 03/06/2017

[PATCH 03/11] bdi: Mark congested->bdi as internal

2017-03-06 Thread Jan Kara
congested->bdi pointer is used only to be able to remove congested structure from bdi->cgwb_congested_tree on structure release. Moreover the pointer can become NULL when we unregister the bdi. Rename the field to __bdi and add a comment to make it more explicit this is internal stuff of memcg

[PATCH 0/11 v3] block: Fix block device shutdown related races

2017-03-06 Thread Jan Kara
Hello, this is a series with the remaining patches (on top of 3.11-rc1) to fix several different races and issues I've found when testing device shutdown and reuse. The first two patches fix possible (theoretical) problems when opening of a block device races with shutdown of a gendisk structure.

[PATCH 11/11] block: Fix oops scsi_disk_get()

2017-03-06 Thread Jan Kara
When device open races with device shutdown, we can get the following oops in scsi_disk_get(): [11863.044351] general protection fault: [#1] SMP [11863.045561] Modules linked in: scsi_debug xfs libcrc32c netconsole btrfs raid6_pq zlib_deflate lzo_compress xor [last unloaded: loop]

[PATCH 07/11] bdi: Do not wait for cgwbs release in bdi_unregister()

2017-03-06 Thread Jan Kara
Currently we wait for all cgwbs to get released in cgwb_bdi_destroy() (called from bdi_unregister()). That is however unnecessary now when cgwb->bdi is a proper refcounted reference (thus bdi cannot get released before all cgwbs are released) and when cgwb_bdi_destroy() shuts down writeback

[PATCH 06/11] bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()

2017-03-06 Thread Jan Kara
Currently we waited for all cgwbs to get freed in cgwb_bdi_destroy() which also means that writeback has been shutdown on them. Since this wait is going away, directly shutdown writeback on cgwbs from cgwb_bdi_destroy() to avoid live writeback structures after bdi_unregister() has finished. To

[PATCH 05/11] bdi: Move removal from bdi->wb_list into wb_shutdown()

2017-03-06 Thread Jan Kara
Currently the removal from bdi->wb_list happens directly in cgwb_release_workfn(). Move it to wb_shutdown() which is functionally equivalent and more logical (the list gets only used for distributing writeback works among bdi_writeback structures). It will also allow us to simplify writeback

[PATCH 04/11] bdi: Make wb->bdi a proper reference

2017-03-06 Thread Jan Kara
Make wb->bdi a proper refcounted reference to bdi for all bdi_writeback structures except for the one embedded inside struct backing_dev_info. That will allow us to simplify bdi unregistration. Acked-by: Tejun Heo Signed-off-by: Jan Kara --- mm/backing-dev.c | 13

[PATCH 02/11] block: Fix race of bdev open with gendisk shutdown

2017-03-06 Thread Jan Kara
blkdev_open() may race with gendisk shutdown in such a way that del_gendisk() has already unhashed block device inode (and thus bd_acquire() will end up creating new block device inode) however gen_gendisk() will still return the gendisk that is being destroyed. This will result in the new bdev

[PATCH 10/11] kobject: Export kobject_get_unless_zero()

2017-03-06 Thread Jan Kara
Make the function available for outside use and fortify it against NULL kobject. CC: Greg Kroah-Hartman Reviewed-by: Bart Van Assche Acked-by: Tejun Heo Signed-off-by: Jan Kara --- include/linux/kobject.h

[PATCH 01/11] block: Fix bdi assignment to bdev inode when racing with disk delete

2017-03-06 Thread Jan Kara
When disk->fops->open() in __blkdev_get() returns -ERESTARTSYS, we restart the process of opening the block device. However we forget to switch bdev->bd_bdi back to noop_backing_dev_info and as a result bdev inode will be pointing to a stale bdi. Fix the problem by setting bdev->bd_bdi later when

Re: [PATCH] block/sed: Fix opal user range check and unused variables

2017-03-06 Thread Scott Bauer
On Mon, Mar 06, 2017 at 08:41:04AM -0700, Jon Derrick wrote: > Fixes check that the opal user is within the range, and cleans up unused > method variables. > > Signed-off-by: Jon Derrick Reviewed-by: Scott Bauer

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Jens Axboe
On 03/06/2017 08:59 AM, Avi Kivity wrote: > On 03/06/2017 05:38 PM, Jens Axboe wrote: >> On 03/06/2017 08:29 AM, Avi Kivity wrote: >>> >>> On 03/06/2017 05:19 PM, Jens Axboe wrote: On 03/06/2017 01:25 AM, Jan Kara wrote: > On Sun 05-03-17 16:56:21, Avi Kivity wrote: >>> The goal of

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-06 Thread Jan Kara
On Mon 06-03-17 07:44:55, James Bottomley wrote: > On Mon, 2017-03-06 at 16:14 +0100, Jan Kara wrote: > > On Mon 06-03-17 06:35:21, James Bottomley wrote: > > > On Mon, 2017-03-06 at 13:01 +0100, Jan Kara wrote: > > > > On Mon 06-03-17 11:27:33, Jan Kara wrote: > > > > > Hi, > > > > > > > > > >

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Avi Kivity
On 03/06/2017 05:38 PM, Jens Axboe wrote: On 03/06/2017 08:29 AM, Avi Kivity wrote: On 03/06/2017 05:19 PM, Jens Axboe wrote: On 03/06/2017 01:25 AM, Jan Kara wrote: On Sun 05-03-17 16:56:21, Avi Kivity wrote: The goal of the patch series is to return -EAGAIN/-EWOULDBLOCK if any of these

[PATCH] block/sed: Fix opal user range check and unused variables

2017-03-06 Thread Jon Derrick
Fixes check that the opal user is within the range, and cleans up unused method variables. Signed-off-by: Jon Derrick --- block/sed-opal.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/block/sed-opal.c b/block/sed-opal.c index

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Jens Axboe
On 03/06/2017 08:29 AM, Avi Kivity wrote: > > > On 03/06/2017 05:19 PM, Jens Axboe wrote: >> On 03/06/2017 01:25 AM, Jan Kara wrote: >>> On Sun 05-03-17 16:56:21, Avi Kivity wrote: > The goal of the patch series is to return -EAGAIN/-EWOULDBLOCK if > any of these conditions are met. This

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Avi Kivity
On 03/06/2017 05:19 PM, Jens Axboe wrote: On 03/06/2017 01:25 AM, Jan Kara wrote: On Sun 05-03-17 16:56:21, Avi Kivity wrote: The goal of the patch series is to return -EAGAIN/-EWOULDBLOCK if any of these conditions are met. This way userspace can push most of the write()s to the kernel to

Re: [PATCH 0/8 v2] Non-blocking AIO

2017-03-06 Thread Jens Axboe
On 03/06/2017 01:25 AM, Jan Kara wrote: > On Sun 05-03-17 16:56:21, Avi Kivity wrote: >>> The goal of the patch series is to return -EAGAIN/-EWOULDBLOCK if >>> any of these conditions are met. This way userspace can push most >>> of the write()s to the kernel to the best of its ability to complete

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-06 Thread Jan Kara
On Mon 06-03-17 06:35:21, James Bottomley wrote: > On Mon, 2017-03-06 at 13:01 +0100, Jan Kara wrote: > > On Mon 06-03-17 11:27:33, Jan Kara wrote: > > > Hi, > > > > > > On Sun 05-03-17 10:21:11, Wu Fengguang wrote: > > > > FYI next-20170303 is good while mainline is bad with this error. > > > >

Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Jens Axboe
On 03/06/2017 03:23 AM, Johannes Thumshirn wrote: > zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using > the NVMe over Fabrics loopback target which potentially sends a huge bulk of > pages attached to the bio's bvec this results in a kernel panic because of > array out

Re: block/sed-opal.c: 2 * bad if tests ?

2017-03-06 Thread Jon Derrick
On 03/06/2017 05:00 AM, David Binderman wrote: > Hello there, > > 1. > > block/sed-opal.c:2136:40: warning: logical ‘and’ of mutually exclusive tests > is always false [-Wlogical-op] > > Source code is > > if (lk_unlk->session.who < OPAL_USER1 && > lk_unlk->session.who >

Re: [bdi_unregister] 165a5e22fa INFO: task swapper:1 blocked for more than 120 seconds.

2017-03-06 Thread James Bottomley
On Mon, 2017-03-06 at 13:01 +0100, Jan Kara wrote: > On Mon 06-03-17 11:27:33, Jan Kara wrote: > > Hi, > > > > On Sun 05-03-17 10:21:11, Wu Fengguang wrote: > > > FYI next-20170303 is good while mainline is bad with this error. > > > The attached reproduce-* may help reproduce the issue. > > > >

Re: [PATCH] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-06 Thread Vivek Goyal
On Mon, Mar 06, 2017 at 04:55:25PM +0800, Hou Tao wrote: > Hi Vivek, > > On 2017/3/4 3:53, Vivek Goyal wrote: > > On Fri, Mar 03, 2017 at 09:20:44PM +0800, Hou Tao wrote: > > > > [..] > >>> Frankly, vdisktime is in fixed-point precision shifted by > >>> CFQ_SERVICE_SHIFT so using CFQ_IDLE_DELAY

Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Sergey Senozhatsky
On (03/06/17 11:23), Johannes Thumshirn wrote: > zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using > the NVMe over Fabrics loopback target which potentially sends a huge bulk of > pages attached to the bio's bvec this results in a kernel panic because of > array out of

Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Hannes Reinecke
On 03/06/2017 11:23 AM, Johannes Thumshirn wrote: > zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using > the NVMe over Fabrics loopback target which potentially sends a huge bulk of > pages attached to the bio's bvec this results in a kernel panic because of > array out

[PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Johannes Thumshirn
zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using the NVMe over Fabrics loopback target which potentially sends a huge bulk of pages attached to the bio's bvec this results in a kernel panic because of array out of bounds accesses in zram_decompress_page().

Re: [PATCH] cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

2017-03-06 Thread Hou Tao
Hi Vivek, On 2017/3/4 3:53, Vivek Goyal wrote: > On Fri, Mar 03, 2017 at 09:20:44PM +0800, Hou Tao wrote: > > [..] >>> Frankly, vdisktime is in fixed-point precision shifted by >>> CFQ_SERVICE_SHIFT so using CFQ_IDLE_DELAY does not make much sense in any >>> case and just adding 1 to maximum

Re: WARN_ON() when booting with nvme loopback over zram

2017-03-06 Thread Johannes Thumshirn
On 03/03/2017 05:20 PM, Jens Axboe wrote: > On 03/03/2017 09:18 AM, Johannes Thumshirn wrote: >> Hi, >> >> I get the following WARN_ON when trying to establish a nvmf loopback device >> backed by zram. >> >> My topmost commit is c82be9d2244aacea9851c86f4fb74694c99cd874 > > It's fixed in my