Re: [PATCH] direct-io: don't introduce another read of inode->i_blkbits

2017-01-09 Thread Chandan Rajendra
On Monday, January 09, 2017 04:42:58 PM Jeff Moyer wrote: > Commit 20ce44d545844 ("do_direct_IO: Use inode->i_blkbits to compute > block count to be cleaned") introduced a regression: if the block size > of the block device is changed while a direct I/O request is being > setup, it can result in a

Re: [LSF/MM TOPIC][LSF/MM ATTEND] OCSSDs - SMR, Hierarchical Interface, and Vector I/Os

2017-01-09 Thread Theodore Ts'o
On Tue, Jan 10, 2017 at 10:42:45AM +0900, Damien Le Moal wrote: > Thank you for the information. I will check this out. Is it the > optimization that aggressively delay meta-data update by allowing > reading of meta-data blocks directly from the journal (for blocks that > are not yet updated in

Re: [PATCH] virtio_blk: fix panic in initialization error path

2017-01-09 Thread Michael S. Tsirkin
On Mon, Jan 09, 2017 at 11:44:12AM -0800, Omar Sandoval wrote: > From: Omar Sandoval > > If blk_mq_init_queue() returns an error, it gets assigned to > vblk->disk->queue. Then, when we call put_disk(), we end up calling > blk_put_queue() with the ERR_PTR, causing a bad

Re: [PATCH] virtio_blk: fix panic in initialization error path

2017-01-09 Thread Jason Wang
On 2017年01月10日 03:44, Omar Sandoval wrote: From: Omar Sandoval If blk_mq_init_queue() returns an error, it gets assigned to vblk->disk->queue. Then, when we call put_disk(), we end up calling blk_put_queue() with the ERR_PTR, causing a bad dereference. Fix it by only

Re: [PATCH] block: fix blk_get_backing_dev_info() crash, use bdev->bd_queue

2017-01-09 Thread Dan Williams
On Sun, Jan 8, 2017 at 12:50 PM, Dan Williams wrote: > On Sun, Jan 8, 2017 at 11:46 AM, Jan Kara wrote: >> On Fri 06-01-17 09:45:45, Dan Williams wrote: >>> On Fri, Jan 6, 2017 at 2:23 AM, Jan Kara wrote: >>> > On Thu 05-01-17 17:17:55, Dan

Re: [LSF/MM TOPIC][LSF/MM ATTEND] OCSSDs - SMR, Hierarchical Interface, and Vector I/Os

2017-01-09 Thread Damien Le Moal
Ted, On 1/5/17 01:57, Theodore Ts'o wrote: > I agree with Damien, but I'd also add that in the future there may > very well be some new Zone types added to the ZBC model. So we > shouldn't assume that the ZBC model is a fixed one. And who knows? > Perhaps T10 standards body will come up with a

Re: [PATCH V5 00/17] blk-throttle: add .low limit

2017-01-09 Thread Shaohua Li
On Mon, Jan 09, 2017 at 04:46:35PM -0500, Tejun Heo wrote: > Hello, > > Sorry about the long delay. Generally looks good to me. Overall, > there are only a few things that I think should be addressed. Thanks for your time! > * Low limit should default to zero. I forgot to change it after

Re: [PATCH] direct-io: don't introduce another read of inode->i_blkbits

2017-01-09 Thread Jens Axboe
On 01/09/2017 02:42 PM, Jeff Moyer wrote: > Commit 20ce44d545844 ("do_direct_IO: Use inode->i_blkbits to compute > block count to be cleaned") introduced a regression: if the block size > of the block device is changed while a direct I/O request is being > setup, it can result in a panic. See

[PATCH] direct-io: don't introduce another read of inode->i_blkbits

2017-01-09 Thread Jeff Moyer
Commit 20ce44d545844 ("do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned") introduced a regression: if the block size of the block device is changed while a direct I/O request is being setup, it can result in a panic. See commit ab73857e354ab ("direct-io: don't read

Re: [PATCH V5 16/17] blk-throttle: add a mechanism to estimate IO latency

2017-01-09 Thread Tejun Heo
Hello, On Thu, Dec 15, 2016 at 12:33:07PM -0800, Shaohua Li wrote: > User configures latency target, but the latency threshold for each > request size isn't fixed. For a SSD, the IO latency highly depends on > request size. To calculate latency threshold, we sample some data, eg, > average

Re: [PATCH V5 14/17] blk-throttle: add interface for per-cgroup target latency

2017-01-09 Thread Tejun Heo
Hello, On Thu, Dec 15, 2016 at 12:33:05PM -0800, Shaohua Li wrote: > @@ -438,6 +439,11 @@ static struct blkg_policy_data *throtl_pd_alloc(gfp_t > gfp, int node) > } > tg->idle_ttime_threshold = U64_MAX; > > + /* > + * target latency default 0, eg, latency threshold is 0,

Re: [PATCH V5 13/17] blk-throttle: ignore idle cgroup limit

2017-01-09 Thread Tejun Heo
On Thu, Dec 15, 2016 at 12:33:04PM -0800, Shaohua Li wrote: > Last patch introduces a way to detect idle cgroup. We use it to make > upgrade/downgrade decision. And the new algorithm can detect completely > idle cgroup too, so we can delete the corresponding code. Ah, okay, the slice based idle

Re: [PATCH V5 12/17] blk-throttle: add interface to configure idle time threshold

2017-01-09 Thread Tejun Heo
On Thu, Dec 15, 2016 at 12:33:03PM -0800, Shaohua Li wrote: > @@ -180,6 +180,8 @@ struct throtl_data > unsigned int limit_index; > bool limit_valid[LIMIT_CNT]; > > + u64 dft_idle_ttime_threshold; BTW, wouldn't idle_time be a better name for these? Currently, it's "idle

Re: [PATCH V5 11/17] blk-throttle: add a simple idle detection

2017-01-09 Thread Tejun Heo
On Thu, Dec 15, 2016 at 12:33:02PM -0800, Shaohua Li wrote: > /* Throttling is performed over 100ms slice and after that slice is renewed > */ > #define DFL_THROTL_SLICE (HZ / 10) > #define MAX_THROTL_SLICE (HZ / 5) > +#define DFL_IDLE_THRESHOLD_SSD (50 * 1000) /* 50 us */ > +#define

blk_queue_bounce_limit() broken for mask=0xffffffff on 64bit archs

2017-01-09 Thread Nikita Yushchenko
Hi There is a use cases when architecture is 64-bit but hardware supports only DMA to lower 4G of address space. E.g. NVMe device on RCar PCIe host. For such cases, it looks proper to call blk_queue_bounce_limit() with mask set to 0x - thus making block layer to use bounce buffers for

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-09 Thread Johannes Weiner
On Mon, Jan 09, 2017 at 09:30:05PM +0100, Jan Kara wrote: > On Sat 07-01-17 21:02:00, Johannes Weiner wrote: > > On Tue, Jan 03, 2017 at 01:28:25PM +0100, Jan Kara wrote: > > > On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > > > > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote:

Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0

2017-01-09 Thread Jan Kara
On Sat 07-01-17 21:02:00, Johannes Weiner wrote: > On Tue, Jan 03, 2017 at 01:28:25PM +0100, Jan Kara wrote: > > On Mon 02-01-17 16:11:36, Johannes Weiner wrote: > > > On Fri, Dec 23, 2016 at 03:33:29AM -0500, Johannes Weiner wrote: > > > > On Fri, Dec 23, 2016 at 02:32:41AM -0500, Johannes Weiner

Re: [PATCH V5 10/17] blk-throttle: make bandwidth change smooth

2017-01-09 Thread Tejun Heo
Hello, On Thu, Dec 15, 2016 at 12:33:01PM -0800, Shaohua Li wrote: > static uint64_t tg_bps_limit(struct throtl_grp *tg, int rw) > { > struct blkcg_gq *blkg = tg_to_blkg(tg); > + struct throtl_data *td; > uint64_t ret; > > if (cgroup_subsys_on_dfl(io_cgrp_subsys) &&

Re: [patch] nbd: blk_mq_init_queue returns an error code on failure, not NULL

2017-01-09 Thread Omar Sandoval
On Mon, Jan 09, 2017 at 03:20:31PM -0500, Jeff Moyer wrote: > Additionally, don't assign directly to disk->queue, otherwise > blk_put_queue (called via put_disk) will choke (panic) on the errno > stored there. > > Bug found by code inspection after Omar found a similar issue in > virtio_blk.

[patch] nbd: blk_mq_init_queue returns an error code on failure, not NULL

2017-01-09 Thread Jeff Moyer
Additionally, don't assign directly to disk->queue, otherwise blk_put_queue (called via put_disk) will choke (panic) on the errno stored there. Bug found by code inspection after Omar found a similar issue in virtio_blk. Compile-tested only. Signed-off-by: Jeff Moyer diff

Re: [PATCH] do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned

2017-01-09 Thread Jan Kara
On Sun 08-01-17 20:17:10, Chandan Rajendra wrote: > The code currently uses sdio->blkbits to compute the number of blocks to > be cleaned. However sdio->blkbits is derived from the logical block size > of the underlying block device (Refer to the definition of > do_blockdev_direct_IO()). Due to

Re: [PATCH V5 09/17] blk-throttle: detect completed idle cgroup

2017-01-09 Thread Tejun Heo
Hello, On Thu, Dec 15, 2016 at 12:33:00PM -0800, Shaohua Li wrote: > @@ -1660,6 +1671,11 @@ static bool throtl_tg_can_downgrade(struct throtl_grp > *tg) > struct throtl_data *td = tg->td; > unsigned long now = jiffies; > > + if (time_after_eq(now, tg->last_dispatch_time[READ] +

Re: [PATCH V5 07/17] blk-throttle: make sure expire time isn't too big

2017-01-09 Thread Tejun Heo
On Thu, Dec 15, 2016 at 12:32:58PM -0800, Shaohua Li wrote: > cgroup could be throttled to a limit but when all cgroups cross high > limit, queue enters a higher state and so the group should be throttled > to a higher limit. It's possible the cgroup is sleeping because of > throttle and other

Re: [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state

2017-01-09 Thread Tejun Heo
Hello, again. On Mon, Jan 09, 2017 at 01:40:53PM -0500, Tejun Heo wrote: > I think it'd be great to explain the above. It was a bit difficult > for me to follow. It's also interesting because we're tying state > transitions for both read and write together. blk-throtl has been > handling reads

Re: [PATCH 2/2] nvme: improve cmb sysfs reporting

2017-01-09 Thread Stephen Bates
> Minor nit below > > >> + >> +for (i = NVME_CMB_CAP_SQS; i <= NVME_CMB_CAP_WDS; i++) >> > I'd prefer seeing (i = 0; i < ARRAY_SIZE(..); i++) because it provides > automatic bounds checking against future code. > Thanks Jon, I will take a look at doing this in a V1. Stephen -- To unsubscribe

Re: [PATCH 0/2] nvme: Improvements in sysfs entry for NVMe CMBs

2017-01-09 Thread Stephen Bates
> > I have added 1/2, since that one is a no-brainer. For 2/2, not so sure. > Generally we try to avoid having sysfs file that aren't single value > output. That isn't a super hard rule, but it is preferable. > > -- > Jens Axboe > Thanks Jens and sorry for the delay (extended vacation). Thanks

Re: [PATCH V5 05/17] blk-throttle: add upgrade logic for LIMIT_LOW state

2017-01-09 Thread Tejun Heo
Hello, Shaohua. On Thu, Dec 15, 2016 at 12:32:56PM -0800, Shaohua Li wrote: > For a cgroup hierarchy, there are two cases. Children has lower low > limit than parent. Parent's low limit is meaningless. If children's > bps/iops cross low limit, we can upgrade queue state. The other case is >

Re: [PATCH V5 04/17] blk-throttle: configure bps/iops limit for cgroup in low limit

2017-01-09 Thread Tejun Heo
Hello, On Thu, Dec 15, 2016 at 12:32:55PM -0800, Shaohua Li wrote: > each queue will have a state machine. Initially queue is in LIMIT_LOW > state, which means all cgroups will be throttled according to their low > limit. After all cgroups with low limit cross the limit, the queue state > gets

[LSF/MM TOPIC] [LSF/MM ATTEND] md raid general discussion

2017-01-09 Thread Coly Li
Hi Folks, I'd like to propose a general md raid discussion, it is quite necessary for most of active md raid developers sit together to discuss current challenge of Linux software raid and development trends. In the last years, we have many development activities in md raid, e.g. raid5 cache,

Re: [PATCH V5 03/17] blk-throttle: add .low interface

2017-01-09 Thread Tejun Heo
Happy new year, Shaohua. Sorry about the long delay. On Thu, Dec 15, 2016 at 12:32:54PM -0800, Shaohua Li wrote: > Add low limit for cgroup and corresponding cgroup interface. It'd be nice to explain why we're adding separate _conf fields. > +static void blk_throtl_update_valid_limit(struct

Re: [PATCH] virtio_blk: avoid DMA to stack for the sense buffer

2017-01-09 Thread Jens Axboe
On 01/09/2017 06:35 AM, Christoph Hellwig wrote: > Is someone going to pick the patch up and send it to Linus? I keep > running into all kinds of boot failures whenever I forget to cherry > pick it into my development trees.. I'll add it. -- Jens Axboe -- To unsubscribe from this list: send

Re: [RFC] blk: increase logical_block_size to unsigned int

2017-01-09 Thread Jerome Marchand
- Original Message - > From: "Sergey Senozhatsky" > To: "Minchan Kim" > Cc: "Jens Axboe" , "Hyeoncheol Lee" , > linux-block@vger.kernel.org, > linux-ker...@vger.kernel.org, "Andrew Morton"

Re: [LSF/MM TOPIC][LSF/MM ATTEND] OCSSDs - SMR, Hierarchical Interface, and Vector I/Os

2017-01-09 Thread Theodore Ts'o
So in the model where the Flash-side is tracking logical to physical zone mapping, and host is merely expecting the ZBC interface, one way it could work is as follows. 1) The flash signals that a particular zone should be reset soon. 2) If the host does not honor the request, eventually the

Re: [RFC] blk: increase logical_block_size to unsigned int

2017-01-09 Thread Sergey Senozhatsky
On (01/09/17 14:04), Minchan Kim wrote: > Mostly, zram is used as swap system on embedded world so it want to do IO > as PAGE_SIZE aligned/size IO unit. For that, one of the problem was > blk_queue_logical_block_size(zram->disk->queue, PAGE_SIZE) made overflow > in *64K page system* so [1] changed

LSF/MM 2017: Call for Proposals closes January 15th

2017-01-09 Thread Jeff Layton
We initially sent this pretty early this year, so this is a resend in case anyone missed the first posting. The call for topics and attendance requests is open until January 15th, 2017. The original message follows: --8< The annual