Re: [PATCH 0/5][v2] blk-iolatency: Fixes and tweak the miss algo for ssds

2018-09-28 Thread Jens Axboe
On 9/28/18 11:45 AM, Josef Bacik wrote: > v1->v2: > - rebased onto a recent for-4.20/block branch > - dropped the changed variable cleanup. > > -- Original message -- > > Testing on ssd's with the current iolatency code wasn't working quite as well. > This is mostly because ssd's don't behave

[PATCH 5/5] blk-iolatency: keep track of previous windows stats

2018-09-28 Thread Josef Bacik
We apply a smoothing to the scale changes in order to keep sawtoothy behavior from occurring. However our window for checking if we've missed our target can sometimes be lower than the smoothing interval (500ms), especially on faster drives like ssd's. In order to deal with this keep track of

[PATCH 0/5][v2] blk-iolatency: Fixes and tweak the miss algo for ssds

2018-09-28 Thread Josef Bacik
v1->v2: - rebased onto a recent for-4.20/block branch - dropped the changed variable cleanup. -- Original message -- Testing on ssd's with the current iolatency code wasn't working quite as well. This is mostly because ssd's don't behave like rotational drives, they are more spikey which means

[PATCH 1/5] blk-iolatency: use q->nr_requests directly

2018-09-28 Thread Josef Bacik
We were using blk_queue_depth() assuming that it would return nr_requests, but we hit a case in production on drives that had to have NCQ turned off in order for them to not shit the bed which resulted in a qd of 1, even though the nr_requests was much larger. iolatency really only cares about

[PATCH 3/5] blk-iolatency: deal with small samples

2018-09-28 Thread Josef Bacik
There is logic to keep cgroups that haven't done a lot of IO in the most recent scale window from being punished for over-active higher priority groups. However for things like ssd's where the windows are pretty short we'll end up with small numbers of samples, so 5% of samples will come out to 0

[PATCH 4/5] blk-iolatency: use a percentile approache for ssd's

2018-09-28 Thread Josef Bacik
We use an average latency approach for determining if we're missing our latency target. This works well for rotational storage where we have generally consistent latencies, but for ssd's and other low latency devices you have more of a spikey behavior, which means we often won't throttle

[PATCH 2/5] blk-iolatency: deal with nr_requests == 1

2018-09-28 Thread Josef Bacik
Hitting the case where blk_queue_depth() returned 1 uncovered the fact that iolatency doesn't actually handle this case properly, it simply doesn't scale down anybody. For this case we should go straight into applying the time delay, which we weren't doing. Since we already limit the floor at 1

Re: [PATCH] kyber: fix integer overflow of latency targets on 32-bit

2018-09-28 Thread Jens Axboe
On 9/28/18 10:22 AM, Omar Sandoval wrote: > From: Omar Sandoval > > NSEC_PER_SEC has type long, so 5 * NSEC_PER_SEC is calculated as a long. > However, 5 seconds is 5,000,000,000 nanoseconds, which overflows a > 32-bit long. Make sure all of the targets are calculated as 64-bit > values.

[PATCH] kyber: fix integer overflow of latency targets on 32-bit

2018-09-28 Thread Omar Sandoval
From: Omar Sandoval NSEC_PER_SEC has type long, so 5 * NSEC_PER_SEC is calculated as a long. However, 5 seconds is 5,000,000,000 nanoseconds, which overflows a 32-bit long. Make sure all of the targets are calculated as 64-bit values. Fixes: 6e25cb01ea20 ("kyber: implement improved heuristics")

Re: [GIT PULL] nvme fixes for 4.19

2018-09-28 Thread Jens Axboe
On 9/28/18 9:40 AM, Christoph Hellwig wrote: > The following changes since commit b228ba1cb95afbaeeb86cf06cd9fd6f6369c3b14: > > null_blk: fix zoned support for non-rq based operation (2018-09-12 18:21:11 > -0600) > > are available in the Git repository at: > >

[GIT PULL] nvme fixes for 4.19

2018-09-28 Thread Christoph Hellwig
The following changes since commit b228ba1cb95afbaeeb86cf06cd9fd6f6369c3b14: null_blk: fix zoned support for non-rq based operation (2018-09-12 18:21:11 -0600) are available in the Git repository at: git://git.infradead.org/nvme.git nvme-4.19 for you to fetch changes up to

Re: [PATCH 2/5] nvme: register ns_id attributes as default sysfs groups

2018-09-28 Thread Christoph Hellwig
On Fri, Sep 28, 2018 at 08:17:20AM +0200, Hannes Reinecke wrote: > We should be registering the ns_id attribute as default sysfs > attribute groups, otherwise we have a race condition between > the uevent and the attributes appearing in sysfs. Looks good, Reviewed-by: Christoph Hellwig

Re: [PATCHv4 0/5] genhd: register default groups with device_add_disk()

2018-09-28 Thread Jens Axboe
On 9/28/18 12:17 AM, Hannes Reinecke wrote: > device_add_disk() doesn't allow to register with default sysfs groups, > which introduces a race with udev as these groups have to be created after > the uevent has been sent. > This patchset updates device_add_disk() to accept a 'groups' argument to >

Re: [PATCH 2/5] nvme: register ns_id attributes as default sysfs groups

2018-09-28 Thread Keith Busch
On Fri, Sep 28, 2018 at 08:17:20AM +0200, Hannes Reinecke wrote: > We should be registering the ns_id attribute as default sysfs > attribute groups, otherwise we have a race condition between > the uevent and the attributes appearing in sysfs. > > Suggested-by: Bart van Assche > Signed-off-by:

[PATCH V2] blk-mq: complete req in softirq context in case of single queue

2018-09-28 Thread Ming Lei
Lot of controllers may have only one irq vector for completing IO request. And usually affinity of the only irq vector is all possible CPUs, however, on most of ARCH, there may be only one specific CPU for handling this interrupt. So if all IOs are completed in hardirq context, it is inevitable

Re: [PATCH] blk-mq: complete req in softirq context in case of single queue

2018-09-28 Thread Ming Lei
On Thu, Sep 27, 2018 at 11:30:19AM +0800, jianchao.wang wrote: > Hi Ming > > On 09/27/2018 12:08 AM, Ming Lei wrote: > > Lot of controllers may have only one irq vector for completing IO > > request. And usually affinity of the only irq vector is all possible > > CPUs, however, on most of ARCH,

[PATCH 2/5] nvme: register ns_id attributes as default sysfs groups

2018-09-28 Thread Hannes Reinecke
We should be registering the ns_id attribute as default sysfs attribute groups, otherwise we have a race condition between the uevent and the attributes appearing in sysfs. Suggested-by: Bart van Assche Signed-off-by: Hannes Reinecke --- drivers/nvme/host/core.c | 21 -

[PATCH 1/5] block: genhd: add 'groups' argument to device_add_disk

2018-09-28 Thread Hannes Reinecke
Update device_add_disk() to take an 'groups' argument so that individual drivers can register a device with additional sysfs attributes. This avoids race condition the driver would otherwise have if these groups were to be created with sysfs_add_groups(). Signed-off-by: Martin Wilck

[PATCH 4/5] zram: register default groups with device_add_disk()

2018-09-28 Thread Hannes Reinecke
Register default sysfs groups during device_add_disk() to avoid a race condition with udev during startup. Signed-off-by: Hannes Reinecke Cc: Minchan Kim Cc: Nitin Gupta Reviewed-by: Christoph Hellwig Reviewed-by: Bart Van Assche --- drivers/block/zram/zram_drv.c | 28

[PATCH 5/5] virtio-blk: modernize sysfs attribute creation

2018-09-28 Thread Hannes Reinecke
Use new-style DEVICE_ATTR_RO/DEVICE_ATTR_RW to create the sysfs attributes and register the disk with default sysfs attribute groups. Signed-off-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Acked-by: Michael S. Tsirkin Reviewed-by: Bart Van Assche --- drivers/block/virtio_blk.c | 68

[PATCHv4 0/5] genhd: register default groups with device_add_disk()

2018-09-28 Thread Hannes Reinecke
device_add_disk() doesn't allow to register with default sysfs groups, which introduces a race with udev as these groups have to be created after the uevent has been sent. This patchset updates device_add_disk() to accept a 'groups' argument to avoid this race and updates the obvious drivers to

[PATCH 3/5] aoe: register default groups with device_add_disk()

2018-09-28 Thread Hannes Reinecke
Register default sysfs groups during device_add_disk() to avoid a race condition with udev during startup. Signed-off-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Acked-by: Ed L. Cachin Reviewed-by: Bart Van Assche --- drivers/block/aoe/aoe.h| 1 - drivers/block/aoe/aoeblk.c | 21