Re: [PATCH v2] blk-throttle: fix race between blkcg_bio_issue_check and cgroup_rmdir

2018-02-27 Thread Joseph Qi
Hi Tejun, On 18/2/28 02:33, Tejun Heo wrote: > Hello, Joseph. > > On Sat, Feb 24, 2018 at 09:45:49AM +0800, Joseph Qi wrote: >>> IIRC, as long as the blkcg and the device are there, the blkgs aren't >>> gonna be destroyed. So, if you have a ref to the blkcg through >>> tryget, the blkg

Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-02-27 Thread chenxiang (M)
在 2018/2/27 22:57, Bart Van Assche 写道: On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote: 在 2018/2/26 23:25, Bart Van Assche 写道: On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote: When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA disk, then disable the disk

Re: [PATCH v7 0/9] bcache: device failure handling improvement

2018-02-27 Thread Coly Li
On 28/02/2018 3:27 AM, Michael Lyle wrote: > On 02/27/2018 10:33 AM, Michael Lyle wrote: >> On 02/27/2018 10:29 AM, Michael Lyle wrote: >>> Hi Coly Li-- >>> >>> On 02/27/2018 08:55 AM, Coly Li wrote: Hi maintainers and folks, This patch set tries to improve bcache device failure

Re: [PATCH v7 9/9] bcache: stop bcache device when backing device is offline

2018-02-27 Thread Coly Li
On 28/02/2018 2:20 AM, Michael Lyle wrote: > Hi Coly Li-- > > Just a couple of questions. > > On 02/27/2018 08:55 AM, Coly Li wrote: >> +#define BACKING_DEV_OFFLINE_TIMEOUT 5 > Hi Mike, > I think you wanted this to be 30 (per commit message)-- was this turned > down for testing or deliberate?

Re: [PATCH] blk-mq: Make sure that the affected zone is unlocked if a request times out

2018-02-27 Thread Ming Lei
Hi Damien, On Wed, Feb 28, 2018 at 02:21:49AM +, Damien Le Moal wrote: > Ming, > > On 2018/02/27 17:35, Ming Lei wrote: > > On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote: > >> If a request times out the .completed_request() method is not called > > > > If BLK_EH_HANDLED is

Re: [PATCH] blk-mq: Make sure that the affected zone is unlocked if a request times out

2018-02-27 Thread Damien Le Moal
Ming, On 2018/02/27 17:35, Ming Lei wrote: > On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote: >> If a request times out the .completed_request() method is not called > > If BLK_EH_HANDLED is returned from .timeout(), __blk_mq_complete_request() > should have called

[PATCH] bcache: fix crashes in duplicate cache device register

2018-02-27 Thread tang . junhui
From: Tang Junhui Kernel crashed when register a duplicate cache device, the call trace is bellow: [ 417.643790] CPU: 1 PID: 16886 Comm: bcache-register Tainted: G W OE4.15.5-amd64-preempt-sysrq-20171018 #2 [

Re: [PATCH 2/2] blk-mq-debugfs: Show zone locking information

2018-02-27 Thread Damien Le Moal
Bart, On 2018/02/27 16:32, Bart Van Assche wrote: > When debugging the ZBC code in the mq-deadline scheduler it is very > important to know which zones are locked and which zones are not > locked. Hence this patch that exports the zone locking information > through debugfs. > > Signed-off-by:

Re: [PATCH] blk-mq: Make sure that the affected zone is unlocked if a request times out

2018-02-27 Thread Ming Lei
On Tue, Feb 27, 2018 at 04:28:30PM -0800, Bart Van Assche wrote: > If a request times out the .completed_request() method is not called If BLK_EH_HANDLED is returned from .timeout(), __blk_mq_complete_request() should have called .completed_request(). Otherwise, somewhere may be wrong about

[PATCH 2/2] sbitmap: use test_and_set_bit_lock()/clear_bit_unlock()

2018-02-27 Thread Omar Sandoval
From: Omar Sandoval sbitmap_queue_get()/sbitmap_queue_clear() are used for allocating/freeing a resource, so they should provide acquire/release barrier semantics, respectively. sbitmap_get() currently contains a full barrier, which is unnecessary, so use test_and_set_bit_lock()

[PATCH 0/2] block: sbitmap tweaks

2018-02-27 Thread Omar Sandoval
From: Omar Sandoval Two fixlets inspired by Tejun's patch (https://patchwork.kernel.org/patch/10226749/). Patch 2 is what we discussed on that patch, patch 1 is a small preparation. Omar Sandoval (2): block: clear ctx pending bit under ctx lock sbitmap: use

[PATCH 1/2] block: clear ctx pending bit under ctx lock

2018-02-27 Thread Omar Sandoval
From: Omar Sandoval When we insert a request, we set the software queue pending bit while holding the software queue lock. However, we clear it outside of the lock, so it's possible that a concurrent insert could reset the bit after we clear it but before we empty the request

[PATCH 1/2] blk-mq-debugfs: Reorder queue show and store methods

2018-02-27 Thread Bart Van Assche
Make sure that the queue show and store methods are contiguous and also that these appear in alphabetical order. Signed-off-by: Bart Van Assche Cc: Omar Sandoval Cc: Damien Le Moal Cc: Ming Lei Cc: Hannes

[PATCH 2/2] blk-mq-debugfs: Show zone locking information

2018-02-27 Thread Bart Van Assche
When debugging the ZBC code in the mq-deadline scheduler it is very important to know which zones are locked and which zones are not locked. Hence this patch that exports the zone locking information through debugfs. Signed-off-by: Bart Van Assche Cc: Omar Sandoval

[PATCH 0/2] Make the zone locking information available in debugfs

2018-02-27 Thread Bart Van Assche
Hello Jens, While analyzing the mq-deadline behavior for ZBC drives together with Damien we noticed the following: - That the request queue attribute methods are not contiguous in blk-mq-debugfs.c. - That the information about which zones are locked is not yet available in debugfs. Hence

[PATCH] blk-mq: Make sure that the affected zone is unlocked if a request times out

2018-02-27 Thread Bart Van Assche
If a request times out the .completed_request() method is not called and the .finish_request() method is only called if RQF_ELVPRIV has been set. Hence this patch that sets RQF_ELVPRIV and that adds a .finish_request() method. Without this patch, if a request times out the zone that request

Re: [PATCH v4 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-27 Thread Bart Van Assche
On Sat, 2018-02-24 at 20:44 +0800, Ming Lei wrote: > On Thu, Feb 22, 2018 at 05:08:02PM -0800, Bart Van Assche wrote: > > Hello Jens, > > > > Recently Joseph Qi identified races between the block cgroup code and > > request > > queue initialization and cleanup. This patch series address these

Re: [PATCH v4 0/6] Fix races between blkcg code and request queue initialization and cleanup

2018-02-27 Thread Bart Van Assche
On Thu, 2018-02-22 at 17:08 -0800, Bart Van Assche wrote: > Recently Joseph Qi identified races between the block cgroup code and request > queue initialization and cleanup. This patch series address these races. > Please > consider these patches for kernel v4.17. Hello Jens, Can you have a

Re: [PATCH] null_blk: add 'requeue' fault attribute

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 15:34 -0700, Jens Axboe wrote: > Similarly to the support we have for testing/faking timeouts for > null_blk, this adds support for triggering a requeue condition. > Considering the issues around restart we've been seeing, this should be > a useful addition to the testing

Re: [PATCH] nbd: fix return value in error handling path

2018-02-27 Thread Jens Axboe
On 2/12/18 10:14 AM, Gustavo A. R. Silva wrote: > It seems that the proper value to return in this particular case is the > one contained into variable new_index instead of ret. Thanks, applied. -- Jens Axboe

Re: [PATCH] nbd: fix return value in error handling path

2018-02-27 Thread Gustavo A. R. Silva
On 02/26/2018 05:01 PM, Omar Sandoval wrote: On Mon, Feb 12, 2018 at 11:14:55AM -0600, Gustavo A. R. Silva wrote: It seems that the proper value to return in this particular case is the one contained into variable new_index instead of ret. Addresses-Coverity-ID: 1465148 ("Copy-paste error")

[PATCH] null_blk: add 'requeue' fault attribute

2018-02-27 Thread Jens Axboe
Similarly to the support we have for testing/faking timeouts for null_blk, this adds support for triggering a requeue condition. Considering the issues around restart we've been seeing, this should be a useful addition to the testing arsenal to ensure that we are handling requeue conditions

Re: [PATCH RFC] sbitmap: Use lock/unlock atomic bitops

2018-02-27 Thread Omar Sandoval
On Tue, Feb 27, 2018 at 10:14:04AM -0800, Tejun Heo wrote: > Hello, Omar. > > On Mon, Feb 26, 2018 at 02:14:44PM -0800, Omar Sandoval wrote: > > > wake_index = atomic_read(>wake_index); > > > for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > > > struct sbq_wait_state *ws = >ws[wake_index];

Re: [PATCH 17/19] lightnvm: pblk: implement get log report chunk

2018-02-27 Thread Javier González
> On 27 Feb 2018, at 19.46, Matias Bjørling wrote: > > On 02/27/2018 03:40 PM, Javier González wrote: >>> On 26 Feb 2018, at 20.04, Matias Bjørling wrote: > >>> Can you help me understand why you want to use The >>> NVM_CHK_ST_HOST_USE? Why would I care if

Re: [PATCH] lightnvm: simplify geometry structure.

2018-02-27 Thread Javier Gonzalez
> On 27 Feb 2018, at 19.23, Matias Bjørling wrote: > > On 02/27/2018 04:57 PM, Javier González wrote: >> Currently, the device geometry is stored redundantly in the nvm_id and >> nvm_geo structures at a device level. Moreover, when instantiating >> targets on a specific number

Re: [PATCH v7 0/9] bcache: device failure handling improvement

2018-02-27 Thread Michael Lyle
On 02/27/2018 10:33 AM, Michael Lyle wrote: > On 02/27/2018 10:29 AM, Michael Lyle wrote: >> Hi Coly Li-- >> >> On 02/27/2018 08:55 AM, Coly Li wrote: >>> Hi maintainers and folks, >>> >>> This patch set tries to improve bcache device failure handling, includes >>> cache device and backing device

Re: [PATCH v7 3/9] bcache: stop dc->writeback_rate_update properly

2018-02-27 Thread Michael Lyle
OK, I have convinced myself this is safe. Reviewed-by: Michael Lyle On 02/27/2018 08:55 AM, Coly Li wrote: > struct delayed_work writeback_rate_update in struct cache_dev is a delayed > worker to call function update_writeback_rate() in period (the interval is > defined by

Re: [PATCH 17/19] lightnvm: pblk: implement get log report chunk

2018-02-27 Thread Matias Bjørling
On 02/27/2018 03:40 PM, Javier González wrote: On 26 Feb 2018, at 20.04, Matias Bjørling wrote: Can you help me understand why you want to use The NVM_CHK_ST_HOST_USE? Why would I care if the chunk state is HOST_USE? A target instance should not be able to see states

Re: [PATCH v2] blk-throttle: fix race between blkcg_bio_issue_check and cgroup_rmdir

2018-02-27 Thread Tejun Heo
Hello, Joseph. On Sat, Feb 24, 2018 at 09:45:49AM +0800, Joseph Qi wrote: > > IIRC, as long as the blkcg and the device are there, the blkgs aren't > > gonna be destroyed. So, if you have a ref to the blkcg through > > tryget, the blkg shouldn't go away. > > > > Maybe we have misunderstanding

Re: [PATCH v7 0/9] bcache: device failure handling improvement

2018-02-27 Thread Michael Lyle
On 02/27/2018 10:29 AM, Michael Lyle wrote: > Hi Coly Li-- > > On 02/27/2018 08:55 AM, Coly Li wrote: >> Hi maintainers and folks, >> >> This patch set tries to improve bcache device failure handling, includes >> cache device and backing device failures. > > I have applied 1, 2, 4 & 6 from this

Re: [PATCH v7 0/9] bcache: device failure handling improvement

2018-02-27 Thread Michael Lyle
Hi Coly Li-- On 02/27/2018 08:55 AM, Coly Li wrote: > Hi maintainers and folks, > > This patch set tries to improve bcache device failure handling, includes > cache device and backing device failures. I have applied 1, 2, 4 & 6 from this series to my 4.17 bcache-for-next for testing. Mike

Re: [PATCH] lightnvm: simplify geometry structure.

2018-02-27 Thread Matias Bjørling
On 02/27/2018 04:57 PM, Javier González wrote: Currently, the device geometry is stored redundantly in the nvm_id and nvm_geo structures at a device level. Moreover, when instantiating targets on a specific number of LUNs, these structures are replicated and manually modified to fit the instance

Re: [PATCH v7 9/9] bcache: stop bcache device when backing device is offline

2018-02-27 Thread Michael Lyle
Hi Coly Li-- Just a couple of questions. On 02/27/2018 08:55 AM, Coly Li wrote: > +#define BACKING_DEV_OFFLINE_TIMEOUT 5 I think you wanted this to be 30 (per commit message)-- was this turned down for testing or deliberate? > +static int cached_dev_status_update(void *arg) > +{ > + struct

Re: [PATCH v7 4/9] bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags

2018-02-27 Thread Michael Lyle
Hi Coly Li-- On 02/27/2018 08:55 AM, Coly Li wrote: > When too many I/Os failed on cache device, bch_cache_set_error() is called > in the error handling code path to retire whole problematic cache set. If > new I/O requests continue to come and take refcount dc->count, the cache > set won't be

Re: [PATCH v7 1/9] bcache: fix cached_dev->count usage for bch_cache_set_error()

2018-02-27 Thread Michael Lyle
Hi Coly Li--- Thanks for this. I've been uncomfortable with the interaction between the dirty status and the refcount (even aside from this issue), and I believe you've resolved it. I'm sorry for the slow review-- it's taken me some time to convince myself that this is safe. I'm getting closer

Re: [PATCH 0/2] Bcache fixes for 4.16

2018-02-27 Thread Jens Axboe
On 2/27/18 10:49 AM, Michael Lyle wrote: > Hi Jens, > > Please pick up these two critical fixes to bcache by Tang Junhui. > They're both one-liners and have been reviewed and tested. > > The first corrects a regression when flash-only volumes are present > that was introduced in 4.16-RC1. The

[PATCH 1/2] bcache: correct flash only vols (check all uuids)

2018-02-27 Thread Michael Lyle
From: Coly Li Commit 2831231d4c3f ("bcache: reduce cache_set devices iteration by devices_max_used") adds c->devices_max_used to reduce iteration of c->uuids elements, this value is updated in bcache_device_attach(). But for flash only volume, when calling flash_devs_run(), the

[PATCH 0/2] Bcache fixes for 4.16

2018-02-27 Thread Michael Lyle
Hi Jens, Please pick up these two critical fixes to bcache by Tang Junhui. They're both one-liners and have been reviewed and tested. The first corrects a regression when flash-only volumes are present that was introduced in 4.16-RC1. The second adjusts bio refcount and completion behavior to

[PATCH 2/2] bcache: fix kcrashes with fio in RAID5 backend dev

2018-02-27 Thread Michael Lyle
From: Tang Junhui Kernel crashed when run fio in a RAID5 backend bcache device, the call trace is bellow: [ 440.012034] kernel BUG at block/blk-ioc.c:146! [ 440.012696] invalid opcode: [#1] SMP NOPTI [ 440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted

[PATCH v7 7/9] bcache: add backing_request_endio() for bi_end_io of attached backing device I/O

2018-02-27 Thread Coly Li
In order to catch I/O error of backing device, a separate bi_end_io call back is required. Then a per backing device counter can record I/O errors number and retire the backing device if the counter reaches a per backing device I/O error limit. This patch adds backing_request_endio() to bcache

[PATCH v7 9/9] bcache: stop bcache device when backing device is offline

2018-02-27 Thread Coly Li
Currently bcache does not handle backing device failure, if backing device is offline and disconnected from system, its bcache device can still be accessible. If the bcache device is in writeback mode, I/O requests even can success if the requests hit on cache device. That is to say, when and how

[PATCH v7 5/9] bcache: add stop_when_cache_set_failed option to backing device

2018-02-27 Thread Coly Li
When there are too many I/O errors on cache device, current bcache code will retire the whole cache set, and detach all bcache devices. But the detached bcache devices are not stopped, which is problematic when bcache is in writeback mode. If the retired cache set has dirty data of backing

[PATCH v7 6/9] bcache: fix inaccurate io state for detached bcache devices

2018-02-27 Thread Coly Li
From: Tang Junhui When we run IO in a detached device, and run iostat to shows IO status, normally it will show like bellow (Omitted some fields): Device: ... avgrq-sz avgqu-sz await r_await w_await svctm %util sdd... 15.89 0.531.820.202.23

[PATCH v7 2/9] bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set

2018-02-27 Thread Coly Li
In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()", cached_dev_get() is called when creating dc->writeback_thread, and cached_dev_put() is called when exiting dc->writeback_thread. This modification works well unless people detach the bcache device manually by 'echo 1 >

[PATCH v7 1/9] bcache: fix cached_dev->count usage for bch_cache_set_error()

2018-02-27 Thread Coly Li
When bcache metadata I/O fails, bcache will call bch_cache_set_error() to retire the whole cache set. The expected behavior to retire a cache set is to unregister the cache set, and unregister all backing device attached to this cache set, then remove sysfs entries of the cache set and all

[PATCH v7 3/9] bcache: stop dc->writeback_rate_update properly

2018-02-27 Thread Coly Li
struct delayed_work writeback_rate_update in struct cache_dev is a delayed worker to call function update_writeback_rate() in period (the interval is defined by dc->writeback_rate_update_seconds). When a metadate I/O error happens on cache device, bcache error handling routine

[PATCH v7 4/9] bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags

2018-02-27 Thread Coly Li
When too many I/Os failed on cache device, bch_cache_set_error() is called in the error handling code path to retire whole problematic cache set. If new I/O requests continue to come and take refcount dc->count, the cache set won't be retired immediately, this is a problem. Further more, there

[PATCH v7 0/9] bcache: device failure handling improvement

2018-02-27 Thread Coly Li
Hi maintainers and folks, This patch set tries to improve bcache device failure handling, includes cache device and backing device failures. The basic idea to handle failed cache device is, - Unregister cache set - Detach all backing devices which are attached to this cache set - Stop all the

Re: GPF in wb_congested due to null bdi_writeback

2018-02-27 Thread Nikolay Borisov
On 27.02.2018 18:05, Nikolay Borisov wrote: > Hello Tejun, > > So while running some fs tests I hit the following GPF. Btw the > warning taint flag was due to a debugging WARN_ON in btrfs 100 or so > tests ago so is unrelated to this gpf: > > [ 4255.628110] general protection fault:

Re: [PATCH] lightnvm: simplify geometry structure.

2018-02-27 Thread Javier Gonzalez
> On 27 Feb 2018, at 16.57, Javier González wrote: > > Currently, the device geometry is stored redundantly in the nvm_id and > nvm_geo structures at a device level. Moreover, when instantiating > targets on a specific number of LUNs, these structures are replicated > and

GPF in wb_congested due to null bdi_writeback

2018-02-27 Thread Nikolay Borisov
Hello Tejun, So while running some fs tests I hit the following GPF. Btw the warning taint flag was due to a debugging WARN_ON in btrfs 100 or so tests ago so is unrelated to this gpf: [ 4255.628110] general protection fault: [#1] SMP PTI [ 4255.628303] Modules linked in: [ 4255.628446]

[PATCH] lightnvm: simplify geometry structure.

2018-02-27 Thread Javier González
Currently, the device geometry is stored redundantly in the nvm_id and nvm_geo structures at a device level. Moreover, when instantiating targets on a specific number of LUNs, these structures are replicated and manually modified to fit the instance channel and LUN partitioning. Instead, create a

[PATCH V4] lightnvm: simplify geometry structure

2018-02-27 Thread Javier González
Sending this separately as it seems to be the controversial one. # Changes since V3 >From Matias: - Remove nvm_common_geo - Do appropriate renames when having a single geometry for device and targets Javier Javier González (1): lightnvm: simplify geometry structure.

Re: [bug report] Don't enter SCSI error handler on kernel 4.16-rc1

2018-02-27 Thread Bart Van Assche
On Tue, 2018-02-27 at 15:09 +0800, chenxiang (M) wrote: > 在 2018/2/26 23:25, Bart Van Assche 写道: > > On Mon, 2018-02-26 at 17:37 +0800, chenxiang (M) wrote: > > > When i have a test on kernel 4.16-rc1, find a issue: running IO on SATA > > > disk, then disable the disk through > > > sysfs

Re: [PATCH 17/19] lightnvm: pblk: implement get log report chunk

2018-02-27 Thread Javier González
> On 26 Feb 2018, at 20.04, Matias Bjørling wrote: > > On 02/26/2018 02:17 PM, Javier González wrote: >> From: Javier González >> In preparation of pblk supporting 2.0, implement the get log report >> chunk in pblk. >> This patch only replicates de bad

[PATCH 2/4] block: bio_check_eod() needs to consider partition

2018-02-27 Thread Jiufei Xue
bio_check_eod() should check partiton size not the whole disk if bio->bi_partno is not zero. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Signed-off-by: Jiufei Xue --- block/blk-core.c | 79

[PATCH 3/4] block: display the correct diskname for bio

2018-02-27 Thread Jiufei Xue
bio_devname use __bdevname to display the device name, and can only show the major and minor of the part0, Fix this by using disk_name to display the correct name. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Signed-off-by: Jiufei Xue

[PATCH 4/4] block: fix a typo

2018-02-27 Thread Jiufei Xue
Fix a typo in pkt_start_recovery. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Signed-off-by: Jiufei Xue --- drivers/block/pktcdvd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[PATCH 1/4] block: fix the count of PGPGOUT for WRITE_SAME

2018-02-27 Thread Jiufei Xue
The vm counters is counted in sectors, so we should do the conversation in submit_bio. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Cc: sta...@vger.kernel.org Reviewed-by: Omar Sandoval Reviewed-by: Christoph Hellwig

[PATCH V2 0/4] fix a few problems in block layer

2018-02-27 Thread Jiufei Xue
I have found a few problems while reviewing the patch 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index"), So fix them. Changes since v1: - add a Fixes tag in individual patch. - check end-of-device of bio in blk_partition_remap when the bi_partno is not zero to

Re: [PATCH V15 06/22] mmc: block: Add blk-mq support

2018-02-27 Thread Dmitry Osipenko
On 27.02.2018 11:57, Linus Walleij wrote: > On Mon, Feb 26, 2018 at 10:48 PM, Dmitry Osipenko wrote: >> On 22.02.2018 20:54, Dmitry Osipenko wrote: >>> On 22.02.2018 10:42, Adrian Hunter wrote: > SDIO (unless it is a combo card) should be unaffected by changes to the

Re: v4.16-rc2: virtio-block + ext4 lockdep splats / sleeping from invalid context

2018-02-27 Thread Mark Rutland
On Mon, Feb 26, 2018 at 01:44:55PM +0100, Jan Kara wrote: > On Mon 26-02-18 11:38:19, Mark Rutland wrote: > > That seems to be it! > > > > With the below patch applied, I can't trigger the bug after ~10 minutes, > > whereas prior to the patch I can trigger it in ~10 seconds. I'll leave > > that

[PATCH V3 8/8] scsi: megaraid: improve scsi_mq performance via .host_tagset

2018-02-27 Thread Ming Lei
It is observed on null_blk that IOPS can be improved much by simply making hw queue per NUMA node, so this patch applies the introduced .host_tagset for improving performance. In reality, .can_queue is quite big, and NUMA node number is often small, so each hw queue's depth should be high enough

[PATCH V3 7/8] scsi: hpsa: improve scsi_mq performance via .host_tagset

2018-02-27 Thread Ming Lei
It is observed that IOPS can be improved much by simply making hw queue per NUMA node on null_blk, so this patch applies the introduced .host_tagset for improving performance. In reality, .can_queue is quite big, and NUMA node number is often small, so each hw queue's depth should be high enough

[PATCH V3 5/8] scsi: Add template flag 'host_tagset'

2018-02-27 Thread Ming Lei
From: Hannes Reinecke Add a host template flag 'host_tagset' to enable the use of a global tagset for block-mq. Cc: Hannes Reinecke Cc: Arun Easi Cc: Omar Sandoval , Cc: "Martin K. Petersen" , Cc:

[PATCH V3 6/8] block: null_blk: introduce module parameter of 'g_host_tags'

2018-02-27 Thread Ming Lei
This patch introduces the parameter of 'g_host_tags' so that we can test this feature by null_blk easiy. With host_tags when the whole hw depth is kept as same, it is observed that IOPS can be improved by ~50% on a dual socket(total 16 CPU cores) system: 1) no 'host_tags', each hw queue depth is

[PATCH V3 4/8] blk-mq: introduce BLK_MQ_F_HOST_TAGS

2018-02-27 Thread Ming Lei
This patch can support to partition host-wide tags to multiple hw queues, so each hw queue related data structures(tags, hctx) can be accessed in NUMA locality way, for example, the hw queue can be per NUMA node. It is observed IOPS can be improved much in this way on null_blk test. Cc: Hannes

[PATCH V3 3/8] blk-mq: introduce 'start_tag' field to 'struct blk_mq_tags'

2018-02-27 Thread Ming Lei
This patch introduces 'start_tag' field to 'struct blk_mq_tags' so that host wide tagset can be supported easily in the following patches by partitioning host wide tags into multiple hw queues. No function change. Cc: Hannes Reinecke Cc: Arun Easi Cc: Omar

[PATCH V3 2/8] scsi: megaraid_sas: fix selection of reply queue

2018-02-27 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), one msix vector can be created without any online CPU mapped, then command may be queued, and won't be notified after its completion. This patch setups mapping between cpu and reply queue according to irq affinity info

[PATCH V3 1/8] scsi: hpsa: fix selection of reply queue

2018-02-27 Thread Ming Lei
>From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), one msix vector can be created without any online CPU mapped, then one command's completion may not be notified. This patch setups mapping between cpu and reply queue according to irq affinity info retrived by

[PATCH V3 0/8] blk-mq & scsi: fix reply queue selection and improve host wide tagset

2018-02-27 Thread Ming Lei
Hi All, The 1st two patches fixes reply queue selection, and this issue has been reported and can cause IO hang during booting, please consider the two for V4.16. The following 6 patches try to improve hostwide tagset on hpsa and megaraid_sas by making hw queue per NUMA node. I don't have

Re: [PATCH V15 06/22] mmc: block: Add blk-mq support

2018-02-27 Thread Adrian Hunter
On 26/02/18 23:48, Dmitry Osipenko wrote: > On 22.02.2018 20:54, Dmitry Osipenko wrote: >> On 22.02.2018 10:42, Adrian Hunter wrote: >>> On 21/02/18 22:50, Dmitry Osipenko wrote: On 29.11.2017 16:41, Adrian Hunter wrote: > Define and use a blk-mq queue. Discards and flushes are processed

Re: [PATCH V15 06/22] mmc: block: Add blk-mq support

2018-02-27 Thread Linus Walleij
On Mon, Feb 26, 2018 at 10:48 PM, Dmitry Osipenko wrote: > On 22.02.2018 20:54, Dmitry Osipenko wrote: >> On 22.02.2018 10:42, Adrian Hunter wrote: >>> SDIO (unless it is a combo card) should be unaffected by changes to the >>> block driver. > > I don't know whether it's a