Re: [PATCH v2] bio: limit bio max size.

2021-01-21 Thread Ming Lei
On Thu, Jan 21, 2021 at 09:58:03AM +0900, Changheun Lee wrote: > bio size can grow up to 4GB when muli-page bvec is enabled. > but sometimes it would lead to inefficient behaviors. > in case of large chunk direct I/O, - 32MB chunk read in user space - > all pages for 32MB would be merged to a bio

Re: [PATCH v4 01/21] ibmvfc: add vhost fields and defaults for MQ enablement

2021-01-14 Thread Ming Lei
On Thu, Jan 14, 2021 at 11:24:35AM -0600, Brian King wrote: > On 1/13/21 7:27 PM, Ming Lei wrote: > > On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote: > >> On 1/12/21 6:33 PM, Tyrel Datwyler wrote: > >>> On 1/12/21 2:54 PM, Brian King wrote: > >&g

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Ming Lei
On Wed, Jan 13, 2021 at 12:02:44PM +, Damien Le Moal wrote: > On 2021/01/13 20:48, Ming Lei wrote: > > On Wed, Jan 13, 2021 at 11:16:11AM +, Damien Le Moal wrote: > >> On 2021/01/13 19:25, Ming Lei wrote: > >>> On Wed, Jan 13, 2021 at 09:28:02AM +, Damien

Re: [PATCH v4 01/21] ibmvfc: add vhost fields and defaults for MQ enablement

2021-01-13 Thread Ming Lei
On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote: > On 1/12/21 6:33 PM, Tyrel Datwyler wrote: > > On 1/12/21 2:54 PM, Brian King wrote: > >> On 1/11/21 5:12 PM, Tyrel Datwyler wrote: > >>> Introduce several new vhost fields for managing MQ state of the adapter > >>> as well as initial

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Ming Lei
On Wed, Jan 13, 2021 at 11:16:11AM +, Damien Le Moal wrote: > On 2021/01/13 19:25, Ming Lei wrote: > > On Wed, Jan 13, 2021 at 09:28:02AM +, Damien Le Moal wrote: > >> On 2021/01/13 18:19, Ming Lei wrote: > >>> On Wed, Jan 13, 2021 at 12:09

Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Ming Lei
On Wed, Jan 13, 2021 at 09:28:02AM +, Damien Le Moal wrote: > On 2021/01/13 18:19, Ming Lei wrote: > > On Wed, Jan 13, 2021 at 12:09 PM Changheun Lee > > wrote: > >> > >>> On 2021/01/12 21:14, Changheun Lee wrote: > >>>>> On 2021/01/12

Re: Re: [PATCH] bio: limit bio max size.

2021-01-13 Thread Ming Lei
So what is the actual total > >latency > >difference for the entire 32MB user IO ? That is I think what needs to be > >compared here. > > > >Also, what is your device max_sectors_kb and max queue depth ? > > > > 32MB total latency is about 19ms including merge time without this patch. > But with this patch, total latency is about 17ms including merge time too. 19ms looks too big just for preparing one 32MB sized bio, which isn't supposed to take so long. Can you investigate where the 19ms is taken just for preparing one 32MB sized bio? It might be iov_iter_get_pages() for handling page fault. If yes, one suggestion is to enable THP(Transparent HugePage Support) in your application. -- Ming Lei

Re: [percpu_ref] 2b0d3d3e4f: reaim.jobs_per_min -18.4% regression

2021-01-11 Thread Ming Lei
On Sun, Jan 10, 2021 at 10:32:47PM +0800, kernel test robot wrote: > > Greeting, > > FYI, we noticed a -18.4% regression of reaim.jobs_per_min due to commit: > > > commit: 2b0d3d3e4fcfb19d10f9a82910b8f0f05c56ee3e ("percpu_ref: reduce memory > footprint of percpu_ref in fast path") >

Re: [PATCH v3 7/7] bio: don't copy bvec for direct IO

2021-01-10 Thread Ming Lei
g(bio, BIO_WORKINGSET); > diff --git a/include/linux/bio.h b/include/linux/bio.h > index d8f9077c43ef..1d30572a8c53 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -444,10 +444,13 @@ static inline void bio_wouldblock_error(struct bio *bio) > > /* > * Calculate number of bvec segments that should be allocated to fit data > - * pointed by @iter. > + * pointed by @iter. If @iter is backed by bvec it's going to be reused > + * instead of allocating a new one. > */ > static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_segs) > { > + if (iov_iter_is_bvec(iter)) > + return 0; > return iov_iter_npages(iter, max_segs); > } > > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 6/7] bio: add a helper calculating nr segments to alloc

2021-01-10 Thread Ming Lei
lk_types.h */ > #include > +#include > > #define BIO_DEBUG > > @@ -441,6 +442,15 @@ static inline void bio_wouldblock_error(struct bio *bio) > bio_endio(bio); > } > > +/* > + * Calculate number of bvec segments that should be allocated to fit data > + * pointed by @iter. > + */ > +static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_segs) > +{ > + return iov_iter_npages(iter, max_segs); > +} > + > struct request_queue; > > extern int submit_bio_wait(struct bio *bio); > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 5/7] iov_iter: optimise bvec iov_iter_advance()

2021-01-10 Thread Ming Lei
v_iter_is_bvec(i)) { > + iov_iter_bvec_advance(i, size); > + return; > + } > iterate_and_advance(i, size, v, 0, 0, 0) > } > EXPORT_SYMBOL(iov_iter_advance); > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 4/7] target/file: allocate the bvec array as part of struct target_core_file_cmd

2021-01-10 Thread Ming Lei
r_bvec(, is_write, aio_cmd->bvecs, sgl_nents, len); > > aio_cmd->cmd = cmd; > aio_cmd->len = len; > @@ -307,8 +301,6 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist > *sgl, u32 sgl_nents, > else > ret = call_read_iter(file, _cmd->iocb, ); > > - kfree(bvec); > - > if (ret != -EIOCBQUEUED) > cmd_rw_aio_complete(_cmd->iocb, ret, 0); > > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 3/7] block/psi: remove PSI annotations from direct IO

2021-01-10 Thread Ming Lei
ct-io.c > @@ -426,6 +426,8 @@ static inline void dio_bio_submit(struct dio *dio, struct > dio_submit *sdio) > unsigned long flags; > > bio->bi_private = dio; > + /* don't account direct I/O as memory stall */ > + bio_clear_flag(bio, BIO_WORKINGSET); > > spin_lock_irqsave(>bio_lock, flags); > dio->refcount++; > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 2/7] bvec/iter: disallow zero-length segment bvecs

2021-01-10 Thread Ming Lei
ontinue; \ > (void)(STEP); \ > } \ > } > -- > 2.24.0 > Reviewed-by: Ming Lei -- Ming

Re: [PATCH v3 1/7] splice: don't generate zero-len segement bvecs

2021-01-10 Thread Ming Lei
ipe, buf); > if (unlikely(ret)) { > @@ -680,6 +682,7 @@ iter_file_splice_write(struct pipe_inode_info *pipe, > struct file *out, > array[n].bv_len = this_len; > array[n].bv_offset = buf->offset; > left

Re: [RFC PATCH] fs: block_dev: compute nr_vecs hint for improving writeback bvecs allocation

2021-01-08 Thread Ming Lei
On Thu, Jan 07, 2021 at 09:21:11AM +1100, Dave Chinner wrote: > On Wed, Jan 06, 2021 at 04:45:48PM +0800, Ming Lei wrote: > > On Tue, Jan 05, 2021 at 07:39:38PM +0100, Christoph Hellwig wrote: > > > At least for iomap I think this is the wrong approach. Betwe

Re: [RFC PATCH] fs: block_dev: compute nr_vecs hint for improving writeback bvecs allocation

2021-01-06 Thread Ming Lei
On Tue, Jan 05, 2021 at 07:39:38PM +0100, Christoph Hellwig wrote: > At least for iomap I think this is the wrong approach. Between the > iomap and writeback_control we know the maximum size of the writeback > request and can just use that. I think writeback_control can tell us nothing about max

[RFC PATCH] fs: block_dev: compute nr_vecs hint for improving writeback bvecs allocation

2021-01-05 Thread Ming Lei
b_vcnt.bt Cc: Alexander Viro Cc: Darrick J. Wong Cc: linux-...@vger.kernel.org Cc: linux-fsde...@vger.kernel.org Signed-off-by: Ming Lei --- fs/block_dev.c| 1 + fs/iomap/buffered-io.c| 13 + include/linux/bio.h | 2 -- include/linux/blk

Re: [PATCH] fs/buffer: try to submit writeback bio in unit of page

2021-01-04 Thread Ming Lei
On Mon, Jan 04, 2021 at 09:44:15AM +0100, Christoph Hellwig wrote: > On Wed, Dec 30, 2020 at 08:08:15AM +0800, Ming Lei wrote: > > It is observed that __block_write_full_page() always submit bio with size > > of block size, > > which is often 512 bytes. > > > >

[PATCH 1/6] block: manage bio slab cache by xarray

2020-12-29 Thread Ming Lei
Managing bio slab cache via xarray by using slab cache size as xarray index, and storing 'struct bio_slab' instance into xarray. So code is simplified a lot, meantime is is more readable than before. Signed-off-by: Ming Lei --- block/bio.c | 104

[PATCH 5/6] block: move three bvec helpers declaration into private helper

2020-12-29 Thread Ming Lei
bvec_alloc(), bvec_free() and bvec_nr_vecs() are only used inside block layer core functions, no need to declare them in public header. Signed-off-by: Ming Lei --- block/blk.h | 4 include/linux/bio.h | 3 --- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/block

[PATCH 6/6] bcache: don't pass BIOSET_NEED_BVECS for the 'bio_set' embedded in 'cache_set'

2020-12-29 Thread Ming Lei
This bioset is just for allocating bio only from bio_next_split, and it needn't bvecs, so remove the flag. Cc: linux-bca...@vger.kernel.org Cc: Coly Li Signed-off-by: Ming Lei --- drivers/md/bcache/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache

[PATCH 4/6] block: set .bi_max_vecs as actual allocated vector number

2020-12-29 Thread Ming Lei
bvec_alloc() may allocate more bio vectors than requested, so set .bi_max_vecs as actual allocated vector number, instead of the requested number. This way can help fs build bigger bio because new bio often won't be allocated until the current one becomes full. Signed-off-by: Ming Lei

[PATCH 2/6] block: don't pass BIOSET_NEED_BVECS for q->bio_split

2020-12-29 Thread Ming Lei
q->bio_split is only used by bio_split() for fast cloning bio, and no need to allocate bvecs, so remove this flag. Signed-off-by: Ming Lei --- block/blk-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-core.c index 96e5fcd7f

[PATCH 0/6] block: improvement on bioset & bvec allocation

2020-12-29 Thread Ming Lei
Hello, All are bioset / bvec improvement, and most of them are quite straightforward. Ming Lei (6): block: manage bio slab cache by xarray block: don't pass BIOSET_NEED_BVECS for q->bio_split block: don't allocate inline bvecs if this bioset needn't bvecs block: set .bi_max_v

[PATCH 3/6] block: don't allocate inline bvecs if this bioset needn't bvecs

2020-12-29 Thread Ming Lei
The inline bvecs won't be used if user needn't bvecs by not passing BIOSET_NEED_BVECS, so don't allocate bvecs in this situation. Signed-off-by: Ming Lei --- block/bio.c | 11 +++ include/linux/bio.h | 1 + 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/block

[PATCH] fs/buffer: try to submit writeback bio in unit of page

2020-12-29 Thread Ming Lei
Cc: Christoph Hellwig Cc: Jens Axboe Signed-off-by: Ming Lei --- fs/buffer.c | 112 +--- 1 file changed, 90 insertions(+), 22 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 32647d2011df..6bcf9ce5d7f8 100644 --- a/fs/buffer.c +++ b/fs/buffe

Re: [PATCH 1/3] blk-mq: allow hardware queue to get more tag while sharing a tag set

2020-12-28 Thread Ming Lei
On Mon, Dec 28, 2020 at 05:02:50PM +0800, yukuai (C) wrote: > Hi > > On 2020/12/28 16:28, Ming Lei wrote: > > Another candidate solution may be to always return true from > > hctx_may_queue() > > for this kind of queue because queue_depth has provided fair allocation

Re: [PATCH 1/3] blk-mq: allow hardware queue to get more tag while sharing a tag set

2020-12-28 Thread Ming Lei
On Mon, Dec 28, 2020 at 09:56:15AM +0800, yukuai (C) wrote: > Hi, > > On 2020/12/27 19:58, Ming Lei wrote: > > Hi Yu Kuai, > > > > On Sat, Dec 26, 2020 at 06:28:06PM +0800, Yu Kuai wrote: > > > When sharing a tag set, if most disks are issuing small amount of

Re: [PATCH 1/3] blk-mq: allow hardware queue to get more tag while sharing a tag set

2020-12-27 Thread Ming Lei
Hi Yu Kuai, On Sat, Dec 26, 2020 at 06:28:06PM +0800, Yu Kuai wrote: > When sharing a tag set, if most disks are issuing small amount of IO, and > only a few is issuing a large amount of IO. Current approach is to limit > the max amount of tags a disk can get equally to the average of total >

Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

2020-12-22 Thread Ming Lei
On Tue, Dec 22, 2020 at 11:22:19AM +, John Garry wrote: > Resend without p...@codeaurora.org, which bounces for me > > On 22/12/2020 02:13, Bart Van Assche wrote: > > On 12/21/20 10:47 AM, John Garry wrote: > >> Yes, I agree, and I'm not sure what I wrote to give that impression. > >> > >>

Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

2020-12-17 Thread Ming Lei
On Thu, Dec 17, 2020 at 07:07:53PM +0800, John Garry wrote: > References to old IO sched requests are currently cleared from the > tagset when freeing those requests; switching elevator or changing > request queue depth is such a scenario in which this occurs. > > However, this does not stop the

Re: [PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT

2020-12-15 Thread Ming Lei
On Mon, Dec 14, 2020 at 10:24:22AM -0500, Steven Rostedt wrote: > On Mon, 14 Dec 2020 10:22:17 +0800 > Ming Lei wrote: > > > trace_note_tsk() is called by __blk_add_trace(), which is covered by RCU > > read lock. > > So in case of PREEMPT_RT, warning of 'BUG: sl

Re: [PATCH v1 0/6] no-copy bvec

2020-12-15 Thread Ming Lei
On Tue, Dec 15, 2020 at 11:14:20AM +, Pavel Begunkov wrote: > On 15/12/2020 01:41, Ming Lei wrote: > > On Tue, Dec 15, 2020 at 12:20:19AM +, Pavel Begunkov wrote: > >> Instead of creating a full copy of iter->bvec into bio in direct I/O, > >> the patchset

Re: [PATCH v1 0/6] no-copy bvec

2020-12-14 Thread Ming Lei
On Tue, Dec 15, 2020 at 12:20:19AM +, Pavel Begunkov wrote: > Instead of creating a full copy of iter->bvec into bio in direct I/O, > the patchset makes use of the one provided. It changes semantics and > obliges users of asynchronous kiocb to track bvec lifetime, and [1/6] > converts the only

[PATCH] blktrace: fix 'BUG: sleeping function called from invalid context' in case of PREEMPT_RT

2020-12-13 Thread Ming Lei
into raw_spin_lock(). Cc: Christoph Hellwig Cc: Steven Rostedt Cc: Ingo Molnar Cc: linux-kernel@vger.kernel.org Signed-off-by: Ming Lei --- kernel/trace/blktrace.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index

Re: KASAN: use-after-free Read in disk_part_iter_next

2020-12-13 Thread Ming Lei
On Fri, Dec 11, 2020 at 01:03:11PM -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:15ac8fdb Add linux-next specific files for 20201207 > git tree: linux-next > console output: https://syzkaller.appspot.com/x/log.txt?x=15d8ad3750 > kernel

Re: [PATCH] x86/apic/vector: Fix ordering in vector assignment

2020-12-10 Thread Ming Lei
> return 0; > + > + if (node != NUMA_NO_NODE) { > + /* Try the node mask */ > + if (!assign_vector_locked(irqd, cpumask_of_node(node))) > + return 0; > + } > + > /* Try the full online mask */ > return assign_vector_locked(irqd, cpu_online_mask); > } > Reviewed-by: Ming Lei Thanks, Ming

Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

2020-12-10 Thread Ming Lei
On Thu, Dec 10, 2020 at 10:44:54AM +, John Garry wrote: > Hi Ming, > > On 10/12/2020 02:07, Ming Lei wrote: > > > Apart from this, my concern is that we come with for a solution, but it's > > > a > > > complicated solution and may not b

Re: [PATCH] blk-mq-tag: make blk_mq_tag_busy() return void

2020-12-09 Thread Ming Lei
> - return false; > + return; > > - return __blk_mq_tag_busy(hctx); > + __blk_mq_tag_busy(hctx); The above can be simplified as: if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) __blk_mq_tag_busy(hctx); Otherwise, looks fine: Reviewed-by: Ming Lei Thanks, Ming

Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

2020-12-09 Thread Ming Lei
On Wed, Dec 09, 2020 at 09:55:30AM +, John Garry wrote: > On 09/12/2020 01:01, Ming Lei wrote: > > blk_mq_queue_tag_busy_iter() can be run on another request queue just > > between one driver tag is allocated and updating the request map, so one > > extra request reference

Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

2020-12-08 Thread Ming Lei
On Tue, Dec 08, 2020 at 11:36:58AM +, John Garry wrote: > On 03/12/2020 09:26, John Garry wrote: > > On 03/12/2020 00:55, Ming Lei wrote: > > > > Hi Ming, > > > > > > Yeah, so I said that was another problem which you mentioned > > > > t

Re: [PATCH V2 0/3] blk-mq/nvme-loop: use nvme-loop's lock class for addressing lockdep false positive warning

2020-12-07 Thread Ming Lei
On Thu, Dec 03, 2020 at 09:26:35AM +0800, Ming Lei wrote: > Hi, > > Qian reported there is hang during booting when shared host tagset is > introduced on megaraid sas. Sumit reported the whole SCSI probe takes > about ~45min in his test. > > Turns out it is caused by n

Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

2020-12-02 Thread Ming Lei
On Wed, Dec 02, 2020 at 11:18:31AM +, John Garry wrote: > On 02/12/2020 03:31, Ming Lei wrote: > > On Tue, Dec 01, 2020 at 09:02:18PM +0800, John Garry wrote: > > > It has been reported many times that a use-after-free can be > > > intermittently > > >

Re: [RFC PATCH] blk-mq: Clean up references when freeing rqs

2020-12-01 Thread Ming Lei
On Tue, Dec 01, 2020 at 09:02:18PM +0800, John Garry wrote: > It has been reported many times that a use-after-free can be intermittently > found when iterating busy requests: > > - > https://lore.kernel.org/linux-block/8376443a-ec1b-0cef-8244-ed584b96f...@huawei.com/ > - >

Re: [PATCH v2] blk-mq: Remove 'running from the wrong CPU' warning

2020-11-30 Thread Ming Lei
and the request is still processed > correctly, better remove the warning as this is the fast path. > > Suggested-by: Ming Lei > Signed-off-by: Daniel Wagner > --- > > v2: > - remove the warning as suggested by Ming > v1: > - initial version > > https:/

Re: [PATCH] blk-mq: Make running from the wrong CPU less scary

2020-11-26 Thread Ming Lei
On Thu, Nov 26, 2020 at 10:51:52AM +0100, Daniel Wagner wrote: > The current warning looks aweful like a proper crash. This is > confusing. There is not much information to gained from the stack > trace anyway, let's drop it. > > While at it print the cpumask as there might be additial helpful >

Re: [PATCH v2 2/4] sbitmap: remove swap_lock

2020-11-26 Thread Ming Lei
On Thu, Nov 26, 2020 at 01:44:36PM +, Pavel Begunkov wrote: > On 26/11/2020 02:46, Ming Lei wrote: > > On Sun, Nov 22, 2020 at 03:35:46PM +, Pavel Begunkov wrote: > >> map->swap_lock protects map->cleared from concurrent modification, > >> however sb

Re: [PATCH v2 2/4] sbitmap: remove swap_lock

2020-11-25 Thread Ming Lei
On Sun, Nov 22, 2020 at 03:35:46PM +, Pavel Begunkov wrote: > map->swap_lock protects map->cleared from concurrent modification, > however sbitmap_deferred_clear() is already atomically drains it, so > it's guaranteed to not loose bits on concurrent > sbitmap_deferred_clear(). > > A one

Re: [PATCH 5.11] block: optimise for_each_bvec() advance

2020-11-24 Thread Ming Lei
_advance((bio_vec), &(iter), \ > - (bvl).bv_len) : bvec_iter_skip_zero_bvec(&(iter))) > + bvec_iter_advance_single((bio_vec), &(iter), (bvl).bv_len)) > > /* for iterating one bio from start to end */ > #define BVEC_ITER_ALL_INIT (struct bvec_iter) > \ > -- > 2.24.0 > Looks fine, Reviewed-by: Ming Lei Thanks, Ming

Re: [PATCH v2 1/2] iov_iter: optimise iov_iter_npages for bvec

2020-11-19 Thread Ming Lei
On Fri, Nov 20, 2020 at 02:06:10AM +, Matthew Wilcox wrote: > On Fri, Nov 20, 2020 at 01:56:22AM +, Pavel Begunkov wrote: > > On 20/11/2020 01:49, Matthew Wilcox wrote: > > > On Fri, Nov 20, 2020 at 01:39:05AM +, Pavel Begunkov wrote: > > >> On 20/11/2020 01:20, Matthew Wilcox wrote: >

Re: [PATCH v2 1/2] iov_iter: optimise iov_iter_npages for bvec

2020-11-19 Thread Ming Lei
On Fri, Nov 20, 2020 at 01:39:05AM +, Pavel Begunkov wrote: > On 20/11/2020 01:20, Matthew Wilcox wrote: > > On Thu, Nov 19, 2020 at 11:24:38PM +, Pavel Begunkov wrote: > >> The block layer spends quite a while in iov_iter_npages(), but for the > >> bvec case the number of pages is already

Re: [PATCH] iosched: Add i10 I/O Scheduler

2020-11-16 Thread Ming Lei
On Fri, Nov 13, 2020 at 01:36:16PM -0800, Sagi Grimberg wrote: > > > > But if you think this has a better home, I'm assuming that the guys > > > will be open to that. > > > > Also see the reply from Ming. It's a balancing act - don't want to add > > extra overhead to the core, but also don't

Re: [PATCH] iosched: Add i10 I/O Scheduler

2020-11-13 Thread Ming Lei
Hello, On Thu, Nov 12, 2020 at 09:07:52AM -0500, Rachit Agarwal wrote: > From: Rachit Agarwal > > > Hi All, > > I/O batching is beneficial for optimizing IOPS and throughput for various > applications. For instance, several kernel block drivers would benefit from > batching, > including mmc

Re: [PATCH v8 17/18] scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug

2020-11-11 Thread Ming Lei
On Wed, Nov 11, 2020 at 09:42:17AM -0500, Qian Cai wrote: > On Wed, 2020-11-11 at 17:27 +0800, Ming Lei wrote: > > Can this issue disappear by applying the following change? > > This makes the system boot again as well. OK, actually it isn't necessary to register one new lock ke

Re: [PATCH v8 17/18] scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug

2020-11-11 Thread Ming Lei
On Wed, Nov 11, 2020 at 12:57:59PM +0530, Sumit Saxena wrote: > On Tue, Nov 10, 2020 at 11:12 PM John Garry wrote: > > > > On 09/11/2020 14:05, John Garry wrote: > > > On 09/11/2020 13:39, Qian Cai wrote: > > >>> I suppose I could try do this myself also, but an authentic version > > >>> would be

Re: 5.10 tree fails to build

2020-11-09 Thread Ming Lei
. GCC: gcc version 10.2.1 20200826 (Red Hat 10.2.1-3) (GCC) -- Ming Lei

Re: INFO: task can't die in nbd_ioctl

2020-11-02 Thread Ming Lei
50 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15bf75b850 Not reproduce this issue by above C reproducer with the kernel config in hours running on linus tree. Thanks, Ming Lei

Re: [PATCH] blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue

2020-10-09 Thread Ming Lei
+ if (queue_is_mq(q)) { > + struct blk_mq_hw_ctx *hctx; > + int i; > + > cancel_delayed_work_sync(>requeue_work); > > + queue_for_each_hw_ctx(q, hctx, i) > + cancel_delayed_work_sync(>run_work); > + } > + Looks fine: Reviewed-by: Ming Lei Thanks, Ming

Re: general protection fault in percpu_ref_exit

2020-10-08 Thread Ming Lei
On Thu, Oct 08, 2020 at 07:23:02PM -0600, Jens Axboe wrote: > On 10/8/20 2:28 PM, syzbot wrote: > > syzbot has bisected this issue to: > > > > commit 2b0d3d3e4fcfb19d10f9a82910b8f0f05c56ee3e > > Author: Ming Lei > > Date: Thu Oct 1 15:48:41 2020 + >

Re: [PATCH V7 0/2] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-10-06 Thread Ming Lei
On Thu, Oct 01, 2020 at 11:48:40PM +0800, Ming Lei wrote: > Hi, > > The 1st patch removes memory footprint of percpu_ref in fast path > from 7 words to 2 words, since it is often used in fast path and > embedded in user struct. > > The 2nd patch moves .q_usage_cou

[PATCH V7 2/2] block: move 'q_usage_counter' into front of 'request_queue'

2020-10-01 Thread Ming Lei
: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d5a3e1a4c2f7..67935b3bef6c 100644 --- a/include/linux/blkdev.h

[PATCH V7 0/2] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-10-01 Thread Ming Lei
V2: - pass 'gfp' to kzalloc() for fixing block/027 failure reported by kernel test robot - protect percpu_ref_is_zero() with destroying percpu-refcount by spin lock Ming Lei (2): percpu_ref: reduce memory footprint of percpu_ref in fast path block: m

[PATCH V7 1/2] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-10-01 Thread Ming Lei
-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2 +- include/linux/percpu-refcount.h | 52 ++-- lib/percpu-refcount.c | 131 ++ 3 files changed, 123 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers

Re: [PATCH V6 1/2] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-10-01 Thread Ming Lei
On Wed, Sep 30, 2020 at 12:00:15PM -0400, Tejun Heo wrote: > On Wed, Sep 30, 2020 at 04:26:56PM +0800, Ming Lei wrote: > > diff --git a/include/linux/percpu-refcount.h > > b/include/linux/percpu-refcount.h > > index 87d8a38bdea1..1d6ed9ca23dd 100644 > > --- a/incl

[PATCH V6 1/2] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-09-30 Thread Ming Lei
-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2 +- include/linux/percpu-refcount.h | 45 -- lib/percpu-refcount.c | 131 ++ 3 files changed, 116 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers

[PATCH V6 2/2] block: move 'q_usage_counter' into front of 'request_queue'

2020-09-30 Thread Ming Lei
: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d5a3e1a4c2f7..67935b3bef6c 100644 --- a/include/linux/blkdev.h

[PATCH V6 0/2] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-09-30 Thread Ming Lei
027 failure reported by kernel test robot - protect percpu_ref_is_zero() with destroying percpu-refcount by spin lock Ming Lei (2): percpu_ref: reduce memory footprint of percpu_ref in fast path block: move 'q_usage_counter' into front of 'request_queue' drivers/infinib

[PATCH V5 3/3] block: move 'q_usage_counter' into front of 'request_queue'

2020-09-27 Thread Ming Lei
: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d5a3e1a4c2f7..67935b3bef6c 100644 --- a/include/linux/blkdev.h

[PATCH V5 2/3] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-09-27 Thread Ming Lei
-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2 +- include/linux/percpu-refcount.h | 45 -- lib/percpu-refcount.c | 131 ++ 3 files changed, 116 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers

[PATCH V5 1/3] percpu_ref: add percpu_ref_is_initialized for MD

2020-09-27 Thread Ming Lei
...@vger.kernel.org Cc: Sagi Grimberg Cc: Tejun Heo Cc: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- drivers/md/md.c | 2 +- include/linux/percpu-refcount.h | 1 + lib/percpu-refcount.c | 6 ++ 3 files changed, 8 insertions(+), 1

[PATCH V5 0/3] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-09-27 Thread Ming Lei
unt by spin lock Ming Lei (3): percpu_ref: add percpu_ref_is_initialized for MD percpu_ref: reduce memory footprint of percpu_ref in fast path block: move 'q_usage_counter' into front of 'request_queue' drivers/infiniband/sw/rdmavt/mr.c | 2 +- drivers/md/md.c | 2 +- incl

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
m_cache *cachep) > @@ -3402,9 +3406,9 @@ static void cache_flusharray(struct kmem_cache *cachep, > struct array_cache *ac) > } > #endif > spin_unlock(>list_lock); > - slabs_destroy(cachep, ); > ac->avail -= batchcount; > memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail); > + slabs_destroy(cachep, ); > } The issue can't be reproduced after applying this patch: Tested-by: Ming Lei Thanks, Ming

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
On Fri, Sep 25, 2020 at 03:31:45PM +0800, Ming Lei wrote: > On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > > HOWEVER, thanks to a hint from a colleague at $WORK, and realizing > >

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-25 Thread Ming Lei
On Thu, Sep 24, 2020 at 09:13:11PM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > HOWEVER, thanks to a hint from a colleague at $WORK, and realizing > > that one of the stack traces had virtio balloon in the trace, I > > realized that when I

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Ming Lei
On Fri, Sep 25, 2020 at 09:14:16AM +0800, Ming Lei wrote: > On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > > On Thu, Sep 24, 2020 at 08:59:01AM +0800, Ming Lei wrote: > > > > > > The list corruption issue can be reproduced on kvm/qumu guest too

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-24 Thread Ming Lei
On Thu, Sep 24, 2020 at 10:33:45AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 24, 2020 at 08:59:01AM +0800, Ming Lei wrote: > > > > The list corruption issue can be reproduced on kvm/qumu guest too when > > running xfstests(ext4) generic/038. > > > &

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-23 Thread Ming Lei
On Thu, Sep 17, 2020 at 10:30:12AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote: > > > > Obviously there is other more serious issue, since 568f27006577 is > > completely reverted in your test, and you still see list corruption &

Re: [PATCH] iomap: Fix the write_count in iomap_add_to_ioend().

2020-09-17 Thread Ming Lei
On Thu, Sep 17, 2020 at 09:04:55AM +0100, Christoph Hellwig wrote: > On Wed, Sep 16, 2020 at 09:07:14AM -0400, Brian Foster wrote: > > Dave described the main purpose earlier in this thread [1]. The initial > > motivation is that we've had downstream reports of soft lockup problems > > in

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-17 Thread Ming Lei
On Thu, Sep 17, 2020 at 10:30:12AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 17, 2020 at 10:20:51AM +0800, Ming Lei wrote: > > > > Obviously there is other more serious issue, since 568f27006577 is > > completely reverted in your test, and you still see list corruption &

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-16 Thread Ming Lei
On Wed, Sep 16, 2020 at 04:20:26PM -0400, Theodore Y. Ts'o wrote: > On Wed, Sep 16, 2020 at 07:09:41AM +0800, Ming Lei wrote: > > > The problem is it's a bit tricky to revert 568f27006577, since there > > > is a merge conflict in blk_kick_flush(). I attempted to do the bisec

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-15 Thread Ming Lei
On Tue, Sep 15, 2020 at 06:45:41PM -0400, Theodore Y. Ts'o wrote: > On Tue, Sep 15, 2020 at 03:33:03PM +0800, Ming Lei wrote: > > Hi Theodore, > > > > On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote: > > > On Thu, Sep 03, 2020 at 11:55:28PM

Re: REGRESSION: 37f4a24c2469: blk-mq: centralise related handling into blk_mq_get_driver_tag

2020-09-15 Thread Ming Lei
Hi Theodore, On Tue, Sep 15, 2020 at 12:45:19AM -0400, Theodore Y. Ts'o wrote: > On Thu, Sep 03, 2020 at 11:55:28PM -0400, Theodore Y. Ts'o wrote: > > Worse, right now, -rc1 and -rc2 is causing random crashes in my > > gce-xfstests framework. Sometimes it happens before we've run even a > >

[PATCH V4 3/3] block: move 'q_usage_counter' into front of 'request_queue'

2020-09-09 Thread Ming Lei
: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 7d82959e7b86..7b1e53084799 100644 --- a/include/linux/blkdev.h

[PATCH V4 1/3] percpu_ref: add percpu_ref_is_initialized for MD

2020-09-09 Thread Ming Lei
...@vger.kernel.org Cc: Sagi Grimberg Cc: Tejun Heo Cc: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- drivers/md/md.c | 2 +- include/linux/percpu-refcount.h | 1 + lib/percpu-refcount.c | 6 ++ 3 files changed, 8 insertions(+), 1

[PATCH V4 0/3] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-09-09 Thread Ming Lei
Kabatova V2: - pass 'gfp' to kzalloc() for fixing block/027 failure reported by kernel test robot - protect percpu_ref_is_zero() with destroying percpu-refcount by spin lock Ming Lei (3): percpu_ref: add percpu_ref_is_initialized for MD percpu_ref: reduce

[PATCH V4 2/3] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-09-09 Thread Ming Lei
-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2 +- include/linux/percpu-refcount.h | 45 -- lib/percpu-refcount.c | 131 ++ 3 files changed, 116 insertions(+), 62 deletions(-) diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers

Re: [RESEND PATCH 1/1] block: Set same_page to false in __bio_try_merge_page if ret is false

2020-09-08 Thread Ming Lei
*same_page = false; > return false; > + } > bv->bv_len += len; > bio->bi_iter.bi_size += len; > return true; Reviewed-by: Ming Lei -- Ming Lei

Re: [PATCH] Revert "block: revert back to synchronous request_queue removal"

2020-09-08 Thread Ming Lei
Hello Haifeng, On Wed, Sep 09, 2020 at 02:11:20AM +, Zhao, Haifeng wrote: > Ming, Christoph, > Could you point out the patch aimed to fix this issue ? I would like to > try it. This issue blocked my other PCI patch developing and verification > work, > I am not a BLOCK/NVMe expert,

Re: [PATCH] blk-mq: Fix refcounting leak in __blk_mq_register_dev()

2020-09-07 Thread Ming Lei
ueue *q) > > kobject_uevent(q->mq_kobj, KOBJ_REMOVE); > kobject_del(q->mq_kobj); > +out_kobj: > kobject_put(>kobj); > return ret; > } > -- > 2.28.0 > Looks good fix: Reviewed-by: Ming Lei -- Ming

[PATCH V2 2/2] block: move 'q_usage_counter' into front of 'request_queue'

2020-09-02 Thread Ming Lei
-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d0d61bc81615..7575fa0aae6e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -397,6 +397,8 @@ struct

[PATCH V2 1/2] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-09-02 Thread Ming Lei
(), then memory footprint of 'percpu_ref' in fast path is reduced a lot and becomes suitable to put into hot cacheline of user structure. Cc: Sagi Grimberg Cc: Tejun Heo Cc: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2

[PATCH V2 0/2] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-09-02 Thread Ming Lei
(two threads per core) machine, dual socket/numa. V2: - pass 'gfp' to kzalloc() for fixing block/027 failure reported by kernel test robot - protect percpu_ref_is_zero() with destroying percpu-refcount by spin lock Ming Lei (2): percpu_ref: reduce memory footprint

Re: splice: infinite busy loop lockup bug

2020-08-31 Thread Ming Lei
left;\ > skip = __v.iov_len; \ > > and end up seeing overflows ("n" supposes to be less than PAGE_SIZE) before > the > soft-lockups and a dead system, > > [ 4300.249180][T470195] ITER_IOVEC left = 0, n = 48566423 > > Thoughts? Does the following patch make a difference for you? https://lore.kernel.org/linux-block/20200817100055.2495905-1-ming@redhat.com/ thanks, Ming Lei

Re: [PATCH] iomap: Fix the write_count in iomap_add_to_ioend().

2020-08-30 Thread Ming Lei
On Tue, Aug 25, 2020 at 10:49:17AM -0400, Brian Foster wrote: > cc Ming > > On Tue, Aug 25, 2020 at 10:42:03AM +1000, Dave Chinner wrote: > > On Mon, Aug 24, 2020 at 11:48:41AM -0400, Brian Foster wrote: > > > On Mon, Aug 24, 2020 at 04:04:17PM +0100, Christoph Hellwig wrote: > > > > On Mon, Aug

Re: [PATCH] [v2] blk-mq: use BLK_MQ_NO_TAG for no tag

2020-08-25 Thread Ming Lei
On Wed, Aug 26, 2020 at 10:06:51AM +0800, Xianting Tian wrote: > Replace various magic -1 constants for tags with BLK_MQ_NO_TAG. > And move the definition of BLK_MQ_NO_TAG from 'block/blk-mq-tag.h' > to 'include/linux/blk-mq.h' All three symbols are supposed for block core internal code only, so

[PATCH 2/2] block: move 'q_usage_counter' into front of 'request_queue'

2020-08-25 Thread Ming Lei
-off-by: Ming Lei --- include/linux/blkdev.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index bb5636cc17b9..d8dba550ecac 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -396,6 +396,8 @@ struct

[PATCH 0/2] percpu_ref & block: reduce memory footprint of percpu_ref in fast path

2020-08-25 Thread Ming Lei
(two threads per core) machine, dual socket/numa. Ming Lei (2): percpu_ref: reduce memory footprint of percpu_ref in fast path block: move 'q_usage_counter' into front of 'request_queue' drivers/infiniband/sw/rdmavt/mr.c | 2 +- include/linux/blkdev.h| 3 +- include/linux

[PATCH 1/2] percpu_ref: reduce memory footprint of percpu_ref in fast path

2020-08-25 Thread Ming Lei
(), then memory footprint of 'percpu_ref' in fast path is reduced a lot and becomes suitable to put into hot cacheline of user structure. Cc: Sagi Grimberg Cc: Tejun Heo Cc: Christoph Hellwig Cc: Jens Axboe Cc: Bart Van Assche Signed-off-by: Ming Lei --- drivers/infiniband/sw/rdmavt/mr.c | 2

Re: [PATCH] block: Fix page_is_mergeable() for compound pages

2020-08-17 Thread Ming Lei
me_page = ((vec_end_addr & PAGE_MASK) == page_addr); > - if (!*same_page && pfn_to_page(PFN_DOWN(vec_end_addr)) + 1 != page) > - return false; > - return true; > + if (*same_page) > + return true; > + return (bv->bv_page + bv_end / PAGE_SIZE) == (page + off / PAGE_SIZE); Looks this way is more straightforward, meantime can cover compound pages: Reviewed-by: Ming Lei Thanks, Ming

  1   2   3   4   5   6   7   8   9   10   >