[PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
-by: Tejun Heo t...@kernel.org --- fs/btrfs/Makefile|2 fs/btrfs/ctree.c | 16 +-- fs/btrfs/extent_io.c |3 fs/btrfs/extent_io.h | 12 -- fs/btrfs/locking.c | 233 --- fs/btrfs/locking.h | 43 +++-- 6 files changed, 48

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
a bit more cpu than SIMPLE but shows discernably better throughput. I'm running SPIN again just in case but the result seems pretty consistent. Thanks. NOT-Signed-off-by: Tejun Heo t...@kernel.org --- fs/btrfs/locking.h|2 - include/linux/mutex.h |1 kernel/mutex.c| 58

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-20 Thread Tejun Heo
On Sun, Mar 20, 2011 at 08:56:52PM +0100, Tejun Heo wrote: So, here's the patch to implement and use mutex_try_spin(), which applies the same owner spin logic to try locking. The result looks pretty good. I re-ran all three. DFL is the current custom locking. SIMPLE is with only

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
Hello, Chris. On Sun, Mar 20, 2011 at 08:10:51PM -0400, Chris Mason wrote: I went through a number of benchmarks with the explicit blocking/spinning code and back then it was still significantly faster than the adaptive spin. But, it is definitely worth doing these again, how many dbench

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
3092091 701.826 I'm running DFL again just in case but SIMPLE or SPIN seems to be a much better choice. Thanks. NOT-Signed-off-by: Tejun Heo t...@kernel.org --- fs/btrfs/locking.h |2 ++ 1 file changed, 2 insertions(+) Index: work/fs/btrfs/locking.h

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
On Mon, Mar 21, 2011 at 05:59:55PM +0100, Tejun Heo wrote: I'm running DFL again just in case but SIMPLE or SPIN seems to be a much better choice. Got 644.176 MB/sec, so yeah the custom locking is definitely worse than just using mutex. Thanks. -- tejun -- To unsubscribe from this list: send

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-21 Thread Tejun Heo
Hello, On Mon, Mar 21, 2011 at 01:24:37PM -0400, Chris Mason wrote: Very interesting. Ok, I'll definitely rerun my benchmarks as well. I used dbench extensively during the initial tuning, but you're forcing the memory low in order to force IO. This case doesn't really hammer on the locks,

Re: [PATCH RFC] btrfs: Simplify locking

2011-03-23 Thread Tejun Heo
Hello, Chris. On Tue, Mar 22, 2011 at 07:13:09PM -0400, Chris Mason wrote: Ok, this impact of this is really interesting. If we have very short waits where there is no IO at all, this patch tends to lose. I ran with dbench 10 and got about 20% slower tput. But, if we do any IO at all it

[RFC PATCH] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-23 Thread Tejun Heo
or not. Is this intended or the test got lost somehow? Thanks. NOT-Signed-off-by: Tejun Heo t...@kernel.org --- kernel/mutex.c | 98 +++-- 1 file changed, 61 insertions(+), 37 deletions(-) Index: work/kernel/mutex.c

Re: [RFC PATCH] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-23 Thread Tejun Heo
On Wed, Mar 23, 2011 at 08:48:01AM -0700, Linus Torvalds wrote: On Wed, Mar 23, 2011 at 8:37 AM, Tejun Heo t...@kernel.org wrote: Currently, mutex_trylock() doesn't use adaptive spinning.  It tries just once.  I got curious whether using adaptive spinning on mutex_trylock() would

[PATCH 1/2] Subject: mutex: Separate out mutex_spin()

2011-03-24 Thread Tejun Heo
for using adaptive spinning in mutex_trylock() and doesn't cause any behavior change. Signed-off-by: Tejun Heo t...@kernel.org LKML-Reference: 20110323153727.gb12...@htj.dyndns.org Cc: Peter Zijlstra pet...@infradead.org Cc: Ingo Molnar mi...@redhat.com --- Here are split patches with SOB. Ingo

[PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-24 Thread Tejun Heo
but it outperforms consistently. In general, using adaptive spinning on trylock makes sense as trylock failure usually leads to costly unlock-relock sequence. [1] http://article.gmane.org/gmane.comp.file-systems.btrfs/9658 Signed-off-by: Tejun Heo t...@kernel.org LKML-Reference: 20110323153727.gb12

Re: [PATCH 1/2] Subject: mutex: Separate out mutex_spin()

2011-03-24 Thread Tejun Heo
Ugh... Please drop the extra Subject: from subject before applying. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[RFC PATCHSET] btrfs: Simplify extent_buffer locking

2011-03-24 Thread Tejun Heo
Hello, This is split patchset of the RFC patches[1] to simplify btrfs locking and contains the following three patches. 0001-btrfs-Cleanup-extent_buffer-lockdep-code.patch 0002-btrfs-Use-separate-lockdep-class-keys-for-different-.patch 0003-btrfs-Simplify-extent_buffer-locking.patch For more

[PATCH 2/3] btrfs: Use separate lockdep class keys for different roots

2011-03-24 Thread Tejun Heo
sets of keys according to the type of @root. Signed-off-by: Tejun Heo t...@kernel.org --- fs/btrfs/disk-io.c | 91 +-- fs/btrfs/disk-io.h | 10 -- fs/btrfs/extent-tree.c |2 +- fs/btrfs/volumes.c |2 +- 4 files changed, 73

[PATCH 1/3] btrfs: Cleanup extent_buffer lockdep code

2011-03-24 Thread Tejun Heo
btrfs_set_buffer_lockdep_class() should be dependent upon CONFIG_LOCKDEP instead of CONFIG_DEBUG_LOCK_ALLOC. Collect the related code into one place, use CONFIG_LOCKDEP instead and make some cosmetic changes. Signed-off-by: Tejun Heo t...@kernel.org --- fs/btrfs/disk-io.c | 22

[PATCH 3/3] btrfs: Simplify extent_buffer locking

2011-03-24 Thread Tejun Heo
. Signed-off-by: Tejun Heo t...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Ingo Molnar mi...@redhat.com --- fs/btrfs/Makefile|2 +- fs/btrfs/ctree.c | 16 ++-- fs/btrfs/extent_io.c |3 +- fs/btrfs/extent_io.h | 12 +-- fs/btrfs/locking.c | 233

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-25 Thread Tejun Heo
Hello, Steven, Linus. On Thu, Mar 24, 2011 at 09:38:58PM -0700, Linus Torvalds wrote: On Thu, Mar 24, 2011 at 8:39 PM, Steven Rostedt rost...@goodmis.org wrote: But now, mutex_trylock(B) becomes a spinner too, and since the B's owner is running (spinning on A) it will spin as well waiting

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-25 Thread Tejun Heo
Hello, On Thu, Mar 24, 2011 at 10:41:51AM +0100, Tejun Heo wrote: USER SYSTEM SIRQCXTSW THROUGHPUT SIMPLE 61107 354977217 8099529 845.100 MB/sec SPIN 63140 364888214 6840527 879.077 MB/sec On various runs, the adaptive spinning trylock consistently posts

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-29 Thread Tejun Heo
Hello, guys. I've been running dbench 50 for a few days now and the result is, well, I don't know how to call it. The problem was that the original patch didn't do anything because x86 fastpath code didn't call into the generic slowpath at all. static inline int

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-29 Thread Tejun Heo
Here's the combined patch I was planning on testing but didn't get to (yet). It implements two things - hard limit on spin duration and early break if the owner also is spinning on a mutex. Thanks. Index: work1/include/linux/sched.h

Re: [PATCH 2/2] mutex: Apply adaptive spinning on mutex_trylock()

2011-03-30 Thread Tejun Heo
Hey, Peter. On Tue, Mar 29, 2011 at 07:37:33PM +0200, Peter Zijlstra wrote: On Tue, 2011-03-29 at 19:09 +0200, Tejun Heo wrote: Here's the combined patch I was planning on testing but didn't get to (yet). It implements two things - hard limit on spin duration and early break if the owner

Re: [BUG REPORT] Kernel panic on 3.9.0-rc7-4-gbb33db7

2013-04-19 Thread Tejun Heo
On Thu, Apr 18, 2013 at 10:57:54PM -0700, Tejun Heo wrote: No wonder this thing crashes. Chris, can't the original bio carry bbio in bi_private and let end_bio_extent_readpage() free the bbio instead of abusing bi_bdev like this? BTW, I think it's a bit too late to fix this properly from

Re: [btrfs] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038

2014-02-08 Thread Tejun Heo
Hello, David, Fengguang, Chris. On Fri, Feb 07, 2014 at 01:13:06PM -0800, David Rientjes wrote: On Fri, 7 Feb 2014, Fengguang Wu wrote: On Fri, Feb 07, 2014 at 02:13:59AM -0800, David Rientjes wrote: On Fri, 7 Feb 2014, Fengguang Wu wrote: [1.625020] BTRFS: selftest: Running

Re: [PATCH 2/2] writeback: allow for dirty metadata accounting

2016-08-10 Thread Tejun Heo
Hello, Josef. On Tue, Aug 09, 2016 at 03:08:27PM -0400, Josef Bacik wrote: > Provide a mechanism for file systems to indicate how much dirty metadata they > are holding. This introduces a few things > > 1) Zone stats for dirty metadata, which is the same as the NR_FILE_DIRTY. > 2) WB stat for

Re: [PATCH 1/2] remove mapping from balance_dirty_pages*()

2016-08-10 Thread Tejun Heo
Josef Bacik <jba...@fb.com> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] writeback: allow for dirty metadata accounting

2016-08-10 Thread Tejun Heo
Hello, Josef. On Wed, Aug 10, 2016 at 05:16:03PM -0400, Josef Bacik wrote: > > It bothers me a bit that sb's can actually be off bdi->sb_list while > > sb_list_lock is released. Can we make this explicit? e.g. keep > > separate bdi sb list for sb's pending metadata writeout (like b_dirty) > >

Re: GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL

2016-07-13 Thread Tejun Heo
Hello, On Mon, Jul 04, 2016 at 04:15:35PM +0300, Nikolay Borisov wrote: > So the btrfs fs was created inside a loop device and mounted with -o loop. > Evidently from the oops it seems that this is the normal umount path, meaning > that no device hot plugging was in action. Unfortunately I don't

Re: GPF in __mark_inode_dirty due to locked_inode_to_wb_and_lock_list returning NULL

2016-07-01 Thread Tejun Heo
On Fri, Jul 01, 2016 at 12:00:50PM +0200, Jan Kara wrote: > Hello, > > On Thu 30-06-16 14:18:14, Nikolay Borisov wrote: > > In light of the discussion in https://patchwork.kernel.org/patch/9187411/ > > and > > the discussion at > > https://groups.google.com/forum/#!topic/syzkaller/XvxH3cBQ134

Re: [PATCH 3/5] writeback: add counters for metadata usage

2016-10-26 Thread Tejun Heo
Hello, Josef. On Wed, Oct 26, 2016 at 11:20:16AM -0400, Josef Bacik wrote: > > > @@ -3701,7 +3703,20 @@ static unsigned long > > > node_pagecache_reclaimable(struct pglist_data *pgdat) > > > if (unlikely(delta > nr_pagecache_reclaimable)) > > > delta = nr_pagecache_reclaimable; > > >

Re: [PATCH 4/5] writeback: introduce super_operations->write_metadata

2016-10-25 Thread Tejun Heo
into > their ->write_metadata callback. > > Signed-off-by: Josef Bacik <jba...@fb.com> > Reviewed-by: Jan Kara <j...@suse.cz> Reviewed-by: Tejun Heo <t...@kernel.org> > @@ -1491,6 +1516,7 @@ static long writeback_sb_inodes(struct super_block *sb, > unsig

Re: [PATCH 3/5] writeback: add counters for metadata usage

2016-10-25 Thread Tejun Heo
Hello, On Tue, Oct 25, 2016 at 02:41:42PM -0400, Josef Bacik wrote: > Btrfs has no bounds except memory on the amount of dirty memory that we have > in > use for metadata. Historically we have used a special inode so we could take > advantage of the balance_dirty_pages throttling that comes

Re: [PATCH 1/5] remove mapping from balance_dirty_pages*()

2016-10-25 Thread Tejun Heo
Josef Bacik <jba...@fb.com> > Reviewed-by: Jan Kara <j...@suse.cz> Acked-by: Tejun Heo <t...@kernel.org> Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] writeback: convert WB_WRITTEN/WB_DIRITED counters to bytes

2016-10-25 Thread Tejun Heo
to count bytes written/dirtied, and allow the > metadata accounting stuff to change the counters as well. > > Signed-off-by: Josef Bacik <jba...@fb.com> Acked-by: Tejun Heo <t...@kernel.org> A small nit below. > @@ -2547,12 +2547,16 @@ void account_page_redirty

[PATCH 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-10 Thread Tejun Heo
call the function during init; however, this serves as documentation and prevents possible future mistakes. If this isn't desirable, please feel free to drop the section. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba...@fb.com> -

[PATCH 1/5] blkcg: export blkcg_root_css

2017-10-10 Thread Tejun Heo
Export blkcg_root_css so that filesystem modules can use it. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-cgroup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index d3f56ba..597a457 100644 --- a/block/blk-cgroup.c +++ b/blo

[PATCH 2/5] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-10 Thread Tejun Heo
change is intended. v2: Use ext4_should_journal_data() as suggested by Jan. Signed-off-by: Tejun Heo <t...@kernel.org> Reviewed-by: Jan Kara <j...@suse.cz> Cc: Jens Axboe <ax...@kernel.dk> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba...@fb.com> Cc: linux-btrfs

[PATCH 4/5] cgroup, buffer_head: implement submit_bh_blkcg_css()

2017-10-10 Thread Tejun Heo
Implement submit_bh_blkcg_css() which will be used to override cgroup membership on specific buffer_heads. v2: Reimplemented using create_bh_bio() as suggested by Jan. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Jan Kara <j...@suse.cz> Cc: Jens Axboe <ax...@kernel.dk>

[PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-10-10 Thread Tejun Heo
Hello, Changes from the last version are * blkcg_root_css exported to fix build breakage on modular btrfs. * Use ext4_should_journal_data() test instead of EXT4_MOUNT_JOURNAL_DATA. * Separated out create_bh_bio() and used it to implement submit_bh_blkcg_css() as suggested by Jan. btrfs

[PATCH v2 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-10 Thread Tejun Heo
all. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba...@fb.com> --- fs/btrfs/check-integrity.c | 2 +- fs/btrfs/disk-io.c | 4 fs/btrfs/ioctl.c | 4 +++- 3 files changed, 8 insertions(+), 2 deletions(-) diff -

[PATCH 3/5] buffer_head: separate out create_bh_bio() from submit_bh_wbc()

2017-10-10 Thread Tejun Heo
handling into submit_bh_wbc() and similarly this will make adding more submit_bh variants straight-forward. This patch is pure refactoring and doesn't cause any functional changes. Signed-off-by: Tejun Heo <t...@kernel.org> Suggested-by: Jan Kara <j...@suse.cz> --- fs/b

[PATCH 3/3] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-09 Thread Tejun Heo
call the function during init; however, this serves as documentation and prevents possible future mistakes. If this isn't desirable, please feel free to drop the section. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba...@fb.com> -

[PATCH 1/3] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-09 Thread Tejun Heo
btree_inode which doesn't use btrfs_update_iflags() during initialization. This is an intended behavior change. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Jan Kara <j...@suse.cz> Cc: Jens Axboe <ax...@kernel.dk> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba

[PATCH 2/3] cgroup, writeback: implement submit_bh_blkcg_css()

2017-10-09 Thread Tejun Heo
Add wbc->blkcg_css so that the blkcg_css association can be specified independently and implement submit_bh_blkcg_css() using it. This will be used to override cgroup membership on specific buffer_heads. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Jan Kara <j...@suse.cz> Cc:

[PATCHSET] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-10-09 Thread Tejun Heo
Hello, btrfs has different ways to issue metadata IOs and may end up issuing metadata or otherwise shared IOs from a non-root cgroup, which can lead to priority inversion and ineffective IO isolation. This patchset makes sure that btrfs issues all metadata and shared IOs from the root cgroup by

Re: [PATCH v2 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-12 Thread Tejun Heo
On Wed, Oct 11, 2017 at 07:07:23PM +0200, David Sterba wrote: > The comment is useful, but the condition will be always true, so I don't > see the point. > > /* >* The btree_inode will be always in the root cgroup. The cgroup >* writeback can be enabled on regular inodes

[PATCH v3 5/5] btrfs: ensure that metadata and flush are issued from the root cgroup

2017-10-12 Thread Tejun Heo
. Signed-off-by: Tejun Heo <t...@kernel.org> Reviewed-by: Liu Bo <bo.li@oracle.com> Cc: David Sterba <dste...@suse.cz> Cc: Chris Mason <c...@fb.com> Cc: Josef Bacik <jba...@fb.com> --- fs/btrfs/check-integrity.c |2 +- fs/btrfs/disk-io.c

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-11-29 Thread Tejun Heo
On Wed, Nov 29, 2017 at 09:03:30AM -0800, Tejun Heo wrote: > Hello, > > On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote: > > What has happened with this patch set? > > No idea. cc'ing Chris directly. Chris, if the patchset looks good, > can you please route the

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-11-29 Thread Tejun Heo
Hello, On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote: > What has happened with this patch set? No idea. cc'ing Chris directly. Chris, if the patchset looks good, can you please route them through the btrfs tree? Thanks. -- tejun -- To unsubscribe from this list: send the line

[PATCH 3/7] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2017-12-16 Thread Tejun Heo
. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index abd5d01..643a38d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_in

[PATCH 1/7] blk-mq: protect completion path with RCU

2017-12-16 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo <t...@kernel.org> ---

[PATCH 4/7] blk-mq: make blk_abort_request() trigger timeout path

2017-12-16 Thread Tejun Heo
while, even when the caller owns the request. AFAICS, SCSI and ATA should be fine with that and I think mtip32xx and dasd should be safe but not completely sure. It'd be great if people who know the drivers take a look. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Asai Thambi SP <

[PATCH 6/7] blk-mq: remove REQ_ATOM_STARTED

2017-12-16 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 inse

[PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-16 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com&g

[PATCH 7/7] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2017-12-16 Thread Tejun Heo
The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk

[PATCH 2/7] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2017-12-16 Thread Tejun Heo
Fixed. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com> Cc: Peter Zijlstra <pet...@infradead.org> --- block/blk-core.c | 2 + block/blk-mq.c | 220 + block/blk-mq.

[PATCHSET v3] blk-mq: reimplement timeout handling

2017-12-16 Thread Tejun Heo
Hello, Changes from [v2] - Possible extended looping around seqcount and u64_stat_sync fixed. - Misplaced MQ_RQ_IDLE state setting fixed. - RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout multiple times. - s/queue_rq_src/srcu/ patch added. - Other misc changes. Changes

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
Hello, Jianchao. On Fri, Dec 22, 2017 at 12:02:20PM +0800, jianchao.wang wrote: > > On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: > >> It's worrying that even though the blk_mark_rq_complete() here is > >> intended to synchronize with timeout path, but it indeed give the > >>

[PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path

2018-01-08 Thread Tejun Heo
while, even when the caller owns the request. AFAICS, SCSI and ATA should be fine with that and I think mtip32xx and dasd should be safe but not completely sure. It'd be great if people who know the drivers take a look. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Asai Thambi SP <

[PATCHSET v4] blk-mq: reimplement timeout handling

2018-01-08 Thread Tejun Heo
Hello, Changes from [v3] - Rebased on top of for-4.16/block. - Integrated Jens's hctx_[un]lock() factoring patch and refreshed the patches accordingly. - Added comment explaining the use of hctx_lock() instead of rcu_read_lock() in completion path. Changes from [v2] - Possible extended

[PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com&g

[PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu

2018-01-08 Thread Tejun Heo
The RCU protection has been expanded to cover both queueing and completion paths making ->queue_rq_srcu a misnomer. Rename it to ->srcu as suggested by Bart. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Bart Van Assche <bart.vanass...@wdc.com> --- block/blk

[PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED

2018-01-08 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 inse

[PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2018-01-08 Thread Tejun Heo
. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 6587f0c..41bfd27 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_in

[PATCH 2/8] blk-mq: protect completion path with RCU

2018-01-08 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo <t...@kernel.org> ---

[PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-08 Thread Tejun Heo
Fixed. v4: - Rebased on top of hctx_lock() refactoring patch. - Added comment explaining the use of hctx_lock() in completion path. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com> Cc: Peter Zijlstra <pet...@infradead.org>

[PATCH 1/8] blk-mq: move hctx lock/unlock into a helper

2018-01-08 Thread Tejun Heo
ed-off-by: Jens Axboe <ax...@kernel.dk> Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq.c | 66 -- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 111e1aa..ddc9261 1006

Re: [PATCHSET v3] blk-mq: reimplement timeout handling

2018-01-08 Thread Tejun Heo
On Fri, Dec 29, 2017 at 02:02:39AM -0800, Christoph Hellwig wrote: > This seems to miss the linux-block list once again. Please include > it in the next resend. Sorry about that. Copy/pasted from the older thread without thinking. Thanks. -- tejun -- To unsubscribe from this list: send the

Re: [PATCH 1/7] blk-mq: protect completion path with RCU

2018-01-08 Thread Tejun Heo
Hello, Christoph. On Fri, Dec 29, 2017 at 02:04:18AM -0800, Christoph Hellwig wrote: > Why do you need the srcu protection? The completion path can never > sleep. > > If there is a good reason to keep it please add commment, and > make the srcu variant a separate function only used by drivers

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-08 Thread Tejun Heo
Hello, On Tue, Jan 09, 2018 at 11:08:04AM +0800, jianchao.wang wrote: > > But what'd prevent the completion reinitializing the request and then > > the actual completion path coming in and completing the request again? > > blk_mark_rq_complete() will gate and ensure there will be only one >

[PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE

2018-01-09 Thread Tejun Heo
. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 052fee5..51e9704 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -95,8 +95,7 @@ static void blk_mq_check_in

[PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2018-01-09 Thread Tejun Heo
timeout multiple times. This removes atomic bitops from hot paths too. v2: Removed blk_clear_rq_complete() from blk_mq_rq_timed_out(). v3: Added RQF_MQ_TIMEOUT_EXPIRED flag. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com&g

[PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED

2018-01-09 Thread Tejun Heo
. REQ_ATOM_STARTED no longer has any users left and is removed. Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq-debugfs.c | 4 +--- block/blk-mq.c | 37 - block/blk-mq.h | 1 + block/blk.h| 1 - 4 files changed, 10 inse

[PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path

2018-01-09 Thread Tejun Heo
->deadline update as requested by Bart. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: Asai Thambi SP <asamymuth...@micron.com> Cc: Stefan Haberland <s...@linux.vnet.ibm.com> Cc: Jan Hoeppner <hoepp...@linux.vnet.ibm.com> Cc: Bart Van Assche <bart.vanass...@wdc.com> --- b

[PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme

2018-01-09 Thread Tejun Heo
he use of hctx_lock() in completion path. v5: - Added comments requested by Bart. - Note the addition of BLK_EH_RESET_TIMER race condition in the commit message. Signed-off-by: Tejun Heo <t...@kernel.org> Cc: "jianchao.wang" <jianchao.w.w...@oracle.com> Cc: Peter Zijlst

[PATCH 2/8] blk-mq: protect completion path with RCU

2018-01-09 Thread Tejun Heo
Currently, blk-mq protects only the issue path with RCU. This patch puts the completion path under the same RCU protection. This will be used to synchronize issue/completion against timeout by later patches, which will also add the comments. Signed-off-by: Tejun Heo <t...@kernel.org> ---

[PATCHSET v5] blk-mq: reimplement timeout handling

2018-01-09 Thread Tejun Heo
Hello, Changes from [v4] - Comments added. Patch description updated. Changes from [v3] - Rebased on top of for-4.16/block. - Integrated Jens's hctx_[un]lock() factoring patch and refreshed the patches accordingly. - Added comment explaining the use of hctx_lock() instead of

[PATCH 1/8] blk-mq: move hctx lock/unlock into a helper

2018-01-09 Thread Tejun Heo
ed-off-by: Jens Axboe <ax...@kernel.dk> Signed-off-by: Tejun Heo <t...@kernel.org> --- block/blk-mq.c | 66 -- 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 111e1aa..ddc9261 1006

Re: [PATCH 5/7] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq

2017-12-21 Thread Tejun Heo
Hello, On Thu, Dec 21, 2017 at 11:56:49AM +0800, jianchao.wang wrote: > It's worrying that even though the blk_mark_rq_complete() here is intended to > synchronize with > timeout path, but it indeed give the blk_mq_complete_request() the capability > to exclude with > itself. Maybe this