Re: [PATCH V3] blk-mq: fix race between complete and BLK_EH_RESET_TIMER

2018-04-13 Thread Jens Axboe
On 4/12/18 5:59 AM, Ming Lei wrote: > The normal request completion can be done before or during handling > BLK_EH_RESET_TIMER, and this race may cause the request to never be > completed since driver's .timeout() may always return > BLK_EH_RESET_TIMER. > > This issue can't be fixed completely by

Re: [PATCH] nbd: do not update block size if file system is mounted

2018-04-13 Thread Josef Bacik
Yeah sorry I screwed that up again. I’m wondering if we can just drop this altogether and leave the zero setting in the config put that we already have. Does doing that fix it for you? Thanks, Josef Sent from my iPhone > On Apr 13, 2018, at 10:26 AM, Michael Tretter

Re: [PATCH v5] blk-mq: Avoid that a completion can be ignored for BLK_EH_RESET_TIMER

2018-04-13 Thread Ming Lei
On Tue, Apr 10, 2018 at 03:01:57PM -0600, Bart Van Assche wrote: > The blk-mq timeout handling code ignores completions that occur after > blk_mq_check_expired() has been called and before blk_mq_rq_timed_out() > has reset rq->aborted_gstate. If a block driver timeout handler always > returns

[PATCHSET 0/2] loop: dio fixup

2018-04-13 Thread Jens Axboe
Wanted to do a v2 of the loop dio short read fix, since I figured we should retain the zero fill for if we have to give up doing reads. Also noticed that we have cmd->rq for some reason, so killed that off as a cleanup. -- Jens Axboe

[PATCH 1/2] loop: remove cmd->rq member

2018-04-13 Thread Jens Axboe
We can always get at the request from the payload, no need to store a pointer to it. Signed-off-by: Jens Axboe --- drivers/block/loop.c | 36 +++- drivers/block/loop.h | 1 - 2 files changed, 19 insertions(+), 18 deletions(-) diff --git

[PATCH 2/2] loop: handle short DIO reads

2018-04-13 Thread Jens Axboe
We ran into an issue with loop and btrfs, where btrfs would complain about checksum errors. It turns out that is because we don't handle short reads at all, we just zero fill the remainder. Worse than that, we don't handle the filling properly, which results in loop trying to advance a single bio

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-04-13 Thread Stephen Bates
> I'll see if I can get our PCI SIG people to follow this through Hi Jonathan Can you let me know if this moves forward within PCI-SIG? I would like to track it. I can see this being doable between Root Ports that reside in the same Root Complex but might become more challenging to

[GIT PULL] Followup pull request for block

2018-04-13 Thread Jens Axboe
ission with device removal from Bart. - sr reference fix from me, fixing a case where disk change or getevents can race with device removal. - Set of nvme fixes by way of Keith, from various contributors. Please pull! git://git.kernel.dk/linux-block.git tags/for-linus-20

loop: handle short DIO reads

2018-04-13 Thread Jens Axboe
We ran into an issue with loop and btrfs, where btrfs would complain about checksum errors. It turns out that is because we don't handle short reads at all, we just zero fill the remainder. Worse than that, we don't handle the filling properly, which results in loop trying to advance a single bio

Re: [PATCH] loop: fix aio/dio end of request clearing

2018-04-13 Thread Jens Axboe
On 4/12/18 12:52 PM, Jens Axboe wrote: > If we read more than the user asked for, we zero fill the last > part. But the current code assumes that the request has just > one bio, and since that's not guaranteed to be true, we can > run into a situation where we attempt to advance a bio by > a

Re: [PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace

2018-04-13 Thread Steven Rostedt
On Fri, 13 Apr 2018 18:36:20 +0200 Steffen Maier wrote: > I had the need to understand I/O request processing in detail. > But I also had the need to enrich block traces with other trace events > including my own dynamic kprobe events. So I preferred block trace events >

Re: [PATCH v2] block: ratelimite pr_err on IO path

2018-04-13 Thread Martin K. Petersen
Jinpu, [CC:ed the mpt3sas maintainers] The ratelimit patch is just an attempt to treat the symptom, not the cause. > Thanks for asking, we updated mpt3sas driver which enables DIX support > (prot_mask=0x7f), all disks are SATA SSDs, no DIF support. > After reboot, kernel reports the IO errors

Re: [PATCH] target: Fix Fortify_panic kernel exception

2018-04-13 Thread Christoph Hellwig
The patch looks fine, but in general I think descriptions of what you fixed in the code or more important than starting out with a backtrace. E.g. please explain what was wrong, how you fixed it and only after that mention how it was caught. (preferably without the whole trace)

[PATCH v2 2/2] tracing/events: block: dev_t via driver core for plug and unplug events

2018-04-13 Thread Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block trace points to TRACE_EVENT()") to be equivalent to traditional blktrace output. Also this allows event filtering to not always get all (un)plug events. NB: The NULL pointer check for q->kobj.parent is certainly racy and I

[PATCH v2 1/2] tracing/events: block: track and print if unplug was explicit or schedule

2018-04-13 Thread Steffen Maier
Just like blktrace distinguishes explicit and schedule by means of BLK_TA_UNPLUG_IO and BLK_TA_UNPLUG_TIMER, actually make use of the existing argument "explicit" to distinguish the two cases in the one common tracepoint block_unplug. Complements v2.6.39 commit 49cac01e1fa7 ("block: make unplug

[PATCH v2 0/2] tracing/events: block: bring more on a par with blktrace

2018-04-13 Thread Steffen Maier
I had the need to understand I/O request processing in detail. But I also had the need to enrich block traces with other trace events including my own dynamic kprobe events. So I preferred block trace events over blktrace to get everything nicely sorted into one ftrace output. However, I missed

[RFC v2] tracing/events: block: also try to get dev_t via driver core for some events

2018-04-13 Thread Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block trace points to TRACE_EVENT()") for cases where rq->rq_disk == NULL: block_rq_requeue, block_rq_insert, block_rq_issue; and for cases where bio == NULL: block_getrq, block_sleeprq. NB: The NULL pointer check for

Re: [PATCH 1/3] nbd: del_gendisk after we cleanup the queue

2018-04-13 Thread Bart Van Assche
On Fri, 2018-04-13 at 12:21 -0400, Josef Bacik wrote: > On Fri, Apr 13, 2018 at 04:16:18PM +, Bart Van Assche wrote: > > On Fri, 2018-04-13 at 12:03 -0400, Josef Bacik wrote: > > > This fixes a use after free bug, we need to do the del_gendisk after we > > > cleanup the queue on the device. >

Re: [PATCH 1/3] nbd: del_gendisk after we cleanup the queue

2018-04-13 Thread Josef Bacik
On Fri, Apr 13, 2018 at 04:16:18PM +, Bart Van Assche wrote: > On Fri, 2018-04-13 at 12:03 -0400, Josef Bacik wrote: > > This fixes a use after free bug, we need to do the del_gendisk after we > > cleanup the queue on the device. > > Hello Josef, > > Which use-after-free bug does this patch

Re: [PATCH 1/3] nbd: del_gendisk after we cleanup the queue

2018-04-13 Thread Bart Van Assche
On Fri, 2018-04-13 at 12:03 -0400, Josef Bacik wrote: > This fixes a use after free bug, we need to do the del_gendisk after we > cleanup the queue on the device. Hello Josef, Which use-after-free bug does this patch fix? Are you aware that most block drivers call blk_cleanup_queue() before

[PATCH 3/3] nbd: use bd_set_size when updating disk size

2018-04-13 Thread Josef Bacik
From: Josef Bacik When we stopped relying on the bdev everywhere I broke updating the block device size on the fly, which ceph relies on. We can't just do set_capacity, we also have to do bd_set_size so things like parted will notice the device size change. Fixes: 29eaadc ("nbd:

[PATCH 1/3] nbd: del_gendisk after we cleanup the queue

2018-04-13 Thread Josef Bacik
From: Josef Bacik This fixes a use after free bug, we need to do the del_gendisk after we cleanup the queue on the device. Fixes: c6a4759ea0c9 ("nbd: add device refcounting") cc: sta...@vger.kernel.org Signed-off-by: Josef Bacik --- drivers/block/nbd.c | 2 +- 1

[PATCH 2/3] nbd: update size when connected

2018-04-13 Thread Josef Bacik
From: Josef Bacik I messed up changing the size of an NBD device while it was connected by not actually updating the device or doing the uevent. Fix this by updating everything if we're connected and we change the size. cc: sta...@vger.kernel.org Fixes: 639812a ("nbd: don't set

Re: [PATCH] blk-mq: fix race between complete and BLK_EH_RESET_TIMER

2018-04-13 Thread Ming Lei
On Thu, Apr 12, 2018 at 06:57:12AM -0700, Tejun Heo wrote: > On Thu, Apr 12, 2018 at 07:05:13AM +0800, Ming Lei wrote: > > > Not really because aborted_gstate right now doesn't have any memory > > > barrier around it, so nothing ensures blk_add_timer() actually appears > > > before. We can either

Re: [PATCH] block: Ensure that a request queue is dissociated from the cgroup controller

2018-04-13 Thread t...@kernel.org
Hello, Bart. On Thu, Apr 12, 2018 at 10:40:52PM +, Bart Van Assche wrote: > Is something like the patch below perhaps what you had in mind? Yeah, exactly. It'd be really great to have the lockdep asserts add to the right places. Thanks. -- tejun

Re: [PATCH] nbd: do not update block size if file system is mounted

2018-04-13 Thread Michael Tretter
On Fri, 16 Mar 2018 09:36:45 -0700, Jens Axboe wrote: > On 3/15/18 3:00 AM, Michael Tretter wrote: > > On Wed, 28 Feb 2018 10:54:56 +0100, Michael Tretter wrote: > >> If a file system is mounted on the nbd during a disconnect, resetting > >> the size to 0, might change the block size and destroy

Re: [PATCH 1/2] tracing/events: block: track and print if unplug was explicit or schedule

2018-04-13 Thread Steven Rostedt
On Fri, 13 Apr 2018 15:07:17 +0200 Steffen Maier wrote: > Just like blktrace distinguishes explicit and schedule by means of > BLK_TA_UNPLUG_IO and BLK_TA_UNPLUG_TIMER, actually make use of the > existing argument "explicit" to distinguish the two cases in the one > common

[PATCH 2/2] tracing/events: block: dev_t via driver core for plug and unplug events

2018-04-13 Thread Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block trace points to TRACE_EVENT()") to be equivalent to traditional blktrace output. Also this allows event filtering to not always get all (un)plug events. NB: The NULL pointer check for q->kobj.parent is certainly racy and I

[PATCH 1/2] tracing/events: block: track and print if unplug was explicit or schedule

2018-04-13 Thread Steffen Maier
Just like blktrace distinguishes explicit and schedule by means of BLK_TA_UNPLUG_IO and BLK_TA_UNPLUG_TIMER, actually make use of the existing argument "explicit" to distinguish the two cases in the one common tracepoint block_unplug. Complements v2.6.39 commit 49cac01e1fa7 ("block: make unplug

[PATCH 0/2] tracing/events: block: bring more on a par with blktrace

2018-04-13 Thread Steffen Maier
I had the need to understand I/O request processing in detail. But I also had the need to enrich block traces with other trace events including my own dynamic kprobe events. So I preferred block trace events over blktrace to get everything nicely sorted into one ftrace output. However, I missed

[RFC] tracing/events: block: also try to get dev_t via driver core for some events

2018-04-13 Thread Steffen Maier
Complements v2.6.31 commit 55782138e47d ("tracing/events: convert block trace points to TRACE_EVENT()") for cases where rq->rq_disk == NULL: block_rq_requeue, block_rq_insert, block_rq_issue; and for cases where bio == NULL: block_getrq, block_sleeprq. NB: The NULL pointer check for

Re: [PATCH 1/2] bcache: remove unncessary code in bch_btree_keys_init()

2018-04-13 Thread tang . junhui
Hi Coly, >Function bch_btree_keys_init() initializes b->set[].size and >b->set[].data to zero. As the code comments indicates, these code indeed >is unncessary, because both struct btree_keys and struct bset_tree are >nested embedded into struct btree, when struct btree is filled with 0 >bits by

Re: [PATCH 1/4] bcache: finish incremental GC

2018-04-13 Thread tang . junhui
Hi Coly, > Hi Junhui, > How about send out v2 patch and let me confirm it with a Reviewed-by ? Certainly it's OK. Thanks. Tang Junhui

Re: [PATCH 1/4] bcache: finish incremental GC

2018-04-13 Thread Coly Li
On 2018/4/13 4:37 PM, tang.jun...@zte.com.cn wrote: > Hi Coly, >>> >>> Hello Coly, >>> On 2018/4/12 2:38 PM, tang.jun...@zte.com.cn wrote: > From: Tang Junhui > > In GC thread, we record the latest GC key in gc_done, which is expected > to be used for

Re: [PATCH v2] block: ratelimite pr_err on IO path

2018-04-13 Thread Jinpu Wang
On Thu, Apr 12, 2018 at 11:20 PM, Martin K. Petersen wrote: > > Jack, > >> + pr_err_ratelimited("%s: ref tag error at >> location %llu (rcvd %u)\n", > > I'm a bit concerned about dropping records of potential data loss. > > Also, what are

Re: [PATCH 1/4] bcache: finish incremental GC

2018-04-13 Thread tang . junhui
Hi Coly, >> >> Hello Coly, >> >>> On 2018/4/12 2:38 PM, tang.jun...@zte.com.cn wrote: From: Tang Junhui In GC thread, we record the latest GC key in gc_done, which is expected to be used for incremental GC, but in currently code, we didn't realize

Re: [PATCH v2] block: do not use interruptible wait anywhere

2018-04-13 Thread Johannes Thumshirn
Hi Alan, On Thu, 2018-04-12 at 19:11 +0100, Alan Jenkins wrote: > # dd if=/dev/sda of=/dev/null iflag=direct & \ > while killall -SIGUSR1 dd; do sleep 0.1; done & \ > echo mem > /sys/power/state ; \ > sleep 5; killall dd # stop after 5 seconds Can you please also add a regression test to

Re: [PATCH 3/4] bcache: notify allocator to prepare for GC

2018-04-13 Thread tang . junhui
Hi Coly Thanks for you comments. > Hi Junhui, > > This method is complicated IMHO. The main idea of this patch is: GC thread allocator thread ==>triggered by sectors_to_gc set ca->prepare_gc to GC_PREPARING to notify