this command out, which would indicate that we've
completed it as well.
Signed-off-by: Josef Bacik
---
drivers/block/nbd.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 8fb8913074b8..e9f5d4e476e7 100644
--- a/drivers/block/nbd.c
+++
We noticed a problem where NBD sometimes double completes the same request when
things go wrong and we time out the request. If the other side goes out to
lunch but happens to reply just as we're timing out the requests we can end up
with a double completion on the request.
We already keep track
se this is initiated by the user, so again is safe.
Signed-off-by: Josef Bacik
---
drivers/block/nbd.c | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index a8e3815295fe..8fb8913074b8 100644
--- a/drivers/block/nbd.c
+++ b/dri
ill make sure all the disk add/remove stuff are done
> by holding the nbd_index_mutex lock.
>
> Signed-off-by: Xiubo Li
> Reported-by: Mike Christie
Sorry, don't know how I missed this.
Reviewed-by: Josef Bacik
Thanks,
Josef
On Tue, Sep 17, 2019 at 02:44:09PM -0500, Mike Christie wrote:
> On 09/17/2019 01:40 PM, Josef Bacik wrote:
> >>> + nbd->destroy_complete = &destroy_complete;
> >>
> >> Also, without the mutex part of the v3 patch, we could race and
&
On Tue, Sep 17, 2019 at 01:31:05PM -0500, Mike Christie wrote:
> On 09/17/2019 06:56 AM, xiu...@redhat.com wrote:
> > From: Xiubo Li
> >
> > When the NBD_CFLAG_DESTROY_ON_DISCONNECT flag is set and at the same
> > time when the socket is closed due to the server daemon is restarted,
> > just befo
fter free bug from Mike's comments
> - This has been test for 3 days, works well.
>
>
You can add
Reviewed-by: Josef Bacik
to the series, Thanks,
Josef
t;>
> >>
> >>
> >>> Il giorno 16 ago 2019, alle ore 19:59, Josef Bacik
> >>> ha scritto:
> >>>
> >>> On Fri, Aug 16, 2019 at 07:52:40PM +0200, Paolo Valente wrote:
> >>>>
> >>>>
> >>
On Fri, Aug 16, 2019 at 07:52:40PM +0200, Paolo Valente wrote:
>
>
> > Il giorno 16 ago 2019, alle ore 15:21, Josef Bacik
> > ha scritto:
> >
> > On Fri, Aug 16, 2019 at 12:57:41PM +0200, Paolo Valente wrote:
> >> Hi,
> >> I happened to test
On Fri, Aug 16, 2019 at 12:57:41PM +0200, Paolo Valente wrote:
> Hi,
> I happened to test the io.latency controller, to make a comparison
> between this controller and BFQ. But io.latency seems not to work,
> i.e., not to reduce latency compared with what happens with no I/O
> control at all. Her
On Sun, Aug 04, 2019 at 02:10:06PM -0500, Mike Christie wrote:
> This fixes a bug added in 4.10 with commit:
>
> commit 9561a7ade0c205bc2ee035a2ac880478dcc1a024
> Author: Josef Bacik
> Date: Tue Nov 22 14:04:40 2016 -0500
>
> nbd: add multi-connection support
>
&
On Tue, Aug 13, 2019 at 10:45:55AM -0500, Mike Christie wrote:
> On 08/13/2019 08:13 AM, Josef Bacik wrote:
> > On Fri, Aug 09, 2019 at 04:26:10PM -0500, Mike Christie wrote:
> >> This fixes a regression added in 4.9 with commit:
> >>
> >> commit 0ead
On Fri, Aug 09, 2019 at 04:26:10PM -0500, Mike Christie wrote:
> This fixes a regression added in 4.9 with commit:
>
> commit 0eadf37afc2500e1162c9040ec26a705b9af8d47
> Author: Josef Bacik
> Date: Thu Sep 8 12:33:40 2016 -0700
>
> nbd: allow block mq to deal wit
On Fri, Aug 09, 2019 at 04:26:08PM -0500, Mike Christie wrote:
> This adds a helper function to convert a block req op to a nbd cmd type.
> It will be used in the last patch to log the type in the timeout
> handler.
>
> Signed-off-by: Mike Christie
Reviewed-by: Josef Bacik
Thanks,
Josef
On Fri, Aug 09, 2019 at 04:26:09PM -0500, Mike Christie wrote:
> Fix bug added with the patch:
>
> commit 8f3ea35929a0806ad1397db99a89ffee0140822a
> Author: Josef Bacik
> Date: Mon Jul 16 12:11:35 2018 -0400
>
> nbd: handle unexpected replies better
>
> where
On Fri, Aug 09, 2019 at 04:26:07PM -0500, Mike Christie wrote:
> Add a helper to set the cmd timeout. It does not really do a lot now,
> but will be more useful in the next patches.
>
> Signed-off-by: Mike Christie
Reviewed-by: Josef Bacik
Thanks,
Josef
bd2]
>? remove_wait_queue+0x60/0x60
>kthread+0xf8/0x130
>? commit_timeout+0x10/0x10 [jbd2]
>? kthread_bind+0x10/0x10
>ret_from_fork+0x35/0x40
>
> With __invalidate_device(), I no longer hit the BUG_ON with sync or
> unmount on the disconnected device.
>
Jeeze I swear I see this same patch go by every 6 months or so, not sure what
happens to it. Anyway
Reviewed-by: Josef Bacik
Thanks,
Josef
sk_buff *skb, struct
> genl_info *info)
> }
>
> dev_list = nla_nest_start_noflag(reply, NBD_ATTR_DEVICE_LIST);
> +
No newline here, once you fix that nit you can add
Reviewed-by: Josef Bacik
Thanks,
Josef
Oleg noticed that our checking of data.got_token is unsafe in the
cleanup case, and should really use a memory barrier. Use a wmb on the
write side, and a rmb() on the read side. We don't need one in the main
loop since we're saved by set_current_state().
Signed-off-by: Josef Bacik
spurious wakeups we'd still want this to be the case. So set
has_sleepers to true if we went to sleep to make sure we're woken up the
proper way.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-rq-qos.c b
In case we get a spurious wakeup we need to make sure to re-set
ourselves to TASK_UNINTERRUPTIBLE so we don't busy wait.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 69a0f0b77795..c450b89
sleeper on the list
after we add ourselves, that way we have an uptodate view of the list.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 659ccb8b693f..67a0a4c07060 100644
--- a/block/blk-rq-q
e are existing waiters locklessly we need to be able to update
our view of the waitqueue list after we've added ourselves to the
waitqueue. Accomplish this by adding this helper to see if there is
more than just ourselves on the list.
Signed-off-by: Josef Bacik
---
include/linux/w
This is the patch series to address the hang we saw in production because of
missed wakeups, and the other issues that Oleg noticed while reviewing the code.
v2->v3:
- apparently I don't understand what READ/WRITE_ONCE does
- set ourselves to TASK_UNINTERRUPTIBLE on wakeup just in case
- add a com
e are existing waiters locklessly we need to be able to update
our view of the waitqueue list after we've added ourselves to the
waitqueue. Accomplish this by adding this helper to see if there is
more than just ourselves on the list.
Signed-off-by: Josef Bacik
---
include/linux/w
ng
isn't needed.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index f4aa7b818cf5..35bc6f54d088 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -261,7 +261,6 @@ void rq_qos_wait(struc
sleeper on the list
after we add ourselves, that way we have an uptodate view of the list.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 659ccb8b693f..67a0a4c07060 100644
--- a/block/blk-rq-q
Oleg noticed that our checking of data.got_token is unsafe in the
cleanup case, and should really use a memory barrier. Use the
READ_ONCE/WRITE_ONCE helpers on got_token so we can be sure we're always
safe.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 6 +++---
1 file chang
This is the patch series to address the hang we saw in production because of
missed wakeups, and the other issues that Oleg noticed while reviewing the code.
v1->v2:
- rename wq_has_multiple_sleepers to wq_has_single_sleeper
- fix the check for has_sleepers in the missed wake-ups patch
- fix the b
On Thu, Jul 11, 2019 at 03:40:06PM +0200, Oleg Nesterov wrote:
> On 07/11, Oleg Nesterov wrote:
> >
> > Jens,
> >
> > I managed to convince myself I understand why 2/2 needs this change...
> > But rq_qos_wait() still looks suspicious to me. Why can't the main loop
> > "break" right after io_schedul
e are existing waiters locklessly we need to be able to update
our view of the waitqueue list after we've added ourselves to the
waitqueue. Accomplish this by adding this helper to see if there are
more than two waiters on the waitqueue.
Suggested-by: Jens Axboe
Signed-off-by: Josef Bacik
--
e sleepers after we add
ourselves to the list, that way we have an uptodate view of the list.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 659ccb8b693f..b39b5f3fb01b 100644
--- a/block/blk-rq-q
(yes, yes, I know) in
order to get a real value for has_sleepers. This way we keep our
optimization in place and avoid hanging forever if there are no longer
any waiters on the list.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
di
ure the IO goes down from
> the correct cgroup.
>
> Signed-off-by: Chris Mason
Reviewed-by: Josef Bacik
Thanks,
Josef
async crcs, the bio already has the correct css, we just need to
> tell the block layer to use REQ_CGROUP_PUNT.
>
> Signed-off-by: Chris Mason
> Modified-and-reviewed-by: Tejun Heo
> ---
Reviewed-by: Josef Bacik
Thanks,
Josef
gt; [ 8308.623451] entry_SYSCALL_64_after_hwframe+0x42/0xb7
>
> The fix here is to make asyc_cow->locked_page NULL everywhere but the
> one async_cow struct that's allowed to do things to the locked page.
>
> Signed-off-by: Chris Mason
> Fixes: 771ed689d2cd ("Btrfs: Optimize compressed writeback and reads")
> ---
Reviewed-by: Josef Bacik
Thanks,
Josef
On Thu, Jun 13, 2019 at 05:33:47PM -0700, Tejun Heo wrote:
> From: Chris Mason
>
> Now that we're not using btrfs_schedule_bio() anymore, delete all the
> code that supported it.
>
> Signed-off-by: Chris Mason
Reviewed-by: Josef Bacik
Thanks,
Josef
switch during IO submission,
> and doesn't fit well with the modern blkmq IO stack. So, this commit stops
> using btrfs_schedule_bio(). We may need to adjust the number of async
> helper threads for crcs and compression, but long term it's a better
> path.
>
> Signed-off-by: Chris Mason
Reviewed-by: Josef Bacik
Thanks,
Josef
t;
> Signed-off-by: Tejun Heo
> Cc: Chris Mason
Reviewed-by: Josef Bacik
Thanks,
Josef
On Thu, Jun 13, 2019 at 05:33:44PM -0700, Tejun Heo wrote:
> Add a helper to determine the target blkcg from wbc.
>
> Signed-off-by: Tejun Heo
Reviewed-by: Josef Bacik
Thanks,
Josef
allow disabling wbc accounting. This will be used
> make btfs compression work well with cgroup IO control.
>
> Signed-off-by: Tejun Heo
Reviewed-by: Josef Bacik
Thanks,
Josef
prevent recurrences.
>
>
> Pavel Begunkov (2):
> blk-iolatency: Fix zero mean in previous stats
> blk-stats: Introduce explicit stat staging buffers
>
I don't have a problem with this, but it's up to Jens I suppose
Acked-by: Josef Bacik
Thanks,
Josef
On Fri, Jun 14, 2019 at 12:33:43PM +0200, Wouter Verhelst wrote:
> On Thu, Jun 13, 2019 at 10:55:36AM -0400, Josef Bacik wrote:
> > Also I mean that there are a bunch of different nbd servers out there. We
> > have
> > our own here at Facebook, qemu has one, IIRC there'
On Thu, Jun 13, 2019 at 12:35:27PM -0500, Mike Christie wrote:
> On 06/13/2019 12:01 PM, Josef Bacik wrote:
> > On Wed, May 29, 2019 at 03:16:06PM -0500, Mike Christie wrote:
> >> If the device is setup with ioctl we can resize the device after the
> >> initial setup,
On Wed, May 29, 2019 at 03:16:06PM -0500, Mike Christie wrote:
> If the device is setup with ioctl we can resize the device after the
> initial setup, but if the device is setup with netlink we cannot use the
> resize related ioctls and there is no netlink reconfigure size ATTR
> handling code.
>
Christie
> ---
Sorry I missed this second go around
Reviewed-by: Josef Bacik
Thanks,
Josef
On Thu, Jun 13, 2019 at 07:21:43PM +0300, Roman Stratiienko wrote:
> > I don't doubt you have a good reason to want it, I'm just not clear on why
> > an
> > initramfs isn't an option? You have this special kernel with your special
> > option, and you manage to get these things to boot your specia
On Wed, Jun 12, 2019 at 07:31:44PM +0300, roman.stratiie...@globallogic.com
wrote:
> From: Roman Stratiienko
>
> Adding support to nbd to use it as a root device. This code essentially
> provides a minimal nbd-client implementation within the kernel. It opens
> a socket and makes the negotiation
On Thu, Jun 13, 2019 at 05:45:13PM +0300, Roman Stratiienko wrote:
> On Thu, Jun 13, 2019 at 4:52 PM Josef Bacik wrote:
> >
> > On Wed, Jun 12, 2019 at 07:31:44PM +0300, roman.stratiie...@globallogic.com
> > wrote:
> > > From: Roman Stratiienko
> > >
>
On Mon, May 27, 2019 at 01:44:38PM +0800, xiu...@redhat.com wrote:
> From: Xiubo Li
>
> This will allow the blksize to be set zero and then use 1024 as
> default.
>
> Signed-off-by: Xiubo Li
Hmm sorry I missed this somehow
Reviewed-by: Josef Bacik
Thanks,
Josef
On Wed, May 29, 2019 at 03:04:46AM +0800, Yao Liu wrote:
> On Tue, May 28, 2019 at 12:57:59PM -0400, Josef Bacik wrote:
> > On Tue, May 28, 2019 at 02:07:43AM +0800, Yao Liu wrote:
> > > On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> > > > On Fri, Ma
On Wed, May 29, 2019 at 04:08:36PM +0800, xiu...@redhat.com wrote:
> From: Xiubo Li
>
> There is one problem that when trying to check the nbd device
> NBD_CMD_STATUS and at the same time insert the nbd.ko module,
> we can randomly get some of the 16 /dev/nbd{0~15} are connected,
> but they are n
On Tue, May 28, 2019 at 02:07:43AM +0800, Yao Liu wrote:
> On Fri, May 24, 2019 at 09:07:42AM -0400, Josef Bacik wrote:
> > On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> > > Some I/O requests that have been sent succussfully but have not yet been
> > > r
On Tue, May 28, 2019 at 02:23:23AM +0800, Yao Liu wrote:
> On Fri, May 24, 2019 at 09:08:58AM -0400, Josef Bacik wrote:
> > On Fri, May 24, 2019 at 05:43:55PM +0800, Yao Liu wrote:
> > > Some nbd client implementations have a userland's daemon, so we should
> > >
On Fri, May 24, 2019 at 05:43:56PM +0800, Yao Liu wrote:
> When sock dead, nbd_read_stat should return a ERR_PTR and then we should
> mark sock as dead and wait for a reconnection if the dead sock is the last
> one, because nbd_xmit_timeout won't resubmit while num_connections <= 1.
num_connection
On Fri, May 24, 2019 at 05:43:55PM +0800, Yao Liu wrote:
> Some nbd client implementations have a userland's daemon, so we should
> inform client daemon to clean up and exit.
>
> Signed-off-by: Yao Liu
Except the nbd_disconnected() check is for the case that the client told us
specifically to di
On Fri, May 24, 2019 at 05:43:54PM +0800, Yao Liu wrote:
> Some I/O requests that have been sent succussfully but have not yet been
> replied won't be resubmitted after reconnecting because of server restart,
> so we add a list to track them.
>
> Signed-off-by: Yao Liu
Nack, this is what the tim
On Tue, May 21, 2019 at 12:48:14PM -0400, Theodore Ts'o wrote:
> On Mon, May 20, 2019 at 11:15:58AM +0200, Jan Kara wrote:
> > But this makes priority-inversion problems with ext4 journal worse, doesn't
> > it? If we submit journal commit in blkio cgroup of some random process, it
> > may get throt
re. The only
> one use is for cgroup accounting on splitted bio, so rename it
> as BIO_SPLITTED.
>
> Cc: Josef Bacik
> Cc: Christoph Hellwig
> Cc: Bart Van Assche
> Signed-off-by: Ming Lei
Reviewed-by: Josef Bacik
Thanks,
Josef
On Wed, Apr 03, 2019 at 12:13:53PM -0500, Adriana Kobylak wrote:
> Adding Josef (updated email address in the maintainers file).
>
> On 2018-12-13 08:21, Adriana Kobylak wrote:
> > On 2018-12-11 00:17, medadyo...@gmail.com wrote:
> > > From: Medad
> > >
> > > If we do NOT clear NBD_BOUND fla
issue.
Signed-off-by: Josef Bacik
---
drivers/block/nbd.c | 14 ++
include/uapi/linux/nbd.h | 2 ++
2 files changed, 16 insertions(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 90ba9f4c03f3..53463217fbe9 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block
On Thu, Mar 07, 2019 at 07:08:31PM +0100, Andrea Righi wrote:
> = Problem =
>
> When sync() is executed from a high-priority cgroup, the process is forced to
> wait the completion of the entire outstanding writeback I/O, even the I/O that
> was originally generated by low-priority cgroups potentia
ng the
> previous behavior by default).
>
> When this flag is enabled any cgroup can write out only dirty pages that
> belong to the cgroup itself (except for the root cgroup that would still
> be able to write out all pages globally).
>
> Signed-off-by: Andrea Righi
Reviewed-by: Josef Bacik
Thanks,
Josef
On Thu, Mar 07, 2019 at 07:08:32PM +0100, Andrea Righi wrote:
> Prevent priority inversion problem when a high-priority blkcg issues a
> sync() and it is forced to wait the completion of all the writeback I/O
> generated by any other low-priority blkcg, causing massive latencies to
> processes that
On Thu, Mar 07, 2019 at 07:08:34PM +0100, Andrea Righi wrote:
> Keep track of the inodes that have been dirtied by each blkcg cgroup and
> make sure that a blkcg issuing a sync() can trigger the writeback + wait
> of only those pages that belong to the cgroup itself.
>
> This behavior is applied o
sted this with a nbd-server that dropped flush
requests to verify that it hung, and then tested with this patch to
verify I got the timeout as expected and the error handling kicked in.
Thanks,
Signed-off-by: Josef Bacik
---
block/blk-core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/b
On Mon, Feb 11, 2019 at 09:40:29PM +0100, Andrea Righi wrote:
> On Mon, Feb 11, 2019 at 10:39:34AM -0500, Josef Bacik wrote:
> > On Sat, Feb 09, 2019 at 03:07:49PM +0100, Andrea Righi wrote:
> > > This is an attempt to mitigate the priority inversion problem of a
> > >
On Sat, Feb 09, 2019 at 03:07:49PM +0100, Andrea Righi wrote:
> This is an attempt to mitigate the priority inversion problem of a
> high-priority blkcg issuing a sync() and being forced to wait the
> completion of all the writeback I/O generated by any other low-priority
> blkcg, causing massive l
On Tue, Jan 29, 2019 at 07:39:38PM +0100, Andrea Righi wrote:
> On Mon, Jan 28, 2019 at 02:26:20PM -0500, Vivek Goyal wrote:
> > On Mon, Jan 28, 2019 at 06:41:29PM +0100, Andrea Righi wrote:
> > > Hi Vivek,
> > >
> > > sorry for the late reply.
> > >
> > > On Mon, Jan 21, 2019 at 04:47:15PM -0500
On Tue, Jan 22, 2019 at 11:52:06PM -0800, Liu Bo wrote:
> On Fri, Jan 18, 2019 at 5:51 PM Jens Axboe wrote:
> >
> > On 1/18/19 6:39 PM, Liu Bo wrote:
> > > On Fri, Jan 18, 2019 at 8:43 AM Josef Bacik wrote:
> > >>
> > >> On Fri, Jan 18, 2019 at 09:
On Fri, Jan 18, 2019 at 07:44:03PM +0100, Andrea Righi wrote:
> On Fri, Jan 18, 2019 at 11:35:31AM -0500, Josef Bacik wrote:
> > On Fri, Jan 18, 2019 at 11:31:24AM +0100, Andrea Righi wrote:
> > > This is a redesign of my old cgroup-io-throttle controller:
> > > http
On Fri, Jan 18, 2019 at 06:07:45PM +0100, Paolo Valente wrote:
>
>
> > Il giorno 18 gen 2019, alle ore 17:35, Josef Bacik
> > ha scritto:
> >
> > On Fri, Jan 18, 2019 at 11:31:24AM +0100, Andrea Righi wrote:
> >> This is a redesign of my old cgroup-io-th
On Fri, Jan 18, 2019 at 09:28:06AM -0700, Jens Axboe wrote:
> On 1/18/19 9:21 AM, Josef Bacik wrote:
> > On Fri, Jan 18, 2019 at 05:58:18AM -0700, Jens Axboe wrote:
> >> On 1/14/19 12:21 PM, Liu Bo wrote:
> >>> Our test reported the following stack, and vmcore showed
On Fri, Jan 18, 2019 at 11:31:24AM +0100, Andrea Righi wrote:
> This is a redesign of my old cgroup-io-throttle controller:
> https://lwn.net/Articles/330531/
>
> I'm resuming this old patch to point out a problem that I think is still
> not solved completely.
>
> = Problem =
>
> The io.max cont
On Fri, Jan 18, 2019 at 05:58:18AM -0700, Jens Axboe wrote:
> On 1/14/19 12:21 PM, Liu Bo wrote:
> > Our test reported the following stack, and vmcore showed that
> > ->inflight counter is -1.
> >
> > [c9003fcc38d0] __schedule at 8173d95d
> > [c9003fcc3958] schedule at 8173
On Wed, Jan 16, 2019 at 06:31:51PM -0800, Bart Van Assche wrote:
> On 1/16/19 5:40 PM, Omar Sandoval wrote:
> > On Tue, Jan 15, 2019 at 08:40:41AM -0800, Bart Van Assche wrote:
> > > On Tue, 2019-01-01 at 19:13 -0800, Bart Van Assche wrote:
> > > > On 12/4
On Thu, Dec 13, 2018 at 12:59:03PM -0700, Jens Axboe wrote:
> On 12/13/18 12:52 PM, Josef Bacik wrote:
> > On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
> >> On 12/11/18 4:01 PM, Dennis Zhou wrote:
> >>> The blk-iolatency controller measures the time f
On Thu, Dec 13, 2018 at 12:48:11PM -0700, Jens Axboe wrote:
> On 12/11/18 4:01 PM, Dennis Zhou wrote:
> > The blk-iolatency controller measures the time from rq_qos_throttle() to
> > rq_qos_done_bio() and attributes this time to the first bio that needs
> > to create the request. This means if a bi
There have been a few issues with turning wbt on and off while IO is in
flight, so add a test that just does some random rw IO and has a
background thread that toggles wbt on and off.
Signed-off-by: Josef Bacik
---
tests/block/027 | 47 +++
tests
There's currently no tests to verify wbt is working properly, this patch
fixes that. Simply run a varied workload and measure the read latencies
with wbt off, and then turn it on and verify that the read latencies go
down.
Signed-off-by: Josef Bacik
---
common/fio | 8 +
c
I added the cgroup cleanup code to the main loop, but didn't test any
other test so I didn't notice it'll error out if we haven't included the
cgroup helpers. Do this so the cleanup can happen appropriately.
Signed-off-by: Josef Bacik
---
common/rc | 1 +
1 file changed,
.
Signed-off-by: Josef Bacik
---
tests/block/026 | 77 +++--
1 file changed, 47 insertions(+), 30 deletions(-)
diff --git a/tests/block/026 b/tests/block/026
index d56fabfcd880..88113a99bd28 100644
--- a/tests/block/026
+++ b/tests/block/026
ors by accounting time separately in a bio
> adding the field bi_start. If this field is set, the bio should be
> processed by blk-iolatency in rq_qos_done_bio().
>
> [1] https://lore.kernel.org/lkml/20181205171039.73066-1-den...@kernel.org/
>
> Signed-off-by: Dennis Zhou
slow group get throttled until the first cgroup is able
to finish, and then the slow cgroup will be allowed to finish.
Signed-off-by: Josef Bacik
---
tests/block/025 | 133
tests/block/025.out | 1 +
2 files changed, 134 insertions
is always in a clean state.
Signed-off-by: Josef Bacik
---
check | 2 ++
common/rc | 48
2 files changed, 50 insertions(+)
diff --git a/check b/check
index ebd87c097e25..1c9dbc518fa1 100755
--- a/check
+++ b/check
@@ -294,6 +294,8 @@ _cleanup
v1->v2:
- dropped my python library, TIL about jq.
- fixed the spelling mistakes in the test.
-- Original message --
This patchset is to add a test to verify io.latency is working properly, and to
add all the supporting code to run that test.
First is the cgroup2 infrastructure which is fairly s
On Tue, Dec 04, 2018 at 01:35:59PM -0500, Dennis Zhou wrote:
> Every bio is now associated with a blkg putting blkg_get, blkg_try_get,
> and blkg_put on the hot path. Switch over the refcnt in blkg to use
> percpu_ref.
>
> Signed-off-by: Dennis Zhou
> Acked-by: Tejun Heo
by: Dennis Zhou
> Acked-by: Tejun Heo
Reviewed-by: Josef Bacik
Thanks,
Josef
by: Dennis Zhou
> Acked-by: Tejun Heo
Reviewed-by: Josef Bacik
Thanks,
Josef
by: Dennis Zhou
> Acked-by: Tejun Heo
> Reviewed-by: Liu Bo
> ---
Reviewed-by: Josef Bacik
Thanks,
Josef
st_queue *q,
> struct bio *bio)
> {
> - struct blkcg *blkcg;
> struct blkcg_gq *blkg;
> bool throtl = false;
>
> - rcu_read_lock();
> + if (!bio->bi_blkg)
> + bio_associate_blkg(bio);
>
Should we maybe WARN_ON_ONCE() here since this really shouldn't happen?
Otherwise you can add
Reviewed-by: Josef Bacik
Thanks,
Josef
Originally when I wrote io-latency and the rq_qos code to provide a common base
between wbt and io-latency I left out the throttling part. These were basically
the same, but slightly different in both cases. The difference was enough and
the code wasn't too complicated that I just copied it into
do their own thing
as appropriate.
Signed-off-by: Josef Bacik
---
block/blk-rq-qos.c | 86 ++
block/blk-rq-qos.h | 6
2 files changed, 92 insertions(+)
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index 80f603b76f61..e932ef9d2718
Now that we have rq_qos_wait() in place, convert wbt_wait() over to
using it with it's specific callbacks.
Signed-off-by: Josef Bacik
---
block/blk-wbt.c | 65 ++---
1 file changed, 11 insertions(+), 54 deletions(-)
diff --git a/bloc
Now that we have this common helper, convert io-latency over to use it
as well.
Signed-off-by: Josef Bacik
---
block/blk-iolatency.c | 31 ---
1 file changed, 8 insertions(+), 23 deletions(-)
diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c
index
is always in a clean state.
Signed-off-by: Josef Bacik
---
check | 2 ++
common/rc | 48
2 files changed, 50 insertions(+)
diff --git a/check b/check
index ebd87c097e25..1c9dbc518fa1 100755
--- a/check
+++ b/check
@@ -294,6 +294,8 @@ _cleanup
This patchset is to add a test to verify io.latency is working properly, and to
add all the supporting code to run that test.
First is the cgroup2 infrastructure which is fairly straightforward. Just
verifies we have cgroup2, and gives us the helpers to check and make sure we
have the right contr
slow group get throttled until the first cgroup is able
to finish, and then the slow cgroup will be allowed to finish.
Signed-off-by: Josef Bacik
---
tests/block/025 | 133
tests/block/025.out | 1 +
2 files changed, 134 insertions
sults data.
Signed-off-by: Josef Bacik
---
src/FioResultDecoder.py | 64 +
src/fio-key-value.py| 28 ++
2 files changed, 92 insertions(+)
create mode 100644 src/FioResultDecoder.py
create mode 100644 src/fio-key-value.py
On Mon, Nov 26, 2018 at 04:19:33PM -0500, Dennis Zhou wrote:
> Hi everyone,
>
> This is respin of v3 [1] with fixes for the errors reported in [2] and
> [3]. v3 was reverted in [4].
>
> The issue in [3] was that bio->bi_disk->queue and blkg->q were out
> of sync. So when I changed blk_get_rl() to
1 - 100 of 296 matches
Mail list logo