On Thu, 14 Nov 2019, Mike Snitzer wrote:

> On Thu, Nov 14 2019 at  9:10am -0500,
> Jeffle Xu <[email protected]> wrote:
> 
> > Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128,
> > numjobs=1) over dm-thin device has poor performance versus bare nvme
> > disk on v5.4.0-rc7.
> > 
> > Further investigation with perf indicates that queue_work_on() consumes
> > over 20% CPU time when doing IO over dm-thin device. The call stack is
> > as follows.
> > 
> > - 46.64% thin_map
> >    + 22.28% queue_work_on
> >    + 12.98% dm_thin_find_block
> >    + 5.07% cell_defer_no_holder
> >    + 2.42% bio_detain.isra.33
> >    + 0.95% inc_all_io_entry.isra.31.part.32
> >      0.80% remap.isra.41
> > 
> > In cell_defer_no_holder(), wakeup_worker() is always called, no matter 
> > whether
> > the cell->bios list is empty or not. In single thread IO model, cell->bios 
> > list
> > is most likely empty. So skip waking up worker thread if cell->bios list is
> > empty.
> > 
> > A significant IO performance of single thread can be seen with this patch.
> > The original IO performance is 445 MiB/s with the fio test previously
> > described, while it is 643 MiB/s after applying the patch, which is a
> > 44% performance improvement.
> > 
> > Signed-off-by: Jeffle Xu <[email protected]>
> 
> 
> Nice find, implementation detail questions inlined below.

Indeed!  

This cherry-picks clean into v4.19.y.  

Is there any reason this would be unsafe in 4.19?

--
Eric Wheeler


> 
> > ---
> >  drivers/md/dm-bio-prison-v1.c |  8 +++++++-
> >  drivers/md/dm-bio-prison-v1.h |  2 +-
> >  drivers/md/dm-thin.c          | 10 ++++++----
> >  3 files changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/dm-bio-prison-v1.c b/drivers/md/dm-bio-prison-v1.c
> > index b538989..b2a9b8d 100644
> > --- a/drivers/md/dm-bio-prison-v1.c
> > +++ b/drivers/md/dm-bio-prison-v1.c
> > @@ -219,11 +219,17 @@ static void __cell_release_no_holder(struct 
> > dm_bio_prison *prison,
> >  
> >  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
> >                            struct dm_bio_prison_cell *cell,
> > -                          struct bio_list *inmates)
> > +                          struct bio_list *inmates, int *empty)
> >  {
> >     unsigned long flags;
> >  
> >     spin_lock_irqsave(&prison->lock, flags);
> > +   /*
> > +    * The empty flag should represent the list state exactly
> > +    * when the list is merged into @inmates, thus get the
> > +    * list state when prison->lock is held.
> > +    */
> > +   *empty = bio_list_empty(&cell->bios);
> >     __cell_release_no_holder(prison, cell, inmates);
> >     spin_unlock_irqrestore(&prison->lock, flags);
> >  }
> > diff --git a/drivers/md/dm-bio-prison-v1.h b/drivers/md/dm-bio-prison-v1.h
> > index cec52ac..500edbc 100644
> > --- a/drivers/md/dm-bio-prison-v1.h
> > +++ b/drivers/md/dm-bio-prison-v1.h
> > @@ -89,7 +89,7 @@ void dm_cell_release(struct dm_bio_prison *prison,
> >                  struct bio_list *bios);
> >  void dm_cell_release_no_holder(struct dm_bio_prison *prison,
> >                            struct dm_bio_prison_cell *cell,
> > -                          struct bio_list *inmates);
> > +                          struct bio_list *inmates, int *empty);
> >  void dm_cell_error(struct dm_bio_prison *prison,
> >                struct dm_bio_prison_cell *cell, blk_status_t error);
> >  
> > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> > index fcd8877..51fd396 100644
> > --- a/drivers/md/dm-thin.c
> > +++ b/drivers/md/dm-thin.c
> > @@ -480,9 +480,9 @@ static void cell_visit_release(struct pool *pool,
> >  
> >  static void cell_release_no_holder(struct pool *pool,
> >                                struct dm_bio_prison_cell *cell,
> > -                              struct bio_list *bios)
> > +                              struct bio_list *bios, int *empty)
> >  {
> > -   dm_cell_release_no_holder(pool->prison, cell, bios);
> > +   dm_cell_release_no_holder(pool->prison, cell, bios, empty);
> >     dm_bio_prison_free_cell(pool->prison, cell);
> >  }
> >  
> > @@ -886,12 +886,14 @@ static void cell_defer_no_holder(struct thin_c *tc, 
> > struct dm_bio_prison_cell *c
> >  {
> >     struct pool *pool = tc->pool;
> >     unsigned long flags;
> > +   int empty;
> >  
> >     spin_lock_irqsave(&tc->lock, flags);
> > -   cell_release_no_holder(pool, cell, &tc->deferred_bio_list);
> > +   cell_release_no_holder(pool, cell, &tc->deferred_bio_list, &empty);
> >     spin_unlock_irqrestore(&tc->lock, flags);
> >  
> > -   wake_worker(pool);
> > +   if (!empty)
> > +           wake_worker(pool);
> >  }
> 
> Think your point is tc->deferred_bio_list may have bios already, before
> the call to cell_release_no_holder, so simply checking if it is empty
> after the cell_release_no_holder call isn't enough?
> 
> But if tc->deferred_bio_list isn't empty then there is work that needs
> to be done, regardless of whether this particular cell created work, so
> I'm left wondering if you first tried simply checking if
> tc->deferred_bio_list is empty after cell_release_no_holder() in
> cell_defer_no_holder?
> 
> Thanks,
> Mike
> 
> --
> dm-devel mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
> 


--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Reply via email to