On Fri, Feb 22 2019 at 10:22am -0500,
Mike Snitzer <[email protected]> wrote:

> On Fri, Feb 22 2019 at  5:59am -0500,
> Mikulas Patocka <[email protected]> wrote:
> 
> > Hi
> > 
> > I've fonud out that this patch causes timeout in the lvm test 
> > shell/activate-missing-segment.sh and some others.
> 
> OK, I can reproduce with:
> make check_local T=shell/activate-missing-segment.sh
> 
> Interestingly, it isn't hung in kernel, test can be interrupted.
> And it is hanging at shell/activate-missing-segment.sh's : aux disable_dev 
> "$dev1"
> 
> Other tests with "aux disable_dev" hang too
> (e.g. shell/lvcreate-repair.sh, shell/vgreduce-usage.sh)
> 
> continued below:
> 
> > On Wed, 20 Feb 2019, Mike Snitzer wrote:
> > 
> > > Leverage fact that dm_queue_split() enables use of noclone support even
> > > for targets that require additional splitting (via ti->max_io_len).
> > > 
> > > To achieve this move noclone processing from dm_make_request() to
> > > dm_process_bio() and check if bio->bi_end_io is already set to
> > > noclone_endio.  This also fixes dm_wq_work() to be able to support
> > > noclone bios (because it calls dm_process_bio()).
> > > 
> > > While at it, clean up ->map() return processing to be comparable to
> > > __map_bio() and add some code comments that explain why certain
> > > conditionals exist to prevent the use of noclone.
> > > 
> > > Signed-off-by: Mike Snitzer <[email protected]>
> > > ---
> > >  drivers/md/dm.c | 103 
> > > ++++++++++++++++++++++++++++++++++----------------------
> > >  1 file changed, 62 insertions(+), 41 deletions(-)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index 57919f211acc..d84735be6927 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1764,14 +1764,75 @@ static blk_qc_t dm_process_bio(struct 
> > > mapped_device *md,
> > >    * If in ->make_request_fn we need to use blk_queue_split(), otherwise
> > >    * queue_limits for abnormal requests (e.g. discard, writesame, etc)
> > >    * won't be imposed.
> > > +  *
> > > +  * For normal READ and WRITE requests we benefit from upfront splitting
> > > +  * via dm_queue_split() because we can then leverage noclone support 
> > > below.
> > >    */
> > > - if (current->bio_list) {
> > > + if (current->bio_list && (bio->bi_end_io != noclone_endio)) {
> > > +         /*
> > > +          * It is fine to use {blk,dm}_queue_split() before opting to 
> > > use noclone
> > > +          * because any ->bi_private and ->bi_end_io will get saved off 
> > > below.
> > > +          * Once noclone_endio() is called the bio_chain's endio will 
> > > kick in.
> > > +          * - __split_and_process_bio() can split further, but any 
> > > targets that
> > > +          *   require late _DM_ initiated splitting (e.g. 
> > > dm_accept_partial_bio()
> > > +          *   callers) shouldn't set ti->no_clone.
> > > +          */
> > >           if (is_abnormal_io(bio))
> > >                   blk_queue_split(md->queue, &bio);
> > >           else
> > >                   dm_queue_split(md, ti, &bio);
> > >   }
> > >  
> > > + if (dm_table_supports_noclone(map) &&
> > > +     (bio_op(bio) == REQ_OP_READ || bio_op(bio) == REQ_OP_WRITE) &&
> > > +     likely(!(bio->bi_opf & REQ_PREFLUSH)) &&
> > > +     likely(!bio_integrity(bio)) && /* integrity requires specialized 
> > > processing */
> > > +     likely(!dm_stats_used(&md->stats))) { /* noclone doesn't support 
> > > dm-stats */
> > > +         int r;
> > > +         struct dm_noclone *noclone;
> > > +
> > > +         /* Already been here if bio->bi_end_io is noclone_endio, 
> > > reentered via dm_wq_work() */
> > > +         if (bio->bi_end_io != noclone_endio) {
> 
> This conditional is causing it ^
> 
> Very strange...

Added some debugging:

Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:2: 
bio=00000000c70f3c09 bio->bi_iter.bi_sector=0 len=256 bio->bi_end_io != 
noclone_endio
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:0: 
bio=00000000c70f3c09 bio->bi_iter.bi_sector=2048 len=256 bio->bi_end_io = 
noclone_endio
...
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:2: 
bio=00000000da9cadf3 bio->bi_iter.bi_sector=0 len=256 bio->bi_end_io != 
noclone_endio
Feb 22 10:46:28 thegoat kernel: device-mapper: core: dm_process_bio: 253:0: 
bio=00000000da9cadf3 bio->bi_iter.bi_sector=2048 len=256 bio->bi_end_io = 
noclone_endio

So the same noclone bio is entering devices in the stacked DM device
(perfectly normal).

This fixes the issue:

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index a9856f1986df..8fbe4d5ec8e4 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1831,15 +1831,16 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
            likely(!bio_integrity(bio)) && /* integrity requires specialized 
processing */
            likely(!dm_stats_used(&md->stats))) { /* noclone doesn't support 
dm-stats */
                int r;
-               struct dm_noclone *noclone;
-
-               /* Already been here if bio->bi_end_io is noclone_endio, 
reentered via dm_wq_work() */
-               if (bio->bi_end_io != noclone_endio) {
+               /*
+                * Only allocate noclone if in ->make_request_fn, otherwise
+                * leak could occur due to reentering (e.g. from dm_wq_work)
+                */
+               if (current->bio_list) {
+                       struct dm_noclone *noclone;
                        noclone = kmalloc_node(sizeof(*noclone) + 
ti->per_io_data_size + sizeof(unsigned),
                                               GFP_NOWAIT, md->numa_node_id);
                        if (unlikely(!noclone))
                                goto no_fast_path;
-
                        noclone->md = md;
                        noclone->start_time = jiffies;
                        noclone->orig_bi_end_io = bio->bi_end_io;

That said, I've reasoned through other edge cases that need to be dealt
with and will layer other fixes in addition to this one.

--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Reply via email to