Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-15 Thread NeilBrown
On Thu, Feb 15 2018, Milan Broz wrote:

> On 02/15/2018 01:07 AM, NeilBrown wrote:
>> 
>> And looking again at the code, it doesn't seem so bad.  I was worried
>> about reassigning ci.io->orig_bio after calling
>> __split_and_process_non_flush(), but the important thing is to set it
>> before the last dec_pending() call, and the code gets that right.
>> 
>> I think I've found the problem though:
>> 
>>  /* done with normal IO or empty flush */
>>  bio->bi_status = io_error;
>> 
>> When we have chained bios, there are multiple sources for error which
>> need to be merged into bi_status:  all bios that were chained to this
>> one, and this bio itself.
>> 
>> __bio_chain_endio uses:
>> 
>>  if (!parent->bi_status)
>>  parent->bi_status = bio->bi_status;
>> 
>> so the new status won't over-ride the old status (unless there are
>> races), but dm doesn't do that - I guess it didn't have to worry
>> about chains before.
>> 
>> So this:
>> 
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index d6de00f367ef..68136806d365 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -903,7 +903,8 @@ static void dec_pending(struct dm_io *io, blk_status_t 
>> error)
>>  queue_io(md, bio);
>>  } else {
>>  /* done with normal IO or empty flush */
>> -bio->bi_status = io_error;
>> +if (io_error)
>> +bio->bi_status = io_error;
>>  bio_endio(bio);
>>  }
>>  }
>> 
>
> Hi,
>
> well, it indeed fixes the reported problem, thanks.
> But I just tested the first (dm.c) change above.
>
> The important part is also that if the error is hits bio later, it will 
> produce
> short read (and not fail of the whole operation), this can be easily tested 
> if we switch
> the error and the zero target in that reproducer
>   //system("echo -e \'0 1000 error\'n\'1000 100 zero\' | 
> dmsetup create test");
>   system("echo -e \'0 1000 zero\'n\'1000 100 error\' | 
> dmsetup create test");
>
> So it seems your patch works.

Excellent - thanks.

>
> I am not sure about this part though:
>
>> should fix it.  And __bio_chain_endio should probably be
>> 
>> diff --git a/block/bio.c b/block/bio.c
>> index e1708db48258..ad77140edc6f 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>>  {
>>  struct bio *parent = bio->bi_private;
>>  
>> -if (!parent->bi_status)
>> +if (bio->bi_status)
>>  parent->bi_status = bio->bi_status;
>
> Shouldn't it be
>   if (!parent->bi_status && bio->bi_status)
>   parent->bi_status = bio->bi_status;
> ?

That wouldn't hurt, but I don't see how it would help.
It would still be racy as two chained bios could complete at
the same time, so ->bi_status much change between test and set.

>
> Maybe one rephrased question: what about priorities for errors (bio statuses)?
>
> For example, part of a bio fails with EILSEQ (data integrity check error, 
> BLK_STS_PROTECTION),
> part with EIO (media error, BLK_STS_IOERR). What should be reported for 
> parent bio?
> (I would expect I always get EIO here because this is hard, uncorrectable 
> error.)

If there was a well defined ordering, then we could easily write to code
ensure the highest priority status wins, but I don't think there is any
such ordering.
Certainly dec_pending() in dm.c already faces the possibility that it will
be called multiple times with different errors, but doesn't consider any
ordering (except relating to BLK_STS_DM_REQUEUE), when assigning a
status code to io->status.

>
> Anyway, thanks for the fix!

And thanks for the report.

NeilBrown

>
> Milan
>
>
>>  bio_put(bio);
>>  return parent;
>> 
>> to remove the chance that a race will hide a real error.
>> 
>> Milan: could you please check that the patch fixes the dm-crypt bug?
>> I've confirmed that it fixes your test case.
>> When you confirm, I'll send a proper patch.
>> 
>> I guess I should audit all code that assigns to ->bi_status and make
>> sure nothing ever assigns 0 to it.
>> 
>> Thanks,
>> NeilBrown
>> 
>
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel


signature.asc
Description: PGP signature


Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-15 Thread NeilBrown
On Thu, Feb 15 2018, Milan Broz wrote:

> On 02/15/2018 01:07 AM, NeilBrown wrote:
>> 
>> And looking again at the code, it doesn't seem so bad.  I was worried
>> about reassigning ci.io->orig_bio after calling
>> __split_and_process_non_flush(), but the important thing is to set it
>> before the last dec_pending() call, and the code gets that right.
>> 
>> I think I've found the problem though:
>> 
>>  /* done with normal IO or empty flush */
>>  bio->bi_status = io_error;
>> 
>> When we have chained bios, there are multiple sources for error which
>> need to be merged into bi_status:  all bios that were chained to this
>> one, and this bio itself.
>> 
>> __bio_chain_endio uses:
>> 
>>  if (!parent->bi_status)
>>  parent->bi_status = bio->bi_status;
>> 
>> so the new status won't over-ride the old status (unless there are
>> races), but dm doesn't do that - I guess it didn't have to worry
>> about chains before.
>> 
>> So this:
>> 
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index d6de00f367ef..68136806d365 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -903,7 +903,8 @@ static void dec_pending(struct dm_io *io, blk_status_t 
>> error)
>>  queue_io(md, bio);
>>  } else {
>>  /* done with normal IO or empty flush */
>> -bio->bi_status = io_error;
>> +if (io_error)
>> +bio->bi_status = io_error;
>>  bio_endio(bio);
>>  }
>>  }
>> 
>
> Hi,
>
> well, it indeed fixes the reported problem, thanks.
> But I just tested the first (dm.c) change above.
>
> The important part is also that if the error is hits bio later, it will 
> produce
> short read (and not fail of the whole operation), this can be easily tested 
> if we switch
> the error and the zero target in that reproducer
>   //system("echo -e \'0 1000 error\'n\'1000 100 zero\' | 
> dmsetup create test");
>   system("echo -e \'0 1000 zero\'n\'1000 100 error\' | 
> dmsetup create test");
>
> So it seems your patch works.

Excellent - thanks.

>
> I am not sure about this part though:
>
>> should fix it.  And __bio_chain_endio should probably be
>> 
>> diff --git a/block/bio.c b/block/bio.c
>> index e1708db48258..ad77140edc6f 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>>  {
>>  struct bio *parent = bio->bi_private;
>>  
>> -if (!parent->bi_status)
>> +if (bio->bi_status)
>>  parent->bi_status = bio->bi_status;
>
> Shouldn't it be
>   if (!parent->bi_status && bio->bi_status)
>   parent->bi_status = bio->bi_status;
> ?

That wouldn't hurt, but I don't see how it would help.
It would still be racy as two chained bios could complete at
the same time, so ->bi_status much change between test and set.

>
> Maybe one rephrased question: what about priorities for errors (bio statuses)?
>
> For example, part of a bio fails with EILSEQ (data integrity check error, 
> BLK_STS_PROTECTION),
> part with EIO (media error, BLK_STS_IOERR). What should be reported for 
> parent bio?
> (I would expect I always get EIO here because this is hard, uncorrectable 
> error.)

If there was a well defined ordering, then we could easily write to code
ensure the highest priority status wins, but I don't think there is any
such ordering.
Certainly dec_pending() in dm.c already faces the possibility that it will
be called multiple times with different errors, but doesn't consider any
ordering (except relating to BLK_STS_DM_REQUEUE), when assigning a
status code to io->status.

>
> Anyway, thanks for the fix!

And thanks for the report.

NeilBrown

>
> Milan
>
>
>>  bio_put(bio);
>>  return parent;
>> 
>> to remove the chance that a race will hide a real error.
>> 
>> Milan: could you please check that the patch fixes the dm-crypt bug?
>> I've confirmed that it fixes your test case.
>> When you confirm, I'll send a proper patch.
>> 
>> I guess I should audit all code that assigns to ->bi_status and make
>> sure nothing ever assigns 0 to it.
>> 
>> Thanks,
>> NeilBrown
>> 
>
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel


signature.asc
Description: PGP signature


Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-14 Thread Milan Broz
On 02/15/2018 01:07 AM, NeilBrown wrote:
> 
> And looking again at the code, it doesn't seem so bad.  I was worried
> about reassigning ci.io->orig_bio after calling
> __split_and_process_non_flush(), but the important thing is to set it
> before the last dec_pending() call, and the code gets that right.
> 
> I think I've found the problem though:
> 
>   /* done with normal IO or empty flush */
>   bio->bi_status = io_error;
> 
> When we have chained bios, there are multiple sources for error which
> need to be merged into bi_status:  all bios that were chained to this
> one, and this bio itself.
> 
> __bio_chain_endio uses:
> 
>   if (!parent->bi_status)
>   parent->bi_status = bio->bi_status;
> 
> so the new status won't over-ride the old status (unless there are
> races), but dm doesn't do that - I guess it didn't have to worry
> about chains before.
> 
> So this:
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index d6de00f367ef..68136806d365 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -903,7 +903,8 @@ static void dec_pending(struct dm_io *io, blk_status_t 
> error)
>   queue_io(md, bio);
>   } else {
>   /* done with normal IO or empty flush */
> - bio->bi_status = io_error;
> + if (io_error)
> + bio->bi_status = io_error;
>   bio_endio(bio);
>   }
>   }
> 

Hi,

well, it indeed fixes the reported problem, thanks.
But I just tested the first (dm.c) change above.

The important part is also that if the error is hits bio later, it will produce
short read (and not fail of the whole operation), this can be easily tested if 
we switch
the error and the zero target in that reproducer
  //system("echo -e \'0 1000 error\'n\'1000 100 zero\' | 
dmsetup create test");
  system("echo -e \'0 1000 zero\'n\'1000 100 error\' | dmsetup 
create test");

So it seems your patch works.

I am not sure about this part though:

> should fix it.  And __bio_chain_endio should probably be
> 
> diff --git a/block/bio.c b/block/bio.c
> index e1708db48258..ad77140edc6f 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>  {
>   struct bio *parent = bio->bi_private;
>  
> - if (!parent->bi_status)
> + if (bio->bi_status)
>   parent->bi_status = bio->bi_status;

Shouldn't it be
if (!parent->bi_status && bio->bi_status)
parent->bi_status = bio->bi_status;
?

Maybe one rephrased question: what about priorities for errors (bio statuses)?

For example, part of a bio fails with EILSEQ (data integrity check error, 
BLK_STS_PROTECTION),
part with EIO (media error, BLK_STS_IOERR). What should be reported for parent 
bio?
(I would expect I always get EIO here because this is hard, uncorrectable 
error.)

Anyway, thanks for the fix!

Milan


>   bio_put(bio);
>   return parent;
> 
> to remove the chance that a race will hide a real error.
> 
> Milan: could you please check that the patch fixes the dm-crypt bug?
> I've confirmed that it fixes your test case.
> When you confirm, I'll send a proper patch.
> 
> I guess I should audit all code that assigns to ->bi_status and make
> sure nothing ever assigns 0 to it.
> 
> Thanks,
> NeilBrown
> 


Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-14 Thread Milan Broz
On 02/15/2018 01:07 AM, NeilBrown wrote:
> 
> And looking again at the code, it doesn't seem so bad.  I was worried
> about reassigning ci.io->orig_bio after calling
> __split_and_process_non_flush(), but the important thing is to set it
> before the last dec_pending() call, and the code gets that right.
> 
> I think I've found the problem though:
> 
>   /* done with normal IO or empty flush */
>   bio->bi_status = io_error;
> 
> When we have chained bios, there are multiple sources for error which
> need to be merged into bi_status:  all bios that were chained to this
> one, and this bio itself.
> 
> __bio_chain_endio uses:
> 
>   if (!parent->bi_status)
>   parent->bi_status = bio->bi_status;
> 
> so the new status won't over-ride the old status (unless there are
> races), but dm doesn't do that - I guess it didn't have to worry
> about chains before.
> 
> So this:
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index d6de00f367ef..68136806d365 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -903,7 +903,8 @@ static void dec_pending(struct dm_io *io, blk_status_t 
> error)
>   queue_io(md, bio);
>   } else {
>   /* done with normal IO or empty flush */
> - bio->bi_status = io_error;
> + if (io_error)
> + bio->bi_status = io_error;
>   bio_endio(bio);
>   }
>   }
> 

Hi,

well, it indeed fixes the reported problem, thanks.
But I just tested the first (dm.c) change above.

The important part is also that if the error is hits bio later, it will produce
short read (and not fail of the whole operation), this can be easily tested if 
we switch
the error and the zero target in that reproducer
  //system("echo -e \'0 1000 error\'n\'1000 100 zero\' | 
dmsetup create test");
  system("echo -e \'0 1000 zero\'n\'1000 100 error\' | dmsetup 
create test");

So it seems your patch works.

I am not sure about this part though:

> should fix it.  And __bio_chain_endio should probably be
> 
> diff --git a/block/bio.c b/block/bio.c
> index e1708db48258..ad77140edc6f 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>  {
>   struct bio *parent = bio->bi_private;
>  
> - if (!parent->bi_status)
> + if (bio->bi_status)
>   parent->bi_status = bio->bi_status;

Shouldn't it be
if (!parent->bi_status && bio->bi_status)
parent->bi_status = bio->bi_status;
?

Maybe one rephrased question: what about priorities for errors (bio statuses)?

For example, part of a bio fails with EILSEQ (data integrity check error, 
BLK_STS_PROTECTION),
part with EIO (media error, BLK_STS_IOERR). What should be reported for parent 
bio?
(I would expect I always get EIO here because this is hard, uncorrectable 
error.)

Anyway, thanks for the fix!

Milan


>   bio_put(bio);
>   return parent;
> 
> to remove the chance that a race will hide a real error.
> 
> Milan: could you please check that the patch fixes the dm-crypt bug?
> I've confirmed that it fixes your test case.
> When you confirm, I'll send a proper patch.
> 
> I guess I should audit all code that assigns to ->bi_status and make
> sure nothing ever assigns 0 to it.
> 
> Thanks,
> NeilBrown
> 


Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-14 Thread NeilBrown
On Wed, Feb 14 2018, Mike Snitzer wrote:

> On Wed, Feb 14 2018 at  3:39pm -0500,
> NeilBrown  wrote:
>
>> On Wed, Feb 14 2018, Milan Broz wrote:
>> 
>> > Hi,
>> >
>> > the commit (found by bisect)
>> >
>> >   commit 18a25da84354c6bb655320de6072c00eda6eb602
>> >   Author: NeilBrown 
>> >   Date:   Wed Sep 6 09:43:28 2017 +1000
>> >
>> > dm: ensure bio submission follows a depth-first tree walk
>> >
>> > cause serious regression while reading from DM device.
>> >
>> > The reproducer is below, basically it tries to simulate failure we see in 
>> > cryptsetup
>> > regression test: we have DM device with error and zero target and try to 
>> > read
>> > "passphrase" from it (it is test for 64 bit offset error path):
>> >
>> > Test device:
>> > # dmsetup table test
>> > 0 1000 error 
>> > 1000 100 zero 
>> >
>> > We try to run this operation:
>> >   lseek64(fd, 511988, SEEK_CUR); // this should seek to error target 
>> > sector
>> >   read(fd, buf, 13); // this should fail, if we seek to error part of the 
>> > device
>> >
>> > While on 4.15 the read properly fails:
>> >   Seek returned 511988.
>> >   Read returned -1.
>> >
>> > for 4.16 it actually succeeds returning some random data
>> > (perhaps kernel memory, so this bug is even more dangerous):
>> >   Seek returned 511988.
>> >   Read returned 13.
>> >
>> > Full reproducer below:
>> >
>> > #define _GNU_SOURCE
>> > #define _LARGEFILE64_SOURCE
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> >
>> > int main (int argc, char *argv[])
>> > {
>> > char buf[13];
>> > int fd;
>> > //uint64_t offset64 = 511999;
>> > uint64_t offset64 =   511988;
>> > off64_t r;
>> > ssize_t bytes;
>> >
>> > system("echo -e \'0 1000 error\'n\'1000 100 zero\' 
>> > | dmsetup create test");
>> >
>> > fd = open("/dev/mapper/test", O_RDONLY);
>> > if (fd == -1) {
>> > printf("open fail\n");
>> > return 1;
>> > }
>> >
>> > r = lseek64(fd, offset64, SEEK_CUR);
>> > printf("Seek returned %" PRIu64 ".\n", r);
>> > if (r < 0) {
>> > printf("seek fail\n");
>> > close(fd);
>> > return 2;
>> > }
>> >
>> > bytes = read(fd, buf, 13);
>> > printf("Read returned %d.\n", (int)bytes);
>> >
>> > close(fd);
>> > return 0;
>> > }
>> >
>> >
>> > Please let me know if you need more info to reproduce it.
>> 
>> Thanks for the detailed report.  I haven't tried to reproduce, but the
>> code looks very weird.
>> The patch I posted "Date: Wed, 06 Sep 2017 09:43:28 +1000" had:
>>   +  struct bio *b = bio_clone_bioset(bio, 
>> GFP_NOIO,
>>   +   
>> md->queue->bio_split);
>>   +  bio_advance(b, (bio_sectors(b) - 
>> ci.sector_count) << 9);
>>   +  bio_chain(b, bio);
>>   +  generic_make_request(b);
>>   +  break;
>> 
>> The code in Linux has:
>> 
>>  struct bio *b = bio_clone_bioset(bio, GFP_NOIO,
>>   
>> md->queue->bio_split);
>>  ci.io->orig_bio = b;
>>  bio_advance(bio, (bio_sectors(bio) - 
>> ci.sector_count) << 9);
>>  bio_chain(b, bio);
>>  ret = generic_make_request(bio);
>>  break;
>> 
>> So the wrong bio is sent to generic_make_request().
>> Mike: do you remember how that change happened?  I think there were
>> discussions following the patch, but I cannot find anything about making
>> that change.
>
> Mikulas had this feedback:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00159.html
>
> You replied with:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00165.html
>
> Where you said "Yes, you are right something like that would be better."
>
> And you then provided a follow-up patch:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00175.html
>
> That we discussed and I said I'd just fold it into the original, and you
> agreed and thanked me for checking with you ;)
> https://www.redhat.com/archives/dm-devel/2017-November/msg00208.html
>
> Anyway, I'm just about to switch to Daddy Daycare mode (need to get my
> daughter up from her nap, feed her dinner, etc) so: I'll circle back to
> this tomorrow morning.
>

Thanks for the reminder - it's coming back to me now.

And looking again at the code, it doesn't seem so bad.  I was worried
about reassigning ci.io->orig_bio after calling
__split_and_process_non_flush(), but the important thing is to set it
before the last 

Re: [dm-devel] DM Regression in 4.16-rc1 - read() returns data when it shouldn't

2018-02-14 Thread NeilBrown
On Wed, Feb 14 2018, Mike Snitzer wrote:

> On Wed, Feb 14 2018 at  3:39pm -0500,
> NeilBrown  wrote:
>
>> On Wed, Feb 14 2018, Milan Broz wrote:
>> 
>> > Hi,
>> >
>> > the commit (found by bisect)
>> >
>> >   commit 18a25da84354c6bb655320de6072c00eda6eb602
>> >   Author: NeilBrown 
>> >   Date:   Wed Sep 6 09:43:28 2017 +1000
>> >
>> > dm: ensure bio submission follows a depth-first tree walk
>> >
>> > cause serious regression while reading from DM device.
>> >
>> > The reproducer is below, basically it tries to simulate failure we see in 
>> > cryptsetup
>> > regression test: we have DM device with error and zero target and try to 
>> > read
>> > "passphrase" from it (it is test for 64 bit offset error path):
>> >
>> > Test device:
>> > # dmsetup table test
>> > 0 1000 error 
>> > 1000 100 zero 
>> >
>> > We try to run this operation:
>> >   lseek64(fd, 511988, SEEK_CUR); // this should seek to error target 
>> > sector
>> >   read(fd, buf, 13); // this should fail, if we seek to error part of the 
>> > device
>> >
>> > While on 4.15 the read properly fails:
>> >   Seek returned 511988.
>> >   Read returned -1.
>> >
>> > for 4.16 it actually succeeds returning some random data
>> > (perhaps kernel memory, so this bug is even more dangerous):
>> >   Seek returned 511988.
>> >   Read returned 13.
>> >
>> > Full reproducer below:
>> >
>> > #define _GNU_SOURCE
>> > #define _LARGEFILE64_SOURCE
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> > #include 
>> >
>> > int main (int argc, char *argv[])
>> > {
>> > char buf[13];
>> > int fd;
>> > //uint64_t offset64 = 511999;
>> > uint64_t offset64 =   511988;
>> > off64_t r;
>> > ssize_t bytes;
>> >
>> > system("echo -e \'0 1000 error\'n\'1000 100 zero\' 
>> > | dmsetup create test");
>> >
>> > fd = open("/dev/mapper/test", O_RDONLY);
>> > if (fd == -1) {
>> > printf("open fail\n");
>> > return 1;
>> > }
>> >
>> > r = lseek64(fd, offset64, SEEK_CUR);
>> > printf("Seek returned %" PRIu64 ".\n", r);
>> > if (r < 0) {
>> > printf("seek fail\n");
>> > close(fd);
>> > return 2;
>> > }
>> >
>> > bytes = read(fd, buf, 13);
>> > printf("Read returned %d.\n", (int)bytes);
>> >
>> > close(fd);
>> > return 0;
>> > }
>> >
>> >
>> > Please let me know if you need more info to reproduce it.
>> 
>> Thanks for the detailed report.  I haven't tried to reproduce, but the
>> code looks very weird.
>> The patch I posted "Date: Wed, 06 Sep 2017 09:43:28 +1000" had:
>>   +  struct bio *b = bio_clone_bioset(bio, 
>> GFP_NOIO,
>>   +   
>> md->queue->bio_split);
>>   +  bio_advance(b, (bio_sectors(b) - 
>> ci.sector_count) << 9);
>>   +  bio_chain(b, bio);
>>   +  generic_make_request(b);
>>   +  break;
>> 
>> The code in Linux has:
>> 
>>  struct bio *b = bio_clone_bioset(bio, GFP_NOIO,
>>   
>> md->queue->bio_split);
>>  ci.io->orig_bio = b;
>>  bio_advance(bio, (bio_sectors(bio) - 
>> ci.sector_count) << 9);
>>  bio_chain(b, bio);
>>  ret = generic_make_request(bio);
>>  break;
>> 
>> So the wrong bio is sent to generic_make_request().
>> Mike: do you remember how that change happened?  I think there were
>> discussions following the patch, but I cannot find anything about making
>> that change.
>
> Mikulas had this feedback:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00159.html
>
> You replied with:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00165.html
>
> Where you said "Yes, you are right something like that would be better."
>
> And you then provided a follow-up patch:
> https://www.redhat.com/archives/dm-devel/2017-November/msg00175.html
>
> That we discussed and I said I'd just fold it into the original, and you
> agreed and thanked me for checking with you ;)
> https://www.redhat.com/archives/dm-devel/2017-November/msg00208.html
>
> Anyway, I'm just about to switch to Daddy Daycare mode (need to get my
> daughter up from her nap, feed her dinner, etc) so: I'll circle back to
> this tomorrow morning.
>

Thanks for the reminder - it's coming back to me now.

And looking again at the code, it doesn't seem so bad.  I was worried
about reassigning ci.io->orig_bio after calling
__split_and_process_non_flush(), but the important thing is to set it
before the last dec_pending() call, and the