Re: [dm-devel] [PATCH 18/19] dm-crypt: check if adding pages to clone bio fails
On 3/30/23 09:17, Yang Shi wrote: > On Wed, Mar 29, 2023 at 4:49 PM Damien Le Moal > wrote: >> >> On 3/30/23 02:06, Johannes Thumshirn wrote: >>> Check if adding pages to clone bio fails and if bail out. >> >> Nope. The code retries with direct reclaim until it succeeds. Which is very >> suspicious... > > It is not related to bio_add_page() failure. It is used to avoid a > race condition when two processes are depleting the mempool > simultaneously. > > IIUC I don't think page merge may happen for dm-crypt, so is > __bio_add_page() good enough? I'm working on this code too, using > __bio_add_page() would make my patch easier. If the BIO was allocated with enough bvecs, we could use that function. But page merging reduces overhead, so if it can happen, let's use it. > >> >>> >>> This way we can mark bio_add_pages as __must_check. >>> >>> Signed-off-by: Johannes Thumshirn >> >> With the commit message fixed, >> >> Reviewed-by: Damien Le Moal >> >> >> -- >> Damien Le Moal >> Western Digital Research >> >> -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 18/19] dm-crypt: check if adding pages to clone bio fails
On Wed, Mar 29, 2023 at 4:49 PM Damien Le Moal wrote: > > On 3/30/23 02:06, Johannes Thumshirn wrote: > > Check if adding pages to clone bio fails and if bail out. > > Nope. The code retries with direct reclaim until it succeeds. Which is very > suspicious... It is not related to bio_add_page() failure. It is used to avoid a race condition when two processes are depleting the mempool simultaneously. IIUC I don't think page merge may happen for dm-crypt, so is __bio_add_page() good enough? I'm working on this code too, using __bio_add_page() would make my patch easier. > > > > > This way we can mark bio_add_pages as __must_check. > > > > Signed-off-by: Johannes Thumshirn > > With the commit message fixed, > > Reviewed-by: Damien Le Moal > > > -- > Damien Le Moal > Western Digital Research > > -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 19/19] block: mark bio_add_page as __must_check
On 3/30/23 02:06, Johannes Thumshirn wrote: > Now that all users of bio_add_page check for the return value, mark > bio_add_page as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 18/19] dm-crypt: check if adding pages to clone bio fails
On 3/30/23 02:06, Johannes Thumshirn wrote: > Check if adding pages to clone bio fails and if bail out. Nope. The code retries with direct reclaim until it succeeds. Which is very suspicious... > > This way we can mark bio_add_pages as __must_check. > > Signed-off-by: Johannes Thumshirn With the commit message fixed, Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 17/19] md: raid1: check if adding pages to resync bio fails
On 3/30/23 02:06, Johannes Thumshirn wrote: > Check if adding pages to resync bio fails and if bail out. > > As the comment above suggests this cannot happen, WARN if it actually > happens. > > This way we can mark bio_add_pages as __must_check. > > Signed-off-by: Johannes Thumshirn > --- > drivers/md/raid1-10.c | 7 ++- > drivers/md/raid10.c | 12 ++-- > 2 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c > index e61f6cad4e08..c21b6c168751 100644 > --- a/drivers/md/raid1-10.c > +++ b/drivers/md/raid1-10.c > @@ -105,7 +105,12 @@ static void md_bio_reset_resync_pages(struct bio *bio, > struct resync_pages *rp, >* won't fail because the vec table is big >* enough to hold all these pages >*/ > - bio_add_page(bio, page, len, 0); > + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { Not sure we really need the WARN_ON here... Nevertheless, Reviewed-by: Damien Le Moal > + bio->bi_status = BLK_STS_RESOURCE; > + bio_endio(bio); > + return; > + } > + > size -= len; > } while (idx++ < RESYNC_PAGES && size > 0); > } > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 6c66357f92f5..5682dba52fd3 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -3808,7 +3808,11 @@ static sector_t raid10_sync_request(struct mddev > *mddev, sector_t sector_nr, >* won't fail because the vec table is big enough >* to hold all these pages >*/ > - bio_add_page(bio, page, len, 0); > + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { > + bio->bi_status = BLK_STS_RESOURCE; > + bio_endio(bio); > + goto giveup; > + } > } > nr_sectors += len>>9; > sector_nr += len>>9; > @@ -4989,7 +4993,11 @@ static sector_t reshape_request(struct mddev *mddev, > sector_t sector_nr, >* won't fail because the vec table is big enough >* to hold all these pages >*/ > - bio_add_page(bio, page, len, 0); > + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { > + bio->bi_status = BLK_STS_RESOURCE; > + bio_endio(bio); > + return sectors_done; /* XXX: is this correct? */ > + } > } > sector_nr += len >> 9; > nr_sectors += len >> 9; -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 16/19] md: raid1: use __bio_add_page for adding single page to bio
On 3/30/23 02:06, Johannes Thumshirn wrote: > The sync request code uses bio_add_page() to add a page to a newly created > bio. > bio_add_page() can fail, but the return value is never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 15/19] md: check for failure when adding pages in alloc_behind_master_bio
On 3/30/23 02:06, Johannes Thumshirn wrote: > alloc_behind_master_bio() can possibly add multiple pages to a bio, but it > is not checking for the return value of bio_add_page() if adding really > succeeded. > > Check if the page adding succeeded and if not bail out. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 14/19] floppy: use __bio_add_page for adding single page to bio
On 3/30/23 02:06, Johannes Thumshirn wrote: > The floppy code uses bio_add_page() to add a page to a newly created bio. > bio_add_page() can fail, but the return value is never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 13/19] zram: use __bio_add_page for adding single page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The zram writeback code uses bio_add_page() to add a page to a newly > created bio. bio_add_page() can fail, but the return value is never > checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 12/19] zonefs: use __bio_add_page for adding single page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The zonefs superblock reading code uses bio_add_page() to add a page to a > newly created bio. bio_add_page() can fail, but the return value is > never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Acked-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 11/19] gfs: use __bio_add_page for adding single page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The GFS superblock reading code uses bio_add_page() to add a page to a > newly created bio. bio_add_page() can fail, but the return value is never > checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 10/19] jfs: logmgr: use __bio_add_page to add single page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The JFS IO code uses bio_add_page() to add a page to a newly created bio. > bio_add_page() can fail, but the return value is never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 09/19] btrfs: raid56: use __bio_add_page to add single page
On 3/30/23 02:05, Johannes Thumshirn wrote: > The btrfs raid58 sector submission code uses bio_add_page() to add a page > to a newly created bio. bio_add_page() can fail, but the return value is > never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 08/19] btrfs: repair: use __bio_add_page for adding single page
On 3/30/23 02:05, Johannes Thumshirn wrote: > The btrfs repair bio submission code uses bio_add_page() to add a page to > a newly created bio. bio_add_page() can fail, but the return value is > never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 07/19] md: raid5: use __bio_add_page to add single page to new bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The raid5-ppl submission code uses bio_add_page() to add a page to a > newly created bio. bio_add_page() can fail, but the return value is never > checked. For adding consecutive pages, the return is actually checked and > a new bio is allocated if adding the page fails. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 05/19] md: use __bio_add_page to add single page
On 3/30/23 02:05, Johannes Thumshirn wrote: > The md-raid superblock writing code uses bio_add_page() to add a page to a > newly created bio. bio_add_page() can fail, but the return value is never > checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-of_-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 06/19] md: raid5-log: use __bio_add_page to add single page
On 3/30/23 02:05, Johannes Thumshirn wrote: > The raid5 log metadata submission code uses bio_add_page() to add a page > to a newly created bio. bio_add_page() can fail, but the return value is > never checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 03/19] dm: dm-zoned: use __bio_add_page for adding single metadata page
On 3/30/23 02:05, Johannes Thumshirn wrote: > dm-zoned uses bio_add_page() for adding a single page to a freshly created > metadata bio. > > Use __bio_add_page() instead as adding a single page to a new bio is > always guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() __must_check > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 04/19] fs: buffer: use __bio_add_page to add single page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The buffer_head submission code uses bio_add_page() to add a page to a > newly created bio. bio_add_page() can fail, but the return value is never > checked. > > Use __bio_add_page() as adding a single page to a newly created bio is > guaranteed to succeed. > > This brings us a step closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 02/19] drbd: use __bio_add_page to add page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The drbd code only adds a single page to a newly created bio. So use > __bio_add_page() to add the page which is guaranteed to succeed in this > case. > > This brings us closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn With Matthew comment addressed, Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH 01/19] swap: use __bio_add_page to add page to bio
On 3/30/23 02:05, Johannes Thumshirn wrote: > The swap code only adds a single page to a newly created bio. So use > __bio_add_page() to add the page which is guaranteed to succeed in this > case. > > This brings us closer to marking bio_add_page() as __must_check. > > Signed-off-by: Johannes Thumshirn Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v8 6/9] nvmet: add copy command support for bdev and file ns
Hi Anuj, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on axboe-block/for-next] [also build test WARNING on linus/master v6.3-rc4 next-20230329] [cannot apply to device-mapper-dm/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/r/20230327084103.21601-7-anuj20.g%40samsung.com patch subject: [PATCH v8 6/9] nvmet: add copy command support for bdev and file ns config: arm64-randconfig-s041-20230329 (https://download.01.org/0day-ci/archive/20230330/202303300238.vmt9ne37-...@intel.com/config) compiler: aarch64-linux-gcc (GCC) 12.1.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-39-gce1a6720-dirty # https://github.com/intel-lab-lkp/linux/commit/f846a8ac40882d9d42532e9e2b43560650ef8510 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 git checkout f846a8ac40882d9d42532e9e2b43560650ef8510 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/nvme/target/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-kbuild-all/202303300238.vmt9ne37-...@intel.com/ sparse warnings: (new ones prefixed by >>) >> drivers/nvme/target/admin-cmd.c:539:29: sparse: sparse: cast from restricted >> __le16 vim +539 drivers/nvme/target/admin-cmd.c 490 491 static void nvmet_execute_identify_ns(struct nvmet_req *req) 492 { 493 struct nvme_id_ns *id; 494 u16 status; 495 496 if (le32_to_cpu(req->cmd->identify.nsid) == NVME_NSID_ALL) { 497 req->error_loc = offsetof(struct nvme_identify, nsid); 498 status = NVME_SC_INVALID_NS | NVME_SC_DNR; 499 goto out; 500 } 501 502 id = kzalloc(sizeof(*id), GFP_KERNEL); 503 if (!id) { 504 status = NVME_SC_INTERNAL; 505 goto out; 506 } 507 508 /* return an all zeroed buffer if we can't find an active namespace */ 509 status = nvmet_req_find_ns(req); 510 if (status) { 511 status = 0; 512 goto done; 513 } 514 515 if (nvmet_ns_revalidate(req->ns)) { 516 mutex_lock(>ns->subsys->lock); 517 nvmet_ns_changed(req->ns->subsys, req->ns->nsid); 518 mutex_unlock(>ns->subsys->lock); 519 } 520 521 /* 522 * nuse = ncap = nsze isn't always true, but we have no way to find 523 * that out from the underlying device. 524 */ 525 id->ncap = id->nsze = 526 cpu_to_le64(req->ns->size >> req->ns->blksize_shift); 527 switch (req->port->ana_state[req->ns->anagrpid]) { 528 case NVME_ANA_INACCESSIBLE: 529 case NVME_ANA_PERSISTENT_LOSS: 530 break; 531 default: 532 id->nuse = id->nsze; 533 break; 534 } 535 536 if (req->ns->bdev) 537 nvmet_bdev_set_limits(req->ns->bdev, id); 538 else { > 539 id->msrc = (u8)to0based(BIO_MAX_VECS - 1); 540 id->mssrl = cpu_to_le16(BIO_MAX_VECS << 541 (PAGE_SHIFT - SECTOR_SHIFT)); 542 id->mcl = cpu_to_le32(le16_to_cpu(id->mssrl)); 543 } 544 545 /* 546 * We just provide a single LBA format that matches what the 547 * underlying device reports. 548 */ 549 id->nlbaf = 0; 550 id->flbas = 0; 551 552 /* 553
Re: [dm-devel] [PATCH 02/19] drbd: use __bio_add_page to add page to bio
On Wed, Mar 29, 2023 at 10:05:48AM -0700, Johannes Thumshirn wrote: > +++ b/drivers/block/drbd/drbd_bitmap.c > @@ -1043,9 +1043,11 @@ static void bm_page_io_async(struct drbd_bm_aio_ctx > *ctx, int page_nr) __must_ho > bio = bio_alloc_bioset(device->ldev->md_bdev, 1, op, GFP_NOIO, > _md_io_bio_set); > bio->bi_iter.bi_sector = on_disk_sector; > - /* bio_add_page of a single page to an empty bio will always succeed, > - * according to api. Do we want to assert that? */ > - bio_add_page(bio, page, len, 0); > + /* > + * __bio_add_page of a single page to an empty bio will always succeed, > + * according to api. Do we want to assert that? > + */ > + __bio_add_page(bio, page, len, 0); Surely the comment should just be deleted? With no return value to check, what would you assert? -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] dm-clone: Request option to send discard to source device during hydration
On 3/28/23 19:20, Mike Snitzer wrote: On Mon, Mar 27 2023 at 4:24P -0400, Gwendal Grignou wrote: On ChromeOS, we are working on migrating file backed loopback devices to thinpool logical volumes using dm-clone on the Chromebook local SSD. Dm-clone hydration workflow is a great fit but the design of dm-clone assumes a read-only source device. Data present in the backing file will be copied to the new logical volume but can be safely deleted only when the hydration process is complete. During migration, some data will be duplicated and usage on the Chromebook SSD will unnecessarily increase. Would it be reasonable to add a discard option when enabling the hydration process to discard data as we go on the source device? 2 implementations are possible: a- add a state to the hydration state machine to ensure a region is discarded before considering another region. b- a simpler implementation where the discard is sent asynchronously at the end of a region copy. It may not complete successfully (in case the device crashes during the hydration for instance), but will vastly reduce the amount of data left in the source device at the end of the hydration. I prefer b) as it is easier to implement, but a) is cleaner from a usage point of view. In general, discards may not complete for any number of reasons. So while a) gives you finer-grained potential for space being deallocated, b) would likely suffice given that a device crash is pretty unlikely (at least I would think). And in the case of file backed loopback devices, independent of dm-clone, you can just issue discard(s) to all free space after a crash? However you elect to do it, you'd do well to make it an optional "discard_rw_src" (or some better name) feature that is configured when you load the dm-clone target. I agree with Mike, but I also want to note the following. dm-clone commits its on-disk metadata periodically every second, and every time a FLUSH or FUA bio is written. This is done to improve performance. This means the dm-clone device behaves like a physical disk that has a volatile write cache. If power is lost you may lose some recent writes, _and_ dm-clone might need to rehydrate some regions. So, you can't discard a region on the source device after the copy operation has finished, because then the following scenario will result in data corruption: 1. dm-clone hydrates a region 2. dm-clone discards the region on the source device, either synchronously (a) or asynchronously (b) 3. The system crashes before the metadata is committed 4. The system comes up, and dm-clone rehydrates the region, because it thinks it has not been hydrated yet 5. The source device might contain garbage for this region, since we discarded it previously 6. You have data corruption So, you can only discard hydrated regions for which the metadata have been committed on disk. I think you could discard hydrated regions on the source device periodically, right after committing the metadata. dm-clone keeps track of the regions hydrated during each metadata transaction, so after committing the metadata for the current transaction, you could also sent an asynchronous discard for these regions. Nikos. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 19/19] block: mark bio_add_page as __must_check
Now that all users of bio_add_page check for the return value, mark bio_add_page as __must_check. Signed-off-by: Johannes Thumshirn --- include/linux/bio.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index d766be7152e1..0f8a8d7a6384 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -465,7 +465,7 @@ extern void bio_uninit(struct bio *); void bio_reset(struct bio *bio, struct block_device *bdev, blk_opf_t opf); void bio_chain(struct bio *, struct bio *); -int bio_add_page(struct bio *, struct page *, unsigned len, unsigned off); +int __must_check bio_add_page(struct bio *, struct page *, unsigned len, unsigned off); bool bio_add_folio(struct bio *, struct folio *, size_t len, size_t off); extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *, unsigned int, unsigned int); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 14/19] floppy: use __bio_add_page for adding single page to bio
The floppy code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/block/floppy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 487840e3564d..6f46a30f7c36 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -4147,7 +4147,7 @@ static int __floppy_read_block_0(struct block_device *bdev, int drive) cbdata.drive = drive; bio_init(, bdev, _vec, 1, REQ_OP_READ); - bio_add_page(, page, block_size(bdev), 0); + __bio_add_page(, page, block_size(bdev), 0); bio.bi_iter.bi_sector = 0; bio.bi_flags |= (1 << BIO_QUIET); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 15/19] md: check for failure when adding pages in alloc_behind_master_bio
alloc_behind_master_bio() can possibly add multiple pages to a bio, but it is not checking for the return value of bio_add_page() if adding really succeeded. Check if the page adding succeeded and if not bail out. Signed-off-by: Johannes Thumshirn --- drivers/md/raid1.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 68a9e2d9985b..bd7c339a84a1 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1147,7 +1147,8 @@ static void alloc_behind_master_bio(struct r1bio *r1_bio, if (unlikely(!page)) goto free_pages; - bio_add_page(behind_bio, page, len, 0); + if (!bio_add_page(behind_bio, page, len, 0)) + goto free_pages; size -= len; i++; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 06/19] md: raid5-log: use __bio_add_page to add single page
The raid5 log metadata submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/md/raid5-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 46182b955aef..852b265c5db4 100644 --- a/drivers/md/raid5-cache.c +++ b/drivers/md/raid5-cache.c @@ -792,7 +792,7 @@ static struct r5l_io_unit *r5l_new_meta(struct r5l_log *log) io->current_bio = r5l_bio_alloc(log); io->current_bio->bi_end_io = r5l_log_endio; io->current_bio->bi_private = io; - bio_add_page(io->current_bio, io->meta_page, PAGE_SIZE, 0); + __bio_add_page(io->current_bio, io->meta_page, PAGE_SIZE, 0); r5_reserve_log_entry(log, io); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 08/19] btrfs: repair: use __bio_add_page for adding single page
The btrfs repair bio submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/btrfs/bio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 726592868e9c..73220a219c91 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -224,7 +224,7 @@ static struct btrfs_failed_bio *repair_one_sector(struct btrfs_bio *failed_bbio, repair_bio = bio_alloc_bioset(NULL, 1, REQ_OP_READ, GFP_NOFS, _repair_bioset); repair_bio->bi_iter.bi_sector = failed_bbio->saved_iter.bi_sector; - bio_add_page(repair_bio, bv->bv_page, bv->bv_len, bv->bv_offset); + __bio_add_page(repair_bio, bv->bv_page, bv->bv_len, bv->bv_offset); repair_bbio = btrfs_bio(repair_bio); btrfs_bio_init(repair_bbio, failed_bbio->inode, NULL, fbio); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 13/19] zram: use __bio_add_page for adding single page to bio
The zram writeback code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/block/zram/zram_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index aa490da3cef2..9179bd0f248c 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -760,7 +760,7 @@ static ssize_t writeback_store(struct device *dev, REQ_OP_WRITE | REQ_SYNC); bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9); - bio_add_page(, bvec.bv_page, bvec.bv_len, + __bio_add_page(, bvec.bv_page, bvec.bv_len, bvec.bv_offset); /* * XXX: A single page IO would be inefficient for write -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 16/19] md: raid1: use __bio_add_page for adding single page to bio
The sync request code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/md/raid1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index bd7c339a84a1..c226d293992f 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2915,7 +2915,7 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr, * won't fail because the vec table is big * enough to hold all these pages */ - bio_add_page(bio, page, len, 0); + __bio_add_page(bio, page, len, 0); } } nr_sectors += len>>9; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 17/19] md: raid1: check if adding pages to resync bio fails
Check if adding pages to resync bio fails and if bail out. As the comment above suggests this cannot happen, WARN if it actually happens. This way we can mark bio_add_pages as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/md/raid1-10.c | 7 ++- drivers/md/raid10.c | 12 ++-- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index e61f6cad4e08..c21b6c168751 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -105,7 +105,12 @@ static void md_bio_reset_resync_pages(struct bio *bio, struct resync_pages *rp, * won't fail because the vec table is big * enough to hold all these pages */ - bio_add_page(bio, page, len, 0); + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { + bio->bi_status = BLK_STS_RESOURCE; + bio_endio(bio); + return; + } + size -= len; } while (idx++ < RESYNC_PAGES && size > 0); } diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6c66357f92f5..5682dba52fd3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3808,7 +3808,11 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, * won't fail because the vec table is big enough * to hold all these pages */ - bio_add_page(bio, page, len, 0); + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { + bio->bi_status = BLK_STS_RESOURCE; + bio_endio(bio); + goto giveup; + } } nr_sectors += len>>9; sector_nr += len>>9; @@ -4989,7 +4993,11 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, * won't fail because the vec table is big enough * to hold all these pages */ - bio_add_page(bio, page, len, 0); + if (WARN_ON(!bio_add_page(bio, page, len, 0))) { + bio->bi_status = BLK_STS_RESOURCE; + bio_endio(bio); + return sectors_done; /* XXX: is this correct? */ + } } sector_nr += len >> 9; nr_sectors += len >> 9; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 02/19] drbd: use __bio_add_page to add page to bio
The drbd code only adds a single page to a newly created bio. So use __bio_add_page() to add the page which is guaranteed to succeed in this case. This brings us closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/block/drbd/drbd_bitmap.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c index 289876ffbc31..c542dcf8c457 100644 --- a/drivers/block/drbd/drbd_bitmap.c +++ b/drivers/block/drbd/drbd_bitmap.c @@ -1043,9 +1043,11 @@ static void bm_page_io_async(struct drbd_bm_aio_ctx *ctx, int page_nr) __must_ho bio = bio_alloc_bioset(device->ldev->md_bdev, 1, op, GFP_NOIO, _md_io_bio_set); bio->bi_iter.bi_sector = on_disk_sector; - /* bio_add_page of a single page to an empty bio will always succeed, -* according to api. Do we want to assert that? */ - bio_add_page(bio, page, len, 0); + /* +* __bio_add_page of a single page to an empty bio will always succeed, +* according to api. Do we want to assert that? +*/ + __bio_add_page(bio, page, len, 0); bio->bi_private = ctx; bio->bi_end_io = drbd_bm_endio; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 03/19] dm: dm-zoned: use __bio_add_page for adding single metadata page
dm-zoned uses bio_add_page() for adding a single page to a freshly created metadata bio. Use __bio_add_page() instead as adding a single page to a new bio is always guaranteed to succeed. This brings us a step closer to marking bio_add_page() __must_check Signed-off-by: Johannes Thumshirn --- drivers/md/dm-zoned-metadata.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c index cf9402064aba..8dbe102ab271 100644 --- a/drivers/md/dm-zoned-metadata.c +++ b/drivers/md/dm-zoned-metadata.c @@ -577,7 +577,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd, bio->bi_iter.bi_sector = dmz_blk2sect(block); bio->bi_private = mblk; bio->bi_end_io = dmz_mblock_bio_end_io; - bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0); + __bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0); submit_bio(bio); return mblk; @@ -728,7 +728,7 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk, bio->bi_iter.bi_sector = dmz_blk2sect(block); bio->bi_private = mblk; bio->bi_end_io = dmz_mblock_bio_end_io; - bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0); + __bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0); submit_bio(bio); return 0; @@ -752,7 +752,7 @@ static int dmz_rdwr_block(struct dmz_dev *dev, enum req_op op, bio = bio_alloc(dev->bdev, 1, op | REQ_SYNC | REQ_META | REQ_PRIO, GFP_NOIO); bio->bi_iter.bi_sector = dmz_blk2sect(block); - bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0); + __bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0); ret = submit_bio_wait(bio); bio_put(bio); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 00/19] bio: check return values of bio_add_page
We have two functions for adding a page to a bio, __bio_add_page() which is used to add a single page to a freshly created bio and bio_add_page() which is used to add a page to an existing bio. While __bio_add_page() is expected to succeed, bio_add_page() can fail. This series converts the callers of bio_add_page() which can easily use __bio_add_page() to using it and checks the return of bio_add_page() for callers that don't work on a freshly created bio. Lastly it marks bio_add_page() as __must_check so we don't have to go again and audit all callers. Johannes Thumshirn (19): swap: use __bio_add_page to add page to bio drbd: use __bio_add_page to add page to bio dm: dm-zoned: use __bio_add_page for adding single metadata page fs: buffer: use __bio_add_page to add single page to bio md: use __bio_add_page to add single page md: raid5-log: use __bio_add_page to add single page md: raid5: use __bio_add_page to add single page to new bio btrfs: repair: use __bio_add_page for adding single page btrfs: raid56: use __bio_add_page to add single page jfs: logmgr: use __bio_add_page to add single page to bio gfs: use __bio_add_page for adding single page to bio zonefs: use __bio_add_page for adding single page to bio zram: use __bio_add_page for adding single page to bio floppy: use __bio_add_page for adding single page to bio md: check for failure when adding pages in alloc_behind_master_bio md: raid1: use __bio_add_page for adding single page to bio md: raid1: check if adding pages to resync bio fails dm-crypt: check if adding pages to clone bio fails block: mark bio_add_page as __must_check drivers/block/drbd/drbd_bitmap.c | 8 +--- drivers/block/floppy.c | 2 +- drivers/block/zram/zram_drv.c| 2 +- drivers/md/dm-crypt.c| 9 - drivers/md/dm-zoned-metadata.c | 6 +++--- drivers/md/md.c | 4 ++-- drivers/md/raid1-10.c| 7 ++- drivers/md/raid1.c | 5 +++-- drivers/md/raid10.c | 12 ++-- drivers/md/raid5-cache.c | 2 +- drivers/md/raid5-ppl.c | 4 ++-- fs/btrfs/bio.c | 2 +- fs/btrfs/raid56.c| 2 +- fs/buffer.c | 2 +- fs/gfs2/ops_fstype.c | 2 +- fs/jfs/jfs_logmgr.c | 4 ++-- fs/zonefs/super.c| 2 +- include/linux/bio.h | 2 +- mm/page_io.c | 8 19 files changed, 54 insertions(+), 31 deletions(-) -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 04/19] fs: buffer: use __bio_add_page to add single page to bio
The buffer_head submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index 9e1e2add541e..855dc41fe162 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2733,7 +2733,7 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh, bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9); - bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh)); + __bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh)); BUG_ON(bio->bi_iter.bi_size != bh->b_size); bio->bi_end_io = end_bio_bh_io_sync; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 09/19] btrfs: raid56: use __bio_add_page to add single page
The btrfs raid58 sector submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/btrfs/raid56.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c index 642828c1b299..c8173e003df6 100644 --- a/fs/btrfs/raid56.c +++ b/fs/btrfs/raid56.c @@ -1108,7 +1108,7 @@ static int rbio_add_io_sector(struct btrfs_raid_bio *rbio, bio->bi_iter.bi_sector = disk_start >> 9; bio->bi_private = rbio; - bio_add_page(bio, sector->page, sectorsize, sector->pgoff); + __bio_add_page(bio, sector->page, sectorsize, sector->pgoff); bio_list_add(bio_list, bio); return 0; } -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 01/19] swap: use __bio_add_page to add page to bio
The swap code only adds a single page to a newly created bio. So use __bio_add_page() to add the page which is guaranteed to succeed in this case. This brings us closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- mm/page_io.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 87b682d18850..684cd3c7b59b 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -338,7 +338,7 @@ static void swap_writepage_bdev_sync(struct page *page, bio_init(, sis->bdev, , 1, REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc)); bio.bi_iter.bi_sector = swap_page_sector(page); - bio_add_page(, page, thp_size(page), 0); + __bio_add_page(, page, thp_size(page), 0); bio_associate_blkg_from_page(, page); count_swpout_vm_event(page); @@ -360,7 +360,7 @@ static void swap_writepage_bdev_async(struct page *page, GFP_NOIO); bio->bi_iter.bi_sector = swap_page_sector(page); bio->bi_end_io = end_swap_bio_write; - bio_add_page(bio, page, thp_size(page), 0); + __bio_add_page(bio, page, thp_size(page), 0); bio_associate_blkg_from_page(bio, page); count_swpout_vm_event(page); @@ -468,7 +468,7 @@ static void swap_readpage_bdev_sync(struct page *page, bio_init(, sis->bdev, , 1, REQ_OP_READ); bio.bi_iter.bi_sector = swap_page_sector(page); - bio_add_page(, page, thp_size(page), 0); + __bio_add_page(, page, thp_size(page), 0); /* * Keep this task valid during swap readpage because the oom killer may * attempt to access it in the page fault retry time check. @@ -488,7 +488,7 @@ static void swap_readpage_bdev_async(struct page *page, bio = bio_alloc(sis->bdev, 1, REQ_OP_READ, GFP_KERNEL); bio->bi_iter.bi_sector = swap_page_sector(page); bio->bi_end_io = end_swap_bio_read; - bio_add_page(bio, page, thp_size(page), 0); + __bio_add_page(bio, page, thp_size(page), 0); count_vm_event(PSWPIN); submit_bio(bio); } -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 11/19] gfs: use __bio_add_page for adding single page to bio
The GFS superblock reading code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/gfs2/ops_fstype.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 6de901c3b89b..e0cd0d43b12f 100644 --- a/fs/gfs2/ops_fstype.c +++ b/fs/gfs2/ops_fstype.c @@ -254,7 +254,7 @@ static int gfs2_read_super(struct gfs2_sbd *sdp, sector_t sector, int silent) bio = bio_alloc(sb->s_bdev, 1, REQ_OP_READ | REQ_META, GFP_NOFS); bio->bi_iter.bi_sector = sector * (sb->s_blocksize >> 9); - bio_add_page(bio, page, PAGE_SIZE, 0); + __bio_add_page(bio, page, PAGE_SIZE, 0); bio->bi_end_io = end_bio_io_page; bio->bi_private = page; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 10/19] jfs: logmgr: use __bio_add_page to add single page to bio
The JFS IO code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/jfs/jfs_logmgr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c index 695415cbfe98..15c645827dec 100644 --- a/fs/jfs/jfs_logmgr.c +++ b/fs/jfs/jfs_logmgr.c @@ -1974,7 +1974,7 @@ static int lbmRead(struct jfs_log * log, int pn, struct lbuf ** bpp) bio = bio_alloc(log->bdev, 1, REQ_OP_READ, GFP_NOFS); bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9); - bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset); + __bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset); BUG_ON(bio->bi_iter.bi_size != LOGPSIZE); bio->bi_end_io = lbmIODone; @@ -2115,7 +2115,7 @@ static void lbmStartIO(struct lbuf * bp) bio = bio_alloc(log->bdev, 1, REQ_OP_WRITE | REQ_SYNC, GFP_NOFS); bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9); - bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset); + __bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset); BUG_ON(bio->bi_iter.bi_size != LOGPSIZE); bio->bi_end_io = lbmIODone; -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 12/19] zonefs: use __bio_add_page for adding single page to bio
The zonefs superblock reading code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- fs/zonefs/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 23b8b299c64e..9350221abfc5 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -1128,7 +1128,7 @@ static int zonefs_read_super(struct super_block *sb) bio_init(, sb->s_bdev, _vec, 1, REQ_OP_READ); bio.bi_iter.bi_sector = 0; - bio_add_page(, page, PAGE_SIZE, 0); + __bio_add_page(, page, PAGE_SIZE, 0); ret = submit_bio_wait(); if (ret) -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 05/19] md: use __bio_add_page to add single page
The md-raid superblock writing code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-of_-by: Johannes Thumshirn --- drivers/md/md.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 39e49e5d7182..e730c3627d00 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -958,7 +958,7 @@ void md_super_write(struct mddev *mddev, struct md_rdev *rdev, atomic_inc(>nr_pending); bio->bi_iter.bi_sector = sector; - bio_add_page(bio, page, size, 0); + __bio_add_page(bio, page, size, 0); bio->bi_private = rdev; bio->bi_end_io = super_written; @@ -999,7 +999,7 @@ int sync_page_io(struct md_rdev *rdev, sector_t sector, int size, bio.bi_iter.bi_sector = sector + rdev->new_data_offset; else bio.bi_iter.bi_sector = sector + rdev->data_offset; - bio_add_page(, page, size, 0); + __bio_add_page(, page, size, 0); submit_bio_wait(); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 18/19] dm-crypt: check if adding pages to clone bio fails
Check if adding pages to clone bio fails and if bail out. This way we can mark bio_add_pages as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/md/dm-crypt.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 3ba53dc3cc3f..19f7e087c6df 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -1693,7 +1693,14 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size) len = (remaining_size > PAGE_SIZE) ? PAGE_SIZE : remaining_size; - bio_add_page(clone, page, len, 0); + if (!bio_add_page(clone, page, len, 0)) { + mempool_free(page, >page_pool); + crypt_free_buffer_pages(cc, clone); + bio_put(clone); + gfp_mask |= __GFP_DIRECT_RECLAIM; + goto retry; + + } remaining_size -= len; } -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
[dm-devel] [PATCH 07/19] md: raid5: use __bio_add_page to add single page to new bio
The raid5-ppl submission code uses bio_add_page() to add a page to a newly created bio. bio_add_page() can fail, but the return value is never checked. For adding consecutive pages, the return is actually checked and a new bio is allocated if adding the page fails. Use __bio_add_page() as adding a single page to a newly created bio is guaranteed to succeed. This brings us a step closer to marking bio_add_page() as __must_check. Signed-off-by: Johannes Thumshirn --- drivers/md/raid5-ppl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c index e495939bb3e0..eaea57aee602 100644 --- a/drivers/md/raid5-ppl.c +++ b/drivers/md/raid5-ppl.c @@ -465,7 +465,7 @@ static void ppl_submit_iounit(struct ppl_io_unit *io) bio->bi_end_io = ppl_log_endio; bio->bi_iter.bi_sector = log->next_io_sector; - bio_add_page(bio, io->header_page, PAGE_SIZE, 0); + __bio_add_page(bio, io->header_page, PAGE_SIZE, 0); pr_debug("%s: log->current_io_sector: %llu\n", __func__, (unsigned long long)log->next_io_sector); @@ -496,7 +496,7 @@ static void ppl_submit_iounit(struct ppl_io_unit *io) prev->bi_opf, GFP_NOIO, _conf->bs); bio->bi_iter.bi_sector = bio_end_sector(prev); - bio_add_page(bio, sh->ppl_page, PAGE_SIZE, 0); + __bio_add_page(bio, sh->ppl_page, PAGE_SIZE, 0); bio_chain(bio, prev); ppl_submit_iounit_bio(io, prev); -- 2.39.2 -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v8 4/9] fs, block: copy_file_range for def_blk_ops for direct block device.
Hi Anuj, Thank you for the patch! Yet something to improve: [auto build test ERROR on axboe-block/for-next] [also build test ERROR on device-mapper-dm/for-next linus/master v6.3-rc4 next-20230329] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/r/20230327084103.21601-5-anuj20.g%40samsung.com patch subject: [PATCH v8 4/9] fs, block: copy_file_range for def_blk_ops for direct block device. config: x86_64-randconfig-a013 (https://download.01.org/0day-ci/archive/20230329/202303292349.ed70fxdw-...@intel.com/config) compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/61819d260936954ddd6688548f074e7063dcf39e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 git checkout 61819d260936954ddd6688548f074e7063dcf39e # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=x86_64 olddefconfig make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-kbuild-all/202303292349.ed70fxdw-...@intel.com/ All errors (new ones prefixed by >>): ld: vmlinux.o: in function `generic_copy_file_checks': >> fs/read_write.c:1453: undefined reference to `I_BDEV' vim +1453 fs/read_write.c 1398 1399 /* 1400 * Performs necessary checks before doing a file copy 1401 * 1402 * Can adjust amount of bytes to copy via @req_count argument. 1403 * Returns appropriate error code that caller should return or 1404 * zero in case the copy should be allowed. 1405 */ 1406 static int generic_copy_file_checks(struct file *file_in, loff_t pos_in, 1407 struct file *file_out, loff_t pos_out, 1408 size_t *req_count, unsigned int flags) 1409 { 1410 struct inode *inode_in = file_inode(file_in); 1411 struct inode *inode_out = file_inode(file_out); 1412 uint64_t count = *req_count; 1413 loff_t size_in; 1414 int ret; 1415 1416 ret = generic_file_rw_checks(file_in, file_out); 1417 if (ret) 1418 return ret; 1419 1420 /* 1421 * We allow some filesystems to handle cross sb copy, but passing 1422 * a file of the wrong filesystem type to filesystem driver can result 1423 * in an attempt to dereference the wrong type of ->private_data, so 1424 * avoid doing that until we really have a good reason. 1425 * 1426 * nfs and cifs define several different file_system_type structures 1427 * and several different sets of file_operations, but they all end up 1428 * using the same ->copy_file_range() function pointer. 1429 */ 1430 if (flags & COPY_FILE_SPLICE) { 1431 /* cross sb splice is allowed */ 1432 } else if (file_out->f_op->copy_file_range) { 1433 if (file_in->f_op->copy_file_range != 1434 file_out->f_op->copy_file_range) 1435 return -EXDEV; 1436 } else if (file_inode(file_in)->i_sb != file_inode(file_out)->i_sb) { 1437 return -EXDEV; 1438 } 1439 1440 /* Don't touch certain kinds of inodes */ 1441 if (IS_IMMUTABLE(inode_out)) 1442 return -EPERM; 1443 1444 if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) 1445 return -ETXTBSY; 1446 1447 /* Ensure offsets don't wrap. */ 1448 if (pos_in + count < pos_in || pos_out + count < pos_out) 1449 return -EOVERFLOW; 1450 1451 /* Shorten the copy to EOF */ 1452 if (S_ISBLK(inode_in->i_mode)) > 1453 size_in = > bdev_nr_bytes(I_BDEV(file_in->f_mapping->host)); 1454 else 1455 size_in = i_size_read(inode_in); 1456 1457 if (pos_in >= size_in) 1458 count = 0; 1459 else 1460 count = min(count, size_in - (uint64_t)pos_in); 1461 1462 ret = generic_write_check_limits(fil
Re: [dm-devel] [PATCH v8 6/9] nvmet: add copy command support for bdev and file ns
Hi Anuj, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on axboe-block/for-next] [also build test WARNING on linus/master v6.3-rc4 next-20230329] [cannot apply to device-mapper-dm/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/r/20230327084103.21601-7-anuj20.g%40samsung.com patch subject: [PATCH v8 6/9] nvmet: add copy command support for bdev and file ns config: arm64-randconfig-s041-20230329 (https://download.01.org/0day-ci/archive/20230329/202303292148.pbx4mdps-...@intel.com/config) compiler: aarch64-linux-gcc (GCC) 12.1.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-39-gce1a6720-dirty # https://github.com/intel-lab-lkp/linux/commit/f846a8ac40882d9d42532e9e2b43560650ef8510 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 git checkout f846a8ac40882d9d42532e9e2b43560650ef8510 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/nvme/target/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-kbuild-all/202303292148.pbx4mdps-...@intel.com/ sparse warnings: (new ones prefixed by >>) >> drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: sparse: incorrect type in >> argument 1 (different base types) @@ expected unsigned int [usertype] >> val @@ got restricted __le16 [usertype] mssrl @@ drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: expected unsigned int [usertype] val drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: got restricted __le16 [usertype] mssrl >> drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: sparse: cast from >> restricted __le16 >> drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: sparse: cast from >> restricted __le16 >> drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: sparse: cast from >> restricted __le16 >> drivers/nvme/target/io-cmd-bdev.c:55:27: sparse: sparse: cast from >> restricted __le16 drivers/nvme/target/io-cmd-bdev.c:57:29: sparse: sparse: cast from restricted __le16 drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: sparse: incorrect type in argument 1 (different base types) @@ expected unsigned int [usertype] val @@ got restricted __le16 [usertype] mssrl @@ drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: expected unsigned int [usertype] val drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: got restricted __le16 [usertype] mssrl drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: sparse: cast from restricted __le16 drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: sparse: cast from restricted __le16 drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: sparse: cast from restricted __le16 drivers/nvme/target/io-cmd-bdev.c:60:27: sparse: sparse: cast from restricted __le16 vim +55 drivers/nvme/target/io-cmd-bdev.c 12 13 void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id) 14 { 15 /* Logical blocks per physical block, 0's based. */ 16 const __le16 lpp0b = to0based(bdev_physical_block_size(bdev) / 17bdev_logical_block_size(bdev)); 18 19 /* 20 * For NVMe 1.2 and later, bit 1 indicates that the fields NAWUN, 21 * NAWUPF, and NACWU are defined for this namespace and should be 22 * used by the host for this namespace instead of the AWUN, AWUPF, 23 * and ACWU fields in the Identify Controller data structure. If 24 * any of these fields are zero that means that the corresponding 25 * field from the identify controller data structure should be used. 26 */ 27 id->nsfeat |= 1 << 1; 28 id->nawun = lpp0b; 29 id->nawupf = lpp0b; 30 id
Re: [dm-devel] [PATCH v8 4/9] fs, block: copy_file_range for def_blk_ops for direct block device.
Hi Anuj, Thank you for the patch! Yet something to improve: [auto build test ERROR on axboe-block/for-next] [also build test ERROR on device-mapper-dm/for-next linus/master v6.3-rc4 next-20230329] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/r/20230327084103.21601-5-anuj20.g%40samsung.com patch subject: [PATCH v8 4/9] fs, block: copy_file_range for def_blk_ops for direct block device. config: loongarch-randconfig-r001-20230329 (https://download.01.org/0day-ci/archive/20230329/202303292151.7ddoucit-...@intel.com/config) compiler: loongarch64-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/61819d260936954ddd6688548f074e7063dcf39e git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Anuj-Gupta/block-Add-copy-offload-support-infrastructure/20230329-162018 git checkout 61819d260936954ddd6688548f074e7063dcf39e # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-kbuild-all/202303292151.7ddoucit-...@intel.com/ All errors (new ones prefixed by >>): loongarch64-linux-ld: fs/read_write.o: in function `.L633': >> read_write.c:(.text+0x42e0): undefined reference to `I_BDEV' -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v8 1/9] block: Introduce queue limits for copy-offload support
On 3/29/23 19:41, Nitesh Shetty wrote: >>> +What: /sys/block//queue/copy_max_bytes >>> +Date: November 2022 >>> +Contact: linux-bl...@vger.kernel.org >>> +Description: >>> + [RW] While 'copy_max_bytes_hw' is the hardware limit for the >>> + device, 'copy_max_bytes' setting is the software limit. >>> + Setting this value lower will make Linux issue smaller size >>> + copies from block layer. >> >> This is the maximum number of bytes that the block >> layer will allow for a copy request. Must be smaller than >> or equal to the maximum size allowed by the hardware >> indicated > > Looks good. Will update in next version. We took reference from discard. > >> by copy_max_bytes_hw. Write 0 to use the default kernel >> settings. >> > > Nack, writing 0 will not set it to default value. (default value is > copy_max_bytes = copy_max_bytes_hw) It is trivial to make it work that way, which would match how max_sectors_kb works. Write 0 to return copy_max_bytes being equal to the default copy_max_bytes_hw. The other possibility that is also interesting is "write 0 to disable copy offload and use emulation". This one may actually be more useful. > >>> + >>> + >>> +What: /sys/block//queue/copy_max_bytes_hw >>> +Date: November 2022 >>> +Contact: linux-bl...@vger.kernel.org >>> +Description: >>> + [RO] Devices that support offloading copy functionality may have >>> + internal limits on the number of bytes that can be offloaded >>> + in a single operation. The `copy_max_bytes_hw` >>> + parameter is set by the device driver to the maximum number of >>> + bytes that can be copied in a single operation. Copy >>> + requests issued to the device must not exceed this limit. >>> + A value of 0 means that the device does not >>> + support copy offload. >> >> [RO] This is the maximum number of kilobytes supported in a >> single data copy offload operation. A value of 0 means that >> the >> device does not support copy offload. >> > > Nack, value is in bytes. Same as discard. Typo. I meant Bytes. Your text is too long an too convoluted, so unclear. -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v8 9/9] null_blk: add support for copy offload
On 3/27/23 17:40, Anuj Gupta wrote: > From: Nitesh Shetty > > Implementaion is based on existing read and write infrastructure. > > Suggested-by: Damien Le Moal > Signed-off-by: Anuj Gupta > Signed-off-by: Nitesh Shetty > Signed-off-by: Vincent Fu > --- > drivers/block/null_blk/main.c | 94 +++ > drivers/block/null_blk/null_blk.h | 7 +++ > 2 files changed, 101 insertions(+) > > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c > index 9e6b032c8ecc..84c5fbcd67a5 100644 > --- a/drivers/block/null_blk/main.c > +++ b/drivers/block/null_blk/main.c > @@ -1257,6 +1257,81 @@ static int null_transfer(struct nullb *nullb, struct > page *page, > return err; > } > > +static inline int nullb_setup_copy_read(struct nullb *nullb, > + struct bio *bio) > +{ > + struct nullb_copy_token *token = bvec_kmap_local(>bi_io_vec[0]); > + > + memcpy(token->subsys, "nullb", 5); > + token->sector_in = bio->bi_iter.bi_sector; > + token->nullb = nullb; > + token->sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + return 0; > +} > + > +static inline int nullb_setup_copy_write(struct nullb *nullb, > + struct bio *bio, bool is_fua) > +{ > + struct nullb_copy_token *token = bvec_kmap_local(>bi_io_vec[0]); > + sector_t sector_in, sector_out; > + void *in, *out; > + size_t rem, temp; > + unsigned long offset_in, offset_out; > + struct nullb_page *t_page_in, *t_page_out; > + int ret = -EIO; > + > + if (unlikely(memcmp(token->subsys, "nullb", 5))) > + return -EOPNOTSUPP; > + if (unlikely(token->nullb != nullb)) > + return -EOPNOTSUPP; > + if (WARN_ON(token->sectors != bio->bi_iter.bi_size >> SECTOR_SHIFT)) > + return -EOPNOTSUPP; EOPNOTSUPP is strange. These are EINVAL, no ?. > + > + sector_in = token->sector_in; > + sector_out = bio->bi_iter.bi_sector; > + rem = token->sectors << SECTOR_SHIFT; > + > + spin_lock_irq(>lock); > + while (rem > 0) { > + temp = min_t(size_t, nullb->dev->blocksize, rem); > + offset_in = (sector_in & SECTOR_MASK) << SECTOR_SHIFT; > + offset_out = (sector_out & SECTOR_MASK) << SECTOR_SHIFT; > + > + if (null_cache_active(nullb) && !is_fua) > + null_make_cache_space(nullb, PAGE_SIZE); > + > + t_page_in = null_lookup_page(nullb, sector_in, false, > + !null_cache_active(nullb)); > + if (!t_page_in) > + goto err; > + t_page_out = null_insert_page(nullb, sector_out, > + !null_cache_active(nullb) || is_fua); > + if (!t_page_out) > + goto err; > + > + in = kmap_local_page(t_page_in->page); > + out = kmap_local_page(t_page_out->page); > + > + memcpy(out + offset_out, in + offset_in, temp); > + kunmap_local(out); > + kunmap_local(in); > + __set_bit(sector_out & SECTOR_MASK, t_page_out->bitmap); > + > + if (is_fua) > + null_free_sector(nullb, sector_out, true); > + > + rem -= temp; > + sector_in += temp >> SECTOR_SHIFT; > + sector_out += temp >> SECTOR_SHIFT; > + } > + > + ret = 0; > +err: > + spin_unlock_irq(>lock); > + return ret; > +} > + > static int null_handle_rq(struct nullb_cmd *cmd) > { > struct request *rq = cmd->rq; > @@ -1267,6 +1342,14 @@ static int null_handle_rq(struct nullb_cmd *cmd) > struct req_iterator iter; > struct bio_vec bvec; > > + if (rq->cmd_flags & REQ_COPY) { > + if (op_is_write(req_op(rq))) > + return nullb_setup_copy_write(nullb, rq->bio, > + rq->cmd_flags & REQ_FUA); > + else No need for this else. > + return nullb_setup_copy_read(nullb, rq->bio); > + } > + > spin_lock_irq(>lock); > rq_for_each_segment(bvec, rq, iter) { > len = bvec.bv_len; > @@ -1294,6 +1377,14 @@ static int null_handle_bio(struct nullb_cmd *cmd) > struct bio_vec bvec; > struct bvec_iter iter; > > + if (bio->bi_opf & REQ_COPY) { > + if (op_is_write(bio_op(bio))) > + return nullb_setup_copy_write(nullb, bio, > + bio->bi_opf & REQ_FUA); > + else No need for this else. > + return nullb_setup_copy_read(nullb, bio); > + } > + > spin_lock_irq(>lock); > bio_for_each_segment(bvec, bio, iter) { > len = bvec.bv_len; > @@ -2146,6 +2237,9 @@ static int null_add_dev(struct nullb_device *dev) > list_add_tail(>list, _list); > mutex_unlock(); > > + blk_queue_max_copy_sectors_hw(nullb->disk->queue, 1024); > +
Re: [dm-devel] [PATCH v8 7/9] dm: Add support for copy offload.
On 3/27/23 17:40, Anuj Gupta wrote: > From: Nitesh Shetty > Drop the period at the end of the patch title. > Before enabling copy for dm target, check if underlying devices and > dm target support copy. Avoid split happening inside dm target. > Fail early if the request needs split, currently splitting copy > request is not supported. > > Signed-off-by: Nitesh Shetty > --- > drivers/md/dm-table.c | 42 +++ > drivers/md/dm.c | 7 ++ > include/linux/device-mapper.h | 5 + > 3 files changed, 54 insertions(+) > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c > index 7899f5fb4c13..45e894b9e3be 100644 > --- a/drivers/md/dm-table.c > +++ b/drivers/md/dm-table.c > @@ -1863,6 +1863,39 @@ static bool dm_table_supports_nowait(struct dm_table > *t) > return true; > } > > +static int device_not_copy_capable(struct dm_target *ti, struct dm_dev *dev, > + sector_t start, sector_t len, void *data) > +{ > + struct request_queue *q = bdev_get_queue(dev->bdev); > + > + return !blk_queue_copy(q); > +} > + > +static bool dm_table_supports_copy(struct dm_table *t) > +{ > + struct dm_target *ti; > + unsigned int i; > + > + for (i = 0; i < t->num_targets; i++) { > + ti = dm_table_get_target(t, i); > + > + if (!ti->copy_offload_supported) > + return false; > + > + /* > + * target provides copy support (as implied by setting > + * 'copy_offload_supported') > + * and it relies on _all_ data devices having copy support. > + */ > + if (!ti->type->iterate_devices || > + ti->type->iterate_devices(ti, > + device_not_copy_capable, NULL)) > + return false; > + } > + > + return true; > +} > + > static int device_not_discard_capable(struct dm_target *ti, struct dm_dev > *dev, > sector_t start, sector_t len, void *data) > { > @@ -1945,6 +1978,15 @@ int dm_table_set_restrictions(struct dm_table *t, > struct request_queue *q, > q->limits.discard_misaligned = 0; > } > > + if (!dm_table_supports_copy(t)) { > + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); > + /* Must also clear copy limits... */ Not a useful comment. The code is clear. > + q->limits.max_copy_sectors = 0; > + q->limits.max_copy_sectors_hw = 0; > + } else { > + blk_queue_flag_set(QUEUE_FLAG_COPY, q); > + } > + > if (!dm_table_supports_secure_erase(t)) > q->limits.max_secure_erase_sectors = 0; > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 2d0f934ba6e6..08ec51000af8 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1693,6 +1693,13 @@ static blk_status_t __split_and_process_bio(struct > clone_info *ci) > if (unlikely(ci->is_abnormal_io)) > return __process_abnormal_io(ci, ti); > > + if ((unlikely(op_is_copy(ci->bio->bi_opf)) && > + max_io_len(ti, ci->sector) < ci->sector_count)) { > + DMERR("Error, IO size(%u) > max target size(%llu)\n", > + ci->sector_count, max_io_len(ti, ci->sector)); > + return BLK_STS_IOERR; > + } > + > /* >* Only support bio polling for normal IO, and the target io is >* exactly inside the dm_io instance (verified in dm_poll_dm_io) > diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h > index 7975483816e4..44969a20295e 100644 > --- a/include/linux/device-mapper.h > +++ b/include/linux/device-mapper.h > @@ -380,6 +380,11 @@ struct dm_target { >* bio_set_dev(). NOTE: ideally a target should _not_ need this. >*/ > bool needs_bio_set_dev:1; > + > + /* > + * copy offload is supported > + */ > + bool copy_offload_supported:1; > }; > > void *dm_per_bio_data(struct bio *bio, size_t data_size); -- Damien Le Moal Western Digital Research -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v8 2/9] block: Add copy offload support infrastructure
On 3/27/23 17:40, Anuj Gupta wrote: > From: Nitesh Shetty > > Introduce blkdev_issue_copy which takes similar arguments as > copy_file_range and performs copy offload between two bdevs. > Introduce REQ_COPY copy offload operation flag. Create a read-write > bio pair with a token as payload and submitted to the device in order. > Read request populates token with source specific information which > is then passed with write request. > This design is courtesy Mikulas Patocka's token based copy > > Larger copy will be divided, based on max_copy_sectors limit. > > Signed-off-by: Nitesh Shetty > Signed-off-by: Anuj Gupta > --- > block/blk-lib.c | 236 ++ > block/blk.h | 2 + > include/linux/blk_types.h | 25 > include/linux/blkdev.h| 3 + > 4 files changed, 266 insertions(+) > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index e59c3069e835..cbc6882d1e7a 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -115,6 +115,242 @@ int blkdev_issue_discard(struct block_device *bdev, > sector_t sector, > } > EXPORT_SYMBOL(blkdev_issue_discard); > > +/* > + * For synchronous copy offload/emulation, wait and process all in-flight > BIOs. > + * This must only be called once all bios have been issued so that the > refcount > + * can only decrease. This just waits for all bios to make it through > + * bio_copy_*_write_end_io. IO errors are propagated through cio->io_error. > + */ > +static int cio_await_completion(struct cio *cio) Why not simply cio_wait_completion() ? > +{ > + int ret = 0; No need for the initialization. > + > + if (cio->endio) > + return 0; > + > + if (atomic_read(>refcount)) { > + __set_current_state(TASK_UNINTERRUPTIBLE); > + blk_io_schedule(); > + } > + > + ret = cio->comp_len; What about cio->io_error as the top comment mentions ? Looking at struct cio, there is no io_error field. > + kfree(cio); > + > + return ret; > +} > + > +static void blk_copy_offload_write_end_io(struct bio *bio) > +{ > + struct copy_ctx *ctx = bio->bi_private; > + struct cio *cio = ctx->cio; > + sector_t clen; > + > + if (bio->bi_status) { > + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - cio->pos_out; > + cio->comp_len = min_t(sector_t, clen, cio->comp_len); I do not understand. You set the length only if there was an error ? What about the OK case ? Is this to handle partial completions ? If yes, then please add a comment. > + } > + __free_page(bio->bi_io_vec[0].bv_page); > + bio_put(bio); > + > + kfree(ctx); > + if (atomic_dec_and_test(>refcount)) { Reverse this condition and return early. > + if (cio->endio) { > + cio->endio(cio->private, cio->comp_len); > + kfree(cio); > + } else > + blk_wake_io_task(cio->waiter); Missing curly braces for the else. > + } > +} > + > +static void blk_copy_offload_read_end_io(struct bio *read_bio) > +{ > + struct copy_ctx *ctx = read_bio->bi_private; > + struct cio *cio = ctx->cio; > + sector_t clen; > + > + if (read_bio->bi_status) { > + clen = (read_bio->bi_iter.bi_sector << SECTOR_SHIFT) > + - cio->pos_in; > + cio->comp_len = min_t(sector_t, clen, cio->comp_len); > + __free_page(read_bio->bi_io_vec[0].bv_page); > + bio_put(ctx->write_bio); > + bio_put(read_bio); > + kfree(ctx); > + if (atomic_dec_and_test(>refcount)) { > + if (cio->endio) { > + cio->endio(cio->private, cio->comp_len); > + kfree(cio); > + } else > + blk_wake_io_task(cio->waiter); Missing curly braces for the else. > + } > + return; So you do all this only if there was an error ? What about an OK bio completion (bi_status == BLK_STS_OK) ? > + } > + > + schedule_work(>dispatch_work); > + bio_put(read_bio); > +} > + > +static void blk_copy_dispatch_work_fn(struct work_struct *work) > +{ > + struct copy_ctx *ctx = container_of(work, struct copy_ctx, > + dispatch_work); > + > + submit_bio(ctx->write_bio); > +} > + > +/* > + * __blk_copy_offload- Use device's native copy offload feature. > + * we perform copy operation by sending 2 bio. > + * 1. First we send a read bio with REQ_COPY flag along with a token and > source > + * and length. Once read bio reaches driver layer, device driver adds all the > + * source info to token and does a fake completion. > + * 2. Once read operation completes, we issue write with REQ_COPY flag with > same > + * token. In driver layer, token info is used to form a copy offload command. > + * > + * returns the length of bytes copied or negative error
Re: [dm-devel] [PATCH v8 1/9] block: Introduce queue limits for copy-offload support
On 3/27/23 17:40, Anuj Gupta wrote: > From: Nitesh Shetty > > Add device limits as sysfs entries, > - copy_offload (RW) > - copy_max_bytes (RW) > - copy_max_bytes_hw (RO) > > Above limits help to split the copy payload in block layer. > copy_offload: used for setting copy offload(1) or emulation(0). > copy_max_bytes: maximum total length of copy in single payload. > copy_max_bytes_hw: Reflects the device supported maximum limit. > > Reviewed-by: Hannes Reinecke > Signed-off-by: Nitesh Shetty > Signed-off-by: Kanchan Joshi > Signed-off-by: Anuj Gupta > --- > Documentation/ABI/stable/sysfs-block | 36 > block/blk-settings.c | 24 +++ > block/blk-sysfs.c| 64 > include/linux/blkdev.h | 12 ++ > include/uapi/linux/fs.h | 3 ++ > 5 files changed, 139 insertions(+) > > diff --git a/Documentation/ABI/stable/sysfs-block > b/Documentation/ABI/stable/sysfs-block > index c57e5b7cb532..f5c56ad91ad6 100644 > --- a/Documentation/ABI/stable/sysfs-block > +++ b/Documentation/ABI/stable/sysfs-block > @@ -155,6 +155,42 @@ Description: > last zone of the device which may be smaller. > > > +What:/sys/block//queue/copy_offload > +Date:November 2022 > +Contact: linux-bl...@vger.kernel.org > +Description: > + [RW] When read, this file shows whether offloading copy to > + device is enabled (1) or disabled (0). Writing '0' to this ...to a device... > + file will disable offloading copies for this device. > + Writing any '1' value will enable this feature. If device If the device does... > + does not support offloading, then writing 1, will result in > + error. > + > + > +What:/sys/block//queue/copy_max_bytes > +Date:November 2022 > +Contact: linux-bl...@vger.kernel.org > +Description: > + [RW] While 'copy_max_bytes_hw' is the hardware limit for the > + device, 'copy_max_bytes' setting is the software limit. > + Setting this value lower will make Linux issue smaller size > + copies from block layer. This is the maximum number of bytes that the block layer will allow for a copy request. Must be smaller than or equal to the maximum size allowed by the hardware indicated by copy_max_bytes_hw. Write 0 to use the default kernel settings. > + > + > +What:/sys/block//queue/copy_max_bytes_hw > +Date:November 2022 > +Contact: linux-bl...@vger.kernel.org > +Description: > + [RO] Devices that support offloading copy functionality may have > + internal limits on the number of bytes that can be offloaded > + in a single operation. The `copy_max_bytes_hw` > + parameter is set by the device driver to the maximum number of > + bytes that can be copied in a single operation. Copy > + requests issued to the device must not exceed this limit. > + A value of 0 means that the device does not > + support copy offload. [RO] This is the maximum number of kilobytes supported in a single data copy offload operation. A value of 0 means that the device does not support copy offload. > + > + > What:/sys/block//queue/crypto/ > Date:February 2022 > Contact: linux-bl...@vger.kernel.org > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 896b4654ab00..350f3584f691 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -59,6 +59,8 @@ void blk_set_default_limits(struct queue_limits *lim) > lim->zoned = BLK_ZONED_NONE; > lim->zone_write_granularity = 0; > lim->dma_alignment = 511; > + lim->max_copy_sectors_hw = 0; > + lim->max_copy_sectors = 0; > } > > /** > @@ -82,6 +84,8 @@ void blk_set_stacking_limits(struct queue_limits *lim) > lim->max_dev_sectors = UINT_MAX; > lim->max_write_zeroes_sectors = UINT_MAX; > lim->max_zone_append_sectors = UINT_MAX; > + lim->max_copy_sectors_hw = ULONG_MAX; > + lim->max_copy_sectors = ULONG_MAX; > } > EXPORT_SYMBOL(blk_set_stacking_limits); > > @@ -183,6 +187,22 @@ void blk_queue_max_discard_sectors(struct request_queue > *q, > } > EXPORT_SYMBOL(blk_queue_max_discard_sectors); > > +/** > + * blk_queue_max_copy_sectors_hw - set max sectors for a single copy payload > + * @q: the request queue for the device > + * @max_copy_sectors: maximum number of sectors to copy > + **/ > +void blk_queue_max_copy_sectors_hw(struct request_queue *q, > + unsigned int max_copy_sectors) > +{ > + if (max_copy_sectors >= MAX_COPY_TOTAL_LENGTH) Confusing name as LENGTH may be
Re: [dm-devel] [PATCH v5 11/18] nvme: Add pr_ops read_keys support
On 3/28/23 2:11 AM, kernel test robot wrote: > Hi Mike, > > I love your patch! Perhaps something to improve: > > [auto build test WARNING on mkp-scsi/for-next] > [also build test WARNING on jejb-scsi/for-next axboe-block/for-next > linus/master v6.3-rc4 next-20230328] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch#_base_tree_information] > > url: > https://github.com/intel-lab-lkp/linux/commits/Mike-Christie/block-Add-PR-callouts-for-read-keys-and-reservation/20230325-022314 > base: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next > patch link: > https://lore.kernel.org/r/20230324181741.13908-12-michael.christie%40oracle.com > patch subject: [PATCH v5 11/18] nvme: Add pr_ops read_keys support > config: i386-randconfig-s002 > (https://download.01.org/0day-ci/archive/20230328/202303281502.u47uzous-...@intel.com/config) > compiler: gcc-11 (Debian 11.3.0-8) 11.3.0 > reproduce: > # apt-get install sparse > # sparse version: v0.6.4-39-gce1a6720-dirty > # > https://github.com/intel-lab-lkp/linux/commit/fcd2233cf643c550ab3cea2b6102077b1d05b389 > git remote add linux-review https://github.com/intel-lab-lkp/linux > git fetch --no-tags linux-review > Mike-Christie/block-Add-PR-callouts-for-read-keys-and-reservation/20230325-022314 > git checkout fcd2233cf643c550ab3cea2b6102077b1d05b389 > # save the config file > mkdir build_dir && cp config build_dir/.config > make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir > ARCH=i386 olddefconfig > make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir > ARCH=i386 SHELL=/bin/bash drivers/nvme/host/ > > If you fix the issue, kindly add following tag where applicable > | Reported-by: kernel test robot > | Link: > https://lore.kernel.org/oe-kbuild-all/202303281502.u47uzous-...@intel.com/ > > sparse warnings: (new ones prefixed by >>) >>> drivers/nvme/host/pr.c:165:24: sparse: sparse: incorrect type in assignment >>> (different base types) @@ expected restricted __le32 [assigned] >>> [usertype] cdw11 @@ got int @@ >drivers/nvme/host/pr.c:165:24: sparse: expected restricted __le32 > [assigned] [usertype] cdw11 >drivers/nvme/host/pr.c:165:24: sparse: got int >>> drivers/nvme/host/pr.c:171:21: sparse: sparse: restricted __le32 degrades >>> to integer > > vim +165 drivers/nvme/host/pr.c > >156 >157static int nvme_pr_resv_report(struct block_device *bdev, void > *data, >158u32 data_len, bool *eds) >159{ >160struct nvme_command c = { }; >161int ret; >162 >163c.common.opcode = nvme_cmd_resv_report; >164c.common.cdw10 = > cpu_to_le32(nvme_bytes_to_numd(data_len)); > > 165c.common.cdw11 = NVME_EXTENDED_DATA_STRUCT; >166*eds = true; >167 >168retry: >169ret = nvme_send_pr_command(bdev, , data, data_len); >170if (ret == NVME_SC_HOST_ID_INCONSIST && > > 171c.common.cdw11 == NVME_EXTENDED_DATA_STRUCT) { Kernel test bot is correct. Will fix. -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
Re: [dm-devel] [PATCH v5 01/18] block: Add PR callouts for read keys and reservation
On 3/28/23 11:36 AM, Mike Snitzer wrote: > On Fri, Mar 24 2023 at 2:17P -0400, > Mike Christie wrote: > >> Add callouts for reading keys and reservations. This allows LIO to support >> the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath >> to optimize it's error handling so it can check if it's getting an error >> because there's an existing reservation or if we need to retry different >> paths. > > Not seeing anything in DM's path selectors that adds these > optimizations. Is there accompanying multipath-tools callouts to get > at this data to optimize path selection (via DM table reloads)? > You can ignore that comment. The comment was meant for the dm pr callouts and not for normal IO/path handling, and now I think I can fix in a different way. I originally added the comment about better dm error handling for something like __dm_pr_register getting a failure when it did: ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags); Right now, we fail the entire operation if just one call on one path fails. With the pr_read_keys/reservation callouts we could check if we got a failure because there was an existing reservation vs a retryable/ignorable error. I forgot I wrote that comment about dm in the mail and we actually don't need the pr_read_* callouts for that type of thing now, because I ended up changing the existing callouts to return common error codes last kernel. So I have another patchset that I'm working on, but am still debating about some issues like: ret = ops->pr_register(dev->bdev, pr->old_key, pr->new_key, pr->flags); switch (ret) { case PR_STS_RESERVATION_CONFLICT: pr->ret = ret; return -1; case PR_STS_RETRY_PATH_FAILURE: /* * We probably want to retry like how we do for the pg_init. */ case PR_STS_PATH_FAILED: case PR_STS_PATH_FAST_FAILED: /* * I'm still not sure what to do here, because if this is the last * host then we might want to try and register the rest of the paths * and limp on. It probably needs a user config option. */ -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel