Re: [dm-devel] [PATCH 3/3] brd: implement write zeroes

2023-07-19 Thread Chaitanya Kulkarni
On 7/19/2023 1:21 PM, Mikulas Patocka wrote:
> This patch implements REQ_OP_WRITE_ZEROES on brd. Write zeroes will free
> the pages just like discard, but the difference is that it writes zeroes
> to the preceding and following page if the range is not aligned on page
> boundary.
> 
> Signed-off-by: Mikulas Patocka 
> 
> ---
>   drivers/block/brd.c |   22 ++
>   1 file changed, 18 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/drivers/block/brd.c
> ===
> --- linux-2.6.orig/drivers/block/brd.c
> +++ linux-2.6/drivers/block/brd.c
> @@ -272,7 +272,8 @@ out:
>   
>   void brd_do_discard(struct brd_device *brd, struct bio *bio)
>   {
> - sector_t sector, len, front_pad;
> + bool zero_padding = bio_op(bio) == REQ_OP_WRITE_ZEROES;
> + sector_t sector, len, front_pad, end_pad;
>   
>   if (unlikely(!discard)) {
>   bio->bi_status = BLK_STS_NOTSUPP;
> @@ -282,11 +283,22 @@ void brd_do_discard(struct brd_device *b
>   sector = bio->bi_iter.bi_sector;
>   len = bio_sectors(bio);
>   front_pad = -sector & (PAGE_SECTORS - 1);
> +
> + if (zero_padding && unlikely(front_pad != 0))
> + copy_to_brd(brd, page_address(ZERO_PAGE(0)),
> + sector, min(len, front_pad) << SECTOR_SHIFT);
> +

Is it possible to create different interface for discard and write
zeroes ? I think it is a lot of logic adding on one function if everyone
else is okay please discard my comment ..

>   sector += front_pad;
>   if (unlikely(len <= front_pad))
>   return;
>   len -= front_pad;
> - len = round_down(len, PAGE_SECTORS);
> +
> + end_pad = len & (PAGE_SECTORS - 1);
> + if (zero_padding && unlikely(end_pad != 0))
> + copy_to_brd(brd, page_address(ZERO_PAGE(0)),
> + sector + len - end_pad, end_pad << SECTOR_SHIFT);
> + len -= end_pad;
> +
>   while (len) {
>   brd_free_page(brd, sector);
>   sector += PAGE_SECTORS;
> @@ -302,7 +314,8 @@ static void brd_submit_bio(struct bio *b
>   struct bio_vec bvec;
>   struct bvec_iter iter;
>   
> - if (bio_op(bio) == REQ_OP_DISCARD) {
> + if (bio_op(bio) == REQ_OP_DISCARD ||
> + bio_op(bio) == REQ_OP_WRITE_ZEROES) {
>   brd_do_discard(brd, bio);
>   goto endio;
>   }

can we please use switch ? unless there is a strong need for if
which I failed to understand ...

> @@ -355,7 +368,7 @@ MODULE_PARM_DESC(max_part, "Num Minors t
>   
>   static bool discard = false;
>   module_param(discard, bool, 0444);
> -MODULE_PARM_DESC(discard, "Support discard");
> +MODULE_PARM_DESC(discard, "Support discard and write zeroes");
>  

write-zeroes and discards are both different req_op and they should have
a separate module parameter ...


>   MODULE_LICENSE("GPL");
>   MODULE_ALIAS_BLOCKDEV_MAJOR(RAMDISK_MAJOR);
> @@ -425,6 +438,7 @@ static int brd_alloc(int i)
>   if (discard) {
>   disk->queue->limits.discard_granularity = PAGE_SIZE;

>   blk_queue_max_discard_sectors(disk->queue, round_down(UINT_MAX, 
> PAGE_SECTORS));
> + blk_queue_max_write_zeroes_sectors(disk->queue, 
> round_down(UINT_MAX, PAGE_SECTORS));

above should be moved to new module parameter write_zeroes, unless there
is a reason to use one module parameter for both req_op...

>   }
>   
>   /* Tell the block layer that this is not a rotational device */
> 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH] block: add meaningful macro for flush op flags

2023-05-12 Thread Chaitanya Kulkarni
On 5/12/23 06:00, Christoph Hellwig wrote:
> Hell no.  This is just obsfucation.  We can look into actually exposing
> REQ_OP_FLUSH at the bio level, but not something like this.
>

and that's why I made it RFC, thanks for the can you please elaborate
on "exposing REQ_OP_FLUSH at the bio level" ?

I'd really like work that ...

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH] block: add meaningful macro for flush op flags

2023-05-12 Thread Chaitanya Kulkarni
Flush requests are implemented as REQ_OP_WRITE + REQ_OP_PREFLUSH
combination and not REQ_OP_FLUSH + REQ_PREFLUSH combination.

This unclear nature has lead to the confusion and bugs in the code for
block drivers causing more work for testing, reviews and fixes :-

1. https://lore.kernel.org/all/zfhgefwofvt24...@infradead.org/
2. https://marc.info/?l=linux-block=168386364026498=2

Add a macro (name can me more meaningful) with a meaningful comment
clearing the confusion and replace the REQ_OP_WRITE | REQ_PREFLUSH with
the new macro name that also saves code repetation.

Signed-off-by: Chaitanya Kulkarni 
---
 block/blk-flush.c   | 2 +-
 drivers/md/bcache/request.c | 3 +--
 drivers/md/dm-bufio.c   | 2 +-
 drivers/md/dm-integrity.c   | 2 +-
 drivers/md/dm-log.c | 2 +-
 drivers/md/dm-raid1.c   | 2 +-
 drivers/md/dm-snap-persistent.c | 5 ++---
 drivers/md/dm-writecache.c  | 2 +-
 drivers/md/dm.c | 2 +-
 drivers/md/md.c | 3 +--
 drivers/md/raid5-cache.c| 3 +--
 drivers/md/raid5-ppl.c  | 3 +--
 drivers/nvme/target/io-cmd-bdev.c   | 2 +-
 drivers/target/target_core_iblock.c | 3 +--
 include/linux/blk_types.h   | 7 +++
 15 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 04698ed9bcd4..376f00257100 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -460,7 +460,7 @@ int blkdev_issue_flush(struct block_device *bdev)
 {
struct bio bio;
 
-   bio_init(, bdev, NULL, 0, REQ_OP_WRITE | REQ_PREFLUSH);
+   bio_init(, bdev, NULL, 0, REQ_FLUSH_OPF);
return submit_bio_wait();
 }
 EXPORT_SYMBOL(blkdev_issue_flush);
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 67a2e29e0b40..ab89897a36a2 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -1023,8 +1023,7 @@ static void cached_dev_write(struct cached_dev *dc, 
struct search *s)
 */
struct bio *flush;
 
-   flush = bio_alloc_bioset(bio->bi_bdev, 0,
-REQ_OP_WRITE | REQ_PREFLUSH,
+   flush = bio_alloc_bioset(bio->bi_bdev, 0, REQ_FLUSH_OPF,
 GFP_NOIO, >disk.bio_split);
if (!flush) {
s->iop.status = BLK_STS_RESOURCE;
diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index eea977662e81..da815325842b 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -2133,7 +2133,7 @@ EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers);
 int dm_bufio_issue_flush(struct dm_bufio_client *c)
 {
struct dm_io_request io_req = {
-   .bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
+   .bi_opf = REQ_FLUSH_OPF | REQ_SYNC,
.mem.type = DM_IO_KMEM,
.mem.ptr.addr = NULL,
.client = c->dm_io,
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 31838b13ea54..2d90f8ad1ae5 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -1533,7 +1533,7 @@ static void dm_integrity_flush_buffers(struct 
dm_integrity_c *ic, bool flush_dat
if (!ic->meta_dev)
flush_data = false;
if (flush_data) {
-   fr.io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
+   fr.io_req.bi_opf = REQ_FLUSH_OPF | REQ_SYNC,
fr.io_req.mem.type = DM_IO_KMEM,
fr.io_req.mem.ptr.addr = NULL,
fr.io_req.notify.fn = flush_notify,
diff --git a/drivers/md/dm-log.c b/drivers/md/dm-log.c
index f9f84236dfcd..2c40f865ef16 100644
--- a/drivers/md/dm-log.c
+++ b/drivers/md/dm-log.c
@@ -311,7 +311,7 @@ static int flush_header(struct log_c *lc)
.count = 0,
};
 
-   lc->io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
+   lc->io_req.bi_opf = REQ_FLUSH_OPF;
 
return dm_io(>io_req, 1, _location, NULL);
 }
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index ddcb2bc4a617..7acb9a390b38 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -265,7 +265,7 @@ static int mirror_flush(struct dm_target *ti)
struct dm_io_region io[MAX_NR_MIRRORS];
struct mirror *m;
struct dm_io_request io_req = {
-   .bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC,
+   .bi_opf = REQ_FLUSH_OPF | REQ_SYNC,
.mem.type = DM_IO_KMEM,
.mem.ptr.addr = NULL,
.client = ms->io_client,
diff --git a/drivers/md/dm-snap-persistent.c b/drivers/md/dm-snap-persistent.c
index 15649921f2a9..cfb7f1b92c5e 100644
--- a/drivers/md/dm-snap-persistent.c
+++ b/drivers/md/dm-snap-persistent.c
@@ -739,8 +739,7 @@ static void persistent_commit_exception(str

Re: [dm-devel] [PATCH blktests] dm: add a regression test

2023-04-26 Thread Chaitanya Kulkarni
On 4/25/23 10:24, Mike Snitzer wrote:
> On Tue, Apr 25 2023 at  8:15P -0400,
> Shin'ichiro Kawasaki  wrote:
>
>> On Apr 25, 2023 / 16:22, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2023/01/12 9:05, Shinichiro Kawasaki 写道:
 Hello Yu, thanks for the patch. I think it is good to have this test case 
 to
 avoid repeating the same regression. Please find my comments in line.

 CC+: Mike, dm-devel,

 Mike, could you take a look in this new test case? It adds "dm" test group 
 to
 blktests. If you have thoughts on how to add device-mapper related test 
 cases
 to blktests, please share (Or we may be able to discuss later at LSF 2023).
>>> Can we add "dm" test group to blktests? I'll send a new version if we
>>> can.
>> I suggest to wait for LSF discussion in May, which could be a good chance to
>> hear opinions of dm experts. I think your new test case is valuable, so IMO 
>> it
>> should be added to the new "dm" group, or at least to the existing "block"
>> group. Let's decide the target group after LSF.
>>
> It's obviously fine to add a new "dm" test group to blktests.
>
> But just so others are aware: more elaborate dm testing is currently
> spread across multiple testsuites (e.g. lvm2, cryptsetup, mptest,
> device-mapper-test-suite, etc).
>
> There is new effort to port device-mapper-test-suite tests (which use
> ruby) to a new python harness currently named "dmtest-python", Joe
> Thornber is leading this effort (with the assistance of
> ChatGPT.. apparently it has been wonderful in helping Joe glue python
> code together to accomplish anything he's needed):
> https://github.com/jthornber/dmtest-python
>
> (we've discussed renaming "dmtest-python" to "dmtests")
>
> I've also discussed with Joe the plan to wrap the other disparate
> testsuites with DM coverage in terms of the new dmtest-python.
> blktests can be made to be one of the testsuites we add support for
> (so that all blktests run on various types of DM targets).
>
> Really all we need is a means to:
> 1) list all tests in a testsuite
> 2) initiate running each test individually
>
> Mike

Thanks Mike for the detailed information, we did talk about DM testcases
in last LSFMM, this is really important piece of blktest that is missing
and need to be discussed this year's LSFMM so we can integrate above
work in blktests as right now we are not able to establish complete
stability due to lack of of the dm tests as we are doing it for block
layer code or nvme for example.

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v9 6/9] nvmet: add copy command support for bdev and file ns

2023-04-25 Thread Chaitanya Kulkarni
On 4/11/23 01:10, Anuj Gupta wrote:
> From: Nitesh Shetty 
>
> Add support for handling target command on target.

what is target command ?

command that you have added is :nvme_cmd_copy

> For bdev-ns we call into blkdev_issue_copy, which the block layer
> completes by a offloaded copy request to backend bdev or by emulating the
> request.
>
> For file-ns we call vfs_copy_file_range to service our request.
>
> Currently target always shows copy capability by setting
> NVME_CTRL_ONCS_COPY in controller ONCS.

there is nothing mentioned about target/loop.c in commit log ?

> Signed-off-by: Nitesh Shetty 
> Signed-off-by: Anuj Gupta 
> ---
>   drivers/nvme/target/admin-cmd.c   |  9 +++--
>   drivers/nvme/target/io-cmd-bdev.c | 58 +++
>   drivers/nvme/target/io-cmd-file.c | 52 +++
>   drivers/nvme/target/loop.c|  6 
>   drivers/nvme/target/nvmet.h   |  1 +
>   5 files changed, 124 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
> index 80099df37314..978786ec6a9e 100644
> --- a/drivers/nvme/target/admin-cmd.c
> +++ b/drivers/nvme/target/admin-cmd.c
> @@ -433,8 +433,7 @@ static void nvmet_execute_identify_ctrl(struct nvmet_req 
> *req)
>   id->nn = cpu_to_le32(NVMET_MAX_NAMESPACES);
>   id->mnan = cpu_to_le32(NVMET_MAX_NAMESPACES);
>   id->oncs = cpu_to_le16(NVME_CTRL_ONCS_DSM |
> - NVME_CTRL_ONCS_WRITE_ZEROES);
> -
> + NVME_CTRL_ONCS_WRITE_ZEROES | NVME_CTRL_ONCS_COPY);
>   /* XXX: don't report vwc if the underlying device is write through */
>   id->vwc = NVME_CTRL_VWC_PRESENT;
>   
> @@ -536,6 +535,12 @@ static void nvmet_execute_identify_ns(struct nvmet_req 
> *req)
>   
>   if (req->ns->bdev)
>   nvmet_bdev_set_limits(req->ns->bdev, id);
> + else {
> + id->msrc = (u8)to0based(BIO_MAX_VECS - 1);
> + id->mssrl = cpu_to_le16(BIO_MAX_VECS <<
> + (PAGE_SHIFT - SECTOR_SHIFT));
> + id->mcl = cpu_to_le32(le16_to_cpu(id->mssrl));
> + }
>   
>   /*
>* We just provide a single LBA format that matches what the
> diff --git a/drivers/nvme/target/io-cmd-bdev.c 
> b/drivers/nvme/target/io-cmd-bdev.c
> index c2d6cea0236b..0af273097aa4 100644
> --- a/drivers/nvme/target/io-cmd-bdev.c
> +++ b/drivers/nvme/target/io-cmd-bdev.c
> @@ -46,6 +46,19 @@ void nvmet_bdev_set_limits(struct block_device *bdev, 
> struct nvme_id_ns *id)
>   id->npda = id->npdg;
>   /* NOWS = Namespace Optimal Write Size */
>   id->nows = to0based(bdev_io_opt(bdev) / bdev_logical_block_size(bdev));
> +
> + /*Copy limits*/

above comment doesn't make any sense ...

> + if (bdev_max_copy_sectors(bdev)) {
> + id->msrc = id->msrc;
> + id->mssrl = cpu_to_le16((bdev_max_copy_sectors(bdev) <<
> + SECTOR_SHIFT) / bdev_logical_block_size(bdev));
> + id->mcl = cpu_to_le32(id->mssrl);
> + } else {
> + id->msrc = (u8)to0based(BIO_MAX_VECS - 1);
> + id->mssrl = cpu_to_le16((BIO_MAX_VECS << PAGE_SHIFT) /
> + bdev_logical_block_size(bdev));
> + id->mcl = cpu_to_le32(id->mssrl);
> + }
>   }
>   
>   void nvmet_bdev_ns_disable(struct nvmet_ns *ns)
> @@ -184,6 +197,19 @@ static void nvmet_bio_done(struct bio *bio)
>   nvmet_req_bio_put(req, bio);
>   }
>   
> +static void nvmet_bdev_copy_end_io(void *private, int comp_len)
> +{
> + struct nvmet_req *req = (struct nvmet_req *)private;
> +
> + if (comp_len == req->copy_len) {
> + req->cqe->result.u32 = cpu_to_le32(1);
> + nvmet_req_complete(req, errno_to_nvme_status(req, 0));
> + } else {
> + req->cqe->result.u32 = cpu_to_le32(0);
> + nvmet_req_complete(req, blk_to_nvme_status(req, BLK_STS_IOERR));
> + }
> +}
> +

please reduce calls for nvmet_req_complete().

+static void nvmet_bdev_copy_end_io(void *private, int comp_len)
+{
+   struct nvmet_req *req = (struct nvmet_req *)private;
+   u16 status;
+
+   if (comp_len == req->copy_len) {
+   req->cqe->result.u32 = cpu_to_le32(1);
+   status = errno_to_nvme_status(req, 0));
+   } else {
+   req->cqe->result.u32 = cpu_to_le32(0);
+   status = blk_to_nvme_status(req, BLK_STS_IOERR));
+   }
+   nvmet_req_complete(req, status);
+}
+

>   #ifdef CONFIG_BLK_DEV_INTEGRITY
>   static int nvmet_bdev_alloc_bip(struct nvmet_req *req, struct bio *bio,
>   struct sg_mapping_iter *miter)
> @@ -450,6 +476,34 @@ static void nvmet_bdev_execute_write_zeroes(struct 
> nvmet_req *req)
>   }
>   }
>   
> +/* At present we handle only one range entry */

please add explanation why ...

> +static void nvmet_bdev_execute_copy(struct nvmet_req *req)
> +{
> + struct nvme_copy_range 

Re: [dm-devel] [PATCH v9 9/9] null_blk: add support for copy offload

2023-04-13 Thread Chaitanya Kulkarni
On 4/11/23 01:10, Anuj Gupta wrote:
> From: Nitesh Shetty 
>
> Implementaion is based on existing read and write infrastructure.
> copy_max_bytes: A new configfs and module parameter is introduced, which
> can be used to set hardware/driver supported maximum copy limit.
>
> Suggested-by: Damien Le Moal 
> Signed-off-by: Anuj Gupta 
> Signed-off-by: Nitesh Shetty 
> Signed-off-by: Vincent Fu 
> ---
>   drivers/block/null_blk/main.c | 101 ++
>   drivers/block/null_blk/null_blk.h |   8 +++
>   2 files changed, 109 insertions(+)
>
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index bc2c58724df3..e273e18ace74 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -157,6 +157,10 @@ static int g_max_sectors;
>   module_param_named(max_sectors, g_max_sectors, int, 0444);
>   MODULE_PARM_DESC(max_sectors, "Maximum size of a command (in 512B 
> sectors)");
>   
> +static int g_copy_max_bytes = COPY_MAX_BYTES;

how about following ? matches nullb_device->copy_max_bytes type ..

-static int g_copy_max_bytes = COPY_MAX_BYTES;
-module_param_named(copy_max_bytes, g_copy_max_bytes, int, 0444);
+static unsigned long g_copy_max_bytes = COPY_MAX_BYTES;
+module_param_named(copy_max_bytes, g_copy_max_bytes, ulong, 0444);

[...]

> @@ -631,6 +637,7 @@ static ssize_t memb_group_features_show(struct 
> config_item *item, char *page)
>   "badblocks,blocking,blocksize,cache_size,"
>   "completion_nsec,discard,home_node,hw_queue_depth,"
>   "irqmode,max_sectors,mbps,memory_backed,no_sched,"
> + "copy_max_bytes,"
>   "poll_queues,power,queue_mode,shared_tag_bitmap,size,"
>   "submit_queues,use_per_node_hctx,virt_boundary,zoned,"
>   "zone_capacity,zone_max_active,zone_max_open,"

why not ?

@@ -637,11 +637,12 @@ static ssize_t memb_group_features_show(struct 
config_item *item, char *page)
 "badblocks,blocking,blocksize,cache_size,"
 "completion_nsec,discard,home_node,hw_queue_depth,"
 "irqmode,max_sectors,mbps,memory_backed,no_sched,"
-   "copy_max_bytes,"
 "poll_queues,power,queue_mode,shared_tag_bitmap,size,"
 "submit_queues,use_per_node_hctx,virt_boundary,zoned,"
 "zone_capacity,zone_max_active,zone_max_open,"
-   "zone_nr_conv,zone_offline,zone_readonly,zone_size\n");
+   "zone_nr_conv,zone_offline,zone_readonly,zone_size"
+   "copy_max_bytes\n");
  }
  
[...]
  
+static inline int nullb_setup_copy_read(struct nullb *nullb,
+   struct bio *bio)
+{
+   struct nullb_copy_token *token = bvec_kmap_local(>bi_io_vec[0]);
+
+   memcpy(token->subsys, "nullb", 5);

do you really need to use memcpy here ? can token->subsys be a pointer
and use with assignment token->subsys = nullb ?

+   token->sector_in = bio->bi_iter.bi_sector;
+   token->nullb = nullb;
+   token->sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
+
+   return 0;
+}
+

no point in return 0 , use local bool for fua instead of repeating
expression and no need to fold line for nullb_setup_copy_read()
makes is easy to read and removes extra lines and indentation see below :-

-static inline int nullb_setup_copy_read(struct nullb *nullb,
-   struct bio *bio)
+static inline void nullb_setup_copy_read(struct nullb *nullb, struct bio *bio)
  {
 struct nullb_copy_token *token = bvec_kmap_local(>bi_io_vec[0]);
  
-   memcpy(token->subsys, "nullb", 5);
+   token->subsys = "nullb;
 token->sector_in = bio->bi_iter.bi_sector;
 token->nullb = nullb;
 token->sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT;
-
-   return 0;
  }
  
  static inline int nullb_setup_copy_write(struct nullb *nullb,
@@ -1334,20 +1331,21 @@ static int null_handle_rq(struct nullb_cmd *cmd)
 sector_t sector = blk_rq_pos(rq);
 struct req_iterator iter;
 struct bio_vec bvec;
+   bool fua = rq->cmd_flags & REQ_FUA;
  
 if (rq->cmd_flags & REQ_COPY) {
 if (op_is_write(req_op(rq)))
-   return nullb_setup_copy_write(nullb, rq->bio,
-   rq->cmd_flags & REQ_FUA);
-   return nullb_setup_copy_read(nullb, rq->bio);
+   return nullb_setup_copy_write(nullb, rq->bio, fua);
+
+   nullb_setup_copy_read(nullb, rq->bio);
+   return 0;
 }
  
 spin_lock_irq(>lock);
 rq_for_each_segment(bvec, rq, iter) {
 len = bvec.bv_len;
 err = null_transfer(nullb, bvec.bv_page, len, bvec.bv_offset,
-op_is_write(req_op(rq)), sector,
-   

Re: [dm-devel] [PATCH v6 11/18] nvme: Add pr_ops read_keys support

2023-04-08 Thread Chaitanya Kulkarni
On 4/7/23 13:05, Mike Christie wrote:
> This patch adds support for the pr_ops read_keys callout by calling the
> NVMe Reservation Report helper, then parsing that info to get the
> controller's registered keys. Because the callout is only used in the
> kernel where the callers, like LIO, do not know about controller/host IDs,
> the callout just returns the registered keys which is required by the SCSI
> PR in READ KEYS command.
>
> Signed-off-by: Mike Christie 
> ---
>   


Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v6 10/18] nvme: Add helper to send pr command

2023-04-08 Thread Chaitanya Kulkarni
On 4/7/23 13:05, Mike Christie wrote:
> Move the code that checks for multipath support and sends the pr command
> to a new helper so it can be used by the reservation report support added
> in the next patches.
>
> Signed-off-by: Mike Christie 
> Reviewed-by: Christoph Hellwig 
> ---
>


Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v6 02/18] block: Rename BLK_STS_NEXUS to BLK_STS_RESV_CONFLICT

2023-04-08 Thread Chaitanya Kulkarni
On 4/7/23 13:05, Mike Christie wrote:
> BLK_STS_NEXUS is used for NVMe/SCSI reservation conflicts and DASD's
> locking feature which works similar to NVMe/SCSI reservations where a
> host can get a lock on a device and when the lock is taken it will get
> failures.
>
> This patch renames BLK_STS_NEXUS so it better reflects this type of
> use.
>
> Signed-off-by: Mike Christie 
> Acked-by: Stefan Haberland 
> Reviewed-by: Bart Van Assche 
> ---
>

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 05/18] scsi: Add support for block PR read keys/reservation

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> This adds support in sd.c for the block PR read keys and read reservation
> callouts, so upper layers like LIO can get the PR info that's been setup
> using the existing pr callouts and return it to initiators.
> 
> Signed-off-by: Mike Christie 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 04/18] scsi: Move sd_pr_type to header to share

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> LIO is going to want to do the same block to/from SCSI pr types as sd.c
> so this moves the sd_pr_type helper to a new file. The next patch will
> then also add a helper to go from the SCSI value to the block one for use
> with PERSISTENT_RESERVE_IN commands.
> 
> Signed-off-by: Mike Christie 
> Reviewed-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 01/18] block: Add PR callouts for read keys and reservation

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> Add callouts for reading keys and reservations. This allows LIO to support
> the READ_KEYS and READ_RESERVATION commands and will allow dm-multipath
> to optimize it's error handling so it can check if it's getting an error
> because there's an existing reservation or if we need to retry different
> paths.
> 
> Note: This only initially adds the struct definitions in the kernel as I'm
> not sure if we wanted to export the interface to userspace yet. read_keys
> and read_reservation are exactly what dm-multipath and LIO need, but for a
> userspace interface we may want something like SCSI's READ_FULL_STATUS and
> NVMe's report reservation commands. Those are overkill for dm/LIO and
> READ_FULL_STATUS is sometimes broken for SCSI devices.
> 
> Signed-off-by: Mike Christie 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 13/18] nvme: Add pr_ops read_reservation support

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> This patch adds support for the pr_ops read_reservation callout by
> calling the NVMe Reservation Report helper. It then parses that info to
> detect if there is a reservation and if there is then convert the
> returned info to a pr_ops pr_held_reservation struct.
> 
> Signed-off-by: Mike Christie 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 12/18] nvme: Add a nvme_pr_type enum

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> The next patch adds support to report the reservation type, so we need to
> be able to convert from the NVMe PR value we get from the device to the
> linux block layer PR value that will be returned to callers. To prepare
> for that, this patch adds a nvme_pr_type enum and renames the nvme_pr_type
> function.
> 
> Signed-off-by: Mike Christie 
> ---
> 
> Note for Chaitanya, Keith and Bart. For these patches where we convert
> between block and nvme pr values, it seemed like Chaitanya and Keith
> didn't have a strong preference. Bart had the suggestion to keep the
> switch and drop the default so the compiler can warn us if new types
> are added. This seemed like a nice benefit so I went that way.
> 

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 10/18] nvme: Add helper to send pr command

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> Move the code that checks for multipath support and sends the pr command
> to a new helper so it can be used by the reservation report support added
> in the next patches.
> 
> Signed-off-by: Mike Christie 
> ---
>   drivers/nvme/host/pr.c | 23 ++-
>   1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/nvme/host/pr.c b/drivers/nvme/host/pr.c
> index 26ad25f7280b..7a1d93da4970 100644
> --- a/drivers/nvme/host/pr.c
> +++ b/drivers/nvme/host/pr.c
> @@ -27,7 +27,7 @@ static char nvme_pr_type(enum pr_type type)
>   }
>   
>   static int nvme_send_ns_head_pr_command(struct block_device *bdev,
> - struct nvme_command *c, u8 *data, unsigned int data_len)
> + struct nvme_command *c, void *data, unsigned int data_len)
>   {
>   struct nvme_ns_head *head = bdev->bd_disk->private_data;
>   int srcu_idx = srcu_read_lock(>srcu);
> @@ -43,7 +43,7 @@ static int nvme_send_ns_head_pr_command(struct block_device 
> *bdev,
>   }
>   
>   static int nvme_send_ns_pr_command(struct nvme_ns *ns, struct nvme_command 
> *c,
> - u8 *data, unsigned int data_len)
> + void *data, unsigned int data_len)
>   {
>   c->common.nsid = cpu_to_le32(ns->head->ns_id);
>   return nvme_submit_sync_cmd(ns->queue, c, data, data_len);
> @@ -71,6 +71,17 @@ static int nvme_sc_to_pr_err(int nvme_sc)
>   }
>   }
>   
> +static int nvme_send_pr_command(struct block_device *bdev,
> + struct nvme_command *c, void *data, unsigned int data_len)
> +{
> + if (IS_ENABLED(CONFIG_NVME_MULTIPATH) &&
> + bdev->bd_disk->fops == _ns_head_ops)
> + return nvme_send_ns_head_pr_command(bdev, c, data, data_len);

below else is not needed after above return..

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 07/18] nvme: Fix reservation status related structs

2023-03-05 Thread Chaitanya Kulkarni
On 2/24/2023 9:44 AM, Mike Christie wrote:
> This fixes the following issues with the reservation status structs:
> 
> 1. resv10 is bytes 23:10 so it should be 14 bytes.
> 2. regctl_ds only supports 64 bit host IDs.
> 
> These are not currently used, but will be in this patchset which adds
> support for the reservation report command.
> 
> Signed-off-by: Mike Christie 
> Reviewed-by: Keith Busch 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [LSF/MM/BPF ATTEND][LSF/MM/BPF TOPIC] : blktests: status, an expansion plan for the storage stack test framework

2023-01-22 Thread Chaitanya Kulkarni
On 1/18/23 18:20, Shinichiro Kawasaki wrote:
> CC+: Mike, dm-devel,
> 
> Hi Chaitanya, thanks for bringing this up! I definitely want to join and learn
> from the discussions. Here I note my comments about them.
> 
> On Jan 18, 2023 / 23:52, Chaitanya Kulkarni wrote:
> [...]
>> For storage track, I would like to propose a session dedicated to
>> blktests. It is a great opportunity for the storage developers to gather
>> and have a discussion about:-
>>
>> 1. Current status of the blktests framework.
> 
> In the session, I can talk shortly about recent blktests improvements and
> failure cases.
> 
>> 2. Any new/missing features that we want to add in the blktests.
> 
> A feature I wish is to mark dangerous test cases which cause system crash, in
> similar way as fstests does. I think they should be skipped by default not to
> confuse new blktests users.
> 
> I remember that dmesg logging was discussed at the last LSFMMBPF, but it is 
> not
> yet implemented. It may worth revisit.
> 
>> 3. Any new kernel features that could be used to make testing easier?
>> 4. DM/MD Testcases.
> 
> I took a liberty to add Mike and dm-devel to CC. Recently, a patch was posted 
> to
> add 'dm' test category [1]. I hope it will help DM/MD developers to add more
> tests in the category. I would like to discuss if it is a good start, or if
> anything is missing in blktests to support DM/MD testing.
> 
> [1] 
> https://lore.kernel.org/linux-block/20221230065424.19998-1-yuku...@huaweicloud.com/#t

we really need to sort out the dm testcases, without dm testcases it
not allowing us to create baseline correctness for block layer,
I've already discussed that in the last LSF.

> 
>>
>> E.g. Implementing new features in the null_blk.c in order to have device
>> independent complete test coverage. (e.g. adding discard command for
>> null_blk or any other specific REQ_OP). Discussion about having any new
>> tracepoint events in the block layer.
>>
>> 4. Any new test cases/categories which are lacking in the blktests
>> framework.
> 
> One thing in my mind is character device. From recent discussions [2][3], it
> looks worth adding some basic test cases for NVME character devices and SG
> devices.
> 

Agree

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] Virtio-blk extended lifetime feature

2023-01-18 Thread Chaitanya Kulkarni
On 1/17/23 14:30, Enrico Granata wrote:
> Hi,
> I am going to add +Jahdiel Alvarez who is also looking into a similar
> issue, and also I would like to hear thoughts of people who may have
> worked with (embedded or otherwise) storage more recently than I have
> 
> One thought that Jahdiel and myself were pondering is whether we need
> "type_a" and "type_b" fields at all, or if there should simply be a
> "wear estimate" field, which for eMMC, it could be max(typ_a, typ_b)
> but it could generalize to any number of cell or other algorithm, as
> long as it produces one unique estimate of wear
> 
> Thanks,
> - Enrico
> 
> Thanks,
> - Enrico
> 
> 
> On Sun, Jan 15, 2023 at 12:56 AM Alvaro Karsz
>  wrote:
>>
>> Hi guys,
>>
>> While trying to upstream the implementation of VIRTIO_BLK_F_LIFETIME
>> feature, many developers suggested that this feature should be
>> extended to include more cell types, since its current implementation
>> in virtio spec is relevant for MMC and UFS devices only.
>>
>> The VIRTIO_BLK_F_LIFETIME defines the following fields:
>>
>> - pre_eol_info:  the percentage of reserved blocks that are consumed.
>> - device_lifetime_est_typ_a: wear of SLC cells.
>> - device_lifetime_est_typ_b: wear of MLC cells.

For immediate comments :-

It needs to cover all the flash cell types.

Using names like type_a/type_b in the spec and in the implementation
adds unnecessary confusion and requires extra documentation in the
code that needs to be changed.

>>
>> (https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html)
>>
>> Following Michael's suggestion, I'd like to add to the virtio spec
>> with a new, extended lifetime command.
>> Since I'm more familiar with embedded type storage devices, I'd like
>> to ask you guys what fields you think should be included in the
>> extended command.
>>
>> Thanks,
>>
>> Alvaro

-ck
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 3/7] nvmet: introduce bdev_zone_no helper

2023-01-08 Thread Chaitanya Kulkarni
On 1/6/23 00:33, Pankaj Raghav wrote:
> A generic bdev_zone_no() helper is added to calculate zone number for a
> given sector in a block device. This helper internally uses disk_zone_no()
> to find the zone number.
> 
> Use the helper bdev_zone_no() to calculate nr of zones. This let's us
> make modifications to the math if needed in one place.
> 
> Signed-off-by: Pankaj Raghav 

with Bart's comments fixed...

looks good

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 1/7] block: remove superfluous check for request queue in bdev_is_zoned

2023-01-08 Thread Chaitanya Kulkarni
On 1/6/23 00:33, Pankaj Raghav wrote:
> Remove the superfluous request queue check in bdev_is_zoned() as the
> bdev_get_queue can never return NULL.
> 
> Signed-off-by: Pankaj Raghav 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 2/7] block: add a new helper bdev_{is_zone_start, offset_from_zone_start}

2023-01-08 Thread Chaitanya Kulkarni
On 1/6/23 00:33, Pankaj Raghav wrote:
> Instead of open coding to check for zone start, add a helper to improve
> readability and store the logic in one place.
> 
> bdev_offset_from_zone_start() will be used later in the series.
> 
> Signed-off-by: Pankaj Raghav 


Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support

2022-12-12 Thread Chaitanya Kulkarni
On 12/11/22 01:49, Alvaro Karsz wrote:
>> Alvaro could you pls explain the use-case? Christoph has doubts that
>> it's useful. Do you have a device implementing this?
> 
> Our HW exposes virtio devices, virtio block included.
> The block device backend can be a eMMC/uSD card.
> 
> The HW can report health values (power, temp, current, voltage) and I
> thought that it would be nice to be able to report lifetime values as
> well.

Why not use block layer passthru interface to fetch this instead of
adding non-generic IOCTL ?

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support

2022-12-12 Thread Chaitanya Kulkarni
Michael,

On 12/7/22 12:28, Michael S. Tsirkin wrote:
> On Wed, Dec 07, 2022 at 08:31:28AM -0800, Christoph Hellwig wrote:
>> On Wed, Dec 07, 2022 at 05:21:48AM -0500, Michael S. Tsirkin wrote:
>>> Christoph you acked the spec patch adding this to virtio blk:
>>>
>>> Still not a fan of the encoding, but at least it is properly documented
>>> now:
>>>
>>> Acked-by: Christoph Hellwig 
>>>
>>> Did you change your mind then? Or do you prefer a different encoding for
>>> the ioctl then? could you help sugesting what kind?
>>
>> Well, it is good enough documented for a spec.  I don't think it is
>> a useful feature for Linux where virtio-blk is our minimum viable
>> paravirtualized block driver.
> 
> No idea what this means, sorry.  Now that's in the spec I expect (some)
> devices to implement it and if they do I see no reason not to expose the
> data to userspace.
> 

Even if any device implements is it can always use passthru commands.
See below for more info...

> Alvaro could you pls explain the use-case? Christoph has doubts that
> it's useful. Do you have a device implementing this?
> 

 From what I know, virtio-blk should be kept minimal and should not
add any storage specific IOCTLs or features that will end up loosing
its generic nature.

The IOCTL we are trying to add is Flash storage specific which
goes against the nature of generic storage and makes it non-generic.
In case we approve this it will open the door for non-generic
code/IOCTL in the virtio-blk and that needs to be avoided.

For any storage specific features or IOCTL (flash/HDD) it should
be using it's own frontend such as virtio-scsi or e.g. nvme and
not virtio-blk.

Hope this helps.

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [RFC for-6.2/block V2] block: Change the granularity of io ticks from ms to ns

2022-12-07 Thread Chaitanya Kulkarni
On 12/7/22 15:08, Jens Axboe wrote:
> On 12/7/22 3:32?PM, Gulam Mohamed wrote:
>> As per the review comment from Jens Axboe, I am re-sending this patch
>> against "for-6.2/block".
>>
>>
>> Use ktime to change the granularity of IO accounting in block layer from
>> milli-seconds to nano-seconds to get the proper latency values for the
>> devices whose latency is in micro-seconds. After changing the granularity
>> to nano-seconds the iostat command, which was showing incorrect values for
>> %util, is now showing correct values.
>>
>> We did not work on the patch to drop the logic for
>> STAT_PRECISE_TIMESTAMPS yet. Will do it if this patch is ok.
>>
>> The iostat command was run after starting the fio with following command
>> on an NVME disk. For the same fio command, the iostat %util was showing
>> ~100% for the disks whose latencies are in the range of microseconds.
>> With the kernel changes (granularity to nano-seconds), the %util was
>> showing correct values. Following are the details of the test and their
>> output:
> 
> My default peak testing runs at 122M IOPS. That's also the peak IOPS of
> the devices combined, and with iostats disabled. If I enabled iostats,
> then the performance drops to 112M IOPS. It's no longer device limited,
> that's a drop of about 8.2%.
> 

Wow, clearly not acceptable that's exactly I asked for perf
numbers :).

-ck



Re: [RFC for-6.2/block V2] block: Change the granularity of io ticks from ms to ns

2022-12-07 Thread Chaitanya Kulkarni
On 12/7/22 14:32, Gulam Mohamed wrote:
> As per the review comment from Jens Axboe, I am re-sending this patch
> against "for-6.2/block".
> 

why is this marked as RFC ? are you waiting for something more to get
resolved so this can be merged ?

> 
> Use ktime to change the granularity of IO accounting in block layer from
> milli-seconds to nano-seconds to get the proper latency values for the
> devices whose latency is in micro-seconds. After changing the granularity
> to nano-seconds the iostat command, which was showing incorrect values for
> %util, is now showing correct values.
> 
> We did not work on the patch to drop the logic for
> STAT_PRECISE_TIMESTAMPS yet. Will do it if this patch is ok.
> 
> The iostat command was run after starting the fio with following command
> on an NVME disk. For the same fio command, the iostat %util was showing
> ~100% for the disks whose latencies are in the range of microseconds.
> With the kernel changes (granularity to nano-seconds), the %util was
> showing correct values. Following are the details of the test and their
> output:
> 
> fio command
> ---
> [global]
> bs=128K
> iodepth=1
> direct=1
> ioengine=libaio
> group_reporting
> time_based
> runtime=90
> thinktime=1ms
> numjobs=1
> name=raw-write
> rw=randrw
> ignore_error=EIO:EIO
> [job1]
> filename=/dev/nvme0n1
> 
> Correct values after kernel changes:
> 
> iostat output
> -
> iostat -d /dev/nvme0n1 -x 1
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.080.05   0.06   128.00   128.00   0.07   6.50
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.080.06   0.06   128.00   128.00   0.07   6.30
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.060.05   0.06   128.00   128.00   0.06   5.70
> 
>  From fio
> 
> Read Latency: clat (usec): min=32, max=2335, avg=79.54, stdev=29.95
> Write Latency: clat (usec): min=38, max=130, avg=57.76, stdev= 3.25
> 
> Values before kernel changes
> 
> iostat output
> -
> 
> iostat -d /dev/nvme0n1 -x 1
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.080.06   0.06   128.00   128.00   1.07  97.70
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.080.06   0.06   128.00   128.00   1.08  98.80
> 
> Devicer_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
> nvme0n1  0.080.05   0.06   128.00   128.00   1.06  97.20
> 
>  From fio
> 
> Read Latency: clat (usec): min=33, max=468, avg=79.56, stdev=28.04
> Write Latency: clat (usec): min=9, max=139, avg=57.10, stdev= 3.79
> 
> Changes in V2:
> 1. Changed the try_cmpxchg() to try_cmpxchg64() in function
> update_io_ticks()as the values being compared are u64 which was giving
> a build error on i386 and microblaze
> 
> Signed-off-by: Gulam Mohamed 
> ---

I believe it has no effect on the overall performance, if so I'd
document that.

Based on the quantitative data present in the commit log this
looks good to me, I believe you did audit all drivers and places
in the block layer.

Looks good.

Reviewed-by: Chaitanya Kulkarni 

-ck


Re: [dm-devel] [PATCH v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support

2022-12-06 Thread Chaitanya Kulkarni
Bart,

On 12/6/22 10:43, Bart Van Assche wrote:
> On 12/5/22 08:20, Alvaro Karsz wrote:
>> +/* Get lifetime information struct for each request */
>> +struct virtio_blk_lifetime {
>> +    /*
>> + * specifies the percentage of reserved blocks that are consumed.
>> + * optional values following virtio spec:
>> + * 0 - undefined
>> + * 1 - normal, < 80% of reserved blocks are consumed
>> + * 2 - warning, 80% of reserved blocks are consumed
>> + * 3 - urgent, 90% of reserved blocks are consumed
>> + */
>> +    __le16 pre_eol_info;
>> +    /*
>> + * this field refers to wear of SLC cells and is provided in 
>> increments of 10used,
>> + * and so on, thru to 11 meaning estimated lifetime exceeded. All 
>> values above 11
>> + * are reserved
>> + */
>> +    __le16 device_lifetime_est_typ_a;
>> +    /*
>> + * this field refers to wear of MLC cells and is provided with 
>> the same semantics as
>> + * device_lifetime_est_typ_a
>> + */
>> +    __le16 device_lifetime_est_typ_b;
>> +};
> 
> Why does the above data structure only refer to SLC and MLC but not to
> TLC or QLC?
> 
> How will this data structure be extended without having to introduce a
> new ioctl?
> 
> Thanks,
> 
> Bart.
> 

We already discussed that see :-

https://lists.linuxfoundation.org/pipermail/virtualization/2022-November/063742.html

-ck



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support

2022-12-05 Thread Chaitanya Kulkarni
On 12/5/22 12:35, Enrico Granata wrote:
> The original definitions for these fields come from JESD84-B50, which
> is what eMMC storage uses. It has been a while, but I recall UFS doing
> something pretty similar.
> Systems that don't have a well defined notion of durability would just
> not expose the flag (e.g. a spinning disk), and going for what
> eMMC/UFS expose already would make implementations fairly seamless for
> a lot of common embedded scenarios.
> 

Has it been considered there might be non-embeded deployments which
virtio-blk frontend can also benefit from this feature and they can
have more flash cell types than present in the approved spec ?

> Of course, if you see room for improvement to the spec, I'd be very
> interested in hearing your thoughts
> 
> Thanks,
> - Enrico
> 
> Thanks,
> - Enrico
> 

 From a quick look :-

The struct virtio_blk_lifetime is too narrow for sure only
supporting two types, it at least needs to support available flash
cell types to start with and the members needs to be renamed
to reflect the actual type for more clarity.

hope this helps ...

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support

2022-12-05 Thread Chaitanya Kulkarni
On 12/5/22 10:24, Jens Axboe wrote:
> On 12/5/22 9:20 AM, Alvaro Karsz wrote:
>> Implement the VIRTIO_BLK_F_LIFETIME feature for VirtIO block devices.
>>
>> This commit introduces a new ioctl command, VBLK_LIFETIME.
>>
>> VBLK_LIFETIME ioctl asks for the block device to provide lifetime
>> information by sending a VIRTIO_BLK_T_GET_LIFETIME command to the device.
>>
>> lifetime information fields:
>>
>> - pre_eol_info: specifies the percentage of reserved blocks that are
>>  consumed.
>>  optional values following virtio spec:
>>  *) 0 - undefined.
>>  *) 1 - normal, < 80% of reserved blocks are consumed.
>>  *) 2 - warning, 80% of reserved blocks are consumed.
>>  *) 3 - urgent, 90% of reserved blocks are consumed.
>>
>> - device_lifetime_est_typ_a: this field refers to wear of SLC cells and
>>   is provided in increments of 10used, and so
>>   on, thru to 11 meaning estimated lifetime
>>   exceeded. All values above 11 are reserved.
>>
>> - device_lifetime_est_typ_b: this field refers to wear of MLC cells and is
>>   provided with the same semantics as
>>   device_lifetime_est_typ_a.
>>
>> The data received from the device will be sent as is to the user.
>> No data check/decode is done by virtblk.
> 
> Is this based on some spec? Because it looks pretty odd to me. There
> can be a pretty wide range of two/three/etc level cells with wildly
> different ranges of durability. And there's really not a lot of slc
> for generic devices these days, if any.
> 

This is exactly what I said on initial version about new types ...

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH v5 00/10] Implement copy offload support

2022-11-29 Thread Chaitanya Kulkarni
On 11/29/22 04:16, Nitesh Shetty wrote:
> On Wed, Nov 23, 2022 at 10:56:23PM +0000, Chaitanya Kulkarni wrote:
>> (+ Shinichiro)
>>
>> On 11/22/22 21:58, Nitesh Shetty wrote:
>>> The patch series covers the points discussed in November 2021 virtual
>>> call [LSF/MM/BFP TOPIC] Storage: Copy Offload [0].
>>> We have covered the initial agreed requirements in this patchset and
>>> further additional features suggested by community.
>>> Patchset borrows Mikulas's token based approach for 2 bdev
>>> implementation.
>>>
>>> This is on top of our previous patchset v4[1].
>>
>> Now that series is converging, since patch-series touches
>> drivers and key components in the block layer you need accompany
>> the patch-series with the blktests to cover the corner cases in the
>> drivers which supports this operations, as I mentioned this in the
>> call last year
>>
>> If you need any help with that feel free to send an email to linux-block
>> and CC me or Shinichiro (added in CC )...
>>
>> -ck
>>
> 
> Yes any help would be appreciated. I am not familiar with blktest
> development/testing cycle. Do we need add blktests along with patch
> series or do we need to add after patch series gets merged(to be merged)?
> 
> Thanks
> Nitesh
> 
> 

we have many testcases you can refer to as an example.
Your cover-letter mentions that you have tested this code, just move
all the testcases to the blktests.

More importantly for a feature like this you should be providing
outstanding testcases in your github tree when you post the
series, it should cover critical parts of the block layer and
drivers in question.

The objective here is to have blktests updated when the code
is upstream so all the distros can test the code from
upstream blktest repo. You can refer to what we have done it
for NVMeOF in-band authentication (Thanks to Hannes and Sagi
in linux-nvme email-archives.

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v5 00/10] Implement copy offload support

2022-11-23 Thread Chaitanya Kulkarni
(+ Shinichiro)

On 11/22/22 21:58, Nitesh Shetty wrote:
> The patch series covers the points discussed in November 2021 virtual
> call [LSF/MM/BFP TOPIC] Storage: Copy Offload [0].
> We have covered the initial agreed requirements in this patchset and
> further additional features suggested by community.
> Patchset borrows Mikulas's token based approach for 2 bdev
> implementation.
> 
> This is on top of our previous patchset v4[1].

Now that series is converging, since patch-series touches
drivers and key components in the block layer you need accompany
the patch-series with the blktests to cover the corner cases in the
drivers which supports this operations, as I mentioned this in the
call last year

If you need any help with that feel free to send an email to linux-block
and CC me or Shinichiro (added in CC )...

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 04/19] scsi: Add support for block PR read keys/reservation

2022-10-31 Thread Chaitanya Kulkarni


> +static inline enum pr_type scsi_pr_type_to_block(enum scsi_pr_type type)
> +{
> + switch (type) {
> + case SCSI_PR_WRITE_EXCLUSIVE:
> + return PR_WRITE_EXCLUSIVE;
> + case SCSI_PR_EXCLUSIVE_ACCESS:
> + return PR_EXCLUSIVE_ACCESS;
> + case SCSI_PR_WRITE_EXCLUSIVE_REG_ONLY:
> + return PR_WRITE_EXCLUSIVE_REG_ONLY;
> + case SCSI_PR_EXCLUSIVE_ACCESS_REG_ONLY:
> + return PR_EXCLUSIVE_ACCESS_REG_ONLY;
> + case SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS:
> + return PR_WRITE_EXCLUSIVE_ALL_REGS;
> + case SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS:
> + return PR_EXCLUSIVE_ACCESS_ALL_REGS;
> + default:
> + return 0;
> + }
> +}
> +

same here as previous comment, unless there is strong reason not to do
that ...

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 03/19] scsi: Move sd_pr_type to header to share

2022-10-31 Thread Chaitanya Kulkarni
On 10/26/22 16:19, Mike Christie wrote:
> LIO is going to want to do the same block to/from SCSI pr types as sd.c
> so this moves the sd_pr_type helper to a new file. The next patch will
> then also add a helper to go from the SCSI value to the block one for use
> with PERSISTENT_RESERVE_IN commands.
> 
> Signed-off-by: Mike Christie 
> Reviewed-by: Christoph Hellwig 
> ---
>   drivers/scsi/sd.c| 31 +++
>   include/scsi/scsi_block_pr.h | 36 
>   2 files changed, 43 insertions(+), 24 deletions(-)
>   create mode 100644 include/scsi/scsi_block_pr.h
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 4dc5c932fbd3..ad9374b07585 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -67,6 +67,7 @@
>   #include 
>   #include 
>   #include 
> +#include 
>   
>   #include "sd.h"
>   #include "scsi_priv.h"
> @@ -1694,28 +1695,8 @@ static int sd_get_unique_id(struct gendisk *disk, u8 
> id[16],
>   return ret;
>   }
>   
> -static char sd_pr_type(enum pr_type type)
> -{
> - switch (type) {
> - case PR_WRITE_EXCLUSIVE:
> - return 0x01;
> - case PR_EXCLUSIVE_ACCESS:
> - return 0x03;
> - case PR_WRITE_EXCLUSIVE_REG_ONLY:
> - return 0x05;
> - case PR_EXCLUSIVE_ACCESS_REG_ONLY:
> - return 0x06;
> - case PR_WRITE_EXCLUSIVE_ALL_REGS:
> - return 0x07;
> - case PR_EXCLUSIVE_ACCESS_ALL_REGS:
> - return 0x08;
> - default:
> - return 0;
> - }
> -};
> -
>   static int sd_pr_out_command(struct block_device *bdev, u8 sa,
> - u64 key, u64 sa_key, u8 type, u8 flags)
> + u64 key, u64 sa_key, enum scsi_pr_type type, u8 flags)
>   {
>   struct scsi_disk *sdkp = scsi_disk(bdev->bd_disk);
>   struct scsi_device *sdev = sdkp->device;
> @@ -1778,19 +1759,21 @@ static int sd_pr_reserve(struct block_device *bdev, 
> u64 key, enum pr_type type,
>   {
>   if (flags)
>   return -EOPNOTSUPP;
> - return sd_pr_out_command(bdev, 0x01, key, 0, sd_pr_type(type), 0);
> + return sd_pr_out_command(bdev, 0x01, key, 0,
> +  block_pr_type_to_scsi(type), 0);
>   }
>   
>   static int sd_pr_release(struct block_device *bdev, u64 key, enum pr_type 
> type)
>   {
> - return sd_pr_out_command(bdev, 0x02, key, 0, sd_pr_type(type), 0);
> + return sd_pr_out_command(bdev, 0x02, key, 0,
> +  block_pr_type_to_scsi(type), 0);
>   }
>   
>   static int sd_pr_preempt(struct block_device *bdev, u64 old_key, u64 
> new_key,
>   enum pr_type type, bool abort)
>   {
>   return sd_pr_out_command(bdev, abort ? 0x05 : 0x04, old_key, new_key,
> -  sd_pr_type(type), 0);
> +  block_pr_type_to_scsi(type), 0);
>   }
>   
>   static int sd_pr_clear(struct block_device *bdev, u64 key)
> diff --git a/include/scsi/scsi_block_pr.h b/include/scsi/scsi_block_pr.h
> new file mode 100644
> index ..6e99f844330d
> --- /dev/null
> +++ b/include/scsi/scsi_block_pr.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _SCSI_BLOCK_PR_H
> +#define _SCSI_BLOCK_PR_H
> +
> +#include 
> +
> +enum scsi_pr_type {
> + SCSI_PR_WRITE_EXCLUSIVE = 0x01,
> + SCSI_PR_EXCLUSIVE_ACCESS= 0x03,
> + SCSI_PR_WRITE_EXCLUSIVE_REG_ONLY= 0x05,
> + SCSI_PR_EXCLUSIVE_ACCESS_REG_ONLY   = 0x06,
> + SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS= 0x07,
> + SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS   = 0x08,
> +};
> +
> +static inline enum scsi_pr_type block_pr_type_to_scsi(enum pr_type type)
> +{
> + switch (type) {
> + case PR_WRITE_EXCLUSIVE:
> + return SCSI_PR_WRITE_EXCLUSIVE;
> + case PR_EXCLUSIVE_ACCESS:
> + return SCSI_PR_EXCLUSIVE_ACCESS;
> + case PR_WRITE_EXCLUSIVE_REG_ONLY:
> + return SCSI_PR_WRITE_EXCLUSIVE_REG_ONLY;
> + case PR_EXCLUSIVE_ACCESS_REG_ONLY:
> + return SCSI_PR_EXCLUSIVE_ACCESS_REG_ONLY;
> + case PR_WRITE_EXCLUSIVE_ALL_REGS:
> + return SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS;
> + case PR_EXCLUSIVE_ACCESS_ALL_REGS:
> + return SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS;
> + default:
> + return 0;
> + }
> +};


do we need above semicolon ?

how about not using switch case pattern totally untested below ?

static inline enum scsi_pr_type block_pr_type_to_scsi(enum pr_type type)
{
 enum pr_type pr_to_scsi_pr[] = {
 [PR_WRITE_EXCLUSIVE] = SCSI_PR_WRITE_EXCLUSIVE,
 [PR_EXCLUSIVE_ACCESS] = SCSI_PR_EXCLUSIVE_ACCESS,
 [PR_WRITE_EXCLUSIVE_REG_ONLY] = 
SCSI_PR_WRITE_EXCLUSIVE_REG_ONLY,
 [PR_EXCLUSIVE_ACCESS_REG_ONLY] = 
SCSI_PR_EXCLUSIVE_ACCESS_REG_ONLY,
 [PR_WRITE_EXCLUSIVE_ALL_REGS] = 
SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS,
   

Re: [dm-devel] [PATCH v3 02/19] scsi: Rename sd_pr_command

2022-10-31 Thread Chaitanya Kulkarni
On 10/26/22 16:19, Mike Christie wrote:
> Rename sd_pr_command to sd_pr_out_command to match a
> sd_pr_in_command helper added in the next patches.
> 
> Signed-off-by: Mike Christie 
> Reviewed-by: Christoph Hellwig 
> ---
>   drivers/scsi/sd.c | 12 ++--
>

Reviewed-by: Chaitanya Kulkarni 

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 07/19] nvme: Don't hardcode the data len for pr commands

2022-10-31 Thread Chaitanya Kulkarni
On 10/26/22 16:19, Mike Christie wrote:
> Reservation Report support needs to pass in a variable sized buffer, so
> this patch has the pr command helpers take a data length argument.
> 
> Signed-off-by: Mike Christie 
> Reviewed-by: Christoph Hellwig 
> ---


Reviewed-by: Chaitanya Kulkarni 

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 08/19] nvme: Move pr code to it's own file

2022-10-31 Thread Chaitanya Kulkarni
On 10/26/22 16:19, Mike Christie wrote:
> This patch moves the pr code to it's own file because I'm going to be
> adding more functions and core.c is getting bigger.
> 
> Signed-off-by: Mike Christie 
> ---
>

Thanks a lot for doing this ...

Reviewed-by: Chaitanya Kulkarni 

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v2 4/4] brd: implement secure erase and write zeroes

2022-09-20 Thread Chaitanya Kulkarni


>   /*
> @@ -300,23 +303,34 @@ out:
>   
>   void brd_do_discard(struct brd_device *brd, struct bio *bio)
>   {
> - sector_t sector, len, front_pad;
> + bool zero_padding;
> + sector_t sector, len, front_pad, end_pad;
>   
>   if (unlikely(!discard)) {
>   bio->bi_status = BLK_STS_NOTSUPP;
>   return;
>   }
>   
> + zero_padding = bio_op(bio) == REQ_OP_SECURE_ERASE || bio_op(bio) == 
> REQ_OP_WRITE_ZEROES;
>   sector = bio->bi_iter.bi_sector;
>   len = bio_sectors(bio);
>   front_pad = -sector & (PAGE_SECTORS - 1);
> +
> + if (zero_padding && unlikely(front_pad != 0))
> + copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector, min(len, 
> front_pad) << SECTOR_SHIFT);
> +
>   sector += front_pad;
>   if (unlikely(len <= front_pad))
>   return;
>   len -= front_pad;
> - len = round_down(len, PAGE_SECTORS);
> +
> + end_pad = len & (PAGE_SECTORS - 1);
> + if (zero_padding && unlikely(end_pad != 0))
> + copy_to_brd(brd, page_address(ZERO_PAGE(0)), sector + len - 
> end_pad, end_pad << SECTOR_SHIFT);
> + len -= end_pad;
> +
>
Is it possible to avoid these long lines ?

-ck


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v2 1/4] brd: make brd_insert_page return bool

2022-09-20 Thread Chaitanya Kulkarni
On 9/20/22 10:53, Mikulas Patocka wrote:
> brd_insert_page returns a pointer to struct page, however the pointer is
> never used (it is only compared against NULL), so to clean-up the code, we
> make it return bool.
> 
> Signed-off-by: Mikulas Patocka 
> Reviewed-by: Christoph Hellwig 
> 
> ---

Reviewed-by: Chaitanya Kulkarni 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [PATCH] checkpatch: Add kmap and kmap_atomic to the deprecated list

2022-08-13 Thread Chaitanya Kulkarni
On 8/13/22 15:00, ira.we...@intel.com wrote:
> From: Ira Weiny 
> 
> kmap() and kmap_atomic() are being deprecated in favor of
> kmap_local_page().
> 
> There are two main problems with kmap(): (1) It comes with an overhead
> as mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when
> the kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
> 
> kmap_local_page() is safe from any context and is therefore redundant
> with kmap_atomic() with the exception of any pagefault or preemption
> disable requirements.  However, using kmap_atomic() for these side
> effects makes the code less clear.  So any requirement for pagefault or
> preemption disable should be made explicitly.
> 
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again,
> the kernel virtual addresses are restored.
> 
> Suggested-by: Thomas Gleixner 
> Suggested-by: Fabio M. De Francesco 
> Signed-off-by: Ira Weiny 
> 
> ---
> Suggested by credits.
>   Thomas: Idea to keep from growing more kmap/kmap_atomic calls.
>   Fabio: Stole some of his boiler plate commit message.
> 
> Notes on tree-wide conversions:
> 
> I've cc'ed mailing lists for subsystems which currently contains either kmap()
> or kmap_atomic() calls.  As some of you already know Fabio and I have been
> working through converting kmap() calls to kmap_local_page().  But there is a
> lot more work to be done.  Help from the community is always welcome,
> especially with kmap_atomic() conversions.  To keep from stepping on each
> others toes I've created a spreadsheet of the current calls[1].  Please let me
> or Fabio know if you plan on tacking one of the conversions so we can mark it
> off the list.
> 
> [1] 
> https://docs.google.com/spreadsheets/d/1i_ckZ10p90bH_CkxD2bYNi05S2Qz84E2OFPv8zq__0w/edit#gid=1679714357
> 

Looks good.

Reviewed-by: Chaitanya Kulkarni 




Re: [dm-devel] [PATCH 11/30] affs: use bdev_nr_sectors instead of open coding it

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Kees Cook 

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 03/30] bcache: remove bdev_sectors

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Use the equivalent block layer helper instead.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Kees Cook 
> Acked-by: Coly Li 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 25/30] block: add a sb_bdev_nr_blocks helper

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Add a helper to return the size of sb->s_bdev in sb->s_blocksize_bits
> based unites.  Note that SECTOR_SHIFT has to be open coded due to
> include dependency issues for now, but I have a plan to sort that out
> eventually.
> 
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chiatanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 22/30] reiserfs: use bdev_nr_bytes instead of open coding it

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Use the proper helper to read the block device size and remove two
> cargo culted checks that can't be false.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Jan Kara 

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 26/30] ext4: use sb_bdev_nr_blocks

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Use the sb_bdev_nr_blocks helper instead of open coding it.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Kees Cook 
> Reviewed-by: Jan Kara 
> Acked-by: Theodore Ts'o 
>

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 01/30] block: move the SECTOR_SIZE related definitions to blk_types.h

2021-10-18 Thread Chaitanya Kulkarni
On 10/15/21 06:26, Christoph Hellwig wrote:
> Ensure these are always available for inlines in the various block layer
> headers.
> 
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 06/29] nvmet: use bdev_nr_sectors instead of open coding it

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 07/29] target/iblock: use bdev_nr_sectors instead of open coding it

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Bodo's comment is good for the code reliability.

Either way, looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 08/29] fs: use bdev_nr_sectors instead of open coding it in blkdev_max_block

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 
> ---


Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 11/29] btrfs: use bdev_nr_sectors instead of open coding it

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 


Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 02/29] drbd: use bdev_nr_sectors instead of open coding it

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 23/29] block: use bdev_nr_sectors instead of open coding it in blkdev_fallocate

2021-10-14 Thread Chaitanya Kulkarni
On 10/12/2021 10:10 PM, Christoph Hellwig wrote:
> Use the proper helper to read the block device size.
> 
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload

2021-10-01 Thread Chaitanya Kulkarni
Javier,

> 
> Hi all,
> 
> Since we are not going to be able to talk about this at LSF/MM, a few of
> us thought about holding a dedicated virtual discussion about Copy
> Offload. I believe we can use Chaitanya's thread as a start. Given the
> current state of the current patches, I would propose that we focus on
> the next step to get the minimal patchset that can go upstream so that
> we can build from there.
> 

I agree with having a call as it has been two years I'm trying to have 
this discussion.

Before we setup a call, please summarize following here :-

1. Exactly what work has been done so far.
2. What kind of feedback you got.
3. What are the exact blockers/objections.
4. Potential ways of moving forward.

Although this all information is present in the mailing archives it is 
scattered all over the places, looking at the long CC list above we need 
to get the everyone on the same page in order to have a productive call.

Once we have above discussion we can setup a precise agenda and assign 
slots.

> Before we try to find a date and a time that fits most of us, who would
> be interested in participating?
> 
> Thanks,
> Javier

-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v2 1/4] blk-crypto-fallback: properly prefix function and struct names

2021-09-17 Thread Chaitanya Kulkarni
On 9/16/21 10:22 AM, Eric Biggers wrote:
> External email: Use caution opening links or attachments
> 
> 
> From: Eric Biggers 
> 
> For clarity, avoid using just the "blk_crypto_" prefix for functions and
> structs that are specific to blk-crypto-fallback.  Instead, use
> "blk_crypto_fallback_".  Some places already did this, but others
> didn't.
> 
> This is also a prerequisite for using "struct blk_crypto_keyslot" to
> mean a generic blk-crypto keyslot (which is what it sounds like).
> Rename the fallback one to "struct blk_crypto_fallback_keyslot".
> 
> Reviewed-by: Christoph Hellwig 
> Signed-off-by: Eric Biggers 

Looks good, it might be useful to mention this patch doesn't change
any functionality, not a hard requirement though.

Reviewed-by: Chaitanya Kulkarni 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 01/15] bvec: add a bvec_virt helper

2021-08-09 Thread Chaitanya Kulkarni




On 8/4/2021 2:56 AM, Christoph Hellwig wrote:

Add a helper to get the virtual address for a bvec.  This avoids that
all callers need to know about the page + offset representation.

Signed-off-by: Christoph Hellwig


Looks good.

Reviewed-by: Chaitanya Kulkarni 

--
-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 02/15] block: use bvec_virt in bio_integrity_{process, free}

2021-08-09 Thread Chaitanya Kulkarni




On 8/4/2021 2:56 AM, Christoph Hellwig wrote:

Use the bvec_virt helper to clean up the bio integrity processing a
little bit.

Signed-off-by: Christoph Hellwig


Looks good.

Reviewed-by: Chaitanya Kulkarni 

--
-ck

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload

2021-06-11 Thread Chaitanya Kulkarni
On 5/10/21 17:15, Chaitanya Kulkarni wrote:
> Hi,
>
> * Background :-
> ---
>
> Copy offload is a feature that allows file-systems or storage devices
> to be instructed to copy files/logical blocks without requiring
> involvement of the local CPU.
>
> With reference to the RISC-V summit keynote [1] single threaded
> performance is limiting due to Denard scaling and multi-threaded
> performance is slowing down due Moore's law limitations. With the rise
> of SNIA Computation Technical Storage Working Group (TWG) [2],
> offloading computations to the device or over the fabrics is becoming
> popular as there are several solutions available [2]. One of the common
> operation which is popular in the kernel and is not merged yet is Copy
> offload over the fabrics or on to the device.
>
> * Problem :-
> ---
>
> The original work which is done by Martin is present here [3]. The
> latest work which is posted by Mikulas [4] is not merged yet. These two
> approaches are totally different from each other. Several storage
> vendors discourage mixing copy offload requests with regular READ/WRITE
> I/O. Also, the fact that the operation fails if a copy request ever
> needs to be split as it traverses the stack it has the unfortunate
> side-effect of preventing copy offload from working in pretty much
> every common deployment configuration out there.
>
> * Current state of the work :-
> ---
>
> With [3] being hard to handle arbitrary DM/MD stacking without
> splitting the command in two, one for copying IN and one for copying
> OUT. Which is then demonstrated by the [4] why [3] it is not a suitable
> candidate. Also, with [4] there is an unresolved problem with the
> two-command approach about how to handle changes to the DM layout
> between an IN and OUT operations.
>
> * Why Linux Kernel Storage System needs Copy Offload support now ?
> ---
>
> With the rise of the SNIA Computational Storage TWG and solutions [2],
> existing SCSI XCopy support in the protocol, recent advancement in the
> Linux Kernel File System for Zoned devices (Zonefs [5]), Peer to Peer
> DMA support in the Linux Kernel mainly for NVMe devices [7] and
> eventually NVMe Devices and subsystem (NVMe PCIe/NVMeOF) will benefit
> from Copy offload operation.
>
> With this background we have significant number of use-cases which are
> strong candidates waiting for outstanding Linux Kernel Block Layer Copy
> Offload support, so that Linux Kernel Storage subsystem can to address
> previously mentioned problems [1] and allow efficient offloading of the
> data related operations. (Such as move/copy etc.)
>
> For reference following is the list of the use-cases/candidates waiting
> for Copy Offload support :-
>
> 1. SCSI-attached storage arrays.
> 2. Stacking drivers supporting XCopy DM/MD.
> 3. Computational Storage solutions.
> 7. File systems :- Local, NFS and Zonefs.
> 4. Block devices :- Distributed, local, and Zoned devices.
> 5. Peer to Peer DMA support solutions.
> 6. Potentially NVMe subsystem both NVMe PCIe and NVMeOF.
>
> * What we will discuss in the proposed session ?
> ---
>
> I'd like to propose a session to go over this topic to understand :-
>
> 1. What are the blockers for Copy Offload implementation ?
> 2. Discussion about having a file system interface.
> 3. Discussion about having right system call for user-space.
> 4. What is the right way to move this work forward ?
> 5. How can we help to contribute and move this work forward ?
>
> * Required Participants :-
> ---
>
> I'd like to invite file system, block layer, and device drivers
> developers to:-
>
> 1. Share their opinion on the topic.
> 2. Share their experience and any other issues with [4].
> 3. Uncover additional details that are missing from this proposal.
>
> Required attendees :-
>
> Martin K. Petersen
> Jens Axboe
> Christoph Hellwig
> Bart Van Assche
> Zach Brown
> Roland Dreier
> Ric Wheeler
> Trond Myklebust
> Mike Snitzer
> Keith Busch
> Sagi Grimberg
> Hannes Reinecke
> Frederick Knight
> Mikulas Patocka
> Keith Busch
>
> Regards,
> Chaitanya
>
> [1]https://content.riscv.org/wp-content/uploads/2018/12/A-New-Golden-Age-for-Computer-Architecture-History-Challenges-and-Opportunities-David-Patterson-.pdf
> [2] https://www.snia.org/compu

Re: [dm-devel] [PATCH 13/16] block: use memcpy_from_bvec in bio_copy_kern_endio_read

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:09, Christoph Hellwig wrote:
> Use memcpy_from_bvec instead of open coding the logic.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 12/16] block: use memcpy_to_bvec in copy_to_high_bio_irq

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:08, Christoph Hellwig wrote:
> Use memcpy_to_bvec instead of opencoding the logic.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 05/16] bvec: add memcpy_{from, to}_bvec and memzero_bvec helper

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:07, Christoph Hellwig wrote:
> Add helpers to perform common memory operation on a bvec.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 06/16] block: use memzero_page in zero_fill_bio

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:07, Christoph Hellwig wrote:
> Use memzero_bvec to zero each segment in the bio instead of manually
> mapping and zeroing the data.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 04/16] bvec: add a bvec_kmap_local helper

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:06, Christoph Hellwig wrote:
> Add a helper to call kmap_local_page on a bvec.  There is no need for
> an unmap helper given that kunmap_local accept any address in the mapped
> page.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 03/16] bvec: fix the include guards for bvec.h

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:06, Christoph Hellwig wrote:
> Fix the include guards to match the file naming.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 01/16] mm: use kmap_local_page in memzero_page

2021-06-08 Thread Chaitanya Kulkarni
On 6/8/21 09:06, Christoph Hellwig wrote:
> No need for kmap_atomic here.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 






--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 20/30] nullb: use blk_mq_alloc_disk

2021-06-02 Thread Chaitanya Kulkarni
On 6/1/21 23:56, Christoph Hellwig wrote:
> Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
> request_queue allocation.
>
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/block/null_blk/main.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index d8e098f1e5b5..74fb2ec63219 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -1851,13 +1851,12 @@ static int null_add_dev(struct nullb_device *dev)
>  
>   rv = -ENOMEM;

Is above initialization needed ?

>   nullb->tag_set->timeout = 5 * HZ;
> - nullb->q = blk_mq_init_queue_data(nullb->tag_set, nullb);
> - if (IS_ERR(nullb->q))
> - goto out_cleanup_tags;
> - nullb->disk = alloc_disk_node(1, nullb->dev->home_node);
> - if (!nullb->disk)
> + nullb->disk = blk_mq_alloc_disk(nullb->tag_set, nullb);
> + if (IS_ERR(nullb->disk)) {
> + rv = PTR_ERR(nullb->disk);
>   goto out_cleanup_disk;
> - nullb->disk->queue = nullb->q;
> + }
> + nullb->q = nullb->disk->queue;
>   } else if (dev->queue_mode == NULL_Q_BIO) {
>   rv = -ENOMEM;
>   nullb->disk = blk_alloc_disk(nullb->dev->home_node);




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 18/30] loop: use blk_mq_alloc_disk and blk_cleanup_disk

2021-06-02 Thread Chaitanya Kulkarni
On 6/1/21 23:56, Christoph Hellwig wrote:
> Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
> request_queue allocation.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 






--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 03/30] blk-mq: add the blk_mq_alloc_disk APIs

2021-06-02 Thread Chaitanya Kulkarni
On 6/1/21 23:54, Christoph Hellwig wrote:
> Add a new API to allocate a gendisk including the request_queue for use
> with blk-mq based drivers.  This is to avoid boilerplate code in drivers.
>
> Signed-off-by: Christoph Hellwig 

This would be a nice API to get rid of the couple initialization
calls and respective error handling in each blk-mq based drivers.

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 15/30] blk-mq: remove blk_mq_init_sq_queue

2021-06-02 Thread Chaitanya Kulkarni
On 6/1/21 23:55, Christoph Hellwig wrote:
> All users are gone now.
>
> Signed-off-by: Christoph Hellwig 

Looks good.

Reviewed-by: Chaitanya Kulkarni 






--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 01/11] block: improve handling of all zones reset operation

2021-05-25 Thread Chaitanya Kulkarni
On 5/24/21 7:25 PM, Damien Le Moal wrote:
> SCSI, ZNS and null_blk zoned devices support resetting all zones using
> a single command (REQ_OP_ZONE_RESET_ALL), as indicated using the device
> request queue flag QUEUE_FLAG_ZONE_RESETALL. This flag is not set for
> device mapper targets creating zoned devices. In this case, a user
> request for resetting all zones of a device is processed in
> blkdev_zone_mgmt() by issuing a REQ_OP_ZONE_RESET operation for each
> zone of the device. This leads to different behaviors of the
> BLKRESETZONE ioctl() depending on the target device support for the
> reset all operation. E.g.
>
> blkzone reset /dev/sdX
>
> will reset all zones of a SCSI device using a single command that will
> ignore conventional, read-only or offline zones.
>
> But a dm-linear device including conventional, read-only or offline
> zones cannot be reset in the same manner as some of the single zone
> reset operations issued by blkdev_zone_mgmt() will fail. E.g.:
>
> blkzone reset /dev/dm-Y
> blkzone: /dev/dm-0: BLKRESETZONE ioctl failed: Remote I/O error
>
> To simplify applications and tools development, unify the behavior of
> the all-zone reset operation by modifying blkdev_zone_mgmt() to not
> issue a zone reset operation for conventional, read-only and offline
> zones, thus mimicking what an actual reset-all device command does on a
> device supporting REQ_OP_ZONE_RESET_ALL. This emulation is done using
> the new function blkdev_zone_reset_all_emulated(). The zones needing a
> reset are identified using a bitmap that is initialized using a zone
> report. Since empty zones do not need a reset, also ignore these zones.
> The function blkdev_zone_reset_all() is introduced for block devices
> natively supporting reset all operations. blkdev_zone_mgmt() is modified
> to call either function to execute an all zone reset request.
>
> Signed-off-by: Damien Le Moal 
> [hch: split into multiple functions]
> Signed-off-by: Christoph Hellwig 

Apart from nit mentioned earlier, looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v4 01/11] block: improve handling of all zones reset operation

2021-05-25 Thread Chaitanya Kulkarni
On 5/24/21 7:25 PM, Damien Le Moal wrote:
> + sector_t sector =  0;

nit:- I think there is an extra space here after = .





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 03/11] block: introduce BIO_ZONE_WRITE_LOCKED bio flag

2021-05-23 Thread Chaitanya Kulkarni
On 5/20/21 20:01, Damien Le Moal wrote:
> Introduce the BIO flag BIO_ZONE_WRITE_LOCKED to indicate that a BIO owns
> the write lock of the zone it is targeting. This is the counterpart of
> the struct request flag RQF_ZONE_WRITE_LOCKED.
>
> This new BIO flag is reserved for now for zone write locking control
> for device mapper targets exposing a zoned block device. Since in this
> case, the lock flag must not be propagated to the struct request that
> will be used to process the BIO, a BIO private flag is used rather than
> changing the RQF_ZONE_WRITE_LOCKED request flag into a common REQ_XXX
> flag that could be used for both BIO and request. This avoids conflicts
> down the stack with the block IO scheduler zone write locking
> (in mq-deadline).
>
> Signed-off-by: Damien Le Moal 
> Reviewed-by: Christoph Hellwig 
> Reviewed-by: Hannes Reinecke 
> ---

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 02/11] block: introduce bio zone helpers

2021-05-23 Thread Chaitanya Kulkarni
On 5/20/21 20:01, Damien Le Moal wrote:
> Introduce the helper functions bio_zone_no() and bio_zone_is_seq().
> Both are the BIO counterparts of the request helpers blk_rq_zone_no()
> and blk_rq_zone_is_seq(), respectively returning the number of the
> target zone of a bio and true if the BIO target zone is sequential.
>
> Signed-off-by: Damien Le Moal 
> Reviewed-by: Hannes Reinecke 


Looks good.

Reviewed-by: Chaitanya Kulkarni 






--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH v3 01/11] block: improve handling of all zones reset operation

2021-05-23 Thread Chaitanya Kulkarni
On 5/20/21 20:01, Damien Le Moal wrote:
> +static int blkdev_zone_reset_all_emulated(struct block_device *bdev,
> +   gfp_t gfp_mask)
> +{
> + struct request_queue *q = bdev_get_queue(bdev);
> + sector_t capacity = get_capacity(bdev->bd_disk);
> + sector_t zone_sectors = blk_queue_zone_sectors(q);
> + unsigned long *need_reset;
> + struct bio *bio = NULL;
> + sector_t sector;
> + int ret;
> +
> + need_reset = blk_alloc_zone_bitmap(q->node, q->nr_zones);
> + if (!need_reset)
> + return -ENOMEM;
> +
> + ret = bdev->bd_disk->fops->report_zones(bdev->bd_disk, 0,
> + q->nr_zones, blk_zone_need_reset_cb,
> + need_reset);
> + if (ret < 0)
> + goto out_free_need_reset;
> +
> + ret = 0;
> + while (sector < capacity) {

Garbage value of sector variable used in above comparison ?
If so consider initializing at the time of declaration.

> + if (!test_bit(blk_queue_zone_no(q, sector), need_reset)) {
> + sector += zone_sectors;
> + continue;
> + }
> + bio = blk_next_bio(bio, 0, gfp_mask);
> + bio_set_dev(bio, bdev);
> + bio->bi_opf = REQ_OP_ZONE_RESET | REQ_SYNC;
> + bio->bi_iter.bi_sector = sector;
> + sector += zone_sectors;
> +
> + /* This may take a while, so be nice to others */
> + cond_resched();
> + }
> +
> + if (bio) {
> + ret = submit_bio_wait(bio);
> + bio_put(bio);
> + }
> +
> +out_free_need_reset:
> + kfree(need_reset);
> + return ret;
> +}




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [PATCH 02/11] block: introduce bio zone helpers

2021-05-19 Thread Chaitanya Kulkarni


Sent from my iPhone

> On May 19, 2021, at 12:17 AM, Johannes Thumshirn  
> wrote:
> 
> On 19/05/2021 04:56, Damien Le Moal wrote:
>> +static inline unsigned int bio_zone_no(struct request_queue *q,
>> +   struct bio *bio)
>> +{
>> +return blk_queue_zone_no(q, bio->bi_iter.bi_sector);
>> +}
>> +
>> +static inline unsigned int bio_zone_is_seq(struct request_queue *q,
>> +   struct bio *bio)
>> +{
>> +return blk_queue_zone_is_seq(q, bio->bi_iter.bi_sector);
>> +}
>> +
> 
> Can't we derive the queue from the bio via bio->bi_bdev->bd_disk->queue
> or would this be too much pointer chasing for a small helper like this?

We have made such code changes to get rid of separate argument q and derive it 
from bio for block trace cleanup. Unless there is a strong reason (such as q is 
not available in this call which I have not looked into), we should avoid 
passing q as separate argument. 



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

[dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload

2021-05-10 Thread Chaitanya Kulkarni
Hi,

* Background :-
---

Copy offload is a feature that allows file-systems or storage devices
to be instructed to copy files/logical blocks without requiring
involvement of the local CPU.

With reference to the RISC-V summit keynote [1] single threaded
performance is limiting due to Denard scaling and multi-threaded
performance is slowing down due Moore's law limitations. With the rise
of SNIA Computation Technical Storage Working Group (TWG) [2],
offloading computations to the device or over the fabrics is becoming
popular as there are several solutions available [2]. One of the common
operation which is popular in the kernel and is not merged yet is Copy
offload over the fabrics or on to the device.

* Problem :-
---

The original work which is done by Martin is present here [3]. The
latest work which is posted by Mikulas [4] is not merged yet. These two
approaches are totally different from each other. Several storage
vendors discourage mixing copy offload requests with regular READ/WRITE
I/O. Also, the fact that the operation fails if a copy request ever
needs to be split as it traverses the stack it has the unfortunate
side-effect of preventing copy offload from working in pretty much
every common deployment configuration out there.

* Current state of the work :-
---

With [3] being hard to handle arbitrary DM/MD stacking without
splitting the command in two, one for copying IN and one for copying
OUT. Which is then demonstrated by the [4] why [3] it is not a suitable
candidate. Also, with [4] there is an unresolved problem with the
two-command approach about how to handle changes to the DM layout
between an IN and OUT operations.

* Why Linux Kernel Storage System needs Copy Offload support now ?
---

With the rise of the SNIA Computational Storage TWG and solutions [2],
existing SCSI XCopy support in the protocol, recent advancement in the
Linux Kernel File System for Zoned devices (Zonefs [5]), Peer to Peer
DMA support in the Linux Kernel mainly for NVMe devices [7] and
eventually NVMe Devices and subsystem (NVMe PCIe/NVMeOF) will benefit
from Copy offload operation.

With this background we have significant number of use-cases which are
strong candidates waiting for outstanding Linux Kernel Block Layer Copy
Offload support, so that Linux Kernel Storage subsystem can to address
previously mentioned problems [1] and allow efficient offloading of the
data related operations. (Such as move/copy etc.)

For reference following is the list of the use-cases/candidates waiting
for Copy Offload support :-

1. SCSI-attached storage arrays.
2. Stacking drivers supporting XCopy DM/MD.
3. Computational Storage solutions.
7. File systems :- Local, NFS and Zonefs.
4. Block devices :- Distributed, local, and Zoned devices.
5. Peer to Peer DMA support solutions.
6. Potentially NVMe subsystem both NVMe PCIe and NVMeOF.

* What we will discuss in the proposed session ?
---

I'd like to propose a session to go over this topic to understand :-

1. What are the blockers for Copy Offload implementation ?
2. Discussion about having a file system interface.
3. Discussion about having right system call for user-space.
4. What is the right way to move this work forward ?
5. How can we help to contribute and move this work forward ?

* Required Participants :-
---

I'd like to invite file system, block layer, and device drivers
developers to:-

1. Share their opinion on the topic.
2. Share their experience and any other issues with [4].
3. Uncover additional details that are missing from this proposal.

Required attendees :-

Martin K. Petersen
Jens Axboe
Christoph Hellwig
Bart Van Assche
Zach Brown
Roland Dreier
Ric Wheeler
Trond Myklebust
Mike Snitzer
Keith Busch
Sagi Grimberg
Hannes Reinecke
Frederick Knight
Mikulas Patocka
Keith Busch

Regards,
Chaitanya

[1]https://content.riscv.org/wp-content/uploads/2018/12/A-New-Golden-Age-for-Computer-Architecture-History-Challenges-and-Opportunities-David-Patterson-.pdf
[2] https://www.snia.org/computational
https://www.napatech.com/support/resources/solution-descriptions/napatech-smartnic-solution-for-hardware-offload/
  https://www.eideticom.com/products.html
https://www.xilinx.com/applications/data-center/computational-storage.html
[3] git://git.kernel.org/pub/scm/linux/kernel/git/mkp/linux.git xcopy
[4] https://www.spinics.net/lists/linux-block/msg00599.html
[5] https://lwn.net/Articles/793585/
[6] https://nvmexpress.org/new-nvmetm-specification-defines-zoned-
namespaces-zns-as-go-to-industry-technology/
[7] https://github.com/sbates130272/linux-p2pmem
[8] 

Re: [dm-devel] [PATCH v2 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set

2021-04-15 Thread Chaitanya Kulkarni
On 4/15/21 16:27, Mike Snitzer wrote:
> If the DNR bit is set we should not retry the command.
>
> We care about the retryable vs not retryable distinction at the block
> layer so propagate the equivalent of the DNR bit by introducing
> BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it
> is set.
>
> This change runs with the suggestion made here:
> https://lore.kernel.org/linux-nvme/20190813170144.ga10...@lst.de/
>
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Mike Snitzer 

Looks good.

Reviewed-by: Chaitanya Kulkarni 





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH v5 0/4] add simple copy support

2021-04-13 Thread Chaitanya Kulkarni
On 4/13/21 11:26, Javier González wrote:
>>> I believe there is space for extensions to simple copy. But given the
>>> experience with XCOPY, I can imagine that changes will be incremental,
>>> based on very specific use cases.
>>>
>>> I think getting support upstream and bringing deployed cases is a very
>>> good start.
>> Copying data (files) within the controller/subsystem from ns_A to ns_B 
>> using NVMf will reduce network BW and memory BW in the host server.
>>
>> This feature is well known and the use case is well known.
> Definitely.
>

I've a working code for nvmet for simple copy, I'm waiting to resolve
the host interface for REQ_OP_COPY so I can post it with this series.

Let me know if someone wants to collaborate offline on that.

IMHO we first need to sort out the host side interface which is
a challenge for years and it is not that easy to get it right
based on the history.





--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH v5 0/4] add simple copy support

2021-04-09 Thread Chaitanya Kulkarni
On 4/9/21 17:22, Max Gurtovoy wrote:
> On 2/19/2021 2:45 PM, SelvaKumar S wrote:
>> This patchset tries to add support for TP4065a ("Simple Copy Command"),
>> v2020.05.04 ("Ratified")
>>
>> The Specification can be found in following link.
>> https://nvmexpress.org/wp-content/uploads/NVM-Express-1.4-Ratified-TPs-1.zip
>>
>> Simple copy command is a copy offloading operation and is  used to copy
>> multiple contiguous ranges (source_ranges) of LBA's to a single destination
>> LBA within the device reducing traffic between host and device.
>>
>> This implementation doesn't add native copy offload support for stacked
>> devices rather copy offload is done through emulation. Possible use
>> cases are F2FS gc and BTRFS relocation/balance.
>>
>> *blkdev_issue_copy* takes source bdev, no of sources, array of source
>> ranges (in sectors), destination bdev and destination offset(in sectors).
>> If both source and destination block devices are same and copy_offload = 1,
>> then copy is done through native copy offloading. Copy emulation is used
>> in other cases.
>>
>> As SCSI XCOPY can take two different block devices and no of source range is
>> equal to 1, this interface can be extended in future to support SCSI XCOPY.
> Any idea why this TP wasn't designed for copy offload between 2 
> different namespaces in the same controller ?

Yes, it was the first attempt so to keep it simple.

Further work is needed to add incremental TP so that we can also do a copy
between the name-spaces of same controller (if we can't already) and to the
namespaces that belongs to the different controller.

> And a simple copy will be the case where the src_nsid == dst_nsid ?
>
> Also why there are multiple source ranges and only one dst range ? We 
> could add a bit to indicate if this range is src or dst..
>
>






--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH 01/11] block: add helper of blk_queue_poll

2021-03-15 Thread Chaitanya Kulkarni
On 3/15/21 20:18, Ming Lei wrote:
> There has been 3 users, and will be more, so add one such helper.
>
> Signed-off-by: Ming Lei 

This looks good to me irrespective of RFC.

Reviewed-by: Chaitanya Kulkarni 




--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH] blk-core: remove blk_put_request()

2021-02-24 Thread Chaitanya Kulkarni
On 2/24/21 10:56, Christoph Hellwig wrote:
> On Wed, Feb 24, 2021 at 09:48:21AM -0700, Jens Axboe wrote:
>> Would make sense to rename blk_get_request() to blk_mq_alloc_request()
>> and then we have API symmetry. The get/put don't make sense when there
>> are no references involved.
>>
>> But it's a lot of churn for very little reward, which is always kind
>> of annoying. Especially for the person that has to carry the patches.
> Let's do the following:
>
>  - move the initialize_rq_fn call from blk_get_request into
>blk_mq_alloc_request and make the former a trivial alias for the
>latter
>  - migrate to the blk_mq_* versions on a per-driver/subsystem basis.
>The scsi migration depends on the first item above, so it will have
>to go with that or wait for the next merge window
>  - don't migrate the legacy ide driver, as it is about to be removed and
>has a huge number of blk_get_request calls
>

Okay, thanks for the feedback, will try and get something together.



--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH] blk-core: remove blk_put_request()

2021-02-22 Thread Chaitanya Kulkarni
The function blk_put_request() is just a wrapper to
blk_mq_free_request(), remove the unnecessary wrapper.

Any feedback is welcome on this RFC.

Signed-off-by: Chaitanya Kulkarni 
---
 block/blk-core.c   |  6 --
 block/blk-merge.c  |  2 +-
 block/bsg-lib.c|  4 ++--
 block/bsg.c|  4 ++--
 block/scsi_ioctl.c |  6 +++---
 drivers/block/paride/pd.c  |  2 +-
 drivers/block/pktcdvd.c|  2 +-
 drivers/block/virtio_blk.c |  2 +-
 drivers/cdrom/cdrom.c  |  4 ++--
 drivers/ide/ide-atapi.c|  2 +-
 drivers/ide/ide-cd.c   |  4 ++--
 drivers/ide/ide-cd_ioctl.c |  2 +-
 drivers/ide/ide-devsets.c  |  2 +-
 drivers/ide/ide-disk.c |  2 +-
 drivers/ide/ide-ioctls.c   |  4 ++--
 drivers/ide/ide-park.c |  2 +-
 drivers/ide/ide-pm.c   |  4 ++--
 drivers/ide/ide-tape.c |  2 +-
 drivers/ide/ide-taskfile.c |  2 +-
 drivers/md/dm-mpath.c  |  2 +-
 drivers/mmc/core/block.c   | 10 +-
 drivers/scsi/scsi_error.c  |  2 +-
 drivers/scsi/scsi_lib.c|  2 +-
 drivers/scsi/sg.c  |  6 +++---
 drivers/scsi/st.c  |  4 ++--
 drivers/scsi/ufs/ufshcd.c  |  6 +++---
 drivers/target/target_core_pscsi.c |  4 ++--
 fs/nfsd/blocklayout.c  |  4 ++--
 include/linux/blkdev.h |  1 -
 29 files changed, 46 insertions(+), 53 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index fc60ff208497..1754f5e7cc80 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -642,12 +642,6 @@ struct request *blk_get_request(struct request_queue *q, 
unsigned int op,
 }
 EXPORT_SYMBOL(blk_get_request);
 
-void blk_put_request(struct request *req)
-{
-   blk_mq_free_request(req);
-}
-EXPORT_SYMBOL(blk_put_request);
-
 static void handle_bad_sector(struct bio *bio, sector_t maxsector)
 {
char b[BDEVNAME_SIZE];
diff --git a/block/blk-merge.c b/block/blk-merge.c
index ffb4aa0ea68b..f3493ee243fd 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -845,7 +845,7 @@ int blk_attempt_req_merge(struct request_queue *q, struct 
request *rq,
 
free = attempt_merge(q, rq, next);
if (free) {
-   blk_put_request(free);
+   blk_mq_free_request(free);
return 1;
}
 
diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 330fede77271..8ea2b33783fb 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -67,7 +67,7 @@ static int bsg_transport_fill_hdr(struct request *rq, struct 
sg_io_v4 *hdr,
 
 out_free_bidi_rq:
if (job->bidi_rq)
-   blk_put_request(job->bidi_rq);
+   blk_mq_free_request(job->bidi_rq);
 out:
kfree(job->request);
return ret;
@@ -128,7 +128,7 @@ static void bsg_transport_free_rq(struct request *rq)
 
if (job->bidi_rq) {
blk_rq_unmap_user(job->bidi_bio);
-   blk_put_request(job->bidi_rq);
+   blk_mq_free_request(job->bidi_rq);
}
 
kfree(job->request);
diff --git a/block/bsg.c b/block/bsg.c
index bd10922d5cbb..4ddc84aebecf 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -158,7 +158,7 @@ static int bsg_sg_io(struct request_queue *q, fmode_t mode, 
void __user *uarg)
 
ret = q->bsg_dev.ops->fill_hdr(rq, , mode);
if (ret) {
-   blk_put_request(rq);
+   blk_mq_free_request(rq);
return ret;
}
 
@@ -189,7 +189,7 @@ static int bsg_sg_io(struct request_queue *q, fmode_t mode, 
void __user *uarg)
 
 out_free_rq:
rq->q->bsg_dev.ops->free_rq(rq);
-   blk_put_request(rq);
+   blk_mq_free_request(rq);
if (!ret && copy_to_user(uarg, , sizeof(hdr)))
return -EFAULT;
return ret;
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index 6599bac0a78c..52cd3fd924fc 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -366,7 +366,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
*bd_disk,
 out_free_cdb:
scsi_req_free_cmd(req);
 out_put_request:
-   blk_put_request(rq);
+   blk_mq_free_request(rq);
return ret;
 }
 
@@ -509,7 +509,7 @@ int sg_scsi_ioctl(struct request_queue *q, struct gendisk 
*disk, fmode_t mode,
}

 error:
-   blk_put_request(rq);
+   blk_mq_free_request(rq);
 
 error_free_buffer:
kfree(buffer);
@@ -534,7 +534,7 @@ static int __blk_send_generic(struct request_queue *q, 
struct gendisk *bd_disk,
scsi_req(rq)->cmd_len = 6;
blk_execute_rq(bd_disk, rq, 0);
err = scsi_req(rq)->result ? -EIO : 0;
-   blk_put_request(rq);
+   blk_mq_free_request(rq);
 
return err;
 }
diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c
index 897acda20ac8..381fc0cf0b35 10

[dm-devel] [RFC PATCH 22/34] fs/jfs/jfs_metapage.c: use bio_new in metapage_readpage

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/jfs/jfs_metapage.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/jfs/jfs_metapage.c b/fs/jfs/jfs_metapage.c
index 3fa09d9a0b94..c7be3a2773bf 100644
--- a/fs/jfs/jfs_metapage.c
+++ b/fs/jfs/jfs_metapage.c
@@ -495,13 +495,11 @@ static int metapage_readpage(struct file *fp, struct page 
*page)
if (bio)
submit_bio(bio);
 
-   bio = bio_alloc(GFP_NOFS, 1);
-   bio_set_dev(bio, inode->i_sb->s_bdev);
-   bio->bi_iter.bi_sector =
-   pblock << (inode->i_blkbits - 9);
+   bio = bio_new(inode->i_sb->s_bdev,
+   pblock << (inode->i_blkbits - 9),
+   REQ_OP_READ, 0, 1, GFP_NOFS);
bio->bi_end_io = metapage_read_end_io;
bio->bi_private = page;
-   bio_set_op_attrs(bio, REQ_OP_READ, 0);
len = xlen << inode->i_blkbits;
offset = block_offset << inode->i_blkbits;
if (bio_add_page(bio, page, len, offset) < len)
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 26/34] xfs: use bio_new in xfs_rw_bdev

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/xfs/xfs_bio_io.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/xfs_bio_io.c b/fs/xfs/xfs_bio_io.c
index e2148f2d5d6b..e4644f22ebe6 100644
--- a/fs/xfs/xfs_bio_io.c
+++ b/fs/xfs/xfs_bio_io.c
@@ -26,11 +26,8 @@ xfs_rw_bdev(
if (is_vmalloc && op == REQ_OP_WRITE)
flush_kernel_vmap_range(data, count);
 
-   bio = bio_alloc(GFP_KERNEL, bio_max_vecs(left));
-   bio_set_dev(bio, bdev);
-   bio->bi_iter.bi_sector = sector;
-   bio->bi_opf = op | REQ_META | REQ_SYNC;
-
+   bio = bio_new(bdev, sector, op, REQ_META | REQ_SYNC, bio_max_vecs(left),
+ GFP_KERNEL);
do {
struct page *page = kmem_to_page(data);
unsigned intoff = offset_in_page(data);
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 17/34] iomap: use bio_new in iomap_dio_zero

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/iomap/direct-io.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index ea1e8f696076..f6c557a1bd25 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -189,15 +189,13 @@ iomap_dio_zero(struct iomap_dio *dio, struct iomap 
*iomap, loff_t pos,
int flags = REQ_SYNC | REQ_IDLE;
struct bio *bio;
 
-   bio = bio_alloc(GFP_KERNEL, 1);
-   bio_set_dev(bio, iomap->bdev);
-   bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
+   bio = bio_new(iomap->bdev, iomap_sector(iomap, pos), REQ_OP_WRITE,
+ flags, 1, GFP_KERNEL);
bio->bi_private = dio;
bio->bi_end_io = iomap_dio_bio_end_io;
 
get_page(page);
__bio_add_page(bio, page, len, 0);
-   bio_set_op_attrs(bio, REQ_OP_WRITE, flags);
iomap_dio_submit_bio(dio, iomap, bio, pos);
 }
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 13/34] block: use bio_new in __blkdev_direct_IO

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/block_dev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9d4b1a884d76..f3e3247894d7 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -367,6 +367,8 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
return -EINVAL;
 
bio = bio_alloc_bioset(GFP_KERNEL, nr_pages, _dio_pool);
+   bio_set_dev(bio, bdev);
+   bio->bi_iter.bi_sector = pos >> 9;
 
dio = container_of(bio, struct blkdev_dio, bio);
dio->is_sync = is_sync = is_sync_kiocb(iocb);
@@ -389,8 +391,6 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
blk_start_plug();
 
for (;;) {
-   bio_set_dev(bio, bdev);
-   bio->bi_iter.bi_sector = pos >> 9;
bio->bi_write_hint = iocb->ki_hint;
bio->bi_private = dio;
bio->bi_end_io = blkdev_bio_end_io;
@@ -446,7 +446,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
}
 
submit_bio(bio);
-   bio = bio_alloc(GFP_KERNEL, nr_pages);
+   bio = bio_new(bdev, pos >> 9, 0, 0, nr_pages, GFP_KERNEL);
}
 
if (!is_poll)
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 09/34] dm-zoned: use bio_new in dmz_write_mblock

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 drivers/md/dm-zoned-metadata.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index e6252f48a49c..fa0ee732c6e9 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -723,7 +723,8 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, 
struct dmz_mblock *mblk,
if (dmz_bdev_is_dying(dev))
return -EIO;
 
-   bio = bio_alloc(GFP_NOIO, 1);
+   bio = bio_new(dev->bdev, dmz_blk2sect(block), REQ_OP_WRITE,
+ REQ_META | REQ_PRIO, 1, GFP_NOIO);
if (!bio) {
set_bit(DMZ_META_ERROR, >state);
return -ENOMEM;
@@ -731,11 +732,8 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, 
struct dmz_mblock *mblk,
 
set_bit(DMZ_META_WRITING, >state);
 
-   bio->bi_iter.bi_sector = dmz_blk2sect(block);
-   bio_set_dev(bio, dev->bdev);
bio->bi_private = mblk;
bio->bi_end_io = dmz_mblock_bio_end_io;
-   bio_set_op_attrs(bio, REQ_OP_WRITE, REQ_META | REQ_PRIO);
bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
submit_bio(bio);
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 33/34] mm: use bio_new in swap_readpage

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 mm/page_io.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index 25b321489703..7579485ccb5e 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -342,7 +342,7 @@ int __swap_writepage(struct page *page, struct 
writeback_control *wbc,
return 0;
}
 
-   bio = bio_alloc(sis->bdev, swap_page_sector(page), REQ_OP_WRITE,
+   bio = bio_new(sis->bdev, swap_page_sector(page), REQ_OP_WRITE,
REQ_SWAP | wbc_to_write_flags(wbc), 1, GFP_NOIO);
bio->bi_end_io = end_write_func;
bio_add_page(bio, page, thp_size(page), 0);
@@ -406,10 +406,8 @@ int swap_readpage(struct page *page, bool synchronous)
}
 
ret = 0;
-   bio = bio_alloc(GFP_KERNEL, 1);
-   bio_set_dev(bio, sis->bdev);
-   bio->bi_opf = REQ_OP_READ;
-   bio->bi_iter.bi_sector = swap_page_sector(page);
+   bio = bio_new(sis->bdev, swap_page_sector(page), REQ_OP_READ, 0, 1,
+   GFP_KERNEL);
bio->bi_end_io = end_swap_bio_read;
bio_add_page(bio, page, thp_size(page), 0);
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 20/34] fs/jfs/jfs_logmgr.c: use bio_new in lbmStartIO

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/jfs/jfs_logmgr.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/jfs/jfs_logmgr.c b/fs/jfs/jfs_logmgr.c
index 4481f3e33a3f..bb25737d52f6 100644
--- a/fs/jfs/jfs_logmgr.c
+++ b/fs/jfs/jfs_logmgr.c
@@ -2121,16 +2121,14 @@ static void lbmStartIO(struct lbuf * bp)
 
jfs_info("lbmStartIO");
 
-   bio = bio_alloc(GFP_NOFS, 1);
-   bio->bi_iter.bi_sector = bp->l_blkno << (log->l2bsize - 9);
-   bio_set_dev(bio, log->bdev);
+   bio = bio_new(log->bdev, bp->l_blkno << (log->l2bsize - 9),
+   REQ_OP_WRITE | REQ_SYNC, 0, 1, GFP_NOFS);
 
bio_add_page(bio, bp->l_page, LOGPSIZE, bp->l_offset);
BUG_ON(bio->bi_iter.bi_size != LOGPSIZE);
 
bio->bi_end_io = lbmIODone;
bio->bi_private = bp;
-   bio->bi_opf = REQ_OP_WRITE | REQ_SYNC;
 
/* check if journaling to disk has been disabled */
if (log->no_integrity) {
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 14/34] fs/buffer: use bio_new in submit_bh_wbc

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/buffer.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 32647d2011df..fcbea667fa04 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3023,12 +3023,16 @@ static int submit_bh_wbc(int op, int op_flags, struct 
buffer_head *bh,
if (test_set_buffer_req(bh) && (op == REQ_OP_WRITE))
clear_buffer_write_io_error(bh);
 
-   bio = bio_alloc(GFP_NOIO, 1);
+   if (buffer_meta(bh))
+   op_flags |= REQ_META;
+   if (buffer_prio(bh))
+   op_flags |= REQ_PRIO;
+
+   bio = bio_new(bh->b_bdev,  bh->b_blocknr * (bh->b_size >> 9), op,
+ op_flags, GFP_NOIO, 1);
 
fscrypt_set_bio_crypt_ctx_bh(bio, bh, GFP_NOIO);
 
-   bio->bi_iter.bi_sector = bh->b_blocknr * (bh->b_size >> 9);
-   bio_set_dev(bio, bh->b_bdev);
bio->bi_write_hint = write_hint;
 
bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh));
@@ -3037,12 +3041,6 @@ static int submit_bh_wbc(int op, int op_flags, struct 
buffer_head *bh,
bio->bi_end_io = end_bio_bh_io_sync;
bio->bi_private = bh;
 
-   if (buffer_meta(bh))
-   op_flags |= REQ_META;
-   if (buffer_prio(bh))
-   op_flags |= REQ_PRIO;
-   bio_set_op_attrs(bio, op, op_flags);
-
/* Take care of bh's that straddle the end of the device */
guard_bio_eod(bio);
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 10/34] dm-zoned: use bio_new in dmz_rdwr_block

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 drivers/md/dm-zoned-metadata.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index fa0ee732c6e9..5b5ed5fce2ed 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -755,13 +755,11 @@ static int dmz_rdwr_block(struct dmz_dev *dev, int op,
if (dmz_bdev_is_dying(dev))
return -EIO;
 
-   bio = bio_alloc(GFP_NOIO, 1);
+   bio = bio_new(dev->bdev, dmz_blk2sect(block), op,
+ REQ_SYNC | REQ_META | REQ_PRIO, 1, GFP_NOIO);
if (!bio)
return -ENOMEM;
 
-   bio->bi_iter.bi_sector = dmz_blk2sect(block);
-   bio_set_dev(bio, dev->bdev);
-   bio_set_op_attrs(bio, op, REQ_SYNC | REQ_META | REQ_PRIO);
bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
ret = submit_bio_wait(bio);
bio_put(bio);
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 03/34] drdb: use bio_new in drdb

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 drivers/block/drbd/drbd_receiver.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c 
b/drivers/block/drbd/drbd_receiver.c
index 09c86ef3f0fd..e1cd3427b28b 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1643,6 +1643,7 @@ int drbd_submit_peer_request(struct drbd_device *device,
struct bio *bio;
struct page *page = peer_req->pages;
sector_t sector = peer_req->i.sector;
+   struct block_device *bdev = device->ldev->backing_bdev;
unsigned data_size = peer_req->i.size;
unsigned n_bios = 0;
unsigned nr_pages = (data_size + PAGE_SIZE -1) >> PAGE_SHIFT;
@@ -1687,15 +1688,12 @@ int drbd_submit_peer_request(struct drbd_device *device,
 * generated bio, but a bio allocated on behalf of the peer.
 */
 next_bio:
-   bio = bio_alloc(GFP_NOIO, nr_pages);
+   bio = bio_new(bdev, sector, op, op_flags, GFP_NOIO, nr_pages);
if (!bio) {
drbd_err(device, "submit_ee: Allocation of a bio failed 
(nr_pages=%u)\n", nr_pages);
goto fail;
}
/* > peer_req->i.sector, unless this is the first bio */
-   bio->bi_iter.bi_sector = sector;
-   bio_set_dev(bio, device->ldev->backing_bdev);
-   bio_set_op_attrs(bio, op, op_flags);
bio->bi_private = peer_req;
bio->bi_end_io = drbd_peer_request_endio;
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 04/34] drdb: use bio_new() in submit_one_flush

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 drivers/block/drbd/drbd_receiver.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c 
b/drivers/block/drbd/drbd_receiver.c
index e1cd3427b28b..b86bbf725cbd 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1277,8 +1277,10 @@ static void one_flush_endio(struct bio *bio)
 
 static void submit_one_flush(struct drbd_device *device, struct 
issue_flush_context *ctx)
 {
-   struct bio *bio = bio_alloc(GFP_NOIO, 0);
+   struct block_device *bdev = device->ldev->backing_bdev;
+   struct bio *bio = bio_new(bdev, 0, REQ_OP_FLUSH, REQ_PREFLUSH, 0, 
GFP_NOIO);
struct one_flush_context *octx = kmalloc(sizeof(*octx), GFP_NOIO);
+
if (!bio || !octx) {
drbd_warn(device, "Could not allocate a bio, CANNOT ISSUE 
FLUSH\n");
/* FIXME: what else can I do now?  disconnecting or detaching
@@ -1296,10 +1298,8 @@ static void submit_one_flush(struct drbd_device *device, 
struct issue_flush_cont
 
octx->device = device;
octx->ctx = ctx;
-   bio_set_dev(bio, device->ldev->backing_bdev);
bio->bi_private = octx;
bio->bi_end_io = one_flush_endio;
-   bio->bi_opf = REQ_OP_FLUSH | REQ_PREFLUSH;
 
device->flush_jif = jiffies;
set_bit(FLUSH_PENDING, >flags);
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 01/34] block: move common code into blk_next_bio()

2021-01-28 Thread Chaitanya Kulkarni
blk_next_bio() is the central function which allocates the bios for
discard, write-same, write-zeroes and zone-mgmt. The initialization of
various bio members is duplicated in disacrd, write-same, write-zeores.
In this preparation patch we add bdev, sector, op, and opf arguments to
the blk_next_bio() to reduce the duplication. 

In the next patch we introduce bio_new(), this prepration patch allows
us to call it inside blk_next_bio().

Signed-off-by: Chaitanya Kulkarni 
---
 block/blk-lib.c   | 36 +++-
 block/blk-zoned.c |  4 +---
 block/blk.h   |  5 +++--
 3 files changed, 19 insertions(+), 26 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 752f9c722062..fb486a0bdb58 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -10,7 +10,9 @@
 
 #include "blk.h"
 
-struct bio *blk_next_bio(struct bio *bio, unsigned int nr_pages, gfp_t gfp)
+struct bio *blk_next_bio(struct bio *bio, struct block_device *bdev,
+   sector_t sect, unsigned op, unsigned opf,
+   unsigned int nr_pages, gfp_t gfp)
 {
struct bio *new = bio_alloc(gfp, nr_pages);
 
@@ -19,6 +21,10 @@ struct bio *blk_next_bio(struct bio *bio, unsigned int 
nr_pages, gfp_t gfp)
submit_bio(bio);
}
 
+   new->bi_iter.bi_sector = sect;
+   bio_set_dev(new, bdev);
+   bio_set_op_attrs(new, op, opf);
+
return new;
 }
 
@@ -94,11 +100,7 @@ int __blkdev_issue_discard(struct block_device *bdev, 
sector_t sector,
 
WARN_ON_ONCE((req_sects << 9) > UINT_MAX);
 
-   bio = blk_next_bio(bio, 0, gfp_mask);
-   bio->bi_iter.bi_sector = sector;
-   bio_set_dev(bio, bdev);
-   bio_set_op_attrs(bio, op, 0);
-
+   bio = blk_next_bio(bio, bdev, sector, op, 0, 0, gfp_mask);
bio->bi_iter.bi_size = req_sects << 9;
sector += req_sects;
nr_sects -= req_sects;
@@ -168,6 +170,7 @@ static int __blkdev_issue_write_same(struct block_device 
*bdev, sector_t sector,
 {
struct request_queue *q = bdev_get_queue(bdev);
unsigned int max_write_same_sectors;
+   unsigned int op = REQ_OP_WRITE_SAME;
struct bio *bio = *biop;
sector_t bs_mask;
 
@@ -188,14 +191,11 @@ static int __blkdev_issue_write_same(struct block_device 
*bdev, sector_t sector,
max_write_same_sectors = bio_allowed_max_sectors(q);
 
while (nr_sects) {
-   bio = blk_next_bio(bio, 1, gfp_mask);
-   bio->bi_iter.bi_sector = sector;
-   bio_set_dev(bio, bdev);
+   bio = blk_next_bio(bio, bdev, sector, op, 0, 1, gfp_mask);
bio->bi_vcnt = 1;
bio->bi_io_vec->bv_page = page;
bio->bi_io_vec->bv_offset = 0;
bio->bi_io_vec->bv_len = bdev_logical_block_size(bdev);
-   bio_set_op_attrs(bio, REQ_OP_WRITE_SAME, 0);
 
if (nr_sects > max_write_same_sectors) {
bio->bi_iter.bi_size = max_write_same_sectors << 9;
@@ -249,7 +249,9 @@ static int __blkdev_issue_write_zeroes(struct block_device 
*bdev,
 {
struct bio *bio = *biop;
unsigned int max_write_zeroes_sectors;
+   unsigned int op = REQ_OP_WRITE_ZEROES;
struct request_queue *q = bdev_get_queue(bdev);
+   unsigned int opf = flags & BLKDEV_ZERO_NOUNMAP ? REQ_NOUNMAP : 0;
 
if (!q)
return -ENXIO;
@@ -264,13 +266,7 @@ static int __blkdev_issue_write_zeroes(struct block_device 
*bdev,
return -EOPNOTSUPP;
 
while (nr_sects) {
-   bio = blk_next_bio(bio, 0, gfp_mask);
-   bio->bi_iter.bi_sector = sector;
-   bio_set_dev(bio, bdev);
-   bio->bi_opf = REQ_OP_WRITE_ZEROES;
-   if (flags & BLKDEV_ZERO_NOUNMAP)
-   bio->bi_opf |= REQ_NOUNMAP;
-
+   bio = blk_next_bio(bio, bdev, sector, op, opf, 0, gfp_mask);
if (nr_sects > max_write_zeroes_sectors) {
bio->bi_iter.bi_size = max_write_zeroes_sectors << 9;
nr_sects -= max_write_zeroes_sectors;
@@ -303,6 +299,7 @@ static int __blkdev_issue_zero_pages(struct block_device 
*bdev,
sector_t sector, sector_t nr_sects, gfp_t gfp_mask,
struct bio **biop)
 {
+   unsigned int nr_pages = __blkdev_sectors_to_bio_pages(nr_sects);
struct request_queue *q = bdev_get_queue(bdev);
struct bio *bio = *biop;
int bi_size = 0;
@@ -315,11 +312,8 @@ static int __blkdev_issue_zero_pages(struct block_device 
*bdev,
return -EPERM;
 
while (nr_sects != 0) {
-   bio = blk_next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
+   bio = blk_next_bio(bio, bdev, s

[dm-devel] [RFC PATCH 12/34] scsi: target/iblock: use bio_new

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 drivers/target/target_core_iblock.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index 8ed93fd205c7..f1264918aee1 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -379,10 +379,9 @@ iblock_execute_sync_cache(struct se_cmd *cmd)
if (immed)
target_complete_cmd(cmd, SAM_STAT_GOOD);
 
-   bio = bio_alloc(GFP_KERNEL, 0);
+   bio = bio_new(ib_dev->ibd_bd, 0, REQ_OP_WRITE, REQ_PREFLUSH, 0,
+ GFP_KERNEL);
bio->bi_end_io = iblock_end_io_flush;
-   bio_set_dev(bio, ib_dev->ibd_bd);
-   bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
if (!immed)
bio->bi_private = cmd;
submit_bio(bio);
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 18/34] iomap: use bio_new in iomap_dio_bio_actor

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/iomap/direct-io.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index f6c557a1bd25..0737192f7e5c 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -267,9 +267,8 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t 
length,
goto out;
}
 
-   bio = bio_alloc(GFP_KERNEL, nr_pages);
-   bio_set_dev(bio, iomap->bdev);
-   bio->bi_iter.bi_sector = iomap_sector(iomap, pos);
+   bio = bio_new(iomap->bdev, iomap_sector(iomap, pos), 0, 0,
+ nr_pages, GFP_KERNEL);
bio->bi_write_hint = dio->iocb->ki_hint;
bio->bi_ioprio = dio->iocb->ki_ioprio;
bio->bi_private = dio;
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 29/34] power/swap: use bio_new in hib_submit_io

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 kernel/power/swap.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index c73f2e295167..e92e36c053a6 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -271,13 +271,12 @@ static int hib_submit_io(int op, int op_flags, pgoff_t 
page_off, void *addr,
struct hib_bio_batch *hb)
 {
struct page *page = virt_to_page(addr);
+   sector_t sect = page_off * (PAGE_SIZE >> 9);
struct bio *bio;
int error = 0;
 
-   bio = bio_alloc(GFP_NOIO | __GFP_HIGH, 1);
-   bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
-   bio_set_dev(bio, hib_resume_bdev);
-   bio_set_op_attrs(bio, op, op_flags);
+   bio = bio_new(hib_resume_bdev, sect, op, op_flags, 1,
+ GFP_NOIO | __GFP_HIGH);
 
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
pr_err("Adding page to bio failed at %llu\n",
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



Re: [dm-devel] [RFC PATCH 02/34] block: introduce and use bio_new

2021-01-28 Thread Chaitanya Kulkarni
On 1/27/21 11:21 PM, Damien Le Moal wrote:

On 2021/01/28 16:12, Chaitanya Kulkarni wrote:


Introduce bio_new() helper and use it in blk-lib.c to allocate and
initialize various non-optional or semi-optional members of the bio
along with bio allocation done with bio_alloc(). Here we also calmp the
max_bvecs for bio with BIO_MAX_PAGES before we pass to bio_alloc().

Signed-off-by: Chaitanya Kulkarni 
<mailto:chaitanya.kulka...@wdc.com>
---
 block/blk-lib.c |  6 +-
 include/linux/bio.h | 25 +
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index fb486a0bdb58..ec29415f00dd 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -14,17 +14,13 @@ struct bio *blk_next_bio(struct bio *bio, struct 
block_device *bdev,
sector_t sect, unsigned op, unsigned opf,
unsigned int nr_pages, gfp_t gfp)
 {
-   struct bio *new = bio_alloc(gfp, nr_pages);
+   struct bio *new = bio_new(bdev, sect, op, opf, gfp, nr_pages);

if (bio) {
bio_chain(bio, new);
submit_bio(bio);
}

-   new->bi_iter.bi_sector = sect;
-   bio_set_dev(new, bdev);
-   bio_set_op_attrs(new, op, opf);
-
return new;
 }

diff --git a/include/linux/bio.h b/include/linux/bio.h
index c74857cf1252..2a09ba100546 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -826,5 +826,30 @@ static inline void bio_set_polled(struct bio *bio, struct 
kiocb *kiocb)
if (!is_sync_kiocb(kiocb))
bio->bi_opf |= REQ_NOWAIT;
 }
+/**
+ * bio_new -   allcate and initialize new bio
+ * @bdev:  blockdev to issue discard for
+ * @sector:start sector
+ * @op:REQ_OP_XXX from enum req_opf
+ * @op_flags:  REQ_XXX from enum req_flag_bits
+ * @max_bvecs: maximum bvec to be allocated for this bio
+ * @gfp_mask:  memory allocation flags (for bio_alloc)
+ *
+ * Description:
+ *Allocates, initializes common members, and returns a new bio.
+ */
+static inline struct bio *bio_new(struct block_device *bdev, sector_t sector,
+ unsigned int op, unsigned int op_flags,
+ unsigned int max_bvecs, gfp_t gfp_mask)
+{
+   unsigned nr_bvec = clamp_t(unsigned int, max_bvecs, 0, BIO_MAX_PAGES);
+   struct bio *bio = bio_alloc(gfp_mask, nr_bvec);


I think that depending on the gfp_mask passed, bio can be NULL. So this should
be checked.


true, I'll add that check.




+
+   bio_set_dev(bio, bdev);
+   bio->bi_iter.bi_sector = sector;
+   bio_set_op_attrs(bio, op, op_flags);


This function is obsolete. Open code this.


true, will do.




+
+   return bio;
+}

 #endif /* __LINUX_BIO_H */



Thanks for the comments Damien.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

[dm-devel] [RFC PATCH 31/34] iomap: use bio_new in iomap_readpage_actor

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/iomap/buffered-io.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 16a1e82e3aeb..08d119b62cf5 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -241,6 +241,9 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, 
loff_t length, void *data,
struct page *page = ctx->cur_page;
struct iomap_page *iop = iomap_page_create(inode, page);
bool same_page = false, is_contig = false;
+   struct block_device *bdev = iomap->bdev;
+   unsigned opf = ctx->rac ? REQ_RAHEAD : 0;
+   unsigned op = REQ_OP_READ;
loff_t orig_pos = pos;
unsigned poff, plen;
sector_t sector;
@@ -285,19 +288,14 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, 
loff_t length, void *data,
 
if (ctx->rac) /* same as readahead_gfp_mask */
gfp |= __GFP_NORETRY | __GFP_NOWARN;
-   ctx->bio = bio_alloc(gfp, min(BIO_MAX_PAGES, nr_vecs));
+   ctx->bio = bio_new(bdev, sector, op, opf, gfp, nr_vecs);
/*
 * If the bio_alloc fails, try it again for a single page to
 * avoid having to deal with partial page reads.  This emulates
 * what do_mpage_readpage does.
 */
if (!ctx->bio)
-   ctx->bio = bio_alloc(orig_gfp, 1);
-   ctx->bio->bi_opf = REQ_OP_READ;
-   if (ctx->rac)
-   ctx->bio->bi_opf |= REQ_RAHEAD;
-   ctx->bio->bi_iter.bi_sector = sector;
-   bio_set_dev(ctx->bio, iomap->bdev);
+   ctx->bio = bio_new(bdev, sector, op, opf, orig_gfp, 1);
ctx->bio->bi_end_io = iomap_read_end_io;
}
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 30/34] hfsplus: use bio_new in hfsplus_submit_bio()

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/hfsplus/wrapper.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c
index 0350dc7821bf..8341ee6c9b31 100644
--- a/fs/hfsplus/wrapper.c
+++ b/fs/hfsplus/wrapper.c
@@ -64,10 +64,7 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t 
sector,
offset = start & (io_size - 1);
sector &= ~((io_size >> HFSPLUS_SECTOR_SHIFT) - 1);
 
-   bio = bio_alloc(GFP_NOIO, 1);
-   bio->bi_iter.bi_sector = sector;
-   bio_set_dev(bio, sb->s_bdev);
-   bio_set_op_attrs(bio, op, op_flags);
+   bio = bio_new(sb->s_bdev, sector, op, op_flags, 1, GFP_NOIO);
 
if (op != WRITE && data)
*data = (u8 *)buf + offset;
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 28/34] zonefs: use bio_new

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/zonefs/super.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
index ab68e27bb322..620d67965a22 100644
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -661,6 +661,7 @@ static const struct iomap_dio_ops zonefs_write_dio_ops = {
 
 static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter 
*from)
 {
+   unsigned int op = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE;
struct inode *inode = file_inode(iocb->ki_filp);
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct block_device *bdev = inode->i_sb->s_bdev;
@@ -678,15 +679,12 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, 
struct iov_iter *from)
if (!nr_pages)
return 0;
 
-   bio = bio_alloc(GFP_NOFS, nr_pages);
+   bio = bio_new(bdev, zi->i_zsector, op, 0, GFP_NOFS, nr_pages);
if (!bio)
return -ENOMEM;
 
-   bio_set_dev(bio, bdev);
-   bio->bi_iter.bi_sector = zi->i_zsector;
bio->bi_write_hint = iocb->ki_hint;
bio->bi_ioprio = iocb->ki_ioprio;
-   bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE;
if (iocb->ki_flags & IOCB_DSYNC)
bio->bi_opf |= REQ_FUA;
 
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 27/34] xfs: use bio_new in xfs_buf_ioapply_map

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/xfs/xfs_buf.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index f8400bbd6473..3ff6235e4f94 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1507,12 +1507,10 @@ xfs_buf_ioapply_map(
atomic_inc(>b_io_remaining);
nr_pages = min(total_nr_pages, BIO_MAX_PAGES);
 
-   bio = bio_alloc(GFP_NOIO, nr_pages);
-   bio_set_dev(bio, bp->b_target->bt_bdev);
-   bio->bi_iter.bi_sector = sector;
+   bio = bio_new(bp->b_target->bt_bdev, sector, op, 0, nr_pages,
+ GFP_NOIO);
bio->bi_end_io = xfs_buf_bio_end_io;
bio->bi_private = bp;
-   bio->bi_opf = op;
 
for (; size && nr_pages; nr_pages--, page_index++) {
int rbytes, nbytes = PAGE_SIZE - offset;
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



[dm-devel] [RFC PATCH 23/34] fs/mpage.c: use bio_new mpage_alloc

2021-01-28 Thread Chaitanya Kulkarni
Signed-off-by: Chaitanya Kulkarni 
---
 fs/mpage.c | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 830e6cc2a9e7..01725126e81f 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -68,25 +68,21 @@ static struct bio *mpage_bio_submit(int op, int op_flags, 
struct bio *bio)
 }
 
 static struct bio *
-mpage_alloc(struct block_device *bdev,
-   sector_t first_sector, int nr_vecs,
-   gfp_t gfp_flags)
+mpage_alloc(struct block_device *bdev, sector_t first_sector, int nr_vecs,
+   gfp_t gfp_flags)
 {
struct bio *bio;
 
/* Restrict the given (page cache) mask for slab allocations */
gfp_flags &= GFP_KERNEL;
-   bio = bio_alloc(gfp_flags, nr_vecs);
+   bio = bio_new(bdev, first_sector, 0, 0, nr_vecs, gfp_flags);
 
if (bio == NULL && (current->flags & PF_MEMALLOC)) {
while (!bio && (nr_vecs /= 2))
-   bio = bio_alloc(gfp_flags, nr_vecs);
+   bio = bio_new(bdev, first_sector, 0, 0, nr_vecs,
+   gfp_flags);
}
 
-   if (bio) {
-   bio_set_dev(bio, bdev);
-   bio->bi_iter.bi_sector = first_sector;
-   }
return bio;
 }
 
@@ -304,9 +300,7 @@ static struct bio *do_mpage_readpage(struct 
mpage_readpage_args *args)
goto out;
}
args->bio = mpage_alloc(bdev, blocks[0] << (blkbits - 9),
-   min_t(int, args->nr_pages,
- BIO_MAX_PAGES),
-   gfp);
+   args->nr_pages, gfp);
if (args->bio == NULL)
goto confused;
}
-- 
2.22.1

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel



  1   2   3   >