Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-22 Thread Sagi Grimberg




Thanks for the review Christoph. I think I should be able to make all
the requested changes in the next week or two.

On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:



I'm still not so happy about having to look up the namespace and still
wonder if we should generalize the connect_q to a passthrough_q.  But
I guess we can do that later and then reduce some of the exports here..


That is a neat idea! should be easy to do (and we can then lose the host
xarray stuff). I don't mind having it on a later patch, but it should be
easy enough to do even before...



I sort of follow this. I can try to work something up but it will
probably take me a few iterations to get it to where you want it. So,
roughly, we'd create a passthrough_q in core with the controller's IO
tagset and then cleanup the fabrics hosts to use that instead of each
independently creating their connect_q?

Though, I don't understand how this relates to the host xarray stuff
that Sagi mentioned...


passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
means that the driver shouldn't need the ns at all. So if you have a
dedicated request queue (mapped to the I/O tagset), you don't need the
ns->queue and we can lose the ns lookup altogether.

The only part is to check the effects, but that can probably be handled
when we setup the passthru controller or something...


Yes, I implemented the passthru_q (which was quite simple).


Nice..


But I'm not
sure how we are supposed to call nvme_command_effects() correctly
without the ns. You can't possibly do that during setup for every
possible opcode on every namespace. And even if we do, we'll still need
the same nvme_find_get_ns() and nvme_put_ns() exports and probably
another xarray to lookup the information.

Also, we pass the namespace's disk to in order to get proper block
accounting for the underlying disk. (Which is pretty important for
debugging). So we need to lookup the namespace for this too.

Unless there are some other ideas to solve these issues, I don't think
this change will gain us anything.


Let's defer it to a followup set than.


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-22 Thread Logan Gunthorpe



On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:
> 
>> Thanks for the review Christoph. I think I should be able to make all
>> the requested changes in the next week or two.
>>
>> On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:
>>>
 I'm still not so happy about having to look up the namespace and still
 wonder if we should generalize the connect_q to a passthrough_q.  But
 I guess we can do that later and then reduce some of the exports here..
>>>
>>> That is a neat idea! should be easy to do (and we can then lose the host
>>> xarray stuff). I don't mind having it on a later patch, but it should be
>>> easy enough to do even before...
>>>
>>
>> I sort of follow this. I can try to work something up but it will
>> probably take me a few iterations to get it to where you want it. So,
>> roughly, we'd create a passthrough_q in core with the controller's IO
>> tagset and then cleanup the fabrics hosts to use that instead of each
>> independently creating their connect_q?
>>
>> Though, I don't understand how this relates to the host xarray stuff
>> that Sagi mentioned...
> 
> passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
> means that the driver shouldn't need the ns at all. So if you have a
> dedicated request queue (mapped to the I/O tagset), you don't need the
> ns->queue and we can lose the ns lookup altogether.
> 
> The only part is to check the effects, but that can probably be handled
> when we setup the passthru controller or something...

Yes, I implemented the passthru_q (which was quite simple). But I'm not
sure how we are supposed to call nvme_command_effects() correctly
without the ns. You can't possibly do that during setup for every
possible opcode on every namespace. And even if we do, we'll still need
the same nvme_find_get_ns() and nvme_put_ns() exports and probably
another xarray to lookup the information.

Also, we pass the namespace's disk to in order to get proper block
accounting for the underlying disk. (Which is pretty important for
debugging). So we need to lookup the namespace for this too.

Unless there are some other ideas to solve these issues, I don't think
this change will gain us anything.

Logan


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Sagi Grimberg




On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote:

On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:


passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
means that the driver shouldn't need the ns at all. So if you have a
dedicated request queue (mapped to the I/O tagset), you don't need the
ns->queue and we can lose the ns lookup altogether.


We still need a request_queue to dispatch the command. I guess you could
make a generic one for the controller that isn't tied to a namespace,
but we lose the fair shared tag allocation.


What do you mean fair shared tag allocation?


See hctx_may_queue().


Still not following your point... this queue is yet another request
queue on the I/O tagset, e.g.

ctrl->passthru_q = blk_mq_init_queue(ctrl->tagset);


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Keith Busch
On Mon, Jul 20, 2020 at 04:28:26PM -0700, Sagi Grimberg wrote:
> On 7/20/20 4:17 PM, Keith Busch wrote:
> > On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote:
> > > On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:
> > > 
> > > > passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
> > > > means that the driver shouldn't need the ns at all. So if you have a
> > > > dedicated request queue (mapped to the I/O tagset), you don't need the
> > > > ns->queue and we can lose the ns lookup altogether.
> > 
> > We still need a request_queue to dispatch the command. I guess you could
> > make a generic one for the controller that isn't tied to a namespace,
> > but we lose the fair shared tag allocation.
> 
> What do you mean fair shared tag allocation?

See hctx_may_queue().


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Sagi Grimberg




On 7/20/20 4:17 PM, Keith Busch wrote:

On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote:

On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:


passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
means that the driver shouldn't need the ns at all. So if you have a
dedicated request queue (mapped to the I/O tagset), you don't need the
ns->queue and we can lose the ns lookup altogether.


We still need a request_queue to dispatch the command. I guess you could
make a generic one for the controller that isn't tied to a namespace,
but we lose the fair shared tag allocation.


What do you mean fair shared tag allocation?


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Keith Busch
On Mon, Jul 20, 2020 at 05:01:19PM -0600, Logan Gunthorpe wrote:
> On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:
>
> > passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
> > means that the driver shouldn't need the ns at all. So if you have a
> > dedicated request queue (mapped to the I/O tagset), you don't need the
> > ns->queue and we can lose the ns lookup altogether.

We still need a request_queue to dispatch the command. I guess you could
make a generic one for the controller that isn't tied to a namespace,
but we lose the fair shared tag allocation.
 
> Thanks, that helps clarify things a bit, but which xarray were you
> talking about

That was something that was replacing the list lookup:

  http://lists.infradead.org/pipermail/linux-nvme/2020-July/018242.html


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Sagi Grimberg




Thanks for the review Christoph. I think I should be able to make all
the requested changes in the next week or two.

On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:



I'm still not so happy about having to look up the namespace and still
wonder if we should generalize the connect_q to a passthrough_q.  But
I guess we can do that later and then reduce some of the exports here..


That is a neat idea! should be easy to do (and we can then lose the host
xarray stuff). I don't mind having it on a later patch, but it should be
easy enough to do even before...



I sort of follow this. I can try to work something up but it will
probably take me a few iterations to get it to where you want it. So,
roughly, we'd create a passthrough_q in core with the controller's IO
tagset and then cleanup the fabrics hosts to use that instead of each
independently creating their connect_q?

Though, I don't understand how this relates to the host xarray stuff
that Sagi mentioned...


passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
means that the driver shouldn't need the ns at all. So if you have a
dedicated request queue (mapped to the I/O tagset), you don't need the
ns->queue and we can lose the ns lookup altogether.


Thanks, that helps clarify things a bit, but which xarray were you
talking about?


The patch from Chaitanya

See "[PATCH V4 2/2] nvme-core: use xarray for ctrl ns tracking"


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Logan Gunthorpe



On 2020-07-20 4:35 p.m., Sagi Grimberg wrote:
> 
>> Thanks for the review Christoph. I think I should be able to make all
>> the requested changes in the next week or two.
>>
>> On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:
>>>
 I'm still not so happy about having to look up the namespace and still
 wonder if we should generalize the connect_q to a passthrough_q.  But
 I guess we can do that later and then reduce some of the exports here..
>>>
>>> That is a neat idea! should be easy to do (and we can then lose the host
>>> xarray stuff). I don't mind having it on a later patch, but it should be
>>> easy enough to do even before...
>>>
>>
>> I sort of follow this. I can try to work something up but it will
>> probably take me a few iterations to get it to where you want it. So,
>> roughly, we'd create a passthrough_q in core with the controller's IO
>> tagset and then cleanup the fabrics hosts to use that instead of each
>> independently creating their connect_q?
>>
>> Though, I don't understand how this relates to the host xarray stuff
>> that Sagi mentioned...
> 
> passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
> means that the driver shouldn't need the ns at all. So if you have a
> dedicated request queue (mapped to the I/O tagset), you don't need the
> ns->queue and we can lose the ns lookup altogether.

Thanks, that helps clarify things a bit, but which xarray were you
talking about?

Logan


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Sagi Grimberg




Thanks for the review Christoph. I think I should be able to make all
the requested changes in the next week or two.

On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:



I'm still not so happy about having to look up the namespace and still
wonder if we should generalize the connect_q to a passthrough_q.  But
I guess we can do that later and then reduce some of the exports here..


That is a neat idea! should be easy to do (and we can then lose the host
xarray stuff). I don't mind having it on a later patch, but it should be
easy enough to do even before...



I sort of follow this. I can try to work something up but it will
probably take me a few iterations to get it to where you want it. So,
roughly, we'd create a passthrough_q in core with the controller's IO
tagset and then cleanup the fabrics hosts to use that instead of each
independently creating their connect_q?

Though, I don't understand how this relates to the host xarray stuff
that Sagi mentioned...


passthru commands are in essence REQ_OP_DRV_IN/REQ_OP_DRV_OUT, which
means that the driver shouldn't need the ns at all. So if you have a
dedicated request queue (mapped to the I/O tagset), you don't need the
ns->queue and we can lose the ns lookup altogether.

The only part is to check the effects, but that can probably be handled
when we setup the passthru controller or something...


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Logan Gunthorpe
Thanks for the review Christoph. I think I should be able to make all
the requested changes in the next week or two.

On 2020-07-20 1:35 p.m., Sagi Grimberg wrote:
> 
>> I'm still not so happy about having to look up the namespace and still
>> wonder if we should generalize the connect_q to a passthrough_q.  But
>> I guess we can do that later and then reduce some of the exports here..
> 
> That is a neat idea! should be easy to do (and we can then lose the host
> xarray stuff). I don't mind having it on a later patch, but it should be
> easy enough to do even before...
> 

I sort of follow this. I can try to work something up but it will
probably take me a few iterations to get it to where you want it. So,
roughly, we'd create a passthrough_q in core with the controller's IO
tagset and then cleanup the fabrics hosts to use that instead of each
independently creating their connect_q?

Though, I don't understand how this relates to the host xarray stuff
that Sagi mentioned...

Logan


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Chaitanya Kulkarni
On 7/20/20 14:41, Logan Gunthorpe wrote:
>> That is a neat idea! should be easy to do (and we can then lose the host
>> xarray stuff). I don't mind having it on a later patch, but it should be
>> easy enough to do even before...
>>
> I sort of follow this. I can try to work something up but it will
> probably take me a few iterations to get it to where you want it. So,
> roughly, we'd create a passthrough_q in core with the controller's IO
> tagset and then cleanup the fabrics hosts to use that instead of each
> independently creating their connect_q?
> 
> Though, I don't understand how this relates to the host xarray stuff
> that Sagi mentioned...

I didn't understand it either how having a passthru q can get rid of the
struct nvme_ns search when nsid is embedded in the nvme_cmd in
the I/O path ?

Sagi can you please explain ?


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Sagi Grimberg




Add passthru command handling capability for the NVMeOF target and
export passthru APIs which are used to integrate passthru
code with nvmet-core.

The new file passthru.c handles passthru cmd parsing and execution.
In the passthru mode, we create a block layer request from the nvmet
request and map the data on to the block layer request.

Admin commands and features are on a white list as there are a number
of each that don't make too much sense with passthrough. We use a
white list so that new commands can be considered before being blindly
passed through. In both cases, vendor specific commands are always
allowed.

We also blacklist reservation IO commands as the underlying device
cannot differentiate between multiple hosts behind a fabric.


I'm still not so happy about having to look up the namespace and still
wonder if we should generalize the connect_q to a passthrough_q.  But
I guess we can do that later and then reduce some of the exports here..


That is a neat idea! should be easy to do (and we can then lose the host
xarray stuff). I don't mind having it on a later patch, but it should be
easy enough to do even before...


Re: [PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-20 Thread Christoph Hellwig
On Thu, Jul 16, 2020 at 02:33:17PM -0600, Logan Gunthorpe wrote:
> Add passthru command handling capability for the NVMeOF target and
> export passthru APIs which are used to integrate passthru
> code with nvmet-core.
> 
> The new file passthru.c handles passthru cmd parsing and execution.
> In the passthru mode, we create a block layer request from the nvmet
> request and map the data on to the block layer request.
> 
> Admin commands and features are on a white list as there are a number
> of each that don't make too much sense with passthrough. We use a
> white list so that new commands can be considered before being blindly
> passed through. In both cases, vendor specific commands are always
> allowed.
> 
> We also blacklist reservation IO commands as the underlying device
> cannot differentiate between multiple hosts behind a fabric.

I'm still not so happy about having to look up the namespace and still
wonder if we should generalize the connect_q to a passthrough_q.  But
I guess we can do that later and then reduce some of the exports here..

A few more comments below:

> + struct {
> + struct request  *rq;
> + struct work_struct  work;
> + u16 (*end_req)(struct nvmet_req *req);
> + } p;

Do we really need the callback for the two identify fixups, or
should we just hardcode them to avoid indirection function calls?

> +++ b/drivers/nvme/target/passthru.c
> @@ -0,0 +1,457 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * NVMe Over Fabrics Target Passthrough command implementation.
> + *
> + * Copyright (c) 2017-2018 Western Digital Corporation or its
> + * affiliates.
> + */

I think you forgot to add your own copyrights here.

> +static int nvmet_passthru_map_sg(struct nvmet_req *req, struct request *rq)
> +{
> + int sg_cnt = req->sg_cnt;
> + struct scatterlist *sg;
> + int op_flags = 0;
> + struct bio *bio;
> + int i, ret;
> +
> + if (req->cmd->common.opcode == nvme_cmd_flush)
> + op_flags = REQ_FUA;
> + else if (nvme_is_write(req->cmd))
> + op_flags = REQ_SYNC | REQ_IDLE;
> +
> +

Double empty line here..

> + bio = bio_alloc(GFP_KERNEL, min(sg_cnt, BIO_MAX_PAGES));
> + bio->bi_end_io = bio_put;
> +
> + for_each_sg(req->sg, sg, req->sg_cnt, i) {
> + if (bio_add_page(bio, sg_page(sg), sg->length,
> +  sg->offset) != sg->length) {

bio_add_pages is only for non-passthrough requests, this needs to use
bio_add_pc_page.

> + if (blk_rq_nr_phys_segments(rq) > queue_max_segments(rq->q)) {
> + status = NVME_SC_INVALID_FIELD;
> + goto fail_out;
> + }
> +
> + if ((blk_rq_payload_bytes(rq) >> 9) > queue_max_hw_sectors(rq->q)) {
> + status = NVME_SC_INVALID_FIELD;
> + goto fail_out;
> + }

Which should also take care of these checks.

> +static void nvmet_passthru_set_host_behaviour(struct nvmet_req *req)
> +{
> + struct nvme_ctrl *ctrl = nvmet_req_passthru_ctrl(req);
> + struct nvme_feat_host_behavior *host;
> + u16 status;
> + int ret;
> +
> + host = kzalloc(sizeof(*host) * 2, GFP_KERNEL);
> + ret = nvme_get_features(ctrl, NVME_FEAT_HOST_BEHAVIOR, 0,
> + host, sizeof(*host), NULL);

Mising kzalloc return check.


[PATCH v15 7/9] nvmet-passthru: Add passthru code to process commands

2020-07-16 Thread Logan Gunthorpe
Add passthru command handling capability for the NVMeOF target and
export passthru APIs which are used to integrate passthru
code with nvmet-core.

The new file passthru.c handles passthru cmd parsing and execution.
In the passthru mode, we create a block layer request from the nvmet
request and map the data on to the block layer request.

Admin commands and features are on a white list as there are a number
of each that don't make too much sense with passthrough. We use a
white list so that new commands can be considered before being blindly
passed through. In both cases, vendor specific commands are always
allowed.

We also blacklist reservation IO commands as the underlying device
cannot differentiate between multiple hosts behind a fabric.

Based-on-a-patch-by: Chaitanya Kulkarni 
Signed-off-by: Logan Gunthorpe 
Reviewed-by: Sagi Grimberg 
---
 drivers/nvme/target/Makefile|   1 +
 drivers/nvme/target/admin-cmd.c |   7 +-
 drivers/nvme/target/core.c  |   3 +
 drivers/nvme/target/nvmet.h |  39 +++
 drivers/nvme/target/passthru.c  | 457 
 include/linux/nvme.h|   4 +
 6 files changed, 509 insertions(+), 2 deletions(-)
 create mode 100644 drivers/nvme/target/passthru.c

diff --git a/drivers/nvme/target/Makefile b/drivers/nvme/target/Makefile
index 2b33836f3d3e..ebf91fc4c72e 100644
--- a/drivers/nvme/target/Makefile
+++ b/drivers/nvme/target/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_NVME_TARGET_TCP) += nvmet-tcp.o
 
 nvmet-y+= core.o configfs.o admin-cmd.o fabrics-cmd.o \
discovery.o io-cmd-file.o io-cmd-bdev.o
+nvmet-$(CONFIG_NVME_TARGET_PASSTHRU)   += passthru.o
 nvme-loop-y+= loop.o
 nvmet-rdma-y   += rdma.o
 nvmet-fc-y += fc.o
diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 95bb3bc4e335..e8d134535db5 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -754,7 +754,7 @@ u16 nvmet_set_feat_async_event(struct nvmet_req *req, u32 
mask)
return 0;
 }
 
-static void nvmet_execute_set_features(struct nvmet_req *req)
+void nvmet_execute_set_features(struct nvmet_req *req)
 {
struct nvmet_subsys *subsys = req->sq->ctrl->subsys;
u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
@@ -829,7 +829,7 @@ void nvmet_get_feat_async_event(struct nvmet_req *req)
nvmet_set_result(req, READ_ONCE(req->sq->ctrl->aen_enabled));
 }
 
-static void nvmet_execute_get_features(struct nvmet_req *req)
+void nvmet_execute_get_features(struct nvmet_req *req)
 {
struct nvmet_subsys *subsys = req->sq->ctrl->subsys;
u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
@@ -945,6 +945,9 @@ u16 nvmet_parse_admin_cmd(struct nvmet_req *req)
if (unlikely(ret))
return ret;
 
+   if (nvmet_req_passthru_ctrl(req))
+   return nvmet_parse_passthru_admin_cmd(req);
+
switch (cmd->common.opcode) {
case nvme_admin_get_log_page:
req->execute = nvmet_execute_get_log_page;
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index 9cdc39c8b729..86fc913f481a 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -874,6 +874,9 @@ static u16 nvmet_parse_io_cmd(struct nvmet_req *req)
if (unlikely(ret))
return ret;
 
+   if (nvmet_req_passthru_ctrl(req))
+   return nvmet_parse_passthru_io_cmd(req);
+
req->ns = nvmet_find_namespace(req->sq->ctrl, cmd->rw.nsid);
if (unlikely(!req->ns)) {
req->error_loc = offsetof(struct nvme_common_command, nsid);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 6f8bd6a93575..03a92dba24c3 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -243,6 +243,10 @@ struct nvmet_subsys {
struct config_group allowed_hosts_group;
 
struct nvmet_subsys_model   __rcu *model;
+
+#ifdef CONFIG_NVME_TARGET_PASSTHRU
+   struct nvme_ctrl*passthru_ctrl;
+#endif /* CONFIG_NVME_TARGET_PASSTHRU */
 };
 
 static inline struct nvmet_subsys *to_subsys(struct config_item *item)
@@ -322,6 +326,11 @@ struct nvmet_req {
struct bio_vec  *bvec;
struct work_struct  work;
} f;
+   struct {
+   struct request  *rq;
+   struct work_struct  work;
+   u16 (*end_req)(struct nvmet_req *req);
+   } p;
};
int sg_cnt;
int metadata_sg_cnt;
@@ -401,6 +410,8 @@ void nvmet_req_complete(struct nvmet_req *req, u16 status);
 int nvmet_req_alloc_sgls(struct nvmet_req *req);
 void nvmet_req_free_sgls(struct nvmet_req *req);
 
+void nvmet_execute_set_features(struct nvmet_req *req);
+void nvmet_execute_get_features(struct nvmet_req *req);
 void