Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-15 Thread Martin K. Petersen

Kashyap,

> This patch is not yet included because of the ongoing discussion.
>
> Chris H, Martin et all -  How are we moving forward with this patch ?

Well, I guess I'll go ahead and queue it up even through I am no fan of
these dreadful driver-specific ioctl interfaces. I really wish the SCSI
Controller Commands had taken off so we had a standard interface for
dealing with RAID controllers and devices behind them.

-- 
Martin K. Petersen  Oracle Linux Engineering


RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-15 Thread Kashyap Desai
This patch is not yet included because of the ongoing discussion.

Chris H, Martin et all -  How are we moving forward with this patch ?

Thanks, Kashyap

> -Original Message-
> From: Sathya Prakash Veerichetty [mailto:sathya.prak...@broadcom.com]
> Sent: Thursday, January 11, 2018 11:37 PM
> To: Keith Busch
> Cc: dgilb...@interlog.com; Bart Van Assche; h...@infradead.org; Kashyap
> Desai; Shivasharan Srikanteshwara; Sumit Saxena; linux-
> n...@lists.infradead.org; Peter Rivera; linux-scsi@vger.kernel.org
> Subject: RE: [PATCH 13/14] megaraid_sas: NVME passthru command support
>
> >>So even when used as a RAID member, there will be a device handle at
> /dev/sdX for each NVMe device the megaraid controller manages?
> In megaraid controller, you can expose bare NVMe drives and RAID volumes
> created out of NVMe drives, when the RAID volume is created underlying
> member drives will not have /dev/sdX entries associated with them, however
> for bare NVMe drives there will be associated /dev/sdX entries.


RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-11 Thread Sathya Prakash Veerichetty
>>So even when used as a RAID member, there will be a device handle at
/dev/sdX for each NVMe device the megaraid controller manages?
In megaraid controller, you can expose bare NVMe drives and RAID volumes
created out of NVMe drives, when the RAID volume is created underlying
member drives will not have /dev/sdX entries associated with them, however
for bare NVMe drives there will be associated /dev/sdX entries.


Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-11 Thread Keith Busch
On Wed, Jan 10, 2018 at 03:14:40PM -0700, Sathya Prakash Veerichetty wrote:
> In the case of RAID controllers, all of those drives and RAID volumes
> are exposed to the OS as generic SCSI devices

So even when used as a RAID member, there will be a device handle at
/dev/sdX for each NVMe device the megaraid controller manages?


RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-10 Thread Sathya Prakash Veerichetty
Bart et al,
Broadcom's Tri-mode HBAs and MegaRAID controllers are capable of
connecting with SAS, SATA, NVMe drives,  SAS expanders and PCIe switches
(with NVMe drives connected behind that) and are capable of creating RAID
volumes on top of similar family of drives.  In the case of RAID
controllers, all of those drives and RAID volumes are exposed to the OS as
generic SCSI devices and in the case of HBA only for SAS and SATA the
topology is exposed to OS through SAS transport layer and NVMe drives are
exposed as generic SCSI devices.  The SCSI CDB to specific packet (SATA
frames, SSP frames or NVMe) translation occurs in the hardware/firmware.
For the OS driver, the interface to interact is common across all the type
of devices and it is MPI SCSI IO Request.

The NVMe passthru support added in this patch is only for management
purpose and will let Broadcom specific management applications to send
some direct NVMe commands to the hardware/firmware solely for management
purpose.  For normal READ/WRITE I/O the preferred path is to issue SCSI
command to our hardware/firmware and let it translate to the NVMe.

We have many architectural constraints to directly expose NVMe drives to
NVMe subsystem for normal I/O usage and management usage and hence we
prefer not to go down the path.

This patch is just addition of new feature for our management applications
(which are common across many OSes) to access a specific type of MPI
command to manage NVMe drives connected behind our HBAs (which are
non-standard) in a vendor specific way, hence we think this patch is valid
to be accepted to megaraid driver.  Please let us know if more details are
required on the tri-mode controllers.

Thanks
Sathya


-Original Message-
From: Linux-nvme [mailto:linux-nvme-boun...@lists.infradead.org] On Behalf
Of Douglas Gilbert
Sent: Wednesday, January 10, 2018 1:06 PM
To: Bart Van Assche; h...@infradead.org; kashyap.de...@broadcom.com;
shivasharan.srikanteshw...@broadcom.com
Cc: sumit.sax...@broadcom.com; peter.riv...@broadcom.com;
linux-n...@lists.infradead.org; linux-scsi@vger.kernel.org
Subject: Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

On 2018-01-10 11:22 AM, Bart Van Assche wrote:
> On Tue, 2018-01-09 at 22:07 +0530, Kashyap Desai wrote:
>> Overall NVME support behind MR controller is really a SCSI device. On
>> top of that, for MegaRaid, NVME device can be part of Virtual Disk
>> and those drive will not be exposed to the driver. User application
>> may like to talk to hidden NVME devices (part of VDs). This patch
>> will extend the existing interface for megaraid product in the same
>> way it is currently supported for other protocols like SMP, SATA
pass-through.
>
> It seems to me like there is a contradiction in the above paragraph:
> if some NVMe devices are not exposed to the driver, how can a user
> space application ever send NVMe commands to it?

I think that he meant that the NVMe physical devices (e.g. SSDs) are not
exposed to the upper layers (e.g. the SCSI mid-layer and above). The SCSI
subsystem has a no_uld_attach device flag that lets a LLD attach physical
devices but the sd driver and hence the block layer do not "see" them. The
idea is that maintenance programs like smartmontools can use them via the
bsg or sg drivers. The Megaraid driver code does not seem to use
no_uld_attach. Does the NVMe subsystem have similar "generic" (i.e.
non-block) devices accessible to the user space?

> Anyway, has it been considered to implement the NVMe support as an
> NVMe transport driver? The upstream kernel already supports NVMe
> communication with NVMe PCI devices, NVMe over RDMA and NVMe over FC.
> If communication to the NVMe devices behind the MegaRaid controller
> would be implemented as an NVMe transport driver then all
> functionality of the Linux NVMe driver could be reused, including its
sysfs entries.

Broadcom already sell "SAS" HBAs that have "tri-mode" phys. That is a phy
that can connect to a SAS device (e.g. a SAS expander), a SATA device or a
NVMe device. Now if I was Broadcom designing a 24 Gbps SAS-4 next
generation expander I would be thinking of using those tri-mode phys on
it. But then there is a problem, SAS currently supports 3 protocols: SSP
(for SCSI storage and enclosure management (SES)), STP (for SATA storage )
and SMP (for expander management). The problem is how those NVMe commands,
status and data cross the wire between the OS HBA (or MegaRaid type
controller) and an expander. Solving that might need some lateral
thinking.

On one hand the NVM Express folks seem to have shelved the idea of a SCSI
to NVMe Translation Layer (SNTL) and have not updated an old white paper
on the subject. Currently there is no SNTL on Linux (there was but it was
removed) or FreeBSD but there is one on Windows.

On the other hand I'm informed that recently the same body accepted the

Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-10 Thread Douglas Gilbert

On 2018-01-10 11:22 AM, Bart Van Assche wrote:

On Tue, 2018-01-09 at 22:07 +0530, Kashyap Desai wrote:

Overall NVME support behind MR controller is really a SCSI device. On top
of that, for MegaRaid, NVME device can be part of Virtual Disk and those
drive will not be exposed to the driver. User application may like to talk
to hidden NVME devices (part of VDs). This patch will extend the existing
interface for megaraid product in the same way it is currently supported
for other protocols like SMP, SATA pass-through.


It seems to me like there is a contradiction in the above paragraph: if some
NVMe devices are not exposed to the driver, how can a user space application
ever send NVMe commands to it?


I think that he meant that the NVMe physical devices (e.g. SSDs) are not
exposed to the upper layers (e.g. the SCSI mid-layer and above). The
SCSI subsystem has a no_uld_attach device flag that lets a LLD attach
physical devices but the sd driver and hence the block layer do not
"see" them. The idea is that maintenance programs like smartmontools
can use them via the bsg or sg drivers. The Megaraid driver code does
not seem to use no_uld_attach. Does the NVMe subsystem have similar
"generic" (i.e. non-block) devices accessible to the user space?


Anyway, has it been considered to implement the NVMe support as an NVMe
transport driver? The upstream kernel already supports NVMe communication
with NVMe PCI devices, NVMe over RDMA and NVMe over FC. If communication to
the NVMe devices behind the MegaRaid controller would be implemented as an
NVMe transport driver then all functionality of the Linux NVMe driver could
be reused, including its sysfs entries.


Broadcom already sell "SAS" HBAs that have "tri-mode" phys. That is a phy
that can connect to a SAS device (e.g. a SAS expander), a SATA device or a
NVMe device. Now if I was Broadcom designing a 24 Gbps SAS-4 next generation
expander I would be thinking of using those tri-mode phys on it. But then
there is a problem, SAS currently supports 3 protocols: SSP (for SCSI
storage and enclosure management (SES)), STP (for SATA storage ) and SMP
(for expander management). The problem is how those NVMe commands, status
and data cross the wire between the OS HBA (or MegaRaid type controller) and
an expander. Solving that might need some lateral thinking.

On one hand the NVM Express folks seem to have shelved the idea of a SCSI
to NVMe Translation Layer (SNTL) and have not updated an old white paper
on the subject. Currently there is no SNTL on Linux (there was but it was
removed) or FreeBSD but there is one on Windows.

On the other hand I'm informed that recently the same body accepted the
SES-3 standard pretty much as-is. That is done with the addition of SES
Send and SES Receive commands to NVME-MI. The library under sg_ses has
already been modified to use them (by implementing a specialized SNTL).

Doug Gilbert



Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-10 Thread Bart Van Assche
On Tue, 2018-01-09 at 22:07 +0530, Kashyap Desai wrote:
> Overall NVME support behind MR controller is really a SCSI device. On top
> of that, for MegaRaid, NVME device can be part of Virtual Disk and those
> drive will not be exposed to the driver. User application may like to talk
> to hidden NVME devices (part of VDs). This patch will extend the existing
> interface for megaraid product in the same way it is currently supported
> for other protocols like SMP, SATA pass-through.

It seems to me like there is a contradiction in the above paragraph: if some
NVMe devices are not exposed to the driver, how can a user space application
ever send NVMe commands to it?

Anyway, has it been considered to implement the NVMe support as an NVMe
transport driver? The upstream kernel already supports NVMe communication
with NVMe PCI devices, NVMe over RDMA and NVMe over FC. If communication to
the NVMe devices behind the MegaRaid controller would be implemented as an
NVMe transport driver then all functionality of the Linux NVMe driver could
be reused, including its sysfs entries.

Bart.

RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-10 Thread Kashyap Desai
> -Original Message-
> From: Douglas Gilbert [mailto:dgilb...@interlog.com]
> Sent: Wednesday, January 10, 2018 2:21 AM
> To: Christoph Hellwig; Kashyap Desai
> Cc: Shivasharan Srikanteshwara; linux-scsi@vger.kernel.org; Sumit Saxena;
> linux-n...@lists.infradead.org; Peter Rivera
> Subject: Re: [PATCH 13/14] megaraid_sas: NVME passthru command support
>
> On 2018-01-09 11:45 AM, Christoph Hellwig wrote:
> > On Tue, Jan 09, 2018 at 10:07:28PM +0530, Kashyap Desai wrote:
> >> Chris -
> >>
> >> Overall NVME support behind MR controller is really a SCSI device. On
> >> top of that, for MegaRaid, NVME device can be part of Virtual Disk
> >> and those drive will not be exposed to the driver. User application
> >> may like to talk to hidden NVME devices (part of VDs). This patch
> >> will extend the existing interface for megaraid product in the same
> >> way it is currently supported for other protocols like SMP, SATA pass-
> through.
> >>
> >> Example - Current smartmon is using megaraid.h (MFI headers) to send
> >> SATA pass-through.
> >>
> >> https://github.com/mirror/smartmontools/blob/master/megaraid.h
> >
> > And that is exactly the example of why we should have never allowed
> > megaraid any private passthrough ioctls to start with.
>
> Christoph,
> Have you tried to do any serious work with  and say
> compared it with FreeBSD and Microsoft's approach? No prize for guessing
> which one is worst (and least extensible). Looks like the Linux
> pass-through
> was at the end of a ToDo list and was "designed"
> at 5 a.m in the morning.
>
> RAID cards need a pass-through that allows them to address one of many
> physical disks behind the virtual disk presented to OS.
> Pass-throughs need to have uncommited room for extra parameters that will
> be passed through as-is to the RAID LLD.

Doug - As you mentioned, I notice the same. This type of issue is common for
all RAID controllers vendors.
Whatever Christoph mentioned about NVMe type API to be used is possible, but
may need extra hit in firmware side to convert Linux NVME API to FW specific
OR deal the same in driver.
It may come with it's own pros/cons.  Also may not fulfil the end goal. For
other platforms, we still have to depend upon specialized pass-through code.
So having said that Firmware of RAID cannot use only one interface for
pass-through and they have to choose specialized pass-through code.

NVME-CLI interface is designed for NVME drives attached to block layer.
MegaRaid product is design to keep NVME protocol abstracted (much like SATA
drives behind SAS controller) and attach those drives/virtual disk to SCSI
layer.

>
> So until Christoph gives an example of how that can be done with
>  then I would like to see Christoph's objection
> ignored.
>
>
> And as a maintainer of smartmontools, I would like to point out that
> pretty
> well all supported RAIDs, on all platforms need specialized pass-through
> code.

If upstream community like to enhance nvme-cli type interface in
megaraid_sas driver, we may have to come up with one more layer in
megaraid_sas driver to convert NVME-API to specialized pass-through code.
It is really not simple to fit into existing design as NVME-CLI/API is
considering NVME drive associated with nvme.ko modules (/dev/nvmeX). Also we
don't have many sysfs entries nvme-cli is looking for NVME device etc.. We
don't have way to talk to Physical disks which is part of VD etc..

Specialized pass-through code is better to extend in application like
smartmontools etc.

> Start by looking at os_linux.cpp and then at the other OSes. And now
> smartmontools supports NVMe on most platforms and at the pass-through
> level, it is just another one, and not a particularly clean one.
>
> IMO Intel had their chance on the pass-through front, and blew it.
> It is now too late to fix it and that job (impossible ?) should not fall
> to
> MegaRaid maintainers.
>
> Douglas Gilbert


RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-10 Thread Kashyap Desai
> -Original Message-
> From: Keith Busch [mailto:keith.bu...@intel.com]
> Sent: Wednesday, January 10, 2018 4:53 AM
> To: Douglas Gilbert
> Cc: Christoph Hellwig; Kashyap Desai; Shivasharan Srikanteshwara; Sumit
> Saxena; Peter Rivera; linux-n...@lists.infradead.org; linux-
> s...@vger.kernel.org
> Subject: Re: [PATCH 13/14] megaraid_sas: NVME passthru command support
>
> On Tue, Jan 09, 2018 at 03:50:44PM -0500, Douglas Gilbert wrote:
> > Have you tried to do any serious work with  and
> > say compared it with FreeBSD and Microsoft's approach? No prize for
> > guessing which one is worst (and least extensible). Looks like the
> > Linux pass-through was at the end of a ToDo list and was "designed"
> > at 5 a.m in the morning.
>
> What the heck are you talking about? FreeBSD's NVMe passthrough is near
> identical to Linux, and Linux's existed years prior.
>
> You're not even touching the nvme subsystem, so why are you copying the
> linux-nvme mailing list to help you with a non-NVMe device? Please take
your
> ignorant and dubious claims elsewhere.

Keith -

As we discussed for mpt3sas driver NVME driver support, there was request
to add linux-n...@lists.infradead.org for NVME related discussion.
https://marc.info/?l=linux-kernel=149874673729467=2

As you mentioned, we are not touching NVME subsystem, we can skip to add
NVME mailing list for future submission w.r.t NVME drive behind MR
(megaraid_sas) and HBA (mpt3sas).
All the NVME drives behind MegaRaid controller is SCSI device irrespective
of transport.

Kashyap


Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-09 Thread Keith Busch
On Tue, Jan 09, 2018 at 03:50:44PM -0500, Douglas Gilbert wrote:
> Have you tried to do any serious work with  and
> say compared it with FreeBSD and Microsoft's approach? No prize for
> guessing which one is worst (and least extensible). Looks like the
> Linux pass-through was at the end of a ToDo list and was "designed"
> at 5 a.m in the morning.

What the heck are you talking about? FreeBSD's NVMe passthrough is near
identical to Linux, and Linux's existed years prior.
 
You're not even touching the nvme subsystem, so why are you copying the
linux-nvme mailing list to help you with a non-NVMe device? Please take
your ignorant and dubious claims elsewhere.


Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-09 Thread Douglas Gilbert

On 2018-01-09 11:45 AM, Christoph Hellwig wrote:

On Tue, Jan 09, 2018 at 10:07:28PM +0530, Kashyap Desai wrote:

Chris -

Overall NVME support behind MR controller is really a SCSI device. On top
of that, for MegaRaid, NVME device can be part of Virtual Disk and those
drive will not be exposed to the driver. User application may like to talk
to hidden NVME devices (part of VDs). This patch will extend the existing
interface for megaraid product in the same way it is currently supported
for other protocols like SMP, SATA pass-through.

Example - Current smartmon is using megaraid.h (MFI headers) to send SATA
pass-through.

https://github.com/mirror/smartmontools/blob/master/megaraid.h


And that is exactly the example of why we should have never allowed
megaraid any private passthrough ioctls to start with.


Christoph,
Have you tried to do any serious work with  and
say compared it with FreeBSD and Microsoft's approach? No prize for
guessing which one is worst (and least extensible). Looks like the
Linux pass-through was at the end of a ToDo list and was "designed"
at 5 a.m in the morning.

RAID cards need a pass-through that allows them to address one of
many physical disks behind the virtual disk presented to OS.
Pass-throughs need to have uncommited room for extra parameters that
will be passed through as-is to the RAID LLD.

So until Christoph gives an example of how that can be done with
 then I would like to see Christoph's objection
ignored.


And as a maintainer of smartmontools, I would like to point out that
pretty well all supported RAIDs, on all platforms need specialized
pass-through code. Start by looking at os_linux.cpp and then at the
other OSes. And now smartmontools supports NVMe on most platforms
and at the pass-through level, it is just another one, and not a
particularly clean one.

IMO Intel had their chance on the pass-through front, and blew it.
It is now too late to fix it and that job (impossible ?) should not
fall to MegaRaid maintainers.

Douglas Gilbert


Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-09 Thread Christoph Hellwig
On Tue, Jan 09, 2018 at 10:07:28PM +0530, Kashyap Desai wrote:
> Chris -
> 
> Overall NVME support behind MR controller is really a SCSI device. On top
> of that, for MegaRaid, NVME device can be part of Virtual Disk and those
> drive will not be exposed to the driver. User application may like to talk
> to hidden NVME devices (part of VDs). This patch will extend the existing
> interface for megaraid product in the same way it is currently supported
> for other protocols like SMP, SATA pass-through.
> 
> Example - Current smartmon is using megaraid.h (MFI headers) to send SATA
> pass-through.
> 
> https://github.com/mirror/smartmontools/blob/master/megaraid.h

And that is exactly the example of why we should have never allowed
megaraid any private passthrough ioctls to start with.



RE: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-09 Thread Kashyap Desai
Chris -

Overall NVME support behind MR controller is really a SCSI device. On top
of that, for MegaRaid, NVME device can be part of Virtual Disk and those
drive will not be exposed to the driver. User application may like to talk
to hidden NVME devices (part of VDs). This patch will extend the existing
interface for megaraid product in the same way it is currently supported
for other protocols like SMP, SATA pass-through.

Example - Current smartmon is using megaraid.h (MFI headers) to send SATA
pass-through.

https://github.com/mirror/smartmontools/blob/master/megaraid.h

Any open source application is aware of above interface can extend the
similar support for NVME drives. I agree that current nvme-cli type
interface is not going to be supported using this method.  In current
patch, driver processing is very limited since most of the work is handled
in application + FW.

NVME behind MR controller is not really NVME device to the operating
system at block layer. Considering this, do you agree or still foresee any
issues ?

Kashyap

-Original Message-
From: Christoph Hellwig [mailto:h...@infradead.org]
Sent: Monday, January 8, 2018 3:36 PM
To: Shivasharan S
Cc: linux-scsi@vger.kernel.org; sumit.sax...@broadcom.com;
linux-n...@lists.infradead.org; kashyap.de...@broadcom.com
Subject: Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

NAK.  Please implement the same ioctl interfaces as the nvme driver
instead of inventing your own incomaptible one.


Re: [PATCH 13/14] megaraid_sas: NVME passthru command support

2018-01-08 Thread Christoph Hellwig
NAK.  Please implement the same ioctl interfaces as the nvme driver
instead of inventing your own incomaptible one.