Re: [PATCH 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-28 Thread Jongpil Jung
On Tue, Oct 27, 2020 at 01:23:34PM -0700, Keith Busch wrote:
> On Wed, Oct 28, 2020 at 12:54:38AM +0900, Jongpil Jung wrote:
> > suspend.
> > 
> > When NVMe device receive D3hot from host, NVMe firmware will do
> > garbage collection. While NVMe device do Garbage collection,
> > firmware has chance to going incorrect address.
> > In that case, NVMe storage device goes to no device available state.
> > Finally, host can't access the device any more.
> > 
> > Quirk devices will not use simple suspend even if HMB is enabled.
> > In case of poweroff scenario, NVMe receive "PME turn off".
> > So garbage collection will not be happening.
> > 
> > Liteon(SSSTC) will fix the issue, that's why quirk apply on specific
> > vendor id and firmware version.
> 
> This is a concerning quirk. We use the simple suspend when HMB is
> enabled because at least some platforms disable device DMA access in the
> runtime suspend state. Many devices continue to access their HMB while
> in low power, so we can't let both conditions occur concurrently.
> Unless you know for sure this device doesn't access host memory in
> low-power, there will be at least some platform combinations where this
> quirk will fail.
I will submit another patch to check platform.


Re: [PATCH 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-27 Thread Keith Busch
On Wed, Oct 28, 2020 at 12:54:38AM +0900, Jongpil Jung wrote:
> suspend.
> 
> When NVMe device receive D3hot from host, NVMe firmware will do
> garbage collection. While NVMe device do Garbage collection,
> firmware has chance to going incorrect address.
> In that case, NVMe storage device goes to no device available state.
> Finally, host can't access the device any more.
> 
> Quirk devices will not use simple suspend even if HMB is enabled.
> In case of poweroff scenario, NVMe receive "PME turn off".
> So garbage collection will not be happening.
> 
> Liteon(SSSTC) will fix the issue, that's why quirk apply on specific
> vendor id and firmware version.

This is a concerning quirk. We use the simple suspend when HMB is
enabled because at least some platforms disable device DMA access in the
runtime suspend state. Many devices continue to access their HMB while
in low power, so we can't let both conditions occur concurrently.
Unless you know for sure this device doesn't access host memory in
low-power, there will be at least some platform combinations where this
quirk will fail.


[PATCH 1/1] nvme: Add quirk for LiteON CL1 devices running FW 220TQ,22001

2020-10-27 Thread Jongpil Jung
LiteON(SSSTC) CL1 device running FW 220TQ,22001 has bugs with simple
suspend.

When NVMe device receive D3hot from host, NVMe firmware will do
garbage collection. While NVMe device do Garbage collection,
firmware has chance to going incorrect address.
In that case, NVMe storage device goes to no device available state.
Finally, host can't access the device any more.

Quirk devices will not use simple suspend even if HMB is enabled.
In case of poweroff scenario, NVMe receive "PME turn off".
So garbage collection will not be happening.

Liteon(SSSTC) will fix the issue, that's why quirk apply on specific
vendor id and firmware version.

Signed-off-by: Jongpil Jung 
---
 drivers/nvme/host/core.c | 30 ++
 drivers/nvme/host/nvme.h |  4 
 drivers/nvme/host/pci.c  |  6 +-
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 95ef4943d8bd..9ee520aa0100 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2637,6 +2637,36 @@ static const struct nvme_core_quirk_entry core_quirks[] 
= {
.vid = 0x14a4,
.fr = "2230",
.quirks = NVME_QUIRK_SIMPLE_SUSPEND,
+   },
+   {
+   /*
+* This LiteON CL1-3D256 CR22001 firmware version has some
+* issue in simple suspend.
+* Simple Suspend issue will be fixed in future firmware
+*/
+   .vid = 0x14a4,
+   .fr = "CR22001",
+   .quirks = NVME_QUIRK_NORMAL_SUSPEND_HMB,
+   },
+   {
+   /*
+* This LiteON CL1-3D256 CR220TQ firmware version has some
+* issue in simple suspend.
+* Simple Suspend issue will be fixed in future firmware
+*/
+   .vid = 0x14a4,
+   .fr = "CR220TQ",
+   .quirks = NVME_QUIRK_NORMAL_SUSPEND_HMB,
+   },
+   {
+   /*
+* This SSSTC CL1-3D256 CR22001 firmware version has some
+* issue in simple suspend.
+* Simple Suspend issue will be fixed in future firmware
+*/
+   .vid = 0x1e95,
+   .fr = "CR22001",
+   .quirks = NVME_QUIRK_NORMAL_SUSPEND_HMB,
}
 };
 
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cc36a981..2fde019dad8e 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -144,6 +144,10 @@ enum nvme_quirks {
 * NVMe 1.3 compliance.
 */
NVME_QUIRK_NO_NS_DESC_LIST  = (1 << 15),
+   /*
+* Force noraml suspend/resume path for HMB enabled devices.
+*/
+   NVME_QUIRK_NORMAL_SUSPEND_HMB   = (1 << 16),
 };
 
 /*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index df8f3612107f..1b1221cfb257 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3014,10 +3014,14 @@ static int nvme_suspend(struct device *dev)
 * specification allows the device to access the host memory buffer in
 * host DRAM from all power states, but hosts will fail access to DRAM
 * during S3.
+*
+* If NVME_QUIRK_NORMAL_SUSPEND_HMB is enabled,
+* do not use SIMPLE_SUSPEND for HMB device.
 */
if (pm_suspend_via_firmware() || !ctrl->npss ||
!pcie_aspm_enabled(pdev) ||
-   ndev->nr_host_mem_descs ||
+   (ndev->nr_host_mem_descs &&
+!(ndev->ctrl.quirks & NVME_QUIRK_NORMAL_SUSPEND_HMB)) ||
(ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
return nvme_disable_prepare_reset(ndev, true);
 
-- 
2.25.1