Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-03 Thread Jianxiong Gao via iommu
>
> Please try with this extra patch:
>
I have tried with the extra patch and it still fails to boot.
I have attached dmesg output for the error:

-dmesg starts here-
[6.357755] XFS (nvme0n1p2): Mounting V5 Filesystem
[6.430092] XFS (nvme0n1p2): Torn write (CRC failure) detected at
log block 0x20d0. Truncating head block from 0x20e0.
[6.442828] XFS (nvme0n1p2): Starting recovery (logdev: internal)
[7.272456] XFS (nvme0n1p2): Ending recovery (logdev: internal)
[7.610250] systemd-journald[434]: Received SIGTERM from PID 1 (systemd).
...
[   10.054121] systemd[755]:
/usr/lib/systemd/system-generators/systemd-rc-local-generator
terminated by signal ABRT.
[   10.726122] audit: type=1400 audit(1612370261.090:4): avc:  denied
{ search } for  pid=783 comm="systemd-sysctl" name="crash"
dev="nvme0n1p2" ino=50789805
scontext=system_u:system_r:systemd_sysctl_t:s0
tcontext=system_u:object_r:kdump_crash_t:s0 tclass=dir permissive=0
[   10.751764] audit: type=1400 audit(1612370261.090:5): avc:  denied
{ search } for  pid=783 comm="systemd-sysctl" name="crash"
dev="nvme0n1p2" ino=50789805
scontext=system_u:system_r:systemd_sysctl_t:s0
tcontext=system_u:object_r:kdump_crash_t:s0 tclass=dir permissive=0
[   12.366607] xfs filesystem being remounted at / supports timestamps
until 2038 (0x7fff)
[   12.376379] audit: type=1400 audit(1612370262.740:6): avc:  denied
{ write } for  pid=788 comm="systemd-remount" name="crash"
dev="nvme0n1p2" ino=50789805 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:kdump_crash_t:s0 tclass=dir permissive=0
[   12.413155] systemd-journald[781]: Received request to flush
runtime journal from PID 1
[   12.428917] audit: type=1400 audit(1612370262.793:7): avc:  denied
{ write } for  pid=815 comm="journalctl" name="crash" dev="nvme0n1p2"
ino=50789805 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:kdump_crash_t:s0 tclass=dir permissive=0
[   12.453006] audit: type=1400 audit(1612370262.799:8): avc:  denied
{ write } for  pid=817 comm="systemd-random-" name="crash"
dev="nvme0n1p2" ino=50789805 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:kdump_crash_t:s0 tclass=dir permissive=0
[   13.306030] plymouth[853]: segfault at 0 ip 7f2eabca8090 sp
7fffe94d8c08 error 6 in libc-2.28.so[7f2eabbcd000+1b9000]
[   13.318772] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00
[   78.782418] sed[911]: segfault at 0 ip 7fc3540da090 sp
7ffdb4373ff8 error 6 in libc-2.28.so[7fc353fff000+1b9000]
[   78.794007] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00
--dmesg ends here---

-- 
Jianxiong Gao
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-03 Thread Christoph Hellwig
Please try with this extra patch:

---
>From 212764c3c15ce859e6f55d2146f450ea4ca6fdb9 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 3 Feb 2021 14:27:13 +0100
Subject: nvme-pci: fix 2nd PRP setup in nvme_setup_prp_simple

Use the dma address instead of the bio_vec offset for the arithmetics
to find the address for the 2nd PRP.

Fixes: dff824b2aadb ("nvme-pci: optimize mapping of small single segment 
requests")
Reported-by: Jianxiong Gao 
Signed-off-by: Christoph Hellwig 
---
 drivers/nvme/host/pci.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 30e45f7e0f750c..4ae51735d72f4a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -808,8 +808,7 @@ static blk_status_t nvme_setup_prp_simple(struct nvme_dev 
*dev,
struct bio_vec *bv)
 {
struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-   unsigned int offset = bv->bv_offset & (NVME_CTRL_PAGE_SIZE - 1);
-   unsigned int first_prp_len = NVME_CTRL_PAGE_SIZE - offset;
+   dma_addr_t next_prp;
 
iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0);
if (dma_mapping_error(dev->dev, iod->first_dma))
@@ -817,8 +816,9 @@ static blk_status_t nvme_setup_prp_simple(struct nvme_dev 
*dev,
iod->dma_len = bv->bv_len;
 
cmnd->dptr.prp1 = cpu_to_le64(iod->first_dma);
-   if (bv->bv_len > first_prp_len)
-   cmnd->dptr.prp2 = cpu_to_le64(iod->first_dma + first_prp_len);
+   next_prp = round_down(iod->first_dma + bv->bv_len, NVME_CTRL_PAGE_SIZE);
+   if (next_prp > iod->first_dma)
+   cmnd->dptr.prp2 = cpu_to_le64(next_prp);
return BLK_STS_OK;
 }
 
-- 
2.29.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-02 Thread Robin Murphy

On 2021-02-02 11:21, Andy Shevchenko wrote:

On Mon, Feb 01, 2021 at 04:25:55PM -0800, Jianxiong Gao wrote:


+   if (dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1))


Side note: we have DMA_BIT_MASK(), please use it.


FWIW I'd actually disagree on that point. Conceptually, this is a very 
different thing from dev->{coherent_}dma_mask. It does not need to 
handle everything up to 2^64-1 correctly without overflow issues, and 
data alignments typically *are* defined in terms of sizes rather than 
numbers of bits.


In fact, it might be neat to just have callers pass a size directly to a 
dma_set_min_align() interface which asserts that it is a power of two 
and stores it as a mask internally.


Robin.




+   dev_warn(dev->dev, "dma_set_min_align_mask failed to
set offset\n");



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-02 Thread Andy Shevchenko
On Mon, Feb 01, 2021 at 04:25:55PM -0800, Jianxiong Gao wrote:

> +   if (dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1))

Side note: we have DMA_BIT_MASK(), please use it.

> +   dev_warn(dev->dev, "dma_set_min_align_mask failed to
> set offset\n");

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Chaitanya Kulkarni
On 2/1/21 13:27, Jianxiong Gao wrote:
>> Why is this setting being done and undone on each IO? Wouldn't it be
>> more efficient to set it once during device initialization?
>>
>> And more importantly, this isn't thread safe: one CPU may be setting the
>> device's dma alignment mask to 0 while another CPU is expecting it to be
>> NVME_CTRL_PAGE_SIZE - 1.
>  I was having trouble getting the OS booted when setting it once during
>  initialization. However when I rebased to the latest rc6 this morning it
>  seems to be working with setting the mask on probe. I am still testing out
>  and will appreciate any idea what may cause the nvme driver to fail
>  with a single mask.
Based on the Keith's comment it needs to be completely avoided in hot path.

Did you get a chance to bisect the problem in the rc6 ? We need to know the
root cause otherwise it might happen again after we add this patch.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Jianxiong Gao via iommu
> Why is this setting being done and undone on each IO? Wouldn't it be
> more efficient to set it once during device initialization?

I agree that setting it once is the right way of doing it.

So I have changed the patch to enable the mask once in nvme_probe.

 drivers/nvme/host/pci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 81e6389b2042..4ce78373f98d 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2630,6 +2630,9 @@ static void nvme_reset_work(struct work_struct *work)
 */
dma_set_max_seg_size(dev->dev, 0x);

+   if (dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1))
+   dev_warn(dev->dev, "dma_set_min_align_mask failed to
set offset\n");
+
mutex_unlock(>shutdown_lock);

/*

However on boot of the system, the following error happens occasionally.
The error seems related to Journal service. Whenever "Stopping Journal
Service..."
appears, the boot succeeds. Otherwise boot fails with the following message.

log start here--
[  OK  ] Started Journal Service.
[   10.774545] xfs filesystem being remounted at / supports timestamps
until 2038 (0x7fff)
[  OK  ] Started Remount Root and Kernel File Systems.
 Starting Flush Journal to Persistent Storage...
 Starting Load/Save Random Seed...
 Starting Create Static [   10.804340] systemd-journald[780]:
Received request to flush runtime journal from PID 1
Device Nodes in /dev...
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Create Static Device Nodes in /dev.
 Starting udev Kernel Device Manager...
[  OK  ] Reached target Local File Systems (Pre).
 Starting File System Check on /dev/disk/by-uuid/7281-17FC...
[  OK  ] Started File System Check on /dev/disk/by-uuid/7281-17FC.
 Mounting /boot/efi...
[  OK  ] Mounted /boo[   11.203461] systemd[1]: segfault at 2e0 ip
55b08607cc24 sp 7ffe13809090 error 4 in
systemd[55b08600+14]
t/efi.
[   11.216088] Code: 02 c7 44 24 10 fe ff ff ff 49 89 e4 89 06 48 8d
6c 24 08 48 8d 5c 24 10 48 c7 44 24 18 00 00 00 00 eb 10 0f 1f 00 48
8b 3c 24 <44> 39 b7 e0 02 00 00 74 3b 49 8b 7d 00 4c 89 e1 48 89 ea 48
89 de
---log ends here---

> Based on the Keith's comment it needs to be completely avoided in hot path.
>
> Did you get a chance to bisect the problem in the rc6 ? We need to know the
> root cause otherwise it might happen again after we add this patch.

I am now trying to bisect the problem.
I am not sure how the mapping offset could have caused the error.
Any suggestions are appreciated.

--
Jianxiong Gao
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Jianxiong Gao via iommu
> Why is this setting being done and undone on each IO? Wouldn't it be
> more efficient to set it once during device initialization?
>
> And more importantly, this isn't thread safe: one CPU may be setting the
> device's dma alignment mask to 0 while another CPU is expecting it to be
> NVME_CTRL_PAGE_SIZE - 1.

 I was having trouble getting the OS booted when setting it once during
 initialization. However when I rebased to the latest rc6 this morning it
 seems to be working with setting the mask on probe. I am still testing out
 and will appreciate any idea what may cause the nvme driver to fail
 with a single mask.
-- 
Jianxiong Gao
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Jianxiong Gao via iommu
>
> On Mon, Feb 01, 2021 at 10:30:17AM -0800, Jianxiong Gao wrote:
> > @@ -868,12 +871,24 @@ static blk_status_t nvme_map_data(struct nvme_dev
> *dev, struct request *req,
> >   if (!iod->nents)
> >   goto out_free_sg;
> >
> > + offset_ret = dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE
> - 1);
> > + if (offset_ret) {
> > + dev_warn(dev->dev, "dma_set_min_align_mask failed to set
> offset\n");
> > + goto out_free_sg;
> > + }
> > +
> >   if (is_pci_p2pdma_page(sg_page(iod->sg)))
> >   nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg,
> >   iod->nents, rq_dma_dir(req),
> DMA_ATTR_NO_WARN);
> >   else
> >   nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
> >rq_dma_dir(req),
> DMA_ATTR_NO_WARN);
> > +
> > + offset_ret = dma_set_min_align_mask(dev->dev, 0);
> > + if (offset_ret) {
> > + dev_warn(dev->dev, "dma_set_min_align_mask failed to reset
> offset\n");
> > + goto out_free_sg;
> > + }
> >   if (!nr_mapped)
> >   goto out_free_sg;
>
> Why is this setting being done and undone on each IO? Wouldn't it be
> more efficient to set it once during device initialization?
>
> And more importantly, this isn't thread safe: one CPU may be setting the
> device's dma alignment mask to 0 while another CPU is expecting it to be
> NVME_CTRL_PAGE_SIZE - 1.
>

I was having trouble getting the OS booted when setting it once during
initialization. However when I rebased to the latest rc6 this morning it
seems to be working with setting the mask on probe. I am still testing out
and will appreciate any idea what may cause the nvme driver to fail
with a single mask.

-- 
Jianxiong Gao
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Keith Busch
On Mon, Feb 01, 2021 at 10:30:17AM -0800, Jianxiong Gao wrote:
> @@ -868,12 +871,24 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, 
> struct request *req,
>   if (!iod->nents)
>   goto out_free_sg;
>  
> + offset_ret = dma_set_min_align_mask(dev->dev, NVME_CTRL_PAGE_SIZE - 1);
> + if (offset_ret) {
> + dev_warn(dev->dev, "dma_set_min_align_mask failed to set 
> offset\n");
> + goto out_free_sg;
> + }
> +
>   if (is_pci_p2pdma_page(sg_page(iod->sg)))
>   nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg,
>   iod->nents, rq_dma_dir(req), DMA_ATTR_NO_WARN);
>   else
>   nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
>rq_dma_dir(req), DMA_ATTR_NO_WARN);
> +
> + offset_ret = dma_set_min_align_mask(dev->dev, 0);
> + if (offset_ret) {
> + dev_warn(dev->dev, "dma_set_min_align_mask failed to reset 
> offset\n");
> + goto out_free_sg;
> + }
>   if (!nr_mapped)
>   goto out_free_sg;

Why is this setting being done and undone on each IO? Wouldn't it be
more efficient to set it once during device initialization?

And more importantly, this isn't thread safe: one CPU may be setting the
device's dma alignment mask to 0 while another CPU is expecting it to be
NVME_CTRL_PAGE_SIZE - 1.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Jianxiong Gao via iommu
On Mon, Feb 1, 2021 at 10:56 AM Andy Shevchenko
 wrote:
>
> On Mon, Feb 01, 2021 at 10:30:17AM -0800, Jianxiong Gao wrote:
> > NVMe driver relies on the address offset to function properly.
> > This patch adds the offset preserve mask to NVMe driver when mapping
> > via dma_map_sg_attrs and unmapping via nvme_unmap_sg. The mask
> > depends on the page size defined by CC.MPS register of NVMe
> > controller.
>
> ...
>
> >   if (is_pci_p2pdma_page(sg_page(iod->sg)))
> >   nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg,
> >   iod->nents, rq_dma_dir(req), 
> > DMA_ATTR_NO_WARN);
> >   else
> >   nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
> >rq_dma_dir(req), 
> > DMA_ATTR_NO_WARN);
> > +
> > + offset_ret = dma_set_min_align_mask(dev->dev, 0);
> > + if (offset_ret) {
> > + dev_warn(dev->dev, "dma_set_min_align_mask failed to reset 
> > offset\n");
> > + goto out_free_sg;
> > + }
>
> Seems like rebasing effect which makes empty line goes in the middle of some
> other group of code lines.
>
> >   if (!nr_mapped)
> >   goto out_free_sg;
>
> Perhaps it should be here?
Yes you are correct, it should be

 else
  nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
   rq_dma_dir(req), DMA_ATTR_NO_WARN);
  if (!nr_mapped)
  goto out_free_sg;
+
+ offset_ret = dma_set_min_align_mask(dev->dev, 0);
+ if (offset_ret) {
+ dev_warn(dev->dev, "dma_set_min_align_mask failed to
reset offset\n");
+ goto out_free_sg;
+ }

Thanks for pointing it out.
>
> --
> With Best Regards,
> Andy Shevchenko
>
>


-- 
Jianxiong Gao
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V2 3/3] Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

2021-02-01 Thread Andy Shevchenko
On Mon, Feb 01, 2021 at 10:30:17AM -0800, Jianxiong Gao wrote:
> NVMe driver relies on the address offset to function properly.
> This patch adds the offset preserve mask to NVMe driver when mapping
> via dma_map_sg_attrs and unmapping via nvme_unmap_sg. The mask
> depends on the page size defined by CC.MPS register of NVMe
> controller.

...

>   if (is_pci_p2pdma_page(sg_page(iod->sg)))
>   nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg,
>   iod->nents, rq_dma_dir(req), DMA_ATTR_NO_WARN);
>   else
>   nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents,
>rq_dma_dir(req), DMA_ATTR_NO_WARN);
> +
> + offset_ret = dma_set_min_align_mask(dev->dev, 0);
> + if (offset_ret) {
> + dev_warn(dev->dev, "dma_set_min_align_mask failed to reset 
> offset\n");
> + goto out_free_sg;
> + }

Seems like rebasing effect which makes empty line goes in the middle of some
other group of code lines.

>   if (!nr_mapped)
>   goto out_free_sg;

Perhaps it should be here?

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu