Tried on kernel 4.18.20 and this issue is not seen.
[root@localhost ~]# ./test-ibv_reg 10.8.8.133 /dev/dax0.0 3
Creating RDMA event channel.
Creating RDMA communication identifier.
RDMA bind address to 10.8.8.133
RDMA start listen
Register memory region.
Unregister memory region.
Pool unmapped.
Pool handler closed.
Pool closed.
De-allocated PD.
Destroyed RDMA communication identifier.
Destroyed RDMA event channel.
[root@localhost ~]# ndctl create-namespace -fe namespace0.0 -a 4k
{
"dev":"namespace0.0",
"mode":"devdax",
"map":"dev",
"size":"7.87 GiB (8.45 GB)",
"uuid":"743ec485-6c77-4323-90ca-5ad864a00e72",
"daxregion":{
"id":0,
"size":"7.87 GiB (8.45 GB)",
"align":4096,
"devices":[
{
"chardev":"dax0.0",
"size":"7.87 GiB (8.45 GB)"
}
]
},
"numa_node":0
}
[root@localhost ~]# uname -a
Linux localhost.localdomain 4.18.20 #1 SMP Mon Jun 17 06:43:19 EDT 2019 x86_64
x86_64 x86_64 GNU/Linux
Thanks,
Jacky
-----Original Message-----
From: Jacky Wu
Sent: Monday, June 17, 2019 4:58 PM
To: Yue Li <[email protected]>; Dan Williams <[email protected]>
Cc: Scargall, Steve <[email protected]>; [email protected]
Subject: RE: ndctl hangs after memory deregistration
Hi Dan,
I wrote a small program to simulate our use case, and tested 3 cases, do no
register/unregister, do register only but no unregister, do both
register/unregister, and ndctl command hung in latter two cases. I'm attaching
the source code for your reference.
I will try using latest kernel next.
Thanks,
Jacky
-----Original Message-----
From: Yue Li <[email protected]<mailto:[email protected]>>
Sent: Friday, June 14, 2019 7:10 AM
To: Dan Williams <[email protected]<mailto:[email protected]>>
Cc: Scargall, Steve
<[email protected]<mailto:[email protected]>>; Jacky Wu
<[email protected]<mailto:[email protected]>>;
[email protected]<mailto:[email protected]>
Subject: Re: ndctl hangs after memory deregistration
Thanks Dan for the reply!
On 6/14/19, 3:06 AM, "Dan Williams"
<[email protected]<mailto:[email protected]>> wrote:
On Wed, Jun 12, 2019 at 9:08 PM Yue Li
<[email protected]<mailto:[email protected]>> wrote:
>
> hi Dan and Steve,
>
>
Hi,
I just happened to see this by luck, please use my Intel address, and
copy the libnvdimm mailing list on issues like this
([email protected]<mailto:[email protected]>).
OK.
> We recently ran into a strange issue where ndctl command hangs on dev dax
after our software uses it.
The last thing that device-dax teardown does is wait for any pinned
pages to be released before allowing the exit to proceed.
OK.
> Inside our application, we basically will first RDMA register the whole
device, then deregister, and exit.
Is this just using simple ibverbs to unregister or something specific
to this driver.
There was a bug upstream that addressed cases where device teardown
proceeded when it shouldn't, but the sequence you describe is the
opposite the pages pins should be torn down before the device
reconfiguration.
> However, if we remove the registration and deregistration code, ndctl
works correctly without hanging. The problem occurs both on DRAM emulated dax
as well as real PMEM backed dax.
>
> Here is our system information:
>
>
>
> CentOS 7.6
>
> Vanilla kernel 3.10.0-957.el7.x86_64
Are you familiar with rebuilding the kernel? I'd ask you to try to
reproduce with the latest development kernel that includes these
fixes:
4422ee8476f0 mm/devm_memremap_pages: fix final page put race
771f0714d0dc PCI/P2PDMA: track pgmap references per resource, not globally
af37085de906 lib/genalloc: introduce chunk owners
e0047ff8aa77 PCI/P2PDMA: fix the gen_pool_add_virt() failure path
0315d47d6ae9 mm/devm_memremap_pages: introduce devm_memunmap_pages
216475c7eaa8 drivers/base/devres: introduce devm_release_action()
...but it sounds like you may be hitting a different issue.
Thanks for the suggestion, we will download the upstream kernel and try it
again. Will post the results soon.
Best,
Yue
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm