On Tue, Dec 8, 2020 at 4:20 PM Jane Chu <[email protected]> wrote:
>
> Hi,
>
> I actually just ran into the NULL deref issue that is fixed here.
>
> Bu I have a question for the experts:
> what might cause libndctl to run into the NULL deref like below ?
>
> Program terminated with signal 11, Segmentation fault.
> #0  ndctl_pfn_get_bus (pfn=pfn@entry=0x0) at libndctl.c:5540
> 5540            return pfn->region->bus;
>
> (gdb) print pfn
> $1 = (struct ndctl_pfn *) 0x0
> (gdb) frame 4
> #4  0x000000000040ca70 in setup_namespace (region=region@entry=0x109d910,
>      ndns=ndns@entry=0x10a7d40, p=p@entry=0x7ffd8ff73b90) at namespace.c:570
> 570                     try(ndctl_dax, set_uuid, dax, uuid);
> (gdb) info locals
> __rc = <optimized out>
> dax = 0x0
>
> What I did was to let 2 threads run "create-namespace all" in a tight
> loop, and 2 other threads run "destroy-namespace all" in a tight loop,

This will definitely cause libndctl to get confused the expectation is
that only one ndctl instance is operating on one region at a time.
What likely happened is TOCTOU in ndctl_region_get_pfn_seed() where
the seed device name is destroyed by the time it tries to convert it
to an ndctl object. The reason libndctl does not perform its own
locking is to keep the library stateless and allow locking to imposed
from a higher level. Two ndctl instances must not be allowed to
operate in the same region otherwise the 2 libndctl instances will get
out of sync.

This is no different than 2 fdisk processes running at the same time,
they are going to invalidate each other's view of the cached partition
state. The fix is not for fdisk to implement locking internally
instead it requires the admin to arrange for only one fdisk to run
against one disk at a time.

> while chasing an year old issue that randomly resurfaces -
> "nd_region region1: allocation underrun: 0x0 of 0x40000000 bytes"

It would be interesting to get more data on the sequence of
allocations that lead up to this event, and a dump of the resource
tree when this happens:

for (i = 0; i < nd_region->ndr_mappings; i++)
    ndd = to_ndd(&nd_region->mapping[i]);
    for_each_dpa_resource(...)
        nd_dbg_dpa(...)


>
> In addition, there are kmemleaks,
> # cat /sys/kernel/debug/kmemleak
> [..]
> unreferenced object 0xffff976bd46f6240 (size 64):
>    comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s)
>    hex dump (first 32 bytes):
>      00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00  .......... .7...
>      ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00  ....8...........
>    backtrace:
>      [<00000000064003cf>] __kmalloc_track_caller+0x136/0x379
>      [<00000000d85e3c52>] krealloc+0x67/0x92
>      [<00000000d7d3ba8a>] __alloc_dev_dax_range+0x73/0x25c
>      [<0000000027d58626>] devm_create_dev_dax+0x27d/0x416
>      [<00000000434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core]
>      [<0000000083726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem]
>      [<00000000b5f2319c>] nvdimm_bus_probe+0x9d/0x340 [libnvdimm]
>      [<00000000c055e544>] really_probe+0x230/0x48d
>      [<000000006cabd38e>] driver_probe_device+0x122/0x13b
>      [<0000000029c7b95a>] device_driver_attach+0x5b/0x60
>      [<0000000053e5659b>] bind_store+0xb7/0xc3
>      [<00000000d3bdaadc>] drv_attr_store+0x27/0x31
>      [<00000000949069c5>] sysfs_kf_write+0x4a/0x57
>      [<000000004a8b5adf>] kernfs_fop_write+0x150/0x1e5
>      [<00000000bded60f0>] __vfs_write+0x1b/0x34
>      [<00000000b92900f0>] vfs_write+0xd8/0x1d1

Hmm... maybe this?

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 27513d311242..506549235e03 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1393,6 +1393,7 @@ struct dev_dax *devm_create_dev_dax(struct
dev_dax_data *data)
 err_pgmap:
        free_dev_dax_ranges(dev_dax);
 err_range:
+       kfree(dev_dax->ranges);
        free_dev_dax_id(dev_dax);
 err_id:
        kfree(dev_dax);
_______________________________________________
Linux-nvdimm mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to