On 8/15/2015 5:14 PM, Doug Ledford wrote:
On 08/06/2015 01:14 PM, Matan Barak wrote:
Separate the cleanup and release IB cache functions.
The cleanup function delete all GIDs and unregister the event channel,
while the release function frees all memory. The cleanup function is
called after all clients were removed (in the device un-registration
process).

The release function is called after the device was put.
This is in order to avoid use-after-free errors if ther vendor
driver's teardown code uses IB cache.

Squash-into: 'IB/core: Add RoCE GID table management'
Signed-off-by: Matan Barak <[email protected]>

I had originally used the v1 of this patch instead of v2.  In order to
make sure I didn't miss anything, I hand compared the final results of
the v1 patch to what this patch would have created if I had used it.  I
found a few things worth nothing, comments inline below.

@@ -903,20 +924,28 @@ int ib_cache_setup_one(struct ib_device *device)
                                          (rdma_end_port(device) -
                                           rdma_start_port(device) + 1),
                                          GFP_KERNEL);
-       err = gid_table_setup_one(device);
-
-       if (!device->cache.pkey_cache || !device->cache.gid_cache ||
+       if (!device->cache.pkey_cache ||
            !device->cache.lmc_cache) {
                printk(KERN_WARNING "Couldn't allocate cache "
                       "for %s\n", device->name);
-               err = -ENOMEM;
-               goto err;
+               /* pkey_cache is freed here because we later access it's
+                * elements.
+                */
+               kfree(device->cache.pkey_cache);
+               device->cache.pkey_cache = NULL;
+               return -ENOMEM;

This function is unnecessarily convoluted IMO.  A simple change to using
kzalloc() for the pkey_cache alloc eliminates part of the issue.  As a
result, I fixed this function up to be different than either patch.
Please look over my results and make sure you agree with my changes.

        }

-       for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p) {
+       for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
                device->cache.pkey_cache[p] = NULL;

For instance this goes away entirely by using kzalloc().

The rest looked ok.  I have to fixup the ordering changes between sysfs
and cache to make the v1 patch match the v2 patch.



Thanks for the squashing and re-ordering these patches.
I think you're missing if (device->cache.pkey_cache) over the for pkey free loop in ib_cache_release_one. Otherwise, the allocation of device->cache.pkey_cache might has failed and you dereference an invalid address device->cache.pkey_cache[p].

In addition, ib_device_release calls ib_cache_release_one with device instead of dev (which has the wrong type).
ib_register_device calls ib_cache_clean_one that I can't really find :)


Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to