Yossi Etigin wrote:
Because ipoib_workqueue is not flushed when ipoib interface is brought down, ipoib_mcast_join() may trigger a join to the broadcast group after priv->broadcast was set to NULL (during cleanup). This will cause ipoib to be joined to the
broadcast group when interface is down.
As a side effect, this breaks the optimization of setting qkey only when joining
the broadcast group.

Signed-off-by: Yossi Etigin <[EMAIL PROTECTED]>

--

Fix bugzilla 1370.

Index: b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===================================================================
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-11-19 21:33:54.000000000 +0200 +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2008-11-19 21:40:12.000000000 +0200
@@ -565,7 +565,8 @@ void ipoib_mcast_join_task(struct work_s
            ipoib_warn(priv, "ib_query_port failed\n");
    }

-    if (!priv->broadcast) {
+    rtnl_lock();
+ if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags) && !priv->broadcast) {
        struct ipoib_mcast *broadcast;

        broadcast = ipoib_mcast_alloc(dev, 1);
@@ -576,6 +577,7 @@ void ipoib_mcast_join_task(struct work_s
                queue_delayed_work(ipoib_workqueue,
                           &priv->mcast_join_task, HZ);
            mutex_unlock(&mcast_mutex);
+            rtnl_unlock();
            return;
        }

@@ -587,6 +589,7 @@ void ipoib_mcast_join_task(struct work_s
        __ipoib_mcast_add(dev, priv->broadcast);
        spin_unlock_irq(&priv->lock);
    }
+    rtnl_unlock();

    if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) {
        if (!test_bit(IPOIB_MCAST_FLAG_BUSY, &priv->broadcast->flags))

Hi Yossi,
I got the following kernel oops on SLES 10 (2.6.16.21-0.8-smp) using the patch above.

To reproduce, run:
rmmod ib_ipoib


Unable to handle kernel NULL pointer dereference at virtual address 00000068
printing eip:
f8c5e3c4
*pde = 7a0e8067
Oops: 0000 [#1]
SMP
last sysfs file: /class/infiniband/mthca0/node_desc
Modules linked in: ib_ipoib ib_cm ib_sa ib_uverbs ib_umad mlx4_ib mlx4_core ib_mthca ib_mad ib_core memtrack autofs4 nfs lockd nfs_acl sunrpc ipv6 af_packe
CPU:    0
EIP:    0060:[<f8c5e3c4>]    Tainted: G     U VLI
EFLAGS: 00010202   (2.6.16.21-0.8-smp #1)
EIP is at ipoib_mcast_join_task+0x134/0x24d [ib_ipoib]
eax: 00000000   ebx: f6a2c3e8   ecx: 00000000   edx: 00000000
esi: f6a2c56c   edi: f6a2c12c   ebp: f6a2c380   esp: f6a2bf0c
ds: 007b   es: 007b   ss: 0068
Process ipoib (pid: 7858, threadinfo=f6a2a000 task=f7e3c0f0)
Stack: <0>f6a2c000 00000004 00000004 00000004 00000020 02510a68 80000000 00000000 00000000 00020040 0400000f 02001200 00000501 f6a2c3e8 f6a2c3ec f73447c0 00000292 c012d85e f8c5e290 f6a2c3e8 f73447cc f73447c0 f73447d4 c012e052
Call Trace:
[<c012d85e>] run_workqueue+0x7f/0xba
[<f8c5e290>] ipoib_mcast_join_task+0x0/0x24d [ib_ipoib]
[<c012e052>] worker_thread+0x0/0x11e
[<c012e13f>] worker_thread+0xed/0x11e
[<c011a067>] default_wake_function+0x0/0xc
[<c0130895>] kthread+0x9d/0xc9
[<c01307f8>] kthread+0x0/0xc9
[<c0102005>] kernel_thread_helper+0x5/0xb
Code: 21 63 c7 8b 75 04 81 c6 3c 01 00 00 a5 a5 a5 a5 89 5d 28 8b 04 24 89 da e8 b3 f5 ff ff b0 01 86 45 00 fb e8 62 92 5e c7 8b 55 28 <8b> 42 68 a8 08 75

Regards,
Vladimir
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to