Hi.

i am running a standalone tipc 1.7.2 node inside a 4-CPU qemu instance.
When opening a tipc socket, i get this:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.20-gentoo-r3 #9
-------------------------------------------------------
netsend/1453 is trying to acquire lock:
 (ref_table_lock){-+..}, at: [<c029c5d1>] tipc_ref_discard+0x31/0x140

 but task is already holding lock:
 (&table[i].lock){-+..}, at: [<c029c579>] tipc_ref_lock+0x39/0x60

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&table[i].lock){-+..}:
       [<c0136f96>] __lock_acquire+0xcd6/0xdc0
       [<c01370d7>] lock_acquire+0x57/0x70
       [<c02a3371>] _spin_lock_bh+0x31/0x40
       [<c029c75c>] tipc_ref_acquire+0x7c/0x110
       [<c029acc2>] tipc_createport_raw+0x32/0x1a0
       [<c029b866>] tipc_createport+0x46/0xf0
       [<c02960dd>] tipc_subscr_start+0xbd/0x130
       [<c028e736>] process_signal_queue+0x56/0x90
       [<c012062a>] tasklet_action+0x5a/0xe0
       [<c01200c7>] __do_softirq+0x87/0x100
       [<c0120197>] do_softirq+0x57/0x60
       [<c012046e>] local_bh_enable_ip+0xae/0x100
       [<c02a3235>] _spin_unlock_bh+0x25/0x30
       [<c028e6b2>] tipc_k_signal+0xc2/0xf0
       [<c028e3e8>] tipc_core_start+0x98/0xc0
       [<c0353093>] tipc_init+0x83/0xdb
       [<c01004d0>] init+0x110/0x320
       [<c0103c43>] kernel_thread_helper+0x7/0x14
       [<ffffffff>] 0xffffffff

-> #0 (ref_table_lock){-+..}:
       [<c0136e0a>] __lock_acquire+0xb4a/0xdc0
       [<c01370d7>] lock_acquire+0x57/0x70
       [<c02a33f1>] _write_lock_bh+0x31/0x40
       [<c029c5d1>] tipc_ref_discard+0x31/0x140
       [<c029b453>] tipc_deleteport+0x33/0x140
       [<c029e825>] release+0xa5/0x130
       [<c023a6d3>] sock_release+0x13/0x70
       [<c023a8a1>] sock_close+0x21/0x40
       [<c0159828>] __fput+0x58/0x100
       [<c0159939>] fput+0x19/0x20
       [<c0156f67>] filp_close+0x47/0x70
       [<c011d034>] put_files_struct+0xa4/0xb0
       [<c011e14e>] do_exit+0x12e/0x7d0
       [<c011e819>] do_group_exit+0x29/0x70
       [<c011e86f>] sys_exit_group+0xf/0x20
       [<c0103018>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff
other info that might help us debug this:

2 locks held by netsend/1453:
 #0:  (sk_lock-AF_TIPC){--..}, at: [<c029e7b1>] release+0x31/0x130
 #1:  (&table[i].lock){-+..}, at: [<c029c579>] tipc_ref_lock+0x39/0x60

stack backtrace:
 [<c010402a>] show_trace_log_lvl+0x1a/0x30
 [<c0104712>] show_trace+0x12/0x20
 [<c01047c6>] dump_stack+0x16/0x20
 [<c01350ef>] print_circular_bug_tail+0x6f/0x80
 [<c0136e0a>] __lock_acquire+0xb4a/0xdc0
 [<c01370d7>] lock_acquire+0x57/0x70
 [<c02a33f1>] _write_lock_bh+0x31/0x40
 [<c029c5d1>] tipc_ref_discard+0x31/0x140
 [<c029b453>] tipc_deleteport+0x33/0x140
 [<c029e825>] release+0xa5/0x130
 [<c023a6d3>] sock_release+0x13/0x70
 [<c023a8a1>] sock_close+0x21/0x40
 [<c0159828>] __fput+0x58/0x100
 [<c0159939>] fput+0x19/0x20
 [<c0156f67>] filp_close+0x47/0x70
 [<c011d034>] put_files_struct+0xa4/0xb0
 [<c011e14e>] do_exit+0x12e/0x7d0
 [<c011e819>] do_group_exit+0x29/0x70
 [<c011e86f>] sys_exit_group+0xf/0x20
 [<c0103018>] syscall_call+0x7/0xb
 =======================


I think the warning is correct and that this is indeed a possible deadlock:

tipc_deleteport():
 calls tipc_port_lock() (which is the same as tipc_ref_lock);
    tipc_ref_lock() aquires the reference spinlock:
      struct reference *r = &tipc_ref_table.entries[ref & 
tipc_ref_table.index_mask];
      spin_lock_bh(&r->lock);
      if (likely(r->data.reference == ref))
             return r->object;
 next, tipc_deleteport calls tipc_ref_discard(), which locks ref_table_lock.

tipc_ref_acquire():
locks ref_table_lock. Then the following code is executed:
  if (tipc_ref_table.first_free) {
       index = tipc_ref_table.first_free;
       entry = &(tipc_ref_table.entries[index]);
       index_mask = tipc_ref_table.index_mask;
       /* take lock in case a previous user of entry still holds it */
       spin_lock_bh(&entry->lock);

and indeed: the locks are acquired in reverse order 8-/
Any ideas of how this can be fixed?

Thanks,
        Florian

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to