From: Sowmini Varadhan <>
Date: Wed,  8 Aug 2018 13:57:13 -0700

> The following deadlock, reported by syzbot, can occur if CPU0 is in
> rds_send_remove_from_sock() while CPU1 is in rds_clear_recv_queue()
>        CPU0                    CPU1
>        ----                    ----
>   lock(&(&rm->m_rs_lock)->rlock);
>                                lock(&rs->rs_recv_lock);
>                                lock(&(&rm->m_rs_lock)->rlock);
>   lock(&rs->rs_recv_lock);
> The deadlock should be avoided by moving the messages from the
> rs_recv_queue into a tmp_list in rds_clear_recv_queue() under
> the rs_recv_lock, and then dropping the refcnt on the messages
> in the tmp_list (potentially resulting in rds_message_purge())
> after dropping the rs_recv_lock.
> The same lock hierarchy violation also exists in rds_still_queued()
> and should be avoided in a similar manner
> Signed-off-by: Sowmini Varadhan <>
> Reported-by:

I'm putting this in deferred state for now.

Sowmini, once you and Santosh agree on what exactly to do, please

Thank you.

