The commit is pushed to "branch-rh7-3.10.0-693.21.1.vz7.50.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-693.21.1.vz7.50.4 ------> commit 1bb9d8c41e7079ac798184ab1fe539c8e28761b6 Author: Kirill Tkhai <ktk...@virtuozzo.com> Date: Mon Jun 4 23:24:16 2018 +0300
fuse kio: Fix fix deadlock during change CS address We have noticed the following deadlock: map_truncate_tail() pcs_cs_find_create() spin_lock(&m->lock) spin_lock(&cs->lock) map_drop_cslist() pcs_map_notify_addr_change() cslist_destroy() spin_lock(&m->lock) spin_lock(&cs->lock) To fix it, this patch makes pcs_map_notify_addr_change() to unlock the cs->lock before taking m->lock. This is possible because of cs_list can't be unlinked from the list after the unlock: we take cs_list counter to keep it alive, and increment cs->use_count to prohibit cs isolation. Thus, after we taken the lock again, cs_link is guatanteed to be on the place, and we continue the iterations. Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com> ===================== Patchset description: Fix deadlock during change of CS address This is not a complete patchset, but I meet the situation when it's necessary to change original logic in small way, so this is a request for comments. [1-5/7] are mostly preparations and fixes, so my question is about [6-7/7]. 1) Patch 6 changes order of actions: pcs_map_notify_addr_change() is called after assigning of rpc addr. Can we do that? As I understand this results in new maps are created with new address, while in pcs_map_notify_addr_change() we invalidate old ones. So, for me it seems there is no a problem. This is needed for possibility to unlock cs->lock in pcs_map_notify_addr_change(). Theoretically, two pcs_cs_find_create() may happen in parallel, so we want they assign addr in the order they happen. Otherwise, the first one with the old addr_serno may overwrite the addr. 2) Patch 7 uses the preparations from previous patches and makes pcs_map_notify_addr_change() to unlock cs->lock for a while. New elements are added to head of cs->map_list, so we skip them on iterations. But it seems, they must be correct because we already updated rpc addr in pcs_cs_find_create(). Is there a reason we can't do this? Kirill Tkhai (7): fuse kio: Introduce pcs_cs_list_of_cs_link() fuse kio: Fix potential use after free fuse kio: Fix possible use after free in cslist_destroy() fuse kio: Introduce pcs_cs::use_count instead of ::is_probing fuse kio: Wait till cs is unused in pcs_csset_fini() fuse kio: Change order around pcs_map_notify_addr_change() fuse kio: Fix fix deadlock during change CS address --- fs/fuse/kio/pcs/pcs_map.c | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/fs/fuse/kio/pcs/pcs_map.c b/fs/fuse/kio/pcs/pcs_map.c index 49af8b961478..650da306b055 100644 --- a/fs/fuse/kio/pcs/pcs_map.c +++ b/fs/fuse/kio/pcs/pcs_map.c @@ -696,12 +696,14 @@ static void map_remote_error(struct pcs_map_entry *m , int error, u64 offender) void pcs_map_notify_addr_change(struct pcs_cs * cs) { + struct pcs_cs_list *cs_list, *prev_cs_list = NULL; struct pcs_cs_link * csl; assert_spin_locked(&cs->lock); + cs->use_count++; /* Prohibit to isolate cs */ + rcu_read_lock(); list_for_each_entry(csl, &cs->map_list, link) { - struct pcs_cs_list *cs_list; struct pcs_map_entry *m; if (csl->addr_serno == cs->addr_serno) @@ -710,6 +712,18 @@ void pcs_map_notify_addr_change(struct pcs_cs * cs) m = rcu_dereference(cs_list->map); if (!m) continue; + /* + * Get cs_list to prevent its destruction and unlinking from cs. + * Thus, csl stays on the place in the list. New elements may be + * added to head of cs->map_list, so our caller must care, they + * will contain correct rpc addr. + */ + cslist_get(cs_list); + spin_unlock(&cs->lock); + + if (prev_cs_list) + cslist_put(prev_cs_list); + prev_cs_list = cs_list; spin_lock(&m->lock); if ((m->state & PCS_MAP_DEAD) || m->cs_list != cs_list) @@ -724,8 +738,17 @@ void pcs_map_notify_addr_change(struct pcs_cs * cs) map_remote_error_nolock(m, PCS_ERR_CSD_STALE_MAP, cs->id.val); unlock: spin_unlock(&m->lock); + spin_lock(&cs->lock); + } + + if (prev_cs_list) { + spin_unlock(&cs->lock); + cslist_put(prev_cs_list); + spin_lock(&cs->lock); } rcu_read_unlock(); + cs->use_count--; + BUG_ON(cs->is_dead); } noinline static void pcs_ireq_queue_fail(struct list_head *queue, int error) _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel