Please, consider to prepare a ReadyKernel patch for it.
https://readykernel.com/ -- Best regards, Konstantin Khorenko, Virtuozzo Linux Kernel Team On 08/08/2017 05:21 AM, Maxim Patlasov wrote:
Doing fast-path, ploop needs to know iblock and corresponding delta for io. iblock is stored in page_address(m->page))[idx] while delta lookup will use m->levels[idx]. Hence both reads and writes must access these bits atomically. Otherwise, a reader might use older page[idx] and newer levels[idx] or vice-versa. Both would be mistake. Fast-path is already acquires plo->lock, so the patch adds lock/unlock only to writer: map_wb_complete. Thanks to Evgenii Shatokhin for discovery and analysis: Report: [rh] Detected a data race on the the memory block at ffff880080f7a956, size 1: [rh] ----- Thread #1, comm: ploop55307 ----- [rh] IP: [<ffffffffa052b259>] ploop_index_wb_complete+0x2d9/0x850 [ploop] [<ffffffffa0525140>] ploop_req_state_process+0x4c0/0xc30 [ploop] [<ffffffffa0525af0>] ploop_thread+0x240/0x560 [ploop] [<ffffffff810b27df>] kthread+0xcf/0xe0 [<ffffffff81693218>] ret_from_fork+0x58/0x90 [rh] ----- Thread #2, comm: gcc ----- [rh] IP: right before [<ffffffffa0529cc5>] ploop_fastmap+0x125/0x160 [ploop] <handling of the hardware BP, skipped> [<ffffffffa051e738>] ploop_fast_lookup+0x48/0xf0 [ploop] [<ffffffffa0522bfd>] ploop_make_request+0x83d/0xc90 [ploop] [<ffffffff812e93e9>] generic_make_request+0x109/0x1e0 [<ffffffff812e9537>] submit_bio+0x77/0x1c0 [<ffffffffa034a605>] ext4_io_submit+0x25/0x50 [ext4] [<ffffffffa03467f3>] ext4_writepages+0x8a3/0xd60 [ext4] [<ffffffff8119edee>] do_writepages+0x1e/0x40 [<ffffffff81192382>] __filemap_fdatawrite_range+0x62/0x80 [<ffffffff8119246c>] filemap_flush+0x1c/0x20 [<ffffffffa0343d0c>] ext4_alloc_da_blocks+0x2c/0x70 [ext4] [<ffffffffa03525aa>] ext4_rename+0x62a/0x870 [ext4] [<ffffffffa035280d>] ext4_rename2+0x1d/0x40 [ext4] [<ffffffff812250b0>] vfs_rename+0x210/0x850 [<ffffffff81229933>] SYSC_renameat2+0x4d3/0x570 [<ffffffff8122a78e>] SyS_renameat2+0xe/0x10 [<ffffffff8122a7ce>] SyS_rename+0x1e/0x20 [<ffffffff816932c9>] system_call_fastpath+0x16/0x1b The first thread, "ploop55307", was executing map_wb_complete() function, drivers/block/ploop/map.c:1152, inlined into ploop_index_wb_complete(): 1258 if (m->levels) { 1259 m->levels[idx] = top_delta->level; // data are written here The second thread, "gcc", was executing ploop_fastmap() function, drivers/block/ploop/map.c:239: 238 if (m->levels) 239 return m->levels[idx]; // data are read here Race on m->levels[idx], I suppose. Signed-off-by: Maxim Patlasov <mpatla...@virtuozzo.com> --- drivers/block/ploop/map.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/block/ploop/map.c b/drivers/block/ploop/map.c index b6f2243..78ae341 100644 --- a/drivers/block/ploop/map.c +++ b/drivers/block/ploop/map.c @@ -1142,6 +1142,7 @@ static void map_wb_complete(struct map_node * m, int err) idx = (preq->req_cluster + PLOOP_MAP_OFFSET) & (INDEX_PER_PAGE - 1); if (!err) { struct ploop_request *pr = preq; + int do_levels_update = 0; if (unlikely(test_bit(PLOOP_REQ_ZERO, &preq->state))) { BUG_ON (list_empty(&preq->delay_list)); @@ -1150,6 +1151,11 @@ static void map_wb_complete(struct map_node * m, int err) list); } + if (m->levels && m->levels[idx] != top_delta->level) { + spin_lock_irq(&plo->lock); + do_levels_update = 1; + } + if (unlikely(test_bit(PLOOP_REQ_RELOC_A, &preq->state) || test_bit(PLOOP_REQ_ZERO, &preq->state))) map_idx_swap(m, idx, &pr->iblock, @@ -1159,7 +1165,10 @@ static void map_wb_complete(struct map_node * m, int err) pr->iblock << ploop_map_log(plo); if (m->levels) { - m->levels[idx] = top_delta->level; + if (do_levels_update) { + m->levels[idx] = top_delta->level; + spin_unlock_irq(&plo->lock); + } } else { BUG_ON(MAP_LEVEL(m) != top_delta->level); } .
_______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel