On 25.04.25 13:27, Pavel Tikhomirov wrote:
On 4/25/25 18:12, Alexander Atanasov wrote:
On 25.04.25 13:05, Pavel Tikhomirov wrote:
On 4/25/25 13:35, Alexander Atanasov wrote:
Rework ploop_advance_local_after_bat_wb so that if we have delayed
update to the same page is really executed and don't endup in
wait_list.
^ missing "it"
Due to releasing and taking locks there was a window that a new update
could go into the wait_list. Both due to WRITEBACK bit cleared after
^ "both" does not sound right, maybe
you meant "also"?
We have two cases that a pio can get there for the same reason so both
not also.
What cases when? It's not clear from the commit message.
Case 1:
>>>> Rework ploop_advance_local_after_bat_wb so that if we have delayed
>>>> update to the same page is really executed and don't endup in
>>>> wait_list.
Case 2:
>>>> Due to releasing and taking locks there was a window that a new update
>>>> could go into the wait_list. Both due to WRITEBACK bit cleared after
And yes it must be "Both are" to be grammar compliant.
retaking the lock.
To fix this take all pending pios and clear the bit under lock,
using the same schema as in other code that checks the bit.
After releasing the lock displatch delayed pios.
^ excess 'l'
Reviewed-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com>
I would like explanation more specific:
Only place we can add to md_page->wait_llist is in
ploop_delay_if_md_busy if md_page->md_status has MD_UPDATING or
MD_WRITEBACK bits (under both md_lock and bat_lock). Previously we
first acquired elements from wait_llist and only then cleared
MD_WRITEBACK bit, thus having a race window where wait_llist can
receive new delayed elements after acquiring elements from it and
before clearing MD_WRITEBACK and those new elements will never be
processed (until the next writeback to the same metadata page).
I explain what problem we faced and how we fixed it, Why do i have to
explain how code works - it is there to be read if one needs the details?
Putting bit-clear and elements-acquire under both locks will
guarantee it is atomicity relative to ploop_delay_if_md_busy.
One alternative case is when we delay due to MD_UPDATING bit set, and
there is no MD_WRITEBACK at the time. This is also covered though
more trickily. In this case the one who set MD_UPDATING bit, will
call ploop_add_dirty_for_wb, which will be processed later in
do_ploop_run_work -> ploop_submit_metadata_writeback and will
eventually lead to MD_WRITEBACK bit set and writeback submitted via
ploop_index_wb_submit. So it is guaranteed that wait_llist will
always be handled in this case.
This is again explaining how code works not what the patch does.
Strongly disagree. I believe the whole locking scheme should be
explained here, as it is the thing that we are fixing. There is no
description of it anywhere. But we should be able to verify that what is
written in code corresponds to how we intend it to work.
There is a description in the previous commits how locking works.
Here it is about how we fix a specific place. No need to repeat it.
Documentation yes, there is no documentation we have to use the source
until there is documentation.
https://virtuozzo.atlassian.net/browse/VSTOR-102677
https://virtuozzo.atlassian.net/browse/VSTOR-103768
Signed-off-by: Alexander Atanasov <alexander.atana...@virtuozzo.com>
---
drivers/md/dm-ploop-map.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-ploop-map.c b/drivers/md/dm-ploop-map.c
index 76544cdb476e..8ef0426e7468 100644
--- a/drivers/md/dm-ploop-map.c
+++ b/drivers/md/dm-ploop-map.c
@@ -884,11 +884,15 @@ static void
ploop_advance_local_after_bat_wb(struct ploop *ploop,
#endif
WARN_ON_ONCE(!test_bit(MD_WRITEBACK, &md->status));
+
+ wait_llist_pending = llist_del_all(&md->wait_llist);
+
+ /* Clear flag at the end when all processing related is done */
+ clear_bit(MD_WRITEBACK, &md->status);
spin_unlock(&md->md_lock);
spin_unlock_irqrestore(&ploop->bat_lock, flags);
kunmap_local(dst_clu);
- wait_llist_pending = llist_del_all(&md->wait_llist);
if (wait_llist_pending) {
llist_for_each_safe(pos, t, wait_llist_pending) {
pio = list_entry((struct list_head *)pos,
typeof(*pio), list);
@@ -900,10 +904,6 @@ static void
ploop_advance_local_after_bat_wb(struct ploop *ploop,
if (!list_empty(&list))
ploop_dispatch_pios(ploop, NULL, &list);
- spin_lock_irqsave(&ploop->bat_lock, flags);
- /* Clear flag at the end when all processing related is done */
- clear_bit(MD_WRITEBACK, &md->status);
- spin_unlock_irqrestore(&ploop->bat_lock, flags);
}
static void ploop_free_piwb(struct ploop_index_wb *piwb)
--
Regards,
Alexander Atanasov
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel