Re: [dm-devel] [PATCH -next v3 00/25] md: synchronize io with array reconfiguration

2023-10-06 Thread Yu Kuai
Hi, 在 2023/10/07 10:40, Song Liu 写道: Can you take a look about this new cover letter? I don't have time right now to look into all the details, but it looks great at first glance. We can still edit it a little bit when applying the patchset, but that may not be necessary. Yeah, it's not

Re: [dm-devel] [PATCH -next v3 00/25] md: synchronize io with array reconfiguration

2023-10-06 Thread Yu Kuai
Hi, 在 2023/10/05 11:55, Song Liu 写道: On Wed, Oct 4, 2023 at 8:42 PM Yu Kuai wrote: Hi, 在 2023/09/29 3:15, Song Liu 写道: Hi Kuai, Thanks for the patchset! A few high level questions/suggestions: Thanks a lot for these! 1. This is a big change that needs a lot of explanation. While you

Re: [dm-devel] [PATCH -next v3 00/25] md: synchronize io with array reconfiguration

2023-10-04 Thread Yu Kuai
, merge patch 10-12 for md/raid5-cache, and 13-16 for md/raid5). Thanks, Kuai Thanks again for your hard work into this! Song On Wed, Sep 27, 2023 at 11:22 PM Yu Kuai wrote: From: Yu Kuai [...] . -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH -next v3 03/25] md: add new helpers to suspend/resume array

2023-10-04 Thread Yu Kuai
Hi, 在 2023/09/29 2:45, Song Liu 写道: On Wed, Sep 27, 2023 at 11:22 PM Yu Kuai wrote: From: Yu Kuai Advantages for new apis: - reconfig_mutex is not required; - the weird logical that suspend array hold 'reconfig_mutex' for mddev_check_recovery() to update superblock is not needed

[dm-devel] [PATCH -next v3 07/25] md: use new apis to suspend array for serialize_policy_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index

[dm-devel] [PATCH -next v3 08/25] md/dm-raid: use new apis to suspend array

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-raid.c b

[dm-devel] [PATCH -next v3 04/25] md: add new helpers to suspend/resume and lock/unlock array

2023-09-28 Thread Yu Kuai
From: Yu Kuai The new helpers suspend the array first and then lock the array, Prepare to refactor from: mddev_lock/lock_nointr mddev_suspend ... mddev_resuem mddev_lock With: mddev_suspend_and_lock/lock_nointr ... mddev_unlock_and_resume After all the use cases is refactored, mddev_suspend

[dm-devel] [PATCH -next v3 24/25] md: remove old apis to suspend the array

2023-09-28 Thread Yu Kuai
From: Yu Kuai Now that mddev_suspend() and mddev_resume() is not used anywhere, remove them, and remove 'MD_ALLOW_SB_UPDATE' and 'MD_UPDATING_SB' as well. Signed-off-by: Yu Kuai --- drivers/md/md.c | 82 ++--- drivers/md/md.h | 8 - 2 files

[dm-devel] [PATCH -next v3 20/25] md: use new apis to suspend array before mddev_create/destroy_serial_pool

2023-09-28 Thread Yu Kuai
From: Yu Kuai mddev_create/destroy_serial_pool() will be called from several places where mddev_suspend() will be called later. Prepare to remove the mddev_suspend() from mddev_create/destroy_serial_pool(). Signed-off-by: Yu Kuai --- drivers/md/md-autodetect.c | 4 ++-- drivers/md/md

[dm-devel] [PATCH -next v3 05/25] md: use new apis to suspend array for suspend_lo/hi_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/md/md.c b/drivers/md

[dm-devel] [PATCH -next v3 03/25] md: add new helpers to suspend/resume array

2023-09-28 Thread Yu Kuai
From: Yu Kuai Advantages for new apis: - reconfig_mutex is not required; - the weird logical that suspend array hold 'reconfig_mutex' for mddev_check_recovery() to update superblock is not needed; - the specail handling, 'pers->prepare_suspend', for raid456 is not needed; - It's s

[dm-devel] [PATCH -next v3 02/25] md: replace is_md_suspended() with 'mddev->suspended' in md_check_recovery()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Prepare to cleanup pers->prepare_suspend(), which is used to fix a deadlock in raid456 by returning error for io that is waiting for reshape to make progress in mddev_suspend(). This change will allow reshape to make progress while waiting for io to be done in mddev_susp

[dm-devel] [PATCH -next v3 06/25] md: use new apis to suspend array for level_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index

[dm-devel] [PATCH -next v3 17/25] md/raid5: replace suspend with quiesce() callback

2023-09-28 Thread Yu Kuai
From: Yu Kuai raid5 is the only personality to suspend array in check_reshape() and start_reshape() callback, suspend and quiesce() callback can both wait for all normal io to be done, and prevent new io to be dispatched, the difference is that suspend is implemented in common layer, and quiesce

[dm-devel] [PATCH -next v3 11/25] md/raid5-cache: use new apis to suspend array for r5c_disable_writeback_async()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 889bba60d6ff

[dm-devel] [PATCH -next v3 10/25] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'

2023-09-28 Thread Yu Kuai
From: Yu Kuai 'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are not procted, hence use READ_ONCE/WRITE_ONCE to prevent reading abnormal value. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 47 +--- 1 file changed, 25 inserti

[dm-devel] [PATCH -next v3 09/25] md/md-bitmap: use new apis to suspend array for location_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers

[dm-devel] [PATCH -next v3 14/25] md/raid5: use new apis to suspend array for raid5_store_skip_copy()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5

[dm-devel] [PATCH -next v3 01/25] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'

2023-09-28 Thread Yu Kuai
From: Yu Kuai Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request() is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal value. Signed-off-by: Yu Kuai --- drivers/md/md.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git

[dm-devel] [PATCH -next v3 25/25] md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Now that the old apis are removed, __mddev_suspend/resume() can be renamed to their original names. This is done by: sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch] sed -i "s/__mddev_resume/mddev_resume/g" *.[ch] Signed-off-by: Yu Kuai --- drivers/m

[dm-devel] [PATCH -next v3 22/25] md/md-linear: cleanup linear_add()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Now that caller already suspend the array, there is no need to suspend array in liner_add(). Note that mddev_suspend/resume() is not used anymore. Signed-off-by: Yu Kuai --- drivers/md/md-linear.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/md-linear.c b

[dm-devel] [PATCH -next v3 19/25] md: use new apis to suspend array for adding/removing rdev from state_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai User can write 'remove' and 're-add' to trigger array reconfiguration through sysfs, suspend array in this case so that io won't concurrent with array reconfiguration. And now that all the caller of add_bound_rdev() alread suspend the array, remove mddev_suspend/resume() from

[dm-devel] [PATCH -next v3 16/25] md/raid5: use new apis to suspend array for raid5_change_consistency_policy()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 19 ++- 1 file changed, 6 insertions(+), 13 deletions(-) diff --git a/drivers/md/raid5.c b

[dm-devel] [PATCH -next v3 00/25] md: synchronize io with array reconfiguration

2023-09-28 Thread Yu Kuai
From: Yu Kuai Changes in v3: - rebase with latest md-next; - remove patch 2 from v2, and replace it with a new patch; - fix a null-ptr-derefrence in rdev_attr_store() that mddev is used before checking; - merge patch 20-22 from v1 into one patch; - mddev_lock() used to be called first

[dm-devel] [PATCH -next v3 15/25] md/raid5: use new apis to suspend array for raid5_store_group_thread_cnt()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[dm-devel] [PATCH -next v3 18/25] md: use new apis to suspend array for ioctls involed array reconfiguration

2023-09-28 Thread Yu Kuai
From: Yu Kuai 'reconfig_mutex' will be grabbed before these ioctls, suspend array before holding the lock, so that io won't concurrent with array reconfiguration through ioctls. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 30

[dm-devel] [PATCH -next v3 13/25] md/raid5: use new apis to suspend array for raid5_store_stripe_size()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5

[dm-devel] [PATCH -next v3 21/25] md: cleanup mddev_create/destroy_serial_pool()

2023-09-28 Thread Yu Kuai
From: Yu Kuai Now that except for stopping the array, all the callers already suspend the array, there is no need to suspend anymore, hence remove the second parameter. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 8 drivers/md/md.c| 33

[dm-devel] [PATCH -next v3 12/25] md/raid5-cache: use new apis to suspend array for r5c_journal_mode_store()

2023-09-28 Thread Yu Kuai
From: Yu Kuai r5c_journal_mode_set() will suspend array and it has only 2 caller, the other caller raid_ctl() already suspend the array with new apis. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 6 ++ 1 file changed, 2

[dm-devel] [PATCH -next v3 23/25] md: suspend array in md_start_sync() if array need reconfiguration

2023-09-28 Thread Yu Kuai
From: Yu Kuai So that io won't concurrent with array reconfiguration, and it's safe to suspend the array directly because normal io won't rely on md_start_sync(). Signed-off-by: Yu Kuai --- drivers/md/md.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers

Re: [dm-devel] fstrim on raid1 LV with writemostly PV leads to system freeze

2023-09-26 Thread Yu Kuai
Hi, 在 2023/09/26 21:12, Kirill Kirilenko 写道: On 26.09.2023 06:28 +0300, Yu Kuai wrote: I still don't quite understand what you mean 'kernel freeze', this patch indeed fix a problem that diskcard bio is treated as normal write bio and it's splitted. Can you explain more by how do you judge

Re: [dm-devel] fstrim on raid1 LV with writemostly PV leads to system freeze

2023-09-25 Thread Yu Kuai
Hi, 在 2023/09/26 7:59, Kirill Kirilenko 写道: On 25.09.2023 05:58 +0300, Yu Kuai wrote: Roman and Kirill, can you test the following patch? Thanks, Kuai diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 4b30a1742162..4963f864ef99 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1

Re: [dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration

2023-09-25 Thread Yu Kuai
00 DR6: fffe0ff0 DR7: 0400 [ 173.250395] Kernel panic - not syncing: Fatal exception [ 173.251612] Kernel Offset: disabled [ 173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]--- On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai wrote: From: Yu Kuai Changes in v2

Re: [dm-devel] fstrim on raid1 LV with writemostly PV leads to system freeze

2023-09-24 Thread Yu Kuai
Hi, 在 2023/09/22 6:03, Roman Mamedov 写道: On Thu, 21 Sep 2023 17:45:24 -0400 Mike Snitzer wrote: I just verified that 6.5.0 does have this DM core fix (needed to prevent excessive splitting of discard IO.. which could cause fstrim to take longer for a DM device), but again 6.5.0 has this fix

Re: [dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()

2023-09-24 Thread Yu Kuai
Hi, 在 2023/09/20 16:46, Xiao Ni 写道: On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai wrote: From: Yu Kuai 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456, this change prepares to clean this up in later patches while refactoring mddev_suspend(). Specifically allow resh

Re: [dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'

2023-09-24 Thread Yu Kuai
Hi, 在 2023/09/14 10:53, Xiao Ni 写道: On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai wrote: From: Yu Kuai Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request() is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal value. Hi Kuai If we don't use READ_ONCE

[dm-devel] [PATCH -next v2 24/28] md: suspend array in md_start_sync() if array need reconfiguration

2023-08-27 Thread Yu Kuai
From: Yu Kuai So that io won't concurrent with array reconfiguration, and it's safe to suspend the array directly because normal io won't rely on md_start_sync(). Signed-off-by: Yu Kuai --- drivers/md/md.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/md/md.c b

[dm-devel] [PATCH -next v2 23/28] md: use new apis to suspend array in backlog_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai mddev_create/destroy_serial_pool() will be called from backlog_store(), and mddev_suspend() will be called later. Prepare to remove the mddev_suspend() from mddev_create/destroy_serial_pool(). Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 8 1 file changed, 4

[dm-devel] [PATCH -next v2 25/28] md: cleanup mddev_create/destroy_serial_pool()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Now that except for stopping the array, all the callers already suspend the array, there is no need to suspend anymore, hence remove the second parameter. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 8 drivers/md/md.c| 33

[dm-devel] [PATCH -next v2 28/28] md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Now that the old apis are removed, __mddev_suspend/resume() can be renamed to their original names. This is done by: sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch] sed -i "s/__mddev_resume/mddev_resume/g" *.[ch] Signed-off-by: Yu Kuai --- drivers/m

[dm-devel] [PATCH -next v2 26/28] md/md-linear: cleanup linear_add()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Now that caller already suspend the array, there is no need to suspend array in liner_add(). Note that mddev_suspend/resume() is not used anymore. Signed-off-by: Yu Kuai --- drivers/md/md-linear.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/md/md-linear.c b

[dm-devel] [PATCH -next v2 05/28] md: use new apis to suspend array for suspend_lo/hi_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. Signed-off-by: Yu Kuai --- drivers/md/md.c | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 6236e2e395c1..84d077110174 100644

[dm-devel] [PATCH -next v2 20/28] md: use new apis to suspend array for adding/removing rdev from state_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai User can write 'remove' and 're-add' to trigger array reconfiguration through sysfs, suspend array in this case so that io won't concurrent with array reconfiguration. Signed-off-by: Yu Kuai --- drivers/md/md.c | 18 -- 1 file changed, 12 insertions(+), 6

[dm-devel] [PATCH -next v2 19/28] md: use new apis to suspend array for ioctls involed array reconfiguration

2023-08-27 Thread Yu Kuai
From: Yu Kuai 'reconfig_mutex' will be grabbed before these ioctls, suspend array before holding the lock, so that io won't concurrent with array reconfiguration through ioctls. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 29

[dm-devel] [PATCH -next v2 15/28] md/raid5: use new apis to suspend array for raid5_store_group_thread_cnt()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md

[dm-devel] [PATCH -next v2 21/28] md: use new apis to suspend array for bind_rdev_to_array()

2023-08-27 Thread Yu Kuai
From: Yu Kuai mddev_create_serial_pool() will be called from bind_rdev_to_array(), and mddev_suspend() will be called if serial pool is used. Prepare to remove the mddev_suspend() from mddev_create_serial_pool(). Signed-off-by: Yu Kuai --- drivers/md/md-autodetect.c | 4 ++-- drivers/md

[dm-devel] [PATCH -next v2 27/28] md: remove old apis to suspend the array

2023-08-27 Thread Yu Kuai
From: Yu Kuai Now that mddev_suspend() and mddev_resume() is not used anywhere, remove them, and remove 'MD_ALLOW_SB_UPDATE' and 'MD_UPDATING_SB' as well. Signed-off-by: Yu Kuai --- drivers/md/md.c | 82 ++--- drivers/md/md.h | 8 - 2 files

[dm-devel] [PATCH -next v2 17/28] md/raid5: replace suspend with quiesce() callback

2023-08-27 Thread Yu Kuai
From: Yu Kuai raid5 is the only personality to suspend array in check_reshape() and start_reshape() callback, suspend and quiesce() callback can both wait for all normal io to be done, and prevent new io to be dispatched, the difference is that suspend is implemented in common layer, and quiesce

[dm-devel] [PATCH -next v2 12/28] md/raid5-cache: use new apis to suspend array for r5c_journal_mode_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai r5c_journal_mode_set() will suspend array and it has only 2 caller, the other caller raid_ctl() already suspend the array with new apis. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 6 ++ 1 file changed, 2

[dm-devel] [PATCH -next v2 22/28] md: use new apis to suspend array related to serial pool in state_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai mddev_create/destroy_serial_pool() will be called from state_store() if user write 'writemostly'/'-writemostly', and mddev_suspend() will be called later. Prepare to remove the mddev_suspend() from mddev_create/destroy_serial_pool(). Signed-off-by: Yu Kuai --- drivers/md/md.c

[dm-devel] [PATCH -next v2 18/28] md: quiesce before md_kick_rdev_from_array() for md-cluster

2023-08-27 Thread Yu Kuai
From: Yu Kuai md_kick_rdev_from_array() can be called from md_check_recovery() and md_reload_sb() for md-cluster, it's very complicated to use new apis to suspend the array before holding 'reconfig_mutex' in this case. Fortunately, md-cluster is only supported for raid1 and raid10

[dm-devel] [PATCH -next v2 13/28] md/raid5: use new apis to suspend array for raid5_store_stripe_size()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5

[dm-devel] [PATCH -next v2 16/28] md/raid5: use new apis to suspend array for raid5_change_consistency_policy()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 19 ++- 1 file changed, 6 insertions(+), 13 deletions(-) diff --git a/drivers/md/raid5.c b

[dm-devel] [PATCH -next v2 04/28] md: add new helpers to suspend/resume and lock/unlock array

2023-08-27 Thread Yu Kuai
From: Yu Kuai The new helpers suspend the array first and then lock the array, Prepare to refactor from: mddev_lock/trylock/lock_nointr mddev_suspend ... mddev_resuem mddev_lock With: mddev_suspend_and_lock/trylock/lock_nointr ... mddev_unlock_and_resume After all the use cases

[dm-devel] [PATCH -next v2 14/28] md/raid5: use new apis to suspend array for raid5_store_skip_copy()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/raid5.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5

[dm-devel] [PATCH -next v2 11/28] md/raid5-cache: use new apis to suspend array for r5c_disable_writeback_async()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c index 889bba60d6ff

[dm-devel] [PATCH -next v2 10/28] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'

2023-08-27 Thread Yu Kuai
From: Yu Kuai 'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are not procted, hence use READ_ONCE/WRITE_ONCE to prevent reading abnormal value. Signed-off-by: Yu Kuai --- drivers/md/raid5-cache.c | 47 +--- 1 file changed, 25 inserti

[dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()

2023-08-27 Thread Yu Kuai
From: Yu Kuai 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456, this change prepares to clean this up in later patches while refactoring mddev_suspend(). Specifically allow reshape to make progress while waiting for 'active_io' to be 0. Signed-off-by: Yu K

[dm-devel] [PATCH -next v2 09/28] md/md-bitmap: use new apis to suspend array for location_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers

[dm-devel] [PATCH -next v2 08/28] md/dm-raid: use new apis to suspend array

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-raid.c b

[dm-devel] [PATCH -next v2 07/28] md: use new apis to suspend array for serialize_policy_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index

[dm-devel] [PATCH -next v2 06/28] md: use new apis to suspend array for level_store()

2023-08-27 Thread Yu Kuai
From: Yu Kuai Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai --- drivers/md/md.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index

[dm-devel] [PATCH -next v2 03/28] md: add new helpers to suspend/resume array

2023-08-27 Thread Yu Kuai
From: Yu Kuai Advantages for new apis: - reconfig_mutex is not required; - the weird logical that suspend array hold 'reconfig_mutex' for mddev_check_recovery() to update superblock is not needed; - the specail handling, 'pers->prepare_suspend', for raid456 is not needed; - It's s

[dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'

2023-08-27 Thread Yu Kuai
From: Yu Kuai Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request() is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal value. Signed-off-by: Yu Kuai --- drivers/md/md.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git

[dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration

2023-08-27 Thread Yu Kuai
From: Yu Kuai Changes in v2: - rebase with latest md-next - remove some follow up cleanup patches, these patches will be sent later after this patchset. After previous four patchset of preparatory work, this patchset impelement a new version of mddev_suspend(), the new apis

[dm-devel] [PATCH -next v2 3/7] md: don't rely on 'mddev->pers' to be set in mddev_suspend()

2023-08-24 Thread Yu Kuai
From: Yu Kuai 'active_io' used to be initialized while the array is running, and 'mddev->pers' is set while the array is running as well. Hence caller must hold 'reconfig_mutex' and guarantee 'mddev->pers' is set before calling mddev_suspend(). Now that 'active_io' is initialized when

[dm-devel] [PATCH -next v2 6/7] md: don't check 'mddev->pers' from suspend_hi_store()

2023-08-24 Thread Yu Kuai
From: Yu Kuai Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to remove such checking. This will also allow the array to be suspended even before the array is ran. Signed-off-by: Yu Kuai --- drivers/md/md.c | 7 +-- 1 file changed, 1 insertion(+), 6 deleti

[dm-devel] [PATCH -next v2 4/7] md-bitmap: remove the checking of 'pers->quiesce' from location_store()

2023-08-24 Thread Yu Kuai
From: Yu Kuai After commit 4d27e927344a ("md: don't quiesce in mddev_suspend()"), there is no need to check 'pers->quiesce' before calling mddev_suspend(). Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/md/md-bitma

[dm-devel] [PATCH -next v2 5/7] md-bitmap: suspend array earlier in location_store()

2023-08-24 Thread Yu Kuai
From: Yu Kuai Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to call mddev_suspend() earlier. This will also be helper to refactor mddev_suspend() later. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 43 -- 1 f

[dm-devel] [PATCH -next v2 7/7] md: don't check 'mddev->pers' and 'pers->quiesce' from suspend_lo_store()

2023-08-24 Thread Yu Kuai
From: Yu Kuai Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to remove such checking. This will also allow the array to be suspended even before the array is ran. Signed-off-by: Yu Kuai --- drivers/md/md.c | 9 ++--- 1 file changed, 2 insertions(+)

[dm-devel] [PATCH -next v2 1/7] md: initialize 'active_io' while allocating mddev

2023-08-24 Thread Yu Kuai
From: Yu Kuai 'active_io' is used for mddev_suspend() and it's initialized in md_run(), this restrict that 'reconfig_mutex' must be held and "mddev->pers" must be set before calling mddev_suspend(). Initialize 'active_io' early so that mddev_suspend() is safe to call once mddev

[dm-devel] [PATCH -next v2 2/7] md: initialize 'writes_pending' while allocating mddev

2023-08-24 Thread Yu Kuai
From: Yu Kuai Currently 'writes_pending' is initialized in pers->run for raid1/5/10, and it's freed while deleing mddev, instead of pers->free. pers->run can be called multiple times before mddev is deleted, and a helper mddev_init_writes_pending() is used to prevent 'write

[dm-devel] [PATCH -next v2 0/7] md: initialize 'active_io' while allocating mddev

2023-08-24 Thread Yu Kuai
From: Yu Kuai Changes in v2: - rebase for md-next; - update commit message for patch 3; This is the 4th patchset to do some preparatory work to synchronize io with array reconfiguration. 1) The first patchset refactor 'active_io', make sure that mddev_suspend() will wait for io to be done

Re: [dm-devel] Processes hung in "D" state in ext4, mm, md and dmcrypt

2023-07-26 Thread Yu Kuai
Hi, 在 2023/07/26 18:02, David Howells 写道: Hi, With 6.5-rc2 (6.5.0-0.rc2.20230721gitf7e3a1bafdea.20.fc39.x86_64), I'm seeing a bunch of processes getting stuck in the D state on my desktop after a few hours of reading email and compiling stuff. It's happened every day this week so far and I

[dm-devel] [PATCH -next v2 2/3] md/dm-raid: clean up four equivalent goto tags in raid_ctr()

2023-07-08 Thread Yu Kuai
From: Yu Kuai There are four equivalent goto tags in raid_ctr(), clean them up to use just one. There is no functional change and is a preparartion to fix an unprotected md_stop(). Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 27 +-- 1 file changed, 9 insertions

[dm-devel] [PATCH -next v2 0/3] dm-raid: minor fixes

2023-07-08 Thread Yu Kuai
From: Yu Kuai Changes in v2: - improve title and commit message for patch 2 This patchset fix two straightforward and easy problems that is found by code review, please consider it for the next merge window. Yu Kuai (3): md/dm-raid: fix that 'reconfig_mutex' is not released from error path

[dm-devel] [PATCH -next v2 1/3] md/dm-raid: fix that 'reconfig_mutex' is not released from error path in raid_ctr()

2023-07-08 Thread Yu Kuai
From: Yu Kuai In the error path 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns. Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 9 +++-- 1 file changed, 7

[dm-devel] [PATCH -next v2 3/3] md/dm-raid: protect md_stop() with 'reconfig_mutex'

2023-07-08 Thread Yu Kuai
From: Yu Kuai __md_stop_writes() and __md_stop() will modify many fields that is protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mtuex' expect for md_stop(). Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Yu Kuai --- drivers/md/dm-

Re: [dm-devel] [PATCH -next 2/3] md/dm-raid: cleanup multiple equivalent goto tags from raid_ctr()

2023-07-06 Thread Yu Kuai
Hi, 在 2023/07/06 21:01, Paul Menzel 写道: Dear Yu, Thank you for your patch. Some minor nits, if you are interested. Am 06.07.23 um 09:16 schrieb Yu Kuai: From: Yu Kuai There are four equivalent goto tags in raid_ctr(), clean them up to use just one, there are no functional change

[dm-devel] [PATCH -next 1/3] md/dm-raid: fix that 'reconfig_mutex' is not released from error path in raid_ctr()

2023-07-06 Thread Yu Kuai
From: Yu Kuai In the error path 'bad_stripe_cache' and 'bad_check_reshape', 'reconfig_mutex' is still held after raid_ctr() returns. Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 9 +++-- 1 file changed, 7

[dm-devel] [PATCH -next 0/3] dm-raid: minor fixes

2023-07-06 Thread Yu Kuai
From: Yu Kuai This patchset fix two straightforward and easy problems that is found by code review, please consider it for the next merge window. Yu Kuai (3): md/dm-raid: fix that 'reconfig_mutex' is not released from error path in raid_ctr() md/dm-raid: cleanup multiple equivalent goto

[dm-devel] [PATCH -next 3/3] md/dm-raid: protect md_stop() with 'reconfig_mutex'

2023-07-06 Thread Yu Kuai
From: Yu Kuai __md_stop_writes() and __md_stop() will modify many fields that is protected by 'reconfig_mutex', and all the callers will grab 'reconfig_mtuex' expect for md_stop(). Fixes: 9d09e663d550 ("dm: raid456 basic support") Signed-off-by: Yu Kuai --- drivers/md/dm-

[dm-devel] [PATCH -next 2/3] md/dm-raid: cleanup multiple equivalent goto tags from raid_ctr()

2023-07-06 Thread Yu Kuai
From: Yu Kuai There are four equivalent goto tags in raid_ctr(), clean them up to use just one, there are no functional change and prepare to fix that md_stop() is not protected. Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 27 +-- 1 file changed, 9 insertions

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-15 Thread Yu Kuai
Hi, 在 2023/06/15 16:17, Xiao Ni 写道: Thanks for the example. I can understand the usage of it. It's the side effect that removes the mutex protection for idle_sync_thread. There is a problem. New sync thread is started in md_check_recovery. After your patch, md_reap_sync_thread is called in

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-14 Thread Yu Kuai
Hi, 在 2023/06/14 17:08, Xiao Ni 写道: On Wed, Jun 14, 2023 at 4:29 PM Yu Kuai wrote: Hi, 在 2023/06/14 15:57, Xiao Ni 写道: On Wed, Jun 14, 2023 at 3:38 PM Yu Kuai wrote: Hi, 在 2023/06/14 15:12, Xiao Ni 写道: On Wed, Jun 14, 2023 at 10:04 AM Yu Kuai wrote: Hi, 在 2023/06/14 9:48, Yu Kuai

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-14 Thread Yu Kuai
Hi, 在 2023/06/14 15:57, Xiao Ni 写道: On Wed, Jun 14, 2023 at 3:38 PM Yu Kuai wrote: Hi, 在 2023/06/14 15:12, Xiao Ni 写道: On Wed, Jun 14, 2023 at 10:04 AM Yu Kuai wrote: Hi, 在 2023/06/14 9:48, Yu Kuai 写道: In the patch, sync_seq is added in md_reap_sync_thread. In idle_sync_thread

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-14 Thread Yu Kuai
Hi, 在 2023/06/14 15:12, Xiao Ni 写道: On Wed, Jun 14, 2023 at 10:04 AM Yu Kuai wrote: Hi, 在 2023/06/14 9:48, Yu Kuai 写道: In the patch, sync_seq is added in md_reap_sync_thread. In idle_sync_thread, if sync_seq isn't equal mddev->sync_seq, it should mean there is someone that st

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-14 Thread Yu Kuai
Hi, 在 2023/06/14 11:47, Xiao Ni 写道: On Wed, Jun 14, 2023 at 9:48 AM Yu Kuai wrote: Hi, 在 2023/06/13 22:50, Xiao Ni 写道: 在 2023/5/29 下午9:20, Yu Kuai 写道: From: Yu Kuai Our test found a following deadlock in raid10: 1) Issue a normal write, and such write failed

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/14 9:48, Yu Kuai 写道: In the patch, sync_seq is added in md_reap_sync_thread. In idle_sync_thread, if sync_seq isn't equal mddev->sync_seq, it should mean there is someone that stops the sync thread already, right? Why do you say 'new started sync thread' h

Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/13 22:50, Xiao Ni 写道: 在 2023/5/29 下午9:20, Yu Kuai 写道: From: Yu Kuai Our test found a following deadlock in raid10: 1) Issue a normal write, and such write failed:    raid10_end_write_request     set_bit(R10BIO_WriteError, _bio->state)     one_write_done reschedule_re

Re: [dm-devel] [PATCH -next v2 3/6] md: add a mutex to synchronize idle and frozen in action_store()

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/13 22:43, Xiao Ni 写道: 在 2023/5/29 下午9:20, Yu Kuai 写道: From: Yu Kuai Currently, for idle and frozen, action_store will hold 'reconfig_mutex' and call md_reap_sync_thread() to stop sync thread, however, this will cause deadlock (explained in the next patch). In order to fix

Re: [dm-devel] [PATCH -next v2 2/6] md: refactor action_store() for 'idle' and 'frozen'

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/13 20:25, Xiao Ni 写道: On Tue, Jun 13, 2023 at 8:00 PM Yu Kuai wrote: Hi, 在 2023/06/13 16:02, Xiao Ni 写道: 在 2023/5/29 下午9:20, Yu Kuai 写道: From: Yu Kuai Prepare to handle 'idle' and 'frozen' differently to fix a deadlock, there are no functional changes except

Re: [dm-devel] [PATCH -next v2 2/6] md: refactor action_store() for 'idle' and 'frozen'

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/13 16:02, Xiao Ni 写道: 在 2023/5/29 下午9:20, Yu Kuai 写道: From: Yu Kuai Prepare to handle 'idle' and 'frozen' differently to fix a deadlock, there are no functional changes except that MD_RECOVERY_RUNNING is checked again after 'reconfig_mutex' is held. Can you explain more

Re: [dm-devel] [PATCH -next v2 1/6] Revert "md: unlock mddev before reap sync_thread in action_store"

2023-06-13 Thread Yu Kuai
Hi, 在 2023/06/13 14:25, Xiao Ni 写道: Thanks for the patch and the explanation in V1. In version1, I took much time to try to understand the problem. Maybe we can use the problem itself as the subject. Something like "Don't allow two sync processes running at the same time"? And could you add

Re: [dm-devel] [PATCH -next v2 0/6] md: fix that MD_RECOVERY_RUNNING can be cleared while sync_thread is still running

2023-06-07 Thread Yu Kuai
Hi, 在 2023/05/29 21:20, Yu Kuai 写道: From: Yu Kuai Changes in v2: - rebase for the latest md-next Patch 1 revert the commit because it will cause MD_RECOVERY_RUNNING to be cleared while sync_thread is still running. The deadlock this patch tries to fix will be fixed by patch 2-5. Patch 6

[dm-devel] [PATCH -next v2 5/6] md: wake up 'resync_wait' at last in md_reap_sync_thread()

2023-05-29 Thread Yu Kuai
From: Yu Kuai md_reap_sync_thread() is just replaced with wait_event(resync_wait, ...) from action_store(), just make sure action_store() will still wait for everything to be done in md_reap_sync_thread(). Signed-off-by: Yu Kuai --- drivers/md/md.c | 2 +- 1 file changed, 1 insertion(+), 1

[dm-devel] [PATCH -next v2 6/6] md: enhance checking in md_check_recovery()

2023-05-29 Thread Yu Kuai
From: Yu Kuai For md_check_recovery(): 1) if 'MD_RECOVERY_RUNING' is not set, register new sync_thread. 2) if 'MD_RECOVERY_RUNING' is set: a) if 'MD_RECOVERY_DONE' is not set, don't do anything, wait for md_do_sync() to be done. b) if 'MD_RECOVERY_DONE' is set, unregister sync_thread

[dm-devel] [PATCH -next v2 1/6] Revert "md: unlock mddev before reap sync_thread in action_store"

2023-05-29 Thread Yu Kuai
From: Yu Kuai This reverts commit 9dfbdafda3b34e262e43e786077bab8e476a89d1. Because it will introduce a defect that sync_thread can be running while MD_RECOVERY_RUNNING is cleared, which will cause some unexpected problems, for example: list_add corruption. prev->next should be n

[dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

2023-05-29 Thread Yu Kuai
From: Yu Kuai Our test found a following deadlock in raid10: 1) Issue a normal write, and such write failed: raid10_end_write_request set_bit(R10BIO_WriteError, _bio->state) one_write_done reschedule_retry // later from md thread raid10d handle_write_completed list_

[dm-devel] [PATCH -next v2 0/6] md: fix that MD_RECOVERY_RUNNING can be cleared while sync_thread is still running

2023-05-29 Thread Yu Kuai
From: Yu Kuai Changes in v2: - rebase for the latest md-next Patch 1 revert the commit because it will cause MD_RECOVERY_RUNNING to be cleared while sync_thread is still running. The deadlock this patch tries to fix will be fixed by patch 2-5. Patch 6 enhance checking to prevent

  1   2   >