This patch mainly reverts what commit b92a4e3f86b1 ("fs: dlm: change posix
lock sigint handling") introduced. Except two things, checking if
op->done got true under ops_lock after it got interrupted and changing
"no op" messages to debug printout.

There is currently problems with cleaning up pending operations. The
main idea of commit b92a4e3f86b1 ("fs: dlm: change posix lock sigint
handling") was to wait for a reply and if it was interrupted then the
cleanup routine e.g. list_del(), do_unlock_close() will be executed.

This requires that for every dlm op request a answer in dev_write()
comes back. The cleanup routine do_unlock_close() is not operating in
the dlm user space software on a per request basis and will cleanup
everything else what matches certain plock op fields which concludes
that we don't get anymore for all request a result back. This will
have some leftovers inside the dlm plock recv_list which will never
being deleted.

It was confirmed with a new debugfs entry to look if some plock lists
have still entries left when there is no posix lock activity, checked
by dlm_tool plocks $LS, ongoing anymore. In the specific testcase on
a gfs2 mountpoint the following command was executed:

stress-ng --fcntl 32

and the stress-ng program was killed after certain time.

Due the fact that do_unlock_close() cleans more than just a specific
operation and the dlm operation is already removed by list_del(). This
list_del() can either be operating on send_list or recv_list. If it hits
recv_list it still can be that answers coming back for an ongoing
operation and do_unlock_close() is not synchronized with the list_del().
This will end in "no op ..." log_print(), to not confuse the user about
such issues which seems to be there by design we move this logging
information to pr_debug() as those are expected log messages.

Cc: sta...@vger.kernel.org
Fixes: b92a4e3f86b1 ("fs: dlm: change posix lock sigint handling")
Signed-off-by: Alexander Aring <aahri...@redhat.com>
---
 fs/dlm/plock.c | 25 ++++++-------------------
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index ff364901f22b..fea2157fac5b 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -30,8 +30,6 @@ struct plock_async_data {
 struct plock_op {
        struct list_head list;
        int done;
-       /* if lock op got interrupted while waiting dlm_controld reply */
-       bool sigint;
        struct dlm_plock_info info;
        /* if set indicates async handling */
        struct plock_async_data *data;
@@ -167,12 +165,14 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
number, struct file *file,
                        spin_unlock(&ops_lock);
                        goto do_lock_wait;
                }
-
-               op->sigint = true;
+               list_del(&op->list);
                spin_unlock(&ops_lock);
+
                log_debug(ls, "%s: wait interrupted %x %llx pid %d",
                          __func__, ls->ls_global_id,
                          (unsigned long long)number, op->info.pid);
+               do_unlock_close(&op->info);
+               dlm_release_plock_op(op);
                goto out;
        }
 
@@ -434,19 +434,6 @@ static ssize_t dev_write(struct file *file, const char 
__user *u, size_t count,
                if (iter->info.fsid == info.fsid &&
                    iter->info.number == info.number &&
                    iter->info.owner == info.owner) {
-                       if (iter->sigint) {
-                               list_del(&iter->list);
-                               spin_unlock(&ops_lock);
-
-                               pr_debug("%s: sigint cleanup %x %llx pid %d",
-                                         __func__, iter->info.fsid,
-                                         (unsigned long long)iter->info.number,
-                                         iter->info.pid);
-                               do_unlock_close(&iter->info);
-                               memcpy(&iter->info, &info, sizeof(info));
-                               dlm_release_plock_op(iter);
-                               return count;
-                       }
                        list_del_init(&iter->list);
                        memcpy(&iter->info, &info, sizeof(info));
                        if (iter->data)
@@ -465,8 +452,8 @@ static ssize_t dev_write(struct file *file, const char 
__user *u, size_t count,
                else
                        wake_up(&recv_wq);
        } else
-               log_print("%s: no op %x %llx", __func__,
-                         info.fsid, (unsigned long long)info.number);
+               pr_debug("%s: no op %x %llx", __func__,
+                        info.fsid, (unsigned long long)info.number);
        return count;
 }
 
-- 
2.31.1

Reply via email to