On Thu, Aug 12, 2021 at 01:44:53PM +0800, Gang He wrote: > In fact, I can reproduce this problem stably. > I want to know if this error happen is by our expectation? since there is > not any extreme pressure test. > Second, how should we handle these error cases? call dlm_lock function > again? maybe the function will fails again, that will lead to kernel > soft-lockup after multiple re-tries.
What's probably happening is that ocfs2 calls dlm_unlock(CANCEL) to cancel an in-progress dlm_lock() request. Before the cancel completes (or the original request completes), ocfs2 calls dlm_lock() again on the same resource. This dlm_lock() returns -EBUSY because the previous request has not completed, either normally or by cancellation. This is expected. A couple options to try: wait for the original request to complete (normally or by cancellation) before calling dlm_lock() again, or retry dlm_lock() on -EBUSY. Dave