On Tue, Sep 12, 2017 at 09:01:31AM +, tsutomu@toshiba.co.jp wrote:
> When the DLM_LKF_NODLCKWT flag was set, even if conversion deadlock
> was detected, the caller of can_be_granted() was unknown.
> We change the behavior of can_be_granted() and change it to detect
> conversion deadlock reg
In the current implementation, we think that exclusion control
for othercon in tcp_accept_from_sock() and sctp_accept_from_sock()
was not enough. We fix them.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 6 ++
1 file changed, 6 insertions(+)
diff --git
When an error occurs in kernel_recvmsg or kernel_sendpage and
close_connection is called and receive work is already scheduled,
receive work is canceled. In that case, the receive work will not
be scheduled forever after reconnection, because CF_READ_PENDING
flag is established.
Signed-off-by: Tad
CF_WRITE_PENDING flag has been reanimated to make dlm_send stop properly
when running dlm_lowcomms_stop.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcom
If a node sends a DLM_RCOM_STATUS command and an error occurs on the
receiving side, the DLM_RCOM_STATUS_REPLY response may not be returned.
We retransmitted the DLM_RCOM_STATUS command so that we do not wait for
an infinite response.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
--
---
This version implements Steve Whitehouse's suggestion to put
cond_resched() after the queue_work in function send_to_sock.
This was just a thinko; it does make more sense to do it afterward.
Before this patch the CF_WRITE_PENDING flag was used to indicate
when writes to the socket were pending
When kernel_sendpage(in send_to_sock) and kernel_recvmsg
(in receive_from_sock) return error, close_connection may works at the
same time. At that time, they may wait for each other by cancel_work_sync.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 8 ++-
When the DLM_LKF_NODLCKWT flag was set, even if conversion deadlock
was detected, the caller of can_be_granted() was unknown.
We change the behavior of can_be_granted() and change it to detect
conversion deadlock regardless of whether the DLM_LKF_NODLCKWT flag
is set or not. And depending on whethe
The sk member of the socket generated by sock_create_kern() is overwritten
by ops->accept(). So the previous sk will not be released.
We use kernel_accept() instead of sock_create_kern() and ops->accept().
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 21 +++
The writequeue and writequeue_lock member of othercon was not initialized.
If lowcomms_state_change() is called from network layer, othercon->swork
may be scheduled. In this case, send_to_sock() will generate a NULL pointer
reference. We avoid this problem by correctly initializing writequeue and
w
dlm_lowcomms_stop() was not functioning properly. Correctly, we have to
wait until all processing is finished with send_workqueue and
recv_workqueue.
This problem causes the following issue. Senario is
1. dlm_send thread:
send_to_sock refers con->writequeue
2. main thread:
dlm_lowcomms_sto
Before this patch, there was a flag in the con structure that was
used to determine whether or not a connect was needed. The bit was
set here and there, and cleared here and there, so it left some
race conditions: the bit was set, work was queued, then the worker
cleared the bit, allowing someone e
In a previous patch I noted that accept() often copies the struct
sock (sk) which overwrites the sock callbacks. However, in testing
we discovered that the dlm connection structures (con) are sometimes
deleted and recreated as connections come and go, and since they're
zeroed out by kmem_cache_zall
When dlm_recoverd_stop() is called between kthread_should_stop() and
set_task_state(TASK_INTERRUPTIBLE), dlm_recoverd will not wake up.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/recoverd.c | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git
In the current implementation, we think that exclusion control between
processing to set the callback function to the connection structure and
processing to refer to the connection structure from the callback function
was not enough. We fix them.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsu
save_cb argument is not used. We remove them.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index f5eea54..585a327 100644
--- a/fs/dlm/lo
If an error occurs in the sending / receiving process, if othercon
exists, sending / receiving processing using othercon may also result
in an error. We fix to pre-close othercon as well.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 4 ++--
1 file changed,
dlm_cb_seq is 64 bits. If dlm_cb_seq overflows and returns to 0,
dlm_rem_lkb_callback() will not work properly.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/ast.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/dlm/ast.c b/fs/dlm/ast.c
index 07fed83..562fa8c 10
If reconnection fails while executing dlm_lowcomms_stop,
dlm_send will not stop.
Signed-off-by: Tadashi Miyauchi
Signed-off-by: Tsutomu Owa
---
fs/dlm/lowcomms.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 4b33614..e2067a6 100644
--- a/fs/dlm/
Hi,
This series of patches (2nd version after previous review on August) is to
fix various bugs. This patch set is against the mainline kernel.
We'd like reviewed to make sure those changes are fine.
Patch number 01/02/03 were posted on this mailing list.
However, I've modifed 03 to correct a bu
20 matches
Mail list logo