Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-09 Thread Gang He
Hi Steven and David,


>>> 
> Hi,
> 
> 
> On 09/04/18 06:02, Gang He wrote:
>> Hello David,
>>
>> If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf 
> under two-rings cluster environment,
>> DLM kernel module will not work with the below error message,
>> [   43.696924] DLM installed
>> [  149.552039] ocfs2: Registered cluster interface user
>> [  149.559579] dlm: TCP protocol can't handle multi-homed hosts, try SCTP  
> <<== here, failed
>> [  149.559589] dlm: cannot start dlm lowcomms -22
>> [  149.559612] (mount.ocfs2,2593,3):ocfs2_dlm_init:3120 ERROR: status = -22
>> [  149.559629] (mount.ocfs2,2593,3):ocfs2_mount_volume:1845 ERROR: status = 
> -22
>>
>> Then, could we modify the code, let this case still work via only using one 
> ring address? or the code is written by purpose.
>> in lowcomms.c
>> 1358 static int tcp_listen_for_all(void)
>> 1359 {
>> 1360 struct socket *sock = NULL;
>> 1361 struct connection *con = nodeid2con(0, GFP_NOFS);
>> 1362 int result = -EINVAL;
>> 1363
>> 1364 if (!con)
>> 1365 return -ENOMEM;
>> 1366
>> 1367 /* We don't support multi-homed hosts */
>> 1368 if (dlm_local_addr[1] != NULL) {   <<== here, could we get ride 
> of this limitation?
>> 1369 log_print("TCP protocol can't handle multi-homed hosts, 
> "
>> 1370   "try SCTP");
>> 1371 return -EINVAL;
>> 1372 }
>> 1373
>> 1374 log_print("Using TCP for communications");
>> 1375
>> 1376 sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
>> 1377 if (sock) {
>> 1378 add_sock(sock, con);
>> 1379 result = 0;
>> 1380 }
>> 1381 else {
>> 1382 result = -EADDRINUSE;
>> 1383 }
>>
>>
>> Thanks
>> Gang
>>
> There is already a patch set to allow multi-homing for TCP. Mark and 
> Dave can comment on the current status and how far from merging it 
> currently is,
Thanks for your update on this problem, hopefully we can the related patches in 
the Linus git tree soon.

Thanks
Gang

> 
> Steve.




[Cluster-devel] [PATCH] fs: dls: lowcomms: Replace GFP_ATOMIC with GFP_KERNEL in receive_from_sock

2018-04-09 Thread Jia-Ju Bai
receive_from_sock() is never called in atomic context.

The call chain ending up at receive_from_sock() is:
[1] receive_from_sock() <- process_recv_sockets()
process_recv_sockets() is only set as a parameter of INIT_WORK()
And receive_from_sock() also calls mutex_lock(), which indicates
this function is not called in atomic context.

Despite never getting called from atomic context,
receive_from_sock() calls alloc_page() with GFP_ATOMIC,
which waits busily for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
to avoid busy waiting and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai 
---
 fs/dlm/lowcomms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 4813d0e..2d4e230 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -633,7 +633,7 @@ static int receive_from_sock(struct connection *con)
 * This doesn't need to be atomic, but I think it should
 * improve performance if it is.
 */
-   con->rx_page = alloc_page(GFP_ATOMIC);
+   con->rx_page = alloc_page(GFP_KERNEL);
if (con->rx_page == NULL)
goto out_resched;
cbuf_init(>cb, PAGE_SIZE);
-- 
1.9.1



Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-09 Thread Steven Whitehouse

Hi,


On 09/04/18 06:02, Gang He wrote:

Hello David,

If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf 
under two-rings cluster environment,
DLM kernel module will not work with the below error message,
[   43.696924] DLM installed
[  149.552039] ocfs2: Registered cluster interface user
[  149.559579] dlm: TCP protocol can't handle multi-homed hosts, try SCTP  <<== 
here, failed
[  149.559589] dlm: cannot start dlm lowcomms -22
[  149.559612] (mount.ocfs2,2593,3):ocfs2_dlm_init:3120 ERROR: status = -22
[  149.559629] (mount.ocfs2,2593,3):ocfs2_mount_volume:1845 ERROR: status = -22

Then, could we modify the code, let this case still work via only using one 
ring address? or the code is written by purpose.
in lowcomms.c
1358 static int tcp_listen_for_all(void)
1359 {
1360 struct socket *sock = NULL;
1361 struct connection *con = nodeid2con(0, GFP_NOFS);
1362 int result = -EINVAL;
1363
1364 if (!con)
1365 return -ENOMEM;
1366
1367 /* We don't support multi-homed hosts */
1368 if (dlm_local_addr[1] != NULL) {   <<== here, could we get ride of 
this limitation?
1369 log_print("TCP protocol can't handle multi-homed hosts, "
1370   "try SCTP");
1371 return -EINVAL;
1372 }
1373
1374 log_print("Using TCP for communications");
1375
1376 sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
1377 if (sock) {
1378 add_sock(sock, con);
1379 result = 0;
1380 }
1381 else {
1382 result = -EADDRINUSE;
1383 }


Thanks
Gang

There is already a patch set to allow multi-homing for TCP. Mark and 
Dave can comment on the current status and how far from merging it 
currently is,


Steve.