[Cluster-devel] [PATCHv3 dlm/next 20/20] fs: dlm: check for invalid namelen

2021-01-04 Thread Alexander Aring
This patch adds an additional check inside the dlm locking from user space functionality that the namelen isn't above the maximum allowed dlm resource name length. If the namelen is above the maximum allowed we have a invalid state and out of buffer access can occur. Cut off the namelen attribute

[Cluster-devel] [PATCHv3 dlm/next 19/20] fs: dlm: remove obsolete code and comment

2021-01-04 Thread Alexander Aring
This patch removes obsolete macros and a comment. The macro was there for temporary copy a dlm message on a stack buffer, nowadays we operate directly on receive buffer which was given to the socket receive API. Signed-off-by: Alexander Aring --- fs/dlm/dlm_internal.h | 6 -- 1 file

[Cluster-devel] [PATCHv3 dlm/next 16/20] fs: dlm: add per node receive flush

2021-01-04 Thread Alexander Aring
This patch will add a functionality to flush pending dlm messages from the receiving workqueue. Upcoming patches will use it to make sure that nothing can be received e.g. after a node gets removed from the nodes hash. Receiving messages will occur into a node lookup which might create an node for

[Cluster-devel] [PATCHv3 dlm/next 17/20] fs: dlm: add reliable connection if reconnect

2021-01-04 Thread Alexander Aring
This patch introduce to make a tcp lowcomms connection reliable even if reconnects occurs. This is done by an application layer re-transmission handling and sequence numbers in dlm protocols. There are three new dlm commands: DLM_OPTS: This will encapsulate an existing dlm message (and rcom

[Cluster-devel] [PATCHv3 dlm/next 18/20] fs: dlm: don't allow half transmitted messages

2021-01-04 Thread Alexander Aring
This patch will clean a dirty page buffer if a reconnect occurs. If a page buffer was half transmitted we cannot start inside the middle of a dlm message if a node connects again. I observed invalid length receptions errors and was guessing that this behaviour occurs, after this patch I never saw

[Cluster-devel] [PATCHv3 dlm/next 11/20] fs: dlm: make new buffer handling softirq ready

2021-01-04 Thread Alexander Aring
This patch makes the writequeue and message handling ready to be called from a softirq by using spinlock handling to stop software interrupts on local cpu while they are hold. The coming midcomms re-transmit handling will introduce a timer which is using this functionality when the timer expires.

[Cluster-devel] [PATCHv3 dlm/next 14/20] fs: dlm: remove unaligned memory access handling

2021-01-04 Thread Alexander Aring
This patch removes unaligned memory access handling for receiving midcomms messages. This handling will not fix the unaligned memory access in general. All messages should be length aligned to 8 bytes, there exists cases where this isn't the case. It's part of the sending handling to not send such

[Cluster-devel] [PATCHv3 dlm/next 15/20] fs: dlm: add union in dlm header for lockspace id

2021-01-04 Thread Alexander Aring
This patch adds union inside the lockspace id to handle it also for another use case for a different dlm command. Signed-off-by: Alexander Aring --- fs/dlm/dlm_internal.h | 5 - fs/dlm/lock.c | 8 fs/dlm/midcomms.c | 1 - fs/dlm/rcom.c | 4 ++-- fs/dlm/util.c

[Cluster-devel] [PATCHv3 dlm/next 13/20] fs: dlm: move out some hash functionality

2021-01-04 Thread Alexander Aring
This patch moves out some lowcomms hash functionality into lowcomms header to provide them to other layers like midcomms as well. Signed-off-by: Alexander Aring --- fs/dlm/lowcomms.c | 9 - fs/dlm/lowcomms.h | 10 ++ 2 files changed, 10 insertions(+), 9 deletions(-) diff --git

[Cluster-devel] [PATCHv3 dlm/next 10/20] fs: dlm: make buffer handling per msg

2021-01-04 Thread Alexander Aring
This patch makes the void pointer handle for lowcomms functionality per message and not per page allocation entry. A refcount handling for the handle was added to keep the message alive until the user doesn't need it anymore. There exists now a per message callback which will be called when

[Cluster-devel] [PATCHv3 dlm/next 12/20] fs: dlm: add functionality to re-transmit a message

2021-01-04 Thread Alexander Aring
This patch introduces a irqsafe retransmit functionality for a lowcomms message handle. It's just allocates a new buffer and transmit it again. To avoid another connection look some refactor was done to make a new buffer allocation with a preexisting connection pointer. Signed-off-by: Alexander

[Cluster-devel] [PATCHv3 dlm/next 09/20] fs: dlm: add more midcomms hooks

2021-01-04 Thread Alexander Aring
This patch prepares hooks to redirect to the midcomms layer which will be used by the midcomms re-transmit handling. There exists the new concept of stateless buffers allocation and commits. This can be used to bypass the midcomms re-transmit handling. It is used by RCOM_STATUS and RCOM_NAMES

[Cluster-devel] [PATCHv3 dlm/next 08/20] fs: dlm: simplify writequeue handling

2021-01-04 Thread Alexander Aring
This patch cleans up the current dlm sending allocator handling by using some named macros, list functionality and removes some goto statements. Signed-off-by: Alexander Aring --- fs/dlm/lowcomms.c | 83 --- 1 file changed, 43 insertions(+), 40

[Cluster-devel] [PATCHv3 dlm/next 07/20] fs: dlm: use GFP_ZERO for page buffer

2021-01-04 Thread Alexander Aring
This patch uses GFP_ZERO for allocate a page for the internal dlm sending buffer allocator instead of calling memset zero after every allocation. An already allocated space will never be reused again. Signed-off-by: Alexander Aring --- fs/dlm/lock.c | 2 -- fs/dlm/lowcomms.c | 2 +-

[Cluster-devel] [PATCHv3 dlm/next 05/20] fs: dlm: change allocation limits

2021-01-04 Thread Alexander Aring
While running tcpkill I experienced invalid header length values while receiving to check that a node doesn't try to send a invalid dlm message we also check on applications minimum allocation limit. Also use DEFAULT_BUFFER_SIZE as maximum allocation limit. The define LOWCOMMS_MAX_TX_BUFFER_LEN is

[Cluster-devel] [PATCHv3 dlm/next 04/20] fs: dlm: add check if dlm is currently running

2021-01-04 Thread Alexander Aring
This patch adds checks for dlm config attributes regarding to protocol parameters as it makes only sense to change them when dlm is not running. It also adds a check for valid protocol specifiers and return invalid argument if they are not supported. Signed-off-by: Alexander Aring ---

[Cluster-devel] [PATCHv3 dlm/next 06/20] fs: dlm: public header in out utility

2021-01-04 Thread Alexander Aring
This patch allows to use header_out() and header_in() outside of dlm util functionality. Signed-off-by: Alexander Aring --- fs/dlm/util.c | 4 ++-- fs/dlm/util.h | 2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/dlm/util.c b/fs/dlm/util.c index cfd0d00b19ae..74a8c5bfe9b5

[Cluster-devel] [PATCHv3 dlm/next 03/20] fs: dlm: add errno handling to check callback

2021-01-04 Thread Alexander Aring
This allows to return individual errno values for the config attribute check callback instead of returning invalid argument only. Signed-off-by: Alexander Aring --- fs/dlm/config.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/fs/dlm/config.c

[Cluster-devel] [PATCHv3 dlm/next 00/20] fs: dlm: introduce dlm re-transmission layer

2021-01-04 Thread Alexander Aring
Hi, this is the final patch-series to make dlm reliable when re-connection occurs. You can easily generate a couple of re-connections by running: tcpkill -9 -i $IFACE port 21064 on your own to test these patches. At some time dlm will detect message drops and will re-transmit messages if

[Cluster-devel] [PATCHv3 dlm/next 01/20] fs: dlm: set connected bit after accept

2021-01-04 Thread Alexander Aring
This patch sets the CF_CONNECTED bit when dlm accepts a connection from another node. If we don't set this bit, next time if the connection socket gets writable it will assume an event that the connection is successfully connected. However that is only the case when the connection did a connect.

[Cluster-devel] [PATCHv3 dlm/next 02/20] fs: dlm: set subclass for othercon sock_mutex

2021-01-04 Thread Alexander Aring
This patch sets the lockdep subclass for the othercon socket mutex. In various places the connection socket mutex is held while locking the othercon socket mutex. This patch will remove lockdep warnings when such case occurs. Signed-off-by: Alexander Aring --- fs/dlm/lowcomms.c | 3 ++- 1 file

Re: [Cluster-devel] [PATCH] fs: amend SLAB_RECLAIM_ACCOUNT on gfs2_quotad_cachep

2021-01-04 Thread Bob Peterson
- Original Message - > From: Zhaoyang Huang > > As gfs2_quotad_cachep has registered the shrinker, amending > SLAB_RECLAIM_ACCOUNT when create gfs2_quotad_cachep, which > make the slab acount to be presiced. > > Signed-off-by: Zhaoyang Huang > --- > fs/gfs2/main.c | 2 +- > 1 file

Re: [Cluster-devel] [GFS2 PATCH] gfs2: make recovery workqueue operate on a gfs2 mount point, not journal

2021-01-04 Thread Bob Peterson
Hi, - Original Message - > Hi, > > On 22/12/2020 20:38, Bob Peterson wrote: > > Hi, > > > > Before this patch, journal recovery was done by a workqueue function that > > operated on a per-journal basis. The problem is, these could run > > simultaneously > > which meant that they could

[Cluster-devel] [PATCH] fs: amend SLAB_RECLAIM_ACCOUNT on gfs2_quotad_cachep

2021-01-04 Thread Huangzhaoyang
From: Zhaoyang Huang As gfs2_quotad_cachep has registered the shrinker, amending SLAB_RECLAIM_ACCOUNT when create gfs2_quotad_cachep, which make the slab acount to be presiced. Signed-off-by: Zhaoyang Huang --- fs/gfs2/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[Cluster-devel] [PATCH -next] dlm: use DEFINE_MUTEX (and mutex_init() had been too late)

2021-01-04 Thread Zheng Yongjun
Signed-off-by: Zheng Yongjun --- fs/dlm/lockspace.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 624617c12250..2b3c32f2d29d 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -26,7 +26,7 @@ #include "ast.h"

[Cluster-devel] [PATCH -next] dlm: debug_fs: use DEFINE_MUTEX (and mutex_init() had been too late)

2021-01-04 Thread Zheng Yongjun
Signed-off-by: Zheng Yongjun --- fs/dlm/debug_fs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/dlm/debug_fs.c b/fs/dlm/debug_fs.c index d6bbccb0ed15..7a6fa8ac6f50 100644 --- a/fs/dlm/debug_fs.c +++ b/fs/dlm/debug_fs.c @@ -20,7 +20,7 @@ #define DLM_DEBUG_BUF_LEN

[Cluster-devel] [PATCH v2 -next] dlm: debug_fs: use DEFINE_MUTEX() for mutex lock

2021-01-04 Thread Zheng Yongjun
mutex lock can be initialized automatically with DEFINE_MUTEX() rather than explicitly calling mutex_init() Signed-off-by: Zheng Yongjun --- fs/dlm/debug_fs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/dlm/debug_fs.c b/fs/dlm/debug_fs.c index

[Cluster-devel] [PATCH v2 -next] dlm: use DEFINE_MUTEX() for mutex lock

2021-01-04 Thread Zheng Yongjun
mutex lock can be initialized automatically with DEFINE_MUTEX() rather than explicitly calling mutex_init(). Signed-off-by: Zheng Yongjun --- fs/dlm/lockspace.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index

Re: [Cluster-devel] [GFS2 PATCH] gfs2: make recovery workqueue operate on a gfs2 mount point, not journal

2021-01-04 Thread Steven Whitehouse
Hi, On 22/12/2020 20:38, Bob Peterson wrote: Hi, Before this patch, journal recovery was done by a workqueue function that operated on a per-journal basis. The problem is, these could run simultaneously which meant that they could all use the same bio, sd_log_bio, to do their writing to all