Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Junxiao Bi
On 01/21/2016 04:10 PM, Eric Ren wrote: > Hi Junxiao, > > On Thu, Jan 21, 2016 at 03:10:20PM +0800, Junxiao Bi wrote: >> Hi Eric, >> >> This patch should fix your issue. >> "NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock" > > Thanks a lot for bringing up this patch! It

Re: [Ocfs2-devel] ocfs2: o2hb: not fence self if storage down

2016-01-21 Thread rwxybh
Hi, junxiao! We can't find correct fencing log after a node fencing itself. We know there is log such as following in source code: printk(KERN_ERR "*** ocfs2 is very sorry to be fencing this " "system by restarting ***\n"); But we NEVER found this message from /var/log/message or last

Re: [Ocfs2-devel] ocfs2: o2hb: not fence self if storage down

2016-01-21 Thread Junxiao Bi
On 01/21/2016 04:34 PM, rwxybh wrote: > Hi, junxiao! > > > We can't find correct fencing log after a node fencing itself. > We know there is log such as following in source code: > > printk(KERN_ERR "*** ocfs2 is very sorry to be fencing this " > "system by restarting ***\n"); > > But

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Eric Ren
Hi Junxiao, On Thu, Jan 21, 2016 at 03:10:20PM +0800, Junxiao Bi wrote: > Hi Eric, > > This patch should fix your issue. > "NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock" Thanks a lot for bringing up this patch! It hasn't been merged into mainline( at least 4.4), right?

Re: [Ocfs2-devel] OCFS2 causing system instability

2016-01-21 Thread Guy 2212112
Hi, First, I'm well aware that OCFS2 is not a distributed file system, but a shared, clustered file system. This is the main reason that I want to use it - access the same filesystem from multiple nodes. I've checked the latest Kernel 4.4 release that include the "errors=continue" option and

Re: [Ocfs2-devel] [PATCH 1/6] ocfs2: o2hb: add negotiate timer

2016-01-21 Thread Andrew Morton
On Wed, 20 Jan 2016 11:13:34 +0800 Junxiao Bi wrote: > When storage down, all nodes will fence self due to write timeout. > The negotiate timer is designed to avoid this, with it node will > wait until storage up again. > > Negotiate timer working in the following way: >

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Andrew Morton
On Thu, 21 Jan 2016 16:18:38 +0800 Junxiao Bi wrote: > On 01/21/2016 04:10 PM, Eric Ren wrote: > > Hi Junxiao, > > > > On Thu, Jan 21, 2016 at 03:10:20PM +0800, Junxiao Bi wrote: > >> Hi Eric, > >> > >> This patch should fix your issue. > >> "NFS hangs in

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-21 Thread Andrew Morton
On Wed, 20 Jan 2016 11:13:35 +0800 Junxiao Bi wrote: > This message is sent to master node when non-master nodes's > negotiate timer expired. Master node records these nodes in > a bitmap which is used to do write timeout timer re-queue > decision. > > ... > > +static int

Re: [Ocfs2-devel] ocfs2: o2hb: not fence self if storage down

2016-01-21 Thread Joseph Qi
Hi Junxiao, On 2016/1/21 9:48, Junxiao Bi wrote: > On 01/21/2016 08:46 AM, Joseph Qi wrote: >> Hi Junxiao, >> So you mean the negotiation you added only happens if all nodes storage >> link down? > Negotiation happened when one node found its storage link down, but > success when all nodes

Re: [Ocfs2-devel] [PATCH 1/6] ocfs2: o2hb: add negotiate timer

2016-01-21 Thread Joseph Qi
Hi Junxiao, On 2016/1/20 11:13, Junxiao Bi wrote: > When storage down, all nodes will fence self due to write timeout. > The negotiate timer is designed to avoid this, with it node will > wait until storage up again. > > Negotiate timer working in the following way: > > 1. The timer expires

Re: [Ocfs2-devel] [PATCH 1/6] ocfs2: o2hb: add negotiate timer

2016-01-21 Thread Junxiao Bi
Hi Andrew, On 01/22/2016 07:42 AM, Andrew Morton wrote: > On Wed, 20 Jan 2016 11:13:34 +0800 Junxiao Bi wrote: > >> When storage down, all nodes will fence self due to write timeout. >> The negotiate timer is designed to avoid this, with it node will >> wait until storage

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmglue: fix false deadlock caused by clearing UPCONVERT_FINISHING too early

2016-01-21 Thread Eric Ren
Hi all, On Thu, Jan 21, 2016 at 03:05:58PM -0800, Andrew Morton wrote: > On Thu, 21 Jan 2016 16:18:38 +0800 Junxiao Bi wrote: > > > On 01/21/2016 04:10 PM, Eric Ren wrote: > > > Hi Junxiao, > > > > > > On Thu, Jan 21, 2016 at 03:10:20PM +0800, Junxiao Bi wrote: > > >>

Re: [Ocfs2-devel] OCFS2 causing system instability

2016-01-21 Thread Junxiao Bi
Hi Guy, On 01/22/2016 01:46 AM, Guy 2212112 wrote: > Hi, > First, I'm well aware that OCFS2 is not a distributed file system, but a > shared, clustered file system. This is the main reason that I want to > use it - access the same filesystem from multiple nodes. Glad to here you are interested in

Re: [Ocfs2-devel] ocfs2: o2hb: not fence self if storage down

2016-01-21 Thread Junxiao Bi
Hi Joseph, On 01/22/2016 12:25 PM, Joseph Qi wrote: > Hi Junxiao, > > On 2016/1/21 9:48, Junxiao Bi wrote: >> On 01/21/2016 08:46 AM, Joseph Qi wrote: >>> Hi Junxiao, >>> So you mean the negotiation you added only happens if all nodes storage >>> link down? >> Negotiation happened when one node

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-21 Thread Junxiao Bi
On 01/22/2016 01:45 PM, Andrew Morton wrote: > On Fri, 22 Jan 2016 13:12:26 +0800 Junxiao Bi wrote: > >> On 01/22/2016 07:47 AM, Andrew Morton wrote: >>> On Wed, 20 Jan 2016 11:13:35 +0800 Junxiao Bi wrote: >>> This message is sent to master

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-21 Thread Junxiao Bi
On 01/22/2016 07:47 AM, Andrew Morton wrote: > On Wed, 20 Jan 2016 11:13:35 +0800 Junxiao Bi wrote: > >> This message is sent to master node when non-master nodes's >> negotiate timer expired. Master node records these nodes in >> a bitmap which is used to do write timeout

Re: [Ocfs2-devel] [PATCH 2/6] ocfs2: o2hb: add NEGO_TIMEOUT message

2016-01-21 Thread Andrew Morton
On Fri, 22 Jan 2016 13:12:26 +0800 Junxiao Bi wrote: > On 01/22/2016 07:47 AM, Andrew Morton wrote: > > On Wed, 20 Jan 2016 11:13:35 +0800 Junxiao Bi wrote: > > > >> This message is sent to master node when non-master nodes's > >> negotiate timer