Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-09 Thread Steven Whitehouse
Hi, On Mon, 2012-01-09 at 12:00 -0500, David Teigland wrote: > On Mon, Jan 09, 2012 at 11:46:26AM -0500, David Teigland wrote: > > On Mon, Jan 09, 2012 at 04:36:30PM +, Steven Whitehouse wrote: > > > On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: > > > > This new method of managing r

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-09 Thread Steven Whitehouse
Hi, On Mon, 2012-01-09 at 11:46 -0500, David Teigland wrote: > On Mon, Jan 09, 2012 at 04:36:30PM +, Steven Whitehouse wrote: > > On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: > > > This new method of managing recovery is an alternative to > > > the previous approach of using the us

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-09 Thread David Teigland
On Mon, Jan 09, 2012 at 11:46:26AM -0500, David Teigland wrote: > On Mon, Jan 09, 2012 at 04:36:30PM +, Steven Whitehouse wrote: > > On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: > > > This new method of managing recovery is an alternative to > > > the previous approach of using the

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-09 Thread David Teigland
On Mon, Jan 09, 2012 at 04:36:30PM +, Steven Whitehouse wrote: > On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: > > This new method of managing recovery is an alternative to > > the previous approach of using the userland gfs_controld. > > > > - use dlm slot numbers to assign journal

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-09 Thread Steven Whitehouse
Hi, On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: > This new method of managing recovery is an alternative to > the previous approach of using the userland gfs_controld. > > - use dlm slot numbers to assign journal id's > - use dlm recovery callbacks to initiate journal recovery > - us

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread David Teigland
On Thu, Jan 05, 2012 at 04:58:22PM +, Steven Whitehouse wrote: > > + clear_bit(SDF_NOJOURNALID, &sdp->sd_flags); > > + smp_mb__after_clear_bit(); > > + wake_up_bit(&sdp->sd_flags, SDF_NOJOURNALID); > > + ls->ls_first = !!test_bit(DFL_FIRST_MOUNT, &ls->ls_recover_flags); > > + return 0

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread Steven Whitehouse
Hi, On Thu, 2012-01-05 at 10:46 -0600, David Teigland wrote: [snip] > > +static int gdlm_mount(struct gfs2_sbd *sdp, const char *table) > +{ > + struct lm_lockstruct *ls = &sdp->sd_lockstruct; > + char cluster[GFS2_LOCKNAME_LEN]; > + const char *fsname; > + uint32_t flags; > +

[Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread David Teigland
This new method of managing recovery is an alternative to the previous approach of using the userland gfs_controld. - use dlm slot numbers to assign journal id's - use dlm recovery callbacks to initiate journal recovery - use a dlm lock to determine the first node to mount fs - use a dlm lock to t

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread Bob Peterson
- Original Message - | This new method of managing recovery is an alternative to | the previous approach of using the userland gfs_controld. | | - use dlm slot numbers to assign journal id's | - use dlm recovery callbacks to initiate journal recovery | - use a dlm lock to determine the fir

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread David Teigland
On Thu, Jan 05, 2012 at 03:40:09PM +, Steven Whitehouse wrote: > I think it would be a good plan to not send this last patch for the > current merge window and let it settle for a bit longer. Running things > so fine with the timing makes me nervous bearing in mind the number of > changes, To

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread Steven Whitehouse
Hi, On Thu, 2012-01-05 at 10:21 -0500, David Teigland wrote: > On Thu, Jan 05, 2012 at 10:08:15AM -0500, Bob Peterson wrote: > > - Original Message - > > | This new method of managing recovery is an alternative to > > | the previous approach of using the userland gfs_controld. > > | > > |

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread David Teigland
On Thu, Jan 05, 2012 at 10:08:15AM -0500, Bob Peterson wrote: > - Original Message - > | This new method of managing recovery is an alternative to > | the previous approach of using the userland gfs_controld. > | > | - use dlm slot numbers to assign journal id's > | - use dlm recovery call

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2012-01-05 Thread Bob Peterson
- Original Message - | This new method of managing recovery is an alternative to | the previous approach of using the userland gfs_controld. | | - use dlm slot numbers to assign journal id's | - use dlm recovery callbacks to initiate journal recovery | - use a dlm lock to determine the fir

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-23 Thread Steven Whitehouse
Hi, On Thu, 2011-12-22 at 16:23 -0500, David Teigland wrote: > On Mon, Dec 19, 2011 at 12:47:38PM -0500, David Teigland wrote: > > On Mon, Dec 19, 2011 at 01:07:38PM +, Steven Whitehouse wrote: > > > > struct lm_lockstruct { > > > > int ls_jid; > > > > unsigned int ls_first; >

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-22 Thread David Teigland
On Mon, Dec 19, 2011 at 12:47:38PM -0500, David Teigland wrote: > On Mon, Dec 19, 2011 at 01:07:38PM +, Steven Whitehouse wrote: > > > struct lm_lockstruct { > > > int ls_jid; > > > unsigned int ls_first; > > > - unsigned int ls_first_done; > > > unsigned int ls_nodir; > > Since ls_flags

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-21 Thread David Teigland
On Wed, Dec 21, 2011 at 10:45:21AM +, Steven Whitehouse wrote: > I don't think I understand whats going on in that case. What I thought > should be happening was this: > > - Try to get mounter lock in EX >- If successful, then we are the first mounter so recover all > journals >-

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-21 Thread Steven Whitehouse
Hi, On Tue, 2011-12-20 at 16:04 -0500, David Teigland wrote: > On Tue, Dec 20, 2011 at 02:16:43PM -0500, David Teigland wrote: > > On Tue, Dec 20, 2011 at 10:39:08AM +, Steven Whitehouse wrote: > > > > I dislike arbitrary delays also, so I'm hesitant to add them. > > > > The choices here are:

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-20 Thread David Teigland
On Tue, Dec 20, 2011 at 02:16:43PM -0500, David Teigland wrote: > On Tue, Dec 20, 2011 at 10:39:08AM +, Steven Whitehouse wrote: > > > I dislike arbitrary delays also, so I'm hesitant to add them. > > > The choices here are: > > > - removing NOQUEUE from the requests below, but with NOQUEUE you

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-20 Thread David Teigland
On Tue, Dec 20, 2011 at 10:39:08AM +, Steven Whitehouse wrote: > > I dislike arbitrary delays also, so I'm hesitant to add them. > > The choices here are: > > - removing NOQUEUE from the requests below, but with NOQUEUE you have a > > much better chance of killing a mount command, which is a

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-20 Thread Steven Whitehouse
Hi, On Mon, 2011-12-19 at 12:47 -0500, David Teigland wrote: > On Mon, Dec 19, 2011 at 01:07:38PM +, Steven Whitehouse wrote: > > > struct lm_lockstruct { > > > int ls_jid; > > > unsigned int ls_first; > > > - unsigned int ls_first_done; > > > unsigned int ls_nodir; > > Since ls_flags a

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-19 Thread David Teigland
On Mon, Dec 19, 2011 at 01:07:38PM +, Steven Whitehouse wrote: > > struct lm_lockstruct { > > int ls_jid; > > unsigned int ls_first; > > - unsigned int ls_first_done; > > unsigned int ls_nodir; > Since ls_flags and ls_first also also only boolean flags, they could > potentially b

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-19 Thread Steven Whitehouse
Hi, On Fri, 2011-12-16 at 16:03 -0600, David Teigland wrote: > This new method of managing recovery is an alternative to > the previous approach of using the userland gfs_controld. > > - use dlm slot numbers to assign journal id's > - use dlm recovery callbacks to initiate journal recovery > - us

Re: [Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-19 Thread Steven Whitehouse
On Fri, 2011-12-16 at 16:03 -0600, David Teigland wrote: > This new method of managing recovery is an alternative to > the previous approach of using the userland gfs_controld. > > - use dlm slot numbers to assign journal id's > - use dlm recovery callbacks to initiate journal recovery > - use a d

[Cluster-devel] [PATCH 5/5] gfs2: dlm based recovery coordination

2011-12-16 Thread David Teigland
This new method of managing recovery is an alternative to the previous approach of using the userland gfs_controld. - use dlm slot numbers to assign journal id's - use dlm recovery callbacks to initiate journal recovery - use a dlm lock to determine the first node to mount fs - use a dlm lock to t