Re: [Ocfs2-devel] [PATCH V2] Fix the nested PR lock calling issue

2010-07-23 Thread Tiger Yang
Hi, Sunil,

I think put them in ocfs2_check_acl() is better.
First, check mount option in ocfs2_check_acl()  is more clear than in 
_ocfs2_get_acl().
Second, we already have ocfs2_get_acl and ocfs2_get_acl_nolock, so it 
seems _ocfs2_get_acl is redundant and could cause confusing for reading 
the code.

Regards,
tiger

On 07/21/2010 02:26 PM, Sunil Mushran wrote:
 Why not add _ocfs2_get_acl() that does the same without
 taking the cluster locks?



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH] ocfs2/dlm: avoid incorrect bit set in refmap on recovery master

2010-07-23 Thread Wengang Wang
In the following situation, there remains an incorrect bit in refmap on the
recovery master. Finally the recovery master will fail at purging the lockres
due to the incorrect bit in refmap.

1) node A has no interest on lockres A any longer, so it is purging it.
2) the owner of lockres A is node B, so node A is sending de-ref message
to node B.
3) at this time, node B crashed. node C becomes the recovery master. it recovers
lockres A(because the master is the dead node B).
4) node A migrated lockres A to node C with a refbit there.
5) node A failed to send de-ref message to node B because it crashed. The 
failure
is ignored. no other action is done for lockres A any more.

For mormal, re-send the deref message to it to recovery master can fix it. Well,
ignoring the failure of deref to the original master and not recovering the 
lockres
to recovery master has the same effect. And the later is simpler.

Signed-off-by: Wengang Wang wen.gang.w...@oracle.com
---
 fs/ocfs2/dlm/dlmrecovery.c |   17 +++--
 fs/ocfs2/dlm/dlmthread.c   |   28 +---
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 9dfaac7..06640f6 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1997,6 +1997,8 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt 
*dlm,
struct list_head *queue;
struct dlm_lock *lock, *next;
 
+   assert_spin_locked(dlm-spinlock);
+   assert_spin_locked(res-spinlock);
res-state |= DLM_LOCK_RES_RECOVERING;
if (!list_empty(res-recovering)) {
mlog(0,
@@ -2336,9 +2338,20 @@ static void dlm_do_local_recovery_cleanup(struct 
dlm_ctxt *dlm, u8 dead_node)
 
/* the wake_up for this will happen when the
 * RECOVERING flag is dropped later */
-   res-state = ~DLM_LOCK_RES_DROPPING_REF;
+   if (res-state  DLM_LOCK_RES_DROPPING_REF) {
+   /*
+* don't migrate a lockres which is in
+* progress of dropping ref
+*/
+   mlog(ML_NOTICE, %.*s ignored for 
+migration\n, res-lockname.len,
+res-lockname.name);
+   res-state =
+   ~DLM_LOCK_RES_DROPPING_REF;
+   } else
+   dlm_move_lockres_to_recovery_list(dlm,
+ res);
 
-   dlm_move_lockres_to_recovery_list(dlm, res);
} else if (res-owner == dlm-node_num) {
dlm_free_dead_locks(dlm, res, dead_node);
__dlm_lockres_calc_usage(dlm, res);
diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c
index dd78ca3..47420ce 100644
--- a/fs/ocfs2/dlm/dlmthread.c
+++ b/fs/ocfs2/dlm/dlmthread.c
@@ -92,17 +92,23 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res)
  * truly ready to be freed. */
 int __dlm_lockres_unused(struct dlm_lock_resource *res)
 {
-   if (!__dlm_lockres_has_locks(res) 
-   (list_empty(res-dirty)  !(res-state  DLM_LOCK_RES_DIRTY))) {
-   /* try not to scan the bitmap unless the first two
-* conditions are already true */
-   int bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0);
-   if (bit = O2NM_MAX_NODES) {
-   /* since the bit for dlm-node_num is not
-* set, inflight_locks better be zero */
-   BUG_ON(res-inflight_locks != 0);
-   return 1;
-   }
+   int bit;
+
+   if (__dlm_lockres_has_locks(res))
+   return 0;
+
+   if (!list_empty(res-dirty) || res-state  DLM_LOCK_RES_DIRTY)
+   return 0;
+
+   if (res-state  DLM_LOCK_RES_RECOVERING)
+   return 0;
+
+   bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0);
+   if (bit = O2NM_MAX_NODES) {
+   /* since the bit for dlm-node_num is not
+* set, inflight_locks better be zero */
+   BUG_ON(res-inflight_locks != 0);
+   return 1;
}
return 0;
 }
-- 
1.7.1.1


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: avoid incorrect bit set in refmap on recovery master

2010-07-23 Thread Srinivas Eeda
thanks for making this patch, it looks good just few minor changes about 
comments

On 7/23/2010 5:15 AM, Wengang Wang wrote:
 In the following situation, there remains an incorrect bit in refmap on the
 recovery master. Finally the recovery master will fail at purging the lockres
 due to the incorrect bit in refmap.

 1) node A has no interest on lockres A any longer, so it is purging it.
 2) the owner of lockres A is node B, so node A is sending de-ref message
 to node B.
 3) at this time, node B crashed. node C becomes the recovery master. it 
 recovers
 lockres A(because the master is the dead node B).
 4) node A migrated lockres A to node C with a refbit there.
 5) node A failed to send de-ref message to node B because it crashed. The 
 failure
 is ignored. no other action is done for lockres A any more.

 For mormal, re-send the deref message to it to recovery master can fix it. 
 Well,
 ignoring the failure of deref to the original master and not recovering the 
 lockres
 to recovery master has the same effect. And the later is simpler.

 Signed-off-by: Wengang Wang wen.gang.w...@oracle.com
 ---
  fs/ocfs2/dlm/dlmrecovery.c |   17 +++--
  fs/ocfs2/dlm/dlmthread.c   |   28 +---
  2 files changed, 32 insertions(+), 13 deletions(-)

 diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
 index 9dfaac7..06640f6 100644
 --- a/fs/ocfs2/dlm/dlmrecovery.c
 +++ b/fs/ocfs2/dlm/dlmrecovery.c
 @@ -1997,6 +1997,8 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt 
 *dlm,
   struct list_head *queue;
   struct dlm_lock *lock, *next;
  
 + assert_spin_locked(dlm-spinlock);
 + assert_spin_locked(res-spinlock);
   res-state |= DLM_LOCK_RES_RECOVERING;
   if (!list_empty(res-recovering)) {
   mlog(0,
 @@ -2336,9 +2338,20 @@ static void dlm_do_local_recovery_cleanup(struct 
 dlm_ctxt *dlm, u8 dead_node)
  
   /* the wake_up for this will happen when the
* RECOVERING flag is dropped later */
   
remove above comment as it doesn't seem to be relevant anymore.
 - res-state = ~DLM_LOCK_RES_DROPPING_REF;
 + if (res-state  DLM_LOCK_RES_DROPPING_REF) {
 + /*
 +  * don't migrate a lockres which is in
 +  * progress of dropping ref
 +  */
   
move this comment to before the if condition
 + mlog(ML_NOTICE, %.*s ignored for 
 +  migration\n, res-lockname.len,
 +  res-lockname.name);
   
This information only helps us in diagnosing any related issue and is 
not helpful in normal cases. So it should be 0 instead of ML_NOTICE.
 + res-state =
 + ~DLM_LOCK_RES_DROPPING_REF;
   
we don't need to clear this state as dlm_purge_lockres removes it.
 + } else
 + dlm_move_lockres_to_recovery_list(dlm,
 +   res);
  
 - dlm_move_lockres_to_recovery_list(dlm, res);
   } else if (res-owner == dlm-node_num) {
   dlm_free_dead_locks(dlm, res, dead_node);
   __dlm_lockres_calc_usage(dlm, res);
 diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c
 index dd78ca3..47420ce 100644
 --- a/fs/ocfs2/dlm/dlmthread.c
 +++ b/fs/ocfs2/dlm/dlmthread.c
 @@ -92,17 +92,23 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res)
   * truly ready to be freed. */
  int __dlm_lockres_unused(struct dlm_lock_resource *res)
  {
 - if (!__dlm_lockres_has_locks(res) 
 - (list_empty(res-dirty)  !(res-state  DLM_LOCK_RES_DIRTY))) {
 - /* try not to scan the bitmap unless the first two
 -  * conditions are already true */
 - int bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0);
 - if (bit = O2NM_MAX_NODES) {
 - /* since the bit for dlm-node_num is not
 -  * set, inflight_locks better be zero */
 - BUG_ON(res-inflight_locks != 0);
 - return 1;
 - }
 + int bit;
 +
 + if (__dlm_lockres_has_locks(res))
 + return 0;
 +
 + if (!list_empty(res-dirty) || res-state  DLM_LOCK_RES_DIRTY)
 + return 0;
 +
 + if (res-state  DLM_LOCK_RES_RECOVERING)
 + return 0;
 +
 + bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0);
 + if (bit = O2NM_MAX_NODES) {
 + /* since the bit for dlm-node_num is not
 +  * set, inflight_locks better be zero */
 + 

Re: [Ocfs2-devel] [PATCH V2] Fix the nested PR lock calling issue

2010-07-23 Thread Sunil Mushran
ok. Then review Jiaju's earlier patch.

On 07/23/2010 03:42 PM, Tiger Yang wrote:
 Hi, Sunil,

 I think put them in ocfs2_check_acl() is better.
 First, check mount option in ocfs2_check_acl()  is more clear than in 
 _ocfs2_get_acl().
 Second, we already have ocfs2_get_acl and ocfs2_get_acl_nolock, so it 
 seems _ocfs2_get_acl is redundant and could cause confusing for 
 reading the code.

 Regards,
 tiger

 On 07/21/2010 02:26 PM, Sunil Mushran wrote:
 Why not add _ocfs2_get_acl() that does the same without
 taking the cluster locks?



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 7/8] ocfs2: Print message if user mounts without starting global heartbeat

2010-07-23 Thread Sunil Mushran
In global heartbeat mode, the heartbeat is started by the user. This patch
prints an error if the user attempts to mount a volume without starting the
heartbeat.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/stack_o2cb.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/stack_o2cb.c b/fs/ocfs2/stack_o2cb.c
index 7020e12..4b14b5b 100644
--- a/fs/ocfs2/stack_o2cb.c
+++ b/fs/ocfs2/stack_o2cb.c
@@ -282,6 +282,8 @@ static int o2cb_cluster_connect(struct 
ocfs2_cluster_connection *conn)
/* for now we only have one cluster/node, make sure we see it
 * in the heartbeat universe */
if (!o2hb_check_local_node_heartbeating()) {
+   if (o2hb_global_heartbeat_active())
+   mlog(ML_ERROR, Global heartbeat not started\n);
rc = -EINVAL;
goto out;
}
-- 
1.7.0.4


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 3/8] ocfs2: Add support for heartbeat=global mount option

2010-07-23 Thread Sunil Mushran
Adds support for heartbeat=global mount option.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/ocfs2.h|4 ++-
 fs/ocfs2/ocfs2_fs.h |1 +
 fs/ocfs2/super.c|   55 ++-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 259015a..db96bbd 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -249,7 +249,7 @@ enum ocfs2_local_alloc_state
 
 enum ocfs2_mount_options
 {
-   OCFS2_MOUNT_HB_LOCAL   = 1  0, /* Heartbeat started in local mode */
+   OCFS2_MOUNT_HB_LOCAL = 1  0, /* Local heartbeat */
OCFS2_MOUNT_BARRIER = 1  1,   /* Use block barriers */
OCFS2_MOUNT_NOINTR  = 1  2,   /* Don't catch signals */
OCFS2_MOUNT_ERRORS_PANIC = 1  3, /* Panic on errors */
@@ -262,6 +262,8 @@ enum ocfs2_mount_options
   control lists */
OCFS2_MOUNT_USRQUOTA = 1  10, /* We support user quotas */
OCFS2_MOUNT_GRPQUOTA = 1  11, /* We support group quotas */
+   OCFS2_MOUNT_HB_NONE = 1  12, /* No heartbeat */
+   OCFS2_MOUNT_HB_GLOBAL = 1  13, /* Global heartbeat */
 };
 
 #define OCFS2_OSB_SOFT_RO  0x0001
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h
index c936cf0..e5507d5 100644
--- a/fs/ocfs2/ocfs2_fs.h
+++ b/fs/ocfs2/ocfs2_fs.h
@@ -367,6 +367,7 @@ static struct ocfs2_system_inode_info 
ocfs2_system_inodes[NUM_SYSTEM_INODES] = {
 /* Parameter passed from mount.ocfs2 to module */
 #define OCFS2_HB_NONE  heartbeat=none
 #define OCFS2_HB_LOCAL heartbeat=local
+#define OCFS2_HB_GLOBALheartbeat=global
 
 /*
  * OCFS2 directory file types.  Only the low 3 bits are used.  The
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 6ecdc07..1e280eb 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -169,6 +169,7 @@ enum {
Opt_nointr,
Opt_hb_none,
Opt_hb_local,
+   Opt_hb_global,
Opt_data_ordered,
Opt_data_writeback,
Opt_atime_quantum,
@@ -202,6 +203,7 @@ static const match_table_t tokens = {
{Opt_nointr, nointr},
{Opt_hb_none, OCFS2_HB_NONE},
{Opt_hb_local, OCFS2_HB_LOCAL},
+   {Opt_hb_global, OCFS2_HB_GLOBAL},
{Opt_data_ordered, data=ordered},
{Opt_data_writeback, data=writeback},
{Opt_atime_quantum, atime_quantum=%u},
@@ -621,6 +623,7 @@ static int ocfs2_remount(struct super_block *sb, int 
*flags, char *data)
int ret = 0;
struct mount_options parsed_options;
struct ocfs2_super *osb = OCFS2_SB(sb);
+   u32 tmp;
 
lock_kernel();
 
@@ -630,8 +633,9 @@ static int ocfs2_remount(struct super_block *sb, int 
*flags, char *data)
goto out;
}
 
-   if ((osb-s_mount_opt  OCFS2_MOUNT_HB_LOCAL) !=
-   (parsed_options.mount_opt  OCFS2_MOUNT_HB_LOCAL)) {
+   tmp = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL |
+   OCFS2_MOUNT_HB_NONE;
+   if ((osb-s_mount_opt  tmp) != (parsed_options.mount_opt  tmp)) {
ret = -EINVAL;
mlog(ML_ERROR, Cannot change heartbeat mode on remount\n);
goto out;
@@ -824,23 +828,29 @@ bail:
 
 static int ocfs2_verify_heartbeat(struct ocfs2_super *osb)
 {
-   if (ocfs2_mount_local(osb)) {
-   if (osb-s_mount_opt  OCFS2_MOUNT_HB_LOCAL) {
+   u32 hb_enabled = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL;
+
+   if (osb-s_mount_opt  hb_enabled) {
+   if (ocfs2_mount_local(osb)) {
mlog(ML_ERROR, Cannot heartbeat on a locally 
 mounted device.\n);
return -EINVAL;
}
-   }
-
-   if (ocfs2_userspace_stack(osb)) {
-   if (osb-s_mount_opt  OCFS2_MOUNT_HB_LOCAL) {
+   if (ocfs2_userspace_stack(osb)) {
mlog(ML_ERROR, Userspace stack expected, but 
 o2cb heartbeat arguments passed to mount\n);
return -EINVAL;
}
+   if (((osb-s_mount_opt  OCFS2_MOUNT_HB_GLOBAL) 
+!ocfs2_cluster_o2cb_global_heartbeat(osb)) ||
+   ((osb-s_mount_opt  OCFS2_MOUNT_HB_LOCAL) 
+ocfs2_cluster_o2cb_global_heartbeat(osb))) {
+   mlog(ML_ERROR, Mismatching o2cb heartbeat modes\n);
+   return -EINVAL;
+   }
}
 
-   if (!(osb-s_mount_opt  OCFS2_MOUNT_HB_LOCAL)) {
+   if (!(osb-s_mount_opt  hb_enabled)) {
if (!ocfs2_mount_local(osb)  !ocfs2_is_hard_readonly(osb) 
!ocfs2_userspace_stack(osb)) {
mlog(ML_ERROR, Heartbeat has to be started to mount 
@@ -1319,6 +1329,7 @@ static int ocfs2_parse_options(struct super_block *sb,
 {
int status;
char *p;
+  

[Ocfs2-devel] [PATCH 2/8] ocfs2: Add an incompat feature flag OCFS2_FEATURE_INCOMPAT_CLUSTERINFO

2010-07-23 Thread Sunil Mushran
OCFS2_FEATURE_INCOMPAT_CLUSTERINFO allows us to use sb-s_cluster_info for
both userspace and o2cb cluster stacks. It also allows us to extend cluster
info to include stack flags.

This patch also adds stackflags to sb-s_clusterinfo. It also introduces a
clusterinfo flag OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT to denote the enabled
global heartbeat mode.

This incompat flag can be set/cleared using tunefs.ocfs2 --fs-features. The
clusterinfo flag is set/cleared using tunefs.ocfs2 --update-cluster-stack.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/ocfs2.h|   31 +--
 fs/ocfs2/ocfs2_fs.h |   40 ++--
 fs/ocfs2/super.c|4 +++-
 3 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 5a3d08d..259015a 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -366,6 +366,8 @@ struct ocfs2_super
struct ocfs2_alloc_stats alloc_stats;
char dev_str[20];   /* major,minor of the device */
 
+   u8 osb_stackflags;
+
char osb_cluster_stack[OCFS2_STACK_LABEL_LEN + 1];
struct ocfs2_cluster_connection *cconn;
struct ocfs2_lock_res osb_super_lockres;
@@ -592,10 +594,35 @@ static inline int ocfs2_is_soft_readonly(struct 
ocfs2_super *osb)
return ret;
 }
 
-static inline int ocfs2_userspace_stack(struct ocfs2_super *osb)
+static inline int ocfs2_clusterinfo_valid(struct ocfs2_super *osb)
 {
return (osb-s_feature_incompat 
-   OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK);
+   (OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK |
+OCFS2_FEATURE_INCOMPAT_CLUSTERINFO));
+}
+
+static inline int ocfs2_userspace_stack(struct ocfs2_super *osb)
+{
+   if (ocfs2_clusterinfo_valid(osb) 
+   memcmp(osb-osb_cluster_stack, OCFS2_CLASSIC_CLUSTER_STACK,
+  OCFS2_STACK_LABEL_LEN))
+   return 1;
+   return 0;
+}
+
+static inline int ocfs2_o2cb_stack(struct ocfs2_super *osb)
+{
+   if (ocfs2_clusterinfo_valid(osb) 
+   !memcmp(osb-osb_cluster_stack, OCFS2_CLASSIC_CLUSTER_STACK,
+  OCFS2_STACK_LABEL_LEN))
+   return 1;
+   return 0;
+}
+
+static inline int ocfs2_cluster_o2cb_global_heartbeat(struct ocfs2_super *osb)
+{
+   return ocfs2_o2cb_stack(osb) 
+   (osb-osb_stackflags  OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT);
 }
 
 static inline int ocfs2_mount_local(struct ocfs2_super *osb)
diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h
index bb37218..c936cf0 100644
--- a/fs/ocfs2/ocfs2_fs.h
+++ b/fs/ocfs2/ocfs2_fs.h
@@ -100,7 +100,8 @@
 | OCFS2_FEATURE_INCOMPAT_XATTR \
 | OCFS2_FEATURE_INCOMPAT_META_ECC \
 | OCFS2_FEATURE_INCOMPAT_INDEXED_DIRS \
-| OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE)
+| OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE 
\
+| OCFS2_FEATURE_INCOMPAT_CLUSTERINFO)
 #define OCFS2_FEATURE_RO_COMPAT_SUPP   (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN \
 | OCFS2_FEATURE_RO_COMPAT_USRQUOTA \
 | OCFS2_FEATURE_RO_COMPAT_GRPQUOTA)
@@ -166,6 +167,13 @@
 #define OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE   0x1000
 
 /*
+ * Incompat bit to indicate useable clusterinfo with stackflags for all
+ * cluster stacks (userspace adnd o2cb). If this bit is set,
+ * INCOMPAT_USERSPACE_STACK becomes superfluous and thus should not be set.
+ */
+#define OCFS2_FEATURE_INCOMPAT_CLUSTERINFO 0x2000
+
+/*
  * backup superblock flag is used to indicate that this volume
  * has backup superblocks.
  */
@@ -275,10 +283,13 @@
 #define OCFS2_VOL_UUID_LEN 16
 #define OCFS2_MAX_VOL_LABEL_LEN64
 
-/* The alternate, userspace stack fields */
+/* The cluster stack fields */
 #define OCFS2_STACK_LABEL_LEN  4
 #define OCFS2_CLUSTER_NAME_LEN 16
 
+/* Classic (historically speaking) cluster stack */
+#define OCFS2_CLASSIC_CLUSTER_STACKo2cb
+
 /* Journal limits (in bytes) */
 #define OCFS2_MIN_JOURNAL_SIZE (4 * 1024 * 1024)
 
@@ -296,6 +307,11 @@
  */
 #define OCFS2_MIN_XATTR_INLINE_SIZE 256
 
+/*
+ * Cluster info flags (ocfs2_cluster_info.ci_stackflags)
+ */
+#define OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT(0x01)
+
 struct ocfs2_system_inode_info {
char*si_name;
int si_iflags;
@@ -554,9 +570,21 @@ struct ocfs2_slot_map_extended {
  */
 };
 
+/*
+ * ci_stackflags is only valid if the incompat bit
+ * OCFS2_FEATURE_INCOMPAT_CLUSTERINFO is set.
+ */
 struct ocfs2_cluster_info {
 /*00*/ __u8   ci_stack[OCFS2_STACK_LABEL_LEN];
-   __le32 ci_reserved;
+   union {
+   __le32 ci_reserved;
+   struct {
+   __u8 ci_reserved1;
+ 

[Ocfs2-devel] [PATCH 5/8] ocfs2/cluster: Get all heartbeat regions

2010-07-23 Thread Sunil Mushran
Export function in o2hb to get a list of heartbeat regions.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/cluster/heartbeat.c |   34 ++
 fs/ocfs2/cluster/heartbeat.h |4 
 2 files changed, 38 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 1107629..00a7fd6 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -1629,6 +1629,9 @@ static struct config_item 
*o2hb_heartbeat_group_make_item(struct config_group *g
if (reg == NULL)
return ERR_PTR(-ENOMEM);
 
+   if (strlen(name)  O2HB_MAX_REGION_NAME_LEN)
+   return ERR_PTR(-ENAMETOOLONG);
+
config_item_init_type_name(reg-hr_item, name, o2hb_region_type);
 
spin_lock(o2hb_live_lock);
@@ -2039,3 +2042,34 @@ void o2hb_stop_all_regions(void)
spin_unlock(o2hb_live_lock);
 }
 EXPORT_SYMBOL_GPL(o2hb_stop_all_regions);
+
+int o2hb_get_all_regions(char *region_uuids, u8 max_regions)
+{
+   struct o2hb_region *reg;
+   int numregs = 0;
+   char *p;
+
+   spin_lock(o2hb_live_lock);
+
+   p = region_uuids;
+   list_for_each_entry(reg, o2hb_all_regions, hr_all_item) {
+   mlog(ML_NOTICE, Region: %s\n, 
config_item_name(reg-hr_item));
+   if (numregs  max_regions) {
+   memcpy(p, config_item_name(reg-hr_item),
+  O2HB_MAX_REGION_NAME_LEN);
+   p += O2HB_MAX_REGION_NAME_LEN;
+   }
+   numregs++;
+   }
+
+   spin_unlock(o2hb_live_lock);
+
+   return numregs;
+}
+EXPORT_SYMBOL_GPL(o2hb_get_all_regions);
+
+int o2hb_global_heartbeat_active(void)
+{
+   return (o2hb_heartbeat_mode == O2HB_HEARTBEAT_GLOBAL);
+}
+EXPORT_SYMBOL(o2hb_global_heartbeat_active);
diff --git a/fs/ocfs2/cluster/heartbeat.h b/fs/ocfs2/cluster/heartbeat.h
index 2f16492..00ad8e8 100644
--- a/fs/ocfs2/cluster/heartbeat.h
+++ b/fs/ocfs2/cluster/heartbeat.h
@@ -31,6 +31,8 @@
 
 #define O2HB_REGION_TIMEOUT_MS 2000
 
+#define O2HB_MAX_REGION_NAME_LEN   32
+
 /* number of changes to be seen as live */
 #define O2HB_LIVE_THRESHOLD   2
 /* number of equal samples to be seen as dead */
@@ -81,5 +83,7 @@ int o2hb_check_node_heartbeating(u8 node_num);
 int o2hb_check_node_heartbeating_from_callback(u8 node_num);
 int o2hb_check_local_node_heartbeating(void);
 void o2hb_stop_all_regions(void);
+int o2hb_get_all_regions(char *region_uuids, u8 numregions);
+int o2hb_global_heartbeat_active(void);
 
 #endif /* O2CLUSTER_HEARTBEAT_H */
-- 
1.7.0.4


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 8/8] ocfs2/dlm: Add message DLM_QUERY_NODEINFO

2010-07-23 Thread Sunil Mushran
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/dlm/dlmcommon.h |   17 
 fs/ocfs2/dlm/dlmdomain.c |  188 +-
 2 files changed, 204 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h
index 2c05138..34d9cd8 100644
--- a/fs/ocfs2/dlm/dlmcommon.h
+++ b/fs/ocfs2/dlm/dlmcommon.h
@@ -447,6 +447,7 @@ enum {
DLM_BEGIN_RECO_MSG,  /* 517 */
DLM_FINALIZE_RECO_MSG,   /* 518 */
DLM_QUERY_HBREGION,  /* 519 */
+   DLM_QUERY_NODEINFO,  /* 520 */
 };
 
 struct dlm_reco_node_data
@@ -737,6 +738,22 @@ struct dlm_query_hbregion {
u8 qhb_hbregions[O2HB_MAX_REGION_NAME_LEN * O2NM_MAX_HBREGIONS];
 };
 
+struct dlm_node_info {
+   u8 ni_nodenum;
+   u8 pad1;
+   u16 ni_ipv4_port;
+   u32 ni_ipv4_address;
+};
+
+struct dlm_query_nodeinfo {
+   u8 qn_nodenum;
+   u8 qn_numnodes;
+   u8 qn_namelen;
+   u8 pad1;
+   u8 qn_domain[O2NM_MAX_NAME_LEN];
+   struct dlm_node_info qn_nodes[O2NM_MAX_NODES];
+};
+
 struct dlm_exit_domain
 {
u8 node_idx;
diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c
index 3521a00..2325087 100644
--- a/fs/ocfs2/dlm/dlmdomain.c
+++ b/fs/ocfs2/dlm/dlmdomain.c
@@ -131,6 +131,7 @@ static DECLARE_WAIT_QUEUE_HEAD(dlm_domain_events);
  *
  * New in version 1.1:
  * - Message DLM_QUERY_HBREGION added to support global heartbeat
+ * - Message DLM_QUERY_NODEINFO added to allow online node removes
  */
 static const struct dlm_protocol_version dlm_protocol = {
.pv_major = 1,
@@ -1122,6 +1123,179 @@ bail:
return status;
 }
 
+static int dlm_match_nodes(struct dlm_ctxt *dlm, struct dlm_query_nodeinfo *qn)
+{
+   struct o2nm_node *local;
+   struct dlm_node_info *remote;
+   int i, j;
+   int status = 0;
+
+   for (j = 0; j  qn-qn_numnodes; ++j)
+   mlog(ML_NOTICE, Node %3d, %u.%u.%u.%u:%u\n,
+qn-qn_nodes[j].ni_nodenum,
+NIPQUAD(qn-qn_nodes[j].ni_ipv4_address),
+ntohs(qn-qn_nodes[j].ni_ipv4_port));
+
+   for (i = 0; i  O2NM_MAX_NODES  !status; ++i) {
+   local = o2nm_get_node_by_num(i);
+   remote = NULL;
+   for (j = 0; j  qn-qn_numnodes; ++j) {
+   if (qn-qn_nodes[j].ni_nodenum == i) {
+   remote = (qn-qn_nodes[j]);
+   break;
+   }
+   }
+
+   if (!local  !remote)
+   continue;
+
+   if ((local  !remote) || (!local  remote))
+   status = -EINVAL;
+
+   if (!status 
+   ((remote-ni_nodenum != local-nd_num) ||
+(remote-ni_ipv4_port != local-nd_ipv4_port) ||
+(remote-ni_ipv4_address != local-nd_ipv4_address)))
+   status = -EINVAL;
+
+   if (status) {
+   if (remote  !local)
+   mlog(ML_ERROR, Domain %s: Node %d 
+(%u.%u.%u.%u:%u) registered in joining 
+node %d but not in local node %d\n,
+qn-qn_domain, remote-ni_nodenum,
+NIPQUAD(remote-ni_ipv4_address),
+ntohs(remote-ni_ipv4_port),
+qn-qn_nodenum, dlm-node_num);
+   if (local  !remote)
+   mlog(ML_ERROR, Domain %s: Node %d 
+(%u.%u.%u.%u:%u) registered in local 
+node %d but not in joining node %d\n,
+qn-qn_domain, local-nd_num,
+NIPQUAD(local-nd_ipv4_address),
+ntohs(local-nd_ipv4_port),
+dlm-node_num, qn-qn_nodenum);
+   BUG_ON((!local  !remote));
+   }
+
+   if (local)
+   o2nm_node_put(local);
+   }
+
+   return status;
+}
+
+static int dlm_send_nodeinfo(struct dlm_ctxt *dlm, unsigned long *node_map)
+{
+   struct dlm_query_nodeinfo *qn = NULL;
+   struct o2nm_node *node;
+   int ret = 0, status, count, i;
+
+   if (find_next_bit(node_map, O2NM_MAX_NODES, 0) = O2NM_MAX_NODES)
+   goto bail;
+
+   qn = kmalloc(sizeof(struct dlm_query_nodeinfo), GFP_KERNEL);
+   if (!qn) {
+   ret = -ENOMEM;
+   mlog_errno(ret);
+   goto bail;
+   }
+
+   memset(qn, 0, sizeof(struct dlm_query_nodeinfo));
+
+   for (i = 0, count = 0; i  O2NM_MAX_NODES; ++i) {
+   node = o2nm_get_node_by_num(i);
+   if (!node)
+   continue;
+ 

[Ocfs2-devel] Global heartbeat - drop#1

2010-07-23 Thread Sunil Mushran

This is the first drop of the global heartbeat patches for ocfs2/kernel.

The first few patches add support for heartbeat mode in sysfs, the new
incompat clusterinfo flag and the new mount option heartbeat=global.

0001-ocfs2-cluster-Add-heartbeat-mode-configfs-parameter.patch
0002-ocfs2-Add-an-incompat-feature-flag-OCFS2_FEATURE_INC.patch
0003-ocfs2-Add-support-for-heartbeat-global-mount-option.patch
0004-ocfs2-dlm-Expose-dlm_protocol-in-dlm_state.patch

The next few patches enhance the join domain protocol to get the list
of configured nodes and heartbeating regions to ensure that all nodes
in the cluster have the same view of the cluster.

0005-ocfs2-cluster-Get-all-heartbeat-regions.patch
0006-ocfs2-dlm-Add-message-DLM_QUERY_HBREGION.patch
0007-ocfs2-Print-message-if-user-mounts-without-starting-.patch
0008-ocfs2-dlm-Add-message-DLM_QUERY_NODEINFO.patch

The one known missing bit concerns quorum calculation. I am still
working on it.

http://oss.oracle.com/osswiki/OCFS2/DesignDocs/NewGlobalHeartbeat

Thanks
Sunil

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 1/8] ocfs2/cluster: Add heartbeat mode configfs parameter

2010-07-23 Thread Sunil Mushran
Add heartbeat mode parameter to the configfs tree. This will be used
to set/show the heartbeat mode.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/cluster/heartbeat.c |   70 ++
 1 files changed, 70 insertions(+), 0 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index d191f45..1107629 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -76,7 +76,19 @@ static struct o2hb_callback *hbcall_from_type(enum 
o2hb_callback_type type);
 
 #define O2HB_DEFAULT_BLOCK_BITS   9
 
+enum o2hb_heartbeat_modes {
+   O2HB_HEARTBEAT_LOCAL= 0,
+   O2HB_HEARTBEAT_GLOBAL,
+   O2HB_HEARTBEAT_NUM_MODES,
+};
+
+char *o2hb_heartbeat_mode_desc[O2HB_HEARTBEAT_NUM_MODES] = {
+   local,/* O2HB_HEARTBEAT_LOCAL */
+   global,   /* O2HB_HEARTBEAT_GLOBAL */
+};
+
 unsigned int o2hb_dead_threshold = O2HB_DEFAULT_DEAD_THRESHOLD;
+unsigned int o2hb_heartbeat_mode = O2HB_HEARTBEAT_LOCAL;
 
 /* Only sets a new threshold if there are no active regions.
  *
@@ -93,6 +105,22 @@ static void o2hb_dead_threshold_set(unsigned int threshold)
}
 }
 
+static int o2hb_global_hearbeat_mode_set(unsigned int hb_mode)
+{
+   int ret = -1;
+
+   if (hb_mode  O2HB_HEARTBEAT_NUM_MODES) {
+   spin_lock(o2hb_live_lock);
+   if (list_empty(o2hb_all_regions)) {
+   o2hb_heartbeat_mode = hb_mode;
+   ret = 0;
+   }
+   spin_unlock(o2hb_live_lock);
+   }
+
+   return ret;
+}
+
 struct o2hb_node_event {
struct list_headhn_item;
enum o2hb_callback_type hn_event_type;
@@ -1694,6 +1722,39 @@ static ssize_t 
o2hb_heartbeat_group_threshold_store(struct o2hb_heartbeat_group
return count;
 }
 
+static
+ssize_t o2hb_heartbeat_group_mode_show(struct o2hb_heartbeat_group *group,
+  char *page)
+{
+   return sprintf(page, %s\n,
+  o2hb_heartbeat_mode_desc[o2hb_heartbeat_mode]);
+}
+
+static
+ssize_t o2hb_heartbeat_group_mode_store(struct o2hb_heartbeat_group *group,
+   const char *page, size_t count)
+{
+   unsigned int i;
+   int ret;
+   size_t len;
+
+   len = (page[count - 1] == '\n') ? count - 1 : count;
+
+   for (i = 0; i  O2HB_HEARTBEAT_NUM_MODES; ++i) {
+   if (strnicmp(page, o2hb_heartbeat_mode_desc[i], len))
+   continue;
+
+   ret = o2hb_global_hearbeat_mode_set(i);
+   if (!ret)
+   printk(KERN_INFO ocfs2: Heartbeat mode set to %s\n,
+  o2hb_heartbeat_mode_desc[i]);
+   return count;
+   }
+
+   return -EINVAL;
+
+}
+
 static struct o2hb_heartbeat_group_attribute 
o2hb_heartbeat_group_attr_threshold = {
.attr   = { .ca_owner = THIS_MODULE,
.ca_name = dead_threshold,
@@ -1702,8 +1763,17 @@ static struct o2hb_heartbeat_group_attribute 
o2hb_heartbeat_group_attr_threshold
.store  = o2hb_heartbeat_group_threshold_store,
 };
 
+static struct o2hb_heartbeat_group_attribute o2hb_heartbeat_group_attr_mode = {
+   .attr   = { .ca_owner = THIS_MODULE,
+   .ca_name = mode,
+   .ca_mode = S_IRUGO | S_IWUSR },
+   .show   = o2hb_heartbeat_group_mode_show,
+   .store  = o2hb_heartbeat_group_mode_store,
+};
+
 static struct configfs_attribute *o2hb_heartbeat_group_attrs[] = {
o2hb_heartbeat_group_attr_threshold.attr,
+   o2hb_heartbeat_group_attr_mode.attr,
NULL,
 };
 
-- 
1.7.0.4


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH 4/8] ocfs2/dlm: Expose dlm_protocol in dlm_state

2010-07-23 Thread Sunil Mushran
Add dlm_protocol to the list of info shown by the debugfs file, dlm_state.

Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
---
 fs/ocfs2/dlm/dlmdebug.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmdebug.c b/fs/ocfs2/dlm/dlmdebug.c
index 75efd45..cf27d81 100644
--- a/fs/ocfs2/dlm/dlmdebug.c
+++ b/fs/ocfs2/dlm/dlmdebug.c
@@ -779,7 +779,9 @@ static int debug_state_print(struct dlm_ctxt *dlm, struct 
debug_buffer *db)
 
/* Domain: xx  Key: 0xdfbac769 */
out += snprintf(db-buf + out, db-len - out,
-   Domain: %s  Key: 0x%08x\n, dlm-name, dlm-key);
+   Domain: %s  Key: 0x%08x  Protocol: %d.%d\n,
+   dlm-name, dlm-key, dlm-dlm_locking_proto.pv_major,
+   dlm-dlm_locking_proto.pv_minor);
 
/* Thread Pid: xxx  Node: xxx  State: x */
out += snprintf(db-buf + out, db-len - out,
-- 
1.7.0.4


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel