Re: [Ocfs2-devel] [PATCH V2] Fix the nested PR lock calling issue
Hi, Sunil, I think put them in ocfs2_check_acl() is better. First, check mount option in ocfs2_check_acl() is more clear than in _ocfs2_get_acl(). Second, we already have ocfs2_get_acl and ocfs2_get_acl_nolock, so it seems _ocfs2_get_acl is redundant and could cause confusing for reading the code. Regards, tiger On 07/21/2010 02:26 PM, Sunil Mushran wrote: Why not add _ocfs2_get_acl() that does the same without taking the cluster locks? ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH] ocfs2/dlm: avoid incorrect bit set in refmap on recovery master
In the following situation, there remains an incorrect bit in refmap on the recovery master. Finally the recovery master will fail at purging the lockres due to the incorrect bit in refmap. 1) node A has no interest on lockres A any longer, so it is purging it. 2) the owner of lockres A is node B, so node A is sending de-ref message to node B. 3) at this time, node B crashed. node C becomes the recovery master. it recovers lockres A(because the master is the dead node B). 4) node A migrated lockres A to node C with a refbit there. 5) node A failed to send de-ref message to node B because it crashed. The failure is ignored. no other action is done for lockres A any more. For mormal, re-send the deref message to it to recovery master can fix it. Well, ignoring the failure of deref to the original master and not recovering the lockres to recovery master has the same effect. And the later is simpler. Signed-off-by: Wengang Wang wen.gang.w...@oracle.com --- fs/ocfs2/dlm/dlmrecovery.c | 17 +++-- fs/ocfs2/dlm/dlmthread.c | 28 +--- 2 files changed, 32 insertions(+), 13 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 9dfaac7..06640f6 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -1997,6 +1997,8 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm, struct list_head *queue; struct dlm_lock *lock, *next; + assert_spin_locked(dlm-spinlock); + assert_spin_locked(res-spinlock); res-state |= DLM_LOCK_RES_RECOVERING; if (!list_empty(res-recovering)) { mlog(0, @@ -2336,9 +2338,20 @@ static void dlm_do_local_recovery_cleanup(struct dlm_ctxt *dlm, u8 dead_node) /* the wake_up for this will happen when the * RECOVERING flag is dropped later */ - res-state = ~DLM_LOCK_RES_DROPPING_REF; + if (res-state DLM_LOCK_RES_DROPPING_REF) { + /* +* don't migrate a lockres which is in +* progress of dropping ref +*/ + mlog(ML_NOTICE, %.*s ignored for +migration\n, res-lockname.len, +res-lockname.name); + res-state = + ~DLM_LOCK_RES_DROPPING_REF; + } else + dlm_move_lockres_to_recovery_list(dlm, + res); - dlm_move_lockres_to_recovery_list(dlm, res); } else if (res-owner == dlm-node_num) { dlm_free_dead_locks(dlm, res, dead_node); __dlm_lockres_calc_usage(dlm, res); diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c index dd78ca3..47420ce 100644 --- a/fs/ocfs2/dlm/dlmthread.c +++ b/fs/ocfs2/dlm/dlmthread.c @@ -92,17 +92,23 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res) * truly ready to be freed. */ int __dlm_lockres_unused(struct dlm_lock_resource *res) { - if (!__dlm_lockres_has_locks(res) - (list_empty(res-dirty) !(res-state DLM_LOCK_RES_DIRTY))) { - /* try not to scan the bitmap unless the first two -* conditions are already true */ - int bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0); - if (bit = O2NM_MAX_NODES) { - /* since the bit for dlm-node_num is not -* set, inflight_locks better be zero */ - BUG_ON(res-inflight_locks != 0); - return 1; - } + int bit; + + if (__dlm_lockres_has_locks(res)) + return 0; + + if (!list_empty(res-dirty) || res-state DLM_LOCK_RES_DIRTY) + return 0; + + if (res-state DLM_LOCK_RES_RECOVERING) + return 0; + + bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0); + if (bit = O2NM_MAX_NODES) { + /* since the bit for dlm-node_num is not +* set, inflight_locks better be zero */ + BUG_ON(res-inflight_locks != 0); + return 1; } return 0; } -- 1.7.1.1 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: avoid incorrect bit set in refmap on recovery master
thanks for making this patch, it looks good just few minor changes about comments On 7/23/2010 5:15 AM, Wengang Wang wrote: In the following situation, there remains an incorrect bit in refmap on the recovery master. Finally the recovery master will fail at purging the lockres due to the incorrect bit in refmap. 1) node A has no interest on lockres A any longer, so it is purging it. 2) the owner of lockres A is node B, so node A is sending de-ref message to node B. 3) at this time, node B crashed. node C becomes the recovery master. it recovers lockres A(because the master is the dead node B). 4) node A migrated lockres A to node C with a refbit there. 5) node A failed to send de-ref message to node B because it crashed. The failure is ignored. no other action is done for lockres A any more. For mormal, re-send the deref message to it to recovery master can fix it. Well, ignoring the failure of deref to the original master and not recovering the lockres to recovery master has the same effect. And the later is simpler. Signed-off-by: Wengang Wang wen.gang.w...@oracle.com --- fs/ocfs2/dlm/dlmrecovery.c | 17 +++-- fs/ocfs2/dlm/dlmthread.c | 28 +--- 2 files changed, 32 insertions(+), 13 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 9dfaac7..06640f6 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -1997,6 +1997,8 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm, struct list_head *queue; struct dlm_lock *lock, *next; + assert_spin_locked(dlm-spinlock); + assert_spin_locked(res-spinlock); res-state |= DLM_LOCK_RES_RECOVERING; if (!list_empty(res-recovering)) { mlog(0, @@ -2336,9 +2338,20 @@ static void dlm_do_local_recovery_cleanup(struct dlm_ctxt *dlm, u8 dead_node) /* the wake_up for this will happen when the * RECOVERING flag is dropped later */ remove above comment as it doesn't seem to be relevant anymore. - res-state = ~DLM_LOCK_RES_DROPPING_REF; + if (res-state DLM_LOCK_RES_DROPPING_REF) { + /* + * don't migrate a lockres which is in + * progress of dropping ref + */ move this comment to before the if condition + mlog(ML_NOTICE, %.*s ignored for + migration\n, res-lockname.len, + res-lockname.name); This information only helps us in diagnosing any related issue and is not helpful in normal cases. So it should be 0 instead of ML_NOTICE. + res-state = + ~DLM_LOCK_RES_DROPPING_REF; we don't need to clear this state as dlm_purge_lockres removes it. + } else + dlm_move_lockres_to_recovery_list(dlm, + res); - dlm_move_lockres_to_recovery_list(dlm, res); } else if (res-owner == dlm-node_num) { dlm_free_dead_locks(dlm, res, dead_node); __dlm_lockres_calc_usage(dlm, res); diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c index dd78ca3..47420ce 100644 --- a/fs/ocfs2/dlm/dlmthread.c +++ b/fs/ocfs2/dlm/dlmthread.c @@ -92,17 +92,23 @@ int __dlm_lockres_has_locks(struct dlm_lock_resource *res) * truly ready to be freed. */ int __dlm_lockres_unused(struct dlm_lock_resource *res) { - if (!__dlm_lockres_has_locks(res) - (list_empty(res-dirty) !(res-state DLM_LOCK_RES_DIRTY))) { - /* try not to scan the bitmap unless the first two - * conditions are already true */ - int bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0); - if (bit = O2NM_MAX_NODES) { - /* since the bit for dlm-node_num is not - * set, inflight_locks better be zero */ - BUG_ON(res-inflight_locks != 0); - return 1; - } + int bit; + + if (__dlm_lockres_has_locks(res)) + return 0; + + if (!list_empty(res-dirty) || res-state DLM_LOCK_RES_DIRTY) + return 0; + + if (res-state DLM_LOCK_RES_RECOVERING) + return 0; + + bit = find_next_bit(res-refmap, O2NM_MAX_NODES, 0); + if (bit = O2NM_MAX_NODES) { + /* since the bit for dlm-node_num is not + * set, inflight_locks better be zero */ +
Re: [Ocfs2-devel] [PATCH V2] Fix the nested PR lock calling issue
ok. Then review Jiaju's earlier patch. On 07/23/2010 03:42 PM, Tiger Yang wrote: Hi, Sunil, I think put them in ocfs2_check_acl() is better. First, check mount option in ocfs2_check_acl() is more clear than in _ocfs2_get_acl(). Second, we already have ocfs2_get_acl and ocfs2_get_acl_nolock, so it seems _ocfs2_get_acl is redundant and could cause confusing for reading the code. Regards, tiger On 07/21/2010 02:26 PM, Sunil Mushran wrote: Why not add _ocfs2_get_acl() that does the same without taking the cluster locks? ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 7/8] ocfs2: Print message if user mounts without starting global heartbeat
In global heartbeat mode, the heartbeat is started by the user. This patch prints an error if the user attempts to mount a volume without starting the heartbeat. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/stack_o2cb.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/stack_o2cb.c b/fs/ocfs2/stack_o2cb.c index 7020e12..4b14b5b 100644 --- a/fs/ocfs2/stack_o2cb.c +++ b/fs/ocfs2/stack_o2cb.c @@ -282,6 +282,8 @@ static int o2cb_cluster_connect(struct ocfs2_cluster_connection *conn) /* for now we only have one cluster/node, make sure we see it * in the heartbeat universe */ if (!o2hb_check_local_node_heartbeating()) { + if (o2hb_global_heartbeat_active()) + mlog(ML_ERROR, Global heartbeat not started\n); rc = -EINVAL; goto out; } -- 1.7.0.4 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 3/8] ocfs2: Add support for heartbeat=global mount option
Adds support for heartbeat=global mount option. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/ocfs2.h|4 ++- fs/ocfs2/ocfs2_fs.h |1 + fs/ocfs2/super.c| 55 ++- 3 files changed, 45 insertions(+), 15 deletions(-) diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 259015a..db96bbd 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -249,7 +249,7 @@ enum ocfs2_local_alloc_state enum ocfs2_mount_options { - OCFS2_MOUNT_HB_LOCAL = 1 0, /* Heartbeat started in local mode */ + OCFS2_MOUNT_HB_LOCAL = 1 0, /* Local heartbeat */ OCFS2_MOUNT_BARRIER = 1 1, /* Use block barriers */ OCFS2_MOUNT_NOINTR = 1 2, /* Don't catch signals */ OCFS2_MOUNT_ERRORS_PANIC = 1 3, /* Panic on errors */ @@ -262,6 +262,8 @@ enum ocfs2_mount_options control lists */ OCFS2_MOUNT_USRQUOTA = 1 10, /* We support user quotas */ OCFS2_MOUNT_GRPQUOTA = 1 11, /* We support group quotas */ + OCFS2_MOUNT_HB_NONE = 1 12, /* No heartbeat */ + OCFS2_MOUNT_HB_GLOBAL = 1 13, /* Global heartbeat */ }; #define OCFS2_OSB_SOFT_RO 0x0001 diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index c936cf0..e5507d5 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -367,6 +367,7 @@ static struct ocfs2_system_inode_info ocfs2_system_inodes[NUM_SYSTEM_INODES] = { /* Parameter passed from mount.ocfs2 to module */ #define OCFS2_HB_NONE heartbeat=none #define OCFS2_HB_LOCAL heartbeat=local +#define OCFS2_HB_GLOBALheartbeat=global /* * OCFS2 directory file types. Only the low 3 bits are used. The diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index 6ecdc07..1e280eb 100644 --- a/fs/ocfs2/super.c +++ b/fs/ocfs2/super.c @@ -169,6 +169,7 @@ enum { Opt_nointr, Opt_hb_none, Opt_hb_local, + Opt_hb_global, Opt_data_ordered, Opt_data_writeback, Opt_atime_quantum, @@ -202,6 +203,7 @@ static const match_table_t tokens = { {Opt_nointr, nointr}, {Opt_hb_none, OCFS2_HB_NONE}, {Opt_hb_local, OCFS2_HB_LOCAL}, + {Opt_hb_global, OCFS2_HB_GLOBAL}, {Opt_data_ordered, data=ordered}, {Opt_data_writeback, data=writeback}, {Opt_atime_quantum, atime_quantum=%u}, @@ -621,6 +623,7 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data) int ret = 0; struct mount_options parsed_options; struct ocfs2_super *osb = OCFS2_SB(sb); + u32 tmp; lock_kernel(); @@ -630,8 +633,9 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data) goto out; } - if ((osb-s_mount_opt OCFS2_MOUNT_HB_LOCAL) != - (parsed_options.mount_opt OCFS2_MOUNT_HB_LOCAL)) { + tmp = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL | + OCFS2_MOUNT_HB_NONE; + if ((osb-s_mount_opt tmp) != (parsed_options.mount_opt tmp)) { ret = -EINVAL; mlog(ML_ERROR, Cannot change heartbeat mode on remount\n); goto out; @@ -824,23 +828,29 @@ bail: static int ocfs2_verify_heartbeat(struct ocfs2_super *osb) { - if (ocfs2_mount_local(osb)) { - if (osb-s_mount_opt OCFS2_MOUNT_HB_LOCAL) { + u32 hb_enabled = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL; + + if (osb-s_mount_opt hb_enabled) { + if (ocfs2_mount_local(osb)) { mlog(ML_ERROR, Cannot heartbeat on a locally mounted device.\n); return -EINVAL; } - } - - if (ocfs2_userspace_stack(osb)) { - if (osb-s_mount_opt OCFS2_MOUNT_HB_LOCAL) { + if (ocfs2_userspace_stack(osb)) { mlog(ML_ERROR, Userspace stack expected, but o2cb heartbeat arguments passed to mount\n); return -EINVAL; } + if (((osb-s_mount_opt OCFS2_MOUNT_HB_GLOBAL) +!ocfs2_cluster_o2cb_global_heartbeat(osb)) || + ((osb-s_mount_opt OCFS2_MOUNT_HB_LOCAL) +ocfs2_cluster_o2cb_global_heartbeat(osb))) { + mlog(ML_ERROR, Mismatching o2cb heartbeat modes\n); + return -EINVAL; + } } - if (!(osb-s_mount_opt OCFS2_MOUNT_HB_LOCAL)) { + if (!(osb-s_mount_opt hb_enabled)) { if (!ocfs2_mount_local(osb) !ocfs2_is_hard_readonly(osb) !ocfs2_userspace_stack(osb)) { mlog(ML_ERROR, Heartbeat has to be started to mount @@ -1319,6 +1329,7 @@ static int ocfs2_parse_options(struct super_block *sb, { int status; char *p; +
[Ocfs2-devel] [PATCH 2/8] ocfs2: Add an incompat feature flag OCFS2_FEATURE_INCOMPAT_CLUSTERINFO
OCFS2_FEATURE_INCOMPAT_CLUSTERINFO allows us to use sb-s_cluster_info for both userspace and o2cb cluster stacks. It also allows us to extend cluster info to include stack flags. This patch also adds stackflags to sb-s_clusterinfo. It also introduces a clusterinfo flag OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT to denote the enabled global heartbeat mode. This incompat flag can be set/cleared using tunefs.ocfs2 --fs-features. The clusterinfo flag is set/cleared using tunefs.ocfs2 --update-cluster-stack. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/ocfs2.h| 31 +-- fs/ocfs2/ocfs2_fs.h | 40 ++-- fs/ocfs2/super.c|4 +++- 3 files changed, 66 insertions(+), 9 deletions(-) diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index 5a3d08d..259015a 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -366,6 +366,8 @@ struct ocfs2_super struct ocfs2_alloc_stats alloc_stats; char dev_str[20]; /* major,minor of the device */ + u8 osb_stackflags; + char osb_cluster_stack[OCFS2_STACK_LABEL_LEN + 1]; struct ocfs2_cluster_connection *cconn; struct ocfs2_lock_res osb_super_lockres; @@ -592,10 +594,35 @@ static inline int ocfs2_is_soft_readonly(struct ocfs2_super *osb) return ret; } -static inline int ocfs2_userspace_stack(struct ocfs2_super *osb) +static inline int ocfs2_clusterinfo_valid(struct ocfs2_super *osb) { return (osb-s_feature_incompat - OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK); + (OCFS2_FEATURE_INCOMPAT_USERSPACE_STACK | +OCFS2_FEATURE_INCOMPAT_CLUSTERINFO)); +} + +static inline int ocfs2_userspace_stack(struct ocfs2_super *osb) +{ + if (ocfs2_clusterinfo_valid(osb) + memcmp(osb-osb_cluster_stack, OCFS2_CLASSIC_CLUSTER_STACK, + OCFS2_STACK_LABEL_LEN)) + return 1; + return 0; +} + +static inline int ocfs2_o2cb_stack(struct ocfs2_super *osb) +{ + if (ocfs2_clusterinfo_valid(osb) + !memcmp(osb-osb_cluster_stack, OCFS2_CLASSIC_CLUSTER_STACK, + OCFS2_STACK_LABEL_LEN)) + return 1; + return 0; +} + +static inline int ocfs2_cluster_o2cb_global_heartbeat(struct ocfs2_super *osb) +{ + return ocfs2_o2cb_stack(osb) + (osb-osb_stackflags OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT); } static inline int ocfs2_mount_local(struct ocfs2_super *osb) diff --git a/fs/ocfs2/ocfs2_fs.h b/fs/ocfs2/ocfs2_fs.h index bb37218..c936cf0 100644 --- a/fs/ocfs2/ocfs2_fs.h +++ b/fs/ocfs2/ocfs2_fs.h @@ -100,7 +100,8 @@ | OCFS2_FEATURE_INCOMPAT_XATTR \ | OCFS2_FEATURE_INCOMPAT_META_ECC \ | OCFS2_FEATURE_INCOMPAT_INDEXED_DIRS \ -| OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE) +| OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE \ +| OCFS2_FEATURE_INCOMPAT_CLUSTERINFO) #define OCFS2_FEATURE_RO_COMPAT_SUPP (OCFS2_FEATURE_RO_COMPAT_UNWRITTEN \ | OCFS2_FEATURE_RO_COMPAT_USRQUOTA \ | OCFS2_FEATURE_RO_COMPAT_GRPQUOTA) @@ -166,6 +167,13 @@ #define OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE 0x1000 /* + * Incompat bit to indicate useable clusterinfo with stackflags for all + * cluster stacks (userspace adnd o2cb). If this bit is set, + * INCOMPAT_USERSPACE_STACK becomes superfluous and thus should not be set. + */ +#define OCFS2_FEATURE_INCOMPAT_CLUSTERINFO 0x2000 + +/* * backup superblock flag is used to indicate that this volume * has backup superblocks. */ @@ -275,10 +283,13 @@ #define OCFS2_VOL_UUID_LEN 16 #define OCFS2_MAX_VOL_LABEL_LEN64 -/* The alternate, userspace stack fields */ +/* The cluster stack fields */ #define OCFS2_STACK_LABEL_LEN 4 #define OCFS2_CLUSTER_NAME_LEN 16 +/* Classic (historically speaking) cluster stack */ +#define OCFS2_CLASSIC_CLUSTER_STACKo2cb + /* Journal limits (in bytes) */ #define OCFS2_MIN_JOURNAL_SIZE (4 * 1024 * 1024) @@ -296,6 +307,11 @@ */ #define OCFS2_MIN_XATTR_INLINE_SIZE 256 +/* + * Cluster info flags (ocfs2_cluster_info.ci_stackflags) + */ +#define OCFS2_CLUSTER_O2CB_GLOBAL_HEARTBEAT(0x01) + struct ocfs2_system_inode_info { char*si_name; int si_iflags; @@ -554,9 +570,21 @@ struct ocfs2_slot_map_extended { */ }; +/* + * ci_stackflags is only valid if the incompat bit + * OCFS2_FEATURE_INCOMPAT_CLUSTERINFO is set. + */ struct ocfs2_cluster_info { /*00*/ __u8 ci_stack[OCFS2_STACK_LABEL_LEN]; - __le32 ci_reserved; + union { + __le32 ci_reserved; + struct { + __u8 ci_reserved1; +
[Ocfs2-devel] [PATCH 5/8] ocfs2/cluster: Get all heartbeat regions
Export function in o2hb to get a list of heartbeat regions. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/cluster/heartbeat.c | 34 ++ fs/ocfs2/cluster/heartbeat.h |4 2 files changed, 38 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c index 1107629..00a7fd6 100644 --- a/fs/ocfs2/cluster/heartbeat.c +++ b/fs/ocfs2/cluster/heartbeat.c @@ -1629,6 +1629,9 @@ static struct config_item *o2hb_heartbeat_group_make_item(struct config_group *g if (reg == NULL) return ERR_PTR(-ENOMEM); + if (strlen(name) O2HB_MAX_REGION_NAME_LEN) + return ERR_PTR(-ENAMETOOLONG); + config_item_init_type_name(reg-hr_item, name, o2hb_region_type); spin_lock(o2hb_live_lock); @@ -2039,3 +2042,34 @@ void o2hb_stop_all_regions(void) spin_unlock(o2hb_live_lock); } EXPORT_SYMBOL_GPL(o2hb_stop_all_regions); + +int o2hb_get_all_regions(char *region_uuids, u8 max_regions) +{ + struct o2hb_region *reg; + int numregs = 0; + char *p; + + spin_lock(o2hb_live_lock); + + p = region_uuids; + list_for_each_entry(reg, o2hb_all_regions, hr_all_item) { + mlog(ML_NOTICE, Region: %s\n, config_item_name(reg-hr_item)); + if (numregs max_regions) { + memcpy(p, config_item_name(reg-hr_item), + O2HB_MAX_REGION_NAME_LEN); + p += O2HB_MAX_REGION_NAME_LEN; + } + numregs++; + } + + spin_unlock(o2hb_live_lock); + + return numregs; +} +EXPORT_SYMBOL_GPL(o2hb_get_all_regions); + +int o2hb_global_heartbeat_active(void) +{ + return (o2hb_heartbeat_mode == O2HB_HEARTBEAT_GLOBAL); +} +EXPORT_SYMBOL(o2hb_global_heartbeat_active); diff --git a/fs/ocfs2/cluster/heartbeat.h b/fs/ocfs2/cluster/heartbeat.h index 2f16492..00ad8e8 100644 --- a/fs/ocfs2/cluster/heartbeat.h +++ b/fs/ocfs2/cluster/heartbeat.h @@ -31,6 +31,8 @@ #define O2HB_REGION_TIMEOUT_MS 2000 +#define O2HB_MAX_REGION_NAME_LEN 32 + /* number of changes to be seen as live */ #define O2HB_LIVE_THRESHOLD 2 /* number of equal samples to be seen as dead */ @@ -81,5 +83,7 @@ int o2hb_check_node_heartbeating(u8 node_num); int o2hb_check_node_heartbeating_from_callback(u8 node_num); int o2hb_check_local_node_heartbeating(void); void o2hb_stop_all_regions(void); +int o2hb_get_all_regions(char *region_uuids, u8 numregions); +int o2hb_global_heartbeat_active(void); #endif /* O2CLUSTER_HEARTBEAT_H */ -- 1.7.0.4 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 8/8] ocfs2/dlm: Add message DLM_QUERY_NODEINFO
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmcommon.h | 17 fs/ocfs2/dlm/dlmdomain.c | 188 +- 2 files changed, 204 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h index 2c05138..34d9cd8 100644 --- a/fs/ocfs2/dlm/dlmcommon.h +++ b/fs/ocfs2/dlm/dlmcommon.h @@ -447,6 +447,7 @@ enum { DLM_BEGIN_RECO_MSG, /* 517 */ DLM_FINALIZE_RECO_MSG, /* 518 */ DLM_QUERY_HBREGION, /* 519 */ + DLM_QUERY_NODEINFO, /* 520 */ }; struct dlm_reco_node_data @@ -737,6 +738,22 @@ struct dlm_query_hbregion { u8 qhb_hbregions[O2HB_MAX_REGION_NAME_LEN * O2NM_MAX_HBREGIONS]; }; +struct dlm_node_info { + u8 ni_nodenum; + u8 pad1; + u16 ni_ipv4_port; + u32 ni_ipv4_address; +}; + +struct dlm_query_nodeinfo { + u8 qn_nodenum; + u8 qn_numnodes; + u8 qn_namelen; + u8 pad1; + u8 qn_domain[O2NM_MAX_NAME_LEN]; + struct dlm_node_info qn_nodes[O2NM_MAX_NODES]; +}; + struct dlm_exit_domain { u8 node_idx; diff --git a/fs/ocfs2/dlm/dlmdomain.c b/fs/ocfs2/dlm/dlmdomain.c index 3521a00..2325087 100644 --- a/fs/ocfs2/dlm/dlmdomain.c +++ b/fs/ocfs2/dlm/dlmdomain.c @@ -131,6 +131,7 @@ static DECLARE_WAIT_QUEUE_HEAD(dlm_domain_events); * * New in version 1.1: * - Message DLM_QUERY_HBREGION added to support global heartbeat + * - Message DLM_QUERY_NODEINFO added to allow online node removes */ static const struct dlm_protocol_version dlm_protocol = { .pv_major = 1, @@ -1122,6 +1123,179 @@ bail: return status; } +static int dlm_match_nodes(struct dlm_ctxt *dlm, struct dlm_query_nodeinfo *qn) +{ + struct o2nm_node *local; + struct dlm_node_info *remote; + int i, j; + int status = 0; + + for (j = 0; j qn-qn_numnodes; ++j) + mlog(ML_NOTICE, Node %3d, %u.%u.%u.%u:%u\n, +qn-qn_nodes[j].ni_nodenum, +NIPQUAD(qn-qn_nodes[j].ni_ipv4_address), +ntohs(qn-qn_nodes[j].ni_ipv4_port)); + + for (i = 0; i O2NM_MAX_NODES !status; ++i) { + local = o2nm_get_node_by_num(i); + remote = NULL; + for (j = 0; j qn-qn_numnodes; ++j) { + if (qn-qn_nodes[j].ni_nodenum == i) { + remote = (qn-qn_nodes[j]); + break; + } + } + + if (!local !remote) + continue; + + if ((local !remote) || (!local remote)) + status = -EINVAL; + + if (!status + ((remote-ni_nodenum != local-nd_num) || +(remote-ni_ipv4_port != local-nd_ipv4_port) || +(remote-ni_ipv4_address != local-nd_ipv4_address))) + status = -EINVAL; + + if (status) { + if (remote !local) + mlog(ML_ERROR, Domain %s: Node %d +(%u.%u.%u.%u:%u) registered in joining +node %d but not in local node %d\n, +qn-qn_domain, remote-ni_nodenum, +NIPQUAD(remote-ni_ipv4_address), +ntohs(remote-ni_ipv4_port), +qn-qn_nodenum, dlm-node_num); + if (local !remote) + mlog(ML_ERROR, Domain %s: Node %d +(%u.%u.%u.%u:%u) registered in local +node %d but not in joining node %d\n, +qn-qn_domain, local-nd_num, +NIPQUAD(local-nd_ipv4_address), +ntohs(local-nd_ipv4_port), +dlm-node_num, qn-qn_nodenum); + BUG_ON((!local !remote)); + } + + if (local) + o2nm_node_put(local); + } + + return status; +} + +static int dlm_send_nodeinfo(struct dlm_ctxt *dlm, unsigned long *node_map) +{ + struct dlm_query_nodeinfo *qn = NULL; + struct o2nm_node *node; + int ret = 0, status, count, i; + + if (find_next_bit(node_map, O2NM_MAX_NODES, 0) = O2NM_MAX_NODES) + goto bail; + + qn = kmalloc(sizeof(struct dlm_query_nodeinfo), GFP_KERNEL); + if (!qn) { + ret = -ENOMEM; + mlog_errno(ret); + goto bail; + } + + memset(qn, 0, sizeof(struct dlm_query_nodeinfo)); + + for (i = 0, count = 0; i O2NM_MAX_NODES; ++i) { + node = o2nm_get_node_by_num(i); + if (!node) + continue; +
[Ocfs2-devel] Global heartbeat - drop#1
This is the first drop of the global heartbeat patches for ocfs2/kernel. The first few patches add support for heartbeat mode in sysfs, the new incompat clusterinfo flag and the new mount option heartbeat=global. 0001-ocfs2-cluster-Add-heartbeat-mode-configfs-parameter.patch 0002-ocfs2-Add-an-incompat-feature-flag-OCFS2_FEATURE_INC.patch 0003-ocfs2-Add-support-for-heartbeat-global-mount-option.patch 0004-ocfs2-dlm-Expose-dlm_protocol-in-dlm_state.patch The next few patches enhance the join domain protocol to get the list of configured nodes and heartbeating regions to ensure that all nodes in the cluster have the same view of the cluster. 0005-ocfs2-cluster-Get-all-heartbeat-regions.patch 0006-ocfs2-dlm-Add-message-DLM_QUERY_HBREGION.patch 0007-ocfs2-Print-message-if-user-mounts-without-starting-.patch 0008-ocfs2-dlm-Add-message-DLM_QUERY_NODEINFO.patch The one known missing bit concerns quorum calculation. I am still working on it. http://oss.oracle.com/osswiki/OCFS2/DesignDocs/NewGlobalHeartbeat Thanks Sunil ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 1/8] ocfs2/cluster: Add heartbeat mode configfs parameter
Add heartbeat mode parameter to the configfs tree. This will be used to set/show the heartbeat mode. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/cluster/heartbeat.c | 70 ++ 1 files changed, 70 insertions(+), 0 deletions(-) diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c index d191f45..1107629 100644 --- a/fs/ocfs2/cluster/heartbeat.c +++ b/fs/ocfs2/cluster/heartbeat.c @@ -76,7 +76,19 @@ static struct o2hb_callback *hbcall_from_type(enum o2hb_callback_type type); #define O2HB_DEFAULT_BLOCK_BITS 9 +enum o2hb_heartbeat_modes { + O2HB_HEARTBEAT_LOCAL= 0, + O2HB_HEARTBEAT_GLOBAL, + O2HB_HEARTBEAT_NUM_MODES, +}; + +char *o2hb_heartbeat_mode_desc[O2HB_HEARTBEAT_NUM_MODES] = { + local,/* O2HB_HEARTBEAT_LOCAL */ + global, /* O2HB_HEARTBEAT_GLOBAL */ +}; + unsigned int o2hb_dead_threshold = O2HB_DEFAULT_DEAD_THRESHOLD; +unsigned int o2hb_heartbeat_mode = O2HB_HEARTBEAT_LOCAL; /* Only sets a new threshold if there are no active regions. * @@ -93,6 +105,22 @@ static void o2hb_dead_threshold_set(unsigned int threshold) } } +static int o2hb_global_hearbeat_mode_set(unsigned int hb_mode) +{ + int ret = -1; + + if (hb_mode O2HB_HEARTBEAT_NUM_MODES) { + spin_lock(o2hb_live_lock); + if (list_empty(o2hb_all_regions)) { + o2hb_heartbeat_mode = hb_mode; + ret = 0; + } + spin_unlock(o2hb_live_lock); + } + + return ret; +} + struct o2hb_node_event { struct list_headhn_item; enum o2hb_callback_type hn_event_type; @@ -1694,6 +1722,39 @@ static ssize_t o2hb_heartbeat_group_threshold_store(struct o2hb_heartbeat_group return count; } +static +ssize_t o2hb_heartbeat_group_mode_show(struct o2hb_heartbeat_group *group, + char *page) +{ + return sprintf(page, %s\n, + o2hb_heartbeat_mode_desc[o2hb_heartbeat_mode]); +} + +static +ssize_t o2hb_heartbeat_group_mode_store(struct o2hb_heartbeat_group *group, + const char *page, size_t count) +{ + unsigned int i; + int ret; + size_t len; + + len = (page[count - 1] == '\n') ? count - 1 : count; + + for (i = 0; i O2HB_HEARTBEAT_NUM_MODES; ++i) { + if (strnicmp(page, o2hb_heartbeat_mode_desc[i], len)) + continue; + + ret = o2hb_global_hearbeat_mode_set(i); + if (!ret) + printk(KERN_INFO ocfs2: Heartbeat mode set to %s\n, + o2hb_heartbeat_mode_desc[i]); + return count; + } + + return -EINVAL; + +} + static struct o2hb_heartbeat_group_attribute o2hb_heartbeat_group_attr_threshold = { .attr = { .ca_owner = THIS_MODULE, .ca_name = dead_threshold, @@ -1702,8 +1763,17 @@ static struct o2hb_heartbeat_group_attribute o2hb_heartbeat_group_attr_threshold .store = o2hb_heartbeat_group_threshold_store, }; +static struct o2hb_heartbeat_group_attribute o2hb_heartbeat_group_attr_mode = { + .attr = { .ca_owner = THIS_MODULE, + .ca_name = mode, + .ca_mode = S_IRUGO | S_IWUSR }, + .show = o2hb_heartbeat_group_mode_show, + .store = o2hb_heartbeat_group_mode_store, +}; + static struct configfs_attribute *o2hb_heartbeat_group_attrs[] = { o2hb_heartbeat_group_attr_threshold.attr, + o2hb_heartbeat_group_attr_mode.attr, NULL, }; -- 1.7.0.4 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH 4/8] ocfs2/dlm: Expose dlm_protocol in dlm_state
Add dlm_protocol to the list of info shown by the debugfs file, dlm_state. Signed-off-by: Sunil Mushran sunil.mush...@oracle.com --- fs/ocfs2/dlm/dlmdebug.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/dlm/dlmdebug.c b/fs/ocfs2/dlm/dlmdebug.c index 75efd45..cf27d81 100644 --- a/fs/ocfs2/dlm/dlmdebug.c +++ b/fs/ocfs2/dlm/dlmdebug.c @@ -779,7 +779,9 @@ static int debug_state_print(struct dlm_ctxt *dlm, struct debug_buffer *db) /* Domain: xx Key: 0xdfbac769 */ out += snprintf(db-buf + out, db-len - out, - Domain: %s Key: 0x%08x\n, dlm-name, dlm-key); + Domain: %s Key: 0x%08x Protocol: %d.%d\n, + dlm-name, dlm-key, dlm-dlm_locking_proto.pv_major, + dlm-dlm_locking_proto.pv_minor); /* Thread Pid: xxx Node: xxx State: x */ out += snprintf(db-buf + out, db-len - out, -- 1.7.0.4 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-devel