The branch, master has been updated
via 0858b11 ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
via 1ef1cfd ctdb-common: Move ctdb_node_list_to_map() to utilities
via dd52d82 ctdb-daemon: Factor out new function ctdb_node_list_to_map()
via ffbe0a6 ctdb-tools: Drop the recovery from "reloadnodes"
via d340f30 ctdb-daemon: Don't delay reloading the nodes file
via 85bd9a3 ctdb-recoverd: Avoid nodemap-related checks when recoveries
are disabled
via 13dc4a9 ctdb-tool: Update "reloadnodes" to disable recoveries
via ee9619c ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
via 2ca484c ctdb-recoverd: Simplify disable_ip_check_handler() using
ctdb_op_disable()
via 108db33 ctdb-recoverd: Add slightly more abstraction for disabling
takeover runs
via ec32d9b ctdb-recoverd: Reimplement ReRecoveryTimeout using
ctdb_op_disable()
via 281f7e8 ctdb-recoverd: Use a goto for do_recovery() failures
via a2044c6 ctdb-recoverd: Reimplement disabling takeover runs using
ctdb_op_disable()
via 55b2461 ctdb-recoverd: Add a new abstraction ctdb_op_disable()
via ae9cd037 ctdb-daemon: Pass on consistent flag information to
recovery daemon
via 4b972bb ctdb-tests: Add "ctdb reloadnodes" test for "node remains
deleted"
via 181658f ctdb-tools: Fix spurious messages about deleted nodes being
disconnected
from b57c778 rpc_server: Coverity fix for CID 1273079
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 0858b11ff735b535bfeded346c87a0c245d902c7
Author: Martin Schwenke <[email protected]>
Date: Sun Feb 22 06:37:41 2015 +1100
ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
Drop copy of old ctdb_control_nodemap().
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
Autobuild-User(master): Amitay Isaacs <[email protected]>
Autobuild-Date(master): Tue Apr 7 10:20:41 CEST 2015 on sn-devel-104
commit 1ef1cfdc4d6b923357630451177fdcde1d616e87
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 20 12:34:25 2015 +1100
ctdb-common: Move ctdb_node_list_to_map() to utilities
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit dd52d82c73b26a3fed6dfd4aaf7d51f576d019d9
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 20 12:31:37 2015 +1100
ctdb-daemon: Factor out new function ctdb_node_list_to_map()
Change ctdb_control_getnodemap() to use this.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit ffbe0a6def236f5d0b03d089a7fc3f060eb0e392
Author: Martin Schwenke <[email protected]>
Date: Wed Feb 4 12:06:56 2015 +1100
ctdb-tools: Drop the recovery from "reloadnodes"
A recovery is not required: when deleting a node it should already be
disconnected and when adding a node it will also be disconnected. The
new sanity checks in "reloadnodes" ensure that these assumptions are
met.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit d340f308e76af53b04ae9b5432c4f6c84315303a
Author: Martin Schwenke <[email protected]>
Date: Tue Feb 10 15:43:03 2015 +1100
ctdb-daemon: Don't delay reloading the nodes file
Presumably this was done to minimise the chance of a recovery
occurring while the nodemaps are inconsistent across nodes.
Another potential theory is that the forced recovery in the
ctdb.c:control_reload_nodes_file() stops another recovery occurring
for ReRecoveryTimeout seconds, so this delay causes the reloads to
occur during that period.
This is no longer necessary because recoveries are now explicitly
disabled while node files are reloaded.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 85bd9a33eb65d6fd03ad85aeedf141a2813c2bb8
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 20:59:11 2015 +1100
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled
The potential resulting recovery won't run anyway. Also recoveries
may have been disabled by "reloadnodes" and if the nodemaps are
inconsistent between nodes then avoid triggering an unnecessary
recovery.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 13dc4a98426b30e7226015b1d8a86ec2e80d6228
Author: Martin Schwenke <[email protected]>
Date: Mon Feb 9 20:20:44 2015 +1100
ctdb-tool: Update "reloadnodes" to disable recoveries
If a recovery occurs when some nodes have reloaded and others haven't
then the nodemaps with be inconsistent so bad things will happen.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit ee9619c28b594b7fec8093b522ac205e5d4eb0ea
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 15:06:44 2015 +1100
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
Also add test stub support.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 2ca484cd50c2655c59802cae6c81982b42bf61eb
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 15:03:03 2015 +1100
ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable()
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 108db3396f71a35ef1690a5b483d2728223803df
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 13:05:12 2015 +1100
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs
Factor out new function srvid_disable_and_reply(), which can be
re-used.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit ec32d9bea8993778cd6b0fc63bfde492ee21d830
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 14:47:33 2015 +1100
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable()
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 281f7e8152e01a15e9df946ee293156ded8b2857
Author: Martin Schwenke <[email protected]>
Date: Fri Feb 6 14:32:08 2015 +1100
ctdb-recoverd: Use a goto for do_recovery() failures
This will allow extra things to be done on failure.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit a2044c65bc669e7240bd4ffc4b6935f57f493535
Author: Martin Schwenke <[email protected]>
Date: Sun Feb 8 20:52:12 2015 +1100
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable()
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 55b246195b282175022ea2ae239ebcd5d4970d3f
Author: Martin Schwenke <[email protected]>
Date: Sun Feb 8 20:50:38 2015 +1100
ctdb-recoverd: Add a new abstraction ctdb_op_disable()
This can be used to disable and re-enable an operation, and do all the
relevant sanity checking.
Most of this is from existing functions
disable_takeover_runs_handler(), clear_takeover_runs_disable() and
reenable_takeover_runs().
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit ae9cd037ee96c000b11aaa7d171463b00fe4850c
Author: Martin Schwenke <[email protected]>
Date: Wed Feb 4 17:18:12 2015 +1100
ctdb-daemon: Pass on consistent flag information to recovery daemon
Signed-off-by: Martin Schwenke <[email protected]>
Pair-programmed-with: Amitay Isaacs <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 4b972bbdb3e2d3f35fad3c47dc6e84f0fee513c4
Author: Martin Schwenke <[email protected]>
Date: Wed Apr 1 18:00:04 2015 +1100
ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted"
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
commit 181658f5bb180c48f88504a703ed3a3758ac3b5b
Author: Martin Schwenke <[email protected]>
Date: Wed Apr 1 17:10:46 2015 +1100
ctdb-tools: Fix spurious messages about deleted nodes being disconnected
The code was too "clever". The 4 different cases should be separate.
The "node remains deleted" case doesn't need the IP address comparison
(always 0.0.0.0) or the disconnected check.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Amitay Isaacs <[email protected]>
-----------------------------------------------------------------------
Summary of changes:
ctdb/common/ctdb_util.c | 27 ++
ctdb/include/ctdb_private.h | 3 +
ctdb/include/ctdb_protocol.h | 3 +
ctdb/server/ctdb_monitor.c | 1 +
ctdb/server/ctdb_recover.c | 47 +---
ctdb/server/ctdb_recoverd.c | 293 +++++++++++++--------
ctdb/tests/src/ctdb_test_stubs.c | 50 +---
...eloadnodes.001.sh => stubby.reloadnodes.024.sh} | 9 +-
ctdb/tools/ctdb.c | 31 ++-
9 files changed, 263 insertions(+), 201 deletions(-)
copy ctdb/tests/tool/{stubby.reloadnodes.001.sh => stubby.reloadnodes.024.sh}
(72%)
Changeset truncated at 500 lines:
diff --git a/ctdb/common/ctdb_util.c b/ctdb/common/ctdb_util.c
index 76fb06d..8e2e430 100644
--- a/ctdb/common/ctdb_util.c
+++ b/ctdb/common/ctdb_util.c
@@ -579,6 +579,33 @@ struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX
*mem_ctx,
return ret;
}
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+ TALLOC_CTX *mem_ctx)
+{
+ uint32_t i;
+ size_t size;
+ struct ctdb_node_map *node_map;
+
+ size = offsetof(struct ctdb_node_map, nodes) +
+ num_nodes * sizeof(struct ctdb_node_and_flags);
+ node_map = (struct ctdb_node_map *)talloc_zero_size(mem_ctx, size);
+ if (node_map == NULL) {
+ DEBUG(DEBUG_ERR,
+ (__location__ " Failed to allocate nodemap array\n"));
+ return NULL;
+ }
+
+ node_map->num = num_nodes;
+ for (i=0; i<num_nodes; i++) {
+ node_map->nodes[i].addr = nodes[i]->address;
+ node_map->nodes[i].pnn = nodes[i]->pnn;
+ node_map->nodes[i].flags = nodes[i]->flags;
+ }
+
+ return node_map;
+}
+
const char *ctdb_eventscript_call_names[] = {
"init",
"setup",
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index b37d5bb..532f859 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -1388,6 +1388,9 @@ int ctdb_client_async_control(struct ctdb_context *ctdb,
client_async_callback fail_callback,
void *callback_data);
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+ TALLOC_CTX *mem_ctx);
struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx,
const char *nlist);
void ctdb_load_nodes_file(struct ctdb_context *ctdb);
diff --git a/ctdb/include/ctdb_protocol.h b/ctdb/include/ctdb_protocol.h
index c828c01..4dea56b 100644
--- a/ctdb/include/ctdb_protocol.h
+++ b/ctdb/include/ctdb_protocol.h
@@ -156,6 +156,9 @@ struct ctdb_call_info {
/* A message handler ID to stop takeover runs from occurring */
#define CTDB_SRVID_DISABLE_TAKEOVER_RUNS 0xFB03000000000000LL
+/* A message handler ID to stop recoveries from occurring */
+#define CTDB_SRVID_DISABLE_RECOVERIES 0xFB04000000000000LL
+
/* A message id to ask the recovery daemon to temporarily disable the
public ip checks
*/
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index 9b8df6d..5c0c055 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -497,6 +497,7 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb,
TDB_DATA indata)
}
/* tell the recovery daemon something has changed */
+ c->new_flags = node->flags;
ctdb_daemon_send_message(ctdb, ctdb->pnn,
CTDB_SRVID_SET_NODE_FLAGS, indata);
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index eb3f46d..7a684d5 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -118,30 +118,19 @@ ctdb_control_getdbmap(struct ctdb_context *ctdb, uint32_t
opcode, TDB_DATA indat
return 0;
}
-int
+int
ctdb_control_getnodemap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA
indata, TDB_DATA *outdata)
{
- uint32_t i, num_nodes;
- struct ctdb_node_map *node_map;
-
CHECK_CONTROL_DATA_SIZE(0);
- num_nodes = ctdb->num_nodes;
-
- outdata->dsize = offsetof(struct ctdb_node_map, nodes) +
num_nodes*sizeof(struct ctdb_node_and_flags);
- outdata->dptr = (unsigned char *)talloc_zero_size(outdata,
outdata->dsize);
- if (!outdata->dptr) {
- DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap
array\n"));
- exit(1);
+ outdata->dptr = (unsigned char *)ctdb_node_list_to_map(ctdb->nodes,
+ ctdb->num_nodes,
+ outdata);
+ if (outdata->dptr == NULL) {
+ return -1;
}
- node_map = (struct ctdb_node_map *)outdata->dptr;
- node_map->num = num_nodes;
- for (i=0; i<num_nodes; i++) {
- node_map->nodes[i].addr = ctdb->nodes[i]->address;
- node_map->nodes[i].pnn = ctdb->nodes[i]->pnn;
- node_map->nodes[i].flags = ctdb->nodes[i]->flags;
- }
+ outdata->dsize = talloc_get_size(outdata->dptr);
return 0;
}
@@ -177,14 +166,15 @@ ctdb_control_getnodemapv4(struct ctdb_context *ctdb,
uint32_t opcode, TDB_DATA i
return 0;
}
-static void
-ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te,
- struct timeval t, void *private_data)
+/*
+ reload the nodes file
+*/
+int
+ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
{
int i, num_nodes;
- struct ctdb_context *ctdb = talloc_get_type(private_data, struct
ctdb_context);
TALLOC_CTX *tmp_ctx;
- struct ctdb_node **nodes;
+ struct ctdb_node **nodes;
tmp_ctx = talloc_new(ctdb);
@@ -225,17 +215,6 @@ ctdb_reload_nodes_event(struct event_context *ev, struct
timed_event *te,
ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_RELOAD_NODES,
tdb_null);
talloc_free(tmp_ctx);
- return;
-}
-
-/*
- reload the nodes file after a short delay (so that we can send the response
- back first
-*/
-int
-ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
-{
- event_add_timed(ctdb->ev, ctdb, timeval_current_ofs(1,0),
ctdb_reload_nodes_event, ctdb);
return 0;
}
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 99018be..673075a 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -117,6 +117,103 @@ nomem:
srvid_request_reply(ctdb, request, result);
}
+/* An abstraction to allow an operation (takeover runs, recoveries,
+ * ...) to be disabled for a given timeout */
+struct ctdb_op_state {
+ struct tevent_timer *timer;
+ bool in_progress;
+ const char *name;
+};
+
+static struct ctdb_op_state *ctdb_op_init(TALLOC_CTX *mem_ctx, const char
*name)
+{
+ struct ctdb_op_state *state = talloc_zero(mem_ctx, struct
ctdb_op_state);
+
+ if (state != NULL) {
+ state->in_progress = false;
+ state->name = name;
+ }
+
+ return state;
+}
+
+static bool ctdb_op_is_disabled(struct ctdb_op_state *state)
+{
+ return state->timer != NULL;
+}
+
+static bool ctdb_op_begin(struct ctdb_op_state *state)
+{
+ if (ctdb_op_is_disabled(state)) {
+ DEBUG(DEBUG_NOTICE,
+ ("Unable to begin - %s are disabled\n", state->name));
+ return false;
+ }
+
+ state->in_progress = true;
+ return true;
+}
+
+static bool ctdb_op_end(struct ctdb_op_state *state)
+{
+ return state->in_progress = false;
+}
+
+static bool ctdb_op_is_in_progress(struct ctdb_op_state *state)
+{
+ return state->in_progress;
+}
+
+static void ctdb_op_enable(struct ctdb_op_state *state)
+{
+ TALLOC_FREE(state->timer);
+}
+
+static void ctdb_op_timeout_handler(struct event_context *ev,
+ struct timed_event *te,
+ struct timeval yt, void *p)
+{
+ struct ctdb_op_state *state =
+ talloc_get_type(p, struct ctdb_op_state);
+
+ DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name));
+ ctdb_op_enable(state);
+}
+
+static int ctdb_op_disable(struct ctdb_op_state *state,
+ struct tevent_context *ev,
+ uint32_t timeout)
+{
+ if (timeout == 0) {
+ DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name));
+ ctdb_op_enable(state);
+ return 0;
+ }
+
+ if (state->in_progress) {
+ DEBUG(DEBUG_ERR,
+ ("Unable to disable %s - in progress\n", state->name));
+ return -EAGAIN;
+ }
+
+ DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n",
+ state->name, timeout));
+
+ /* Clear any old timers */
+ talloc_free(state->timer);
+
+ /* Arrange for the timeout to occur */
+ state->timer = tevent_add_timer(ev, state,
+ timeval_current_ofs(timeout, 0),
+ ctdb_op_timeout_handler, state);
+ if (state->timer == NULL) {
+ DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n"));
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
struct ctdb_banning_state {
uint32_t count;
struct timeval last_reported_time;
@@ -141,8 +238,8 @@ struct ctdb_recoverd {
struct timed_event *election_timeout;
struct vacuum_info *vacuum_info;
struct srvid_requests *reallocate_requests;
- bool takeover_run_in_progress;
- TALLOC_CTX *takeover_runs_disable_ctx;
+ struct ctdb_op_state *takeover_run;
+ struct ctdb_op_state *recovery;
struct ctdb_control_get_ifaces *ifaces;
uint32_t *force_rebalance_nodes;
};
@@ -1566,7 +1663,7 @@ static int ctdb_reload_remote_public_ips(struct
ctdb_context *ctdb,
}
if (ctdb->do_checkpublicip &&
- rec->takeover_runs_disable_ctx == NULL &&
+ !ctdb_op_is_disabled(rec->takeover_run) &&
verify_remote_ip_allocation(ctdb,
node->known_public_ips,
node->pnn)) {
@@ -1691,19 +1788,14 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
DEBUG(DEBUG_NOTICE, ("Takeover run starting\n"));
- if (rec->takeover_run_in_progress) {
+ if (ctdb_op_is_in_progress(rec->takeover_run)) {
DEBUG(DEBUG_ERR, (__location__
" takeover run already in progress \n"));
ok = false;
goto done;
}
- rec->takeover_run_in_progress = true;
-
- /* If takeover runs are in disabled then fail... */
- if (rec->takeover_runs_disable_ctx != NULL) {
- DEBUG(DEBUG_ERR,
- ("Takeover runs are disabled so refusing to run one\n"));
+ if (!ctdb_op_begin(rec->takeover_run)) {
ok = false;
goto done;
}
@@ -1767,7 +1859,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
done:
rec->need_takeover_run = !ok;
talloc_free(nodes);
- rec->takeover_run_in_progress = false;
+ ctdb_op_end(rec->takeover_run);
DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully"
: "unsuccessful"));
return ok;
@@ -1796,16 +1888,20 @@ static int do_recovery(struct ctdb_recoverd *rec,
/* if recovery fails, force it again */
rec->need_recovery = true;
+ if (!ctdb_op_begin(rec->recovery)) {
+ return -1;
+ }
+
if (rec->election_timeout) {
/* an election is in progress */
DEBUG(DEBUG_ERR, ("do_recovery called while election in
progress - try again later\n"));
- return -1;
+ goto fail;
}
ban_misbehaving_nodes(rec, &self_ban);
if (self_ban) {
DEBUG(DEBUG_NOTICE, ("This node was banned, aborting
recovery\n"));
- return -1;
+ goto fail;
}
if (ctdb->recovery_lock_file != NULL) {
@@ -1823,14 +1919,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
*/
DEBUG(DEBUG_ERR, ("Unable to get
recovery lock"
" - retrying
recovery\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_ERR,("Unable to get recovery lock -
aborting recovery "
"and ban ourself for %u
seconds\n",
ctdb->tunable.recovery_ban_period));
ctdb_ban_node(rec, pnn,
ctdb->tunable.recovery_ban_period);
- return -1;
+ goto fail;
}
ctdb_ctrl_report_recd_lock_latency(ctdb,
CONTROL_TIMEOUT(),
@@ -1846,7 +1942,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &dbmap);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node
:%u\n", pnn));
- return -1;
+ goto fail;
}
/* we do the db creation before we set the recovery mode, so the freeze
happens
@@ -1856,14 +1952,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = create_missing_local_databases(ctdb, nodemap, pnn, &dbmap,
mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to create missing local
databases\n"));
- return -1;
+ goto fail;
}
/* verify that all other nodes have all our databases */
ret = create_missing_remote_databases(ctdb, nodemap, pnn, dbmap,
mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to create missing
remote databases\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - created remote
databases\n"));
@@ -1884,14 +1980,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_ACTIVE);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to
active on cluster\n"));
- return -1;
+ goto fail;
}
/* execute the "startrecovery" event script on all nodes */
ret = run_startrecovery_eventscript(rec, nodemap);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to run the
'startrecovery' event on cluster\n"));
- return -1;
+ goto fail;
}
/*
@@ -1908,7 +2004,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
DEBUG(DEBUG_WARNING, (__location__ "Unable to
update flags on inactive node %d\n", i));
} else {
DEBUG(DEBUG_ERR, (__location__ " Unable to
update flags on all nodes for node %d\n", i));
- return -1;
+ goto fail;
}
}
}
@@ -1932,7 +2028,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = ctdb_ctrl_setvnnmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx,
vnnmap);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set vnnmap for node
%u\n", pnn));
- return -1;
+ goto fail;
}
data.dptr = (void *)&generation;
@@ -1954,7 +2050,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
NULL) != 0) {
DEBUG(DEBUG_ERR,("Failed to cancel recovery
transaction\n"));
}
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE,(__location__ " started transactions on all
nodes\n"));
@@ -1966,7 +2062,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
pnn, nodemap, generation);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Failed to recover
database 0x%x\n", dbmap->dbs[i].dbid));
- return -1;
+ goto fail;
}
}
@@ -1979,7 +2075,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
NULL, NULL,
NULL) != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to commit recovery
changes. Recovery failed.\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - committed databases\n"));
@@ -1989,7 +2085,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = update_capabilities(ctdb, nodemap);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to update node
capabilities.\n"));
- return -1;
+ goto fail;
}
/* build a new vnn map with all the currently active and
@@ -2029,7 +2125,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = update_vnnmap_on_all_nodes(ctdb, nodemap, pnn, vnnmap, mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to update vnnmap on all
nodes\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated vnnmap\n"));
@@ -2038,7 +2134,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_master(ctdb, nodemap, pnn);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery
master\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated recmaster\n"));
@@ -2047,7 +2143,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_NORMAL);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to
normal on cluster\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - disabled recovery
mode\n"));
@@ -2058,7 +2154,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
DEBUG(DEBUG_ERR,("Failed to read public ips from remote node
%d\n",
culprit));
rec->need_takeover_run = true;
- return -1;
+ goto fail;
}
do_takeover_run(rec, nodemap, false);
@@ -2067,7 +2163,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = run_recovered_eventscript(rec, nodemap, "do_recovery");
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered'
event on cluster. Recovery process failed.\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered
event\n"));
--
Samba Shared Repository