[SCM] Samba Shared Repository - branch master updated

Amitay Isaacs Tue, 07 Apr 2015 01:22:35 -0700

The branch, master has been updated
       via  0858b11 ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
       via  1ef1cfd ctdb-common: Move ctdb_node_list_to_map() to utilities
       via  dd52d82 ctdb-daemon: Factor out new function ctdb_node_list_to_map()
       via  ffbe0a6 ctdb-tools: Drop the recovery from "reloadnodes"
       via  d340f30 ctdb-daemon: Don't delay reloading the nodes file
       via  85bd9a3 ctdb-recoverd: Avoid nodemap-related checks when recoveries 
are disabled
       via  13dc4a9 ctdb-tool: Update "reloadnodes" to disable recoveries
       via  ee9619c ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
       via  2ca484c ctdb-recoverd: Simplify disable_ip_check_handler() using 
ctdb_op_disable()
       via  108db33 ctdb-recoverd: Add slightly more abstraction for disabling 
takeover runs
       via  ec32d9b ctdb-recoverd: Reimplement ReRecoveryTimeout using 
ctdb_op_disable()
       via  281f7e8 ctdb-recoverd: Use a goto for do_recovery() failures
       via  a2044c6 ctdb-recoverd: Reimplement disabling takeover runs using 
ctdb_op_disable()
       via  55b2461 ctdb-recoverd: Add a new abstraction ctdb_op_disable()
       via  ae9cd037 ctdb-daemon: Pass on consistent flag information to 
recovery daemon
       via  4b972bb ctdb-tests: Add "ctdb reloadnodes" test for "node remains 
deleted"
       via  181658f ctdb-tools: Fix spurious messages about deleted nodes being 
disconnected
      from  b57c778 rpc_server: Coverity fix for CID 1273079


https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 0858b11ff735b535bfeded346c87a0c245d902c7
Author: Martin Schwenke <[email protected]>
Date:   Sun Feb 22 06:37:41 2015 +1100

    ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
    
    Drop copy of old ctdb_control_nodemap().
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>
    
    Autobuild-User(master): Amitay Isaacs <[email protected]>
    Autobuild-Date(master): Tue Apr  7 10:20:41 CEST 2015 on sn-devel-104

commit 1ef1cfdc4d6b923357630451177fdcde1d616e87
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 20 12:34:25 2015 +1100

    ctdb-common: Move ctdb_node_list_to_map() to utilities
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit dd52d82c73b26a3fed6dfd4aaf7d51f576d019d9
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 20 12:31:37 2015 +1100

    ctdb-daemon: Factor out new function ctdb_node_list_to_map()
    
    Change ctdb_control_getnodemap() to use this.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit ffbe0a6def236f5d0b03d089a7fc3f060eb0e392
Author: Martin Schwenke <[email protected]>
Date:   Wed Feb 4 12:06:56 2015 +1100

    ctdb-tools: Drop the recovery from "reloadnodes"
    
    A recovery is not required: when deleting a node it should already be
    disconnected and when adding a node it will also be disconnected.  The
    new sanity checks in "reloadnodes" ensure that these assumptions are
    met.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit d340f308e76af53b04ae9b5432c4f6c84315303a
Author: Martin Schwenke <[email protected]>
Date:   Tue Feb 10 15:43:03 2015 +1100

    ctdb-daemon: Don't delay reloading the nodes file
    
    Presumably this was done to minimise the chance of a recovery
    occurring while the nodemaps are inconsistent across nodes.
    
    Another potential theory is that the forced recovery in the
    ctdb.c:control_reload_nodes_file() stops another recovery occurring
    for ReRecoveryTimeout seconds, so this delay causes the reloads to
    occur during that period.
    
    This is no longer necessary because recoveries are now explicitly
    disabled while node files are reloaded.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 85bd9a33eb65d6fd03ad85aeedf141a2813c2bb8
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 20:59:11 2015 +1100

    ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled
    
    The potential resulting recovery won't run anyway.  Also recoveries
    may have been disabled by "reloadnodes" and if the nodemaps are
    inconsistent between nodes then avoid triggering an unnecessary
    recovery.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 13dc4a98426b30e7226015b1d8a86ec2e80d6228
Author: Martin Schwenke <[email protected]>
Date:   Mon Feb 9 20:20:44 2015 +1100

    ctdb-tool: Update "reloadnodes" to disable recoveries
    
    If a recovery occurs when some nodes have reloaded and others haven't
    then the nodemaps with be inconsistent so bad things will happen.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit ee9619c28b594b7fec8093b522ac205e5d4eb0ea
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 15:06:44 2015 +1100

    ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
    
    Also add test stub support.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 2ca484cd50c2655c59802cae6c81982b42bf61eb
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 15:03:03 2015 +1100

    ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable()
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 108db3396f71a35ef1690a5b483d2728223803df
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 13:05:12 2015 +1100

    ctdb-recoverd: Add slightly more abstraction for disabling takeover runs
    
    Factor out new function srvid_disable_and_reply(), which can be
    re-used.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit ec32d9bea8993778cd6b0fc63bfde492ee21d830
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 14:47:33 2015 +1100

    ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable()
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 281f7e8152e01a15e9df946ee293156ded8b2857
Author: Martin Schwenke <[email protected]>
Date:   Fri Feb 6 14:32:08 2015 +1100

    ctdb-recoverd: Use a goto for do_recovery() failures
    
    This will allow extra things to be done on failure.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit a2044c65bc669e7240bd4ffc4b6935f57f493535
Author: Martin Schwenke <[email protected]>
Date:   Sun Feb 8 20:52:12 2015 +1100

    ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable()
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 55b246195b282175022ea2ae239ebcd5d4970d3f
Author: Martin Schwenke <[email protected]>
Date:   Sun Feb 8 20:50:38 2015 +1100

    ctdb-recoverd: Add a new abstraction ctdb_op_disable()
    
    This can be used to disable and re-enable an operation, and do all the
    relevant sanity checking.
    
    Most of this is from existing functions
    disable_takeover_runs_handler(), clear_takeover_runs_disable() and
    reenable_takeover_runs().
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit ae9cd037ee96c000b11aaa7d171463b00fe4850c
Author: Martin Schwenke <[email protected]>
Date:   Wed Feb 4 17:18:12 2015 +1100

    ctdb-daemon: Pass on consistent flag information to recovery daemon
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Pair-programmed-with: Amitay Isaacs <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 4b972bbdb3e2d3f35fad3c47dc6e84f0fee513c4
Author: Martin Schwenke <[email protected]>
Date:   Wed Apr 1 18:00:04 2015 +1100

    ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted"
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

commit 181658f5bb180c48f88504a703ed3a3758ac3b5b
Author: Martin Schwenke <[email protected]>
Date:   Wed Apr 1 17:10:46 2015 +1100

    ctdb-tools: Fix spurious messages about deleted nodes being disconnected
    
    The code was too "clever".  The 4 different cases should be separate.
    The "node remains deleted" case doesn't need the IP address comparison
    (always 0.0.0.0) or the disconnected check.
    
    Signed-off-by: Martin Schwenke <[email protected]>
    Reviewed-by: Amitay Isaacs <[email protected]>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/common/ctdb_util.c                            |  27 ++
 ctdb/include/ctdb_private.h                        |   3 +
 ctdb/include/ctdb_protocol.h                       |   3 +
 ctdb/server/ctdb_monitor.c                         |   1 +
 ctdb/server/ctdb_recover.c                         |  47 +---
 ctdb/server/ctdb_recoverd.c                        | 293 +++++++++++++--------
 ctdb/tests/src/ctdb_test_stubs.c                   |  50 +---
 ...eloadnodes.001.sh => stubby.reloadnodes.024.sh} |   9 +-
 ctdb/tools/ctdb.c                                  |  31 ++-
 9 files changed, 263 insertions(+), 201 deletions(-)
 copy ctdb/tests/tool/{stubby.reloadnodes.001.sh => stubby.reloadnodes.024.sh} 
(72%)


Changeset truncated at 500 lines:

diff --git a/ctdb/common/ctdb_util.c b/ctdb/common/ctdb_util.c
index 76fb06d..8e2e430 100644
--- a/ctdb/common/ctdb_util.c
+++ b/ctdb/common/ctdb_util.c
@@ -579,6 +579,33 @@ struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX 
*mem_ctx,
        return ret;
 }
 
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+                     TALLOC_CTX *mem_ctx)
+{
+       uint32_t i;
+       size_t size;
+       struct ctdb_node_map *node_map;
+
+       size = offsetof(struct ctdb_node_map, nodes) +
+               num_nodes * sizeof(struct ctdb_node_and_flags);
+       node_map  = (struct ctdb_node_map *)talloc_zero_size(mem_ctx, size);
+       if (node_map == NULL) {
+               DEBUG(DEBUG_ERR,
+                     (__location__ " Failed to allocate nodemap array\n"));
+               return NULL;
+       }
+
+       node_map->num = num_nodes;
+       for (i=0; i<num_nodes; i++) {
+               node_map->nodes[i].addr  = nodes[i]->address;
+               node_map->nodes[i].pnn   = nodes[i]->pnn;
+               node_map->nodes[i].flags = nodes[i]->flags;
+       }
+
+       return node_map;
+}
+
 const char *ctdb_eventscript_call_names[] = {
        "init",
        "setup",
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index b37d5bb..532f859 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -1388,6 +1388,9 @@ int ctdb_client_async_control(struct ctdb_context *ctdb,
                                client_async_callback fail_callback,
                                void *callback_data);
 
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+                     TALLOC_CTX *mem_ctx);
 struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx,
                                           const char *nlist);
 void ctdb_load_nodes_file(struct ctdb_context *ctdb);
diff --git a/ctdb/include/ctdb_protocol.h b/ctdb/include/ctdb_protocol.h
index c828c01..4dea56b 100644
--- a/ctdb/include/ctdb_protocol.h
+++ b/ctdb/include/ctdb_protocol.h
@@ -156,6 +156,9 @@ struct ctdb_call_info {
 /* A message handler ID to stop takeover runs from occurring */
 #define CTDB_SRVID_DISABLE_TAKEOVER_RUNS 0xFB03000000000000LL
 
+/* A message handler ID to stop recoveries from occurring */
+#define CTDB_SRVID_DISABLE_RECOVERIES 0xFB04000000000000LL
+
 /* A message id to ask the recovery daemon to temporarily disable the
    public ip checks
 */
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index 9b8df6d..5c0c055 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -497,6 +497,7 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb, 
TDB_DATA indata)
        }
 
        /* tell the recovery daemon something has changed */
+       c->new_flags = node->flags;
        ctdb_daemon_send_message(ctdb, ctdb->pnn,
                                 CTDB_SRVID_SET_NODE_FLAGS, indata);
 
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index eb3f46d..7a684d5 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -118,30 +118,19 @@ ctdb_control_getdbmap(struct ctdb_context *ctdb, uint32_t 
opcode, TDB_DATA indat
        return 0;
 }
 
-int 
+int
 ctdb_control_getnodemap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA 
indata, TDB_DATA *outdata)
 {
-       uint32_t i, num_nodes;
-       struct ctdb_node_map *node_map;
-
        CHECK_CONTROL_DATA_SIZE(0);
 
-       num_nodes = ctdb->num_nodes;
-
-       outdata->dsize = offsetof(struct ctdb_node_map, nodes) + 
num_nodes*sizeof(struct ctdb_node_and_flags);
-       outdata->dptr  = (unsigned char *)talloc_zero_size(outdata, 
outdata->dsize);
-       if (!outdata->dptr) {
-               DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap 
array\n"));
-               exit(1);
+       outdata->dptr  = (unsigned char *)ctdb_node_list_to_map(ctdb->nodes,
+                                                               ctdb->num_nodes,
+                                                               outdata);
+       if (outdata->dptr == NULL) {
+               return -1;
        }
 
-       node_map = (struct ctdb_node_map *)outdata->dptr;
-       node_map->num = num_nodes;
-       for (i=0; i<num_nodes; i++) {
-               node_map->nodes[i].addr = ctdb->nodes[i]->address;
-               node_map->nodes[i].pnn   = ctdb->nodes[i]->pnn;
-               node_map->nodes[i].flags = ctdb->nodes[i]->flags;
-       }
+       outdata->dsize = talloc_get_size(outdata->dptr);
 
        return 0;
 }
@@ -177,14 +166,15 @@ ctdb_control_getnodemapv4(struct ctdb_context *ctdb, 
uint32_t opcode, TDB_DATA i
        return 0;
 }
 
-static void
-ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te, 
-                              struct timeval t, void *private_data)
+/*
+  reload the nodes file
+*/
+int
+ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
 {
        int i, num_nodes;
-       struct ctdb_context *ctdb = talloc_get_type(private_data, struct 
ctdb_context);
        TALLOC_CTX *tmp_ctx;
-       struct ctdb_node **nodes;       
+       struct ctdb_node **nodes;
 
        tmp_ctx = talloc_new(ctdb);
 
@@ -225,17 +215,6 @@ ctdb_reload_nodes_event(struct event_context *ev, struct 
timed_event *te,
        ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_RELOAD_NODES, 
tdb_null);
 
        talloc_free(tmp_ctx);
-       return;
-}
-
-/*
-  reload the nodes file after a short delay (so that we can send the response
-  back first
-*/
-int 
-ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
-{
-       event_add_timed(ctdb->ev, ctdb, timeval_current_ofs(1,0), 
ctdb_reload_nodes_event, ctdb);
 
        return 0;
 }
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 99018be..673075a 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -117,6 +117,103 @@ nomem:
        srvid_request_reply(ctdb, request, result);
 }
 
+/* An abstraction to allow an operation (takeover runs, recoveries,
+ * ...) to be disabled for a given timeout */
+struct ctdb_op_state {
+       struct tevent_timer *timer;
+       bool in_progress;
+       const char *name;
+};
+
+static struct ctdb_op_state *ctdb_op_init(TALLOC_CTX *mem_ctx, const char 
*name)
+{
+       struct ctdb_op_state *state = talloc_zero(mem_ctx, struct 
ctdb_op_state);
+
+       if (state != NULL) {
+               state->in_progress = false;
+               state->name = name;
+       }
+
+       return state;
+}
+
+static bool ctdb_op_is_disabled(struct ctdb_op_state *state)
+{
+       return state->timer != NULL;
+}
+
+static bool ctdb_op_begin(struct ctdb_op_state *state)
+{
+       if (ctdb_op_is_disabled(state)) {
+               DEBUG(DEBUG_NOTICE,
+                     ("Unable to begin - %s are disabled\n", state->name));
+               return false;
+       }
+
+       state->in_progress = true;
+       return true;
+}
+
+static bool ctdb_op_end(struct ctdb_op_state *state)
+{
+       return state->in_progress = false;
+}
+
+static bool ctdb_op_is_in_progress(struct ctdb_op_state *state)
+{
+       return state->in_progress;
+}
+
+static void ctdb_op_enable(struct ctdb_op_state *state)
+{
+       TALLOC_FREE(state->timer);
+}
+
+static void ctdb_op_timeout_handler(struct event_context *ev,
+                                   struct timed_event *te,
+                                   struct timeval yt, void *p)
+{
+       struct ctdb_op_state *state =
+               talloc_get_type(p, struct ctdb_op_state);
+
+       DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name));
+       ctdb_op_enable(state);
+}
+
+static int ctdb_op_disable(struct ctdb_op_state *state,
+                          struct tevent_context *ev,
+                          uint32_t timeout)
+{
+       if (timeout == 0) {
+               DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name));
+               ctdb_op_enable(state);
+               return 0;
+       }
+
+       if (state->in_progress) {
+               DEBUG(DEBUG_ERR,
+                     ("Unable to disable %s - in progress\n", state->name));
+               return -EAGAIN;
+       }
+
+       DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n",
+                           state->name, timeout));
+
+       /* Clear any old timers */
+       talloc_free(state->timer);
+
+       /* Arrange for the timeout to occur */
+       state->timer = tevent_add_timer(ev, state,
+                                       timeval_current_ofs(timeout, 0),
+                                       ctdb_op_timeout_handler, state);
+       if (state->timer == NULL) {
+               DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n"));
+               return -ENOMEM;
+       }
+
+       return 0;
+}
+
 struct ctdb_banning_state {
        uint32_t count;
        struct timeval last_reported_time;
@@ -141,8 +238,8 @@ struct ctdb_recoverd {
        struct timed_event *election_timeout;
        struct vacuum_info *vacuum_info;
        struct srvid_requests *reallocate_requests;
-       bool takeover_run_in_progress;
-       TALLOC_CTX *takeover_runs_disable_ctx;
+       struct ctdb_op_state *takeover_run;
+       struct ctdb_op_state *recovery;
        struct ctdb_control_get_ifaces *ifaces;
        uint32_t *force_rebalance_nodes;
 };
@@ -1566,7 +1663,7 @@ static int ctdb_reload_remote_public_ips(struct 
ctdb_context *ctdb,
                }
 
                if (ctdb->do_checkpublicip &&
-                   rec->takeover_runs_disable_ctx == NULL &&
+                   !ctdb_op_is_disabled(rec->takeover_run) &&
                    verify_remote_ip_allocation(ctdb,
                                                 node->known_public_ips,
                                                 node->pnn)) {
@@ -1691,19 +1788,14 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
 
        DEBUG(DEBUG_NOTICE, ("Takeover run starting\n"));
 
-       if (rec->takeover_run_in_progress) {
+       if (ctdb_op_is_in_progress(rec->takeover_run)) {
                DEBUG(DEBUG_ERR, (__location__
                                  " takeover run already in progress \n"));
                ok = false;
                goto done;
        }
 
-       rec->takeover_run_in_progress = true;
-
-       /* If takeover runs are in disabled then fail... */
-       if (rec->takeover_runs_disable_ctx != NULL) {
-               DEBUG(DEBUG_ERR,
-                     ("Takeover runs are disabled so refusing to run one\n"));
+       if (!ctdb_op_begin(rec->takeover_run)) {
                ok = false;
                goto done;
        }
@@ -1767,7 +1859,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
 done:
        rec->need_takeover_run = !ok;
        talloc_free(nodes);
-       rec->takeover_run_in_progress = false;
+       ctdb_op_end(rec->takeover_run);
 
        DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully" 
: "unsuccessful"));
        return ok;
@@ -1796,16 +1888,20 @@ static int do_recovery(struct ctdb_recoverd *rec,
        /* if recovery fails, force it again */
        rec->need_recovery = true;
 
+       if (!ctdb_op_begin(rec->recovery)) {
+               return -1;
+       }
+
        if (rec->election_timeout) {
                /* an election is in progress */
                DEBUG(DEBUG_ERR, ("do_recovery called while election in 
progress - try again later\n"));
-               return -1;
+               goto fail;
        }
 
        ban_misbehaving_nodes(rec, &self_ban);
        if (self_ban) {
                DEBUG(DEBUG_NOTICE, ("This node was banned, aborting 
recovery\n"));
-               return -1;
+               goto fail;
        }
 
         if (ctdb->recovery_lock_file != NULL) {
@@ -1823,14 +1919,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
                                         */
                                        DEBUG(DEBUG_ERR, ("Unable to get 
recovery lock"
                                                          " - retrying 
recovery\n"));
-                                       return -1;
+                                       goto fail;
                                }
 
                                DEBUG(DEBUG_ERR,("Unable to get recovery lock - 
aborting recovery "
                                                 "and ban ourself for %u 
seconds\n",
                                                 
ctdb->tunable.recovery_ban_period));
                                ctdb_ban_node(rec, pnn, 
ctdb->tunable.recovery_ban_period);
-                               return -1;
+                               goto fail;
                        }
                        ctdb_ctrl_report_recd_lock_latency(ctdb,
                                                           CONTROL_TIMEOUT(),
@@ -1846,7 +1942,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &dbmap);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node 
:%u\n", pnn));
-               return -1;
+               goto fail;
        }
 
        /* we do the db creation before we set the recovery mode, so the freeze 
happens
@@ -1856,14 +1952,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = create_missing_local_databases(ctdb, nodemap, pnn, &dbmap, 
mem_ctx);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to create missing local 
databases\n"));
-               return -1;
+               goto fail;
        }
 
        /* verify that all other nodes have all our databases */
        ret = create_missing_remote_databases(ctdb, nodemap, pnn, dbmap, 
mem_ctx);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to create missing 
remote databases\n"));
-               return -1;
+               goto fail;
        }
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - created remote 
databases\n"));
 
@@ -1884,14 +1980,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_ACTIVE);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to 
active on cluster\n"));
-               return -1;
+               goto fail;
        }
 
        /* execute the "startrecovery" event script on all nodes */
        ret = run_startrecovery_eventscript(rec, nodemap);
        if (ret!=0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to run the 
'startrecovery' event on cluster\n"));
-               return -1;
+               goto fail;
        }
 
        /*
@@ -1908,7 +2004,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
                                DEBUG(DEBUG_WARNING, (__location__ "Unable to 
update flags on inactive node %d\n", i));
                        } else {
                                DEBUG(DEBUG_ERR, (__location__ " Unable to 
update flags on all nodes for node %d\n", i));
-                               return -1;
+                               goto fail;
                        }
                }
        }
@@ -1932,7 +2028,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = ctdb_ctrl_setvnnmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, 
vnnmap);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to set vnnmap for node 
%u\n", pnn));
-               return -1;
+               goto fail;
        }
 
        data.dptr = (void *)&generation;
@@ -1954,7 +2050,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
                                        NULL) != 0) {
                        DEBUG(DEBUG_ERR,("Failed to cancel recovery 
transaction\n"));
                }
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE,(__location__ " started transactions on all 
nodes\n"));
@@ -1966,7 +2062,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
                                       pnn, nodemap, generation);
                if (ret != 0) {
                        DEBUG(DEBUG_ERR, (__location__ " Failed to recover 
database 0x%x\n", dbmap->dbs[i].dbid));
-                       return -1;
+                       goto fail;
                }
        }
 
@@ -1979,7 +2075,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
                                        NULL, NULL,
                                        NULL) != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to commit recovery 
changes. Recovery failed.\n"));
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - committed databases\n"));
@@ -1989,7 +2085,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = update_capabilities(ctdb, nodemap);
        if (ret!=0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to update node 
capabilities.\n"));
-               return -1;
+               goto fail;
        }
 
        /* build a new vnn map with all the currently active and
@@ -2029,7 +2125,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = update_vnnmap_on_all_nodes(ctdb, nodemap, pnn, vnnmap, mem_ctx);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to update vnnmap on all 
nodes\n"));
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated vnnmap\n"));
@@ -2038,7 +2134,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = set_recovery_master(ctdb, nodemap, pnn);
        if (ret!=0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery 
master\n"));
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated recmaster\n"));
@@ -2047,7 +2143,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_NORMAL);
        if (ret != 0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to 
normal on cluster\n"));
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - disabled recovery 
mode\n"));
@@ -2058,7 +2154,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
                DEBUG(DEBUG_ERR,("Failed to read public ips from remote node 
%d\n",
                                 culprit));
                rec->need_takeover_run = true;
-               return -1;
+               goto fail;
        }
 
        do_takeover_run(rec, nodemap, false);
@@ -2067,7 +2163,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
        ret = run_recovered_eventscript(rec, nodemap, "do_recovery");
        if (ret!=0) {
                DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered' 
event on cluster. Recovery process failed.\n"));
-               return -1;
+               goto fail;
        }
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered 
event\n"));


-- 
Samba Shared Repository

[SCM] Samba Shared Repository - branch master updated

Reply via email to