The branch, 2.5 has been updated via 93deb93164a0522448bdafa60ecb3f1892aad8f1 (commit) via 04e5184ba6fd1a6e190f822f2b0ae7c1ecb542a8 (commit) via 14427b6ff35af8fd20471239aa146e3a1d28e89d (commit) via fdb899d7c91606dee41d3284ad66bb9bca2573f9 (commit) via 097342f5abd2278bd6facf1ed9e2bd12cd0852d4 (commit) via 84905282afa56a08c273b7fe76b8fd00917ac1ea (commit) via 938a6cae2fc8eb40716584e2998cab0b838786ab (commit) via e7447459742a491b31bafb94555db4c17340622f (commit) via 53e21c2aafa9c4912fd2bd90ef16570e4ec772a3 (commit) via 9d2ba86dc9ec8bebc7d28c62d729c5dbe2aca1ec (commit) via 5471fe7a68549ddac6cf6e9a235477adb01c4086 (commit) from 163852e4a796b9f3ea4a3097c4e139209a9c51f9 (commit)
https://git.samba.org/?p=ctdb.git;a=shortlog;h=2.5 - Log ----------------------------------------------------------------- commit 93deb93164a0522448bdafa60ecb3f1892aad8f1 Author: Michael Adam <ob...@samba.org> Date: Fri Jun 12 10:59:54 2015 +0200 vacuum: revert "Do not delete VACUUM MIGRATED records immediately" This reverts commit 257311e337065f089df688cbf261d2577949203d. That commit was due to a misunderstanding, and it does not fix what it was supposed to fix. Signed-off-by: Michael Adam <ob...@samba.org> Reviewed-by: Amitay Isaacs <ami...@gmail.com> (Imported from commit 1898200481f64676e596e52dc177c8d70ca1a00c) commit 04e5184ba6fd1a6e190f822f2b0ae7c1ecb542a8 Author: Stefan Metzmacher <me...@samba.org> Date: Fri Jun 5 10:30:39 2015 +0200 ib: make sure the tevent_fd is removed before the fd is closed BUG: https://bugzilla.samba.org/show_bug.cgi?id=11316 Signed-off-by: Stefan Metzmacher <me...@samba.org> Reviewed-by: Volker Lendecke <v...@samba.org> (Imported from commit 53ff3e4f31f3debd98f9293171c023a0a406858d) commit 14427b6ff35af8fd20471239aa146e3a1d28e89d Author: Stefan Metzmacher <me...@samba.org> Date: Tue Jun 2 12:43:17 2015 +0200 locking: move all auto_mark logic into process_callbacks() The caller should not dereference lock_ctx after invoking process_callbacks(), it might be destroyed already. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Stefan Metzmacher <me...@samba.org> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Fri Jun 12 15:28:57 CEST 2015 on sn-devel-104 (Imported from commit b3a18d66c00dba73a3f56a6f95781b4d34db1fe2) commit fdb899d7c91606dee41d3284ad66bb9bca2573f9 Author: Stefan Metzmacher <me...@samba.org> Date: Tue Jun 2 12:39:17 2015 +0200 locking: make process_callbacks() more robust We should not dereference lock_ctx after invoking the callback in the auto_mark == false case. The callback could have destroyed it. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Stefan Metzmacher <me...@samba.org> Reviewed-by: Amitay Isaacs <ami...@gmail.com> (Imported from commit a2690bc3f4e28a2ed50ccb47cb404fc8570fde6d) commit 097342f5abd2278bd6facf1ed9e2bd12cd0852d4 Author: Amitay Isaacs <ami...@gmail.com> Date: Tue Jun 2 13:15:37 2015 +1000 locking: Add a comment to explain auto_mark usage BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Stefan Metzmacher <me...@samba.org> (Imported from commit 89849c4d31c0bb0c47864e11abc89efe7d812d87) commit 84905282afa56a08c273b7fe76b8fd00917ac1ea Author: Amitay Isaacs <ami...@gmail.com> Date: Tue Jun 2 11:25:44 2015 +1000 locking: Avoid resetting talloc destructor Let ctdb_lock_request_destructor() take care of the proper cleanup. If the request if freed from the callback function, then the lock context should not be freed. Setting request->lctx to NULL takes care of that in the destructor. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Stefan Metzmacher <me...@samba.org> (Imported from commit bc747030d435447e62262541cf2e74be4c4229d8) commit 938a6cae2fc8eb40716584e2998cab0b838786ab Author: Amitay Isaacs <ami...@gmail.com> Date: Tue Jun 2 11:15:11 2015 +1000 locking: Avoid memory leak in the failure case BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Stefan Metzmacher <me...@samba.org> (Imported from commit 2b352ff20597b9e34b3777d35deca1bf56209f8a) commit e7447459742a491b31bafb94555db4c17340622f Author: Amitay Isaacs <ami...@gmail.com> Date: Tue Jun 2 00:22:07 2015 +1000 locking: Set destructor when lock_context is created There is already code in the destructor to correctly remove it from the pending or the active queue. This also ensures that when lock context is in pending queue and if the lock request gets freed, the lock context is correctly removed from the pending queue. Thanks to Stefan Metzmacher for noticing this and suggesting the fix. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Stefan Metzmacher <me...@samba.org> (Imported from commit 5ae6a8f2fff5b5f4d46f496fd83f555be4b3d448) commit 53e21c2aafa9c4912fd2bd90ef16570e4ec772a3 Author: Stefan Metzmacher <me...@samba.org> Date: Tue Jun 2 00:15:11 2015 +1000 locking: Set the lock_ctx->request to NULL when request is freed The code was added to ctdb_lock_context_destructor() to ensure that the if a lock_ctx gets freed first, the lock_request does not have a dangling pointer. However, the reverse is also true. When a lock_request is freed, then lock_ctx should not have a dangling pointer. In commit 374cbc7b0ff68e04ee4e395935509c7df817b3c0, the code for second condition was dropped causing a regression. BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Stefan Metzmacher <me...@samba.org> Reviewed-by: Amitay Isaacs <ami...@gmail.com> (Imported from commit 752ec31bcbbfe9f5b3b1c5dde4179d69f41cb53c) commit 9d2ba86dc9ec8bebc7d28c62d729c5dbe2aca1ec Author: Stefan Metzmacher <me...@samba.org> Date: Tue May 26 16:45:34 2015 +0200 locking: Avoid memory corruption in ctdb_lock_context_destructor If the lock request is freed from within the callback, then setting lock_ctx->request to NULL in ctdb_lock_context_destructor will end up corrupting memory. In this case, lock_ctx->request could be reallocated and pointing to something else. This may cause unexpected abort trying to dereference a NULL pointer. So, set lock_ctx->request to NULL before processing callbacks. This avoids the following valgrind problem. ==3636== Invalid write of size 8 ==3636== at 0x151F3D: ctdb_lock_context_destructor (ctdb_lock.c:276) ==3636== by 0x58B3618: _talloc_free_internal (talloc.c:993) ==3636== by 0x58AD692: _talloc_free_children_internal (talloc.c:1472) ==3636== by 0x58AD692: _talloc_free_internal (talloc.c:1019) ==3636== by 0x58AD692: _talloc_free (talloc.c:1594) ==3636== by 0x15292E: ctdb_lock_handler (ctdb_lock.c:471) ==3636== by 0x56A535A: epoll_event_loop (tevent_epoll.c:728) ==3636== by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926) ==3636== by 0x56A3826: std_event_loop_once (tevent_standard.c:114) ==3636== by 0x569FFFC: _tevent_loop_once (tevent.c:533) ==3636== by 0x56A019A: tevent_common_loop_wait (tevent.c:637) ==3636== by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140) ==3636== by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320) ==3636== by 0x118557: main (ctdbd.c:321) ==3636== Address 0x9c5b660 is 96 bytes inside a block of size 120 free'd ==3636== at 0x4C29D17: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==3636== by 0x58B32D3: _talloc_free_internal (talloc.c:1063) ==3636== by 0x58B3232: _talloc_free_children_internal (talloc.c:1472) ==3636== by 0x58B3232: _talloc_free_internal (talloc.c:1019) ==3636== by 0x58B3232: _talloc_free_children_internal (talloc.c:1472) ==3636== by 0x58B3232: _talloc_free_internal (talloc.c:1019) ==3636== by 0x58AD692: _talloc_free_children_internal (talloc.c:1472) ==3636== by 0x58AD692: _talloc_free_internal (talloc.c:1019) ==3636== by 0x58AD692: _talloc_free (talloc.c:1594) ==3636== by 0x11EC30: daemon_incoming_packet (ctdb_daemon.c:844) ==3636== by 0x136F4A: lock_fetch_callback (ctdb_ltdb_server.c:268) ==3636== by 0x152489: process_callbacks (ctdb_lock.c:353) ==3636== by 0x152489: ctdb_lock_handler (ctdb_lock.c:468) ==3636== by 0x56A535A: epoll_event_loop (tevent_epoll.c:728) ==3636== by 0x56A535A: epoll_event_loop_once (tevent_epoll.c:926) ==3636== by 0x56A3826: std_event_loop_once (tevent_standard.c:114) ==3636== by 0x569FFFC: _tevent_loop_once (tevent.c:533) ==3636== by 0x56A019A: tevent_common_loop_wait (tevent.c:637) ==3636== by 0x56A37C6: std_event_loop_wait (tevent_standard.c:140) ==3636== by 0x11E03A: ctdb_start_daemon (ctdb_daemon.c:1320) ==3636== by 0x118557: main (ctdbd.c:321) BUG: https://bugzilla.samba.org/show_bug.cgi?id=11293 Signed-off-by: Stefan Metzmacher <me...@samba.org> Reviewed-by: Amitay Isaacs <ami...@gmail.com> (Imported from commit ee02e40e869fd46f113d016122dd5384b7887228) commit 5471fe7a68549ddac6cf6e9a235477adb01c4086 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Apr 28 13:51:00 2015 +1000 scripts: Add alternative network family monitoring for NFS For example, adding a file called nfs-rpc-checks.d/20.nfsd@udp.check will cause NFS to be checked on UDP as well, using a separate counter. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Thu Apr 30 09:24:12 CEST 2015 on sn-devel-104 (Imported from commit e359d826a42656bb02ca2ab85f0fa886a046cb58) ----------------------------------------------------------------------- Summary of changes: config/functions | 19 ++++++++++++++---- ib/ibwrapper.c | 21 ++++++++++---------- server/ctdb_lock.c | 49 +++++++++++++++++++++++++++++------------------ server/ctdb_ltdb_server.c | 5 ----- 4 files changed, 55 insertions(+), 39 deletions(-) Changeset truncated at 500 lines: diff --git a/config/functions b/config/functions index 33a3a3a..80639c4 100755 --- a/config/functions +++ b/config/functions @@ -271,7 +271,16 @@ nfs_check_rpc_services () _t="${_f%.check}" _prog_name="${_t##*/[0-9][0-9].}" - if _nfs_check_rpc_common "$_prog_name" ; then + # If $_prog_name contains '@' then the bit after it is the + # address family. + _family="${_prog_name#*@}" + if [ "$_family" = "$_prog_name" ] ; then + _family="" + else + _prog_name="${_prog_name%@*}" + fi + + if _nfs_check_rpc_common "$_prog_name" "$_family" ; then # This RPC service is up, check next service... continue fi @@ -295,6 +304,7 @@ nfs_check_rpc_services () _nfs_check_rpc_common () { _prog_name="$1" + _family="$2" # Some platforms don't have separate programs for all services. case "$_prog_name" in @@ -328,9 +338,9 @@ _nfs_check_rpc_common () exit 1 esac - _service_name="nfs_${_prog_name}" + _service_name="nfs_${_prog_name}${_family:+_}${_family}" - if ctdb_check_rpc "$_rpc_prog" $_version >/dev/null ; then + if ctdb_check_rpc "$_rpc_prog" "$_version" "$_family" >/dev/null ; then ctdb_counter_init "$_service_name" return 0 fi @@ -432,10 +442,11 @@ ctdb_check_rpc () { progname="$1" version="$2" + _family="${3:-tcp}" _localhost="${CTDB_RPCINFO_LOCALHOST:-127.0.0.1}" - if ! ctdb_check_rpc_out=$(rpcinfo -T tcp $_localhost $progname $version 2>&1) ; then + if ! ctdb_check_rpc_out=$(rpcinfo -T $_family $_localhost $progname $version 2>&1) ; then ctdb_check_rpc_out="ERROR: $progname failed RPC check: $ctdb_check_rpc_out" echo "$ctdb_check_rpc_out" diff --git a/ib/ibwrapper.c b/ib/ibwrapper.c index 3daab3e..51d39da 100644 --- a/ib/ibwrapper.c +++ b/ib/ibwrapper.c @@ -134,16 +134,16 @@ static int ibw_ctx_priv_destruct(struct ibw_ctx_priv *pctx) { DEBUG(DEBUG_DEBUG, ("ibw_ctx_priv_destruct(%p)\n", pctx)); + /* + * tevent_fd must be removed before the fd is closed + */ + TALLOC_FREE(pctx->cm_channel_event); + /* destroy cm */ if (pctx->cm_channel) { rdma_destroy_event_channel(pctx->cm_channel); pctx->cm_channel = NULL; } - if (pctx->cm_channel_event) { - /* TODO: do we have to do this here? */ - talloc_free(pctx->cm_channel_event); - pctx->cm_channel_event = NULL; - } if (pctx->cm_id) { rdma_destroy_id(pctx->cm_id); pctx->cm_id = NULL; @@ -166,6 +166,11 @@ static int ibw_conn_priv_destruct(struct ibw_conn_priv *pconn) /* pconn->wr_index is freed by talloc */ /* pconn->wr_index[i] are freed by talloc */ + /* + * tevent_fd must be removed before the fd is closed + */ + TALLOC_FREE(pconn->verbs_channel_event); + /* destroy verbs */ if (pconn->cm_id!=NULL && pconn->cm_id->qp!=NULL) { rdma_destroy_qp(pconn->cm_id); @@ -182,12 +187,6 @@ static int ibw_conn_priv_destruct(struct ibw_conn_priv *pconn) pconn->verbs_channel = NULL; } - /* must be freed here because its order is important */ - if (pconn->verbs_channel_event) { - talloc_free(pconn->verbs_channel_event); - pconn->verbs_channel_event = NULL; - } - /* free memory regions */ ibw_free_mr(&pconn->buf_send, &pconn->mr_send); ibw_free_mr(&pconn->buf_recv, &pconn->mr_recv); diff --git a/server/ctdb_lock.c b/server/ctdb_lock.c index fd1b4db..4440178 100644 --- a/server/ctdb_lock.c +++ b/server/ctdb_lock.c @@ -41,6 +41,10 @@ * ctdb_lock_alldb() - get a lock on all DBs * * auto_mark - whether to mark/unmark DBs in before/after callback + * = false is used for freezing databases for + * recovery since the recovery cannot start till + * databases are locked on all the nodes. + * = true is used for record locks. */ enum lock_type { @@ -312,7 +316,13 @@ static int ctdb_lock_context_destructor(struct lock_context *lock_ctx) */ static int ctdb_lock_request_destructor(struct lock_request *lock_request) { + if (lock_request->lctx == NULL) { + return 0; + } + + lock_request->lctx->request = NULL; TALLOC_FREE(lock_request->lctx); + return 0; } @@ -324,8 +334,9 @@ static int ctdb_lock_request_destructor(struct lock_request *lock_request) static void process_callbacks(struct lock_context *lock_ctx, bool locked) { struct lock_request *request; + bool auto_mark = lock_ctx->auto_mark; - if (lock_ctx->auto_mark && locked) { + if (auto_mark && locked) { switch (lock_ctx->type) { case LOCK_RECORD: tdb_chainlock_mark(lock_ctx->ctdb_db->ltdb->tdb, lock_ctx->key); @@ -346,13 +357,23 @@ static void process_callbacks(struct lock_context *lock_ctx, bool locked) } request = lock_ctx->request; - if (lock_ctx->auto_mark) { - /* Reset the destructor, so request is not removed from the list */ - talloc_set_destructor(request, NULL); + if (auto_mark) { + /* Since request may be freed in the callback, unset the lock + * context, so request destructor will not free lock context. + */ + request->lctx = NULL; } + + /* Since request may be freed in the callback, unset the request */ + lock_ctx->request = NULL; + request->callback(request->private_data, locked); - if (lock_ctx->auto_mark && locked) { + if (!auto_mark) { + return; + } + + if (locked) { switch (lock_ctx->type) { case LOCK_RECORD: tdb_chainlock_unmark(lock_ctx->ctdb_db->ltdb->tdb, lock_ctx->key); @@ -371,6 +392,8 @@ static void process_callbacks(struct lock_context *lock_ctx, bool locked) break; } } + + talloc_free(lock_ctx); } @@ -416,7 +439,6 @@ static void ctdb_lock_handler(struct tevent_context *ev, void *private_data) { struct lock_context *lock_ctx; - TALLOC_CTX *tmp_ctx = NULL; char c; bool locked; double t; @@ -430,11 +452,6 @@ static void ctdb_lock_handler(struct tevent_context *ev, t = timeval_elapsed(&lock_ctx->start_time); id = lock_bucket_id(t); - if (lock_ctx->auto_mark) { - tmp_ctx = talloc_new(ev); - talloc_steal(tmp_ctx, lock_ctx); - } - /* Read the status from the child process */ if (sys_read(lock_ctx->fd[0], &c, 1) != 1) { locked = false; @@ -466,10 +483,6 @@ static void ctdb_lock_handler(struct tevent_context *ev, } process_callbacks(lock_ctx, locked); - - if (lock_ctx->auto_mark) { - talloc_free(tmp_ctx); - } } @@ -817,8 +830,6 @@ static void ctdb_lock_schedule(struct ctdb_context *ctdb) /* Parent process */ close(lock_ctx->fd[1]); - talloc_set_destructor(lock_ctx, ctdb_lock_context_destructor); - talloc_free(tmp_ctx); /* Set up timeout handler */ @@ -830,7 +841,6 @@ static void ctdb_lock_schedule(struct ctdb_context *ctdb) if (lock_ctx->ttimer == NULL) { ctdb_kill(ctdb, lock_ctx->child, SIGKILL); lock_ctx->child = -1; - talloc_set_destructor(lock_ctx, NULL); close(lock_ctx->fd[0]); return; } @@ -846,7 +856,6 @@ static void ctdb_lock_schedule(struct ctdb_context *ctdb) TALLOC_FREE(lock_ctx->ttimer); ctdb_kill(ctdb, lock_ctx->child, SIGKILL); lock_ctx->child = -1; - talloc_set_destructor(lock_ctx, NULL); close(lock_ctx->fd[0]); return; } @@ -911,6 +920,7 @@ static struct lock_request *ctdb_lock_internal(TALLOC_CTX *mem_ctx, if (lock_ctx->key.dptr == NULL) { DEBUG(DEBUG_ERR, (__location__ "Memory allocation error\n")); talloc_free(lock_ctx); + talloc_free(request); return NULL; } lock_ctx->key_hash = ctdb_hash(&key); @@ -944,6 +954,7 @@ static struct lock_request *ctdb_lock_internal(TALLOC_CTX *mem_ctx, request->private_data = private_data; talloc_set_destructor(request, ctdb_lock_request_destructor); + talloc_set_destructor(lock_ctx, ctdb_lock_context_destructor); ctdb_lock_schedule(ctdb); diff --git a/server/ctdb_ltdb_server.c b/server/ctdb_ltdb_server.c index 6a11851..3b71918 100644 --- a/server/ctdb_ltdb_server.c +++ b/server/ctdb_ltdb_server.c @@ -115,11 +115,6 @@ static int ctdb_ltdb_store_server(struct ctdb_db_context *ctdb_db, * fails. So storing the empty record makes sure that we do not * need to change the client code. */ - if ((header->flags & CTDB_REC_FLAG_VACUUM_MIGRATED) && - (ctdb_db->ctdb->pnn == header->dmaster)) { - keep = true; - schedule_for_deletion = true; - } if (!(header->flags & CTDB_REC_FLAG_VACUUM_MIGRATED)) { keep = true; } else if (ctdb_db->ctdb->pnn != header->dmaster) { -- CTDB repository