Re: [Openais] detecting cpg joiners

2009-04-09 Thread Chrissie Caulfield
Joel Becker wrote: Steve, Dave, etc, Someone told me a while back that a node joining a cpg group would be by its lonesome in the join message. That is, when the node gets its first confchg, it will be the only node in the list of joins. I've been using this to detect the first joiner

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
What I recommend here is to place your local node id in the message contents (retrieved via cpg_local_get) and then compare that nodeid to incoming messages Why do you include the local node id into the message? I can compare the local node id with the sending node id without that, for

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Steven Dake
On Thu, 2009-04-09 at 10:19 +0200, Dietmar Maurer wrote: What I recommend here is to place your local node id in the message contents (retrieved via cpg_local_get) and then compare that nodeid to incoming messages Why do you include the local node id into the message? I can compare the

[Openais] [PATCH 4/9] Propagate the above into vsf_quorum.c.

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com * exec/vsf_quorum.c: add const to msg param. --- exec/vsf_quorum.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/exec/vsf_quorum.c b/exec/vsf_quorum.c index dc05458..45f537e 100644 --- a/exec/vsf_quorum.c +++

[Openais] [PATCH 2/9] coroapi.h: change lib_handler_fn's *msg to be const

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com Make a tiny type change and watch it propagate. * include/corosync/engine/coroapi.h (struct corosync_lib_handler) [lib_handler_fn]: Change type of 2nd parameter: s/void *msg/const void *msg/. --- include/corosync/engine/coroapi.h |2 +- 1 files changed,

[Openais] [PATCH 6/9] * services/pload.c: Likewise

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com --- services/pload.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/services/pload.c b/services/pload.c index 2dbe974..424abe6 100644 --- a/services/pload.c +++ b/services/pload.c @@ -91,7 +91,7 @@ static void

[Openais] [PATCH 1/9] testevsth.c: const+size_t: evs_deliver_fn, evs_confchg_fn

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com --- test/testevsth.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/test/testevsth.c b/test/testevsth.c index 1d74f7d..9a1635d 100644 --- a/test/testevsth.c +++ b/test/testevsth.c @@ -47,7 +47,7 @@ char *delivery_string;

[Openais] [PATCH 7/9] coroipcs.h: update signature of coroipcs_handler_fn_lvalue to match

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com * exec/coroipcs.h: signature of coroipcs_handler_fn_lvalue must match that of lib_handler_fn; noted via main.c. --- exec/coroipcs.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/coroipcs.h b/exec/coroipcs.h index

[Openais] [PATCH 8/9] * services/cpg.c: Likewise.

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com --- services/cpg.c | 56 +--- 1 files changed, 33 insertions(+), 23 deletions(-) diff --git a/services/cpg.c b/services/cpg.c index 7bae0fd..f60427a 100644 --- a/services/cpg.c +++ b/services/cpg.c @@

[Openais] [PATCH 3/9] Propagate the above into cfg.c and votequorum.c.

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com * services/cfg.c (message_handler_req_lib_cfg_get_node_addrs): Constification exposed a bug in this function whereby it mistakenly modified storage through its now-const *msg parameter. Since it did that solely to store a temporary result, we've changed it

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Andrew Beekhof
On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield ccaul...@redhat.com wrote: Joel Becker wrote: Steve, Dave, etc,       Someone told me a while back that a node joining a cpg group would be by its lonesome in the join message.  That is, when the node gets its first confchg, it will be the only

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Robert Wipfel
On 4/9/2009 at 5:50 AM, in message 26ef5e70904090450s40e92dcfgea0fc34826360...@mail.gmail.com, Andrew Beekhof beek...@gmail.com wrote: On Thu, Apr 9, 2009 at 09:37, Chrissie Caulfield ccaul...@redhat.com wrote: Joel Becker wrote: Steve, Dave, etc, Someone told me a while back that a

[Openais] [PATCH 5/9] propagate to evc.c

2009-04-09 Thread Jim Meyering
From: Jim Meyering meyer...@redhat.com * services/evs.c: add const to msg param --- services/evs.c | 46 +++--- 1 files changed, 23 insertions(+), 23 deletions(-) diff --git a/services/evs.c b/services/evs.c index 389af98..24fff4d 100644 ---

Re: [Openais] another big batch of API changes

2009-04-09 Thread Jim Meyering
Jim Meyering wrote: Here is a tiny API change, along with the many changes it induces. 0001 is just something I saw along the way. 0002 is the tiny change that the adjustments in all of the following. Here's that tiny diff: diff --git a/include/corosync/engine/coroapi.h ... ...

[Openais] [PATCH]: openais/trunk: Fix handle leak in saMsg service

2009-04-09 Thread Ryan O'Hara
This is exactly the same fix as I found in the checkpoint service and Steve fixed. When dispatch_avail == -1, we must call saHandleInstancePut. Index: lib/msg.c === --- lib/msg.c (revision 1786) +++ lib/msg.c (working copy) @@

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
need for locks. An example of why not is creation of a resource called datasetA. 3 nodes: node A sends create datasetA node B sends create datasetA node C sends create datasetA Only one of those nodes create dataset will arrive first. The remainder will arrive

[Openais] [PATCH]: openais/trunk: Fix saAmf dispatch code

2009-04-09 Thread Ryan O'Hara
This patch fixed what I believe to be a few problems with the saAmf service's dispatch code. The few things I changed in this patch seem to be fallout from the ipc changes that went in a while back. First, we only need to hold a lock on the dispatch_mutex while calling coroipcc_dispatch_recv.

Re: [Openais] detecting cpg joiners

2009-04-09 Thread David Teigland
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It will simply just be there again with no indication that anything

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Joel Becker
On Thu, Apr 09, 2009 at 08:37:00AM +0100, Chrissie Caulfield wrote: 1) If member_count == join count, then it's a safe bet that they are all new nodes, and yes , it is true that all nodes should see the same confchg messages 2) if join_count 0 then leave_count will always be zero. That's a

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Joel Becker
On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It will simply just be there again with no indication that anything

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
1. Have an old cpg member (e.g. the one with the lowest nodeid) send messages containing the state to the new node after it's joined. These sync messages are separate from the messages used to read/write the replicated state during normal operation. This is not bullet proof. State can

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread David Teigland
On Thu, Apr 09, 2009 at 09:00:08PM +0200, Dietmar Maurer wrote: If new, normal read/write messages to the replicated state continue while the new node is syncing the pre-existing state, the new node needs to save those operations to apply after it's synced. Ah, that probably works. But

Re: [Openais] [PATCH]: openais/trunk: Fix saAmf dispatch code

2009-04-09 Thread Steven Dake
good for merge On Thu, 2009-04-09 at 11:23 -0500, Ryan O'Hara wrote: This patch fixed what I believe to be a few problems with the saAmf service's dispatch code. The few things I changed in this patch seem to be fallout from the ipc changes that went in a while back. First, we only need to

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
Ah, that probably works. But can lead to very high memory usage if traffic is high. If that's a problem you could block normal activity during the sync period. wow. that 'virtual synchrony' sound nice first, but gets incredible complex soon ;-) Is somebody really using that? If so,

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Andrew Beekhof
On Thu, Apr 9, 2009 at 20:49, Joel Becker joel.bec...@oracle.com wrote: On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Andrew Beekhof
On Thu, Apr 9, 2009 at 19:15, David Teigland teigl...@redhat.com wrote: On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have left (or rejoined) the cluster. At the next totem confchg event, It

[Openais] [PATCH]: openais/trunk: Fix saEvt dispatch code

2009-04-09 Thread Ryan O'Hara
This patch fixes a few issues with the event service dispatch code. It is similar to the patch I posted eariler for the saAmf service. Fixes include: * Change checking of dispatchFlags to be same as other services. * Wrap coroipcc_dispatch_recv with pthread_mutex_lock/unlock. As a result,

Re: [Openais] detecting cpg joiners

2009-04-09 Thread David Teigland
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: On Thu, Apr 9, 2009 at 20:49, Joel Becker joel.bec...@oracle.com wrote: On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have

Re: [Openais] API change vestiges

2009-04-09 Thread Jim Meyering
Jim Meyering wrote: Here are a few more. From 052f43f2c3ec1a1a7d6a2e9038ee1fb0e7d222e9 Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Thu, 2 Apr 2009 22:18:51 +0200 Subject: [PATCH 1/2] cpg.h, objdb.h, coroaph.h: more const/size_t * include/corosync/cpg.h

Re: [Openais] [PATCH]: openais/trunk: Fix saEvt dispatch code

2009-04-09 Thread Ryan O'Hara
Also, this patch changes how/when timeout is set to 0. Existing code would set timeout=0 when dispatchFlags was set to SA_DISPATCH_ALL -or- SA_DISPATCH_ONE. This is not correct. We should only set timeout=0 when dispatchFlags == SA_DISPATCH_ALL. This is worth nothing, and I failed to mention it

[Openais] [PATCH]: openais/trunk: Fix saTmr dispatch code

2009-04-09 Thread Ryan O'Hara
Same as other patches for service dispatch code. Hold lock on dispatch_mutex only during coroipcc_dispatch_recv. Handle finalize/dispatch_avail correctly. Ryan Index: lib/tmr.c === --- lib/tmr.c (revision 1787) +++ lib/tmr.c

[Openais] [PATCH]: openais/trunk: Fix saLck dispatch code

2009-04-09 Thread Ryan O'Hara
Same as other patches for service dispatch code. Hold lock on dispatch_mutex only during coroipcc_dispatch_recv. Handle finalize/dispatch_avail correctly. Ryan Index: lib/lck.c === --- lib/lck.c (revision 1787) +++ lib/lck.c

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Joel Becker
On Thu, Apr 09, 2009 at 03:17:47PM -0700, Steven Dake wrote: A proper system using this model doesn't care - it synchronizes every time regardless of who left or joined based upon whether it has state to sync that is unique. Dave, If we're going to use cpg for our membership, we need

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Joel Becker
On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: On Thu, Apr 9, 2009 at 20:49, Joel Becker joel.bec...@oracle.com wrote: On Thu, Apr 09, 2009 at 01:50:18PM +0200, Andrew Beekhof wrote: For added fun, a node that restarts quickly enough (think a VM) won't even appear to have

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Steven Dake
On Thu, 2009-04-09 at 17:17 -0700, Joel Becker wrote: On Thu, Apr 09, 2009 at 04:09:18PM -0500, David Teigland wrote: On Thu, Apr 09, 2009 at 03:50:08PM -0500, David Teigland wrote: On Thu, Apr 09, 2009 at 10:12:43PM +0200, Andrew Beekhof wrote: On Thu, Apr 9, 2009 at 20:49, Joel Becker

[Openais] [corosync trunk] use spin locks in hdb api

2009-04-09 Thread Steven Dake
The hdb api is a perfect candidate for spinlocks to protect the critical sections in the database. This patch adds them (if available on the os) to the hdb API. It also adds functions to declare different kinds of hdb databases. performance increase = 25% msgs/sec for evsbench with small

[Openais] [corosync trunk] remove warnings in wthread.c

2009-04-09 Thread Steven Dake
see subject Index: exec/wthread.c === --- exec/wthread.c (revision 2050) +++ exec/wthread.c (working copy) @@ -65,8 +65,7 @@ struct thread_data thread_data; }; -void *worker_thread (void *thread_data_in)

[Openais] remove warning from keygen and report error condition properly

2009-04-09 Thread Steven Dake
Regards -steve Index: tools/corosync-keygen.c === --- tools/corosync-keygen.c (revision 2050) +++ tools/corosync-keygen.c (working copy) @@ -83,9 +83,13 @@ exit (1); } /* - * Set security of authorization key to uid = 0 uid =

[Openais] [corosync trunk] cast coroipcs iovec entry from const to non const

2009-04-09 Thread Steven Dake
see subject Index: exec/coroipcs.c === --- exec/coroipcs.c (revision 2050) +++ exec/coroipcs.c (working copy) @@ -869,7 +869,7 @@ { struct iovec iov; - iov.iov_base = msg; + iov.iov_base = (void *)msg; iov.iov_len = mlen;

[Openais] [PATCH]: openais/trunk: Create timer_hdb with DECLARE_HDB_DATABASE

2009-04-09 Thread Ryan O'Hara
This patch uses new DECLARE_HDB_DATABASE macro create timer_hdb. Replaces old method of declaring and initializing the handle database. Index: services/tmr.c === --- services/tmr.c (revision 1789) +++ services/tmr.c

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Dietmar Maurer
guarantees you seek, and if it doesn't, it is defective. The only exception might be if the new process reuses the same PID since the pid/nodeid/group are the uniqifiers and if pid is the same, there is no way to detect the new process (and remove the old one). PID reuse happens more often

[Openais] [PATCH]: openais/trunk: Fix confchg_fn for message service

2009-04-09 Thread Ryan O'Hara
This patch fixes the issues that Steve mentioned yesterday regarding the message service confchg_fn. Most importantly, it renames the global member_list such that there is not collision with the member_list passed as an argument to confchg_fn. Also, the entire confchg_fn code was reworked to

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Joel Becker
On Thu, Apr 09, 2009 at 06:06:13PM -0700, Steven Dake wrote: I'd like to clear up that when Andrew talks about the membership not generating a leave event for totem processes in this scenario (which he integrates directly with), this is true. But cpg should generate a leave event.