Re: [Openais] totem token timeout increase
On 08/26/2014 07:17 AM, Vasil Valchev wrote: Hello all, I have a RHEL 5 (openais) cluster with intermittent issues on the heartbeat network, and was thinking to increase the totem token value to 90s (currently is 30s). Are there any negative effects from this change, apart from the cluster taking longer to detect a node is failed - can this cause data corruption for example or something like that? BR, Vasil Valchev Vasil, I doubt going from 30s to 90s would make a difference with healthchecking performed. You may be better off increasing token_retransmits_before_loss_const. Also make sure your running the latest z stream. Regards, -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais
Re: [Openais] totem token timeout increase
On 08/27/2014 02:20 PM, Vasil Valchev wrote: Hello Steve, If I just increase token_retransmits_before_loss, without increasing token, won't it just send more tokens during the same time? For example 8 in 30s instead of 4? The last few times the network interruption wasn't longer than a minute, last time the cluster was even going to reform, but the fencing was already initiated by fenced. I want to allow a bit more time in which the nodes can resume communication and though increasing token timeout should do it. Do you mean to also increase the token_retransmits_before_loss proportionally? Vasil, I am not sure why you would have network disruption for 4 lost tokens, but transmitting 8 gives better chance they will reach. UDP (the transport used) can lose those retransmitted tokens. Increasing the token timer will allow more time for whatever action your doing on the network that takes it out of service to repair. Regards, -steve BR, Vasil On Thu, Aug 28, 2014 at 12:03 AM, Steven Dake sd...@redhat.com mailto:sd...@redhat.com wrote: On 08/26/2014 07:17 AM, Vasil Valchev wrote: Hello all, I have a RHEL 5 (openais) cluster with intermittent issues on the heartbeat network, and was thinking to increase the totem token value to 90s (currently is 30s). Are there any negative effects from this change, apart from the cluster taking longer to detect a node is failed - can this cause data corruption for example or something like that? BR, Vasil Valchev Vasil, I doubt going from 30s to 90s would make a difference with healthchecking performed. You may be better off increasing token_retransmits_before_loss_const. Also make sure your running the latest z stream. Regards, -steve ___ Openais mailing list Openais@lists.linux-foundation.org mailto:Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais
Re: [Openais] Request of information about rrp mode passive versus rrp mode active
On 11/27/2013 08:20 AM, Moullé Alain wrote: Hi, the man page of corosync.conf gives : Active replication offers slightly lower latency from transmit to delivery in faulty network environments but with less performance. Passive replication may nearly double the speed of the totem protocol if the protocol doesn’t become cpu bound OK but knowing that, could someone give the pro cons for passive mode, and the pro cons for active mode, and/or how must we choose the real better mode for a HA cluster ? If you care about latency use active, if you care about throughput, use passive. From a functional perspective they both deliver the same functionality. I have seen on this list that something with active may be broken atm though, best to check with Jan Friesse. Regards -steve Thanks a lot Alain ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais
Re: [Openais] question stack Pacemaker/corosync on SLES11
On 03/08/2012 03:10 AM, Tim Serong wrote: On 03/08/2012 07:23 PM, alain.mou...@bull.net wrote: Hi Darren And thanks. I effectively found that the stack is started with the service openais : no more 'corosync' neither 'pacemaker' lsb scripts. But I'am surprised because I think I remind that one or two years ago, it was said that corosync was sort of an 'extract' of openais , just to isolate needed code for stack Pacemaker/corosync to work ... and now it seems that all is managed again by openais ... so I don't completely understand the evolution of 'architecture' ... but perhaps am I wrong ? Could you clarify the history for me ? openais 0.80.x (before the creation of corosync) shipped for SLES 11. You started it by running /etc/init.d/openais start. corosync 1.2.x + openais 1.1.x shipped for SLES 11 SP1. corosync is the core messaging layer, and openais just includes some extra magic for OCFS2 etc. But (on SLES at least) the /etc/init.d/openais init script remained, even though that init script now starts corosync. The same is true now for SLES 11 SP2 (albeit with corosync 1.4.1); corosync is what's running, but you use the openais init script to start it. So your history of project splits and whatnot is correct, you're just being misled by the name of an init script :) Regards, Tim De :Darren Thompson darr...@akurit.com.au A : alain.mou...@bull.net Date : 07/03/2012 21:03 Objet : Re: [Openais] question stack Pacemaker/corosync on SLES11 Alain With SLES you also need to install the OpenAIS stack as that is where the init.d service comes from etc. Darren On Mar 8, 2012 2:14 AM, alain.mou...@bull.net wrote: Hi, In rpm corosync-1.4.1-4 on rhel are installed : /etc/rc.d/init.d/corosync but in rpm corosync-1.4.1-0.11.29 on SLES 11, I don't have anything installed as init.d service, even in /etc/init.d, and I checked the rpm, there is no more /etc/rc.d/init.d/corosync same thing for pacemaker , the rpm pacemaker-1.1.6-1.25.1 on SLES does not install the lsb script pacemaker as in the rpm pacemaker-1.1.6-3 on rhel could someone tell me how to start the stack Pacemaker/corosync service with the pacemaker-1.1.6-1.25.1/corosync-1.4.1-0.11.29 on SLES 11 ? Another thing to keep in mind is a particular company's product trails upstream by some time interval... (which is good, upstream has bugs, product should not :) Regards -steve Many thanks Alain ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/openais
[Openais] change to commit policy
Russell pointed out a problem with his recent patch for mutexes. It is only applicable to 1.4/1.3 branches. It is not applicable to master. Currently our policy is that all patches go into master, and 1 person is responsible for backports to other branches. This leaves out the important case above that Russell ran into. As a result, if the patch is not suitable for master because of our de-threading of the software, please commit to flatron-1.4 and then git cherry-pick to flatiron-1.3 branch. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Installing corosync from source
On 09/06/2011 06:05 PM, Nick Khamis wrote: Hello Everyone, We are moving everything over from heartbeat, after the last update brought the cluster to it's knees... What we are interested in is using corosync, pacemaker to LVS mysql, and asterisk. We have not looked into asterisk yet, and we don't know if it's even possible (i.e. if there is already an ocf a;ready created). Regardless, our attempt to install corosync from source using the directrions found in http://www.clusterlabs.org/wiki/Install; seemed to go ok however, nothing was created. We had to manually copy: cp /usr/etc/corosync/corosync.conf.example /etc/corosync/corosync.conf cp /usr/etc/init.d/corosync /etc/init.d/corosync We have a long way to go, you're help is greatly appreciated. simple startup conf totem { version: 2 secauth: off threads: 0 interface { ringnumber: 0 bindnetaddr: 192.168.1.1 mcastaddr: 226.94.1.1 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } amf { mode: disabled } If there are script that I am suppose to be running that will create everyting? Do I need to install OpenAIS as well? We downloaded the latest version of resource agents, cluster glue, corosync, pacemaker. I know we can install everything from gentoo source tree, but we are trying to avoid that... There is no script to build everything. Compiling all that code from source is sure to be painful. Corosync is pretty straight forward (autogen - confgiure - make install) but getting everything else operational may be challenging. You shouldn't need openais package. Regards -steve Your help is greatly appreciated, Nick. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Configuration Hash Table - API proposal
On 09/01/2011 05:17 AM, Jan Friesse wrote: Included is API proposal for replacement of objdb/confdb API. It should keep all good things there (triggers, ...), remove hard to use bits (like whole object idea) and improve existing things (like typing) Even I wrote it before, also configuration file will need change. Proposed change is ht_key value ht_key. { ht_subkey value2 } We absolutely can't change the config file - it will cause massive confusion in the user base. Although changing the internal representation in whatever way is necessary seems fine. If the parsing code has to be suboptimal that is preferable to confusing the user base. which is (internally) converted to ht_key value ht_key.ht_subkey value2 Also value should become typed, so value ~= ^-?[0-9]+$ = integer 32 bits, with modificators like l, ll, ... value ~= ^-?[0-9]*.[0-9]*$ = float (or double) (also should handle all variants with E .. basically C format) value = [:alpha:]* = string value = bin:base64 encoded binary data Regards, Honza The API looks really solid. I don't totally like the error returns in cht_get and set calls, but understand the need for the programmer to be able to determine what went wrong with the API call. If we didn't have typing we wouldn't need error codes, but I am pretty certain we need typing in corosync (but perhaps not the underlying libqb). A typical map doesn't need an error code because it doesn't care about errors (worst case error, malloc = whole system going to blow up anyway). The only other option is asserting in libs, which is evil, so we should count that out. On the topic of prefix, this is a great feature, but doesn't fit in well with a hash table. Another option is to use direct integration in the skiplist in libqb to implement this. Since what you are delivering on not really a hash table, but more like a map table, may consider a rename to cs_map or similar Really great work Feed missing requirements for libqb into Angus's work on libqb when you start progressing with implementation. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Allow nss building conditionally with rpmbuild operation
Signed-off-by: Steven Dake sd...@redhat.com --- corosync.spec.in |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 74ab851..5c651aa 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -11,6 +11,7 @@ %bcond_with snmp %bcond_with dbus %bcond_with rdma +%bcond_with nss Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces @@ -36,7 +37,9 @@ Conflicts: openais = 0.89, openais-devel = 0.89 %if %{buildtrunk} BuildRequires: autoconf automake %endif +%if %{with nss} BuildRequires: nss-devel +%endif %if %{with rdma} BuildRequires: libibverbs-devel librdmacm-devel %endif @@ -83,6 +86,11 @@ export rdmacm_LIBS=-lrdmacm \ %if %{with rdma} --enable-rdma \ %endif +%if %{with nss} + --enable-nss \ +%else + --disable-nss \ +%endif --with-initddir=%{_initrddir} make %{_smp_mflags} -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] cpg behavior on transitional membership change
On 09/02/2011 12:59 AM, Vladislav Bogdanov wrote: Hi all, I'm trying to further investigate problem I described at https://www.redhat.com/archives/cluster-devel/2011-August/msg00133.html The main problem for me there is that pacemaker first sees transitional membership with left nodes, then it sees stable membership with that nodes returned back, and does nothing about that. On the other hand, dlm_controld sees CPG_REASON_NODEDOWN events on CPGs related to all its lockspaces (at the same time with transitional membership change) and stops kernel part of each lockspace until whole cluster is rebooted (or until some other recovery procedure which unfortunately does not happen I believe fenced should reboot the node, but only if there is quorum. It is possible your cluster has lost quorum during this series of events. I have copied Dave for his feedback on this point. :( ). It neither requests to fence left node nor recovers when node is returned on next stable membership. Could anyone please help me to understand, what is a correct CPG behavior on membership change? From what I see, CPG emits CPG_REASON_NODEDOWN event on both transitional and stable membership if there is node which left the cluster. Am I correct here? And is that a right thing if I am? Line #'s where this happens? If yes, is there a way do detect membership change type (transitional pr stable) through CPG API? A transitional membership will always contain a subset of the previous regular membership. This means it will always contains 0 or more left members. A transitional membership means The membership of nodes transitioning from previous regular membership to new regular mebmership. A regular configuration is where members are added to the configuration when detected. A transitional membership never has nodes added to it. Hoping for answer, It would be nice if cpg and totem had a direct relationship in how their transitional and regular configurations were generated, but this doesn't happen currently. I am not sure if there is a good reason for this. Regards -steve Best regards, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Ignore memb_join messages during flush operations
a memb_join operation that occurs during flushing can result in an entry into the GATHER state from the RECOVERY state. This results in the regular sort queue being used instead of the recovery sort queue, resulting in segfault. Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemudp.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/exec/totemudp.c b/exec/totemudp.c index 96849b7..0c12b56 100644 --- a/exec/totemudp.c +++ b/exec/totemudp.c @@ -90,6 +90,8 @@ #define BIND_STATE_REGULAR 1 #define BIND_STATE_LOOPBACK2 +#define MESSAGE_TYPE_MCAST 1 + #define HMAC_HASH_SIZE 20 struct security_header { unsigned char hash_digest[HMAC_HASH_SIZE]; /* The hash *MUST* be first in the data structure */ @@ -1172,6 +1174,7 @@ static int net_deliver_fn ( int res = 0; unsigned char *msg_offset; unsigned int size_delv; + char *message_type; if (instance-flushing == 1) { iovec = instance-totemudp_iov_recv_flush; @@ -1234,6 +1237,16 @@ static int net_deliver_fn ( } /* +* Drop all non-mcast messages (more specifically join +* messages should be dropped) +*/ + message_type = (char *)msg_offset; + if (instance-flushing == 1 *message_type != MESSAGE_TYPE_MCAST) { + iovec-iov_len = FRAME_SIZE_MAX; + return (0); + } + + /* * Handle incoming message */ instance-totemudp_deliver_fn ( -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] test two
___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] test
___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] test 3
test 3 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync 2.0 Feature Request: Notification of the status change for each rings via SNMP
On 08/31/2011 12:37 AM, Keisuke MORI wrote: Hi, We would like to be notified when a ring gets down or up to let them know when they need to check and repair which the network interfaces. The notification should be sent via SNMP traps to co-operate with various kinds of NMSs. A proposed implementation was included in the patch posted at March 2010 by Sato Yuki-san: https://lists.linux-foundation.org/pipermail/openais/2010-March/014036.html The following MIB item in the patch is related to this. +corosyncNoticeIfaceEntry ::= SEQUENCE { +corosyncNoticeIfaceIndexINTEGER, +corosyncNoticeIface OCTET STRING, +corosyncNoticeIfaceStatus OCTET STRING +} The notification should be sent when; 1) when one ring detect a failure (get in FAULTY state) 2) when the failed ring get recovered. 3) when the second (last) ring also detect a failure and no longer usable to communicate with others 3) when the second ring get recovered. It is also preferable that the current status can be checked by some command line tools or an user-customized service plug-in. The proposed patch above tried to store the status into the objdb to achieve this but the implementation details does not matter. It would be glad if you would be considering it. Yup this seems reasonable. The way that data comes out now is through corosync-notifyd vs a direct snmp integration into the corosync process. As a result, the linked patch needs to be changed to match the new model. Regards -steve Regards, Keisuke MORI 2011/7/22 Steven Dake sd...@redhat.com: The Corosync flatiron 1.y series had many more features added then I would have liked, but the development team feels the 1.y series addresses any major gaps users of the software have had. As a result, we are freezing any future feature development of the flatiron branch permanently. We will continue to maintain z streams (1.4.z) bug fixes for many years to come in a robust and aggressive fashion. Now that the flatiron chapter of Corosync is finished, we can move on to new rd work around Corosync 2.0. There are a few RFEs floating around in bugzilla and the TODO list. This is your chance to provide feedback about feature development you would like to see in Corosync. The overall theme for Corosync 2.0 is focused around trimming the fat and simplifying the implementation without major performance regressions. The developers will take feature submission suggestions until Aug 31, at which point we will prioritize features for 2.0 and close feature submission requests. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Sorry for noise on mailing list
The mailing list server had a short outage. Apologies for noise on the mailing list. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] aisexec crashes with SIGABRT
On 08/29/2011 06:26 PM, Christopher A. Kirke wrote: Steve, The core file doesn't have debuginfo installed when it was analyzed. The package you want is something like openais-debuginfo. You may have to enable the debuginfo yum repo if you have not already. Regards -steve our setup is a 2-node cluster: * mstel21 (172.24.100.10) - attached gdb output from here * mstel22 (172.24.100.12) log file from 172.24.100.12 http://172.24.100.12: Aug 29 08:04:49 172.24.100.12 openais[4656]: [TOTEM] The token was lost in the OPERATIONAL state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes). Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering GATHER state from 2. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering GATHER state from 0. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Creating commit token because I am the rep. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Storing new sequence id for ring ff4 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering COMMIT state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering RECOVERY state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] position [0] member 172.24.100.12 http://172.24.100.12: Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] previous ring seq 4080 rep 172.24.100.10 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] aru 425c1 high delivered 425c1 received flag 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Did not need to originate any messages in recovery. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Sending initial ORF token Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] CLM CONFIGURATION CHANGE Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] New Configuration: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.12) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Left: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.10) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Joined: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] CLM CONFIGURATION CHANGE Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] New Configuration: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.12) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Left: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Joined: Aug 29 08:04:50 172.24.100.12 openais[4656]: [SYNC ] This node is within the primary component and will provide service. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering OPERATIONAL state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] got nodejoin message 172.24.100.12 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Channel device_state, total 1, local 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Node r(0) ip(172.24.100.12) , count 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Channel mwi, total 1, local 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Node r(0) ip(172.24.100.12) , count 1 Thanks, -- Chris Kirke Director - Systems Architecture Multi Service Corporation www.multiservice.com http://www.multiservice.com +1.913.663.9483 (direct) +1.816.718.0468 (mobile) +1.913.217.9318 (fax) On Tue, Aug 23, 2011 at 11:45, Christopher A. Kirke caki...@multiservice.com mailto:caki...@multiservice.com wrote: Steve, appreciate the quick response, only happened to be running strace during one of the crashes. i've updated /etc/init.d/openais to enable core dump and changed openais.conf to run as root:asterisk instead of asterisk:asterisk so /var/lib/openais is available for writing. will post gdb output from the next aisexec crash. Thanks, -- Chris Kirke Director - Systems Architecture Multi Service Corporation www.multiservice.com http://www.multiservice.com +1.913.663.9483 tel:%2B1.913.663.9483 (direct) +1.816.718.0468 tel:%2B1.816.718.0468 (mobile) +1.913.217.9318 tel:%2B1.913.217.9318 (fax) On Mon, Aug 22, 2011 at 12:13, Steven Dake sd...@redhat.com mailto:sd...@redhat.com wrote: On 08/22/2011 09:58 AM, Christopher A. Kirke wrote: currently using the REL5-provided package on two nodes on local Cisco-switched LAN: openais.x86_64 0.80.6-28.el5_6.1 installed with following configuration: # Please read the openais.conf.5 manual page aisexec { user: asterisk group: asterisk } totem
Re: [Openais] aisexec crashes with SIGABRT
On 08/30/2011 06:39 AM, Christopher A. Kirke wrote: Steve, i sometimes need to be smacked by the obvious :^) updated analysis attached ... THe core happens because the event service from openais de-referenes an object with a reference count of 0. This shouldn't happen, but can't explain why it does. We stopped supporting the sa forum services last year in this project, but understanding that we may not be able to fix the problem, If you could provide more details of the exact scenario which triggers this problem, that might be helpful in reproducing the issue. Regards -steve Thanks, -- Chris Kirke Director - Systems Architecture Multi Service Corporation www.multiservice.com http://www.multiservice.com +1.913.663.9483 (direct) +1.816.718.0468 (mobile) +1.913.217.9318 (fax) On Mon, Aug 29, 2011 at 22:43, Steven Dake sd...@redhat.com mailto:sd...@redhat.com wrote: On 08/29/2011 06:26 PM, Christopher A. Kirke wrote: Steve, The core file doesn't have debuginfo installed when it was analyzed. The package you want is something like openais-debuginfo. You may have to enable the debuginfo yum repo if you have not already. Regards -steve our setup is a 2-node cluster: * mstel21 (172.24.100.10) - attached gdb output from here * mstel22 (172.24.100.12) log file from 172.24.100.12 http://172.24.100.12: Aug 29 08:04:49 172.24.100.12 openais[4656]: [TOTEM] The token was lost in the OPERATIONAL state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes). Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering GATHER state from 2. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering GATHER state from 0. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Creating commit token because I am the rep. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Storing new sequence id for ring ff4 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering COMMIT state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering RECOVERY state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] position [0] member 172.24.100.12 http://172.24.100.12: Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] previous ring seq 4080 rep 172.24.100.10 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] aru 425c1 high delivered 425c1 received flag 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Did not need to originate any messages in recovery. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] Sending initial ORF token Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] CLM CONFIGURATION CHANGE Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] New Configuration: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.12) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Left: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.10) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Joined: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] CLM CONFIGURATION CHANGE Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] New Configuration: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] r(0) ip(172.24.100.12) Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Left: Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] Members Joined: Aug 29 08:04:50 172.24.100.12 openais[4656]: [SYNC ] This node is within the primary component and will provide service. Aug 29 08:04:50 172.24.100.12 openais[4656]: [TOTEM] entering OPERATIONAL state. Aug 29 08:04:50 172.24.100.12 openais[4656]: [CLM ] got nodejoin message 172.24.100.12 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Channel device_state, total 1, local 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Node r(0) ip(172.24.100.12) , count 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Channel mwi, total 1, local 1 Aug 29 08:04:50 172.24.100.12 openais[4656]: [EVT ] Node r(0) ip(172.24.100.12) , count 1 Thanks, -- Chris Kirke Director - Systems Architecture Multi Service Corporation www.multiservice.com http://www.multiservice.com http://www.multiservice.com +1.913.663.9483 tel:%2B1.913.663.9483 (direct) +1.816.718.0468 tel:%2B1.816.718.0468 (mobile) +1.913.217.9318 tel:%2B1.913.217.9318 (fax) On Tue, Aug 23
[Openais] [PATCH] Remove hdb.h header includes from unnecessary files
The files in this patch do not use the hdb.h header. Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemrrp.c |1 - exec/totemsrp.c |1 - exec/totemudp.c |1 - exec/totemudp.h |1 - exec/totemudpu.c |1 - exec/totemudpu.h |1 - 6 files changed, 0 insertions(+), 6 deletions(-) diff --git a/exec/totemrrp.c b/exec/totemrrp.c index 8fe3ef7..73cb996 100644 --- a/exec/totemrrp.c +++ b/exec/totemrrp.c @@ -60,7 +60,6 @@ #include corosync/sq.h #include corosync/list.h -#include corosync/hdb.h #include corosync/swab.h #include qb/qbdefs.h #include qb/qbloop.h diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 71ccd59..861c75b 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -81,7 +81,6 @@ #include corosync/swab.h #include corosync/sq.h #include corosync/list.h -#include corosync/hdb.h #define LOGSYS_UTILS_ONLY 1 #include corosync/engine/logsys.h diff --git a/exec/totemudp.c b/exec/totemudp.c index ed2f03c..740e246 100644 --- a/exec/totemudp.c +++ b/exec/totemudp.c @@ -61,7 +61,6 @@ #include corosync/sq.h #include corosync/swab.h #include corosync/list.h -#include corosync/hdb.h #include qb/qbdefs.h #include qb/qbloop.h #define LOGSYS_UTILS_ONLY 1 diff --git a/exec/totemudp.h b/exec/totemudp.h index 6d509c1..de39c81 100644 --- a/exec/totemudp.h +++ b/exec/totemudp.h @@ -37,7 +37,6 @@ #include sys/types.h #include sys/socket.h -#include corosync/hdb.h #include qb/qbloop.h #include corosync/totem/totem.h diff --git a/exec/totemudpu.c b/exec/totemudpu.c index 529c362..21e57c7 100644 --- a/exec/totemudpu.c +++ b/exec/totemudpu.c @@ -62,7 +62,6 @@ #include corosync/sq.h #include corosync/list.h -#include corosync/hdb.h #include corosync/swab.h #define LOGSYS_UTILS_ONLY 1 #include corosync/engine/logsys.h diff --git a/exec/totemudpu.h b/exec/totemudpu.h index 977148f..93b31a0 100644 --- a/exec/totemudpu.h +++ b/exec/totemudpu.h @@ -37,7 +37,6 @@ #include sys/types.h #include sys/socket.h -#include corosync/hdb.h #include qb/qbloop.h #include corosync/totem/totem.h -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Get rid of hdb usage in totempg.h interface
hdb has some expense and is not necessary in the totempg.so runtime. This patch removes the dependence on hdb and instead uses a direct pointer. Signed-off-by: Steven Dake sd...@redhat.com --- exec/main.c |2 +- exec/sync.c |2 +- exec/syncv2.c|2 +- exec/totempg.c | 213 +- include/corosync/totem/totempg.h | 20 ++-- 5 files changed, 83 insertions(+), 156 deletions(-) diff --git a/exec/main.c b/exec/main.c index fde77da..582f1e2 100644 --- a/exec/main.c +++ b/exec/main.c @@ -244,7 +244,7 @@ static void sigabrt_handler (int num) #define LOCALHOST_IP inet_addr(127.0.0.1) -static hdb_handle_t corosync_group_handle; +static void *corosync_group_handle; static struct totempg_group corosync_group = { .group = a, diff --git a/exec/sync.c b/exec/sync.c index b9cc84a..ce99129 100644 --- a/exec/sync.c +++ b/exec/sync.c @@ -142,7 +142,7 @@ static struct totempg_group sync_group = { .group_len = 4 }; -static hdb_handle_t sync_group_handle; +static void *sync_group_handle; struct req_exec_sync_barrier_start { struct qb_ipc_request_header header; diff --git a/exec/syncv2.c b/exec/syncv2.c index f9eebac..8a96615 100644 --- a/exec/syncv2.c +++ b/exec/syncv2.c @@ -167,7 +167,7 @@ static struct totempg_group sync_group = { .group_len = 6 }; -static hdb_handle_t sync_group_handle; +static void *sync_group_handle; int sync_v2_init ( int (*sync_callbacks_retrieve) ( diff --git a/exec/totempg.c b/exec/totempg.c index c5ba01c..a3eee15 100644 --- a/exec/totempg.c +++ b/exec/totempg.c @@ -98,7 +98,6 @@ #include limits.h #include corosync/swab.h -#include corosync/hdb.h #include corosync/list.h #include qb/qbloop.h #include qb/qbipcs.h @@ -212,6 +211,8 @@ DECLARE_LIST_INIT(assembly_list_inuse); DECLARE_LIST_INIT(assembly_list_free); +DECLARE_LIST_INIT(totempg_groups_list); + /* * Staging buffer for packed messages. Messages are staged in this buffer * before sending. Multiple messages may fit which cuts down on the @@ -230,8 +231,6 @@ static int fragment_continuation = 0; static struct iovec iov_delv; -static unsigned int totempg_max_handle = 0; - struct totempg_group_instance { void (*deliver_fn) ( unsigned int nodeid, @@ -250,6 +249,8 @@ struct totempg_group_instance { int groups_cnt; int32_t q_level; + + struct list_head list; }; DECLARE_HDB_DATABASE (totempg_groups_instance_database,NULL); @@ -342,7 +343,7 @@ static inline void app_confchg_fn ( int i; struct totempg_group_instance *instance; struct assembly *assembly; - unsigned int res; + struct list_head *list; /* * For every leaving processor, add to free list @@ -354,25 +355,23 @@ static inline void app_confchg_fn ( list_del (assembly-list); list_add (assembly-list, assembly_list_free); } - for (i = 0; i = totempg_max_handle; i++) { - res = hdb_handle_get (totempg_groups_instance_database, - hdb_nocheck_convert (i), (void *)instance); - - if (res == 0) { - if (instance-confchg_fn) { - instance-confchg_fn ( - configuration_type, - member_list, - member_list_entries, - left_list, - left_list_entries, - joined_list, - joined_list_entries, - ring_id); - } - hdb_handle_put (totempg_groups_instance_database, - hdb_nocheck_convert (i)); + for (list = totempg_groups_list.next; + list != totempg_groups_list; + list = list-next) { + + instance = list_entry (list, struct totempg_group_instance, list); + + if (instance-confchg_fn) { + instance-confchg_fn ( + configuration_type, + member_list, + member_list_entries, + left_list, + left_list_entries, + joined_list, + joined_list_entries, + ring_id); } } } @@ -474,12 +473,11 @@ static inline void app_deliver_fn ( unsigned int msg_len, int endian_conversion_required) { - int i; struct totempg_group_instance *instance; struct iovec stripped_iovec; unsigned int adjust_iovec; - unsigned int res
Re: [Openais] aisexec crashes with SIGABRT
On 08/22/2011 09:58 AM, Christopher A. Kirke wrote: currently using the REL5-provided package on two nodes on local Cisco-switched LAN: openais.x86_64 0.80.6-28.el5_6.1 installed with following configuration: # Please read the openais.conf.5 manual page aisexec { user: asterisk group: asterisk } totem { version: 2 secauth: off threads: 0 interface { ringnumber: 0 bindnetaddr: 172.24.100.0 mcastaddr: 239.255.4.1 mcastport: 5405 } } logging { debug: off syslog_facility: local1 syslog_priority: info timestamp: off to_file: no to_syslog: yes } amf { mode: disabled } to enable Asterisk distributed device state. we see cases where aisexec crashes, both with Asterisk running and stopped - strace output below: 0.73 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}], 3, 10) = 0 (Timeout) 0.009994 0.010031 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}], 3, 237) = 1 ([{fd=1, revents=POLLIN}]) 0.07 0.37 recvmsg(1, {msg_name(16)={sa_family=AF_INET, sin_port=htons(5149), sin_addr=inet_addr(172.24.100.10)}, msg_iov(1)=[{\2\0\\377\254\30d\n\254\30d\n\2\0\254\30d\n\10\0\2\0\254\30d\n\10\0\4\0\0\0..., 1}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 82 0.10 0.72 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}], 3, 237) = 1 ([{fd=3, revents=POLLIN}]) 0.180257 0.180339 recvmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(5149), sin_addr=inet_addr(172.24.100.10)}, msg_iov(1)=[{\0\0\\377\254\30d\fN\1\0\0004/\0\0N\1\0\0\0\0\0\0\254\30d\n\2\0\254\30..., 1}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 70 0.22 0.000104 sendmsg(2, {msg_name(16)={sa_family=AF_INET, sin_port=htons(5405), sin_addr=inet_addr(172.24.100.10)}, msg_iov(1)=[{\0\0\\377\254\30d\fN\1\0\0005/\0\0N\1\0\0\0\0\0\0\254\30d\n\2\0\254\30..., 70}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 70 0.14 0.72 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}], 3, 209) = 1 ([{fd=4, revents=POLLIN}]) 0.037614 0.037682 accept(4, {sa_family=AF_FILE, path=@}, [4294967298]) = 5 0.23 0.96 fcntl(5, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 0.12 0.40 setsockopt(5, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 0.06 0.70 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}, {fd=5, events=POLLIN|POLLNVAL}], 4, 172) = 1 ([{fd=5, revents=POLLIN}]) 0.000158 0.000207 setsockopt(5, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 0.06 0.29 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{\1\0\0\0\252*\0\0\224\343TE\0\0\0\0\270\355\362+\0\0\0\0, 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=819, uid=301, gid=301}}, msg_flags=0}, MSG_NOSIGNAL) = 24 0.14 0.84 setsockopt(5, SOL_SOCKET, SO_PASSCRED, [0], 4) = 0 0.06 0.28 sendto(5, \1\0\0\0\0\0\0\0, 8, MSG_WAITALL, NULL, 0) = 8 0.07 0.31 shmget(0x4554e394, 308, 0600) = 5144599 0.000118 0.000198 shmat(5144599, 0, 0) = ? 0.002286 0.002332 semget(0x2bf2edb8, 3, 0600) = 1081360 0.000108 0.000155 mmap(NULL, 20, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x41781000 0.000212 0.000262 mprotect(0x41781000, 4096, PROT_NONE) = 0 0.20 0.51 clone(child_stack=0x417b0f90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x417b1710, tls=0x417b1680, child_tidptr=0x417b1710) = 859 0.46 0.000109 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}, {fd=5, events=POLLIN|POLLNVAL}], 4, 168) = 1 ([{fd=4, revents=POLLIN}]) 0.000924 0.000980 accept(4, {sa_family=AF_FILE, path=@}, [4294967298]) = 6 0.11 0.63 fcntl(6, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 0.06 0.49 setsockopt(6, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 0.06 0.33 poll([{fd=1, events=POLLIN}, {fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLNVAL}, {fd=5, events=POLLIN|POLLNVAL}, {fd=6, events=POLLIN|POLLNVAL}], 5, 167) = 1 ([{fd=6, revents=POLLIN}]) 0.07 0.44 setsockopt(6, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0 0.06 0.42 recvmsg(6, {msg_name(0)=NULL, msg_iov(1)=[{\4\0\0\0\252*\0\0Rv-b\0\0\0\0A\246\10B\0\0\0\0, 24}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=819, uid=301, gid=301}}, msg_flags=0}, MSG_NOSIGNAL) = 24 0.10 0.72 setsockopt(6, SOL_SOCKET, SO_PASSCRED, [0], 4) = 0 0.06 0.41 sendto(6, \1\0\0\0\0\0\0\0, 8, MSG_WAITALL, NULL, 0) = 8 0.07 0.37 shmget(0x622d7652, 308, 0600) = 5177368 0.06
[Openais] [PATCH] Move cs_queue.h from include directory to exec directory
This file is only used by totemsrp.c. Move out of general include directory. Signed-off-by: Steven Dake sd...@redhat.com --- exec/Makefile.am|2 +- exec/cs_queue.h | 229 +++ exec/totemsrp.c |2 +- include/Makefile.am |2 +- include/corosync/cs_queue.h | 227 -- 5 files changed, 232 insertions(+), 230 deletions(-) create mode 100644 exec/cs_queue.h delete mode 100644 include/corosync/cs_queue.h diff --git a/exec/Makefile.am b/exec/Makefile.am index 49e9f5a..8514afa 100644 --- a/exec/Makefile.am +++ b/exec/Makefile.am @@ -37,7 +37,7 @@ INCLUDES = -I$(top_builddir)/include -I$(top_srcdir)/include $(nss_CFLAGS) $(rd TOTEM_SRC = totemip.c totemnet.c totemudp.c \ totemudpu.c totemrrp.c totemsrp.c totemmrp.c \ - totempg.c crypto.c + totempg.c crypto.c cs_queue.h if BUILD_RDMA TOTEM_SRC += totemiba.c endif diff --git a/exec/cs_queue.h b/exec/cs_queue.h new file mode 100644 index 000..1e8439f --- /dev/null +++ b/exec/cs_queue.h @@ -0,0 +1,229 @@ +/* + * Copyright (c) 2002-2004 MontaVista Software, Inc. + * + * All rights reserved. + * + * Author: Steven Dake (sd...@redhat.com) + * + * This software licensed under BSD license, the text of which follows: + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * - Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * - Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials provided with the distribution. + * - Neither the name of the MontaVista Software, Inc. nor the names of its + * contributors may be used to endorse or promote products derived from this + * software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS AS IS + * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF + * THE POSSIBILITY OF SUCH DAMAGE. + */ +#ifndef CS_QUEUE_H_DEFINED +#define CS_QUEUE_H_DEFINED + +#include string.h +#include stdlib.h +#include pthread.h +#include errno.h +#include assert.h + +struct cs_queue { + int head; + int tail; + int used; + int usedhw; + int size; + void *items; + int size_per_item; + int iterator; + pthread_mutex_t mutex; +}; + +static inline int cs_queue_init (struct cs_queue *cs_queue, int cs_queue_items, int size_per_item) { + cs_queue-head = 0; + cs_queue-tail = cs_queue_items - 1; + cs_queue-used = 0; + cs_queue-usedhw = 0; + cs_queue-size = cs_queue_items; + cs_queue-size_per_item = size_per_item; + + cs_queue-items = malloc (cs_queue_items * size_per_item); + if (cs_queue-items == 0) { + return (-ENOMEM); + } + memset (cs_queue-items, 0, cs_queue_items * size_per_item); + pthread_mutex_init (cs_queue-mutex, NULL); + return (0); +} + +static inline int cs_queue_reinit (struct cs_queue *cs_queue) +{ + pthread_mutex_lock (cs_queue-mutex); + cs_queue-head = 0; + cs_queue-tail = cs_queue-size - 1; + cs_queue-used = 0; + cs_queue-usedhw = 0; + + memset (cs_queue-items, 0, cs_queue-size * cs_queue-size_per_item); + pthread_mutex_unlock (cs_queue-mutex); + return (0); +} + +static inline void cs_queue_free (struct cs_queue *cs_queue) { + pthread_mutex_destroy (cs_queue-mutex); + free (cs_queue-items); +} + +static inline int cs_queue_is_full (struct cs_queue *cs_queue) { + int full; + + pthread_mutex_lock (cs_queue-mutex); + full = ((cs_queue-size - 1) == cs_queue-used); + pthread_mutex_unlock (cs_queue-mutex); + return (full); +} + +static inline int cs_queue_is_empty (struct cs_queue *cs_queue) { + int empty; + + pthread_mutex_lock (cs_queue-mutex); + empty = (cs_queue-used == 0); + pthread_mutex_unlock (cs_queue-mutex); + return (empty
Re: [Openais] Problems forming cluster on corosync startup
On 08/14/2011 01:34 PM, Tim Beale wrote: Hi Steve, I repeated the test with fail_recv_const=5000. I can see the CPG client hung for ~4 minutes without dispatching any CPG events (i.e. node join). Unfortunately, one of our healthchecking mechanisms kicked in at this point, detected the CPG client as locked up and rebooted the units. It definitely rules out #2. I can repeat the test with healthchecking disabled to narrow down if #1 or #3 will occur. Regards, Tim On Thu, Aug 11, 2011 at 4:21 AM, Steven Dake sd...@redhat.com wrote: On 08/09/2011 09:56 PM, Tim Beale wrote: Hi Steve, Thanks for your patch. 1. I don't see the initial CLM leave events. But I still see the FAILED TO RECEIVE hit on node-3. A couple of nodes don't enter operational on ring 20, then after the ring next reforms (ring 24), the FAILED TO RECEIVE happens. Attached is the latest debug. Keep in mind there are two problems here - (1) clm membership is wrong and (2) fail to recv problem. They are independent issues. I definitely want to look into this failed to receive issue. Can you try changing fail_recv_const on all the nodes to some large value, such as 5000? One of 3 things should happen: 1. the protocol blocks forever 2. the protocol enters operational after some short period 3. fail to recv is printed after a long period of time (1-10 minutes). Please report back which one happens with this tuning. Given that #1/#3 are basically what are occurring, I would love to have a blackbox few seconds after config time and then couple minutes in. Apparently something is wrong with the recovery in this test case. Regards -steve I think the problem is some nodes end up missing a message/sequence-number, although I'm not sure exactly why. E.g. the token sequence starts off at one when they enter operational, but not all nodes receive this. 2011 Aug 9 10:07:18 daemon.debug node-3 corosync[1575]: [TOTEM ] totemsrp.c:3785 retrans flag count 4 token aru 0 install seq 0 aru 0 1 The nodes that were still in recovery will be using different values for old_ring_state_high_seq_received and my_old_ring_id. It seems these nodes receive msg seq #1, but the others don't and hit the FAILED TO RECEIVE. The debug attached has your first memb-list patch popped off, but I've seen the same problem happen with it applied too. 2. Note that I don't see any CLM leave events at all now, even though after the FAILED TO RECEIVE, node-3 kicks all other nodes out of its ring. I think this is due to the logic: diff = my_new_memb_list - my_memb_list This isn't how the difference operation works. It produces a list of nodes that are not both in my_new_memb_list and my_memb_list, therefore, the current and logic should be correct. I wrote the patch at 2am and was quite tired, so I'll double check it is correct. Regards -steve The diff doesn't include any nodes that are in my_memb_list but not in my_new_memb_list, i.e. left nodes. I guess you could get all the differences by doing the following: memb_set_subtract( diff1, my_new_memb_list, my_memb_list ) memb_set_subtract( diff2, my_memb_list, my_new_memb_list ) memb_set_and( diff1, diff2, diff ) Thanks, Tim On Mon, Aug 8, 2011 at 9:45 PM, Steven Dake sd...@redhat.com wrote: On 08/08/2011 12:10 AM, Tim Beale wrote: Hi Steve, Thanks for your help. I tried out your patch but the problem still occurs. The problem looks to me due to the ring-IDs used when forming the transitional memb-list, rather than with the memb-list itself. The ring-ID of the nodes still in Recovery is older than the rest of the nodes who have already shifted to Operational. Attached is my attempt at fixing the problem. The idea is to delay the nodes processing a Memb-Join immediately after shifting to Operational, until the token has rotated the ring once. It doesn't quite work either though. The nodes are still re-entering gather before all have left recovery. This time it's due to processing a Merge-Detect message. One node has just started up and set itself to the rep, and sends out a Merge-Detect which triggers the other nodes to enter gather and reform the ring. Let me know if you have any other advice. the problem is clear from the blackbox - 8 nodes enter operational while 1 in recovery is interrupted by a join message. this interrupted node then proceeds with a transitional membership of 1 node (which is correct). The joined and left lists use the transitional list to determine their contents, which is not correct. This results in incorrect data delivered to clm. Try the follow-up patch which should correctly calculate the joined and left lists. Thanks, Tim On Mon, Aug 8, 2011 at 6:08 AM, Steven Dake sd...@redhat.com wrote: On 08/03/2011 10:32 PM, Tim Beale wrote: Hi, It looks to me that the way the transition from Recovery to Operational works, we can't guarantee that all nodes in the ring have entered
Re: [Openais] Problems forming cluster on corosync startup
On 08/09/2011 09:56 PM, Tim Beale wrote: Hi Steve, Thanks for your patch. 1. I don't see the initial CLM leave events. But I still see the FAILED TO RECEIVE hit on node-3. A couple of nodes don't enter operational on ring 20, then after the ring next reforms (ring 24), the FAILED TO RECEIVE happens. Attached is the latest debug. Keep in mind there are two problems here - (1) clm membership is wrong and (2) fail to recv problem. They are independent issues. I definitely want to look into this failed to receive issue. Can you try changing fail_recv_const on all the nodes to some large value, such as 5000? One of 3 things should happen: 1. the protocol blocks forever 2. the protocol enters operational after some short period 3. fail to recv is printed after a long period of time (1-10 minutes). Please report back which one happens with this tuning. I think the problem is some nodes end up missing a message/sequence-number, although I'm not sure exactly why. E.g. the token sequence starts off at one when they enter operational, but not all nodes receive this. 2011 Aug 9 10:07:18 daemon.debug node-3 corosync[1575]: [TOTEM ] totemsrp.c:3785 retrans flag count 4 token aru 0 install seq 0 aru 0 1 The nodes that were still in recovery will be using different values for old_ring_state_high_seq_received and my_old_ring_id. It seems these nodes receive msg seq #1, but the others don't and hit the FAILED TO RECEIVE. The debug attached has your first memb-list patch popped off, but I've seen the same problem happen with it applied too. 2. Note that I don't see any CLM leave events at all now, even though after the FAILED TO RECEIVE, node-3 kicks all other nodes out of its ring. I think this is due to the logic: diff = my_new_memb_list - my_memb_list This isn't how the difference operation works. It produces a list of nodes that are not both in my_new_memb_list and my_memb_list, therefore, the current and logic should be correct. I wrote the patch at 2am and was quite tired, so I'll double check it is correct. Regards -steve The diff doesn't include any nodes that are in my_memb_list but not in my_new_memb_list, i.e. left nodes. I guess you could get all the differences by doing the following: memb_set_subtract( diff1, my_new_memb_list, my_memb_list ) memb_set_subtract( diff2, my_memb_list, my_new_memb_list ) memb_set_and( diff1, diff2, diff ) Thanks, Tim On Mon, Aug 8, 2011 at 9:45 PM, Steven Dake sd...@redhat.com wrote: On 08/08/2011 12:10 AM, Tim Beale wrote: Hi Steve, Thanks for your help. I tried out your patch but the problem still occurs. The problem looks to me due to the ring-IDs used when forming the transitional memb-list, rather than with the memb-list itself. The ring-ID of the nodes still in Recovery is older than the rest of the nodes who have already shifted to Operational. Attached is my attempt at fixing the problem. The idea is to delay the nodes processing a Memb-Join immediately after shifting to Operational, until the token has rotated the ring once. It doesn't quite work either though. The nodes are still re-entering gather before all have left recovery. This time it's due to processing a Merge-Detect message. One node has just started up and set itself to the rep, and sends out a Merge-Detect which triggers the other nodes to enter gather and reform the ring. Let me know if you have any other advice. the problem is clear from the blackbox - 8 nodes enter operational while 1 in recovery is interrupted by a join message. this interrupted node then proceeds with a transitional membership of 1 node (which is correct). The joined and left lists use the transitional list to determine their contents, which is not correct. This results in incorrect data delivered to clm. Try the follow-up patch which should correctly calculate the joined and left lists. Thanks, Tim On Mon, Aug 8, 2011 at 6:08 AM, Steven Dake sd...@redhat.com wrote: On 08/03/2011 10:32 PM, Tim Beale wrote: Hi, It looks to me that the way the transition from Recovery to Operational works, we can't guarantee that all nodes in the ring have entered Operational before a node processes another Memb-Join message from a new node. E.g. we can't guarantee the token has rotated right the way around the ring. When this happens, the nodes still in Recovery will still use the older ring ID. So they won't get added to the transitional membership, and CLM will report leave events for these nodes. (Plus there might be other side-effects, like the FAILED TO RECEIVE problem - I haven't quite worked out why that's happening). Thanks for the pointer here - patch on ml. We are currently using CLM to check the health of a node, i.e. so we can detect if it locks up. My questions are: i) Are there config settings we could change to improve this, like increasing the 'join' timeout? ii) Should I try to make a code change
Re: [Openais] [PATCH 1/2] cpg: Handle errors from totem_mcast
On second consideration this patch is Reviewed-by: Steven Dake sd...@redhat.com On 08/08/2011 09:11 AM, Steven Dake wrote: On 07/28/2011 07:20 AM, Jan Friesse wrote: totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse jfrie...@redhat.com --- services/cpg.c | 31 ++- 1 files changed, 26 insertions(+), 5 deletions(-) diff --git a/services/cpg.c b/services/cpg.c index 6669fbd..18767bd 100644 --- a/services/cpg.c +++ b/services/cpg.c @@ -865,12 +865,19 @@ static void cpg_pd_finalize (struct cpg_pd *cpd) static int cpg_lib_exit_fn (void *conn) { struct cpg_pd *cpd = (struct cpg_pd *)api-ipc_private_data_get (conn); +int result; log_printf(LOGSYS_LEVEL_DEBUG, exit_fn for conn=%p\n, conn); if (cpd-group_name.length 0) { -cpg_node_joinleave_send (cpd-pid, cpd-group_name, +result = cpg_node_joinleave_send (cpd-pid, cpd-group_name, MESSAGE_REQ_EXEC_CPG_PROCLEAVE, CONFCHG_CPG_REASON_PROCDOWN); +if (result == -1) { +/* + * Call this function again later + */ +return (result); +} } this is correct cpg_pd_finalize (cpd); @@ -1289,6 +1296,7 @@ static void message_handler_req_lib_cpg_join (void *conn, const void *message) struct res_lib_cpg_join res_lib_cpg_join; cs_error_t error = CPG_OK; struct list_head *iter; +int result; /* Test, if we don't have same pid and group name joined */ for (iter = cpg_pd_list_head.next; iter != cpg_pd_list_head; iter = iter-next) { @@ -1327,9 +1335,15 @@ static void message_handler_req_lib_cpg_join (void *conn, const void *message) memcpy (cpd-group_name, req_lib_cpg_join-group_name, sizeof (cpd-group_name)); -cpg_node_joinleave_send (req_lib_cpg_join-pid, +result = cpg_node_joinleave_send (req_lib_cpg_join-pid, req_lib_cpg_join-group_name, MESSAGE_REQ_EXEC_CPG_PROCJOIN, CONFCHG_CPG_REASON_JOIN); + +if (result == -1) { +error = CPG_ERR_TRY_AGAIN; +cpd-cpd_state = CPD_STATE_UNJOINED; +goto response_send; +} break; case CPD_STATE_LEAVE_STARTED: error = CPG_ERR_BUSY; the remainder of patch is not. the ipc layer ensures room is available in the totem queue to handle new totem messages. If that part isn't working as expected (ie: you see a failure in this part of the code) you should fix the totem pending queue rather then hack it here. @@ -1356,6 +1370,7 @@ static void message_handler_req_lib_cpg_leave (void *conn, const void *message) cs_error_t error = CPG_OK; struct req_lib_cpg_leave *req_lib_cpg_leave = (struct req_lib_cpg_leave *)message; struct cpg_pd *cpd = (struct cpg_pd *)api-ipc_private_data_get (conn); +int result; log_printf(LOGSYS_LEVEL_DEBUG, got leave request on %p\n, conn); @@ -1372,10 +1387,14 @@ static void message_handler_req_lib_cpg_leave (void *conn, const void *message) case CPD_STATE_JOIN_COMPLETED: error = CPG_OK; cpd-cpd_state = CPD_STATE_LEAVE_STARTED; -cpg_node_joinleave_send (req_lib_cpg_leave-pid, +result = cpg_node_joinleave_send (req_lib_cpg_leave-pid, req_lib_cpg_leave-group_name, MESSAGE_REQ_EXEC_CPG_PROCLEAVE, CONFCHG_CPG_REASON_LEAVE); +if (result == -1) { +error = CPG_ERR_TRY_AGAIN; +cpd-cpd_state = CPD_STATE_JOIN_COMPLETED; +} break; } @@ -1458,8 +1477,10 @@ static void message_handler_req_lib_cpg_mcast (void *conn, const void *message) req_exec_cpg_iovec[1].iov_base = (char *)req_lib_cpg_mcast-message; req_exec_cpg_iovec[1].iov_len = msglen; -result = api-totem_mcast (req_exec_cpg_iovec, 2, TOTEM_AGREED); -assert(result == 0); +result = api-totem_mcast (req_exec_cpg_iovec, 2, TOTEM_AGREED); +if (result == -1) { +error = CPG_ERR_TRY_AGAIN; +} } res_lib_cpg_mcast.header.size = sizeof(res_lib_cpg_mcast); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org
Re: [Openais] [PATCH 2/2] cfg: Handle errors from totem_mcast
On second consideration this patch is Reviewed-by: Steven Dake sd...@redhat.com On 08/08/2011 09:15 AM, Steven Dake wrote: Before accepting an IPC message, ipc checks that the totem queue has available room for new messages. As a result this patch is either not necessary or fixes the wrong thing. See coroipcs.c:697 send_ok = api-sending_allowed (conn_info-service, header-id, header, conn_info-sending_allowed_private_data); On 07/28/2011 07:20 AM, Jan Friesse wrote: totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse jfrie...@redhat.com --- services/cfg.c | 77 ++- 1 files changed, 59 insertions(+), 18 deletions(-) diff --git a/services/cfg.c b/services/cfg.c index b7aa63b..24f19f2 100644 --- a/services/cfg.c +++ b/services/cfg.c @@ -379,6 +379,7 @@ static int send_shutdown(void) { struct req_exec_cfg_shutdown req_exec_cfg_shutdown; struct iovec iovec; +int result; ENTER(); req_exec_cfg_shutdown.header.size = @@ -389,10 +390,10 @@ static int send_shutdown(void) iovec.iov_base = (char *)req_exec_cfg_shutdown; iovec.iov_len = sizeof (struct req_exec_cfg_shutdown); -assert (api-totem_mcast (iovec, 1, TOTEM_SAFE) == 0); +result = api-totem_mcast (iovec, 1, TOTEM_SAFE); LEAVE(); -return 0; +return (result); } static void send_test_shutdown(void *only_conn, void *exclude_conn, int status) @@ -426,6 +427,9 @@ static void send_test_shutdown(void *only_conn, void *exclude_conn, int status) static void check_shutdown_status(void) { +int result; +cs_error_t error = CS_OK; + ENTER(); /* @@ -448,9 +452,17 @@ static void check_shutdown_status(void) shutdown_flags == CFG_SHUTDOWN_FLAG_REGARDLESS) { TRACE1(shutdown confirmed); +/* + * Tell other nodes we are going down + */ +result = send_shutdown(); +if (result == -1) { +error = CS_ERR_TRY_AGAIN; +} + res_lib_cfg_tryshutdown.header.size = sizeof(struct res_lib_cfg_tryshutdown); res_lib_cfg_tryshutdown.header.id = MESSAGE_RES_CFG_TRYSHUTDOWN; -res_lib_cfg_tryshutdown.header.error = CS_OK; +res_lib_cfg_tryshutdown.header.error = error; /* * Tell originator that shutdown was confirmed @@ -459,10 +471,6 @@ static void check_shutdown_status(void) sizeof(res_lib_cfg_tryshutdown)); shutdown_con = NULL; -/* - * Tell other nodes we are going down - */ -send_shutdown(); } else { @@ -698,7 +706,9 @@ static void message_handler_req_lib_cfg_ringreenable ( const void *msg) { struct req_exec_cfg_ringreenable req_exec_cfg_ringreenable; +struct res_lib_cfg_ringreenable res_lib_cfg_ringreenable; struct iovec iovec; +int result; ENTER(); req_exec_cfg_ringreenable.header.size = @@ -711,7 +721,19 @@ static void message_handler_req_lib_cfg_ringreenable ( iovec.iov_base = (char *)req_exec_cfg_ringreenable; iovec.iov_len = sizeof (struct req_exec_cfg_ringreenable); -assert (api-totem_mcast (iovec, 1, TOTEM_SAFE) == 0); +result = api-totem_mcast (iovec, 1, TOTEM_SAFE); + +if (result == -1) { +res_lib_cfg_ringreenable.header.id = MESSAGE_RES_CFG_RINGREENABLE; +res_lib_cfg_ringreenable.header.size = sizeof (struct res_lib_cfg_ringreenable); +res_lib_cfg_ringreenable.header.error = CS_ERR_TRY_AGAIN; +api-ipc_response_send ( +conn, +res_lib_cfg_ringreenable, +sizeof (struct res_lib_cfg_ringreenable)); + +api-ipc_refcnt_dec(conn); +} LEAVE(); } @@ -836,6 +858,8 @@ static void message_handler_req_lib_cfg_killnode ( struct res_lib_cfg_killnode res_lib_cfg_killnode; struct req_exec_cfg_killnode req_exec_cfg_killnode; struct iovec iovec; +int result; +cs_error_t error = CS_OK; ENTER(); req_exec_cfg_killnode.header.size = @@ -848,11 +872,14 @@ static void message_handler_req_lib_cfg_killnode ( iovec.iov_base = (char *)req_exec_cfg_killnode; iovec.iov_len = sizeof
[Openais] [PATCH 1/4] Fix problem in totemiba where incorrect define is used (and also not defined)
Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemiba.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/exec/totemiba.c b/exec/totemiba.c index 2d8c690..a16f88a 100644 --- a/exec/totemiba.c +++ b/exec/totemiba.c @@ -70,6 +70,8 @@ #include corosync/list.h #include corosync/hdb.h #include corosync/swab.h + +#include qb/qbdefs.h #include qb/qbloop.h #define LOGSYS_UTILS_ONLY 1 #include corosync/engine/logsys.h @@ -1316,7 +1318,7 @@ int totemiba_initialize ( qb_loop_timer_add (instance-totemiba_poll_handle, QB_LOOP_MED, - 100*QB_TIME_NS_IN_NSEC, + 100*QB_TIME_NS_IN_MSEC, (void *)instance, timer_function_netif_check_timeout, instance-timer_netif_check_timeout); -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 2/4] Define totemiba_log_printf properly
Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemiba.c |8 +--- 1 files changed, 5 insertions(+), 3 deletions(-) diff --git a/exec/totemiba.c b/exec/totemiba.c index a16f88a..008018a 100644 --- a/exec/totemiba.c +++ b/exec/totemiba.c @@ -187,13 +187,15 @@ struct totemiba_instance { struct ibv_cq *send_token_recv_cq; - void (*totemiba_log_printf) ( - unsigned int rec_ident, +void (*totemiba_log_printf) ( + int level, + int subsys, const char *function, const char *file, int line, const char *format, - ...)__attribute__((format(printf, 5, 6))); + ...)__attribute__((format(printf, 6, 7))); + int totemiba_subsys_id; -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 4/4] Remove -lcoroipcc from tools/Makefile.am notifyd
Signed-off-by: Steven Dake sd...@redhat.com --- tools/Makefile.am |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tools/Makefile.am b/tools/Makefile.am index f88e741..2699519 100644 --- a/tools/Makefile.am +++ b/tools/Makefile.am @@ -55,7 +55,7 @@ corosync_quorumtool_LDADD = -lconfdb -lcfg -lquorum \ -lvotequorum ../lcr/liblcr.a $(LIBQB_LIBS) corosync_quorumtool_LDFLAGS = -L../lib -corosync_notifyd_LDADD = -lcfg -lconfdb ../lcr/liblcr.a -lcoroipcc \ +corosync_notifyd_LDADD = -lcfg -lconfdb ../lcr/liblcr.a \ $(LIBQB_LIBS) $(DBUS_LIBS) $(SNMPLIBS) \ -lquorum corosync_notifyd_LDFLAGS = -L../lib -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 3/4] properly define rec_token_cq_send_event_fn
Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemiba.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/exec/totemiba.c b/exec/totemiba.c index 008018a..ffcfceb 100644 --- a/exec/totemiba.c +++ b/exec/totemiba.c @@ -562,7 +562,10 @@ static int mcast_rdma_event_fn (int events, int suck, void *context) return (0); } -static int recv_token_cq_send_event_fn (hdb_handle_t poll_handle, int events, int suck, void *context) +static int recv_token_cq_send_event_fn ( + int fd, + int revents, + void *context) { struct totemiba_instance *instance = (struct totemiba_instance *)context; struct ibv_wc wc[32]; -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Make joined and left lists deliver correct results
Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c | 47 ++- 1 files changed, 42 insertions(+), 5 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 4a299a0..a97ed49 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -1349,6 +1349,35 @@ static void memb_set_and_with_ring_id ( return; } +static void memb_set_and ( + struct srp_addr *set1, + int set1_entries, + struct srp_addr *set2, + int set2_entries, + struct srp_addr *and, + int *and_entries) +{ + int i; + int j; + int found = 0; + + *and_entries = 0; + + for (i = 0; i set2_entries; i++) { + for (j = 0; j set1_entries; j++) { + if (srp_addr_equal (set1[j], set2[i])) { + found = 1; + break; + } + } + if (found) { + srp_addr_copy (and[*and_entries], set1[j]); + *and_entries = *and_entries + 1; + } + found = 0; + } + return; +} #ifdef CODE_COVERAGE static void memb_set_print ( char *string, @@ -1718,6 +1747,8 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance) unsigned int trans_memb_list_totemip[PROCESSOR_COUNT_MAX]; unsigned int new_memb_list_totemip[PROCESSOR_COUNT_MAX]; unsigned int left_list[PROCESSOR_COUNT_MAX]; + struct srp_addr difference_list[PROCESSOR_COUNT_MAX]; + int difference_list_entries = 0; unsigned int i; unsigned int res; @@ -1739,14 +1770,20 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance) /* * Calculate joined and left list */ - memb_set_subtract (instance-my_left_memb_list, - instance-my_left_memb_entries, + memb_set_subtract (difference_list, + difference_list_entries, + instance-my_new_memb_list, instance-my_new_memb_entries, + instance-my_memb_list, instance-my_memb_entries); + + memb_set_and ( + difference_list, difference_list_entries, instance-my_memb_list, instance-my_memb_entries, - instance-my_trans_memb_list, instance-my_trans_memb_entries); + instance-my_left_memb_list, instance-my_left_memb_entries); - memb_set_subtract (joined_list, joined_list_entries, + memb_set_and ( + difference_list, difference_list_entries, instance-my_new_memb_list, instance-my_new_memb_entries, - instance-my_trans_memb_list, instance-my_trans_memb_entries); + joined_list, joined_list_entries); /* * Install new membership -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Problems forming cluster on corosync startup
On 08/08/2011 12:10 AM, Tim Beale wrote: Hi Steve, Thanks for your help. I tried out your patch but the problem still occurs. The problem looks to me due to the ring-IDs used when forming the transitional memb-list, rather than with the memb-list itself. The ring-ID of the nodes still in Recovery is older than the rest of the nodes who have already shifted to Operational. Attached is my attempt at fixing the problem. The idea is to delay the nodes processing a Memb-Join immediately after shifting to Operational, until the token has rotated the ring once. It doesn't quite work either though. The nodes are still re-entering gather before all have left recovery. This time it's due to processing a Merge-Detect message. One node has just started up and set itself to the rep, and sends out a Merge-Detect which triggers the other nodes to enter gather and reform the ring. Let me know if you have any other advice. the problem is clear from the blackbox - 8 nodes enter operational while 1 in recovery is interrupted by a join message. this interrupted node then proceeds with a transitional membership of 1 node (which is correct). The joined and left lists use the transitional list to determine their contents, which is not correct. This results in incorrect data delivered to clm. Try the follow-up patch which should correctly calculate the joined and left lists. Thanks, Tim On Mon, Aug 8, 2011 at 6:08 AM, Steven Dake sd...@redhat.com wrote: On 08/03/2011 10:32 PM, Tim Beale wrote: Hi, It looks to me that the way the transition from Recovery to Operational works, we can't guarantee that all nodes in the ring have entered Operational before a node processes another Memb-Join message from a new node. E.g. we can't guarantee the token has rotated right the way around the ring. When this happens, the nodes still in Recovery will still use the older ring ID. So they won't get added to the transitional membership, and CLM will report leave events for these nodes. (Plus there might be other side-effects, like the FAILED TO RECEIVE problem - I haven't quite worked out why that's happening). Thanks for the pointer here - patch on ml. We are currently using CLM to check the health of a node, i.e. so we can detect if it locks up. My questions are: i) Are there config settings we could change to improve this, like increasing the 'join' timeout? ii) Should I try to make a code change to fix the problem? E.g. delay processing the Memb-Join message if the node's only just entered operational. iii) Should we not be using CLM like this? I.e. should we just learn to live with CLM/CPG sometimes reporting nodes as leaving when they're perfectly healthy. Thanks for your help. Tim Tim please try the patch I have recently posted: [PATCH] Set my_new_memb_list in recovery enter First and foremost, let me know if it resolves your 10 node startup case which fails 10% of the time. Then let me know if it treats other symptoms. Regards -steve On Wed, Aug 3, 2011 at 3:28 PM, Tim Beale tlbe...@gmail.com wrote: Hi, We're booting up a 10-node cluster (with all nodes starting corosync at roughly the same time) and approx 1 in 10 times we see some problems: a) CLM is reporting nodes as leaving and then immediately rejoining (not sure if this is valid behaviour?) b) Probably an unrelated oddity, but we're getting flow control enabled on a client daemon using CLM that's only sending one request (saClmClusterTrack()). c) A node is hitting the FAILED TO RECEIVE case d) After c) there seems to be a lot of churn as the cluster tries to reform e) During the processing of node leave events, the CPG client can sometimes get broken so it no longer processes *any* CPG events Corosync debug is attached (I commented out some of the noisier debug around message delivery). We don't really know enough about corosync to tell what exactly is incorrect behaviour and what should be fixed. But here's what we've noticed: 1). Node-4 joins soon after node-1. When this happens all nodes except node-12 have entered operational state (see node-12.txt line 235). It looks like maybe node-12 hasn't received enough rotations of the token to enter operational yet. Node-12's resulting transitional config consists of just itself. All nodes then report node-1 and node-12 as leaving and immediately rejoining. 2) After this config change, node-3 eventually hits the FAILED TO RECEIVE case (node-3.txt line 380). At this point node-1 and node-12 have an ARU matching the high_seq_received, all other nodes have an ARU of zero. 3) Node-3 entering gather seems to result in a lot of config change churn across the cluster. 4) While processing the config changes on node-3, the CPG downlist it uses contains itself. When node-3 sends leave events for the nodes in the downlist (including itself), it sets its own cpd state to CPD_STATE_UNJOINED
[Openais] feature proposal: take 2 of quorum
On 08/08/2011 12:25 AM, Fabio M. Di Nitto wrote: On 8/7/2011 6:57 PM, Steven Dake wrote: Believe many in community are on vacation during our proposal window. As a result, I'm extending until Aug 30th. topic-quorum ? as we discussed recently on IRC, in order to replace cman. can you write full proposal that captures our conversation on irc. thanks -steve Fabio ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] Add systemd unit files for corosync and corosync-notifyd
Reviewed-by: Steven Dake sd...@redhat.com Regards -steve On 08/08/2011 04:04 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- configure.ac |8 corosync.spec.in | 12 init/.gitignore |2 ++ init/Makefile.am | 15 +++ init/corosync-notifyd.service.in | 11 +++ init/corosync.service.in | 12 6 files changed, 56 insertions(+), 4 deletions(-) create mode 100644 init/corosync-notifyd.service.in create mode 100644 init/corosync.service.in diff --git a/configure.ac b/configure.ac index e00edeb..563f799 100644 --- a/configure.ac +++ b/configure.ac @@ -279,6 +279,11 @@ AC_ARG_ENABLE([augeas], [ enable_augeas=no ]) AM_CONDITIONAL(INSTALL_AUGEAS, test x$enable_augeas = xyes) +AC_ARG_ENABLE([systemd], + [ --enable-systemd : Install systemd service files],, + [ enable_systemd=no ]) +AM_CONDITIONAL(INSTALL_SYSTEMD, test x$enable_systemd = xyes) + AC_ARG_WITH([initddir], [ --with-initddir=DIR : path to init script directory. ], [ INITDDIR=$withval ], @@ -448,6 +453,9 @@ fi if test x${enable_augeas} = xyes; then PACKAGE_FEATURES=$PACKAGE_FEATURES augeas fi +if test x${enable_systemd} = xyes; then + PACKAGE_FEATURES=$PACKAGE_FEATURES systemd +fi if test x${enable_snmp} = xyes; then SNMPCONFIG= diff --git a/corosync.spec.in b/corosync.spec.in index 5eba3bc..b864087 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -11,6 +11,7 @@ %bcond_with snmp %bcond_with dbus %bcond_with rdma +%bcond_with systemd Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces @@ -46,6 +47,9 @@ BuildRequires: net-snmp-devel %if %{with dbus} BuildRequires: dbus-devel %endif +%if %{with systemd} +BuildRequires: systemd-units +%endif BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XX) @@ -83,6 +87,9 @@ export rdmacm_LIBS=-lrdmacm \ %if %{with rdma} --enable-rdma \ %endif +%if %{with systemd} + --enable-systemd \ +%endif --with-initddir=%{_initrddir} make %{_smp_mflags} @@ -146,8 +153,13 @@ fi %if %{with snmp} %{_datadir}/snmp/mibs/COROSYNC-MIB.txt %endif +%if %{with systemd} +%{_unitdir}/corosync.service +%{_unitdir}/corosync-notifyd.service +%else %{_initrddir}/corosync %{_initrddir}/corosync-notifyd +%endif %dir %{_libexecdir}/lcrso %{_libexecdir}/lcrso/coroparse.lcrso %{_libexecdir}/lcrso/objdb.lcrso diff --git a/init/.gitignore b/init/.gitignore index 0a75c32..34e4cb8 100644 --- a/init/.gitignore +++ b/init/.gitignore @@ -1,2 +1,4 @@ generic notifyd +corosync.service +corosync-notifyd.service diff --git a/init/Makefile.am b/init/Makefile.am index 0ca9ee9..90d49c4 100644 --- a/init/Makefile.am +++ b/init/Makefile.am @@ -34,9 +34,14 @@ MAINTAINERCLEANFILES = Makefile.in -EXTRA_DIST = generic.in notifyd.in +EXTRA_DIST = generic.in notifyd.in corosync.service.in corosync-notifyd.service.in +if INSTALL_SYSTEMD +systemdconfdir = /lib/systemd/system +systemdconf_DATA = corosync.service corosync-notifyd.service +else target_INIT = generic notifyd +endif %: %.in Makefile rm -f $@-t $@ @@ -46,14 +51,15 @@ target_INIT = generic notifyd -e 's#@''INITDDIR@#$(INITDDIR)#g' \ -e 's#@''LOCALSTATEDIR@#$(localstatedir)#g' \ $ $@-t - chmod 0755 $@-t mv $@-t $@ -all-local: $(target_INIT) +all-local: $(target_INIT) $(systemdconf_DATA) clean-local: - rm -rf $(target_INIT) + rm -rf $(target_INIT) $(systemdconf_DATA) +if INSTALL_SYSTEMD +else install-exec-local: $(INSTALL) -d $(DESTDIR)/$(INITDDIR) $(INSTALL) -m 755 generic $(DESTDIR)/$(INITDDIR)/corosync @@ -62,3 +68,4 @@ install-exec-local: uninstall-local: cd $(DESTDIR)/$(INITDDIR) \ rm -f corosync corosync-notifyd +endif diff --git a/init/corosync-notifyd.service.in b/init/corosync-notifyd.service.in new file mode 100644 index 000..26a278a --- /dev/null +++ b/init/corosync-notifyd.service.in @@ -0,0 +1,11 @@ +[Unit] +Description=Corosync Dbus and snmp notifier +Wants=corosync.service + +[Service] +EnvironmentFile=@SYSCONFIGDIR@/corosync-notifyd +ExecStart=@SBINDIR@/corosync-notifyd -f $OPTIONS +Type=simple + +[Install] +WantedBy=multi-user.target diff --git a/init/corosync.service.in b/init/corosync.service.in new file mode 100644 index 000..8cc692b --- /dev/null +++ b/init/corosync.service.in @@ -0,0 +1,12 @@ +[Unit] +Description=Corosync Cluster Engine +ConditionKernelCommandLine=!nocluster +#Conflicts=cman.service + +[Service] +ExecStart=@SBINDIR@/corosync +Type=forking +#RestartSec=90s + +[Install] +WantedBy
Re: [Openais] [PATCH] Revert totemsrp: Remove recv_flush code
Reviewed-by: Steven Dake sd...@redhat.com On 07/27/2011 05:49 AM, Jan Friesse wrote: This reverts commit 2167 Reversion is needed to remove overflow of receive buffers and dropping messages. Signed-off-by: Jan Friesse jfrie...@redhat.com --- branches/whitetank/exec/totemnet.c | 45 - branches/whitetank/exec/totemnet.h |2 + branches/whitetank/exec/totemrrp.c | 65 branches/whitetank/exec/totemrrp.h |2 + branches/whitetank/exec/totemsrp.c |2 + 5 files changed, 115 insertions(+), 1 deletions(-) diff --git a/branches/whitetank/exec/totemnet.c b/branches/whitetank/exec/totemnet.c index b5c4293..154aa4f 100644 --- a/branches/whitetank/exec/totemnet.c +++ b/branches/whitetank/exec/totemnet.c @@ -148,6 +148,8 @@ struct totemnet_instance { struct iovec totemnet_iov_recv; + struct iovec totemnet_iov_recv_flush; + struct totemnet_socket totemnet_sockets; struct totem_ip_address mcast_address; @@ -215,6 +217,9 @@ static void totemnet_instance_initialize (struct totemnet_instance *instance) instance-totemnet_iov_recv.iov_base = instance-iov_buffer; instance-totemnet_iov_recv.iov_len = FRAME_SIZE_MAX; //sizeof (instance-iov_buffer); + instance-totemnet_iov_recv_flush.iov_base = instance-iov_buffer_flush; + + instance-totemnet_iov_recv_flush.iov_len = FRAME_SIZE_MAX; //sizeof (instance-iov_buffer); /* * There is always atleast 1 processor @@ -629,7 +634,11 @@ static int net_deliver_fn ( unsigned char *msg_offset; unsigned int size_delv; - iovec = instance-totemnet_iov_recv; + if (instance-flushing == 1) { + iovec = instance-totemnet_iov_recv_flush; + } else { + iovec = instance-totemnet_iov_recv; + } /* * Receive datagram @@ -1310,6 +1319,40 @@ error_exit: return (res); } +int totemnet_recv_flush (totemnet_handle handle) +{ + struct totemnet_instance *instance; + struct pollfd ufd; + int nfds; + int res = 0; + + res = hdb_handle_get (totemnet_instance_database, handle, + (void *)instance); + if (res != 0) { + res = ENOENT; + goto error_exit; + } + + instance-flushing = 1; + + do { + ufd.fd = instance-totemnet_sockets.mcast_recv; + ufd.events = POLLIN; + nfds = poll (ufd, 1, 0); + if (nfds == 1 ufd.revents POLLIN) { + net_deliver_fn (0, instance-totemnet_sockets.mcast_recv, + ufd.revents, instance); + } + } while (nfds == 1); + + instance-flushing = 0; + + hdb_handle_put (totemnet_instance_database, handle); + +error_exit: + return (res); +} + int totemnet_send_flush (totemnet_handle handle) { struct totemnet_instance *instance; diff --git a/branches/whitetank/exec/totemnet.h b/branches/whitetank/exec/totemnet.h index 521743a..f4788ab 100644 --- a/branches/whitetank/exec/totemnet.h +++ b/branches/whitetank/exec/totemnet.h @@ -88,6 +88,8 @@ extern int totemnet_mcast_noflush_send ( struct iovec *iovec, unsigned int iov_len); +extern int totemnet_recv_flush (totemnet_handle handle); + extern int totemnet_send_flush (totemnet_handle handle); extern int totemnet_iface_check (totemnet_handle handle); diff --git a/branches/whitetank/exec/totemrrp.c b/branches/whitetank/exec/totemrrp.c index 9864a88..f471c5b 100644 --- a/branches/whitetank/exec/totemrrp.c +++ b/branches/whitetank/exec/totemrrp.c @@ -131,6 +131,9 @@ struct rrp_algo { struct iovec *iovec, unsigned int iov_len); + void (*recv_flush) ( + struct totemrrp_instance *instance); + void (*send_flush) ( struct totemrrp_instance *instance); @@ -241,6 +244,9 @@ static void none_token_send ( struct iovec *iovec, unsigned int iov_len); +static void none_recv_flush ( + struct totemrrp_instance *instance); + static void none_send_flush ( struct totemrrp_instance *instance); @@ -296,6 +302,9 @@ static void passive_token_send ( struct iovec *iovec, unsigned int iov_len); +static void passive_recv_flush ( + struct totemrrp_instance *instance); + static void passive_send_flush ( struct totemrrp_instance *instance); @@ -351,6 +360,9 @@ static void active_token_send ( struct iovec *iovec, unsigned int iov_len); +static void active_recv_flush ( + struct totemrrp_instance *instance); + static void active_send_flush ( struct totemrrp_instance *instance); @@ -389,6 +401,7 @@ struct rrp_algo none_algo = { .mcast_flush_send = none_mcast_flush_send, .token_recv = none_token_recv, .token_send
Re: [Openais] [PATCH] coroipcc: use malloc for path in service_connect
Reiewed-by: Steen Dake sd...@redhat.com On 07/27/2011 08:31 AM, Jan Friesse wrote: Coroipcc appropriately uses PATH_MAX sized variables for various data structures handling files in the initialization of the client. Due to the use of 12 of these structures declared as stack variables, the application stack balloons to over 12*4k. This is especially problematic if threads are used by long running daemons to restart the connection to corosync so as to be resilient in the face of system services restarting (service corosync restart). A simple alternative is to allocate temporary memory to avoid requirements of large thread stacks. Original patch by Dan Clark 2cla...@gmail.com Signed-off-by: Jan Friesse jfrie...@redhat.com --- lib/coroipcc.c | 67 +-- 1 files changed, 40 insertions(+), 27 deletions(-) diff --git a/lib/coroipcc.c b/lib/coroipcc.c index 14860e2..54d9aa7 100644 --- a/lib/coroipcc.c +++ b/lib/coroipcc.c @@ -86,6 +86,15 @@ struct ipc_instance { pthread_mutex_t mutex; }; +struct ipc_path_data { + mar_req_setup_t req_setup; + mar_res_setup_t res_setup; + char control_map_path[PATH_MAX]; + char request_map_path[PATH_MAX]; + char response_map_path[PATH_MAX]; + char dispatch_map_path[PATH_MAX]; +}; + void ipc_hdb_destructor (void *context); DECLARE_HDB_DATABASE(ipc_hdb,ipc_hdb_destructor); @@ -579,12 +588,7 @@ coroipcc_service_connect ( union semun semun; #endif int sys_res; - mar_req_setup_t req_setup; - mar_res_setup_t res_setup; - char control_map_path[PATH_MAX]; - char request_map_path[PATH_MAX]; - char response_map_path[PATH_MAX]; - char dispatch_map_path[PATH_MAX]; + struct ipc_path_data *path_data; res = hdb_error_to_cs (hdb_handle_create (ipc_hdb, sizeof (struct ipc_instance), handle)); @@ -597,8 +601,6 @@ coroipcc_service_connect ( return (res); } - res_setup.error = CS_ERR_LIBRARY; - #if defined(COROSYNC_SOLARIS) request_fd = socket (PF_UNIX, SOCK_STREAM, 0); #else @@ -611,6 +613,14 @@ coroipcc_service_connect ( socket_nosigpipe (request_fd); #endif + path_data = malloc (sizeof(*path_data)); + if (path_data == NULL) { + goto error_connect; + } + memset(path_data, 0, sizeof(*path_data)); + + path_data-res_setup.error = CS_ERR_LIBRARY; + memset (address, 0, sizeof (struct sockaddr_un)); address.sun_family = AF_UNIX; #if defined(COROSYNC_BSD) || defined(COROSYNC_DARWIN) @@ -630,7 +640,7 @@ coroipcc_service_connect ( } sys_res = memory_map ( - control_map_path, + path_data-control_map_path, control_buffer-XX, (void *)ipc_instance-control_buffer, 8192); @@ -640,7 +650,7 @@ coroipcc_service_connect ( } sys_res = memory_map ( - request_map_path, + path_data-request_map_path, request_buffer-XX, (void *)ipc_instance-request_buffer, request_size); @@ -650,7 +660,7 @@ coroipcc_service_connect ( } sys_res = memory_map ( - response_map_path, + path_data-response_map_path, response_buffer-XX, (void *)ipc_instance-response_buffer, response_size); @@ -660,7 +670,7 @@ coroipcc_service_connect ( } sys_res = circular_memory_map ( - dispatch_map_path, + path_data-dispatch_map_path, dispatch_buffer-XX, (void *)ipc_instance-dispatch_buffer, dispatch_size); @@ -715,33 +725,33 @@ coroipcc_service_connect ( /* * Initialize IPC setup message */ - req_setup.service = service; - strcpy (req_setup.control_file, control_map_path); - strcpy (req_setup.request_file, request_map_path); - strcpy (req_setup.response_file, response_map_path); - strcpy (req_setup.dispatch_file, dispatch_map_path); - req_setup.control_size = 8192; - req_setup.request_size = request_size; - req_setup.response_size = response_size; - req_setup.dispatch_size = dispatch_size; + path_data-req_setup.service = service; + strcpy (path_data-req_setup.control_file, path_data-control_map_path); + strcpy (path_data-req_setup.request_file, path_data-request_map_path); + strcpy (path_data-req_setup.response_file, path_data-response_map_path); + strcpy (path_data-req_setup.dispatch_file, path_data-dispatch_map_path); + path_data-req_setup.control_size = 8192; + path_data-req_setup.request_size = request_size; + path_data-req_setup.response_size = response_size; + path_data-req_setup.dispatch_size = dispatch_size; #if _POSIX_THREAD_PROCESS_SHARED 1 -
Re: [Openais] [PATCH 1/2] cpg: Handle errors from totem_mcast
On 07/28/2011 07:20 AM, Jan Friesse wrote: totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse jfrie...@redhat.com --- services/cpg.c | 31 ++- 1 files changed, 26 insertions(+), 5 deletions(-) diff --git a/services/cpg.c b/services/cpg.c index 6669fbd..18767bd 100644 --- a/services/cpg.c +++ b/services/cpg.c @@ -865,12 +865,19 @@ static void cpg_pd_finalize (struct cpg_pd *cpd) static int cpg_lib_exit_fn (void *conn) { struct cpg_pd *cpd = (struct cpg_pd *)api-ipc_private_data_get (conn); + int result; log_printf(LOGSYS_LEVEL_DEBUG, exit_fn for conn=%p\n, conn); if (cpd-group_name.length 0) { - cpg_node_joinleave_send (cpd-pid, cpd-group_name, + result = cpg_node_joinleave_send (cpd-pid, cpd-group_name, MESSAGE_REQ_EXEC_CPG_PROCLEAVE, CONFCHG_CPG_REASON_PROCDOWN); + if (result == -1) { + /* + * Call this function again later + */ + return (result); + } } this is correct cpg_pd_finalize (cpd); @@ -1289,6 +1296,7 @@ static void message_handler_req_lib_cpg_join (void *conn, const void *message) struct res_lib_cpg_join res_lib_cpg_join; cs_error_t error = CPG_OK; struct list_head *iter; + int result; /* Test, if we don't have same pid and group name joined */ for (iter = cpg_pd_list_head.next; iter != cpg_pd_list_head; iter = iter-next) { @@ -1327,9 +1335,15 @@ static void message_handler_req_lib_cpg_join (void *conn, const void *message) memcpy (cpd-group_name, req_lib_cpg_join-group_name, sizeof (cpd-group_name)); - cpg_node_joinleave_send (req_lib_cpg_join-pid, + result = cpg_node_joinleave_send (req_lib_cpg_join-pid, req_lib_cpg_join-group_name, MESSAGE_REQ_EXEC_CPG_PROCJOIN, CONFCHG_CPG_REASON_JOIN); + + if (result == -1) { + error = CPG_ERR_TRY_AGAIN; + cpd-cpd_state = CPD_STATE_UNJOINED; + goto response_send; + } break; case CPD_STATE_LEAVE_STARTED: error = CPG_ERR_BUSY; the remainder of patch is not. the ipc layer ensures room is available in the totem queue to handle new totem messages. If that part isn't working as expected (ie: you see a failure in this part of the code) you should fix the totem pending queue rather then hack it here. @@ -1356,6 +1370,7 @@ static void message_handler_req_lib_cpg_leave (void *conn, const void *message) cs_error_t error = CPG_OK; struct req_lib_cpg_leave *req_lib_cpg_leave = (struct req_lib_cpg_leave *)message; struct cpg_pd *cpd = (struct cpg_pd *)api-ipc_private_data_get (conn); + int result; log_printf(LOGSYS_LEVEL_DEBUG, got leave request on %p\n, conn); @@ -1372,10 +1387,14 @@ static void message_handler_req_lib_cpg_leave (void *conn, const void *message) case CPD_STATE_JOIN_COMPLETED: error = CPG_OK; cpd-cpd_state = CPD_STATE_LEAVE_STARTED; - cpg_node_joinleave_send (req_lib_cpg_leave-pid, + result = cpg_node_joinleave_send (req_lib_cpg_leave-pid, req_lib_cpg_leave-group_name, MESSAGE_REQ_EXEC_CPG_PROCLEAVE, CONFCHG_CPG_REASON_LEAVE); + if (result == -1) { + error = CPG_ERR_TRY_AGAIN; + cpd-cpd_state = CPD_STATE_JOIN_COMPLETED; + } break; } @@ -1458,8 +1477,10 @@ static void message_handler_req_lib_cpg_mcast (void *conn, const void *message) req_exec_cpg_iovec[1].iov_base = (char *)req_lib_cpg_mcast-message; req_exec_cpg_iovec[1].iov_len = msglen; - result = api-totem_mcast (req_exec_cpg_iovec, 2, TOTEM_AGREED); - assert(result == 0); + result = api-totem_mcast (req_exec_cpg_iovec, 2, TOTEM_AGREED); + if (result == -1) { + error = CPG_ERR_TRY_AGAIN; + } } res_lib_cpg_mcast.header.size = sizeof(res_lib_cpg_mcast); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 2/2] cfg: Handle errors from totem_mcast
Before accepting an IPC message, ipc checks that the totem queue has available room for new messages. As a result this patch is either not necessary or fixes the wrong thing. See coroipcs.c:697 send_ok = api-sending_allowed (conn_info-service, header-id, header, conn_info-sending_allowed_private_data); On 07/28/2011 07:20 AM, Jan Friesse wrote: totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse jfrie...@redhat.com --- services/cfg.c | 77 ++- 1 files changed, 59 insertions(+), 18 deletions(-) diff --git a/services/cfg.c b/services/cfg.c index b7aa63b..24f19f2 100644 --- a/services/cfg.c +++ b/services/cfg.c @@ -379,6 +379,7 @@ static int send_shutdown(void) { struct req_exec_cfg_shutdown req_exec_cfg_shutdown; struct iovec iovec; + int result; ENTER(); req_exec_cfg_shutdown.header.size = @@ -389,10 +390,10 @@ static int send_shutdown(void) iovec.iov_base = (char *)req_exec_cfg_shutdown; iovec.iov_len = sizeof (struct req_exec_cfg_shutdown); - assert (api-totem_mcast (iovec, 1, TOTEM_SAFE) == 0); + result = api-totem_mcast (iovec, 1, TOTEM_SAFE); LEAVE(); - return 0; + return (result); } static void send_test_shutdown(void *only_conn, void *exclude_conn, int status) @@ -426,6 +427,9 @@ static void send_test_shutdown(void *only_conn, void *exclude_conn, int status) static void check_shutdown_status(void) { + int result; + cs_error_t error = CS_OK; + ENTER(); /* @@ -448,9 +452,17 @@ static void check_shutdown_status(void) shutdown_flags == CFG_SHUTDOWN_FLAG_REGARDLESS) { TRACE1(shutdown confirmed); + /* + * Tell other nodes we are going down + */ + result = send_shutdown(); + if (result == -1) { + error = CS_ERR_TRY_AGAIN; + } + res_lib_cfg_tryshutdown.header.size = sizeof(struct res_lib_cfg_tryshutdown); res_lib_cfg_tryshutdown.header.id = MESSAGE_RES_CFG_TRYSHUTDOWN; - res_lib_cfg_tryshutdown.header.error = CS_OK; + res_lib_cfg_tryshutdown.header.error = error; /* * Tell originator that shutdown was confirmed @@ -459,10 +471,6 @@ static void check_shutdown_status(void) sizeof(res_lib_cfg_tryshutdown)); shutdown_con = NULL; - /* - * Tell other nodes we are going down - */ - send_shutdown(); } else { @@ -698,7 +706,9 @@ static void message_handler_req_lib_cfg_ringreenable ( const void *msg) { struct req_exec_cfg_ringreenable req_exec_cfg_ringreenable; + struct res_lib_cfg_ringreenable res_lib_cfg_ringreenable; struct iovec iovec; + int result; ENTER(); req_exec_cfg_ringreenable.header.size = @@ -711,7 +721,19 @@ static void message_handler_req_lib_cfg_ringreenable ( iovec.iov_base = (char *)req_exec_cfg_ringreenable; iovec.iov_len = sizeof (struct req_exec_cfg_ringreenable); - assert (api-totem_mcast (iovec, 1, TOTEM_SAFE) == 0); + result = api-totem_mcast (iovec, 1, TOTEM_SAFE); + + if (result == -1) { + res_lib_cfg_ringreenable.header.id = MESSAGE_RES_CFG_RINGREENABLE; + res_lib_cfg_ringreenable.header.size = sizeof (struct res_lib_cfg_ringreenable); + res_lib_cfg_ringreenable.header.error = CS_ERR_TRY_AGAIN; + api-ipc_response_send ( + conn, + res_lib_cfg_ringreenable, + sizeof (struct res_lib_cfg_ringreenable)); + + api-ipc_refcnt_dec(conn); + } LEAVE(); } @@ -836,6 +858,8 @@ static void message_handler_req_lib_cfg_killnode ( struct res_lib_cfg_killnode res_lib_cfg_killnode; struct req_exec_cfg_killnode req_exec_cfg_killnode; struct iovec iovec; + int result; + cs_error_t error = CS_OK; ENTER(); req_exec_cfg_killnode.header.size = @@ -848,11 +872,14 @@ static void message_handler_req_lib_cfg_killnode ( iovec.iov_base = (char *)req_exec_cfg_killnode; iovec.iov_len = sizeof (struct req_exec_cfg_killnode); - (void)api-totem_mcast
Re: [Openais] [PATCH] Make realtime scheduling optional not the default.
Good work Reviewed-by: Steven Dake sd...@redhat.com On 08/07/2011 05:40 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- configure.ac |6 ++ exec/main.c| 21 +++-- man/corosync.8 |7 +-- 3 files changed, 26 insertions(+), 8 deletions(-) diff --git a/configure.ac b/configure.ac index 35e3cfb..e00edeb 100644 --- a/configure.ac +++ b/configure.ac @@ -73,6 +73,12 @@ AC_CHECK_LIB([socket], [socket]) AC_CHECK_LIB([nsl], [t_open]) AC_CHECK_LIB([rt], [sched_getscheduler]) PKG_CHECK_MODULES([LIBQB], [libqb]) +AC_CHECK_LIB([qb], [qb_log_thread_priority_set], \ + have_qb_log_thread_priority_set=yes, \ + have_qb_log_thread_priority_set=no) +if test x${have_qb_log_thread_priority_set} = xyes; then + AC_DEFINE_UNQUOTED([HAVE_QB_LOG_THREAD_PRIORITY_SET], 1, [have qb_log_thread_priority_set]) +fi # Checks for header files. AC_FUNC_ALLOCA diff --git a/exec/main.c b/exec/main.c index 9b2c941..a822120 100644 --- a/exec/main.c +++ b/exec/main.c @@ -980,13 +980,19 @@ static void corosync_setscheduler (void) global_sched_param.sched_priority); global_sched_param.sched_priority = 0; - logsys_thread_priority_set (SCHED_OTHER, NULL, 1); +#ifdef HAVE_QB_LOG_THREAD_PRIORITY_SET + qb_log_thread_priority_set (SCHED_OTHER, 0); +#endif } else { /* * Turn on SCHED_RR in logsys system */ - res = logsys_thread_priority_set (SCHED_RR, global_sched_param, 10); +#ifdef HAVE_QB_LOG_THREAD_PRIORITY_SET + res = qb_log_thread_priority_set (SCHED_RR, sched_priority); +#else + res = -1; +#endif if (res == -1) { log_printf (LOGSYS_LEVEL_ERROR, Could not set logsys thread priority. @@ -1238,9 +1244,9 @@ int main (int argc, char **argv, char **envp) /* default configuration */ background = 1; - setprio = 1; + setprio = 0; - while ((ch = getopt (argc, argv, fpv)) != EOF) { + while ((ch = getopt (argc, argv, fprv)) != EOF) { switch (ch) { case 'f': @@ -1248,7 +1254,9 @@ int main (int argc, char **argv, char **envp) logsys_config_mode_set (NULL, LOGSYS_MODE_OUTPUT_STDERR|LOGSYS_MODE_THREADED|LOGSYS_MODE_FORK); break; case 'p': - setprio = 0; + break; + case 'r': + setprio = 1; break; case 'v': printf (Corosync Cluster Engine, version '%s'\n, VERSION); @@ -1260,7 +1268,8 @@ int main (int argc, char **argv, char **envp) fprintf(stderr, \ usage:\n\ -f : Start application in foreground.\n\ - -p : Do not set process priority.\n\ + -p : Does nothing.\n\ + -r : Set round robin realtime scheduling \n\ -v : Display version and SVN revision of Corosync and exit.\n); return EXIT_FAILURE; } diff --git a/man/corosync.8 b/man/corosync.8 index c45cc56..016c053 100644 --- a/man/corosync.8 +++ b/man/corosync.8 @@ -35,7 +35,7 @@ .SH NAME corosync \- The Corosync Cluster Engine. .SH SYNOPSIS -.B corosync [\-f] [\-p] [\-v] +.B corosync [\-f] [\-p] [\-r] [\-v] .SH DESCRIPTION .B corosync Corosync provides clustering infracture such as membership, messaging and quorum. @@ -45,7 +45,10 @@ Corosync provides clustering infracture such as membership, messaging and quorum Start application in foreground. .TP .B -p -Do not set process priority. +Does nothing (was: Do not set process priority - this is now the default). +.TP +.B -r +Set round robin realtime scheduling. .TP .B -v Display version and SVN revision of Corosync and exit. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [GIT PULL] Changes to example configuration file (corosync.conf.example)
Florian This has been processed. Apologies for delay - very busy week. Regards -steve On 08/01/2011 07:17 AM, Florian Haas wrote: Steve, please consider pulling the following changes since commit d4fb83e971b6fa9af0447ce0a70345fb20064dc1: main: let poll really stop before totempg_finalize (2011-07-26 10:07:08 +0200) from the the git repository at: git://github.com/fghaas/corosync master All changes have undergone review on the list. Thanks to Dan Frincu and Jan Friesse for their valuable feedback. A patch-by-patch summary and diffstat are below, as usual. Cheers, Florian Florian Haas (4): corosync.conf.example: change bindnetaddr corosync.conf.example: change mcastaddr corosync.conf.example: include comments corosync.conf.example: add note about host addresses in bindnetaddr conf/corosync.conf.example | 54 +-- 1 files changed, 51 insertions(+), 3 deletions(-) ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Extendng call for Corosync RFEs until Aug 30th
Believe many in community are on vacation during our proposal window. As a result, I'm extending until Aug 30th. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Corosync 2.0 Feature Request: Experiment with rdma support without using librdmacm
The librdmacm libs assume a connection oriented mechanism whereas totem assumes a connectionless oriented operation. The RDMA technology can be exposed only through ibverbs. The advantage is improved reliability with RDMA networks. In the TODO file this is the topic : topic-rdmaud ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Corosync 2.0 Feature Request: Use zero-copy operation with RDMA networks
Totem currently copies each packet into the network layer. This results in an extra copy in RDMA networks. To reduce cpu utilization and improve performance, allocate these packets from the totem network layer before sending the packet. This removes an extra memory copy operation in RDMA networks. This is the topic-netmalloc topic in the TODO file. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Corosync 2.0 Feature Request: Centralize the encryption/decryption into one file
Each network driver has encryption code in it. Centralize that encryption code to one file so that it may be maintained in one file rather then 3 separate drivers. This is the topic-onecrypt topic on the TODO file. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Set my_new_memb_list in recovery enter
Currently my_new_memb_list is set in commit_enter, resulting in join messages being accepted during commit/recovery phases which are not appropriate to maintain protocol guarantees. Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 4a299a0..44623d8 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -1991,6 +1991,11 @@ static void memb_state_recovery_enter ( log_printf (instance-totemsrp_log_level_debug, entering RECOVERY state.\n); + memcpy (instance-my_new_memb_list, addr, + sizeof (struct srp_addr) * instance-commit_token-addr_entries); + + instance-my_new_memb_entries = instance-commit_token-addr_entries; + instance-orf_token_discard = 0; instance-my_high_ring_delivered = 0; @@ -2766,11 +2771,6 @@ static void memb_state_commit_token_update ( addr = (struct srp_addr *)instance-commit_token-end_of_commit_token; memb_list = (struct memb_commit_token_memb_entry *)(addr + instance-commit_token-addr_entries); - memcpy (instance-my_new_memb_list, addr, - sizeof (struct srp_addr) * instance-commit_token-addr_entries); - - instance-my_new_memb_entries = instance-commit_token-addr_entries; - memcpy (memb_list[instance-commit_token-memb_index].ring_id, instance-my_old_ring_id, sizeof (struct memb_ring_id)); -- 1.7.6 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Problems forming cluster on corosync startup
On 08/03/2011 10:32 PM, Tim Beale wrote: Hi, It looks to me that the way the transition from Recovery to Operational works, we can't guarantee that all nodes in the ring have entered Operational before a node processes another Memb-Join message from a new node. E.g. we can't guarantee the token has rotated right the way around the ring. When this happens, the nodes still in Recovery will still use the older ring ID. So they won't get added to the transitional membership, and CLM will report leave events for these nodes. (Plus there might be other side-effects, like the FAILED TO RECEIVE problem - I haven't quite worked out why that's happening). Thanks for the pointer here - patch on ml. We are currently using CLM to check the health of a node, i.e. so we can detect if it locks up. My questions are: i) Are there config settings we could change to improve this, like increasing the 'join' timeout? ii) Should I try to make a code change to fix the problem? E.g. delay processing the Memb-Join message if the node's only just entered operational. iii) Should we not be using CLM like this? I.e. should we just learn to live with CLM/CPG sometimes reporting nodes as leaving when they're perfectly healthy. Thanks for your help. Tim Tim please try the patch I have recently posted: [PATCH] Set my_new_memb_list in recovery enter First and foremost, let me know if it resolves your 10 node startup case which fails 10% of the time. Then let me know if it treats other symptoms. Regards -steve On Wed, Aug 3, 2011 at 3:28 PM, Tim Beale tlbe...@gmail.com wrote: Hi, We're booting up a 10-node cluster (with all nodes starting corosync at roughly the same time) and approx 1 in 10 times we see some problems: a) CLM is reporting nodes as leaving and then immediately rejoining (not sure if this is valid behaviour?) b) Probably an unrelated oddity, but we're getting flow control enabled on a client daemon using CLM that's only sending one request (saClmClusterTrack()). c) A node is hitting the FAILED TO RECEIVE case d) After c) there seems to be a lot of churn as the cluster tries to reform e) During the processing of node leave events, the CPG client can sometimes get broken so it no longer processes *any* CPG events Corosync debug is attached (I commented out some of the noisier debug around message delivery). We don't really know enough about corosync to tell what exactly is incorrect behaviour and what should be fixed. But here's what we've noticed: 1). Node-4 joins soon after node-1. When this happens all nodes except node-12 have entered operational state (see node-12.txt line 235). It looks like maybe node-12 hasn't received enough rotations of the token to enter operational yet. Node-12's resulting transitional config consists of just itself. All nodes then report node-1 and node-12 as leaving and immediately rejoining. 2) After this config change, node-3 eventually hits the FAILED TO RECEIVE case (node-3.txt line 380). At this point node-1 and node-12 have an ARU matching the high_seq_received, all other nodes have an ARU of zero. 3) Node-3 entering gather seems to result in a lot of config change churn across the cluster. 4) While processing the config changes on node-3, the CPG downlist it uses contains itself. When node-3 sends leave events for the nodes in the downlist (including itself), it sets its own cpd state to CPD_STATE_UNJOINED and clears the cpd-group_name. This means it no longer sends any CPG events to the CPG client. We tried cherry-picking this commit to fix the problem (#4) with the CPG client. http://www.corosync.org/git/?p=corosync.git;a=commit;h=956a1dcb4236acbba37c07e2ac0b6c9ffcb32577 It helped a bit, but didn't fix it completely. We've made an interim change (attached) to avoid this problem. We're using corosync v1.3.1 on an embedded linux system (with a low-spec CPU). Corosync is running over a basic ethernet interface (no hubs/routers/etc). Any help would be appreciated. Let me know if there's any other debug I can provide. Thanks, Tim ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 1/6] Remove scheduling
I believe a better approach would be to default to standard scheduling and add a new flag --realtime which enables realtime scheduling. Regards -steve On 08/05/2011 12:09 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- exec/main.c | 55 +-- 1 files changed, 1 insertions(+), 54 deletions(-) diff --git a/exec/main.c b/exec/main.c index b03d33e..006f846 100644 --- a/exec/main.c +++ b/exec/main.c @@ -144,8 +144,6 @@ LOGSYS_DECLARE_SUBSYS (MAIN); #define SERVER_BACKLOG 5 -static int sched_priority = 0; - static unsigned int service_count = 32; static struct totem_logging_configuration totem_logging_configuration; @@ -972,46 +970,6 @@ void message_source_set ( source-conn = conn; } -static void corosync_setscheduler (void) -{ -#if defined(HAVE_PTHREAD_SETSCHEDPARAM) defined(HAVE_SCHED_GET_PRIORITY_MAX) defined(HAVE_SCHED_SETSCHEDULER) - int res; - - sched_priority = sched_get_priority_max (SCHED_RR); - if (sched_priority != -1) { - global_sched_param.sched_priority = sched_priority; - res = sched_setscheduler (0, SCHED_RR, global_sched_param); - if (res == -1) { - LOGSYS_PERROR(errno, LOGSYS_LEVEL_WARNING, - Could not set SCHED_RR at priority %d, - global_sched_param.sched_priority); - - global_sched_param.sched_priority = 0; - logsys_thread_priority_set (SCHED_OTHER, NULL, 1); - } else { - - /* - * Turn on SCHED_RR in logsys system - */ - res = logsys_thread_priority_set (SCHED_RR, global_sched_param, 10); - if (res == -1) { - log_printf (LOGSYS_LEVEL_ERROR, - Could not set logsys thread priority. - Can't continue because of priority inversions.); - corosync_exit_error (AIS_DONE_LOGSETUP); - } - } - } else { - LOGSYS_PERROR (errno, LOGSYS_LEVEL_WARNING, - Could not get maximum scheduler priority); - sched_priority = 0; - } -#else - log_printf(LOGSYS_LEVEL_WARNING, - The Platform is missing process priority setting features. Leaving at default.); -#endif -} - static void fplay_key_change_notify_fn ( object_change_type_t change_type, hdb_handle_t parent_object_handle, @@ -1203,7 +1161,7 @@ int main (int argc, char **argv, char **envp) char *iface; char *strtok_save_pt; int res, ch; - int background, setprio; + int background; struct stat stat_out; char corosync_lib_dir[PATH_MAX]; hdb_handle_t object_runtime_handle; @@ -1212,7 +1170,6 @@ int main (int argc, char **argv, char **envp) /* default configuration */ background = 1; - setprio = 1; while ((ch = getopt (argc, argv, fpv)) != EOF) { @@ -1222,7 +1179,6 @@ int main (int argc, char **argv, char **envp) logsys_config_mode_set (NULL, LOGSYS_MODE_OUTPUT_STDERR|LOGSYS_MODE_THREADED|LOGSYS_MODE_FORK); break; case 'p': - setprio = 0; break; case 'v': printf (Corosync Cluster Engine, version '%s'\n, VERSION); @@ -1240,15 +1196,6 @@ int main (int argc, char **argv, char **envp) } } - /* - * Set round robin realtime scheduling with priority 99 - * Lock all memory to avoid page faults which may interrupt - * application healthchecking - */ - if (setprio) { - corosync_setscheduler (); - } - corosync_mlockall (); log_printf (LOGSYS_LEVEL_NOTICE, Corosync Cluster Engine ('%s'): started and ready to provide service.\n, VERSION); ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 3/6] Fix some compiler warnings
Reviewed-by: Steven Dake sd...@redhat.com On 08/05/2011 12:09 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- configure.ac |4 +- exec/crypto.c |2 - exec/main.c |3 -- exec/objdb.c | 76 lib/confdb.c |3 ++ services/confdb.c |7 +++- services/cpg.c|2 - 7 files changed, 51 insertions(+), 46 deletions(-) diff --git a/configure.ac b/configure.ac index 92aed9e..35e3cfb 100644 --- a/configure.ac +++ b/configure.ac @@ -173,9 +173,9 @@ LIB_MSG_RESULT(m4_shift(m4_shift($@)))dnl ## helper for CC stuff cc_supports_flag() { - local CFLAGS=$@ + local CPPFLAGS=$CPPFLAGS $@ AC_MSG_CHECKING([whether $CC supports $@]) - AC_COMPILE_IFELSE([int main(){return 0;}] , + AC_PREPROC_IFELSE([AC_LANG_PROGRAM([])], [RC=0; AC_MSG_RESULT([yes])], [RC=1; AC_MSG_RESULT([no])]) return $RC diff --git a/exec/crypto.c b/exec/crypto.c index 901797a..14fb807 100644 --- a/exec/crypto.c +++ b/exec/crypto.c @@ -1140,12 +1140,10 @@ int sha1_done(hash_state * md, unsigned char *hash) int hmac_init(hmac_state *hmac, int hash, const unsigned char *key, unsigned long keylen) { unsigned char buf[128]; -unsigned long hashsize; unsigned long i; int err; hmac-hash = hash; -hashsize = hash_descriptor[hash]-hashsize; /* valid key length? */ assert (keylen 0); diff --git a/exec/main.c b/exec/main.c index 006f846..e33a397 100644 --- a/exec/main.c +++ b/exec/main.c @@ -807,16 +807,13 @@ static void deliver_fn ( int32_t service; int32_t fn_id; uint32_t id; - uint32_t size; uint32_t key_incr_dummy; header = msg; if (endian_conversion_required) { id = swab32 (header-id); - size = swab32 (header-size); } else { id = header-id; - size = header-size; } /* diff --git a/exec/objdb.c b/exec/objdb.c index 99e20ec..999db61 100644 --- a/exec/objdb.c +++ b/exec/objdb.c @@ -112,7 +112,7 @@ static int objdb_init (void) { hdb_handle_t handle; struct object_instance *instance; - unsigned int res; + int res; res = hdb_handle_create (object_instance_database, sizeof (struct object_instance), handle); @@ -192,11 +192,12 @@ static void object_created_notification( struct object_instance * obj_pt; struct object_tracker * tracker_pt; hdb_handle_t obj_handle = object_handle; - unsigned int res; do { - res = hdb_handle_get (object_instance_database, - obj_handle, (void *)obj_pt); + if (hdb_handle_get (object_instance_database, + obj_handle, (void *)obj_pt) != 0) { + return; + } for (list = obj_pt-track_head.next; list != obj_pt-track_head; list = list-next) { @@ -226,11 +227,12 @@ static void object_pre_deletion_notification(hdb_handle_t object_handle, struct object_instance * obj_pt; struct object_tracker * tracker_pt; hdb_handle_t obj_handle = object_handle; - unsigned int res; do { - res = hdb_handle_get (object_instance_database, - obj_handle, (void *)obj_pt); + if (hdb_handle_get (object_instance_database, + obj_handle, (void *)obj_pt) != 0) { + return; + } for (list = obj_pt-track_head.next; list != obj_pt-track_head; list = list-next) { @@ -265,11 +267,12 @@ static void object_key_changed_notification(hdb_handle_t object_handle, struct object_instance * owner_pt = NULL; struct object_tracker * tracker_pt; hdb_handle_t obj_handle = object_handle; - unsigned int res; do { - res = hdb_handle_get (object_instance_database, - obj_handle, (void *)obj_pt); + if (hdb_handle_get (object_instance_database, + obj_handle, (void *)obj_pt) != 0) { + return; + } if (owner_pt == NULL) owner_pt = obj_pt; @@ -302,10 +305,11 @@ static void object_reload_notification(int startstop, int flush) struct object_instance * obj_pt; struct object_tracker * tracker_pt; struct object_tracker * tmptracker_pt; - unsigned int res; - res = hdb_handle_get (object_instance_database, - OBJECT_PARENT_HANDLE, (void *)obj_pt); + if (hdb_handle_get (object_instance_database, + OBJECT_PARENT_HANDLE, (void *)obj_pt) != 0) { + return; + } /* * Make a copy of the list @@ -350,7 +354,7
Re: [Openais] [PATCH 4/6] libqb: Add libqb dependency in the rpm pc file
Reviewed-by: Steven Dake sd...@redhat.com On 08/05/2011 12:09 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- corosync.spec.in |2 +- pkgconfig/corosync.pc.in |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 58c4b0d..d50b72c 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -36,7 +36,7 @@ Conflicts: openais = 0.89, openais-devel = 0.89 %if %{buildtrunk} BuildRequires: autoconf automake %endif -BuildRequires: nss-devel +BuildRequires: nss-devel libqb-devel %if %{with rdma} BuildRequires: libibverbs-devel librdmacm-devel %endif diff --git a/pkgconfig/corosync.pc.in b/pkgconfig/corosync.pc.in index 820c607..31b354a 100644 --- a/pkgconfig/corosync.pc.in +++ b/pkgconfig/corosync.pc.in @@ -8,5 +8,5 @@ socketdir=@COROSOCKETDIR@ Name: corosync Version: @LIBVERSION@ Description: corosync -Requires: +Requires: libqb Cflags: -I${includedir} ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 6/6] Update TODOs
Reviewed-by: Steven Dake sd...@redhat.com On 08/05/2011 12:09 AM, Angus Salkeld wrote: Signed-off-by: Angus Salkeld asalk...@redhat.com --- TODO | 73 + 1 files changed, 19 insertions(+), 54 deletions(-) diff --git a/TODO b/TODO index 9a2db8f..fa30e36 100644 --- a/TODO +++ b/TODO @@ -3,69 +3,34 @@ The Corosync Cluster Engine Topic Branches -- -- -Last Updated: October 2010 +Last Updated: August 2011 -- -We use topic branches in our git repository to develop new disruptive features -that define our future roadmap. This file describes the topic branches -the developers have interest in investigating further. - -targets can be: whitetank, needle, or future (3.0+). -Finished can be: percentage or date merged to master. - -- -topic-libqb +master -- -Main Developer: Angus Salkeld -Started: September 2010 -Finished: 60% -target: needle -Description: -The libqb project is our effort to remove the core infrastructure required for -client server operations of corosync from the corosync code base and place -inside a separate project. +1) exec/totempg.c in check_q_level() + Remove hardcoded values. + Chat to Steve about correcting the queue length calculation. -The main purpose of this topic is to investigate integrating corosync with the -libqb package that has been refactored. Part of this effort also involves -investigation into single threaded operation of the IPC layer without -peformance penalties. +2) check max message size restrictions. --- -topic-rr --- -Main Developer: Steven Dake -Started: Not Started -Finished: 0% -target: needle -Description: -Redundant ring may have quality problems near boundary conditions for sequence -numbers. This effort involves qualifying and hardening redundant ring around -these boundary numbers. A further stretch goal of this topic is to -automatically reenable a redundant ring when it has been back in service. +3) is this https://github.com/asalkeld/libqb/issues/1 still an issue? --- -topic-snmp --- -Main Developer: Angus Salkeld -Started: Not Started -Finished: 100% -target: needle -Description: -This topic involves investigation of adding SNMP support into Corosync. +4) remove old stuff from the man pages (logging/IPC). +5) new blackbox size might be too small (exec/logsys.c:311) --- -topic-udpu --- -Main Developer: Steven Dake -Started: October -Finished: 80% -target: needle -Description: -The UDPU transport mode offers a mechanism for Corosync to operate in network -environments where multicast or broadcast are prohibited. The main mechanism -it uses to do this is to UDP unicast to each of the target node IP addresses -listed in the configuation. +6) extend the logging config to make better use of the tracing capabilities. + + + +We use topic branches in our git repository to develop new disruptive features +that define our future roadmap. This file describes the topic branches +the developers have interest in investigating further. + +targets can be: whitetank, needle, or future (3.0+). +Finished can be: percentage or date merged to master. -- topic-onecrypt ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Live demo of Pacemaker Cloud on Fedora: Friday August 5th at 8am PST
Extending a general invitation to the high availability communities and other cloud community contributors to participate in a live demo I am giving on Friday August 5th 8am PST (GMT-7). Demo portion of session is 15 minutes and will be provided first followed by more details of our approach to high availability. I will use elluminate to show the demo on my desktop machine. To make elluminate work, you will need icedtea-web installed on your system which is not typically installed by default. You will also need a conference # and bridge code. Please contact me offlist with your location and I'll provide you with a hopefully toll free conference # and bridge code. Elluminate link: https://sas.elluminate.com/m.jnlp?sid=819password=M.13AB020AEBE358D265FD925A07335F Bridge Code: Please contact me off list with your location and I'll respond back with dial-in information. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync fails to start under cman
On 08/03/2011 04:06 PM, David wrote: I have a 3 node RHCS cluster and prior to an VLAN change (moved the cluster communications into its own VLAN) all three nodes were working. Post VLAN migration 2 of the 3 nodes joined the cluster but a third is failing when I start cman: Starting cluster: Checking Network Manager... [ OK ] Global setup... [ OK ] Loading kernel modules... [ OK ] Mounting configfs...[ OK ] Starting cman... Aug 03 22:58:26 corosync [MAIN ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service. Aug 03 22:58:26 corosync [MAIN ] Corosync built-in features: nss rdma Aug 03 22:58:26 corosync [MAIN ] Successfully read config from /etc/cluster/cluster.conf Aug 03 22:58:26 corosync [MAIN ] Successfully parsed cman config Aug 03 22:58:26 corosync [TOTEM ] Token Timeout (1 ms) retransmit timeout (2380 ms) Aug 03 22:58:26 corosync [TOTEM ] token hold (1894 ms) retransmits before loss (4 retrans) Aug 03 22:58:26 corosync [TOTEM ] join (60 ms) send_join (0 ms) consensus (12000 ms) merge (200 ms) Aug 03 22:58:26 corosync [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs) Aug 03 22:58:26 corosync [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402 Aug 03 22:58:26 corosync [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages) Aug 03 22:58:26 corosync [TOTEM ] missed count const (5 messages) Aug 03 22:58:26 corosync [TOTEM ] send threads (0 threads) Aug 03 22:58:26 corosync [TOTEM ] RRP token expired timeout (2380 ms) Aug 03 22:58:26 corosync [TOTEM ] RRP token problem counter (2000 ms) Aug 03 22:58:26 corosync [TOTEM ] RRP threshold (10 problem count) Aug 03 22:58:26 corosync [TOTEM ] RRP mode set to none. Aug 03 22:58:26 corosync [TOTEM ] heartbeat_failures_allowed (0) Aug 03 22:58:26 corosync [TOTEM ] max_network_delay (50 ms) Aug 03 22:58:26 corosync [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed 0 Aug 03 22:58:26 corosync [TOTEM ] Initializing transport (UDP/IP). Aug 03 22:58:26 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Aug 03 22:58:26 corosync [IPC ] you are using ipc api v2 Aug 03 22:58:26 corosync [TOTEM ] Receive multicast socket recv buffer size (262142 bytes). Aug 03 22:58:26 corosync [TOTEM ] Transmit multicast socket send buffer size (262142 bytes). corosync: totemsrp.c:3091: memb_ring_id_create_or_load: Assertion `res == sizeof (unsigned long long)' failed. Aug 03 22:58:26 corosync [TOTEM ] The network interface [10.50.3.70] is now up. corosync died with signal: 6 Check cluster logs for details [FAILED] I haven't been able to find information that identifies the issue or how to correct it. I am hoping someone from this group may be able to shed some light. This happens because the ring id file is 0 bytes. We have fixed this problem in later versions of corosync. TO rectify this problem, rm -f /var/lib/corosync/ringid* Regards -steve Thanks! David ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync (version 1.23 on rhel6) crashes when packets are dropped
On 08/02/2011 04:47 PM, Stanley, Ephrim wrote: Hi, I’m evaluating the Qpid messaging broker which uses Corosync for clustering. As part of my cluster break tests, I ran into a problem where Corosync dies without producing any core files or error messages. Is this expected ? Also, what are some best practices for testing packet loss with Corosync ? Steps to reproduce : 1. Compile Corosync 1.2.3 after enabling the #defines for packet loss (in totemsrp.c line 129). I did not change the drop percentages.. left them as is #define TEST_DROP_ORF_TOKEN_PERCENTAGE 30 #define TEST_DROP_COMMIT_TOKEN_PERCENTAGE 30 #define TEST_DROP_MCAST_PERCENTAGE 50 #define TEST_RECOVERY_MSG_COUNT 300 2. Start a qpid cluster with three nodes NODE1, NODE2, NODE3 3. Nodes NODE2 and NODE3 are run with the Corosync that does not drop packets 4. Start the qpid process on nodes NODE2 and NODE3 5. After both proceses are up, corosync-cpgtool reports the cluster membership correctly 6. On NODE1, start Corosync (that drops packets) 7. Corosync starts and packet drops can be observed in the Corosync log (I added some debug log statements) 8. Start a qpid process on NODE1 9. Now, Corosync crashes on NODE1. No core files are produced. I have attached the output of corosync-fplay on NODE1 and a diff of the changes I made to totemsrp.c. Thanks, Ephrim. Ephrim Could you be more specific about which version of Red Hat's build of corosync you are using? Redundant ring is not supported in 1.2.3 by upstream nor Red Hat. Looking at existing bugs that have not hit z streams yet, may be this issue: https://bugzilla.redhat.com/show_bug.cgi?id=722522 to get a core file, set ulimit -c unlimited before running corosync. A core file would verify if this is a known fixed problem or a new issue. Thanks -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] corosync.conf.example: add note about host addresses in bindnetaddr
These patches look good. Reviewed-by: Steven Dake sd...@redhat.com Regards -steve On 07/31/2011 11:56 PM, Florian Haas wrote: https://lists.linux-foundation.org/pipermail/openais/2011-July/016563.html Jan Friesse pointed out that bindnetaddr should be set to a host address (as opposed to a network address) on hosts where multiple NICs live on the same subnet. Add a comment to that effect to the example configuration file. Signed-off-by: Florian Haas florian.h...@linbit.com --- conf/corosync.conf.example | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/conf/corosync.conf.example b/conf/corosync.conf.example index c849dba..ac1718f 100644 --- a/conf/corosync.conf.example +++ b/conf/corosync.conf.example @@ -17,11 +17,19 @@ totem { interface { # Rings must be consecutively numbered, starting at 0. ringnumber: 0 - # This is the *network* address of the interface to - # bind to. This ensures that you can use identical - # instances of this configuration file across all your - # cluster nodes, without having to modify this option. + # This is normally the *network* address of the + # interface to bind to. This ensures that you can use + # identical instances of this configuration file + # across all your cluster nodes, without having to + # modify this option. bindnetaddr: 192.168.1.0 + # However, if you have multiple physical network + # interfaces configured for the same subnet, then the + # network address alone is not sufficient to identify + # the interface Corosync should bind to. In that case, + # configure the *host* address of the interface + # instead: + # bindnetaddr: 192.168.1.1 # When selecting a multicast address, consider RFC # 2365 (which, among other things, specifies that # 239.255.x.x addresses are left to the discretion of ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] corosync didn't do what I expected
On 07/29/2011 12:36 PM, Keith Stevens wrote: I have the following configuration on two servers netbox1 and netbox2: crm(live)configure# show node netbox1 \ attributes standby=off node netbox2 primitive failover-ip ocf:heartbeat:IPaddr \ params ip=216.105.20.43 \ op monitor interval=10s location cli-prefer-failover-ip failover-ip \ rule $id=cli-prefer-rule-failover-ip inf: #uname eq netbox1 property $id=cib-bootstrap-options \ dc-version=1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b \ cluster-infrastructure=openais \ expected-quorum-votes=2 \ stonith-enabled=false If I put netbox1 on standby the ip address migrates to netbox2 and back to netbox1 when I bring it back online. The ip address was on netbox1 when I powered down netbox2 to move it into a cabinet. To my surprise, netbox1 lost the ip address and didn't get it back until I booted netbox2. Apparently I have huge conceptual hole in my understanding, I expected netbox1 to keep the ip address. Why didn't it? Thanks, -Keith Keith, Your email is better suited for the pacemaker list. Regards -teve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync Compatability
On 07/26/2011 08:28 PM, manish.gu...@ionidea.com wrote: Thank you Steave, We are currentely using corosync-1.2.1 and pacemaker 1.0.10 Can we use the same version of pacemaker with corosync-1.4 Yes, although redundant ring is not onwire compatible meaning you will have to restart your cluster. Regards -steve On Tue, July 26, 2011 7:12 pm, Steven Dake wrote: On 07/26/2011 01:52 AM, manish.gu...@ionidea.com wrote: Hi, I am facing problem with redundent Communication Channel. I am using Coroync 1.2 In this auto failback of redundent channel is not Supported. But 1.4 provide support. Corosync-1.4 id compatiable with which version of pacemaker corosync 1.4 should work with all versions of pacemaker. What version of pm are you using? Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync Compatability
On 07/26/2011 01:52 AM, manish.gu...@ionidea.com wrote: Hi, I am facing problem with redundent Communication Channel. I am using Coroync 1.2 In this auto failback of redundent channel is not Supported. But 1.4 provide support. Corosync-1.4 id compatiable with which version of pacemaker corosync 1.4 should work with all versions of pacemaker. What version of pm are you using? Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] vsftype - which one?
On 07/26/2011 04:07 AM, Proskurin Kirill wrote: Hello all. I not fully understand that vsftype is really is. Could someone explain it? I plan to make a ~50 nodes cluster with about ~50 resources via pacemaker. All nodes are in out local network with 1Gbis\s NIC What type should I chose? Do I need recompile corosync with something special? (eg with HAVE_SMALL_MEMORY_FOOTPRINT=0 ?) All runs on corosync-1.4.1 and pacemaker-1.1.5 Don't use vsftype, ie vsftype: none Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] main: let poll really stop before totempg_finalize
Reviewed-by: Steven Dake sd...@redhat.com On 07/25/2011 06:23 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/main.c | 24 +++- 1 files changed, 15 insertions(+), 9 deletions(-) diff --git a/exec/main.c b/exec/main.c index be9e118..1c4fb37 100644 --- a/exec/main.c +++ b/exec/main.c @@ -184,6 +184,8 @@ static int32_t corosync_not_enough_fds_left = 0; static void serialize_unlock (void); +static void serialize_lock (void); + hdb_handle_t corosync_poll_handle_get (void) { return (corosync_poll_handle); @@ -211,14 +213,7 @@ static void unlink_all_completed (void) serialize_unlock (); api-timer_delete (corosync_stats_timer_handle); poll_stop (corosync_poll_handle); - totempg_finalize (); - - /* - * Remove pid lock file - */ - unlink (corosync_lock_file); - - corosync_exit_error (AIS_DONE_EXIT); + serialize_lock (); } void corosync_shutdown_request (void) @@ -1887,6 +1882,17 @@ int main (int argc, char **argv, char **envp) */ poll_run (corosync_poll_handle); + /* + * Exit was requested + */ + totempg_finalize (); + + /* + * Remove pid lock file + */ + unlink (corosync_lock_file); + + corosync_exit_error (AIS_DONE_EXIT); + return EXIT_SUCCESS; } - ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] totemsrp: fix buffer overflows for large clusters ( 100 nodes)
Thanks for the submission. Reviewed-by; Steven Dake sd...@redhat.com On 07/24/2011 02:58 AM, MORITA Kazutaka wrote: Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp --- exec/totemsrp.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 16de74d..e34da1a 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -508,7 +508,7 @@ struct totemsrp_instance { void * token_recv_event_handle; void * token_sent_event_handle; - char commit_token_storage[9000]; + char commit_token_storage[4]; }; struct message_handlers { @@ -2976,7 +2976,7 @@ static void memb_state_commit_token_create ( static void memb_join_message_send (struct totemsrp_instance *instance) { - char memb_join_data[1]; + char memb_join_data[4]; struct memb_join *memb_join = (struct memb_join *)memb_join_data; char *addr; unsigned int addr_idx; @@ -3028,7 +3028,7 @@ static void memb_join_message_send (struct totemsrp_instance *instance) static void memb_leave_message_send (struct totemsrp_instance *instance) { - char memb_join_data[1]; + char memb_join_data[4]; struct memb_join *memb_join = (struct memb_join *)memb_join_data; char *addr; unsigned int addr_idx; ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync sends unicast but should multycast
Tokens are always sent unicast - this is how the protocol works. thanks -steve On 07/22/2011 07:22 AM, Proskurin Kirill wrote: Hi all. Found odd thing - some of my node send unicast while other send muiltycast and other unicast and multycast... with same configuration and they all work. Sound little confusing, I know. corosync-1.4.0 Config attached. I have 3 node: my108.i has address 10.3.1.108 my107.i has address 10.3.1.107 my105.i has address 10.6.1.155 I use tcpdump to look at the traffic and see thing like this: IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto: UDP (17), length: 98) 10.3.1.108.5404 10.6.1.155.5405: UDP, length 70 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 98) 10.6.1.155.5404 10.6.1.156.5405: UDP, length 70 IP (tos 0x0, ttl 29, id 0, offset 0, flags [DF], proto: UDP (17), length: 110) 10.3.1.107.5404 239.255.1.1.5405: UDP, length 82 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto: UDP (17), length: 98) 10.3.1.108.5404 10.6.1.155.5405: UDP, length 70 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17), length: 98) 10.6.1.155.5404 10.6.1.156.5405: UDP, length 70 Node see each other and all seems to work but as I understand they should communicate by multycast. Or not? ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Corosync sends unicast but should multycast
On 07/22/2011 08:01 AM, Proskurin Kirill wrote: On 07/22/2011 06:46 PM, Steven Dake wrote: Tokens are always sent unicast - this is how the protocol works. Thanks for reply. One more thing - then and for what multycast is send? We make some test with network team and try to understand all communication logic of corosync. read http://www.google.com/url?sa=tsource=webcd=1ved=0CBUQFjAAurl=http%3A%2F%2Fciteseer.ist.psu.edu%2Fviewdoc%2Fdownload%3Bjsessionid%3D863760AB04B004AF5DF7285D032E6595%3Fdoi%3D10.1.1.37.767%26rep%3Drep1%26type%3Dpsrct=jq=totem%20single%20ring%20protocolei=Z5EpTo6sMsmbtwfe36nXAgusg=AFQjCNFSyIM94w0Xm2VCfOGJS4kKyaMjmg The Totem Single Ring Protocol in case that link didn't come through The multicast are the actual data in messages that is transmitted to all nodes at same time. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Ip addr auto detection
On 07/21/2011 02:42 AM, Proskurin Kirill wrote: Hello all. In man for corosync.conf suggest to add not current IP addr of a node but her network: For example, if the local interface is 192.168.5.92 with netmask 255.255.255.0, set bindnetaddr to 192.168.5.0. Ok - that`s cool. But If i have a bunch of alias on same network on same node? How it will determine what ip to use? Or if I have two NIC with two IP on them on the with the same network? Specify the exact ip address in this case. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Multycast unicast as fall back
On 07/21/2011 02:30 AM, Proskurin Kirill wrote: Hello all. Is this possible to use multycast as primary way to communication in cluster but fall back to unicast transports if multycast is fail? Different rings with different transports? We have some problems in network switches and multycast just stop working and I start to think about this feature. Just use udpu entirely. This feature is supported n 1.3.2+. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] About TODO file
On 07/21/2011 04:15 AM, Yingliang Yang wrote: Hi, I have downloaded corosync-1.4.0 package. There is a TODO file in the release.But it's updated in October 2010 I would like to know is there any plan in the future. And also, there is an option(enable_watchdog) in the configure file. Will this feature be released in future version? We have a fairly concrete 2.0 plan which is called Noeedle. Most features are describes in our TODO in master branch. Regards -steve Best Regards, Yingliang Yang ___ Openais mailing list Openais@lists.linux-foundation.org mailto:Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Multycast unicast as fall back
On 07/21/2011 06:27 AM, Proskurin Kirill wrote: On 07/21/2011 05:11 PM, Steven Dake wrote: On 07/21/2011 02:30 AM, Proskurin Kirill wrote: Hello all. Is this possible to use multycast as primary way to communication in cluster but fall back to unicast transports if multycast is fail? Different rings with different transports? We have some problems in network switches and multycast just stop working and I start to think about this feature. Just use udpu entirely. This feature is supported n 1.3.2+. I`m on 1.4.0 now but I not wish to use unicast as production base - only if some problems with multycast occur. There is no fallback. You can specify one transport or the other. Thinking a moment how to implement this type of feature, it could not be reasonably implemented. What type of app are you running on top of corosync? The advantages of multicast is automatic growth (you don't have to know the node addresses ahead of time) and more throughput with less cpu utilization on high cpg message throughput. The disadvantage is multicast is generally poorly implemented by switch vendors. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] New bugzilla method
Hi, We have new bugzilla tracking in place via bugzilla.redhat.com. When filing bugs, please file under Community-Corosync Cluster Engine rather then rawhide or a specific fedora version. If the issue is fedora specific, continue to file under fedora. For other distro specific problems (such as defect because distro is shipping non latest z stream supported software), please file bugs with the various distributions bug tracking systems. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] Corosync 2.0 (needle) Call for RFEs
The Corosync flatiron 1.y series had many more features added then I would have liked, but the development team feels the 1.y series addresses any major gaps users of the software have had. As a result, we are freezing any future feature development of the flatiron branch permanently. We will continue to maintain z streams (1.4.z) bug fixes for many years to come in a robust and aggressive fashion. Now that the flatiron chapter of Corosync is finished, we can move on to new rd work around Corosync 2.0. There are a few RFEs floating around in bugzilla and the TODO list. This is your chance to provide feedback about feature development you would like to see in Corosync. The overall theme for Corosync 2.0 is focused around trimming the fat and simplifying the implementation without major performance regressions. The developers will take feature submission suggestions until Aug 31, at which point we will prioritize features for 2.0 and close feature submission requests. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] FAILED TO RECEIVE followed by cluster failure
On 07/21/2011 12:19 PM, Jed Smith wrote: Steve, Thank you again for all of the information. I labbed an in-place upgrade and the Corosync 1.4.0 compile brought down the 1.2.1-4ubuntu1 box. All I did was deploy from scratch, create a cluster with 1.2.1-4ubuntu1 and Pacemaker 1.0.10-4ubuntu3, then compiled Corosync 1.4.0 and Pacemaker 1.0.11 and introduced them to the cluster, and Corosync disappeared with no output. I don't mind building a new oblivious cluster and failing my resources over the hard way -- I did that many times, including a transition from Heartbeat to Corosync during development -- I'm just curious if there's something I'm doing that's preventing the 1.2.1 box from staying up. I restarted Corosync on the 1.2.1 side, and it crashed immediately. Logs: http://pastie.org/private/e9ktdolkdesf3eeq5d5gnq Again, I don't mind doing an oblivious cluster rebuild. It's not ideal, but it's also not a big deal -- you just mentioned that, in theory, 1.2.1 should talk to 1.4.0 fine. A correction is in order. We test rolling upgrades from 1.2.latest z to 1.3.0 and 1.3.latest z to 1.4.0. updating from 1.2.1 may not roll properly. I expect rolling upgrades of redundant ring don't work well with 1.4.0 because of protocol changes to support automatic redundant ring recovery, which hopefully nobody was using until 1.4.0 where we added it to the list of things we really want to work well :) Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 2/3] specfile: use _datadir as var expansion not exec
On 07/20/2011 12:48 AM, Jan Friesse wrote: Steven Dake wrote: On 07/19/2011 08:01 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 37e53ed..823ad3d 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -138,7 +138,7 @@ fi %{_sysconfdir}/dbus-1/system.d/corosync-signals.conf %endif %if %{with snmp} -%(_datadir)/snmp/mibs/COROSYNC-MIB.txt +%{_datadir}/snmp/mibs/COROSYNC-MIB.txt %endif does this patch change anything? Ya, but it's very hard to spot (especially with small/bad fonts, it took me a while to notice it too). It changes round brackets ( ) to curly bracket { }. First means execute in shell (we really don't want to execute _datadir command) and second expand variable value (this is what we want). %{_initrddir}/corosync %{_initrddir}/corosync-notifyd Reviewed-by: Steven Dake sd...@redhat.com ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Multi State active resource each instance start\Stop
On 07/19/2011 03:21 AM, manish.gu...@ionidea.com wrote: Hi, I have configured a multi-state(clone)resource float IP(IP). It is running on all the configure Nodes. I am trying to stop it using crm_resource command crm_resource -r IP:0 -p target-role -v stopped I am getting this error. Error performing operation : The object/attribute does not exist. Please anybody can help me. How can I stop a single instance using any command If I manually down a single instance on one node ,then i clean instance than it comes up means it start again. ifconfig eth0:1 down crm_resource -C -r IP:0 -H NodeName It is working properly. Cluster stack corosync-1.2 pacemaker-1.10 wrong ml. Try the pacemaker ml. Regards Manish ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 1/3] specfile: Correct URL and source0
On 07/19/2011 08:01 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index e1dcf19..37e53ed 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -18,8 +18,8 @@ Version: @version@ Release: 1%{?numcomm:.%{numcomm}}%{?alphatag:.%{alphatag}}%{?dirty:.%{dirty}}%{?dist} License: BSD Group: System Environment/Base -URL: http://www.openais.org -Source0: http://developer.osdl.org/dev/openais/downloads/%{name}-%{version}/%{name}-%{version}%{?numcomm:.%{numcomm}}%{?alphatag:-%{alphatag}}%{?dirty:-%{dirty}}.tar.gz +URL: http://ftp.corosync.org +Source0: ftp://ftp:u...@ftp.corosync.org/downloads/%{name}-%{version}/%{name}-%{version}%{?numcomm:.%{numcomm}}%{?alphatag:-%{alphatag}}%{?dirty:-%{dirty}}.tar.gz # Runtime bits Requires: corosynclib = %{version}-%{release} Reviewed-by: Steven Dake sd...@redhat.com ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 2/3] specfile: use _datadir as var expansion not exec
On 07/19/2011 08:01 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 37e53ed..823ad3d 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -138,7 +138,7 @@ fi %{_sysconfdir}/dbus-1/system.d/corosync-signals.conf %endif %if %{with snmp} -%(_datadir)/snmp/mibs/COROSYNC-MIB.txt +%{_datadir}/snmp/mibs/COROSYNC-MIB.txt %endif does this patch change anything? %{_initrddir}/corosync %{_initrddir}/corosync-notifyd ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 3/3] specfile: Install corosync-signals.conf for dbus
On 07/19/2011 08:01 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 823ad3d..74ab851 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -92,6 +92,11 @@ rm -rf %{buildroot} make install DESTDIR=%{buildroot} +%if %{with dbus} +mkdir -p -m 0700 %{buildroot}/%{_sysconfdir}/dbus-1/system.d +install -m 644 %{_builddir}/%{name}-%{version}/conf/corosync-signals.conf %{buildroot}/%{_sysconfdir}/dbus-1/system.d/corosync-signals.conf +%endif + ## tree fixup # drop static libs rm -f %{buildroot}%{_libdir}/*.a Reviewed-by: Steven Dake sd...@redhat.com ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Some messages still leaked in recovery code
On 07/18/2011 07:55 PM, Tim Beale wrote: Hi, I think there is still a slight memory-leak when recovery is entered repeatedly. The recovery messages usually get freed when the operational state is entered. However if recovery is entered several times, without entering the operational state, then some messages can be leaked. Attached is a patch that fixes the problem for me. I tested it on v1.3.1, but the patch should apply to trunk. Let me know if I've misunderstood anything, or if any of the patch needs fixing up. Cheers, Tim Tim, Thanks for the patch. I have briefly looked over it, and it is a big change. I want to give it due review but I am swamped atleast until the end of the month. I'll provide review then. Thanks -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Add a few more stats for debugging
On 07/18/2011 09:14 PM, Tim Beale wrote: Hi, Attached is a patch that adds a few more more stats (the code was actually written by Angus). We find these stats useful - hopefully others will too. Cheers, Tim Great work Reviewed-by: Steven Dake sd...@redhat.com ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Announcing Corosync 1.4.0
On 07/18/2011 08:29 AM, Digimer wrote: On 07/18/2011 10:37 AM, Jan Friesse wrote: Corosync 1.4.0 is available for immediate download from our website. This version brings many enhancements to the software but most visible change is redundant ring auto recovery functionality. Please retrieve the latest sources from our website: http://www.corosync.org Regards Honza This is a question I think I already know the answer to, but what the heck, I'll ask anyway. Will the RRP recovery feature be back-ported to EL5? Having this option on existing RHCS2 clusters would be fantastic! We are providing bug fixes for RHEL5 only - no new feature development. ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Announcing Corosync 1.4.0
On 07/18/2011 07:37 AM, Jan Friesse wrote: Corosync 1.4.0 is available for immediate download from our website. This version brings many enhancements to the software but most visible change is redundant ring auto recovery functionality. Please retrieve the latest sources from our website: http://www.corosync.org Regards Honza ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais The other nice feature we have spent alot of time on is SNMP support and integration with foghorn (a DBUS to SNMP connector). Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] FAILED TO RECEIVE followed by cluster failure
On 07/18/2011 10:38 AM, Jed Smith wrote: Thank you for your reply. On Mon, Jul 18, 2011 at 1:18 PM, Digimer li...@alteeve.com wrote: Is it possible that the switch dropped the multicast group, and didn't reform it fast enough to prevent the cluster from partitioning? Our network guy says that the switches do not look at multicast traffic, they merely broadcast it in our environment. unlikely. I expect what is happening is your switch is delaying multicast packets compared to the unicast token. This causes retransmits. There is a bug in older versions of our totem implementation that increase the fail to recv counter incorrectly. In newer versions we have worked around this flaw in the original totem specification (which expects multicast can be flushed before a token receipt, which is an invalid assertion). My recommendation to you is to update to a 1.3 or 1.4 series. Both of these have very tight maintenance rules around what goes in (ie: its not tip development work). Once you have a version that doesn't have known bugs, I'd recommend increasing fail recv const to some large value, such as 5000. See: http://www.mail-archive.com/openais@lists.linux-foundation.org/msg05924.html It would be nice if the debian maintainers would update their packages to latest upstream. We release z streams for a reason, usually the reason being someone has had a field failure resulting in a complete cluster outage). Y stream releases are a bit more liberal in terms of additional features. File a bug with your distro and ask them to use an upstream release which is recent and supported upstream (1.2.y upstream support fell off once we released 1.4.y - we support 2 y streams). Thanks -steve Thanks, ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] FAILED TO RECEIVE followed by cluster failure
On 07/18/2011 07:55 PM, Keisuke MORI wrote: Hi, 2011/7/19 Steven Dake sd...@redhat.com: On 07/18/2011 10:38 AM, Jed Smith wrote: Thank you for your reply. On Mon, Jul 18, 2011 at 1:18 PM, Digimer li...@alteeve.com wrote: Is it possible that the switch dropped the multicast group, and didn't reform it fast enough to prevent the cluster from partitioning? Our network guy says that the switches do not look at multicast traffic, they merely broadcast it in our environment. unlikely. I expect what is happening is your switch is delaying multicast packets compared to the unicast token. This causes retransmits. There is a bug in older versions of our totem implementation that increase the fail to recv counter incorrectly. In newer versions we have worked around this flaw in the original totem specification (which expects multicast can be flushed before a token receipt, which is an invalid assertion). My recommendation to you is to update to a 1.3 or 1.4 series. Both of these have very tight maintenance rules around what goes in (ie: its not tip development work). Once you have a version that doesn't have known bugs, I'd recommend increasing fail recv const to some large value, such as 5000. See: http://www.mail-archive.com/openais@lists.linux-foundation.org/msg05924.html We had discovered that the issue in that report was caused by a misbehavior of IGMP snooping feature in bridge interface; http://www.spinics.net/lists/netdev/msg166960.html Because of this, the bridge interface sometimes fails to handle IGMP packet properly and multicast traffic may not be forwarded for a while although unicast traffic goes fine, which makes corosync confused. RHEL6.0 is affected at least, but RHEL5 is not affected because RHEL5 kernel does not implement IGMP snooping yet. You can workaroud it by either; 1) disabling IGMP snooping feature ex. echo 0 /sys/class/net/br0/bridge/multicast_snooping 2) not to use bridge interface for corosync multicast traffic When we encountered to this issue, we had assigned a multicast address to a bridge interface on top of a bonding interface. Changing to assign the IP address onto a bonding interface did solve it. Increasing fail_recv_const did not actually solve it; it just delayed to occur. Hope it helps. Thanks for the report. I believe our workarounds for delayed multicast packets will mask that kernel oddness, but can't guarantee it. I'm certain someone will find that information of value. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 2/2] rrp: Handle rollower in passive rrp properly
Great work Reviewed-by: Steven Dake sd...@redhat.com On 07/15/2011 06:31 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemrrp.c | 175 +++ 1 files changed, 112 insertions(+), 63 deletions(-) diff --git a/exec/totemrrp.c b/exec/totemrrp.c index 0445be2..6bfacd9 100644 --- a/exec/totemrrp.c +++ b/exec/totemrrp.c @@ -335,6 +335,11 @@ static void passive_mcast_flush_send ( const void *msg, unsigned int msg_len); +static void passive_monitor ( + struct totemrrp_instance *rrp_instance, + unsigned int iface_no, + int is_token_recv_count); + static void passive_token_recv ( struct totemrrp_instance *instance, unsigned int iface_no, @@ -484,6 +489,14 @@ static void active_timer_problem_decrementer_cancel ( * #define ARR_SEQNO_START_MSG 0xfe00 */ +/* + * Threshold value when recv_count for passive rrp should be adjusted. + * Set this value to some smaller for testing of adjusting proper + * functionality. Also keep in mind that this value must be smaller + * then rrp_problem_count_threshold + */ +#define PASSIVE_RECV_COUNT_THRESHOLD (INT_MAX / 2) + struct message_header { char type; char encapsulated; @@ -841,50 +854,92 @@ static void passive_timer_problem_decrementer_cancel ( } */ - -static void passive_mcast_recv ( +/* + * Monitor function implementation from rrp paper. + * rrp_instance is passive rrp instance, iface_no is interface with received messgae/token and + * is_token_recv_count is boolean variable which donates if message is token (1) or regular + * message (= 0) + */ +static void passive_monitor ( struct totemrrp_instance *rrp_instance, unsigned int iface_no, - void *context, - const void *msg, - unsigned int msg_len) + int is_token_recv_count) { struct passive_instance *passive_instance = (struct passive_instance *)rrp_instance-rrp_algo_instance; + unsigned int *recv_count; unsigned int max; unsigned int i; - - rrp_instance-totemrrp_deliver_fn ( - context, - msg, - msg_len); - - if (rrp_instance-totemrrp_msgs_missing() == 0 - passive_instance-timer_expired_token) { - /* - * Delivers the last token - */ - rrp_instance-totemrrp_deliver_fn ( - passive_instance-totemrrp_context, - passive_instance-token, - passive_instance-token_len); - passive_timer_expired_token_cancel (passive_instance); - } + unsigned int min_all, min_active; /* * Monitor for failures - * TODO doesn't handle wrap-around of the mcast recv count */ - passive_instance-mcast_recv_count[iface_no] += 1; + if (is_token_recv_count) { + recv_count = passive_instance-token_recv_count; + } else { + recv_count = passive_instance-mcast_recv_count; + } + + recv_count[iface_no] += 1; + max = 0; for (i = 0; i rrp_instance-interface_count; i++) { - if (max passive_instance-mcast_recv_count[i]) { - max = passive_instance-mcast_recv_count[i]; + if (max recv_count[i]) { + max = recv_count[i]; + } + } + + /* + * Max is larger then threshold - start adjusting process + */ + if (max PASSIVE_RECV_COUNT_THRESHOLD) { + min_all = min_active = recv_count[iface_no]; + + for (i = 0; i rrp_instance-interface_count; i++) { + if (recv_count[i] min_all) { + min_all = recv_count[i]; + } + + if (passive_instance-faulty[i] == 0 + recv_count[i] min_active) { + min_active = recv_count[i]; + } + } + + if (min_all 0) { + /* + * There is one or more faulty device with recv_count 0 + */ + for (i = 0; i rrp_instance-interface_count; i++) { + recv_count[i] -= min_all; + } + } else { + /* + * No faulty device with recv_count 0, adjust only active + * devices + */ + for (i = 0; i rrp_instance-interface_count; i++) { + if (passive_instance-faulty[i] == 0) { + recv_count[i] -= min_active; + } + } + } + + /* + * Find again max
Re: [Openais] [PATCH] totemconfig: Change default FAIL_TO_RECV_CONST
Reviewed-by: Steven Dake sd...@redhat.com On 07/15/2011 09:21 AM, Jan Friesse wrote: Previous default (50) was too low for most modern switch hardware. This may trigger abort because the aru doesn't increase for 50 token rotations combined with a defect in how failed to recv conditions are handled. By increasing this tunable, the condition should no longer trigger the errant code. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemconfig.c |2 +- man/corosync.conf.5 |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/exec/totemconfig.c b/exec/totemconfig.c index 5135672..80ca182 100644 --- a/exec/totemconfig.c +++ b/exec/totemconfig.c @@ -73,7 +73,7 @@ #define JOIN_TIMEOUT 50 #define MERGE_TIMEOUT200 #define DOWNCHECK_TIMEOUT1000 -#define FAIL_TO_RECV_CONST 50 +#define FAIL_TO_RECV_CONST 2500 #define SEQNO_UNCHANGED_CONST 30 #define MINIMUM_TIMEOUT (int)(1000/HZ)*3 #define MAX_NETWORK_DELAY50 diff --git a/man/corosync.conf.5 b/man/corosync.conf.5 index d092064..3f8e90e 100644 --- a/man/corosync.conf.5 +++ b/man/corosync.conf.5 @@ -380,7 +380,7 @@ This constant specifies how many rotations of the token without receiving any of the messages when messages should be received may occur before a new configuration is formed. -The default is 50 failures to receive a message. +The default is 2500 failures to receive a message. .TP seqno_unchanged_const ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] rrp: handle rollover in active rrp properly
Reviewed-by: Steven Dake sd...@redhat.com On 07/15/2011 09:31 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemrrp.c | 24 +++- 1 files changed, 23 insertions(+), 1 deletions(-) diff --git a/exec/totemrrp.c b/exec/totemrrp.c index 6fb5772..eb9b788 100644 --- a/exec/totemrrp.c +++ b/exec/totemrrp.c @@ -468,6 +468,22 @@ static void active_timer_problem_decrementer_cancel ( #define ENDIAN_LOCAL 0xff22 +/* + * Rollover handling: + * + * ARR_SEQNO_START_TOKEN is the starting sequence number of last seen sequence + * for a token for active redundand ring. This should remain zero, unless testing + * overflow in which case 07f00 or 0xff00 are good starting values. + * It should be same as on defined in totemsrp.c + */ + +#define ARR_SEQNO_START_TOKEN 0x0 + +/* + * These can be used ot test different rollover points + * #define ARR_SEQNO_START_MSG 0xfe00 + */ + struct message_header { char type; char encapsulated; @@ -1154,6 +1170,8 @@ void *active_instance_initialize ( instance-rrp_instance = rrp_instance; + instance-last_token_seq = ARR_SEQNO_START_TOKEN - 1; + error_exit: return ((void *)instance); } @@ -1342,7 +1360,7 @@ static void active_token_recv ( struct active_instance *active_instance = (struct active_instance *)rrp_instance-rrp_algo_instance; active_instance-totemrrp_context = context; - if (token_seq active_instance-last_token_seq) { + if (sq_lt_compare (active_instance-last_token_seq, token_seq)) { memcpy (active_instance-token, msg, msg_len); active_instance-token_len = msg_len; for (i = 0; i rrp_instance-interface_count; i++) { @@ -1353,6 +1371,10 @@ static void active_token_recv ( active_timer_expired_token_start (active_instance); } + /* + * This doesn't follow spec because the spec assumes we will know + * when token resets occur. + */ active_instance-last_token_seq = token_seq; if (token_seq == active_instance-last_token_seq) { ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [TOTEM ] Process pause detected for XXX ms, flushing membership messages.
On 07/08/2011 02:03 AM, Vladislav Bogdanov wrote: I checked the archives and found a patch from some time ago that was never merged. It wasn't verified to resolve the pause timeout problem but t could indeed solve the problem. It wasn't merged because we lacked verification it resolved the problem. Great, I'll try it in next few days, good news is that problem should be easily reproducible. Hmm... Not so easily... I applied that patch to all physical hosts, and do not see that message any more for two days, independently of number of RX buffers in adapter. But, I do not see it if I downgrade to previous image (without that patch) :( Although I did not test it again for a long time, only several hours. I didn't apply patch to VM, and do not see that message either. What I did also: * Rescheduled VM to higher CPU priority (actually real-time) * Assigned higher blkio priority to that VM * Assigned low blkio priority to bulk resources on node where that VM runs. So, original problem seems to have different causes for bare-metal and VM cases. For former case patch seems to be helpful. It should help for VM case too. There were lots of '[TOTEM ] Retransmit List:' messages on bare-metal hosts until I returned eth RX ring size back to 256 buffers (from 4096). After some thinking, this is probably correct, because more buffers add some latency, which is bad for corosync. Not sure why that may affect NAPI polling rate although. I'll try to upgrade igb driver (newer version has tuning param InterruptThrottleRate) and play again with ring buffers and that rate. Again, that driver version I currently have may have some bugs when operating with big buffer rings which lead to 500ms blocking under high load. BTW are that Retransmit List: messages harmful? These are only warning messages and result in a duplicate message being retransmitted which may not have to be. We are working to sort out how to remove these on some hardware enironments. Regards -steve Best, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] totemiba: free send_buf on ibv_reg_mr failure
On 07/07/2011 02:06 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemiba.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/exec/totemiba.c b/exec/totemiba.c index ec4ccfc..0b2d2ca 100644 --- a/exec/totemiba.c +++ b/exec/totemiba.c @@ -271,6 +271,7 @@ static inline struct send_buf *mcast_send_buf_get ( 2048, IBV_ACCESS_LOCAL_WRITE); if (send_buf-mr == NULL) { log_printf (LOGSYS_LEVEL_ERROR, couldn't register memory range\n); + free (send_buf); return (NULL); } list_init (send_buf-list_all); @@ -307,6 +308,7 @@ static inline struct send_buf *token_send_buf_get ( 2048, IBV_ACCESS_LOCAL_WRITE); if (send_buf-mr == NULL) { log_printf (LOGSYS_LEVEL_ERROR, couldn't register memory range\n); + free (send_buf); return (NULL); } list_init (send_buf-list_all); Reviewed-by: Steven Dake sd...@redhat.com Thanks! -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] Speculatory patch that may correct tlbe...@gmail.com's reported problem
May not work at all or correct problem - would appreciate feedback Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 3dcc05e..5a3bfaa 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -1809,7 +1809,7 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance) sizeof (struct srp_addr) * instance-my_memb_entries); instance-my_failed_list_entries = 0; - instance-my_high_delivered = instance-my_aru; + instance-my_high_delivered = instance-my_high_received; for (i = 0; i = instance-my_high_delivered; i++) { void *ptr; -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] take 2 Speculatory patch that may correct tlbe...@gmail.com's reported problem
May not work at all or correct problem - would appreciate feedback Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 3dcc05e..5a3bfaa 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -1809,7 +1809,7 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance) sizeof (struct srp_addr) * instance-my_memb_entries); instance-my_failed_list_entries = 0; - instance-my_high_delivered = instance-my_aru; + instance-my_high_delivered = instance-my_high_received; for (i = 0; i = instance-my_high_delivered; i++) { void *ptr; -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Question about recovery code
On 07/07/2011 03:07 PM, Tim Beale wrote: Hi Steve, Thanks for your help. When we upgraded to v1.3.1 we picked up commit 8603ff6e9a270ecec194f4e13780927ebeb9f5b2: totemsrp: free messages originated in recovery rather then rely on messages_free Which is why I was retesting this issue. But I still see the problem even with the above change. The recovery code seems to work most of the time. But occasionally it doesn't free all of the recovery messages on the queue. It seems there are recovery messages left with seq numbers higher than instance-my_high_delivered/ instance-my_aru. In the last crash I saw there were 12 messages on the recovery queue but only 5 of them got freed by the above patch/code. I think usually a node leave event seems to occur at the same time. I speculate there are gaps in the recovery queue. Example my_aru = 5, but there are messages at 7,8. 8 = my_high_seq_received which results in data slots taken up in new message queue. What should really happen is these last messages should be delivered after a transitional configuration to maintain SAFE agreement. We don't have support for SAFE atm, so it is probably safe just to throw these messages away. Could you test my speculatory patch against your test case? Thanks! -steve I can reproduce the problem reasonably reliably in a 2-node cluster with: #define TEST_DROP_ORF_TOKEN_PERCENTAGE 40 #define TEST_DROP_MCAST_PERCENTAGE 20 But I suspect it's reliant on timing/messaging specific to my system. Let me know if there's any debug or anything you want me to try out. Thanks, Tim On Thu, Jul 7, 2011 at 3:47 PM, Steven Dake sd...@redhat.com wrote: On 07/06/2011 05:24 PM, Tim Beale wrote: Hi, We've hit a problem in the recovery code and I'm struggling to understand why we do the following: /* * The recovery sort queue now becomes the regular * sort queue. It is necessary to copy the state * into the regular sort queue. */ sq_copy (instance-regular_sort_queue, instance-recovery_sort_queue); The problem we're seeing is sometimes we get an encapsulated message from the recovery queue copied onto the regular queue, and corosync then crashes trying to process the message. (When it strips off the totemsrp header it gets another totemsrp header rather than the totempg header it expects). The problem seems to happen when we only do the sq_items_release() for a subset of the recovery messages, e.g. there are 12 messages on the recovery queue and we only free/release 5 of them. The remaining encapsulated recovery messages get left on the regular queue and corosync crashes trying to deliver them. It looks to me like deliver_messages_from_recovery_to_regular() handles the encapsulation correctly, stripping the extra header and adding the recovery messages to the regular queue. But then the sq_copy() just seems to overwrite the regular queue. We've avoided the crash in the past by just reiniting both queues, but I don't think this is the best solution. I would expect this solution would lead to message loss or lockup of the protocol. Any advice would be appreciated. Thanks, Tim A proper fix should be in commit master: 7d5e588931e4393c06790995a995ea69e6724c54 flatiron-1.3: 8603ff6e9a270ecec194f4e13780927ebeb9f5b2 A new flatiron-1.3 release is in the works. There are other totem bugs you may wish to backport in the meantime. Let us know if that commit fixes the problem you encountered. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH] take 3 Speculatory patch that may correct tlbe...@gmail.com's reported problem
May not work at all or correct problem - would appreciate feedback Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 3dcc05e..16de74d 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -1809,7 +1809,7 @@ static void memb_state_operational_enter (struct totemsrp_instance *instance) sizeof (struct srp_addr) * instance-my_memb_entries); instance-my_failed_list_entries = 0; - instance-my_high_delivered = instance-my_aru; + instance-my_high_delivered = instance-my_high_seq_received; for (i = 0; i = instance-my_high_delivered; i++) { void *ptr; -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH] Add more pause timeout resets
On 07/05/2011 04:51 PM, Russell Bryant wrote: On Tue, Jul 5, 2011 at 2:14 PM, Steven Dake sd...@redhat.com wrote: Signed-off-by: Steven Dake sd...@redhat.com --- exec/totemsrp.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/exec/totemsrp.c b/exec/totemsrp.c index 3dcc05e..0194a7c 100644 --- a/exec/totemsrp.c +++ b/exec/totemsrp.c @@ -3501,6 +3501,8 @@ static int message_handler_orf_token ( cancel_heartbeat_timeout(instance); } + timer_function_pause_timeout (instance); + timer_function_pause_timeout (instance); return (0); /* discard token */ } Is this duplicate on purpose? no but it wont cause harm ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] startup error - getpwnam_r() returns ERANGE for some systems
On 07/05/2011 07:20 PM, Tim Beale wrote: Hi, We've just upgraded to corosync v1.3.1 and struck a problem with corosync failing to startup. The problem is the getpwnam_r()/getgrnam_r() calls return ERANGE on our system, meaning insufficient buffer space was supplied (the expected buffer length is 256, rather than 250). I don't know much about this code, but judging by the man page for getpwnam_r, the correct way to determine the buffersize on any given system is to use sysconf(). Attached is a patch that does this. Cheers, Tim Thanks for the work - it is appreciated. I have reviewed and merged your patch. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Fix compile/runtime issues for _POSIX_THREAD_PROCESS_SHARED 1
On 07/05/2011 07:22 PM, Tim Beale wrote: Hi, Another issue we found upgrading was that the code doesn't compile when _POSIX_THREAD_PROCESS_SHARED 1. When it does compile, it crashes on our system - our version of uClibc seems to always expect a 4th arg. The man pages suggests the 4th arg is optional, but does say: 'For greater portability it is best to always call semctl() with four arguments'. The attached patch does this. Cheers, Tim Tim, Thanks for the work. Is this only a uclibc problem on linux? Reviewed-by: Steven Dake sd...@redhat.com Thanks -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 2/4] build: make RDMA support an RPM build conditional
From: Florian Haas florian.h...@linbit.com Enable RDMA in RPM builds by default to maintain the previous behavior (which always included --enable-rdma in the %configure invocation). --- corosync.spec.in |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index aec13c6..d5bdeb6 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -10,6 +10,7 @@ %bcond_with monitoring %bcond_with snmp %bcond_with dbus +%bcond_without rdma Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces @@ -36,7 +37,9 @@ Conflicts: openais = 0.89, openais-devel = 0.89 BuildRequires: autoconf automake %endif BuildRequires: nss-devel +%if %{with rdma} BuildRequires: libibverbs-devel librdmacm-devel +%endif %if %{with snmp} BuildRequires: net-snmp-devel %endif @@ -75,7 +78,9 @@ export rdmacm_LIBS=-lrdmacm \ %if %{with dbus} --enable-dbus \ %endif +%if %{with rdma} --enable-rdma \ +%endif --with-initddir=%{_initrddir} make %{_smp_mflags} -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 3/4] build: set RDMA related _LIBS and _CFLAGS only if building with RDMA support
From: Florian Haas florian.h...@linbit.com Having to force {ibverbs,rdmacm}_{LIBS,CFLAGS} looks positively odd; so this may warrant further review. However, they are definitely not needed if building without RDMA support. --- corosync.spec.in |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index d5bdeb6..34e1658 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -57,10 +57,12 @@ BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XX) ./autogen.sh %endif +%if %{with rdma} export ibverbs_CFLAGS=-I/usr/include/infiniband \ export ibverbs_LIBS=-libverbs \ export rdmacm_CFLAGS=-I/usr/include/rdma \ export rdmacm_LIBS=-lrdmacm \ +%endif %{configure} \ --enable-nss \ %if %{with testagents} -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
[Openais] [PATCH 4/4] build: disable RDMA support in RPMs by default
From: Florian Haas florian.h...@linbit.com Rather than curiously disable RDMA support by default in configure and enable it by default in RPM builds, streamline the default configuration to always turn RDMA support off. It can be enabled in RPM builds with --with rdma. --- corosync.spec.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 34e1658..9585831 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -10,7 +10,7 @@ %bcond_with monitoring %bcond_with snmp %bcond_with dbus -%bcond_without rdma +%bcond_with rdma Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces -- 1.7.4.4 ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 4/4] build: disable RDMA support in RPMs by default
On 07/06/2011 06:52 AM, Steven Dake wrote: From: Florian Haas florian.h...@linbit.com Rather than curiously disable RDMA support by default in configure and enable it by default in RPM builds, streamline the default configuration to always turn RDMA support off. It can be enabled in RPM builds with --with rdma. --- corosync.spec.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) Reviewed-by: Steven Dake sd...@redhat.com diff --git a/corosync.spec.in b/corosync.spec.in index 34e1658..9585831 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -10,7 +10,7 @@ %bcond_with monitoring %bcond_with snmp %bcond_with dbus -%bcond_without rdma +%bcond_with rdma Name: corosync Summary: The Corosync Cluster Engine and Application Programming Interfaces ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [GIT PULL] Minor fixes for RPM builds
On 07/06/2011 07:08 AM, Florian Haas wrote: On 2011-07-06 15:59, Steven Dake wrote: On 07/06/2011 06:56 AM, Florian Haas wrote: On 2011-07-06 15:49, Steven Dake wrote: Florian, I'll take improvements however I can get them, but sending patches to the list is preferred that way multiple people can look at them. Arguably that counts for github too, as my repo happens to be quite public. :) The way I generally do this is git send-email --to=open...@lists.osdl.org --smtp-server=server -3 where -3 is last 3 patches the to and smtp server can be set in gitconfig as well. Fair enough, but do you actually prefer to git am each patch by hand? Wouldn't it make more sense to post the patches first, when reviewed and acknowledged fix up the git tree so you can merge easily, and then send a pull request? I do like git am, however, open to changes. I am not sure how to amend a commit in a patch set to include a reviewed-by line. Get am lets me amend per patch. Any tips here? Hmmm. You can merge from my repo into yours, then use git rebase -i base rev to edit commit messages and add your Reviewed-By lines. But the downside of this is that this creates in place of my changesets it creates new ones, and then I have to reset my tree to match yours after you've pushed your changes. I think normally what's most often done is the contributor posts patches first, gets review and testing feedback, then the _contributor_ adds Reviewed-By, Tested-By, etc., issues a pull request, and then the maintainer pulls, and no further changes to the commits are necessary. Does that sound workable? Florian Yup that wfm if you prefer to work in that way Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 1/4] build: force LC_ALL=C correctly for dates
Thanks for the patch Reviewed-by: Steven Dake sd...@redhat.com On 07/06/2011 06:52 AM, Steven Dake wrote: From: Florian Haas florian.h...@linbit.com Failure to force C dates will have RPM et al. complain about invalid dates and timestamps. --- Makefile.am |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Makefile.am b/Makefile.am index 0929ca7..252caf1 100644 --- a/Makefile.am +++ b/Makefile.am @@ -123,7 +123,7 @@ clean-generic: $(SPEC): $(SPEC).in rm -f $@-t $@ - LC_ALL=C date=$(shell date +%a %b %d %Y) \ + date=$(shell LC_ALL=C date +%a %b %d %Y) \ if [ -f .tarball-version ]; then \ gitver=$(shell cat .tarball-version) \ rpmver=$$gitver \ @@ -190,7 +190,7 @@ gen_start_date = 2000-01-01 .PHONY: gen-ChangeLog gen-ChangeLog: if test -d .git; then \ - $(top_srcdir)/build-aux/gitlog-to-changelog \ + LC_ALL=C $(top_srcdir)/build-aux/gitlog-to-changelog \ --since=$(gen_start_date) $(distdir)/cl-t;\ rm -f $(distdir)/ChangeLog; \ mv $(distdir)/cl-t $(distdir)/ChangeLog;\ ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [PATCH 2/4] build: make RDMA support an RPM build conditional
On 07/06/2011 01:02 PM, Florian Haas wrote: On 07/06/2011 03:52 PM, Steven Dake wrote: From: Florian Haas florian.h...@linbit.com Enable RDMA in RPM builds by default to maintain the previous behavior (which always included --enable-rdma in the %configure invocation). Steve, seeing that you acked all the others, any objections to this one? I didn't get your Reviewed-by here. Should I leave this one out when I fix up my tree for you to pull? Cheers, Florian hmm. I did push it - have alot of email open in the morning :) Reviewed-by: Steven Dake sd...@redhat.com Thanks -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] Question about recovery code
On 07/06/2011 05:24 PM, Tim Beale wrote: Hi, We've hit a problem in the recovery code and I'm struggling to understand why we do the following: /* * The recovery sort queue now becomes the regular * sort queue. It is necessary to copy the state * into the regular sort queue. */ sq_copy (instance-regular_sort_queue, instance-recovery_sort_queue); The problem we're seeing is sometimes we get an encapsulated message from the recovery queue copied onto the regular queue, and corosync then crashes trying to process the message. (When it strips off the totemsrp header it gets another totemsrp header rather than the totempg header it expects). The problem seems to happen when we only do the sq_items_release() for a subset of the recovery messages, e.g. there are 12 messages on the recovery queue and we only free/release 5 of them. The remaining encapsulated recovery messages get left on the regular queue and corosync crashes trying to deliver them. It looks to me like deliver_messages_from_recovery_to_regular() handles the encapsulation correctly, stripping the extra header and adding the recovery messages to the regular queue. But then the sq_copy() just seems to overwrite the regular queue. We've avoided the crash in the past by just reiniting both queues, but I don't think this is the best solution. I would expect this solution would lead to message loss or lockup of the protocol. Any advice would be appreciated. Thanks, Tim A proper fix should be in commit master: 7d5e588931e4393c06790995a995ea69e6724c54 flatiron-1.3: 8603ff6e9a270ecec194f4e13780927ebeb9f5b2 A new flatiron-1.3 release is in the works. There are other totem bugs you may wish to backport in the meantime. Let us know if that commit fixes the problem you encountered. Regards -steve ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais
Re: [Openais] [TOTEM ] Process pause detected for XXX ms, flushing membership messages.
On 07/05/2011 07:26 AM, Vladislav Bogdanov wrote: Hi all, Last days I see following messages in logs: [TOTEM ] Process pause detected for XXX ms, flushing membership messages. After that ring is quickly re-established. DLM/clvmd notifies this and switches to kern_stop waiting for fencing to be done. Although what dlm_tool ls provides is really strange flags and members differ between nodes. I have dumps of what has been happening in dlm, and there are messages that fencing was done! On the other hand, pacemaker does not notify anything so fencing is not done. This is rather strange, but for another list. Can anybody please explain what exactly that message means and what is the correct reaction of upper services should be? Can it be solely caused by network problems? Can number of buffers in RX ring of ethernet card influence this (I did some tuning there some time ago)? corosync 1.3.1, UDPU transport. pacemaker-1.1-devel dlm_controld.pcmk from 3.0.17 clvmd 2.02.85 clusterlib-3.1.1 This indicates the kernel has paused scheduling (or corosync of corosync or corosync has blocked for the time value printed in the message. Corosync is non-blocking. Are you running inside a VM? Increasing token is probably a necessity when running inside a VM on a heavily loaded host because kvm does not schedule as fairly as bare metal. Please provide feedback if this is bare metal or m. Regards -steve Best, Vladislav ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais