[Openais] Announcing Corosync 1.4.0

2011-07-18 Thread Jan Friesse
Corosync 1.4.0 is available for immediate download from our website. This version brings many enhancements to the software but most visible change is redundant ring auto recovery functionality. Please retrieve the latest sources from our website: http://www.corosync.org Regards Honza

Re: [Openais] Announcing Corosync 1.4.0

2011-07-18 Thread Jan Friesse
Proskurin Kirill napsal(a): On 07/18/2011 06:37 PM, Jan Friesse wrote: Corosync 1.4.0 is available for immediate download from our website. Great news! Should we consider it stable, ready for production use and easy to update? :-) Yep, definitively. It's simply continuation of our

[Openais] [PATCH 3/3] specfile: Install corosync-signals.conf for dbus

2011-07-19 Thread Jan Friesse
Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 823ad3d..74ab851 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -92,6 +92,11 @@ rm -rf %{buildroot

[Openais] [PATCH 1/3] specfile: Correct URL and source0

2011-07-19 Thread Jan Friesse
Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index e1dcf19..37e53ed 100644 --- a/corosync.spec.in +++ b/corosync.spec.in @@ -18,8 +18,8 @@ Version: @version

Re: [Openais] [PATCH 2/3] specfile: use _datadir as var expansion not exec

2011-07-20 Thread Jan Friesse
Steven Dake wrote: On 07/19/2011 08:01 AM, Jan Friesse wrote: Signed-off-by: Jan Friesse jfrie...@redhat.com --- corosync.spec.in |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/corosync.spec.in b/corosync.spec.in index 37e53ed..823ad3d 100644

[Openais] [PATCH] main: let poll really stop before totempg_finalize

2011-07-25 Thread Jan Friesse
Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/main.c | 24 +++- 1 files changed, 15 insertions(+), 9 deletions(-) diff --git a/exec/main.c b/exec/main.c index be9e118..1c4fb37 100644 --- a/exec/main.c +++ b/exec/main.c @@ -184,6 +184,8 @@ static int32_t

[Openais] Announcing Corosync 1.4.1 and 1.3.3 available at ftp.corosync.org!

2011-07-26 Thread Jan Friesse
I am pleased to announce the latest maintenance release of Corosync 1.3.3 and 1.4.1 available immediately from our website at http://www.corosync.org. This release fixes mainly problem with Retransmit list errors even if network is perfectly OK. Bug appears only on high cpu load/weak cpu and

Re: [Openais] Corosync crash at startup - (Type of received message is wrong)

2011-07-27 Thread Jan Friesse
Probably this is because one node uses secauth (one with messages invalid digest ...) and second node doesn't (one with Type of received message is wrong). Proskurin Kirill wrote: Hello all. Just catch fully reproducible crash of corosync 1.4.1 OS: Centos 5.3 i386 RPMS: Corosync-1.4.1

[Openais] [PATCH] Revert totemsrp: Remove recv_flush code

2011-07-27 Thread Jan Friesse
This reverts commit 2167 Reversion is needed to remove overflow of receive buffers and dropping messages. Signed-off-by: Jan Friesse jfrie...@redhat.com --- branches/whitetank/exec/totemnet.c | 45 - branches/whitetank/exec/totemnet.h |2 + branches/whitetank/exec

[Openais] [PATCH] coroipcc: use malloc for path in service_connect

2011-07-27 Thread Jan Friesse
...@gmail.com Signed-off-by: Jan Friesse jfrie...@redhat.com --- lib/coroipcc.c | 67 +-- 1 files changed, 40 insertions(+), 27 deletions(-) diff --git a/lib/coroipcc.c b/lib/coroipcc.c index 14860e2..54d9aa7 100644 --- a/lib/coroipcc.c +++ b/lib

[Openais] [PATCH 2/2] cfg: Handle errors from totem_mcast

2011-07-28 Thread Jan Friesse
totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse

[Openais] [PATCH 1/2] cpg: Handle errors from totem_mcast

2011-07-28 Thread Jan Friesse
totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse

[Openais] [PATCH] cpg: Handle errors from totem_mcast

2011-07-29 Thread Jan Friesse
totem_mcast function can return -1 if corosync is overloaded. Sadly in many calls of this functions was error code ether not handled at all, or handled by assert. Commit changes behaviour to ether return CS_ERR_TRY_AGAIN or put error code to later layers to handle it. Signed-off-by: Jan Friesse

Re: [Openais] [PATCH 3/3] corosync.conf.example: include comments

2011-07-29 Thread Jan Friesse
then that, Reviewed-by: Jan Friesse jfrie...@redhat.com (on all 3 of them) Florian Haas napsal(a): It's nice to say people should read the man page. It's also naive to assume that they always do. Include comments in the example config file itself. Signed-off-by: Florian Haas florian.h...@linbit.com

[Openais] Corosync 2.0 Feature Request: Replace objdb/confdb with something easier to use

2011-08-08 Thread Jan Friesse
Current objdb/confdb is really hard to use, because of all iterationing, ... It would be nice to replace it by hash table and thus for simple get item or set item, no iteration is needed. But iteration functionality should still somehow be there to allow user select for example all totem.*

[Openais] [PATCH] Allow compile master on RHEL 6

2011-08-09 Thread Jan Friesse
corosync_timer_handle_t is know conditionally defined to prevent double definition causing compile fault on RHEL 6 systems. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/timer.h |3 +++ include/corosync/engine/coroapi.h |4 2 files changed, 7

Re: [Openais] CPG client can lockup if the local node is in the downlist

2011-08-17 Thread Jan Friesse
Angus Salkeld napsal(a): On Wed, Aug 17, 2011 at 01:19:53PM +1200, Tim Beale wrote: Hi, I'm resending this patch in a separate thread because I think this part of the cluster formation problems I'm seeing has been overlooked. The patch attached is one way of addressing the problem, but I'm

[Openais] [PATCH 3/3] whitetank ipc: Don't deadlock in ipc_disconnect

2011-08-17 Thread Jan Friesse
function ipc_disconnect is converted to call of new function by using locked parameter set to 0. Signed-off-by: Jan Friesse jfrie...@redhat.com --- branches/whitetank/exec/ipc.c | 20 +--- 1 files changed, 17 insertions(+), 3 deletions(-) diff --git a/branches/whitetank/exec/ipc.c b

[Openais] [PATCH] totemconfig: Check interfaces address integrity

2011-08-19 Thread Jan Friesse
Two interfaces (in RRP mode) shouldn't have equal unicast or multicast addresses. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemconfig.c | 24 +++- 1 files changed, 23 insertions(+), 1 deletions(-) diff --git a/exec/totemconfig.c b/exec/totemconfig.c index

Re: [Openais] Corosync 2.0 Feature Request: Replace objdb/confdb with something easier to use

2011-08-25 Thread Jan Friesse
Fabio M. Di Nitto napsal(a): On 08/25/2011 06:31 AM, Angus Salkeld wrote: On Thu, Aug 25, 2011 at 05:16:20AM +0200, Fabio M. Di Nitto wrote: On 08/25/2011 04:56 AM, Angus Salkeld wrote: Possible Solutions == 1] API We really just want to get/set values do we really need a

[Openais] [PATCH 1/2] rrp: Handle endless loop if all ifaces are faulty

2011-08-29 Thread Jan Friesse
If all interfaces were faulty, passive_mcast_flush_send and related functions ended in endless loop. This is now handled and if there is no live interface, message is dropped. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemrrp.c | 29 - 1 files changed

[Openais] [PATCH 2/2] rrp: Higher threshold in passive mode for mcast

2011-08-29 Thread Jan Friesse
. Variable is unused in active mode. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemconfig.c | 11 +++ exec/totemrrp.c|6 -- exec/totemsrp.c|3 +++ include/corosync/totem/totem.h |2 ++ man/corosync.conf.5

[Openais] [PATCH] wt: Ignore memb_join messages during flush operations

2011-09-01 Thread Jan Friesse
Backport of corosync patch a memb_join operation that occurs during flushing can result in an entry into the GATHER state from the RECOVERY state. This results in the regular sort queue being used instead of the recovery sort queue, resulting in segfault. Signed-off-by: Jan Friesse jfrie

[Openais] [PATCH] wt: Ignore memb_join messages during flush operations

2011-09-01 Thread Jan Friesse
Backport of corosync patch a memb_join operation that occurs during flushing can result in an entry into the GATHER state from the RECOVERY state. This results in the regular sort queue being used instead of the recovery sort queue, resulting in segfault. Signed-off-by: Jan Friesse jfrie

[Openais] Configuration Hash Table - API proposal

2011-09-01 Thread Jan Friesse
Included is API proposal for replacement of objdb/confdb API. It should keep all good things there (triggers, ...), remove hard to use bits (like whole object idea) and improve existing things (like typing) Even I wrote it before, also configuration file will need change. Proposed change is

Re: [Openais] [PATCH] Ignore memb_join messages during flush operations

2011-09-02 Thread Jan Friesse
Reviewed-by: Jan Friesse jfrie...@redhat.com Steven Dake napsal(a): a memb_join operation that occurs during flushing can result in an entry into the GATHER state from the RECOVERY state. This results in the regular sort queue being used instead of the recovery sort queue, resulting

[Openais] [PATCH] totemconfig: change minimum RRP threshold

2011-09-08 Thread Jan Friesse
RRP threshold can be lower value then 5. Signed-off-by: Jan Friesse jfrie...@redhat.com --- exec/totemconfig.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/exec/totemconfig.c b/exec/totemconfig.c index f767f69..a475bb3 100644 --- a/exec/totemconfig.c +++ b/exec

Re: [Openais] [PATCH] Resolve a deadlock between the timer and serialize locks.

2011-09-09 Thread Jan Friesse
Reviewed-by: Jan Friesse jfrie...@redhat.com Russell Bryant napsal(a): This patch resolves a deadlock between the serialize lock (in exec/main.c) and the timer lock (in exec/timer.c). I observed this deadlock happening fairly quickly on a cluster using the EVT service from OpenAIS. (OpenAIS

Re: [Openais] Can't get udpu to work with basic 2-node Corosync cluster.

2013-01-02 Thread Jan Friesse
Rosser, actually, problem is much simpler to solve then you would expect. If you look closely to your config there is: member { mamberaddr: 10.198.156.47 ^ } member { memberaddr: 10.198.156.48 } you have mAmberaddr instead of mEmberaddr ;)

Re: [Openais] Hawk 0.5.2 Debian packages

2013-02-26 Thread Jan Friesse
Great news! Regards, Honza Charles Williams napsal(a): Hey all, I recently got a chance to finally build Debian packages for the 0.5.2 version of ClusterLabs Hawk GUI. These are Squeeze packages ATM (Wheezy to come next week dependent upon testing of the current packages) and I am

Re: [Openais] binding to corosync

2013-03-28 Thread Jan Friesse
Hi, corosync functions never returns ERRNO errors. They are returning errors like CS_ERR_*. What is return value you've got from cmap_initialize? Regards, Honza eXeC001er napsal(a): Hello. I tried to create an application that uses corosync via its libraries, but it seems something wrong.

Re: [Openais] Question about corosync mcastaddr setting

2013-04-25 Thread Jan Friesse
Moullé Alain napsal(a): Hi, corosync-1.4.1-7 with two rings in corosync.conf , and rrp mode active, is it recommended to have two distinct mcastaddr ? You can choose to have ether two distinct mcastaddr(eses) or distinct ports (don't use port +- 1). (and if so, where can I find this

Re: [Openais] Question about corosync mcastaddr setting

2013-04-25 Thread Jan Friesse
Moullé Alain napsal(a): Hi, you can choose ... meaning that it is not mandatory ? and my configuration is correct anyway ? No, your configuration is not correct. You can choose... means binary OR. So (table) same_mcast_addr | same_port +- 1 | works 0

Re: [Openais] Question about corosync mcastaddr setting

2013-04-26 Thread Jan Friesse
. Alain Le 25/04/2013 17:33, Jan Friesse a écrit : Moullé Alain napsal(a): Hi, you can choose ... meaning that it is not mandatory ? and my configuration is correct anyway ? No, your configuration is not correct. You can choose... means binary OR. So (table) same_mcast_addr | same_port +- 1

Re: [Openais] Question about corosync mcastaddr setting

2013-04-29 Thread Jan Friesse
eth IF ? Thanks for all information. Alain Le 25/04/2013 17:33, Jan Friesse a écrit : Moullé Alain napsal(a): Hi, you can choose ... meaning that it is not mandatory ? and my configuration is correct anyway ? No, your configuration is not correct. You can choose... means binary OR. So

Re: [Openais] Question about corosync mcastaddr setting

2013-05-21 Thread Jan Friesse
;-) : which is the last official release available of the corosync rpm ? and where can I get this last release ? (can't find it on clusterlabs site) Thanks Alain Le 17/05/2013 07:55, Jan Friesse a écrit : Moullé Alain napsal(a): Hi Jan, thanks for all your accurate answers. Again some questions

Re: [Openais] About corosync and libibverbs-devel librdmacm-devel dependancies

2013-05-29 Thread Jan Friesse
Alain, Moullé Alain napsal(a): Hi Jan, Just for my information, I would like only to know the dependancies between corosync and both IB libs libibverbs-devel librdmacm-devel : Yes in which configuration corosync needs functions from these both IB libs ? only if there is a heartbeat

Re: [Openais] About corosync and libibverbs-devel librdmacm-devel dependancies

2013-05-29 Thread Jan Friesse
are compiling from source and don't pass configure parameter --enable-rdma, corosync is built without RDMA code and without dependencies to IB libraries. Honza Thanks again Alain Le 29/05/2013 15:53, Jan Friesse a écrit : Alain, Moullé Alain napsal(a): Hi Jan, Just for my information, I would

Re: [Openais] [Corosync] seg fault when corosync starts

2013-09-19 Thread Jan Friesse
Aarti, can you please try to install debug informations and include backtrace from coredump? Regards, Honza Aarti Sawant napsal(a): hello, I am trying to setup HA on centos6.4 lxc container. i have install pacemaker and corosync on this container. my settings of

Re: [Openais] [Corosync] seg fault when corosync starts

2013-09-23 Thread Jan Friesse
in ?? () #6 0x00410d55 in __libc_csu_init () #7 0x76dfcc70 in __libc_start_main () from /lib64/libc.so.6 #8 0x00404539 in _start () Thanks, Aarti Sawant NTT DATA OSS Center Pune On Thu, Sep 19, 2013 at 7:32 PM, Jan Friesse jfrie...@redhat.com wrote: Aarti

Re: [Openais] Problem and Question about corosync

2013-11-18 Thread Jan Friesse
Moullé Alain napsal(a): Hi, with corosync.1.2.3-36 (with Pacemaker) on a 4 nodes HA cluster, we got 1.2.3-36 is problem. This was last release WITHOUT official support for RRP. a strange and random problem : For some reason that we can't identify in the syslog, one node (let's say

Re: [Openais] Problem and Question about corosync

2013-11-19 Thread Jan Friesse
recommend you (if possible) to really consider updating to latest RHEL (probably wait for 6.5 where again A HUGE amount of fixes are available). Another possibility may be to not use RRP and consider bonding. Regards, Honza Thanks a lot for your help. Alain Moullé Le 18/11/2013 15:50, Jan

Re: [Openais] Problem and Question about corosync

2013-11-19 Thread Jan Friesse
problems. If you really need to use it, please consider bonding. 1.4.1 - Only passive mode is fully supported. Honza Alain Le 19/11/2013 09:54, Jan Friesse a écrit : Alain, Moullé Alain napsal(a): Hi Jan, If you don't mind , I need more precisions about 1.2.3-36: I just check again the man

Re: [Openais] Request of information about rrp mode passive versus rrp mode active

2013-11-27 Thread Jan Friesse
Alain, passive mode is much better tested. Another big plus of passive is, that if one network becomes faulty, passive makes progress (one packet is send thru active device, another via faulty - this is not delivered but resend via active device, ...). Active RRP waits until enough failures and

Re: [Openais] Request of information about rrp mode passive versus rrp mode active

2013-12-02 Thread Jan Friesse
we re-start HA stack on only one node of the HA cluster, does it also switch back to first ring , or does it remain on the ring currently used by the other(s) node(s) Thanks Alain Regards, Honza Le 27/11/2013 16:54, Jan Friesse a écrit : Alain, passive mode is much better tested

Re: [Openais] Error: - Need help! cib: [1539]: WARN: cib_peer_callback: Discarding cib_modify message (3) from lxds05: not in our membership

2014-05-19 Thread Jan Friesse
Stefan, On 16 May 2014, at 11:11 pm, Senftleben, Stefan (itsc) stefan.senftle...@itsc.de wrote: Hello, I hope that someone can help me… I have a two node pacemaker cluster, with to corosync rings. Ubuntu 10.04, 64 bit. Pacemaker 1.0.8+hg15494-2ubuntu2, corosync 1.2.0-0ubuntu1. It

Re: [Openais] Newbie clustering questions

2014-06-05 Thread Jan Friesse
Per, it looks like none of your question is really corosync related (so I'm CC'ing linux clustering linux-clus...@redhat.com (this is really better list) but I will try to answer at least some of your questions. Hi all I have redhat clustering running on a 3 VMware vm's 2 nodes and 1

Re: [Openais] unmanaged resource failed - how to get back?

2014-06-30 Thread Jan Friesse
Stefan, sending to Pacemaker list because your question seems to be not Corosync related. Regards, Honza Senftleben, Stefan (itsc) napsal(a): Hello, I set the cluster in a maintainance mode with: crm configure property maintenance-mode=true . Afterwards I did stop one resource manually,

Re: [Openais] Meaning of ais failure exit reason code 254

2016-02-12 Thread Jan Friesse
Ajit, Ajit Singh1 via Openais napsal(a): Hi All, I am facing a issue in my application, It is unable to wake up due to ais faliure. I am getting following error in my OS(system log). == *ais.service: main process exited, code=exited, status=254* *Unit ais.service entered failed state*

Re: [Openais] Openais / Corosync Question

2016-02-12 Thread Jan Friesse
Michael, Michael Weiner napsal(a): Hello all, Just to start we are running a custom script at boot to change our corosync.conf configuration files on boot based on a IP ping of the gateway. The corosync script being run after boot should be run simultaneously if configurations are changing,

Re: [Openais] Unable to start cluster (Pacemaker/Corosync)

2017-05-09 Thread Jan Friesse
Hi there, I am currently trying to configure Pacemaker/Corosync. I managed to install the required packages for the cluster configuration, however I could not start the cluster service. Based on the log file, there was an issue with the directory /var/lib/pacemaker/. I have tried some

<    1   2   3   4