[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-07-03 Thread Launchpad Bug Tracker
This bug was fixed in the package corosync - 1.4.2-2ubuntu0.2

---
corosync (1.4.2-2ubuntu0.2) precise; urgency=medium

  * Fixed consensus being empty in case failed_to_recv is set (LP: #1318441)
 -- Rafael David Tinoco rafael.tin...@canonical.com   Mon, 12 May 2014 
09:37:06 -0500

** Changed in: corosync (Ubuntu Precise)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-07-03 Thread Rafael David Tinoco
** Changed in: corosync (Ubuntu Precise)
 Assignee: Rafael David Tinoco (inaddy) = (unassigned)

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-07-01 Thread Rafael David Tinoco
Brian,

I've made several tests on this and everything works like expected.
Changing tag.

Thanks

** Tags removed: verification-needed
** Tags added: verification-done

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-15 Thread Brian Murray
Hello Rafael, or anyone else affected,

Accepted corosync into precise-proposed. The package will build now and
be available at
http://launchpad.net/ubuntu/+source/corosync/1.4.2-2ubuntu0.2 in a few
hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
enable and use -proposed.  Your feedback will aid us getting this update
out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Changed in: corosync (Ubuntu Precise)
   Status: In Progress = Fix Committed

** Tags added: verification-needed

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-15 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/precise-proposed/corosync

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-12 Thread Rafael David Tinoco
Attaching patch.

** Patch added: corosync_1.4.2-2ubuntu0.2.diff
   
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+attachment/4110673/+files/corosync_1.4.2-2ubuntu0.2.diff

** Description changed:

  [Impact]
  
-  * On certain conditions corosync daemon may quit if it detects itself as not
-being able to receive messages. The logic asserts the existence of at least
-one functional node but the node is marking itself as a failed node (not
-following the specification). It is safe not to assert this if 
failed_to_recv
-is set.
+  * On certain conditions *precise* corosync daemon may quit if it detects 
itself 
+as not being able to receive messages. The logic asserts the existence of 
+at least one functional node but the node is marking itself as a failed 
node 
+(not following the specification). It is safe not to assert this if
+failed_to_recv is set.
  
  [Test Case]
  
-  * Using corosync test suite on precise-test machine:
+  * Using corosync test suite on precise-test machine:
  
-- Make sure to set ssh keys so precise-test can access 
precise-cluster-{01,02}.
-- Make sure only failed-to-receive-crash.sh is executable on tests dir.
-- Make sure precise-cluster-{01,02} nodes have build-dep for corosync 
installed.
-- sudo ./run-tests.sh -c flatiron -n precise-cluster-01 
precise-cluster-02
-- Check corosync log messages to see precise-cluster-01 corosync dieing. 
+    - Make sure to set ssh keys so precise-test can access 
precise-cluster-{01,02}.
+    - Make sure only failed-to-receive-crash.sh is executable on tests dir.
+    - Make sure precise-cluster-{01,02} nodes have build-dep for corosync 
installed.
+    - sudo ./run-tests.sh -c flatiron -n precise-cluster-01 
precise-cluster-02
+    - Check corosync log messages to see precise-cluster-01 corosync dieing.
  
  [Regression Potential]
  
-  * We are not asserting the existence of at least 1 node in corosync cluster
-anymore. Since there is always 1 node in the cluster (the node itself) it
-is very unlikely this change alters corosync logic for membership. If it 
-does it is likely corosync will recover from the error and reestablish new 
-membership (with 1 or more nodes).
+  * We are not asserting the existence of at least 1 node in corosync cluster
+    anymore. Since there is always 1 node in the cluster (the node itself) it
+    is very unlikely this change alters corosync logic for membership. If it
+    does it is likely corosync will recover from the error and reestablish new
+    membership (with 1 or more nodes).
  
  [Other Info]
  
-  * n/a
+  * n/a

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-12 Thread Rafael David Tinoco
 Tests before the patch:

#
# NODE 1
#

--- MARKER --- ./failed-to-receive-crash.sh at 2014-05-09-17:33:04 --- MARKER 
--- 
May 09 17:33:04 corosync [MAIN]:  ] Corosync Cluster Engine ('1.4.2'): started 
and ready to provide service. 
May 09 17:33:04 corosync [MAIN]:  ] Corosync built-in features: nss 
May 09 17:33:04 corosync [MAIN]:  ] Successfully read main configuration file 
'/etc/corosync/corosync.conf'. 
May 09 17:33:04 corosync [TOTEM]: ] Initializing transport (UDP/IP Multicast). 
May 09 17:33:04 corosync [TOTEM]: ] Initializing transmit/receive security: 
libtomcrypt SOBER128/SHA1HMAC (mode 0). 
May 09 17:33:04 corosync [TOTEM]: ] The network interface [192.168.168.1] is 
now up. 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: openais checkpoint 
service B.01.01 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync extended 
virtual synchrony service 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync 
configuration service 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync cluster 
closed process group service v1.01 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync cluster 
config database access v1.01 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync profile 
loading service 
May 09 17:33:04 corosync [SERV]:  ] Service engine loaded: corosync cluster 
quorum service v0.1 
May 09 17:33:04 corosync [MAIN]:  ] Compatibility mode set to whitetank.  Using 
V1 and V2 of the synchronization engine. 
May 09 17:33:04 corosync [TOTEM]: ] A processor joined or left the membership 
and a new membership was formed. 
May 09 17:33:04 corosync [CPG]:   ] chosen downlist: sender r(0) 
ip(192.168.168.1) ; members(old:0 left:0) 
May 09 17:33:04 corosync [MAIN]:  ] Completed service synchronization, ready to 
provide service. 
May 09 17:33:05 corosync [TOTEM]: ] A processor joined or left the membership 
and a new membership was formed. 
May 09 17:33:05 corosync [CPG]:   ] chosen downlist: sender r(0) 
ip(192.168.168.1) ; members(old:1 left:0) 
May 09 17:33:05 corosync [MAIN]:  ] Completed service synchronization, ready to 
provide service. 
May 09 17:33:10 corosync [TOTEM]: ] FAILED TO RECEIVE 

# COROSYNC HAS DIED BEFORE TEST CASE TRIES TO STOP IT

root@precise-cluster-01:~# ps -ef | grep corosync
root  1414  1306  0 17:31 pts/000:00:00 tail -f 
/var/log/cluster/corosync.log
root  4712  1306  0 17:33 pts/000:00:00 grep --color=auto corosync

 Tests after the patch:

May 11 22:27:48 corosync [MAIN]:  ] Corosync Cluster Engine ('1.4.2'): started 
and ready to provide service. 
May 11 22:27:48 corosync [MAIN]:  ] Corosync built-in features: nss 
May 11 22:27:48 corosync [MAIN]:  ] Successfully read main configuration file 
'/etc/corosync/corosync.conf'. 
May 11 22:27:48 corosync [TOTEM]: ] Initializing transport (UDP/IP Multicast). 
May 11 22:27:48 corosync [TOTEM]: ] Initializing transmit/receive security: 
libtomcrypt SOBER128/SHA1HMAC (mode 0). 
May 11 22:27:48 corosync [TOTEM]: ] The network interface [192.168.168.1] is 
now up. 
May 11 22:27:48 corosync [SERV]:  ] Service engine loaded: openais checkpoint 
service B.01.01 
May 11 22:27:48 corosync [SERV]:  ] Service engine loaded: corosync extended 
virtual synchrony service 
May 11 22:27:48 corosync [SERV]:  ] Service engine loaded: corosync 
configuration service 
May 11 22:27:48 corosync [SERV]:  ] Service engine loaded: corosync cluster 
closed process group service v1.01 
May 11 22:27:48 corosync [SERV]:  ] Service engine loaded: corosync cluster 
config database access v1.01 
May 11 22:27:49 corosync [SERV]:  ] Service engine loaded: corosync profile 
loading service 
May 11 22:27:49 corosync [SERV]:  ] Service engine loaded: corosync cluster 
quorum service v0.1 
May 11 22:27:49 corosync [MAIN]:  ] Compatibility mode set to whitetank.  Using 
V1 and V2 of the synchronization engine. 
May 11 22:27:49 corosync [TOTEM]: ] A processor joined or left the membership 
and a new membership was formed. 
May 11 22:27:49 corosync [CPG]:   ] chosen downlist: sender r(0) 
ip(192.168.168.1) ; members(old:0 left:0) 
May 11 22:27:49 corosync [MAIN]:  ] Completed service synchronization, ready to 
provide service. 
May 11 22:27:49 corosync [TOTEM]: ] A processor joined or left the membership 
and a new membership was formed. 
May 11 22:27:49 corosync [CPG]:   ] chosen downlist: sender r(0) 
ip(192.168.168.1) ; members(old:1 left:0) 
May 11 22:27:49 corosync [MAIN]:  ] Completed service synchronization, ready to 
provide service. 
May 11 22:27:54 corosync [TOTEM]: ] FAILED TO RECEIVE 
May 11 22:27:55 corosync [TOTEM]: ] A processor joined or left the membership 
and a new membership was formed. 
May 11 22:27:55 corosync [CPG]:   ] chosen downlist: sender r(0) 
ip(192.168.168.1) ; members(old:2 left:1) 
May 11 22:27:55 corosync [MAIN]:  ] Completed service synchronization, ready to 
provide service. 
May 11 22:27:57 corosync [TOTEM]: ] A 

[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-12 Thread Chris J Arges
** Also affects: corosync (Ubuntu Precise)
   Importance: Undecided
   Status: New

** Changed in: corosync (Ubuntu Precise)
 Assignee: (unassigned) = Rafael David Tinoco (inaddy)

** Changed in: corosync (Ubuntu)
   Status: In Progress = Fix Released

** Changed in: corosync (Ubuntu Precise)
   Status: New = In Progress

** Changed in: corosync (Ubuntu Precise)
   Importance: Undecided = Medium

** Changed in: corosync (Ubuntu)
 Assignee: Rafael David Tinoco (inaddy) = (unassigned)

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-12 Thread Chris J Arges
Sponsored for Precise.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1318441] Re: Precise corosync dies if failed_to_recv is set

2014-05-11 Thread Rafael David Tinoco
** Description changed:

- If node detects itself not able to receive message it asserts the number of 
failed members considering itself and dies. 
- I'll write more information (and the fix) in a few minutes.
+ If node detects itself not able to receive message it asserts the number
+ of failed members considering itself and dies.
+ 
+ - Testing bugfix. To be released soon.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to corosync in Ubuntu.
https://bugs.launchpad.net/bugs/1318441

Title:
  Precise corosync dies if failed_to_recv is set

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1318441/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs