Launchpad has imported 4 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=586752.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2010-04-28T10:12:15+00:00 Oliver wrote:

Created attachment 409748
Andrew Beekhof's patch to fix this issue

Description of problem:
dlm_controld.pcmk segfaults on startup if network uses vlan, bonding or 
bridging and corosync/pacemaker is invoked too early

Version-Release number of selected component (if applicable):
bug and patch testet on 3.0.7 ubuntu lucid packages

How reproducible:
Configure any of the obove on top of the raw interface and start corosync 
before the network settles.

Additional info:
The issue is discussed here 
http://oss.clusterlabs.org/pipermail/pacemaker/2010-April/005954.html
 
Andrew Beekhof <[email protected]> posted the attached patch that fixes this 
issue.


gdb output is:
Core was generated by `dlm_controld.pcmk -q 0'.
Program terminated with signal 11, Segmentation fault.
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
        in ../sysdeps/x86_64/multiarch/../strlen.S
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
#1  0x00007f499565cd46 in *__GI___strdup (s=0x0) at strdup.c:42
#2  0x0000000000403f0c in dlm_process_node (key=<value optimized out>, 
value=0x1864a30, user_data=0x62a4f8) at 
/usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/pacemaker.c:136
#3  0x00007f4995cdbd73 in IA__g_hash_table_foreach (hash_table=0x1866050, 
func=0x403e40 <dlm_process_node>, user_data=0x62a4f8) at 
/build/buildd/glib2.0-2.24.0/glib/ghash.c:1325
#4  0x0000000000403c9e in update_cluster () at 
/usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/pacemaker.c:82
#5  0x0000000000415a4a in loop () at 
/usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/main.c:986
#6  0x000000000041659c in main (argc=<value optimized out>, argv=<value 
optimized out>) at 
/usr/src/packages/redhat-cluster/3.0.7/redhat-cluster-3.0.7/group/dlm_controld/main.c:1295


hth,
Oliver

Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat-
cluster/+bug/571612/comments/0

------------------------------------------------------------------------
On 2010-04-28T12:08:13+00:00 Andrew wrote:

Patch fa24b46 resolving this issue has been committed in cluster.git
   
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=fa24b460c51aa0c47d0842703feea8bca0ed66b7

Essentially, the dlm was trying to create a configfs entry for a node with no 
address.
This lead to a NULL pointer being dereferenced and the dlm crashing.

The above mentioned patch now checks for a valid address before
continuing.

Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat-
cluster/+bug/571612/comments/1

------------------------------------------------------------------------
On 2010-04-29T13:22:44+00:00 Andrew wrote:

Sorry, set the wrong status.

Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat-
cluster/+bug/571612/comments/3

------------------------------------------------------------------------
On 2010-07-30T11:29:34+00:00 Bug wrote:


This bug appears to have been reported against 'rawhide' during the Fedora 14 
development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Reply at: https://bugs.launchpad.net/ubuntu/+source/redhat-
cluster/+bug/571612/comments/4


** Changed in: redhatcluster
       Status: Unknown => Fix Released

** Changed in: redhatcluster
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to redhat-cluster in Ubuntu.
https://bugs.launchpad.net/bugs/571612

Title:
  dlm_controld.pcmk segfault

Status in Red Hat Cluster:
  Fix Released
Status in redhat-cluster package in Ubuntu:
  Invalid

Bug description:
  Anyone who uses link aggregation (me), bridging, and vlans are affect
  due to the time required to bring up the network after reboot.
  Corosync comes up and dlm segfaults. This has been fixed upstream, and
  the fix is included in Maverick+.

  Upstream bugreport and patch [1]. Patch commited upstream [2].
  Discussion about the issue [3].

  [1]: https://bugzilla.redhat.com/show_bug.cgi?id=586752
  [2]: 
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=fa24b460c51aa0c47d0842703feea8bca0ed66b7
  [3]: http://oss.clusterlabs.org/pipermail/pacemaker/2010-April/005954.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/redhatcluster/+bug/571612/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

Reply via email to