** Description changed:

+ [SRU justification]
+ Without this patch, multipathd may exit in SEGV in trying to add a map that 
aleady exists
+ 
+ [Impact]
+ multipathd crashes with SIGSEGV
+ A typical trace of such a situation is a message similar to this one in 
/var/log/syslog :
+ 
+ multipathd: 360060160164034004cd59cfdb22ce611: failed in domap for
+ addition of new path sdr
+ 
+ [Fix]
+ Check if the map already exists and do a RELOAD in domap() instead of failing.
+ 
+ [Test Case]
+ Problem was encountered in a complex Openstack test environment where the 
following was done :
+ A test tool which runs which :
+ - first boots a number of virtual machines. 
+ - then it creates a number of threads and in each thread it 
+ creates volumes, takes snapshots of the volumes, and attaches the volumes to 
the initially booted virtual machines. After a short while the volumes are 
detached, and snapshots and volumes are deleted.
+ 
+ Running this tool overnight normally result in running in the multipathd
+ SEGV situation.
+ 
+ [Regression]
+ This is a straight backport of the code being used in 0.5.0. No regression is 
to be expected.
+ 
+ It is important to note that the reproducer in the original description
+ did not lead to such a problem.
+ 
+ [Original description of the problem]
+ 
  We have a problem on multipath-tools.
  
  Usually after a path removal and a re-scan, the multipathd process dies.
  
  I created 2 hosts:
  
  iscsi-server
  iscsi-client
  
  With 4 NICs in between them and with a simple multibus multipath. With
  that I was able to check that there is a regression in multipath-tools.
  
  It looks like the patches brought from upstream:
  
  0017-multipath-get-right-sysfs-value-for-checker_timeout.patch
  0018-multipath-handle-offlined-paths.patch
  #
  # from here
  #
  0019-multipath-fix-scsi-timeout-code.patch
  0020-multipath-make-tgt_node_name-work-for-iscsi-devices.patch
  0021-multipath-cleanup-dev_loss_tmo-issues.patch
  0022-Fix-for-setting-0-to-fast_io_fail.patch
  0023-Fix-fast_io_fail-capping.patch
  0024-multipath-enable-getting-uevents-through-libudev.patch
  0025-Use-devpath-as-argument-for-sysfs-functions.patch
  0026-multipathd-remove-references-to-sysfs_device.patch
  0027-multipathd-use-struct-path-as-argument-for-event-pro.patch
  0028-Add-global-udev-reference-pointer-to-config.patch
  0029-Use-udev-enumeration-during-discovery.patch
  0030-use-struct-udev_device-during-discovery.patch
  0031-More-debugging-output-when-synchronizing-path-states.patch
  0032-Use-struct-udev_device-instead-of-sysdev.patch
  0033-discovery-Fixup-cciss-discovery.patch
  0035-Use-udev-devices-during-discovery.patch
  0036-Remove-all-references-to-hand-craftes-sysfs-code.patch
  #
  # to here
  #
  # 0037-multipath-libudev-cleanup-and-bugfixes.patch
  # 0038-multipath-check-if-a-device-belongs-to-multipath.patch
  # 0039-multipath-and-wwids_file-multipath.conf-option.patch
  # 0040-multipath-Check-blacklists-as-soon-as-possible.patch
  # 0041-add-wwids-file-cleanup-options.patch
  # 0042-add-find_multipaths-option.patch
  # 0043-alloc-keywords.patch
  # lp1503305_libmultipath_info_on_1st_path_down_dbd131e.patch
  
  In the range 19-36 caused a regression.
  
  Whenever I generate the package (for trusty) including those patches I'm
  able to generate a core dump indicating a possible double-free or null-
  dereference related to a path removal (that is why I can reproduce with
  the test case). Unfortunately it usually explodes inside malloc() or
  somewhere in glibc.
  
  Using valgrind I was able to verify some free() errors:
  
  ==30415== Invalid free() / delete / delete[] / realloc()
  ==30415==    at 0x4C2BDEC: free (vg_replace_malloc.c:473)
  ==30415==    by 0x54E243C: vector_del_slot (vector.c:95)
  ==30415==    by 0x550A516: _remove_map (structs_vec.c:139)
  ==30415==    by 0x550A5C3: _remove_maps (structs_vec.c:170)
  ==30415==    by 0x550A64B: remove_maps (structs_vec.c:181)
  ==30415==    by 0x40713F: configure (main.c:1153)
  ==30415==    by 0x407A74: child (main.c:1419)
  ==30415==    by 0x40837D: main (main.c:1618)
  
  And they are exactly aligned to a core dump (multipathd) I got from
  another user. (wrong free was coming from _remove_map).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1535898

Title:
  Trusty & Vivid multipath-tools (multipathd) seg-fault core dump

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1535898/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to