[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-27 Thread Steve Fisher
The root cause was in fact the lvm.conf filter, but explicitly not for the reason you'd think. The issue is that if I added a|.*| into regex array, it was ignoring my 'sd[b-z]', loop and ram exclusions, both singly and in combination. It seems to be an obscure issue with the use of square

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-27 Thread Peter Petrakis
** Changed in: multipath-tools (Ubuntu) Status: Incomplete = Invalid -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to multipath-tools in Ubuntu. https://bugs.launchpad.net/bugs/1020436 Title: Cannot read superblock after FC

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-24 Thread Steve Fisher
Output from before and after commands is attached. I'm pretty sure you're right about the LVM device filter; I figured that setting the scsi-timeout to 90 seconds (being way longer than the TUR path checker, which is scheduled every 20 seconds) should be enough to handle the device failover.

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-24 Thread Peter Petrakis
It doesn't look like your lvm.conf filter is working. (pvdisplay) /dev/sdb: block size is 4096 bytes /dev/sdb: lvm2 label detected Ignoring duplicate PV RLOhdDURbD7uK2La3MDK2olkP0BF2Tu7 on /dev/sdb - using dm /dev/mapper/mpath0 Closed /dev/sdb Opened /dev/sdc

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-23 Thread Steve Fisher
It looks like all the /dev/mapper/mpathX targets, and the remaining 'sd' unique paths (with one fabric disabled) were all readable after the path shutdown, but the underlying LVs were somehow still deactivated when the paths disappeared. root@hostname-03:/srv/mysql# dmesg snip [1039812.298161]

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-23 Thread Peter Petrakis
Hmm, we've got some counter indicators here. lvs claims that the volumes are active. But the probe itself is showing problems reading the volumes. XFS is telling us that it cannot write it's journal to disk. [1039812.311433] Filesystem dm-4: Log I/O Error Detected. Shutting down filesystem:

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-23 Thread Peter Petrakis
scratch the ext3 test, I just re-read the description. I've replicated this behaviour using both xfs and ext4 filesystems, on multiple different luns presented to the server. -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-18 Thread Peter Petrakis
Alrighty. [OK] lvm.conf device blacklist [OK] multipath.conf, sans fix I already pointed out You can ignore that FW blurb, it's just a script that shouldn't be so noisy. failover logs *look good* bug-1020436$ grep fail_path multipathd_stdout.log | sort -u Jul 10 15:08:25 | libdevmapper:

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-16 Thread Steve Fisher
To date, the issue hasn't been observed on the two physical hosts we have running 10.04.1 LTS with the same multipath-tools version, which certainly raises a flag. They are being used as mission critical / production database servers so I'm scheduling a window in which we will be able to confirm

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-13 Thread Peter Petrakis
** Changed in: multipath-tools (Ubuntu) Status: Incomplete = Confirmed -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to multipath-tools in Ubuntu. https://bugs.launchpad.net/bugs/1020436 Title: Cannot read superblock after FC

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-13 Thread Peter Petrakis
Thanks for all the feedback, there's quite a bit of information here to sort through. In the meanwhile, it would be an interesting data point to see how well this array functions under multipath 0.4.9, which is what Precise, the new LTS uses. This can be accomplished by using apt pinning.

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
** Attachment removed: strace_mpathd.log https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3218020/+files/strace_mpathd.log ** Attachment added: strace_mpathd.log

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
-- lvm.conf blacklist should be fine, searches /dev but filters all /dev/sd.* I'll attach those files momentarily. I didn't see any kpartx processes running at all; there wasn't anything in an uninterruptable sleep state ('D') either. Listing all kernel and some additional processes here

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Peter Petrakis
Your blacklist definition is wrong. You provided. blacklist { wwid 3600605b0039afe20ff54052e7d38 vendor SMART product SMART } defaults { user_friendly_names yes } wwid cites an entire device, vendor product cite a different device and must be included

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
Storage devices: $ sudo lsscsi [0:0:0:0]cd/dvd TSSTcorp CDDVDW TS-L633F IT03 /dev/sr0 [4:2:0:0]diskIBM ServeRAID M5015 2.12 /dev/sda [5:0:0:0]diskHITACHI OPEN-V 6008 /dev/sdb [5:0:0:1]diskHITACHI OPEN-V 6008 /dev/sdc [5:0:0:2]disk

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
I note this segfault has appeared in dmesg on the last couple of boots. [ 304.214560] multipathd[3083]: segfault at a ip 7f5eb838798a sp 7fffb225ebb0 error 4 in libc-2.11.1.so[7f5eb831+17a000] After adjusting the blacklist, updating initrd and rebooting, the same behaviour is

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
** Attachment added: strace_mpathd_110712.log https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3219323/+files/strace_mpathd_110712.log -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-10 Thread Steve Fisher
** Attachment added: multipathd_stdout_110712.log https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3219324/+files/multipathd_stdout_110712.log -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Peter Petrakis
To assist in further debug, stop the multipath daemon, then run it in the foreground with max verbosity, while running it under strace. Log strace to file and the foreground stdout and stderr to another file. Post the results here. Also include your multipath.conf, lvm.conf, output of dmsetup

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
Hi Peter, Thanks for picking this up. Note: I ran an 'update-initramfs -u -k all' and rebooted just for good measure before proceeding. There's some output regarding a missing firmware file, I'm not sure it's relevant: root@rgrprod-pmdh-proc-03:/etc# update-initramfs -u -k all

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
note: multipath_stderr.log exists, but is empty. ** Attachment added: multipathd_stdout.log https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3218021/+files/multipathd_stdout.log -- You received this bug notification because you are a member of Ubuntu Server

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
user@hostname:~$ sudo dmsetup table mpath2: 0 8388608 multipath 1 queue_if_no_path 0 1 1 round-robin 0 4 1 8:48 1000 8:96 1000 8:144 1000 8:192 1000 mpath1: 0 8388608 multipath 1 queue_if_no_path 0 1 1 round-robin 0 4 1 8:128 1000 8:32 1000 8:80 1000 8:176 1000 mpath0: 0 1048576000 multipath 1

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
** Attachment added: multipath.conf https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3218022/+files/multipath.conf -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to multipath-tools in Ubuntu.

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
** Attachment added: lvm.conf https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1020436/+attachment/3218023/+files/lvm.conf -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to multipath-tools in Ubuntu.

[Bug 1020436] Re: Cannot read superblock after FC multipath failover

2012-07-09 Thread Steve Fisher
note: /dev/sda is a raid volume used for the root vg; the filter is actually r|/dev/sd[b-z]| -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to multipath-tools in Ubuntu. https://bugs.launchpad.net/bugs/1020436 Title: Cannot read