Re: Reboot hangs on failing multipath devices
Mike Christie wrote: On 03/23/2010 10:13 AM, James Hammer wrote: Mike Christie wrote: On 03/22/2010 03:38 PM, James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue Here are snippets from the reboot log: snip Stopping multipath daemon: multipathd. ... Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. mult Are there file systems mounted on the multipath device? As far as I can tell, there are *no* file systems mounted on the multipath device. What you/(the debian scripts) want to do is shutdown multipath first, so the higher level queues have flushed they data out. Then shut down iscsi. Or do something to flush the multipath queues and shut that down, then shutdown iscsi. The multipath daemon was being shutdown before iscsi in the init scripts. However, the multipath queues were not being flushed and shutdown. I added a script to run 'multipath -F' between the multipath and iscsi shutdown scripts. That seems to flush the queues and shutdown multipath devices OK. The server no longer hangs on reboot and I see no glaring errors. Thanks for your help. -- James Hammer jham...@callone.com 312-681-5052 -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
RE: Reboot hangs on failing multipath devices
This is a reported bug of the device-mapper on debian. There's a patch at debians bugtracker available, but as far as I remember, it has been refused by upstream developers. We're also running open-iscsi/dm-multipath/lvm/clvm stack on virtualization Hosts. Due to this behavior one big point is to never ever let multipath loose all pathes. Try to add features1 queue_if_no_path to your related multipath.conf device section. Regards, Stephan -Original Message- From: open-iscsi@googlegroups.com [mailto:open-is...@googlegroups.com] On Behalf Of James Hammer Sent: Tuesday, March 23, 2010 4:13 PM To: open-iscsi@googlegroups.com Subject: Re: Reboot hangs on failing multipath devices Mike Christie wrote: On 03/22/2010 03:38 PM, James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue Here are snippets from the reboot log: snip Stopping multipath daemon: multipathd. ... Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. mult Are there file systems mounted on the multipath device? As far as I can tell, there are *no* file systems mounted on the multipath device. This multipath device is used by a virtual machine. The virtual machine is turned off at that point. The 'mount' command on the physical host does not list the multipath device as being mounted. This is what I have found...I ran the whole shutdown sequence manually, i.e. running each script in /etc/rc0.d manually in order (with *no_path_retry* set to *queue*). Between each shutdown script, I ran '*multipath -f mpath5*' to try and remove the multipath device manually. Each time I got this result: mpath5: map in use All the way down until I got to the last 3 scripts: S50lvm2 - ../init.d/lvm2 S60umountroot - ../init.d/umountroot S90halt - ../init.d/halt When that lvm2 script gets run to shutdown lvm2, I again get the multipath: Failing path results: Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed That hangs indefinitely. Now, if I do the same thing with *no_path_retry* set to *fail* the sequence goes similarly, except that when I run */etc/init.d/lvm2 stop* I get the same as above followed by a few of these lines: /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error end_request: I/O error, dev dm-9, sector 20971776 Then the script finishes and the reboot can proceed. So the key seems to be the *no_path_retry* setting. From my tests, things seem to go so much better if *no_path_retry* is set to *queue* and the connection to the iSCSI server is interrupted. So, is it possible to get those paths to fail with *no_path_retry* set to *queue* so the reboot can continue? Thanks! -- James -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open- iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en. -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Reboot hangs on failing multipath devices
On 03/23/2010 10:13 AM, James Hammer wrote: Mike Christie wrote: On 03/22/2010 03:38 PM, James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue Here are snippets from the reboot log: snip Stopping multipath daemon: multipathd. ... Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. mult Are there file systems mounted on the multipath device? As far as I can tell, there are *no* file systems mounted on the multipath device. This multipath device is used by a virtual machine. The virtual machine is turned off at that point. The 'mount' command on the physical host does not list the multipath device as being mounted. This is what I have found...I ran the whole shutdown sequence manually, i.e. running each script in /etc/rc0.d manually in order (with *no_path_retry* set to *queue*). Between each shutdown script, I ran '*multipath -f mpath5*' to try and remove the multipath device manually. Each time I got this result: mpath5: map in use All the way down until I got to the last 3 scripts: S50lvm2 - ../init.d/lvm2 S60umountroot - ../init.d/umountroot S90halt - ../init.d/halt When that lvm2 script gets run to shutdown lvm2, I again get the multipath: Failing path results: Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed That hangs indefinitely. Now, if I do the same thing with *no_path_retry* set to *fail* the sequence goes similarly, except that when I run */etc/init.d/lvm2 stop* I get the same as above followed by a few of these lines: /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error end_request: I/O error, dev dm-9, sector 20971776 Then the script finishes and the reboot can proceed. So the key seems to be the *no_path_retry* setting. From my tests, things seem to go so much better if *no_path_retry* is set to *queue* and the connection to the iSCSI server is interrupted. So, is it possible to get those paths to fail with *no_path_retry* set to *queue* so the reboot can continue? I do not know if you can easily do this, and I am not sure if it is safe in your case. It seems like though from the first iscsi messages: Disconnecting iSCSI targets:Logging out of session [sid: 1, Logging out of session [sid: 2, Logging out of session [sid: 3, sd 8:0:0:0: [sde] Synchronizing SCSI cache sd 9:0:0:0: [sdd] Synchronizing SCSI cache sd 10:0:0:0: [sdf] Synchronizing SCSI cache connection2:0: detected conn error (1020) connection1:0: detected conn error (1020) connection3:0: detected conn error (1020) Logout of [sid: 1...successful Logout of [sid: 2...successful Stopping iSCSI initiator server:. that the iscsi layer has logged out of the sessoins and cleaned up at its layer, so at this point no IO is going to get executed. The problem and reason I do not think it is safe to rerrun with no_path_retry 0, is that there is still IO somewhere in the multipath/block layer queues. When you see: /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error end_request: I/O error, dev dm-9, sector 20971776 It means some IO that was in that queue failed. If it was a write to some disk it means that you lost data. What you/(the debian scripts) want to do is shutdown multipath first, so the higher level queues have flushed they data out. Then shut down iscsi. Or do something to flush the multipath queues and shut that down, then shutdown iscsi. -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Reboot hangs on failing multipath devices
Mike Christie wrote: On 03/22/2010 03:38 PM, James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue Here are snippets from the reboot log: snip Stopping multipath daemon: multipathd. ... Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. mult Are there file systems mounted on the multipath device? As far as I can tell, there are *no* file systems mounted on the multipath device. This multipath device is used by a virtual machine. The virtual machine is turned off at that point. The 'mount' command on the physical host does not list the multipath device as being mounted. This is what I have found...I ran the whole shutdown sequence manually, i.e. running each script in /etc/rc0.d manually in order (with *no_path_retry* set to *queue*). Between each shutdown script, I ran '*multipath -f mpath5*' to try and remove the multipath device manually. Each time I got this result: mpath5: map in use All the way down until I got to the last 3 scripts: S50lvm2 - ../init.d/lvm2 S60umountroot - ../init.d/umountroot S90halt - ../init.d/halt When that lvm2 script gets run to shutdown lvm2, I again get the multipath: Failing path results: Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed That hangs indefinitely. Now, if I do the same thing with *no_path_retry* set to *fail* the sequence goes similarly, except that when I run */etc/init.d/lvm2 stop* I get the same as above followed by a few of these lines: /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error end_request: I/O error, dev dm-9, sector 20971776 Then the script finishes and the reboot can proceed. So the key seems to be the *no_path_retry* setting. From my tests, things seem to go so much better if *no_path_retry* is set to *queue* and the connection to the iSCSI server is interrupted. So, is it possible to get those paths to fail with *no_path_retry* set to *queue* so the reboot can continue? Thanks! -- James -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Reboot hangs on failing multipath devices
James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue I found that if I set no_path_retry to its default value of 0, then the server reboots immediately. Is it possible to get this working with no_path_retry set to queue? -- James -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
Re: Reboot hangs on failing multipath devices
On 03/22/2010 03:38 PM, James Hammer wrote: Every time I reboot my server it hangs on the multipath devices. The server is Debian based. I've had this problem with all kernels I've tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is set to queue Here are snippets from the reboot log: snip Stopping multipath daemon: multipathd. ... Saving the system clock. Unmounting iscsi-backed filesystems: /umount: /? device is busy umount: /: device is busy ... Disconnecting iSCSI targets:Logging out of session [sid: 1, Logging out of session [sid: 2, Logging out of session [sid: 3, sd 8:0:0:0: [sde] Synchronizing SCSI cache sd 9:0:0:0: [sdd] Synchronizing SCSI cache sd 10:0:0:0: [sdf] Synchronizing SCSI cache connection2:0: detected conn error (1020) connection1:0: detected conn error (1020) connection3:0: detected conn error (1020) Logout of [sid: 1...successful Logout of [sid: 2...successful Stopping iSCSI initiator server:. Cleaning up ifupdown Deactivating swap...done. Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:48. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed device-mapper: multipath: Failing path 8:80. device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed /snip Are there file systems mounted on the multipath device? -- You received this message because you are subscribed to the Google Groups open-iscsi group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.