Re: Reboot hangs on failing multipath devices

2010-03-31 Thread James Hammer

Mike Christie wrote:

On 03/23/2010 10:13 AM, James Hammer wrote:

Mike Christie wrote:

On 03/22/2010 03:38 PM, James Hammer wrote:

Every time I reboot my server it hangs on the multipath devices.

The server is Debian based. I've had this problem with all kernels 
I've
tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, 
no_path_retry is

set to queue

Here are snippets from the reboot log:

snip
Stopping multipath daemon: multipathd.
...
Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path
8:64.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:48.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:80. mult


Are there file systems mounted on the multipath device?



As far as I can tell, there are *no* file systems mounted on the
multipath device.



What you/(the debian scripts) want to do is shutdown multipath first, 
so the higher level queues have flushed they data out. Then shut down 
iscsi.


Or do something to flush the multipath queues and shut that down, then 
shutdown iscsi.


The multipath daemon was being shutdown before iscsi in the init 
scripts.  However, the multipath queues were not being flushed and 
shutdown.  I added a script to run 'multipath -F' between the multipath 
and iscsi shutdown scripts.  That seems to flush the queues and shutdown 
multipath devices OK.  The server no longer hangs on reboot and I see no 
glaring errors.


Thanks for your help.

--
James Hammer
jham...@callone.com
312-681-5052

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



RE: Reboot hangs on failing multipath devices

2010-03-26 Thread netz-haut - stephan seitz
This is a reported bug of the device-mapper on debian.
There's a patch at debians bugtracker available, but as far as I remember,
it has been refused by upstream developers.

We're also running open-iscsi/dm-multipath/lvm/clvm stack on virtualization
Hosts. Due to this behavior one big point is to never ever let multipath loose
all pathes.
Try to add
features1 queue_if_no_path
to your related multipath.conf device section.

Regards,
Stephan


 -Original Message-
 From: open-iscsi@googlegroups.com [mailto:open-is...@googlegroups.com]
 On Behalf Of James Hammer
 Sent: Tuesday, March 23, 2010 4:13 PM
 To: open-iscsi@googlegroups.com
 Subject: Re: Reboot hangs on failing multipath devices
 
 Mike Christie wrote:
  On 03/22/2010 03:38 PM, James Hammer wrote:
  Every time I reboot my server it hangs on the multipath devices.
 
  The server is Debian based. I've had this problem with all kernels
 I've
  tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf,
 no_path_retry is
  set to queue
 
  Here are snippets from the reboot log:
 
  snip
  Stopping multipath daemon: multipathd.
  ...
  Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing
 path
  8:64.
  device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
  device-mapper: multipath: Failing path 8:48.
  device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
  device-mapper: multipath: Failing path 8:80. mult
 
  Are there file systems mounted on the multipath device?
 
 
 As far as I can tell, there are *no* file systems mounted on the
 multipath device.  This multipath device is used by a virtual machine.
 The virtual machine is turned off at that point.  The 'mount' command
 on
 the physical host does not list the multipath device as being mounted.
 
 This is what I have found...I ran the whole shutdown sequence manually,
 i.e. running each script in /etc/rc0.d manually in order (with
 *no_path_retry* set to *queue*). Between each shutdown script, I ran
 '*multipath -f mpath5*' to try and remove the multipath device manually.
 Each time I got this result:
 
   mpath5: map in use
 
 All the way down until I got to the last 3 scripts:
 
   S50lvm2 - ../init.d/lvm2
   S60umountroot - ../init.d/umountroot
   S90halt - ../init.d/halt
 
 
 
 When that lvm2 script gets run to shutdown lvm2, I again get the
 multipath: Failing path results:
 
   Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path
 8:48.
   device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
   device-mapper: multipath: Failing path 8:80.
   device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
   device-mapper: multipath: Failing path 8:64.
   device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
 
 That hangs indefinitely.
 
 Now, if I do the same thing with *no_path_retry* set to *fail* the
 sequence goes similarly, except that when I run */etc/init.d/lvm2 stop*
 I get the same as above followed by a few of these lines:
 
   /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error
   end_request: I/O error, dev dm-9, sector 20971776
 
 Then the script finishes and the reboot can proceed.
 
 So the key seems to be the *no_path_retry* setting.
 
  From my tests, things seem to go so much better if *no_path_retry* is
 set to *queue* and the connection to the iSCSI server is interrupted.
 
 So, is it possible to get those paths to fail with *no_path_retry*
 set
 to *queue* so the reboot can continue?
 
 Thanks!
 
 -- James
 
 --
 You received this message because you are subscribed to the Google
 Groups open-iscsi group.
 To post to this group, send email to open-is...@googlegroups.com.
 To unsubscribe from this group, send email to open-
 iscsi+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/open-iscsi?hl=en.
 


-- 
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Reboot hangs on failing multipath devices

2010-03-25 Thread Mike Christie

On 03/23/2010 10:13 AM, James Hammer wrote:

Mike Christie wrote:

On 03/22/2010 03:38 PM, James Hammer wrote:

Every time I reboot my server it hangs on the multipath devices.

The server is Debian based. I've had this problem with all kernels I've
tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is
set to queue

Here are snippets from the reboot log:

snip
Stopping multipath daemon: multipathd.
...
Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path
8:64.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:48.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:80. mult


Are there file systems mounted on the multipath device?



As far as I can tell, there are *no* file systems mounted on the
multipath device. This multipath device is used by a virtual machine.
The virtual machine is turned off at that point. The 'mount' command on
the physical host does not list the multipath device as being mounted.

This is what I have found...I ran the whole shutdown sequence manually,
i.e. running each script in /etc/rc0.d manually in order (with
*no_path_retry* set to *queue*). Between each shutdown script, I ran
'*multipath -f mpath5*' to try and remove the multipath device manually.
Each time I got this result:

mpath5: map in use

All the way down until I got to the last 3 scripts:

S50lvm2 - ../init.d/lvm2
S60umountroot - ../init.d/umountroot
S90halt - ../init.d/halt



When that lvm2 script gets run to shutdown lvm2, I again get the
multipath: Failing path results:

Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:48.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:80.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:64.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed

That hangs indefinitely.

Now, if I do the same thing with *no_path_retry* set to *fail* the
sequence goes similarly, except that when I run */etc/init.d/lvm2 stop*
I get the same as above followed by a few of these lines:

/dev/dm-9: read failed after 0 of 2048 at 0: Input/output error
end_request: I/O error, dev dm-9, sector 20971776

Then the script finishes and the reboot can proceed.

So the key seems to be the *no_path_retry* setting.

 From my tests, things seem to go so much better if *no_path_retry* is
set to *queue* and the connection to the iSCSI server is interrupted.

So, is it possible to get those paths to fail with *no_path_retry* set
to *queue* so the reboot can continue?



I do not know if you can easily do this, and I am not sure if it is safe 
in your case. It seems like though from the first iscsi messages:


Disconnecting iSCSI targets:Logging out of session [sid: 1,
Logging out of session [sid: 2,
Logging out of session [sid: 3,
sd 8:0:0:0: [sde] Synchronizing SCSI cache
sd 9:0:0:0: [sdd] Synchronizing SCSI cache
sd 10:0:0:0: [sdf] Synchronizing SCSI cache
 connection2:0: detected conn error (1020)
 connection1:0: detected conn error (1020)
 connection3:0: detected conn error (1020)
Logout of [sid: 1...successful
Logout of [sid: 2...successful
Stopping iSCSI initiator server:.

that the iscsi layer has logged out of the sessoins and cleaned up at 
its layer, so at this point no IO is going to get executed.


The problem and reason I do not think it is safe to rerrun with 
no_path_retry 0, is that there is still IO somewhere in the 
multipath/block layer queues. When you see:


 /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error
 end_request: I/O error, dev dm-9, sector 20971776

It means some IO that was in that queue failed. If it was a write to 
some disk it means that you lost data.


What you/(the debian scripts) want to do is shutdown multipath first, so 
the higher level queues have flushed they data out. Then shut down iscsi.


Or do something to flush the multipath queues and shut that down, then 
shutdown iscsi.


--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Reboot hangs on failing multipath devices

2010-03-23 Thread James Hammer

Mike Christie wrote:

On 03/22/2010 03:38 PM, James Hammer wrote:

Every time I reboot my server it hangs on the multipath devices.

The server is Debian based. I've had this problem with all kernels I've
tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is
set to queue

Here are snippets from the reboot log:

snip
Stopping multipath daemon: multipathd.
...
Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 
8:64.

device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:48.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:80. mult


Are there file systems mounted on the multipath device?



As far as I can tell, there are *no* file systems mounted on the 
multipath device.  This multipath device is used by a virtual machine.  
The virtual machine is turned off at that point.  The 'mount' command on 
the physical host does not list the multipath device as being mounted.


This is what I have found...I ran the whole shutdown sequence manually, 
i.e. running each script in /etc/rc0.d manually in order (with 
*no_path_retry* set to *queue*). Between each shutdown script, I ran 
'*multipath -f mpath5*' to try and remove the multipath device manually. 
Each time I got this result:


 mpath5: map in use

All the way down until I got to the last 3 scripts:

 S50lvm2 - ../init.d/lvm2
 S60umountroot - ../init.d/umountroot
 S90halt - ../init.d/halt



When that lvm2 script gets run to shutdown lvm2, I again get the 
multipath: Failing path results:


 Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 
8:48.

 device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
 device-mapper: multipath: Failing path 8:80.
 device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
 device-mapper: multipath: Failing path 8:64.
 device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed

That hangs indefinitely.

Now, if I do the same thing with *no_path_retry* set to *fail* the 
sequence goes similarly, except that when I run */etc/init.d/lvm2 stop* 
I get the same as above followed by a few of these lines:


 /dev/dm-9: read failed after 0 of 2048 at 0: Input/output error
 end_request: I/O error, dev dm-9, sector 20971776

Then the script finishes and the reboot can proceed.

So the key seems to be the *no_path_retry* setting.

From my tests, things seem to go so much better if *no_path_retry* is 
set to *queue* and the connection to the iSCSI server is interrupted.


So, is it possible to get those paths to fail with *no_path_retry* set 
to *queue* so the reboot can continue?


Thanks!

-- James

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Reboot hangs on failing multipath devices

2010-03-22 Thread James Hammer

James Hammer wrote:

Every time I reboot my server it hangs on the multipath devices.

The server is Debian based.  I've had this problem with all kernels 
I've tried (2.6.18, 2.6.24, 2.6.32).  In /etc/multipath.conf, 
no_path_retry is set to queue




I found that if I set no_path_retry to its default value of 0, then the 
server reboots immediately.  Is it possible to get this working with 
no_path_retry set to queue?


-- James

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.



Re: Reboot hangs on failing multipath devices

2010-03-22 Thread Mike Christie

On 03/22/2010 03:38 PM, James Hammer wrote:

Every time I reboot my server it hangs on the multipath devices.

The server is Debian based. I've had this problem with all kernels I've
tried (2.6.18, 2.6.24, 2.6.32). In /etc/multipath.conf, no_path_retry is
set to queue

Here are snippets from the reboot log:

snip
Stopping multipath daemon: multipathd.
...
Saving the system clock.
Unmounting iscsi-backed filesystems: /umount: /? device is busy
umount: /: device is busy
...
Disconnecting iSCSI targets:Logging out of session [sid: 1,
Logging out of session [sid: 2,
Logging out of session [sid: 3,
sd 8:0:0:0: [sde] Synchronizing SCSI cache
sd 9:0:0:0: [sdd] Synchronizing SCSI cache
sd 10:0:0:0: [sdf] Synchronizing SCSI cache
connection2:0: detected conn error (1020)
connection1:0: detected conn error (1020)
connection3:0: detected conn error (1020)
Logout of [sid: 1...successful
Logout of [sid: 2...successful
Stopping iSCSI initiator server:.

Cleaning up ifupdown
Deactivating swap...done.
Shutting down LVM Volume Groupsdevice-mapper: multipath: Failing path 8:64.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:48.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
device-mapper: multipath: Failing path 8:80.
device-mapper: uevent: dm_send_uevents: kobject_uevent_env failed
/snip



Are there file systems mounted on the multipath device?

--
You received this message because you are subscribed to the Google Groups 
open-iscsi group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.