Re: [libvirt] udevadm settle can take too long

2012-04-26 Thread Jim Paris
Osier Yang wrote:
 On 2012年04月24日 03:47, Guido Günther wrote:
 Hi,
 On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:
 Hi,
 
 http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
 times out on the initial connection to libvirt.
 
 I reassigned the bug back to libvirt. I still wonder what triggers this
 though for some users but not for others?
 Cheers,
   -- Guido
 
 
 The basic problem is that, while checking storage volumes,
 virt-manager causes libvirt to call udevadm settle.  There's an
 interaction where libvirt's earlier use of network namespaces (to probe
 LXC features) had caused some uevents to be sent that get filtered out
 before they reach udev.  This confuses udevadm settle a bit, and so
 it sits there waiting for a 2-3 minute built-in timeout before returning.
 Eventually libvirtd prints:
2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : 
  No response from client 0x7feec4003630 after 5 keepalive messages in 30 
  seconds
 and virt-manager prints:
2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : 
  Failed to send keepalive response to client 0x25004e0
 and the connection gets dropped.
 
 One workaround could be to specify a shorter timeout when doing the
 settle.  The patch appended below allows virt-manager to work,
 although the connection still has to wait for the 10 second timeout
 before it succeeds.  I don't know what a better solution would be,
 though.  It seems the udevadm behavior might not be considered a bug
 from the udev/kernel point of view:
https://lkml.org/lkml/2012/4/22/60
 
 I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
 udevadm issue using a program I posted at the Debian bug report link
 above.
 
 -jim
 
  From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001
 From: Jim Parisj...@jtan.com
 Date: Sun, 22 Apr 2012 14:35:47 -0400
 Subject: [PATCH] shorten udevadmin settle timeout
 
 Otherwise, udevadmin settle can take so long that connections from
 e.g. virt-manager will get closed.
 ---
   src/util/util.c |4 ++--
   1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/util/util.c b/src/util/util.c
 index 6e041d6..dfe458e 100644
 --- a/src/util/util.c
 +++ b/src/util/util.c
 @@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type 
 ATTRIBUTE_UNUSED)
   void virFileWaitForDevices(void)
   {
   # ifdef UDEVADM
 -const char *const settleprog[] = { UDEVADM, settle, NULL };
 +const char *const settleprog[] = { UDEVADM, settle, --timeout, 
 10, NULL };
 
 Though I don't have a good idea to fix it either, I guess this
 change could cause lvremove to fail again for the udev race.
 
 See BZs:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=702260
 https://bugzilla.redhat.com/show_bug.cgi?id=570359

It seems that those bugs were caused by something like
 
1. open(lv, O_RDWR)
2. close(lv)
3. system(lvremove ...)

where udev would fire off a command between 2 and 3 that caused 3 to
fail.  Adding udevadm settle as step 2.5 is a good way to wait for
that command to finish, but:

- it doesn't necessarily fix the issue; something could easily re-open
  the device between 2.5 and 3 and cause the same failure.

- the race condition sounds like it was a short window, and sometimes
  the original sequence would still work even without the settle.
  That would suggest to me that a timeout of 10s is still plenty long.

A few thoughts:

- For lvremove: can we try a short timeout (3 seconds), then if the
  lvremove still fails, try again with the default udevadm timeout
  (120 seconds)?

- Even in that case, we need to fix libvirtd to not kill the
  connection after 30 seconds when it's libvirtd's fault that the
  connection is blocked for so long anyway.

- When connecting with virt-manager, is the udevadm settle really
  necessary?  We're not calling lvremove.

Thanks,
-jim

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] udevadm settle can take too long

2012-04-26 Thread Osier Yang


[ CC to Cole ]



Osier Yang wrote:

On 2012年04月24日 03:47, Guido Günther wrote:

Hi,
On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:

Hi,

http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
times out on the initial connection to libvirt.


I reassigned the bug back to libvirt. I still wonder what triggers this
though for some users but not for others?
Cheers,
  -- Guido



The basic problem is that, while checking storage volumes,  
virt-manager causes libvirt to call udevadm settle.  There's an
interaction where libvirt's earlier use of network namespaces (to probe
LXC features) had caused some uevents to be sent that get filtered out
before they reach udev.  This confuses udevadm settle a bit, and so
it sits there waiting for a 2-3 minute built-in timeout before returning.
Eventually libvirtd prints:
   2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : No 
response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds
and virt-manager prints:
   2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : Failed 
to send keepalive response to client 0x25004e0
and the connection gets dropped.

One workaround could be to specify a shorter timeout when doing the
settle.  The patch appended below allows virt-manager to work,
although the connection still has to wait for the 10 second timeout
before it succeeds.  I don't know what a better solution would be,
though.  It seems the udevadm behavior might not be considered a bug

from the udev/kernel point of view:

   https://lkml.org/lkml/2012/4/22/60

I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
udevadm issue using a program I posted at the Debian bug report link
above.

-jim


 From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001

From: Jim Parisj...@jtan.com
Date: Sun, 22 Apr 2012 14:35:47 -0400
Subject: [PATCH] shorten udevadmin settle timeout

Otherwise, udevadmin settle can take so long that connections from
e.g. virt-manager will get closed.
---
  src/util/util.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/util.c b/src/util/util.c
index 6e041d6..dfe458e 100644
--- a/src/util/util.c
+++ b/src/util/util.c
@@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type ATTRIBUTE_UNUSED)
  void virFileWaitForDevices(void)
  {
  # ifdef UDEVADM
-const char *const settleprog[] = { UDEVADM, settle, NULL };
+const char *const settleprog[] = { UDEVADM, settle, --timeout, 10, 
NULL };


Though I don't have a good idea to fix it either, I guess this
change could cause lvremove to fail again for the udev race.

See BZs:

https://bugzilla.redhat.com/show_bug.cgi?id=702260
https://bugzilla.redhat.com/show_bug.cgi?id=570359


It seems that those bugs were caused by something like

1. open(lv, O_RDWR)
2. close(lv)
3. system(lvremove ...)

where udev would fire off a command between 2 and 3 that caused 3 to
fail.  Adding udevadm settle as step 2.5 is a good way to wait for
that command to finish, but:

- it doesn't necessarily fix the issue; something could easily re-open
   the device between 2.5 and 3 and cause the same failure.


Right.



- the race condition sounds like it was a short window, and sometimes
   the original sequence would still work even without the settle.
   That would suggest to me that a timeout of 10s is still plenty long.

A few thoughts:

- For lvremove: can we try a short timeout (3 seconds), then if the
   lvremove still fails, try again with the default udevadm timeout
   (120 seconds)?

- Even in that case, we need to fix libvirtd to not kill the
   connection after 30 seconds when it's libvirtd's fault that the
   connection is blocked for so long anyway.


perhaps we need a timeout property for the client connection,
but not hardcode to 30s.



- When connecting with virt-manager, is the udevadm settle really
   necessary?  We're not calling lvremove.


virt-manager's hung should be caused by pool refresh, which
uses udevadm settle to wait for the new devices show up. So
it doesn't relates with lvremove.

Except logical storage, storage type of disk, scsi, and
mpath uses udevadm settle too. And node device driver.

Generally the pool refresh will be involked when libvirtd starts,
and surely another case is it's involked explicitly. :-) I.e.
virt-manager can't be hung if it doesn't intent to refresh the
pool. And thus I guess the situation will be much worse if pools
of disk, logical, scsi, mpath exists all together.

I'm wondering if virt-manager try to refresh the pools when
it starts, or when user request to check storage explicitly,
(e.g. clicking some button). It should be improved if it's the
first case IMHO, (let the user get the connection, and refresh
the pool when neccessary could be better).

I'd agree with that introducing timeout argument for udevadm
settle will be better, but hardcode a timeout in
virFileWaitForDevices is not good, as we can see, it's used
many 

Re: [libvirt] udevadm settle can take too long

2012-04-25 Thread Osier Yang

On 2012年04月24日 03:47, Guido Günther wrote:

Hi,
On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:

Hi,

http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
times out on the initial connection to libvirt.


I reassigned the bug back to libvirt. I still wonder what triggers this
though for some users but not for others?
Cheers,
  -- Guido



The basic problem is that, while checking storage volumes,
virt-manager causes libvirt to call udevadm settle.  There's an
interaction where libvirt's earlier use of network namespaces (to probe
LXC features) had caused some uevents to be sent that get filtered out
before they reach udev.  This confuses udevadm settle a bit, and so
it sits there waiting for a 2-3 minute built-in timeout before returning.
Eventually libvirtd prints:
   2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : No 
response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds
and virt-manager prints:
   2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : Failed 
to send keepalive response to client 0x25004e0
and the connection gets dropped.

One workaround could be to specify a shorter timeout when doing the
settle.  The patch appended below allows virt-manager to work,
although the connection still has to wait for the 10 second timeout
before it succeeds.  I don't know what a better solution would be,
though.  It seems the udevadm behavior might not be considered a bug
from the udev/kernel point of view:
   https://lkml.org/lkml/2012/4/22/60

I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
udevadm issue using a program I posted at the Debian bug report link
above.

-jim

 From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001
From: Jim Parisj...@jtan.com
Date: Sun, 22 Apr 2012 14:35:47 -0400
Subject: [PATCH] shorten udevadmin settle timeout

Otherwise, udevadmin settle can take so long that connections from
e.g. virt-manager will get closed.
---
  src/util/util.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/util.c b/src/util/util.c
index 6e041d6..dfe458e 100644
--- a/src/util/util.c
+++ b/src/util/util.c
@@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type ATTRIBUTE_UNUSED)
  void virFileWaitForDevices(void)
  {
  # ifdef UDEVADM
-const char *const settleprog[] = { UDEVADM, settle, NULL };
+const char *const settleprog[] = { UDEVADM, settle, --timeout, 10, 
NULL };


Though I don't have a good idea to fix it either, I guess this
change could cause lvremove to fail again for the udev race.

See BZs:

https://bugzilla.redhat.com/show_bug.cgi?id=702260
https://bugzilla.redhat.com/show_bug.cgi?id=570359

Regards,
Osier

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] udevadm settle can take too long

2012-04-23 Thread Guido Günther
Hi,
On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:
 Hi,
 
 http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
 times out on the initial connection to libvirt.

I reassigned the bug back to libvirt. I still wonder what triggers this
though for some users but not for others?
Cheers,
 -- Guido

 
 The basic problem is that, while checking storage volumes,
 virt-manager causes libvirt to call udevadm settle.  There's an
 interaction where libvirt's earlier use of network namespaces (to probe
 LXC features) had caused some uevents to be sent that get filtered out
 before they reach udev.  This confuses udevadm settle a bit, and so
 it sits there waiting for a 2-3 minute built-in timeout before returning.
 Eventually libvirtd prints:
   2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : No 
 response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds
 and virt-manager prints:
   2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : 
 Failed to send keepalive response to client 0x25004e0
 and the connection gets dropped.
 
 One workaround could be to specify a shorter timeout when doing the
 settle.  The patch appended below allows virt-manager to work,
 although the connection still has to wait for the 10 second timeout
 before it succeeds.  I don't know what a better solution would be,
 though.  It seems the udevadm behavior might not be considered a bug
 from the udev/kernel point of view:
   https://lkml.org/lkml/2012/4/22/60
 
 I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
 udevadm issue using a program I posted at the Debian bug report link
 above.
 
 -jim
 
 From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001
 From: Jim Paris j...@jtan.com
 Date: Sun, 22 Apr 2012 14:35:47 -0400
 Subject: [PATCH] shorten udevadmin settle timeout
 
 Otherwise, udevadmin settle can take so long that connections from
 e.g. virt-manager will get closed.
 ---
  src/util/util.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/util/util.c b/src/util/util.c
 index 6e041d6..dfe458e 100644
 --- a/src/util/util.c
 +++ b/src/util/util.c
 @@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type ATTRIBUTE_UNUSED)
  void virFileWaitForDevices(void)
  {
  # ifdef UDEVADM
 -const char *const settleprog[] = { UDEVADM, settle, NULL };
 +const char *const settleprog[] = { UDEVADM, settle, --timeout, 10, 
 NULL };
  # else
 -const char *const settleprog[] = { UDEVSETTLE, NULL };
 +const char *const settleprog[] = { UDEVSETTLE, --timeout, 10, NULL };
  # endif
  int exitstatus;
  
 -- 
 1.7.7
 
 
 --
 libvir-list mailing list
 libvir-list@redhat.com
 https://www.redhat.com/mailman/listinfo/libvir-list
 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] udevadm settle can take too long

2012-04-23 Thread Jim Paris
Guido Günther wrote:
 Hi,
 On Sun, Apr 22, 2012 at 02:41:54PM -0400, Jim Paris wrote:
  Hi,
  
  http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
  times out on the initial connection to libvirt.
 
 I reassigned the bug back to libvirt. I still wonder what triggers this
 though for some users but not for others?
 Cheers,
  -- Guido

On all of my machines, virt-manager hangs if udevadm settle hangs.
You can use the program I posted at that bug report to trigger the
udevadm problem (it can be undone by restarting udev).

Libvirtd only triggers the udevadm problem at startup, through its use
of network namespaces while probing lxc.  If anything else generates
uevents after that point, then the udevadm problem usually goes away.
For example, any module loads, hardware events (ejecting a CD, closing
a laptop lid, etc), or bringing up or down network interfaces (which
libvirt would typically do by itself when starting a new domain).
So most users might just avoid it through luck.  But if you manually
restart libvirtd right before trying virt-manager, you'll probably see
it too.

Thanks,
-jim

  The basic problem is that, while checking storage volumes,
  virt-manager causes libvirt to call udevadm settle.  There's an
  interaction where libvirt's earlier use of network namespaces (to probe
  LXC features) had caused some uevents to be sent that get filtered out
  before they reach udev.  This confuses udevadm settle a bit, and so
  it sits there waiting for a 2-3 minute built-in timeout before returning.
  Eventually libvirtd prints:
2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : No 
  response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds
  and virt-manager prints:
2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : 
  Failed to send keepalive response to client 0x25004e0
  and the connection gets dropped.
  
  One workaround could be to specify a shorter timeout when doing the
  settle.  The patch appended below allows virt-manager to work,
  although the connection still has to wait for the 10 second timeout
  before it succeeds.  I don't know what a better solution would be,
  though.  It seems the udevadm behavior might not be considered a bug
  from the udev/kernel point of view:
https://lkml.org/lkml/2012/4/22/60
  
  I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
  udevadm issue using a program I posted at the Debian bug report link
  above.
  
  -jim
  
  From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001
  From: Jim Paris j...@jtan.com
  Date: Sun, 22 Apr 2012 14:35:47 -0400
  Subject: [PATCH] shorten udevadmin settle timeout
  
  Otherwise, udevadmin settle can take so long that connections from
  e.g. virt-manager will get closed.
  ---
   src/util/util.c |4 ++--
   1 files changed, 2 insertions(+), 2 deletions(-)
  
  diff --git a/src/util/util.c b/src/util/util.c
  index 6e041d6..dfe458e 100644
  --- a/src/util/util.c
  +++ b/src/util/util.c
  @@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type 
  ATTRIBUTE_UNUSED)
   void virFileWaitForDevices(void)
   {
   # ifdef UDEVADM
  -const char *const settleprog[] = { UDEVADM, settle, NULL };
  +const char *const settleprog[] = { UDEVADM, settle, --timeout, 
  10, NULL };
   # else
  -const char *const settleprog[] = { UDEVSETTLE, NULL };
  +const char *const settleprog[] = { UDEVSETTLE, --timeout, 10, NULL 
  };
   # endif
   int exitstatus;
   
  -- 
  1.7.7
  
  
  --
  libvir-list mailing list
  libvir-list@redhat.com
  https://www.redhat.com/mailman/listinfo/libvir-list
  

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] udevadm settle can take too long

2012-04-22 Thread Jim Paris
Hi,

http://bugs.debian.org/663931 is a bug I'm hitting, where virt-manager
times out on the initial connection to libvirt.

The basic problem is that, while checking storage volumes,
virt-manager causes libvirt to call udevadm settle.  There's an
interaction where libvirt's earlier use of network namespaces (to probe
LXC features) had caused some uevents to be sent that get filtered out
before they reach udev.  This confuses udevadm settle a bit, and so
it sits there waiting for a 2-3 minute built-in timeout before returning.
Eventually libvirtd prints:
  2012-04-22 18:22:18.678+: 30503: warning : virKeepAliveTimer:182 : No 
response from client 0x7feec4003630 after 5 keepalive messages in 30 seconds
and virt-manager prints:
  2012-04-22 18:22:18.931+: 30647: warning : virKeepAliveSend:128 : Failed 
to send keepalive response to client 0x25004e0
and the connection gets dropped.

One workaround could be to specify a shorter timeout when doing the
settle.  The patch appended below allows virt-manager to work,
although the connection still has to wait for the 10 second timeout
before it succeeds.  I don't know what a better solution would be,
though.  It seems the udevadm behavior might not be considered a bug
from the udev/kernel point of view:
  https://lkml.org/lkml/2012/4/22/60

I'm using Linux 3.2.14 with libvirt 0.9.11.  You can trigger the
udevadm issue using a program I posted at the Debian bug report link
above.

-jim

From 17e5b9ebab76acb0d711e8bc308023372fbc4180 Mon Sep 17 00:00:00 2001
From: Jim Paris j...@jtan.com
Date: Sun, 22 Apr 2012 14:35:47 -0400
Subject: [PATCH] shorten udevadmin settle timeout

Otherwise, udevadmin settle can take so long that connections from
e.g. virt-manager will get closed.
---
 src/util/util.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/util.c b/src/util/util.c
index 6e041d6..dfe458e 100644
--- a/src/util/util.c
+++ b/src/util/util.c
@@ -2593,9 +2593,9 @@ virFileFindMountPoint(const char *type ATTRIBUTE_UNUSED)
 void virFileWaitForDevices(void)
 {
 # ifdef UDEVADM
-const char *const settleprog[] = { UDEVADM, settle, NULL };
+const char *const settleprog[] = { UDEVADM, settle, --timeout, 10, 
NULL };
 # else
-const char *const settleprog[] = { UDEVSETTLE, NULL };
+const char *const settleprog[] = { UDEVSETTLE, --timeout, 10, NULL };
 # endif
 int exitstatus;
 
-- 
1.7.7


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list