Re: [libvirt] problems with remote authentication with policykit
On Thu, Jun 18, 2009 at 12:20:40PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: We close the socket to the 'nc' process here so in theory it should be getting a HUP event from poll or EOF from read, etc and then exiting. Ominously though I see several patches to Fedora's 'nc' RPM at least one of which is related to nc hanging forever after getting HUP fback from poll(). What distro are you using ? http://cvs.fedoraproject.org/viewvc/rpms/nc/F-11/ I'm using Debian. I've already had to switch from the netcat-traditional package to the netcat-openbsd package. Debian does already include that patch, but what a mess... Since already know libvirtd is installed on the remote host, would it make sense to just add a new set of options: libvirtd --socket-connect libvirtd --socket-connect-ro that do the same thing as nc -U on the appropriate socket? Then we know it would work everywhere, and have the added benefit that the client wouldn't need to know the location of the socket. Last time I checked Debian's nc wouldn't properly close the scoket on EOF so I added this to virt-manager: -argv += [ server, nc, vncaddr, str(vncport) ] +argv += [ server, nc, -q, 0, vncaddr, str(vncport) ] The problem you're seeing suggests that we need this in virsh too. Could you check if this helps? Cheers, -- Guido -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
On Wed, Jun 17, 2009 at 06:36:16PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: On Wed, Jun 17, 2009 at 05:51:27PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: 17:34:59.360: debug : call:6947 : Doing call 70 (nil) 17:34:59.360: debug : call:7017 : We have the buck 70 0xbccef0 0xbccef0 17:34:59.433: debug : processCallRecvLen:6605 : Got length, now need 128 total (124 more) 17:34:59.434: debug : processCalls:6873 : Giving up the buck 70 0xbccef0 (nil) 17:34:59.434: debug : call:7048 : All done with our call 70 (nil) 0xbccef0 17:34:59.434: error : server_error:7231 : authentication failed 17:35:13.585: debug : do_open:999 : driver 4 remote returned ERROR 17:35:13.585: debug : virUnrefConnect:232 : unref connection 0xbc6a60 1 17:35:13.585: debug : virReleaseConnect:191 : release connection 0xbc6a60 If I kill the libvirtd process on the server, the client then finally prints: error: authentication failed error: failed to connect to the hypervisor and the client then exits. Ok, this bit definitely sounds like a server side bug, unless perhaps there is some buffering taking place in ssh or nc causing the errore reply packet to not be send back promptly I'll try to get some better traces of what's going on here. The hang aside, it seems libvirtd should be using org.libvirt.unix.monitor for the readonly connection? In this case the problem is that the remote client end is using netcat on the wrong UNIX socket. Thanks, that's it. With the attached patch on the client side, virsh --readonly and virt-viewer work fine over qemu+ssh://. -jim --- libvirt-0.6.4-orig/src/remote_internal.c 2009-05-29 10:55:26.0 -0400 +++ libvirt-0.6.4/src/remote_internal.c 2009-06-17 18:21:34.0 -0400 @@ -700,7 +700,10 @@ cmd_argv[j++] = strdup (priv-hostname); cmd_argv[j++] = strdup (netcat ? netcat : nc); cmd_argv[j++] = strdup (-U); -cmd_argv[j++] = strdup (sockname ? sockname : LIBVIRTD_PRIV_UNIX_SOCKET); + cmd_argv[j++] = strdup (sockname ? sockname : + (flags VIR_CONNECT_RO + ? LIBVIRTD_PRIV_UNIX_SOCKET_RO + : LIBVIRTD_PRIV_UNIX_SOCKET)); cmd_argv[j++] = 0; assert (j == nr_args); for (j = 0; j (nr_args-1); j++) Ok, I've committed this change Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
On Wed, Jun 17, 2009 at 07:27:22PM -0400, Jim Paris wrote: I wrote: Ok, this bit definitely sounds like a server side bug, unless perhaps there is some buffering taking place in ssh or nc causing the errore reply packet to not be send back promptly I'll try to get some better traces of what's going on here. The error is getting back to the client. On the client, remoteAuthenticate does fail and return -1. The client then ends up blocked in the waitpid at remote_internal.c:877: 865 failed: 866 /* Close the socket if we failed. */ 867 if (priv-sock = 0) { 868 if (priv-uses_tls priv-session) { 869 gnutls_bye (priv-session, GNUTLS_SHUT_RDWR); 870 gnutls_deinit (priv-session); 871 } 872 close (priv-sock); 873 #ifndef WIN32 874 if (priv-pid 0) { 875 pid_t reap; 876 do { 877 reap = waitpid(priv-pid, NULL, 0); 878 if (reap == -1 errno == EINTR) 879 continue; 880 } while (reap != -1 reap != priv-pid); 881 } 882 #endif 883 } Nothing gets printed up until this point, which is why there's no output. I guess the client is waiting for SSH to die, which isn't happening for some reason. That must be a bug on the server side, although the client should also probably be more robust in this case.. We close the socket to the 'nc' process here so in theory it should be getting a HUP event from poll or EOF from read, etc and then exiting. Ominously though I see several patches to Fedora's 'nc' RPM at least one of which is related to nc hanging forever after getting HUP fback from poll(). What distro are you using ? http://cvs.fedoraproject.org/viewvc/rpms/nc/F-11/ Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
Daniel P. Berrange wrote: We close the socket to the 'nc' process here so in theory it should be getting a HUP event from poll or EOF from read, etc and then exiting. Ominously though I see several patches to Fedora's 'nc' RPM at least one of which is related to nc hanging forever after getting HUP fback from poll(). What distro are you using ? http://cvs.fedoraproject.org/viewvc/rpms/nc/F-11/ I'm using Debian. I've already had to switch from the netcat-traditional package to the netcat-openbsd package. Debian does already include that patch, but what a mess... Since already know libvirtd is installed on the remote host, would it make sense to just add a new set of options: libvirtd --socket-connect libvirtd --socket-connect-ro that do the same thing as nc -U on the appropriate socket? Then we know it would work everywhere, and have the added benefit that the client wouldn't need to know the location of the socket. -jim -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
On Thu, Jun 18, 2009 at 12:20:40PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: We close the socket to the 'nc' process here so in theory it should be getting a HUP event from poll or EOF from read, etc and then exiting. Ominously though I see several patches to Fedora's 'nc' RPM at least one of which is related to nc hanging forever after getting HUP fback from poll(). What distro are you using ? http://cvs.fedoraproject.org/viewvc/rpms/nc/F-11/ I'm using Debian. I've already had to switch from the netcat-traditional package to the netcat-openbsd package. Debian does already include that patch, but what a mess... I know the reason why it gets stuck on the server end too - after an auth failure, the server won't kick off the client. The connection just remains in an unauthenticated state. This allows the client to (in theory) retry the authentication step, and gives us a little more flexibility for any future protocol changes we might need to make. I think the best way to solve the problem of 'nc' potentially not quitting promptly, is to simply have the remote client kill() the SSH client pid, rather than simply closing the socket doing waitpid() on the SSH client. This would ensure the waitpid promptly cleans up. Since already know libvirtd is installed on the remote host, would it make sense to just add a new set of options: libvirtd --socket-connect libvirtd --socket-connect-ro that do the same thing as nc -U on the appropriate socket? Then we know it would work everywhere, and have the added benefit that the client wouldn't need to know the location of the socket. If we'd thought of this originally, I would certainly have done it this way, but if we did this now, it would break compatability. ie new libvirt clients would be trying to run a binary that does not exist with old server deployments. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
Daniel P. Berrange wrote: On Thu, Jun 18, 2009 at 12:20:40PM -0400, Jim Paris wrote: I'm using Debian. I've already had to switch from the netcat-traditional package to the netcat-openbsd package. Debian does already include that patch, but what a mess... I know the reason why it gets stuck on the server end too - after an auth failure, the server won't kick off the client. The connection just remains in an unauthenticated state. This allows the client to (in theory) retry the authentication step, and gives us a little more flexibility for any future protocol changes we might need to make. Makes sense -- it would be nice for the client to be able to retry with read-only authentication when read-write fails, without having to reopen the SSH connection. Or is that not possible, since it would require opening a different socket? I think the best way to solve the problem of 'nc' potentially not quitting promptly, is to simply have the remote client kill() the SSH client pid, rather than simply closing the socket doing waitpid() on the SSH client. This would ensure the waitpid promptly cleans up. Yeah, that should fix the hang. Since already know libvirtd is installed on the remote host, would it make sense to just add a new set of options: libvirtd --socket-connect libvirtd --socket-connect-ro that do the same thing as nc -U on the appropriate socket? Then we know it would work everywhere, and have the added benefit that the client wouldn't need to know the location of the socket. If we'd thought of this originally, I would certainly have done it this way, but if we did this now, it would break compatability. ie new libvirt clients would be trying to run a binary that does not exist with old server deployments. It could still be done in a backwards-compatible way. Something like: ssh server libvirtd --socket-connect || nc -U /socket Or, if you really wanted to be nice to us Debian folks, ssh server libvirtd --socket-connect || nc.openbsd -U /socket || nc -U /socket (while the Debian libvirt package does depend on netcat-openbsd, there's nothing that forces the local nc symlink to point to the openbsd version over the traditional version, if both are installed). It's definitely messy, but it would really be nice to remove the need for the client to know which netcat to use, or where sockets are located, etc. Hmm, as I think about it more, I guess netcat is also used for VNC connections? I wonder if that could be implemented as a dynamic port forward on the existing SSH connection, which would also eliminate the need for a second connection (and having to enter the password a second time)... -jim -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
Daniel P. Berrange wrote: But when accessing remotely, I get no useful error, and a hang: $ virsh -c qemu+ssh://j...@server/system libvir: Remote error : authentication failed process hangs here $ virsh --readonly -c qemu+ssh://j...@server/system libvir: Remote error : authentication failed process hangs here Furthermore, on the server, this leaves nc processes running, and eventually there are enough that libvirtd stops accepting new connections. The hang is really odd. That suggests something is not closing the socket connection properly. If you had been yusing 0.6.1/.2/.3 I would have said it was one of the libvirtd bugs, but 0.6.4 has all event handling bugs fixed. Perhaps the libvirtd client is not killing the SSH session / process when it closes the connection after auth failure. I was using 0.4.6 on the client side. I upgraded that to 0.6.4, but I still get the hang. Virsh prints nothing; the LIBVIRT_DEBUG output is: 17:34:58.524: debug : doRemoteOpen:505 : proceeding with name = qemu:///system 17:34:58.525: debug : virExecWithHook:573 : ssh server nc -U /var/run/libvirt/libvirt-sock 17:34:58.526: debug : call:6947 : Doing call 66 (nil) 17:34:58.527: debug : call:7017 : We have the buck 66 0x7fba56729010 0x7fba56729010 17:34:59.359: debug : processCallRecvLen:6605 : Got length, now need 36 total (32 more) 17:34:59.360: debug : processCalls:6873 : Giving up the buck 66 0x7fba56729010 (nil) 17:34:59.360: debug : call:7048 : All done with our call 66 (nil) 0x7fba56729010 17:34:59.360: debug : remoteAuthPolkit:6114 : Client initialize PolicyKit authentication 17:34:59.360: debug : call:6947 : Doing call 70 (nil) 17:34:59.360: debug : call:7017 : We have the buck 70 0xbccef0 0xbccef0 17:34:59.433: debug : processCallRecvLen:6605 : Got length, now need 128 total (124 more) 17:34:59.434: debug : processCalls:6873 : Giving up the buck 70 0xbccef0 (nil) 17:34:59.434: debug : call:7048 : All done with our call 70 (nil) 0xbccef0 17:34:59.434: error : server_error:7231 : authentication failed 17:35:13.585: debug : do_open:999 : driver 4 remote returned ERROR 17:35:13.585: debug : virUnrefConnect:232 : unref connection 0xbc6a60 1 17:35:13.585: debug : virReleaseConnect:191 : release connection 0xbc6a60 If I kill the libvirtd process on the server, the client then finally prints: error: authentication failed error: failed to connect to the hypervisor and the client then exits. On the server side, the libvirtd output is 17:34:59.378: debug : remoteDispatchAuthPolkit:3385 : Start PolicyKit auth 25 17:34:59.378: info : remoteDispatchAuthPolkit:3396 : Checking PID 7551 running as 1000 17:34:59.379: debug : virEventRunOnce:567 : Poll got 1 event 17:34:59.379: debug : virEventDispatchHandles:450 : Dispatch n=2 f=9 w=3 e=1 0x1a72790 17:34:59.379: debug : nodeDeviceLock:52 : LOCK node 0x1a748e0 17:34:59.379: debug : nodeDeviceUnlock:57 : UNLOCK node 0x1a748e0 17:34:59.426: error : remoteDispatchAuthPolkit:3451 : Policy kit denied action org.libvirt.unix.manage from pid 7551, uid 1000, result: auth_admin_keep_session The hang aside, it seems libvirtd should be using org.libvirt.unix.monitor for the readonly connection? Is policykit authentication supposed to work over qemu+ssh? Yes, but only if you ssh as root such that policykit is a no-op. The problem you are seeing is becaue you SSH as non-root. PolicyKit relies on ConsoleKit to determine who is authorized, and SSH does not register ConsoleKit Sessions. As I mentioned, I've modified the PolicyKit libvirtd configuration to not require a session: match action=org.libvirt.unix.manage return result=auth_admin_keep_session/ /match so I was hoping that wouldn't be a problem. With this configuration, I think even using libpam-ck-connector wouldn't change things? I was hoping it would at least not break the --readonly case. That all said --readonly is intended to work at all times. Our default policy file includes a rule allow_anyyes/allow_any which is telling policykit to allow access even if the client is not associatied with any ConsoleKit session. So this should have allowed it to work for you with --readonly. Right, it seems libvirtd is missing readonly somehow? -jim -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
On Wed, Jun 17, 2009 at 05:51:27PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: 17:34:59.360: debug : call:6947 : Doing call 70 (nil) 17:34:59.360: debug : call:7017 : We have the buck 70 0xbccef0 0xbccef0 17:34:59.433: debug : processCallRecvLen:6605 : Got length, now need 128 total (124 more) 17:34:59.434: debug : processCalls:6873 : Giving up the buck 70 0xbccef0 (nil) 17:34:59.434: debug : call:7048 : All done with our call 70 (nil) 0xbccef0 17:34:59.434: error : server_error:7231 : authentication failed 17:35:13.585: debug : do_open:999 : driver 4 remote returned ERROR 17:35:13.585: debug : virUnrefConnect:232 : unref connection 0xbc6a60 1 17:35:13.585: debug : virReleaseConnect:191 : release connection 0xbc6a60 If I kill the libvirtd process on the server, the client then finally prints: error: authentication failed error: failed to connect to the hypervisor and the client then exits. Ok, this bit definitely sounds like a server side bug, unless perhaps there is some buffering taking place in ssh or nc causing the errore reply packet to not be send back promptly On the server side, the libvirtd output is 17:34:59.378: debug : remoteDispatchAuthPolkit:3385 : Start PolicyKit auth 25 17:34:59.378: info : remoteDispatchAuthPolkit:3396 : Checking PID 7551 running as 1000 17:34:59.379: debug : virEventRunOnce:567 : Poll got 1 event 17:34:59.379: debug : virEventDispatchHandles:450 : Dispatch n=2 f=9 w=3 e=1 0x1a72790 17:34:59.379: debug : nodeDeviceLock:52 : LOCK node 0x1a748e0 17:34:59.379: debug : nodeDeviceUnlock:57 : UNLOCK node 0x1a748e0 17:34:59.426: error : remoteDispatchAuthPolkit:3451 : Policy kit denied action org.libvirt.unix.manage from pid 7551, uid 1000, result: auth_admin_keep_session The hang aside, it seems libvirtd should be using org.libvirt.unix.monitor for the readonly connection? In this case the problem is that the remote client end is using netcat on the wrong UNIX socket. In remote_internal.c it does cmd_argv[j++] = strdup (sockname ? sockname : LIBVIRTD_PRIV_UNIX_SOCKET); When it should be doing cmd_argv[j++] = strdup (sockname ? sockname : (flags VIR_CONNECT_IO ? LIBVIRTD_PRIV_UNIX_SOCKET_RO : LIBVIRTD_PRIV_UNIX_SOCKET); that would make libvirtd use the correct permission check Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] problems with remote authentication with policykit
Daniel P. Berrange wrote: On Wed, Jun 17, 2009 at 05:51:27PM -0400, Jim Paris wrote: Daniel P. Berrange wrote: 17:34:59.360: debug : call:6947 : Doing call 70 (nil) 17:34:59.360: debug : call:7017 : We have the buck 70 0xbccef0 0xbccef0 17:34:59.433: debug : processCallRecvLen:6605 : Got length, now need 128 total (124 more) 17:34:59.434: debug : processCalls:6873 : Giving up the buck 70 0xbccef0 (nil) 17:34:59.434: debug : call:7048 : All done with our call 70 (nil) 0xbccef0 17:34:59.434: error : server_error:7231 : authentication failed 17:35:13.585: debug : do_open:999 : driver 4 remote returned ERROR 17:35:13.585: debug : virUnrefConnect:232 : unref connection 0xbc6a60 1 17:35:13.585: debug : virReleaseConnect:191 : release connection 0xbc6a60 If I kill the libvirtd process on the server, the client then finally prints: error: authentication failed error: failed to connect to the hypervisor and the client then exits. Ok, this bit definitely sounds like a server side bug, unless perhaps there is some buffering taking place in ssh or nc causing the errore reply packet to not be send back promptly I'll try to get some better traces of what's going on here. The hang aside, it seems libvirtd should be using org.libvirt.unix.monitor for the readonly connection? In this case the problem is that the remote client end is using netcat on the wrong UNIX socket. Thanks, that's it. With the attached patch on the client side, virsh --readonly and virt-viewer work fine over qemu+ssh://. -jim --- libvirt-0.6.4-orig/src/remote_internal.c2009-05-29 10:55:26.0 -0400 +++ libvirt-0.6.4/src/remote_internal.c 2009-06-17 18:21:34.0 -0400 @@ -700,7 +700,10 @@ cmd_argv[j++] = strdup (priv-hostname); cmd_argv[j++] = strdup (netcat ? netcat : nc); cmd_argv[j++] = strdup (-U); -cmd_argv[j++] = strdup (sockname ? sockname : LIBVIRTD_PRIV_UNIX_SOCKET); + cmd_argv[j++] = strdup (sockname ? sockname : + (flags VIR_CONNECT_RO +? LIBVIRTD_PRIV_UNIX_SOCKET_RO +: LIBVIRTD_PRIV_UNIX_SOCKET)); cmd_argv[j++] = 0; assert (j == nr_args); for (j = 0; j (nr_args-1); j++) -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] problems with remote authentication with policykit
Hi, I have libvirt 0.6.4 running kvm instances on a headless server. I'm using virt-manager 0.7.0 to manage them. In the past, I would SSH in and run virt-manager as root. Since running GTK apps as root is no good, I've switched to policykit authentication. By default, the libvirt policy only allows management if the user is in the active host session, which isn't the case with my SSH logins. Therefore I've added an override in /etc/PolicyKit/PolicyKit.conf: match action=org.libvirt.unix.manage return result=auth_admin_keep_session/ /match Now things generally work fine when SSHed in: - as root, virsh gives ro and rw access with no password - as jim, virsh gives ro access with no password, but requests a password for rw - as jim, virsh asks for a password for rw access But when accessing remotely, I get no useful error, and a hang: $ virsh -c qemu+ssh://j...@server/system libvir: Remote error : authentication failed process hangs here $ virsh --readonly -c qemu+ssh://j...@server/system libvir: Remote error : authentication failed process hangs here Furthermore, on the server, this leaves nc processes running, and eventually there are enough that libvirtd stops accepting new connections. I was also getting strange errors including: polkit-grant-helper: given auth type (8 - yes) is bogus but now I can't reproduce that for the life of me, I have no idea what changed. Is policykit authentication supposed to work over qemu+ssh? I was hoping it would at least not break the --readonly case. -jim -- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list