[libvirt] [PATCH] virsh: fix invalid free
* tools/virsh.c (cmdDetachDisk): fix invalid free due to using uninitialised value. * How to reproduce? # virsh detach-disk a b error: failed to get domain 'a' *** glibc detected *** virsh: double free or corruption (out): 0x7fff410ed1a0 *** === Backtrace: = /lib64/libc.so.6[0x39cf0750c6] /usr/lib/libvirt.so.0(virFree+0x39)[0x7f045938a239] virsh[0x41c768] virsh[0x415075] virsh[0x425d64] /lib64/libc.so.6(__libc_start_main+0xfd)[0x39cf01ecdd] virsh[0x40a419] === Memory map: 0040-0044e000 r-xp 08:0e 760441 /usr/bin/virsh 0064e000-0065 rw-p 0004e000 08:0e 760441 /usr/bin/virsh .. 39d7229000-39d722b000 r--p 00029000 08:0e 2183477 /lib64/libk5crypto.so.3.1 39d722b000-39d722c000 rw-p 0002b000 08:0e 2183477 /lib64/lAborted (core dumped) Signed-off-by: Alex Jia a...@redhat.com --- tools/virsh.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 630b77f..d45a4c9 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -14826,7 +14826,7 @@ cmdDetachDisk(vshControl *ctl, const vshCmd *cmd) char *disk_xml = NULL; virDomainPtr dom = NULL; const char *target = NULL; -char *doc; +char *doc = NULL; int ret; bool functionReturn = false; unsigned int flags; -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Resending][PATCH v2 2/2] x86: Allow sysinfo to fall back on /proc/cpuinfo if demidecode is absent
On Thu, Mar 15, 2012 at 11:21:09AM +0530, Prerna Saxena wrote: Hi Daniel, Thank you for taking a look. On 03/14/2012 02:29 PM, Daniel Veillard wrote: On Tue, Mar 13, 2012 at 11:10:16AM +0530, Prerna Saxena wrote: From: Prerna Saxena pre...@linux.vnet.ibm.com Date: Tue, 13 Mar 2012 15:33:43 +0530 Subject: [PATCH 2/2] Sysinfo : Allow x86 to fetch sysinfo from /proc/cpuinfo in the event 'dmidecode' is absent in the system. Until now, libvirt on x86 flags an error message if dmidecode is not found. With this patch, the following is a sample output on x86 when dmidecode is absent: virsh # sysinfo sysinfo type='smbios' processor entry name='socket_destination'0/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor processor entry name='socket_destination'1/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor ... (listing for all online CPUs) /sysinfo Based on suggestion from Eric: (http://www.redhat.com/archives/libvir-list/2012-February/msg00509.html) Acked-by: Daniel P Berrange berra...@redhat.com Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- src/util/sysinfo.c | 97 +-- 1 files changed, 93 insertions(+), 4 deletions(-) ... [snip].. Hi Prerna, that sounds like a good idea, and the patch seems to work but I have doubt with the usefulness in its current form. Let me explain: with dmidecode available on my system I get: ... processor entry name='socket_destination'Socket 775/entry entry name='type'Central Processor/entry entry name='family'Other/entry entry name='manufacturer'Intel/entry entry name='signature'Type 0, Family 6, Model 15, Stepping 11/entry entry name='version'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='external_clock'333 MHz/entry entry name='max_speed'4000 MHz/entry entry name='status'Populated, Enabled/entry /processor ... without dmidecode and your patch plugged in I get processor entry name='socket_destination'0/entry entry name='type'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor processor entry name='socket_destination'1/entry entry name='type'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor so basically we get informations, some are available in both case but differently, and worse, in the fallback case we get 2 physical processor entries (I have only one) which is of course different from the single processor that we get with dmidecode. So 1/ is seems to me the fallback data can't be parsed programmatically as a replacement of the original ones 2/ the data may be misunderstood and lead to erroneous decision for example a schedule may start to stack 2 time more load on my machine based on the difference of report. So I'm a bit worried about applying it as-is, I'm afraid we need to reconcile the output (as much as possible considering there is less data) between both cases. Thanks for pointing this out. I investigated this discrepancy, and discovered that 'dmidecode' presents a listing of processor *cores*. However, for /proc/cpuinfo, all hardware threads in a processor show up as independent processors. So, while dmidecode correctly reads that my system has a single core, /proc/cpuinfo reports two hardware threads in the core as two independent logical CPUs. To sort this out, one alternative would be to parse the physical_id in /proc/cpuinfo -- this would be identical for all logical processors in a given core, and thus can be used to report the number of cores in the system. Will send a modified patch asap. That said I think patch 1/2 looks fine to me, and could probably be applied as-is, Thanks! Would you want to apply it as-is, or shall I send a rebased version ? Well if you're fixing 2/2 before end of next week, I suggest to apply both together and hence rebase 1/2 when you submit 2/2 v3 :) Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ dan...@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virsh: fix invalid free
On 03/15/2012 02:06 PM, Alex Jia wrote: * tools/virsh.c (cmdDetachDisk): fix invalid free due to using uninitialised value. * How to reproduce? # virsh detach-disk a b error: failed to get domain 'a' *** glibc detected *** virsh: double free or corruption (out): 0x7fff410ed1a0 *** Good catch, ACK. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Resending][PATCH v2 2/2] x86: Allow sysinfo to fall back on /proc/cpuinfo if demidecode is absent
On 03/15/2012 11:37 AM, Daniel Veillard wrote: On Thu, Mar 15, 2012 at 11:21:09AM +0530, Prerna Saxena wrote: Hi Daniel, Thank you for taking a look. On 03/14/2012 02:29 PM, Daniel Veillard wrote: On Tue, Mar 13, 2012 at 11:10:16AM +0530, Prerna Saxena wrote: From: Prerna Saxena pre...@linux.vnet.ibm.com Date: Tue, 13 Mar 2012 15:33:43 +0530 Subject: [PATCH 2/2] Sysinfo : Allow x86 to fetch sysinfo from /proc/cpuinfo in the event 'dmidecode' is absent in the system. Until now, libvirt on x86 flags an error message if dmidecode is not found. With this patch, the following is a sample output on x86 when dmidecode is absent: virsh # sysinfo sysinfo type='smbios' processor entry name='socket_destination'0/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor processor entry name='socket_destination'1/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor ... (listing for all online CPUs) /sysinfo Based on suggestion from Eric: (http://www.redhat.com/archives/libvir-list/2012-February/msg00509.html) Acked-by: Daniel P Berrange berra...@redhat.com Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- src/util/sysinfo.c | 97 +-- 1 files changed, 93 insertions(+), 4 deletions(-) ... [snip].. Hi Prerna, that sounds like a good idea, and the patch seems to work but I have doubt with the usefulness in its current form. Let me explain: with dmidecode available on my system I get: ... processor entry name='socket_destination'Socket 775/entry entry name='type'Central Processor/entry entry name='family'Other/entry entry name='manufacturer'Intel/entry entry name='signature'Type 0, Family 6, Model 15, Stepping 11/entry entry name='version'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='external_clock'333 MHz/entry entry name='max_speed'4000 MHz/entry entry name='status'Populated, Enabled/entry /processor ... without dmidecode and your patch plugged in I get processor entry name='socket_destination'0/entry entry name='type'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor processor entry name='socket_destination'1/entry entry name='type'Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor so basically we get informations, some are available in both case but differently, and worse, in the fallback case we get 2 physical processor entries (I have only one) which is of course different from the single processor that we get with dmidecode. So 1/ is seems to me the fallback data can't be parsed programmatically as a replacement of the original ones 2/ the data may be misunderstood and lead to erroneous decision for example a schedule may start to stack 2 time more load on my machine based on the difference of report. So I'm a bit worried about applying it as-is, I'm afraid we need to reconcile the output (as much as possible considering there is less data) between both cases. Thanks for pointing this out. I investigated this discrepancy, and discovered that 'dmidecode' presents a listing of processor *cores*. However, for /proc/cpuinfo, all hardware threads in a processor show up as independent processors. So, while dmidecode correctly reads that my system has a single core, /proc/cpuinfo reports two hardware threads in the core as two independent logical CPUs. To sort this out, one alternative would be to parse the physical_id in /proc/cpuinfo -- this would be identical for all logical processors in a given core, and thus can be used to report the number of cores in the system. Will send a modified patch asap. Hi Daniel, I just realised a correction in the explanation above -- dmidecode only reveals a per-socket listing, while /proc/cpuinfo lists all the physical cores within a socket. So if demidecode lists single entry for a processor, it can be inferred that the machine in question has one socket. /proc/cpuinfo will have listings for each physical core in that socket, plus the hardware threads available. As mentioned earlier, I will be happy to spin a patch that uses 'physical_id' (constant for all cores in a socket) to provide a socket-level information. This will attain parity with 'dmidecode' output and will report *one socket* for such a machine, under the 'processor' XML tag.( Which could be a little misleading) However, I am curious -- what benefit would the number of sockets be to a libvirt user? I expect users would mostly care about number of
Re: [libvirt] [PATCH] virsh: fix invalid free
Thanks and pushed now. - Original Message - From: Osier Yang jy...@redhat.com To: Alex Jia a...@redhat.com Cc: libvir-list@redhat.com Sent: Thursday, March 15, 2012 3:16:48 PM Subject: Re: [libvirt] [PATCH] virsh: fix invalid free On 03/15/2012 02:06 PM, Alex Jia wrote: * tools/virsh.c (cmdDetachDisk): fix invalid free due to using uninitialised value. * How to reproduce? # virsh detach-disk a b error: failed to get domain 'a' *** glibc detected *** virsh: double free or corruption (out): 0x7fff410ed1a0 *** Good catch, ACK. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Resending][PATCH v2 2/2] x86: Allow sysinfo to fall back on /proc/cpuinfo if demidecode is absent
On Thu, Mar 15, 2012 at 12:28:05PM +0530, Prerna Saxena wrote: On 03/15/2012 11:37 AM, Daniel Veillard wrote: On Thu, Mar 15, 2012 at 11:21:09AM +0530, Prerna Saxena wrote: Hi Daniel, Thank you for taking a look. [...] Thanks for pointing this out. I investigated this discrepancy, and discovered that 'dmidecode' presents a listing of processor *cores*. However, for /proc/cpuinfo, all hardware threads in a processor show up as independent processors. So, while dmidecode correctly reads that my system has a single core, /proc/cpuinfo reports two hardware threads in the core as two independent logical CPUs. To sort this out, one alternative would be to parse the physical_id in /proc/cpuinfo -- this would be identical for all logical processors in a given core, and thus can be used to report the number of cores in the system. Will send a modified patch asap. Hi Daniel, I just realised a correction in the explanation above -- dmidecode only reveals a per-socket listing, while /proc/cpuinfo lists all the physical cores within a socket. So if demidecode lists single entry for a processor, it can be inferred that the machine in question has one socket. /proc/cpuinfo will have listings for each physical core in that socket, plus the hardware threads available. As mentioned earlier, I will be happy to spin a patch that uses 'physical_id' (constant for all cores in a socket) to provide a socket-level information. This will attain parity with 'dmidecode' output and will report *one socket* for such a machine, under the 'processor' XML tag.( Which could be a little misleading) However, I am curious -- what benefit would the number of sockets be to a libvirt user? I expect users would mostly care about number of available CPU cores to take scheduling decisions. Am I missing a use-case for exclusive need of socket-level information? The question at this point is not whether the information is better or not, it must be the same in the fallback case. The informations won't be missing, it can be gathered by 'virsh nodeinfo' and equivalent API. Point of patch 2/2 is to provide some identical informations in the case dmidecode is missing. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ dan...@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 09:42 AM, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; Seems we have much problem of the array length for the RPC calls. A similiar problem with VOL_NAME_LIST_MAX: https://bugzilla.redhat.com/show_bug.cgi?id=802357 Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] virsh: A bit smarter attach-disk
Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. And previous virCommandOptString ensures source is not NULL, remove the conditional checking. --- tools/virsh.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index d45a4c9..3b845ac 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -14428,6 +14428,7 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) const char *stype = NULL; virBuffer buf = VIR_BUFFER_INITIALIZER; char *xml; +struct stat st; if (!vshConnectionUsability(ctl, ctl-conn)) goto cleanup; @@ -14458,8 +14459,12 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) } if (!stype) { -if (driver (STREQ(driver, file) || STREQ(driver, tap))) +if (driver (STREQ(driver, file) || STREQ(driver, tap))) { isFile = true; +} else { +if (!stat(source, st)) +isFile = S_ISREG(st.st_mode) ? true : false; +} } else if (STREQ(stype, file)) { isFile = true; } else if (STRNEQ(stype, block)) { @@ -14497,10 +14502,9 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) if (driver || subdriver || cache) virBufferAddLit(buf, /\n); -if (source) -virBufferAsprintf(buf, source %s='%s'/\n, - (isFile) ? file : dev, - source); +virBufferAsprintf(buf, source %s='%s'/\n, + (isFile) ? file : dev, + source); virBufferAsprintf(buf, target dev='%s'/\n, target); if (mode) virBufferAsprintf(buf, %s/\n, mode); -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Resending][PATCH v2 2/2] x86: Allow sysinfo to fall back on /proc/cpuinfo if demidecode is absent
On 03/15/2012 12:40 PM, Daniel Veillard wrote: On Thu, Mar 15, 2012 at 12:28:05PM +0530, Prerna Saxena wrote: On 03/15/2012 11:37 AM, Daniel Veillard wrote: On Thu, Mar 15, 2012 at 11:21:09AM +0530, Prerna Saxena wrote: Hi Daniel, Thank you for taking a look. [...] Thanks for pointing this out. I investigated this discrepancy, and discovered that 'dmidecode' presents a listing of processor *cores*. However, for /proc/cpuinfo, all hardware threads in a processor show up as independent processors. So, while dmidecode correctly reads that my system has a single core, /proc/cpuinfo reports two hardware threads in the core as two independent logical CPUs. To sort this out, one alternative would be to parse the physical_id in /proc/cpuinfo -- this would be identical for all logical processors in a given core, and thus can be used to report the number of cores in the system. Will send a modified patch asap. Hi Daniel, I just realised a correction in the explanation above -- dmidecode only reveals a per-socket listing, while /proc/cpuinfo lists all the physical cores within a socket. So if demidecode lists single entry for a processor, it can be inferred that the machine in question has one socket. /proc/cpuinfo will have listings for each physical core in that socket, plus the hardware threads available. As mentioned earlier, I will be happy to spin a patch that uses 'physical_id' (constant for all cores in a socket) to provide a socket-level information. This will attain parity with 'dmidecode' output and will report *one socket* for such a machine, under the 'processor' XML tag.( Which could be a little misleading) However, I am curious -- what benefit would the number of sockets be to a libvirt user? I expect users would mostly care about number of available CPU cores to take scheduling decisions. Am I missing a use-case for exclusive need of socket-level information? The question at this point is not whether the information is better or not, it must be the same in the fallback case. The informations won't be missing, it can be gathered by 'virsh nodeinfo' and equivalent API. Point of patch 2/2 is to provide some identical informations in the case dmidecode is missing. Okay. I'll send out a v3 of my patch series incorporating changes asap :) Regards, -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-trivial] do_spice_init error on Ubuntu11.10
On Thu, Mar 08, 2012 at 11:29:48AM +0800, suyi wang wrote: Hi all: Hi Suyi, qemu-trivial is a mailing list for QEMU patches, questions should be directed at the regular QEMU mailing list. I have CCed qemu-devel and libvirt-list so your question will go to the QEMU and libvirt communities. Are you sure your libvirt configuration allows the QEMU process to access /dev/shm? You may have SELinux enabled. Stefan I tried kvm on my ubuntu with the libvirt.xml file as follows: domain type='kvm' nameinstance-0011/ name memory2097152/memory os typehvm/type boot dev=hd / /os features acpi/ /features vcpu1/vcpu devices sound model='ac97'/ input type='tablet' bus='usb'/ disk type='file' driver type='qcow2'/ source file='/opt/stack/nova/nova/../ /instances/instance-0011/disk'/ target dev='vda' bus='ide'/ /disk disk type='file' driver type='qcow2'/ source - Ignored: file='/opt/stack/nova/nova/..//instances/instance-0011/disk.local'/ target dev='vdb' bus='ide'/ /disk interface type='bridge' source bridge='br100'/ mac address='02:16:3e:44:a1:dd'/ filterref filter=nova-instance-instance-0011-02163e44a1dd parameter name=IP value=10.0.0.2 / parameter name=DHCPSERVER value=10.0.0.1 / /filterref /interface !-- The order is significant here. File must be defined first -- serial type=file source path='/opt/stack/nova/nova/..//instances/instance-0011/console.log'/ target port='1'/ /serial console type='pty' tty='/dev/pts/2' source path='/dev/pts/2'/ target port='0'/ /console serial type='pty' source path='/dev/pts/2'/ target port='0'/ /serial graphics type='vnc' port='-1' autoport='yes' keymap='en-us' listen='0.0.0.0'/ /devices /domain So it works well. Howerver, I want change the remote access method by spice, I simply changed the libvirt.xml as follows: domain type='kvm' nameinstance-0011/name memory2097152/memory os typehvm/type boot dev=hd / /os features acpi/ /features vcpu1/vcpu devices sound model='ac97'/ input type='tablet' bus='usb'/ disk type='file' driver type='qcow2'/ source file='/opt/stack/nova/nova/..//instances/instance-0011/disk'/ target dev='vda' bus='ide'/ /disk disk type='file' driver type='qcow2'/ source file='/opt/stack/nova/nova/..//instances/instance-0011/disk.local'/ target dev='vdb' bus='ide'/ /disk interface type='bridge' source bridge='br100'/ mac address='02:16:3e:44:a1:dd'/ filterref filter=nova-instance-instance-0011-02163e44a1dd parameter name=IP value=10.0.0.2 / parameter name=DHCPSERVER value=10.0.0.1 / /filterref /interface !-- The order is significant here. File must be defined first -- serial type=file source path='/opt/stack/nova/nova/..//instances/instance-0011/console.log'/ target port='1'/ /serial console type='pty' tty='/dev/pts/2' source path='/dev/pts/2'/ target port='0'/ /console serial type='pty' source path='/dev/pts/2'/ target port='0'/ /serial graphics type='spice' port='-1' autoport='yes' keymap='en-us' listen='0.0.0.0'/ /devices /domain As you can see, I just change graphics type='vnc' port='-1' autoport='yes' keymap='en-us' listen='0.0.0.0'/ to graphics type='spice' port='-1' autoport='yes' keymap='en-us' listen='0.0.0.0'/ , But it called the error as follows: libvirtError: internal error Process exited while reading console log output: char device redirected to /dev/pts/12 TRACE: do_spice_init: starting 0.8.1 TRACE: do_spice_init: statistics shm_open failed, Permission denied Then I also tried ls -ld /dev/shm , yes, it is already 777. then my problem remained: libvirtError: internal error Process exited while reading console log output: char device
Re: [libvirt] [PATCH] virsh: A bit smarter attach-disk
On 03/15/2012 05:12 PM, Michal Privoznik wrote: On 15.03.2012 10:13, Osier Yang wrote: Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. And previous virCommandOptString ensures source is not NULL, remove the conditional checking. --- tools/virsh.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index d45a4c9..3b845ac 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -14428,6 +14428,7 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) const char *stype = NULL; virBuffer buf = VIR_BUFFER_INITIALIZER; char *xml; +struct stat st; if (!vshConnectionUsability(ctl, ctl-conn)) goto cleanup; @@ -14458,8 +14459,12 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) } if (!stype) { -if (driver (STREQ(driver, file) || STREQ(driver, tap))) +if (driver (STREQ(driver, file) || STREQ(driver, tap))) { isFile = true; +} else { +if (!stat(source,st)) +isFile = S_ISREG(st.st_mode) ? true : false; I think S_ISREG() would be sufficient here. But that's just cosmetic. +} } else if (STREQ(stype, file)) { isFile = true; } else if (STRNEQ(stype, block)) { @@ -14497,10 +14502,9 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) if (driver || subdriver || cache) virBufferAddLit(buf, /\n); -if (source) -virBufferAsprintf(buf, source %s='%s'/\n, - (isFile) ? file : dev, - source); +virBufferAsprintf(buf, source %s='%s'/\n, + (isFile) ? file : dev, + source); virBufferAsprintf(buf, target dev='%s'/\n, target); if (mode) virBufferAsprintf(buf, %s/\n, mode); However this looks bad. As written in commend just below virCommandOptString(, source): /* Allow empty string as a placeholder that implies no source, for * use in adding a cdrom drive with no disk. */ if (!*source) source = NULL; Oh, yes, I didn't notice the comments. Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2] virsh: A bit smarter attach-disk
Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. --- tools/virsh.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index d45a4c9..19f9bbe 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -14428,6 +14428,7 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) const char *stype = NULL; virBuffer buf = VIR_BUFFER_INITIALIZER; char *xml; +struct stat st; if (!vshConnectionUsability(ctl, ctl-conn)) goto cleanup; @@ -14458,8 +14459,12 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) } if (!stype) { -if (driver (STREQ(driver, file) || STREQ(driver, tap))) +if (driver (STREQ(driver, file) || STREQ(driver, tap))) { isFile = true; +} else { +if (source !stat(source, st)) +isFile = S_ISREG(st.st_mode) ? true : false; +} } else if (STREQ(stype, file)) { isFile = true; } else if (STRNEQ(stype, block)) { -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] virsh: A bit smarter attach-disk
On 15.03.2012 11:17, Osier Yang wrote: Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. --- tools/virsh.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) ACK Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Add migration APIs for libxl driver
2012/3/15 Jim Fehlig jfeh...@suse.com While testing this patch, I noticed some strange problems wrt concurrent operations in the driver. E.g. if I start a migration and then query dominfo on the migrating domain, it kills the migration xen134: # virsh migrate --live sles11sp1-pv xen+ssh://xen142 error: internal error Failed to save domain '7' with libxenlight Strange. Found that doing whatever operation that needs to open a connect to the hypervisor (even virsh version) will cause the same issue to live migration. While trying operation that calls do_open the connect, xc_shadow_control( in xc_domain_save) will fail. Non-live migration is OK. Thanks! Jim -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] virsh: A bit smarter attach-disk
On 03/15/2012 05:32 PM, Michal Privoznik wrote: On 15.03.2012 11:17, Osier Yang wrote: Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. --- tools/virsh.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) ACK Thanks, pushed. Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Wed, Mar 14, 2012 at 08:42:35PM -0500, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; /* Upper limit on lists of storage vol names. */ const REMOTE_STORAGE_VOL_NAME_LIST_MAX = 1024; We have to think about what the compatibility implications are for this kind of change. eg what is the behaviour when old client talks to new server, and vica-verca. It might be fine, but I'd like someone to enumerate the before after behaviour in all combinations. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virsh: A bit smarter attach-disk
On 15.03.2012 10:13, Osier Yang wrote: Detects the file type of source path if no --sourcetype and driver is specified, instead of always set the disk type as block. And previous virCommandOptString ensures source is not NULL, remove the conditional checking. --- tools/virsh.c | 14 +- 1 files changed, 9 insertions(+), 5 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index d45a4c9..3b845ac 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -14428,6 +14428,7 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) const char *stype = NULL; virBuffer buf = VIR_BUFFER_INITIALIZER; char *xml; +struct stat st; if (!vshConnectionUsability(ctl, ctl-conn)) goto cleanup; @@ -14458,8 +14459,12 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) } if (!stype) { -if (driver (STREQ(driver, file) || STREQ(driver, tap))) +if (driver (STREQ(driver, file) || STREQ(driver, tap))) { isFile = true; +} else { +if (!stat(source, st)) +isFile = S_ISREG(st.st_mode) ? true : false; I think S_ISREG() would be sufficient here. But that's just cosmetic. +} } else if (STREQ(stype, file)) { isFile = true; } else if (STRNEQ(stype, block)) { @@ -14497,10 +14502,9 @@ cmdAttachDisk(vshControl *ctl, const vshCmd *cmd) if (driver || subdriver || cache) virBufferAddLit(buf, /\n); -if (source) -virBufferAsprintf(buf, source %s='%s'/\n, - (isFile) ? file : dev, - source); +virBufferAsprintf(buf, source %s='%s'/\n, + (isFile) ? file : dev, + source); virBufferAsprintf(buf, target dev='%s'/\n, target); if (mode) virBufferAsprintf(buf, %s/\n, mode); However this looks bad. As written in commend just below virCommandOptString(, source): /* Allow empty string as a placeholder that implies no source, for * use in adding a cdrom drive with no disk. */ if (!*source) source = NULL; That means: virsh attach-disk domain dummy make source NULL. Therefore you want to check source != NULL in the first chunk too (okay, not as strict as here, since passing NULL to stat() makes it fail, but it's clean coding style what matters too). Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v3 2/2] x86: Allow sysinfo to fall back on /proc/cpuinfo if demidecode is absent.
From: Prerna Saxena pre...@linux.vnet.ibm.com Date: Thu, 16 Feb 2012 15:33:43 +0530 Subject: [PATCH 2/2] Sysinfo : Allow x86 to fetch sysinfo from /proc/cpuinfo in the event 'dmidecode' is absent in the system. Until now, libvirt on x86 flags an error message if dmidecode is not found. With this patch, the following is a sample output on x86 when dmidecode is absent: virsh # sysinfo sysinfo type='smbios' processor entry name='socket_destination'0/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor processor entry name='socket_destination'1/entry entry name='type'Intel(R) Xeon(R) CPU X5570 @ 2.93GHz/entry entry name='family'6/entry entry name='manufacturer'GenuineIntel/entry /processor ... (listing for all online CPUs) /sysinfo Based on suggestion from Eric Blake: (http://www.redhat.com/archives/libvir-list/2012-February/msg00509.html) Also updated to display only socket-level information under 'processor' XML tags, to bring it analogous to 'dmidecode' output. (Observed by Daniel Veillard : http://www.spinics.net/linux/fedora/libvir/msg53922.html) Acked-by: Daniel P Berrange berra...@redhat.com Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- src/util/sysinfo.c | 110 ++-- 1 files changed, 106 insertions(+), 4 deletions(-) diff --git a/src/util/sysinfo.c b/src/util/sysinfo.c index 78efc32..220338f 100644 --- a/src/util/sysinfo.c +++ b/src/util/sysinfo.c @@ -598,6 +598,111 @@ no_memory: return -1; } +/* If a call to 'dmidecode' fails, + * extract basic sysinfo from /proc/cpuinfo */ + +static int +virSysinfoParseCPUInfoProcessor(const char *base, virSysinfoDefPtr ret) +{ +const char *cur; +char *eol, *tmp_base, *physical_id=NULL, *tmp_physical_id; +int physical_id_len; +virSysinfoProcessorDefPtr processor; + +while((tmp_base = strstr(base, processor)) != NULL) { +base = tmp_base; +eol = strchr(base, '\n'); +cur = strchr(base, ':') + 1; + +tmp_physical_id = strchr(strstr(base, physical id), ':') + 1; +physical_id_len = strchr(tmp_physical_id, '\n') - tmp_physical_id; + +if(!physical_id) { +physical_id = tmp_physical_id; +} +else if (!strncmp(physical_id, tmp_physical_id, physical_id_len)) { +base = cur; +continue; +} + +if (VIR_EXPAND_N(ret-processor, ret-nprocessor, 1) 0) { +goto no_memory; +} + +processor = ret-processor[ret-nprocessor - 1]; +virSkipSpaces(cur); +if (eol +(processor-processor_socket_destination = + strndup(cur, eol - cur)) == NULL) +goto no_memory; + + +if ((cur = strstr(base, vendor_id)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol + ((processor-processor_manufacturer = + strndup(cur, eol - cur)) == NULL)) +goto no_memory; +} + +if ((cur = strstr(base, cpu family)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol +((processor-processor_family = strndup(cur, eol - cur)) + == NULL)) +goto no_memory; +} + +if ((cur = strstr(base, model name)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol + ((processor-processor_type = strndup(cur, eol - cur)) + == NULL)) +goto no_memory; +} + +base = cur; +physical_id = tmp_physical_id; +} + +return 0; + +no_memory: +return -1; +} + +static virSysinfoDefPtr +virCPUInfoSysinfoRead(void) { +virSysinfoDefPtr ret = NULL; +char *outbuf = NULL; + +if (VIR_ALLOC(ret) 0) +goto no_memory; + +if(virFileReadAll(CPUINFO, 2, outbuf) 0) { +virSmbiosReportError(VIR_ERR_INTERNAL_ERROR, + _(Failed to open %s), CPUINFO); +return NULL; +} + +ret-nprocessor = 0; +ret-processor = NULL; +if (virSysinfoParseCPUInfoProcessor(outbuf, ret) 0) +goto no_memory; + +return ret; + +no_memory: +VIR_FREE(outbuf); +return NULL; +} + virSysinfoDefPtr virSysinfoRead(void) { char *path; @@ -607,10 +712,7 @@ virSysinfoRead(void) { path = virFindFileInPath(SYSINFO_SMBIOS_DECODER); if (path == NULL) { -virSmbiosReportError(VIR_ERR_INTERNAL_ERROR, - _(Failed to find path for %s binary), -
[libvirt] [PATCH] qemuDomainDetachPciDiskDevice: Free allocated cgroup
This function potentially allocates new virCgroup but never frees it. --- src/qemu/qemu_hotplug.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index e088a49..f1c1283 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -1621,6 +1621,7 @@ int qemuDomainDetachPciDiskDevice(struct qemud_driver *driver, ret = 0; cleanup: +virCgroupFree(cgroup); VIR_FREE(drivestr); return ret; } -- 1.7.8.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] hold the reference when call monitor's callback
When qemu cannot startup, we will call EOF notify callback. Unfortunately, there is a bug in libvirt, and it will cause mon-ref to be 0 while we call EOF notify callback. This bug will cause the libvirt to be deadlock. We call the other callback while holding the reference. So I think we should also hold the reference here. Note: this patch does not fix the bug. The libvirt will be deadlock in qemuMonitorUnwatch() because the monitor is freed unexpectedly. I am still investigating this bug. But if I set a breakpoint in qemuMonitorUnref(), it does not happen now. If i remove the breakpoint, it will happen sometime(the probability is about 20%). The steps to reproduce this bug: 1. use newest qemu-kvm 2. add a host pci device into guest config file 3. start the guest The qemu will fail with the following error message: Failed to assign irq for hostdev0: Input/output error Perhaps you are assigning a device that shares an IRQ with another device? I have reported qemu's problem to qemu maillist. --- src/qemu/qemu_monitor.c | 16 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index 78eb492..20eb9ea 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -668,10 +668,14 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) { /* Make sure anyone waiting wakes up now */ virCondSignal(mon-notify); -if (qemuMonitorUnref(mon) 0) -qemuMonitorUnlock(mon); + +/* hold the reference when call monitor's callback to avoid deadlock */ +qemuMonitorUnlock(mon); VIR_DEBUG(Triggering EOF callback); (eofNotify)(mon, vm); +qemuMonitorLock(mon); +if (qemuMonitorUnref(mon) 0) +qemuMonitorUnlock(mon); } else if (error) { void (*errorNotify)(qemuMonitorPtr, virDomainObjPtr) = mon-cb-errorNotify; @@ -679,10 +683,14 @@ qemuMonitorIO(int watch, int fd, int events, void *opaque) { /* Make sure anyone waiting wakes up now */ virCondSignal(mon-notify); -if (qemuMonitorUnref(mon) 0) -qemuMonitorUnlock(mon); + +/* hold the reference when call monitor's callback to avoid deadlock */ +qemuMonitorUnlock(mon); VIR_DEBUG(Triggering error callback); (errorNotify)(mon, vm); +qemuMonitorLock(mon); +if (qemuMonitorUnref(mon) 0) +qemuMonitorUnlock(mon); } else { if (qemuMonitorUnref(mon) 0) qemuMonitorUnlock(mon); -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v3 0/3] Sysinfo improvements for PowerPC x86
This series of patches implements the following: Patch 1/2 : Implement sysinfo for PowerPC. OnPowerPC, sysinfo is obtained by reading /proc/cpuinfo since dmidecode is not available. Patch 2/2 : Based on Eric's suggestion (http://www.redhat.com/archives/libvir-list/2012-February/msg00509.html) Allow x86 to read host system's processor information from /proc/cpuinfo in the event dmidecode is not available. Changelog : --- v1-v2 : * Rebased and cleanups done. v2-v3: * Patch 1/2 : rebased. * Patch 2/2 : As reported by Daniel, it was found that 'dmidecode' lists socket (and not core-level) information under the processor XML tag. The /proc/cpuinfo fallback is also accordingly modified to display a listing of the available sockets under the 'processor' XML tag under sysinfo. (http://www.spinics.net/linux/fedora/libvir/msg53922.html) Thanks, -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v3 1/2] PowerPC : Implement sysinfo on PowerPC.
From: Prerna Saxena pre...@linux.vnet.ibm.com Date: Thu, 15 Mar 2012 16:05:26 +0530 Subject: [PATCH 1/2] Implement sysinfo on PowerPC. Libvirt on x86 parses 'dmidecode' to gather characteristics of host system. On PowerPC, this is now implemented by reading /proc/cpuinfo NOTE: memory-DIMM information is not presently implemented. Acked-by: Daniel Veillard veill...@redhat.com Acked-by: Daniel P Berrange berra...@redhat.com Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- src/util/sysinfo.c | 133 +++- 1 files changed, 131 insertions(+), 2 deletions(-) diff --git a/src/util/sysinfo.c b/src/util/sysinfo.c index 39c7581..78efc32 100644 --- a/src/util/sysinfo.c +++ b/src/util/sysinfo.c @@ -44,6 +44,7 @@ __FUNCTION__, __LINE__, __VA_ARGS__) #define SYSINFO_SMBIOS_DECODER dmidecode +#define CPUINFO /proc/cpuinfo VIR_ENUM_IMPL(virSysinfo, VIR_SYSINFO_LAST, smbios); @@ -113,10 +114,138 @@ void virSysinfoDefFree(virSysinfoDefPtr def) * * Returns: a filled up sysinfo structure or NULL in case of error */ -#if defined(WIN32) || \ + +#if defined(__powerpc__) +static int +virSysinfoParseSystem(const char *base, virSysinfoDefPtr ret) +{ +char *eol = NULL; +const char *cur; + +if ((cur = strstr(base, platform)) == NULL) +return 0; + +base = cur; +/* Account for format 'platform: '*/ +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol + ((ret-system_family = strndup(cur, eol - cur)) == NULL)) + goto no_memory; + +if ((cur = strstr(base, model)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol ((ret-system_serial = strndup(cur, eol - cur)) + == NULL)) +goto no_memory; +} + +if ((cur = strstr(base, machine)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol ((ret-system_version = strndup(cur, eol - cur)) +== NULL)) +goto no_memory; +} + +return 0; + +no_memory: +return -1; +} + +static int +virSysinfoParseProcessor(const char *base, virSysinfoDefPtr ret) +{ +const char *cur; +char *eol, *tmp_base; +virSysinfoProcessorDefPtr processor; + +while((tmp_base = strstr(base, processor)) != NULL) { +base = tmp_base; +eol = strchr(base, '\n'); +cur = strchr(base, ':') + 1; + +if (VIR_EXPAND_N(ret-processor, ret-nprocessor, 1) 0) { +goto no_memory; +} +processor = ret-processor[ret-nprocessor - 1]; + +virSkipSpaces(cur); +if (eol +((processor-processor_socket_destination = strndup + (cur, eol - cur)) == NULL)) +goto no_memory; + +if ((cur = strstr(base, cpu)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol + ((processor-processor_type = strndup(cur, eol - cur)) + == NULL)) +goto no_memory; +} + +if ((cur = strstr(base, revision)) != NULL) { +cur = strchr(cur, ':') + 1; +eol = strchr(cur, '\n'); +virSkipSpaces(cur); +if (eol + ((processor-processor_version = strndup(cur, eol - cur)) +== NULL)) +goto no_memory; +} + +base = cur; +} + +return 0; + +no_memory: +return -1; +} + +/* virSysinfoRead for PowerPC + * Gathers sysinfo data from /proc/cpuinfo */ +virSysinfoDefPtr +virSysinfoRead(void) { +virSysinfoDefPtr ret = NULL; +char *outbuf = NULL; + +if (VIR_ALLOC(ret) 0) +goto no_memory; + +if(virFileReadAll(CPUINFO, 2048, outbuf) 0) { +virSmbiosReportError(VIR_ERR_INTERNAL_ERROR, + _(Failed to open %s), CPUINFO); +return NULL; +} + +ret-nprocessor = 0; +ret-processor = NULL; +if (virSysinfoParseProcessor(outbuf, ret) 0) +goto no_memory; + +if (virSysinfoParseSystem(outbuf, ret) 0) +goto no_memory; + +return ret; + +no_memory: +VIR_FREE(outbuf); +return NULL; +} + +#elif defined(WIN32) || \ !(defined(__x86_64__) || \ defined(__i386__) || \ - defined(__amd64__)) + defined(__amd64__) || \ + defined(__powerpc__)) virSysinfoDefPtr virSysinfoRead(void) { /* -- 1.7.7.6 -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India -- libvir-list mailing list libvir-list@redhat.com
[libvirt] [PATCH] numad: Fix typo and warning
src/libvirt_private.syms: s/virDomainCpuPlacement/virDomainCpuPlacementMode/ src/qemu/qemu_process.c def-mem.cur_balloon expects llu -- pushed under build-breaker rule --- src/libvirt_private.syms |4 ++-- src/qemu/qemu_process.c |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d58082a..e40c80e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -271,8 +271,8 @@ virDomainControllerModelSCSITypeToString; virDomainControllerModelUSBTypeFromString; virDomainControllerModelUSBTypeToString; virDomainControllerTypeToString; -virDomainCpuPlacementTypeFromString; -virDomainCpuPlacementTypeToString; +virDomainCpuPlacementModeTypeFromString; +virDomainCpuPlacementModeTypeToString; virDomainCpuSetFormat; virDomainCpuSetParse; virDomainDefAddImplicitControllers; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 56cb531..0af3751 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1641,7 +1641,7 @@ qemuGetNumadAdvice(virDomainDefPtr def) char *args = NULL; char *output = NULL; -if (virAsprintf(args, %d:%lu, def-vcpus, def-mem.cur_balloon) 0) { +if (virAsprintf(args, %d:%llu, def-vcpus, def-mem.cur_balloon) 0) { virReportOOMError(); goto out; } -- 1.7.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 06:22:38AM -0500, Jesse Cook wrote: Just to clarify, you would like to see: v0.9.10 pre-patch client talk to v0.9.10 patch server v0.9.10 patch client talk to v0.9.10 pre-patch server And for comparison v0.9.10 pre-patch client talk to v0.9.10 pre-patch server Obviously for v0.9.10 patch client talk to v0.9.10 patch server I'd expect just works :-) Would the test code I used in my cover letter be sufficient? If so, I could probably test this fairly easily and quickly today. Yep. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemuDomainDetachPciDiskDevice: Free allocated cgroup
On 03/15/2012 11:49 AM, Michal Privoznik wrote: This function potentially allocates new virCgroup but never frees it. --- ACK. Peter -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 3:42 AM, Osier Yang jy...@redhat.com wrote: On 03/15/2012 09:42 AM, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; Seems we have much problem of the array length for the RPC calls. A similiar problem with VOL_NAME_LIST_MAX: https://bugzilla.redhat.com/show_bug.cgi?id=802357 Osier I will need this change as well. I can patch and test. -- Jesse -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 5:14 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Mar 14, 2012 at 08:42:35PM -0500, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; /* Upper limit on lists of storage vol names. */ const REMOTE_STORAGE_VOL_NAME_LIST_MAX = 1024; We have to think about what the compatibility implications are for this kind of change. eg what is the behaviour when old client talks to new server, and vica-verca. It might be fine, but I'd like someone to enumerate the before after behaviour in all combinations. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| Sorry, I accidentally top-posted my reply originally. Just to clarify, you would like to see: v0.9.10 pre-patch client talk to v0.9.10 patch server v0.9.10 patch client talk to v0.9.10 pre-patch server Would the test code I used in my cover letter be sufficient? If so, I could probably test this fairly easily and quickly today. -- Jesse -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
Just to clarify, you would like to see: v0.9.10 pre-patch client talk to v0.9.10 patch server v0.9.10 patch client talk to v0.9.10 pre-patch server Would the test code I used in my cover letter be sufficient? If so, I could probably test this fairly easily and quickly today. On Thu, Mar 15, 2012 at 5:14 AM, Daniel P. Berrange berra...@redhat.comwrote: On Wed, Mar 14, 2012 at 08:42:35PM -0500, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; /* Upper limit on lists of storage vol names. */ const REMOTE_STORAGE_VOL_NAME_LIST_MAX = 1024; We have to think about what the compatibility implications are for this kind of change. eg what is the behaviour when old client talks to new server, and vica-verca. It might be fine, but I'd like someone to enumerate the before after behaviour in all combinations. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/:| |: http://libvirt.org -o- http://virt-manager.org:| |: http://autobuild.org -o- http://search.cpan.org/~danberr/:| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc:| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: Split scsi-disk into into scsi-hd and scsi-cd
A scsi-disk device can be either a hard disk or a CD-ROM, if there is ,media=cdrom specified for the backend, it's a CD-ROM, otherwise it's a hard disk. But upstream qemu splitted scsi-disk into scsi-hd and scsi-cd since commit b443ae, and ,media=cdrom is not required for scsi-cd anymore. scsi-disk is still supported for backwards compatibility, but no doubt we should go foward. --- src/qemu/qemu_capabilities.c |3 + src/qemu/qemu_capabilities.h |1 + src/qemu/qemu_command.c| 39 ++ tests/qemuhelptest.c |6 ++- .../qemuxml2argv-disk-scsi-disk-split.args | 16 +++ .../qemuxml2argv-disk-scsi-disk-split.xml | 44 tests/qemuxml2argvtest.c |3 + 7 files changed, 102 insertions(+), 10 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-disk-split.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-disk-split.xml diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ace5011..c4b04ae 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -155,6 +155,7 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, system_wakeup, scsi-disk.channel, scsi-block, + scsi-cd, ); struct qemu_feature_flags { @@ -1447,6 +1448,8 @@ qemuCapsParseDeviceStr(const char *str, virBitmapPtr flags) qemuCapsSet(flags, QEMU_CAPS_SCSI_DISK_CHANNEL); if (strstr(str, scsi-block)) qemuCapsSet(flags, QEMU_CAPS_SCSI_BLOCK); +if (strstr(str, scsi-cd)) +qemuCapsSet(flags, QEMU_CAPS_SCSI_CD); return 0; } diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 62b4270..5ebffdc 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -123,6 +123,7 @@ enum qemuCapsFlags { QEMU_CAPS_WAKEUP = 86, /* system_wakeup monitor command */ QEMU_CAPS_SCSI_DISK_CHANNEL = 87, /* Is scsi-disk.channel available? */ QEMU_CAPS_SCSI_BLOCK = 88, /* -device scsi-block */ +QEMU_CAPS_SCSI_CD= 89, /* -device scsi-cd */ QEMU_CAPS_LAST, /* this must always be the last item */ }; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 3b9eec8..afe5db8 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1967,8 +1967,14 @@ qemuBuildDriveStr(virConnectPtr conn ATTRIBUTE_UNUSED, else virBufferAsprintf(opt, if=%s, bus); -if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) -virBufferAddLit(opt, ,media=cdrom); +if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) { +if ((disk-bus == VIR_DOMAIN_DISK_BUS_SCSI)) { +if (!qemuCapsGet(qemuCaps, QEMU_CAPS_SCSI_CD)) +virBufferAddLit(opt, ,media=cdrom); +} else { +virBufferAddLit(opt, ,media=cdrom); +} +} if (qemuCapsGet(qemuCaps, QEMU_CAPS_DEVICE)) { virBufferAsprintf(opt, ,id=%s%s, QEMU_DRIVE_HOST_PREFIX, disk-info.alias); @@ -2218,10 +2224,19 @@ qemuBuildDriveDevStr(virDomainDefPtr def, goto error; } -if (disk-device == VIR_DOMAIN_DISK_DEVICE_LUN) +if (disk-device == VIR_DOMAIN_DISK_DEVICE_LUN) { virBufferAddLit(opt, scsi-block); -else -virBufferAddLit(opt, scsi-disk); +} else { +if (qemuCapsGet(qemuCaps, QEMU_CAPS_SCSI_CD)) { +if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) +virBufferAddLit(opt, scsi-cd); +else +virBufferAddLit(opt, scsi-hd); +} else { +virBufferAddLit(opt, scsi-disk); +} +} + virBufferAsprintf(opt, ,bus=scsi%d.%d,scsi-id=%d, disk-info.addr.drive.controller, disk-info.addr.drive.bus, @@ -2244,10 +2259,18 @@ qemuBuildDriveDevStr(virDomainDefPtr def, } } -if (disk-device != VIR_DOMAIN_DISK_DEVICE_LUN) -virBufferAddLit(opt, scsi-disk); -else +if (disk-device != VIR_DOMAIN_DISK_DEVICE_LUN) { +if (qemuCapsGet(qemuCaps, QEMU_CAPS_SCSI_CD)) { +if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) +virBufferAddLit(opt, scsi-cd); +else +virBufferAddLit(opt, scsi-hd); +} else { +virBufferAddLit(opt, scsi-disk); +} +} else { virBufferAddLit(opt, scsi-block); +} virBufferAsprintf(opt, ,bus=scsi%d.0,channel=%d,scsi-id=%d,lun=%d,
[libvirt] [PATCH] qemu: Split ide-drive into ide-cd and ide-hd
A ide-drive device can be either a hard disk or a CD-ROM, if there is ,media=cdrom specified for the backend, it's a CD-ROM, otherwise it's a hard disk. Upstream qemu splitted ide-drive into ide-hd and ide-cd since commit 1f56e32, and ,media=cdrom is not required for ide-cd anymore. ide-drive is still supported for backwards compatibility, but no doubt we should go foward. --- src/qemu/qemu_capabilities.c |4 ++ src/qemu/qemu_capabilities.h |1 + src/qemu/qemu_command.c| 26 +++- tests/qemuhelptest.c |3 +- .../qemuxml2argv-disk-ide-drive-split.args |8 + .../qemuxml2argv-disk-ide-drive-split.xml | 32 tests/qemuxml2argvtest.c |3 ++ 7 files changed, 74 insertions(+), 3 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-ide-drive-split.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-ide-drive-split.xml diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index c4b04ae..1f999a7 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -156,6 +156,8 @@ VIR_ENUM_IMPL(qemuCaps, QEMU_CAPS_LAST, scsi-disk.channel, scsi-block, scsi-cd, + + ide-cd, /* 90 */ ); struct qemu_feature_flags { @@ -1450,6 +1452,8 @@ qemuCapsParseDeviceStr(const char *str, virBitmapPtr flags) qemuCapsSet(flags, QEMU_CAPS_SCSI_BLOCK); if (strstr(str, scsi-cd)) qemuCapsSet(flags, QEMU_CAPS_SCSI_CD); +if (strstr(str, ide-cd)) +qemuCapsSet(flags, QEMU_CAPS_IDE_CD); return 0; } diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 5ebffdc..e664f29 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -124,6 +124,7 @@ enum qemuCapsFlags { QEMU_CAPS_SCSI_DISK_CHANNEL = 87, /* Is scsi-disk.channel available? */ QEMU_CAPS_SCSI_BLOCK = 88, /* -device scsi-block */ QEMU_CAPS_SCSI_CD= 89, /* -device scsi-cd */ +QEMU_CAPS_IDE_CD = 90, /* -device ide-cd */ QEMU_CAPS_LAST, /* this must always be the last item */ }; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index afe5db8..9a02a80 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1971,6 +1971,9 @@ qemuBuildDriveStr(virConnectPtr conn ATTRIBUTE_UNUSED, if ((disk-bus == VIR_DOMAIN_DISK_BUS_SCSI)) { if (!qemuCapsGet(qemuCaps, QEMU_CAPS_SCSI_CD)) virBufferAddLit(opt, ,media=cdrom); +} else if (disk-bus == VIR_DOMAIN_DISK_BUS_IDE) { +if (!qemuCapsGet(qemuCaps, QEMU_CAPS_IDE_CD)) +virBufferAddLit(opt, ,media=cdrom); } else { virBufferAddLit(opt, ,media=cdrom); } @@ -2194,7 +2197,16 @@ qemuBuildDriveDevStr(virDomainDefPtr def, _(target must be 0 for ide controller)); goto error; } -virBufferAddLit(opt, ide-drive); + +if (qemuCapsGet(qemuCaps, QEMU_CAPS_IDE_CD)) { +if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) +virBufferAddLit(opt, ide-cd); +else +virBufferAddLit(opt, ide-hd); +} else { +virBufferAddLit(opt, ide-drive); +} + virBufferAsprintf(opt, ,bus=ide.%d,unit=%d, disk-info.addr.drive.bus, disk-info.addr.drive.unit); @@ -2285,12 +2297,22 @@ qemuBuildDriveDevStr(virDomainDefPtr def, _(bus must be 0 for ide controller)); goto error; } + if (disk-info.addr.drive.target != 0) { qemuReportError(VIR_ERR_CONFIG_UNSUPPORTED, %s, _(target must be 0 for ide controller)); goto error; } -virBufferAddLit(opt, ide-drive); + +if (qemuCapsGet(qemuCaps, QEMU_CAPS_IDE_CD)) { +if (disk-device == VIR_DOMAIN_DISK_DEVICE_CDROM) +virBufferAddLit(opt, ide-cd); +else +virBufferAddLit(opt, ide-hd); +} else { +virBufferAddLit(opt, ide-drive); +} + virBufferAsprintf(opt, ,bus=ahci%d.%d, disk-info.addr.drive.controller, disk-info.addr.drive.unit); diff --git a/tests/qemuhelptest.c b/tests/qemuhelptest.c index a01c389..d23b35a 100644 --- a/tests/qemuhelptest.c +++ b/tests/qemuhelptest.c @@ -676,7 +676,8 @@ mymain(void) QEMU_CAPS_CPU_HOST, QEMU_CAPS_FSDEV_WRITEOUT, QEMU_CAPS_SCSI_BLOCK, -QEMU_CAPS_SCSI_CD); +QEMU_CAPS_SCSI_CD, +QEMU_CAPS_IDE_CD); return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; }
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/14/2012 07:42 PM, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. --- src/remote/remote_protocol.x |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 59774b2..58f0871 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -103,7 +103,7 @@ const REMOTE_INTERFACE_NAME_LIST_MAX = 256; const REMOTE_DEFINED_INTERFACE_NAME_LIST_MAX = 256; /* Upper limit on lists of storage pool names. */ -const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 256; +const REMOTE_STORAGE_POOL_NAME_LIST_MAX = 65536; While I agree that it would be nice to avoid capping the number of pools that can actually exist, I have to say NACK to the approach in this patch. The reason we originally chose a limit of 256 is because each storage pool name is an arbitrary-length string, and when you pile multiple strings in a row, you can generate a rather lengthy RPC message for which the receiver has to pre-allocate memory in order to receive correctly. See this comment: /* Maximum length of a memory peek buffer message. * Note applications need to be aware of this limit and issue multiple * requests for large amounts of data. */ const REMOTE_DOMAIN_MEMORY_PEEK_BUFFER_MAX = 65536; That is, you cannot send any RPC with more than 64k data, because . With a cap of 256 pool names, that means each pool name can be (on average) about 256 bytes before you hit the RPC cap. Bumping REMOTE_STORAGE_POOL_NAME_LIST_MAX to 65536 is pointless; each remote_nonnull_string already occupies several bytes towards the 64k cap. I could perhaps see bumping things to 512, which lowers your average pool name to about 128 bytes before you hit the RPC cap; or even a cap of 1024 names averaging 64 bytes each. But the fact remains that you will hit a hard wall on RPC sizing, with diminishing returns if you change the pool name list maximum, which means that the _real_ solution to this has to be at a more fundamental level. We need a new API, which lets you specify a starting point and range, in order to query only part of the pool list. For example, see how the recent virDomainGetCPUStats has start_cpu and ncpus parameters, as well as a cap on only 128 cpus per RPC call. If you have a beefy machine with 256 cpus, you have to make two separate RPC calls in order to collect all of the statistics. In other words, I think we need: virConnectListStoragePoolsRange(virConnectPtr conn, char **const names, int start, int maxnames, unsigned int flags); where flags controls whether we are listing active or defined pools (virConnectListStoragePools and virConnectListDefinedStoragePools were added back in the days before we realized how useful a flags argument would be), where 'maxnames' can be at most 256 (per the RPC limits), but where 'start' lets you choose which point in the array that you are starting your enumeration so that multiple calls let you see the entire array. It may also be worth considering the addition of _just_ a new ranged RPC call, but no new public API, by making the public API smart enough to automatically call the ranged RPC multiple times to build up a single return to the user. After all, if we exposed the ranged call to the user, then each client will have to add the logic to make the multiple calls; whereas our current limitation is _only_ in the RPC aspect (once the data is across the RPC and into the target process' memory space, there's no longer a 64k cap to worry about). The same need for a new ranged API or RPC applies to any situation where the current RPC limits are starting to feel too small compared to the amount of data available to transmit in a single call. There's also an issue of concurrency - a single API call sees consistent data, but back-to-back calls over subsets of the array could race with another client adding or removing a pool in the meantime. But I'm not sure what can be done to prevent issues on that front; you already have TOCTTOU races between grabbing the list of pools and acting on that list even without a range command. Maybe we can invent a new API for batching several commands into one transaction, with multiple RPC calls but with a guarantee that no RPCs from any other connection will alter the state between the batch calls; but then we start to get into issues with what happens if a network connection goes down in the middle of operating on a batch command. So it may be something that we declare as not worth trying to solve. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com
Re: [libvirt] [PATCH 1/1] conf, util: on restart of libvirt restart vepa callbacks
On 03/14/2012 05:00 AM, D. Herrendoerfer wrote: From: D. Herrendoerfer d.herrendoer...@herrendoerfer.name When libvirtd is restarted, also restart the netlink event message callbacks for existing VEPA connections and send a message to lldpad for these existing links, so it learns the new libvirtd pid. Note: This Patch provides fixes to a number of scenarios where lldpad, switch, and libvirt would loose syncronization. 1. Crash/Reboot of a switch. 2. lldpad crash. 3. Clearing of the local switch database. (cleanvms) 4. libvirtd crash or restart. In any of the above events restarting libvirtd will restore normal operations of the VMs on their virtual network. Signed-off-by: D. Herrendoerfer d.herrendoer...@herrendoerfer.name --- src/conf/domain_conf.c | 18 ++ src/util/virnetdevmacvlan.c | 128 +- src/util/virnetdevmacvlan.h |9 +++ 3 files changed, 127 insertions(+), 28 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b994718..2ea3ea7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -52,6 +52,7 @@ #include secret_conf.h #include netdev_vport_profile_conf.h #include netdev_bandwidth_conf.h +#include virnetdevmacvlan.h #define VIR_FROM_THIS VIR_FROM_DOMAIN @@ -4545,6 +4546,23 @@ virDomainNetDefParseXML(virCapsPtr caps, if ((flags VIR_DOMAIN_XML_INACTIVE)) VIR_FREE(ifname); +else { +char *tmp; +VIR_DEBUG(Device already active on %s in VEPA mode.,ifname); + +tmp = virXPathString(string(../../uuid[1]), ctxt); + +if (tmp) { +VIR_DEBUG(Active VM UUID: %s. Restarting association and callback handler.,tmp); +ignore_value(virNetDevMacVLanRestartWithVPortProfile(ifname, + def-mac, + def-data.direct.linkdev, + tmp, + def-data.direct.virtPortProfile, + VIR_NETDEV_VPORT_PROFILE_OP_CREATE)); +VIR_FREE(tmp); +} +} This is not the correct place to be taking action on the results of a parse - the parse functions should just parse the xml into objects, and not make assumptions about what will be done with them. Instead you need to add code to each hypervisor driver's xxxReload() function just after it has called virDomainLoadAllConfigs to reload all the running or status configs. For example, for the qemu driver, you could add something like the following to qemudStartup() just after the call to qemuProcessReconnectAll() (around src/qemu/qemu_driver.c:679): virHashForEach(qemu_driver-domains.objs, qemuDomainNetsRestart, NULL); (and somewhere above that:) static void qemuDomainNetsReload(void *payload, const void *name ATTRIBUTE_UNUSED, void *data) { int i; virDomainObjPtr vm = (virDomainObjPtr)payload; virDomainDefPtr def = vm-def; virDomainObjLock(vm); for (i = 0; i def-nnets; i++) { virDomainNetDefPtr net = def-nets[i]; if (this net needs action taken) { take action } } cleanup: virDomainObjUnlock(vm); } You'll need to do something similar for all the other hypervisor types that support macvtap interfaces and port profiles (again, look for the xxxReload() function of that hypervisor, most easily found by a tag search for virDomainLoadAllConfigs). Beyond this, I don't see any explicit action to remove the stale registration from lldpad. Is there a way to do this? Or will lldpad possibly remove it by itself after the first time it gets no reponse? break; diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c index 647679f..19daf57 100644 --- a/src/util/virnetdevmacvlan.c +++ b/src/util/virnetdevmacvlan.c @@ -769,6 +769,49 @@ virNetDevMacVLanVPortProfileDestroyCallback(int watch ATTRIBUTE_UNUSED, virNetlinkCallbackDataFree((virNetlinkCallbackDataPtr)opaque); } +static int +virNetDevMacVLanVPortProfileRegisterCallback(const char *ifname, + const unsigned char *macaddress, + const char *linkdev, + const unsigned char *vmuuid, + virNetDevVPortProfilePtr virtPortProfile, + enum virNetDevVPortProfileOp vmOp) +{ +virNetlinkCallbackDataPtr calld = NULL; + +if (virtPortProfile virNetlinkEventServiceIsRunning()) { +if
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 08:31:32AM -0600, Eric Blake wrote: On 03/14/2012 07:42 PM, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. correctly. See this comment: /* Maximum length of a memory peek buffer message. * Note applications need to be aware of this limit and issue multiple * requests for large amounts of data. */ const REMOTE_DOMAIN_MEMORY_PEEK_BUFFER_MAX = 65536; That is, you cannot send any RPC with more than 64k data, because . With a cap of 256 pool names, that means each pool name can be (on average) about 256 bytes before you hit the RPC cap. Not quite right, you meant to look at virnetrpcprotocol.x: /* Size of message payload */ const VIR_NET_MESSAGE_PAYLOAD_MAX = 262120; Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 05:51 AM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 06:22:38AM -0500, Jesse Cook wrote: Just to clarify, you would like to see: I have not actually tried this, but based on my understanding of the RPC famework, I'd expect: v0.9.10 pre-patch client talk to v0.9.10 patch server If there are 256 or fewer names, this works as it always did. If there are more than 256 names, then the server tries to send back more array elements than the client is prepared to receive - the client will treat the response as an invalid RPC, and my recollection is that under these circumstances, the client then forcefully disconnects the session under the assumption that the server is broken. To avoid breaking such clients, you'd probably need some way for clients to warn the server if they are prepared to receive more than the old max (similar to how we had to add VIR_TYPED_PARAM_STRING_OKAY as a way for clients to tell the server that they were prepared to handle typed strings being returned). Basically, any time you change the RPC to allow more data to be returned than what was previously possible, the server should not return that additional data unless the client has negotiated that it is aware of the additional data. v0.9.10 patch client talk to v0.9.10 pre-patch server No problem for the client. The server will never successfully return more than 256 names. However, if the client requests more than 256 names, I'm not sure whether the server will silently truncate to 256 or whether it will return an RPC error because it was unable to provide the reply. And for comparison v0.9.10 pre-patch client talk to v0.9.10 pre-patch server Should be the same behavior as a v0.9.10-patch client talking to a pre-patch server - you can't successfully send more than 256 names, but I'm not sure how the server will react if the client requests too many names. Actually, it might be slightly different - the client might reject the request up front without even contacting the server. Obviously for v0.9.10 patch client talk to v0.9.10 patch server I'd expect just works :-) Of course, if both sides are prepared to handle the new RPC protocol, it should just work. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 08:41 AM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 08:31:32AM -0600, Eric Blake wrote: On 03/14/2012 07:42 PM, Jesse J. Cook wrote: 256 (8 bits) is insufficient for large scale deployments. 65536 (16 bits) is a more appropriate limit and should be sufficient. You are more likely to run into other system limitations first, such as the 31998 inode link limit on ext3. correctly. See this comment: /* Maximum length of a memory peek buffer message. * Note applications need to be aware of this limit and issue multiple * requests for large amounts of data. */ const REMOTE_DOMAIN_MEMORY_PEEK_BUFFER_MAX = 65536; That is, you cannot send any RPC with more than 64k data, because . With a cap of 256 pool names, that means each pool name can be (on average) about 256 bytes before you hit the RPC cap. Not quite right, you meant to look at virnetrpcprotocol.x: /* Size of message payload */ const VIR_NET_MESSAGE_PAYLOAD_MAX = 262120; Okay, I read the wrong constant, so I'm off by a factor of 4. That means increasing the pool limit from 256 to 1024 would reduce the average pool name length from 1024 down to 256, which is still pretty reasonable; but increasing the pool limit to 64k reduces the average pool name length limit down to 4 bytes if all names are provided at once (and those four bytes are probably all consumed in the overhead, since RPC sends strings by listing length before contents). For a case study on bumping limits, I only found one commit where we have done it in the past, using 'git log -p src/remote/remote_protocol.x' and grepping for '^-.*MAX.*=': commit 8654175 changed REMOTE_MIGRATE_COOKIE_MAX from 256 to 16k But there, we were also adding migrate v3, where v2 (and therefore old servers never sent enough data) to pass the max, so feature negotiation is built in - a client will only get more than 256 bytes if it requests migrate v3. My point remains that bumping limits only delays the limit and also requires some client feature negotiation, and that the only way to fix things to unlimited is to break it into multiple RPC calls. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemuDomainDetachPciDiskDevice: Free allocated cgroup
On 15.03.2012 13:04, Peter Krempa wrote: On 03/15/2012 11:49 AM, Michal Privoznik wrote: This function potentially allocates new virCgroup but never frees it. --- ACK. Peter Thanks, pushed. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv2] Emit graphics events when a SPICE client connects/disconnects
On 03/15/2012 12:19 AM, Osier Yang wrote: On 03/14/2012 11:05 PM, Laine Stump wrote: Wire up the domain graphics event notifications for SPICE. Adapted from a RHEL-only patch written by Dan Berrange that used custom __com.redhat_SPICE events - equivalent events are now available in upstream QEMU (including a SPICE_CONNECTED event, which was missing in the __COM.redhat_SPICE version). * src/qemu/qemu_monitor_json.c: Wire up SPICE graphics events --- V2: fix copy-paste typos checking for wrong strings. Not adding any interpretation of parameters, as this patch is just achieving parity with RHEL-specific build. ACK then. Okay. I pushed it. Thanks! -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] lib: Don't access configuration if none is present
Commit e457d5ef2015e6106094b85f8bbd1582002edc4d adds ability to pass the default URI using the client configuration file. If the file is not present, it still accesses the NULL config object causing a segfault. Caught running make check. --- src/libvirt.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 1c0cdf7..d0b1b28 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -1085,7 +1085,8 @@ virConnectOpenResolveURIAlias(virConfPtr conf, *uri = NULL; -if ((value = virConfGetValue(conf, uri_aliases))) +if (conf +(value = virConfGetValue(conf, uri_aliases))) ret = virConnectOpenFindURIAliasMatch(value, alias, uri); else ret = 0; -- 1.7.3.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] sanlock: Enhance error message to point to possible problem with selinux
If the connection to the sanlock daemon is forbidden by selinux the error message was not clear enough. This patch adds a check if proper configuration for selinux is used while trying to connect to sanlock. *src/locking/lock_driver_sanlock.c: - add macro virLockSystemError that checks for selinux and reports an improved error message - modify calls of virReportSystemError to the new macro in apropriate places Background: https://bugzilla.redhat.com/show_bug.cgi?id=770488 --- src/locking/lock_driver_sanlock.c | 83 +++-- 1 files changed, 52 insertions(+), 31 deletions(-) diff --git a/src/locking/lock_driver_sanlock.c b/src/locking/lock_driver_sanlock.c index d344d6a..d5634f9 100644 --- a/src/locking/lock_driver_sanlock.c +++ b/src/locking/lock_driver_sanlock.c @@ -35,6 +35,10 @@ #include sanlock_resource.h #include sanlock_admin.h +#if HAVE_SELINUX +# include selinux/selinux.h +#endif + #include lock_driver.h #include logging.h #include virterror_internal.h @@ -51,7 +55,23 @@ #define virLockError(code, ...) \ virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \ __FUNCTION__, __LINE__, __VA_ARGS__) - +#if HAVE_SELINUX +# define virLockSystemError(theerrno, format, ...) \ +do { \ +if ((theerrno)==EACCES \ +security_get_boolean_active(virt_use_sanlock) == 0) { \ +char errbuff[1024]; \ +snprintf(errbuff, sizeof(errbuff), %s %s, (format), \ +_((Consider setting virt_use_sanlock selinux variable)));\ +virReportSystemError((theerrno), errbuff, __VA_ARGS__); \ +} else { \ +virReportSystemError((theerrno), (format), __VA_ARGS__); \ +} \ +} while(0); +#else +# define virLockSystemError(...) \ +virReportSystemError(__VA_ARGS__); +#endif #define VIR_LOCK_MANAGER_SANLOCK_AUTO_DISK_LOCKSPACE __LIBVIRT__DISKS__ @@ -186,9 +206,9 @@ static int virLockManagerSanlockSetupLockspace(void) _(Unable to query sector size %s: error %d), path, rv); else -virReportSystemError(-rv, - _(Unable to query sector size %s), - path); +virLockSystemError(-rv, + _(Unable to query sector size %s), + path); goto error_unlink; } @@ -215,9 +235,9 @@ static int virLockManagerSanlockSetupLockspace(void) _(Unable to initialize lockspace %s: error %d), path, rv); else -virReportSystemError(-rv, - _(Unable to initialize lockspace %s), - path); +virLockSystemError(-rv, + _(Unable to initialize lockspace %s), + path); goto error_unlink; } VIR_DEBUG(Lockspace %s has been initialized, path); @@ -236,9 +256,9 @@ static int virLockManagerSanlockSetupLockspace(void) _(Unable to add lockspace %s: error %d), path, rv); else -virReportSystemError(-rv, - _(Unable to add lockspace %s), - path); +virLockSystemError(-rv, + _(Unable to add lockspace %s), + path); goto error_unlink; } else { VIR_DEBUG(Lockspace %s is already registered, path); @@ -559,9 +579,9 @@ static int virLockManagerSanlockCreateLease(struct sanlk_resource *res) _(Unable to query sector size %s: error %d), res-disks[0].path, rv); else -virReportSystemError(-rv, - _(Unable to query sector size %s), - res-disks[0].path); +virLockSystemError(-rv, + _(Unable to query sector size %s), + res-disks[0].path); goto error_unlink; } @@ -588,9 +608,9 @@
Re: [libvirt] [PATCH] sanlock: Enhance error message to point to possible problem with selinux
On Thu, Mar 15, 2012 at 05:23:09PM +0100, Peter Krempa wrote: If the connection to the sanlock daemon is forbidden by selinux the error message was not clear enough. This patch adds a check if proper configuration for selinux is used while trying to connect to sanlock. *src/locking/lock_driver_sanlock.c: - add macro virLockSystemError that checks for selinux and reports an improved error message - modify calls of virReportSystemError to the new macro in apropriate places Background: https://bugzilla.redhat.com/show_bug.cgi?id=770488 IMHO this is not something we should do here. You're outputing the message regardless of whether there is even an NFS volume involved, and harcoding details of the SELinux policy. Finally I don't think we should blindly tell people to change SELinux tunables without explaining the implications, which is not practical in an error message. So, IMHO, this belongs in documentation, not in the error messages here. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] lib: Don't access configuration if none is present
On Thu, Mar 15, 2012 at 05:22:28PM +0100, Peter Krempa wrote: Commit e457d5ef2015e6106094b85f8bbd1582002edc4d adds ability to pass the default URI using the client configuration file. If the file is not present, it still accesses the NULL config object causing a segfault. Caught running make check. --- src/libvirt.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/libvirt.c b/src/libvirt.c index 1c0cdf7..d0b1b28 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -1085,7 +1085,8 @@ virConnectOpenResolveURIAlias(virConfPtr conf, *uri = NULL; -if ((value = virConfGetValue(conf, uri_aliases))) +if (conf +(value = virConfGetValue(conf, uri_aliases))) ret = virConnectOpenFindURIAliasMatch(value, alias, uri); else ret = 0; ACK Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] lib: Don't access configuration if none is present
On 03/15/2012 05:37 PM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 05:22:28PM +0100, Peter Krempa wrote: Commit e457d5ef2015e6106094b85f8bbd1582002edc4d adds ability to pass the default URI using the client configuration file. If the file is not present, it still accesses the NULL config object causing a segfault. Caught running make check. --- Thanks, pushed. Peter -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 08:31 AM, Eric Blake wrote: It may also be worth considering the addition of _just_ a new ranged RPC call, but no new public API, by making the public API smart enough to automatically call the ranged RPC multiple times to build up a single return to the user. After all, if we exposed the ranged call to the user, then each client will have to add the logic to make the multiple calls; whereas our current limitation is _only_ in the RPC aspect (once the data is across the RPC and into the target process' memory space, there's no longer a 64k cap to worry about). Thinking more about this idea: Suppose we have the following in remote_protocol.x: Existing: struct remote_storage_pool_list_volumes_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES = 92, /* autogen skipgen priority:high */ (note the change from autogen to skipgen for src/remote) New: struct remote_storage_pool_list_volumes_get_transaction_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_get_transaction_ret { int transaction; }; struct remote_storage_pool_list_volumes_transaction_args { remote_nonnull_storage_pool pool; int transaction; int start; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION = ... /* skipgen skipgen priority:high */ REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION = ... /* skipgen skipgen priority:high */ Then, in pseudocode, we change the currently-generated src/remote handler for storagePoolListVolumes to do: if maxnames = REMOTE_STORAGE_VOL_NAME_LIST_MAX use REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES /* existing code */ else { int transaction, i = 0; pass pool and maxnames through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION to set transaction while (i maxnames) { pass pool and transaction through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION to get 128 more names i += REMOTE_STORAGE_VOL_NAME_LIST_MAX; } } Then, in daemon/remote.c, we leave the existing handler (which calls back into virStoragePoolListVolumes with 256 or less entries) untouched, and add two new hand-written RPC handlers: the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION: register that a sequenced transaction has been started allocate an array of maxnames and position and call virStoragePoolListVolumes, and tie that result to the sequence return the sequence transaction number the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION: validates that the sequence number is valid, and that the start value matches the recorded position prepares the return RPC with the next 128 names from the array tied to the sequence, and updates the position if the position reached the end, free the array and end the sequenced transaction we also update virnetserver.c to support sequenced commands - the idea here is that any time a transaction will need more than one RPC message between client and server, then a transaction number can be generated which reserves memory tied to the transaction; as long as that transaction number exists, then further commands sent with reference to that transaction can reuse that memory; the transaction is then closed if the server and client complete the transaction, but also closed (and frees the reserved memory) if the connection disappears. To avoid a misbehaving client from DoS'ing a server, we could also require that while a transaction is open, the next RPC call from the client must reference that transaction, and must come within a reasonable time limit. To some extent, we already have a 'continue' status for a message; right now, a 'call' can only be replied by 'ok' or 'error', but this would be a case where 'call' is now replied by 'continue' to state that a transaction is in effect, and where we could reuse the 'serial' for as long as the transaction is in effect. At any rate, I don't have code to back this up, but it's certainly food for thought on whether it makes sense to try and enhance our use of RPC to support transactions without also having to add new public API. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] sanlock: Enhance error message to point to possible problem with selinux
On 03/15/2012 10:35 AM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 05:23:09PM +0100, Peter Krempa wrote: If the connection to the sanlock daemon is forbidden by selinux the error message was not clear enough. This patch adds a check if proper configuration for selinux is used while trying to connect to sanlock. *src/locking/lock_driver_sanlock.c: - add macro virLockSystemError that checks for selinux and reports an improved error message - modify calls of virReportSystemError to the new macro in apropriate places Background: https://bugzilla.redhat.com/show_bug.cgi?id=770488 IMHO this is not something we should do here. You're outputing the message regardless of whether there is even an NFS volume involved, and harcoding details of the SELinux policy. Finally I don't think we should blindly tell people to change SELinux tunables without explaining the implications, which is not practical in an error message. We've done this sort of targeted error message before; but there we were careful to _only_ issue the message after checking that we were indeed dealing with NFS; see commit 1888363d. [Hmm - should that patch be revisited, to mention virt_use_samba if it was samba rather than nfs that caused the SELinux denial, since those are different bools?] So, IMHO, this belongs in documentation, not in the error messages here. I think both are appropriate - we definitely need to mention virt_use_sanlock in locking.html.in, with full implications, but I would also like to see the error message, provided that you can be sure that the error message only mentions virt_use_sanlock in the actual case where SELinux is enforcing and the error is confirmed to be due to NFS causing us to need the bool. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] Add support for forcing a private network namespace for LXC guests
From: Daniel P. Berrange berra...@redhat.com If no interface elements are included in an LXC guest XML description, then the LXC guest will just see the host's network interfaces. It is desirable to be able to hide the host interfaces, without having to define any guest interfaces. This patch introduces a new feature flag privnet/ to allow forcing of a private network namespace for LXC. In the future I also anticipate that we will add privuser/ to force a private user ID namespace. * src/conf/domain_conf.c, src/conf/domain_conf.h: Add support for privnet/ feature. Auto-set privnet if any interface devices are defined * src/lxc/lxc_container.c: Honour request for private network namespace NB: this was ACKed way back in January, but I forgot to push it then. Pushing it now - this re-post is just to remind folks where it came from https://www.redhat.com/archives/libvir-list/2012-January/msg01018.html --- docs/formatdomain.html.in |7 +++ docs/schemas/domaincommon.rng |5 + src/conf/domain_conf.c|3 ++- src/conf/domain_conf.h|1 + src/lxc/lxc_container.c | 12 5 files changed, 23 insertions(+), 5 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 624c6b2..4edada3 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -897,6 +897,7 @@ lt;acpi/gt; lt;apic/gt; lt;hap/gt; +lt;privnet/gt; lt;/featuresgt; .../pre @@ -924,6 +925,12 @@ ddEnable Viridian hypervisor extensions for paravirtualizing guest operating systems /dd + dtcodeprivnet/code/dt + ddAlways create a private network namespace. This is +automatically set if any interface devices are defined. +This feature is only relevant for container based +virtualization drivers, such as LXC. + /dd /dl h3a name=elementsTimeTime keeping/a/h3 diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index b804a70..5b3e5fa 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2632,6 +2632,11 @@ empty/ /element /optional + optional +element name=privnet + empty/ +/element + /optional /interleave /element /optional diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index f6f8b8c..e6d0f4b 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -106,7 +106,8 @@ VIR_ENUM_IMPL(virDomainFeature, VIR_DOMAIN_FEATURE_LAST, apic, pae, hap, - viridian) + viridian, + privnet) VIR_ENUM_IMPL(virDomainLifecycle, VIR_DOMAIN_LIFECYCLE_LAST, destroy, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 0ab3b81..f471e35 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1298,6 +1298,7 @@ enum virDomainFeature { VIR_DOMAIN_FEATURE_PAE, VIR_DOMAIN_FEATURE_HAP, VIR_DOMAIN_FEATURE_VIRIDIAN, +VIR_DOMAIN_FEATURE_PRIVNET, VIR_DOMAIN_FEATURE_LAST }; diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index d827b35..267fbfb 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -261,7 +261,8 @@ int lxcContainerWaitForContinue(int control) * * Returns 0 on success or nonzero in case of error */ -static int lxcContainerRenameAndEnableInterfaces(unsigned int nveths, +static int lxcContainerRenameAndEnableInterfaces(bool privNet, + unsigned int nveths, char **veths) { int rc = 0; @@ -289,7 +290,7 @@ static int lxcContainerRenameAndEnableInterfaces(unsigned int nveths, } /* enable lo device only if there were other net devices */ -if (veths) +if (veths || privNet) rc = virNetDevSetOnline(lo, true); error_out: @@ -1343,7 +1344,9 @@ static int lxcContainerChild( void *data ) VIR_DEBUG(Received container continue message); /* rename and enable interfaces */ -if (lxcContainerRenameAndEnableInterfaces(argv-nveths, +if (lxcContainerRenameAndEnableInterfaces(!!(vmDef-features + (1 VIR_DOMAIN_FEATURE_PRIVNET)), + argv-nveths, argv-veths) 0) { goto cleanup; } @@ -1458,7 +1461,8 @@ int lxcContainerStart(virDomainDefPtr def, cflags |= CLONE_NEWUSER; } -if (def-nets != NULL) { +if (def-nets != NULL || +(def-features (1 VIR_DOMAIN_FEATURE_PRIVNET))) { VIR_DEBUG(Enable network namespaces); cflags |= CLONE_NEWNET; } -- 1.7.7.6 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] sanlock: Enhance error message to point to possible problem with selinux
On Thu, Mar 15, 2012 at 10:55:20AM -0600, Eric Blake wrote: On 03/15/2012 10:35 AM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 05:23:09PM +0100, Peter Krempa wrote: If the connection to the sanlock daemon is forbidden by selinux the error message was not clear enough. This patch adds a check if proper configuration for selinux is used while trying to connect to sanlock. *src/locking/lock_driver_sanlock.c: - add macro virLockSystemError that checks for selinux and reports an improved error message - modify calls of virReportSystemError to the new macro in apropriate places Background: https://bugzilla.redhat.com/show_bug.cgi?id=770488 IMHO this is not something we should do here. You're outputing the message regardless of whether there is even an NFS volume involved, and harcoding details of the SELinux policy. Finally I don't think we should blindly tell people to change SELinux tunables without explaining the implications, which is not practical in an error message. We've done this sort of targeted error message before; but there we were careful to _only_ issue the message after checking that we were indeed dealing with NFS; see commit 1888363d. [Hmm - should that patch be revisited, to mention virt_use_samba if it was samba rather than nfs that caused the SELinux denial, since those are different bools?] I don't really agree with that previous commit either for the same reasons. So, IMHO, this belongs in documentation, not in the error messages here. I think both are appropriate - we definitely need to mention virt_use_sanlock in locking.html.in, with full implications, but I would also like to see the error message, provided that you can be sure that the error message only mentions virt_use_sanlock in the actual case where SELinux is enforcing and the error is confirmed to be due to NFS causing us to need the bool. I still don't think it belongs in the error message - with this kind of message, they'll just see the instruction blindly follow it Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 11:47 AM, Eric Blake ebl...@redhat.com wrote: On 03/15/2012 08:31 AM, Eric Blake wrote: It may also be worth considering the addition of _just_ a new ranged RPC call, but no new public API, by making the public API smart enough to automatically call the ranged RPC multiple times to build up a single return to the user. After all, if we exposed the ranged call to the user, then each client will have to add the logic to make the multiple calls; whereas our current limitation is _only_ in the RPC aspect (once the data is across the RPC and into the target process' memory space, there's no longer a 64k cap to worry about). Thinking more about this idea: Suppose we have the following in remote_protocol.x: Existing: struct remote_storage_pool_list_volumes_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES = 92, /* autogen skipgen priority:high */ (note the change from autogen to skipgen for src/remote) New: struct remote_storage_pool_list_volumes_get_transaction_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_get_transaction_ret { int transaction; }; struct remote_storage_pool_list_volumes_transaction_args { remote_nonnull_storage_pool pool; int transaction; int start; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION = ... /* skipgen skipgen priority:high */ REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION = ... /* skipgen skipgen priority:high */ Then, in pseudocode, we change the currently-generated src/remote handler for storagePoolListVolumes to do: if maxnames = REMOTE_STORAGE_VOL_NAME_LIST_MAX use REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES /* existing code */ else { int transaction, i = 0; pass pool and maxnames through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION to set transaction while (i maxnames) { pass pool and transaction through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION to get 128 more names i += REMOTE_STORAGE_VOL_NAME_LIST_MAX; } } Then, in daemon/remote.c, we leave the existing handler (which calls back into virStoragePoolListVolumes with 256 or less entries) untouched, and add two new hand-written RPC handlers: the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION: register that a sequenced transaction has been started allocate an array of maxnames and position and call virStoragePoolListVolumes, and tie that result to the sequence return the sequence transaction number the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION: validates that the sequence number is valid, and that the start value matches the recorded position prepares the return RPC with the next 128 names from the array tied to the sequence, and updates the position if the position reached the end, free the array and end the sequenced transaction we also update virnetserver.c to support sequenced commands - the idea here is that any time a transaction will need more than one RPC message between client and server, then a transaction number can be generated which reserves memory tied to the transaction; as long as that transaction number exists, then further commands sent with reference to that transaction can reuse that memory; the transaction is then closed if the server and client complete the transaction, but also closed (and frees the reserved memory) if the connection disappears. To avoid a misbehaving client from DoS'ing a server, we could also require that while a transaction is open, the next RPC call from the client must reference that transaction, and must come within a reasonable time limit. To some extent, we already have a 'continue' status for a message; right now, a 'call' can only be replied by 'ok' or 'error', but this would be a case where 'call' is now replied by 'continue' to state that a transaction is in effect, and where we could reuse the 'serial' for as long as the transaction is in effect. At any rate, I don't have code to back this up, but it's certainly food for thought on whether it makes sense to try and enhance our use of RPC to support transactions without also having to add new public API. -- Eric Blake ebl...@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org Your solution seems reasonable, though I'd have to really think about it to see if there are any major holes in it. Could you leverage compression to eliminate the need for a more sophisticated solution? -- libvir-list mailing list libvir-list@redhat.com
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 11:30 AM, Jesse Cook wrote: At any rate, I don't have code to back this up, but it's certainly food for thought on whether it makes sense to try and enhance our use of RPC to support transactions without also having to add new public API. Your solution seems reasonable, though I'd have to really think about it to see if there are any major holes in it. Could you leverage compression to eliminate the need for a more sophisticated solution? Compression is indeed another option, but it still implies a new RPC, where if maxnames is = limit, we call the old RPC, and if it is greater, then we call the new RPC which generates the list, compresses it to less than 256k (the rpc limit), and passes the compressed data as an opaque blob, then the client decompresses the blob to reconstruct the list. Simpler than having to do transactions, if you can assume the compression is always likely to work, but more complex in that we have to decide what form of compression to use. (All of this really boils down to a protocol contract between client and server, where both sides agree on how to represent data so that any one message is not disproportionately large). Back to my earlier comments about cross-compatibility, how a server must not reply with more data than an old client was expecting so as not to crash the old client, and therefore new clients must indicate that they are newer: adding a new RPC is indeed a form of feature negotiation (if you are new enough to call the new RPC, then you can deal with the larger return of the new RPC). -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 10:47:25AM -0600, Eric Blake wrote: On 03/15/2012 08:31 AM, Eric Blake wrote: It may also be worth considering the addition of _just_ a new ranged RPC call, but no new public API, by making the public API smart enough to automatically call the ranged RPC multiple times to build up a single return to the user. After all, if we exposed the ranged call to the user, then each client will have to add the logic to make the multiple calls; whereas our current limitation is _only_ in the RPC aspect (once the data is across the RPC and into the target process' memory space, there's no longer a 64k cap to worry about). Thinking more about this idea: Suppose we have the following in remote_protocol.x: Existing: struct remote_storage_pool_list_volumes_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES = 92, /* autogen skipgen priority:high */ (note the change from autogen to skipgen for src/remote) New: struct remote_storage_pool_list_volumes_get_transaction_args { remote_nonnull_storage_pool pool; int maxnames; }; struct remote_storage_pool_list_volumes_get_transaction_ret { int transaction; }; struct remote_storage_pool_list_volumes_transaction_args { remote_nonnull_storage_pool pool; int transaction; int start; }; struct remote_storage_pool_list_volumes_ret { remote_nonnull_string namesREMOTE_STORAGE_VOL_NAME_LIST_MAX; /* insert@1 */ }; ... REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION = ... /* skipgen skipgen priority:high */ REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION = ... /* skipgen skipgen priority:high */ Then, in pseudocode, we change the currently-generated src/remote handler for storagePoolListVolumes to do: if maxnames = REMOTE_STORAGE_VOL_NAME_LIST_MAX use REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES /* existing code */ else { int transaction, i = 0; pass pool and maxnames through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION to set transaction while (i maxnames) { pass pool and transaction through REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION to get 128 more names i += REMOTE_STORAGE_VOL_NAME_LIST_MAX; } } Then, in daemon/remote.c, we leave the existing handler (which calls back into virStoragePoolListVolumes with 256 or less entries) untouched, and add two new hand-written RPC handlers: the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_GET_TRANSACTION: register that a sequenced transaction has been started allocate an array of maxnames and position and call virStoragePoolListVolumes, and tie that result to the sequence return the sequence transaction number the handler for REMOTE_PROC_STORAGE_POOL_LIST_VOLUMES_TRANSACTION: validates that the sequence number is valid, and that the start value matches the recorded position prepares the return RPC with the next 128 names from the array tied to the sequence, and updates the position if the position reached the end, free the array and end the sequenced transaction we also update virnetserver.c to support sequenced commands - the idea here is that any time a transaction will need more than one RPC message between client and server, then a transaction number can be generated which reserves memory tied to the transaction; as long as that transaction number exists, then further commands sent with reference to that transaction can reuse that memory; the transaction is then closed if the server and client complete the transaction, but also closed (and frees the reserved memory) if the connection disappears. To avoid a misbehaving client from DoS'ing a server, we could also require that while a transaction is open, the next RPC call from the client must reference that transaction, and must come within a reasonable time limit. To some extent, we already have a 'continue' status for a message; right now, a 'call' can only be replied by 'ok' or 'error', but this would be a case where 'call' is now replied by 'continue' to state that a transaction is in effect, and where we could reuse the 'serial' for as long as the transaction is in effect. To me this all feels rather complicated to deal with. The reason for the RPC limits was to prevent memory usage DOS. The actual RPC message size is only half of the story though - the actual storage pool API still requires memory to give you back the string list. So if the list of volume names consumes 1 MB of memory, we're going to consume that just by invoking the API. Now copying the strings into an RPC message will double that to 2 MB. Using the sequence of RPC calls to iterate over this is only addressing that second part of memory usage. So I'm not really feeling
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On 03/15/2012 11:48 AM, Daniel P. Berrange wrote: Using the sequence of RPC calls to iterate over this is only addressing that second part of memory usage. So I'm not really feeling that's a huge win, given the complexity it introduces. I'm inclined to say, that if people are creating setups with 1000's of volume guests, then they can probably spare the extra memory for us to increase the main RPC message payload limit somewhat. Our own docs/internals/rpc.html.in states: Although the protocol itself defines many arbitrary sized data values in the payloads, to avoid denial of service attack there are a number of size limit checks prior to encoding or decoding data. There is a limit on the maximum size of a single RPC message, limit on the maximum string length, and limits on any other parameter which uses a variable length array. These limits can be raised, subject to agreement between client/server, without otherwise breaking compatibility of the RPC data on the wire. Increasing limits makes sense, as long as we have a sane way to do it while ensuring that on version mismatch, a large packet from the sender doesn't crash an older receiver expecting the smaller limit. In our XDR, should we be converting some of our 'type nameLIST_MAX' over to 'type name' notations, to state that they are effectively unlimited within the larger bounds of the overall packet size? Is 256k for overall packet size still okay for the problem at hand? -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 1:10 PM, Eric Blake ebl...@redhat.com wrote: On 03/15/2012 11:48 AM, Daniel P. Berrange wrote: Using the sequence of RPC calls to iterate over this is only addressing that second part of memory usage. So I'm not really feeling that's a huge win, given the complexity it introduces. I'm inclined to say, that if people are creating setups with 1000's of volume guests, then they can probably spare the extra memory for us to increase the main RPC message payload limit somewhat. Our own docs/internals/rpc.html.in states: Although the protocol itself defines many arbitrary sized data values in the payloads, to avoid denial of service attack there are a number of size limit checks prior to encoding or decoding data. There is a limit on the maximum size of a single RPC message, limit on the maximum string length, and limits on any other parameter which uses a variable length array. These limits can be raised, subject to agreement between client/server, without otherwise breaking compatibility of the RPC data on the wire. Increasing limits makes sense, as long as we have a sane way to do it while ensuring that on version mismatch, a large packet from the sender doesn't crash an older receiver expecting the smaller limit. In our XDR, should we be converting some of our 'type nameLIST_MAX' over to 'type name' notations, to state that they are effectively unlimited within the larger bounds of the overall packet size? Is 256k for overall packet size still okay for the problem at hand? -- Eric Blake ebl...@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org Using the previously posted snippet of code (I'm guessing that the missing 24 bytes is for RPC header info): /* Size of message payload */ const VIR_NET_MESSAGE_PAYLOAD_MAX = 262120; We would currently be able to squeeze 7084 things into the RPC message at 37 bytes per pool. We could easily reduce this to 33 bytes giving us 7943 things. Now I don't know about other users, but this would give us some room to grow. Therefore, removing the list limitation reduces the priority of this from my perspective and gives us the opportunity to test other limitations that we might run into with libvirt or otherwise. When we start moving into the neighborhood of tens of thousands of things the priority of removing the 256k limit will be much higher from our perspective, but I'm not sure when that will be. I certainly can't speak from the perspective of other users. -- Jesse -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Modified version of the libvirt-test-api wrapper
On 03/14/2012 11:56 AM, Guannan Ren wrote: On 03/14/2012 06:45 PM, Daniel Veillard wrote: On Tue, Mar 13, 2012 at 11:05:31PM -0300, Lucas Meneghel Rodrigues wrote: Hi Guannan: I've worked on your first version of the libvirt-test-api wrapper for autotest. Could you please check if you like the modified version? https://github.com/autotest/autotest/pull/230 If you do think it's fine, you can ack it, or you might take it, modify and resend it. On a git branch, you can do the following: curl https://github.com/autotest/autotest/pull/230.patch | git am And then modify and resend. All this looks fine to me. Maybe the libvirt-test-API.tar.gz should be renamed to stamp it, we should make sure it is recent, and possibly make a release (that could be as simple as copying one of ftp://libvirt.org/libvirt/libvirt-test-API/libvirt-test-API-git-snapshot.tar.gz and renaming it as libvirt-test-API-0.1.0.tar.gz) and push that to autotest git instead, Daniel I agree with this, for following work, we could submit patch to autotest mailing list. Guannan Ok, so I've commited the wrapper to the upstream repo: https://github.com/autotest/autotest/commit/c915fa8aea2c6a0542f2d798ad6a0fbaf2738ef3 When you want to follow up with this, let me know. Cheers, Lucas -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3] qemu: Support numad
On 03/08/2012 06:36 AM, Osier Yang wrote: numad is an user-level daemon that monitors NUMA topology and processes resource consumption to facilitate good NUMA resource alignment of applications/virtual machines to improve performance and minimize cost of remote memory latencies. It provides a pre-placement advisory interface, so significant processes can be pre-bound to nodes with sufficient available resources. Better late than never, but here's a few things you might want to tweak: +#if defined(NUMAD) +static char * +qemuGetNumadAdvice(virDomainDefPtr def) +{ +virCommandPtr cmd = NULL; +char *args = NULL; +char *output = NULL; + +if (virAsprintf(args, %d:%lu, def-vcpus, def-mem.cur_balloon) 0) { +virReportOOMError(); +goto out; +} This prints into a temporary, +cmd = virCommandNewArgList(NUMAD, -w, args, NULL); only to then copy it to cmd and have to free the temporary. It's fewer lines of code, and less malloc pressure, to avoid the temporary by printing directly into cmd: cmd = virCommandNewArgList(NUMAD, -w, NULL); virCommandAddArgFormat(cmd, %d:%llu, def-vcpus, def-mem.cur_balloon); +if (vm-def-cpumask) { +/* XXX why don't we keep 'cpumask' in the libvirt cpumap + * format to start with ?!?! */ +for (i = 0 ; i maxcpu i vm-def-cpumasklen ; i++) +if (vm-def-cpumask[i]) +VIR_USE_CPU(cpumap, i); This comment just verbalizes what has already been bothering me. It would be really nice if someone could convert all of our internal uses of cpumask (with its inefficient one-byte-per-cpu representation) into virBitmask(), as well as making virBitmask provide convenience functions for converting to and from our stringized format as well as from the public api cpumap (8 cpus per byte). -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 2/3] snapshots: Introduce VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag
On 01/24/2012 01:21 PM, Michal Privoznik wrote: With this flag, virDomainSnapshotCreate will use fs-freeze and fs-thaw guest agent commands to quiesce guest's disks. --- include/libvirt/libvirt.h.in |4 ++ src/libvirt.c|6 +++ src/qemu/qemu_driver.c | 87 ++ 3 files changed, 89 insertions(+), 8 deletions(-) Late review, but I just noticed this since I'm now touching the same area of code: +++ b/src/libvirt.c @@ -16602,6 +16602,12 @@ virDomainSnapshotGetConnect(virDomainSnapshotPtr snapshot) * inconsistent (as if power had been pulled), and specifying this * with the VIR_DOMAIN_SNAPSHOT_CREATE_HALT flag risks data loss. * + * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE, then the + * libvirt will attempt to use guest agent to freeze and thaw all + * file systems in use within domain OS. However, if the guest agent + * is not present, an error is thrown. Moreover, this flag requires + * VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY to be passed as well. [note to self - if I ever get disk-only snapshots of offline guests to work by using qemu-img, then quiesce can be ignored, since an offline guest is by definition quiesced] +static int +qemuDomainSnapshotFSThaw(struct qemud_driver *driver, + virDomainObjPtr vm) { +qemuDomainObjPrivatePtr priv = vm-privateData; +int thawed; + +if (priv-agentError) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(QEMU guest agent is not + available due to an error)); +return -1; +} +if (!priv-agent) { +qemuReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, %s, +_(QEMU guest agent is not configured)); +return -1; +} + +qemuDomainObjEnterAgent(driver, vm); +thawed = qemuAgentFSThaw(priv-agent); Does qemuDomainObjEnterAgent and/or the common code that actually tries to pass a message over the agent socket ensure that the guest is still running? If the guest is online, but paused, then we know for a fact that the guest agent will not respond in a timely manner. Rather than make all callers check for this situation, I'm thinking that it would make sense to check that the vm is running prior to issuing a guest-agent command, and to issue an early error, rather than waiting for a timeout when we finally realize the guest never responded. This is particularly important for the quiesce command. Consider what happens if we don't check for paused first: - guest is paused - we issue the quiesce command which gets buffered up, but it times out without being delivered - we fail the command - the guest is resumed, and now consumes the data that was pending on the monitor - the guest goes into quiesce, but our command is no longer around to thaw the guest. I think we need to ensure that any failure path where we issued a quiesce command, _even if the quiesce command failed due to a timeout_, we _also_ issue a thaw command. If the quiesce command failed, we also expect the thaw command to fail, but if the failure was due to the guest being paused, now when the guest wakes up, it will have two pending commands on the bus, and will temporarily quiesce but then thaw right back. Of course, a paused guest isn't the only reason for a timeout, so we need to make sure of paired calls even if we also teach qemDomainObjEnterAgent to make both freeze and thaw be instant fails on a paused guest. @@ -9773,6 +9823,12 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { +if ((flags VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE) +(qemuDomainSnapshotFSFreeze(driver, vm) 0)) { +/* helper reported the error */ +goto endjob; +} Note that if the guest is paused instead of running, and the QUIESCE flag is present, we fail to quiesce the disk. That means the snapshot we took is not stable. Okay, we only documented the QUIESCE flag as a best-effort (and anything that relies on the guest agent is inherently best-effort, as we can't trust the guest), but I think it makes more sense to _always_ attempt the FSFreeze on an active guest, and abort the snapshot if quiesce fails (including but not limited to the case where the guest is paused), rather than only attempting the quiesce on a running guest. + /* In qemu, snapshot_blkdev on a single disk will pause cpus, * but this confuses libvirt since notifications are not given * when qemu resumes. And for multiple disks, libvirt must @@ -9840,13 +9896,20 @@ qemuDomainSnapshotCreateDiskActive(virConnectPtr conn, } cleanup: -if (resume virDomainObjIsActive(vm) -qemuProcessStartCPUs(driver, vm, conn, - VIR_DOMAIN_RUNNING_UNPAUSED, -
Re: [libvirt] Memory leak on list_domains
On 01/25/2012 03:20 AM, Carlos Rodrigues wrote: Hi, I have some problems on my application with memory leak when call list_domains method. I'm using libvirt 0.8.3 and Sys::Virt 0.2.4 Perl Module. Does anyone have any idea what's the problem? Sorry for not noticing this mail sooner. Without seeing the actual code, it's tough to know whether the leak is due to a bug in your code, in the perl bindings, or even if it is a bug in older libvirt that has since been patched. Can you try again with libvirt 0.9.10, or even libvirt.git? Or can you provide more evidence of a leak, such as a valgrind backtrace of the memory leak? -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Using Libvirt to change the bridge a virtual network card of a running vm is connected to
On 01/26/2012 04:10 AM, Daniel P. Berrange wrote: On Wed, Jan 25, 2012 at 04:09:22PM +0100, Hendrik Schwartke wrote: I wrote a patch to change the mapping between a virtual bridge interface and the host bridge while the host is up. It's based on commit 6fba577e505611e6c25c68e322942eab7754de7e. The host and the interface definition I used for testing are also attached. I would be glad if the patch could be added to the repo. Great ! ... Various whitespace fixes - For assignment, we prefer 'foo = bar' instead of 'foo=bar' - For conditionals 'if (' instead of 'if(' - For tests foo 0 instead of foo0 So all in all I think I'd write this as int qemuDomainChangeNetBridge(virDomainNetDefPtr olddev, virDomainNetDefPtr newdev) { ... Are you planning on resubmitting this patch with the noted improvements? -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/5] qemu: Wire up virDomainSuspendForDuration API
On 01/26/2012 12:59 PM, Michal Privoznik wrote: This makes use of QEMU guest agent to implement the virDomainSuspendForDuration API. --- src/qemu/qemu_driver.c | 93 1 files changed, 93 insertions(+), 0 deletions(-) + +if (!virDomainObjIsActive(vm)) { +qemuReportError(VIR_ERR_OPERATION_INVALID, +%s, _(domain is not running)); +goto cleanup; +} Same question as for quiesce: putting the guest into S3 will only work if the agent can respond, so checking merely for active is not enough. If the guest is active but paused, then we can't talk to the agent to issue the request. Having the common guest agent code check for this condition will prevent the scenario of: guest is paused issue the pm suspend, but it times out guest is resumed guest finally acts on command although it is always possible that a guest will suspend itself even without action on our part. At least this isn't as bad as a stale quiesce leaving the guest in a bad state, since on suspend, the host will receive a qemu event about the change in guest state (there's no event for freeze/thaw, since that is an aspect internal to the guest's use of disks and not something inherently visible to qemu to generate an event from). -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Increased upper limit on lists of pool names
On Thu, Mar 15, 2012 at 9:43 AM, Eric Blake ebl...@redhat.com wrote: On 03/15/2012 05:51 AM, Daniel P. Berrange wrote: On Thu, Mar 15, 2012 at 06:22:38AM -0500, Jesse Cook wrote: Just to clarify, you would like to see: I have not actually tried this, but based on my understanding of the RPC famework, I'd expect: v0.9.10 pre-patch client talk to v0.9.10 patch server If there are 256 or fewer names, this works as it always did. If there are more than 256 names, then the server tries to send back more array elements than the client is prepared to receive - the client will treat the response as an invalid RPC, and my recollection is that under these circumstances, the client then forcefully disconnects the session under the assumption that the server is broken. To avoid breaking such clients, you'd probably need some way for clients to warn the server if they are prepared to receive more than the old max (similar to how we had to add VIR_TYPED_PARAM_STRING_OKAY as a way for clients to tell the server that they were prepared to handle typed strings being returned). Basically, any time you change the RPC to allow more data to be returned than what was previously possible, the server should not return that additional data unless the client has negotiated that it is aware of the additional data. v0.9.10 patch client talk to v0.9.10 pre-patch server No problem for the client. The server will never successfully return more than 256 names. However, if the client requests more than 256 names, I'm not sure whether the server will silently truncate to 256 or whether it will return an RPC error because it was unable to provide the reply. And for comparison v0.9.10 pre-patch client talk to v0.9.10 pre-patch server Should be the same behavior as a v0.9.10-patch client talking to a pre-patch server - you can't successfully send more than 256 names, but I'm not sure how the server will react if the client requests too many names. Actually, it might be slightly different - the client might reject the request up front without even contacting the server. Obviously for v0.9.10 patch client talk to v0.9.10 patch server I'd expect just works :-) Of course, if both sides are prepared to handle the new RPC protocol, it should just work. -- Eric Blake ebl...@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org v0.9.10 client did not want to play nicely with the v0.9.10 server via qemu+ssh. I got frustrated and just tried running the test from a client running an older version of libvirt. When connecting to v0.9.10, it behaved the same way the pre-patched did in my cover letter. I don't have full test results because of the communication errors I was getting. I'll try to either figure it out tomorrow or just use the older version as the client (pre-patch and patch). -- Jesse -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] python: add virDomainGetCPUStats python binding API
Is there anyone who would like to review this :) On 03/14/2012 09:03 PM, Guannan Ren wrote: dom.getCPUStats(True, 0) [{'cpu_time': 92913537401L, 'system_time': 547000L, 'user_time': 31000L}] dom.getCPUStats(False, 0) [{'cpu_time': 39476858499L}, {'cpu_time': 10627048370L}, {'cpu_time': 21270945682L}, {'cpu_time': 21556420641L}] *generator.py Add a new naming rule *libvirt-override-api.xml The API function description *libvirt-override.c Implement it. --- python/generator.py |5 +- python/libvirt-override-api.xml | 10 +++ python/libvirt-override.c | 164 +++ 3 files changed, 178 insertions(+), 1 deletions(-) diff --git a/python/generator.py b/python/generator.py index 98072f0..4f95cc9 100755 --- a/python/generator.py +++ b/python/generator.py @@ -423,7 +423,7 @@ skip_impl = ( 'virDomainGetBlockIoTune', 'virDomainSetInterfaceParameters', 'virDomainGetInterfaceParameters', -'virDomainGetCPUStats', # not implemented now. +'virDomainGetCPUStats', 'virDomainGetDiskErrors', ) @@ -966,6 +966,9 @@ def nameFixup(name, classe, type, file): elif name[0:19] == virStorageVolLookup: func = name[3:] func = string.lower(func[0:1]) + func[1:] +elif name[0:20] == virDomainGetCPUStats: +func = name[9:] +func = string.lower(func[0:1]) + func[1:] elif name[0:12] == virDomainGet: func = name[12:] func = string.lower(func[0:1]) + func[1:] diff --git a/python/libvirt-override-api.xml b/python/libvirt-override-api.xml index ab8f33a..c906cc3 100644 --- a/python/libvirt-override-api.xml +++ b/python/libvirt-override-api.xml @@ -149,6 +149,16 @@ arg name='path' type='char *' info='the path for the block device'/ arg name='flags' type='int' info='flags (unused; pass 0)'/ /function +function name='virDomainGetCPUStats' file='python' +infoExtracts CPU statistics for a running domain, On success it will return a list of data of dictionary type. + If boolean total is True, the first element of the list refers to CPU0 on the host, second element is CPU1, and so on. + The format of data struct is like [{cpu_time:xxx},{cpu_time:xxx}, ...] + If it is False, it returns total domain CPU statistics like [{cpu_time:xxx, user_time:xxx, system_time:xxx}]/info +return type='str *' info='returns a list of dictionary in case of success, None in case of error'/ +arg name='domain' type='virDomainPtr' info='pointer to domain object'/ +arg name='total' type='bool' info='on true, return total domain CPU statistics, false return per-cpu info'/ +arg name='flags' type='int' info='flags (unused; pass 0)'/ +/function function name='virDomainInterfaceStats' file='python' infoExtracts interface device statistics for a domain/info return type='virDomainInterfaceStats' info='a tuple of statistics'/ diff --git a/python/libvirt-override.c b/python/libvirt-override.c index 792cfa3..eb24c60 100644 --- a/python/libvirt-override.c +++ b/python/libvirt-override.c @@ -378,6 +378,169 @@ cleanup: } static PyObject * +libvirt_virDomainGetCPUStats(PyObject *self ATTRIBUTE_UNUSED, PyObject *args) +{ +virDomainPtr domain; +PyObject *pyobj_domain, *totalbool; +PyObject *ret = NULL; +PyObject *cpu, *total; +PyObject *error; +int i, i_retval; +int ncpus = -1; +int sumparams, nparams = -1; +unsigned int flags; +virTypedParameterPtr params, cpuparams; + +if (!PyArg_ParseTuple(args, (char *)OOi:virDomainGetNumaParameters, +pyobj_domain,totalbool,flags)) +return NULL; +domain = (virDomainPtr) PyvirDomain_Get(pyobj_domain); + +if (!PyBool_Check(totalbool)) { +PyErr_Format(PyExc_TypeError, +The \total\ attribute must be bool); +return NULL; +} + +if ((ret = PyList_New(0)) == NULL) +return NULL; + +if (totalbool == Py_False) { +LIBVIRT_BEGIN_ALLOW_THREADS; +ncpus = virDomainGetCPUStats(domain, NULL, 0, 0, 0, flags); +LIBVIRT_END_ALLOW_THREADS; + +if (ncpus 0) { +error = VIR_PY_NONE; +goto failed; +} + +LIBVIRT_BEGIN_ALLOW_THREADS; +nparams = virDomainGetCPUStats(domain, NULL, 0, 0, 1, flags); +LIBVIRT_END_ALLOW_THREADS; + +if (nparams 0) { +error = VIR_PY_NONE; +goto failed; +} + +if (!nparams) { +if ((cpu = PyDict_New()) == NULL) { +error = NULL; +goto failed; +} + +if (PyList_Append(ret, cpu) 0) { +Py_DECREF(cpu); +error = NULL; +goto failed; +} + +Py_DECREF(cpu); +return ret; +} +sumparams = nparams * ncpus; + +if (VIR_ALLOC_N(params, sumparams) 0) { +error =
Re: [libvirt] [PATCH v3] qemu: Support numad
On 03/16/2012 06:12 AM, Eric Blake wrote: On 03/08/2012 06:36 AM, Osier Yang wrote: numad is an user-level daemon that monitors NUMA topology and processes resource consumption to facilitate good NUMA resource alignment of applications/virtual machines to improve performance and minimize cost of remote memory latencies. It provides a pre-placement advisory interface, so significant processes can be pre-bound to nodes with sufficient available resources. Better late than never, but here's a few things you might want to tweak: +#if defined(NUMAD) +static char * +qemuGetNumadAdvice(virDomainDefPtr def) +{ +virCommandPtr cmd = NULL; +char *args = NULL; +char *output = NULL; + +if (virAsprintf(args, %d:%lu, def-vcpus, def-mem.cur_balloon) 0) { +virReportOOMError(); +goto out; +} This prints into a temporary, +cmd = virCommandNewArgList(NUMAD, -w, args, NULL); only to then copy it to cmd and have to free the temporary. It's fewer lines of code, and less malloc pressure, to avoid the temporary by printing directly into cmd: cmd = virCommandNewArgList(NUMAD, -w, NULL); virCommandAddArgFormat(cmd, %d:%llu, def-vcpus, def-mem.cur_balloon); +if (vm-def-cpumask) { +/* XXX why don't we keep 'cpumask' in the libvirt cpumap + * format to start with ?!?! */ +for (i = 0 ; i maxcpu i vm-def-cpumasklen ; i++) +if (vm-def-cpumask[i]) +VIR_USE_CPU(cpumap, i); This comment just verbalizes what has already been bothering me. It would be really nice if someone could convert all of our internal uses of cpumask (with its inefficient one-byte-per-cpu representation) into virBitmask(), as well as making virBitmask provide convenience functions for converting to and from our stringized format as well as from the public api cpumap (8 cpus per byte). I will do these work in seperate patches. Thanks. Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list