Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] logically memory hotplug via guest agent
On Thu, Jun 11, 2015 at 09:38:24 +0800, zhang bo wrote: On 2015/6/10 17:31, Daniel P. Berrange wrote: On Wed, Jun 10, 2015 at 10:28:08AM +0100, Daniel P. Berrange wrote: On Wed, Jun 10, 2015 at 05:24:50PM +0800, zhang bo wrote: On 2015/6/10 16:39, Vasiliy Tolstov wrote: 2015-06-10 11:37 GMT+03:00 Daniel P. Berrange berra...@redhat.com: The udev rules are really something the OS vendor should setup, so that it just works I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly. In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue. virsh # help setvcpus . --guest modify cpu state in the guest BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option. I was suggesting that an RFE be filed with any vendor who doesn't do it to add this capability, not that we add udev rules ourselves. Or actually, it probably is sufficient to just send a patch to the upstream systemd project to add the desired rule to udev. That way all Linux distros will inherit the feature when they update to new udev. Then, here comes the question: how to deal with the guests that are already in use? I think it's better to operate them at the host side without getting into the guest. That's the advantage of qemu-guest-agent, why not take advantage of it? Such guests would need an update qemu-guest-agent anyway. And installing a new version of qemu-guest-agent is not any easier than installing an updated udev or a new udev rule. That is, I don't think the qemu-guest-agent way has any benefits over a udev rule. It's rather the opposite. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] add some more missing libvirt functions
10 июня 2015 г. 16:00 пользователь Michal Privoznik mpriv...@redhat.com написал: On 10.06.2015 10:18, Vasiliy Tolstov wrote: * libvirt_connect_get_all_domain_stats * libvirt_domain_block_resize * libvirt_domain_block_job_abort * libvirt_domain_block_job_set_speed Signed-off-by: Vasiliy Tolstov v.tols...@selfip.ru --- src/libvirt-php.c | 177 +- src/libvirt-php.h | 4 ++ 2 files changed, 180 insertions(+), 1 deletion(-) From the e-mail header: Content-Type: text/plain; charset=yes I've not know there's such charset as yes :) diff --git a/src/libvirt-php.c b/src/libvirt-php.c index e9b9657..f9096ef 100644 --- a/src/libvirt-php.c +++ b/src/libvirt-php.c @@ -91,6 +91,7 @@ static zend_function_entry libvirt_functions[] = { PHP_FE(libvirt_connect_get_maxvcpus, NULL) PHP_FE(libvirt_connect_get_encrypted, NULL) PHP_FE(libvirt_connect_get_secure, NULL) + PHP_FE(libvirt_connect_get_all_domain_stats, NULL) /* Stream functions */ PHP_FE(libvirt_stream_create, NULL) PHP_FE(libvirt_stream_close, NULL) @@ -136,6 +137,10 @@ static zend_function_entry libvirt_functions[] = { PHP_FE(libvirt_domain_memory_peek,NULL) PHP_FE(libvirt_domain_memory_stats,NULL) PHP_FE(libvirt_domain_block_stats,NULL) + PHP_FE(libvirt_domain_block_resize,NULL) + // PHP_FE(libvirt_domain_block_copy,NULL) Just drop this line. + PHP_FE(libvirt_domain_block_job_abort,NULL) + PHP_FE(libvirt_domain_block_job_set_speed,NULL) PHP_FE(libvirt_domain_interface_stats,NULL) PHP_FE(libvirt_domain_get_connect, NULL) PHP_FE(libvirt_domain_migrate, NULL) @@ -1332,6 +1337,11 @@ PHP_MINIT_FUNCTION(libvirt) /* Job was aborted but it's not cleanup up yet */ REGISTER_LONG_CONSTANT(VIR_DOMAIN_JOB_CANCELLED, 5, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT, CONST_CS | CONST_PERSISTENT); + + REGISTER_LONG_CONSTANT(VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES, VIR_DOMAIN_BLOCK_JOB_SPEED_BANDWIDTH_BYTES, CONST_CS | CONST_PERSISTENT); + /* Migration constants */ REGISTER_LONG_CONSTANT(VIR_MIGRATE_LIVE,1, CONST_CS | CONST_PERSISTENT); /* direct source - dest host control channel Note the less-common spelling that we're stuck with: */ @@ -1374,7 +1384,7 @@ PHP_MINIT_FUNCTION(libvirt) REGISTER_LONG_CONSTANT(VIR_DOMAIN_FLAG_TEST_LOCAL_VNC, DOMAIN_FLAG_TEST_LOCAL_VNC, CONST_CS | CONST_PERSISTENT); REGISTER_LONG_CONSTANT(VIR_DOMAIN_FLAG_SOUND_AC97, DOMAIN_FLAG_SOUND_AC97, CONST_CS | CONST_PERSISTENT); REGISTER_LONG_CONSTANT(VIR_DOMAIN_DISK_FILE, DOMAIN_DISK_FILE, CONST_CS | CONST_PERSISTENT); - REGISTER_LONG_CONSTANT(VIR_DOMAIN_DISK_BLOCK, DOMAIN_DISK_BLOCK, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_DISK_BLOCK, DOMAIN_DISK_BLOCK, CONST_CS | CONST_PERSISTENT); This looks like spurious change. REGISTER_LONG_CONSTANT(VIR_DOMAIN_DISK_ACCESS_ALL, DOMAIN_DISK_ACCESS_ALL, CONST_CS | CONST_PERSISTENT); /* Domain metadata constants */ @@ -1385,6 +1395,24 @@ PHP_MINIT_FUNCTION(libvirt) REGISTER_LONG_CONSTANT(VIR_DOMAIN_AFFECT_LIVE, 1, CONST_CS | CONST_PERSISTENT); REGISTER_LONG_CONSTANT(VIR_DOMAIN_AFFECT_CONFIG, 2, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_STATE, VIR_DOMAIN_STATS_STATE, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_CPU_TOTAL, VIR_DOMAIN_STATS_CPU_TOTAL, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_BALLOON, VIR_DOMAIN_STATS_BALLOON, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_VCPU, VIR_DOMAIN_STATS_VCPU, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_INTERFACE, VIR_DOMAIN_STATS_INTERFACE, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_DOMAIN_STATS_BLOCK, VIR_DOMAIN_STATS_BLOCK, CONST_CS | CONST_PERSISTENT); + + REGISTER_LONG_CONSTANT(VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE, VIR_CONNECT_GET_ALL_DOMAINS_STATS_ACTIVE, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_CONNECT_GET_ALL_DOMAINS_STATS_INACTIVE, VIR_CONNECT_GET_ALL_DOMAINS_STATS_INACTIVE, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER, VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_CONNECT_GET_ALL_DOMAINS_STATS_PAUSED, VIR_CONNECT_GET_ALL_DOMAINS_STATS_PAUSED, CONST_CS | CONST_PERSISTENT); + REGISTER_LONG_CONSTANT(VIR_CONNECT_GET_ALL_DOMAINS_STATS_PERSISTENT,
[libvirt] [PATCH] virfile: Report useful error fork approach to create NFS mount point fails
Commit 92d9114e tweaked the way we handle child errors when using fork approach to set specific permissions. The same logic should be used to create directories with specified permissions as well, otherwise the parent process doesn't report any useful error unknown cause while still returning negative errcode. https://bugzilla.redhat.com/show_bug.cgi?id=1230137 --- src/util/virfile.c | 48 +++- 1 file changed, 39 insertions(+), 9 deletions(-) diff --git a/src/util/virfile.c b/src/util/virfile.c index 5ff4668..7675eeb 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -2376,6 +2376,7 @@ virDirCreate(const char *path, if (pid) { /* parent */ /* wait for child to complete, and retrieve its exit code */ VIR_FREE(groups); + while ((waitret = waitpid(pid, status, 0)) == -1 errno == EINTR); if (waitret == -1) { ret = -errno; @@ -2384,11 +2385,33 @@ virDirCreate(const char *path, path); goto parenterror; } -if (!WIFEXITED(status) || (ret = -WEXITSTATUS(status)) == -EACCES) { -/* fall back to the simpler method, which works better in - * some cases */ -return virDirCreateNoFork(path, mode, uid, gid, flags); + +/* + * If waitpid succeeded, but if the child exited abnormally or + * reported non-zero status, report failure, except for EACCES where + * we try to fall back to non-fork method as in the original logic. + */ +if (!WIFEXITED(status) || (WEXITSTATUS(status)) != 0) { +if (WEXITSTATUS(status) == EACCES) +return virDirCreateNoFork(path, mode, uid, gid, flags); +char *msg = virProcessTranslateStatus(status); +virReportError(VIR_ERR_INTERNAL_ERROR, + _(child failed to create '%s': %s), + path, msg); +VIR_FREE(msg); +/* Use child exit status if possible; otherwise, + * just use -EACCES, since by our original failure in + * the non fork+setuid path would have been EACCES or + * EPERM by definition (see qemuOpenFileAs after the + * first virFileOpenAs failure), but EACCES is close enough. + * Besides -EPERM is like returning fd == -1. + */ +if (WIFEXITED(status)) +ret = -WEXITSTATUS(status); +else +ret = -EACCES; } + parenterror: return ret; } @@ -2400,15 +2423,14 @@ virDirCreate(const char *path, ret = -errno; goto childerror; } + if (mkdir(path, mode) 0) { ret = -errno; -if (ret != -EACCES) { -/* in case of EACCES, the parent will retry */ -virReportSystemError(errno, _(child failed to create directory '%s'), - path); -} +virReportSystemError(errno, _(child failed to create directory '%s'), + path); goto childerror; } + /* check if group was set properly by creating after * setgid. If not, try doing it with chown */ if (stat(path, st) == -1) { @@ -2417,6 +2439,7 @@ virDirCreate(const char *path, _(stat of '%s' failed), path); goto childerror; } + if ((st.st_gid != gid) (chown(path, (uid_t) -1, gid) 0)) { ret = -errno; virReportSystemError(errno, @@ -2424,13 +2447,20 @@ virDirCreate(const char *path, path, (unsigned int) gid); goto childerror; } + if (mode != (mode_t) -1 chmod(path, mode) 0) { virReportSystemError(errno, _(cannot set mode of '%s' to %04o), path, mode); goto childerror; } + childerror: +ret = -ret; +if ((ret 0xff) != ret) { +VIR_WARN(unable to pass desired return value %d, ret); +ret = 0xff; +} _exit(ret); } -- 1.9.3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/2] apibuild: Generate macro/@string attribute
On Mon, Jun 08, 2015 at 11:34:35 +0200, Jiri Denemark wrote: If a macro has a string value, the @string attribute will contain the value. Otherwise @string attribute will be missing. For example, the following macro definition from libvirt-domain.h: /** * VIR_MIGRATE_PARAM_URI: * ... */ # define VIR_MIGRATE_PARAM_URI migrate_uri will result in macro name='VIR_MIGRATE_PARAM_URI' file='libvirt-domain' string='migrate_uri' info![CDATA[...]]/info /macro https://bugzilla.redhat.com/show_bug.cgi?id=1229199 Signed-off-by: Jiri Denemark jdene...@redhat.com --- docs/apibuild.py | 47 --- 1 file changed, 28 insertions(+), 19 deletions(-) ACK, Peter signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [python PATCH 2/2] Provide symbolic names for typed parameters
On Mon, Jun 08, 2015 at 11:34:36 +0200, Jiri Denemark wrote: https://bugzilla.redhat.com/show_bug.cgi?id=1222795 Signed-off-by: Jiri Denemark jdene...@redhat.com --- generator.py | 8 1 file changed, 8 insertions(+) ACK, Peter signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Overhead for a default cpu cg placement scheme
Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. Thanks! -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Regards, Daniel -- Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] util: Fix coverity warings in virProcessGetAffinity
On Wed, Jun 10, 2015 at 15:18:15 -0600, Eric Blake wrote: On 06/10/2015 09:27 AM, John Ferlan wrote: So there are basically three options: 1) Silence the coverity warning So on line just prior to CPU_ISSET_S either: sa_assert(sizeof(unsigned long int) == sizeof(cpu_set_t)); Is that true even on 32-bit platforms? This is never true. cpu_set_t is a structure that contains array of unsigned longs that is 1024bits wide. or /* coverity[overrun-local] */ Might be safer, even though I generally prefer sa_assert() over /* magic comment */ I'd go with this one in this case since the sa_assert one above probably silences the warning in a bogous way. signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] get guest OS infos
On Thu, Jun 11, 2015 at 09:17:30AM +0100, Daniel P. Berrange wrote: On Thu, Jun 11, 2015 at 01:51:33PM +0800, zhang bo wrote: Different OSes have different capabilities and behaviors sometimes. We have to distinguish them then. For example, our clients want to send NMI interrupts to certain guests(eg.Linux distributions), but not others(eg.Windows guests). They want to acquire the list below: guest1: RHEL 7 guest2: RHEL 7 guest3: Ubuntu 12 guest4: Ubuntu 13 guest5: Windows 7 .. AFAIK, neither libvirt nor openstack, nor qemu, have such capbility of showing these guest OS infos. Libvirt now supports to show host capabilities and driver capability, but not an individual guest OS's capibility. We may refer to http://libvirt.org/formatdomaincaps.html for more information. So, what's your opinion on adding such feature in libvirt and qemu? This is normally something the higher level management app will remember and record. For example, RHEV/oVirt store a record of the OS when the guest is first provisioned. In OpenStack we are going to permit the user to set an image property flag to specify the guest OS, using libosinfo terminology http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/libvirt-hardware-policy-from-libosinfo.html One thing I could see us to is to define an official libosinfo metadata schema. eg so there is a standized way to use the libvirt metadata element to record the libosinfo operating system for a guest, giving interoperability across different apps. That doesn't really require any coding - just an update in the libosinfo website with some docs about recommendations Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] question about virConnectGetAllDomainStats and virTypedParameter
10 июня 2015 г. 17:15 пользователь Michal Privoznik mpriv...@redhat.com написал: On 10.06.2015 08:52, Vasiliy Tolstov wrote: I'm try to implement virConnectGetAllDomainStats for php binding api, but have one issue with VIR_TYPED_PARAM_STRING: code part: retval = virConnectGetAllDomainStats(conn-conn, stats, retstats, flags); for (i=0; i retval; i++) { zval *arr2; ALLOC_INIT_ZVAL(arr2); array_init(arr2); for (j = 0; j retstats[i]-nparams; j++) { params = retstats[i]-params[j]; switch (params.type) { case VIR_TYPED_PARAM_INT: add_assoc_long(arr2, params.field, params.value.i); case VIR_TYPED_PARAM_UINT: add_assoc_long(arr2, params.field, params.value.ui); case VIR_TYPED_PARAM_LLONG: add_assoc_long(arr2, params.field, params.value.l); case VIR_TYPED_PARAM_ULLONG: add_assoc_long(arr2, params.field, params.value.ul); case VIR_TYPED_PARAM_DOUBLE: add_assoc_double(arr2, params.field, params.value.d); case VIR_TYPED_PARAM_BOOLEAN: add_assoc_bool(arr2, params.field, params.value.b); case VIR_TYPED_PARAM_STRING: add_assoc_string_ex(arr2, params.field, strlen(params.field)+1, strdup(params.value.s), strlen(params.value.s)+1); // SEGFAULT HAPPENING } } gdb shows: return_value_used=optimized out) at libvirt-php.c:2505 arr2 = 0x77fd72b8 conn = optimized out zconn = 0x77fd7140 retval = optimized out flags = optimized out stats = optimized out name = optimized out i = optimized out j = optimized out params = {field = state.state, '\000' repeats 68 times, type = 1, value = {i = 5, ui = 5, l = 5, ul = 5, d = 2.4703282292062327e-323, b = 5 '\005', s = 0x5 Address 0x5 out of bounds}} retstats = 0x101d870 What i'm doing wrong? The switch() items needs to end with break; Otherwise add_assoc_*() will be called that not correspond to the type. As in your example - the type is INT, and you are seeing the error in strdup(). Unfortunately, my mind was too slow when reviewing your patch, so I've pushed it without spotting it. I'm pushing the obvious fix right now. Michal Thanks, after golang I'm forget about breaks in switches. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] logically memory hotplug via guest agent
2015-06-11 11:42 GMT+03:00 Jiri Denemark jdene...@redhat.com: Such guests would need an update qemu-guest-agent anyway. And installing a new version of qemu-guest-agent is not any easier than installing an updated udev or a new udev rule. That is, I don't think the qemu-guest-agent way has any benefits over a udev rule. It's rather the opposite. May be as workaround install with qemu-ga (if os old) udev rules for cpu/memory hotplug? So we have udev rules that do all the work, and packagers can enable/disable installing rules ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [RFC PATCH] qemu: Use heads parameter for QXL driver
Allow to specify maximum number of head to QXL driver. Signed-off-by: Frediano Ziglio fzig...@redhat.com --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 11 +++ tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.2.2-1.replies | 8 tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.3.1-1.replies | 8 tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.4.2-1.replies | 8 tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.5.3-1.replies | 8 tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.0-1.replies | 8 tests/qemucapabilitiesdata/caps_1.6.50-1.caps| 1 + tests/qemucapabilitiesdata/caps_1.6.50-1.replies | 8 tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.replies | 8 17 files changed, 77 insertions(+) The patch to support the max_outputs in Qemu is still not merged but I got agreement on the name of the argument. Actually can be a compatiblity problem as heads in the XML configuration was set by default to '1'. diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ca7a7c2..cdc2575 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -285,6 +285,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, dea-key-wrap, pci-serial, aarch64-off, + qxl-vga.max_outputs, ); @@ -1643,6 +1644,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsQxl[] = { static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsQxlVga[] = { { vgamem_mb, QEMU_CAPS_QXL_VGA_VGAMEM }, +{ max_outputs, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS }, }; struct virQEMUCapsObjectTypeProps { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index b5a7770..a2ea84b 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -229,6 +229,7 @@ typedef enum { QEMU_CAPS_DEA_KEY_WRAP = 187, /* -machine dea_key_wrap */ QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ QEMU_CAPS_CPU_AARCH64_OFF= 189, /* -cpu ...,aarch64=off */ +QEMU_CAPS_QXL_VGA_MAX_OUTPUTS = 190, /* qxl-vga.max_outputs */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0a6d92f..2bd63e1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5610,6 +5610,11 @@ qemuBuildDeviceVideoStr(virDomainDefPtr def, /* QEMU accepts mebibytes for vgamem_mb. */ virBufferAsprintf(buf, ,vgamem_mb=%u, video-vgamem / 1024); } + +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS) +video-heads 0) { +virBufferAsprintf(buf, ,max_outputs=%u, video-heads); +} } else if (video-vram ((video-type == VIR_DOMAIN_VIDEO_TYPE_VGA virQEMUCapsGet(qemuCaps, QEMU_CAPS_VGA_VGAMEM)) || @@ -10234,6 +10239,7 @@ qemuBuildCommandLine(virConnectPtr conn, unsigned int ram = def-videos[0]-ram; unsigned int vram = def-videos[0]-vram; unsigned int vgamem = def-videos[0]-vgamem; +unsigned int heads = def-videos[0]-heads; if (vram (UINT_MAX / 1024)) { virReportError(VIR_ERR_OVERFLOW, @@ -10264,6 +10270,11 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArgFormat(cmd, %s.vgamem_mb=%u, dev, vgamem / 1024); } +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS) heads 0) { +virCommandAddArg(cmd, -global); +virCommandAddArgFormat(cmd, %s.max_outputs=%u, + dev, heads); +} } if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE) diff --git a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps index 30239df..7791e42 100644 --- a/tests/qemucapabilitiesdata/caps_1.2.2-1.caps +++ b/tests/qemucapabilitiesdata/caps_1.2.2-1.caps @@ -120,4 +120,5 @@ flag name='vmware-svga.vgamem_mb'/ flag name='qxl.vgamem_mb'/ flag name='qxl-vga.vgamem_mb'/ +flag name='qxl-vga.max_outputs'/ /qemuCaps diff --git a/tests/qemucapabilitiesdata/caps_1.2.2-1.replies b/tests/qemucapabilitiesdata/caps_1.2.2-1.replies index f501218..aa1d3f9 100644 --- a/tests/qemucapabilitiesdata/caps_1.2.2-1.replies +++
Re: [libvirt] Problem with setting up KVM guests to use HugePages
On Thu, Jun 11, 2015 at 10:27:05AM +0200, Michal Privoznik wrote: On 11.06.2015 10:13, Daniel P. Berrange wrote: On Wed, Jun 10, 2015 at 09:20:40PM +, Vivi L wrote: Michal Privoznik mprivozn at redhat.com writes: On 10.06.2015 01:05, Vivi L wrote: Kashyap Chamarthy kchamart at redhat.com writes: You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this: $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 memory unit='KiB'2000896/memory currentMemory unit='KiB'200/currentMemory memoryBacking hugepages page size='2048' unit='KiB' nodeset='0'/ /hugepages /memoryBacking vcpu placement='static'8/vcpu I haven't tested this exhaustively, but some basic test notes here: https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt Current QEMU does not support setting page element. Could it be the cause of my aforementioned problem? unsupported configuration: huge pages per NUMA node are not supported with this QEMU So this is explanation why the memory for you guest is not backed by hugepages. I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest? No, it should not be mandatory. You should be able to use memoryBacking hugepages/ /memoryBacking With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug. Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case. memory-backend-file should only be required if trying to setup different hugepage configs for each guest NUMA node, or if trying to pin each guest NUMA node to a different host node. If they just want hugepages across the whole VM and no pinning shouldn't the traditional setup work Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
On 11.06.2015 10:48, Daniel P. Berrange wrote: On Thu, Jun 11, 2015 at 10:27:05AM +0200, Michal Privoznik wrote: On 11.06.2015 10:13, Daniel P. Berrange wrote: On Wed, Jun 10, 2015 at 09:20:40PM +, Vivi L wrote: Michal Privoznik mprivozn at redhat.com writes: On 10.06.2015 01:05, Vivi L wrote: Kashyap Chamarthy kchamart at redhat.com writes: You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this: $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 memory unit='KiB'2000896/memory currentMemory unit='KiB'200/currentMemory memoryBacking hugepages page size='2048' unit='KiB' nodeset='0'/ /hugepages /memoryBacking vcpu placement='static'8/vcpu I haven't tested this exhaustively, but some basic test notes here: https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt Current QEMU does not support setting page element. Could it be the cause of my aforementioned problem? unsupported configuration: huge pages per NUMA node are not supported with this QEMU So this is explanation why the memory for you guest is not backed by hugepages. I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest? No, it should not be mandatory. You should be able to use memoryBacking hugepages/ /memoryBacking With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug. Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case. memory-backend-file should only be required if trying to setup different hugepage configs for each guest NUMA node, or if trying to pin each guest NUMA node to a different host node. If they just want hugepages across the whole VM and no pinning shouldn't the traditional setup work Vivi L, now you see why you should never ever drop the list from CC. He has sent me an additional info with this snippet in the domain: numatune memory mode='strict' nodeset='0-1'/ /numatune Therefore the memory object is required. We can't guarantee the placement without it. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 02:16:50PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. Ok, so the per-vCPU cgroups are used for a couple of things - Setting scheduler tunables - period/quota/shares/etc - Setting CPU pinning - Setting NUMA memory pinning In addition to the per-VCPU cgroup, we have one cgroup fr each I/O thread, and also one more for general QEMU emulator threads. In the case of CPU pinning we already have automatic fallback to sched_setaffinity if the CPUSET controller isn't available. We could in theory start off without the per-vCPU/emulator/I/O cgroups and only create them as when the feature is actually used. The concern I would have though is that changing the cgroups layout on the fly may cause unexpected sideeffects in behaviour of the VM. More critically, there would be alot of places in the code where we would need to deal with this which could hurt maintainability. How confident are you that the performance problems you see are inherant to the actual use of the cgroups, and not instead as a result of some particular bad choice of default parameters we might have left in the cgroups ? In general I'd have a desire to try to work to eliminate the perf impact before we consider the complexity of disabling this feature Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/2] lxc: properly clean up qemu-nbd
On Wed, Jun 10, 2015 at 04:08:41PM -0400, John Ferlan wrote: On 06/01/2015 09:01 AM, Cédric Bosdonnat wrote: Hi all, Here is the very same patch, but split in two patches. Well, I also moved two comments around between v1 and v2. Cédric Bosdonnat (2): Add virProcessGetPids to get all tasks of a process lxc: properly clean up qemu-nbd src/libvirt_private.syms | 1 + src/lxc/lxc_controller.c | 56 src/util/virprocess.c| 47 src/util/virprocess.h| 2 ++ 4 files changed, 106 insertions(+) Never saw the 1/2 and 2/2 show up in my inbox and I don't see them in the archive - just your 0/2. Same here - I only received the cover letter it appears Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Spice-devel] [RFC PATCH] qemu: Use heads parameter for QXL driver
Hey, On Thu, Jun 11, 2015 at 12:39:50PM +0100, Frediano Ziglio wrote: Allow to specify maximum number of head to QXL driver. I've tested this with an older qemu without qxl-vga.max_outputs, and with a newer one with support for it, and in both cases this is doing the right thing. Signed-off-by: Frediano Ziglio fzig...@redhat.com --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 11 +++ tests/qemucapabilitiesdata/caps_1.2.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.2.2-1.replies | 8 tests/qemucapabilitiesdata/caps_1.3.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.3.1-1.replies | 8 tests/qemucapabilitiesdata/caps_1.4.2-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.4.2-1.replies | 8 tests/qemucapabilitiesdata/caps_1.5.3-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.5.3-1.replies | 8 tests/qemucapabilitiesdata/caps_1.6.0-1.caps | 1 + tests/qemucapabilitiesdata/caps_1.6.0-1.replies | 8 tests/qemucapabilitiesdata/caps_1.6.50-1.caps| 1 + tests/qemucapabilitiesdata/caps_1.6.50-1.replies | 8 tests/qemucapabilitiesdata/caps_2.1.1-1.caps | 1 + tests/qemucapabilitiesdata/caps_2.1.1-1.replies | 8 17 files changed, 77 insertions(+) The patch to support the max_outputs in Qemu is still not merged but I got agreement on the name of the argument. Actually can be a compatiblity problem as heads in the XML configuration was set by default to '1'. diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ca7a7c2..cdc2575 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -285,6 +285,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, dea-key-wrap, pci-serial, aarch64-off, + qxl-vga.max_outputs, ); In order to be consistent with the rest of the file, this should be + + qxl-vga.max_outputs, /* 190 */ @@ -1643,6 +1644,7 @@ static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsQxl[] = { static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsQxlVga[] = { { vgamem_mb, QEMU_CAPS_QXL_VGA_VGAMEM }, +{ max_outputs, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS }, }; struct virQEMUCapsObjectTypeProps { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index b5a7770..a2ea84b 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -229,6 +229,7 @@ typedef enum { QEMU_CAPS_DEA_KEY_WRAP = 187, /* -machine dea_key_wrap */ QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ QEMU_CAPS_CPU_AARCH64_OFF= 189, /* -cpu ...,aarch64=off */ +QEMU_CAPS_QXL_VGA_MAX_OUTPUTS = 190, /* qxl-vga.max_outputs */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0a6d92f..2bd63e1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5610,6 +5610,11 @@ qemuBuildDeviceVideoStr(virDomainDefPtr def, /* QEMU accepts mebibytes for vgamem_mb. */ virBufferAsprintf(buf, ,vgamem_mb=%u, video-vgamem / 1024); } + +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS) +video-heads 0) { +virBufferAsprintf(buf, ,max_outputs=%u, video-heads); +} } else if (video-vram ((video-type == VIR_DOMAIN_VIDEO_TYPE_VGA virQEMUCapsGet(qemuCaps, QEMU_CAPS_VGA_VGAMEM)) || @@ -10234,6 +10239,7 @@ qemuBuildCommandLine(virConnectPtr conn, unsigned int ram = def-videos[0]-ram; unsigned int vram = def-videos[0]-vram; unsigned int vgamem = def-videos[0]-vgamem; +unsigned int heads = def-videos[0]-heads; if (vram (UINT_MAX / 1024)) { virReportError(VIR_ERR_OVERFLOW, @@ -10264,6 +10270,11 @@ qemuBuildCommandLine(virConnectPtr conn, virCommandAddArgFormat(cmd, %s.vgamem_mb=%u, dev, vgamem / 1024); } +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_QXL_VGA_MAX_OUTPUTS) heads 0) { +virCommandAddArg(cmd, -global); +virCommandAddArgFormat(cmd, %s.max_outputs=%u, + dev, heads); +} This part of the code is a fallback for QEMU not supporting -device. As the max_outputs option is new, I'm not sure this will ever be triggered. } if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE) diff --git
Re: [libvirt] [Spice-devel] [RFC PATCH] qemu: Use heads parameter for QXL driver
On Thu, Jun 11, 2015 at 12:39:50PM +0100, Frediano Ziglio wrote: Actually can be a compatiblity problem as heads in the XML configuration was set by default to '1'. Yes, this bit is worrying, the old behaviour could be considered as buggy as the XML contained '1' but the number of heads was not enforced. Suddenly enforcing the heads='1' (which libvirt will add by default to domain definitions which don't have it) will cause a change of behaviour for old guests though. Something like the patch below changes libvirt so that we don't always append heads='1' to domain XML, but I don't know if this interacts correctly with parallels and vmx which force it to be 1 (this probably should not be an issue, but maybe there are latent bugs). Also this is part of the things which are checked as part of virDomainVideoDefCheckABIStability() , so I suspect we'll need to be extra careful there too :( diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 36de844..43067e9 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11421,7 +11421,7 @@ virDomainVideoDefParseXML(xmlNodePtr node, goto error; } } else { -def-heads = 1; +def-heads = 0; } if (virDomainDeviceInfoParseXML(node, NULL, def-info, flags) 0) @@ -15507,7 +15507,6 @@ virDomainDefParseXML(xmlDocPtr xml, goto error; } video-vram = virDomainVideoDefaultRAM(def, video-type); -video-heads = 1; if (VIR_ALLOC_N(def-videos, 1) 0) { virDomainVideoDefFree(video); goto error; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 2bd63e1..03a0458 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -13411,7 +13411,6 @@ qemuParseCommandLine(virCapsPtr qemuCaps, vid-ram = 0; vid-vgamem = 0; } -vid-heads = 1; if (VIR_APPEND_ELEMENT(def-videos, def-nvideos, vid) 0) { virDomainVideoDefFree(vid); pgp8QCXxNtEhF.pgp Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] schema: use arch list from basictypes for os arch attribute
On 08.06.2015 15:42, James Cowgill wrote: I see no reason to duplicate this list of architectures. This also allows more guest architectures to be used with libvirt (like the mips64el qemu machine I am trying to run). Signed-off-by: James Cowgill james...@cowgill.org.uk --- docs/schemas/domaincommon.rng | 26 ++ 1 file changed, 6 insertions(+), 20 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 7c6fa5c..fc28fb3 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -331,7 +331,9 @@ define name=ostypehvm element name=type optional -ref name=archList/ +attribute name=arch + ref name=archnames/ +/attribute /optional optional attribute name=machine @@ -344,29 +346,13 @@ /element /define - define name=archList -attribute name=arch - choice -valuearmv7l/value -valueaarch64/value -valuei686/value -valuex86_64/value -valuemips/value -valueppc/value -valueppc64/value -valueppc64le/value -values390/value -values390x/value -valuesparc/value - /choice -/attribute - /define - define name=osexe element name=os element name=type optional - ref name=archList/ + attribute name=arch +ref name=archnames/ + /attribute /optional valueexe/value /element The patch looks good to me. ACked and pushed. Although during testing I've found 2 small bugs (for which I'm going to propose patches in a while) and one big issue that I'm not sure how to fix. The problem is, imagine you have some system-wide qemus installed. Say for x86_64 and ppc. Then, you have qemu.git where you have all arches built. Therefore, in order to use them, you put something like this into the xml: emulator/path/to/qemu.git/mips64el-softmmu/qemu-system-mips64el/emulator But defining such domain fails, since the emulator is not in the capabilities (virsh capabilities) - it's not system wide emulator installed under $PATH. Sigh. We need those capabilities in order to check whether the emulator supports desired architecture from the xml. However, the capabilities construction is driver dependent - caps for qemu binaries are constructed differently than for VBOX or XEN server, right? Frankly, I don't have any bright idea how to fix this. If anybody has, please enlighten me. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
On Thu, Jun 11, 2015 at 12:00 AM, Michal Privoznik mpriv...@redhat.com wrote: [please keep the list CC'ed] On 10.06.2015 20:09, Clarylin L wrote: Hi Michal, Thanks a lot. If 100 hugepages are pre-allocated, the guest can start without decreasing number of hugepages. Since the guest requires 128 hugepages, it's kind of expected that the guest would not take memory from hugepages. Before guest start, [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 0 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB After: [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 134254592 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB There is no -mem-prealloc and -mem-path options in qemu command And there can't be. From the command line below, you are defining 2 NUMA nodes for your guest. In order to instruct qemu to back their memory by huge pages you need it to support memory-backend-file object which was introduced in qemu-2.1.0. The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. Michal, you were correct. My current version is qemu-1.5.3. After I updated it to 2.1.2, I was able to use page element under hugepages and the problem was solved. Seemed page element was required -- without using it, start guest would report an error. [root@local ~]# ps -ef | grep qemu *qemu* 3403 1 99 17:42 ?00:36:42 /usr/libexec/*qemu*-kvm -name qvpc-di-03-sf -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 131072 -realtime mlock=off -smp 32,sockets=2,cores=16,threads=1 -numa node,nodeid=0,cpus=0-15,mem=65536 -numa node,nodeid=1,cpus=16-31,mem=65536 -uuid e1b72349-4a0b-4b91-aedc-fd34e92251e4 -smbios type=1,serial=SCALE-SLOT-03 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/qvpc-di-03-sf.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/asr5700/qvpc-di-03-sf-hda.img,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.0,addr=0x5 -device vfio-pci,host=09:00.0,id=hostdev1,bus=pci.0,addr=0x6 -device vfio-pci,host=0a:00.0,id=hostdev2,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on If 140 hugepages are preallocated, the guest cannot start and it complained not enough memory. The libvirt version is shown as follows: virsh # version Compiled against library: libvirt 1.2.8 Using library: libvirt 1.2.8 Using API: QEMU 1.2.8 Running hypervisor: QEMU 1.5.3 Also in guest configuration contains numa session. The hugepages are uniformly distributed to two nodes. In this case, do I need to make additional configurations to enable usage of hugepages? numatune memory mode='strict' nodeset='0-1'/ /numatune This says that all the memory for you guests should be pinned onto host nodes 0-1. If you want to be more specific, you can explicitly wire guest NUMA nodes onto host NUMA nodes in 1:N relationship (where N can be even 1, in which case you will get 1:1), e.g.: memnode cellid='0' mode='preferred' nodeset='0'/ Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
Hi Michal, I also tried the other option you mentioned The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. by removing from xml numatune memory mode='strict' nodeset='0-1'/ /numatune while keeping older qemu-1.5.3 which does not support page element, it still complained no enough memory when starting guest. Did I miss something? On Thu, Jun 11, 2015 at 12:00 AM, Michal Privoznik mpriv...@redhat.com wrote: [please keep the list CC'ed] On 10.06.2015 20:09, Clarylin L wrote: Hi Michal, Thanks a lot. If 100 hugepages are pre-allocated, the guest can start without decreasing number of hugepages. Since the guest requires 128 hugepages, it's kind of expected that the guest would not take memory from hugepages. Before guest start, [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 0 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB After: [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 134254592 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB There is no -mem-prealloc and -mem-path options in qemu command And there can't be. From the command line below, you are defining 2 NUMA nodes for your guest. In order to instruct qemu to back their memory by huge pages you need it to support memory-backend-file object which was introduced in qemu-2.1.0. The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. [root@local ~]# ps -ef | grep qemu *qemu* 3403 1 99 17:42 ?00:36:42 /usr/libexec/*qemu*-kvm -name qvpc-di-03-sf -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 131072 -realtime mlock=off -smp 32,sockets=2,cores=16,threads=1 -numa node,nodeid=0,cpus=0-15,mem=65536 -numa node,nodeid=1,cpus=16-31,mem=65536 -uuid e1b72349-4a0b-4b91-aedc-fd34e92251e4 -smbios type=1,serial=SCALE-SLOT-03 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/qvpc-di-03-sf.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/asr5700/qvpc-di-03-sf-hda.img,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.0,addr=0x5 -device vfio-pci,host=09:00.0,id=hostdev1,bus=pci.0,addr=0x6 -device vfio-pci,host=0a:00.0,id=hostdev2,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on If 140 hugepages are preallocated, the guest cannot start and it complained not enough memory. The libvirt version is shown as follows: virsh # version Compiled against library: libvirt 1.2.8 Using library: libvirt 1.2.8 Using API: QEMU 1.2.8 Running hypervisor: QEMU 1.5.3 Also in guest configuration contains numa session. The hugepages are uniformly distributed to two nodes. In this case, do I need to make additional configurations to enable usage of hugepages? numatune memory mode='strict' nodeset='0-1'/ /numatune This says that all the memory for you guests should be pinned onto host nodes 0-1. If you want to be more specific, you can explicitly wire guest NUMA nodes onto host NUMA nodes in 1:N relationship (where N can be even 1, in which case you will get 1:1), e.g.: memnode cellid='0' mode='preferred' nodeset='0'/ Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] network: add an option to make dns public
On 06/10/2015 03:56 PM, John Ferlan wrote: On 06/01/2015 07:54 AM, Cédric Bosdonnat wrote: In some use cases we don't want the virtual network's DNS to only listen to the vnet interface. Adding a publiclyAccessible attribute :-) Really, that name was only intended as a placeholder! I was hoping you (or someone else) would be able to find something shorter/simpler. Lacking that, I guess this is a reasonable name though. to the dns element in the configuration allows the DNS to listen to all interfaces. It simply disables the bind-dynamic option of dnsmasq for the network. --- docs/formatnetwork.html.in | 11 +++ docs/schemas/network.rng | 15 ++- src/conf/network_conf.c | 6 ++ src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 4 +++- tests/networkxml2confdata/nat-network-dns-hosts.conf | 1 - tests/networkxml2confdata/nat-network-dns-hosts.xml | 2 +- 7 files changed, 32 insertions(+), 8 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 6abed8f..8e43658 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -851,6 +851,17 @@ DNS server. /p +p + The dns element + can have an optional codepubliclyAccessible/code + attribute span class=sinceSince 1.2.17/span. + If codepubliclyAccessible/code is yes, then the DNS server + will handle requests for all interfaces. + If codepubliclyAccessible/code is not set or no, the DNS + server will only handle requests for the interface of the virtual + network. +/p + Currently supported sub-elements of codelt;dnsgt;/code are: dl dtcodeforwarder/code/dt diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 4edb6eb..f989625 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -244,12 +244,17 @@ and other features in the dns element -- optional element name=dns -optional - attribute name=forwardPlainNames -ref name=virYesNo/ - /attribute -/optional interleave + optional +attribute name=forwardPlainNames + ref name=virYesNo/ +/attribute + /optional + optional +attribute name=publiclyAccessible + ref name=virYesNo/ +/attribute + /optional Moving the attributes inside the interleave had me looking through other .rng's... I'm no expert, but had thought they really only mattered for element's I'm not an expert either, but you are correct :-) zeroOrMore element name=forwarder attribute name=addrref name=ipAddr//attribute diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index f4a9df0..99bac6d 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1309,9 +1309,14 @@ virNetworkDNSDefParseXML(const char *networkName, size_t i; int ret = -1; xmlNodePtr save = ctxt-node; +char *publiclyAccessible = NULL; ctxt-node = node; +publiclyAccessible = virXPathString(string(./@publiclyAccessible), ctxt); +if (publiclyAccessible) +def-publiclyAccessible = virTristateBoolTypeFromString(publiclyAccessible); + forwardPlainNames = virXPathString(string(./@forwardPlainNames), ctxt); if (forwardPlainNames) { def-forwardPlainNames = virTristateBoolTypeFromString(forwardPlainNames); @@ -1410,6 +1415,7 @@ virNetworkDNSDefParseXML(const char *networkName, ret = 0; cleanup: +VIR_FREE(publiclyAccessible); VIR_FREE(forwardPlainNames); VIR_FREE(fwdNodes); VIR_FREE(hostNodes); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index f69d999..f555b6b 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -136,6 +136,7 @@ struct _virNetworkDNSDef { virNetworkDNSSrvDefPtr srvs; size_t nfwds; char **forwarders; +int publiclyAccessible; /* enum virTristateBool */ }; typedef struct _virNetworkIpDef virNetworkIpDef; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index d195085..c39b1a5 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -996,8 +996,10 @@ networkDnsmasqConfContents(virNetworkObjPtr network, * other than one of the virtual guests connected directly to * this network). This was added in response to CVE 2012-3411. */ +if (network-def-dns.publiclyAccessible != VIR_TRISTATE_BOOL_YES) +
Re: [libvirt] [PATCH] lxc / docker containers gives option to inherit the namespaces. Example lxc-start has option [ --share-[net|ipc|uts] name|pid ] where --share-net name|pid means Inherit a network
On 21.05.2015 19:43, ik.nitk wrote: This patch tries to add the similar option to libvirt lxc. So to inherit namespace from name container c2. add this into xml. lxc:namespace sharenet type='name' value='c2'/ /lxc:namespace And to inherit namespace from a pid. add this into xml. lxc:namespace sharenet type='pid' value='10245'/ /lxc:namespace And to inherit namespace from a netns. add this into xml. lxc:namespace sharenet type='netns' value='red'/ /lxc:namespace Similar options for ipc/uts. shareipc/ , shareuts / The reasong lxc xml namespace is added because this feature is very specific to lxc. Therfore wanted to keep it seperated from actual libvirt xml domain. -imran --- The subject line is just too long. Look at git log to see the style we use to write commit messages. src/Makefile.am | 5 +- src/lxc/lxc_conf.c | 2 +- src/lxc/lxc_conf.h | 23 + src/lxc/lxc_container.c | 191 ++-- src/lxc/lxc_domain.c| 254 +++- src/lxc/lxc_domain.h| 1 + 6 files changed, 463 insertions(+), 13 deletions(-) You are introducing new elements and namespace to the XML. This must always go hand in hand with RNG schema adjustment and a test case or two under tests/. I NACK every patch that does not comply with this rule. But let me review the rest. diff --git a/src/Makefile.am b/src/Makefile.am index 579421d..1a78fde 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -1293,7 +1293,8 @@ libvirt_driver_lxc_impl_la_CFLAGS = \ -I$(srcdir)/access \ -I$(srcdir)/conf \ $(AM_CFLAGS) -libvirt_driver_lxc_impl_la_LIBADD = $(CAPNG_LIBS) $(LIBNL_LIBS) $(FUSE_LIBS) +libvirt_driver_lxc_impl_la_LIBADD = $(CAPNG_LIBS) $(LIBNL_LIBS) $(LIBXML_LIBS) $(FUSE_LIBS) +libvirt_driver_lxc_impl_la_LDFLAGS = libvirt-lxc.la This won't fly. If you need libvirt-lxc.la to be added, you must put it into LIBADD. Otherwise automake will fail to see the dependency tree. It happened to me when I was building this with -j5. Although, this won't be needed at all IMO, but more on that later. if WITH_BLKID libvirt_driver_lxc_impl_la_CFLAGS += $(BLKID_CFLAGS) libvirt_driver_lxc_impl_la_LIBADD += $(BLKID_LIBS) @@ -2652,6 +2653,8 @@ libvirt_lxc_LDADD = \ libvirt-net-rpc.la \ libvirt_security_manager.la \ libvirt_conf.la \ + libvirt.la \ + libvirt-lxc.la \ libvirt_util.la \ ../gnulib/lib/libgnu.la if WITH_DTRACE_PROBES diff --git a/src/lxc/lxc_conf.c b/src/lxc/lxc_conf.c index c393cb5..96a0f47 100644 --- a/src/lxc/lxc_conf.c +++ b/src/lxc/lxc_conf.c @@ -213,7 +213,7 @@ lxcDomainXMLConfInit(void) { return virDomainXMLOptionNew(virLXCDriverDomainDefParserConfig, virLXCDriverPrivateDataCallbacks, - NULL); + virLXCDriverDomainXMLNamespace); } diff --git a/src/lxc/lxc_conf.h b/src/lxc/lxc_conf.h index 8340b1f..59002e5 100644 --- a/src/lxc/lxc_conf.h +++ b/src/lxc/lxc_conf.h @@ -67,6 +67,29 @@ struct _virLXCDriverConfig { bool securityRequireConfined; }; + +typedef enum { +VIR_DOMAIN_NAMESPACE_SHARENET = 0, +VIR_DOMAIN_NAMESPACE_SHAREIPC, +VIR_DOMAIN_NAMESPACE_SHAREUTS, +VIR_DOMAIN_NAMESPACE_LAST, +} virDomainNamespace; + +struct ns_info { +const char *proc_name; +int clone_flag; +}; + +extern const struct ns_info ns_info[VIR_DOMAIN_NAMESPACE_LAST]; + +typedef struct _lxcDomainDef lxcDomainDef; +typedef lxcDomainDef *lxcDomainDefPtr; +struct _lxcDomainDef { +int ns_inherit_fd[VIR_DOMAIN_NAMESPACE_LAST]; +char *ns_type[VIR_DOMAIN_NAMESPACE_LAST]; +char *ns_val[VIR_DOMAIN_NAMESPACE_LAST]; +}; + struct _virLXCDriver { virMutex lock; diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 9a9ae5c..a9a7ba0 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -25,8 +25,8 @@ */ #include config.h - No, we like the extra space here. config.h has a special status. #include fcntl.h +#include sched.h #include limits.h #include stdlib.h #include stdio.h @@ -38,7 +38,6 @@ #include mntent.h #include sys/reboot.h #include linux/reboot.h - /* Yes, we want linux private one, for _syscall2() macro */ #include linux/unistd.h @@ -99,6 +98,50 @@ VIR_LOG_INIT(lxc.lxc_container); typedef char lxc_message_t; #define LXC_CONTINUE_MSG 'c' +#ifdef __linux__ +/* + * Workaround older glibc. While kernel may support the setns + * syscall, the glibc wrapper might not exist. If that's the + * case, use our own. + */ +# ifndef __NR_setns +# if defined(__x86_64__) +# define __NR_setns 308
[libvirt] [PATCH] parallels: implement attach/detach network.
Support nova commands interface-attach and interface-detach. For containers only. I use memcmp() to compare MAC addresses, because PrlVmDevNet_GetMacAddress() returns MAC as a UTF-8 encoded, null-terminated string. --- src/parallels/parallels_driver.c | 16 src/parallels/parallels_sdk.c| 144 +- src/parallels/parallels_sdk.h|4 + 3 files changed, 161 insertions(+), 3 deletions(-) diff --git a/src/parallels/parallels_driver.c b/src/parallels/parallels_driver.c index 706229d..0009127 100644 --- a/src/parallels/parallels_driver.c +++ b/src/parallels/parallels_driver.c @@ -1117,6 +1117,14 @@ static int parallelsDomainAttachDeviceFlags(virDomainPtr dom, const char *xml, goto cleanup; } break; +case VIR_DOMAIN_DEVICE_NET: +ret = prlsdkAttachNet(privdom, privconn, dev-data.net); +if (ret) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(network attach failed)); +goto cleanup; +} +break; default: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _(device type '%s' cannot be attached), @@ -1186,6 +1194,14 @@ static int parallelsDomainDetachDeviceFlags(virDomainPtr dom, const char *xml, goto cleanup; } break; +case VIR_DOMAIN_DEVICE_NET: +ret = prlsdkDetachNet(privdom, privconn, dev-data.net); +if (ret) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(network detach failed)); +goto cleanup; +} +break; default: virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _(device type '%s' cannot be detached), diff --git a/src/parallels/parallels_sdk.c b/src/parallels/parallels_sdk.c index 104c905..306f5e3 100644 --- a/src/parallels/parallels_sdk.c +++ b/src/parallels/parallels_sdk.c @@ -2815,6 +2815,12 @@ static int prlsdkAddNet(PRL_HANDLE sdkdom, pret = PrlVmDevNet_SetMacAddress(sdknet, macstr); prlsdkCheckRetGoto(pret, cleanup); +pret = PrlVmDevNet_SetConfigureWithDhcp(sdknet, true); +prlsdkCheckRetGoto(pret, cleanup); + +pret = PrlVmDevNet_SetAutoApply(sdknet, true); +prlsdkCheckRetGoto(pret, cleanup); + if (isCt) { if (net-model) VIR_WARN(Setting network adapter for containers is not @@ -2885,14 +2891,15 @@ static int prlsdkAddNet(PRL_HANDLE sdkdom, return ret; } -static void prlsdkDelNet(parallelsConnPtr privconn, virDomainNetDefPtr net) +static int prlsdkDelNet(parallelsConnPtr privconn, virDomainNetDefPtr net) { +int ret = -1; PRL_RESULT pret; PRL_HANDLE vnet = PRL_INVALID_HANDLE; PRL_HANDLE job = PRL_INVALID_HANDLE; if (net-type != VIR_DOMAIN_NET_TYPE_BRIDGE) -return; +return 0; pret = PrlVirtNet_Create(vnet); prlsdkCheckRetGoto(pret, cleanup); @@ -2900,12 +2907,142 @@ static void prlsdkDelNet(parallelsConnPtr privconn, virDomainNetDefPtr net) pret = PrlVirtNet_SetNetworkId(vnet, net-data.network.name); prlsdkCheckRetGoto(pret, cleanup); -PrlSrv_DeleteVirtualNetwork(privconn-server, vnet, 0); +job = PrlSrv_DeleteVirtualNetwork(privconn-server, vnet, 0); if (PRL_FAILED(pret = waitJob(job))) goto cleanup; +ret = 0; + cleanup: PrlHandle_Free(vnet); +return ret; +} + +int prlsdkAttachNet(virDomainObjPtr dom, parallelsConnPtr privconn, virDomainNetDefPtr net) +{ +int ret = -1; +parallelsDomObjPtr privdom = dom-privateData; +PRL_HANDLE job = PRL_INVALID_HANDLE; + +if (!IS_CT(dom-def)) { +virReportError(VIR_ERR_OPERATION_UNSUPPORTED, %s, + _(network device cannot be attached)); +goto cleanup; +} + +job = PrlVm_BeginEdit(privdom-sdkdom); +if (PRL_FAILED(waitJob(job))) +goto cleanup; + +ret = prlsdkAddNet(privdom-sdkdom, privconn, net, IS_CT(dom-def)); +if (ret == 0) { +job = PrlVm_CommitEx(privdom-sdkdom, PVCF_DETACH_HDD_BUNDLE); +if (PRL_FAILED(waitJob(job))) { +ret = -1; +goto cleanup; +} +} + + cleanup: +return ret; +} + +static int +prlsdkGetNetIndex(PRL_HANDLE sdkdom, virDomainNetDefPtr net) +{ +int idx = -1; +PRL_RESULT pret; +PRL_UINT32 netCount; +PRL_UINT32 i; +PRL_HANDLE adapter = PRL_INVALID_HANDLE; +PRL_UINT32 len; +char adapterMac[PRL_MAC_STRING_BUFNAME]; +char netMac[PRL_MAC_STRING_BUFNAME]; + +prlsdkFormatMac(net-mac, netMac); +pret = PrlVmCfg_GetNetAdaptersCount(sdkdom, netCount); +prlsdkCheckRetGoto(pret, cleanup); + +for (i = 0; i netCount; ++i) { + +pret = PrlVmCfg_GetNetAdapter(sdkdom, i, adapter); +prlsdkCheckRetGoto(pret, cleanup); + +len = sizeof(adapterMac); +memset(adapterMac, 0, sizeof(adapterMac)); +pret = PrlVmDevNet_GetMacAddress(adapter,
Re: [libvirt] [PATCH v2 3/4] qemu: Add capability for vhost-user multiqueue
On 06/04/2015 01:04 PM, Martin Kletzander wrote: The support for this was added in QEMU with commit 830d70db692e374b5f4407f96a1ceefdcc97. Unfortunately we have to do another ugly version-based capability check. The other option would be not to check for the capability at all and leave that to qemu as it's doen with multiqueue tap devices. Signed-off-by: Martin Kletzander mklet...@redhat.com --- src/qemu/qemu_capabilities.c | 6 ++ src/qemu/qemu_capabilities.h | 1 + 2 files changed, 7 insertions(+) This patch will need some updates because of commit id '29ce1693' which now takes 189. It's obvious what needs to be done though and doesn't require sending a new series. diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 960afa4ac0db..f102ed80f15e 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -284,6 +284,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, aes-key-wrap, dea-key-wrap, pci-serial, + vhost-user-multiq, This one will add the /* 190 */ John ); @@ -3283,6 +3284,11 @@ virQEMUCapsInitQMPMonitor(virQEMUCapsPtr qemuCaps, if (qemuCaps-version = 2002000) virQEMUCapsSet(qemuCaps, QEMU_CAPS_MACHINE_VMPORT_OPT); +/* vhost-user supports multi-queue from v2.4.0 onwards, + * but there is no way to query for that capability */ +if (qemuCaps-version = 2004000) +virQEMUCapsSet(qemuCaps, QEMU_CAPS_VHOSTUSER_MULTIQ); + if (virQEMUCapsProbeQMPCommands(qemuCaps, mon) 0) goto cleanup; if (virQEMUCapsProbeQMPEvents(qemuCaps, mon) 0) diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index 9c956f3007be..3dbd767f2516 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -228,6 +228,7 @@ typedef enum { QEMU_CAPS_AES_KEY_WRAP = 186, /* -machine aes_key_wrap */ QEMU_CAPS_DEA_KEY_WRAP = 187, /* -machine dea_key_wrap */ QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ +QEMU_CAPS_VHOSTUSER_MULTIQ = 189, /* vhost-user with -netdev queues= */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] get guest OS infos
On Thu, Jun 11, 2015 at 08:47:12AM -0500, Dennis Jenkins wrote: On Thu, Jun 11, 2015 at 3:51 AM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 09:17:30AM +0100, Daniel P. Berrange wrote: On Thu, Jun 11, 2015 at 01:51:33PM +0800, zhang bo wrote: Different OSes have different capabilities and behaviors sometimes. We have to distinguish them then. For example, our clients want to send NMI interrupts to certain guests(eg.Linux distributions), but not others(eg.Windows guests). They want to acquire the list below: guest1: RHEL 7 guest2: RHEL 7 guest3: Ubuntu 12 guest4: Ubuntu 13 guest5: Windows 7 .. AFAIK, neither libvirt nor openstack, nor qemu, have such capbility of showing these guest OS infos. Libvirt now supports to show host capabilities and driver capability, but not an individual guest OS's capibility. We may refer to http://libvirt.org/formatdomaincaps.html for more information. Hello. I wrote a utility a few years ago to detect which OS is running in each qemu VM under libvirt via memory probing. I have not touched the code in a few years. YMMV. http://pastebin.com/m0mfcK8G FWIW, you can also use libguestfs to analyse the disk of a libvirt guest while it is running, if you libguestfs' use readonly mode Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 4/4] qemu: add multiqueue vhost-user support
On 06/04/2015 01:04 PM, Martin Kletzander wrote: From: Maxime Leroy maxime.le...@6wind.com This patch adds the support of queues attribute of the driver element for vhost-user interface type. Example: interface type='vhostuser' mac address='52:54:00:ee:96:6d'/ source type='unix' path='/tmp/vhost2.sock' mode='client'/ model type='virtio'/ driver queues='4'/ /interface Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1207692 Signed-off-by: Maxime Leroy maxime.le...@6wind.com Signed-off-by: Martin Kletzander mklet...@redhat.com --- docs/formatdomain.html.in | 11 +-- src/qemu/qemu_command.c| 14 +- ...ostuser.args = qemuxml2argv-net-vhostuser-multiq.args} | 6 +- ...vhostuser.xml = qemuxml2argv-net-vhostuser-multiq.xml} | 6 ++ tests/qemuxml2argvtest.c | 3 +++ 5 files changed, 36 insertions(+), 4 deletions(-) copy tests/qemuxml2argvdata/{qemuxml2argv-net-vhostuser.args = qemuxml2argv-net-vhostuser-multiq.args} (75%) copy tests/qemuxml2argvdata/{qemuxml2argv-net-vhostuser.xml = qemuxml2argv-net-vhostuser-multiq.xml} (87%) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 72ad54cee188..85238a16af8d 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -4260,7 +4260,8 @@ qemu-kvm -net nic,model=? /dev/null type='virtio'/gt;/code, multiple packet processing queues can be created; each queue will potentially be handled by a different processor, resulting in much higher throughput. -span class=sinceSince 1.0.6 (QEMU and KVM only)/span +span class=sinceSince 1.0.6 (QEMU and KVM only) and for vhost-user + since 1.2.17/span This read a bit strangely - so I went looking at the context which describes the feature for the Multiqueue virtio-net feature which has a Service Temporarily Unavailable link. So rather than just adding a mostly cryptic conjunction, how about: The optional queues attribute controls the number of queues to be used for either a href=http://www.linux-kvm.org/page/Multiqueue;Multiqueue virtio-net/a or a href=#elementVhostuservhost-user/a network interfaces. Use of multiple packet processing queues requires the interface having the model type='virtio'/ element. Each queue will potentially be handled by a different processor, resulting in much higher throughput. span class=sincevirtio-net since 1.0.6 (QEMU and KVM only)./span span class=sincevhost-user since 1.2.17./span John /dd dtcodehost/code offloading options/dt dd @@ -4581,9 +4582,15 @@ qemu-kvm -net nic,model=? /dev/null lt;devicesgt; lt;interface type='vhostuser'gt; lt;mac address='52:54:00:3b:83:1a'/gt; - lt;source type='unix' path='/tmp/vhost.sock' mode='server'/gt; + lt;source type='unix' path='/tmp/vhost1.sock' mode='server'/gt; lt;model type='virtio'/gt; lt;/interfacegt; +lt;interface type='vhostuser'gt; + lt;mac address='52:54:00:3b:83:1b'/gt; + lt;source type='unix' path='/tmp/vhost2.sock' mode='client'/gt; + lt;model type='virtio'/gt; + lt;driver queues='5'/gt; +lt;/interfacegt; lt;/devicesgt; .../pre diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 61faa576e11b..862729f01352 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -8089,6 +8089,7 @@ qemuBuildVhostuserCommandLine(virCommandPtr cmd, { virBuffer chardev_buf = VIR_BUFFER_INITIALIZER; virBuffer netdev_buf = VIR_BUFFER_INITIALIZER; +unsigned int queues = net-driver.virtio.queues; char *nic = NULL; if (!qemuDomainSupportsNetdev(def, qemuCaps, net)) { @@ -8126,13 +8127,24 @@ qemuBuildVhostuserCommandLine(virCommandPtr cmd, virBufferAsprintf(netdev_buf, type=vhost-user,id=host%s,chardev=char%s, net-info.alias, net-info.alias); +if (queues 1) { +if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_VHOSTUSER_MULTIQ)) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, %s, + _(multi-queue is not supported for vhost-user + with this QEMU binary)); +goto error; +} +virBufferAsprintf(netdev_buf, ,queues=%u, queues); +} + virCommandAddArg(cmd, -chardev); virCommandAddArgBuffer(cmd, chardev_buf); virCommandAddArg(cmd, -netdev); virCommandAddArgBuffer(cmd, netdev_buf); -if (!(nic = qemuBuildNicDevStr(def, net, -1, bootindex, 0, qemuCaps))) { +if (!(nic = qemuBuildNicDevStr(def, net, -1, bootindex, + queues, qemuCaps))) { virReportError(VIR_ERR_INTERNAL_ERROR, %s, _(Error generating NIC -device string)); goto error; diff --git
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 2:33 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 02:16:50PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. Ok, so the per-vCPU cgroups are used for a couple of things - Setting scheduler tunables - period/quota/shares/etc - Setting CPU pinning - Setting NUMA memory pinning In addition to the per-VCPU cgroup, we have one cgroup fr each I/O thread, and also one more for general QEMU emulator threads. In the case of CPU pinning we already have automatic fallback to sched_setaffinity if the CPUSET controller isn't available. We could in theory start off without the per-vCPU/emulator/I/O cgroups and only create them as when the feature is actually used. The concern I would have though is that changing the cgroups layout on the fly may cause unexpected sideeffects in behaviour of the VM. More critically, there would be alot of places in the code where we would need to deal with this which could hurt maintainability. How confident are you that the performance problems you see are inherant to the actual use of the cgroups, and not instead as a result of some particular bad choice of default parameters we might have left in the cgroups ? In general I'd have a desire to try to work to eliminate the perf impact before we consider the complexity of disabling this feature Regards, Daniel Hm, what are you proposing to begin with in a testing terms? By my understanding the excessive cgroup usage along with small scheduler quanta *will* lead to some overhead anyway. Let`s look at the numbers which I would bring tomorrow, the mentioned five percents was catched on a guest 'perf numa xxx' for a different kind of mappings and host behavior (post-3.8): memory automigration on/off, kind of 'numa passthrough', like grouping vcpu threads according to the host and emulated guest NUMA topologies, totally scattered and unpinned threads within a single and within a multiple NUMA nodes. As the result for 3.10.y, there was a five-percent difference between best-performing case with thread-level cpu cgroups and a 'totally scattered' case on a simple mid-range two-headed node. If you think that the choice of an emulated workload is wrong, please let me know, I was afraid that the non-synthetic workload in the guest may suffer from a range of a side factors and therefore chose perf for this task. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 04:06:59PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:33 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 02:16:50PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. Ok, so the per-vCPU cgroups are used for a couple of things - Setting scheduler tunables - period/quota/shares/etc - Setting CPU pinning - Setting NUMA memory pinning In addition to the per-VCPU cgroup, we have one cgroup fr each I/O thread, and also one more for general QEMU emulator threads. In the case of CPU pinning we already have automatic fallback to sched_setaffinity if the CPUSET controller isn't available. We could in theory start off without the per-vCPU/emulator/I/O cgroups and only create them as when the feature is actually used. The concern I would have though is that changing the cgroups layout on the fly may cause unexpected sideeffects in behaviour of the VM. More critically, there would be alot of places in the code where we would need to deal with this which could hurt maintainability. How confident are you that the performance problems you see are inherant to the actual use of the cgroups, and not instead as a result of some particular bad choice of default parameters we might have left in the cgroups ? In general I'd have a desire to try to work to eliminate the perf impact before we consider the complexity of disabling this feature Regards, Daniel Hm, what are you proposing to begin with in a testing terms? By my understanding the excessive cgroup usage along with small scheduler quanta *will* lead to some overhead anyway. Let`s look at the numbers which I would bring tomorrow, the mentioned five percents was catched on a guest 'perf numa xxx' for a different kind of mappings and host behavior (post-3.8): memory automigration on/off, kind of 'numa passthrough', like grouping vcpu threads according to the host and emulated guest NUMA topologies, totally scattered and unpinned threads within a single and within a multiple NUMA nodes. As the result for 3.10.y, there was a five-percent difference between best-performing case with thread-level cpu cgroups and a 'totally scattered' case on a simple mid-range two-headed node. If you think that the choice of an emulated workload is wrong, please let me know, I was afraid that the non-synthetic workload in the guest may suffer from a range of a side factors and therefore chose perf for this task. Benchmarking isn't my area of expertize, but you should be able to just disable the CPUSET controller entirely in qemu.conf. If we got some comparative results for with without CPUSET that'd be interesting place to start. If it shows clear difference, I might be able to get some of the Red Hat performance team to
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 04:24:18PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 4:13 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 04:06:59PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:33 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 02:16:50PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. Ok, so the per-vCPU cgroups are used for a couple of things - Setting scheduler tunables - period/quota/shares/etc - Setting CPU pinning - Setting NUMA memory pinning In addition to the per-VCPU cgroup, we have one cgroup fr each I/O thread, and also one more for general QEMU emulator threads. In the case of CPU pinning we already have automatic fallback to sched_setaffinity if the CPUSET controller isn't available. We could in theory start off without the per-vCPU/emulator/I/O cgroups and only create them as when the feature is actually used. The concern I would have though is that changing the cgroups layout on the fly may cause unexpected sideeffects in behaviour of the VM. More critically, there would be alot of places in the code where we would need to deal with this which could hurt maintainability. How confident are you that the performance problems you see are inherant to the actual use of the cgroups, and not instead as a result of some particular bad choice of default parameters we might have left in the cgroups ? In general I'd have a desire to try to work to eliminate the perf impact before we consider the complexity of disabling this feature Regards, Daniel Hm, what are you proposing to begin with in a testing terms? By my understanding the excessive cgroup usage along with small scheduler quanta *will* lead to some overhead anyway. Let`s look at the numbers which I would bring tomorrow, the mentioned five percents was catched on a guest 'perf numa xxx' for a different kind of mappings and host behavior (post-3.8): memory automigration on/off, kind of 'numa passthrough', like grouping vcpu threads according to the host and emulated guest NUMA topologies, totally scattered and unpinned threads within a single and within a multiple NUMA nodes. As the result for 3.10.y, there was a five-percent difference between best-performing case with thread-level cpu cgroups and a 'totally scattered' case on a simple mid-range two-headed node. If you think that the choice of an emulated workload is wrong, please let me know, I was afraid that the non-synthetic workload in the guest may suffer from a range of a side factors and therefore chose perf for this task. Benchmarking isn't my area of expertize, but you should be able to
Re: [libvirt] [(python) PATCH 0/2] Provide symbolic names for typed parameters
On Mon, Jun 08, 2015 at 11:34:34 +0200, Jiri Denemark wrote: libvirt: Jiri Denemark (1): apibuild: Generate macro/@string attribute docs/apibuild.py | 47 --- 1 file changed, 28 insertions(+), 19 deletions(-) libvirt-python: Jiri Denemark (1): Provide symbolic names for typed parameters generator.py | 8 1 file changed, 8 insertions(+) I pushed this series and the previous apibuild patch. Thanks for the reviews. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [sandbox PATCH 1/3] Add an utility function for guessing filetype from file extension
On Wed, Jun 10, 2015 at 01:40:08PM +0200, Eren Yagdiran wrote: Consider the file name extension as the image type, except for .img that are usually RAW images --- libvirt-sandbox/Makefile.am| 1 + libvirt-sandbox/libvirt-sandbox-util.c | 79 ++ libvirt-sandbox/libvirt-sandbox-util.h | 6 +++ 3 files changed, 86 insertions(+) create mode 100644 libvirt-sandbox/libvirt-sandbox-util.c diff --git a/libvirt-sandbox/Makefile.am b/libvirt-sandbox/Makefile.am index 96302cb..6917f04 100644 --- a/libvirt-sandbox/Makefile.am +++ b/libvirt-sandbox/Makefile.am @@ -84,6 +84,7 @@ SANDBOX_HEADER_FILES = \ $(NULL) SANDBOX_SOURCE_FILES = \ libvirt-sandbox-main.c \ + libvirt-sandbox-util.c \ libvirt-sandbox-config.c \ libvirt-sandbox-config-network.c \ libvirt-sandbox-config-network-address.c \ diff --git a/libvirt-sandbox/libvirt-sandbox-util.c b/libvirt-sandbox/libvirt-sandbox-util.c new file mode 100644 index 000..0ab4fac --- /dev/null +++ b/libvirt-sandbox/libvirt-sandbox-util.c @@ -0,0 +1,79 @@ +/* + * libvirt-sandbox-util.c: libvirt sandbox util functions + * + * Copyright (C) 2015 Universitat Politècnica de Catalunya. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + * + * Author: Eren Yagdiran erenyagdi...@gmail.com + */ + +#include config.h +#include string.h + +#include libvirt-sandbox/libvirt-sandbox.h + +/* This array contains string values for GVirConfigDomainDiskFormat, + * order is important.*/ +static const gchar *FORMATS_STRINGS[] = { +raw, +dir, +bochs, +cloop, +cow, +dmg, +iso, +qcow, +qcow2, +qed, +vmdk, +vpc, +fat, +vhd, +NULL +}; I'm not convinced we actually need this lookup table. The libvirt-gconfig library defines a formal gobject enum type for GVirConfigDomainDiskFormat which lets you do value - string lookups / conversions. eg GEnumClass *klass = g_type_class_ref(GVIR_CONFIG_DOMAIN_DISK_FORMAT); GEnumValue *value = g_enum_get_value(klass, GVIR_CONFIG_DOMAIN_DISK_FORMAT_QCOW2) value-value_nick now contains the string 'qcow2' + +gint gvir_sandbox_util_guess_image_format(const gchar *path){ We ought to have a GError ** parameter here to return an error message. + +gchar *tmp; + +if ((tmp = strchr(path, '.')) == NULL) { +return -1; +} +tmp = tmp + 1; + +if (strcmp(tmp,img)==0){ + return GVIR_CONFIG_DOMAIN_DISK_FORMAT_RAW; +} + +return gvir_sandbox_util_disk_format_from_str(tmp); +} + +gint gvir_sandbox_util_disk_format_from_str(const gchar *value) Same here with GError ** +{ +gint i = 0; + +while (FORMATS_STRINGS[i] != NULL) { + if (strcmp(FORMATS_STRINGS[i], value) == 0) +return i; +i++; +} +return -1; +} + +const gchar *gvir_sandbox_util_disk_format_to_str(GVirConfigDomainDiskFormat format) +{ +return FORMATS_STRINGS[format]; +} This is redundant - the g_enum apis already let callers do this conversion. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] get guest OS infos
On Thu, Jun 11, 2015 at 3:51 AM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 09:17:30AM +0100, Daniel P. Berrange wrote: On Thu, Jun 11, 2015 at 01:51:33PM +0800, zhang bo wrote: Different OSes have different capabilities and behaviors sometimes. We have to distinguish them then. For example, our clients want to send NMI interrupts to certain guests(eg.Linux distributions), but not others(eg.Windows guests). They want to acquire the list below: guest1: RHEL 7 guest2: RHEL 7 guest3: Ubuntu 12 guest4: Ubuntu 13 guest5: Windows 7 .. AFAIK, neither libvirt nor openstack, nor qemu, have such capbility of showing these guest OS infos. Libvirt now supports to show host capabilities and driver capability, but not an individual guest OS's capibility. We may refer to http://libvirt.org/formatdomaincaps.html for more information. Hello. I wrote a utility a few years ago to detect which OS is running in each qemu VM under libvirt via memory probing. I have not touched the code in a few years. YMMV. http://pastebin.com/m0mfcK8G -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] Add support for vhost-user with multi-queue
On 06/04/2015 01:04 PM, Martin Kletzander wrote: Also some tiny clean-up. Martin Kletzander (2): conf: Ignore multiqueue with one queue. qemu: Add capability for vhost-user multiqueue Maxime Leroy (2): docs: Clarify that attribute name is not used for vhostuser qemu: add multiqueue vhost-user support docs/formatdomain.html.in| 16 ++-- src/conf/domain_conf.c | 3 ++- src/qemu/qemu_capabilities.c | 6 ++ src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 14 +- ...tuser.args = qemuxml2argv-net-vhostuser-multiq.args} | 6 +- ...ostuser.xml = qemuxml2argv-net-vhostuser-multiq.xml} | 6 ++ .../qemuxml2argv-tap-vhost-incorrect.xml | 6 ++ tests/qemuxml2argvtest.c | 3 +++ .../qemuxml2xmlout-tap-vhost-incorrect.xml | 6 ++ 10 files changed, 62 insertions(+), 5 deletions(-) copy tests/qemuxml2argvdata/{qemuxml2argv-net-vhostuser.args = qemuxml2argv-net-vhostuser-multiq.args} (75%) copy tests/qemuxml2argvdata/{qemuxml2argv-net-vhostuser.xml = qemuxml2argv-net-vhostuser-multiq.xml} (87%) -- 2.4.2 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list ACK series with noted adjustments made. John -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] maint: Remove control characters from LGPL license file
--- COPYING.LESSER | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/COPYING.LESSER b/COPYING.LESSER index 4362b49..e5ab03e 100644 --- a/COPYING.LESSER +++ b/COPYING.LESSER @@ -55,7 +55,7 @@ modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others. - + Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a @@ -111,7 +111,7 @@ modification follow. Pay close attention to the difference between a work based on the library and a work that uses the library. The former contains code derived from the library, whereas the latter must be combined with the library in order to run. - + GNU LESSER GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION @@ -158,7 +158,7 @@ Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. - + 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 @@ -216,7 +216,7 @@ instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. - + Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. @@ -267,7 +267,7 @@ Library will still fall under Section 6.) distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. - + 6. As an exception to the Sections above, you may also combine or link a work that uses the Library with the Library to produce a work containing portions of the Library, and distribute that work @@ -329,7 +329,7 @@ restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. - + 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined @@ -370,7 +370,7 @@ subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License. - + 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or @@ -422,7 +422,7 @@ conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. - + 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is @@ -456,7 +456,7 @@ SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS - + How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest -- 2.4.2 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Overhead for a default cpu cg placement scheme
On Thu, Jun 11, 2015 at 4:13 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 04:06:59PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:33 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 02:16:50PM +0300, Andrey Korolyov wrote: On Thu, Jun 11, 2015 at 2:09 PM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Jun 11, 2015 at 01:50:24PM +0300, Andrey Korolyov wrote: Hi Daniel, would it possible to adopt an optional tunable for a virCgroup mechanism which targets to a disablement of a nested (per-thread) cgroup creation? Those are bringing visible overhead for many-threaded guest workloads, almost 5% in non-congested host CPU state, primarily because the host scheduler should make a much more decisions with those cgroups than without them. We also experienced a lot of host lockups with currently exploited cgroup placement and disabled nested behavior a couple of years ago. Though the current patch is simply carves out the mentioned behavior, leaving only top-level per-machine cgroups, it can serve for an upstream after some adaptation, that`s why I`m asking about a chance of its acceptance. This message is a kind of 'request of a feature', it either can be accepted/dropped from our side or someone may give a hand and redo it from scratch. The detailed benchmarks are related to a host 3.10.y, if anyone is interested in the numbers for latest stable, I can update those. When you say nested cgroup creation, as you referring to the modern libvirt hierarchy, or the legacy hierarchy - as described here: http://libvirt.org/cgroups.html The current libvirt setup used for a year or so now is much shallower than previously, to the extent that we'd consider performance problems with it to be the job of the kernel to fix. Thanks, I`m referring to a 'new nested' hiearchy for an overhead mentioned above. The host crashes I mentioned happened with old hierarchy back ago, forgot to mention this. Despite the flattening of the topo for the current scheme it should be possible to disable fine group creation for the VM threads for some users who don`t need per-vcpu cpu pinning/accounting (though overhead caused by a placement for cpu cgroup, not by accounting/pinning ones, I`m assuming equal distribution with such disablement for all nested-aware cgroup types), that`s the point for now. Ok, so the per-vCPU cgroups are used for a couple of things - Setting scheduler tunables - period/quota/shares/etc - Setting CPU pinning - Setting NUMA memory pinning In addition to the per-VCPU cgroup, we have one cgroup fr each I/O thread, and also one more for general QEMU emulator threads. In the case of CPU pinning we already have automatic fallback to sched_setaffinity if the CPUSET controller isn't available. We could in theory start off without the per-vCPU/emulator/I/O cgroups and only create them as when the feature is actually used. The concern I would have though is that changing the cgroups layout on the fly may cause unexpected sideeffects in behaviour of the VM. More critically, there would be alot of places in the code where we would need to deal with this which could hurt maintainability. How confident are you that the performance problems you see are inherant to the actual use of the cgroups, and not instead as a result of some particular bad choice of default parameters we might have left in the cgroups ? In general I'd have a desire to try to work to eliminate the perf impact before we consider the complexity of disabling this feature Regards, Daniel Hm, what are you proposing to begin with in a testing terms? By my understanding the excessive cgroup usage along with small scheduler quanta *will* lead to some overhead anyway. Let`s look at the numbers which I would bring tomorrow, the mentioned five percents was catched on a guest 'perf numa xxx' for a different kind of mappings and host behavior (post-3.8): memory automigration on/off, kind of 'numa passthrough', like grouping vcpu threads according to the host and emulated guest NUMA topologies, totally scattered and unpinned threads within a single and within a multiple NUMA nodes. As the result for 3.10.y, there was a five-percent difference between best-performing case with thread-level cpu cgroups and a 'totally scattered' case on a simple mid-range two-headed node. If you think that the choice of an emulated workload is wrong, please let me know, I was afraid that the non-synthetic workload in the guest may suffer from a range of a side factors and therefore chose perf for this task. Benchmarking isn't my area of expertize, but you should be able to just disable the CPUSET controller entirely in qemu.conf. If we got some comparative results for with without CPUSET that'd be interesting place to start. If it
Re: [libvirt] [sandbox PATCH 2/3] Add configuration object for disk support
On Wed, Jun 10, 2015 at 01:40:09PM +0200, Eren Yagdiran wrote: Add the config gobject, functions to store and load the new configuration fragments and test. This will allow creating sandboxes with attached disk with a parameter formatted like file:hda=/source/file.qcow2,format=qcow2 +/** + * gvir_sandbox_config_add_disk_strv: + * @config: (transfer none): the sandbox config + * @disks: (transfer none)(array zero-terminated=1): the list of disks + * + * Parses @disks whose elements are in the format TYPE:TARGET=SOURCE,FORMAT=FORMAT + * creating #GVirSandboxConfigMount instances for each element. For + * example + * + * - file:hda=/var/lib/sandbox/demo/tmp.qcow2,format=qcow2 + */ One of the goal of the libvirt sandbox code is to insulate apps from needing to know hypervisor specific differences. The guest side disk device name is one such big difference. Many hypervisors, including kvm, will not even honour requested names - you just get whatever name the guest decides to give you. Essentially the only thing that libvirt guarantees is the disk ordering. ie if you configure two disks one with hda and one hdb, libvirt will ensure hda appears before hdb on the bus or controller. So I don't think we should include the target device name in our configuration syntax here. We should just document that disks will be added to the guest in the same order that you supply them to virt-sandbox command line in. The actual device names will vary according to the hypevisor and guest os. +gboolean gvir_sandbox_config_add_disk_strv(GVirSandboxConfig *config, + gchar **disks, + GError **error) +{ +gsize i = 0; +while (disks disks[i]) { +if (!gvir_sandbox_config_add_disk_opts(config, + disks[i], + error)) +return FALSE; +i++; +} +return TRUE; +} Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] scsi: Need to translate disk source pool in config attach path
https://bugzilla.redhat.com/show_bug.cgi?id=1228007 When attaching a scsi volume lun via the attach-device --config or --persistent options, there was no translation of the source pool like there was for the live path, thus the attempt to modify the config would fail since not enough was known about the disk. Signed-off-by: John Ferlan jfer...@redhat.com --- src/qemu/qemu_driver.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 34e5581..6bb8549 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8016,7 +8016,8 @@ qemuDomainUpdateDeviceLive(virConnectPtr conn, static int qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, virDomainDefPtr vmdef, - virDomainDeviceDefPtr dev) + virDomainDeviceDefPtr dev, + virConnectPtr conn) { virDomainDiskDefPtr disk; virDomainNetDefPtr net; @@ -8033,6 +8034,8 @@ qemuDomainAttachDeviceConfig(virQEMUCapsPtr qemuCaps, _(target %s already exists), disk-dst); return -1; } +if (virStorageTranslateDiskSourcePool(conn, disk) 0) +return -1; if (qemuCheckDiskConfig(disk) 0) return -1; if (virDomainDiskInsert(vmdef, disk)) @@ -8501,7 +8504,8 @@ static int qemuDomainAttachDeviceFlags(virDomainPtr dom, const char *xml, VIR_DOMAIN_DEVICE_ACTION_ATTACH) 0) goto endjob; -if ((ret = qemuDomainAttachDeviceConfig(qemuCaps, vmdef, dev)) 0) +if ((ret = qemuDomainAttachDeviceConfig(qemuCaps, vmdef, dev, +dom-conn)) 0) goto endjob; } -- 2.1.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Accessing libvirtd remotely as non-root user
I manage libvirtd on a few remote machines, and my security policies require me to disable root login via SSH. Up to this point, I've been using root due to the systems being in staging, but this is the final step before they're moved to production. What is the current proscribed method of connecting virt-manager or virsh to a remote system with a non-root account? I keep getting authentication failed: no agent is available to authenticate with a user that is in the kvm and qemu groups on the systems I've tried using the ssh transport. Thanks in advance Dan -- Dan Mossor, RHCSA Systems Engineer Fedora Server WG | Fedora KDE WG | Fedora QA Team Fedora Infrastructure Apprentice FAS: dmossor IRC: danofsatx San Antonio, Texas, USA -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
With older qemu-1.5.3, I can start the guest by removing the following lines from the xml config file and hugepages are correctly used. numa cell id='0' cpus='0-15' memory='67108864'/ cell id='1' cpus='16-31' memory='67108864'/ /numa I believe these lines were used to create NUMA nodes for the guest. Removing them means no NUMA nodes for the guest I guess. There is one thing I don't quite understand though. In the xml file, I kept numatune memory mode='strict' nodeset='0-1'/ /numatune Since no NUMA nodes were created, how come this section can pass? Then what is the nodeset here? On Thu, Jun 11, 2015 at 10:27 AM, Clarylin L clear...@gmail.com wrote: Hi Michal, I also tried the other option you mentioned The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. by removing from xml numatune memory mode='strict' nodeset='0-1'/ /numatune while keeping older qemu-1.5.3 which does not support page element, it still complained no enough memory when starting guest. Did I miss something? On Thu, Jun 11, 2015 at 12:00 AM, Michal Privoznik mpriv...@redhat.com wrote: [please keep the list CC'ed] On 10.06.2015 20:09, Clarylin L wrote: Hi Michal, Thanks a lot. If 100 hugepages are pre-allocated, the guest can start without decreasing number of hugepages. Since the guest requires 128 hugepages, it's kind of expected that the guest would not take memory from hugepages. Before guest start, [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 0 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB After: [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 134254592 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB There is no -mem-prealloc and -mem-path options in qemu command And there can't be. From the command line below, you are defining 2 NUMA nodes for your guest. In order to instruct qemu to back their memory by huge pages you need it to support memory-backend-file object which was introduced in qemu-2.1.0. The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. [root@local ~]# ps -ef | grep qemu *qemu* 3403 1 99 17:42 ?00:36:42 /usr/libexec/*qemu*-kvm -name qvpc-di-03-sf -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 131072 -realtime mlock=off -smp 32,sockets=2,cores=16,threads=1 -numa node,nodeid=0,cpus=0-15,mem=65536 -numa node,nodeid=1,cpus=16-31,mem=65536 -uuid e1b72349-4a0b-4b91-aedc-fd34e92251e4 -smbios type=1,serial=SCALE-SLOT-03 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/qvpc-di-03-sf.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/asr5700/qvpc-di-03-sf-hda.img,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.0,addr=0x5 -device vfio-pci,host=09:00.0,id=hostdev1,bus=pci.0,addr=0x6 -device vfio-pci,host=0a:00.0,id=hostdev2,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on If 140 hugepages are preallocated, the guest cannot start and it complained not enough memory. The libvirt version is shown as follows: virsh # version Compiled against library: libvirt 1.2.8 Using library: libvirt 1.2.8 Using API: QEMU 1.2.8 Running hypervisor: QEMU 1.5.3 Also in guest configuration contains numa session. The hugepages are uniformly distributed to two nodes. In this case, do I need to make additional configurations to enable usage of hugepages? numatune memory mode='strict' nodeset='0-1'/ /numatune This says that all the memory for you guests should be pinned onto host nodes 0-1. If you want to be more specific, you can explicitly wire guest NUMA nodes onto host NUMA nodes in 1:N relationship (where N can be even 1, in which case you will get 1:1), e.g.: memnode cellid='0' mode='preferred' nodeset='0'/ Michal -- libvir-list mailing list libvir-list@redhat.com
[libvirt] [PATCH v2 2/2] lxc: properly clean up qemu-nbd
Add the qemu-nbd tasks to the container cgroup to make sure those will be killed when the container is stopped. In order to reliably get the qemu-nbd tasks PIDs, we use /sys/devices/virtual/block/DEV/pid as qemu-nbd is daemonizing itself. --- src/lxc/lxc_controller.c | 56 1 file changed, 56 insertions(+) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index efbe71f..9b6f0c8 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -107,6 +107,9 @@ struct _virLXCController { pid_t initpid; +size_t nnbdpids; +pid_t *nbdpids; + size_t nveths; char **veths; @@ -283,6 +286,8 @@ static void virLXCControllerFree(virLXCControllerPtr ctrl) virObjectUnref(ctrl-server); virLXCControllerFreeFuse(ctrl); +VIR_FREE(ctrl-nbdpids); + virCgroupFree(ctrl-cgroup); /* This must always be the last thing to be closed */ @@ -525,6 +530,38 @@ static int virLXCControllerSetupNBDDeviceDisk(virDomainDiskDefPtr disk) return 0; } +static int virLXCControllerAppendNBDPids(virLXCControllerPtr ctrl, + const char *dev) +{ +char *pidpath = NULL; +pid_t *pids; +size_t npids; +size_t i; +int ret = -1; +pid_t pid; + +if (!STRPREFIX(dev, /dev/) || +virAsprintf(pidpath, /sys/devices/virtual/block/%s/pid, dev + 5) 0) +goto cleanup; + +if (virPidFileReadPath(pidpath, pid) 0) +goto cleanup; + +if (virProcessGetPids(pid, npids, pids) 0) +goto cleanup; + +for (i = 0; i npids; i++) { +if (VIR_APPEND_ELEMENT(ctrl-nbdpids, ctrl-nnbdpids, pids[i]) 0) +goto cleanup; +} + +ret = 0; + + cleanup: +VIR_FREE(pids); +VIR_FREE(pidpath); +return ret; +} static int virLXCControllerSetupLoopDevices(virLXCControllerPtr ctrl) { @@ -570,6 +607,12 @@ static int virLXCControllerSetupLoopDevices(virLXCControllerPtr ctrl) } else if (fs-fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_NBD) { if (virLXCControllerSetupNBDDeviceFS(fs) 0) goto cleanup; + +/* The NBD device will be cleaned up while the cgroup will end. + * For this we need to remember the qemu-nbd pid and add it to + * the cgroup*/ +if (virLXCControllerAppendNBDPids(ctrl, fs-src) 0) +goto cleanup; } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _(fs driver %s is not supported), @@ -629,6 +672,12 @@ static int virLXCControllerSetupLoopDevices(virLXCControllerPtr ctrl) } if (virLXCControllerSetupNBDDeviceDisk(disk) 0) goto cleanup; + +/* The NBD device will be cleaned up while the cgroup will end. + * For this we need to remember the qemu-nbd pid and add it to + * the cgroup*/ +if (virLXCControllerAppendNBDPids(ctrl, virDomainDiskGetSource(disk)) 0) +goto cleanup; } else { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _(disk driver %s is not supported), @@ -781,6 +830,7 @@ static int virLXCControllerSetupCgroupLimits(virLXCControllerPtr ctrl) virBitmapPtr auto_nodeset = NULL; int ret = -1; virBitmapPtr nodeset = NULL; +size_t i; VIR_DEBUG(Setting up cgroup resource limits); @@ -798,6 +848,12 @@ static int virLXCControllerSetupCgroupLimits(virLXCControllerPtr ctrl) if (virCgroupAddTask(ctrl-cgroup, getpid()) 0) goto cleanup; +/* Add all qemu-nbd tasks to the cgroup */ +for (i = 0; i ctrl-nnbdpids; i++) { +if (virCgroupAddTask(ctrl-cgroup, ctrl-nbdpids[i]) 0) +goto cleanup; +} + if (virLXCCgroupSetup(ctrl-def, ctrl-cgroup, nodeset) 0) goto cleanup; -- 2.1.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 1/2] Introduce QEMU_CAPS_ARM_VIRT_PCI
This capability specifies that virt machine on ARM has PCI controller. Enabled when version is at least 2.3.0. Signed-off-by: Pavel Fedin p.fe...@samsung.com --- src/qemu/qemu_capabilities.c | 5 + src/qemu/qemu_capabilities.h | 1 + 2 files changed, 6 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index ca7a7c2..2eccc97 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -285,6 +285,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, dea-key-wrap, pci-serial, aarch64-off, + arm-virt-pci, ); @@ -1330,6 +1331,10 @@ virQEMUCapsComputeCmdFlags(const char *help, virQEMUCapsSet(qemuCaps, QEMU_CAPS_VNC_SHARE_POLICY); } +if (version = 2003000) { +virQEMUCapsSet(qemuCaps, QEMU_CAPS_ARM_VIRT_PCI); +} + return 0; } diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index b5a7770..3c1a8b9 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -229,6 +229,7 @@ typedef enum { QEMU_CAPS_DEA_KEY_WRAP = 187, /* -machine dea_key_wrap */ QEMU_CAPS_DEVICE_PCI_SERIAL = 188, /* -device pci-serial */ QEMU_CAPS_CPU_AARCH64_OFF= 189, /* -cpu ...,aarch64=off */ +QEMU_CAPS_ARM_VIRT_PCI = 190, /* ARM 'virt' machine has PCI bus */ QEMU_CAPS_LAST, /* this must always be the last item */ } virQEMUCapsFlags; -- 1.9.5.msysgit.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 0/2] Allow PCI virtio on ARM virt machine
Virt machine in qemu since v2.3.0 has PCI generic host controller, and can use PCI devices. This provides performance improvement as well as vhost-net with irqfd support for virtio-net. However libvirt still insists on virtio devices attached to virt machine to have MMIO bindings. This patch allows to use both. If the user doesn't specify address type='virtio-mmio', PCI will be used by default. Changes since v1: - Added capability based on qemu version number - Recognize also virt- prefix Pavel Fedin (2): Introduce QEMU_CAPS_ARM_VIRT_PCI Allow PCI virtio on ARM virt machine src/qemu/qemu_capabilities.c | 5 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 15 --- 3 files changed, 18 insertions(+), 3 deletions(-) -- 1.9.5.msysgit.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 2/2] Allow PCI virtio on ARM virt machine
Signed-off-by: Pavel Fedin p.fe...@samsung.com --- src/qemu/qemu_command.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0a6d92f..2acdc6a 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -457,7 +457,7 @@ qemuDomainSupportsNicdev(virDomainDefPtr def, /* non-virtio ARM nics require legacy -net nic */ if (((def-os.arch == VIR_ARCH_ARMV7L) || (def-os.arch == VIR_ARCH_AARCH64)) -net-info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO) +strcmp(net-model, virtio)) return false; return true; @@ -1375,8 +1375,9 @@ qemuDomainAssignARMVirtioMMIOAddresses(virDomainDefPtr def, if (((def-os.arch == VIR_ARCH_ARMV7L) || (def-os.arch == VIR_ARCH_AARCH64)) (STRPREFIX(def-os.machine, vexpress-) || -STREQ(def-os.machine, virt) || -STRPREFIX(def-os.machine, virt-)) +(!virQEMUCapsGet(qemuCaps, QEMU_CAPS_ARM_VIRT_PCI) +(STREQ(def-os.machine, virt) || + STRPREFIX(def-os.machine, virt- virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VIRTIO_MMIO)) { qemuDomainPrimeVirtioDeviceAddresses( def, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO); @@ -2498,6 +2499,14 @@ qemuAssignDevicePCISlots(virDomainDefPtr def, VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW) continue; +/* ARM virt machine can also have virtio-mmio devices */ +if (((def-os.arch == VIR_ARCH_ARMV7L) || +(def-os.arch == VIR_ARCH_AARCH64)) +(STREQ(def-os.machine, virt) || +STRPREFIX(def-os.machine, virt-)) +def-disks[i]-info.type == VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_MMIO) +continue; + if (def-disks[i]-info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _(virtio disk cannot have an address of type '%s'), -- 1.9.5.msysgit.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3 00/24] Add support for migration events
On Wed, Jun 10, 2015 at 11:16:29 -0400, John Ferlan wrote: On 06/10/2015 11:06 AM, Jiri Denemark wrote: On Wed, Jun 10, 2015 at 10:27:11 -0400, John Ferlan wrote: On 06/10/2015 09:42 AM, Jiri Denemark wrote: QEMU will soon (patches are available on qemu-devel) get support for migration events which will finally allow us to get rid of polling query-migrate every 50ms. However, we first need to be able to wait for all events related to migration (migration status changes, block job events, async abort requests) at once. This series prepares the infrastructure and uses it to switch all polling loops in migration code to pthread_cond_wait. https://bugzilla.redhat.com/show_bug.cgi?id=1212077 Version 3 (see individual patches for details): - most of the series has been ACKed in v2 - qemu: Use domain condition for synchronous block jobs was split in 3 patches for easier review - minor changes requested in v2 review Version 2 (see individual patches for details): - rewritten using per-domain condition variable - enahnced to fully support the migration events Just ran this through my Coverity checker - only one issue RESOURCE_LEAK in qemuMigrationRun: 4235 if (qemuMigrationCheckJobStatus(driver, vm, 4236 QEMU_ASYNC_JOB_MIGRATION_OUT) 0) (4) Event if_end: End of if statement 4237 goto cancel; 4238 (5) Event open_fn: Returning handle opened by accept. (6) Event var_assign: Assigning: fd = handle returned from accept(spec-dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL). (7) Event cond_false: Condition (fd = accept(spec-dest.unix_socket.sock, __SOCKADDR_ARG({ .__sockaddr__ = NULL}), NULL)) 0, taking false branch Also see events: [leaked_handle] 4239 while ((fd = accept(spec-dest.unix_socket.sock, NULL, NULL)) 0) { 4240 if (errno == EAGAIN || errno == EINTR) 4241 continue; Hmm, what an old and unused (except for some ancient QEMU versions) code path :-) However, this code is only executed if spec-destType == MIGRATION_DEST_UNIX, which only happens in tunnelled migration path, which also sets spec.fwdType = MIGRATION_FWD_STREAM. Placing sa_assert(spec-fwdType == MIGRATION_FWD_STREAM); above the while loop makes Coverity happy. Feel free to push the sa_assert, it's completely unrelated to this series and it has been there for ages. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 1/2] Add virProcessGetPids to get all tasks of a process
This function gets all the PIDs listed in /proc/PID/task. This will be needed at least to move all qmeu-nbd tasks to the container cgroup. --- src/libvirt_private.syms | 1 + src/util/virprocess.c| 47 +++ src/util/virprocess.h| 2 ++ 3 files changed, 50 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 6a95fb9..780cfbb 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1986,6 +1986,7 @@ virProcessAbort; virProcessExitWithStatus; virProcessGetAffinity; virProcessGetNamespaces; +virProcessGetPids; virProcessGetStartTime; virProcessKill; virProcessKillPainfully; diff --git a/src/util/virprocess.c b/src/util/virprocess.c index 7a79970..ce5e106 100644 --- a/src/util/virprocess.c +++ b/src/util/virprocess.c @@ -608,6 +608,53 @@ int virProcessGetAffinity(pid_t pid ATTRIBUTE_UNUSED, #endif /* HAVE_SCHED_GETAFFINITY */ +int virProcessGetPids(pid_t pid, size_t *npids, pid_t **pids) +{ +int ret = -1; +char *taskPath = NULL; +DIR *dir = NULL; +int value; +struct dirent *ent; + +*npids = 0; +*pids = NULL; + +if (virAsprintf(taskPath, /proc/%llu/task, +(unsigned long long)pid) 0) +goto cleanup; + +if (!(dir = opendir(taskPath))) +goto cleanup; + +while ((value = virDirRead(dir, ent, taskPath)) 0) { +pid_t tmp_pid; + +/* Skip . and .. */ +if (STRPREFIX(ent-d_name, .)) +continue; + +if (virStrToLong_i(ent-d_name, NULL, 10, tmp_pid) 0) +goto cleanup; + +if (VIR_APPEND_ELEMENT(*pids, *npids, tmp_pid) 0) +goto cleanup; +} + +if (value 0) +goto cleanup; + +ret = 0; + + cleanup: +if (!dir) +closedir(dir); +VIR_FREE(taskPath); +if (ret 0) +VIR_FREE(*pids); +return ret; +} + + int virProcessGetNamespaces(pid_t pid, size_t *nfdlist, int **fdlist) diff --git a/src/util/virprocess.h b/src/util/virprocess.h index c812882..86a633d 100644 --- a/src/util/virprocess.h +++ b/src/util/virprocess.h @@ -62,6 +62,8 @@ int virProcessGetAffinity(pid_t pid, virBitmapPtr *map, int maxcpu); +int virProcessGetPids(pid_t pid, size_t *npids, pid_t **pids); + int virProcessGetStartTime(pid_t pid, unsigned long long *timestamp); -- 2.1.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 0/2] qemu-nbd cleanup, resent
Hi all, Note: I'm just resending this as the patches somehow never landed on the mailing list. Here is the very same patch, but split in two patches. Well, I also moved two comments around between v1 and v2. Cédric Bosdonnat (2): Add virProcessGetPids to get all tasks of a process lxc: properly clean up qemu-nbd src/libvirt_private.syms | 1 + src/lxc/lxc_controller.c | 56 src/util/virprocess.c| 47 src/util/virprocess.h| 2 ++ 4 files changed, 106 insertions(+) -- 2.1.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
[please keep the list CC'ed] On 10.06.2015 20:09, Clarylin L wrote: Hi Michal, Thanks a lot. If 100 hugepages are pre-allocated, the guest can start without decreasing number of hugepages. Since the guest requires 128 hugepages, it's kind of expected that the guest would not take memory from hugepages. Before guest start, [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 0 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB After: [root@local ~]# cat /proc/meminfo | grep uge AnonH*uge*Pages: 134254592 kB H*uge*Pages_Total: 100 H*uge*Pages_Free: 100 H*uge*Pages_Rsvd:0 H*uge*Pages_Surp:0 H*uge*pagesize:1048576 kB There is no -mem-prealloc and -mem-path options in qemu command And there can't be. From the command line below, you are defining 2 NUMA nodes for your guest. In order to instruct qemu to back their memory by huge pages you need it to support memory-backend-file object which was introduced in qemu-2.1.0. The other option you have is to not use guest NUMA nodes, in which case global -mem-path can be used. [root@local ~]# ps -ef | grep qemu *qemu* 3403 1 99 17:42 ?00:36:42 /usr/libexec/*qemu*-kvm -name qvpc-di-03-sf -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 131072 -realtime mlock=off -smp 32,sockets=2,cores=16,threads=1 -numa node,nodeid=0,cpus=0-15,mem=65536 -numa node,nodeid=1,cpus=16-31,mem=65536 -uuid e1b72349-4a0b-4b91-aedc-fd34e92251e4 -smbios type=1,serial=SCALE-SLOT-03 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/*qemu*/qvpc-di-03-sf.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/asr5700/qvpc-di-03-sf-hda.img,if=none,id=drive-ide0-0-0,format=qcow2 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0x3 -watchdog-action reset -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.0,addr=0x5 -device vfio-pci,host=09:00.0,id=hostdev1,bus=pci.0,addr=0x6 -device vfio-pci,host=0a:00.0,id=hostdev2,bus=pci.0,addr=0x7 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on If 140 hugepages are preallocated, the guest cannot start and it complained not enough memory. The libvirt version is shown as follows: virsh # version Compiled against library: libvirt 1.2.8 Using library: libvirt 1.2.8 Using API: QEMU 1.2.8 Running hypervisor: QEMU 1.5.3 Also in guest configuration contains numa session. The hugepages are uniformly distributed to two nodes. In this case, do I need to make additional configurations to enable usage of hugepages? numatune memory mode='strict' nodeset='0-1'/ /numatune This says that all the memory for you guests should be pinned onto host nodes 0-1. If you want to be more specific, you can explicitly wire guest NUMA nodes onto host NUMA nodes in 1:N relationship (where N can be even 1, in which case you will get 1:1), e.g.: memnode cellid='0' mode='preferred' nodeset='0'/ Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] network: add an option to make dns public
On Wed, 2015-06-10 at 15:56 -0400, John Ferlan wrote: On 06/01/2015 07:54 AM, Cédric Bosdonnat wrote: In some use cases we don't want the virtual network's DNS to only listen to the vnet interface. Adding a publiclyAccessible attribute to the dns element in the configuration allows the DNS to listen to all interfaces. It simply disables the bind-dynamic option of dnsmasq for the network. --- docs/formatnetwork.html.in | 11 +++ docs/schemas/network.rng | 15 ++- src/conf/network_conf.c | 6 ++ src/conf/network_conf.h | 1 + src/network/bridge_driver.c | 4 +++- tests/networkxml2confdata/nat-network-dns-hosts.conf | 1 - tests/networkxml2confdata/nat-network-dns-hosts.xml | 2 +- 7 files changed, 32 insertions(+), 8 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 6abed8f..8e43658 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -851,6 +851,17 @@ DNS server. /p +p + The dns element + can have an optional codepubliclyAccessible/code + attribute span class=sinceSince 1.2.17/span. + If codepubliclyAccessible/code is yes, then the DNS server + will handle requests for all interfaces. + If codepubliclyAccessible/code is not set or no, the DNS + server will only handle requests for the interface of the virtual + network. +/p + Currently supported sub-elements of codelt;dnsgt;/code are: dl dtcodeforwarder/code/dt diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 4edb6eb..f989625 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -244,12 +244,17 @@ and other features in the dns element -- optional element name=dns -optional - attribute name=forwardPlainNames -ref name=virYesNo/ - /attribute -/optional interleave + optional +attribute name=forwardPlainNames + ref name=virYesNo/ +/attribute + /optional + optional +attribute name=publiclyAccessible + ref name=virYesNo/ +/attribute + /optional Moving the attributes inside the interleave had me looking through other .rng's... I'm no expert, but had thought they really only mattered for element's Hum, I'll try without moving it. I'm obviously no RNG expert either ;) zeroOrMore element name=forwarder attribute name=addrref name=ipAddr//attribute diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index f4a9df0..99bac6d 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1309,9 +1309,14 @@ virNetworkDNSDefParseXML(const char *networkName, size_t i; int ret = -1; xmlNodePtr save = ctxt-node; +char *publiclyAccessible = NULL; ctxt-node = node; +publiclyAccessible = virXPathString(string(./@publiclyAccessible), ctxt); +if (publiclyAccessible) +def-publiclyAccessible = virTristateBoolTypeFromString(publiclyAccessible); + forwardPlainNames = virXPathString(string(./@forwardPlainNames), ctxt); if (forwardPlainNames) { def-forwardPlainNames = virTristateBoolTypeFromString(forwardPlainNames); @@ -1410,6 +1415,7 @@ virNetworkDNSDefParseXML(const char *networkName, ret = 0; cleanup: +VIR_FREE(publiclyAccessible); VIR_FREE(forwardPlainNames); VIR_FREE(fwdNodes); VIR_FREE(hostNodes); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index f69d999..f555b6b 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -136,6 +136,7 @@ struct _virNetworkDNSDef { virNetworkDNSSrvDefPtr srvs; size_t nfwds; char **forwarders; +int publiclyAccessible; /* enum virTristateBool */ }; typedef struct _virNetworkIpDef virNetworkIpDef; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index d195085..c39b1a5 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -996,8 +996,10 @@ networkDnsmasqConfContents(virNetworkObjPtr network, * other than one of the virtual guests connected directly to * this network). This was added in response to CVE 2012-3411. */ +if (network-def-dns.publiclyAccessible != VIR_TRISTATE_BOOL_YES) +virBufferAddLit(configbuf, +
Re: [libvirt] Problem with setting up KVM guests to use HugePages
On Wed, Jun 10, 2015 at 09:20:40PM +, Vivi L wrote: Michal Privoznik mprivozn at redhat.com writes: On 10.06.2015 01:05, Vivi L wrote: Kashyap Chamarthy kchamart at redhat.com writes: You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this: $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 memory unit='KiB'2000896/memory currentMemory unit='KiB'200/currentMemory memoryBacking hugepages page size='2048' unit='KiB' nodeset='0'/ /hugepages /memoryBacking vcpu placement='static'8/vcpu I haven't tested this exhaustively, but some basic test notes here: https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt Current QEMU does not support setting page element. Could it be the cause of my aforementioned problem? unsupported configuration: huge pages per NUMA node are not supported with this QEMU So this is explanation why the memory for you guest is not backed by hugepages. I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest? No, it should not be mandatory. You should be able to use memoryBacking hugepages/ /memoryBacking With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] get guest OS infos
On Thu, Jun 11, 2015 at 01:51:33PM +0800, zhang bo wrote: Different OSes have different capabilities and behaviors sometimes. We have to distinguish them then. For example, our clients want to send NMI interrupts to certain guests(eg.Linux distributions), but not others(eg.Windows guests). They want to acquire the list below: guest1: RHEL 7 guest2: RHEL 7 guest3: Ubuntu 12 guest4: Ubuntu 13 guest5: Windows 7 .. AFAIK, neither libvirt nor openstack, nor qemu, have such capbility of showing these guest OS infos. Libvirt now supports to show host capabilities and driver capability, but not an individual guest OS's capibility. We may refer to http://libvirt.org/formatdomaincaps.html for more information. So, what's your opinion on adding such feature in libvirt and qemu? This is normally something the higher level management app will remember and record. For example, RHEV/oVirt store a record of the OS when the guest is first provisioned. In OpenStack we are going to permit the user to set an image property flag to specify the guest OS, using libosinfo terminology http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/libvirt-hardware-policy-from-libosinfo.html Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Problem with setting up KVM guests to use HugePages
On 11.06.2015 10:13, Daniel P. Berrange wrote: On Wed, Jun 10, 2015 at 09:20:40PM +, Vivi L wrote: Michal Privoznik mprivozn at redhat.com writes: On 10.06.2015 01:05, Vivi L wrote: Kashyap Chamarthy kchamart at redhat.com writes: You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this: $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 memory unit='KiB'2000896/memory currentMemory unit='KiB'200/currentMemory memoryBacking hugepages page size='2048' unit='KiB' nodeset='0'/ /hugepages /memoryBacking vcpu placement='static'8/vcpu I haven't tested this exhaustively, but some basic test notes here: https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt Current QEMU does not support setting page element. Could it be the cause of my aforementioned problem? unsupported configuration: huge pages per NUMA node are not supported with this QEMU So this is explanation why the memory for you guest is not backed by hugepages. I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest? No, it should not be mandatory. You should be able to use memoryBacking hugepages/ /memoryBacking With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug. Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] get guest OS infos
On 11.06.2015 07:51, zhang bo wrote: Different OSes have different capabilities and behaviors sometimes. We have to distinguish them then. For example, our clients want to send NMI interrupts to certain guests(eg.Linux distributions), but not others(eg.Windows guests). They want to acquire the list below: guest1: RHEL 7 guest2: RHEL 7 guest3: Ubuntu 12 guest4: Ubuntu 13 guest5: Windows 7 .. AFAIK, neither libvirt nor openstack, nor qemu, have such capbility of showing these guest OS infos. Libvirt now supports to show host capabilities and driver capability, but not an individual guest OS's capibility. We may refer to http://libvirt.org/formatdomaincaps.html for more information. So, what's your opinion on adding such feature in libvirt and qemu? --- The solution I can see is: 1 add a new qga command in qemu agent, 'guest-get-osinfo', which gets the os infos by qemu-agent inside the guest. { 'command': 'guest-get-osinfo', 'returns': ['GuestOSInfo'] } { 'struct': 'GuestOSInfo', 'data': {'distribution': 'GuestOSDistribution', 'version': 'int', 'arch': 'GuestOSArchType'} } an example Json result: {return: {distribution: RHEL, version: 7, arch: x86_64 } } 2 add new helper APIs for that qga command in libvirt. qemuAgentGetOSInfo() 3 When the guest starts up and its qemu-agent is running, call qemuAgentGetOSINfo() in libvirt. 4 set the osinfo, which is got at step 3, into the guest's status and config file. domainCapabilities path/usr/bin/qemu-system-x86_64/path domainkvm/domain machinepc-i440fx-2.1/machine archx86_64/arch distribution typeRHEL/type version7/version /distribution ... /domainCapabilities This is not going to fly. Firstly, domain capabilities are not on per-domain basis. So while one domain may run RHEL-7 which supports NMI, the other (running the same hypervisor) may run a different OS which does not support NMI. Secondly, domain capabilities are designed to be consulted prior to constructing domain XML to expose which devices are supported. Thirdly, guest os detection falls out of libvirt scope. It's an upper layer that creates and installs new guests. Therefore it's the layer who is responsible for making such decisions. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list