Re: [libvirt] [PATCH] libxl: add support for soft reset
On 10/29/18 7:26 PM, Jim Fehlig wrote: The pvops Linux kernel implements machine_ops.crash_shutdown as static void xen_hvm_crash_shutdown(struct pt_regs *regs) { native_machine_crash_shutdown(regs); xen_reboot(SHUTDOWN_soft_reset); } but currently the libxl driver does not handle the soft reset shutdown event. As a result, the guest domain never proceeds past xen_reboot(), making it impossible for HVM domains to save a crash dump using kexec. This patch adds support for handling the soft reset event by calling libxl_domain_soft_reset() and re-enabling domain death events, which is similar to the xl tool handling of soft reset shutdown event. Signed-off-by: Jim Fehlig --- src/libxl/libxl_domain.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 0032b9dd11..295ec04a85 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -447,8 +447,10 @@ libxlDomainShutdownThread(void *opaque) virObjectEventPtr dom_event = NULL; libxl_shutdown_reason xl_reason = ev->u.domain_shutdown.shutdown_reason; libxlDriverConfigPtr cfg; +libxl_domain_config d_config; cfg = libxlDriverConfigGet(driver); +libxl_domain_config_init(_config); vm = virDomainObjListFindByID(driver->domains, ev->domid); if (!vm) { @@ -532,6 +534,33 @@ libxlDomainShutdownThread(void *opaque) * after calling libxl_domain_suspend() are handled by it's callers. */ goto endjob; +} else if (xl_reason == LIBXL_SHUTDOWN_REASON_SOFT_RESET) { +libxlDomainObjPrivatePtr priv = vm->privateData; +int res; I'm not sure why I have this useless local. I've squashed the below diff into my local branch. Regards, Jim diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 295ec04a85..af5127bc2f 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -536,7 +536,6 @@ libxlDomainShutdownThread(void *opaque) goto endjob; } else if (xl_reason == LIBXL_SHUTDOWN_REASON_SOFT_RESET) { libxlDomainObjPrivatePtr priv = vm->privateData; -int res; if (libxl_retrieve_domain_configuration(cfg->ctx, vm->def->id, _config) != 0) { @@ -551,9 +550,8 @@ libxlDomainShutdownThread(void *opaque) priv->deathW = NULL; } -res = libxl_domain_soft_reset(cfg->ctx, _config, vm->def->id, - NULL, NULL); -if (res != 0) { +if (libxl_domain_soft_reset(cfg->ctx, _config, vm->def->id, +NULL, NULL) != 0) { VIR_ERROR(_("Failed to soft reset VM '%s'. Destroying VM"), vm->def->name); goto destroy; -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] libxl: add support for soft reset
The pvops Linux kernel implements machine_ops.crash_shutdown as static void xen_hvm_crash_shutdown(struct pt_regs *regs) { native_machine_crash_shutdown(regs); xen_reboot(SHUTDOWN_soft_reset); } but currently the libxl driver does not handle the soft reset shutdown event. As a result, the guest domain never proceeds past xen_reboot(), making it impossible for HVM domains to save a crash dump using kexec. This patch adds support for handling the soft reset event by calling libxl_domain_soft_reset() and re-enabling domain death events, which is similar to the xl tool handling of soft reset shutdown event. Signed-off-by: Jim Fehlig --- src/libxl/libxl_domain.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 0032b9dd11..295ec04a85 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -447,8 +447,10 @@ libxlDomainShutdownThread(void *opaque) virObjectEventPtr dom_event = NULL; libxl_shutdown_reason xl_reason = ev->u.domain_shutdown.shutdown_reason; libxlDriverConfigPtr cfg; +libxl_domain_config d_config; cfg = libxlDriverConfigGet(driver); +libxl_domain_config_init(_config); vm = virDomainObjListFindByID(driver->domains, ev->domid); if (!vm) { @@ -532,6 +534,33 @@ libxlDomainShutdownThread(void *opaque) * after calling libxl_domain_suspend() are handled by it's callers. */ goto endjob; +} else if (xl_reason == LIBXL_SHUTDOWN_REASON_SOFT_RESET) { +libxlDomainObjPrivatePtr priv = vm->privateData; +int res; + +if (libxl_retrieve_domain_configuration(cfg->ctx, vm->def->id, +_config) != 0) { +VIR_ERROR(_("Failed to retrieve config for VM '%s'. " +"Unable to perform soft reset. Destroying VM"), + vm->def->name); +goto destroy; +} + +if (priv->deathW) { +libxl_evdisable_domain_death(cfg->ctx, priv->deathW); +priv->deathW = NULL; +} + +res = libxl_domain_soft_reset(cfg->ctx, _config, vm->def->id, + NULL, NULL); +if (res != 0) { +VIR_ERROR(_("Failed to soft reset VM '%s'. Destroying VM"), + vm->def->name); +goto destroy; +} +libxl_evenable_domain_death(cfg->ctx, vm->def->id, 0, >deathW); +libxl_domain_unpause(cfg->ctx, vm->def->id); +goto endjob; } else { VIR_INFO("Unhandled shutdown_reason %d", xl_reason); goto endjob; @@ -565,6 +594,7 @@ libxlDomainShutdownThread(void *opaque) virObjectEventStateQueue(driver->domainEventState, dom_event); libxl_event_free(cfg->ctx, ev); VIR_FREE(shutdown_info); +libxl_domain_config_dispose(_config); virObjectUnref(cfg); } -- 2.18.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 2/2] Revert "qemu: Forbid pinning vCPUs for TCG domain"
On 10/17/18 10:15 AM, Daniel P. Berrangé wrote: > This reverts commit 8b035c84d8a7362a87a95e6114b8e7f959685ed9. > > The MTTCG impl in QEMU does allow pinning vCPUs. > > When the guest is running we already check if pinning is > possible in the qemuDomainPinVcpuLive method, so this > check was adding no benefit. > > When the guest is not running, we cannot know whether the > subsequent launch will use MTTCG or TCG, so we must allow > the pinning request. If the guest does use TCG on the next > launch it will fail, but this is no worse than if the user > had done a virDomainDefineXML with an XML doc specifying > vCPU pinning. > > Signed-off-by: Daniel P. Berrangé > --- > src/qemu/qemu_driver.c | 7 --- > 1 file changed, 7 deletions(-) > While looking at another "== VIR_DOMAIN_VIRT_QEMU" check in virQEMUCapsAddCPUDefinitions, I wonder why/when/if it really mattered if qemuCaps->tcgCPUModels in this code... No matter, I agree let's remove it. Reviewed-by: John Ferlan John -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/2] qemu: fix recording of vCPU pids for MTTCG
On 10/17/18 10:15 AM, Daniel P. Berrangé wrote: > MTTCG is the new multi-threaded impl of TCG which follows > KVM in having one host OS thread per vCPU. Historically > we have discarded all PIDs reported for TCG guests, but > we must now selectively honour this data. > > We don't have anything in the domain XML that indicates > whether a guest is using TCG or MTTCG. While QEMU does > have an option (-accel tcg,thread=single|multi), it is > not desirable to expose this in libvirt. QEMU will > automatically use MTTCG when the host/guest architecture > pairing is known to be safe. Only developers of QEMU TCG > have a strong reason to override this logic. > > Thus we use two sanity checks to decide if the vCPU > PID information is usable. First we see if the PID > duplicates the main emulator PID, and second we see > if the PID duplicates any other vCPUs. > > Signed-off-by: Daniel P. Berrangé > --- > src/qemu/qemu_domain.c | 81 ++ > 1 file changed, 51 insertions(+), 30 deletions(-) > > diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c > index f00f1b3fdb..c7a0c03e3f 100644 > --- a/src/qemu/qemu_domain.c > +++ b/src/qemu/qemu_domain.c > @@ -10326,9 +10326,10 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver, > qemuDomainVcpuPrivatePtr vcpupriv; > qemuMonitorCPUInfoPtr info = NULL; > size_t maxvcpus = virDomainDefGetVcpusMax(vm->def); > -size_t i; > +size_t i, j; > bool hotplug; > bool fast; > +bool validTIDs = true; > int rc; > int ret = -1; > > @@ -10336,6 +10337,8 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver, > fast = virQEMUCapsGet(QEMU_DOMAIN_PRIVATE(vm)->qemuCaps, >QEMU_CAPS_QUERY_CPUS_FAST); > > +VIR_DEBUG("Maxvcpus %zu hotplug %d fast query %d", maxvcpus, hotplug, > fast); > + > if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) > return -1; > > @@ -10348,39 +10351,57 @@ qemuDomainRefreshVcpuInfo(virQEMUDriverPtr driver, > if (rc < 0) > goto cleanup; > > +/* > + * The query-cpus[-fast] commands return information > + * about the vCPUs, including the OS level PID that > + * is executing the vCPU. > + * > + * For KVM there is always a 1-1 mapping between > + * vCPUs and host OS PIDs. > + * > + * For TCG things are a little more complicated. > + * > + * - In some cases the vCPUs will all have the same > + *PID as the main emulator thread. > + * - In some cases the first vCPU will have a distinct > + *PID, but other vCPUs will share the emulator thread > + * > + * For MTTCG, things work the same as KVM, with each > + * vCPU getting its own PID. > + * > + * We use the Host OS PIDs for doing vCPU pinning > + * and reporting. The TCG data reporting will result > + * in bad behaviour such as pinning the wrong PID. > + * We must thus detect and discard bogus PID info > + * from TCG, while still honouring the modern MTTCG > + * impl which we can support. > + */ > +for (i = 0; i < maxvcpus && validTIDs; i++) { > +if (info[i].tid == vm->pid) { > +VIR_DEBUG("vCPU[%zu] PID %llu duplicates process", > + i, (unsigned long long)info[i].tid); > +validTIDs = false; > +} > + If !validTIDs does the next loop matter? IOW: Should the above section add a "continue;" since the loop exit would force the exit? Beyond that the logic and comments look reasonable. I assume since domain XML doesn't care whether MTTCG or TCG is used and things are handled under the covers by QEMU that means there's no migration or save/restore issues. Of course you have a much deeper understanding of the QEMU code than I do! The one other question I'd have is should validTIDs setting be done just once and saved perhaps in the domain private block? There is more than 1 caller (and *Launch can call twice). It's not like it's going to change, right? So doing the same loop from a hotplug path won't matter nor would subsequent reconnects or attaches. So perhaps the validTIDs should be a tristate that only needs to be checked when the value is UNKNOWN. It's not like the loop is that expensive since it's only numeric comparisons, so it doesn't matter. I suppose I can be easily convinced taking this route would be fine, but figured I'd ask. John > +for (j = 0; j < i; j++) { > +if (info[i].tid == info[j].tid) { > +VIR_DEBUG("vCPU[%zu] PID %llu duplicates vCPU[%zu]", > + i, (unsigned long long)info[i].tid, j); > +validTIDs = false; > +} > +} > + > +if (validTIDs) > +VIR_DEBUG("vCPU[%zu] PID %llu is valid", > + i, (unsigned long long)info[i].tid); > +} > + > +VIR_DEBUG("Extracting vCPU information validTIDs=%d", validTIDs); > for (i =
Re: [libvirt] [PATCH] nwfilter: intantiate filters on loading driver
On 10/15/18 4:26 AM, Nikolay Shirokovskiy wrote: > Before using filters binding filters instantiation was done by hypervisors > drivers initialization code (qemu was the only such hypervisor). Now qemu > reconnection done supposes it should be done by nwfilter driver probably. > Let's add this missing step. > > Signed-off-by: Nikolay Shirokovskiy > --- > src/nwfilter/nwfilter_driver.c | 3 +++ > 1 file changed, 3 insertions(+) > If there's research you've done where the instantiation was done before introduction of the nwfilter bindings that would be really helpful... I found that virNWFilterBuildAll was introduced in commit 3df907bfff. There were 2 callers for it: 1. virNWFilterTriggerRebuildImpl 2. nwfilterStateReload The former called as part of the virNWFilterConfLayerInit callback during nwfilterStateInitialize (about 50 lines earlier). So how does calling this now w/ @false help things during the state initialize processing? John > diff --git a/src/nwfilter/nwfilter_driver.c b/src/nwfilter/nwfilter_driver.c > index 1ee5162..1ab906f 100644 > --- a/src/nwfilter/nwfilter_driver.c > +++ b/src/nwfilter/nwfilter_driver.c > @@ -264,6 +264,9 @@ nwfilterStateInitialize(bool privileged, > if (virNWFilterBindingObjListLoadAllConfigs(driver->bindings, > driver->bindingDir) < 0) > goto error; > > +if (virNWFilterBuildAll(driver, false) < 0) > +goto error; > + > nwfilterDriverUnlock(); > > return 0; > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2] qemu: Introduce caching whether /dev/kvm is accessible
Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU group. This reduces the overhead of the QEMU capabilities cache lookup. Before this patch there were many fork() calls used for checking whether /dev/kvm is accessible. Now we store the result whether /dev/kvm is accessible or not and we only need to re-run the virFileAccessibleAs check if the ctime of /dev/kvm has changed. Suggested-by: Daniel P. Berrangé Signed-off-by: Marc Hartmayer --- src/qemu/qemu_capabilities.c | 54 ++-- 1 file changed, 52 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index e228f52ec0bb..85516954149b 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv { virArch hostArch; unsigned int microcodeVersion; char *kernelVersion; + +/* cache whether /dev/kvm is usable as runUid:runGuid */ +virTristateBool kvmUsable; +time_t kvmCtime; }; typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv; typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr; @@ -3824,6 +3828,52 @@ virQEMUCapsSaveFile(void *data, } +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU group. */ +static bool +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv) +{ +struct stat sb; +static const char *kvm_device = "/dev/kvm"; +virTristateBool value; +virTristateBool cached_value = priv->kvmUsable; +time_t kvm_ctime; +time_t cached_kvm_ctime = priv->kvmCtime; + +if (stat(kvm_device, ) < 0) { +virReportSystemError(errno, + _("Failed to stat %s"), kvm_device); +return false; +} +kvm_ctime = sb.st_ctime; + +if (kvm_ctime != cached_kvm_ctime) { +VIR_DEBUG("%s has changed (%lld vs %lld)", kvm_device, + (long long)kvm_ctime, (long long)cached_kvm_ctime); +cached_value = VIR_TRISTATE_BOOL_ABSENT; +} + +if (cached_value != VIR_TRISTATE_BOOL_ABSENT) +return cached_value == VIR_TRISTATE_BOOL_YES; + +if (virFileAccessibleAs(kvm_device, R_OK | W_OK, +priv->runUid, priv->runGid) == 0) { +value = VIR_TRISTATE_BOOL_YES; +} else { +value = VIR_TRISTATE_BOOL_NO; +} + +/* There is a race window between 'stat' and + * 'virFileAccessibleAs'. However, since we're only interested in + * detecting changes *after* the virFileAccessibleAs check, we can + * neglect this here. + */ +priv->kvmCtime = kvm_ctime; +priv->kvmUsable = value; + +return value == VIR_TRISTATE_BOOL_YES; +} + + static bool virQEMUCapsIsValid(void *data, void *privData) @@ -3872,8 +3922,7 @@ virQEMUCapsIsValid(void *data, return true; } -kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK, -priv->runUid, priv->runGid) == 0; +kvmUsable = virQEMUCapsKVMUsable(priv); if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_KVM) && kvmUsable) { @@ -4684,6 +4733,7 @@ virQEMUCapsCacheNew(const char *libDir, priv->runUid = runUid; priv->runGid = runGid; priv->microcodeVersion = microcodeVersion; +priv->kvmUsable = VIR_TRISTATE_BOOL_ABSENT; if (uname() == 0 && virAsprintf(>kernelVersion, "%s %s", uts.release, uts.version) < 0) -- 2.17.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [snmp PATCH 09/20] src: Fix header file defines
On Thursday, 18 October 2018 14:26:47 CET Michal Privoznik wrote: > To avoid multiple includes, header files start with: > > #ifndef __SOMETHING__ > # define __SOMETHING__ > > Well, SOMETHING should be the file name, and there should be a > space after hash and before 'define'. Note that identifier start that with two underscores (or even with two underscores in their name, IIRC) are reserved for the system. I know the likelyhood of a name conflict with a system identifier with LIBVIRT in the name is very low, still I saw people reporting these "reserved names used" issues in the past. I'd simply use LIBVIRTSNMP_ (or something like that) as prefix. -- Pino Toscano signature.asc Description: This is a digitally signed message part. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Introduce caching whether /dev/kvm is accessible
On Mon, Oct 29, 2018 at 05:53:58PM +0100, Marc Hartmayer wrote: > Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU > group. This reduces the overhead of the QEMU capabilities cache > lookup. Before this patch there were many fork() calls used for > checking whether /dev/kvm is accessible. Now we store the result > whether /dev/kvm is accessible or not and we only need to re-run the > virFileAccessibleAs check if the ctime of /dev/kvm has changed. > > Suggested-by: Daniel P. Berrangé > Signed-off-by: Marc Hartmayer > --- > src/qemu/qemu_capabilities.c | 56 ++-- > 1 file changed, 54 insertions(+), 2 deletions(-) > > diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c > index e228f52ec0bb..ea95915f0f71 100644 > --- a/src/qemu/qemu_capabilities.c > +++ b/src/qemu/qemu_capabilities.c > @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv { > virArch hostArch; > unsigned int microcodeVersion; > char *kernelVersion; > + > +/* cache whether /dev/kvm is usable as runUid:runGuid */ > +virTristateBool kvmUsable; > +time_t kvmCtime; > }; > typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv; > typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr; > @@ -3824,6 +3828,54 @@ virQEMUCapsSaveFile(void *data, > } > > > +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU group. */ > +static bool > +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv) > +{ > +struct stat sb; > +static const char *kvm_device = "/dev/kvm"; > +virTristateBool value; > +virTristateBool cached_value = priv->kvmUsable; > +time_t kvm_ctime; > +time_t cached_kvm_ctime = priv->kvmCtime; > + > +if (stat(kvm_device, ) < 0) { > +virReportSystemError(errno, > + _("Failed to stat %s"), kvm_device); > +return false; > +} > +kvm_ctime = sb.st_ctime; > + > +if (cached_value != VIR_TRISTATE_BOOL_ABSENT) { > +if (kvm_ctime != cached_kvm_ctime) { > +VIR_DEBUG("%s has changed (%lld vs %lld)", kvm_device, > + (long long)kvm_ctime, (long long)cached_kvm_ctime); > +goto update; > +} > + > +return cached_value == VIR_TRISTATE_BOOL_YES; > +} > + > + update: I don't like this use of goto's. It could be simplified thus to give a linear flow: if (kvm_ctime != cached_kvm_ctime) { VIR_DEBUG("%s has changed (%lld vs %lld)", kvm_device, (long long)kvm_ctime, (long long)cached_kvm_ctime); cached_value = VIR_TRISTATE_BOOL_ABSENT; } if (cached_value != VIR_TRISTATE_BOOL_ABSENT) { return cached_value == VIR_TRISTATE_BOOL_YES; } > +if (virFileAccessibleAs(kvm_device, R_OK | W_OK, > +priv->runUid, priv->runGid) == 0) { > +value = VIR_TRISTATE_BOOL_YES; > +} else { > +value = VIR_TRISTATE_BOOL_NO; > +} > + > +/* There is a race window between 'stat' and > + * 'virFileAccessibleAs'. However, since we're only interested in > + * detecting changes *after* the virFileAccessibleAs check, we can > + * neglect this here. > + */ > +priv->kvmCtime = kvm_ctime; > +priv->kvmUsable = value; > + > +return value == VIR_TRISTATE_BOOL_YES; > +} Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/2] Augment access denied error message and adjust polkit docs
ping? Thoughts at all - it's more doc than code and I'd really like to be sure the words are proper and/or make sense. Tks - John On 10/15/18 10:26 AM, John Ferlan wrote: > Details in the patches > > John Ferlan (2): > access: Modify the VIR_ERR_ACCESS_DENIED to include driverName > docs: Enhance polkit documentation to describe secondary connection > > docs/aclpolkit.html.in| 117 ++ > docs/libvirt.css | 1 + > src/access/viraccessmanager.c | 25 > src/rpc/gendispatch.pl| 2 +- > src/util/virerror.c | 4 +- > 5 files changed, 134 insertions(+), 15 deletions(-) > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: Introduce caching whether /dev/kvm is accessible
Introduce caching whether /dev/kvm is usable as the QEMU user:QEMU group. This reduces the overhead of the QEMU capabilities cache lookup. Before this patch there were many fork() calls used for checking whether /dev/kvm is accessible. Now we store the result whether /dev/kvm is accessible or not and we only need to re-run the virFileAccessibleAs check if the ctime of /dev/kvm has changed. Suggested-by: Daniel P. Berrangé Signed-off-by: Marc Hartmayer --- src/qemu/qemu_capabilities.c | 56 ++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index e228f52ec0bb..ea95915f0f71 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -3238,6 +3238,10 @@ struct _virQEMUCapsCachePriv { virArch hostArch; unsigned int microcodeVersion; char *kernelVersion; + +/* cache whether /dev/kvm is usable as runUid:runGuid */ +virTristateBool kvmUsable; +time_t kvmCtime; }; typedef struct _virQEMUCapsCachePriv virQEMUCapsCachePriv; typedef virQEMUCapsCachePriv *virQEMUCapsCachePrivPtr; @@ -3824,6 +3828,54 @@ virQEMUCapsSaveFile(void *data, } +/* Determine whether '/dev/kvm' is usable as QEMU user:QEMU group. */ +static bool +virQEMUCapsKVMUsable(virQEMUCapsCachePrivPtr priv) +{ +struct stat sb; +static const char *kvm_device = "/dev/kvm"; +virTristateBool value; +virTristateBool cached_value = priv->kvmUsable; +time_t kvm_ctime; +time_t cached_kvm_ctime = priv->kvmCtime; + +if (stat(kvm_device, ) < 0) { +virReportSystemError(errno, + _("Failed to stat %s"), kvm_device); +return false; +} +kvm_ctime = sb.st_ctime; + +if (cached_value != VIR_TRISTATE_BOOL_ABSENT) { +if (kvm_ctime != cached_kvm_ctime) { +VIR_DEBUG("%s has changed (%lld vs %lld)", kvm_device, + (long long)kvm_ctime, (long long)cached_kvm_ctime); +goto update; +} + +return cached_value == VIR_TRISTATE_BOOL_YES; +} + + update: +if (virFileAccessibleAs(kvm_device, R_OK | W_OK, +priv->runUid, priv->runGid) == 0) { +value = VIR_TRISTATE_BOOL_YES; +} else { +value = VIR_TRISTATE_BOOL_NO; +} + +/* There is a race window between 'stat' and + * 'virFileAccessibleAs'. However, since we're only interested in + * detecting changes *after* the virFileAccessibleAs check, we can + * neglect this here. + */ +priv->kvmCtime = kvm_ctime; +priv->kvmUsable = value; + +return value == VIR_TRISTATE_BOOL_YES; +} + + static bool virQEMUCapsIsValid(void *data, void *privData) @@ -3872,8 +3924,7 @@ virQEMUCapsIsValid(void *data, return true; } -kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK, -priv->runUid, priv->runGid) == 0; +kvmUsable = virQEMUCapsKVMUsable(priv); if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_KVM) && kvmUsable) { @@ -4684,6 +4735,7 @@ virQEMUCapsCacheNew(const char *libDir, priv->runUid = runUid; priv->runGid = runGid; priv->microcodeVersion = microcodeVersion; +priv->kvmUsable = VIR_TRISTATE_BOOL_ABSENT; if (uname() == 0 && virAsprintf(>kernelVersion, "%s %s", uts.release, uts.version) < 0) -- 2.17.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/3] qemu: guest dedicated crypto adapters
On 10/18/18 10:54 AM, Boris Fiuczynski wrote: > v2: > + added singleton check for vfio-ap mediated device > > This patch series introduces initial libvirt support for guest > dedicated crypto adapters on S390. > It allows to specify a vfio-ap mediated device in a domain. > Extensive documentation about AP is available in patch 6 of > the QEMU patch series. > I modified patch2 to use "supported" and pushed the series just now. Tks - John > KVM/kernel: guest dedicated crypto adapters > https://lkml.org/lkml/2018/9/26/25 > > QEMU: s390x: vfio-ap: guest dedicated crypto adapters > https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01993.html > > The qemu patch series has been included into master. > https://git.qemu.org/?p=qemu.git;a=commit;h=69ac8c4cb93f2685839ff7b857cef306b388ff3c > > Boris Fiuczynski (3): > qemu: add vfio-ap capability > qemu: vfio-ap device support > news: Update news for vfio-ap support > > docs/formatdomain.html.in | 3 ++- > docs/news.xml | 9 + > docs/schemas/domaincommon.rng | 1 + > src/conf/domain_conf.c | 28 > src/qemu/qemu_capabilities.c | 2 ++ > src/qemu/qemu_capabilities.h | 1 + > src/qemu/qemu_command.c| 8 > src/qemu/qemu_domain_address.c | 4 > src/util/virmdev.c | 3 ++- > src/util/virmdev.h | 1 + > 10 files changed, 58 insertions(+), 2 deletions(-) > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 2/3] qemu: vfio-ap device support
On 10/29/18 11:54 AM, Boris Fiuczynski wrote: > Hi Erik, > do you think it is possible that this series gets into v4.9.0? > > Cheers, > Boris > Not sure if Erik is "online" today... The Red Hat portion of the team has just returned from KVM Forum and today we have some well larger news to digest too ;-) In any case, I had it on my radar too... When I last looked patch2 didn't have Thomas' explicit R-By, but it has others so that's fine. Just have to go through the "motions" and will push before release... John > On 10/24/18 12:18 AM, Erik Skultety wrote: >> On Mon, Oct 22, 2018 at 10:10:39AM +0200, Boris Fiuczynski wrote: >>> On 10/19/18 1:56 PM, Thomas Huth wrote: On 2018-10-18 16:54, Boris Fiuczynski wrote: > Adjusting domain format documentation, adding device address > support and adding command line generation for vfio-ap. > Since only one mediated hostdev with model vfio-ap is supported a > check > disallows to define domains with more than one such hostdev device. > > Signed-off-by: Boris Fiuczynski > Reviewed-by: Bjoern Walk [...] > +static int > +virDomainDefPostParseHostdev(virDomainDefPtr def) > +{ > + size_t i; > + bool vfioap_found = false; > + > + /* verify settings of hostdevs vfio-ap */ > + for (i = 0; i < def->nhostdevs; i++) { > + virDomainHostdevDefPtr hostdev = def->hostdevs[i]; > + > + if (virHostdevIsMdevDevice(hostdev) && > + hostdev->source.subsys.u.mdev.model == > VIR_MDEV_MODEL_TYPE_VFIO_AP) { > + if (vfioap_found) { > + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", > + _("Only one hostdev of model > vfio-ap is " > + "support")); s/support/supported/ ? >>> It should be "supported"... :-) >>> >>> I hope whoever is going to push this series can fix it before >>> pushing. If >>> not please let me know and I am going to send a v3. >> >> Hi Boris, I'm planning on having a look at the series too (sorry for >> not taking >> a look earlier, I was on PTO and the whole team is now attending KVM >> forum). >> Anyhow, if it turns out the typo is the only issue, that will be fixed >> before >> merging, no brainer :). >> >> Erik >> >> -- >> libvir-list mailing list >> libvir-list@redhat.com >> https://www.redhat.com/mailman/listinfo/libvir-list >> > > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 2/3] qemu: vfio-ap device support
Hi Erik, do you think it is possible that this series gets into v4.9.0? Cheers, Boris On 10/24/18 12:18 AM, Erik Skultety wrote: On Mon, Oct 22, 2018 at 10:10:39AM +0200, Boris Fiuczynski wrote: On 10/19/18 1:56 PM, Thomas Huth wrote: On 2018-10-18 16:54, Boris Fiuczynski wrote: Adjusting domain format documentation, adding device address support and adding command line generation for vfio-ap. Since only one mediated hostdev with model vfio-ap is supported a check disallows to define domains with more than one such hostdev device. Signed-off-by: Boris Fiuczynski Reviewed-by: Bjoern Walk [...] +static int +virDomainDefPostParseHostdev(virDomainDefPtr def) +{ +size_t i; +bool vfioap_found = false; + +/* verify settings of hostdevs vfio-ap */ +for (i = 0; i < def->nhostdevs; i++) { +virDomainHostdevDefPtr hostdev = def->hostdevs[i]; + +if (virHostdevIsMdevDevice(hostdev) && +hostdev->source.subsys.u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_AP) { +if (vfioap_found) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("Only one hostdev of model vfio-ap is " + "support")); s/support/supported/ ? It should be "supported"... :-) I hope whoever is going to push this series can fix it before pushing. If not please let me know and I am going to send a v3. Hi Boris, I'm planning on having a look at the series too (sorry for not taking a look earlier, I was on PTO and the whole team is now attending KVM forum). Anyhow, if it turns out the typo is the only issue, that will be fixed before merging, no brainer :). Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Köderitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Plan for next release
I'm afraid I'm a bit behind. I suggest to enter freeze tomorrow morning then pushing rc2 on Thursday and if everything goes well push the GA during the w.e., a bit late but not too late. I hope this works for everybody, thanks, Daniel -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] number of graphic device in vm xml is limited to only 1
On Thu, Oct 25, 2018 at 10:11:34AM +, Huangyong wrote: > Hello, > > In qemu_command, vnc graphic device can’t be more than 1, as the > following code show in qemuBuildCommandLineValidate(): > If(sdl > 1 || vnc > 1 || spice > 1) { >virReportError(….); >return -1; > } > This check originally introduced by commit: 6fe9025eb. > Qemu: Error on unsupported graphics config > > qemu can support multi-vnc-configuration in qemu process, but libvirt limit > vnc graphics in only 1. > Is there any considerations in libvirt? can remove this restriction? Yes, this restriction dates from the time QEMU was restricted to 1 display backend. We could certainly remove this restriction now. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] number of graphic device in vm xml is limited to only 1
Hello, In qemu_command, vnc graphic device can’t be more than 1, as the following code show in qemuBuildCommandLineValidate(): If(sdl > 1 || vnc > 1 || spice > 1) { virReportError(….); return -1; } This check originally introduced by commit: 6fe9025eb. Qemu: Error on unsupported graphics config qemu can support multi-vnc-configuration in qemu process, but libvirt limit vnc graphics in only 1. Is there any considerations in libvirt? can remove this restriction? - 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from New H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: snapshot: better error for active external readonly disk
External snapshot of readonly disk of active domain is impossible but error message [1] is cryptic (it's source is qemu). Let's make error message more user friendly. The problem is libvirtd precreates snapshot image with no write permission which is not expected by qemu. [1] current error message error: internal error: unable to execute QEMU command 'transaction': Could not create file: Permission denied Signed-off-by: Nikolay Shirokovskiy --- By the way if domain is not active then snapshot is possible. However top image will have write permissions after snapshot. We can make snapshot work for active domain I guess if we precreate writable snapshot image, but then we end up running readonly disk on writable image. Also making snapshot of readonly disk is not that practical so let's just fix error message for now. src/qemu/qemu_driver.c | 8 1 file changed, 8 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a52e249..e75931e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14683,6 +14683,14 @@ qemuDomainSnapshotPrepare(virDomainObjPtr vm, active, reuse) < 0) goto cleanup; +if (dom_disk->src->readonly && active) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("external snapshot for readonly disk %s " + "of active domain is not supported"), + disk->name); +goto cleanup; +} + external++; break; -- 1.8.3.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH 1/3] qdev: add deprecation_reason to DeviceClass
On Thu, Oct 25, 2018 at 01:12:15PM +0200, Philippe Mathieu-Daudé wrote: > Hi Gerd, > > On 25/10/18 10:52, Gerd Hoffmann wrote: > > Simliar to deprecated machine types. > > "Similar" > > > Print a warning when creating a deprecated device. > > Add deprecation notice to -device help. > > > > TODO: add to intospection. > > "introspection" > > Do we want the TODO in the git history? No. It's RfC because of the missing introspection bits ;) cheers, Gerd -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH 3/3] cirrus: mark as deprecated
On Thu, Oct 25, 2018 at 09:37:58PM +0100, Daniel P. Berrangé wrote: > On Thu, Oct 25, 2018 at 10:52:56AM +0200, Gerd Hoffmann wrote: > > While being at it deprecate cirrus too. > > > > Reason (short version): use stdvga instead. > > Verbose version: > > https://www.kraxel.org/blog/2014/10/qemu-using-cirrus-considered-harmful > > Every single one of my guests is using cirrus. This wasn't an explicit > choice on my part, so I believe it is being used as the default in > virt-install. > > I don't debate the points in the blog post above that stdvga is a > better choice, but I don't think that's enough to justify deprecating > cirrus at this point in time, because when it then gets deleted it > will break way too many existing deployments. > > We need to socialize info in that blog post above more widely and > especially ensure that apps are not using that by default. I don't > see it being viable to formally deprecate it in QEMU any time soon > though given existing usage. Well, getting that message to the users is the point of deprecating it. I think in this specific case we'll need a longer (maybe much longer) deprecation period than only two releases. But I think it still makes sense to deprecate it now. In qemu it isn't the default any more since release 2.2. libvirt switched the default not too long after that. Not fully sure how things look like higher up the stack. cheers, Gerd -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] [PATCH 2/3] adlib: mark as insecure and deprecated.
On Fri, Oct 26, 2018 at 05:23:37PM +0530, P J P wrote: > +-- On Fri, 26 Oct 2018, Paolo Bonzini wrote --+ > | Oh, thanks! I said I was dumb. :) So the fix is just this: > | > | diff --git a/hw/audio/fmopl.h b/hw/audio/fmopl.h > | index e7e578a48e..7199afaa3c 100644 > | --- a/hw/audio/fmopl.h > | +++ b/hw/audio/fmopl.h > | @@ -72,8 +72,8 @@ typedef struct fm_opl_f { > | /* Rhythm sention */ > | uint8_t rhythm; /* Rhythm mode , key flag */ > | /* time tables */ > | - int32_t AR_TABLE[75]; /* atttack rate tables */ > | - int32_t DR_TABLE[75]; /* decay rate tables */ > | + int32_t AR_TABLE[76]; /* atttack rate tables */ > | + int32_t DR_TABLE[76]; /* decay rate tables */ > | uint32_t FN_TABLE[1024]; /* fnumber -> increment counter */ > | /* LFO */ > | int32_t *ams_table; > | > | and init_timetables will just fill it with the right value? (I checked > | against another implementation at http://opl3.cozendey.com/). > > Gerd has proposed to a patch to deprecate adlib, as it's not used as much. > IMO > deprecation is better option. But if that is not happening, above seems good. I think we can actually do both. cheers, Gerd -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC 0/3] update NVDIMM support
polite ping On 2018/10/17 上午10:21, Luyao Zhong wrote: Hi libvirt experts, This is the RFC for updating NVDIMM support in libvirt. QEMU has supported four more properties which libvirt has not introduced yet, including 'align', 'pmem', 'nvdimm-persistences' and 'unarmed'. The 'align' property allows users to specify the proper alignment. The previous alignment can only be 4K because QEMU use pagesize as alignment. But some backends may require alignments different from the pagesize. The 'pmem' property allows users to specify whether the backend storage of memory-backend-file is a real persistent memory. Then QEMU will know if it needs to guarrantee the write persistence to the vNVDIMM backend. The 'nvdimm-persistence' property allows users to set platform-supported features about NVDIMM data persistence of a guest. The 'unarmed' property allows users to mark vNVDIMM read-only. Only the device DAX on the real NVDIMM can guarantee the guest write persistence, so it's suggested to set 'unarmed' option to 'on' and then vNVDIMM device will be marked as read-only. Libvirt introduces 'alignsize', 'pmem', 'persistence' and 'unarmed' config elements into xml corresponding to 'align', 'pmem', 'nvdimm-persistence' and 'unarmed' properties in QEMU, and update xml parsing, formating and qemu command-line generating process for NVDIMM. Thanks, Zhong, Luyao Luyao Zhong (3): xml: introduce more config elements for NVDIMM memory xml: update xml parsing and formating about NVDIMM memory qemu: update qemu command-line generating for NVDIMM memory docs/formatdomain.html.in | 98 +++--- docs/schemas/domaincommon.rng | 31 +- src/conf/domain_conf.c | 115 +++-- src/conf/domain_conf.h | 14 +++ src/libvirt_private.syms | 2 + src/qemu/qemu_command.c| 25 + .../memory-hotplug-nvdimm-align.args | 31 ++ .../memory-hotplug-nvdimm-align.xml| 58 +++ .../memory-hotplug-nvdimm-persistence.args | 31 ++ .../memory-hotplug-nvdimm-persistence.xml | 58 +++ .../memory-hotplug-nvdimm-pmem.args| 31 ++ .../memory-hotplug-nvdimm-pmem.xml | 58 +++ .../memory-hotplug-nvdimm-unarmed.args | 31 ++ .../memory-hotplug-nvdimm-unarmed.xml | 58 +++ tests/qemuxml2argvtest.c | 12 +++ .../memory-hotplug-nvdimm-align.xml| 1 + .../memory-hotplug-nvdimm-persistence.xml | 1 + .../memory-hotplug-nvdimm-pmem.xml | 1 + .../memory-hotplug-nvdimm-unarmed.xml | 1 + tests/qemuxml2xmltest.c| 4 + 20 files changed, 636 insertions(+), 25 deletions(-) create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-align.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-align.xml create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-persistence.xml create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-pmem.xml create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-unarmed.args create mode 100644 tests/qemuxml2argvdata/memory-hotplug-nvdimm-unarmed.xml create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-align.xml create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-persistence.xml create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-pmem.xml create mode 12 tests/qemuxml2xmloutdata/memory-hotplug-nvdimm-unarmed.xml -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 2/2] nwfilter: don't reinstantiate rules if there is no need to
This patch adds caching to rules instantiating in nwfilter. This instantiating can take much time as described in [1] so as we are lacking option to instantiate rules faster right now we can at least don't instantiate when we don't need to. And this is typical that between libvirtd restarts rules don't change in firewall so we don't need to reapply them. The most straightforward approach is just to compare rules to be instantiated against rules in firewall but this comparison is complicated (see issues below). So let's instead dump both rules we are going to instantiate and rules after instantiation every time we apply rules. So next time we are going to instantiate rules we can check whether rules in firewall changed or not and whether we are going to reinstantiate same rules or different. A few words on dumping rules we are going to instantiate. After filling firewall obj with rules we have several transactions some of them can have rollbacks. Among all these rules those that are in rollbacks and those that are aiming on preparation cleanup (-X, -F, -D and -L which helps to traverse tree on cleanup) are not significant and ommited in dump. * Comparison issues * - different rules order - rules arguments order can differ. For example: in: -A libvirt-out -m physdev --physdev-is-bridged --physdev-out vme001c42fd388c -g FO-vme001c42fd388c out: -A libvirt-out -m physdev --physdev-out vme001c42fd388c --physdev-is-bridged -g FO-vme001c42fd388c - result rule can have extra arguments. For example (here it looks like just explicit extension): in: -A FI-vme001c42fd388c -p tcp --dport 1000:3000 -j RETURN out: -A FI-vme001c42fd388c -p tcp -m tcp --dport 1000:3000 -j RETURN * References * [1] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html Signed-off-by: Nikolay Shirokovskiy --- src/nwfilter/nwfilter_ebiptables_driver.c | 387 +- 1 file changed, 385 insertions(+), 2 deletions(-) diff --git a/src/nwfilter/nwfilter_ebiptables_driver.c b/src/nwfilter/nwfilter_ebiptables_driver.c index 5be1c9b..4cd6b54 100644 --- a/src/nwfilter/nwfilter_ebiptables_driver.c +++ b/src/nwfilter/nwfilter_ebiptables_driver.c @@ -96,6 +96,8 @@ static bool newMatchState; #define MATCH_PHYSDEV_OUT_FW "-m", "physdev", "--physdev-is-bridged", "--physdev-out" #define MATCH_PHYSDEV_OUT_OLD_FW "-m", "physdev", "--physdev-out" +#define EBIPTABLES_CACHE_DIR (LOCALSTATEDIR "/run/libvirt/nwfilter-rules") + static int ebtablesRemoveBasicRules(const char *ifname); static int ebiptablesDriverInit(bool privileged); static void ebiptablesDriverShutdown(void); @@ -150,6 +152,10 @@ static char chainprefixes_host_temp[3] = { 0 }; +static virHashTablePtr ebiptables_cache_hits; + +static int ebiptablesTearNewRules(const char *ifname); + static int printVar(virNWFilterVarCombIterPtr vars, char *buf, int bufsize, @@ -3393,6 +3399,339 @@ ebtablesGetSubChainInsts(virHashTablePtr chains, } + +static int +ebiptablesCacheApplyNew(const char *ifname, +const char *dumpDriver) +{ +char *fileFirewall = NULL; +char *fileDriver = NULL; +char *fileDriverTmp = NULL; +int ret = -1; + +/* for tests */ +if (!ebiptables_cache_hits) +return 0; + +if (virAsprintf(, "%s/%s.firewall", +EBIPTABLES_CACHE_DIR, ifname) < 0) +return -1; + +if (virAsprintf(, "%s/%s.driver", +EBIPTABLES_CACHE_DIR, ifname) < 0) +goto cleanup; + +if (unlink(fileFirewall) < 0 && errno != ENOENT) { +virReportSystemError(errno, _("can not unlink file '%s'"), fileFirewall); +goto cleanup; +} + +if (unlink(fileDriver) < 0 && errno != ENOENT) { +virReportSystemError(errno, _("can not unlink file '%s'"), fileDriver); +goto cleanup; +} + +ret = 0; + +if (!dumpDriver) +goto cleanup; + +if (virAsprintfQuiet(, "%s/%s.driver.tmp", + EBIPTABLES_CACHE_DIR, ifname) < 0) +goto cleanup; + +if (virFileWriteStr(fileDriverTmp, dumpDriver, 0600) < 0) { +unlink(fileDriverTmp); +goto cleanup; +} + +rename(fileDriverTmp, fileDriver); + + cleanup: +VIR_FREE(fileFirewall); +VIR_FREE(fileDriver); +VIR_FREE(fileDriverTmp); + +return ret; +} + + +struct ebiptablesDumpData { +char *dump; +const char *ifname; +virFirewallLayer layer; +}; + + +static int +ebiptablesAppendInstalledRules(virFirewallPtr fw ATTRIBUTE_UNUSED, + const char *const *lines, + void *opaque) +{ +virBuffer buf = VIR_BUFFER_INITIALIZER; +struct ebiptablesDumpData *data = opaque; + +virBufferAdd(, data->dump, -1); + +while (*lines) { +if (strstr(*lines, data->ifname)) { +/* iptables/ip6tables do not add binary and table in front of rules +
[libvirt] [PATCH v2 1/2] firewall: add dump function
We are going to save this dump on applying rules so that on next applying we can compare dump of current rules with dump of previous rules. Which in turn will help us to decide wether to reinstantiate rules or not. Signed-off-by: Nikolay Shirokovskiy --- src/libvirt_private.syms | 1 + src/util/virfirewall.c | 111 +++ src/util/virfirewall.h | 3 ++ 3 files changed, 115 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 335210c..612ad41 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1892,6 +1892,7 @@ virFileCacheSetPriv; # util/virfirewall.h virFirewallAddRuleFull; virFirewallApply; +virFirewallDumpRules; virFirewallFree; virFirewallNew; virFirewallRemoveRule; diff --git a/src/util/virfirewall.c b/src/util/virfirewall.c index c786d76..ed7f9e6 100644 --- a/src/util/virfirewall.c +++ b/src/util/virfirewall.c @@ -949,3 +949,114 @@ virFirewallApply(virFirewallPtr firewall) virMutexUnlock(); return ret; } + + +static void +virFirewallDumpRule(virBufferPtr buf, +const char *ifname, +virFirewallRulePtr rule) +{ +size_t i = 0; +size_t j, idx; +const char *concurrent = NULL; + +switch (rule->layer) { +case VIR_FIREWALL_LAYER_ETHERNET: +concurrent = "--concurrent"; +break; + +case VIR_FIREWALL_LAYER_IPV4: +case VIR_FIREWALL_LAYER_IPV6: +concurrent = "-w"; +break; + +case VIR_FIREWALL_LAYER_LAST: +break; +} + +/* don't dump concurrent option */ +if (concurrent && rule->argsLen > 0 && STREQ(rule->args[0], concurrent)) +i++; + +/* calculate index of 'command' argument like -N or -A etc */ +idx = i; +if (i < rule->argsLen && STREQ(rule->args[i], "-t")) +idx += 2; + +/* Whitelist commands that can go to dump. These are -N and -A. + * Commands D, L, X, F are used only for cleanup purpuses and do + * not present result rules list. -I is only used by + * iptablesCreateBaseChainsFW to [re]insert common non binding specific + * rules. + */ +if (idx >= rule->argsLen || +(STRNEQ(rule->args[idx], "-N") && STRNEQ(rule->args[idx], "-A"))) + return; + +for (j = i; j < rule->argsLen; j++) +if (strstr(rule->args[j], ifname)) +break; + +/* We also need to filter common non binding specific rules originated + * from iptablesCreateBaseChainsFW - these are -N for libvirt-in etc. */ +if (j == rule->argsLen) +return; + +switch (rule->layer) { +case VIR_FIREWALL_LAYER_ETHERNET: +virBufferAddLit(buf, "ebtables "); +break; + +case VIR_FIREWALL_LAYER_IPV4: +virBufferAddLit(buf, "iptables "); +break; + +case VIR_FIREWALL_LAYER_IPV6: +virBufferAddLit(buf, "ip6tables "); +break; + +case VIR_FIREWALL_LAYER_LAST: +break; +} + +/* dump default table explicitly */ +if (i < rule->argsLen && STRNEQ(rule->args[i], "-t")) +virBufferAddLit(buf, "-t filter "); + +for (; i < rule->argsLen; i++) { +virBufferAddStr(buf, rule->args[i]); + +if (i < rule->argsLen - 1) +virBufferAddLit(buf, " "); +} + +virBufferAddLit(buf, "\n"); +} + + +/* + * Dump rules for all transactions. Transaction rollbacks are not dumped. + * The order of rules in dump follows from order of transactions and + * order of rules in transactions. Only creation rules (-N, -A) are dumped. + * Example: + * + * ebtables -t nat -N libvirt-P-vme001c42fd388c + * ebtables -t nat -A libvirt-P-vme001c42fd388c -d 00:1c:42:fd:38:9d -j RETURN + * .. + */ +char* +virFirewallDumpRules(virFirewallPtr firewall, + const char *ifname) +{ +size_t i, j; +virBuffer buf = VIR_BUFFER_INITIALIZER; + +if (!firewall || firewall->err) +return NULL; + +for (i = 0; i < firewall->ngroups; i++) +for (j = 0; j < firewall->groups[i]->naction; j++) +virFirewallDumpRule(, ifname, firewall->groups[i]->action[j]); + +return virBufferContentAndReset(); +} diff --git a/src/util/virfirewall.h b/src/util/virfirewall.h index e024e88..bbbc561 100644 --- a/src/util/virfirewall.h +++ b/src/util/virfirewall.h @@ -115,6 +115,9 @@ void virFirewallStartRollback(virFirewallPtr firewall, int virFirewallApply(virFirewallPtr firewall); +char* virFirewallDumpRules(virFirewallPtr firewall, + const char *ifname); + void virFirewallSetLockOverride(bool avoid); VIR_DEFINE_AUTOPTR_FUNC(virFirewall, virFirewallFree) -- 1.8.3.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2 0/2] nwfilter: don't reinstantiate rules if there is no need to
Applied on top of [1] that restores reinstantiating filters on daemon reload. Note the fragile issue mentioned in ebiptablesDumpIstalledRules in respect to list of firewall tables we are using. I wonder can we instead of caching instantiate rules faster in the end? There is iptables-restore if we instantiate directly. And in case of firewalld mode why we instantiate filters via firewalld dbus interface after all? We use passthrough interface so looks like firewalld don't account our rules in any way. May be all we need is reloading rules on firewalld reload and always instantiate thru binaries? Then we can do things fast. Diff from v1 [2]: Approach is changed. Instead of checking whether applied filters changed or not (so we can miss firewall changes from outside) let's check that don't change both - rules we are going to apply and rules in firewall in comparion to previous instantiation. [1] [PATCH] nwfilter: intantiate filters on loading driver https://www.redhat.com/archives/libvir-list/2018-October/msg00787.html [2] [PATCH RFC 0/4] nwfilter: don't reinstantiate filters if they are not changed https://www.redhat.com/archives/libvir-list/2018-October/msg00904.html [3] [RFC] Faster libvirtd restart with nwfilter rules https://www.redhat.com/archives/libvir-list/2018-September/msg01206.html which continues in https://www.redhat.com/archives/libvir-list/2018-October/msg00657.html Nikolay Shirokovskiy (2): firewall: add dump function nwfilter: don't reinstantiate rules if there is no need to src/libvirt_private.syms | 1 + src/nwfilter/nwfilter_ebiptables_driver.c | 387 +- src/util/virfirewall.c| 111 + src/util/virfirewall.h| 3 + 4 files changed, 500 insertions(+), 2 deletions(-) -- 1.8.3.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] conf: More clear error msg for incomplete coalesce xml
On Thu, Oct 18, 2018 at 5:28 PM Martin Kletzander wrote: > On Thu, Oct 18, 2018 at 02:12:36PM +0800, Han Han wrote: > >On Wed, Oct 17, 2018 at 4:46 PM Martin Kletzander > >wrote: > > > >> On Tue, Oct 16, 2018 at 07:19:41PM -0400, John Ferlan wrote: > >> > > >> > > >> >On 10/14/18 10:26 AM, Han Han wrote: > >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1535930 > >> >> > >> >> Report more clear err msg instead of unknown error when coalesce > >> >> settings is incomplete. > >> >> > >> > >> Incomplete is not an error. It's request for removal of the missing > >> setting. > >> There is no problem if it is incomplete. > >> > >> Well, I think incomplete xml is the problem because I have reproduce it > on > >an inactive > >VM as https://bugzilla.redhat.com/show_bug.cgi?id=1535930#c4 . > > > >I am not sure it will be something wrong when update the device lively, > but > >we can fix > >this first. > > > > So I looked at the code and yes, there is inconsistency in the parsing. > If the > function does not parse anything it just returns NULL without an error set > and > the caller treats that as an error. > > What we should do is pass the dev into the function, make the function > return > int (0 for OK, -1 for error) and update the coalesce in the passed > parameter. > That way empty (incomplete) coalesce will not error out, but will just set > nothing in the device. > > This is not the whole fix, though. What is missing is the fact that if > such > device is being updated, then we need to see if everything was reset to > zero and > reset the pointer and free the coalesce struct from memory (ideally). > Although > there shouldn't be a code that will fail if the struct is allocated but > everything is zero. > > Would you mind having a look at that in v2? > > OK. I will try to make a whole fix in v2. > >> If you look at the BZ the problem is not the incomplete XML, but the > fact > >> that > >> the code that should update the device fails somewhere without setting > an > >> error. > >> Actually, there should not be an error, it should work. So this just > works > >> around the actual issue. > >> > >> Having said that maybe the problem is somewhere in the parsing part, but > >> this is > >> not the solution. We need to "go deeper" to find out why the updating > code > >> fails and then figure out what to fix from there, not put a sheet over > the > >> problem. > >> > >> >> Signed-off-by: Han Han > >> >> --- > >> >> src/conf/domain_conf.c | 6 +- > >> >> 1 file changed, 5 insertions(+), 1 deletion(-) > >> >> > >> >> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c > >> >> index 9911d56130..e755f45d3d 100644 > >> >> --- a/src/conf/domain_conf.c > >> >> +++ b/src/conf/domain_conf.c > >> >> @@ -7804,8 +7804,12 @@ virDomainNetDefCoalesceParseXML(xmlNodePtr > node, > >> >> ctxt->node = node; > >> >> > >> >> str = virXPathString("string(./rx/frames/@max)", ctxt); > >> >> -if (!str) > >> >> +if (!str) { > >> >> +virReportError(VIR_ERR_XML_DETAIL, > >> >> + "%s", > >> > > >> >This can be put on the previous line > >> > > >> >> + _("incomplete coalesce settings in interface > >> xml")); > >> > > >> >and specifically this could be is missing rx frames max attributes > >> > > >> >However, according to the RNG from commit 523c9960, it seems the 'rx' > is > >> >optional as is the '@max' value. Maybe Martin should provide a comment > >> >on this series since he added it. > >> > > >> >Of course that would cause the whole to disappear on Format. > >> >It would also cause problems because def->coalesce would have something > >> >that's empty. > >> > > >> >So perhaps the best thing to do is pass the @def into here, then only > if > >> >we get beyond the initial !str comparison do we allocate and fill it > in; > >> >otherwise, we return 0 if rx/frames/@max is not there. Prepares us for > >> >the future. > >> > > >> >I guess I'm not 100% clear if max frames == 0 what would happen. Maybe > >> >Martin knows (I've CC'd him). > >> > > >> > >> `rx-frames=0` turns that option off. It's like not having the parameter > >> there > >> at all (it also works like this with ethtool). > >> > >Set `rx-frames=0` is a better solution compared with report a error to > >incomplete > >the incomplete coalesce setting. > > > >> > >> >John > >> > > >> >> goto cleanup; > >> >> +} > >> >> > >> >> if (VIR_ALLOC(ret) < 0) > >> >> goto cleanup; > >> >> > >> > > > > > >-- > >Best regards, > >--- > >Han Han > >Quality Engineer > >Redhat. > > > >Email: h...@redhat.com > >Phone: +861065339333 > -- Best regards, --- Han Han Quality Engineer Redhat. Email: h...@redhat.com Phone: +861065339333 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list