Re: [pve-devel] [PATCH qemu-server] fix Bug #615 Windows guests suddenly hangs after couple times of migration
No this is the real solution. The first patch was only the workaround and will not come in the code. On 09/20/2016 07:37 AM, Alexandre DERUMIER wrote: > Do we need this patch, with the new qemu-kvm patch ? > > > - Mail original - > De: "Wolfgang Link"> À: "pve-devel" > Envoyé: Vendredi 16 Septembre 2016 13:14:51 > Objet: [pve-devel] [PATCH qemu-server] fix Bug #615 Windows guests > suddenly hangs after couple times of migration > > Windows has a clock/tick problem when it is live-migrated. > This problem ends in a pseudo freeze status. > The solution is to stop the clock, when we suspend the VM for live-migration. > > see man kvm -rtc clock=vm > > The drawback is the VM will lose a little time on every live migration. > So with this setting it is recommended to have a time-sync client in the VM > to keep the time. > --- > PVE/QemuServer.pm | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm > index dbd85a0..d3eeac5 100644 > --- a/PVE/QemuServer.pm > +++ b/PVE/QemuServer.pm > @@ -3002,6 +3002,7 @@ sub config_to_command { > $ostype eq 'wvista') { > push @$globalFlags, 'kvm-pit.lost_tick_policy=discard'; > push @$cmd, '-no-hpet'; > + push @$rtcFlags, 'clock=vm'; > if (qemu_machine_feature_enabled ($machine_type, $kvmver, 2, 3)) { > push @$cpuFlags , 'hv_spinlocks=0x1fff' if !$nokvm; > push @$cpuFlags , 'hv_vapic' if !$nokvm; > ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ?
One thing that I think it could be great, is to be able to have unique vmid across differents proxmox clusters. maybe with a letter prefix for example (cluster1: vmid: a100 , cluster2: vmid:b100). Like this, it could be possible migration vm across cluster (offline, or maybe even online), and also share storage across cluster. It could be great for ceph too, if you want to share a pool between differents cluster. (we can create multiple pool in ceph, but with too many pool, this increase the number of pg, which is bad for ceph) - Mail original - De: "aderumier"À: "dietmar" Cc: "pve-devel" Envoyé: Lundi 19 Septembre 2016 08:56:36 Objet: Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ? > It's become to be difficult to manage, as we can't easily migrate vm across > clusters >>But this is difficult because there is no shared storage in that case? I have local dc storage, but shared across dc. (mainly do live migration, then storage migration). But In the future (>3-5year), I'm looking to implemented ceph across 3 dc. - Mail original - De: "dietmar" À: "aderumier" , "pve-devel" Envoyé: Lundi 19 Septembre 2016 06:23:58 Objet: Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ? > It's become to be difficult to manage, as we can't easily migrate vm across > clusters But this is difficult because there is no shared storage in that case? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH qemu-server] fix Bug #615 Windows guests suddenly hangs after couple times of migration
Do we need this patch, with the new qemu-kvm patch ? - Mail original - De: "Wolfgang Link"À: "pve-devel" Envoyé: Vendredi 16 Septembre 2016 13:14:51 Objet: [pve-devel] [PATCH qemu-server] fix Bug #615 Windows guests suddenly hangs after couple times of migration Windows has a clock/tick problem when it is live-migrated. This problem ends in a pseudo freeze status. The solution is to stop the clock, when we suspend the VM for live-migration. see man kvm -rtc clock=vm The drawback is the VM will lose a little time on every live migration. So with this setting it is recommended to have a time-sync client in the VM to keep the time. --- PVE/QemuServer.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index dbd85a0..d3eeac5 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -3002,6 +3002,7 @@ sub config_to_command { $ostype eq 'wvista') { push @$globalFlags, 'kvm-pit.lost_tick_policy=discard'; push @$cmd, '-no-hpet'; + push @$rtcFlags, 'clock=vm'; if (qemu_machine_feature_enabled ($machine_type, $kvmver, 2, 3)) { push @$cpuFlags , 'hv_spinlocks=0x1fff' if !$nokvm; push @$cpuFlags , 'hv_vapic' if !$nokvm; -- 2.1.4 ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
Ok, I have test the patched kernel, and I still have the same behavior, the frequency still go up and down. (But it's not stuck anymore, like the bug I have see last week) I think this is the normal behavior of pstate, as they are a lower limit. But for virtualisation, I think it's really bad to have changing frequency. (clock problem for example). Also for corosync, that mean that a loaded node will have faster frequency, and non-loaded node low frequency. This can give us retransmit, because low frequency take more time to get the corosync message than the fastest node https://www.hastexo.com/resources/hints-and-kinks/whats-totem-retransmit-list-all-about-corosync/ result: without any load: root@kvm1:/etc/pve# cat /proc/cpuinfo |grep MHz cpu MHz : 2025.292 cpu MHz : 1520.816 cpu MHz : 3261.175 cpu MHz : 1875.742 cpu MHz : 2923.445 cpu MHz : 1935.078 cpu MHz : 2860.597 cpu MHz : 1671.820 cpu MHz : 1200.039 cpu MHz : 1653.656 cpu MHz : 1602.433 cpu MHz : 1935.320 cpu MHz : 2042.972 cpu MHz : 1359.761 cpu MHz : 3460.253 cpu MHz : 1200.039 cpu MHz : 2163.097 cpu MHz : 1710.328 cpu MHz : 2249.316 cpu MHz : 1199.675 cpu MHz : 2473.945 cpu MHz : 1731.398 cpu MHz : 2541.273 cpu MHz : 1658.863 cpu MHz : 2528.800 cpu MHz : 1680.660 cpu MHz : 1922.847 cpu MHz : 1369.570 cpu MHz : 1940.890 cpu MHz : 1526.507 cpu MHz : 1952.878 cpu MHz : 1452.761 cpu MHz : 1788.675 cpu MHz : 2137.910 cpu MHz : 1942.828 cpu MHz : 1707.664 cpu MHz : 1438.957 cpu MHz : 1642.757 cpu MHz : 1561.382 cpu MHz : 2104.730 running cpu benchmark: root@kvm1:/etc/pve# cat /proc/cpuinfo |grep MHz cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 cpu MHz : 3199.902 - Mail original - De: "aderumier"À: "pve-devel" Envoyé: Lundi 19 Septembre 2016 10:08:45 Objet: Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor >>And it's being changed based on cpu load, like actual governor is ondemand. From what I read, the intel pstate "performance", have a range min_freq / max_freq. So it seem to be different than cpufreq "performance", which is "max performance" (min_freq can be change manually through sysfs, but I don't think they are a kernel option to fix to max at boot) I'm currently compiling the kernel with patches from Stefan, I'll make a report this afternoon. - Mail original - De: "Dmitry Petuhov" À: "pve-devel" Envoyé: Lundi 19 Septembre 2016 09:47:21 Objet: Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor 19.09.2016 01:29, Alexandre DERUMIER wrote: > Hi, > > I have add some strange behaviour of some host last week, (cpu performance > degrading) > > > and I have found than 3 hosts of my 15 host cluster have wrong cpu frequency. > > All nodes are dell r630, with xeon v3 3,1ghz. (all with last bios/microcode > updates , last proxmox kernel) > > On the 3 hosts, the frequency was stuck to 800mhz instead 3,1ghz. (note that > on other host, the frequency was not stable, up and down between 3,09 && > 3,2ghz) > > Cpu governor is correctly set to max performance in bios + linux. > > > It seem to be a problem with intel pstate driver. > > > I have disabled it with intel_pstate=disable in grub, (also can be disable > with CONFIG_X86_INTEL_PSTATE=n) > and not frequency is super stable at 3,1ghz > > > Has somebody already seen this on xeon v3 ? Can confirm that on SandyBridge: model name : Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz root@msv-spb-pve01:/usr/share/perl5/PVE/Storage# cat
[pve-devel] [PATCH pve-qemu-kvm] fix Bug #615 Windows guests suddenly hangs after couple times of migration
From: "Dr. David Alan Gilbert" Load the LAPIC state during post_load (rather than when the CPU starts). This allows an interrupt to be delivered from the ioapic to the lapic prior to cpu loading, in particular the RTC that starts ticking as soon as we load it's state. Partially fixes a case where Windows hangs after migration due to RTC interrupts disappearing; it survived ~30 iterations of my test where as --- debian/patches/series | 1 + .../x86-lapic-Load-LAPIC-state-at-post_load.patch | 134 + 2 files changed, 135 insertions(+) create mode 100644 debian/patches/x86-lapic-Load-LAPIC-state-at-post_load.patch diff --git a/debian/patches/series b/debian/patches/series index d1470ba..4d50eef 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -74,3 +74,4 @@ extra/0003-9pfs-handle-walk-of-.-in-the-root-directory.patch extra/CVE-2016-7155-scsi-check-page-count-while-initialising-descriptor-.patch extra/CVE-2016-7156-scsi-pvscsi-avoid-infinite-loop-while-building-SG-li.patch extra/CVE-2016-7157-scsi-mptconfig-fix-an-assert-expression.patch +x86-lapic-Load-LAPIC-state-at-post_load.patch diff --git a/debian/patches/x86-lapic-Load-LAPIC-state-at-post_load.patch b/debian/patches/x86-lapic-Load-LAPIC-state-at-post_load.patch new file mode 100644 index 000..b7fe250 --- /dev/null +++ b/debian/patches/x86-lapic-Load-LAPIC-state-at-post_load.patch @@ -0,0 +1,134 @@ +From 78d6a05d2f69cbfa6e95f0a4a24a2c934969913b Mon Sep 17 00:00:00 2001 +From: "Dr. David Alan Gilbert"+Date: Mon, 12 Sep 2016 18:18:35 +0100 +Subject: [PATCH] x86/lapic: Load LAPIC state at post_load + +Load the LAPIC state during post_load (rather than when the CPU +starts). + +This allows an interrupt to be delivered from the ioapic to +the lapic prior to cpu loading, in particular the RTC that starts +ticking as soon as we load it's state. + +Fixes a case where Windows hangs after migration due to RTC interrupts +disappearing. + +Signed-off-by: Dr. David Alan Gilbert +Suggested-by: Paolo Bonzini +Signed-off-by: Paolo Bonzini +--- + hw/i386/kvm/apic.c | 27 +-- + include/sysemu/kvm.h | 1 - + target-i386/kvm.c| 17 - + 3 files changed, 25 insertions(+), 20 deletions(-) + +diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c +index 2bd0de8..5d140b9 100644 +--- a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c +@@ -28,9 +28,8 @@ static inline uint32_t kvm_apic_get_reg(struct kvm_lapic_state *kapic, + return *((uint32_t *)(kapic->regs + (reg_id << 4))); + } + +-void kvm_put_apic_state(DeviceState *dev, struct kvm_lapic_state *kapic) ++static void kvm_put_apic_state(APICCommonState *s, struct kvm_lapic_state *kapic) + { +-APICCommonState *s = APIC_COMMON(dev); + int i; + + memset(kapic, 0, sizeof(*kapic)); +@@ -125,6 +124,27 @@ static void kvm_apic_vapic_base_update(APICCommonState *s) + } + } + ++static void kvm_apic_put(void *data) ++{ ++APICCommonState *s = data; ++struct kvm_lapic_state kapic; ++int ret; ++ ++kvm_put_apic_state(s, ); ++ ++ret = kvm_vcpu_ioctl(CPU(s->cpu), KVM_SET_LAPIC, ); ++if (ret < 0) { ++fprintf(stderr, "KVM_SET_LAPIC failed: %s\n", strerror(ret)); ++abort(); ++} ++} ++ ++static void kvm_apic_post_load(APICCommonState *s) ++{ ++fprintf(stderr, "%s: Yeh\n", __func__); ++run_on_cpu(CPU(s->cpu), kvm_apic_put, s); ++} ++ + static void do_inject_external_nmi(void *data) + { + APICCommonState *s = data; +@@ -178,6 +198,8 @@ static void kvm_apic_reset(APICCommonState *s) + { + /* Not used by KVM, which uses the CPU mp_state instead. */ + s->wait_for_sipi = 0; ++ ++run_on_cpu(CPU(s->cpu), kvm_apic_put, s); + } + + static void kvm_apic_realize(DeviceState *dev, Error **errp) +@@ -206,6 +228,7 @@ static void kvm_apic_class_init(ObjectClass *klass, void *data) + k->set_base = kvm_apic_set_base; + k->set_tpr = kvm_apic_set_tpr; + k->get_tpr = kvm_apic_get_tpr; ++k->post_load = kvm_apic_post_load; + k->enable_tpr_reporting = kvm_apic_enable_tpr_reporting; + k->vapic_base_update = kvm_apic_vapic_base_update; + k->external_nmi = kvm_apic_external_nmi; +diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h +index 4938f65..f2a7b3b 100644 +--- a/include/sysemu/kvm.h b/include/sysemu/kvm.h +@@ -371,7 +371,6 @@ int kvm_irqchip_send_msi(KVMState *s, MSIMessage msg); + + void kvm_irqchip_add_irq_route(KVMState *s, int gsi, int irqchip, int pin); + +-void kvm_put_apic_state(DeviceState *d, struct kvm_lapic_state *kapic); + void kvm_get_apic_state(DeviceState *d, struct kvm_lapic_state *kapic); + + struct kvm_guest_debug; +diff --git a/target-i386/kvm.c b/target-i386/kvm.c +index d1a25c5..f1ad805 100644 +--- a/target-i386/kvm.c b/target-i386/kvm.c +@@ -2416,19 +2416,6 @@ static int
[pve-devel] [PATCH pve-storage] Add support of subvol format capability to NFS plugin
Because it actually supports subvols, but it just was not checked. Signed-off-by: Dmitry Petuhov--- PVE/Storage/NFSPlugin.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PVE/Storage/NFSPlugin.pm b/PVE/Storage/NFSPlugin.pm index df00f37..bfc2356 100644 --- a/PVE/Storage/NFSPlugin.pm +++ b/PVE/Storage/NFSPlugin.pm @@ -53,7 +53,7 @@ sub plugindata { return { content => [ { images => 1, rootdir => 1, vztmpl => 1, iso => 1, backup => 1}, { images => 1 }], - format => [ { raw => 1, qcow2 => 1, vmdk => 1 } , 'raw' ], + format => [ { raw => 1, qcow2 => 1, vmdk => 1, subvol => 1 } , 'raw' ], }; } -- 2.1.4 ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] v2 Make LXC code independent of storage plugin names
This patch series simplifies LXC volume creation and makes it independent of storage plugin names. It will allow to use LXC with custom plugins. Fixed version by comments. ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH pve-container v2] Make volume create and mount code independent of storage plugin types
Instead, rely on plugin-defined supported formats to decide if it supports subvols. Signed-off-by: Dmitry Petuhov--- Still workaround 'rbd' plugin, that cannot be used for LXC without krbd. Maybe better way is to pass $scfg into plugin's plugindata() in storage code? So plugin could show instance's capabilities instead of whole plugin's capabilities, that may or may not be available in instance. Also workaround special-case 'rbdpool' storage, where we allways want to use subvols with LXC. Maybe signal that plugin wants such behaviour via plugindata() too? Like set default format to subvol instead of raw in zfspool? src/PVE/LXC.pm | 45 +++-- 1 file changed, 15 insertions(+), 30 deletions(-) diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm index 35ce796..70d3b50 100644 --- a/src/PVE/LXC.pm +++ b/src/PVE/LXC.pm @@ -1238,12 +1238,9 @@ sub mountpoint_mount { if ($scfg->{path}) { $mounted_dev = run_with_loopdev($domount, $path); $use_loopdev = 1; - } elsif ($scfg->{type} eq 'drbd' || $scfg->{type} eq 'lvm' || -$scfg->{type} eq 'rbd' || $scfg->{type} eq 'lvmthin') { + } else { $mounted_dev = $path; &$domount($path); - } else { - die "unsupported storage type '$scfg->{type}'\n"; } return wantarray ? ($path, $use_loopdev, $mounted_dev) : $path; } else { @@ -1331,43 +1328,31 @@ sub create_disks { my $size_kb = int(${size_gb}*1024) * 1024; my $scfg = PVE::Storage::storage_config($storecfg, $storage); + die "krbd option must be enabled on storage type 'rbd'\n" if ($scfg->{type} eq 'rbd') && !$scfg->{krbd}; # fixme: use better naming ct-$vmid-disk-X.raw? - if ($scfg->{type} eq 'dir' || $scfg->{type} eq 'nfs') { - if ($size_kb > 0) { - $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'raw', - undef, $size_kb); - format_disk($storecfg, $volid, $rootuid, $rootgid); - } else { + if ($size_kb > 0 && $scfg->{type} ne 'zfspool') { + $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'raw', + undef, $size_kb); + format_disk($storecfg, $volid, $rootuid, $rootgid); + } else { + my (undef, $valid_formats) = PVE::Storage::storage_default_format($storecfg, $storage); + if (grep { $_ eq 'subvol' } @$valid_formats) { $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'subvol', - undef, 0); + undef, $size_kb); push @$chown_vollist, $volid; + } else { + die "Selected storage does not support subvols. Please, specify image size or select another storage"; } - } elsif ($scfg->{type} eq 'zfspool') { - - $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'subvol', - undef, $size_kb); - push @$chown_vollist, $volid; - } elsif ($scfg->{type} eq 'drbd' || $scfg->{type} eq 'lvm' || $scfg->{type} eq 'lvmthin') { - - $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'raw', undef, $size_kb); - format_disk($storecfg, $volid, $rootuid, $rootgid); - - } elsif ($scfg->{type} eq 'rbd') { - - die "krbd option must be enabled on storage type '$scfg->{type}'\n" if !$scfg->{krbd}; - $volid = PVE::Storage::vdisk_alloc($storecfg, $storage, $vmid, 'raw', undef, $size_kb); - format_disk($storecfg, $volid, $rootuid, $rootgid); - } else { - die "unable to create containers on storage type '$scfg->{type}'\n"; } + push @$vollist, $volid; $mountpoint->{volume} = $volid; $mountpoint->{size} = $size_kb * 1024; $conf->{$ms} = PVE::LXC::Config->print_ct_mountpoint($mountpoint, $ms eq 'rootfs'); } else { -# use specified/existing volid/dir/device -$conf->{$ms} = PVE::LXC::Config->print_ct_mountpoint($mountpoint, $ms eq 'rootfs'); + # use specified/existing volid/dir/device + $conf->{$ms} = PVE::LXC::Config->print_ct_mountpoint($mountpoint, $ms eq 'rootfs'); } }); -- 2.1.4 ___ pve-devel mailing list
Re: [pve-devel] [PATCH manager 1/2] virtio-scsi-pci as default SCSI for new VMS fix #1106
On 09/19/2016 11:23 AM, Emmanuel Kasper wrote: > On 09/19/2016 10:38 AM, Caspar Smit wrote: >> Ok, but since the scsihw: 'virtio-scsi-pci' is set at the generic OSdefault >> template and the w2k OSdefaults has a parent generic, doesn't that inherit >> all settings from generic? Why else does it need a parent? >> >> As i read the code the 'w2k' OSdefaults are: >> >> busType: 'ide' (from generic parent) >> scsihw: 'virtio-scsi-pci' (from generic parent) >> networkCard: 'rtl8139' (from w2k template, overriding e1000 from generic) >> >> So creating a w2k VM will use IDE as default for disks so everything works, >> BUT when you change IDE to SCSI you have to change the SCSI Controller too >> (virtio-scsi-pci is not compatible with W2K). Wouldn't it be better to have >> a 'sane' default SCSI controller for w2k? > > I am wondering how often people install w2k from scratch this days on > and then decide they want to switch from IDE to SCSI but since the > change is trivial, I will propose a patch. Can you also submit a bugzilla entry so we can properly track the issue ? ( https://bugzilla.proxmox.com/ ) ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH manager 1/2] virtio-scsi-pci as default SCSI for new VMS fix #1106
On 09/19/2016 10:38 AM, Caspar Smit wrote: > Ok, but since the scsihw: 'virtio-scsi-pci' is set at the generic OSdefault > template and the w2k OSdefaults has a parent generic, doesn't that inherit > all settings from generic? Why else does it need a parent? > > As i read the code the 'w2k' OSdefaults are: > > busType: 'ide' (from generic parent) > scsihw: 'virtio-scsi-pci' (from generic parent) > networkCard: 'rtl8139' (from w2k template, overriding e1000 from generic) > > So creating a w2k VM will use IDE as default for disks so everything works, > BUT when you change IDE to SCSI you have to change the SCSI Controller too > (virtio-scsi-pci is not compatible with W2K). Wouldn't it be better to have > a 'sane' default SCSI controller for w2k? I am wondering how often people install w2k from scratch this days on and then decide they want to switch from IDE to SCSI but since the change is trivial, I will propose a patch. BTW w2k3 would also be a candidate here for keeping lsi as: * it does not support any kind VirtIO ( ie nothing for w2k3 in the last VirtIO iso) * when we researched it, we found out w2k3 also includes the LSI drivers ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH manager 1/2] virtio-scsi-pci as default SCSI for new VMS fix #1106
Ok, but since the scsihw: 'virtio-scsi-pci' is set at the generic OSdefault template and the w2k OSdefaults has a parent generic, doesn't that inherit all settings from generic? Why else does it need a parent? As i read the code the 'w2k' OSdefaults are: busType: 'ide' (from generic parent) scsihw: 'virtio-scsi-pci' (from generic parent) networkCard: 'rtl8139' (from w2k template, overriding e1000 from generic) So creating a w2k VM will use IDE as default for disks so everything works, BUT when you change IDE to SCSI you have to change the SCSI Controller too (virtio-scsi-pci is not compatible with W2K). Wouldn't it be better to have a 'sane' default SCSI controller for w2k? Kind regards, Caspar Smit 2016-09-19 10:18 GMT+02:00 Emmanuel Kasper: > On 09/16/2016 03:11 PM, Caspar Smit wrote: > > Hi, > > > > I'm assuming this commit will break the 'w2k' pveOS default (because the > > scsihw will be inherited from generic): > > not really, because presetting a different kind of SCSI controller will > not impact the _default_ controller which will still be IDE > for all new VMs except l26 > > > > ___ > pve-devel mailing list > pve-devel@pve.proxmox.com > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] Make LXC code independent of storage plugin names
> Also, why I can't set volume size to zero on volume creation via WEB UI > and so use subvols? Is it bug? Using simply directories is a hack, because there is no disk quota in that case. That is why I disabled it on the GUI. Or is it a feature? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] Make LXC code independent of storage plugin names
On Mon, Sep 19, 2016 at 10:37:01AM +0300, Dmitry Petuhov wrote: > 19.09.2016 08:51, Fabian Grünbichler wrote: > > > > general remark, rest of comments inline: > > > > the indentation is all messed up. I know our codebase is not very clean > > in this regard anyway, but for new / touched code we try to keep / make > > it clean. our indentation for perl code is a mix of tabs and spaces, > > with tabs being 8 spaces long. > > > > the first indentation level is indented with 4 spaces, the second with > > 1 tab, the third with 1 tab and 4 spaces, the fourth with 2 tabs, and > > so on. if you use vim, I recommend enabling listmode to easily > > differentiate between the two. > Ok. Configured my mcedit. Maybe it would be good idea to all coding style > memo to https://pve.proxmox.com/wiki/Developer_Documentation ? > yes, that would probably be a good idea :) > > you drop the $mounted_dev assignment in the else branch. after a quick > > search of the calling code, this will probably at least break quota > > support for such volumes? > Actually it's not. mountpoint_mount() is being called in list context only > once (in PVE::API2::LXC) and only $use_loopdev is being used there. In other > places it is being called in scalar context, or its returned values are > ignored at all. So currently it brakes nothing. But OK, I can keep this > assingment for future. our LXC prestart hook uses it to setup the devices for quota support, see src/lxc-pve-prestart-hook, lines 79-108: my $devlist_file = "/var/lib/lxc/$vmid/devices"; unlink $devlist_file; my $devices = []; my $setup_mountpoint = sub { my ($ms, $mountpoint) = @_; #return if $ms eq 'rootfs'; my (undef, undef, $dev) = PVE::LXC::mountpoint_mount($mountpoint, $rootdir, $storage_cfg); push @$devices, $dev if $dev && $mountpoint->{quota}; }; PVE::LXC::Config->foreach_mountpoint($conf, $setup_mountpoint); my $lxc_setup = PVE::LXC::Setup->new($conf, $rootdir); $lxc_setup->pre_start_hook(); if (@$devices) { my $devlist = ''; foreach my $dev (@$devices) { my ($mode, $rdev) = (stat($dev))[2,6]; next if !$mode || !S_ISBLK($mode) || !$rdev; my $major = int($rdev / 0x100); my $minor = $rdev % 0x100; $devlist .= "b:$major:$minor:$dev\n"; } PVE::Tools::file_set_contents($devlist_file, $devlist); } return undef; }}); > > subvols are not only for size == 0 , e.g. ZFS supports subvols with and > > without size (limit). ZFS is a bit tricky unfortunately, as for > > containers we always want subvols (regular ZFS filesystems), and for > > VMs we always want raw volumes ("zvols" in ZFS speak). so changing the > > default format does not really work, and we still need a special case > > for ZFS (and future storages with the same problem) here? > But why we are limiting zfspool to subvols only? Following this logic, we > should forbid raw images for LXC on any other filesystem-based storage and > use only subvols there, like it was with OpenVZ. Or we can spread generic > behaviour to zfspool, like I did. > Maybe as compromise we could add configurable option to plugins to let user > decide? It's actually the opposite, we are limited for KVM/Qemu, where we can only use zvols. ZFS datasets are vastly superior to zvols for container usage, so that is the default (and only option) for them. In theory we could allow zvols as well, but there is no reason to do so (they only have disadvantages, and checking all the places were subvols are implicitly assumed right now is tedious work for basically no gain). All of this is orthogonal to the issue at hand though, which is that your patch would break a currently existing feature: support for ZFS subvols for containers WITH and without size (limit). This is a feature which we definitely don't want to drop ;) > Also, why I can't set volume size to zero on volume creation via WEB UI and > so use subvols? Is it bug? It's intentional. Some advanced features are not available in the GUI, either to keep the GUI simple, or because they are only available to the root user for security reasons. This one belongs in the first category (IIRC, it's only available to allow setups similar to legacy OpenVZ ones, but I may be wrong about that). ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] [PATCH manager 1/2] virtio-scsi-pci as default SCSI for new VMS fix #1106
On 09/16/2016 03:11 PM, Caspar Smit wrote: > Hi, > > I'm assuming this commit will break the 'w2k' pveOS default (because the > scsihw will be inherited from generic): not really, because presetting a different kind of SCSI controller will not impact the _default_ controller which will still be IDE for all new VMs except l26 ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
>>And it's being changed based on cpu load, like actual governor is ondemand. From what I read, the intel pstate "performance", have a range min_freq / max_freq. So it seem to be different than cpufreq "performance", which is "max performance" (min_freq can be change manually through sysfs, but I don't think they are a kernel option to fix to max at boot) I'm currently compiling the kernel with patches from Stefan, I'll make a report this afternoon. - Mail original - De: "Dmitry Petuhov"À: "pve-devel" Envoyé: Lundi 19 Septembre 2016 09:47:21 Objet: Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor 19.09.2016 01:29, Alexandre DERUMIER wrote: > Hi, > > I have add some strange behaviour of some host last week, (cpu performance > degrading) > > > and I have found than 3 hosts of my 15 host cluster have wrong cpu frequency. > > All nodes are dell r630, with xeon v3 3,1ghz. (all with last bios/microcode > updates , last proxmox kernel) > > On the 3 hosts, the frequency was stuck to 800mhz instead 3,1ghz. (note that > on other host, the frequency was not stable, up and down between 3,09 && > 3,2ghz) > > Cpu governor is correctly set to max performance in bios + linux. > > > It seem to be a problem with intel pstate driver. > > > I have disabled it with intel_pstate=disable in grub, (also can be disable > with CONFIG_X86_INTEL_PSTATE=n) > and not frequency is super stable at 3,1ghz > > > Has somebody already seen this on xeon v3 ? Can confirm that on SandyBridge: model name : Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz root@msv-spb-pve01:/usr/share/perl5/PVE/Storage# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq 1199906 And it's being changed based on cpu load, like actual governor is ondemand. ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
[pve-devel] [PATCH kvm] various CVE fixes
CVE-2016-7170: vmsvga: correct bitmap and pixmap size checks CVE-2016-7421: scsi: pvscsi: limit process IO loop to ring size CVE-2016-7423: scsi: mptsas: use g_new0 to allocate MPTSASRequest object --- ...vga-correct-bitmap-and-pixmap-size-checks.patch | 45 ++ ...pvscsi-limit-process-IO-loop-to-ring-size.patch | 38 ++ ...-use-g_new0-to-allocate-MPTSASRequest-obj.patch | 35 + debian/patches/series | 3 ++ 4 files changed, 121 insertions(+) create mode 100644 debian/patches/extra/CVE-2016-7170-vmsvga-correct-bitmap-and-pixmap-size-checks.patch create mode 100644 debian/patches/extra/CVE-2016-7421-scsi-pvscsi-limit-process-IO-loop-to-ring-size.patch create mode 100644 debian/patches/extra/CVE-2016-7423-scsi-mptsas-use-g_new0-to-allocate-MPTSASRequest-obj.patch diff --git a/debian/patches/extra/CVE-2016-7170-vmsvga-correct-bitmap-and-pixmap-size-checks.patch b/debian/patches/extra/CVE-2016-7170-vmsvga-correct-bitmap-and-pixmap-size-checks.patch new file mode 100644 index 000..732f679 --- /dev/null +++ b/debian/patches/extra/CVE-2016-7170-vmsvga-correct-bitmap-and-pixmap-size-checks.patch @@ -0,0 +1,45 @@ +From 167d97a3def77ee2dbf6e908b0ecbfe2103977db Mon Sep 17 00:00:00 2001 +From: Prasad J Pandit+Date: Thu, 8 Sep 2016 18:15:54 +0530 +Subject: [PATCH] vmsvga: correct bitmap and pixmap size checks + +When processing svga command DEFINE_CURSOR in vmsvga_fifo_run, +the computed BITMAP and PIXMAP size are checked against the +'cursor.mask[]' and 'cursor.image[]' array sizes in bytes. +Correct these checks to avoid OOB memory access. + +Reported-by: Qinghao Tang +Reported-by: Li Qiang +Signed-off-by: Prasad J Pandit +Message-id: 1473338754-15430-1-git-send-email-ppan...@redhat.com +Signed-off-by: Gerd Hoffmann +--- + hw/display/vmware_vga.c | 12 +++- + 1 file changed, 7 insertions(+), 5 deletions(-) + +diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c +index e51a05e..6599cf0 100644 +--- a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c +@@ -676,11 +676,13 @@ static void vmsvga_fifo_run(struct vmsvga_state_s *s) + cursor.bpp = vmsvga_fifo_read(s); + + args = SVGA_BITMAP_SIZE(x, y) + SVGA_PIXMAP_SIZE(x, y, cursor.bpp); +-if (cursor.width > 256 || +-cursor.height > 256 || +-cursor.bpp > 32 || +-SVGA_BITMAP_SIZE(x, y) > sizeof cursor.mask || +-SVGA_PIXMAP_SIZE(x, y, cursor.bpp) > sizeof cursor.image) { ++if (cursor.width > 256 ++|| cursor.height > 256 ++|| cursor.bpp > 32 ++|| SVGA_BITMAP_SIZE(x, y) ++> sizeof(cursor.mask) / sizeof(cursor.mask[0]) ++|| SVGA_PIXMAP_SIZE(x, y, cursor.bpp) ++> sizeof(cursor.image) / sizeof(cursor.image[0])) { + goto badcmd; + } + +-- +2.1.4 + diff --git a/debian/patches/extra/CVE-2016-7421-scsi-pvscsi-limit-process-IO-loop-to-ring-size.patch b/debian/patches/extra/CVE-2016-7421-scsi-pvscsi-limit-process-IO-loop-to-ring-size.patch new file mode 100644 index 000..05ab4a5 --- /dev/null +++ b/debian/patches/extra/CVE-2016-7421-scsi-pvscsi-limit-process-IO-loop-to-ring-size.patch @@ -0,0 +1,38 @@ +From d251157ac1928191af851d199a9ff255d330bec9 Mon Sep 17 00:00:00 2001 +From: Prasad J Pandit +Date: Wed, 14 Sep 2016 15:09:12 +0530 +Subject: [PATCH] scsi: pvscsi: limit process IO loop to ring size + +Vmware Paravirtual SCSI emulator while processing IO requests +could run into an infinite loop if 'pvscsi_ring_pop_req_descr' +always returned positive value. Limit IO loop to the ring size. + +Cc: qemu-sta...@nongnu.org +Reported-by: Li Qiang +Signed-off-by: Prasad J Pandit +Message-Id: <1473845952-30785-1-git-send-email-ppan...@redhat.com> +Signed-off-by: Paolo Bonzini +--- + hw/scsi/vmw_pvscsi.c | 5 - + 1 file changed, 4 insertions(+), 1 deletion(-) + +diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c +index babac5a..a5ce7de 100644 +--- a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c +@@ -247,8 +247,11 @@ static hwaddr + pvscsi_ring_pop_req_descr(PVSCSIRingInfo *mgr) + { + uint32_t ready_ptr = RS_GET_FIELD(mgr, reqProdIdx); ++uint32_t ring_size = PVSCSI_MAX_NUM_PAGES_REQ_RING ++* PVSCSI_MAX_NUM_REQ_ENTRIES_PER_PAGE; + +-if (ready_ptr != mgr->consumed_ptr) { ++if (ready_ptr != mgr->consumed_ptr ++&& ready_ptr - mgr->consumed_ptr < ring_size) { + uint32_t next_ready_ptr = + mgr->consumed_ptr++ & mgr->txr_len_mask; + uint32_t next_ready_page = +-- +2.1.4 + diff --git
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
19.09.2016 01:29, Alexandre DERUMIER wrote: Hi, I have add some strange behaviour of some host last week, (cpu performance degrading) and I have found than 3 hosts of my 15 host cluster have wrong cpu frequency. All nodes are dell r630, with xeon v3 3,1ghz. (all with last bios/microcode updates , last proxmox kernel) On the 3 hosts, the frequency was stuck to 800mhz instead 3,1ghz. (note that on other host, the frequency was not stable, up and down between 3,09 && 3,2ghz) Cpu governor is correctly set to max performance in bios + linux. It seem to be a problem with intel pstate driver. I have disabled it with intel_pstate=disable in grub, (also can be disable with CONFIG_X86_INTEL_PSTATE=n) and not frequency is super stable at 3,1ghz Has somebody already seen this on xeon v3 ? Can confirm that on SandyBridge: model name : Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz root@msv-spb-pve01:/usr/share/perl5/PVE/Storage# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq 1199906 And it's being changed based on cpu load, like actual governor is ondemand. ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] Make LXC code independent of storage plugin names
19.09.2016 08:51, Fabian Grünbichler wrote: I sent you some feedback on patch #1 yesterday.. Since it breaks stuff, it can't be merged (yet). Feel free to send an updated v2 (or if you feel the feedback requires discussion, feel free to respond). Thanks! Sorry, did not received it. Maybe Google's spam filters munched it. general remark, rest of comments inline: the indentation is all messed up. I know our codebase is not very clean in this regard anyway, but for new / touched code we try to keep / make it clean. our indentation for perl code is a mix of tabs and spaces, with tabs being 8 spaces long. the first indentation level is indented with 4 spaces, the second with 1 tab, the third with 1 tab and 4 spaces, the fourth with 2 tabs, and so on. if you use vim, I recommend enabling listmode to easily differentiate between the two. Ok. Configured my mcedit. Maybe it would be good idea to all coding style memo to https://pve.proxmox.com/wiki/Developer_Documentation ? you drop the $mounted_dev assignment in the else branch. after a quick search of the calling code, this will probably at least break quota support for such volumes? Actually it's not. mountpoint_mount() is being called in list context only once (in PVE::API2::LXC) and only $use_loopdev is being used there. In other places it is being called in scalar context, or its returned values are ignored at all. So currently it brakes nothing. But OK, I can keep this assingment for future. subvols are not only for size == 0 , e.g. ZFS supports subvols with and without size (limit). ZFS is a bit tricky unfortunately, as for containers we always want subvols (regular ZFS filesystems), and for VMs we always want raw volumes ("zvols" in ZFS speak). so changing the default format does not really work, and we still need a special case for ZFS (and future storages with the same problem) here? But why we are limiting zfspool to subvols only? Following this logic, we should forbid raw images for LXC on any other filesystem-based storage and use only subvols there, like it was with OpenVZ. Or we can spread generic behaviour to zfspool, like I did. Maybe as compromise we could add configurable option to plugins to let user decide? Also, why I can't set volume size to zero on volume creation via WEB UI and so use subvols? Is it bug? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
Am 19.09.2016 um 09:01 schrieb Alexandre DERUMIER: >>> @alexandre: please can you test if that solves your problem? > > I'll try. does I need to apply the whole patches series ? normally not. Just apply the whole cpufreq series. If it does not compile please report to me. I'll point you to the missing piece. Stefan > > > > - Mail original - > De: "dietmar"> À: "Stefan Priebe, Profihost AG" , "pve-devel" > > Cc: "aderumier" > Envoyé: Lundi 19 Septembre 2016 07:36:57 > Objet: Re: [pve-devel] intel pstate: wrong cpu frequency with performance > governor > >> The cpufreq and intel pstate driver were somewhat broken in 4.4 there were a >> lot of changes in 4.5 or 4.6 (can't remember). I'm using around 20 cpufreq >> (also a lot of optimizations) patches in 4.4. >> >> I grabbed those from mr hoffstaette who has his own repo of 4.4 patches and >> backports. >> >> See here and look for prefix cpufreq >> https://github.com/hhoffstaette/kernel-patches/tree/master/4.4.21 > > @alexandre: please can you test if that solves your problem? > > @stefan: Do you also use the brtfs patches from hhoffstaette/... ? > ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
>>@alexandre: please can you test if that solves your problem? I'll try. does I need to apply the whole patches series ? - Mail original - De: "dietmar"À: "Stefan Priebe, Profihost AG" , "pve-devel" Cc: "aderumier" Envoyé: Lundi 19 Septembre 2016 07:36:57 Objet: Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor > The cpufreq and intel pstate driver were somewhat broken in 4.4 there were a > lot of changes in 4.5 or 4.6 (can't remember). I'm using around 20 cpufreq > (also a lot of optimizations) patches in 4.4. > > I grabbed those from mr hoffstaette who has his own repo of 4.4 patches and > backports. > > See here and look for prefix cpufreq > https://github.com/hhoffstaette/kernel-patches/tree/master/4.4.21 @alexandre: please can you test if that solves your problem? @stefan: Do you also use the brtfs patches from hhoffstaette/... ? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ?
> It's become to be difficult to manage, as we can't easily migrate vm across > clusters >>But this is difficult because there is no shared storage in that case? I have local dc storage, but shared across dc. (mainly do live migration, then storage migration). But In the future (>3-5year), I'm looking to implemented ceph across 3 dc. - Mail original - De: "dietmar"À: "aderumier" , "pve-devel" Envoyé: Lundi 19 Septembre 2016 06:23:58 Objet: Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ? > It's become to be difficult to manage, as we can't easily migrate vm across > clusters But this is difficult because there is no shared storage in that case? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ?
>>What problem do you want to solve exactly? Only more nodes? Or nodes >>between physically distant data centers? mainly more nodes. multidatacenter is a plus. (I mainly use multidatacenter to do live migration, then storage migration on local dc storage). >>An alternative plan would be to write a management tool which >>can deal with multiple (corosync) clusters. I always wanted such >>tool, and I guess it is not really hard to write one. >> >>Things like HA failover would still be restricted to a single >>corosync cluster. What do you think about this idea? Oh yes, that could be perfect ! - Mail original - De: "dietmar"À: "aderumier" , "pve-devel" Envoyé: Lundi 19 Septembre 2016 06:18:00 Objet: Re: [pve-devel] question/idea : managing big proxmox cluster (100nodes), get rid of corosync ? > I'm not an expert in cluster messaging, but I have found some projects which > seem interesting: > > serf: > https://www.serf.io/intro/index.html > consul: > https://www.consul.io/ What problem do you want to solve exactly? Only more nodes? Or nodes between physically distant data centers? An alternative plan would be to write a management tool which can deal with multiple (corosync) clusters. I always wanted such tool, and I guess it is not really hard to write one. Things like HA failover would still be restricted to a single corosync cluster. What do you think about this idea? ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
Re: [pve-devel] intel pstate: wrong cpu frequency with performance governor
Am 19.09.2016 um 07:36 schrieb Dietmar Maurer: >> The cpufreq and intel pstate driver were somewhat broken in 4.4 there were a >> lot of changes in 4.5 or 4.6 (can't remember). I'm using around 20 cpufreq >> (also a lot of optimizations) patches in 4.4. >> >> I grabbed those from mr hoffstaette who has his own repo of 4.4 patches and >> backports. >> >> See here and look for prefix cpufreq >> https://github.com/hhoffstaette/kernel-patches/tree/master/4.4.21 > > @alexandre: please can you test if that solves your problem? > > @stefan: Do you also use the brtfs patches from hhoffstaette/... ? Yes. Stefan ___ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel