[BUG] pv on hvm, possibly xen net , current master versions.
I follow the bleeding edge on Dom0 kernels and Xen pretty closely, and for the past 6 months I have been encountering increasingly frequent hangs in the most heavily used domUs. I have tried downgrading both dom0 kernel and xen, but can't seem to find any where the bug does not materialize. The final domU crash varies, but always involves skb. If I disable xen_platform_pci in domU, I do not get theese crashes. My most heavily used domUs all involve pcie cards passed through to guest, and various insanitary practices to ignore the cards stepping where they should not, but this feels like something else. Currently I mostly run Xen 4.15 (stable-4.15 from git) and linux-5.11.14 (gentoo-sources latest testing), but I have dom0 kernels back to linux-5.4.8 able to boot and run. Not, alas the exact ones I used to run back then. I have recently started using f2fs in dom0, and I don't know which version of gcc I ran at the time linux-5.4.8 was current, but I am fairly sure that this did not happen back then. Either because of dumb luck, or because something running on the system is stressing it more now. My easiest way to reproduce is to start a world build in the domU, wait for load to reach 3 and fire up thunderbird and "firefox http://twitter.com/; Logs show: --dom0, times in UTC, (XEN) [2021-04-19 07:08:10] grant_table.c:1861:d7v2 Expanding d7 grant table from 6 to 7 frames (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v3 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:24] grant_table.c:803:d0v5 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:25] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:26] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:27] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:31] grant_table.c:803:d0v2 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:37] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 (XEN) [2021-04-19 07:21:51] grant_table.c:803:d0v6 Bad flags (0) or dom (0); expected d0 --- ---domU, times in UTC+2--- april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19 09:21:24 gt kernel: net eth0: rx->offset: 0, size: -1 april 19
Re: [Xen-devel] [PATCH V3 2/2] Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute
Den 19.10.2020 17:16, skrev Håkon Alstadheim: Den 19.10.2020 13:00, skrev George Dunlap: On Jan 31, 2020, at 3:33 PM, Wei Liu wrote: On Fri, Jan 17, 2020 at 02:13:04PM -0500, Rich Persaud wrote: On Aug 26, 2019, at 17:08, Pasi Kärkkäinen wrote: Hi, On Mon, Oct 08, 2018 at 10:32:45AM -0400, Boris Ostrovsky wrote: On 10/3/18 11:51 AM, Pasi Kärkkäinen wrote: On Wed, Sep 19, 2018 at 11:05:26AM +0200, Roger Pau Monné wrote: On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: On 9/18/18 5:32 AM, George Dunlap wrote: On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen wrote: Hi, On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: What about the toolstack changes? Have they been accepted? I vaguely recall there was a discussion about those changes but don't remember how it ended. I don't think toolstack/libxl patch has been applied yet either. "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html Will this patch work for *BSD? Roger? At least FreeBSD don't support pci-passthrough, so none of this works ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will have to be moved to libxl_linux.c when BSD support is added. Ok. That sounds like it's OK for the initial pci 'reset' implementation in xl/libxl to be linux-only.. Are these two patches still needed? ISTR they were written originally to deal with guest trying to use device that was previously assigned to another guest. But pcistub_put_pci_dev() calls __pci_reset_function_locked() which first tries FLR, and it looks like it was added relatively recently. Replying to an old thread.. I only now realized I forgot to reply to this message earlier. afaik these patches are still needed. Håkon (CC'd) wrote to me in private that he gets a (dom0) Linux kernel crash if he doesn't have these patches applied. Here are the links to both the linux kernel and libxl patches: "[Xen-devel] [PATCH V3 0/2] Xen/PCIback: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00659.html [Note that PATCH V3 1/2 "Drivers/PCI: Export pcie_has_flr() interface" is already applied in upstream linux kernel, so it's not needed anymore] "[Xen-devel] [PATCH V3 2/2] Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00661.html "[Xen-devel] [PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[Xen-devel] [PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html [dropping Linux mailing lists] What is required to get the Xen patches merged? Rebasing against Xen master? OpenXT has been carrying a similar patch for many years and we would like to move to an upstream implementation. Xen users of PCI passthrough would benefit from more reliable device reset. Rebase and resend? Skimming that thread I think the major concern was backward compatibility. That seemed to have been addressed. Unfortunately I don't have the time to dig into Linux to see if the claim there is true or not. It would be helpful to write a concise paragraph to say why backward compatibility is not required. Just going through my old “make sure something happens with this” mails. Did anything ever happen with this? Who has the ball here / who is this stuck on? We're waiting for "somebody" to testify that fixing this will not adversely affect anyone. I'm not qualified, but my strong belief is that since "reset" or "do_flr" in the linux kernel is not currently implemented/used in any official distribution, it should be OK. Patches still work in current staging-4.14 btw. Just for the record, attached are the patches I am running on top of linux gentoo-sources-5.9.1 and xen-staging-4.14 respectively. (I am also running with the patch to mark populated reserved memory that contains ACPI tables as "ACPI NVS", not attached here ). --- a/drivers/xen/xen-pciback/pci_stub.c 2020-03-30 21:08:39.406994339 +0200 +++ b/drivers/xen/xen-pciback/pci_stub.c 2020-03-30 20:56:18.225810279 +0200 @@ -245,6 +245,90 @@ return found_dev; } +struct pcistub_args { + struct pci_dev *dev; + unsigned int dcount; +}; + +static int pcistub_search_dev(struct pci_dev *dev, void *data) +{ + struct pcistub_device *psdev; + struct pcistub_args *arg = data; + bool found_dev = false; + unsigned long flags; + + spin_lock_irqsave(_devices_lock, flags); + + list_for_each_entry(psdev, _devices, dev_list) { + if
Re: [Xen-devel] [PATCH V3 2/2] Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute
Den 19.10.2020 13:00, skrev George Dunlap: On Jan 31, 2020, at 3:33 PM, Wei Liu wrote: On Fri, Jan 17, 2020 at 02:13:04PM -0500, Rich Persaud wrote: On Aug 26, 2019, at 17:08, Pasi Kärkkäinen wrote: Hi, On Mon, Oct 08, 2018 at 10:32:45AM -0400, Boris Ostrovsky wrote: On 10/3/18 11:51 AM, Pasi Kärkkäinen wrote: On Wed, Sep 19, 2018 at 11:05:26AM +0200, Roger Pau Monné wrote: On Tue, Sep 18, 2018 at 02:09:53PM -0400, Boris Ostrovsky wrote: On 9/18/18 5:32 AM, George Dunlap wrote: On Sep 18, 2018, at 8:15 AM, Pasi Kärkkäinen wrote: Hi, On Mon, Sep 17, 2018 at 02:06:02PM -0400, Boris Ostrovsky wrote: What about the toolstack changes? Have they been accepted? I vaguely recall there was a discussion about those changes but don't remember how it ended. I don't think toolstack/libxl patch has been applied yet either. "[PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html Will this patch work for *BSD? Roger? At least FreeBSD don't support pci-passthrough, so none of this works ATM. There's no sysfs on BSD, so much of what's in libxl_pci.c will have to be moved to libxl_linux.c when BSD support is added. Ok. That sounds like it's OK for the initial pci 'reset' implementation in xl/libxl to be linux-only.. Are these two patches still needed? ISTR they were written originally to deal with guest trying to use device that was previously assigned to another guest. But pcistub_put_pci_dev() calls __pci_reset_function_locked() which first tries FLR, and it looks like it was added relatively recently. Replying to an old thread.. I only now realized I forgot to reply to this message earlier. afaik these patches are still needed. Håkon (CC'd) wrote to me in private that he gets a (dom0) Linux kernel crash if he doesn't have these patches applied. Here are the links to both the linux kernel and libxl patches: "[Xen-devel] [PATCH V3 0/2] Xen/PCIback: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00659.html [Note that PATCH V3 1/2 "Drivers/PCI: Export pcie_has_flr() interface" is already applied in upstream linux kernel, so it's not needed anymore] "[Xen-devel] [PATCH V3 2/2] Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00661.html "[Xen-devel] [PATCH V1 0/1] Xen/Tools: PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00664.html "[Xen-devel] [PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute": https://lists.xen.org/archives/html/xen-devel/2017-12/msg00663.html [dropping Linux mailing lists] What is required to get the Xen patches merged? Rebasing against Xen master? OpenXT has been carrying a similar patch for many years and we would like to move to an upstream implementation. Xen users of PCI passthrough would benefit from more reliable device reset. Rebase and resend? Skimming that thread I think the major concern was backward compatibility. That seemed to have been addressed. Unfortunately I don't have the time to dig into Linux to see if the claim there is true or not. It would be helpful to write a concise paragraph to say why backward compatibility is not required. Just going through my old “make sure something happens with this” mails. Did anything ever happen with this? Who has the ball here / who is this stuck on? We're waiting for "somebody" to testify that fixing this will not adversely affect anyone. I'm not qualified, but my strong belief is that since "reset" or "do_flr" in the linux kernel is not currently implemented/used in any official distribution, it should be OK. Patches still work in current staging-4.14 btw.
Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10
Den 05.11.2019 02:15, skrev Andrew Cooper: On 05/11/2019 00:25, Andrew Cooper wrote: On 04/11/2019 23:42, Andrew Cooper wrote: On 04/11/2019 23:20, Håkon Alstadheim wrote: (XEN) RFLAGS=0x0193 (0x0193) DR7 = 0x0400 (XEN) *** Insn bytes from b8868f61d69a: 44 00 00 8c d0 9c 81 0c 24 00 01 00 00 9d 8e d0 9c 81 24 24 ff fe ff ff 9d c3 cc cc cc cc cc Ok. One question answered, several more WTF. <.data>: 0: 44 00 00 add %r8b,(%rax) 3: 8c d0 mov %ss,%eax 5: 9c pushfq 6: 81 0c 24 00 01 00 00 orl $0x100,(%rsp) d: 9d popfq e: 8e d0 mov %eax,%ss 10: f1 icebp 11: 9c pushfq 12: 81 24 24 ff fe ff ff andl $0xfeff,(%rsp) 19: 9d popfq 1a: c3 retq 1b: cc int3 1c: cc int3 1d: cc int3 1e: cc int3 1f: cc int3 This is a serious exercising of architectural corner cases, by layering a single step exception on top of a MovSS-deferred ICEBP. Now I've looked closer, this isn't a CVE-2018-8897 exploit as no breakpoints are configured in %dr7, so I'm going to revise my guess some new debugger-detection in DRM software. I've reproduced the VMEntry failure you were seeing. Now to figure out if there is sufficient control available to provide architectural behaviour to guest, because I'm not entirely certain there is, given some of ICEBP's extra magic properties. So, this is just another case of an issue I've already tried fixing once and haven't had time to fix in a way which doesn't break other pieces of functionality. I clearly need to dust off that series and get it working properly. In the short term, this will work around your problem. diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index f86af09898..10daaa6f33 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -522,8 +522,7 @@ static inline void hvm_invlpg(struct vcpu *v, unsigned long linear) (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE)) /* These exceptions must always be intercepted. */ -#define HVM_TRAP_MASK ((1U << TRAP_debug) | \ - (1U << TRAP_alignment_check) | \ +#define HVM_TRAP_MASK ((1U << TRAP_alignment_check) | \ (1U << TRAP_machine_check)) static inline int hvm_cpu_up(void) However, be aware that it will reintroduce http://xenbits.xen.org/xsa/advisory-156.html so isn't recommended for general use. Thank you kindly. Ever the optimist, I'll apply the patch. Seeing as this looks to be some DRM software, it isn't likely to mount an attack like that, as it would livelock a native system just as badly as it livelocks a virtualised system. I'm slightly relieved the malware running on my system is courtesy of big media rather than some Romanian consultant for the RNC. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10
Den 04.11.2019 14:31, skrev Andrew Cooper: On 03/11/2019 10:23, Håkon Alstadheim wrote: (XEN) [2019-11-02 14:09:46] d2v0 vmentry failure (reason 0x8021): Invalid guest state (0) (XEN) [2019-11-02 14:09:46] * VMCS Area ** (XEN) [2019-11-02 14:09:46] *** Guest State *** (XEN) [2019-11-02 14:09:46] CR0: actual=0x80050031, shadow=0x80050031, gh_mask= (XEN) [2019-11-02 14:09:46] CR4: actual=0x00172678, shadow=0x00170678, gh_mask=ffe8f860 (XEN) [2019-11-02 14:09:46] CR3 = 0x001aa002 (XEN) [2019-11-02 14:09:46] RSP = 0x8c0f4dd71e98 (0x8c0f4dd71e98) RIP = 0xd18a040bb75e (0xd18a040bb75e) (XEN) [2019-11-02 14:09:46] RFLAGS=0x0187 (0x0187) DR7 = 0x0400 (XEN) [2019-11-02 14:09:46] Sysenter RSP= CS:RIP=: (XEN) [2019-11-02 14:09:46] sel attr limit base (XEN) [2019-11-02 14:09:46] CS: 0010 0209b (XEN) [2019-11-02 14:09:46] DS: 002b 0c0f3 (XEN) [2019-11-02 14:09:46] SS: 0018 04093 (XEN) [2019-11-02 14:09:46] ES: 002b 0c0f3 (XEN) [2019-11-02 14:09:46] FS: 0053 040f3 3c00 (XEN) [2019-11-02 14:09:46] GS: 002b 0c0f3 f8044ff8 (XEN) [2019-11-02 14:09:46] GDTR: 0057 f80459c61fb0 (XEN) [2019-11-02 14:09:46] LDTR: 1c000 (XEN) [2019-11-02 14:09:46] IDTR: 012f d18a014a0980 (XEN) [2019-11-02 14:09:46] TR: 0040 0008b 0067 f80459c6 (XEN) [2019-11-02 14:09:46] EFER(VMCS) = 0x0d01 PAT = 0x0007010600070106 (XEN) [2019-11-02 14:09:46] PreemptionTimer = 0x SM Base = 0x (XEN) [2019-11-02 14:09:46] DebugCtl = 0x DebugExceptions = 0x (XEN) [2019-11-02 14:09:46] Interruptibility = 0002 ActivityState = (XEN) [2019-11-02 14:09:46] InterruptStatus = (XEN) [2019-11-02 14:09:46] *** Host State *** (XEN) [2019-11-02 14:09:46] RIP = 0x82d080341950 (vmx_asm_vmexit_handler) RSP = 0x83083ff0ff70 (XEN) [2019-11-02 14:09:46] CS=e008 SS= DS= ES= FS= GS= TR=e040 (XEN) [2019-11-02 14:09:46] FSBase= GSBase= TRBase=83083ff14000 (XEN) [2019-11-02 14:09:46] GDTBase=83083ff03000 IDTBase=83083ff07000 (XEN) [2019-11-02 14:09:46] CR0=80050033 CR3=00054dbea000 CR4=001526e0 (XEN) [2019-11-02 14:09:46] Sysenter RSP=83083ff0ffa0 CS:RIP=e008:82d080395440 (XEN) [2019-11-02 14:09:46] EFER = 0x0d01 PAT = 0x050100070406 (XEN) [2019-11-02 14:09:46] *** Control State *** (XEN) [2019-11-02 14:09:46] PinBased=00bf CPUBased=b62065fa SecondaryExec=17eb (XEN) [2019-11-02 14:09:46] EntryControls=d3ff ExitControls=002fefff (XEN) [2019-11-02 14:09:46] ExceptionBitmap=00060002 PFECmask= PFECmatch= (XEN) [2019-11-02 14:09:46] VMEntry: intr_info=8501 errcode= ilen=0001 (XEN) [2019-11-02 14:09:46] VMExit: intr_info=8501 errcode= ilen=0001 (XEN) [2019-11-02 14:09:46] reason=8021 qualification= (XEN) [2019-11-02 14:09:46] IDTVectoring: info= errcode= (XEN) [2019-11-02 14:09:46] TSC Offset = 0xf45ded46dd57 TSC Multiplier = 0x (XEN) [2019-11-02 14:09:46] TPR Threshold = 0x00 PostedIntrVec = 0xf5 (XEN) [2019-11-02 14:09:46] EPT pointer = 0x00054e3a701e EPTP index = 0x (XEN) [2019-11-02 14:09:46] PLE Gap=0080 Window=1000 (XEN) [2019-11-02 14:09:46] Virtual processor ID = 0x5a02 VMfunc controls = (XEN) [2019-11-02 14:09:46] ** (XEN) [2019-11-02 14:09:46] domain_crash called from vmx.c:3335 (XEN) [2019-11-02 14:09:46] Domain 2 (vcpu#0) crashed on cpu#2: Interruptibility = 0002 (Blocked by Mov SS) and VMEntry: intr_info=8501 (ICEBP) Dare I ask what you're running in your windows guest? Unless it is a vulnerability test suite, I'm rather concerned. Because I have pulled out all stops ? Well no particular reason. I've asked my kids nicely not to poke any /more/ holes in the security on the system. Probably should tighten up the ship. I have some conflict going on between the hardware pci USB cards in the machine, so I thought I'd see what would happen if I gave ASUS and whatever no-name Taiwanese I have in there free rein. Nothing gained as far as I can see, so I'll see about closing some of the more gaping holes. At least as far as getting rid of deprecation warnings go :-) . I hope "they" never get serious about requiring a license to own a computer with Internet access. :-) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [BUG] Invalid guest state in Xen master, dom0 linux-5.3.6, domU windows 10
Got this just now, as my windows domU died: (XEN) [2019-10-15 21:23:44] d7v0 vmentry failure (reason 0x8021): Invalid guest state (0) (XEN) [2019-10-15 21:23:44] * VMCS Area ** (XEN) [2019-10-15 21:23:44] *** Guest State *** (XEN) [2019-10-15 21:23:44] CR0: actual=0x80050031, shadow=0x80050031, gh_mask= (XEN) [2019-10-15 21:23:44] CR4: actual=0x00172678, shadow=0x00170678, gh_mask=ffe8f860 (XEN) [2019-10-15 21:23:44] CR3 = 0x001aa002 (XEN) [2019-10-15 21:23:44] RSP = 0x908e440fae68 (0x908e440fae68) RIP = 0x9581e15d560b (0x9581e15d560b) (XEN) [2019-10-15 21:23:44] RFLAGS=0x0197 (0x0197) DR7 = 0x0400 (XEN) [2019-10-15 21:23:44] Sysenter RSP= CS:RIP=: (XEN) [2019-10-15 21:23:44] sel attr limit base (XEN) [2019-10-15 21:23:56] printk: 52 messages suppressed. (XEN) [2019-10-15 21:23:56] [VT-D]d7:PCIe: unmap :81:00.0 (XEN) [2019-10-15 21:23:56] [VT-D]d0:PCIe: map :81:00.0 (XEN) [2019-10-15 21:23:59] printk: 2 messages suppressed. (XEN) [2019-10-15 21:23:59] [VT-D]d7:PCIe: unmap :02:00.0 # xl info host : gentoo release : 5.3.6-gentoo-r1 version : #2 SMP Sat Oct 12 13:48:00 CEST 2019 machine : x86_64 nr_cpus : 12 max_cpu_id : 11 nr_nodes : 2 cores_per_socket : 6 threads_per_core : 1 cpu_mhz : 2471.973 hw_caps : bfebfbff:77fef3ff:2c100800:0021:0001:37ab::0100 virt_caps : pv hvm hvm_directio pv_directio hap shadow iommu_hap_pt_share total_memory : 65376 free_memory : 12996 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 13 xen_extra : -unstable xen_version : 4.13-unstable xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit2 xen_pagesize : 4096 platform_params : virt_start=0x8000 xen_changeset : xen_commandline : xen.cfg xen-marker-215 console_timestamps=date iommu=1,intpost,verbose,debug iommu_inclusive_mapping=1 com1=57600,8n1 com2=57600,8n1 console=vga,com2 dom0_max_vcpus=8 dom0_mem=8G,max:8G cpufreq=xen:performance,verbose smt=0 maxcpus=12 core_parking=performance nmi=dom0 gnttab_max_frames=256 gnttab_max_maptrack_frames=1024 vcpu_migration_delay=2000 tickle_one_idle_cpu=1 spec-ctrl=no-xen sched=credit2 max_cstate=2 clocksource=tsc tsc=stable:socket timer_slop=5000 loglvl=none/warning guest_loglvl=none/warning cc_compiler : gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0 cc_compile_by : hakon cc_compile_domain : alstadheim.priv.no cc_compile_date : Sun Oct 13 16:18:01 CEST 2019 build_id : c67e3aeeb910fcd06dfe7bd31a9eb820 xend_config_format : 4 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 0/2] xen-block: fix sector size confusion
Den 01.04.2019 11:34, skrev Kevin Wolf: Yes, for 512 accesses on native 4k disks with O_DIRECT, the QEMU block layer performs the necessary RMW. Of course, it still comes with a performance penalty, so you want to avoid such setups, but they do work. I suspect that the approximately 1/10 -th disk-speed I see (e.g 28389 vs 258719 K/sec sequential disk Output) in domU compared to dom0 must be due to this. I have dom0 and domU drives as partitions on an md raid6. The performance is sub-optimal to put it mildly. Here is one example bonnie++ run on a guest: Version 1.97 --Sequential Output-- --Sequential Input- --Random- Concurrency 3 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP gt-credit2-on-p 16G 1062 90 28389 2 20233 2 3950 82 199633 10 1195 17 Latency 19003us 140ms 652ms 20811us 602ms 33189us Version 1.97 --Sequential Create-- Random Create gt-credit2-on-pt -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 127 34445 26 + +++ 42424 27 67418 51 + +++ 52590 35 Latency 12979us 329us 2903us 304us 38us 668us 1.97,1.97,gt-credit2-on-pt,3,1550928114,16G,,1062,90,28389,2,20233,2,3950,82,199633,10,1195,17,127,34445,26,+,+++,42424,27,67418,51,+,+++,52590,35,19003us,140ms,652ms,20811us,602ms,33189us,12979us,329us,2903us,304us,38us,668us Same on dom0: Version 1.97 --Sequential Output-- --Sequential Input- --Random- Concurrency 3 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP XEN-4.12-DOM0 16G 475 91 258719 20 151508 19 585 99 324725 22 392.3 14 Latency 16093us 1097ms 406ms 18136us 160ms 155ms Version 1.97 --Sequential Create-- Random Create XEN-4.12-DOM0 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 8 6563 11 + +++ 13522 20 10969 18 + +++ 9560 13 Latency 104us 68us 18179us 120us 20us 63us 1.97,1.97,XEN-4.12-DOM0,3,1547961396,16G,,475,91,258719,20,151508,19,585,99,324725,22,392.3,14,8,6563,11,+,+++,13522,20,10969,18,+,+++,9560,13,16093us,1097ms,406ms,18136us,160ms,155ms,104us,68us,18179us,120us,20us,63us ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH cargo-cult-version] For linux-4.19.x . Xen/PCIback: Implement PCI flr/slot/bus reset with 'reset' SysFS attribute
Den 23.10.2018 20:40, skrev Håkon Alstadheim: Den 08. okt. 2018 16:32, skrev Boris Ostrovsky: Are these two patches still needed? ISTR they were written originally to deal with guest trying to use device that was previously assigned to another guest. But pcistub_put_pci_dev() calls __pci_reset_function_locked() which first tries FLR, and it looks like it was added relatively recently. Sorry for the late reply, but I just now booted xen staging-4.11 (94fba9f438a2c36ad9bf3a481a6013ddc7cf8cd9), with gentoo-sources-4.19.0 as dom0. Shut down and started again a VM that has a secondary GPU passed through, and the whole machine hung. I haven't had time to look more closely into this, other than that my old "do_flr" patch no longer applies to gentoo-sources (i.e. the linux kernel sources) . "do_flr" worked for me on linux-4.18.? , with appropriate patch to the linux kernel. Without some kind of fix, my whole server (dom0) goes down whenever a domu with pci passed through is re-started. NOTE: I am not a programmer. I have no idea what I am doing. The patch I have as a starting-point does not compile correctly when applied to kernel version 4.19.x. I get implicit declarations of pci_try_reset_slot() and pci_try_reset_bus(). Replacing those with pci_reset_bus(dev) gives me the attached patch which applies cleanly to gentoo-sources-4.19.2, compiles without warnings, and works to let me restart a domU with pci-passthrough (modulo changing do_flr to reset in xen libxl). I hope a dev will adopt these and give them proper care :-) . --- a/drivers/xen/xen-pciback/pci_stub.c 2018-10-22 08:37:37.0 +0200 +++ b/drivers/xen/xen-pciback/pci_stub.c 2018-11-14 12:45:21.926468126 +0100 @@ -244,6 +244,90 @@ return found_dev; } +struct pcistub_args { + struct pci_dev *dev; + unsigned int dcount; +}; + +static int pcistub_search_dev(struct pci_dev *dev, void *data) +{ + struct pcistub_device *psdev; + struct pcistub_args *arg = data; + bool found_dev = false; + unsigned long flags; + + spin_lock_irqsave(_devices_lock, flags); + + list_for_each_entry(psdev, _devices, dev_list) { + if (psdev->dev == dev) { + found_dev = true; + arg->dcount++; + break; + } + } + + spin_unlock_irqrestore(_devices_lock, flags); + + /* Device not owned by pcistub, someone owns it. Abort the walk */ + if (!found_dev) + arg->dev = dev; + + return found_dev ? 0 : 1; +} + +static int pcistub_reset_dev(struct pci_dev *dev) +{ + struct xen_pcibk_dev_data *dev_data; + bool slot = false, bus = false; + struct pcistub_args arg = {}; + + if (!dev) + return -EINVAL; + + dev_dbg(>dev, "[%s]\n", __func__); + + if (!pci_probe_reset_slot(dev->slot)) + slot = true; + else if ((!pci_probe_reset_bus(dev->bus)) && + (!pci_is_root_bus(dev->bus))) + bus = true; + + if (!bus && !slot) + return -EOPNOTSUPP; + + /* + * Make sure all devices on this bus are owned by the + * PCI backend so that we can safely reset the whole bus. + */ + pci_walk_bus(dev->bus, pcistub_search_dev, ); + + /* All devices under the bus should be part of pcistub! */ + if (arg.dev) { + dev_err(>dev, "%s device on bus 0x%x is not owned by pcistub\n", + pci_name(arg.dev), dev->bus->number); + + return -EBUSY; + } + + dev_dbg(>dev, "pcistub owns %d devices on bus 0x%x\n", + arg.dcount, dev->bus->number); + + dev_data = pci_get_drvdata(dev); + if (!pci_load_saved_state(dev, dev_data->pci_saved_state)) + pci_restore_state(dev); + + /* This disables the device. */ + xen_pcibk_reset_device(dev); + + /* Cleanup up any emulated fields */ + xen_pcibk_config_reset_dev(dev); + + dev_dbg(>dev, "resetting %s device using %s reset\n", + pci_name(dev), slot ? "slot" : "bus"); + + return pci_reset_bus(dev); +} + /* * Called when: * - XenBus state has been reconfigure (pci unplug). See xen_pcibk_remove_device @@ -1430,6 +1514,33 @@ } static DRIVER_ATTR_RW(permissive); +static ssize_t reset_store(struct device_driver *drv, const char *buf, + size_t count) +{ + struct pcistub_device *psdev; + int domain, bus, slot, func; + int err; + + err = str_to_slot(buf, , , , ); + if (err) + return err; + + psdev = pcistub_device_find(domain, bus, slot, func); + if (psdev) { + err = pcistub_reset_dev(psdev->dev); + pcistub_device_put(psdev); + } else { + err = -ENODEV; + } + + if (!err) + err = count; + + return err; +} + +static DRIVER_ATTR_WO(reset); + static void pcistub_exit(void) { driver_remove_file(_pcibk_pci_driver.driver, _attr_new_slot); @@ -1443,6 +1554,8 @@ _attr_irq_handlers); driver_remove_file(_pcibk_pci_driver.driver, _attr_irq_handler_state); + driver_remove_file(_pcibk_pci_driver.driver, + _attr_reset); pci_unregister_driver(_pcibk_pci_driver); } @@ -1536,6 +1649,11 @@ if (!err) err = driver_create_file(_pcibk_pci_driver.driver,
Re: [Xen-devel] [Xen-users] xen_pt_region_update: Error: create new mem mapping failed! (err: 22)
Den 25. jan. 2018 12:29, skrev Anthony PERARD: > On Thu, Jan 25, 2018 at 10:28:14AM +, George Dunlap wrote: >> On Wed, Jan 24, 2018 at 9:59 PM, Håkon Alstadheim >> <ha...@alstadheim.priv.no> wrote: >>> I'm trying, and failing, to launch a vm with bios = 'ovmf' under xen 4.10. >>> >>> The domain launches OK as long as I do not pass any pci devices through, >>> but with pci devices passed through, >> >> Anthony, >> >> Does OVMF support PCI pass-through yet? > > I don't think OVMF cares if a PCI device is pass-through or not. Does > the VM works with bios=seabios ? Yes, it does (i.e without a bios= line, which amounts to the same thing?) . I'd like to get ovmf working with these devices, primarily to see if maybe ovmf might play nicer with windows and my pcie usb card than it does at present. I'm doing my tests on another VM, that is working well without ovmf, just to be sure that I have not messed up something obvious. Here is a diff between a working (steam.hvm.bios) and non-working working domU config: # diff -u steam.hvm.bios steam.hvm.ovmf --- steam.hvm.bios 2017-12-02 15:34:58.673709262 +0100 +++ steam.hvm.ovmf 2018-01-25 16:32:57.375284572 +0100 @@ -1,7 +1,7 @@ name = "steam.hvm" builder = "hvm" nestedhvm = 1 -xen_platform_pci = '1' +#xen_platform_pci = '1' pvh=1 vcpus = 8 cpu_weight=5120 @@ -10,13 +10,13 @@ memory = 7680 mmio_hole = 3072 no_migrate = 1 -timer_mode = "one_missed_tick_pending" +#timer_mode = "one_missed_tick_pending" #timer_mode = "no_missed_ticks_pending" soundhw="hda" #soundhw="ac97" device_model_version="qemu-xen" -boot = 'n' - +boot = 'cd' +bios = 'ovmf' disk = [ 'vdev=xvda, format=raw, no-discard, target=/dev/system/steam-efi', 'vdev=xvdb, format=raw, no-discard, target=/dev/bcache/by-label/steam-b', 'vdev=xvdc, format=raw, no-discard, target=/dev/system/steam-swap', @@ -33,10 +33,14 @@ keymap="en-us" spice=0 sdl = '0' -vnc = '1' +vnc = '0' serial = 'pty' -usbctrl=["version=1"] +#usbctrl=["version=1"] +device_model_args_hvm = [ + '-chardev', 'file,id=debugcon,mux=on,path=/tmp/OVMF.logs,', + '-device', 'isa-debugcon,iobase=0x402,chardev=debugcon', +] --- The .ovmf one boots as long as I do not add a pci=... option to the xl create command-line. The arg I add is: pci=["82:00.0,rdm_policy=relaxed,permissive=1,msitranslate=1","81:00.0","81:00.1"] , or some subset. All failing for ovmf, working OK with seabios. > > There is maybe somethings wrong with the way OVMF handles PCI devices > that doesn't work with pass-through. > > Håkon, could you add the following in the VM config? With that, we could > get some logs from OVMF: > device_model_args_hvm = [ > '-chardev', 'file,id=debugcon,mux=on,path=/tmp/OVMF.logs,', > '-device', 'isa-debugcon,iobase=0x402,chardev=debugcon', > ] I just did that, and /tmp/OVMF.logs gets created, but it is empty. I happened to look at xl dmesg, and this is what I get from starting the vm: - (XEN) [2018-01-25 17:05:52] HVM6 save: CPU (XEN) [2018-01-25 17:05:52] HVM6 save: PIC (XEN) [2018-01-25 17:05:52] HVM6 save: IOAPIC (XEN) [2018-01-25 17:05:52] HVM6 save: LAPIC (XEN) [2018-01-25 17:05:52] HVM6 save: LAPIC_REGS (XEN) [2018-01-25 17:05:52] HVM6 save: PCI_IRQ (XEN) [2018-01-25 17:05:52] HVM6 save: ISA_IRQ (XEN) [2018-01-25 17:05:52] HVM6 save: PCI_LINK (XEN) [2018-01-25 17:05:52] HVM6 save: PIT (XEN) [2018-01-25 17:05:52] HVM6 save: RTC (XEN) [2018-01-25 17:05:52] HVM6 save: HPET (XEN) [2018-01-25 17:05:52] HVM6 save: PMTIMER (XEN) [2018-01-25 17:05:52] HVM6 save: MTRR (XEN) [2018-01-25 17:05:52] HVM6 save: VIRIDIAN_DOMAIN (XEN) [2018-01-25 17:05:52] HVM6 save: CPU_XSAVE (XEN) [2018-01-25 17:05:52] HVM6 save: VIRIDIAN_VCPU (XEN) [2018-01-25 17:05:52] HVM6 save: VMCE_VCPU (XEN) [2018-01-25 17:05:52] HVM6 save: TSC_ADJUST (XEN) [2018-01-25 17:05:52] HVM6 save: CPU_MSR (XEN) [2018-01-25 17:05:52] HVM6 restore: CPU 0 (XEN) [2018-01-25 17:05:55] d6: bind: m_gsi=56 g_gsi=40 dev=00.00.6 intx=0 (XEN) [2018-01-25 17:05:55] [VT-D]d0:PCIe: unmap :81:00.0 (XEN) [2018-01-25 17:05:55] [VT-D]d6:PCIe: map :81:00.0 (XEN) [2018-01-25 17:05:56] d6: bind: m_gsi=60 g_gsi=45 dev=00.00.7 intx=1 (XEN) [2018-01-25 17:05:56] [VT-D]d0:PCIe: unmap :81:00.1 (XEN) [2018-01-25 17:05:56] [VT-D]d6:PCIe: map :81:00.1 (XEN) [2018-01-25 17:05:57] d6: bind: m_gsi=64 g_gsi=17 dev=00.01.0 intx=0 (XEN) [2018-01-25 17:05:57] [VT-D] It's risky to assign :82:00.0 with shared RMRR at 7db85000 for Dom6. (XEN) [2018-01-25 17:05:57] [VT-D]d0:PCIe: unmap :82:00.0 (XEN) [2018-01-25 17:05:57] [VT-D]d6:PCIe: map :82:00.0 (d6) [2018-01-25 17:05:58] HVM Loader (d6) [2018-01-25 17:05:58] Detected Xen v4.10.0 (d6) [2018-01-25 17:05:58] X
Re: [Xen-devel] [PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute
Den 18. des. 2017 19:33, skrev Govinda Tatti: Are you saying do_flr doesn't exist at all in any version of Linux, and as such the line you're removing is currently pointless? >>> Yes, that's correct. In other-words, it will not break any existing code >>> or functionality. >> Except for people, like me, running unofficial patches to linux. It >> should be OK to assume they are watching this thread. > Do we need to account for unofficial patches or usage of do_flr?. If yes, > we need to maintainbackward compatibility for do_flr attribute. When the final, official change to the linux backend driver goes in, local patches will no longer apply. The cause should be obvious to anybody dealing with patching the linux sources on their own. Running newer kernels as dom0 on old Xen would just mean those people would need a different (simpler) patch. As a convenience, I'd like a heads up before the interface on the xen/libxl side changes, so I can make sure to run a compatible dom0 kernel. An entry in changelog should suffice. A wishlist item might be to have libxl error message be as fool-proof as possible. "Failed to access pciback path %s" could perhaps be improved upon. Perhaps add "make sure pciback version is compatible with libxl version". Adding backwards compatibility code just for this would be unnecessary cruft as far as I can see. I'll go back to lurking now. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH V1 1/1] Xen/libxl: Perform PCI reset using 'reset' SysFS attribute
Den 14. des. 2017 21:22, skrev Govinda Tatti: > In which case, xl needs to be backwards-compatible with kernels that don't have your new feature: it will have to check for %s/reset, and if it's not there, then try %/do_flr. >>> I think this fix was planned more than a year back and even we pushed >>> libxl >>> fix >>> ("do_flr" SysFSattribute) but linux kernel fix was not integrated for >>> some >>> reason. >>> Now, we are revisitingboth linux kernel and libxl changes. In >>> other-words, >>> "do_flr" change is not being usedtoday since we don't have required code >>> changes >>> in the linux kernel. >> Are you saying do_flr doesn't exist at all in any version of Linux, >> and as such the line you're removing is currently pointless? > Yes, that's correct. In other-words, it will not break any existing code > or functionality. Except for people, like me, running unofficial patches to linux. It should be OK to assume they are watching this thread. > > Cheers > GOVINDA > > ___ > Xen-devel mailing list > Xen-devel@lists.xenproject.org > https://lists.xenproject.org/mailman/listinfo/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel