On 10/31/15 18:50, Laszlo Ersek wrote: > I'm very sorry, but I don't think I can spend time on this, unless > someone gives me ssh and/or console access to a host that readily > reproduces the bug, with the latest kvm/master, qemu, and ekd2 > builds.
We just got lucky, the problem reproduces on my oldie workstation (HP Z400, (R) Xeon(R) CPU W3550 @ 3.07GHz, family/model/stepping/microcode = 6/26/5/0x19). OVMF works fine with the most recent Fedora 21 host kernel, "kernel-4.1.10-100.fc21.x86_64", and it breaks on the crispy new upstream Linux 4.3 release. The OVMF log contains a number of failed ASSERT()s, printed by the APs, in a way that is very characteristic of multi-threading bugs: --*-- Detect CPU count: 2 AASSSSEERRTT //hhoomem/el/alcaocso/ss/rscr/cu/pusptsrteraema/me/dekd2k-2g-igti-ts-vsnv/nM/dMedPekPgk/gL/iLbirbarrayr/yB/aBsaesSeySnycnhcrhornoiznatiizoantLiiobn/LSiybn/cShyrnocnhirzoantiizoantGicocn.Gcc(c1.9c8()1:9 8L)o:c kLVoaclkuVea l=u=e (=(=U I(N(TUNI)N T2N)) |2|) L|o|c kLVoaclkuVea l=u=e (=(=U I(N(TUNI)N T1N))^M 1) ASSERT /home/lacos/src/upstream/edk2-git-svn/MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c(198): LockValue == ((UINTN) 2) || LockValue == ((UINTN) 1) --*-- (Note that the above is for a 8 VCPU guest. Cf. "Detect CPU count: 2".) Maybe important, maybe not: my workstation's IOMMU does *not* have snoop control: [ 3.043297] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c90780106f0462 ecap f02076 Where 0xf02076 & 0x80 == 0. Whereas my laptop *does* have Snoop Control: [ 0.030486] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a [ 0.030491] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008020660462 ecap f010da The first IOMMU doesn't have Snoop Control, but its scope only extends to the integrated Intel Graphics Device. The second IOMMU, which covers the SD card reader I used for testing (and all other PCI devices), *does* have Snoop Control. 0xf010da & 0x80 = 0x80. However, SC support should not matter, due to kernel commit fb279950ba02e3210a16b11ecfa8871f3ee0ca49, so I think we might be looking an independent issue. (Possibly in CpuDxe, only exposed by the new host kernel.) I'll try to track this down. Thanks Laszlo _______________________________________________ edk2-devel mailing list [email protected] https://lists.01.org/mailman/listinfo/edk2-devel

