On 10/31/15 18:50, Laszlo Ersek wrote:

> I'm very sorry, but I don't think I can spend time on this, unless
> someone gives me ssh and/or console access to a host that readily
> reproduces the bug, with the latest kvm/master, qemu, and ekd2
> builds.

We just got lucky, the problem reproduces on my oldie workstation (HP Z400, (R) 
Xeon(R) CPU W3550 @ 3.07GHz, family/model/stepping/microcode = 6/26/5/0x19).

OVMF works fine with the most recent Fedora 21 host kernel, 
"kernel-4.1.10-100.fc21.x86_64", and it breaks on the crispy new upstream Linux 
4.3 release.

The OVMF log contains a number of failed ASSERT()s, printed by the APs, in a 
way that is very characteristic of multi-threading bugs:

--*--

Detect CPU count: 2
AASSSSEERRTT  
//hhoomem/el/alcaocso/ss/rscr/cu/pusptsrteraema/me/dekd2k-2g-igti-ts-vsnv/nM/dMedPekPgk/gL/iLbirbarrayr/yB/aBsaesSeySnycnhcrhornoiznatiizoantLiiobn/LSiybn/cShyrnocnhirzoantiizoantGicocn.Gcc(c1.9c8()1:9
 8L)o:c kLVoaclkuVea l=u=e  (=(=U I(N(TUNI)N T2N))  |2|)  L|o|c kLVoaclkuVea 
l=u=e  (=(=U I(N(TUNI)N T1N))^M 
1)
ASSERT 
/home/lacos/src/upstream/edk2-git-svn/MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c(198):
 LockValue == ((UINTN) 2) || LockValue == ((UINTN) 1)

--*--

(Note that the above is for a 8 VCPU guest. Cf. "Detect CPU count: 2".)


Maybe important, maybe not: my workstation's IOMMU does *not* have snoop 
control:

[    3.043297] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c90780106f0462 
ecap f02076

Where 0xf02076 & 0x80 == 0.

Whereas my laptop *does* have Snoop Control:

[    0.030486] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 
ecap f0101a
[    0.030491] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008020660462 
ecap f010da

The first IOMMU doesn't have Snoop Control, but its scope only extends to the 
integrated Intel Graphics Device. The second IOMMU, which covers the SD card 
reader I used for testing (and all other PCI devices), *does* have Snoop 
Control. 0xf010da & 0x80 = 0x80.

However, SC support should not matter, due to kernel commit 
fb279950ba02e3210a16b11ecfa8871f3ee0ca49, so I think we might be looking an 
independent issue. (Possibly in CpuDxe, only exposed by the new host kernel.)

I'll try to track this down.

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to