Bugs item #2019053, was opened at 2008-07-15 18:10
Message generated for change (Comment added) made by alex_williamson
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2019053&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: amd
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Alex Williamson (alex_williamson)
Assigned to: Nobody/Anonymous (nobody)
Summary: tbench fails on guest when AMD NPT enabled

Initial Comment:
Running on a dual-socket system with AMD 2356 quad-core processors (8 total 
cores), 32GB RAM, Ubuntu Hardy 2.6.24-19-generic (64bit) with kvm-71 userspace 
and kernel modules.  With no module options, dmesg confirms: kvm: Nested Paging 
enabled

Start guest with:

/usr/local/kvm/bin/qemu-system-x86_64 -hda /dev/VM/Ubuntu64 -m 1024 -net 
nic,model=e1000,mac=de:ad:be:ef:00:01 -net tap,script=/root/bin/br0-ifup -smp 8 
-vnc :0

Guest VM is also Ubuntu Hardy 64bit.  On the guest run 'tbench 16 <tbench 
server>'.  System running tbench_srv is a different system in my case.

The tbench client will fail randomly, often quietly with "Child failed with 
status 1", but sometimes more harshly with a glibc double free error.

If I unload the modules and reload w/o npt:

modprobe -r kvm-amd
modprobe -r kvm
modprobe kvm-amd npt=0

dmesg confirms: kvm: Nested Paging disabled

The tbench test now runs over and over successfully.  The test also runs fine 
on an Intel E5450 (no EPT).

----------------------------------------------------------------------

>Comment By: Alex Williamson (alex_williamson)
Date: 2008-09-16 12:16

Message:
Fixed in KVM-75 by kvm.git commit 6eaa802ce5187e8508b07293633af00d8ccc911b

----------------------------------------------------------------------

Comment By: Alex Williamson (alex_williamson)
Date: 2008-09-02 10:36

Message:
Logged In: YES 
user_id=333914
Originator: YES

vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : Quad-Core AMD Opteron(tm) Processor 2356
stepping        : 3
cpu MHz         : 2300.080
cache size      : 512 KB
physical id     : 1
siblings        : 4
core id         : 3
cpu cores       : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm 3dnowext 3dnow constant_tsc rep_good pni cx16 popcnt lahf_lm
cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw
ibs
bogomips        : 4601.00
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate


----------------------------------------------------------------------

Comment By: Joerg Roedel (jroedel)
Date: 2008-09-02 07:27

Message:
Logged In: YES 
user_id=2019182
Originator: NO

Hi Alex,

can you post the model and stepping from /proc/cpuinfo of the host you
reproduced the issue? This would be very helpfull.

Thanks,

Joerg

----------------------------------------------------------------------

Comment By: Alexander Graf (awwy)
Date: 2008-07-25 03:30

Message:
Logged In: YES 
user_id=376328
Originator: NO

I just tried my test on openSUSE 10.3 again and it fails now. This bug
really only shows itself rarely - out of 30 kernel build usually about 5
fail. So scratch the 2.6.22 idea. We're back at square 1.

----------------------------------------------------------------------

Comment By: Alex Williamson (alex_williamson)
Date: 2008-07-24 11:10

Message:
Logged In: YES 
user_id=333914
Originator: YES

I tried the Ubuntu Gutsy 2.6.22-15-generic kernel on the host, but I still
see the issue.  I'll install openSUSE 10.3 and see what happens.

----------------------------------------------------------------------

Comment By: Alexander Graf (awwy)
Date: 2008-07-23 23:45

Message:
Logged In: YES 
user_id=376328
Originator: NO

I'm seeing random segfaults when using NPT on a host kernel >= 2.6.23. So
far I have not been able to reproduce my test case breakages with an
openSUSE 10.3 kernel though, so could you please test that and verify if
tbench works for you on openSUSE 10.3? It does break with 11.0.

I have the feeling that we're seeing the same problem here.

----------------------------------------------------------------------

Comment By: Alex Williamson (alex_williamson)
Date: 2008-07-16 09:18

Message:
Logged In: YES 
user_id=333914
Originator: YES

No, I added mlockall(MCL_CURRENT | MCL_FUTURE) to qemu/vl.c:main() and it
makes no difference.  I'm only starting a 1G guest on an otherwise idle 32G
host, so host memory pressure is pretty light.

----------------------------------------------------------------------

Comment By: Avi Kivity (avik)
Date: 2008-07-16 08:19

Message:
Logged In: YES 
user_id=539971
Originator: NO

Strange.  If you add an mlockall() to qemu startup, does the test pass?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2019053&group_id=180599
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to