Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-23 Thread Alan Cox

Alan Cox wrote:
The next public revision guide from AMD will contain an errata (383) 
that documents the bug.  However, it doesn't really tell us anything 
that we didn't already know.


Could someone on this list please test the attached patch in an amd64 
FreeBSD 8 guest running on vSphere 4 with an AMD Family 10h processor 
underneath?  Before testing the patch, remove the manual setting of 
vm.pmap.pg_ps_enabled=0 from /boot/loader.conf.  After booting the 
virtual machine, please run sysctl vm.pmap.pg_ps_enabled to verify 
that superpage promotion has been automatically disabled.


Thanks,
Alan

Index: amd64/amd64/pmap.c
===
--- amd64/amd64/pmap.c  (revision 204175)
+++ amd64/amd64/pmap.c  (working copy)
@@ -686,6 +686,15 @@ pmap_init(void)
pv_entry_high_water = 9 * (pv_entry_max / 10);
 
/*
+* Disable large page mappings by default if the kernel is running in
+* a virtual machine on an AMD Family 10h processor.  This is a work-
+* around for Erratum 383.
+*/
+   if (vm_guest == VM_GUEST_VM  cpu_vendor_id == CPU_VENDOR_AMD 
+   CPUID_TO_FAMILY(cpu_id) == 0x10)
+   pg_ps_enabled = 0;
+
+   /*
 * Are large page mappings enabled?
 */
TUNABLE_INT_FETCH(vm.pmap.pg_ps_enabled, pg_ps_enabled);
Index: kern/subr_param.c
===
--- kern/subr_param.c   (revision 204175)
+++ kern/subr_param.c   (working copy)
@@ -74,10 +74,6 @@ __FBSDID($FreeBSD$);
 #defineMAXFILES (maxproc * 2)
 #endif
 
-/* Values of enum VM_GUEST members are used as indices in 
- * vm_guest_sysctl_names */
-enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN };
-
 static int sysctl_kern_vm_guest(SYSCTL_HANDLER_ARGS);
 
 inthz;
Index: sys/systm.h
===
--- sys/systm.h (revision 204175)
+++ sys/systm.h (working copy)
@@ -45,6 +45,10 @@
 #include sys/queue.h
 #include sys/stdint.h/* for people using printf mainly */
 
+/* Values of enum VM_GUEST members are used as indices in 
+ * vm_guest_sysctl_names */
+enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN };
+
 extern int cold;   /* nonzero if we are doing a cold boot */
 extern int rebooting;  /* boot() has been called. */
 extern const char *panicstr;   /* panic message */
@@ -63,6 +67,7 @@ extern int bootverbose;   /* nonzero to print 
verbo
 
 extern int maxusers;   /* system tune hint */
 extern int ngroups_max;/* max # of supplemental groups */
+extern int vm_guest;   /* Running as virtual machine guest? */
 
 #ifdef INVARIANTS  /* The option is always available */
 #defineKASSERT(exp,msg) do {   
\
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-11 Thread John Baldwin
On Wednesday 10 February 2010 1:38:37 pm Ivan Voras wrote:
 On 10 February 2010 19:35, Andriy Gapon a...@icyb.net.ua wrote:
  on 10/02/2010 20:26 Ivan Voras said the following:
  On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
  on 10/02/2010 20:03 Ivan Voras said the following:
  When you say very unique is it in the it is not Linux or Windows
  sense or do we do something nonstandard?
  The former - neither Linux, Windows or OpenSolaris seem to have what we 
have.
 
  I can't find the exact documents but I think both Windows
  MegaUltimateServer (the highest priced version of Windows Server,
  whatever it's called today) and Linux (though disabled and marked
  Experimental) have it, or have some kind of support for large pages
  that might not be as pervasive (maybe they use it for kernel only?). I
  have no idea about (Open)Solaris.
 
  I haven't said that those OSes do not use large pages.
  I've said what I've said :-)
 
 Ok :)
 
 Is there a difference between large pages as they are commonly known
 and superpages as in FreeBSD ? In other words - are you referencing
 some specific mechanism, like automatic promotion / demotion of the
 large pages or maybe something else?

Yes, the automatic promotion / demotion.  That is a far-less common feature.  
FreeBSD/i386 has used large pages for the kernel text as far back as at least 
4.x, but that is not the same as superpages.  Linux does not have automatic 
promotion / demotion to my knowledge.  I do not know about other OS's.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Andriy Gapon
on 11/02/2010 06:37 Alan Cox said the following:
 Here is what I know.  Several of us, myself included, have been able to
 reproduce either lockups or machine check exceptions when BOTH the machine
 check driver and superpages are enabled on AMD family 10h processors.  There
 have been no reports of this problem on either Intel or earlier AMD
 processors.  Moreover, there is no evidence of instability in AMD family 10h
 processors until the machine check driver is enabled.  By default, FreeBSD
 8.0 enables superpages but disables the machine check driver.  So, running
 natively, i.e., without virtualization, you shouldn't experience a problem,
 unless you explicitly enable the machine check driver.  However, running on
 top of a hypervisor, like vSphere 4, you might face a problem because the
 hypervisor might enable machine check exceptions, regardless of what the
 FreeBSD guest does.  I really don't know whether vSphere 4 enables machine
 check exception or not.  If it does, then either you disable the use of
 superpages in the FreeBSD guest, or you find a way to disable the machine
 check driver in the hypervisor.

I'd like to mention another possibility, just in case: machine check might be
enabled/done by firmware (e.g. BIOS).  This typically could be the case for
high-end-ish/server systems.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Alan Cox
On Thu, Feb 11, 2010 at 7:13 AM, John Baldwin j...@freebsd.org wrote:

 On Wednesday 10 February 2010 1:38:37 pm Ivan Voras wrote:
  On 10 February 2010 19:35, Andriy Gapon a...@icyb.net.ua wrote:
   on 10/02/2010 20:26 Ivan Voras said the following:
   On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
   on 10/02/2010 20:03 Ivan Voras said the following:
   When you say very unique is it in the it is not Linux or Windows
   sense or do we do something nonstandard?
   The former - neither Linux, Windows or OpenSolaris seem to have what
 we
 have.
  
   I can't find the exact documents but I think both Windows
   MegaUltimateServer (the highest priced version of Windows Server,
   whatever it's called today) and Linux (though disabled and marked
   Experimental) have it, or have some kind of support for large pages
   that might not be as pervasive (maybe they use it for kernel only?). I
   have no idea about (Open)Solaris.
  
   I haven't said that those OSes do not use large pages.
   I've said what I've said :-)
 
  Ok :)
 
  Is there a difference between large pages as they are commonly known
  and superpages as in FreeBSD ? In other words - are you referencing
  some specific mechanism, like automatic promotion / demotion of the
  large pages or maybe something else?

 Yes, the automatic promotion / demotion.  That is a far-less common
 feature.
 FreeBSD/i386 has used large pages for the kernel text as far back as at
 least
 4.x, but that is not the same as superpages.  Linux does not have automatic
 promotion / demotion to my knowledge.  I do not know about other OS's.


A comparison of current large page support among Unix-like and Windows
operating systems has two dimensions: (1) whether or not the creation of
large pages for applications is automatic and (2) whether or not the machine
administrator has to statically partition the machine's physical memory
between large and small pages at boot time.

For FreeBSD, large pages are created automatically and there is not a static
partitioning of physical memory.  In contrast, Linux does not create large
pages automatically and does require a static partitioning.  Specifically,
Linux requires the administrator to explicitly and statically partition the
machine's physical memory at boot time into two parts, one that is dedicated
to large pages and another for general use.  To utilize large pages an
application has to explicitly request memory from the dedicated large pages
pool.  However, to make this somewhat easier, but not automatic, there do
exist re-implementations of malloc that you can explicitly link with your
application.

In Solaris, the application has to explicitly request the use of large
pages, either via explicit kernel calls in the program or from the command
line with support from a library.  However, there is not a static
partitioning of physical memory.  So, for example, when you run the Sun jdk
on Solaris, it explicitly requests large pages for much of its data, and
this works without administrator having to configure the machine for large
page usage.

To the best of my knowledge, Windows is just like Solaris.

Alan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Alan Cox
The next public revision guide from AMD will contain an errata (383) 
that documents the bug.  However, it doesn't really tell us anything 
that we didn't already know.


Alan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Andriy Gapon
on 11/02/2010 20:38 Alan Cox said the following:
 The next public revision guide from AMD will contain an errata (383)
 that documents the bug.  However, it doesn't really tell us anything
 that we didn't already know.

Pity.  I sort of hoped for more, like a workaround, some magic MSR.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Sean C. Farley

On Wed, 10 Feb 2010, Ivan Voras wrote:

It looks like I've stumbled upon a bug in vSphere 4 (recent update) 
with FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In 
this combination, everything works fine until a moderate load is 
started - a buildworld is enough. About five minutes after the load 
starts, the vSphere client starts getting timeouts while talking with 
the host and soon after the guest VM is forcibly shut down without any 
trace of a reason in various logs.  The same VM runs fine on hosts 
with Xeon CPUs. The shutdown happens regardless if there is a vSphere 
client connected.


This is very repeatable, on Sun Fire X4140 hosts.

With 7.x/7.stable guests everything works fine.

I'm posting this for future reference and to see if anyone has 
encountered something like that, or has an idea why this happens.


Is it related to this thread:
http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/054755.html

I have been fighting other issues (mainly countless Command WRITE(10) 
took X.XYZ seconds in the VM's vmware.log file under moderate I/O) with 
VMware Workstation 7 on a Linux host with an AMD Phenom(tm) II X4 945 
Processor, but I still have more testing to see if I can work through 
it.  I also do not want to take over this thread.


Sean
--
s...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 17:36 Ivan Voras said the following:
 It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
 FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
 combination, everything works fine until a moderate load is started - a
 buildworld is enough. About five minutes after the load starts, the
 vSphere client starts getting timeouts while talking with the host and
 soon after the guest VM is forcibly shut down without any trace of a
 reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
 The shutdown happens regardless if there is a vSphere client connected.
 
 This is very repeatable, on Sun Fire X4140 hosts.
 
 With 7.x/7.stable guests everything works fine.
 
 I'm posting this for future reference and to see if anyone has
 encountered something like that, or has an idea why this happens.

Wild guess - try disabling superpages in the guests.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras

On 02/10/10 17:05, Andriy Gapon wrote:

on 10/02/2010 17:36 Ivan Voras said the following:

It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
combination, everything works fine until a moderate load is started - a
buildworld is enough. About five minutes after the load starts, the
vSphere client starts getting timeouts while talking with the host and
soon after the guest VM is forcibly shut down without any trace of a
reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
The shutdown happens regardless if there is a vSphere client connected.

This is very repeatable, on Sun Fire X4140 hosts.

With 7.x/7.stable guests everything works fine.

I'm posting this for future reference and to see if anyone has
encountered something like that, or has an idea why this happens.


Wild guess - try disabling superpages in the guests.


It looks like your guess is perfectly correct :) The guest has been 
doing buildworlds for an hour and it works fine. Thanks!


It's strange how this doesn't affect the Xeons...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 19:05 Ivan Voras said the following:
 On 02/10/10 17:05, Andriy Gapon wrote:
 on 10/02/2010 17:36 Ivan Voras said the following:
 It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
 FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
 combination, everything works fine until a moderate load is started - a
 buildworld is enough. About five minutes after the load starts, the
 vSphere client starts getting timeouts while talking with the host and
 soon after the guest VM is forcibly shut down without any trace of a
 reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
 The shutdown happens regardless if there is a vSphere client connected.

 This is very repeatable, on Sun Fire X4140 hosts.

 With 7.x/7.stable guests everything works fine.

 I'm posting this for future reference and to see if anyone has
 encountered something like that, or has an idea why this happens.

 Wild guess - try disabling superpages in the guests.
 
 It looks like your guess is perfectly correct :) The guest has been
 doing buildworlds for an hour and it works fine. Thanks!
 
 It's strange how this doesn't affect the Xeons...

I really can not tell more but there seems to be an issue between our
implementation of superpages (very unique) and AMD processors from 10h family.
I'd recommend not using superpages feature with those processors for time being.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 18:13, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 19:05 Ivan Voras said the following:
 On 02/10/10 17:05, Andriy Gapon wrote:

 Wild guess - try disabling superpages in the guests.

 It looks like your guess is perfectly correct :) The guest has been
 doing buildworlds for an hour and it works fine. Thanks!

 It's strange how this doesn't affect the Xeons...

 I really can not tell more but there seems to be an issue between our
 implementation of superpages (very unique) and AMD processors from 10h family.
 I'd recommend not using superpages feature with those processors for time 
 being.

When you say very unique is it in the it is not Linux or Windows
sense or do we do something nonstandard?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 20:03 Ivan Voras said the following:
 When you say very unique is it in the it is not Linux or Windows
 sense or do we do something nonstandard?

The former - neither Linux, Windows or OpenSolaris seem to have what we have.
So we might be the first testers for certain processor features.
We don't do anything that strays from specifications.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 20:03 Ivan Voras said the following:
 When you say very unique is it in the it is not Linux or Windows
 sense or do we do something nonstandard?

 The former - neither Linux, Windows or OpenSolaris seem to have what we have.

I can't find the exact documents but I think both Windows
MegaUltimateServer (the highest priced version of Windows Server,
whatever it's called today) and Linux (though disabled and marked
Experimental) have it, or have some kind of support for large pages
that might not be as pervasive (maybe they use it for kernel only?). I
have no idea about (Open)Solaris.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:26, Ivan Voras ivo...@freebsd.org wrote:
 On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 20:03 Ivan Voras said the following:
 When you say very unique is it in the it is not Linux or Windows
 sense or do we do something nonstandard?

 The former - neither Linux, Windows or OpenSolaris seem to have what we have.

 I can't find the exact documents but I think both Windows
 MegaUltimateServer (the highest priced version of Windows Server,
 whatever it's called today) and Linux (though disabled and marked
 Experimental) have it, or have some kind of support for large pages
 that might not be as pervasive (maybe they use it for kernel only?). I
 have no idea about (Open)Solaris.

VMWare documentation about large pages:

http://www.vmware.com/files/pdf/large_pg_performance.pdf

I think I remember reading that on Windows, the application must use a
special syscall to allocate an area with large pages, but I can't find
the document.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 20:26 Ivan Voras said the following:
 On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 20:03 Ivan Voras said the following:
 When you say very unique is it in the it is not Linux or Windows
 sense or do we do something nonstandard?
 The former - neither Linux, Windows or OpenSolaris seem to have what we have.
 
 I can't find the exact documents but I think both Windows
 MegaUltimateServer (the highest priced version of Windows Server,
 whatever it's called today) and Linux (though disabled and marked
 Experimental) have it, or have some kind of support for large pages
 that might not be as pervasive (maybe they use it for kernel only?). I
 have no idea about (Open)Solaris.

I haven't said that those OSes do not use large pages.
I've said what I've said :-)

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:35, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 20:26 Ivan Voras said the following:
 On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
 on 10/02/2010 20:03 Ivan Voras said the following:
 When you say very unique is it in the it is not Linux or Windows
 sense or do we do something nonstandard?
 The former - neither Linux, Windows or OpenSolaris seem to have what we 
 have.

 I can't find the exact documents but I think both Windows
 MegaUltimateServer (the highest priced version of Windows Server,
 whatever it's called today) and Linux (though disabled and marked
 Experimental) have it, or have some kind of support for large pages
 that might not be as pervasive (maybe they use it for kernel only?). I
 have no idea about (Open)Solaris.

 I haven't said that those OSes do not use large pages.
 I've said what I've said :-)

Ok :)

Is there a difference between large pages as they are commonly known
and superpages as in FreeBSD ? In other words - are you referencing
some specific mechanism, like automatic promotion / demotion of the
large pages or maybe something else?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Jeremy Chadwick
On Wed, Feb 10, 2010 at 07:38:37PM +0100, Ivan Voras wrote:
 On 10 February 2010 19:35, Andriy Gapon a...@icyb.net.ua wrote:
  on 10/02/2010 20:26 Ivan Voras said the following:
  On 10 February 2010 19:10, Andriy Gapon a...@icyb.net.ua wrote:
  on 10/02/2010 20:03 Ivan Voras said the following:
  When you say very unique is it in the it is not Linux or Windows
  sense or do we do something nonstandard?
  The former - neither Linux, Windows or OpenSolaris seem to have what we 
  have.
 
  I can't find the exact documents but I think both Windows
  MegaUltimateServer (the highest priced version of Windows Server,
  whatever it's called today) and Linux (though disabled and marked
  Experimental) have it, or have some kind of support for large pages
  that might not be as pervasive (maybe they use it for kernel only?). I
  have no idea about (Open)Solaris.
 
  I haven't said that those OSes do not use large pages.
  I've said what I've said :-)
 
 Ok :)
 
 Is there a difference between large pages as they are commonly known
 and superpages as in FreeBSD ? In other words - are you referencing
 some specific mechanism, like automatic promotion / demotion of the
 large pages or maybe something else?

I read what Andriy wrote to mean that the way FreeBSD utilises 4MB TLB
on certain models of AMD processors is broken/quirky, and on those CPUs,
users should stick to vm.pmap.pg_ps_enabled=0 (loader.conf).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange problem with 8-stable, VMWare vSphere 4 AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Alan Cox
On Wed, Feb 10, 2010 at 12:46 PM, Jeremy Chadwick
free...@jdc.parodius.comwrote:
[snip]


 I read what Andriy wrote to mean that the way FreeBSD utilises 4MB TLB
 on certain models of AMD processors is broken/quirky, and on those CPUs,
 users should stick to vm.pmap.pg_ps_enabled=0 (loader.conf).


No.  He said, We don't do anything that strays from specifications.  So,
he is not saying that FreeBSD is doing anything broken.

Here is what I know.  Several of us, myself included, have been able to
reproduce either lockups or machine check exceptions when BOTH the machine
check driver and superpages are enabled on AMD family 10h processors.  There
have been no reports of this problem on either Intel or earlier AMD
processors.  Moreover, there is no evidence of instability in AMD family 10h
processors until the machine check driver is enabled.  By default, FreeBSD
8.0 enables superpages but disables the machine check driver.  So, running
natively, i.e., without virtualization, you shouldn't experience a problem,
unless you explicitly enable the machine check driver.  However, running on
top of a hypervisor, like vSphere 4, you might face a problem because the
hypervisor might enable machine check exceptions, regardless of what the
FreeBSD guest does.  I really don't know whether vSphere 4 enables machine
check exception or not.  If it does, then either you disable the use of
superpages in the FreeBSD guest, or you find a way to disable the machine
check driver in the hypervisor.

Both Andriy and I have reported this problem to people at AMD, but we
haven't yet received AMD's analysis.  These things take time.

Regards,
Alan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org