Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-23 Thread Alan Cox

Alan Cox wrote:
The next public revision guide from AMD will contain an errata (383) 
that documents the bug.  However, it doesn't really tell us anything 
that we didn't already know.


Could someone on this list please test the attached patch in an amd64 
FreeBSD 8 guest running on vSphere 4 with an AMD Family 10h processor 
underneath?  Before testing the patch, remove the manual setting of 
vm.pmap.pg_ps_enabled="0" from /boot/loader.conf.  After booting the 
virtual machine, please run "sysctl vm.pmap.pg_ps_enabled" to verify 
that superpage promotion has been automatically disabled.


Thanks,
Alan

Index: amd64/amd64/pmap.c
===
--- amd64/amd64/pmap.c  (revision 204175)
+++ amd64/amd64/pmap.c  (working copy)
@@ -686,6 +686,15 @@ pmap_init(void)
pv_entry_high_water = 9 * (pv_entry_max / 10);
 
/*
+* Disable large page mappings by default if the kernel is running in
+* a virtual machine on an AMD Family 10h processor.  This is a work-
+* around for Erratum 383.
+*/
+   if (vm_guest == VM_GUEST_VM && cpu_vendor_id == CPU_VENDOR_AMD &&
+   CPUID_TO_FAMILY(cpu_id) == 0x10)
+   pg_ps_enabled = 0;
+
+   /*
 * Are large page mappings enabled?
 */
TUNABLE_INT_FETCH("vm.pmap.pg_ps_enabled", &pg_ps_enabled);
Index: kern/subr_param.c
===
--- kern/subr_param.c   (revision 204175)
+++ kern/subr_param.c   (working copy)
@@ -74,10 +74,6 @@ __FBSDID("$FreeBSD$");
 #defineMAXFILES (maxproc * 2)
 #endif
 
-/* Values of enum VM_GUEST members are used as indices in 
- * vm_guest_sysctl_names */
-enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN };
-
 static int sysctl_kern_vm_guest(SYSCTL_HANDLER_ARGS);
 
 inthz;
Index: sys/systm.h
===
--- sys/systm.h (revision 204175)
+++ sys/systm.h (working copy)
@@ -45,6 +45,10 @@
 #include 
 #include /* for people using printf mainly */
 
+/* Values of enum VM_GUEST members are used as indices in 
+ * vm_guest_sysctl_names */
+enum VM_GUEST { VM_GUEST_NO = 0, VM_GUEST_VM, VM_GUEST_XEN };
+
 extern int cold;   /* nonzero if we are doing a cold boot */
 extern int rebooting;  /* boot() has been called. */
 extern const char *panicstr;   /* panic message */
@@ -63,6 +67,7 @@ extern int bootverbose;   /* nonzero to print 
verbo
 
 extern int maxusers;   /* system tune hint */
 extern int ngroups_max;/* max # of supplemental groups */
+extern int vm_guest;   /* Running as virtual machine guest? */
 
 #ifdef INVARIANTS  /* The option is always available */
 #defineKASSERT(exp,msg) do {   
\
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Andriy Gapon
on 11/02/2010 20:38 Alan Cox said the following:
> The next public revision guide from AMD will contain an errata (383)
> that documents the bug.  However, it doesn't really tell us anything
> that we didn't already know.

Pity.  I sort of hoped for more, like a workaround, some magic MSR.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Alan Cox
The next public revision guide from AMD will contain an errata (383) 
that documents the bug.  However, it doesn't really tell us anything 
that we didn't already know.


Alan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Alan Cox
On Thu, Feb 11, 2010 at 7:13 AM, John Baldwin  wrote:

> On Wednesday 10 February 2010 1:38:37 pm Ivan Voras wrote:
> > On 10 February 2010 19:35, Andriy Gapon  wrote:
> > > on 10/02/2010 20:26 Ivan Voras said the following:
> > >> On 10 February 2010 19:10, Andriy Gapon  wrote:
> > >>> on 10/02/2010 20:03 Ivan Voras said the following:
> >  When you say "very unique" is it in the "it is not Linux or Windows"
> >  sense or do we do something nonstandard?
> > >>> The former - neither Linux, Windows or OpenSolaris seem to have what
> we
> have.
> > >>
> > >> I can't find the exact documents but I think both Windows
> > >> MegaUltimateServer (the highest priced version of Windows Server,
> > >> whatever it's called today) and Linux (though disabled and marked
> > >> Experimental) have it, or have some kind of support for large pages
> > >> that might not be as pervasive (maybe they use it for kernel only?). I
> > >> have no idea about (Open)Solaris.
> > >
> > > I haven't said that those OSes do not use large pages.
> > > I've said what I've said :-)
> >
> > Ok :)
> >
> > Is there a difference between "large pages" as they are commonly known
> > and "superpages" as in FreeBSD ? In other words - are you referencing
> > some specific mechanism, like automatic promotion / demotion of the
> > large pages or maybe something else?
>
> Yes, the automatic promotion / demotion.  That is a far-less common
> feature.
> FreeBSD/i386 has used large pages for the kernel text as far back as at
> least
> 4.x, but that is not the same as superpages.  Linux does not have automatic
> promotion / demotion to my knowledge.  I do not know about other OS's.
>
>
A comparison of current large page support among Unix-like and Windows
operating systems has two dimensions: (1) whether or not the creation of
large pages for applications is automatic and (2) whether or not the machine
administrator has to statically partition the machine's physical memory
between large and small pages at boot time.

For FreeBSD, large pages are created automatically and there is not a static
partitioning of physical memory.  In contrast, Linux does not create large
pages automatically and does require a static partitioning.  Specifically,
Linux requires the administrator to explicitly and statically partition the
machine's physical memory at boot time into two parts, one that is dedicated
to large pages and another for general use.  To utilize large pages an
application has to explicitly request memory from the dedicated large pages
pool.  However, to make this somewhat easier, but not automatic, there do
exist re-implementations of malloc that you can explicitly link with your
application.

In Solaris, the application has to explicitly request the use of large
pages, either via explicit kernel calls in the program or from the command
line with support from a library.  However, there is not a static
partitioning of physical memory.  So, for example, when you run the Sun jdk
on Solaris, it explicitly requests large pages for much of its data, and
this works without administrator having to configure the machine for large
page usage.

To the best of my knowledge, Windows is just like Solaris.

Alan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-11 Thread Andriy Gapon
on 11/02/2010 06:37 Alan Cox said the following:
> Here is what I know.  Several of us, myself included, have been able to
> reproduce either lockups or machine check exceptions when BOTH the machine
> check driver and superpages are enabled on AMD family 10h processors.  There
> have been no reports of this problem on either Intel or earlier AMD
> processors.  Moreover, there is no evidence of instability in AMD family 10h
> processors until the machine check driver is enabled.  By default, FreeBSD
> 8.0 enables superpages but disables the machine check driver.  So, running
> natively, i.e., without virtualization, you shouldn't experience a problem,
> unless you explicitly enable the machine check driver.  However, running on
> top of a hypervisor, like vSphere 4, you might face a problem because the
> hypervisor might enable machine check exceptions, regardless of what the
> FreeBSD guest does.  I really don't know whether vSphere 4 enables machine
> check exception or not.  If it does, then either you disable the use of
> superpages in the FreeBSD guest, or you find a way to disable the machine
> check driver in the hypervisor.

I'd like to mention another possibility, just in case: machine check might be
enabled/done by firmware (e.g. BIOS).  This typically could be the case for
high-end-ish/server systems.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-11 Thread John Baldwin
On Wednesday 10 February 2010 1:38:37 pm Ivan Voras wrote:
> On 10 February 2010 19:35, Andriy Gapon  wrote:
> > on 10/02/2010 20:26 Ivan Voras said the following:
> >> On 10 February 2010 19:10, Andriy Gapon  wrote:
> >>> on 10/02/2010 20:03 Ivan Voras said the following:
>  When you say "very unique" is it in the "it is not Linux or Windows"
>  sense or do we do something nonstandard?
> >>> The former - neither Linux, Windows or OpenSolaris seem to have what we 
have.
> >>
> >> I can't find the exact documents but I think both Windows
> >> MegaUltimateServer (the highest priced version of Windows Server,
> >> whatever it's called today) and Linux (though disabled and marked
> >> Experimental) have it, or have some kind of support for large pages
> >> that might not be as pervasive (maybe they use it for kernel only?). I
> >> have no idea about (Open)Solaris.
> >
> > I haven't said that those OSes do not use large pages.
> > I've said what I've said :-)
> 
> Ok :)
> 
> Is there a difference between "large pages" as they are commonly known
> and "superpages" as in FreeBSD ? In other words - are you referencing
> some specific mechanism, like automatic promotion / demotion of the
> large pages or maybe something else?

Yes, the automatic promotion / demotion.  That is a far-less common feature.  
FreeBSD/i386 has used large pages for the kernel text as far back as at least 
4.x, but that is not the same as superpages.  Linux does not have automatic 
promotion / demotion to my knowledge.  I do not know about other OS's.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Alan Cox
On Wed, Feb 10, 2010 at 12:46 PM, Jeremy Chadwick
wrote:
[snip]

>
> I read what Andriy wrote to mean that the way FreeBSD utilises 4MB TLB
> on certain models of AMD processors is broken/quirky, and on those CPUs,
> users should stick to vm.pmap.pg_ps_enabled="0" (loader.conf).
>
>
No.  He said, "We don't do anything that strays from specifications."  So,
he is not saying that FreeBSD is doing anything broken.

Here is what I know.  Several of us, myself included, have been able to
reproduce either lockups or machine check exceptions when BOTH the machine
check driver and superpages are enabled on AMD family 10h processors.  There
have been no reports of this problem on either Intel or earlier AMD
processors.  Moreover, there is no evidence of instability in AMD family 10h
processors until the machine check driver is enabled.  By default, FreeBSD
8.0 enables superpages but disables the machine check driver.  So, running
natively, i.e., without virtualization, you shouldn't experience a problem,
unless you explicitly enable the machine check driver.  However, running on
top of a hypervisor, like vSphere 4, you might face a problem because the
hypervisor might enable machine check exceptions, regardless of what the
FreeBSD guest does.  I really don't know whether vSphere 4 enables machine
check exception or not.  If it does, then either you disable the use of
superpages in the FreeBSD guest, or you find a way to disable the machine
check driver in the hypervisor.

Both Andriy and I have reported this problem to people at AMD, but we
haven't yet received AMD's analysis.  These things take time.

Regards,
Alan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Jeremy Chadwick
On Wed, Feb 10, 2010 at 07:38:37PM +0100, Ivan Voras wrote:
> On 10 February 2010 19:35, Andriy Gapon  wrote:
> > on 10/02/2010 20:26 Ivan Voras said the following:
> >> On 10 February 2010 19:10, Andriy Gapon  wrote:
> >>> on 10/02/2010 20:03 Ivan Voras said the following:
>  When you say "very unique" is it in the "it is not Linux or Windows"
>  sense or do we do something nonstandard?
> >>> The former - neither Linux, Windows or OpenSolaris seem to have what we 
> >>> have.
> >>
> >> I can't find the exact documents but I think both Windows
> >> MegaUltimateServer (the highest priced version of Windows Server,
> >> whatever it's called today) and Linux (though disabled and marked
> >> Experimental) have it, or have some kind of support for large pages
> >> that might not be as pervasive (maybe they use it for kernel only?). I
> >> have no idea about (Open)Solaris.
> >
> > I haven't said that those OSes do not use large pages.
> > I've said what I've said :-)
> 
> Ok :)
> 
> Is there a difference between "large pages" as they are commonly known
> and "superpages" as in FreeBSD ? In other words - are you referencing
> some specific mechanism, like automatic promotion / demotion of the
> large pages or maybe something else?

I read what Andriy wrote to mean that the way FreeBSD utilises 4MB TLB
on certain models of AMD processors is broken/quirky, and on those CPUs,
users should stick to vm.pmap.pg_ps_enabled="0" (loader.conf).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:35, Andriy Gapon  wrote:
> on 10/02/2010 20:26 Ivan Voras said the following:
>> On 10 February 2010 19:10, Andriy Gapon  wrote:
>>> on 10/02/2010 20:03 Ivan Voras said the following:
 When you say "very unique" is it in the "it is not Linux or Windows"
 sense or do we do something nonstandard?
>>> The former - neither Linux, Windows or OpenSolaris seem to have what we 
>>> have.
>>
>> I can't find the exact documents but I think both Windows
>> MegaUltimateServer (the highest priced version of Windows Server,
>> whatever it's called today) and Linux (though disabled and marked
>> Experimental) have it, or have some kind of support for large pages
>> that might not be as pervasive (maybe they use it for kernel only?). I
>> have no idea about (Open)Solaris.
>
> I haven't said that those OSes do not use large pages.
> I've said what I've said :-)

Ok :)

Is there a difference between "large pages" as they are commonly known
and "superpages" as in FreeBSD ? In other words - are you referencing
some specific mechanism, like automatic promotion / demotion of the
large pages or maybe something else?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 20:26 Ivan Voras said the following:
> On 10 February 2010 19:10, Andriy Gapon  wrote:
>> on 10/02/2010 20:03 Ivan Voras said the following:
>>> When you say "very unique" is it in the "it is not Linux or Windows"
>>> sense or do we do something nonstandard?
>> The former - neither Linux, Windows or OpenSolaris seem to have what we have.
> 
> I can't find the exact documents but I think both Windows
> MegaUltimateServer (the highest priced version of Windows Server,
> whatever it's called today) and Linux (though disabled and marked
> Experimental) have it, or have some kind of support for large pages
> that might not be as pervasive (maybe they use it for kernel only?). I
> have no idea about (Open)Solaris.

I haven't said that those OSes do not use large pages.
I've said what I've said :-)

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:26, Ivan Voras  wrote:
> On 10 February 2010 19:10, Andriy Gapon  wrote:
>> on 10/02/2010 20:03 Ivan Voras said the following:
>>> When you say "very unique" is it in the "it is not Linux or Windows"
>>> sense or do we do something nonstandard?
>>
>> The former - neither Linux, Windows or OpenSolaris seem to have what we have.
>
> I can't find the exact documents but I think both Windows
> MegaUltimateServer (the highest priced version of Windows Server,
> whatever it's called today) and Linux (though disabled and marked
> Experimental) have it, or have some kind of support for large pages
> that might not be as pervasive (maybe they use it for kernel only?). I
> have no idea about (Open)Solaris.

VMWare documentation about large pages:

http://www.vmware.com/files/pdf/large_pg_performance.pdf

I think I remember reading that on Windows, the application must use a
special syscall to allocate an area with large pages, but I can't find
the document.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 19:10, Andriy Gapon  wrote:
> on 10/02/2010 20:03 Ivan Voras said the following:
>> When you say "very unique" is it in the "it is not Linux or Windows"
>> sense or do we do something nonstandard?
>
> The former - neither Linux, Windows or OpenSolaris seem to have what we have.

I can't find the exact documents but I think both Windows
MegaUltimateServer (the highest priced version of Windows Server,
whatever it's called today) and Linux (though disabled and marked
Experimental) have it, or have some kind of support for large pages
that might not be as pervasive (maybe they use it for kernel only?). I
have no idea about (Open)Solaris.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 20:03 Ivan Voras said the following:
> When you say "very unique" is it in the "it is not Linux or Windows"
> sense or do we do something nonstandard?

The former - neither Linux, Windows or OpenSolaris seem to have what we have.
So we might be the first testers for certain processor features.
We don't do anything that strays from specifications.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras
On 10 February 2010 18:13, Andriy Gapon  wrote:
> on 10/02/2010 19:05 Ivan Voras said the following:
>> On 02/10/10 17:05, Andriy Gapon wrote:

>>> Wild guess - try disabling superpages in the guests.
>>
>> It looks like your guess is perfectly correct :) The guest has been
>> doing buildworlds for an hour and it works fine. Thanks!
>>
>> It's strange how this doesn't affect the Xeons...
>
> I really can not tell more but there seems to be an issue between our
> implementation of superpages (very unique) and AMD processors from 10h family.
> I'd recommend not using superpages feature with those processors for time 
> being.

When you say "very unique" is it in the "it is not Linux or Windows"
sense or do we do something nonstandard?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 19:05 Ivan Voras said the following:
> On 02/10/10 17:05, Andriy Gapon wrote:
>> on 10/02/2010 17:36 Ivan Voras said the following:
>>> It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
>>> FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
>>> combination, everything works fine until a moderate load is started - a
>>> buildworld is enough. About five minutes after the load starts, the
>>> vSphere client starts getting timeouts while talking with the host and
>>> soon after the guest VM is forcibly shut down without any trace of a
>>> reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
>>> The shutdown happens regardless if there is a vSphere client connected.
>>>
>>> This is very repeatable, on Sun Fire X4140 hosts.
>>>
>>> With 7.x/7.stable guests everything works fine.
>>>
>>> I'm posting this for future reference and to see if anyone has
>>> encountered something like that, or has an idea why this happens.
>>
>> Wild guess - try disabling superpages in the guests.
> 
> It looks like your guess is perfectly correct :) The guest has been
> doing buildworlds for an hour and it works fine. Thanks!
> 
> It's strange how this doesn't affect the Xeons...

I really can not tell more but there seems to be an issue between our
implementation of superpages (very unique) and AMD processors from 10h family.
I'd recommend not using superpages feature with those processors for time being.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Ivan Voras

On 02/10/10 17:05, Andriy Gapon wrote:

on 10/02/2010 17:36 Ivan Voras said the following:

It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
combination, everything works fine until a moderate load is started - a
buildworld is enough. About five minutes after the load starts, the
vSphere client starts getting timeouts while talking with the host and
soon after the guest VM is forcibly shut down without any trace of a
reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
The shutdown happens regardless if there is a vSphere client connected.

This is very repeatable, on Sun Fire X4140 hosts.

With 7.x/7.stable guests everything works fine.

I'm posting this for future reference and to see if anyone has
encountered something like that, or has an idea why this happens.


Wild guess - try disabling superpages in the guests.


It looks like your guess is perfectly correct :) The guest has been 
doing buildworlds for an hour and it works fine. Thanks!


It's strange how this doesn't affect the Xeons...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Andriy Gapon
on 10/02/2010 17:36 Ivan Voras said the following:
> It looks like I've stumbled upon a bug in vSphere 4 (recent update) with
> FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In this
> combination, everything works fine until a moderate load is started - a
> buildworld is enough. About five minutes after the load starts, the
> vSphere client starts getting timeouts while talking with the host and
> soon after the guest VM is forcibly shut down without any trace of a
> reason in various logs. The same VM runs fine on hosts with Xeon CPUs.
> The shutdown happens regardless if there is a vSphere client connected.
> 
> This is very repeatable, on Sun Fire X4140 hosts.
> 
> With 7.x/7.stable guests everything works fine.
> 
> I'm posting this for future reference and to see if anyone has
> encountered something like that, or has an idea why this happens.

Wild guess - try disabling superpages in the guests.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Strange problem with 8-stable, VMWare vSphere 4 & AMD CPUs (unexpected shutdowns)

2010-02-10 Thread Sean C. Farley

On Wed, 10 Feb 2010, Ivan Voras wrote:

It looks like I've stumbled upon a bug in vSphere 4 (recent update) 
with FreeBSD/amd64 8.0/8-stable (but not 7.x) guests on Opteron(s). In 
this combination, everything works fine until a moderate load is 
started - a buildworld is enough. About five minutes after the load 
starts, the vSphere client starts getting timeouts while talking with 
the host and soon after the guest VM is forcibly shut down without any 
trace of a reason in various logs.  The same VM runs fine on hosts 
with Xeon CPUs. The shutdown happens regardless if there is a vSphere 
client connected.


This is very repeatable, on Sun Fire X4140 hosts.

With 7.x/7.stable guests everything works fine.

I'm posting this for future reference and to see if anyone has 
encountered something like that, or has an idea why this happens.


Is it related to this thread:
http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/054755.html

I have been fighting other issues (mainly countless "Command WRITE(10) 
took X.XYZ seconds" in the VM's vmware.log file under moderate I/O) with 
VMware Workstation 7 on a Linux host with an AMD Phenom(tm) II X4 945 
Processor, but I still have more testing to see if I can work through 
it.  I also do not want to take over this thread.


Sean
--
s...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"