Launchpad has imported 23 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=448588.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2008-05-27T18:19:14+00:00 Alok wrote:

Description of problem:

There have been a series of patches committed to the mainline kernel
recently that address a performance issue for gettimeofday when running
on hypervisors that enable hardware assisted virtualization.  The
non-ideal performance occurs because a CPUID instruction is used to
serialize the pipeline before RDTSC, and when using hardware
virtualization, CPUID always exits to the hypervisor.

The code in question also exists in the RHEL 5.2 64-bit kernel (see
get_cycles_sync in include/asm-x86_64/timex.h).

The fix is to use MFENCE/LFENCE instead of CPUID.  Here are links to
relevant patches by Andi Kleen which are now in git:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=de4218634e3df6d73a3e6cdfdf3a17fa3bc7e013
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=707fa8ed923b1b6a3d7af0d386b0b3abad28ed19
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=fde1b3fa947c2512e3715962ebb1d3a6a9b9bb7d
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6d63de8dbcda98511206897562ecfcdacf18f523
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f06e4ec1c15691b0cfd2397ae32214fa36c90d71

Would you be able to make a similar change to the RHEL 5.2 kernel, to
address this issue ?

Thanks,
Alok

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/0

------------------------------------------------------------------------
On 2008-07-31T10:14:11+00:00 C. wrote:

(In reply to comment #0)

We're also seeing this problem - a little test script (doing 90,000,000
gettimeofday() calls) takes two minutes on our RH VMs... and thirty seconds on
my desktop. 

Applying that kernel patch to the RH kernel isn't trivial - there have been a
*lot* of changes since 2.6.18 was cut. I'm giving it a go here, but I'm not
really a kernel coder.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/4

------------------------------------------------------------------------
On 2008-08-27T10:55:10+00:00 Jon wrote:

Currently this appears to be an issue with 64 bit architectures and the 
implementation behind the syscalls within the xen kernel in the VM and the way 
they all exit the vm to the hypervisor. Is this related to not taking advantage 
of segmentation protection? 
 
#include <stdio.h>
 
int main(int argc,char **argv) {
 
    int i;
    for ( i = 0 ; i < 90000000 ; i++ ) {
           gettimeofday();
 
     }
 
}
 
 
32 bit VM:
time ./time
 
real 1m42.460s
user 0m8.565s
sys  1m33.834s
 
64-bit VM:
time ./time_64
 
real 6m3.259s
user 0m26.750s
sys  5m36.501s  
 
Not 100% sure on a gettimeofday why it requires to exit out of the VM 
completely? Is it not possible to take advantage of the generic 'vsyscalls' 
implementation which is in the 2.6.x branches?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/5

------------------------------------------------------------------------
On 2008-09-03T11:03:33+00:00 Prarit wrote:

*** Bug 460983 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/6

------------------------------------------------------------------------
On 2008-09-03T15:33:33+00:00 Chris wrote:

Well, I think there are actually two things at play in this BZ.  The
original request is to shift from using CPUID to serialize gettimeofday
to using MFENCE/LFENCE for serializing.  This should be faster under
*full* virtualization, so a backport to RHEL-5 might be desirable.

Comment #3, however, talks about something different.  In particular,
it's talking about the Xen kernel, which does indeed have the vsyscall
stuff off in 64-bit.  I'm not entirely sure why; I don't think
segmentation protection should have anything to do with it.  In a quick
test, I turned it on, and your little benchmark there went from 2m17s on
this box to about 30s.  I'm going to make a new BZ about that issue,
since it doesn't really belong here.

Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/7

------------------------------------------------------------------------
On 2008-10-22T16:57:36+00:00 Alok wrote:

(In reply to comment #4)
> Well, I think there are actually two things at play in this BZ.  The original
> request is to shift from using CPUID to serialize gettimeofday to using
> MFENCE/LFENCE for serializing.  This should be faster under *full*
> virtualization, so a backport to RHEL-5 might be desirable.
> 

<ping>
Has anybody been working on these patches, do we have a ETA as to which release 
can have this fix ? 

> Comment #3, however, talks about something different.  In particular, it's
> talking about the Xen kernel, which does indeed have the vsyscall stuff off in
> 64-bit.  I'm not entirely sure why; I don't think segmentation protection
> should have anything to do with it.  In a quick test, I turned it on, and your
> little benchmark there went from 2m17s on this box to about 30s.  I'm going to
> make a new BZ about that issue, since it doesn't really belong here.

Can you please cc me on this BZ.

Thanks,
Alok

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/9

------------------------------------------------------------------------
On 2008-10-28T15:09:18+00:00 Bill wrote:

There is no ETA for these, but the earliest we can evaluate it for would
be RHEL 5.4 at this point.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/10

------------------------------------------------------------------------
On 2008-11-21T05:55:53+00:00 Alok wrote:

[changed the component category to "kernel", this is a generic kernel
problem, only that the performance impact would be more for kernel
running under hypervisors.]

I have cooked up some patches which use the mfence/lfence instead of cpuid.
Please have a look and let me know if you have any comments. Will upload them 
shortly.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/11

------------------------------------------------------------------------
On 2008-11-21T05:58:33+00:00 Alok wrote:

Created attachment 324275
x86: Implement support to synchronize RDTSC with LFENCE on Intel CPUs

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/12

------------------------------------------------------------------------
On 2008-11-21T05:59:41+00:00 Alok wrote:

Created attachment 324276
x86: implement support to synchronize RDTSC through MFENCE on AMD CPUs

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/13

------------------------------------------------------------------------
On 2008-11-21T06:00:49+00:00 Alok wrote:

Created attachment 324277
x86: introduce rdtsc_barrier()

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/14

------------------------------------------------------------------------
On 2008-11-21T23:20:20+00:00 Andrew wrote:

*** Bug 468459 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/15

------------------------------------------------------------------------
On 2009-01-16T15:08:04+00:00 Chris wrote:

I took a quick look at these patches, and they look entirely reasonable.
The only question I have concerns the
set_bit(X86_FEATURE_{L,M}FENCE_RDTSC, &c->x86_capability); don't we have
to protect that by first checking if sse2 is enabled?  Upstream that's
done with the "cpu_has_xmm2" check, but since RHEL-5 doesn't have that,
we'd have to do something a little more primitive.  Or am I missing
something?

Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/16

------------------------------------------------------------------------
On 2009-01-16T19:36:23+00:00 Alok wrote:

(In reply to comment #20)
> I took a quick look at these patches, and they look entirely reasonable.  The
> only question I have concerns the set_bit(X86_FEATURE_{L,M}FENCE_RDTSC,
> &c->x86_capability); don't we have to protect that by first checking if sse2 
> is
> enabled?

These barrier changes are done only for 64bit code. All 64bit machines
have SSE2 enabled, atleast thats what include/asm-x86_64/cpufeature.h
says

#define cpu_has_xmm2           1

So i don't think we need the xmm2 check for RHEL5 since the 32 and 64bit
code is still separate.

Thanks,
Alok

  Upstream that's done with the "cpu_has_xmm2" check, but since RHEL-5
> doesn't have that, we'd have to do something a little more primitive.  Or am I
> missing something?
> 
> Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/17

------------------------------------------------------------------------
On 2009-01-19T07:52:28+00:00 Chris wrote:

Ah, of course, silly me.  Upstream has the combined 32/64 bit code,
which is why it needs the protection.  OK, great, thanks a lot!

Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/18

------------------------------------------------------------------------
On 2009-01-23T10:24:32+00:00 Chris wrote:

I've uploaded a test kernel that contains this fix (along with several others)
to this location:

http://people.redhat.com/clalance/virttest

Could the original reporter try out the test kernels there, and report back if
it fixes the problem?

Thanks,
Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/19

------------------------------------------------------------------------
On 2009-01-27T02:07:03+00:00 Alok wrote:

(In reply to comment #23)
> I've uploaded a test kernel that contains this fix (along with several others)
> to this location:
> 
> http://people.redhat.com/clalance/virttest
> 
> Could the original reporter try out the test kernels there, and report back if
> it fixes the problem?
> 

Yep this kernel does fix the performance problems for me, thanks for
picking up the patches.

Alok

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/20

------------------------------------------------------------------------
On 2009-02-02T07:42:50+00:00 Chris wrote:

Great, thanks for the testing!

Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/21

------------------------------------------------------------------------
On 2009-02-16T15:04:52+00:00 RHEL wrote:

Updating PM score.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/22

------------------------------------------------------------------------
On 2009-06-29T16:52:42+00:00 Evan wrote:

The fix for this is included in the latest RHEL5.4 beta kernels,
available at:

http://people.redhat.com/dzickus/el5

Alok (or anyone else hitting this bug), can you please test this kernel
when possible? Thanks!

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/23

------------------------------------------------------------------------
On 2009-07-03T18:03:10+00:00 Chris wrote:

~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the
Beta release that addresses this particular request. Please test and
report back results here, at your earliest convenience. RHEL 5.4 General
Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the
issues you have encountered and set the bug into NEED_INFO. If you
encounter new issues, please clone this bug to open a new issue and
request it be reviewed for inclusion in RHEL 5.4 or a later update, if
it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your
verification results, and if available, update Verified field with the
appropriate value.

Questions can be posted to this bug or your customer or partner
representative.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/24

------------------------------------------------------------------------
On 2009-07-21T22:19:19+00:00 Alok wrote:

RHEL5.4 looks okay WRT these patches too. Thanks.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/25

------------------------------------------------------------------------
On 2009-09-02T08:18:45+00:00 errata-xmlrpc wrote:

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/comments/26


** Changed in: linux (Fedora)
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/248509

Title:
  Hardy CPU physical hot plugging is broken

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Hardy:
  Fix Released
Status in linux source package in Intrepid:
  Invalid
Status in linux package in Fedora:
  Fix Released

Bug description:
  In the mainline kernel there are some deadlock issues when hot removing a
  processor. This issues were discussed in detail at:

  https://bugzilla.redhat.com/show_bug.cgi?id=448588

  These 7 commits fix CPU hotplug:

  ba62b077871a5255e271f4fdae57167651839277 - acpi: fix "buggy BIOS check" when 
CPUs are hot removed
  63d38198a0f57dca87e6cb79931c7bedbb7ab069 - x86: fix paranoia about using BIOS 
quickboot mechanism.
  2f67a0695dc389247c05041b05d2a2b06fc102a3 - flush kacpi_notify_wq before 
removing notify handler
  087803d18fb8259cb844c075a35fb27c2d80792e - fix a deadlock issue when poking 
"eject" file
  3d5ed99657e93cd0453a187c478e663e6b6a3a8b - force offline the processor during 
hot-removal
  89d675d0f987534139d330eb2689ec53fab9404e - create sysfs link from acpi device 
to sysdev for cpu
  ad7f0d9feee6980a3ab3ea806854f56817d1da8e - ACPI: fix checkpatch.pl complaints 
in scan.c

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/248509/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to