Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-24 Thread Boris Ostrovsky
On 10/24/2016 03:22 PM, Kyle Huey wrote:
> On Mon, Oct 24, 2016 at 8:05 AM, Boris Ostrovsky
>  wrote:
>> On 10/24/2016 12:18 AM, Kyle Huey wrote:
>>> The anomalies we see appear to be related to, or at least triggerable
>>> by, the performance monitoring interrupt.  The following program runs
>>> a loop of roughly 2^25 conditional branches.  It takes one argument,
>>> the number of conditional branches to program the PMI to trigger on.
>>> The default is 50,000, and if you run the program with that it'll
>>> produce the same value every time.  If you drop it to 5000 or so
>>> you'll probably see occasional off-by-one discrepancies.  If you drop
>>> it to 500 the performance counter values fluctuate wildly.
>> Yes, it does change but I also see the difference on baremetal (although
>> not as big as it is in an HVM guest):
>> ostr@workbase> ./pmu 500
>> Period is 500
>> Counted 5950003 conditional branches
>> ostr@workbase> ./pmu 500
>> Period is 500
>> Counted 5850003 conditional branches
>> ostr@workbase> ./pmu 500
>> Period is 500
>> Counted 7530107 conditional branches
>> ostr@workbase>
> Yeah, you're right.  I simplified the testcase too far.  I have
> included a better one.  This testcase is stable on bare metal (down to
> an interrupt every 10 branches, I didn't try below that) and more
> accurately represents what our software actually does. 

When I run this program in a loop the first iteration is always off:

ostr@workbase> while [ true ]; do taskset -c 0 ./pmu 500|grep -v Period
; done
Counted 33554556 conditional branches
Counted 33554729 conditional branches
Counted 33554729 conditional branches
...

but then it indeed is stable. Could it be "priming" the branch
predictor? Does this counter count mis-predicted branches? Probably not
since the first number is smaller than the rest.


>  rr acts as a
> ptrace supervisor to the process being recorded, and it seems that
> context switching between the supervisor and tracee processes
> stabilizes the performance counter values somehow.

Not sure I understand what you mean by this. The supervising thread is
presumably sitting in kernel (in waitpid()) so nothing should be counted
for it.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-24 Thread Kyle Huey
On Mon, Oct 24, 2016 at 8:05 AM, Boris Ostrovsky
 wrote:
> On 10/24/2016 12:18 AM, Kyle Huey wrote:
>>
>> The anomalies we see appear to be related to, or at least triggerable
>> by, the performance monitoring interrupt.  The following program runs
>> a loop of roughly 2^25 conditional branches.  It takes one argument,
>> the number of conditional branches to program the PMI to trigger on.
>> The default is 50,000, and if you run the program with that it'll
>> produce the same value every time.  If you drop it to 5000 or so
>> you'll probably see occasional off-by-one discrepancies.  If you drop
>> it to 500 the performance counter values fluctuate wildly.
>
> Yes, it does change but I also see the difference on baremetal (although
> not as big as it is in an HVM guest):
> ostr@workbase> ./pmu 500
> Period is 500
> Counted 5950003 conditional branches
> ostr@workbase> ./pmu 500
> Period is 500
> Counted 5850003 conditional branches
> ostr@workbase> ./pmu 500
> Period is 500
> Counted 7530107 conditional branches
> ostr@workbase>

Yeah, you're right.  I simplified the testcase too far.  I have
included a better one.  This testcase is stable on bare metal (down to
an interrupt every 10 branches, I didn't try below that) and more
accurately represents what our software actually does.  rr acts as a
ptrace supervisor to the process being recorded, and it seems that
context switching between the supervisor and tracee processes
stabilizes the performance counter values somehow.

>> I'm not yet sure if this is specifically related to the PMI, or if it
>> can be caused by any interrupt and it's only how frequently the
>> interrupts occur that matters.
>
> I have never used file interface to performance counters, but what are
> we reporting here (in read_counter()) --- total number of events or
> number of events since last sample? It is also curious to me that the
> counter in non-zero after  PERF_EVENT_IOC_RESET (but again, I don't have
> any experience with these interfaces).

It should be number of events since the last time the counter was
reset (or overflowed, I guess).  On my machine the counter value is
zero both before and after the PERF_EVENT_IOC_RESET ioctl.

> Also, exclude_guest doesn't appear to make any difference, I don't know
> if there are any bits in Intel counters that allow you to distinguish
> guest from host (unlike AMD, where there is a bit for that).

exclude_guest is a Linux specific thing for excluding KVM guests.
There is no hardware support involved; it's handled entirely in the
perf events infrastructure in the kernel.

- Kyle

#define _GNU_SOURCE 1

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static struct perf_event_attr rcb_attr;
static uint64_t period;
static int fd;

void counter_on(uint64_t ticks)
{
  int ret = ioctl(fd, PERF_EVENT_IOC_RESET, 0);
  assert(!ret);
  ret = ioctl(fd, PERF_EVENT_IOC_PERIOD, );
  assert(!ret);
  ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 1);
  assert(!ret);
}

void counter_off()
{
  int ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
  assert(!ret);
}

int64_t read_counter()
{
  int64_t val;
  ssize_t nread = read(fd, , sizeof(val));
  assert(nread == sizeof(val));
  return val;
}

void do_test()
{
  int i, dummy;

  for (i = 0; i < (1 << 25); i++) {
dummy += i % (1 << 10);
dummy += i % (79 * (1 << 10));
  }
}

int main(int argc, const char* argv[])
{
  int pid;
  memset(_attr, 0, sizeof(rcb_attr));
  rcb_attr.size = sizeof(rcb_attr);
  rcb_attr.type = PERF_TYPE_RAW;
  /* Intel retired conditional branches counter, ring 3 only */
  rcb_attr.config = 0x5101c4;
  rcb_attr.exclude_kernel = 1;
  rcb_attr.exclude_guest = 1;
  /* We'll change this later */
  rcb_attr.sample_period = 0x;

  signal(SIGALRM, SIG_IGN);
  pid = fork();
  if (pid == 0) {
/* Wait for the parent */
kill(getpid(), SIGSTOP);
do_test();
return 0;
  }

  /* start the counter */
  fd = syscall(__NR_perf_event_open, _attr, pid, -1, -1, 0);
  if (fd < 0) {
printf("Failed to initialize counter\n");
return -1;
  }

  counter_off();

  struct f_owner_ex own;
  own.type = F_OWNER_PID;
  own.pid = pid;
  if (fcntl(fd, F_SETOWN_EX, ) ||
  fcntl(fd, F_SETFL, O_ASYNC) ||
  fcntl(fd, F_SETSIG, SIGALRM)) {
printf("Failed to make counter async\n");
return -1;
  }

  period = 5;
  if (argc > 1) {
sscanf(argv[1], "%ld", );
  }

  printf("Period is %ld\n", period);

  counter_on(period);
  ptrace(PTRACE_SEIZE, pid, NULL, 0);
  ptrace(PTRACE_CONT, pid, NULL, SIGCONT);

  int status = 0;
  while (1) {
waitpid(pid, , 0);
if (WIFEXITED(status)) {
  break;
}
if (WIFSIGNALED(status)) {
  assert(0);
  continue;
}
if (WIFSTOPPED(status)) {
  if (WSTOPSIG(status) == SIGALRM ||
  WSTOPSIG(status) == SIGSTOP) {
ptrace(PTRACE_CONT, pid, NULL, WSTOPSIG(status));
continue;
  }
}
assert(0 

Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-24 Thread Boris Ostrovsky
On 10/24/2016 12:18 AM, Kyle Huey wrote:
>
> The anomalies we see appear to be related to, or at least triggerable
> by, the performance monitoring interrupt.  The following program runs
> a loop of roughly 2^25 conditional branches.  It takes one argument,
> the number of conditional branches to program the PMI to trigger on.
> The default is 50,000, and if you run the program with that it'll
> produce the same value every time.  If you drop it to 5000 or so
> you'll probably see occasional off-by-one discrepancies.  If you drop
> it to 500 the performance counter values fluctuate wildly.

Yes, it does change but I also see the difference on baremetal (although
not as big as it is in an HVM guest):
ostr@workbase> ./pmu 500
Period is 500
Counted 5950003 conditional branches
ostr@workbase> ./pmu 500
Period is 500
Counted 5850003 conditional branches
ostr@workbase> ./pmu 500
Period is 500
Counted 7530107 conditional branches
ostr@workbase>



>
> I'm not yet sure if this is specifically related to the PMI, or if it
> can be caused by any interrupt and it's only how frequently the
> interrupts occur that matters.

I have never used file interface to performance counters, but what are
we reporting here (in read_counter()) --- total number of events or
number of events since last sample? It is also curious to me that the
counter in non-zero after  PERF_EVENT_IOC_RESET (but again, I don't have
any experience with these interfaces).

Also, exclude_guest doesn't appear to make any difference, I don't know
if there are any bits in Intel counters that allow you to distinguish
guest from host (unlike AMD, where there is a bit for that).


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-23 Thread Kyle Huey
On Fri, Oct 21, 2016 at 8:52 AM, Kyle Huey  wrote:
> On Thu, Oct 20, 2016 at 7:40 AM, Boris Ostrovsky
>  wrote:
>> On 10/20/2016 10:11 AM, Andrew Cooper wrote:
>>> On 20/10/16 14:55, Kyle Huey wrote:
>> That said, rr currently does not work in Xen guests due to some PMU
>> issues that we haven't tracked down yet.
> Is this RR trying to use vPMU and it not functioning, or not
> specifically trying to use PMU facilities and getting stuck anyway?
 The latter.  rr relies on the values returned by the PMU (the retired
 conditional branches counter in particular) being exactly the same
 during the recording and replay phases.  This is true when running on
 bare metal, and when running inside a KVM guest, but when running in a
 Xen HVM guest we see values that are off by a branch or two on a small
 fraction of our tests.  Since it works in KVM I suspect this is some
 sort of issue with how Xen multiplexes the real PMU and events are
 "leaking" between guests (or perhaps from Xen itself, though I don't
 think the Xen kernel executes any ring 3 code).  Even if that's
 correct we're a long way from tracking it down and patching it though.
>>> Hmm.  That is unfortunate, and does point towards a bug in Xen.  Are
>>> these tests which notice the problem easy to run?
>>>
>>> Boris (CC'd) is the maintainer of that code.  It has undergone quite a
>>> few changes recently.
>>
>> I am actually not the maintainer, I just break this code more often than
>> others.
>>
>> But yes, having a test case would make it much easier to understand what
>> and why is not working.
>>
>> Would something like
>>
>> wrmsr(PERFCTR,0);
>> wrmsr(EVNTSEL, XXX); //enable counter
>> // do something simple, with branches
>> wrmsr(EVTSEL,YYY); // disable counter
>>
>> demonstrate the problem? (I assume we are talking about HVM guest)
>>
>> -boris
>>
>
> That is a good question.  I'll see if I can reduce the problem down
> from "run Linux and run our tests inside it".

The anomalies we see appear to be related to, or at least triggerable
by, the performance monitoring interrupt.  The following program runs
a loop of roughly 2^25 conditional branches.  It takes one argument,
the number of conditional branches to program the PMI to trigger on.
The default is 50,000, and if you run the program with that it'll
produce the same value every time.  If you drop it to 5000 or so
you'll probably see occasional off-by-one discrepancies.  If you drop
it to 500 the performance counter values fluctuate wildly.

I'm not yet sure if this is specifically related to the PMI, or if it
can be caused by any interrupt and it's only how frequently the
interrupts occur that matters.

- Kyle

#define _GNU_SOURCE 1

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

static struct perf_event_attr rcb_attr;
static uint64_t period;
static int fd;

void counter_on(uint64_t ticks)
{
  int ret = ioctl(fd, PERF_EVENT_IOC_RESET, 0);
  assert(!ret);
  ret = ioctl(fd, PERF_EVENT_IOC_PERIOD, );
  assert(!ret);
  ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 1);
  assert(!ret);
}

void counter_off()
{
  int ret = ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
  assert(!ret);
}

int64_t read_counter()
{
  int64_t val;
  ssize_t nread = read(fd, , sizeof(val));
  assert(nread == sizeof(val));
  return val;
}

void do_test()
{
  int64_t counts;
  int i, dummy;

  counter_on(period);
  for (i = 0; i < (1 << 25); i++) {
dummy += i % (1 << 10);
dummy += i % (79 * (1 << 10));
  }

  counter_off();
  counts = read_counter();
  printf("Counted %ld conditional branches\n", counts);
}

int main(int argc, const char* argv[])
{
  memset(_attr, 0, sizeof(rcb_attr));
  rcb_attr.size = sizeof(rcb_attr);
  rcb_attr.type = PERF_TYPE_RAW;
  /* Intel retired conditional branches counter, ring 3 only */
  rcb_attr.config = 0x5101c4;
  rcb_attr.exclude_kernel = 1;
  rcb_attr.exclude_guest = 1;
  /* We'll change this later */
  rcb_attr.sample_period = 0x;

  /* start the counter */
  fd = syscall(__NR_perf_event_open, _attr, 0, -1, -1, 0);
  if (fd < 0) {
printf("Failed to initialize counter\n");
return -1;
  }

  signal(SIGALRM, SIG_IGN);

  if (fcntl(fd, F_SETFL, O_ASYNC) || fcntl(fd, F_SETSIG, SIGALRM)) {
printf("Failed to make counter async\n");
return -1;
  }

  counter_off();

  period = 5;
  if (argc > 1) {
sscanf(argv[1], "%ld", );
  }

  printf("Period is %ld\n", period);

  do_test();

  return 0;
}

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-21 Thread Kyle Huey
On Thu, Oct 20, 2016 at 7:40 AM, Boris Ostrovsky
 wrote:
> On 10/20/2016 10:11 AM, Andrew Cooper wrote:
>> On 20/10/16 14:55, Kyle Huey wrote:
> That said, rr currently does not work in Xen guests due to some PMU
> issues that we haven't tracked down yet.
 Is this RR trying to use vPMU and it not functioning, or not
 specifically trying to use PMU facilities and getting stuck anyway?
>>> The latter.  rr relies on the values returned by the PMU (the retired
>>> conditional branches counter in particular) being exactly the same
>>> during the recording and replay phases.  This is true when running on
>>> bare metal, and when running inside a KVM guest, but when running in a
>>> Xen HVM guest we see values that are off by a branch or two on a small
>>> fraction of our tests.  Since it works in KVM I suspect this is some
>>> sort of issue with how Xen multiplexes the real PMU and events are
>>> "leaking" between guests (or perhaps from Xen itself, though I don't
>>> think the Xen kernel executes any ring 3 code).  Even if that's
>>> correct we're a long way from tracking it down and patching it though.
>> Hmm.  That is unfortunate, and does point towards a bug in Xen.  Are
>> these tests which notice the problem easy to run?
>>
>> Boris (CC'd) is the maintainer of that code.  It has undergone quite a
>> few changes recently.
>
> I am actually not the maintainer, I just break this code more often than
> others.
>
> But yes, having a test case would make it much easier to understand what
> and why is not working.
>
> Would something like
>
> wrmsr(PERFCTR,0);
> wrmsr(EVNTSEL, XXX); //enable counter
> // do something simple, with branches
> wrmsr(EVTSEL,YYY); // disable counter
>
> demonstrate the problem? (I assume we are talking about HVM guest)
>
> -boris
>

That is a good question.  I'll see if I can reduce the problem down
from "run Linux and run our tests inside it".

- Kyle

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-20 Thread Boris Ostrovsky
On 10/20/2016 10:11 AM, Andrew Cooper wrote:
> On 20/10/16 14:55, Kyle Huey wrote:
 That said, rr currently does not work in Xen guests due to some PMU
 issues that we haven't tracked down yet.
>>> Is this RR trying to use vPMU and it not functioning, or not
>>> specifically trying to use PMU facilities and getting stuck anyway?
>> The latter.  rr relies on the values returned by the PMU (the retired
>> conditional branches counter in particular) being exactly the same
>> during the recording and replay phases.  This is true when running on
>> bare metal, and when running inside a KVM guest, but when running in a
>> Xen HVM guest we see values that are off by a branch or two on a small
>> fraction of our tests.  Since it works in KVM I suspect this is some
>> sort of issue with how Xen multiplexes the real PMU and events are
>> "leaking" between guests (or perhaps from Xen itself, though I don't
>> think the Xen kernel executes any ring 3 code).  Even if that's
>> correct we're a long way from tracking it down and patching it though.
> Hmm.  That is unfortunate, and does point towards a bug in Xen.  Are
> these tests which notice the problem easy to run?
>
> Boris (CC'd) is the maintainer of that code.  It has undergone quite a
> few changes recently.

I am actually not the maintainer, I just break this code more often than
others.

But yes, having a test case would make it much easier to understand what
and why is not working.

Would something like

wrmsr(PERFCTR,0);
wrmsr(EVNTSEL, XXX); //enable counter
// do something simple, with branches
wrmsr(EVTSEL,YYY); // disable counter
   
demonstrate the problem? (I assume we are talking about HVM guest)

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-20 Thread Andrew Cooper
On 20/10/16 14:55, Kyle Huey wrote:
>
>>> That said, rr currently does not work in Xen guests due to some PMU
>>> issues that we haven't tracked down yet.
>> Is this RR trying to use vPMU and it not functioning, or not
>> specifically trying to use PMU facilities and getting stuck anyway?
> The latter.  rr relies on the values returned by the PMU (the retired
> conditional branches counter in particular) being exactly the same
> during the recording and replay phases.  This is true when running on
> bare metal, and when running inside a KVM guest, but when running in a
> Xen HVM guest we see values that are off by a branch or two on a small
> fraction of our tests.  Since it works in KVM I suspect this is some
> sort of issue with how Xen multiplexes the real PMU and events are
> "leaking" between guests (or perhaps from Xen itself, though I don't
> think the Xen kernel executes any ring 3 code).  Even if that's
> correct we're a long way from tracking it down and patching it though.

Hmm.  That is unfortunate, and does point towards a bug in Xen.  Are
these tests which notice the problem easy to run?

Boris (CC'd) is the maintainer of that code.  It has undergone quite a
few changes recently.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-20 Thread Kyle Huey
On Thu, Oct 20, 2016 at 12:56 AM, Andrew Cooper
 wrote:
> On 20/10/2016 06:10, Kyle Huey wrote:
>> On Mon, Oct 17, 2016 at 5:32 AM, Wei Liu  wrote:
>>> On Fri, Oct 14, 2016 at 12:47:36PM -0700, Kyle Huey wrote:
 On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
 faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
 hardware support for faulting on cpuid is necessary to emulate support 
 with an
 HVM guest.

 On PV guests, hardware support is required so that userspace cpuid will 
 trap
 to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
 (that
 aren't the control domain, see the comment in intel_ctxt_switch_levelling).
 Every PV guest cpuid will trap via a GP(0) to emulate_privileged_op (via
 do_general_protection). Once there we simply decline to emulate cpuid if 
 the
 CPL > 0 and faulting is enabled, leaving the GP(0) for the guest kernel to
 handle.

 Signed-off-by: Kyle Huey 
>>> Andrew expressed the desire of taking this patch into 4.8. After reading
>>> the description and code in detail, I think this patch falls into the
>>> "nice-to-have" category.
>>>
>>> The main risk here is this patch doesn't have architecturally correct
>>> behaviour. I would like to see an ack or review from VT maintainers to
>>> make this patch eligible for acceptance.
>>>
>>> Another thing to consider is timing. We plan to cut RC3 before Friday
>>> this week, so if this patch can be acked and becomes part of RC3 I'm
>>> fine with applying it. If not, we shall revisit the situation when it is
>>> acked.
>> Kevin Tian reviewed the patch yesterday, so I think we're just waiting
>> for a final review from Andrew here.
>
> Ah - I am just waiting for your final respin with the comments so far
> addressed.

Oh.  I thought I already did that ... though apparently I didn't.  I
must have forgotten to remove --dry-run or something.  Anyways, it's
sent now.

>> That said, rr currently does not work in Xen guests due to some PMU
>> issues that we haven't tracked down yet.
>
> Is this RR trying to use vPMU and it not functioning, or not
> specifically trying to use PMU facilities and getting stuck anyway?

The latter.  rr relies on the values returned by the PMU (the retired
conditional branches counter in particular) being exactly the same
during the recording and replay phases.  This is true when running on
bare metal, and when running inside a KVM guest, but when running in a
Xen HVM guest we see values that are off by a branch or two on a small
fraction of our tests.  Since it works in KVM I suspect this is some
sort of issue with how Xen multiplexes the real PMU and events are
"leaking" between guests (or perhaps from Xen itself, though I don't
think the Xen kernel executes any ring 3 code).  Even if that's
correct we're a long way from tracking it down and patching it though.

>>   So for us it's not a big
>> deal if this feature does not make it into 4.8.  I won't be
>> disappointed if you cut it from 4.8 to reduce technical risk.
>
> From my point of view, its a small feature with working code and a
> comprehensive test case ready to go straight into regression testing.
> This makes it the least risky feature going.

- Kyle

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-20 Thread Andrew Cooper
On 20/10/2016 06:10, Kyle Huey wrote:
> On Mon, Oct 17, 2016 at 5:32 AM, Wei Liu  wrote:
>> On Fri, Oct 14, 2016 at 12:47:36PM -0700, Kyle Huey wrote:
>>> On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
>>> faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
>>> hardware support for faulting on cpuid is necessary to emulate support with 
>>> an
>>> HVM guest.
>>>
>>> On PV guests, hardware support is required so that userspace cpuid will trap
>>> to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
>>> (that
>>> aren't the control domain, see the comment in intel_ctxt_switch_levelling).
>>> Every PV guest cpuid will trap via a GP(0) to emulate_privileged_op (via
>>> do_general_protection). Once there we simply decline to emulate cpuid if the
>>> CPL > 0 and faulting is enabled, leaving the GP(0) for the guest kernel to
>>> handle.
>>>
>>> Signed-off-by: Kyle Huey 
>> Andrew expressed the desire of taking this patch into 4.8. After reading
>> the description and code in detail, I think this patch falls into the
>> "nice-to-have" category.
>>
>> The main risk here is this patch doesn't have architecturally correct
>> behaviour. I would like to see an ack or review from VT maintainers to
>> make this patch eligible for acceptance.
>>
>> Another thing to consider is timing. We plan to cut RC3 before Friday
>> this week, so if this patch can be acked and becomes part of RC3 I'm
>> fine with applying it. If not, we shall revisit the situation when it is
>> acked.
> Kevin Tian reviewed the patch yesterday, so I think we're just waiting
> for a final review from Andrew here.

Ah - I am just waiting for your final respin with the comments so far
addressed.

>
> That said, rr currently does not work in Xen guests due to some PMU
> issues that we haven't tracked down yet.

Is this RR trying to use vPMU and it not functioning, or not
specifically trying to use PMU facilities and getting stuck anyway?

>   So for us it's not a big
> deal if this feature does not make it into 4.8.  I won't be
> disappointed if you cut it from 4.8 to reduce technical risk.

From my point of view, its a small feature with working code and a
comprehensive test case ready to go straight into regression testing. 
This makes it the least risky feature going.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-19 Thread Kyle Huey
On Mon, Oct 17, 2016 at 5:32 AM, Wei Liu  wrote:
> On Fri, Oct 14, 2016 at 12:47:36PM -0700, Kyle Huey wrote:
>> On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
>> faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
>> hardware support for faulting on cpuid is necessary to emulate support with 
>> an
>> HVM guest.
>>
>> On PV guests, hardware support is required so that userspace cpuid will trap
>> to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
>> (that
>> aren't the control domain, see the comment in intel_ctxt_switch_levelling).
>> Every PV guest cpuid will trap via a GP(0) to emulate_privileged_op (via
>> do_general_protection). Once there we simply decline to emulate cpuid if the
>> CPL > 0 and faulting is enabled, leaving the GP(0) for the guest kernel to
>> handle.
>>
>> Signed-off-by: Kyle Huey 
>
> Andrew expressed the desire of taking this patch into 4.8. After reading
> the description and code in detail, I think this patch falls into the
> "nice-to-have" category.
>
> The main risk here is this patch doesn't have architecturally correct
> behaviour. I would like to see an ack or review from VT maintainers to
> make this patch eligible for acceptance.
>
> Another thing to consider is timing. We plan to cut RC3 before Friday
> this week, so if this patch can be acked and becomes part of RC3 I'm
> fine with applying it. If not, we shall revisit the situation when it is
> acked.

Kevin Tian reviewed the patch yesterday, so I think we're just waiting
for a final review from Andrew here.

That said, rr currently does not work in Xen guests due to some PMU
issues that we haven't tracked down yet.  So for us it's not a big
deal if this feature does not make it into 4.8.  I won't be
disappointed if you cut it from 4.8 to reduce technical risk.

- Kyle

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-17 Thread Andrew Cooper
On 14/10/16 20:47, Kyle Huey wrote:
> On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
> faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
> hardware support for faulting on cpuid is necessary to emulate support with an
> HVM guest.
>
> On PV guests, hardware support is required so that userspace cpuid will trap
> to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
> (that

to Xen.

> aren't the control domain, see the comment in intel_ctxt_switch_levelling).
> Every PV guest cpuid will trap via a GP(0) to emulate_privileged_op (via
> do_general_protection). Once there we simply decline to emulate cpuid if the
> CPL > 0 and faulting is enabled, leaving the GP(0) for the guest kernel to
> handle.
>
> Signed-off-by: Kyle Huey 
> ---
>  xen/arch/x86/hvm/vmx/vmx.c   | 24 ++--
>  xen/arch/x86/traps.c | 34 ++
>  xen/include/asm-x86/domain.h |  3 +++
>  3 files changed, 59 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index b9102ce..c038393 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2427,16 +2427,25 @@ static void vmx_cpuid_intercept(
>  
>  HVMTRACE_5D (CPUID, input, *eax, *ebx, *ecx, *edx);
>  }
>  
>  static int vmx_do_cpuid(struct cpu_user_regs *regs)
>  {
>  unsigned int eax, ebx, ecx, edx;
>  unsigned int leaf, subleaf;
> +struct segment_register sreg;
> +struct vcpu *v = current;
> +
> +hvm_get_segment_register(v, x86_seg_ss, );
> +if ( v->arch.cpuid_fault && sreg.attr.fields.dpl > 0 )
> +{
> +hvm_inject_hw_exception(TRAP_gp_fault, 0);
> +return 1; /* Don't advance the guest IP! */
> +}

Thinking about it, the segment register query can be skipped in the
likely case that faulting isn't enabled.  Could this be re-arranged to:

if ( v->arch.cpuid_fault )
{
struct segment_register sreg;

hvm_get_segment_register(v, x86_seg_ss, );
if ( sreg.attr.fields.dpl > 0 )
{
hvm_inject_hw_exception(TRAP_gp_fault, 0);
return 1; /* Don't advance the guest IP! */
}
}

With these two minor issues taken care of, Reviewed-by: Andrew Cooper


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/2] x86/Intel: virtualize support for cpuid faulting

2016-10-17 Thread Wei Liu
On Fri, Oct 14, 2016 at 12:47:36PM -0700, Kyle Huey wrote:
> On HVM guests, the cpuid triggers a vm exit, so we can check the emulated
> faulting state in vmx_do_cpuid and inject a GP(0) if CPL > 0. Notably no
> hardware support for faulting on cpuid is necessary to emulate support with an
> HVM guest.
> 
> On PV guests, hardware support is required so that userspace cpuid will trap
> to xen. Xen already enables cpuid faulting on supported CPUs for pv guests 
> (that
> aren't the control domain, see the comment in intel_ctxt_switch_levelling).
> Every PV guest cpuid will trap via a GP(0) to emulate_privileged_op (via
> do_general_protection). Once there we simply decline to emulate cpuid if the
> CPL > 0 and faulting is enabled, leaving the GP(0) for the guest kernel to
> handle.
> 
> Signed-off-by: Kyle Huey 

Andrew expressed the desire of taking this patch into 4.8. After reading
the description and code in detail, I think this patch falls into the
"nice-to-have" category.

The main risk here is this patch doesn't have architecturally correct
behaviour. I would like to see an ack or review from VT maintainers to
make this patch eligible for acceptance.

Another thing to consider is timing. We plan to cut RC3 before Friday
this week, so if this patch can be acked and becomes part of RC3 I'm
fine with applying it. If not, we shall revisit the situation when it is
acked.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel