Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-26 Thread Ian Jackson
Ian Jackson writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD cpus 
?"):
> Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD 
> cpus ?"):
> > It will be because of Gen1 SVM which doesn't have NRIP support. This 
> > case requires emulation of the invlpg instruction, rather than just 
> > using the information provided by the intercept.
> 
> So it seems that the xtf test is not effective at detecting the Xen
> bug except on old hardware ?  Is there some way it could be improved ?
> 
> It's obviously not desirable that we should have tests which pass in
> the production colo and fail in the ancient Citrix Cambridge instance.

Andrew and I discussed this IRL.  I thought it worth writing down what
was said so that we can refer to it later.

This test failure is due to genuine bug(s) in Xen 4.5, in that it
doesn't have various fixes (see the rest of the thread).

The bugs are only exposed on old hardware, which uses different
codepaths in Xen.  On new hardware Xen takes a different approach.
This is why the test failure appears in the Citrix (Cambridge)
osstest but not in the Xen Project (Massachusetts) instance.

Xen decides which approach to take based on hardware features.  There
is not currently any way to tell Xen not to use these hardware
features (at least, not in this case - the AMD SVM NextRIP feature) if
they are available.  Andrew has a long-term plan to add more of such a
facility - but that is not going to be available any time soon.

In this particular case, the old hardware uses the Xen instruction
emulator where newer hardware uses hardware support.  (Andrew tells me
that without NextRIP support, Xen must use the instruction emulator
when handling `invlpg` instructions on behalf of the guest, to
calculate how many bytes to move the instruction pointer forward by.
And it is the emulator which has the bug here.)

So FEP could be used to cause the bug to manifest even on new hardware
and indeed where FEP is available, XTF does then use FEP to run
exactly the same set of tests.  However, FEP is not available in Xen
4.5 and there are good reasons for not backporting it there.

It would be possible to backport the bugfixes to Xen 4.5.  However,
the bugs address only very rare problems.  Andrew thinks the bugs
are, insofar they are bugs which might cause lossage, more likely to
bbe roughly "crashes obscure or very oddly-behaved guests" than
"crashes commonly used guests but only with very low probability.
The latter kind of bug would be worth a backport; the former much
less so (especially in a very old stable release, and especially
when the fixes involve behavioural changes).

The fixes would also provide an unquantified performance improvement
on AMD hardware, due to avoiding extraneous TLB flushes, but Andrew
says he doubts that's worth caring about.

We discussed host stickiness, host-specific bug detection, and
regression detection, in osstest.  I reassured Andrew that I think
the current osstest algorithms will deal with this situation
tolerably well (if not perfectly).

The conclusion is that there is nothing to be done, at least in the
short term.  There are good reasons for the bug to persist in 4.5 and
good reasons for it being hard to detect on newer hardware.

Ian.

(Thanks to Andrew for the IRL explanation and for review of this
email.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-22 Thread Ian Jackson
Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD 
cpus ?"):
> This is the 4.5 tree missing some fixes:
> 
> * 31d961f - x86/hvm: Fix invalidation for emulated invlpg instructions 
> (4 months ago) 
> * eee511d - x86/svm: Don't unconditionally use a new ASID in 
> svm_invlpg_intercept() (4 months ago) 
> * a373db2 - x86/hvm: Correct the emulated interaction of invlpg with 
> segments (4 months ago) 
> * a94b35d - x86/hvm: Raise #SS faults for %ss-based segmentation 
> violations (4 months ago) 
> * 6093515 - x86/hvm: Always return the linear address from 
> hvm_virtual_to_linear_addr() (4 months ago) 
> 
> In this case, XTF is complaining that `invlpg`, as used in a shadow 
> guest, is not behaving in the architecturally specified way.

I think it is probably not sensible to backport these kind of
changes (and especially not to 4.5).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-22 Thread Ian Jackson
Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD 
cpus ?"):
> It will be because of Gen1 SVM which doesn't have NRIP support. This 
> case requires emulation of the invlpg instruction, rather than just 
> using the information provided by the intercept.

So it seems that the xtf test is not effective at detecting the Xen
bug except on old hardware ?  Is there some way it could be improved ?

It's obviously not desirable that we should have tests which pass in
the production colo and fail in the ancient Citrix Cambridge instance.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Andrew Cooper

On 21/09/16 19:06, Wei Liu wrote:

On Wed, Sep 21, 2016 at 02:00:50PM -0400, Boris Ostrovsky wrote:

On 09/21/2016 01:36 PM, Wei Liu wrote:

On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote:

On 09/21/2016 12:13 PM, Ian Jackson wrote:

Platform Team regression test user writes ("[xen-4.5-testing baseline-only test] 
67737: regressions - FAIL"):

test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706
test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67706

Several of these, 32bit and 64bit HVM.  This is in the Citrix
Cambridge osstest instance.  The Xen Project colo instance is
unaffected (flight 101045 there passed with the same revisions of
everything)

This is with:

   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860

I can't get these commits neither for Xen nor for Linux. Are these from
Citrix trees?

No. They are all upstream commits.

Apparently xen commit *just* made it to the tree, after I checked it.

Yes, it passed in Mass COLO. We suspected it can be related to
generations of AMD cpus (hence the "old" in email title).


It will be because of Gen1 SVM which doesn't have NRIP support. This 
case requires emulation of the invlpg instruction, rather than just 
using the information provided by the intercept.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Wei Liu
On Wed, Sep 21, 2016 at 02:00:50PM -0400, Boris Ostrovsky wrote:
> On 09/21/2016 01:36 PM, Wei Liu wrote:
> > On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote:
> >> On 09/21/2016 12:13 PM, Ian Jackson wrote:
> >>> Platform Team regression test user writes ("[xen-4.5-testing 
> >>> baseline-only test] 67737: regressions - FAIL"):
>  test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 
>  67706
>  test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 
>  67706
> >>> Several of these, 32bit and 64bit HVM.  This is in the Citrix
> >>> Cambridge osstest instance.  The Xen Project colo instance is
> >>> unaffected (flight 101045 there passed with the same revisions of
> >>> everything)
> >>>
> >>> This is with:
> >>>
> >>>   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
> >>>   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
> >>>   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
> >>>   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860
> >> I can't get these commits neither for Xen nor for Linux. Are these from
> >> Citrix trees?
> > No. They are all upstream commits.
> 
> Apparently xen commit *just* made it to the tree, after I checked it.

Yes, it passed in Mass COLO. We suspected it can be related to
generations of AMD cpus (hence the "old" in email title).

> And I still don't see Linux one.
> 

It should be upstream commit, too. But I don't think it matters that
much anyway.

> I ran a quick test (not xtf, our internal one) with 32-bit shadow guest
> and didn't see anything. But then Andrew seems to have pointed out what

That's probably because your guest is a well-behaved guest.

> the problem is.
> 
> -boris
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Boris Ostrovsky
On 09/21/2016 01:36 PM, Wei Liu wrote:
> On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote:
>> On 09/21/2016 12:13 PM, Ian Jackson wrote:
>>> Platform Team regression test user writes ("[xen-4.5-testing baseline-only 
>>> test] 67737: regressions - FAIL"):
 test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706
 test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 
 67706
>>> Several of these, 32bit and 64bit HVM.  This is in the Citrix
>>> Cambridge osstest instance.  The Xen Project colo instance is
>>> unaffected (flight 101045 there passed with the same revisions of
>>> everything)
>>>
>>> This is with:
>>>
>>>   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
>>>   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
>>>   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
>>>   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860
>> I can't get these commits neither for Xen nor for Linux. Are these from
>> Citrix trees?
> No. They are all upstream commits.

Apparently xen commit *just* made it to the tree, after I checked it.
And I still don't see Linux one.

I ran a quick test (not xtf, our internal one) with 32-bit shadow guest
and didn't see anything. But then Andrew seems to have pointed out what
the problem is.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Wei Liu
On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote:
> On 09/21/2016 12:13 PM, Ian Jackson wrote:
> > Platform Team regression test user writes ("[xen-4.5-testing baseline-only 
> > test] 67737: regressions - FAIL"):
> >> test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706
> >> test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 
> >> 67706
> > Several of these, 32bit and 64bit HVM.  This is in the Citrix
> > Cambridge osstest instance.  The Xen Project colo instance is
> > unaffected (flight 101045 there passed with the same revisions of
> > everything)
> >
> > This is with:
> >
> >   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
> >   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
> >   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
> >   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860
> 
> I can't get these commits neither for Xen nor for Linux. Are these from
> Citrix trees?

No. They are all upstream commits.

> 
> >
> > The log says this:
> >
> >   2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow 
> > running -- 
> >   2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 
> > /home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $?
> >
> >   Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log'
> >   Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg'
> >   --- Xen Test Framework ---
> >   Environment: HVM 32bit (No paging)
> >   Testing 'invlpg' in normally-faulting conditions
> > Test: Mapped address
> > Test: Unmapped address
> > Test: NULL segment override
> > Test: Past segment limit
> >   Fail: Unexpected #GP[]
> > Test: Before expand-down segment limit
> >   Fail: Unexpected #GP[]
> >   Test result: FAILURE
> >
> >   Combined test results:
> >   test-hvm32-invlpg~shadow FAILURE
> >
> > Sadly we haven't yet managed to make the logs from this instance
> > public.
> 
> I looked at the logs you posted but I can't find guest config file. Are
> they part of the logs?

The guest config file is not stored because it is part of xtf. So ...

> 
> >
> > Do you have any idea what might be causing this ?  Is there a real
> > problem with the Xen 4.5 branch ?  The Citrix Cambridge instance has
> > old hardware.
> 
> We run 4.5 every week or so here but we don't run without HAP (which,
> based on the name, I assume is what this guest is).
> 
> I'll give it a try.
> 

... the best way to try is to build and run xtf.

After building:

./xtf-runner --host --functional

> -boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Andrew Cooper

On 21/09/16 17:13, Ian Jackson wrote:

Platform Team regression test user writes ("[xen-4.5-testing baseline-only test] 
67737: regressions - FAIL"):

test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706
test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67706

Several of these, 32bit and 64bit HVM.  This is in the Citrix
Cambridge osstest instance.  The Xen Project colo instance is
unaffected (flight 101045 there passed with the same revisions of
everything)

This is with:

   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860

The log says this:

   2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow 
running --
   2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 
/home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $?

   Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log'
   Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg'
   --- Xen Test Framework ---
   Environment: HVM 32bit (No paging)
   Testing 'invlpg' in normally-faulting conditions
 Test: Mapped address
 Test: Unmapped address
 Test: NULL segment override
 Test: Past segment limit
   Fail: Unexpected #GP[]
 Test: Before expand-down segment limit
   Fail: Unexpected #GP[]
   Test result: FAILURE

   Combined test results:
   test-hvm32-invlpg~shadow FAILURE

Sadly we haven't yet managed to make the logs from this instance
public.

Do you have any idea what might be causing this ?  Is there a real
problem with the Xen 4.5 branch ?  The Citrix Cambridge instance has
old hardware.


This is the 4.5 tree missing some fixes:

* 31d961f - x86/hvm: Fix invalidation for emulated invlpg instructions 
(4 months ago) 
* eee511d - x86/svm: Don't unconditionally use a new ASID in 
svm_invlpg_intercept() (4 months ago) 
* a373db2 - x86/hvm: Correct the emulated interaction of invlpg with 
segments (4 months ago) 
* a94b35d - x86/hvm: Raise #SS faults for %ss-based segmentation 
violations (4 months ago) 
* 6093515 - x86/hvm: Always return the linear address from 
hvm_virtual_to_linear_addr() (4 months ago) 


In this case, XTF is complaining that `invlpg`, as used in a shadow 
guest, is not behaving in the architecturally specified way.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Boris Ostrovsky
On 09/21/2016 12:13 PM, Ian Jackson wrote:
> Platform Team regression test user writes ("[xen-4.5-testing baseline-only 
> test] 67737: regressions - FAIL"):
>> test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706
>> test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 
>> 67706
> Several of these, 32bit and 64bit HVM.  This is in the Citrix
> Cambridge osstest instance.  The Xen Project colo instance is
> unaffected (flight 101045 there passed with the same revisions of
> everything)
>
> This is with:
>
>   xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b
>   xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3
>   linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13
>   linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860

I can't get these commits neither for Xen nor for Linux. Are these from
Citrix trees?

>
> The log says this:
>
>   2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow 
> running -- 
>   2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 
> /home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $?
>
>   Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log'
>   Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg'
>   --- Xen Test Framework ---
>   Environment: HVM 32bit (No paging)
>   Testing 'invlpg' in normally-faulting conditions
> Test: Mapped address
> Test: Unmapped address
> Test: NULL segment override
> Test: Past segment limit
>   Fail: Unexpected #GP[]
> Test: Before expand-down segment limit
>   Fail: Unexpected #GP[]
>   Test result: FAILURE
>
>   Combined test results:
>   test-hvm32-invlpg~shadow FAILURE
>
> Sadly we haven't yet managed to make the logs from this instance
> public.

I looked at the logs you posted but I can't find guest config file. Are
they part of the logs?

>
> Do you have any idea what might be causing this ?  Is there a real
> problem with the Xen 4.5 branch ?  The Citrix Cambridge instance has
> old hardware.

We run 4.5 every week or so here but we don't run without HAP (which,
based on the name, I assume is what this guest is).

I'll give it a try.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?

2016-09-21 Thread Ian Jackson
Ian Jackson writes ("Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"):
> Sadly we haven't yet managed to make the logs from this instance
> public.

I have copied the logs from this one test job to here:

  http://xenbits.xen.org/people/iwj/2016/67737/test-xtf-amd64-amd64-1/info.html

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel