Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
Ian Jackson writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"): > Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD > cpus ?"): > > It will be because of Gen1 SVM which doesn't have NRIP support. This > > case requires emulation of the invlpg instruction, rather than just > > using the information provided by the intercept. > > So it seems that the xtf test is not effective at detecting the Xen > bug except on old hardware ? Is there some way it could be improved ? > > It's obviously not desirable that we should have tests which pass in > the production colo and fail in the ancient Citrix Cambridge instance. Andrew and I discussed this IRL. I thought it worth writing down what was said so that we can refer to it later. This test failure is due to genuine bug(s) in Xen 4.5, in that it doesn't have various fixes (see the rest of the thread). The bugs are only exposed on old hardware, which uses different codepaths in Xen. On new hardware Xen takes a different approach. This is why the test failure appears in the Citrix (Cambridge) osstest but not in the Xen Project (Massachusetts) instance. Xen decides which approach to take based on hardware features. There is not currently any way to tell Xen not to use these hardware features (at least, not in this case - the AMD SVM NextRIP feature) if they are available. Andrew has a long-term plan to add more of such a facility - but that is not going to be available any time soon. In this particular case, the old hardware uses the Xen instruction emulator where newer hardware uses hardware support. (Andrew tells me that without NextRIP support, Xen must use the instruction emulator when handling `invlpg` instructions on behalf of the guest, to calculate how many bytes to move the instruction pointer forward by. And it is the emulator which has the bug here.) So FEP could be used to cause the bug to manifest even on new hardware and indeed where FEP is available, XTF does then use FEP to run exactly the same set of tests. However, FEP is not available in Xen 4.5 and there are good reasons for not backporting it there. It would be possible to backport the bugfixes to Xen 4.5. However, the bugs address only very rare problems. Andrew thinks the bugs are, insofar they are bugs which might cause lossage, more likely to bbe roughly "crashes obscure or very oddly-behaved guests" than "crashes commonly used guests but only with very low probability. The latter kind of bug would be worth a backport; the former much less so (especially in a very old stable release, and especially when the fixes involve behavioural changes). The fixes would also provide an unquantified performance improvement on AMD hardware, due to avoiding extraneous TLB flushes, but Andrew says he doubts that's worth caring about. We discussed host stickiness, host-specific bug detection, and regression detection, in osstest. I reassured Andrew that I think the current osstest algorithms will deal with this situation tolerably well (if not perfectly). The conclusion is that there is nothing to be done, at least in the short term. There are good reasons for the bug to persist in 4.5 and good reasons for it being hard to detect on newer hardware. Ian. (Thanks to Andrew for the IRL explanation and for review of this email.) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"): > This is the 4.5 tree missing some fixes: > > * 31d961f - x86/hvm: Fix invalidation for emulated invlpg instructions > (4 months ago) > * eee511d - x86/svm: Don't unconditionally use a new ASID in > svm_invlpg_intercept() (4 months ago) > * a373db2 - x86/hvm: Correct the emulated interaction of invlpg with > segments (4 months ago) > * a94b35d - x86/hvm: Raise #SS faults for %ss-based segmentation > violations (4 months ago) > * 6093515 - x86/hvm: Always return the linear address from > hvm_virtual_to_linear_addr() (4 months ago) > > In this case, XTF is complaining that `invlpg`, as used in a shadow > guest, is not behaving in the architecturally specified way. I think it is probably not sensible to backport these kind of changes (and especially not to 4.5). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
Andrew Cooper writes ("Re: Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"): > It will be because of Gen1 SVM which doesn't have NRIP support. This > case requires emulation of the invlpg instruction, rather than just > using the information provided by the intercept. So it seems that the xtf test is not effective at detecting the Xen bug except on old hardware ? Is there some way it could be improved ? It's obviously not desirable that we should have tests which pass in the production colo and fail in the ancient Citrix Cambridge instance. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On 21/09/16 19:06, Wei Liu wrote: On Wed, Sep 21, 2016 at 02:00:50PM -0400, Boris Ostrovsky wrote: On 09/21/2016 01:36 PM, Wei Liu wrote: On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote: On 09/21/2016 12:13 PM, Ian Jackson wrote: Platform Team regression test user writes ("[xen-4.5-testing baseline-only test] 67737: regressions - FAIL"): test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706 test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67706 Several of these, 32bit and 64bit HVM. This is in the Citrix Cambridge osstest instance. The Xen Project colo instance is unaffected (flight 101045 there passed with the same revisions of everything) This is with: xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 I can't get these commits neither for Xen nor for Linux. Are these from Citrix trees? No. They are all upstream commits. Apparently xen commit *just* made it to the tree, after I checked it. Yes, it passed in Mass COLO. We suspected it can be related to generations of AMD cpus (hence the "old" in email title). It will be because of Gen1 SVM which doesn't have NRIP support. This case requires emulation of the invlpg instruction, rather than just using the information provided by the intercept. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On Wed, Sep 21, 2016 at 02:00:50PM -0400, Boris Ostrovsky wrote: > On 09/21/2016 01:36 PM, Wei Liu wrote: > > On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote: > >> On 09/21/2016 12:13 PM, Ian Jackson wrote: > >>> Platform Team regression test user writes ("[xen-4.5-testing > >>> baseline-only test] 67737: regressions - FAIL"): > test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. > 67706 > test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. > 67706 > >>> Several of these, 32bit and 64bit HVM. This is in the Citrix > >>> Cambridge osstest instance. The Xen Project colo instance is > >>> unaffected (flight 101045 there passed with the same revisions of > >>> everything) > >>> > >>> This is with: > >>> > >>> xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b > >>> xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 > >>> linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 > >>> linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 > >> I can't get these commits neither for Xen nor for Linux. Are these from > >> Citrix trees? > > No. They are all upstream commits. > > Apparently xen commit *just* made it to the tree, after I checked it. Yes, it passed in Mass COLO. We suspected it can be related to generations of AMD cpus (hence the "old" in email title). > And I still don't see Linux one. > It should be upstream commit, too. But I don't think it matters that much anyway. > I ran a quick test (not xtf, our internal one) with 32-bit shadow guest > and didn't see anything. But then Andrew seems to have pointed out what That's probably because your guest is a well-behaved guest. > the problem is. > > -boris > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On 09/21/2016 01:36 PM, Wei Liu wrote: > On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote: >> On 09/21/2016 12:13 PM, Ian Jackson wrote: >>> Platform Team regression test user writes ("[xen-4.5-testing baseline-only >>> test] 67737: regressions - FAIL"): test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706 test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67706 >>> Several of these, 32bit and 64bit HVM. This is in the Citrix >>> Cambridge osstest instance. The Xen Project colo instance is >>> unaffected (flight 101045 there passed with the same revisions of >>> everything) >>> >>> This is with: >>> >>> xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b >>> xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 >>> linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 >>> linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 >> I can't get these commits neither for Xen nor for Linux. Are these from >> Citrix trees? > No. They are all upstream commits. Apparently xen commit *just* made it to the tree, after I checked it. And I still don't see Linux one. I ran a quick test (not xtf, our internal one) with 32-bit shadow guest and didn't see anything. But then Andrew seems to have pointed out what the problem is. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On Wed, Sep 21, 2016 at 12:42:51PM -0400, Boris Ostrovsky wrote: > On 09/21/2016 12:13 PM, Ian Jackson wrote: > > Platform Team regression test user writes ("[xen-4.5-testing baseline-only > > test] 67737: regressions - FAIL"): > >> test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706 > >> test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. > >> 67706 > > Several of these, 32bit and 64bit HVM. This is in the Citrix > > Cambridge osstest instance. The Xen Project colo instance is > > unaffected (flight 101045 there passed with the same revisions of > > everything) > > > > This is with: > > > > xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b > > xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 > > linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 > > linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 > > I can't get these commits neither for Xen nor for Linux. Are these from > Citrix trees? No. They are all upstream commits. > > > > > The log says this: > > > > 2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow > > running -- > > 2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 > > /home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $? > > > > Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log' > > Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg' > > --- Xen Test Framework --- > > Environment: HVM 32bit (No paging) > > Testing 'invlpg' in normally-faulting conditions > > Test: Mapped address > > Test: Unmapped address > > Test: NULL segment override > > Test: Past segment limit > > Fail: Unexpected #GP[] > > Test: Before expand-down segment limit > > Fail: Unexpected #GP[] > > Test result: FAILURE > > > > Combined test results: > > test-hvm32-invlpg~shadow FAILURE > > > > Sadly we haven't yet managed to make the logs from this instance > > public. > > I looked at the logs you posted but I can't find guest config file. Are > they part of the logs? The guest config file is not stored because it is part of xtf. So ... > > > > > Do you have any idea what might be causing this ? Is there a real > > problem with the Xen 4.5 branch ? The Citrix Cambridge instance has > > old hardware. > > We run 4.5 every week or so here but we don't run without HAP (which, > based on the name, I assume is what this guest is). > > I'll give it a try. > ... the best way to try is to build and run xtf. After building: ./xtf-runner --host --functional > -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On 21/09/16 17:13, Ian Jackson wrote: Platform Team regression test user writes ("[xen-4.5-testing baseline-only test] 67737: regressions - FAIL"): test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706 test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. 67706 Several of these, 32bit and 64bit HVM. This is in the Citrix Cambridge osstest instance. The Xen Project colo instance is unaffected (flight 101045 there passed with the same revisions of everything) This is with: xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 The log says this: 2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow running -- 2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 /home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $? Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log' Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg' --- Xen Test Framework --- Environment: HVM 32bit (No paging) Testing 'invlpg' in normally-faulting conditions Test: Mapped address Test: Unmapped address Test: NULL segment override Test: Past segment limit Fail: Unexpected #GP[] Test: Before expand-down segment limit Fail: Unexpected #GP[] Test result: FAILURE Combined test results: test-hvm32-invlpg~shadow FAILURE Sadly we haven't yet managed to make the logs from this instance public. Do you have any idea what might be causing this ? Is there a real problem with the Xen 4.5 branch ? The Citrix Cambridge instance has old hardware. This is the 4.5 tree missing some fixes: * 31d961f - x86/hvm: Fix invalidation for emulated invlpg instructions (4 months ago) * eee511d - x86/svm: Don't unconditionally use a new ASID in svm_invlpg_intercept() (4 months ago) * a373db2 - x86/hvm: Correct the emulated interaction of invlpg with segments (4 months ago) * a94b35d - x86/hvm: Raise #SS faults for %ss-based segmentation violations (4 months ago) * 6093515 - x86/hvm: Always return the linear address from hvm_virtual_to_linear_addr() (4 months ago) In this case, XTF is complaining that `invlpg`, as used in a shadow guest, is not behaving in the architecturally specified way. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
On 09/21/2016 12:13 PM, Ian Jackson wrote: > Platform Team regression test user writes ("[xen-4.5-testing baseline-only > test] 67737: regressions - FAIL"): >> test-xtf-amd64-amd64-1 19 xtf/test-hvm32-invlpg~shadow fail REGR. vs. 67706 >> test-xtf-amd64-amd64-1 26 xtf/test-hvm32pae-invlpg~shadow fail REGR. vs. >> 67706 > Several of these, 32bit and 64bit HVM. This is in the Citrix > Cambridge osstest instance. The Xen Project colo instance is > unaffected (flight 101045 there passed with the same revisions of > everything) > > This is with: > > xen e4ae4b03d35babc9624b7286f1ea4c6749bad84b > xtf b5c5332de4268d33a6f8eadc1d17c7b9cf0e7dc3 > linux b65f2f457c49b2cfd7967c34b7a0b04c25587f13 > linux-firmware c530a75c1e6a472b0eb9558310b518f0dfcd8860 I can't get these commits neither for Xen nor for Linux. Are these from Citrix trees? > > The log says this: > > 2016-09-21 06:07:36 Z -- substep 19 xtf/test-hvm32-invlpg~shadow > running -- > 2016-09-21 06:07:36 Z executing ssh ... root@10.80.228.78 > /home/xtf/xtf-runner -m logfile test-hvm32-invlpg~shadow 1>&2; echo $? > > Using logfile '/var/log/xen/console/guest-test-hvm32-invlpg~shadow.log' > Executing 'xl create -F tests/invlpg/test-hvm32-invlpg~shadow.cfg' > --- Xen Test Framework --- > Environment: HVM 32bit (No paging) > Testing 'invlpg' in normally-faulting conditions > Test: Mapped address > Test: Unmapped address > Test: NULL segment override > Test: Past segment limit > Fail: Unexpected #GP[] > Test: Before expand-down segment limit > Fail: Unexpected #GP[] > Test result: FAILURE > > Combined test results: > test-hvm32-invlpg~shadow FAILURE > > Sadly we haven't yet managed to make the logs from this instance > public. I looked at the logs you posted but I can't find guest config file. Are they part of the logs? > > Do you have any idea what might be causing this ? Is there a real > problem with the Xen 4.5 branch ? The Citrix Cambridge instance has > old hardware. We run 4.5 every week or so here but we don't run without HAP (which, based on the name, I assume is what this guest is). I'll give it a try. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Problem with Xen 4.5 failing XTF tests on old AMD cpus ?
Ian Jackson writes ("Problem with Xen 4.5 failing XTF tests on old AMD cpus ?"): > Sadly we haven't yet managed to make the logs from this instance > public. I have copied the logs from this one test job to here: http://xenbits.xen.org/people/iwj/2016/67737/test-xtf-amd64-amd64-1/info.html Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel