Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
On 03/07/2018 11:22 AM, Pierre Morel wrote: > On 06/03/2018 18:10, David Hildenbrand wrote: >>> If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in >>> the CRYCB of L2, >>> L3 will not see any device. >> Exactly and this is the problem: How should L2 know that these devices >> are special and cannot be forwarded. >> This is what we call the nightmare of nested virtualization (see x86), because we have to emulate L3 instructions in L1 - but even worse, not even in L1 kernel space but in L1 user space. >>> As soon as one level begin to virtualize, all levels under it >>> must virtualize too so that L3 instructions will be handled in L2 >>> which will issue instructions that will be handled in L1. >> By virtualize I assume you mean emulate? If so, yes. >> So we could never provide the AP feature reliably with the SIE feature. >>> I think we should change a little this sentence to: >>> We can not provide SIE interpretation to a guest from which >>> any guest level N-1 does not use SIE interpretation. >> Exactly, and as said, there is no way to tell a guest that it has AP but >> cannot use AP interpretation but has to intercept and handle manually. > > > vSIE must clear ECA28 during running of the guest if the host itself do not > have ECA28 set. > Since ECA28 set for the host means AP instructions available for the host > then we can sum it up by: vSIE should never set ECA28 in the shadow SIE > if no AP instructions available. To say it differently, architecturally ECA28 is an effective control so we might put the burden on the guest2 by saying even it you set eca.28 you might still get exits for NQAP,PQAP,DQAP and handle it appropriately. > > Pierre > > >> >>> Nothing bad will occur for the host, the hardware or other guests, >>> but the guest will just not get any device. >>> We want to avoid interdependence between CPU features. (because everything else makes CPU feature detection ugly - CMMA is a good example and the only exception so far) Long story even shorter: No emulated AP devices with KVM. >>> I agree with: KVM should never set bits in CRYCB for emulated devices. >> I think this is stronger: emulated AP devices should not be used with >> KVM because it can potentially lead to architectural (v)SIE conflicts. >> >> But the details are buried in some AP documentation not accessible to me. >> >> Anyhow, if the scenario I described cannot be worked around via: >> >> a) telling a guest that AP virtualization cannot be used - which doesn't >> seem to be possible >> b) provoking for selected devices a SIE exit when an AP instruction is >> executed on these devices - and this is totally fine with the documented >> AP architecture >> >> I assume we would have to live with !emualted AP devices. >> >
Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
On 06/03/2018 18:10, David Hildenbrand wrote: If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in the CRYCB of L2, L3 will not see any device. Exactly and this is the problem: How should L2 know that these devices are special and cannot be forwarded. This is what we call the nightmare of nested virtualization (see x86), because we have to emulate L3 instructions in L1 - but even worse, not even in L1 kernel space but in L1 user space. As soon as one level begin to virtualize, all levels under it must virtualize too so that L3 instructions will be handled in L2 which will issue instructions that will be handled in L1. By virtualize I assume you mean emulate? If so, yes. So we could never provide the AP feature reliably with the SIE feature. I think we should change a little this sentence to: We can not provide SIE interpretation to a guest from which any guest level N-1 does not use SIE interpretation. Exactly, and as said, there is no way to tell a guest that it has AP but cannot use AP interpretation but has to intercept and handle manually. vSIE must clear ECA28 during running of the guest if the host itself do not have ECA28 set. Since ECA28 set for the host means AP instructions available for the host then we can sum it up by: vSIE should never set ECA28 in the shadow SIE if no AP instructions available. Pierre Nothing bad will occur for the host, the hardware or other guests, but the guest will just not get any device. We want to avoid interdependence between CPU features. (because everything else makes CPU feature detection ugly - CMMA is a good example and the only exception so far) Long story even shorter: No emulated AP devices with KVM. I agree with: KVM should never set bits in CRYCB for emulated devices. I think this is stronger: emulated AP devices should not be used with KVM because it can potentially lead to architectural (v)SIE conflicts. But the details are buried in some AP documentation not accessible to me. Anyhow, if the scenario I described cannot be worked around via: a) telling a guest that AP virtualization cannot be used - which doesn't seem to be possible b) provoking for selected devices a SIE exit when an AP instruction is executed on these devices - and this is totally fine with the documented AP architecture I assume we would have to live with !emualted AP devices. -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany
Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
> If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in > the CRYCB of L2, > L3 will not see any device. Exactly and this is the problem: How should L2 know that these devices are special and cannot be forwarded. > >> >> This is what we call the nightmare of nested virtualization (see x86), >> because we have to emulate L3 instructions in L1 - but even worse, not >> even in L1 kernel space but in L1 user space. > > As soon as one level begin to virtualize, all levels under it > must virtualize too so that L3 instructions will be handled in L2 > which will issue instructions that will be handled in L1. By virtualize I assume you mean emulate? If so, yes. >> >> So we could never provide the AP feature reliably with the SIE feature. > > I think we should change a little this sentence to: > We can not provide SIE interpretation to a guest from which > any guest level N-1 does not use SIE interpretation. Exactly, and as said, there is no way to tell a guest that it has AP but cannot use AP interpretation but has to intercept and handle manually. > > Nothing bad will occur for the host, the hardware or other guests, > but the guest will just not get any device. > >> We want to avoid interdependence between CPU features. (because >> everything else makes CPU feature detection ugly - CMMA is a good >> example and the only exception so far) >> >> >> Long story even shorter: >> >> No emulated AP devices with KVM. >> > I agree with: KVM should never set bits in CRYCB for emulated devices. I think this is stronger: emulated AP devices should not be used with KVM because it can potentially lead to architectural (v)SIE conflicts. But the details are buried in some AP documentation not accessible to me. Anyhow, if the scenario I described cannot be worked around via: a) telling a guest that AP virtualization cannot be used - which doesn't seem to be possible b) provoking for selected devices a SIE exit when an AP instruction is executed on these devices - and this is totally fine with the documented AP architecture I assume we would have to live with !emualted AP devices. -- Thanks, David / dhildenb
Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
On 06/03/2018 11:01, David Hildenbrand wrote: On 27.02.2018 16:44, Tony Krowiak wrote: This patch series is the QEMU counterpart to the KVM/kernel support for guest dedicated crypto adapters. The KVM/kernel model is built on the VFIO mediated device framework and provides the infrastructure for granting exclusive guest access to crypto devices installed on the linux host. This patch series introduces a new QEMU command line option, QEMU object model and CPU model features to exploit the KVM/kernel model. See the detailed specifications for AP virtualization provided by this patch set in docs/vfio-ap.txt for a more complete discussion of the design introduced by this patch series. v1 -> v2 Change log: === * Removed unnecessary S390APMatrixDevice, S390APMatrixDeviceClass * Removed ioctl to configure the AP matrix for the guest: letting the vfio_ap device driver's 'open' callback configure the AP matrix for the guest * Removed masks from object model: Unnecessary at this point because they are not currently used * Renamed: * VFIOAPMatrixDevice to VFIOAPDevice * VFIOAPMatrixDeviceClass to VFIOAPDeviceClass * APMatrixDevice to APDevice * APMatrixDeviceClass to APDeviceClass * ap-matrix.c to ap.c (in hw/vfio) * ap-matrix-device.c to ap-device.c (in hw/s390x) * ap-matrix-device.h to ap-device.h (in include/hw/s390x) * Added CPU model feature for AP facilities installed on guest and facilities features for QCI Instructions Available (STFLE.12) and AP Facilities Test facility installed (STFLE.15). Tony Krowiak (5): s390x/ap: base Adjunct Processor (AP) object s390x/vfio: ap: VFIO: linux header updates s390x/vfio: ap: Introduce VFIO AP device s390x/cpumodel: Set up CPU model for AP device support s390: doc: detailed specifications for AP virtualization I'm going to highlight an issue that stems from bad HW design: The lack of an AP interpretation facility (indication). We e.g. have something like that for zPCI (and all other I/O besides AP as far as I remember). Let's assume L1 provides AP to L2. Let's assume L2 provides AP to L3. L2 can blindly forward APs to L3 because it sees the AP facility. This requires AP vSIE support. We have no separate way of indicating that support, it comes with the AP feature. So let's assume L2 does not emulate devices but uses interpretation for L3. Everything is fine as long as L1 does not emulate AP devices/instructions for L2. All instructions are interpreted by HW. If L1 emulates AP, there is no need it sets any bit in the L2 SIE CRYCB. In fact we better do not set any bit in the CRYCB. But what happens if L1 emulates AP devices for L2? intepretation is disabled. QEMU handles it. However L2 can simply forward AP devices to L3. At this point, we must also intercept and emulate AP instructions issued by L3 in _L1_. If L2 forward devices to L3 through SIE ECA.28 but no bit is set is in the CRYCB of L2, L3 will not see any device. This is what we call the nightmare of nested virtualization (see x86), because we have to emulate L3 instructions in L1 - but even worse, not even in L1 kernel space but in L1 user space. As soon as one level begin to virtualize, all levels under it must virtualize too so that L3 instructions will be handled in L2 which will issue instructions that will be handled in L1. Long story short: Making this scenario work would require a _huge_ effort (going to user space with nested guest state - or communicating with the user space part using some other mechanism). A funny game with big overhead but same virtualization whatever the level is. So we could never provide the AP feature reliably with the SIE feature. I think we should change a little this sentence to: We can not provide SIE interpretation to a guest from which any guest level N-1 does not use SIE interpretation. Nothing bad will occur for the host, the hardware or other guests, but the guest will just not get any device. We want to avoid interdependence between CPU features. (because everything else makes CPU feature detection ugly - CMMA is a good example and the only exception so far) Long story even shorter: No emulated AP devices with KVM. I agree with: KVM should never set bits in CRYCB for emulated devices. -- Pierre Morel Linux/KVM/QEMU in Böblingen - Germany
Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
On 27.02.2018 16:44, Tony Krowiak wrote: > This patch series is the QEMU counterpart to the KVM/kernel support for > guest dedicated crypto adapters. The KVM/kernel model is built on the > VFIO mediated device framework and provides the infrastructure for > granting exclusive guest access to crypto devices installed on the linux > host. This patch series introduces a new QEMU command line option, QEMU > object model and CPU model features to exploit the KVM/kernel model. > > See the detailed specifications for AP virtualization provided by this > patch set in docs/vfio-ap.txt for a more complete discussion of the > design introduced by this patch series. > > v1 -> v2 Change log: > === > * Removed unnecessary S390APMatrixDevice, S390APMatrixDeviceClass > * Removed ioctl to configure the AP matrix for the guest: letting the > vfio_ap device driver's 'open' callback configure the AP matrix > for the guest > * Removed masks from object model: Unnecessary at this point because they > are not currently used > * Renamed: > * VFIOAPMatrixDevice to VFIOAPDevice > * VFIOAPMatrixDeviceClass to VFIOAPDeviceClass > * APMatrixDevice to APDevice > * APMatrixDeviceClass to APDeviceClass > * ap-matrix.c to ap.c (in hw/vfio) > * ap-matrix-device.c to ap-device.c (in hw/s390x) > * ap-matrix-device.h to ap-device.h (in include/hw/s390x) > * Added CPU model feature for AP facilities installed on guest and > facilities features for QCI Instructions Available (STFLE.12) and AP > Facilities Test facility installed (STFLE.15). > > Tony Krowiak (5): > s390x/ap: base Adjunct Processor (AP) object > s390x/vfio: ap: VFIO: linux header updates > s390x/vfio: ap: Introduce VFIO AP device > s390x/cpumodel: Set up CPU model for AP device support > s390: doc: detailed specifications for AP virtualization > I'm going to highlight an issue that stems from bad HW design: The lack of an AP interpretation facility (indication). We e.g. have something like that for zPCI (and all other I/O besides AP as far as I remember). Let's assume L1 provides AP to L2. Let's assume L2 provides AP to L3. L2 can blindly forward APs to L3 because it sees the AP facility. This requires AP vSIE support. We have no separate way of indicating that support, it comes with the AP feature. So let's assume L2 does not emulate devices but uses interpretation for L3. Everything is fine as long as L1 does not emulate AP devices/instructions for L2. All instructions are interpreted by HW. But what happens if L1 emulates AP devices for L2? intepretation is disabled. QEMU handles it. However L2 can simply forward AP devices to L3. At this point, we must also intercept and emulate AP instructions issued by L3 in _L1_. This is what we call the nightmare of nested virtualization (see x86), because we have to emulate L3 instructions in L1 - but even worse, not even in L1 kernel space but in L1 user space. Long story short: Making this scenario work would require a _huge_ effort (going to user space with nested guest state - or communicating with the user space part using some other mechanism). So we could never provide the AP feature reliably with the SIE feature. We want to avoid interdependence between CPU features. (because everything else makes CPU feature detection ugly - CMMA is a good example and the only exception so far) Long story even shorter: No emulated AP devices with KVM. -- Thanks, David / dhildenb
Re: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 1519746259-27710-1-git-send-email-akrow...@linux.vnet.ibm.com Subject: [Qemu-devel] [PATCH v2 0/5] s390x: vfio-ap: guest dedicated crypto adapters === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/1519746259-27710-1-git-send-email-akrow...@linux.vnet.ibm.com -> patchew/1519746259-27710-1-git-send-email-akrow...@linux.vnet.ibm.com Switched to a new branch 'test' e9e1d68b87 s390x/cpumodel: Set up CPU model for AP device support 2fabd0f576 s390x/vfio: ap: Introduce VFIO AP device bb505ee5d6 s390x/vfio: ap: VFIO: linux header updates 4ea89ebf38 s390x/ap: base Adjunct Processor (AP) object 4fc31e63ea s390: doc: detailed specifications for AP virtualization === OUTPUT BEGIN === Checking PATCH 1/5: s390: doc: detailed specifications for AP virtualization... Checking PATCH 2/5: s390x/ap: base Adjunct Processor (AP) object... Checking PATCH 3/5: s390x/vfio: ap: VFIO: linux header updates... Checking PATCH 4/5: s390x/vfio: ap: Introduce VFIO AP device... Checking PATCH 5/5: s390x/cpumodel: Set up CPU model for AP device support... WARNING: line over 80 characters #86: FILE: target/s390x/cpu_features.c:39: +FEAT_INIT("qci", S390_FEAT_TYPE_STFL, 12, "Query AP Configuration facility"), ERROR: line over 90 characters #89: FILE: target/s390x/cpu_features.c:42: +FEAT_INIT("apft", S390_FEAT_TYPE_STFL, 15, "Adjunct Processor Facilities Test facility"), total: 1 errors, 1 warnings, 113 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org