On 11.04.2010, at 23:07, Andre Przywara wrote:
> On SVM we set the instruction length of skipped instructions
> to hard-coded, well known values, which could be wrong when (bogus,
> but valid) prefixes (REX, segment override) are used.
> Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or
> AthlonII) have an explicit NEXTRIP field in the VMCB containing the
> desired information.
> Since it is cheap to do so, we use this field to override the guessed
> value on newer processors.
> A fix for older CPUs would be rather expensive, as it would require
> to fetch and partially decode the instruction. As the problem is not
> a security issue and needs special, handcrafted code to trigger
> (no compiler will ever generate such code), I omit a fix for older
> CPUs.
> If someone is interested, I have both a patch for these CPUs as well as
> demo code triggering this issue: It segfaults under KVM, but runs
> perfectly on native Linux.
>
> Signed-off-by: Andre Przywara <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 4 +++-
> arch/x86/kvm/svm.c | 13 ++++++++-----
> 2 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index b26a38d..1d91d05 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -81,7 +81,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> u32 event_inj_err;
> u64 nested_cr3;
> u64 lbr_ctl;
> - u8 reserved_5[832];
> + u64 reserved_5;
> + u64 next_rip;
> + u8 reserved_6[816];
> };
>
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d04c7ad..7fff56c 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -43,11 +43,11 @@ MODULE_LICENSE("GPL");
> #define SEG_TYPE_LDT 2
> #define SEG_TYPE_BUSY_TSS16 3
>
> -#define SVM_FEATURE_NPT (1 << 0)
> -#define SVM_FEATURE_LBRV (1 << 1)
> -#define SVM_FEATURE_SVML (1 << 2)
> -#define SVM_FEATURE_NRIP (1 << 3)
> -#define SVM_FEATURE_PAUSE_FILTER (1 << 10)
> +#define SVM_FEATURE_NPT (1 << 0)
> +#define SVM_FEATURE_LBRV (1 << 1)
> +#define SVM_FEATURE_SVML (1 << 2)
> +#define SVM_FEATURE_NRIP (1 << 3)
> +#define SVM_FEATURE_PAUSE_FILTER (1 << 10)
This is pure indent fixing, right? Sounds like a separate patch to me.
>
> #define NESTED_EXIT_HOST 0 /* Exit handled on host level */
> #define NESTED_EXIT_DONE 1 /* Exit caused nested vmexit */
> @@ -319,6 +319,9 @@ static void skip_emulated_instruction(struct kvm_vcpu
> *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> + if (svm->vmcb->control.next_rip != 0)
> + svm->next_rip = svm->vmcb->control.next_rip;
> +
Wouldn't it be more obvious to create a wrapper function to set next_rip in all
the callers instead of setting it manually and then have magic override the
value you just set?
Let me illustrate what I mean. Currently we have:
static int halt_interception(struct vcpu_svm *svm)
{
svm->next_rip = kvm_rip_read(&svm->vcpu) + 1;
skip_emulated_instruction(&svm->vcpu);
return kvm_emulate_halt(&svm->vcpu);
}
skip_emulated_instruction sets pc = next_rip. All fine and obvious. With your
patch it would simply ignore next_rip, rendering the line before void without
obvious indication of that behavior.
So what I'd prefer is something like:
/* Either adds offset n to the instruction counter or takes the next
instruction pointer from the vmcb if the CPU supports it */
static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add)
{
if (svm->vmcb->control.next_rip != 0)
return svm->vmcb->control.next_rip;
return kvm_rip_read(&svm->vcpu) + add;
}
static int halt_interception(struct vcpu_svm *svm)
{
svm->next_rip = svm_next_rip(&svm->vcpu, 1);
skip_emulated_instruction(&svm->vcpu);
return kvm_emulate_halt(&svm->vcpu);
}
That is more readable IMHO and makes the code flow a lot more obvious.
Alex--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html