Re: [Qemu-devel] [PATCH 3/6] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled

2018-09-12 Thread Peter Xu
On Wed, Sep 12, 2018 at 01:50:34PM -0500, Brijesh Singh wrote:

[...]

> > > + */
> > > +if (sid == X86_IOMMU_SID_INVALID) {
> > > +sid = AMDVI_SB_IOAPIC_ID;
> > > +}
> > > +
> > > +amdvi_get_dte(iommu, sid, dte);
> > 
> > Mind to check the return value?
> > 
> > After all it's provided, and we have the fault path to handle it in
> > this function.
> > 
> 
> The amdvi_get_dte(..) does not check the IR bit. It only checks for the
> address-translation enabled bit. I can extend the function to check for
> IR flag so that we can check the error status of this function.

It seems to me that amdvi_get_dte() is checking against things like:
general read errors, reserved bits error, and proper set of
AMDVI_DEV_VALID.  These seem still make sense to IR.

> 
> 
> > > +
> > > +/* interrupt remapping is disabled */
> > > +if (!(dte[2] & AMDVI_IR_REMAP_ENABLE)) {
> > > +memcpy(translated, origin, sizeof(*origin));
> > > +goto out;
> > > +}
> > > +
> > > +/* deliver_mode and dest_mode from MSI data */
> > > +dest_mode = (origin->data >> 11) & 1;   /* Bit 11 */
> > > +delivery_mode = (origin->data >> 7) & 7;/* Bits 10:8 */
> > 
> > Nit: The comments are already nice, though IMHO some mask macros would
> > be nicer, like AMDVI_IR_PHYS_ADDR_MASK.
> > 
> > Also, could you hint me where are these things defined in the spec?
> > I'm looking at Figure 14, but there MSI data bits 15:11 are defined as
> > "Xb", and bits 10:8 seems to be part of the interrupt table offset.
> > 
> 
> Since we are not emulating the Hyper Transport hence we need to look at
> the encoding of the delivery mode of MSI data, which  is the same as
> APIC/IOAPIC delivery mode encoding. See
> 
> * For MSI Data, 
> https://pdfs.semanticscholar.org/presentation/9420/c279e942eca568157711ef5c92b800c40a79.pdf
>   (page 5)

[1]

> * For IOAPIC, https://wiki.osdev.org/APIC
> 
> They are similar to Hyper Transport MT encoding, but not quite the same.
> 
> enum ioapic_irq_destination_types {
> dest_Fixed  = 0,
> dest_LowestPrio = 1,
> dest_SMI= 2,
> dest__reserved_1= 3,
> dest_NMI= 4,
> dest_INIT   = 5,
> dest__reserved_2= 6,
> dest_ExtINT = 7
> };
> 
> I will add the comments in the code and also provide the link

I just misunderstood since you did this before the switch.  Maybe you
could consider move this after the switch then since these dest_mode
things are not valid for all the cases.

Meanwhile, you are refering to general MSI/IOAPIC definitions, then I
don't see why dest_mode is bit 11 of MSI data; what I see is that it
should be bit 2 of MSI address.  That's exactly on the slides you
provided [1].  Would you help double confirm?

[...]

> > > +/* Generic IRQ entry information */
> > > +struct AMDVIIrq {
> > > +/* Used by both IOAPIC/MSI interrupt remapping */
> > > +uint8_t trigger_mode;
> > > +uint8_t vector;
> > > +uint8_t delivery_mode;
> > > +uint32_t dest;
> > > +uint8_t dest_mode;
> > > +
> > > +/* only used by MSI interrupt remapping */
> > > +uint8_t redir_hint;
> > > +uint8_t msi_addr_last_bits;
> > > +};
> > 
> > This is VTDIrq.
> > 
> > We're having some similar codes between the vt-d and amd-vi ir
> > implementations.  I'm thinking to what extend we could share the code.
> > I don't think it would worth it if we need to spend too much extra
> > time, but things like this one might still be good candidates that we
> > can share them at the very beginning since it won't be that hard (like
> > what you did in patch 1).
> > 
> > (maybe also things like amdvi_generate_msi_message but I'm not sure)
> > 
> 
> I will see what can be done to move the VTDIrq struct and msi message
> generation in common file.

That'll be nice.  Thank you.

-- 
Peter Xu



Re: [Qemu-devel] [PATCH 3/6] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled

2018-09-12 Thread Brijesh Singh



Thanks for the quick review feedback.

On 09/11/2018 10:37 PM, Peter Xu wrote:

On Tue, Sep 11, 2018 at 11:49:46AM -0500, Brijesh Singh wrote:

Emulate the interrupt remapping support when guest virtual APIC is
not enabled.

See IOMMU spec: https://support.amd.com/TechDocs/48882_IOMMU.pdf
(section 2.2.5.1) for details information.

When VAPIC is not enabled, it uses interrupt remapping as defined in
Table 20 and Figure 15 from IOMMU spec.

Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Tom Lendacky 
Cc: Suravee Suthikulpanit 
Signed-off-by: Brijesh Singh 
---
  hw/i386/amd_iommu.c  | 187 +++
  hw/i386/amd_iommu.h  |  60 -
  hw/i386/trace-events |   7 ++
  3 files changed, 253 insertions(+), 1 deletion(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 572ba0a..5ac19df 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -28,6 +28,8 @@
  #include "qemu/error-report.h"
  #include "hw/i386/apic_internal.h"
  #include "trace.h"
+#include "cpu.h"
+#include "hw/i386/apic-msidef.h"
  
  /* used AMD-Vi MMIO registers */

  const char *amdvi_mmio_low[] = {
@@ -1027,6 +1029,119 @@ static IOMMUTLBEntry amdvi_translate(IOMMUMemoryRegion 
*iommu, hwaddr addr,
  return ret;
  }
  
+static int amdvi_get_irte(AMDVIState *s, MSIMessage *origin, uint64_t *dte,

+  union irte *irte, uint16_t devid)
+{
+uint64_t irte_root, offset;
+
+irte_root = dte[2] & AMDVI_IR_PHYS_ADDR_MASK;
+offset = (origin->data & AMDVI_IRTE_OFFSET) << 2;
+
+trace_amdvi_ir_irte(irte_root, offset);
+
+if (dma_memory_read(_space_memory, irte_root + offset,
+irte, sizeof(*irte))) {
+trace_amdvi_ir_err("failed to get irte");
+return -AMDVI_IR_GET_IRTE;
+}
+
+trace_amdvi_ir_irte_val(irte->val);
+
+return 0;
+}
+
+static void amdvi_generate_msi_message(struct AMDVIIrq *irq, MSIMessage *out)
+{
+out->address = APIC_DEFAULT_ADDRESS | \
+(irq->dest_mode << MSI_ADDR_DEST_MODE_SHIFT) | \
+(irq->redir_hint << MSI_ADDR_REDIRECTION_SHIFT) | \
+(irq->dest << MSI_ADDR_DEST_ID_SHIFT);
+
+out->data = (irq->vector << MSI_DATA_VECTOR_SHIFT) | \
+(irq->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+
+trace_amdvi_ir_generate_msi_message(irq->vector, irq->delivery_mode,
+irq->dest_mode, irq->dest, irq->redir_hint);
+}
+
+static int amdvi_int_remap_legacy(AMDVIState *iommu,
+  MSIMessage *origin,
+  MSIMessage *translated,
+  uint64_t *dte,
+  struct AMDVIIrq *irq,
+  uint16_t sid)
+{
+int ret;
+union irte irte;
+
+/* get interrupt remapping table */


... get interrupt remapping table "entry"? :)

I see similar wordings in your spec, e.g., Table 20 is named as
"Interrupt Remapping Table Fields - Basic Format", but actually AFAICT
it's for the entry fields.  I'm confused a bit with them.



I was too much in spec hence used the same wording as spec. But, I agree
with you that we should use "... interrupt remapping table entry".



+ret = amdvi_get_irte(iommu, origin, dte, , sid);
+if (ret < 0) {
+return ret;
+}
+
+if (!irte.fields.valid) {
+trace_amdvi_ir_target_abort("RemapEn is disabled");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+if (irte.fields.guest_mode) {
+trace_amdvi_ir_target_abort("guest mode is not zero");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+if (irte.fields.int_type > AMDVI_IOAPIC_INT_TYPE_ARBITRATED) {
+trace_amdvi_ir_target_abort("reserved int_type");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+irq->delivery_mode = irte.fields.int_type;
+irq->vector = irte.fields.vector;
+irq->dest_mode = irte.fields.dm;
+irq->redir_hint = irte.fields.rq_eoi;
+irq->dest = irte.fields.destination;
+
+return 0;
+}
+
+static int __amdvi_int_remap_msi(AMDVIState *iommu,
+ MSIMessage *origin,
+ MSIMessage *translated,
+ uint64_t *dte,
+ uint16_t sid)
+{
+int ret;
+uint8_t int_ctl;
+struct AMDVIIrq irq = { 0 };
+
+int_ctl = (dte[2] >> AMDVI_IR_INTCTL_SHIFT) & 3;
+trace_amdvi_ir_intctl(int_ctl);
+
+switch (int_ctl) {
+case AMDVI_IR_INTCTL_PASS:
+memcpy(translated, origin, sizeof(*origin));
+return 0;
+case AMDVI_IR_INTCTL_REMAP:
+break;
+case AMDVI_IR_INTCTL_ABORT:
+trace_amdvi_ir_target_abort("int_ctl abort");
+return -AMDVI_IR_TARGET_ABORT;
+default:
+trace_amdvi_ir_target_abort("int_ctl reserved");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+ret = amdvi_int_remap_legacy(iommu, 

Re: [Qemu-devel] [PATCH 3/6] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled

2018-09-11 Thread Peter Xu
On Tue, Sep 11, 2018 at 11:49:46AM -0500, Brijesh Singh wrote:
> Emulate the interrupt remapping support when guest virtual APIC is
> not enabled.
> 
> See IOMMU spec: https://support.amd.com/TechDocs/48882_IOMMU.pdf
> (section 2.2.5.1) for details information.
> 
> When VAPIC is not enabled, it uses interrupt remapping as defined in
> Table 20 and Figure 15 from IOMMU spec.
> 
> Cc: "Michael S. Tsirkin" 
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Marcel Apfelbaum 
> Cc: Tom Lendacky 
> Cc: Suravee Suthikulpanit 
> Signed-off-by: Brijesh Singh 
> ---
>  hw/i386/amd_iommu.c  | 187 
> +++
>  hw/i386/amd_iommu.h  |  60 -
>  hw/i386/trace-events |   7 ++
>  3 files changed, 253 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 572ba0a..5ac19df 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -28,6 +28,8 @@
>  #include "qemu/error-report.h"
>  #include "hw/i386/apic_internal.h"
>  #include "trace.h"
> +#include "cpu.h"
> +#include "hw/i386/apic-msidef.h"
>  
>  /* used AMD-Vi MMIO registers */
>  const char *amdvi_mmio_low[] = {
> @@ -1027,6 +1029,119 @@ static IOMMUTLBEntry 
> amdvi_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>  return ret;
>  }
>  
> +static int amdvi_get_irte(AMDVIState *s, MSIMessage *origin, uint64_t *dte,
> +  union irte *irte, uint16_t devid)
> +{
> +uint64_t irte_root, offset;
> +
> +irte_root = dte[2] & AMDVI_IR_PHYS_ADDR_MASK;
> +offset = (origin->data & AMDVI_IRTE_OFFSET) << 2;
> +
> +trace_amdvi_ir_irte(irte_root, offset);
> +
> +if (dma_memory_read(_space_memory, irte_root + offset,
> +irte, sizeof(*irte))) {
> +trace_amdvi_ir_err("failed to get irte");
> +return -AMDVI_IR_GET_IRTE;
> +}
> +
> +trace_amdvi_ir_irte_val(irte->val);
> +
> +return 0;
> +}
> +
> +static void amdvi_generate_msi_message(struct AMDVIIrq *irq, MSIMessage *out)
> +{
> +out->address = APIC_DEFAULT_ADDRESS | \
> +(irq->dest_mode << MSI_ADDR_DEST_MODE_SHIFT) | \
> +(irq->redir_hint << MSI_ADDR_REDIRECTION_SHIFT) | \
> +(irq->dest << MSI_ADDR_DEST_ID_SHIFT);
> +
> +out->data = (irq->vector << MSI_DATA_VECTOR_SHIFT) | \
> +(irq->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
> +
> +trace_amdvi_ir_generate_msi_message(irq->vector, irq->delivery_mode,
> +irq->dest_mode, irq->dest, irq->redir_hint);
> +}
> +
> +static int amdvi_int_remap_legacy(AMDVIState *iommu,
> +  MSIMessage *origin,
> +  MSIMessage *translated,
> +  uint64_t *dte,
> +  struct AMDVIIrq *irq,
> +  uint16_t sid)
> +{
> +int ret;
> +union irte irte;
> +
> +/* get interrupt remapping table */

... get interrupt remapping table "entry"? :)

I see similar wordings in your spec, e.g., Table 20 is named as
"Interrupt Remapping Table Fields - Basic Format", but actually AFAICT
it's for the entry fields.  I'm confused a bit with them.

> +ret = amdvi_get_irte(iommu, origin, dte, , sid);
> +if (ret < 0) {
> +return ret;
> +}
> +
> +if (!irte.fields.valid) {
> +trace_amdvi_ir_target_abort("RemapEn is disabled");
> +return -AMDVI_IR_TARGET_ABORT;
> +}
> +
> +if (irte.fields.guest_mode) {
> +trace_amdvi_ir_target_abort("guest mode is not zero");
> +return -AMDVI_IR_TARGET_ABORT;
> +}
> +
> +if (irte.fields.int_type > AMDVI_IOAPIC_INT_TYPE_ARBITRATED) {
> +trace_amdvi_ir_target_abort("reserved int_type");
> +return -AMDVI_IR_TARGET_ABORT;
> +}
> +
> +irq->delivery_mode = irte.fields.int_type;
> +irq->vector = irte.fields.vector;
> +irq->dest_mode = irte.fields.dm;
> +irq->redir_hint = irte.fields.rq_eoi;
> +irq->dest = irte.fields.destination;
> +
> +return 0;
> +}
> +
> +static int __amdvi_int_remap_msi(AMDVIState *iommu,
> + MSIMessage *origin,
> + MSIMessage *translated,
> + uint64_t *dte,
> + uint16_t sid)
> +{
> +int ret;
> +uint8_t int_ctl;
> +struct AMDVIIrq irq = { 0 };
> +
> +int_ctl = (dte[2] >> AMDVI_IR_INTCTL_SHIFT) & 3;
> +trace_amdvi_ir_intctl(int_ctl);
> +
> +switch (int_ctl) {
> +case AMDVI_IR_INTCTL_PASS:
> +memcpy(translated, origin, sizeof(*origin));
> +return 0;
> +case AMDVI_IR_INTCTL_REMAP:
> +break;
> +case AMDVI_IR_INTCTL_ABORT:
> +trace_amdvi_ir_target_abort("int_ctl abort");
> +return -AMDVI_IR_TARGET_ABORT;
> +default:
> +trace_amdvi_ir_target_abort("int_ctl reserved");
> +return -AMDVI_IR_TARGET_ABORT;
> +}

[Qemu-devel] [PATCH 3/6] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled

2018-09-11 Thread Brijesh Singh
Emulate the interrupt remapping support when guest virtual APIC is
not enabled.

See IOMMU spec: https://support.amd.com/TechDocs/48882_IOMMU.pdf
(section 2.2.5.1) for details information.

When VAPIC is not enabled, it uses interrupt remapping as defined in
Table 20 and Figure 15 from IOMMU spec.

Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Tom Lendacky 
Cc: Suravee Suthikulpanit 
Signed-off-by: Brijesh Singh 
---
 hw/i386/amd_iommu.c  | 187 +++
 hw/i386/amd_iommu.h  |  60 -
 hw/i386/trace-events |   7 ++
 3 files changed, 253 insertions(+), 1 deletion(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 572ba0a..5ac19df 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -28,6 +28,8 @@
 #include "qemu/error-report.h"
 #include "hw/i386/apic_internal.h"
 #include "trace.h"
+#include "cpu.h"
+#include "hw/i386/apic-msidef.h"
 
 /* used AMD-Vi MMIO registers */
 const char *amdvi_mmio_low[] = {
@@ -1027,6 +1029,119 @@ static IOMMUTLBEntry amdvi_translate(IOMMUMemoryRegion 
*iommu, hwaddr addr,
 return ret;
 }
 
+static int amdvi_get_irte(AMDVIState *s, MSIMessage *origin, uint64_t *dte,
+  union irte *irte, uint16_t devid)
+{
+uint64_t irte_root, offset;
+
+irte_root = dte[2] & AMDVI_IR_PHYS_ADDR_MASK;
+offset = (origin->data & AMDVI_IRTE_OFFSET) << 2;
+
+trace_amdvi_ir_irte(irte_root, offset);
+
+if (dma_memory_read(_space_memory, irte_root + offset,
+irte, sizeof(*irte))) {
+trace_amdvi_ir_err("failed to get irte");
+return -AMDVI_IR_GET_IRTE;
+}
+
+trace_amdvi_ir_irte_val(irte->val);
+
+return 0;
+}
+
+static void amdvi_generate_msi_message(struct AMDVIIrq *irq, MSIMessage *out)
+{
+out->address = APIC_DEFAULT_ADDRESS | \
+(irq->dest_mode << MSI_ADDR_DEST_MODE_SHIFT) | \
+(irq->redir_hint << MSI_ADDR_REDIRECTION_SHIFT) | \
+(irq->dest << MSI_ADDR_DEST_ID_SHIFT);
+
+out->data = (irq->vector << MSI_DATA_VECTOR_SHIFT) | \
+(irq->delivery_mode << MSI_DATA_DELIVERY_MODE_SHIFT);
+
+trace_amdvi_ir_generate_msi_message(irq->vector, irq->delivery_mode,
+irq->dest_mode, irq->dest, irq->redir_hint);
+}
+
+static int amdvi_int_remap_legacy(AMDVIState *iommu,
+  MSIMessage *origin,
+  MSIMessage *translated,
+  uint64_t *dte,
+  struct AMDVIIrq *irq,
+  uint16_t sid)
+{
+int ret;
+union irte irte;
+
+/* get interrupt remapping table */
+ret = amdvi_get_irte(iommu, origin, dte, , sid);
+if (ret < 0) {
+return ret;
+}
+
+if (!irte.fields.valid) {
+trace_amdvi_ir_target_abort("RemapEn is disabled");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+if (irte.fields.guest_mode) {
+trace_amdvi_ir_target_abort("guest mode is not zero");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+if (irte.fields.int_type > AMDVI_IOAPIC_INT_TYPE_ARBITRATED) {
+trace_amdvi_ir_target_abort("reserved int_type");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+irq->delivery_mode = irte.fields.int_type;
+irq->vector = irte.fields.vector;
+irq->dest_mode = irte.fields.dm;
+irq->redir_hint = irte.fields.rq_eoi;
+irq->dest = irte.fields.destination;
+
+return 0;
+}
+
+static int __amdvi_int_remap_msi(AMDVIState *iommu,
+ MSIMessage *origin,
+ MSIMessage *translated,
+ uint64_t *dte,
+ uint16_t sid)
+{
+int ret;
+uint8_t int_ctl;
+struct AMDVIIrq irq = { 0 };
+
+int_ctl = (dte[2] >> AMDVI_IR_INTCTL_SHIFT) & 3;
+trace_amdvi_ir_intctl(int_ctl);
+
+switch (int_ctl) {
+case AMDVI_IR_INTCTL_PASS:
+memcpy(translated, origin, sizeof(*origin));
+return 0;
+case AMDVI_IR_INTCTL_REMAP:
+break;
+case AMDVI_IR_INTCTL_ABORT:
+trace_amdvi_ir_target_abort("int_ctl abort");
+return -AMDVI_IR_TARGET_ABORT;
+default:
+trace_amdvi_ir_target_abort("int_ctl reserved");
+return -AMDVI_IR_TARGET_ABORT;
+}
+
+ret = amdvi_int_remap_legacy(iommu, origin, translated, dte, , sid);
+if (ret < 0) {
+return ret;
+}
+
+/* Translate AMDVIIrq to MSI message */
+amdvi_generate_msi_message(, translated);
+
+return 0;
+}
+
 /* Interrupt remapping for MSI/MSI-X entry */
 static int amdvi_int_remap_msi(AMDVIState *iommu,
MSIMessage *origin,
@@ -1034,6 +1149,9 @@ static int amdvi_int_remap_msi(AMDVIState *iommu,
uint16_t sid)
 {
 int ret;
+uint64_t pass = 0;
+uint64_t dte[4] = { 0 };
+uint8_t