Re: [Xen-devel] [PATCH] x86/svm: Improve segment register printing in svm_vmcb_dump()

2016-11-16 Thread Suravee Suthikulanit

On 11/16/2016 10:48 AM, Boris Ostrovsky wrote:

On 11/16/2016 08:43 AM, Andrew Cooper wrote:

On 31/10/16 13:48, Andrew Cooper wrote:

This makes it more succinct and easier to read.

Before:
  (XEN) H_CR3 = 0x00042a0ec000 CleanBits = 0
  (XEN) CS: sel=0x0008, attr=0x029b, limit=0x, base=0x
  (XEN) DS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x
  (XEN) SS: sel=0x0018, attr=0x0c93, limit=0x, base=0x
  (XEN) ES: sel=0x0033, attr=0x0cf3, limit=0x, base=0x
  (XEN) FS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x
  (XEN) GS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x
  (XEN) GDTR: sel=0x, attr=0x, limit=0x0067, base=0x0010d100
  (XEN) LDTR: sel=0x, attr=0x, limit=0x, base=0x
  (XEN) IDTR: sel=0x, attr=0x, limit=0x0fff, base=0x00110900
  (XEN) TR: sel=0x0038, attr=0x0089, limit=0x0067, base=0x0010d020

After:
  (XEN) H_CR3 = 0x00042a0ec000 CleanBits = 0
  (XEN)sel  attr  limit   base
  (XEN)   CS: 0008 0029b  
  (XEN)   DS: 0033 00cf3  
  (XEN)   SS: 0018 00c93  
  (XEN)   ES: 0033 00cf3  
  (XEN)   FS: 0033 00cf3  
  (XEN)   GS: 0033 00cf3  
  (XEN) GDTR:  0 0067 0010d100
  (XEN) LDTR:  0  
  (XEN) IDTR:  0 0fff 00110900
  (XEN)   TR: 0038 00089 0067 0010d020

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Boris Ostrovsky 
CC: Suravee Suthikulpanit 

AMD Ping?


Sorry, I thought you were going to respin with updated attribute format
but I should have responded anyway.

Reviewed-by: Boris Ostrovsky 




Reviewed-by: Suravee Suthikulpanit 

Thanks,
S

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] AMD IOMMU: correctly propagate errors from amd_iommu_init()

2016-06-15 Thread Suravee Suthikulanit

On 6/14/2016 4:09 AM, Andrew Cooper wrote:

On 14/06/16 10:03, Jan Beulich wrote:

... instead of using -ENODEV for any kind of error. It in particular
addresses Coverity ID 1362694 (introduced by commit eb48587210 ["AMD
IOMMU: introduce support for IVHD block type 11h"]).

Coverity ID: 1362694

Signed-off-by: Jan Beulich 


Reviewed-by: Andrew Cooper 




Reviewed-by: Suravee Suthikulpanit 
Tested-by: Suravee Suthikulpanit 

Thank you for the fix up.

Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 03/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU unmapping (top level ones)

2016-06-09 Thread Suravee Suthikulanit

On 6/8/2016 9:54 AM, Jan Beulich wrote:

On 08.06.16 at 10:58,  wrote:

> From: Quan Xu 
>
> Signed-off-by: Quan Xu 
> Acked-by: Kevin Tian 
>
> CC: Stefano Stabellini 
> CC: Julien Grall 
> CC: Kevin Tian 
> CC: Feng Wu 
> CC: Jan Beulich 
> CC: Andrew Cooper 

CC: Suravee Suthikulpanit 



Acked-by: Suravee Suthikulpanit 

Thanks,
Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 07/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending (top level ones)

2016-06-09 Thread Suravee Suthikulanit

On 6/8/2016 3:59 AM, Xu, Quan wrote:

From: Quan Xu 

Signed-off-by: Quan Xu 

CC: Jan Beulich 
CC: Liu Jinsong 
CC: Keir Fraser 
CC: Andrew Cooper 
CC: Suravee Suthikulpanit 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Kevin Tian 
CC: Feng Wu 

v7:
  1. return SAVED_ALL at the bottom of device_power_down(), instead
 of SAVED_NONE.
  2. drop the 'if ( error > 0 )', calling device_power_up(error)
 without any if().
  3. for vtd_suspend():
   - drop pointless initializer.
   - return 0 at the bottom to make obvious that no error path
 comes there.


Shouldn't the changes log for v7 probably go ...

---

... HERE instead so that we don't get this in the commit log.


For AMD part,
Acked-by: Suravee Suthikulpanit 

Thanks,
Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 04/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU mapping (top level ones)

2016-06-09 Thread Suravee Suthikulanit

On 6/8/2016 9:43 AM, Jan Beulich wrote:

On 08.06.16 at 10:58,  wrote:

> From: Quan Xu 
>
> Signed-off-by: Quan Xu 
> Acked-by: Kevin Tian 

Reviewed-by: Jan Beulich 


Acked-by: Suravee Suthikulpanit 

Thanks,
Suravee


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/3] svm: iommu: Only call guest_iommu_init() after initialized HVM domain

2016-05-31 Thread Suravee Suthikulanit

On 5/26/2016 10:44 AM, Jan Beulich wrote:

Suravee Suthikulanit <suravee.suthikulpa...@amd.com> 05/25/16 9:01 PM >>>

On 5/23/2016 6:54 AM, Jan Beulich wrote:

On 22.05.16 at 01:42, <suravee.suthikulpa...@amd.com> wrote:

From: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com>

The guest_iommu_init() is currently called by the following code path:

arch/x86/domain.c: arch_domain_create()
  ]- drivers/passthrough/iommu.c: iommu_domain_init()
|- drivers/passthrough/amd/pci_amd_iommu.c: amd_iommu_domain_init();
  |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init()

At this point, the hvm_domain_initialised() has not been called.
So register_mmio_handler() in guest_iommu_init() silently fails.
This patch moves the iommu_domain_init() to a later point after the
hvm_domain_intialise() instead.


That's one possible approach, which I continue to be not really
happy with. guest_iommu_init() really is HVM-specific, so maybe
no longer calling it from amd_iommu_domain_init() would be the
better solution (instead calling it from hvm_domain_initialise()
would then seem to be the better option). Thoughts?


Then, this goes back to the approach I proposed in the v1 of this patch
series, where I call guest_iommu_init/destroy() in the
svm_domain_initialise/destroy().

However, I'm still not quite clear in why the iommu_domain_init() is
needed before hvm_domain_initialise().


I think the two things are only lightly related. Changing the order of calls is
generally fine, but recognizing that guest_iommu_init() really would better be
called elsewhere makes that re-ordering simply unnecessary.

Jan


So, let discussing these two things separately. I would propose to:

1. Let's just remove the guest_iommu_init() for now since it's not 
functioning, and it seems to not being called at a proper place 
according to Jan. We will revisit this when we re-introduce and fully 
test out the feature.


2. As for the ordering of the iommu_domain_init() and hvm_domain_init() 
, let's continue to discuss to find proper ordering if it needs changing.


Let me know what you guys thinks.

Thanks,
Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/3] x86/hvm: Add check when register io handler

2016-05-25 Thread Suravee Suthikulanit

On 5/23/2016 6:46 AM, Jan Beulich wrote:

On 22.05.16 at 01:42,  wrote:

--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -258,6 +258,8 @@ struct hvm_io_handler *hvm_next_io_handler(struct domain *d)
 {
 unsigned int i = d->arch.hvm_domain.io_handler_count++;

+ASSERT( d->arch.hvm_domain.io_handler );


Am I wrong in understanding that without patch 2 this ASSERT() will
actually trigger? In which case the patch order would be wrong (but
that could be taken care of during commit).

Jan



Right, I can fix this in V4 if we decide to change the 
iommu_domain_initialise() ordering.


Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/3] svm: iommu: Only call guest_iommu_init() after initialized HVM domain

2016-05-25 Thread Suravee Suthikulanit

On 5/23/2016 6:54 AM, Jan Beulich wrote:

On 22.05.16 at 01:42,  wrote:

From: Suravee Suthikulpanit 

The guest_iommu_init() is currently called by the following code path:

arch/x86/domain.c: arch_domain_create()
  ]- drivers/passthrough/iommu.c: iommu_domain_init()
|- drivers/passthrough/amd/pci_amd_iommu.c: amd_iommu_domain_init();
  |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init()

At this point, the hvm_domain_initialised() has not been called.
So register_mmio_handler() in guest_iommu_init() silently fails.
This patch moves the iommu_domain_init() to a later point after the
hvm_domain_intialise() instead.


That's one possible approach, which I continue to be not really
happy with. guest_iommu_init() really is HVM-specific, so maybe
no longer calling it from amd_iommu_domain_init() would be the
better solution (instead calling it from hvm_domain_initialise()
would then seem to be the better option). Thoughts?


Then, this goes back to the approach I proposed in the v1 of this patch 
series, where I call guest_iommu_init/destroy() in the 
svm_domain_initialise/destroy().


However, I'm still not quite clear in why the iommu_domain_init() is 
needed before hvm_domain_initialise().




In any event is the choice of ...


@@ -675,6 +675,9 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,

 return 0;

+ fail_1:
+if ( has_hvm_container_domain(d) )
+hvm_domain_destroy(d);
  fail:
 d->is_dying = DOMDYING_dead;
 psr_domain_free(d);


... the new label name sub-optimal. Please pick something more
descriptive, e.g. "iommu_fail", if the current approach is to be
retained.

Jan



In case we are going with this approach, I will make this change.

Thanks,
Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/3] AMD IOMMU: Check io_handler before registering mmio handler

2016-05-25 Thread Suravee Suthikulanit

On 5/23/2016 3:23 AM, Paul Durrant wrote:

-Original Message-
> From: suravee.suthikulpa...@amd.com
> [mailto:suravee.suthikulpa...@amd.com]
> Sent: 22 May 2016 00:43
> To: xen-devel@lists.xen.org; Paul Durrant; jbeul...@suse.com; George
> Dunlap
> Cc: Keir (Xen.org); Suravee Suthikulpanit; Suravee Suthikulapanit
> Subject: [PATCH v3 3/3] AMD IOMMU: Check io_handler before registering
> mmio handler
>
> From: Suravee Suthikulpanit 
>
> guest_iommu_init tries to register mmio handler before HVM domain
> is initialized. This cause registration to silently failing.
> This patch adds a sanitiy check and puts out error message.
>
> Signed-off-by: Suravee Suthikulapanit 

This patch is now defunct isn't it?

  Paul



It is no longer required, but I think this is a good sanity check in 
case something changes in the future and causing this to be called 
before the HVM I/O handler initialized.


Thanks,
Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] AMD IOMMU: Introduce support for IVHD block type 11h

2016-05-25 Thread Suravee Suthikulanit

Thanks for these review comment. I'll update this patch and send out V3

Suravee

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] IOMMU/x86: avoid pages without GFN in page table creation/updating

2015-05-13 Thread Suravee Suthikulanit

Acked-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com

Thanks,

Suravee

On 5/7/2015 7:59 AM, Jan Beulich wrote:

Handing INVALID_GFN to functions like hd-platform_ops-map_page()
just can't do any good, and the ioreq server code results in such pages
being on the list of ones owned by a guest.

While - as suggested by Tim - we should use get_gfn()/put_gfn() there
to eliminate races, we really can't due to holding the domain's page
alloc lock. Ultimately arch_iommu_populate_page_table() may need to be
switched to be GFN based. Here is what Tim said in this regard:
Ideally this loop would be iterating over all gfns in the p2m rather
  than over all owned MFNs.  As long as needs_iommu gets set first,
  such a loop could safely be paused and restarted without worrying
  about concurrent updates.  The code sould even stay in this file,
  though exposing an iterator from the p2m code would be a lot more
  efficient.

Original by Andrew Cooper andrew.coop...@citrix.com, using further
suggestions from Tim Deegan t...@xen.org.

Reported-by: Sander Eikelenboom li...@eikelenboom.it
Signed-off-by: Jan Beulich jbeul...@suse.com
Tested-by: Sander Eikelenboom li...@eikelenboom.it
Acked-by: Tim Deegan t...@xen.org

--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -557,6 +557,10 @@ static int update_paging_mode(struct dom
  unsigned long old_root_mfn;
  struct hvm_iommu *hd = domain_hvm_iommu(d);

+if ( gfn == INVALID_MFN )
+return -EADDRNOTAVAIL;
+ASSERT(!(gfn  DEFAULT_DOMAIN_ADDRESS_WIDTH));
+
  level = hd-arch.paging_mode;
  old_root = hd-arch.root_table;
  offset = gfn  (PTE_PER_TABLE_SHIFT * (level - 1));
@@ -729,12 +733,15 @@ int amd_iommu_unmap_page(struct domain *
   * we might need a deeper page table for lager gfn now */
  if ( is_hvm_domain(d) )
  {
-if ( update_paging_mode(d, gfn) )
+int rc = update_paging_mode(d, gfn);
+
+if ( rc )
  {
  spin_unlock(hd-arch.mapping_lock);
  AMD_IOMMU_DEBUG(Update page mode failed gfn = %lx\n, gfn);
-domain_crash(d);
-return -EFAULT;
+if ( rc != -EADDRNOTAVAIL )
+domain_crash(d);
+return rc;
  }
  }

--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -482,7 +482,6 @@ struct qinval_entry {
  #define VTD_PAGE_TABLE_LEVEL_3  3
  #define VTD_PAGE_TABLE_LEVEL_4  4

-#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
  #define MAX_IOMMU_REGS 0xc0

  extern struct list_head acpi_drhd_units;
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -59,10 +59,17 @@ int arch_iommu_populate_page_table(struc
  if ( has_hvm_container_domain(d) ||
  (page-u.inuse.type_info  PGT_type_mask) == PGT_writable_page )
  {
-BUG_ON(SHARED_M2P(mfn_to_gmfn(d, page_to_mfn(page;
-rc = hd-platform_ops-map_page(
-d, mfn_to_gmfn(d, page_to_mfn(page)), page_to_mfn(page),
-IOMMUF_readable|IOMMUF_writable);
+unsigned long mfn = page_to_mfn(page);
+unsigned long gfn = mfn_to_gmfn(d, mfn);
+
+if ( gfn != INVALID_MFN )
+{
+ASSERT(!(gfn  DEFAULT_DOMAIN_ADDRESS_WIDTH));
+BUG_ON(SHARED_M2P(gfn));
+rc = hd-platform_ops-map_page(d, gfn, mfn,
+IOMMUF_readable |
+IOMMUF_writable);
+}
  if ( rc )
  {
  page_list_add(page, d-page_list);
--- a/xen/include/asm-x86/hvm/iommu.h
+++ b/xen/include/asm-x86/hvm/iommu.h
@@ -46,6 +46,8 @@ struct g2m_ioport {
  unsigned int np;
  };

+#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48
+
  struct arch_hvm_iommu
  {
  u64 pgd_maddr; /* io page directory machine address */
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h
@@ -464,8 +464,6 @@
  #define IOMMU_CONTROL_DISABLED0
  #define IOMMU_CONTROL_ENABLED 1

-#define DEFAULT_DOMAIN_ADDRESS_WIDTH48
-
  /* interrupt remapping table */
  #define INT_REMAP_ENTRY_REMAPEN_MASK0x0001
  #define INT_REMAP_ENTRY_REMAPEN_SHIFT   0






___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from

2015-03-09 Thread Suravee Suthikulanit

On 3/6/2015 6:15 AM, Andrew Cooper wrote:

On 06/03/2015 07:50, Jan Beulich wrote:

On 05.03.15 at 18:30, andrew.coop...@citrix.com wrote:

On 26/02/15 13:56, Jan Beulich wrote:

--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -158,12 +158,12 @@ static inline unsigned long region_to_pa
  return (PAGE_ALIGN(addr + size) - (addr  PAGE_MASK))  PAGE_SHIFT;
  }

-static inline struct page_info* alloc_amd_iommu_pgtable(void)
+static inline struct page_info *alloc_amd_iommu_pgtable(struct domain *d)
  {
  struct page_info *pg;
  void *vaddr;

-pg = alloc_domheap_page(NULL, 0);
+pg = alloc_domheap_page(d, MEMF_no_owner);

Same comment as with the VT-d side of things.  This should be based on
the proximity information of the IOMMU, not of the owning domain.

I think I buy this argument on the VT-d side (under the assumption
that there's going to be at least one IOMMU per node), but I'm not
sure here: The most modern AMD box I have has just a single
IOMMU for 4 nodes it reports.


It is not possible for an IOMMU to cover multiple NUMA nodes worth of
IO, because of the position it has to sit relative to the IO root ports
and QPI/HT links.

In AMD systems, the IOMMUs lives in the northbridges, meaning one per
numa node (as it is the northbridges which contain the hypertransport links)

The BIOS/firmware will only report IOMMUs from northbridges which have
IO connected to their IO hypertransport link (most systems in the wild
have all IO hanging off one or two Numa nodes).  On the other hand, I
have an AMD system with 8 IOMMUs in use.



Actually, a single IOMMU could handle multiple nodes. For example, in 
scenario of a multi-chip-module (MCM) setup, there could be at least 2-4 
nodes sharing one IOMMU depending on how the platform vendor configuring 
the system. In the server platforms, IOMMU is in AMD northbridge 
chipsets (e.g. SR56xx). This website has an example of such system 
configuration 
(http://www.qdpma.com/systemarchitecture/SystemArchitecture_Opteron.html).


For AMD IOMMU, the IVRS table specifies the PCI bus/device ranges to be 
handled by each IOMMU. This is probably should be considered here.




In Intel systems, there is one IOMMU for each socket (to cover the
on-chip root ports and GPU if applicable) and one IOMMU in the IOH/PCH
(depending on generation) to cover the legacy IO.


In all cases, the IOMMUs are local to a single NUMA node, and would
benefit from having the control pages and pagetables allocated in local RAM.



As state above, this is not the case for AMD IOMMU.

Thanks,

Suravee

~Andrew





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from

2015-03-09 Thread Suravee Suthikulanit

On 3/9/2015 12:26 PM, Andrew Cooper wrote:

On 09/03/15 15:42, Suravee Suthikulanit wrote:

On 3/6/2015 6:15 AM, Andrew Cooper wrote:

On 06/03/2015 07:50, Jan Beulich wrote:

On 05.03.15 at 18:30, andrew.coop...@citrix.com wrote:

On 26/02/15 13:56, Jan Beulich wrote:

--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -158,12 +158,12 @@ static inline unsigned long region_to_pa
   return (PAGE_ALIGN(addr + size) - (addr  PAGE_MASK)) 
PAGE_SHIFT;
   }

-static inline struct page_info* alloc_amd_iommu_pgtable(void)
+static inline struct page_info *alloc_amd_iommu_pgtable(struct
domain *d)
   {
   struct page_info *pg;
   void *vaddr;

-pg = alloc_domheap_page(NULL, 0);
+pg = alloc_domheap_page(d, MEMF_no_owner);

Same comment as with the VT-d side of things.  This should be based on
the proximity information of the IOMMU, not of the owning domain.

I think I buy this argument on the VT-d side (under the assumption
that there's going to be at least one IOMMU per node), but I'm not
sure here: The most modern AMD box I have has just a single
IOMMU for 4 nodes it reports.


It is not possible for an IOMMU to cover multiple NUMA nodes worth of
IO, because of the position it has to sit relative to the IO root ports
and QPI/HT links.

In AMD systems, the IOMMUs lives in the northbridges, meaning one per
numa node (as it is the northbridges which contain the hypertransport
links)

The BIOS/firmware will only report IOMMUs from northbridges which have
IO connected to their IO hypertransport link (most systems in the wild
have all IO hanging off one or two Numa nodes).  On the other hand, I
have an AMD system with 8 IOMMUs in use.



Actually, a single IOMMU could handle multiple nodes. For example, in
scenario of a multi-chip-module (MCM) setup, there could be at least
2-4 nodes sharing one IOMMU depending on how the platform vendor
configuring the system. In the server platforms, IOMMU is in AMD
northbridge chipsets (e.g. SR56xx). This website has an example of
such system configuration
(http://www.qdpma.com/systemarchitecture/SystemArchitecture_Opteron.html).


Ok - I was basing my example on the last layout I had the manual for,
which I believe was Bulldozer.

However, my point still stands that there is an IOMMU between any IO and
RAM.  An individual IOMMU will always benefit from having its
iopagetables on the local numa node, rather than the numa node(s) which
the domain owning the device is running on.



I agree that having the IO page tables on the NUMA node that is closest 
to the IOMMU would be beneficial.  However, I am not sure at the moment 
that this information could be easily determined. I think ACPI _PXM for 
devices should be able to provide this information, but this is optional 
and often not available.




For AMD IOMMU, the IVRS table specifies the PCI bus/device ranges to
be handled by each IOMMU. This is probably should be considered here.


Presumably a PCI transaction must never get onto the HT bus without
having already undergone translation, or there can be no guarantee that
it would be routed via the IOMMU?  Or are you saying that there are
cases where a transaction will enter the HT bus, route sideways to an
IOMMU, undergo translation, then route back onto the HT bus to the
target RAM/processor?

~Andrew



IOMMU sits between PCI devices (downstream) and HT (uptream), all DMA 
transactions from downstream must go through IOMMU. On the other hand, 
the I/O page translation is handled by IOMMU, and it is a separate 
traffic than the downstream device DMA transactions.


Suravee


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-18 Thread Suravee Suthikulanit

On 2/18/2015 6:48 AM, Julien Grall wrote:

Hi Suravee,

On 18/02/2015 05:28, Suravee Suthikulanit wrote:


Actually, that seems to be more related to the PCI pass-through devices.
Isn't the Cavium guys already done that work to support their PCI device
pass-through?


They were working on it, but so far there is no patch series on the ML.
It would be nice to come with a common solution (i.e between GICv2m and
GICv3 ITS) for MSI.


Agree, although for supporting Dom0 MSI, I don't see anything that could 
be shared since this would totally be two separate drivers/interfaces.





Anyways, at this point, I am able to generated Dom0 device tree with
correct v2m node, and I can see Dom0 gicv2m driver probing and
initializing correctly as it would on bare-metal.

# Snippet from /sys/firmware/fdt showing dom0 GIC node
 interrupt-controller {
 compatible = arm,gic-400, arm,cortex-a15-gic;
 #interrupt-cells = 0x3;
 interrupt-controller;
 reg = 0x0 0xe111 0x0 0x1000 0x0 0xe112f000 0x0 0x2000;
 phandle = 0x1;
 #address-cells = 0x2;
 #size-cells = 0x2;
 ranges = 0x0 0x0 0x0 0xe110 0x0 0x10;

 v2m {
 compatible = arm,gic-v2m-frame;
 msi-controller;
 arm,msi-base-spi = 0x40;
 arm,msi-num-spis = 0x100;
 phandle = 0x5;
 reg = 0x0 0x8 0x0 0x1000;
 };
 };

linux:~ # dmesg | grep v2m
[0.00] GICv2m: Overriding V2M MSI_TYPER (base:64, num:256)
[0.00] GICv2m: Node v2m: range[0xe118:0xe1180fff],
SPI[64:320]

So, during setting up v2m in hypervisor, I also call
route_irq_to_guest() for the all SPIs used for MSI (i.e. 64-320 on
Seattle), which will force the MSIs to Dom0. However, we would need to
figure out how to detach and re-route certain interrupts to a specific
DomU in case of passing through PCI devices in the future.


Who decide to assign the MSI n to the SPI x? DOM0 or Xen?


For v2m, each MSI is tied to a specific SPI. The range of SPIs is 
specified in the GIC V2M_MSI_TYPER register. In Xen, we need to make 
sure that these are routed to Dom0 initially since Dom0 GICv2m driver is 
the one handling all MSI assignments.



Wouldn't it be possible to route the SPI dynamically when the domain
decide to use the MSI n? We would need to implement PHYSDEVOP_map_pirq
for MSI.


Enabling MSI is done by each end-point PCI device drivers in the guest. 
In Linux, this would mean that when the driver tries to allocate an MSI 
interrupt, it would need to communicate back to Xen (possibly via 
hypercall as you pointed out) to get the next available SPI. It is not 
necessary for now. I am planning to revisit this when we try to 
implement pass-through support. Lemme know if you think this should be 
handled differently.


Cheers,

Suravee


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-17 Thread Suravee Suthikulanit

On 2/17/2015 4:35 PM, Suravee Suthikulanit wrote:

On 2/17/2015 7:50 AM, Andrew Cooper wrote:

On 17/02/15 13:43, Julien Grall wrote:

(CC Jan and Andrew)

Hi Suravee,

On 17/02/15 03:04, Suravee Suthikulanit wrote:

By the way, looking at the output of xl dmesg, I saw the following
message:

(XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges:
(XEN) DOM0:IO 0xefff..0xefff - 0x
(XEN) DOM0:   MEM 0x4000..0xbfff - 0x4000
(XEN) DOM0:   MEM 0x1..0x7f - 0x1
(XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus
:00
(XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f]
(XEN) DOM0: pci_bus :00: root bus resource [io  0x-0x]
(XEN) DOM0: pci_bus :00: root bus resource [mem
0x4000-0xbfff]
(XEN) DOM0: pci_bus :00: root bus resource [mem
0x1-0x7f]
(XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=25: not implemented yet
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X
might fail!

IIUC, This is because xen_add_device() failed, and it seems to be
related to some hyper call not implemented. Not sure what is cmd=15.
Any ideas?

There is 2 commands not implemented in the log:
* cmd 15: PHYSDEVOP_manage_pci_add
* cmd 25: PHYSDEVOP_pci_device_add

Linux fallbacks on the former because the later is not implemented.

AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM
because it doesn't support segment. I suspect that it's kept for legacy
on older Xen x86. Maybe Jan or Andrew have more input on this?


It needs to be kept for backwards compatibility in x86.

All new code should use PHYSDEVOP_pci_device_add.

~Andrew



Ok, now that I look at the arch/arm/physdev.c, I don't think the code
for supporting any of the PHYSDEVOP_xxx is there.  That's probably why
Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx
already supported.

My question is, are we supposed to be adding code to put the support in
here?

Thanks,

Suravee.


My guess is yes, and that would mean we need to enable building 
drivers/pci.c when building arm code, which then open up a can of worms 
with re-factoring MSI support code from x86 and etc.


Suravee


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-17 Thread Suravee Suthikulanit

On 2/17/2015 6:31 PM, Suravee Suthikulanit wrote:

On 2/17/2015 4:35 PM, Suravee Suthikulanit wrote:

On 2/17/2015 7:50 AM, Andrew Cooper wrote:

On 17/02/15 13:43, Julien Grall wrote:

(CC Jan and Andrew)

Hi Suravee,

On 17/02/15 03:04, Suravee Suthikulanit wrote:

By the way, looking at the output of xl dmesg, I saw the following
message:

(XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges:
(XEN) DOM0:IO 0xefff..0xefff - 0x
(XEN) DOM0:   MEM 0x4000..0xbfff - 0x4000
(XEN) DOM0:   MEM 0x1..0x7f - 0x1
(XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus
:00
(XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f]
(XEN) DOM0: pci_bus :00: root bus resource [io  0x-0x]
(XEN) DOM0: pci_bus :00: root bus resource [mem
0x4000-0xbfff]
(XEN) DOM0: pci_bus :00: root bus resource [mem
0x1-0x7f]
(XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=25: not implemented yet
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X
might fail!

IIUC, This is because xen_add_device() failed, and it seems to be
related to some hyper call not implemented. Not sure what is cmd=15.
Any ideas?

There is 2 commands not implemented in the log:
* cmd 15: PHYSDEVOP_manage_pci_add
* cmd 25: PHYSDEVOP_pci_device_add

Linux fallbacks on the former because the later is not implemented.

AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM
because it doesn't support segment. I suspect that it's kept for legacy
on older Xen x86. Maybe Jan or Andrew have more input on this?


It needs to be kept for backwards compatibility in x86.

All new code should use PHYSDEVOP_pci_device_add.

~Andrew



Ok, now that I look at the arch/arm/physdev.c, I don't think the code
for supporting any of the PHYSDEVOP_xxx is there.  That's probably why
Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx
already supported.

My question is, are we supposed to be adding code to put the support in
here?

Thanks,

Suravee.


My guess is yes, and that would mean we need to enable building
drivers/pci.c when building arm code, which then open up a can of worms
with re-factoring MSI support code from x86 and etc.

Suravee


Actually, that seems to be more related to the PCI pass-through devices. 
Isn't the Cavium guys already done that work to support their PCI device 
pass-through?


Anyways, at this point, I am able to generated Dom0 device tree with 
correct v2m node, and I can see Dom0 gicv2m driver probing and 
initializing correctly as it would on bare-metal.


# Snippet from /sys/firmware/fdt showing dom0 GIC node
interrupt-controller {
compatible = arm,gic-400, arm,cortex-a15-gic;
#interrupt-cells = 0x3;
interrupt-controller;
reg = 0x0 0xe111 0x0 0x1000 0x0 0xe112f000 0x0 0x2000;
phandle = 0x1;
#address-cells = 0x2;
#size-cells = 0x2;
ranges = 0x0 0x0 0x0 0xe110 0x0 0x10;

v2m {
compatible = arm,gic-v2m-frame;
msi-controller;
arm,msi-base-spi = 0x40;
arm,msi-num-spis = 0x100;
phandle = 0x5;
reg = 0x0 0x8 0x0 0x1000;
};
};

linux:~ # dmesg | grep v2m
[0.00] GICv2m: Overriding V2M MSI_TYPER (base:64, num:256)
[0.00] GICv2m: Node v2m: range[0xe118:0xe1180fff], SPI[64:320]

So, during setting up v2m in hypervisor, I also call 
route_irq_to_guest() for the all SPIs used for MSI (i.e. 64-320 on 
Seattle), which will force the MSIs to Dom0. However, we would need to 
figure out how to detach and re-route certain interrupts to a specific 
DomU in case of passing through PCI devices in the future.


So, here's what I got:

linux:~ # cat /proc/interrupts
   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 

  0:  0  0  0  0  0  0 
  xen-dyn-event xenbus
  3:  19872  19872  19870  19861  19866  20034

Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-17 Thread Suravee Suthikulanit

On 2/17/2015 7:50 AM, Andrew Cooper wrote:

On 17/02/15 13:43, Julien Grall wrote:

(CC Jan and Andrew)

Hi Suravee,

On 17/02/15 03:04, Suravee Suthikulanit wrote:

By the way, looking at the output of xl dmesg, I saw the following
message:

(XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges:
(XEN) DOM0:IO 0xefff..0xefff - 0x
(XEN) DOM0:   MEM 0x4000..0xbfff - 0x4000
(XEN) DOM0:   MEM 0x1..0x7f - 0x1
(XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus :00
(XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f]
(XEN) DOM0: pci_bus :00: root bus resource [io  0x-0x]
(XEN) DOM0: pci_bus :00: root bus resource [mem 0x4000-0xbfff]
(XEN) DOM0: pci_bus :00: root bus resource [mem
0x1-0x7f]
(XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=25: not implemented yet
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X
might fail!
(XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22
(XEN) do_physdev_op 16 cmd=15: not implemented yet
(XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X
might fail!

IIUC, This is because xen_add_device() failed, and it seems to be
related to some hyper call not implemented. Not sure what is cmd=15.
Any ideas?

There is 2 commands not implemented in the log:
* cmd 15: PHYSDEVOP_manage_pci_add
* cmd 25: PHYSDEVOP_pci_device_add

Linux fallbacks on the former because the later is not implemented.

AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM
because it doesn't support segment. I suspect that it's kept for legacy
on older Xen x86. Maybe Jan or Andrew have more input on this?


It needs to be kept for backwards compatibility in x86.

All new code should use PHYSDEVOP_pci_device_add.

~Andrew



Ok, now that I look at the arch/arm/physdev.c, I don't think the code 
for supporting any of the PHYSDEVOP_xxx is there.  That's probably why 
Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx 
already supported.


My question is, are we supposed to be adding code to put the support in 
here?


Thanks,

Suravee.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel