Re: [Xen-devel] [PATCH] x86/svm: Improve segment register printing in svm_vmcb_dump()
On 11/16/2016 10:48 AM, Boris Ostrovsky wrote: On 11/16/2016 08:43 AM, Andrew Cooper wrote: On 31/10/16 13:48, Andrew Cooper wrote: This makes it more succinct and easier to read. Before: (XEN) H_CR3 = 0x00042a0ec000 CleanBits = 0 (XEN) CS: sel=0x0008, attr=0x029b, limit=0x, base=0x (XEN) DS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x (XEN) SS: sel=0x0018, attr=0x0c93, limit=0x, base=0x (XEN) ES: sel=0x0033, attr=0x0cf3, limit=0x, base=0x (XEN) FS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x (XEN) GS: sel=0x0033, attr=0x0cf3, limit=0x, base=0x (XEN) GDTR: sel=0x, attr=0x, limit=0x0067, base=0x0010d100 (XEN) LDTR: sel=0x, attr=0x, limit=0x, base=0x (XEN) IDTR: sel=0x, attr=0x, limit=0x0fff, base=0x00110900 (XEN) TR: sel=0x0038, attr=0x0089, limit=0x0067, base=0x0010d020 After: (XEN) H_CR3 = 0x00042a0ec000 CleanBits = 0 (XEN)sel attr limit base (XEN) CS: 0008 0029b (XEN) DS: 0033 00cf3 (XEN) SS: 0018 00c93 (XEN) ES: 0033 00cf3 (XEN) FS: 0033 00cf3 (XEN) GS: 0033 00cf3 (XEN) GDTR: 0 0067 0010d100 (XEN) LDTR: 0 (XEN) IDTR: 0 0fff 00110900 (XEN) TR: 0038 00089 0067 0010d020 Signed-off-by: Andrew Cooper--- CC: Jan Beulich CC: Boris Ostrovsky CC: Suravee Suthikulpanit AMD Ping? Sorry, I thought you were going to respin with updated attribute format but I should have responded anyway. Reviewed-by: Boris Ostrovsky Reviewed-by: Suravee Suthikulpanit Thanks, S ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] AMD IOMMU: correctly propagate errors from amd_iommu_init()
On 6/14/2016 4:09 AM, Andrew Cooper wrote: On 14/06/16 10:03, Jan Beulich wrote: ... instead of using -ENODEV for any kind of error. It in particular addresses Coverity ID 1362694 (introduced by commit eb48587210 ["AMD IOMMU: introduce support for IVHD block type 11h"]). Coverity ID: 1362694 Signed-off-by: Jan BeulichReviewed-by: Andrew Cooper Reviewed-by: Suravee Suthikulpanit Tested-by: Suravee Suthikulpanit Thank you for the fix up. Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 03/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU unmapping (top level ones)
On 6/8/2016 9:54 AM, Jan Beulich wrote: On 08.06.16 at 10:58,wrote: > From: Quan Xu > > Signed-off-by: Quan Xu > Acked-by: Kevin Tian > > CC: Stefano Stabellini > CC: Julien Grall > CC: Kevin Tian > CC: Feng Wu > CC: Jan Beulich > CC: Andrew Cooper CC: Suravee Suthikulpanit Acked-by: Suravee Suthikulpanit Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 07/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending (top level ones)
On 6/8/2016 3:59 AM, Xu, Quan wrote: From: Quan XuSigned-off-by: Quan Xu CC: Jan Beulich CC: Liu Jinsong CC: Keir Fraser CC: Andrew Cooper CC: Suravee Suthikulpanit CC: Stefano Stabellini CC: Julien Grall CC: Kevin Tian CC: Feng Wu v7: 1. return SAVED_ALL at the bottom of device_power_down(), instead of SAVED_NONE. 2. drop the 'if ( error > 0 )', calling device_power_up(error) without any if(). 3. for vtd_suspend(): - drop pointless initializer. - return 0 at the bottom to make obvious that no error path comes there. Shouldn't the changes log for v7 probably go ... --- ... HERE instead so that we don't get this in the commit log. For AMD part, Acked-by: Suravee Suthikulpanit Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 04/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU mapping (top level ones)
On 6/8/2016 9:43 AM, Jan Beulich wrote: On 08.06.16 at 10:58,wrote: > From: Quan Xu > > Signed-off-by: Quan Xu > Acked-by: Kevin Tian Reviewed-by: Jan Beulich Acked-by: Suravee Suthikulpanit Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] svm: iommu: Only call guest_iommu_init() after initialized HVM domain
On 5/26/2016 10:44 AM, Jan Beulich wrote: Suravee Suthikulanit <suravee.suthikulpa...@amd.com> 05/25/16 9:01 PM >>> On 5/23/2016 6:54 AM, Jan Beulich wrote: On 22.05.16 at 01:42, <suravee.suthikulpa...@amd.com> wrote: From: Suravee Suthikulpanit <suravee.suthikulpa...@amd.com> The guest_iommu_init() is currently called by the following code path: arch/x86/domain.c: arch_domain_create() ]- drivers/passthrough/iommu.c: iommu_domain_init() |- drivers/passthrough/amd/pci_amd_iommu.c: amd_iommu_domain_init(); |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init() At this point, the hvm_domain_initialised() has not been called. So register_mmio_handler() in guest_iommu_init() silently fails. This patch moves the iommu_domain_init() to a later point after the hvm_domain_intialise() instead. That's one possible approach, which I continue to be not really happy with. guest_iommu_init() really is HVM-specific, so maybe no longer calling it from amd_iommu_domain_init() would be the better solution (instead calling it from hvm_domain_initialise() would then seem to be the better option). Thoughts? Then, this goes back to the approach I proposed in the v1 of this patch series, where I call guest_iommu_init/destroy() in the svm_domain_initialise/destroy(). However, I'm still not quite clear in why the iommu_domain_init() is needed before hvm_domain_initialise(). I think the two things are only lightly related. Changing the order of calls is generally fine, but recognizing that guest_iommu_init() really would better be called elsewhere makes that re-ordering simply unnecessary. Jan So, let discussing these two things separately. I would propose to: 1. Let's just remove the guest_iommu_init() for now since it's not functioning, and it seems to not being called at a proper place according to Jan. We will revisit this when we re-introduce and fully test out the feature. 2. As for the ordering of the iommu_domain_init() and hvm_domain_init() , let's continue to discuss to find proper ordering if it needs changing. Let me know what you guys thinks. Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/3] x86/hvm: Add check when register io handler
On 5/23/2016 6:46 AM, Jan Beulich wrote: On 22.05.16 at 01:42,wrote: --- a/xen/arch/x86/hvm/intercept.c +++ b/xen/arch/x86/hvm/intercept.c @@ -258,6 +258,8 @@ struct hvm_io_handler *hvm_next_io_handler(struct domain *d) { unsigned int i = d->arch.hvm_domain.io_handler_count++; +ASSERT( d->arch.hvm_domain.io_handler ); Am I wrong in understanding that without patch 2 this ASSERT() will actually trigger? In which case the patch order would be wrong (but that could be taken care of during commit). Jan Right, I can fix this in V4 if we decide to change the iommu_domain_initialise() ordering. Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] svm: iommu: Only call guest_iommu_init() after initialized HVM domain
On 5/23/2016 6:54 AM, Jan Beulich wrote: On 22.05.16 at 01:42,wrote: From: Suravee Suthikulpanit The guest_iommu_init() is currently called by the following code path: arch/x86/domain.c: arch_domain_create() ]- drivers/passthrough/iommu.c: iommu_domain_init() |- drivers/passthrough/amd/pci_amd_iommu.c: amd_iommu_domain_init(); |- drivers/passthrough/amd/iommu_guest.c: guest_iommu_init() At this point, the hvm_domain_initialised() has not been called. So register_mmio_handler() in guest_iommu_init() silently fails. This patch moves the iommu_domain_init() to a later point after the hvm_domain_intialise() instead. That's one possible approach, which I continue to be not really happy with. guest_iommu_init() really is HVM-specific, so maybe no longer calling it from amd_iommu_domain_init() would be the better solution (instead calling it from hvm_domain_initialise() would then seem to be the better option). Thoughts? Then, this goes back to the approach I proposed in the v1 of this patch series, where I call guest_iommu_init/destroy() in the svm_domain_initialise/destroy(). However, I'm still not quite clear in why the iommu_domain_init() is needed before hvm_domain_initialise(). In any event is the choice of ... @@ -675,6 +675,9 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, return 0; + fail_1: +if ( has_hvm_container_domain(d) ) +hvm_domain_destroy(d); fail: d->is_dying = DOMDYING_dead; psr_domain_free(d); ... the new label name sub-optimal. Please pick something more descriptive, e.g. "iommu_fail", if the current approach is to be retained. Jan In case we are going with this approach, I will make this change. Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/3] AMD IOMMU: Check io_handler before registering mmio handler
On 5/23/2016 3:23 AM, Paul Durrant wrote: -Original Message- > From: suravee.suthikulpa...@amd.com > [mailto:suravee.suthikulpa...@amd.com] > Sent: 22 May 2016 00:43 > To: xen-devel@lists.xen.org; Paul Durrant; jbeul...@suse.com; George > Dunlap > Cc: Keir (Xen.org); Suravee Suthikulpanit; Suravee Suthikulapanit > Subject: [PATCH v3 3/3] AMD IOMMU: Check io_handler before registering > mmio handler > > From: Suravee Suthikulpanit> > guest_iommu_init tries to register mmio handler before HVM domain > is initialized. This cause registration to silently failing. > This patch adds a sanitiy check and puts out error message. > > Signed-off-by: Suravee Suthikulapanit This patch is now defunct isn't it? Paul It is no longer required, but I think this is a good sanity check in case something changes in the future and causing this to be called before the HVM I/O handler initialized. Thanks, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] AMD IOMMU: Introduce support for IVHD block type 11h
Thanks for these review comment. I'll update this patch and send out V3 Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] IOMMU/x86: avoid pages without GFN in page table creation/updating
Acked-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com Thanks, Suravee On 5/7/2015 7:59 AM, Jan Beulich wrote: Handing INVALID_GFN to functions like hd-platform_ops-map_page() just can't do any good, and the ioreq server code results in such pages being on the list of ones owned by a guest. While - as suggested by Tim - we should use get_gfn()/put_gfn() there to eliminate races, we really can't due to holding the domain's page alloc lock. Ultimately arch_iommu_populate_page_table() may need to be switched to be GFN based. Here is what Tim said in this regard: Ideally this loop would be iterating over all gfns in the p2m rather than over all owned MFNs. As long as needs_iommu gets set first, such a loop could safely be paused and restarted without worrying about concurrent updates. The code sould even stay in this file, though exposing an iterator from the p2m code would be a lot more efficient. Original by Andrew Cooper andrew.coop...@citrix.com, using further suggestions from Tim Deegan t...@xen.org. Reported-by: Sander Eikelenboom li...@eikelenboom.it Signed-off-by: Jan Beulich jbeul...@suse.com Tested-by: Sander Eikelenboom li...@eikelenboom.it Acked-by: Tim Deegan t...@xen.org --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -557,6 +557,10 @@ static int update_paging_mode(struct dom unsigned long old_root_mfn; struct hvm_iommu *hd = domain_hvm_iommu(d); +if ( gfn == INVALID_MFN ) +return -EADDRNOTAVAIL; +ASSERT(!(gfn DEFAULT_DOMAIN_ADDRESS_WIDTH)); + level = hd-arch.paging_mode; old_root = hd-arch.root_table; offset = gfn (PTE_PER_TABLE_SHIFT * (level - 1)); @@ -729,12 +733,15 @@ int amd_iommu_unmap_page(struct domain * * we might need a deeper page table for lager gfn now */ if ( is_hvm_domain(d) ) { -if ( update_paging_mode(d, gfn) ) +int rc = update_paging_mode(d, gfn); + +if ( rc ) { spin_unlock(hd-arch.mapping_lock); AMD_IOMMU_DEBUG(Update page mode failed gfn = %lx\n, gfn); -domain_crash(d); -return -EFAULT; +if ( rc != -EADDRNOTAVAIL ) +domain_crash(d); +return rc; } } --- a/xen/drivers/passthrough/vtd/iommu.h +++ b/xen/drivers/passthrough/vtd/iommu.h @@ -482,7 +482,6 @@ struct qinval_entry { #define VTD_PAGE_TABLE_LEVEL_3 3 #define VTD_PAGE_TABLE_LEVEL_4 4 -#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48 #define MAX_IOMMU_REGS 0xc0 extern struct list_head acpi_drhd_units; --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -59,10 +59,17 @@ int arch_iommu_populate_page_table(struc if ( has_hvm_container_domain(d) || (page-u.inuse.type_info PGT_type_mask) == PGT_writable_page ) { -BUG_ON(SHARED_M2P(mfn_to_gmfn(d, page_to_mfn(page; -rc = hd-platform_ops-map_page( -d, mfn_to_gmfn(d, page_to_mfn(page)), page_to_mfn(page), -IOMMUF_readable|IOMMUF_writable); +unsigned long mfn = page_to_mfn(page); +unsigned long gfn = mfn_to_gmfn(d, mfn); + +if ( gfn != INVALID_MFN ) +{ +ASSERT(!(gfn DEFAULT_DOMAIN_ADDRESS_WIDTH)); +BUG_ON(SHARED_M2P(gfn)); +rc = hd-platform_ops-map_page(d, gfn, mfn, +IOMMUF_readable | +IOMMUF_writable); +} if ( rc ) { page_list_add(page, d-page_list); --- a/xen/include/asm-x86/hvm/iommu.h +++ b/xen/include/asm-x86/hvm/iommu.h @@ -46,6 +46,8 @@ struct g2m_ioport { unsigned int np; }; +#define DEFAULT_DOMAIN_ADDRESS_WIDTH 48 + struct arch_hvm_iommu { u64 pgd_maddr; /* io page directory machine address */ --- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h @@ -464,8 +464,6 @@ #define IOMMU_CONTROL_DISABLED0 #define IOMMU_CONTROL_ENABLED 1 -#define DEFAULT_DOMAIN_ADDRESS_WIDTH48 - /* interrupt remapping table */ #define INT_REMAP_ENTRY_REMAPEN_MASK0x0001 #define INT_REMAP_ENTRY_REMAPEN_SHIFT 0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from
On 3/6/2015 6:15 AM, Andrew Cooper wrote: On 06/03/2015 07:50, Jan Beulich wrote: On 05.03.15 at 18:30, andrew.coop...@citrix.com wrote: On 26/02/15 13:56, Jan Beulich wrote: --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -158,12 +158,12 @@ static inline unsigned long region_to_pa return (PAGE_ALIGN(addr + size) - (addr PAGE_MASK)) PAGE_SHIFT; } -static inline struct page_info* alloc_amd_iommu_pgtable(void) +static inline struct page_info *alloc_amd_iommu_pgtable(struct domain *d) { struct page_info *pg; void *vaddr; -pg = alloc_domheap_page(NULL, 0); +pg = alloc_domheap_page(d, MEMF_no_owner); Same comment as with the VT-d side of things. This should be based on the proximity information of the IOMMU, not of the owning domain. I think I buy this argument on the VT-d side (under the assumption that there's going to be at least one IOMMU per node), but I'm not sure here: The most modern AMD box I have has just a single IOMMU for 4 nodes it reports. It is not possible for an IOMMU to cover multiple NUMA nodes worth of IO, because of the position it has to sit relative to the IO root ports and QPI/HT links. In AMD systems, the IOMMUs lives in the northbridges, meaning one per numa node (as it is the northbridges which contain the hypertransport links) The BIOS/firmware will only report IOMMUs from northbridges which have IO connected to their IO hypertransport link (most systems in the wild have all IO hanging off one or two Numa nodes). On the other hand, I have an AMD system with 8 IOMMUs in use. Actually, a single IOMMU could handle multiple nodes. For example, in scenario of a multi-chip-module (MCM) setup, there could be at least 2-4 nodes sharing one IOMMU depending on how the platform vendor configuring the system. In the server platforms, IOMMU is in AMD northbridge chipsets (e.g. SR56xx). This website has an example of such system configuration (http://www.qdpma.com/systemarchitecture/SystemArchitecture_Opteron.html). For AMD IOMMU, the IVRS table specifies the PCI bus/device ranges to be handled by each IOMMU. This is probably should be considered here. In Intel systems, there is one IOMMU for each socket (to cover the on-chip root ports and GPU if applicable) and one IOMMU in the IOH/PCH (depending on generation) to cover the legacy IO. In all cases, the IOMMUs are local to a single NUMA node, and would benefit from having the control pages and pagetables allocated in local RAM. As state above, this is not the case for AMD IOMMU. Thanks, Suravee ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from
On 3/9/2015 12:26 PM, Andrew Cooper wrote: On 09/03/15 15:42, Suravee Suthikulanit wrote: On 3/6/2015 6:15 AM, Andrew Cooper wrote: On 06/03/2015 07:50, Jan Beulich wrote: On 05.03.15 at 18:30, andrew.coop...@citrix.com wrote: On 26/02/15 13:56, Jan Beulich wrote: --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -158,12 +158,12 @@ static inline unsigned long region_to_pa return (PAGE_ALIGN(addr + size) - (addr PAGE_MASK)) PAGE_SHIFT; } -static inline struct page_info* alloc_amd_iommu_pgtable(void) +static inline struct page_info *alloc_amd_iommu_pgtable(struct domain *d) { struct page_info *pg; void *vaddr; -pg = alloc_domheap_page(NULL, 0); +pg = alloc_domheap_page(d, MEMF_no_owner); Same comment as with the VT-d side of things. This should be based on the proximity information of the IOMMU, not of the owning domain. I think I buy this argument on the VT-d side (under the assumption that there's going to be at least one IOMMU per node), but I'm not sure here: The most modern AMD box I have has just a single IOMMU for 4 nodes it reports. It is not possible for an IOMMU to cover multiple NUMA nodes worth of IO, because of the position it has to sit relative to the IO root ports and QPI/HT links. In AMD systems, the IOMMUs lives in the northbridges, meaning one per numa node (as it is the northbridges which contain the hypertransport links) The BIOS/firmware will only report IOMMUs from northbridges which have IO connected to their IO hypertransport link (most systems in the wild have all IO hanging off one or two Numa nodes). On the other hand, I have an AMD system with 8 IOMMUs in use. Actually, a single IOMMU could handle multiple nodes. For example, in scenario of a multi-chip-module (MCM) setup, there could be at least 2-4 nodes sharing one IOMMU depending on how the platform vendor configuring the system. In the server platforms, IOMMU is in AMD northbridge chipsets (e.g. SR56xx). This website has an example of such system configuration (http://www.qdpma.com/systemarchitecture/SystemArchitecture_Opteron.html). Ok - I was basing my example on the last layout I had the manual for, which I believe was Bulldozer. However, my point still stands that there is an IOMMU between any IO and RAM. An individual IOMMU will always benefit from having its iopagetables on the local numa node, rather than the numa node(s) which the domain owning the device is running on. I agree that having the IO page tables on the NUMA node that is closest to the IOMMU would be beneficial. However, I am not sure at the moment that this information could be easily determined. I think ACPI _PXM for devices should be able to provide this information, but this is optional and often not available. For AMD IOMMU, the IVRS table specifies the PCI bus/device ranges to be handled by each IOMMU. This is probably should be considered here. Presumably a PCI transaction must never get onto the HT bus without having already undergone translation, or there can be no guarantee that it would be routed via the IOMMU? Or are you saying that there are cases where a transaction will enter the HT bus, route sideways to an IOMMU, undergo translation, then route back onto the HT bus to the target RAM/processor? ~Andrew IOMMU sits between PCI devices (downstream) and HT (uptream), all DMA transactions from downstream must go through IOMMU. On the other hand, the I/O page translation is handled by IOMMU, and it is a separate traffic than the downstream device DMA transactions. Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map
On 2/18/2015 6:48 AM, Julien Grall wrote: Hi Suravee, On 18/02/2015 05:28, Suravee Suthikulanit wrote: Actually, that seems to be more related to the PCI pass-through devices. Isn't the Cavium guys already done that work to support their PCI device pass-through? They were working on it, but so far there is no patch series on the ML. It would be nice to come with a common solution (i.e between GICv2m and GICv3 ITS) for MSI. Agree, although for supporting Dom0 MSI, I don't see anything that could be shared since this would totally be two separate drivers/interfaces. Anyways, at this point, I am able to generated Dom0 device tree with correct v2m node, and I can see Dom0 gicv2m driver probing and initializing correctly as it would on bare-metal. # Snippet from /sys/firmware/fdt showing dom0 GIC node interrupt-controller { compatible = arm,gic-400, arm,cortex-a15-gic; #interrupt-cells = 0x3; interrupt-controller; reg = 0x0 0xe111 0x0 0x1000 0x0 0xe112f000 0x0 0x2000; phandle = 0x1; #address-cells = 0x2; #size-cells = 0x2; ranges = 0x0 0x0 0x0 0xe110 0x0 0x10; v2m { compatible = arm,gic-v2m-frame; msi-controller; arm,msi-base-spi = 0x40; arm,msi-num-spis = 0x100; phandle = 0x5; reg = 0x0 0x8 0x0 0x1000; }; }; linux:~ # dmesg | grep v2m [0.00] GICv2m: Overriding V2M MSI_TYPER (base:64, num:256) [0.00] GICv2m: Node v2m: range[0xe118:0xe1180fff], SPI[64:320] So, during setting up v2m in hypervisor, I also call route_irq_to_guest() for the all SPIs used for MSI (i.e. 64-320 on Seattle), which will force the MSIs to Dom0. However, we would need to figure out how to detach and re-route certain interrupts to a specific DomU in case of passing through PCI devices in the future. Who decide to assign the MSI n to the SPI x? DOM0 or Xen? For v2m, each MSI is tied to a specific SPI. The range of SPIs is specified in the GIC V2M_MSI_TYPER register. In Xen, we need to make sure that these are routed to Dom0 initially since Dom0 GICv2m driver is the one handling all MSI assignments. Wouldn't it be possible to route the SPI dynamically when the domain decide to use the MSI n? We would need to implement PHYSDEVOP_map_pirq for MSI. Enabling MSI is done by each end-point PCI device drivers in the guest. In Linux, this would mean that when the driver tries to allocate an MSI interrupt, it would need to communicate back to Xen (possibly via hypercall as you pointed out) to get the next available SPI. It is not necessary for now. I am planning to revisit this when we try to implement pass-through support. Lemme know if you think this should be handled differently. Cheers, Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map
On 2/17/2015 4:35 PM, Suravee Suthikulanit wrote: On 2/17/2015 7:50 AM, Andrew Cooper wrote: On 17/02/15 13:43, Julien Grall wrote: (CC Jan and Andrew) Hi Suravee, On 17/02/15 03:04, Suravee Suthikulanit wrote: By the way, looking at the output of xl dmesg, I saw the following message: (XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges: (XEN) DOM0:IO 0xefff..0xefff - 0x (XEN) DOM0: MEM 0x4000..0xbfff - 0x4000 (XEN) DOM0: MEM 0x1..0x7f - 0x1 (XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus :00 (XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f] (XEN) DOM0: pci_bus :00: root bus resource [io 0x-0x] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x4000-0xbfff] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x1-0x7f] (XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=25: not implemented yet (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X might fail! IIUC, This is because xen_add_device() failed, and it seems to be related to some hyper call not implemented. Not sure what is cmd=15. Any ideas? There is 2 commands not implemented in the log: * cmd 15: PHYSDEVOP_manage_pci_add * cmd 25: PHYSDEVOP_pci_device_add Linux fallbacks on the former because the later is not implemented. AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM because it doesn't support segment. I suspect that it's kept for legacy on older Xen x86. Maybe Jan or Andrew have more input on this? It needs to be kept for backwards compatibility in x86. All new code should use PHYSDEVOP_pci_device_add. ~Andrew Ok, now that I look at the arch/arm/physdev.c, I don't think the code for supporting any of the PHYSDEVOP_xxx is there. That's probably why Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx already supported. My question is, are we supposed to be adding code to put the support in here? Thanks, Suravee. My guess is yes, and that would mean we need to enable building drivers/pci.c when building arm code, which then open up a can of worms with re-factoring MSI support code from x86 and etc. Suravee ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map
On 2/17/2015 6:31 PM, Suravee Suthikulanit wrote: On 2/17/2015 4:35 PM, Suravee Suthikulanit wrote: On 2/17/2015 7:50 AM, Andrew Cooper wrote: On 17/02/15 13:43, Julien Grall wrote: (CC Jan and Andrew) Hi Suravee, On 17/02/15 03:04, Suravee Suthikulanit wrote: By the way, looking at the output of xl dmesg, I saw the following message: (XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges: (XEN) DOM0:IO 0xefff..0xefff - 0x (XEN) DOM0: MEM 0x4000..0xbfff - 0x4000 (XEN) DOM0: MEM 0x1..0x7f - 0x1 (XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus :00 (XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f] (XEN) DOM0: pci_bus :00: root bus resource [io 0x-0x] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x4000-0xbfff] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x1-0x7f] (XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=25: not implemented yet (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X might fail! IIUC, This is because xen_add_device() failed, and it seems to be related to some hyper call not implemented. Not sure what is cmd=15. Any ideas? There is 2 commands not implemented in the log: * cmd 15: PHYSDEVOP_manage_pci_add * cmd 25: PHYSDEVOP_pci_device_add Linux fallbacks on the former because the later is not implemented. AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM because it doesn't support segment. I suspect that it's kept for legacy on older Xen x86. Maybe Jan or Andrew have more input on this? It needs to be kept for backwards compatibility in x86. All new code should use PHYSDEVOP_pci_device_add. ~Andrew Ok, now that I look at the arch/arm/physdev.c, I don't think the code for supporting any of the PHYSDEVOP_xxx is there. That's probably why Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx already supported. My question is, are we supposed to be adding code to put the support in here? Thanks, Suravee. My guess is yes, and that would mean we need to enable building drivers/pci.c when building arm code, which then open up a can of worms with re-factoring MSI support code from x86 and etc. Suravee Actually, that seems to be more related to the PCI pass-through devices. Isn't the Cavium guys already done that work to support their PCI device pass-through? Anyways, at this point, I am able to generated Dom0 device tree with correct v2m node, and I can see Dom0 gicv2m driver probing and initializing correctly as it would on bare-metal. # Snippet from /sys/firmware/fdt showing dom0 GIC node interrupt-controller { compatible = arm,gic-400, arm,cortex-a15-gic; #interrupt-cells = 0x3; interrupt-controller; reg = 0x0 0xe111 0x0 0x1000 0x0 0xe112f000 0x0 0x2000; phandle = 0x1; #address-cells = 0x2; #size-cells = 0x2; ranges = 0x0 0x0 0x0 0xe110 0x0 0x10; v2m { compatible = arm,gic-v2m-frame; msi-controller; arm,msi-base-spi = 0x40; arm,msi-num-spis = 0x100; phandle = 0x5; reg = 0x0 0x8 0x0 0x1000; }; }; linux:~ # dmesg | grep v2m [0.00] GICv2m: Overriding V2M MSI_TYPER (base:64, num:256) [0.00] GICv2m: Node v2m: range[0xe118:0xe1180fff], SPI[64:320] So, during setting up v2m in hypervisor, I also call route_irq_to_guest() for the all SPIs used for MSI (i.e. 64-320 on Seattle), which will force the MSIs to Dom0. However, we would need to figure out how to detach and re-route certain interrupts to a specific DomU in case of passing through PCI devices in the future. So, here's what I got: linux:~ # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 0: 0 0 0 0 0 0 xen-dyn-event xenbus 3: 19872 19872 19870 19861 19866 20034
Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map
On 2/17/2015 7:50 AM, Andrew Cooper wrote: On 17/02/15 13:43, Julien Grall wrote: (CC Jan and Andrew) Hi Suravee, On 17/02/15 03:04, Suravee Suthikulanit wrote: By the way, looking at the output of xl dmesg, I saw the following message: (XEN) DOM0: PCI host bridge /smb/pcie@f000 ranges: (XEN) DOM0:IO 0xefff..0xefff - 0x (XEN) DOM0: MEM 0x4000..0xbfff - 0x4000 (XEN) DOM0: MEM 0x1..0x7f - 0x1 (XEN) DOM0: pci-host-generic f000.pcie: PCI host bridge to bus :00 (XEN) DOM0: pci_bus :00: root bus resource [bus 00-7f] (XEN) DOM0: pci_bus :00: root bus resource [io 0x-0x] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x4000-0xbfff] (XEN) DOM0: pci_bus :00: root bus resource [mem 0x1-0x7f] (XEN) DOM0: pci :00:00.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=25: not implemented yet (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :00:02.0: of_irq_parse_pci() failed with rc=-19 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :00:02.1: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.0: Failed to add - passthrough or MSI/MSI-X might fail! (XEN) DOM0: pci :01:00.1: of_irq_parse_pci() failed with rc=-22 (XEN) do_physdev_op 16 cmd=15: not implemented yet (XEN) DOM0: pci :01:00.1: Failed to add - passthrough or MSI/MSI-X might fail! IIUC, This is because xen_add_device() failed, and it seems to be related to some hyper call not implemented. Not sure what is cmd=15. Any ideas? There is 2 commands not implemented in the log: * cmd 15: PHYSDEVOP_manage_pci_add * cmd 25: PHYSDEVOP_pci_device_add Linux fallbacks on the former because the later is not implemented. AFAICT, PHYSDEVOP_manage_pci_add should not be implemented for ARM because it doesn't support segment. I suspect that it's kept for legacy on older Xen x86. Maybe Jan or Andrew have more input on this? It needs to be kept for backwards compatibility in x86. All new code should use PHYSDEVOP_pci_device_add. ~Andrew Ok, now that I look at the arch/arm/physdev.c, I don't think the code for supporting any of the PHYSDEVOP_xxx is there. That's probably why Xen complains. In contrast, arch/x86/physdev.c has most PHYSDEVOP_xxx already supported. My question is, are we supposed to be adding code to put the support in here? Thanks, Suravee. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel