On 13/10/2016 01:57, David Gibson wrote:
> Currently, the MMIO space for accessing PCI on pseries guests begins at
> 1 TiB in guest address space. Each PCI host bridge (PHB) has a 64 GiB
> chunk of address space in which it places its outbound PIO and 32-bit and
> 64-bit MMIO windows.
> This scheme as several problems:
> - It limits guest RAM to 1 TiB (though we have a limited fix for this
> - It limits the total MMIO window to 64 GiB. This is not always enough
> for some of the large nVidia GPGPU cards
> - Putting all the windows into a single 64 GiB area means that naturally
> aligning things within there will waste more address space.
> In addition there was a miscalculation in some of the defaults, which meant
> that the MMIO windows for each PHB actually slightly overran the 64 GiB
> region for that PHB. We got away without nasty consequences because
> the overrun fit within an unused area at the beginning of the next PHB's
> region, but it's not pretty.
> This patch implements a new scheme which addresses those problems, and is
> also closer to what bare metal hardware and pHyp guests generally use.
> Because some guest versions (including most current distro kernels) can't
> access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
> 64 TiB. This is broken into 1 TiB chunks. The 1 TiB contains the PIO
> (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs. Each
> subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
> one PHB each.
> This reduces the number of allowed PHBs (without full manual configuration
> of all the windows) from 256 to 31, but this should still be plenty in
> We also change some of the default window sizes for manually configured
> PHBs to saner values.
> Finally we adjust some tests and libqos so that it correctly uses the new
> default locations. Ideally it would parse the device tree given to the
> guest, but that's a more complex problem for another time.
> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au>
> hw/ppc/spapr.c | 126
> hw/ppc/spapr_pci.c | 5 +-
> include/hw/pci-host/spapr.h | 8 ++-
> tests/endianness-test.c | 3 +-
> tests/libqos/pci-spapr.c | 9 ++--
> tests/spapr-phb-test.c | 2 +-
> 6 files changed, 113 insertions(+), 40 deletions(-)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 8db3657..2d952a8 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2375,31 +2375,42 @@ static void spapr_phb_placement(sPAPRMachineState
> *spapr, uint32_t index,
> hwaddr *mmio32, hwaddr *mmio64,
> unsigned n_dma, uint32_t *liobns, Error
> + /*
> + * New-style PHB window placement.
> + *
> + * Goals: Gives large (1TiB), naturally aligned 64-bit MMIO window
> + * for each PHB, in addition to 2GiB 32-bit MMIO and 64kiB PIO
> + * windows.
> + *
> + * Some guest kernels can't work with MMIO windows above 1<<46
> + * (64TiB), so we place up to 31 PHBs in the area 32TiB..64TiB
> + *
> + * 32TiB..33TiB contains the PIO and 32-bit MMIO windows for all
> + * PHBs. 33..34TiB has the 64-bit MMIO window for PHB0, 34..35
> + * has the 64-bit window for PHB1 and so forth.
> + */
> const uint64_t base_buid = 0x800000020000000ULL;
> - const hwaddr phb_spacing = 0x1000000000ULL; /* 64 GiB */
> - const hwaddr mmio_offset = 0xa0000000; /* 2 GiB + 512 MiB */
> - const hwaddr pio_offset = 0x80000000; /* 2 GiB */
> - const uint32_t max_index = 255;
> - const hwaddr phb0_alignment = 0x10000000000ULL; /* 1 TiB */
> - uint64_t ram_top = MACHINE(spapr)->ram_size;
> - hwaddr phb0_base, phb_base;
> + int max_phbs =
> + (SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 1;
> + hwaddr mmio32_base = SPAPR_PCI_BASE + SPAPR_PCI_MEM32_WIN_SIZE;
The result is the same but I would add SPAPR_PCI_MEM_WIN_BUS_OFFSET
instead of SPAPR_PCI_MEM32_WIN_SIZE.
As SPAPR_PCI_MEM32_WIN_SIZE is defined as "((1ULL << 32) -
SPAPR_PCI_MEM_WIN_BUS_OFFSET)", I guess 0..SPAPR_PCI_MEM_WIN_BUS_OFFSET
is for PIO and SPAPR_PCI_MEM_WIN_BUS_OFFSET..(1<<32) is for MMIO.
This is what we have below with:
*pio = SPAPR_PCI_BASE + index * SPAPR_PCI_IO_WIN_SIZE;
*mmio32 = mmio32_base + index * SPAPR_PCI_MEM32_WIN_SIZE;
Perhaps we can see *mmio32 as
SPAPR_PCI_BASE + (index + 1) * SPAPR_PCI_MEM32_WIN_SIZE
But in this case you should directly defined SPAPR_PCI_MEM32_WIN_SIZE as
0x80000000, not as "SIZE - OFFSET". It's confusing...
If it is only a misunderstanding from my side, you can add my:
Reviewed-by: Laurent Vivier <lviv...@redhat.com>