Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-07-02 Thread David Gibson
On Thu, Jun 28, 2018 at 10:14:53PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2018-06-28 at 10:00 +0200, Andrea Bolognani wrote:
> > On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
> > > On Wed, Jun 27, 2018 at 12:22:31PM +0200, Andrea Bolognani wrote:
> > > > On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
> > > > > I didn't follow that discussion but this is "another" kind of PHB.
> > > > > This one models the baremetal controller as found on OpenPOWER and
> > > > > IBM Power machines. pSeries has a virtual PHB.
> > > > 
> > > > I understand that, and of course libvirt will need to learn about
> > > > this new type of PHB and make sure both pSeries and PowerNV guests
> > > > get the correct one assigned to them.
> > > 
> > > Hmm.. does it?  I would have thought pnv could act more like x86, in
> > > that libvirt doesn't attempt to create PHBs at all and just use the
> > > ones that are built in.
> > 
> > AFAIK x86 guests have a single PHB and additional ones cannot be
> > created in any way, which means we don't have to do any additional
> > second-guessing when assigning IDs to additional PCI controllers.
> 
> That's a surprising limitation. A single PHB only supports a limited
> number of MSIs no ? And only 256 bus numbers...

I think it depends exactly what you call a "PHB".  AIUI, on modern x86
systems, multiple PCI domains are supported, but you access them all
through the same IO ports, using a 'domain' field in some register to
distinguish which you're operating on

Wheter you want to call that multiple PHBs with a register multiplexer
in front of them, or a single PHB off which hang multiple domains is
kind of arbitrary (at least from the guest PoV).

> > > Though, come to that, I wouldn't think pnv support for libvirt would
> > > be much of a priority anyway.  The machine type is still very much in
> > > flux, and it's designed primarily for testing and development, not
> > > "real world" usage.
> > 
> > Can you *guarantee* that someone won't ask for PowerNV support in
> > libvirt at some point? Because if you can't (and I don't think you
> > can ;) then this is still a valuable conversation to have.
> 
> It's rather unlikely for now as there is no KVM suport for it (it's
> tricky, our chips aren't designed for full virtualization). That might
> change in the future but not soon.

KVM support isn't really a prerequisite for libvirt support.  More
relevant is that the qemu level machine is still changing a lot.  I
don't believe we're really maintaining version to version option
compatibility at this point, we're certainly not attempting to support
cross version migration for it.

> > > > What I meant is that pSeries guests get a single PHB by default,
> > > > with additional ones being instantiable through -device; this is
> > > > also consistent with how PCI controllers are added to other guest
> > > > types including pc, q35 and aarch64/virt, so it would be really
> > > > nice if PowerNV behaved the same way.
> > > 
> > > Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
> > > have a reasonable amount of flexibility to define it as we want.
> > > PowerNV is an emulation of existing hardware which has a specific
> > > behaviour which we need to match.
> > 
> > Sure, that's something to keep in mind.
> > 
> > But the thing is, you still need to have *some* flexibility in
> > the number of PHBs, since there is variation among real Power8
> > and Power9 chips; in the current incarnation, that flexibility
> > is provided by the num_phbs parameter, which is an entirely new
> > interface that's exclusive to PowerNV.
> > 
> > What I'm suggesting is that the same amount of flexibility is
> > offered through a standard interface, namely -device, instead.
> 
> But that's harder internally to qemu to properly "bind" to the chip
> where the PHB resides etc... 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Cédric Le Goater
On 06/28/2018 01:40 PM, Andrea Bolognani wrote:
> On Thu, 2018-06-28 at 12:04 +0200, Cédric Le Goater wrote:
>> On 06/28/2018 10:00 AM, Andrea Bolognani wrote:
>>> On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
 Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
 have a reasonable amount of flexibility to define it as we want.
 PowerNV is an emulation of existing hardware which has a specific
 behaviour which we need to match.
>>>
>>> Sure, that's something to keep in mind.
>>>
>>> But the thing is, you still need to have *some* flexibility in
>>> the number of PHBs, since there is variation among real Power8
>>> and Power9 chips; in the current incarnation, that flexibility
>>> is provided by the num_phbs parameter, which is an entirely new
>>> interface that's exclusive to PowerNV.
>>>
>>> What I'm suggesting is that the same amount of flexibility is
>>> offered through a standard interface, namely -device, instead.
>>
>> Yes. I don't know to be honest. Adding support for -device is not 
>> complex.
>>
>> v2 proposes to initialize a fixed set of PHBs 2, 3, 4 depending on 
>> the CPU. I think this is the best modeling option to fit the HW.
> 
> That approach would require even more hacks in libvirt if we ever
> wanted to support PowerNV - basing the PCI address allocation on
> the CPU model is not something that's really ever happened before.

It's even more complex than that but let's keep it "simple" and
consider that some CPU flavors can have more or less PHBs.

May be, we could agree on a common number of PHBs per chip which 
would be statically initialized. *two* PHBs should cover all CPU 
flavors. Extra PHBs created by the user with -device would be 
allowed but capped by a per CPU flavor limit. How's that ? 

C.   

> 
> To make it somewhat reasonable, information about the number of
> PHBs created for each CPU model would have to be exposed through
> QMP. And I wonder what a multi-chip guest would look like...
> 
> Plus, as soon as you try something like
> 
>   $ qemu-system-ppc64 \
> -nodefaults -display none \
> -machine powernv -cpu POWER8E \
> -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1 \
> -device megasas,id=scsi0,bus=pci.1,addr=0x1
> 
> very interesting things will start happening :)
> 




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Benjamin Herrenschmidt
On Thu, 2018-06-28 at 10:00 +0200, Andrea Bolognani wrote:
> On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
> > On Wed, Jun 27, 2018 at 12:22:31PM +0200, Andrea Bolognani wrote:
> > > On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
> > > > I didn't follow that discussion but this is "another" kind of PHB.
> > > > This one models the baremetal controller as found on OpenPOWER and
> > > > IBM Power machines. pSeries has a virtual PHB.
> > > 
> > > I understand that, and of course libvirt will need to learn about
> > > this new type of PHB and make sure both pSeries and PowerNV guests
> > > get the correct one assigned to them.
> > 
> > Hmm.. does it?  I would have thought pnv could act more like x86, in
> > that libvirt doesn't attempt to create PHBs at all and just use the
> > ones that are built in.
> 
> AFAIK x86 guests have a single PHB and additional ones cannot be
> created in any way, which means we don't have to do any additional
> second-guessing when assigning IDs to additional PCI controllers.

That's a surprising limitation. A single PHB only supports a limited
number of MSIs no ? And only 256 bus numbers...

> > Though, come to that, I wouldn't think pnv support for libvirt would
> > be much of a priority anyway.  The machine type is still very much in
> > flux, and it's designed primarily for testing and development, not
> > "real world" usage.
> 
> Can you *guarantee* that someone won't ask for PowerNV support in
> libvirt at some point? Because if you can't (and I don't think you
> can ;) then this is still a valuable conversation to have.

It's rather unlikely for now as there is no KVM suport for it (it's
tricky, our chips aren't designed for full virtualization). That might
change in the future but not soon.

> > > What I meant is that pSeries guests get a single PHB by default,
> > > with additional ones being instantiable through -device; this is
> > > also consistent with how PCI controllers are added to other guest
> > > types including pc, q35 and aarch64/virt, so it would be really
> > > nice if PowerNV behaved the same way.
> > 
> > Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
> > have a reasonable amount of flexibility to define it as we want.
> > PowerNV is an emulation of existing hardware which has a specific
> > behaviour which we need to match.
> 
> Sure, that's something to keep in mind.
> 
> But the thing is, you still need to have *some* flexibility in
> the number of PHBs, since there is variation among real Power8
> and Power9 chips; in the current incarnation, that flexibility
> is provided by the num_phbs parameter, which is an entirely new
> interface that's exclusive to PowerNV.
> 
> What I'm suggesting is that the same amount of flexibility is
> offered through a standard interface, namely -device, instead.

But that's harder internally to qemu to properly "bind" to the chip
where the PHB resides etc... 

Cheers,
Ben.




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Cédric Le Goater
On 06/28/2018 01:40 PM, Andrea Bolognani wrote:
> On Thu, 2018-06-28 at 12:04 +0200, Cédric Le Goater wrote:
>> On 06/28/2018 10:00 AM, Andrea Bolognani wrote:
>>> On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
 Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
 have a reasonable amount of flexibility to define it as we want.
 PowerNV is an emulation of existing hardware which has a specific
 behaviour which we need to match.
>>>
>>> Sure, that's something to keep in mind.
>>>
>>> But the thing is, you still need to have *some* flexibility in
>>> the number of PHBs, since there is variation among real Power8
>>> and Power9 chips; in the current incarnation, that flexibility
>>> is provided by the num_phbs parameter, which is an entirely new
>>> interface that's exclusive to PowerNV.
>>>
>>> What I'm suggesting is that the same amount of flexibility is
>>> offered through a standard interface, namely -device, instead.
>>
>> Yes. I don't know to be honest. Adding support for -device is not 
>> complex.
>>
>> v2 proposes to initialize a fixed set of PHBs 2, 3, 4 depending on 
>> the CPU. I think this is the best modeling option to fit the HW.
> 
> That approach would require even more hacks in libvirt if we ever
> wanted to support PowerNV - basing the PCI address allocation on
> the CPU model is not something that's really ever happened before.

ah. hmm, so that's another +1 for the -device approach.

> To make it somewhat reasonable, information about the number of
> PHBs created for each CPU model would have to be exposed through
> QMP. And I wonder what a multi-chip guest would look like...
> 
> Plus, as soon as you try something like
> 
>   $ qemu-system-ppc64 \
> -nodefaults -display none \
> -machine powernv -cpu POWER8E \
> -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1 \
> -device megasas,id=scsi0,bus=pci.1,addr=0x1
> 
> very interesting things will start happening :)

Well it boots :)


/ # lspci 
:00:00.0 PCI bridge: IBM POWER8 Host Bridge (PHB3)
:01:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
:02:01.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
:02:02.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet 
Controller (rev 03)
:02:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller 
(rev 03)
0001:00:00.0 PCI bridge: IBM POWER8 Host Bridge (PHB3)
0001:01:00.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge
0001:02:01.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078


C.



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Andrea Bolognani
On Thu, 2018-06-28 at 12:04 +0200, Cédric Le Goater wrote:
> On 06/28/2018 10:00 AM, Andrea Bolognani wrote:
> > On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
> > > Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
> > > have a reasonable amount of flexibility to define it as we want.
> > > PowerNV is an emulation of existing hardware which has a specific
> > > behaviour which we need to match.
> > 
> > Sure, that's something to keep in mind.
> > 
> > But the thing is, you still need to have *some* flexibility in
> > the number of PHBs, since there is variation among real Power8
> > and Power9 chips; in the current incarnation, that flexibility
> > is provided by the num_phbs parameter, which is an entirely new
> > interface that's exclusive to PowerNV.
> > 
> > What I'm suggesting is that the same amount of flexibility is
> > offered through a standard interface, namely -device, instead.
> 
> Yes. I don't know to be honest. Adding support for -device is not 
> complex.
> 
> v2 proposes to initialize a fixed set of PHBs 2, 3, 4 depending on 
> the CPU. I think this is the best modeling option to fit the HW.

That approach would require even more hacks in libvirt if we ever
wanted to support PowerNV - basing the PCI address allocation on
the CPU model is not something that's really ever happened before.

To make it somewhat reasonable, information about the number of
PHBs created for each CPU model would have to be exposed through
QMP. And I wonder what a multi-chip guest would look like...

Plus, as soon as you try something like

  $ qemu-system-ppc64 \
-nodefaults -display none \
-machine powernv -cpu POWER8E \
-device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1 \
-device megasas,id=scsi0,bus=pci.1,addr=0x1

very interesting things will start happening :)

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Cédric Le Goater
On 06/28/2018 10:00 AM, Andrea Bolognani wrote:
> On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
>> On Wed, Jun 27, 2018 at 12:22:31PM +0200, Andrea Bolognani wrote:
>>> On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
 I didn't follow that discussion but this is "another" kind of PHB.
 This one models the baremetal controller as found on OpenPOWER and
 IBM Power machines. pSeries has a virtual PHB.
>>>
>>> I understand that, and of course libvirt will need to learn about
>>> this new type of PHB and make sure both pSeries and PowerNV guests
>>> get the correct one assigned to them.
>>
>> Hmm.. does it?  I would have thought pnv could act more like x86, in
>> that libvirt doesn't attempt to create PHBs at all and just use the
>> ones that are built in.
> 
> AFAIK x86 guests have a single PHB and additional ones cannot be
> created in any way, which means we don't have to do any additional
> second-guessing when assigning IDs to additional PCI controllers.
> 
>> Though, come to that, I wouldn't think pnv support for libvirt would
>> be much of a priority anyway.  The machine type is still very much in
>> flux, and it's designed primarily for testing and development, not
>> "real world" usage.
> 
> Can you *guarantee* that someone won't ask for PowerNV support in
> libvirt at some point? Because if you can't (and I don't think you
> can ;) then this is still a valuable conversation to have.
> 
>>> What I meant is that pSeries guests get a single PHB by default,
>>> with additional ones being instantiable through -device; this is
>>> also consistent with how PCI controllers are added to other guest
>>> types including pc, q35 and aarch64/virt, so it would be really
>>> nice if PowerNV behaved the same way.
>>
>> Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
>> have a reasonable amount of flexibility to define it as we want.
>> PowerNV is an emulation of existing hardware which has a specific
>> behaviour which we need to match.
> 
> Sure, that's something to keep in mind.
> 
> But the thing is, you still need to have *some* flexibility in
> the number of PHBs, since there is variation among real Power8
> and Power9 chips; in the current incarnation, that flexibility
> is provided by the num_phbs parameter, which is an entirely new
> interface that's exclusive to PowerNV.
> 
> What I'm suggesting is that the same amount of flexibility is
> offered through a standard interface, namely -device, instead.

Yes. I don't know to be honest. Adding support for -device is not 
complex.

v2 proposes to initialize a fixed set of PHBs 2, 3, 4 depending on 
the CPU. I think this is the best modeling option to fit the HW.

Thanks,

C.




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-28 Thread Andrea Bolognani
On Thu, 2018-06-28 at 13:59 +1000, David Gibson wrote:
> On Wed, Jun 27, 2018 at 12:22:31PM +0200, Andrea Bolognani wrote:
> > On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
> > > I didn't follow that discussion but this is "another" kind of PHB.
> > > This one models the baremetal controller as found on OpenPOWER and
> > > IBM Power machines. pSeries has a virtual PHB.
> > 
> > I understand that, and of course libvirt will need to learn about
> > this new type of PHB and make sure both pSeries and PowerNV guests
> > get the correct one assigned to them.
> 
> Hmm.. does it?  I would have thought pnv could act more like x86, in
> that libvirt doesn't attempt to create PHBs at all and just use the
> ones that are built in.

AFAIK x86 guests have a single PHB and additional ones cannot be
created in any way, which means we don't have to do any additional
second-guessing when assigning IDs to additional PCI controllers.

> Though, come to that, I wouldn't think pnv support for libvirt would
> be much of a priority anyway.  The machine type is still very much in
> flux, and it's designed primarily for testing and development, not
> "real world" usage.

Can you *guarantee* that someone won't ask for PowerNV support in
libvirt at some point? Because if you can't (and I don't think you
can ;) then this is still a valuable conversation to have.

> > What I meant is that pSeries guests get a single PHB by default,
> > with additional ones being instantiable through -device; this is
> > also consistent with how PCI controllers are added to other guest
> > types including pc, q35 and aarch64/virt, so it would be really
> > nice if PowerNV behaved the same way.
> 
> Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
> have a reasonable amount of flexibility to define it as we want.
> PowerNV is an emulation of existing hardware which has a specific
> behaviour which we need to match.

Sure, that's something to keep in mind.

But the thing is, you still need to have *some* flexibility in
the number of PHBs, since there is variation among real Power8
and Power9 chips; in the current incarnation, that flexibility
is provided by the num_phbs parameter, which is an entirely new
interface that's exclusive to PowerNV.

What I'm suggesting is that the same amount of flexibility is
offered through a standard interface, namely -device, instead.

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread David Gibson
On Wed, Jun 27, 2018 at 12:22:31PM +0200, Andrea Bolognani wrote:
> On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
> > On 06/26/2018 05:57 PM, Andrea Bolognani wrote:
> > > On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> > > > This is a model of the PCIe host bridge found on Power8 chips,
> > > > including PowerBus logic interface, IOMMU support, PCIe root complex,
> > > > XICS MSI and LSI interrupt sources.
> > > > 
> > > > 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> > > > only one is currently initialized.
> > > 
> > > What's the advantage in creating 4 PHBs instead of a single one,
> > 
> > The Power8 chip comes in different flavors: Venice, Murano, Naple, 
> > each having a different number of PHBs. We don't need to initialize 
> > them all to plug only a couple of devices (net, storage, usbs) 
> > 
> > When time comes, we might want to test some more complex configurations
> > or extend the modeling with CAPI support. That's why we have a :
> > 
> > #define PNV_MAX_CHIP_PHB 4
> > PnvPHB3  phbs[PNV_MAX_CHIP_PHB];
> > 
> > under the chip, and a 'num_phbs' attribute to increase the number
> > of controllers. It still needs to be tested but that's the goal.
[snip]
> > I didn't follow that discussion but this is "another" kind of PHB.
> > This one models the baremetal controller as found on OpenPOWER and
> > IBM Power machines. pSeries has a virtual PHB.
> 
> I understand that, and of course libvirt will need to learn about
> this new type of PHB and make sure both pSeries and PowerNV guests
> get the correct one assigned to them.

Hmm.. does it?  I would have thought pnv could act more like x86, in
that libvirt doesn't attempt to create PHBs at all and just use the
ones that are built in.

Though, come to that, I wouldn't think pnv support for libvirt would
be much of a priority anyway.  The machine type is still very much in
flux, and it's designed primarily for testing and development, not
"real world" usage.

> What I meant is that pSeries guests get a single PHB by default,
> with additional ones being instantiable through -device; this is
> also consistent with how PCI controllers are added to other guest
> types including pc, q35 and aarch64/virt, so it would be really
> nice if PowerNV behaved the same way.

Well.. sure.. but it doesn't.  pSeries is a virtual platform, so we
have a reasonable amount of flexibility to define it as we want.
PowerNV is an emulation of existing hardware which has a specific
behaviour which we need to match.

> > > As it is, this will confuse the heck out of libvirt's PCI address > 
> > > allocation algorithm :)
> > 
> > The pci bus name should be directly related to the PHB index. But
> > I agree we need to be careful. That's why you are in cc: :)
> 
> Thanks, I really appreciate you keeping me in the loop :)
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Cédric Le Goater
On 06/27/2018 02:18 PM, Cédric Le Goater wrote:
> On 06/27/2018 12:22 PM, Andrea Bolognani wrote:
>> On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
>>> On 06/26/2018 05:57 PM, Andrea Bolognani wrote:
 On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> This is a model of the PCIe host bridge found on Power8 chips,
> including PowerBus logic interface, IOMMU support, PCIe root complex,
> XICS MSI and LSI interrupt sources.
>
> 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> only one is currently initialized.

 What's the advantage in creating 4 PHBs instead of a single one,
>>>
>>> The Power8 chip comes in different flavors: Venice, Murano, Naple, 
>>> each having a different number of PHBs. We don't need to initialize 
>>> them all to plug only a couple of devices (net, storage, usbs) 
>>>
>>> When time comes, we might want to test some more complex configurations
>>> or extend the modeling with CAPI support. That's why we have a :
>>>
>>> #define PNV_MAX_CHIP_PHB 4
>>> PnvPHB3  phbs[PNV_MAX_CHIP_PHB];
>>>
>>> under the chip, and a 'num_phbs' attribute to increase the number
>>> of controllers. It still needs to be tested but that's the goal.
>>
>> I was a bit confused about the difference between "provisioning"
>> and "initializing" a PHB, I guess.
> 
> objects have different states : allocated, initialized, realized
> I guess "provision (memory)" is not the best choice for the first
> state.
> 
>> Now that I've looked at the code, my (very uneducated) reading is
>> that you're allocating memory for 4 PHBs along with the chip, but
>> only actually initializing num_phbs of them, with 1 being the
>> default.
> 
> Yes. I had the idea to increase the number of PHBs with a machine
> option later on if needed.
> 
>> I have no idea what this implies when it comes to adding PCI
>> controller to the guest, though. If I run a guest with num_phbs=1,
>> are ids pci.1..pci.3 reserved even though the corresponding PHBs
>> have not been initialized? 
> 
> They don't exist on the PowerNV machine if you don't have a PHB3
> device backing them.
> 
>> Is num_phbs the only way you can control how many PHBs a PowerNV 
>> guest will have?
> 
> yes. Unless we make them user creatable but that's a discussion in
> progress. I am not sure user creatable PHB3s are needed. We will
> see.
>  
>> I would play around and try to figure out all the kinks on my own,
>> but I can't actually compile QEMU with this patch applied:
> 
> you need to use David's branch :
> 
>   https://github.com/dgibson/qemu/tree/ppc-for-3.0

I have pushed a tree here :

https://github.com/legoater/qemu/tree/phb3-3.0

with two patches, the one we have been discussing about, with some
cleanups and fixes, and one adding user creatable PHB3 devices so 
that extra phbs and devices can be defined on the command line :

-device pnv-phb3,chip-id=1,index=1,id=phb-1.1 -device 
nec-usb-xhci,bus=pci.1,addr=0x7

The PHB identifier is named 'index' now, to be compatible with the 
libvirt attribute. 

Cheers,

C. 



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Cédric Le Goater
On 06/27/2018 12:40 PM, Andrea Bolognani wrote:
> On Wed, 2018-06-27 at 18:41 +1000, Benjamin Herrenschmidt wrote:
>> On Wed, 2018-06-27 at 09:46 +0200, Cédric Le Goater wrote:
>>> So the "IBM PHB3 PCIE Root Port" is already user createable.
>>>
>>> I can take a look at user createable PHB3s. I think this is OK from a model
>>> perspective. The object is rather standalone, it needs the machine for 
>>> the XICS fabric and a couple of ids, phb id and chip id. These can come
>>> from the command line.
>>>
>>> We want at least one PHB3 per socket/chip though. 
>>
>> We don't want the user to specify the SCOM addresses though (for the
>> MMIO windows we should get skiboot to assign them).
>>
>> If the user gets to specify a thing it would be which of the 3 or 4 HW
>> PHBs of the chip it is, the SCOM addresses gets deduced.
> 
> For pSeries guests libvirt will either automatically create, or
> allow users to configure manually, PHBs with something like
> 
>   
> 
>   
> 
> which is ultimately converted to
> 
>   -device spapr-pci-host-bridge,index=1,id=pci.1
> 
> Ideally the interface for PowerNV guests can be made to be similar
> if not identical at the libvirt level, without having to add too
> many hacks... It would certainly help a lot if the QEMU interface
> for PowerNV PHBs didn't stray too far from the above.
> 

I think that we will need an extra attribute to specify the chip, but 
only in the case of a multichip system, which is not the common scenario.

So the 'index' attribute should work fine. 

Thanks,

C. 



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Cédric Le Goater
On 06/27/2018 12:22 PM, Andrea Bolognani wrote:
> On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
>> On 06/26/2018 05:57 PM, Andrea Bolognani wrote:
>>> On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
 This is a model of the PCIe host bridge found on Power8 chips,
 including PowerBus logic interface, IOMMU support, PCIe root complex,
 XICS MSI and LSI interrupt sources.

 4 PHBs are provisioned under the Power8 chip model to fit hardware but
 only one is currently initialized.
>>>
>>> What's the advantage in creating 4 PHBs instead of a single one,
>>
>> The Power8 chip comes in different flavors: Venice, Murano, Naple, 
>> each having a different number of PHBs. We don't need to initialize 
>> them all to plug only a couple of devices (net, storage, usbs) 
>>
>> When time comes, we might want to test some more complex configurations
>> or extend the modeling with CAPI support. That's why we have a :
>>
>>  #define PNV_MAX_CHIP_PHB 4
>>  PnvPHB3  phbs[PNV_MAX_CHIP_PHB];
>>
>> under the chip, and a 'num_phbs' attribute to increase the number
>> of controllers. It still needs to be tested but that's the goal.
> 
> I was a bit confused about the difference between "provisioning"
> and "initializing" a PHB, I guess.

objects have different states : allocated, initialized, realized
I guess "provision (memory)" is not the best choice for the first
state.

> Now that I've looked at the code, my (very uneducated) reading is
> that you're allocating memory for 4 PHBs along with the chip, but
> only actually initializing num_phbs of them, with 1 being the
> default.

Yes. I had the idea to increase the number of PHBs with a machine
option later on if needed.

> I have no idea what this implies when it comes to adding PCI
> controller to the guest, though. If I run a guest with num_phbs=1,
> are ids pci.1..pci.3 reserved even though the corresponding PHBs
> have not been initialized? 

They don't exist on the PowerNV machine if you don't have a PHB3
device backing them.

> Is num_phbs the only way you can control how many PHBs a PowerNV 
> guest will have?

yes. Unless we make them user creatable but that's a discussion in
progress. I am not sure user creatable PHB3s are needed. We will
see.
 
> I would play around and try to figure out all the kinks on my own,
> but I can't actually compile QEMU with this patch applied:

you need to use David's branch :

https://github.com/dgibson/qemu/tree/ppc-for-3.0


>   hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_reset’:
>   hw/pci-host/pnv_phb3_msi.c:229:11: error: ‘ICSStateClass’ {aka ‘struct 
> ICSStateClass’} has no member named ‘parent_reset’; did you mean 
> ‘parent_class’?
>icsc->parent_reset(dev);
>  ^~~~
>  parent_class
>   hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_realize’:
>   hw/pci-host/pnv_phb3_msi.c:260:11: error: ‘ICSStateClass’ {aka ‘struct 
> ICSStateClass’} has no member named ‘parent_realize’; did you mean 
> ‘parent_class’?
>icsc->parent_realize(dev, _err);
>  ^~
>  parent_class
>   hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_class_init’:
>   hw/pci-host/pnv_phb3_msi.c:293:43: error: ‘ICSStateClass’ {aka ‘struct 
> ICSStateClass’} has no member named ‘parent_realize’; did you mean 
> ‘parent_class’?
>>parent_realize);
>  ^~
>  parent_class
>   hw/pci-host/pnv_phb3_msi.c:295:41: error: ‘ICSStateClass’ {aka ‘struct 
> ICSStateClass’} has no member named ‘parent_reset’; did you mean 
> ‘parent_class’?
>  >parent_reset);
>^~~~
>parent_class
> 
>>> like we already do for pSeries guests? 
>>
>> I didn't follow that discussion but this is "another" kind of PHB.
>> This one models the baremetal controller as found on OpenPOWER and
>> IBM Power machines. pSeries has a virtual PHB.
> 
> I understand that, and of course libvirt will need to learn about
> this new type of PHB and make sure both pSeries and PowerNV guests
> get the correct one assigned to them.

yes. we don't want to change libvirt too much so this is a requirement 
to take into account.

> What I meant is that pSeries guests get a single PHB by default,

yes

> with additional ones being instantiable through -device; 

yes.

> this is
> also consistent with how PCI controllers are added to other guest
> types including pc, q35 and aarch64/virt, so it would be really
> nice if PowerNV behaved the same way.

OK. So that's a +1 in favor of user creatable PHB3s devices 

>>> As it is, this will confuse the heck out of libvirt's PCI address > 
>>> allocation algorithm :)
>>
>> The pci bus name should be directly related to the PHB index. But
>> I agree we need to be 

Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Cédric Le Goater
On 06/27/2018 10:41 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2018-06-27 at 09:46 +0200, Cédric Le Goater wrote:
>> So the "IBM PHB3 PCIE Root Port" is already user createable.
>>
>> I can take a look at user createable PHB3s. I think this is OK from a model
>> perspective. The object is rather standalone, it needs the machine for 
>> the XICS fabric and a couple of ids, phb id and chip id. These can come
>> from the command line.
>>
>> We want at least one PHB3 per socket/chip though. 
> 
> We don't want the user to specify the SCOM addresses though

We don't need to.

>  (for the MMIO windows we should get skiboot to assign them).

yes. That's a small fix yet to be done. It can come later.

> If the user gets to specify a thing it would be which of the 3 or 4 HW
> PHBs of the chip it is, the SCOM addresses gets deduced.


So I modified a bit the model to support user creatable PHB3 devices and
could start a QEMU powernv machine with these options :
  
Raid controller on PHB0 :

  -device megasas,id=scsi0,bus=pci.0,addr=0x1
  -drive file=$file,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
  -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1

Ethernet controller on PHB0 :

  -device e1000,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pci.0,addr=0x2
  -netdev 
bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=net0

USB controller on PHB0 :

  -device nec-usb-xhci,bus=pci.0,addr=0x7

Extra PHB1 on chip 0 :

  -device pnv-phb3,chip-id=0,phb-id=1,id=phb0.1

USB controller on PHB1 :

  -device nec-usb-xhci,bus=pci.1,addr=0x7 

Most of the code changes are fine and even fix a few problems.  

But, user created PHB3 devices are not parented to the PnvChip and 
this means that we need to : 

 1 - scan the full object hierarchy to populate the device tree 
 2 - look for the owning chip in the PHP3 realize routine 
 3 - grab the pnv machine to get the XICS fabric

2 & 3 are somewhat violation of the QOM models but we can live with 
that. 1 is annoying. I think.

So may be we should forget about the -device possibility and use fixed
values for the number of PHBs. The value would depend on the chip flavor.
 

Cheers,

C. 



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Andrea Bolognani
On Wed, 2018-06-27 at 18:41 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-06-27 at 09:46 +0200, Cédric Le Goater wrote:
> > So the "IBM PHB3 PCIE Root Port" is already user createable.
> > 
> > I can take a look at user createable PHB3s. I think this is OK from a model
> > perspective. The object is rather standalone, it needs the machine for 
> > the XICS fabric and a couple of ids, phb id and chip id. These can come
> > from the command line.
> > 
> > We want at least one PHB3 per socket/chip though. 
> 
> We don't want the user to specify the SCOM addresses though (for the
> MMIO windows we should get skiboot to assign them).
> 
> If the user gets to specify a thing it would be which of the 3 or 4 HW
> PHBs of the chip it is, the SCOM addresses gets deduced.

For pSeries guests libvirt will either automatically create, or
allow users to configure manually, PHBs with something like

  

  

which is ultimately converted to

  -device spapr-pci-host-bridge,index=1,id=pci.1

Ideally the interface for PowerNV guests can be made to be similar
if not identical at the libvirt level, without having to add too
many hacks... It would certainly help a lot if the QEMU interface
for PowerNV PHBs didn't stray too far from the above.

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Andrea Bolognani
On Tue, 2018-06-26 at 19:02 +0200, Cédric Le Goater wrote:
> On 06/26/2018 05:57 PM, Andrea Bolognani wrote:
> > On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> > > This is a model of the PCIe host bridge found on Power8 chips,
> > > including PowerBus logic interface, IOMMU support, PCIe root complex,
> > > XICS MSI and LSI interrupt sources.
> > > 
> > > 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> > > only one is currently initialized.
> > 
> > What's the advantage in creating 4 PHBs instead of a single one,
> 
> The Power8 chip comes in different flavors: Venice, Murano, Naple, 
> each having a different number of PHBs. We don't need to initialize 
> them all to plug only a couple of devices (net, storage, usbs) 
> 
> When time comes, we might want to test some more complex configurations
> or extend the modeling with CAPI support. That's why we have a :
> 
>   #define PNV_MAX_CHIP_PHB 4
>   PnvPHB3  phbs[PNV_MAX_CHIP_PHB];
> 
> under the chip, and a 'num_phbs' attribute to increase the number
> of controllers. It still needs to be tested but that's the goal.

I was a bit confused about the difference between "provisioning"
and "initializing" a PHB, I guess.

Now that I've looked at the code, my (very uneducated) reading is
that you're allocating memory for 4 PHBs along with the chip, but
only actually initializing num_phbs of them, with 1 being the
default.

I have no idea what this implies when it comes to adding PCI
controller to the guest, though. If I run a guest with num_phbs=1,
are ids pci.1..pci.3 reserved even though the corresponding PHBs
have not been initialized? Is num_phbs the only way you can control
how many PHBs a PowerNV guest will have?

I would play around and try to figure out all the kinks on my own,
but I can't actually compile QEMU with this patch applied:

  hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_reset’:
  hw/pci-host/pnv_phb3_msi.c:229:11: error: ‘ICSStateClass’ {aka ‘struct 
ICSStateClass’} has no member named ‘parent_reset’; did you mean ‘parent_class’?
   icsc->parent_reset(dev);
 ^~~~
 parent_class
  hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_realize’:
  hw/pci-host/pnv_phb3_msi.c:260:11: error: ‘ICSStateClass’ {aka ‘struct 
ICSStateClass’} has no member named ‘parent_realize’; did you mean 
‘parent_class’?
   icsc->parent_realize(dev, _err);
 ^~
 parent_class
  hw/pci-host/pnv_phb3_msi.c: In function ‘phb3_msi_class_init’:
  hw/pci-host/pnv_phb3_msi.c:293:43: error: ‘ICSStateClass’ {aka ‘struct 
ICSStateClass’} has no member named ‘parent_realize’; did you mean 
‘parent_class’?
   >parent_realize);
 ^~
 parent_class
  hw/pci-host/pnv_phb3_msi.c:295:41: error: ‘ICSStateClass’ {aka ‘struct 
ICSStateClass’} has no member named ‘parent_reset’; did you mean ‘parent_class’?
 >parent_reset);
   ^~~~
   parent_class

> > like we already do for pSeries guests? 
> 
> I didn't follow that discussion but this is "another" kind of PHB.
> This one models the baremetal controller as found on OpenPOWER and
> IBM Power machines. pSeries has a virtual PHB.

I understand that, and of course libvirt will need to learn about
this new type of PHB and make sure both pSeries and PowerNV guests
get the correct one assigned to them.

What I meant is that pSeries guests get a single PHB by default,
with additional ones being instantiable through -device; this is
also consistent with how PCI controllers are added to other guest
types including pc, q35 and aarch64/virt, so it would be really
nice if PowerNV behaved the same way.

> > As it is, this will confuse the heck out of libvirt's PCI address > 
> > allocation algorithm :)
> 
> The pci bus name should be directly related to the PHB index. But
> I agree we need to be careful. That's why you are in cc: :)

Thanks, I really appreciate you keeping me in the loop :)

-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Benjamin Herrenschmidt
On Wed, 2018-06-27 at 09:46 +0200, Cédric Le Goater wrote:
> So the "IBM PHB3 PCIE Root Port" is already user createable.
> 
> I can take a look at user createable PHB3s. I think this is OK from a model
> perspective. The object is rather standalone, it needs the machine for 
> the XICS fabric and a couple of ids, phb id and chip id. These can come
> from the command line.
> 
> We want at least one PHB3 per socket/chip though. 

We don't want the user to specify the SCOM addresses though (for the
MMIO windows we should get skiboot to assign them).

If the user gets to specify a thing it would be which of the 3 or 4 HW
PHBs of the chip it is, the SCOM addresses gets deduced.

Cheers,
Ben.



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread Cédric Le Goater
On 06/27/2018 09:28 AM, David Gibson wrote:
> On Wed, Jun 27, 2018 at 11:38:17AM +1000, Benjamin Herrenschmidt wrote:
>> On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
>>>
 +
 +/* Extract field fname from val */
 +#define GETFIELD(fname, val)\
 +(((val) & fname##_MASK) >> fname##_LSH)
 +
 +/* Set field fname of oval to fval
 + * NOTE: oval isn't modified, the combined result is returned
 + */
 +#define SETFIELD(fname, oval, fval) \
 +(((oval) & ~fname##_MASK) | \
 + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
 +
>>>
>>> Pls don't make up macros like these. We can't have each device do it.
>>
>> So what ? We move the macros in a generic place ? These are MUCH better
>> than open-coding the masks & shifts and much less error prone.
> 
> There are already deposit32 and deposit64() functions which I think
> would cover this.

OK. I understand but we can't include target/ppc/cpu.h in this file
either and we want to use these specific macros to keep the register 
definition file similar to the one in skiboot.

Is their a common area for such needs ? 

>>
 @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
 +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
  DEFINE_PROP_END_OF_LIST(),
  };
>>>
>>> How about instanciating each extra phb using -device?
>>> Seems better than teaching everyone about platform-specific
>>> options.
>>
>> It's about which PHBs are enabled, not which are instanciated, if I
>> understand Cedric changes ...
>>
>> This aims are implementing the POWER8 chip correctly, it has a fixed
>> number of PHBs per-chip at very specific XSCOM addresses, that the
>> firwmare knows about.
> 
> Yeah, this is a recurring design conflict, which I'm not sure how to
> address.  Usually instantiating things with -device works better, but
> when do we do when that allows more flexibility with hardware
> arrangement than is actually possible in the hardware.
> 

So the "IBM PHB3 PCIE Root Port" is already user createable.

I can take a look at user createable PHB3s. I think this is OK from a model
perspective. The object is rather standalone, it needs the machine for 
the XICS fabric and a couple of ids, phb id and chip id. These can come
from the command line.

We want at least one PHB3 per socket/chip though. 

C. 




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-27 Thread David Gibson
On Wed, Jun 27, 2018 at 11:38:17AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
> > 
> > > +
> > > +/* Extract field fname from val */
> > > +#define GETFIELD(fname, val)\
> > > +(((val) & fname##_MASK) >> fname##_LSH)
> > > +
> > > +/* Set field fname of oval to fval
> > > + * NOTE: oval isn't modified, the combined result is returned
> > > + */
> > > +#define SETFIELD(fname, oval, fval) \
> > > +(((oval) & ~fname##_MASK) | \
> > > + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> > > +
> > 
> > Pls don't make up macros like these. We can't have each device do it.
> 
> So what ? We move the macros in a generic place ? These are MUCH better
> than open-coding the masks & shifts and much less error prone.

There are already deposit32 and deposit64() functions which I think
would cover this.

> 
> > > @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
> > >  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
> > >  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
> > >  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> > > +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
> > >  DEFINE_PROP_END_OF_LIST(),
> > >  };
> > 
> > How about instanciating each extra phb using -device?
> > Seems better than teaching everyone about platform-specific
> > options.
> 
> It's about which PHBs are enabled, not which are instanciated, if I
> understand Cedric changes ...
> 
> This aims are implementing the POWER8 chip correctly, it has a fixed
> number of PHBs per-chip at very specific XSCOM addresses, that the
> firwmare knows about.

Yeah, this is a recurring design conflict, which I'm not sure how to
address.  Usually instantiating things with -device works better, but
when do we do when that allows more flexibility with hardware
arrangement than is actually possible in the hardware.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Michael S. Tsirkin
On Wed, Jun 27, 2018 at 11:38:17AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
> > 
> > > +
> > > +/* Extract field fname from val */
> > > +#define GETFIELD(fname, val)\
> > > +(((val) & fname##_MASK) >> fname##_LSH)
> > > +
> > > +/* Set field fname of oval to fval
> > > + * NOTE: oval isn't modified, the combined result is returned
> > > + */
> > > +#define SETFIELD(fname, oval, fval) \
> > > +(((oval) & ~fname##_MASK) | \
> > > + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> > > +
> > 
> > Pls don't make up macros like these. We can't have each device do it.
> 
> So what ? We move the macros in a generic place ? These are MUCH better
> than open-coding the masks & shifts and much less error prone.

include/qemu/bitops.h has a ton of handy macros.

> > > @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
> > >  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
> > >  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
> > >  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> > > +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
> > >  DEFINE_PROP_END_OF_LIST(),
> > >  };
> > 
> > How about instanciating each extra phb using -device?
> > Seems better than teaching everyone about platform-specific
> > options.
> 
> It's about which PHBs are enabled, not which are instanciated, if I
> understand Cedric changes ...
> 
> This aims are implementing the POWER8 chip correctly, it has a fixed
> number of PHBs per-chip at very specific XSCOM addresses, that the
> firwmare knows about.
> 
> Cheers,
> Ben.



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Benjamin Herrenschmidt
On Wed, 2018-06-27 at 03:35 +0300, Michael S. Tsirkin wrote:
> 
> > +
> > +/* Extract field fname from val */
> > +#define GETFIELD(fname, val)\
> > +(((val) & fname##_MASK) >> fname##_LSH)
> > +
> > +/* Set field fname of oval to fval
> > + * NOTE: oval isn't modified, the combined result is returned
> > + */
> > +#define SETFIELD(fname, oval, fval) \
> > +(((oval) & ~fname##_MASK) | \
> > + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> > +
> 
> Pls don't make up macros like these. We can't have each device do it.

So what ? We move the macros in a generic place ? These are MUCH better
than open-coding the masks & shifts and much less error prone.

> > @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
> >  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
> >  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
> >  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> > +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
> >  DEFINE_PROP_END_OF_LIST(),
> >  };
> 
> How about instanciating each extra phb using -device?
> Seems better than teaching everyone about platform-specific
> options.

It's about which PHBs are enabled, not which are instanciated, if I
understand Cedric changes ...

This aims are implementing the POWER8 chip correctly, it has a fixed
number of PHBs per-chip at very specific XSCOM addresses, that the
firwmare knows about.

Cheers,
Ben.




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Michael S. Tsirkin
On Tue, Jun 26, 2018 at 03:59:28PM +0200, Cédric Le Goater wrote:
> diff --git a/include/hw/pci-host/pnv_phb3_regs.h 
> b/include/hw/pci-host/pnv_phb3_regs.h
> new file mode 100644
> index ..a1672726b908
> --- /dev/null
> +++ b/include/hw/pci-host/pnv_phb3_regs.h
> @@ -0,0 +1,510 @@
> +/* Copyright (c) 2013-2018, IBM Corporation.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + *  http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> + * implied.
> + *
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#ifndef PCI_HOST_PNV_PHB3_REGS_H
> +#define PCI_HOST_PNV_PHB3_REGS_H
> +
> +/*
> + * Duplicated from target/ppc/cpu.h
> + */
> +#define PPC_BIT(bit)(0x8000UL >> (bit))
> +#define PPC_BIT32(bit)  (0x8000UL >> (bit))
> +#define PPC_BIT8(bit)   (0x80UL >> (bit))
> +#define PPC_BITMASK(bs, be) ((PPC_BIT(bs) - PPC_BIT(be)) | PPC_BIT(bs))
> +#define PPC_BITMASK32(bs, be)   ((PPC_BIT32(bs) - PPC_BIT32(be)) | \
> + PPC_BIT32(bs))
> +#define PPC_BITLSHIFT(be)   (63 - (be))
> +#define PPC_BITLSHIFT32(be) (31 - (be))



> +
> +/* Extract field fname from val */
> +#define GETFIELD(fname, val)\
> +(((val) & fname##_MASK) >> fname##_LSH)
> +
> +/* Set field fname of oval to fval
> + * NOTE: oval isn't modified, the combined result is returned
> + */
> +#define SETFIELD(fname, oval, fval) \
> +(((oval) & ~fname##_MASK) | \
> + typeof(oval))(fval)) << fname##_LSH) & fname##_MASK))
> +

Pls don't make up macros like these. We can't have each device do it.


> @@ -1031,6 +1110,7 @@ static Property pnv_chip_properties[] = {
>  DEFINE_PROP_UINT64("ram-size", PnvChip, ram_size, 0),
>  DEFINE_PROP_UINT32("nr-cores", PnvChip, nr_cores, 1),
>  DEFINE_PROP_UINT64("cores-mask", PnvChip, cores_mask, 0x0),
> +DEFINE_PROP_UINT32("num-phbs", PnvChip, num_phbs, 1),
>  DEFINE_PROP_END_OF_LIST(),
>  };

How about instanciating each extra phb using -device?
Seems better than teaching everyone about platform-specific
options.

-- 
MST



Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Benjamin Herrenschmidt
On Tue, 2018-06-26 at 17:57 +0200, Andrea Bolognani wrote:
> On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> > This is a model of the PCIe host bridge found on Power8 chips,
> > including PowerBus logic interface, IOMMU support, PCIe root complex,
> > XICS MSI and LSI interrupt sources.
> > 
> > 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> > only one is currently initialized.
> 
> What's the advantage in creating 4 PHBs instead of a single one,
> like we already do for pSeries guests? As it is, this will confuse
> the heck out of libvirt's PCI address allocation algorithm :)

This matches the actual HW. POWER9 will have 6 per chip :-)

The goal of the "powernv" platform in qemu is to closely match the
actual HW.

Note that pseries guests can (and will under some cirscumstances) have
multiple PHBs as well.

Cheers,
Ben.




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Cédric Le Goater
On 06/26/2018 05:57 PM, Andrea Bolognani wrote:
> On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
>> This is a model of the PCIe host bridge found on Power8 chips,
>> including PowerBus logic interface, IOMMU support, PCIe root complex,
>> XICS MSI and LSI interrupt sources.
>>
>> 4 PHBs are provisioned under the Power8 chip model to fit hardware but
>> only one is currently initialized.
> 
> What's the advantage in creating 4 PHBs instead of a single one,

The Power8 chip comes in different flavors: Venice, Murano, Naple, 
each having a different number of PHBs. We don't need to initialize 
them all to plug only a couple of devices (net, storage, usbs) 

When time comes, we might want to test some more complex configurations
or extend the modeling with CAPI support. That's why we have a :

#define PNV_MAX_CHIP_PHB 4
PnvPHB3  phbs[PNV_MAX_CHIP_PHB];

under the chip, and a 'num_phbs' attribute to increase the number
of controllers. It still needs to be tested but that's the goal.

> like we already do for pSeries guests? 

I didn't follow that discussion but this is "another" kind of PHB.
This one models the baremetal controller as found on OpenPOWER and
IBM Power machines. pSeries has a virtual PHB.  

> As it is, this will confuse the heck out of libvirt's PCI address > 
> allocation algorithm :)

The pci bus name should be directly related to the PHB index. But
I agree we need to be careful. That's why you are in cc: :)

Thanks,

C.  




Re: [Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Andrea Bolognani
On Tue, 2018-06-26 at 15:59 +0200, Cédric Le Goater wrote:
> This is a model of the PCIe host bridge found on Power8 chips,
> including PowerBus logic interface, IOMMU support, PCIe root complex,
> XICS MSI and LSI interrupt sources.
> 
> 4 PHBs are provisioned under the Power8 chip model to fit hardware but
> only one is currently initialized.

What's the advantage in creating 4 PHBs instead of a single one,
like we already do for pSeries guests? As it is, this will confuse
the heck out of libvirt's PCI address allocation algorithm :)

-- 
Andrea Bolognani / Red Hat / Virtualization



[Qemu-devel] [PATCH] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge

2018-06-26 Thread Cédric Le Goater
From: Benjamin Herrenschmidt 

This is a model of the PCIe host bridge found on Power8 chips,
including PowerBus logic interface, IOMMU support, PCIe root complex,
XICS MSI and LSI interrupt sources.

4 PHBs are provisioned under the Power8 chip model to fit hardware but
only one is currently initialized. No default device layout is
provided and devices should be added on the pci.0 bus using command
line options such as :

  -device e1000,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pci.0,addr=0x2
  -netdev 
bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=hostnet0

  -device megasas,id=scsi0,bus=pci.0,addr=0x1
  -drive file=$disk,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
  -device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2

This implementation doesn't emulate the EEH error handling (and may
never do).

Signed-off-by: Benjamin Herrenschmidt 
[clg: rewrote the QOM models
  misc fixes
  lots of love and care]
Signed-off-by: Cédric Le Goater 
---
 default-configs/ppc64-softmmu.mak   |1 +
 include/hw/pci-host/pnv_phb3.h  |  158 +
 include/hw/pci-host/pnv_phb3_regs.h |  510 
 include/hw/ppc/pnv.h|6 +
 include/hw/ppc/pnv_xscom.h  |9 +
 include/hw/ppc/xics.h   |1 +
 hw/intc/xics.c  |2 +-
 hw/pci-host/pnv_phb3.c  | 1089 +++
 hw/pci-host/pnv_phb3_msi.c  |  315 ++
 hw/pci-host/pnv_phb3_pbcq.c |  350 +++
 hw/ppc/pnv.c|  106 +++-
 hw/ppc/pnv_xscom.c  |6 +-
 hw/pci-host/Makefile.objs   |1 +
 13 files changed, 2549 insertions(+), 5 deletions(-)
 create mode 100644 include/hw/pci-host/pnv_phb3.h
 create mode 100644 include/hw/pci-host/pnv_phb3_regs.h
 create mode 100644 hw/pci-host/pnv_phb3.c
 create mode 100644 hw/pci-host/pnv_phb3_msi.c
 create mode 100644 hw/pci-host/pnv_phb3_pbcq.c

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index b94af6c7c62a..deebba2b044a 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -9,6 +9,7 @@ CONFIG_IPMI=y
 CONFIG_IPMI_LOCAL=y
 CONFIG_IPMI_EXTERN=y
 CONFIG_ISA_IPMI_BT=y
+CONFIG_PCIE_PORT=y
 
 # For pSeries
 CONFIG_PSERIES=y
diff --git a/include/hw/pci-host/pnv_phb3.h b/include/hw/pci-host/pnv_phb3.h
new file mode 100644
index ..795e9b3af2c0
--- /dev/null
+++ b/include/hw/pci-host/pnv_phb3.h
@@ -0,0 +1,158 @@
+/*
+ * QEMU PowerPC PowerNV PHB3 model
+ *
+ * Copyright (c) 2014-2018, IBM Corporation.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef PCI_HOST_PNV_PHB3_H
+#define PCI_HOST_PNV_PHB3_H
+
+#include "hw/pci/pci_host.h"
+#include "hw/ppc/xics.h"
+
+typedef struct PnvPHB3 PnvPHB3;
+
+/*
+ * PHB3 XICS Source for MSIs
+ */
+#define TYPE_PHB3_MSI "phb3-msi"
+#define PHB3_MSI(obj) OBJECT_CHECK(Phb3MsiState, (obj), TYPE_PHB3_MSI)
+
+#define PHB3_MAX_MSI 2048
+
+typedef struct Phb3MsiState {
+ICSState ics;
+
+PnvPHB3 *phb;
+uint64_t rba[PHB3_MAX_MSI / 64];
+uint32_t rba_sum;
+} Phb3MsiState;
+
+void pnv_phb3_msi_update_config(Phb3MsiState *msis, uint32_t base,
+uint32_t count);
+void pnv_phb3_msi_send(Phb3MsiState *msis, uint64_t addr, uint16_t data,
+   int32_t dev_pe);
+void pnv_phb3_msi_ffi(Phb3MsiState *msis, uint64_t val);
+
+
+/* We have one such address space wrapper per possible device
+ * under the PHB since they need to be assigned statically at
+ * qemu device creation time. The relationship to a PE is done
+ * later dynamically. This means we can potentially create a lot
+ * of these guys. Q35 stores them as some kind of radix tree but
+ * we never really need to do fast lookups so instead we simply
+ * keep a QLIST of them for now, we can add the radix if needed
+ * later on.
+ *
+ * We do cache the PE number to speed things up a bit though.
+ */
+typedef struct PnvPhb3DMASpace {
+PCIBus *bus;
+uint8_t devfn;
+int pe_num; /* Cached PE number */
+#define PHB_INVALID_PE (-1)
+PnvPHB3 *phb;
+AddressSpace dma_as;
+IOMMUMemoryRegion dma_mr;
+MemoryRegion msi32_mr;
+MemoryRegion msi64_mr;
+bool msi32_mapped;
+bool msi64_mapped;
+QLIST_ENTRY(PnvPhb3DMASpace) list;
+} PnvPhb3DMASpace;
+
+/*
+ * PHB3 Power Bus Common Queue
+ */
+#define TYPE_PNV_PBCQ "pnv-pbcq"
+#define PNV_PBCQ(obj) OBJECT_CHECK(PnvPBCQState, (obj), TYPE_PNV_PBCQ)
+
+typedef struct PnvPBCQState {
+DeviceState parent;
+
+uint32_t chip_id;
+uint32_t phb_id;
+uint32_t nest_xbase;
+uint32_t spci_xbase;
+uint32_t pci_xbase;
+#define PBCQ_NEST_REGS_COUNT0x46
+#define PBCQ_PCI_REGS_COUNT 0x15
+#define PBCQ_SPCI_REGS_COUNT0x5
+
+uint64_t nest_regs[PBCQ_NEST_REGS_COUNT];
+uint64_t