On Wed, Aug 09, 2017 at 10:48:48AM +0200, Cédric Le Goater wrote:
> On 08/09/2017 05:53 AM, David Gibson wrote:
> > On Tue, Aug 08, 2017 at 10:56:12AM +0200, Cédric Le Goater wrote:
> >> This is the framework for using XIVE in a PowerVM guest. The support
> >> is very similar to the native one in a much simpler form.
> >>
> >> Instead of OPAL calls, a set of Hypervisors call are used to configure
> >> the interrupt sources and the event/notification queues of the guest:
> >>
> >>  - H_INT_GET_SOURCE_INFO
> >>
> >>    used to obtain the address of the MMIO page of the Event State
> >>    Buffer (PQ bits) entry associated with the source.
> >>
> >>  - H_INT_SET_SOURCE_CONFIG
> >>
> >>    assigns a source to a "target".
> >>
> >>  - H_INT_GET_SOURCE_CONFIG
> >>
> >>    determines to which "target" and "priority" is assigned to a source
> >>
> >>  - H_INT_GET_QUEUE_INFO
> >>
> >>    returns the address of the notification management page associated
> >>    with the specified "target" and "priority".
> >>
> >>  - H_INT_SET_QUEUE_CONFIG
> >>
> >>    sets or resets the event queue for a given "target" and "priority".
> >>    It is also used to set the notification config associated with the
> >>    queue, only unconditional notification for the moment.  Reset is
> >>    performed with a queue size of 0 and queueing is disabled in that
> >>    case.
> >>
> >>  - H_INT_GET_QUEUE_CONFIG
> >>
> >>    returns the queue settings for a given "target" and "priority".
> >>
> >>  - H_INT_RESET
> >>
> >>    resets all of the partition's interrupt exploitation structures to
> >>    their initial state, losing all configuration set via the hcalls
> >>    H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
> >>
> >>  - H_INT_SYNC
> >>
> >>    issue a synchronisation on a source to make sure sure all
> >>    notifications have reached their queue.
> >>
> >> As for XICS, the XIVE interface for the guest is described in the
> >> device tree under the interrupt controller node. A couple of new
> >> properties are specific to XIVE :
> >>
> >>  - "reg"
> >>
> >>    contains the base address and size of the thread interrupt
> >>    managnement areas (TIMA) for the user level for the OS level. Only
> >>    the OS level is taken into account.
> >>
> >>  - "ibm,xive-eq-sizes"
> >>
> >>    the size of the event queues.
> >>
> >>  - "ibm,xive-lisn-ranges"
> >>
> >>    the interrupt numbers ranges assigned to the guest. These are
> >>    allocated using a simple bitmap.
> >>
> >> Tested with a QEMU XIVE model for pseries and with the Power
> >> hypervisor
> >>
> >> Signed-off-by: Cédric Le Goater <c...@kaod.org>
> > 
> > I don't know XIVE well enough to review in detail, but I've made some
> > comments based on general knowledge.
> 
> Thanks for taking a look.

np

[snip]
> >> +/* Cause IPI as setup by the interrupt controller (xics or xive) */
> >> +static void (*ic_cause_ipi)(int cpu);
> >> +
> >>  static void smp_pseries_cause_ipi(int cpu)
> >>  {
> >> -  /* POWER9 should not use this handler */
> >>    if (doorbell_try_core_ipi(cpu))
> >>            return;
> >>  
> >> -  icp_ops->cause_ipi(cpu);
> >> +  ic_cause_ipi(cpu);
> > 
> > Wouldn't it make more sense to change smp_ops->cause_ipi, rather than
> > having a double indirection through smp_ops, then the ic_cause_ipi
> > global?
> 
> we need to retain the original setting of smp_ops->cause_ipi 
> somewhere as it can be set from xive_smp_probe() to : 
> 
>       icp_ops->cause_ipi 
> 
> or from xics_smp_probe() to :
> 
>       xive_cause_ipi()
> 
> I am not sure we can do any better ? 

I don't see why not.  You basically have two bits of options xics vs
xive, and doorbell vs no doorbells.  At worst that gives you 4
different versions of ->cause_ipi, and you can work out the right one
in smp_probe().  If the number of combinations got too much larger you
might indeed need some more indirection.  But with 4 there's a little
code duplication, but small enough that I think it's preferable to an
extra global and an extra indirection.

Also, will POWER9 always have doorbells?  In which case you could
reduce it to 3 options.

[snip]
> >> +static void xive_spapr_update_pending(struct xive_cpu *xc)
> >> +{
> >> +  u8 nsr, cppr;
> >> +  u16 ack;
> >> +
> >> +  /* Perform the acknowledge hypervisor to register cycle */
> >> +  ack = be16_to_cpu(__raw_readw(xive_tima + TM_SPC_ACK_OS_REG));
> > 
> > Why do you need the raw_readw() + be16_to_cpu + mb, rather than one of
> > the higher level IO helpers?
> 
> This is one of the many ways to do MMIOs on the TIMA. This memory 
> region defines a set of offsets and sizes for which loads and 
> stores have different effects. 
> 
> See the arch/powerpc/include/asm/xive-regs.h file and 
> arch/powerpc/kvm/book3s_xive_template.c for some more usage.

Sure, much like any IO region.  My point is, why do you want this kind
of complex combo, rather than say an in_be16() or readw_be().

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature

Reply via email to