Re: [PATCH] irqchip: Add support for IDT 79rc3243x interrupt controller

2021-04-20 Thread Marc Zyngier

On 2021-04-20 13:34, Thomas Bogendoerfer wrote:

IDT 79rc3243x SoCs have rather simple interrupt controllers connected
to the MIPS CPU interrupt lines. Each of them has room for up to
32 interrupts.

Signed-off-by: Thomas Bogendoerfer 


Is there a DT binding for this irqchip? The code looks fine, but
it'd be good if the binding was merged at the same time as the driver.

Thanks,

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH] ptp: Don't print an error if ptp_kvm is not supported

2021-04-20 Thread Marc Zyngier
On Tue, 20 Apr 2021 14:24:19 +0100, Jon Hunter wrote:
> Commit 300bb1fe7671 ("ptp: arm/arm64: Enable ptp_kvm for arm/arm64")
> enable ptp_kvm support for ARM platforms and for any ARM platform that
> does not support this, the following error message is displayed ...
> 
>  ERR KERN fail to initialize ptp_kvm
> 
> For platforms that do not support ptp_kvm this error is a bit misleading
> and so fix this by only printing this message if the error returned by
> kvm_arch_ptp_init() is not -EOPNOTSUPP. Note that -EOPNOTSUPP is only
> returned by ARM platforms today if ptp_kvm is not supported.

Applied to kvm-arm64/ptp, thanks!

[1/1] ptp: Don't print an error if ptp_kvm is not supported
  commit: a86ed2cfa13c5175eb082c50a644f6bf29ac65cc

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




[PATCH] PCI: tegra: Restore MSI enable state on resume

2021-04-20 Thread Marc Zyngier
When going into suspend, the Tegra MSI controller loses its
state. Restore the MSI allocation on resume so that PCI devices
are usable again.

Reported-by: Jon Hunter 
Tested-by: Jon Hunter 
Fixes: 973a28677e39 ("PCI: tegra: Convert to MSI domains")
Signed-off-by: Marc Zyngier 
Cc: Lorenzo Pieralisi 
Cc: Bjorn Helgaas 
Cc: Thierry Reding 
---
 drivers/pci/controller/pci-tegra.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-tegra.c 
b/drivers/pci/controller/pci-tegra.c
index eaba7b2fab4a..507b23d43ad1 100644
--- a/drivers/pci/controller/pci-tegra.c
+++ b/drivers/pci/controller/pci-tegra.c
@@ -1802,13 +1802,19 @@ static void tegra_pcie_enable_msi(struct tegra_pcie 
*pcie)
 {
const struct tegra_pcie_soc *soc = pcie->soc;
struct tegra_msi *msi = >msi;
-   u32 reg;
+   u32 reg, msi_state[INT_PCI_MSI_NR / 32];
+   int i;
 
afi_writel(pcie, msi->phys >> soc->msi_base_shift, AFI_MSI_FPCI_BAR_ST);
afi_writel(pcie, msi->phys, AFI_MSI_AXI_BAR_ST);
/* this register is in 4K increments */
afi_writel(pcie, 1, AFI_MSI_BAR_SZ);
 
+   /* Restore the MSI allocation state */
+   bitmap_to_arr32(msi_state, msi->used, INT_PCI_MSI_NR);
+   for (i = 0; i < ARRAY_SIZE(msi_state); i++)
+   afi_writel(pcie, msi_state[i], AFI_MSI_EN_VEC(i));
+
/* and unmask the MSI interrupt */
reg = afi_readl(pcie, AFI_INTR_MASK);
reg |= AFI_INTR_MASK_MSI_MASK;
-- 
2.30.2



Re: [PATCH v10 5/7] PCI: mediatek-gen3: Add MSI support

2021-04-20 Thread Marc Zyngier
On Tue, 20 Apr 2021 10:44:02 +0100,
Pali Rohár  wrote:
> 
> Hello!
> 
> On Tuesday 20 April 2021 14:17:21 Jianjun Wang wrote:
> > +static void mtk_pcie_enable_msi(struct mtk_pcie_port *port)
> > +{
> > +   int i;
> > +   u32 val;
> > +
> > +   for (i = 0; i < PCIE_MSI_SET_NUM; i++) {
> > +   struct mtk_msi_set *msi_set = >msi_sets[i];
> > +
> > +   msi_set->base = port->base + PCIE_MSI_SET_BASE_REG +
> > +   i * PCIE_MSI_SET_OFFSET;
> > +   msi_set->msg_addr = port->reg_base + PCIE_MSI_SET_BASE_REG +
> > +   i * PCIE_MSI_SET_OFFSET;
> > +
> > +   /* Configure the MSI capture address */
> > +   writel_relaxed(lower_32_bits(msi_set->msg_addr), msi_set->base);
> > +   writel_relaxed(upper_32_bits(msi_set->msg_addr),
> > +  port->base + PCIE_MSI_SET_ADDR_HI_BASE +
> > +  i * PCIE_MSI_SET_ADDR_HI_OFFSET);
> 
> This looks like as setting MSI doorbell address to MSI doorbell address.
> 
> > +static void mtk_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
> > +{
> > +   struct mtk_msi_set *msi_set = irq_data_get_irq_chip_data(data);
> > +   struct mtk_pcie_port *port = data->domain->host_data;
> > +   unsigned long hwirq;
> > +
> > +   hwirq = data->hwirq % PCIE_MSI_IRQS_PER_SET;
> > +
> > +   msg->address_hi = upper_32_bits(msi_set->msg_addr);
> > +   msg->address_lo = lower_32_bits(msi_set->msg_addr);
> > +   msg->data = hwirq;
> > +   dev_dbg(port->dev, "msi#%#lx address_hi %#x address_lo %#x data %d\n",
> > +   hwirq, msg->address_hi, msg->address_lo, msg->data);
> 
> ... which is later used in compose_msi_msg().
> 
> Marc in some other patches for other pci controller drivers changed this
> address to just main "port" structure. It simplified implementations and
> also avoided need to declare additional member "msg_addr".
> 
> Marc, would it be possible to simplify it also for this driver and just
> set msg_addr to virt_to_phys(port)?

Maybe. It really depends on what range the HW accepts, and the sole
requirement is to use an address that the endpoint cannot DMA
to. Here, the driver seems to be using something based on the port
base address, which is good enough as far as I am concerned (the thing
I usually object to is the allocation of memory just for the sake of
getting a capture address).

If you want to further simplify it, you could simply use port.reg_base
as the MSI address for all sets, as I don't think they have to be
distinct. But someone with access to the TRM for this should go and
check it.

I don't believe this should gate the merging od this driver though.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] genirq/irqaction: Move dev_id and percpu_dev_id in a union

2021-04-20 Thread Marc Zyngier
On Mon, 19 Apr 2021 00:44:14 +0100,
Thomas Gleixner  wrote:
> 
> On Sat, Apr 10 2021 at 13:00, Marc Zyngier wrote:
> > dev_id and percpu_dev_id are mutually exclusive in struct irqaction,
> > as they conceptually represent the same thing, only in a per-cpu
> > fashion.
> >
> > Move them into an anonymous union, saving a few bytes on the way.
> 
> The reason why they are not in an anomymous union is that any misuse of
> interfaces will result in an instantaneous explosion while with your
> variant it will cause hard to diagnose side effects.
> 
> I rather waste the extra 4/8 bytes unless there is a compelling reason
> not to do so.

Fair enough. I just happened to spot it while doing some vaguely
related rework (turning the irqaction.irq field into an irqdesc).

Happy to leave it as is.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2] irqchip/xilinx: Expose Kconfig option for Zynq/ZynqMP

2021-04-20 Thread Marc Zyngier
On Mon, 19 Apr 2021 20:42:45 +0100,
Robert Hancock  wrote:
> 
> Previously the XILINX_INTC config option was hidden and only
> auto-selected on the MicroBlaze platform. However, this IP can also be
> used on the Zynq and ZynqMP platforms as a secondary cascaded
> controller. Allow this option to be user-enabled on those platforms.
> 
> Signed-off-by: Robert Hancock 
> ---
>  drivers/irqchip/Kconfig | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 15536e321df5..1020cc5a7800 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -279,8 +279,13 @@ config XTENSA_MX
>   select GENERIC_IRQ_EFFECTIVE_AFF_MASK
>  
>  config XILINX_INTC
> - bool
> + bool "Xilinx Interrupt Controller IP"
> + depends on MICROBLAZE || ARCH_ZYNQ || ARCH_ZYNQMP || COMPILE_TEST

Please drop COMPILE_TEST. As reported by the kernel test bot, there
are architectures it can't be compiled on.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v19 6/7] KVM: arm64: Add support for the KVM PTP service

2021-04-17 Thread Marc Zyngier
On Sat, 17 Apr 2021 09:59:39 +0100,
Zenghui Yu  wrote:
> 
> On 2021/3/30 22:54, Marc Zyngier wrote:
> > +PTP_KVM support for arm/arm64
> > +=
> > +
> > +PTP_KVM is used for high precision time sync between host and guests.
> > +It relies on transferring the wall clock and counter value from the
> > +host to the guest using a KVM-specific hypercall.
> > +
> > +* ARM_SMCCC_HYP_KVM_PTP_FUNC_ID: 0x8601
> 
> Per include/linux/arm-smccc.h, this should be
> ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID.

Well spotted. Care to send a patch on top of my kvm-arm64/ptp branch?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v19 7/7] ptp: arm/arm64: Enable ptp_kvm for arm/arm64

2021-04-17 Thread Marc Zyngier
On Sat, 17 Apr 2021 09:42:37 +0100,
Zenghui Yu  wrote:
> 
> On 2021/3/30 22:54, Marc Zyngier wrote:
> > +int kvm_arch_ptp_init(void)
> > +{
> > +   int ret;
> > +
> > +   ret = kvm_arm_hyp_service_available(ARM_SMCCC_KVM_FUNC_PTP);
> > +   if (ret <= 0)
> 
> kvm_arm_hyp_service_available() returns boolean. Maybe write as ?
> 
>   bool ret;
> 
>   ret = kvm_arm_hyp_service_available();
>   if (!ret)
>   return -ENODEV;

Fixed in 300bb1fe7671, as previously reported by Dan Carpenter in [1].

Thanks,

M.

https://lore.kernel.org/r/20210331043704.GG2065@kadam

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] irqchip/xilinx: Expose Kconfig option

2021-04-16 Thread Marc Zyngier
On Fri, 16 Apr 2021 17:05:49 +0100,
Robert Hancock  wrote:
> 
> On Fri, 2021-04-16 at 14:41 +0100, Marc Zyngier wrote:
> > On Fri, 16 Apr 2021 00:32:50 +0100,
> > Robert Hancock  wrote:
> > > Previously the XILINX_INTC config option was hidden and only
> > > auto-selected on the MicroBlaze platform. However, this IP can also be
> > > used on other platforms. Allow this option to be user-enabled.
> > > 
> > > Signed-off-by: Robert Hancock 
> > 
> > I don't think this is a good idea. In general, people have no idea
> > which interrupt controller they need to select. So you either end-up
> > with a missing interrupt controller, or a bunch you really don't need.
> > 
> > This is essentially a platform constraint, and this should directly be
> > selected by the platform if you have this IP in your system.
> > 
> > Thanks,
> > 
> > M.
> 
> The problem is essentially that at the platform level, we don't know, other
> than in the MicroBlaze case where we know it will be used as the platform's
> primary interrupt controller. In our case, we are using this IP core on the
> ZynqMP platform as a secondary cascaded interrupt controller in the FPGA
> portion of the device.
> But many ZynqMP configurations wouldn't have this device present, it
> depends on what the user instantiates in the programmable logic.
> Also, this core could just as easily be instantiated in standalone
> Xilinx FPGAs which could be connected to many different platforms
> over a PCIe, AXI, etc.  bus.

Not compiling it for some users is great if you happen to *know* what
you have to select, which is probably a single digit percentage of the
people that build their own kernel. At least having it to depend on
ZYNQMP (or some other FPGA platform) would narrow it down.

And if you have some other HW in your FPGA, you can make the config
fragment for this HW select the right interrupt controller. But I'm
definitely not keen on making this a universally user-selectable
driver.

> So I don't think having this as a platform constraint makes sense.

I don't think imposing this on *everyone*, across all supported
architectures and platforms makes any sense. Surely, people who build
their own HW (because that's what we are talking about here) can be
bothered to add the small Kconfig fragment that is required to their
kernel build.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC] ITS fails to allocate on rk3568/rk3566

2021-04-16 Thread Marc Zyngier
On Fri, 16 Apr 2021 02:13:38 +0100,
Kever Yang  wrote:
> 
> Hi Marc,
> 
> On 2021/4/15 下午4:11, Marc Zyngier wrote:
> > Hi Kever,
> > 
> > On Thu, 15 Apr 2021 08:24:33 +0100,
> > Kever Yang  wrote:
> >> Hi Marc, Peter,
> >> 
> >>      RK356x GIC has two issues:
> >> 
> >> 1. GIC only support 32bit address while rk356x supports 8GB DDR SDRAM,
> >> so we use ZONE_DMA32 to fix this issue;
> > What transactions does this affect exactly?
> The GIC on rk356x is a 32bit master, which means all the space its
> logic need to access should be in the 4GB range.

Well, at least this is consistently broken.

> > Only some ITS tables? Or
> > all of them, including the command queue? What about the configuration
> > and pending tables associated with the redistributors?
> > 
> >> 2. GIC version is r1p6-00rel0, RK356x interconnect does not support
> >> GIC and CPU snoop to each other, hence the GIC does not support the
> >> shareability feature.  The read of register value for shareability
> >> feature does not return as expect in GICR and GITS, so we have to
> >> workaround for it.
> > How about the cacheability attribute? Can you please provide the exact
> > set of attributes that this system actually supports for each of the
> > ITS and redistributor base registers?
> 
> The shareability attributes in GICR_PENDBASEER, GICR_PROPBASER,
> GITS_BASERn, GITS_CBASER default value is 0b00, when we set 0b01 then
> read returns 0b01.

And I claim that this is a perfectly broken behaviour. How do you
expect software to find about the gory details of the integration?
That's the only way for SW to find out what the HW is capable of...

> Since there is no ACE coherency interface for this GIC controller, all
> the cacheability in the GIC is not support in hardware.
> 
> > 
> > Also, please provide errata numbers for these two issues so that we
> > can properly document them and track the workarounds.
> 
> What kind of errata do you need, could you please share any kind of
> example close to this case?

I would like something that says:

"ROCKCHIP_ERRATUM_123456: The GIC600 integration in RK356x doesn't
 support any of the shareability or cacheability attributes, and
 requires both values to be set to 0b00 for all the ITS and
 Redistributor tables."

This is pretty similar to the bug affecting ThunderX with its "erratum
24313" (covered by CONFIG_CAVIUM_ERRATUM_22375), where the tables have
to be flagged as non-cacheable. The Rockchip one is just worse.

We need an official erratum number so that we can refer to it in the
source, commit log and documentation, as well as cross-reference it
with the TRM. This number will be part of a configuration symbol that
will make the compilation conditional so that people don't have to
carry the extra burden generated by this bug if they don't need to.

Same thing goes for the 32bit bug.

> 
> We consider this as a SoC implement design instead of a bug, so we
> will add document in RK356X  TRM to describe the GIC design, but no
> idea how to provide the errata.
> 
> Here is the shareabily attribute from ARM GIC architecture specification:
> Shareability, bits [11:10] (from GITS_CBASER)
> Indicates the Shareability attributes of accesses to the command
> queue. The possible values of this field are:
> 0b00 Non-shareable.
> 0b01 Inner Shareable.
> 0b10 Outer Shareable.
> 0b11 Reserved. Treated as 0b00.
> It is IMPLEMENTATION DEFINED whether this field has a fixed value or
> can be programmed by software. Implementing this field with a fixed
> value is deprecated.
> On a Warm reset, this field resets to an architecturally UNKNOWN value
> 
> As you can see, "Implementing this field with a fixed value is
> deprecated", so software should program this field to '0b00
> Non-shareable' if the SoC design does not support the cache
> shareability.

[I really feel special when people quote the GIC spec at me]

That isn't what it says. Hardcoding the field with a fixed value is
indeed deprecated, but that doesn't mean this field should accept
values that the HW cannot support. If anything, what this says is "try
and implement the options that SW is going to use".

But you need to give SW an indication of what is usable, because there
is no other way to *discover* what the SoC is capable of at runtime.
Otherwise, we would need to carry a per-SoC list of what the HW
supports. I don't think that's the right thing to do (and you're about
8 years too late anyway).

Other GIC600 integrations got it perfectly right, by the way. Same for
other GIC implementations, with the notable exception of Cavium and
their first GIC in ThunderX, as described above.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v4 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO

2021-04-16 Thread Marc Zyngier
On Thu, 15 Apr 2021 15:08:09 +0100,
Keqian Zhu  wrote:
> 
> Hi Marc,
> 
> On 2021/4/15 22:03, Keqian Zhu wrote:
> > The MMIO region of a device maybe huge (GB level), try to use
> > block mapping in stage2 to speedup both map and unmap.
> > 
> > Compared to normal memory mapping, we should consider two more
> > points when try block mapping for MMIO region:
> > 
> > 1. For normal memory mapping, the PA(host physical address) and
> > HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> > the HVA to request hugepage, so we don't need to consider PA
> > alignment when verifing block mapping. But for device memory
> > mapping, the PA and HVA may have different alignment.
> > 
> > 2. For normal memory mapping, we are sure hugepage size properly
> > fit into vma, so we don't check whether the mapping size exceeds
> > the boundary of vma. But for device memory mapping, we should pay
> > attention to this.
> > 
> > This adds get_vma_page_shift() to get page shift for both normal
> > memory and device MMIO region, and check these two points when
> > selecting block mapping size for MMIO region.
> > 
> > Signed-off-by: Keqian Zhu 
> > ---
> >  arch/arm64/kvm/mmu.c | 61 
> >  1 file changed, 51 insertions(+), 10 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index c59af5ca01b0..5a1cc7751e6d 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot 
> > *memslot,
> > return PAGE_SIZE;
> >  }
> >  
> > +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long 
> > hva)
> > +{
> > +   unsigned long pa;
> > +
> > +   if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP))
> > +   return huge_page_shift(hstate_vma(vma));
> > +
> > +   if (!(vma->vm_flags & VM_PFNMAP))
> > +   return PAGE_SHIFT;
> > +
> > +   VM_BUG_ON(is_vm_hugetlb_page(vma));
> > +
> > +   pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> > +
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > +   if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> > +   ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> > +   ALIGN(hva, PUD_SIZE) <= vma->vm_end)
> > +   return PUD_SHIFT;
> > +#endif
> > +
> > +   if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> > +   ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
> > +   ALIGN(hva, PMD_SIZE) <= vma->vm_end)
> > +   return PMD_SHIFT;
> > +
> > +   return PAGE_SHIFT;
> > +}
> > +
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >   struct kvm_memory_slot *memslot, unsigned long hva,
> >   unsigned long fault_status)
> > @@ -769,7 +798,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> > phys_addr_t fault_ipa,
> > return -EFAULT;
> > }
> >  
> > -   /* Let's check if we will get back a huge page backed by hugetlbfs */
> > +   /*
> > +* Let's check if we will get back a huge page backed by hugetlbfs, or
> > +* get block mapping for device MMIO region.
> > +*/
> > mmap_read_lock(current->mm);
> > vma = find_vma_intersection(current->mm, hva, hva + 1);
> > if (unlikely(!vma)) {
> > @@ -778,15 +810,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> > phys_addr_t fault_ipa,
> > return -EFAULT;
> > }
> >  
> > -   if (is_vm_hugetlb_page(vma))
> > -   vma_shift = huge_page_shift(hstate_vma(vma));
> > -   else
> > -   vma_shift = PAGE_SHIFT;
> > -
> > -   if (logging_active ||
> > -   (vma->vm_flags & VM_PFNMAP)) {
> > +   /*
> > +* logging_active is guaranteed to never be true for VM_PFNMAP
> > +* memslots.
> > +*/
> > +   if (logging_active) {
> > force_pte = true;
> > vma_shift = PAGE_SHIFT;
> > +   } else {
> > +   vma_shift = get_vma_page_shift(vma, hva);
> > }
> I use a if/else manner in v4, please check that. Thanks very much!

That's fine. However, it is getting a bit late for 5.13, and we don't
have much time to left it simmer in -next. I'll probably wait until
after the merge window to pick it up.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC PATCH] irqchip/gic-v3: Do not enable irqs when handling spurious interrups

2021-04-16 Thread Marc Zyngier
[+ Mark]

On Fri, 16 Apr 2021 07:22:17 +0100,
He Ying  wrote:
> 
> We found this problem in our kernel src tree:
> 
> [   14.816231] [ cut here ]
> [   14.816231] kernel BUG at irq.c:99!
> [   14.816232] Internal error: Oops - BUG: 0 [#1] SMP
> [   14.816232] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
> [   14.816233] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G   O  
> 4.19.95-1.h1.AOS2.0.aarch64 #14
> [   14.816233] Hardware name: evb (DT)
> [   14.816234] pstate: 80400085 (Nzcv daIf +PAN -UAO)
> [   14.816234] pc : asm_nmi_enter+0x94/0x98
> [   14.816235] lr : asm_nmi_enter+0x18/0x98
> [   14.816235] sp : 08003c50
> [   14.816235] pmr_save: 0070
> [   14.816237] x29: 08003c50 x28: 095f56c0
> [   14.816238] x27:  x26: 08004000
> [   14.816239] x25: 015e x24: 8008fb916000
> [   14.816240] x23: 2045 x22: 080817cc
> [   14.816241] x21: 08003da0 x20: 0060
> [   14.816242] x19: 03ff x18: 
> [   14.816243] x17: 0008 x16: 003d0900
> [   14.816244] x15: 095ea6c8 x14: 8008fff5ab40
> [   14.816244] x13: 8008fff58b9d x12: 
> [   14.816245] x11: 08c8a200 x10: 8e31fca5
> [   14.816246] x9 : 08c8a208 x8 : 000f
> [   14.816247] x7 : 0004 x6 : 8008fff58b9e
> [   14.816248] x5 :  x4 : 8000
> [   14.816249] x3 :  x2 : 8000
> [   14.816250] x1 : 0012 x0 : 095f56c0
> [   14.816251] Call trace:
> [   14.816251]  asm_nmi_enter+0x94/0x98
> [   14.816251]  el1_irq+0x8c/0x180
> [   14.816252]  gic_handle_irq+0xbc/0x2e4
> [   14.816252]  el1_irq+0xcc/0x180
> [   14.816253]  arch_timer_handler_virt+0x38/0x58
> [   14.816253]  handle_percpu_devid_irq+0x90/0x240
> [   14.816253]  generic_handle_irq+0x34/0x50
> [   14.816254]  __handle_domain_irq+0x68/0xc0
> [   14.816254]  gic_handle_irq+0xf8/0x2e4
> [   14.816255]  el1_irq+0xcc/0x180
> [   14.816255]  arch_cpu_idle+0x34/0x1c8
> [   14.816255]  default_idle_call+0x24/0x44
> [   14.816256]  do_idle+0x1d0/0x2c8
> [   14.816256]  cpu_startup_entry+0x28/0x30
> [   14.816256]  rest_init+0xb8/0xc8
> [   14.816257]  start_kernel+0x4c8/0x4f4
> [   14.816257] Code: 940587f1 d5384100 b9401001 36a7fd01 (d421)
> [   14.816258] Modules linked in: start_dp(O) smeth(O)
> [   15.103092] ---[ end trace 701753956cb14aa8 ]---
> [   15.103093] Kernel panic - not syncing: Fatal exception in interrupt
> [   15.103099] SMP: stopping secondary CPUs
> [   15.103100] Kernel Offset: disabled
> [   15.103100] CPU features: 0x36,a2400218
> [   15.103100] Memory Limit: none

Urgh...

> Our kernel src tree is based on 4.19.95 and backports arm64 pseudo-NMI
> patches but doesn't support nested NMI. Its top relative commit is
> commit 17ce302f3117 ("arm64: Fix interrupt tracing in the presence of NMIs").

Can you please reproduce it with mainline and without any backport?
It is hard to reason about something that isn't a vanilla kernel.

> I look into this issue and find that it's caused by 'BUG_ON(in_nmi())'
> in nmi_enter(). From the call trace, we find two 'el1_irqs' which
> means an interrupt preempts the other one and the new one is an NMI.
> Furthermore, by adding some prints, we find the first irq also calls
> nmi_enter(), but its priority is not GICD_INT_NMI_PRI and its irq number
> is 1023. It enables irq by calling gic_arch_enable_irqs() in
> gic_handle_irq(). At this moment, the second irq preempts the first irq
> and it's an NMI but current context is already in nmi. So that may be
> the problem.

I'm not sure I get it. From the stack trace, I see this:

[   14.816251]  asm_nmi_enter+0x94/0x98
[   14.816251]  el1_irq+0x8c/0x180  (C)
[   14.816252]  gic_handle_irq+0xbc/0x2e4
[   14.816252]  el1_irq+0xcc/0x180  (B)
[   14.816253]  arch_timer_handler_virt+0x38/0x58
[   14.816253]  handle_percpu_devid_irq+0x90/0x240
[   14.816253]  generic_handle_irq+0x34/0x50
[   14.816254]  __handle_domain_irq+0x68/0xc0
[   14.816254]  gic_handle_irq+0xf8/0x2e4
[   14.816255]  el1_irq+0xcc/0x180  (A)

which indicates that we preempted a timer interrupt (A) with another
IRQ (B), itself immediately preempted by another IRQ (C)? That's
indeed at least one too many.

Can you please describe for each of (A), (B) and (C) whether they are
spurious or not, what their priorities are if they aren't spurious?

> In my opinion, when handling spurious interrupts, we shouldn't enable irqs.
> My reason is that for spurious interrupts we may enter nmi context in
> el1_irq() because current PMR may be GIC_PRIO_IRQOFF. If we enable irqs
> at this time, another NMI may happen and preempt this spurious interrupt
> but the context is already in nmi. That causes a bug on if nested NMI is
> not supported. Even for nested 

Re: [PATCH] irqchip/xilinx: Expose Kconfig option

2021-04-16 Thread Marc Zyngier
On Fri, 16 Apr 2021 00:32:50 +0100,
Robert Hancock  wrote:
> 
> Previously the XILINX_INTC config option was hidden and only
> auto-selected on the MicroBlaze platform. However, this IP can also be
> used on other platforms. Allow this option to be user-enabled.
> 
> Signed-off-by: Robert Hancock 

I don't think this is a good idea. In general, people have no idea
which interrupt controller they need to select. So you either end-up
with a missing interrupt controller, or a bunch you really don't need.

This is essentially a platform constraint, and this should directly be
selected by the platform if you have this IP in your system.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v3 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO

2021-04-15 Thread Marc Zyngier
On Thu, 15 Apr 2021 12:26:26 +0100,
Keqian Zhu  wrote:
> 
> Hi Marc,
> 
> On 2021/4/15 18:23, Marc Zyngier wrote:
> > On Thu, 15 Apr 2021 03:20:52 +0100,
> > Keqian Zhu  wrote:
> >>
> >> Hi Marc,
> >>
> >> On 2021/4/14 17:05, Marc Zyngier wrote:
> >>> + Santosh, who found some interesting bugs in that area before.
> >>>
> >>> On Wed, 14 Apr 2021 07:51:09 +0100,
> >>> Keqian Zhu  wrote:
> >>>>
> >>>> The MMIO region of a device maybe huge (GB level), try to use
> >>>> block mapping in stage2 to speedup both map and unmap.
> >>>>
> >>>> Compared to normal memory mapping, we should consider two more
> >>>> points when try block mapping for MMIO region:
> >>>>
> >>>> 1. For normal memory mapping, the PA(host physical address) and
> >>>> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> >>>> the HVA to request hugepage, so we don't need to consider PA
> >>>> alignment when verifing block mapping. But for device memory
> >>>> mapping, the PA and HVA may have different alignment.
> >>>>
> >>>> 2. For normal memory mapping, we are sure hugepage size properly
> >>>> fit into vma, so we don't check whether the mapping size exceeds
> >>>> the boundary of vma. But for device memory mapping, we should pay
> >>>> attention to this.
> >>>>
> >>>> This adds device_rough_page_shift() to check these two points when
> >>>> selecting block mapping size.
> >>>>
> >>>> Signed-off-by: Keqian Zhu 
> >>>> ---
> >>>>  arch/arm64/kvm/mmu.c | 37 +
> >>>>  1 file changed, 33 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> >>>> index c59af5ca01b0..1a6d96169d60 100644
> >>>> --- a/arch/arm64/kvm/mmu.c
> >>>> +++ b/arch/arm64/kvm/mmu.c
> >>>> @@ -624,6 +624,31 @@ static void kvm_send_hwpoison_signal(unsigned long 
> >>>> address, short lsb)
> >>>>  send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, 
> >>>> current);
> >>>>  }
> >>>>  
> >>>> +/*
> >>>> + * Find a max mapping size that properly insides the vma. And hva and 
> >>>> pa must
> >>>> + * have the same alignment to this mapping size. It's rough as there 
> >>>> are still
> >>>> + * other restrictions, will be checked by 
> >>>> fault_supports_stage2_huge_mapping().
> >>>> + */
> >>>> +static short device_rough_page_shift(struct vm_area_struct *vma,
> >>>> + unsigned long hva)
> >>>
> >>> My earlier question still stands. Under which circumstances would this
> >>> function return something that is *not* the final mapping size? I
> >>> really don't see a reason why this would not return the final mapping
> >>> size.
> >>
> >> IIUC, all the restrictions are about alignment and area boundary.
> >>
> >> That's to say, HVA, IPA and PA must have same alignment within the
> >> mapping size.  And the areas are memslot and vma, which means the
> >> mapping size must properly fit into the memslot and vma.
> >>
> >> In this function, we just checked the alignment of HVA and PA, and
> >> the boundary of vma.  So we still need to check the alignment of HVA
> >> and IPA, and the boundary of memslot.  These will be checked by
> >> fault_supports_stage2_huge_mapping().
> > 
> > But that's no different from what we do with normal memory, is it? So
> > it really feels like we should have *one* function that deals with
> > establishing the basic mapping size from the VMA (see below for what I
> > have in mind).
> Right. And it looks better.
> 
> > 
> >>
> >>>
> >>>> +{
> >>>> +phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - 
> >>>> vma->vm_start);
> >>>> +
> >>>> +#ifndef __PAGETABLE_PMD_FOLDED
> >>>> +if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> >>>> +ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> >>>> + 

Re: [PATCH 3/5] s390: Get rid of oprofile leftovers

2021-04-15 Thread Marc Zyngier
On Thu, 15 Apr 2021 11:38:52 +0100,
Heiko Carstens  wrote:
> 
> On Wed, Apr 14, 2021 at 02:44:07PM +0100, Marc Zyngier wrote:
> > perf_pmu_name() and perf_num_counters() are unused. Drop them.
> > 
> > Signed-off-by: Marc Zyngier 
> > ---
> >  arch/s390/kernel/perf_event.c | 21 -
> >  1 file changed, 21 deletions(-)
> 
> Acked-by: Heiko Carstens 
> 
> ...or do you want me to pick this up and route via the s390 tree(?).

Either way work for me, but I just want to make sure the last patch
doesn't get applied before the previous ones.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v3 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO

2021-04-15 Thread Marc Zyngier
On Thu, 15 Apr 2021 03:20:52 +0100,
Keqian Zhu  wrote:
> 
> Hi Marc,
> 
> On 2021/4/14 17:05, Marc Zyngier wrote:
> > + Santosh, who found some interesting bugs in that area before.
> > 
> > On Wed, 14 Apr 2021 07:51:09 +0100,
> > Keqian Zhu  wrote:
> >>
> >> The MMIO region of a device maybe huge (GB level), try to use
> >> block mapping in stage2 to speedup both map and unmap.
> >>
> >> Compared to normal memory mapping, we should consider two more
> >> points when try block mapping for MMIO region:
> >>
> >> 1. For normal memory mapping, the PA(host physical address) and
> >> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> >> the HVA to request hugepage, so we don't need to consider PA
> >> alignment when verifing block mapping. But for device memory
> >> mapping, the PA and HVA may have different alignment.
> >>
> >> 2. For normal memory mapping, we are sure hugepage size properly
> >> fit into vma, so we don't check whether the mapping size exceeds
> >> the boundary of vma. But for device memory mapping, we should pay
> >> attention to this.
> >>
> >> This adds device_rough_page_shift() to check these two points when
> >> selecting block mapping size.
> >>
> >> Signed-off-by: Keqian Zhu 
> >> ---
> >>  arch/arm64/kvm/mmu.c | 37 +
> >>  1 file changed, 33 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> >> index c59af5ca01b0..1a6d96169d60 100644
> >> --- a/arch/arm64/kvm/mmu.c
> >> +++ b/arch/arm64/kvm/mmu.c
> >> @@ -624,6 +624,31 @@ static void kvm_send_hwpoison_signal(unsigned long 
> >> address, short lsb)
> >>send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
> >>  }
> >>  
> >> +/*
> >> + * Find a max mapping size that properly insides the vma. And hva and pa 
> >> must
> >> + * have the same alignment to this mapping size. It's rough as there are 
> >> still
> >> + * other restrictions, will be checked by 
> >> fault_supports_stage2_huge_mapping().
> >> + */
> >> +static short device_rough_page_shift(struct vm_area_struct *vma,
> >> +   unsigned long hva)
> > 
> > My earlier question still stands. Under which circumstances would this
> > function return something that is *not* the final mapping size? I
> > really don't see a reason why this would not return the final mapping
> > size.
> 
> IIUC, all the restrictions are about alignment and area boundary.
> 
> That's to say, HVA, IPA and PA must have same alignment within the
> mapping size.  And the areas are memslot and vma, which means the
> mapping size must properly fit into the memslot and vma.
> 
> In this function, we just checked the alignment of HVA and PA, and
> the boundary of vma.  So we still need to check the alignment of HVA
> and IPA, and the boundary of memslot.  These will be checked by
> fault_supports_stage2_huge_mapping().

But that's no different from what we do with normal memory, is it? So
it really feels like we should have *one* function that deals with
establishing the basic mapping size from the VMA (see below for what I
have in mind).

> 
> > 
> >> +{
> >> +  phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> >> +
> >> +#ifndef __PAGETABLE_PMD_FOLDED
> >> +  if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> >> +  ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> >> +  ALIGN(hva, PUD_SIZE) <= vma->vm_end)
> >> +  return PUD_SHIFT;
> >> +#endif
> >> +
> >> +  if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> >> +  ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
> >> +  ALIGN(hva, PMD_SIZE) <= vma->vm_end)
> >> +  return PMD_SHIFT;
> >> +
> >> +  return PAGE_SHIFT;
> >> +}
> >> +
> >>  static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot 
> >> *memslot,
> >>   unsigned long hva,
> >>   unsigned long map_size)
> >> @@ -769,7 +794,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> >> phys_addr_t fault_ipa,
> >>return -EFAULT;
> >>}
> >>  
> >> -  /* Let'

Re: [RFC] ITS fails to allocate on rk3568/rk3566

2021-04-15 Thread Marc Zyngier
Hi Kever,

On Thu, 15 Apr 2021 08:24:33 +0100,
Kever Yang  wrote:
> 
> Hi Marc, Peter,
> 
>     RK356x GIC has two issues:
> 
> 1. GIC only support 32bit address while rk356x supports 8GB DDR SDRAM,
> so we use ZONE_DMA32 to fix this issue;

What transactions does this affect exactly? Only some ITS tables? Or
all of them, including the command queue? What about the configuration
and pending tables associated with the redistributors?

> 2. GIC version is r1p6-00rel0, RK356x interconnect does not support
> GIC and CPU snoop to each other, hence the GIC does not support the
> shareability feature.  The read of register value for shareability 
> feature does not return as expect in GICR and GITS, so we have to
> workaround for it.

How about the cacheability attribute? Can you please provide the exact
set of attributes that this system actually supports for each of the
ITS and redistributor base registers?

Also, please provide errata numbers for these two issues so that we
can properly document them and track the workarounds.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


[PATCH 5/5] perf: Get rid of oprofile leftovers

2021-04-14 Thread Marc Zyngier
perf_pmu_name() and perf_num_counters() are unused. Drop them.

Signed-off-by: Marc Zyngier 
---
 include/linux/perf_event.h | 2 --
 kernel/events/core.c   | 5 -
 2 files changed, 7 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3f7f89ea5e51..51154ed9a206 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -951,8 +951,6 @@ extern void perf_event_itrace_started(struct perf_event 
*event);
 extern int perf_pmu_register(struct pmu *pmu, const char *name, int type);
 extern void perf_pmu_unregister(struct pmu *pmu);
 
-extern int perf_num_counters(void);
-extern const char *perf_pmu_name(void);
 extern void __perf_event_task_sched_in(struct task_struct *prev,
   struct task_struct *task);
 extern void __perf_event_task_sched_out(struct task_struct *prev,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 03db40f6cba9..88cb0ba5690b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -580,11 +580,6 @@ static u64 perf_event_time(struct perf_event *event);
 
 void __weak perf_event_print_debug(void)   { }
 
-extern __weak const char *perf_pmu_name(void)
-{
-   return "pmu";
-}
-
 static inline u64 perf_clock(void)
 {
return local_clock();
-- 
2.29.2



[PATCH 2/5] arm64: Get rid of oprofile leftovers

2021-04-14 Thread Marc Zyngier
perf_pmu_name() and perf_num_counters() are now unused. Drop them.

Signed-off-by: Marc Zyngier 
---
 drivers/perf/arm_pmu.c | 30 --
 1 file changed, 30 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 2d10d84fb79c..d4f7f1f9cc77 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -581,33 +581,6 @@ static const struct attribute_group 
armpmu_common_attr_group = {
.attrs = armpmu_common_attrs,
 };
 
-/* Set at runtime when we know what CPU type we are. */
-static struct arm_pmu *__oprofile_cpu_pmu;
-
-/*
- * Despite the names, these two functions are CPU-specific and are used
- * by the OProfile/perf code.
- */
-const char *perf_pmu_name(void)
-{
-   if (!__oprofile_cpu_pmu)
-   return NULL;
-
-   return __oprofile_cpu_pmu->name;
-}
-EXPORT_SYMBOL_GPL(perf_pmu_name);
-
-int perf_num_counters(void)
-{
-   int max_events = 0;
-
-   if (__oprofile_cpu_pmu != NULL)
-   max_events = __oprofile_cpu_pmu->num_events;
-
-   return max_events;
-}
-EXPORT_SYMBOL_GPL(perf_num_counters);
-
 static int armpmu_count_irq_users(const int irq)
 {
int cpu, count = 0;
@@ -979,9 +952,6 @@ int armpmu_register(struct arm_pmu *pmu)
if (ret)
goto out_destroy;
 
-   if (!__oprofile_cpu_pmu)
-   __oprofile_cpu_pmu = pmu;
-
pr_info("enabled with %s PMU driver, %d counters available%s\n",
pmu->name, pmu->num_events,
has_nmi ? ", using NMIs" : "");
-- 
2.29.2



[PATCH 4/5] sh: Get rid of oprofile leftovers

2021-04-14 Thread Marc Zyngier
perf_pmu_name() and perf_num_counters() are unused. Drop them.

Signed-off-by: Marc Zyngier 
---
 arch/sh/kernel/perf_event.c | 18 --
 1 file changed, 18 deletions(-)

diff --git a/arch/sh/kernel/perf_event.c b/arch/sh/kernel/perf_event.c
index 445e3ece4c23..1d2507f22437 100644
--- a/arch/sh/kernel/perf_event.c
+++ b/arch/sh/kernel/perf_event.c
@@ -57,24 +57,6 @@ static inline int sh_pmu_initialized(void)
return !!sh_pmu;
 }
 
-const char *perf_pmu_name(void)
-{
-   if (!sh_pmu)
-   return NULL;
-
-   return sh_pmu->name;
-}
-EXPORT_SYMBOL_GPL(perf_pmu_name);
-
-int perf_num_counters(void)
-{
-   if (!sh_pmu)
-   return 0;
-
-   return sh_pmu->num_events;
-}
-EXPORT_SYMBOL_GPL(perf_num_counters);
-
 /*
  * Release the PMU if this is the last perf_event.
  */
-- 
2.29.2



[PATCH 3/5] s390: Get rid of oprofile leftovers

2021-04-14 Thread Marc Zyngier
perf_pmu_name() and perf_num_counters() are unused. Drop them.

Signed-off-by: Marc Zyngier 
---
 arch/s390/kernel/perf_event.c | 21 -
 1 file changed, 21 deletions(-)

diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c
index 1e75cc983546..ea7729bebaa0 100644
--- a/arch/s390/kernel/perf_event.c
+++ b/arch/s390/kernel/perf_event.c
@@ -23,27 +23,6 @@
 #include 
 #include 
 
-const char *perf_pmu_name(void)
-{
-   if (cpum_cf_avail() || cpum_sf_avail())
-   return "CPU-Measurement Facilities (CPU-MF)";
-   return "pmu";
-}
-EXPORT_SYMBOL(perf_pmu_name);
-
-int perf_num_counters(void)
-{
-   int num = 0;
-
-   if (cpum_cf_avail())
-   num += PERF_CPUM_CF_MAX_CTR;
-   if (cpum_sf_avail())
-   num += PERF_CPUM_SF_MAX_CTR;
-
-   return num;
-}
-EXPORT_SYMBOL(perf_num_counters);
-
 static struct kvm_s390_sie_block *sie_block(struct pt_regs *regs)
 {
struct stack_frame *stack = (struct stack_frame *) regs->gprs[15];
-- 
2.29.2



[PATCH 1/5] KVM: arm64: Divorce the perf code from oprofile helpers

2021-04-14 Thread Marc Zyngier
KVM/arm64 is the sole user of perf_num_counters(), and really
could do without it. Stop using the obsolete API by relying on
the existing probing code.

Signed-off-by: Marc Zyngier 
---
 arch/arm64/kvm/perf.c | 7 +--
 arch/arm64/kvm/pmu-emul.c | 2 +-
 include/kvm/arm_pmu.h | 4 
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c
index 739164324afe..b8b398670ef2 100644
--- a/arch/arm64/kvm/perf.c
+++ b/arch/arm64/kvm/perf.c
@@ -50,12 +50,7 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = {
 
 int kvm_perf_init(void)
 {
-   /*
-* Check if HW_PERF_EVENTS are supported by checking the number of
-* hardware performance counters. This could ensure the presence of
-* a physical PMU and CONFIG_PERF_EVENT is selected.
-*/
-   if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0)
+   if (kvm_pmu_probe_pmuver() != 0xf)
static_branch_enable(_arm_pmu_available);
 
return perf_register_guest_info_callbacks(_guest_cbs);
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
index e32c6e139a09..fd167d4f4215 100644
--- a/arch/arm64/kvm/pmu-emul.c
+++ b/arch/arm64/kvm/pmu-emul.c
@@ -739,7 +739,7 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, 
u64 data,
kvm_pmu_create_perf_event(vcpu, select_idx);
 }
 
-static int kvm_pmu_probe_pmuver(void)
+int kvm_pmu_probe_pmuver(void)
 {
struct perf_event_attr attr = { };
struct perf_event *event;
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 6fd3cda608e4..864b9997efb2 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -61,6 +61,7 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu,
 int kvm_arm_pmu_v3_has_attr(struct kvm_vcpu *vcpu,
struct kvm_device_attr *attr);
 int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu);
+int kvm_pmu_probe_pmuver(void);
 #else
 struct kvm_pmu {
 };
@@ -116,6 +117,9 @@ static inline u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, 
bool pmceid1)
 {
return 0;
 }
+
+static inline int kvm_pmu_probe_pmuver(void) { return 0xf; }
+
 #endif
 
 #endif
-- 
2.29.2



[PATCH 0/5] perf: oprofile spring cleanup

2021-04-14 Thread Marc Zyngier
This small series builds on top of the work that was started with [1].

It recently became apparent that KVM/arm64 is the last bit of the
kernel that still uses perf_num_counters().

As I went ahead to address this, it became obvious that all traces of
oprofile had been eradicated from all architectures but arm64, s390
and sh (plus a bit of cruft in the core perf code). With KVM fixed,
perf_num_counters() and perf_pmu_name() are finally gone.

Thanks,

M.

[1] https://lore.kernel.org/lkml/20210215050618.hgftdmfmslbdrg3j@vireshk-i7

Marc Zyngier (5):
  KVM: arm64: Divorce the perf code from oprofile helpers
  arm64: Get rid of oprofile leftovers
  s390: Get rid of oprofile leftovers
  sh: Get rid of oprofile leftovers
  perf: Get rid of oprofile leftovers

 arch/arm64/kvm/perf.c |  7 +--
 arch/arm64/kvm/pmu-emul.c |  2 +-
 arch/s390/kernel/perf_event.c | 21 -
 arch/sh/kernel/perf_event.c   | 18 --
 drivers/perf/arm_pmu.c| 30 --
 include/kvm/arm_pmu.h |  4 
 include/linux/perf_event.h|  2 --
 kernel/events/core.c  |  5 -
 8 files changed, 6 insertions(+), 83 deletions(-)

-- 
2.29.2


Re: [RFC] ITS fails to allocate on rk3568/rk3566

2021-04-14 Thread Marc Zyngier
On Wed, 14 Apr 2021 12:41:20 +0100,
Peter Geis  wrote:
> 
> On Tue, Apr 13, 2021 at 11:51 AM Marc Zyngier  wrote:
> >
> > On Tue, 13 Apr 2021 16:03:51 +0100,
> > Peter Geis  wrote:
> > >
> > > On Tue, Apr 13, 2021 at 10:01 AM Marc Zyngier  wrote:
> >
> > [...]
> >
> > > > What happens if you hack all the allocations to happen in the low 4GB
> > > > of the PA space?
> > >
> > > It seems to work correctly.
> > > The downstream hacks used GFP_DMA32 which gets discarded by
> > > kmalloc_fix_flags on certain allocations.
> > > Switching to GFP_DMA seems to have satisfied it, but it feels wrong
> > > using this code.
> > > Need to check the corner cases to make sure I'm not missing something.
> >
> > The problem is that GFP_DMA doesn't always mean the same thing.
> > Overall, we need to hear from Rockchip about the exact nature of the
> > problem, and then we *may* be able to work something out.
> 
> From what I've read, GFP_DMA allocates as low as possible, while
> GFP_DMA32 ensures it's in the 32 bit address range, am I understanding
> this correctly?

ZONE_DMA{,32} aren't necessarily selected, and can vary in size (some
equally broken systems can only DMA over 30bits...).

> Is there a reason GFP_DMA is permitted while GFP_DMA32 is not, aside
> from backwards compatibility?  (I saw the notes about how we aren't
> really supposed to rely on these flags)

They are completely independent, and they can either be selected or
not. And plenty of systems do not have any memory in the low
4GB. FWIW, one of my main machines has its first byte of RAM at 1TB.

Which means that supporting this system is going to require some very
specific handling.

> I've also confirmed that their disabling shareability and caching is
> necessary.

Confirmed how? For which tables? We really cannot guess this kind of
thing.

> > I'd also like to understand whether it is broken because you happen to
> > have pre-release silicon that will never make it into the wild, or if
> > this is the real thing that is going to ship on millions of devices.
> 
> My understanding is these chips are samples prior to the full
> production run, but we are waiting on official comment from Rockchip
> about this particular errata.

OK. Please let me know once you get a full description of the problem
from Rockchip. We will also need an official erratum number for this
if this is to be worked around in mainline.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: arch/arm64/kvm/perf.c:58:36: error: implicit declaration of function 'perf_num_counters'

2021-04-14 Thread Marc Zyngier
On Tue, 13 Apr 2021 21:00:57 +0100,
Nathan Chancellor  wrote:

[...]

> I just ran into this again. It is not a clang specific issue, it
> reproduces quite easily with arm64 defconfig minus CONFIG_PERF_EVENTS
> and gcc 10.3.0:
> 
> arch/arm64/kvm/perf.c: In function 'kvm_perf_init':
> arch/arm64/kvm/perf.c:58:36: error: implicit declaration of function
> 'perf_num_counters'; did you mean 'dec_mm_counter'?
> [-Werror=implicit-function-declaration]
>58 |  if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0)
>   |^
>   |dec_mm_counter
> cc1: some warnings being treated as errors
> 
> I am not sure what the cleanest solution would be for providing a static
> inline version of perf_num_counters() would be, as only arm64 actually
> uses it (sh and s390 define it but it does not appear to be used) but it
> is only available through CONFIG_ARM_PMU instead of just
> CONFIG_PERF_EVENTS like the other two architectures mentioned above.

As you point out, KVM/arm64 is the only user of perf_num_counters()
across the whole kernel. The whole oprofile subsystem has been
removed, so maybe a a bigger cleanup is in order.

I'll post something shortly.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v3 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO

2021-04-14 Thread Marc Zyngier
+ Santosh, who found some interesting bugs in that area before.

On Wed, 14 Apr 2021 07:51:09 +0100,
Keqian Zhu  wrote:
> 
> The MMIO region of a device maybe huge (GB level), try to use
> block mapping in stage2 to speedup both map and unmap.
> 
> Compared to normal memory mapping, we should consider two more
> points when try block mapping for MMIO region:
> 
> 1. For normal memory mapping, the PA(host physical address) and
> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> the HVA to request hugepage, so we don't need to consider PA
> alignment when verifing block mapping. But for device memory
> mapping, the PA and HVA may have different alignment.
> 
> 2. For normal memory mapping, we are sure hugepage size properly
> fit into vma, so we don't check whether the mapping size exceeds
> the boundary of vma. But for device memory mapping, we should pay
> attention to this.
> 
> This adds device_rough_page_shift() to check these two points when
> selecting block mapping size.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  arch/arm64/kvm/mmu.c | 37 +
>  1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c59af5ca01b0..1a6d96169d60 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -624,6 +624,31 @@ static void kvm_send_hwpoison_signal(unsigned long 
> address, short lsb)
>   send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
>  }
>  
> +/*
> + * Find a max mapping size that properly insides the vma. And hva and pa must
> + * have the same alignment to this mapping size. It's rough as there are 
> still
> + * other restrictions, will be checked by 
> fault_supports_stage2_huge_mapping().
> + */
> +static short device_rough_page_shift(struct vm_area_struct *vma,
> +  unsigned long hva)

My earlier question still stands. Under which circumstances would this
function return something that is *not* the final mapping size? I
really don't see a reason why this would not return the final mapping
size.

> +{
> + phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> +
> +#ifndef __PAGETABLE_PMD_FOLDED
> + if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> + ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> + ALIGN(hva, PUD_SIZE) <= vma->vm_end)
> + return PUD_SHIFT;
> +#endif
> +
> + if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> + ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
> + ALIGN(hva, PMD_SIZE) <= vma->vm_end)
> + return PMD_SHIFT;
> +
> + return PAGE_SHIFT;
> +}
> +
>  static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot 
> *memslot,
>  unsigned long hva,
>  unsigned long map_size)
> @@ -769,7 +794,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   return -EFAULT;
>   }
>  
> - /* Let's check if we will get back a huge page backed by hugetlbfs */
> + /*
> +  * Let's check if we will get back a huge page backed by hugetlbfs, or
> +  * get block mapping for device MMIO region.
> +  */
>   mmap_read_lock(current->mm);
>   vma = find_vma_intersection(current->mm, hva, hva + 1);
>   if (unlikely(!vma)) {
> @@ -780,11 +808,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>   if (is_vm_hugetlb_page(vma))
>   vma_shift = huge_page_shift(hstate_vma(vma));
> + else if (vma->vm_flags & VM_PFNMAP)
> + vma_shift = device_rough_page_shift(vma, hva);

What prevents a VMA from having both VM_HUGETLB and VM_PFNMAP? This is
pretty unlikely, but I'd like to see this case catered for.

>   else
>   vma_shift = PAGE_SHIFT;
>  
> - if (logging_active ||
> - (vma->vm_flags & VM_PFNMAP)) {
> + if (logging_active) {
>   force_pte = true;
>   vma_shift = PAGE_SHIFT;
>   }
> @@ -855,7 +884,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>   if (kvm_is_device_pfn(pfn)) {
>   device = true;
> - force_pte = true;
> + force_pte = (vma_pagesize == PAGE_SIZE);

Why do we need to set force_pte if we are already dealing with
PAGE_SIZE? I guess you are doing this for the sake of avoiding the
call to transparent_hugepage_adjust(), right?

I'd rather you simply don't try to upgrade a device mapping by
explicitly checking for this and keep force_pte for *memory*
exclusively.

Santosh, can you please take a look at this series and try to see if
the problem you fixed in [1] (which ended up as commit 91a2c34b7d6f)
is still OK with this series?

>   } else if (logging_active && !write_fault) {
>   /*
>* Only actually map the page as writable if 

Re: [GIT PULL] coresight: Fixes for ETE and TRBE

2021-04-14 Thread Marc Zyngier
On Tue, 13 Apr 2021 20:45:00 +0100,
Mathieu Poirier  wrote:
> 
> On Tue, 13 Apr 2021 at 10:52, Marc Zyngier  wrote:
> >
> > Hi Mathieu,
> >
> > On Tue, 13 Apr 2021 17:19:52 +0100,
> > Mathieu Poirier  wrote:
> > >
> > > The following changes since commit 
> > > 4fb13790417a7bf726f3867a5d2b9723efde488b:
> > >
> > >   dts: bindings: Document device tree bindings for Arm TRBE (2021-04-06 
> > > 16:05:38 -0600)
> > >
> > > are available in the Git repository at:
> > >
> > >   g...@gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux.git 
> > > next-ETE-TRBE
> > >
> > > for you to fetch changes up to 68d400c079978f649e7f63aba966d219743edd64:
> > >
> > >   coresight: trbe: Fix return value check in 
> > > arm_trbe_register_coresight_cpu() (2021-04-13 09:46:27 -0600)
> > >
> > > 
> > > Hi Marc,
> > >
> > > Please consider these two patches, they are ETE/TRBE fixes found by bots.
> > >
> > > Let me know if you want me to rebase on your next branch and send the
> > > pull request from that.
> >
> > I've now pulled this into kvmarm/next. If you have additional fixes,
> > just stick them on top of your next-ETE-TRBE branch like you did with
> > these two patches.
> >
> 
> Much appreciated - I owe you a beer.

One day, my friend. One day...

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [GIT PULL] coresight: Fixes for ETE and TRBE

2021-04-13 Thread Marc Zyngier
Hi Mathieu,

On Tue, 13 Apr 2021 17:19:52 +0100,
Mathieu Poirier  wrote:
> 
> The following changes since commit 4fb13790417a7bf726f3867a5d2b9723efde488b:
> 
>   dts: bindings: Document device tree bindings for Arm TRBE (2021-04-06 
> 16:05:38 -0600)
> 
> are available in the Git repository at:
> 
>   g...@gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux.git 
> next-ETE-TRBE
> 
> for you to fetch changes up to 68d400c079978f649e7f63aba966d219743edd64:
> 
>   coresight: trbe: Fix return value check in 
> arm_trbe_register_coresight_cpu() (2021-04-13 09:46:27 -0600)
> 
> 
> Hi Marc,
> 
> Please consider these two patches, they are ETE/TRBE fixes found by bots.
> 
> Let me know if you want me to rebase on your next branch and send the
> pull request from that.

I've now pulled this into kvmarm/next. If you have additional fixes,
just stick them on top of your next-ETE-TRBE branch like you did with
these two patches.

The kvmarm/next branch gets rebuilt every other day, so it isn't a
stable branch on its own. Only the non-merge commits are stable, which
is why I keep everything on topic branches.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC] ITS fails to allocate on rk3568/rk3566

2021-04-13 Thread Marc Zyngier
On Tue, 13 Apr 2021 16:03:51 +0100,
Peter Geis  wrote:
> 
> On Tue, Apr 13, 2021 at 10:01 AM Marc Zyngier  wrote:

[...]

> > What happens if you hack all the allocations to happen in the low 4GB
> > of the PA space?
> 
> It seems to work correctly.
> The downstream hacks used GFP_DMA32 which gets discarded by
> kmalloc_fix_flags on certain allocations.
> Switching to GFP_DMA seems to have satisfied it, but it feels wrong
> using this code.
> Need to check the corner cases to make sure I'm not missing something.

The problem is that GFP_DMA doesn't always mean the same thing.
Overall, we need to hear from Rockchip about the exact nature of the
problem, and then we *may* be able to work something out.

I'd also like to understand whether it is broken because you happen to
have pre-release silicon that will never make it into the wild, or if
this is the real thing that is going to ship on millions of devices.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read

2021-04-13 Thread Marc Zyngier
On Mon, 12 Apr 2021 17:00:34 +0200, Eric Auger wrote:
> When reading the base address of the a REDIST region
> through KVM_VGIC_V3_ADDR_TYPE_REDIST we expect the
> redistributor region list to be populated with a single
> element.
> 
> However list_first_entry() expects the list to be non empty.
> Instead we should use list_first_entry_or_null which effectively
> returns NULL if the list is empty.

Applied to kvm-arm64/vgic-5.13, thanks!

[1/1] KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read
  commit: 94ac0835391efc1a30feda6fc908913ec012951e

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] Fixed: ARM64 GIC ITS could not resume from suspend

2021-04-13 Thread Marc Zyngier
On Tue, 13 Apr 2021 07:40:10 +0100,
414777...@qq.com wrote:
> 
> From: Mengguang Peng 
> 
> - After ITS suspend, in the ATF(arm-trusted-firmware),
>   gicv3_rdistif_init_restore() just restore GICR_CTLR.Enable_LPIs bit
>   of boot cpu.
> 
> - In its_cpu_init_lpis() of kernel, gic_data_rdist()->lpi_enable
>   will block setting GICR_CTLR_ENABLE_LPIS bit of the other CPUs
>   when ITS resume after suspend.
> 
> Signed-off-by: Mengguang Peng 
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 18 +-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c 
> b/drivers/irqchip/irq-gic-v3-its.c
> index ed46e60..8167397 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -4777,7 +4777,7 @@ static int its_save_disable(void)
>  static void its_restore_enable(void)
>  {
>   struct its_node *its;
> - int ret;
> + int ret, cpu;
>  
>   raw_spin_lock(_lock);
>   list_for_each_entry(its, _nodes, entry) {
> @@ -4831,6 +4831,22 @@ static void its_restore_enable(void)
>   GITS_TYPER_HCC(gic_read_typer(base + GITS_TYPER)))
>   its_cpu_init_collection(its);
>   }
> +
> + /*
> +  * Enable LPIs: firmware just restore GICR_CTLR.Enable_LPIs
> +  * of boot cpu, the other CPUs also should be restore.
> +  */
> + for_each_online_cpu(cpu) {
> + void __iomem *rbase = gic_data_rdist_cpu(cpu)->rd_base;
> + u32 val;
> +
> + /* Enable LPIs */
> + val = readl_relaxed(rbase + GICR_CTLR);
> + if (val)
> + continue;
> + val |= GICR_CTLR_ENABLE_LPIS;
> + writel_relaxed(val, rbase + GICR_CTLR);
> + }
>   raw_spin_unlock(_lock);
>  }

This looks completely flawed. If GICR_CTLR.Enable_LPIs is clear on
resume, how can you trust the rest of the bits in the register? How
can you trust *any* register in the redistributor?

And what makes you think it is valid or safe to blindly enable LPIs
without even checking whether they were enabled the first place?

I will reiterate my take on this: if the firmware messes with the RD
on suspend, please address the problem in the firmware so that the RDs
are correctly restored on each CPU.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC] ITS fails to allocate on rk3568/rk3566

2021-04-13 Thread Marc Zyngier
Hi Peter,

On Mon, 12 Apr 2021 21:49:59 +0100,
Peter Geis  wrote:
> 
> Good Afternoon,
> 
> I am assisting with early bringup of the rk3566 based quartz64
> development board for mainline linux.
> I've encountered a few issues with allocating ITS on their version of
> the GIC-V3.
> The first issue is the ITS controller can only use 32bit addresses.
> This leads to the following error:
> [0.00] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
> [0.00] GICv3: GIC: Using split EOI/Deactivate mode
> [0.00] GICv3: 320 SPIs implemented
> [0.00] GICv3: 0 Extended SPIs implemented
> [0.00] GICv3: Distributor has no Range Selector support
> [0.00] Root IRQ handler: gic_handle_irq
> [0.00] GICv3: 16 PPIs implemented
> [0.00] GICv3: CPU0: found redistributor 0 region 0:0xfd46
> [0.00] ITS [mem 0xfd44-0xfd45]
> [0.00] ITS@0xfd44: Devices doesn't stick:
> f907000100190600 f90700190600

Ouch. That looks pretty bad. Bit 32 of the register doesn't stick, and
that's right in the middle of the address. The register should be
fully writable as far as the address field is concerned.

Please dump the distributor and ITS IIDR registers so that I can find
the TRM for the exact IP.

> [0.00] ITS@0xfd44: failed probing (-6)
> [0.00] ITS: No ITS available, not enabling LPIs
> 
> Downstream fixed this by adding the GFP_DMA32 flag to the memory
> allocations.

Urgh... this really looks like broken silicon to me.

> They also force clear the GITS_BASER_SHAREABILITY_MASK.

Why? Does this also apply to the command queue? Are they forcing cache
flushing?

> Unfortunately while this allowed ITS to allocate on downstream, as
> soon as MSIs attempted to use it all interrupts would time out.
> 
> On upstream, we observe this during allocation:
> [0.00] ITS [mem 0xfd44-0xfd45]
> [0.00] ITS@0xfd44: allocated 8192 Devices @381
> (indirect, esz 8, psz 64K, shr 1)
> [0.00] ITS@0xfd44: allocated 32768 Interrupt
> Collections @382 (flat, esz 2, psz 64K, shr 1)
> [0.00] GICv3: using LPI property table @0x0383
> [0.00] GICv3: CPU0: using allocated LPI pending table
> @0x0384
> [0.00] ITS queue timeout (64 1)
> [0.00] ITS cmd its_build_mapc_cmd failed
> [0.00] ITS queue timeout (96 1)
> [0.00] ITS cmd its_build_invall_cmd failed
> 

So the command queue is not making forward progress. Either because
the ITS cannot access the commands, or because it cannot use the
memory it has been allocated. Please dump GITS_CBASER (and the value
that has been written to it), just in case it shows the same
brokenness as the GITS_BASER registers...

[...]

> Any assistance you can provide would be greatly appreciated.

I'm not sure there is much we can do without a lot more details about
the HW. We need to know the exact GIC implementation they are using
(ARM has two versions of the GICv3 IP), and we also need to find out
*how* this has been integrated. Only Rockchip can tell you that.

Once we know which version they are using, and how it is wired, we can
start looking at some IMPDEF debug registers.

The Linux driver assumes that the ITS is able to access the whole
memory. If there are restrictions on what memory ranges can be used,
it is going to be a pain to support :-(.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC v4 net-next 2/4] net: dsa: mt7530: add interrupt support

2021-04-13 Thread Marc Zyngier
On Tue, 13 Apr 2021 01:07:23 +0100,
Andrew Lunn  wrote:
> 
> > > > +static void
> > > > +mt7530_setup_mdio_irq(struct mt7530_priv *priv)
> > > > +{
> > > > +   struct dsa_switch *ds = priv->ds;
> > > > +   int p;
> > > > +
> > > > +   for (p = 0; p < MT7530_NUM_PHYS; p++) {
> > > > +   if (BIT(p) & ds->phys_mii_mask) {
> > > > +   unsigned int irq;
> > > > +
> > > > +   irq = irq_create_mapping(priv->irq_domain, p);
> > > 
> > > This seems odd. Why aren't the MDIO IRQs allocated on demand as
> > > endpoint attached to this interrupt controller are being probed
> > > individually? In general, doing this allocation upfront is an
> > > indication that there is some missing information in the DT to perform
> > > the discovery.
> > 
> > This is what Andrew's mv88e6xxx does, actually. In addition, I also check
> > the phys_mii_mask to avoid creating mappings for unused ports.
> 
> It can be done via DT, using the standard interrupt property, so long
> as you use of_mdiobus_register(np).
> 
> But when you have an 7 port switch, and a nice simple mapping, port 0
> PHY using interrupt 0, you can save a lot of device tree boilerplate
> by doing it in code. And when you have 4 of these switches, it gets
> very boring adding all the DT to just wire up the interrupts 28
> interrupts.

I guess this is depends whether the most usual case is to have all
these interrupts being actively in use or not. Most interrupts only
use a limited portion of their interrupt space at any given time.
Allocating all interrupts and creating mappings upfront is a waste of
memory.

If the use case here is that all these interrupts will be wired and
used in most cases, then upfront allocation is probably not a problem.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2 00/10] KVM: Consolidate and optimize MMU notifiers

2021-04-12 Thread Marc Zyngier
On Fri, 02 Apr 2021 13:17:45 +0100,
Paolo Bonzini  wrote:
> 
> On 02/04/21 02:56, Sean Christopherson wrote:
> > The end goal of this series is to optimize the MMU notifiers to take
> > mmu_lock if and only if the notification is relevant to KVM, i.e. the hva
> > range overlaps a memslot.   Large VMs (hundreds of vCPUs) are very
> > sensitive to mmu_lock being taken for write at inopportune times, and
> > such VMs also tend to be "static", e.g. backed by HugeTLB with minimal
> > page shenanigans.  The vast majority of notifications for these VMs will
> > be spurious (for KVM), and eliding mmu_lock for spurious notifications
> > avoids an otherwise unacceptable disruption to the guest.
> > 
> > To get there without potentially degrading performance, e.g. due to
> > multiple memslot lookups, especially on non-x86 where the use cases are
> > largely unknown (from my perspective), first consolidate the MMU notifier
> > logic by moving the hva->gfn lookups into common KVM.
> > 
> > Based on kvm/queue, commit 5f986f748438 ("KVM: x86: dump_vmcs should
> > include the autoload/autostore MSR lists").
> > 
> > Well tested on Intel and AMD.  Compile tested for arm64, MIPS, PPC,
> > PPC e500, and s390.  Absolutely needs to be tested for real on non-x86,
> > I give it even odds that I introduced an off-by-one bug somewhere.
> > 
> > v2:
> >   - Drop the patches that have already been pushed to kvm/queue.
> >   - Drop two selftest changes that had snuck in via "git commit -a".
> >   - Add a patch to assert that mmu_notifier_count is elevated when
> > .change_pte() runs. [Paolo]
> >   - Split out moving KVM_MMU_(UN)LOCK() to __kvm_handle_hva_range() to a
> > separate patch.  Opted not to squash it with the introduction of the
> > common hva walkers (patch 02), as that prevented sharing code between
> > the old and new APIs. [Paolo]
> >   - Tweak the comment in kvm_vm_destroy() above the smashing of the new
> > slots lock. [Paolo]
> >   - Make mmu_notifier_slots_lock unconditional to avoid #ifdefs. [Paolo]
> > 
> > v1:
> >   - https://lkml.kernel.org/r/20210326021957.1424875-1-sea...@google.com
> > 
> > Sean Christopherson (10):
> >KVM: Assert that notifier count is elevated in .change_pte()
> >KVM: Move x86's MMU notifier memslot walkers to generic code
> >KVM: arm64: Convert to the gfn-based MMU notifier callbacks
> >KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacks
> >KVM: PPC: Convert to the gfn-based MMU notifier callbacks
> >KVM: Kill off the old hva-based MMU notifier callbacks
> >KVM: Move MMU notifier's mmu_lock acquisition into common helper
> >KVM: Take mmu_lock when handling MMU notifier iff the hva hits a
> >  memslot
> >KVM: Don't take mmu_lock for range invalidation unless necessary
> >KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if
> >  possible
> > 
> >   arch/arm64/kvm/mmu.c   | 117 +++--
> >   arch/mips/kvm/mmu.c|  97 ++--
> >   arch/powerpc/include/asm/kvm_book3s.h  |  12 +-
> >   arch/powerpc/include/asm/kvm_ppc.h |   9 +-
> >   arch/powerpc/kvm/book3s.c  |  18 +-
> >   arch/powerpc/kvm/book3s.h  |  10 +-
> >   arch/powerpc/kvm/book3s_64_mmu_hv.c|  98 ++--
> >   arch/powerpc/kvm/book3s_64_mmu_radix.c |  25 +-
> >   arch/powerpc/kvm/book3s_hv.c   |  12 +-
> >   arch/powerpc/kvm/book3s_pr.c   |  56 ++---
> >   arch/powerpc/kvm/e500_mmu_host.c   |  27 +-
> >   arch/x86/kvm/mmu/mmu.c | 127 --
> >   arch/x86/kvm/mmu/tdp_mmu.c | 245 +++
> >   arch/x86/kvm/mmu/tdp_mmu.h |  14 +-
> >   include/linux/kvm_host.h   |  22 +-
> >   virt/kvm/kvm_main.c| 325 +++--
> >   16 files changed, 552 insertions(+), 662 deletions(-)
> > 
> 
> For MIPS, I am going to post a series that simplifies TLB flushing
> further.  I applied it, and rebased this one on top, to
> kvm/mmu-notifier-queue.
> 
> Architecture maintainers, please look at the branch and
> review/test/ack your parts.

I've given this a reasonably good beating on arm64 for both VHE and
nVHE HW, and nothing caught fire, although I was left with a conflict
in the x86 code after merging with linux/master.

Feel free to add a

Tested-by: Marc Zyngier 

for the arm64 side.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2 03/10] KVM: arm64: Convert to the gfn-based MMU notifier callbacks

2021-04-12 Thread Marc Zyngier
On Fri, 02 Apr 2021 01:56:51 +0100,
Sean Christopherson  wrote:
> 
> Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn
> lookup in common code.
> 
> No meaningful functional change intended, though the exact order of
> operations is slightly different since the memslot lookups occur before
> calling into arch code.
> 
> Signed-off-by: Sean Christopherson 

Reviewed-by: Marc Zyngier 

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [RFC v4 net-next 2/4] net: dsa: mt7530: add interrupt support

2021-04-12 Thread Marc Zyngier
On Mon, 12 Apr 2021 04:42:35 +0100,
DENG Qingfang  wrote:
> 
> Add support for MT7530 interrupt controller to handle internal PHYs.
> In order to assign an IRQ number to each PHY, the registration of MDIO bus
> is also done in this driver.
> 
> Signed-off-by: DENG Qingfang 
> ---
> RFC v3 -> RFC v4:
> - No changes.
> 
>  drivers/net/dsa/Kconfig  |   1 +
>  drivers/net/dsa/mt7530.c | 266 +++
>  drivers/net/dsa/mt7530.h |  20 ++-
>  3 files changed, 258 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
> index a5f1aa911fe2..264384449f09 100644
> --- a/drivers/net/dsa/Kconfig
> +++ b/drivers/net/dsa/Kconfig
> @@ -36,6 +36,7 @@ config NET_DSA_LANTIQ_GSWIP
>  config NET_DSA_MT7530
>   tristate "MediaTek MT753x and MT7621 Ethernet switch support"
>   select NET_DSA_TAG_MTK
> + select MEDIATEK_PHY
>   help
> This enables support for the MediaTek MT7530, MT7531, and MT7621
> Ethernet switch chips.
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index 2bd1bab71497..da033004a74d 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -596,18 +597,14 @@ mt7530_mib_reset(struct dsa_switch *ds)
>   mt7530_write(priv, MT7530_MIB_CCR, CCR_MIB_ACTIVATE);
>  }
>  
> -static int mt7530_phy_read(struct dsa_switch *ds, int port, int regnum)
> +static int mt7530_phy_read(struct mt7530_priv *priv, int port, int regnum)
>  {
> - struct mt7530_priv *priv = ds->priv;
> -
>   return mdiobus_read_nested(priv->bus, port, regnum);
>  }
>  
> -static int mt7530_phy_write(struct dsa_switch *ds, int port, int regnum,
> +static int mt7530_phy_write(struct mt7530_priv *priv, int port, int regnum,
>   u16 val)
>  {
> - struct mt7530_priv *priv = ds->priv;
> -
>   return mdiobus_write_nested(priv->bus, port, regnum, val);
>  }
>  
> @@ -785,9 +782,8 @@ mt7531_ind_c22_phy_write(struct mt7530_priv *priv, int 
> port, int regnum,
>  }
>  
>  static int
> -mt7531_ind_phy_read(struct dsa_switch *ds, int port, int regnum)
> +mt7531_ind_phy_read(struct mt7530_priv *priv, int port, int regnum)
>  {
> - struct mt7530_priv *priv = ds->priv;
>   int devad;
>   int ret;
>  
> @@ -803,10 +799,9 @@ mt7531_ind_phy_read(struct dsa_switch *ds, int port, int 
> regnum)
>  }
>  
>  static int
> -mt7531_ind_phy_write(struct dsa_switch *ds, int port, int regnum,
> +mt7531_ind_phy_write(struct mt7530_priv *priv, int port, int regnum,
>u16 data)
>  {
> - struct mt7530_priv *priv = ds->priv;
>   int devad;
>   int ret;
>  
> @@ -822,6 +817,22 @@ mt7531_ind_phy_write(struct dsa_switch *ds, int port, 
> int regnum,
>   return ret;
>  }
>  
> +static int
> +mt753x_phy_read(struct mii_bus *bus, int port, int regnum)
> +{
> + struct mt7530_priv *priv = bus->priv;
> +
> + return priv->info->phy_read(priv, port, regnum);
> +}
> +
> +static int
> +mt753x_phy_write(struct mii_bus *bus, int port, int regnum, u16 val)
> +{
> + struct mt7530_priv *priv = bus->priv;
> +
> + return priv->info->phy_write(priv, port, regnum, val);
> +}
> +
>  static void
>  mt7530_get_strings(struct dsa_switch *ds, int port, u32 stringset,
>  uint8_t *data)
> @@ -1828,6 +1839,211 @@ mt7530_setup_gpio(struct mt7530_priv *priv)
>  }
>  #endif /* CONFIG_GPIOLIB */
>  
> +static irqreturn_t
> +mt7530_irq_thread_fn(int irq, void *dev_id)
> +{
> + struct mt7530_priv *priv = dev_id;
> + bool handled = false;
> + u32 val;
> + int p;
> +
> + mutex_lock_nested(>bus->mdio_lock, MDIO_MUTEX_NESTED);
> + val = mt7530_mii_read(priv, MT7530_SYS_INT_STS);
> + mt7530_mii_write(priv, MT7530_SYS_INT_STS, val);
> + mutex_unlock(>bus->mdio_lock);
> +
> + for (p = 0; p < MT7530_NUM_PHYS; p++) {
> + if (BIT(p) & val) {
> + unsigned int irq;
> +
> + irq = irq_find_mapping(priv->irq_domain, p);
> + handle_nested_irq(irq);
> + handled = true;
> + }
> + }
> +
> + return IRQ_RETVAL(handled);
> +}
> +
> +static void
> +mt7530_irq_mask(struct irq_data *d)
> +{
> + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + priv->irq_enable &= ~BIT(d->hwirq);
> +}
> +
> +static void
> +mt7530_irq_unmask(struct irq_data *d)
> +{
> + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + priv->irq_enable |= BIT(d->hwirq);
> +}
> +
> +static void
> +mt7530_irq_bus_lock(struct irq_data *d)
> +{
> + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
> +
> + mutex_lock_nested(>bus->mdio_lock, MDIO_MUTEX_NESTED);
> +}
> +
> +static void
> +mt7530_irq_bus_sync_unlock(struct irq_data *d)
> +{
> + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);

[PATCH] genirq/irqaction: Move dev_id and percpu_dev_id in a union

2021-04-10 Thread Marc Zyngier
dev_id and percpu_dev_id are mutually exclusive in struct irqaction,
as they conceptually represent the same thing, only in a per-cpu
fashion.

Move them into an anonymous union, saving a few bytes on the way.

Signed-off-by: Marc Zyngier 
---
 include/linux/interrupt.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 967e25767153..4383ee033acf 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -109,8 +109,10 @@ typedef irqreturn_t (*irq_handler_t)(int, void *);
  */
 struct irqaction {
irq_handler_t   handler;
-   void*dev_id;
-   void __percpu   *percpu_dev_id;
+   union {
+   void*dev_id;
+   void __percpu   *percpu_dev_id;
+   };
struct irqaction*next;
irq_handler_t   thread_fn;
struct task_struct  *thread;
-- 
2.30.2



Re: [PATCH] irqchip: fixed S3 could not resume after suspend

2021-04-09 Thread Marc Zyngier
On Fri, 09 Apr 2021 11:10:09 +0100,
414777...@qq.com wrote:
> 
> From: Mengguang Peng 
> 
> On arm64 platform, found that the machine could not wake up after suspend,
> this patch updates the its suspend and resume handling code.
> 
> - Add a variable named ctlr_save in struct rdists.
> - When suspend, save the value of GICR_CTLR to memmory
>   in its_save_disable().
> - When resume, write the value of memory saved to GICR_CTLR
>   in its_restore_enable().

This really is the kind of things that must be handled by firmware.
How comes yours doesn't handle it?

That is what you should be fixing. See the ATF code for a (working)
reference implementation

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v4 1/2] KVM: arm64: Move CMOs from user_mem_abort to the fault handlers

2021-04-09 Thread Marc Zyngier
On Fri, 09 Apr 2021 09:08:11 +0100,
Quentin Perret  wrote:
> 
> Hi Yanan,
> 
> On Friday 09 Apr 2021 at 11:36:51 (+0800), Yanan Wang wrote:
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > +static void stage2_invalidate_icache(void *addr, u64 size)
> > +{
> > +   if (icache_is_aliasing()) {
> > +   /* Flush any kind of VIPT icache */
> > +   __flush_icache_all();
> > +   } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
> > +   /* PIPT or VPIPT at EL2 */
> > +   invalidate_icache_range((unsigned long)addr,
> > +   (unsigned long)addr + size);
> > +   }
> > +}
> > +
> 
> I would recommend to try and rebase this patch on kvmarm/next because
> we've made a few changes in pgtable.c recently. It is now linked into
> the EL2 NVHE code which means there are constraints on what can be used
> from there -- you'll need a bit of extra work to make some of these
> functions available to EL2.

That's an interesting point.

I wonder whether we are missing something on the i-side for VPITP +
host stage-2 due to switching HCR_EL2.VM. We haven't changed the VMID
(still 0), but I can't bring myself to be sure it doesn't affect the
icache in this case...

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v13 03/18] arm64: hyp-stub: Move el1_sync into the vectors

2021-04-08 Thread Marc Zyngier
On Thu, 08 Apr 2021 15:45:18 +0100,
Pavel Tatashin  wrote:
> 
> On Thu, Apr 8, 2021 at 6:24 AM Marc Zyngier  wrote:
> >
> > On 2021-04-08 05:05, Pavel Tatashin wrote:
> > > From: James Morse 
> > >
> > > The hyp-stub's el1_sync code doesn't do very much, this can easily fit
> > > in the vectors.
> > >
> > > With this, all of the hyp-stubs behaviour is contained in its vectors.
> > > This lets kexec and hibernate copy the hyp-stub when they need its
> > > behaviour, instead of re-implementing it.
> > >
> > > Signed-off-by: James Morse 
> > >
> > > [Fixed merging issues]
> >
> > That's a pretty odd fix IMO.
> >
> > >
> > > Signed-off-by: Pavel Tatashin 
> > > ---
> > >  arch/arm64/kernel/hyp-stub.S | 59 ++--
> > >  1 file changed, 29 insertions(+), 30 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/hyp-stub.S
> > > b/arch/arm64/kernel/hyp-stub.S
> > > index ff329c5c074d..d1a73d0f74e0 100644
> > > --- a/arch/arm64/kernel/hyp-stub.S
> > > +++ b/arch/arm64/kernel/hyp-stub.S
> > > @@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label)
> > >   .align 7
> > >   b   \label
> > >  SYM_CODE_END(\label)
> > > +.endm
> > > +
> > > +.macro hyp_stub_el1_sync
> > > +SYM_CODE_START_LOCAL(hyp_stub_el1_sync)
> > > + .align 7
> > > + cmp x0, #HVC_SET_VECTORS
> > > + b.ne2f
> > > + msr vbar_el2, x1
> > > + b   9f
> > > +
> > > +2:   cmp x0, #HVC_SOFT_RESTART
> > > + b.ne3f
> > > + mov x0, x2
> > > + mov x2, x4
> > > + mov x4, x1
> > > + mov x1, x3
> > > + br  x4  // no return
> > > +
> > > +3:   cmp x0, #HVC_RESET_VECTORS
> > > + beq 9f  // Nothing to reset!
> > > +
> > > + /* Someone called kvm_call_hyp() against the hyp-stub... */
> > > + mov_q   x0, HVC_STUB_ERR
> > > + eret
> > > +
> > > +9:   mov x0, xzr
> > > + eret
> > > +SYM_CODE_END(hyp_stub_el1_sync)
> >
> > You said you tested this on a TX2. I guess you don't care whether
> > it runs VHE or not...
> 
> Hi Marc,
> 
> Thank you for noticing this. Not sure how this missmerge happened. I
> have added the missing case, and VHE is initialized correctly during
> boot.
> [   14.698175] kvm [1]: VHE mode initialized successfully
> 
> During normal boot, kexec reboot, and kdump reboot. I will respin the
> series and send the version 14 soon.

Please give people a chance to review this lot first. This isn't code
that is easy to digest, and immediate re-spinning does more harm than
good (this isn't targeting 5.13, I would assume).

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: linux-next: build warning after merge of the kvm-arm tree

2021-04-08 Thread Marc Zyngier
On Thu, 08 Apr 2021 12:44:00 +0100,
Stephen Rothwell  wrote:
> 
> [1  ]
> Hi all,
> 
> After merging the kvm-arm tree, today's linux-next build (htmldocs)
> produced this warning:
> 
> /home/sfr/next/next/Documentation/virt/kvm/arm/ptp_kvm.rst:19: WARNING: 
> Malformed table.
> Text in column margin in table line 5.
> 
> =====
> Function ID: (uint32)  0x8601
> Arguments:   (uint32)  KVM_PTP_VIRT_COUNTER(0)
>KVM_PTP_PHYS_COUNTER(1)
> Return Values:   (int32)   NOT_SUPPORTED(-1) on error, or
>  (uint32)  Upper 32 bits of wall clock time (r0)
>  (uint32)  Lower 32 bits of wall clock time (r1)
>  (uint32)  Upper 32 bits of counter (r2)
>  (uint32)  Lower 32 bits of counter (r3)
> Endianness:No Restrictions.
> =====
> 
> Introduced by commit
> 
>   3bf725699bf6 ("KVM: arm64: Add support for the KVM PTP service")

Now fixed, thanks.

M.

-- 
Without deviation from the norm, progress is not possible.


[irqchip: irq/irqchip-next] irqchip/wpcm450: Drop COMPILE_TEST

2021-04-08 Thread irqchip-bot for Marc Zyngier
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: 94bc94209a66f05532c065279f4a719058d447e4
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/94bc94209a66f05532c065279f4a719058d447e4
Author:Marc Zyngier 
AuthorDate:Thu, 08 Apr 2021 08:56:27 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 08 Apr 2021 11:37:14 +01:00

irqchip/wpcm450: Drop COMPILE_TEST

This driver is (for now) ARM specific, and currently doesn't
build with a variety of architectures (ia64, RISC-V, x86_64
at the very least).

Drop COMPILE_TEST from Kconfig until it gets sorted out.

Reviewed-by: Jonathan Neuschäfer 
Reported-by: Stephen Rothwell 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 38ad9dc..715eb43 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -579,7 +579,7 @@ config MST_IRQ
 
 config WPCM450_AIC
bool "Nuvoton WPCM450 Advanced Interrupt Controller"
-   depends on ARCH_WPCM450 || COMPILE_TEST
+   depends on ARCH_WPCM450
help
  Support for the interrupt controller in the Nuvoton WPCM450 BMC SoC.
 


Re: [PATCH v13 03/18] arm64: hyp-stub: Move el1_sync into the vectors

2021-04-08 Thread Marc Zyngier

On 2021-04-08 05:05, Pavel Tatashin wrote:

From: James Morse 

The hyp-stub's el1_sync code doesn't do very much, this can easily fit
in the vectors.

With this, all of the hyp-stubs behaviour is contained in its vectors.
This lets kexec and hibernate copy the hyp-stub when they need its
behaviour, instead of re-implementing it.

Signed-off-by: James Morse 

[Fixed merging issues]


That's a pretty odd fix IMO.



Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 59 ++--
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S 
b/arch/arm64/kernel/hyp-stub.S

index ff329c5c074d..d1a73d0f74e0 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label)
.align 7
b   \label
 SYM_CODE_END(\label)
+.endm
+
+.macro hyp_stub_el1_sync
+SYM_CODE_START_LOCAL(hyp_stub_el1_sync)
+   .align 7
+   cmp x0, #HVC_SET_VECTORS
+   b.ne2f
+   msr vbar_el2, x1
+   b   9f
+
+2: cmp x0, #HVC_SOFT_RESTART
+   b.ne3f
+   mov x0, x2
+   mov x2, x4
+   mov x4, x1
+   mov x1, x3
+   br  x4  // no return
+
+3: cmp x0, #HVC_RESET_VECTORS
+   beq 9f  // Nothing to reset!
+
+   /* Someone called kvm_call_hyp() against the hyp-stub... */
+   mov_q   x0, HVC_STUB_ERR
+   eret
+
+9: mov x0, xzr
+   eret
+SYM_CODE_END(hyp_stub_el1_sync)


You said you tested this on a TX2. I guess you don't care whether
it runs VHE or not...

M.


 .endm

.text
@@ -39,7 +67,7 @@ SYM_CODE_START(__hyp_stub_vectors)
invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h

-   ventry  el1_sync// Synchronous 64-bit EL1
+   hyp_stub_el1_sync   // Synchronous 64-bit 
EL1
invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
@@ -55,35 +83,6 @@ SYM_CODE_END(__hyp_stub_vectors)
 # Check the __hyp_stub_vectors didn't overflow
 .org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K

-
-SYM_CODE_START_LOCAL(el1_sync)
-   cmp x0, #HVC_SET_VECTORS
-   b.ne1f
-   msr vbar_el2, x1
-   b   9f
-
-1: cmp x0, #HVC_VHE_RESTART
-   b.eqmutate_to_vhe
-
-2: cmp x0, #HVC_SOFT_RESTART
-   b.ne3f
-   mov x0, x2
-   mov x2, x4
-   mov x4, x1
-   mov x1, x3
-   br  x4  // no return
-
-3: cmp x0, #HVC_RESET_VECTORS
-   beq 9f  // Nothing to reset!
-
-   /* Someone called kvm_call_hyp() against the hyp-stub... */
-   mov_q   x0, HVC_STUB_ERR
-   eret
-
-9: mov x0, xzr
-   eret
-SYM_CODE_END(el1_sync)
-
 // nVHE? No way! Give me the real thing!
 SYM_CODE_START_LOCAL(mutate_to_vhe)
// Sanity check: MMU *must* be off


--
Jazz is not dead. It just smells funny...


[irqchip: irq/irqchip-next] irqchip/wpcm450: Drop COMPILE_TEST

2021-04-08 Thread irqchip-bot for Marc Zyngier
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: 384cf046e474b40db4773e9358241a5de11ed8a7
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/384cf046e474b40db4773e9358241a5de11ed8a7
Author:Marc Zyngier 
AuthorDate:Thu, 08 Apr 2021 08:56:27 +01:00
Committer: Marc Zyngier 
CommitterDate: Thu, 08 Apr 2021 08:56:27 +01:00

irqchip/wpcm450: Drop COMPILE_TEST

This driver is (for now) ARM specific, and currently doesn't
build with a variety of architectures (ia64, RISC-V, x86_64
at the very least).

Drop COMPILE_TEST from Kconfig until it gets sorted out.

Cc: Jonathan Neuschäfer 
Reported-by: Stephen Rothwell 
Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 38ad9dc..715eb43 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -579,7 +579,7 @@ config MST_IRQ
 
 config WPCM450_AIC
bool "Nuvoton WPCM450 Advanced Interrupt Controller"
-   depends on ARCH_WPCM450 || COMPILE_TEST
+   depends on ARCH_WPCM450
help
  Support for the interrupt controller in the Nuvoton WPCM450 BMC SoC.
 


Re: linux-next: build failure after merge of the irqchip tree

2021-04-08 Thread Marc Zyngier

Hi Stephen,

On 2021-04-08 07:35, Stephen Rothwell wrote:

Hi all,

After merging the irqchip tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/irqchip/irq-wpcm450-aic.c:9:10: fatal error: asm/exception.h:
No such file or directory
9 | #include 
  |  ^

Caused by commit

  fead4dd49663 ("irqchip: Add driver for WPCM450 interrupt controller")

I have used the irqchip tree from next-20210407 for today.


Thanks for the heads up. I guess that's the effect of COMPILE_TEST
which was apparently not very well tested... I'll drop it from Kconfig.

Jonathan, feel free to submit something re-enabling COMPILE_TEST once
you've worked out the missing dependencies.

Thanks,

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v19 3/7] ptp: Reorganize ptp_kvm.c to make it arch-independent

2021-04-07 Thread Marc Zyngier
On Wed, 07 Apr 2021 16:13:34 +0100,
Richard Cochran  wrote:
> 
> On Wed, Apr 07, 2021 at 10:28:44AM +0100, Marc Zyngier wrote:
> > On Tue, 30 Mar 2021 15:54:26 +0100,
> > Marc Zyngier  wrote:
> > > 
> > > From: Jianyong Wu 
> > > 
> > > Currently, the ptp_kvm module contains a lot of x86-specific code.
> > > Let's move this code into a new arch-specific file in the same directory,
> > > and rename the arch-independent file to ptp_kvm_common.c.
> > > 
> > > Reviewed-by: Andre Przywara 
> > > Signed-off-by: Jianyong Wu 
> > > Signed-off-by: Marc Zyngier 
> > > Link: 
> > > https://lore.kernel.org/r/20201209060932.212364-4-jianyong...@arm.com
> > 
> > Richard, Paolo,
> > 
> > Can I get an Ack on this and patch #7? We're getting pretty close to
> > the next merge window, and this series has been going on for a couple
> > of years now...
> 
> For both patches:
> 
> Acked-by: Richard Cochran 

Cheers Richard, much appreciated.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] KVM: selftests: vgic_init kvm selftests fixup

2021-04-07 Thread Marc Zyngier
On Wed, 7 Apr 2021 15:59:37 +0200, Eric Auger wrote:
> Bring some improvements/rationalization over the first version
> of the vgic_init selftests:
> 
> - ucall_init is moved in run_cpu()
> - vcpu_args_set is not called as not needed
> - whenever a helper is supposed to succeed, call the non "_" version
> - helpers do not return -errno, instead errno is checked by the caller
> - vm_gic struct is used whenever possible, as well as vm_gic_destroy
> - _kvm_create_device takes an addition fd parameter

Applied to kvm-arm64/vgic-5.13, thanks!

[1/1] KVM: selftests: vgic_init kvm selftests fixup
  commit: 4cffb2df4260ed38c7ae4105f6913ad2d71a16ec

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v2 0/3] KVM: arm64: Minor page fault handler improvement

2021-04-07 Thread Marc Zyngier
On Tue, 16 Mar 2021 12:11:23 +0800, Gavin Shan wrote:
> The series includes several minior improvements to stage-2 page fault
> handler: PATCH[1/2] are cleaning up the code. PATCH[3] don't retrieve
> the memory slot again in the page fault handler to save a bit CPU cycles.
> 
> Changelog
> =
> v2:
>* Rebased to 5.12.rc3 and include r-bs from Keqian  (Gavin)
>* Drop patch to fix IPA limit boundary issue(Keqian)
>* Comments on why we use __gfn_to_pfn_memslot() (Keqian)
> 
> [...]

Applied to kvm-arm64/memslot-fixes, thanks!

[1/3] KVM: arm64: Hide kvm_mmu_wp_memory_region()
  commit: eab62148478d339a37c7a6b37d34182ccf5056ad
[2/3] KVM: arm64: Use find_vma_intersection()
  commit: c728fd4ce75e9c342ea96facc5a2fe5ddb976a67
[3/3] KVM: arm64: Don't retrieve memory slot again in page fault handler
  commit: 10ba2d17d2972926c60e01dace6d7a3f8d968c4f

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [RFC PATCH v2 2/2] kvm/arm64: Try stage2 block mapping for host device MMIO

2021-04-07 Thread Marc Zyngier
On Tue, 16 Mar 2021 13:43:38 +,
Keqian Zhu  wrote:
> 
> The MMIO region of a device maybe huge (GB level), try to use
> block mapping in stage2 to speedup both map and unmap.
> 
> Compared to normal memory mapping, we should consider two more
> points when try block mapping for MMIO region:
> 
> 1. For normal memory mapping, the PA(host physical address) and
> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> the HVA to request hugepage, so we don't need to consider PA
> alignment when verifing block mapping. But for device memory
> mapping, the PA and HVA may have different alignment.
> 
> 2. For normal memory mapping, we are sure hugepage size properly
> fit into vma, so we don't check whether the mapping size exceeds
> the boundary of vma. But for device memory mapping, we should pay
> attention to this.
> 
> This adds device_rough_page_shift() to check these two points when
> selecting block mapping size.
> 
> Signed-off-by: Keqian Zhu 
> ---
> 
> Mainly for RFC, not fully tested. I will fully test it when the
> code logic is well accepted.
> 
> ---
>  arch/arm64/kvm/mmu.c | 42 ++
>  1 file changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c59af5ca01b0..224aa15eb4d9 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -624,6 +624,36 @@ static void kvm_send_hwpoison_signal(unsigned long 
> address, short lsb)
>   send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current);
>  }
>  
> +/*
> + * Find a mapping size that properly insides the intersection of vma and
> + * memslot. And hva and pa have the same alignment to this mapping size.
> + * It's rough because there are still other restrictions, which will be
> + * checked by the following fault_supports_stage2_huge_mapping().

I don't think these restrictions make complete sense to me. If this is
a PFNMAP VMA, we should use the biggest mapping size that covers the
VMA, and not more than the VMA.

> + */
> +static short device_rough_page_shift(struct kvm_memory_slot *memslot,
> +  struct vm_area_struct *vma,
> +  unsigned long hva)
> +{
> + size_t size = memslot->npages * PAGE_SIZE;
> + hva_t sec_start = max(memslot->userspace_addr, vma->vm_start);
> + hva_t sec_end = min(memslot->userspace_addr + size, vma->vm_end);
> + phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> +
> +#ifndef __PAGETABLE_PMD_FOLDED
> + if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> + ALIGN_DOWN(hva, PUD_SIZE) >= sec_start &&
> + ALIGN(hva, PUD_SIZE) <= sec_end)
> + return PUD_SHIFT;
> +#endif
> +
> + if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> + ALIGN_DOWN(hva, PMD_SIZE) >= sec_start &&
> + ALIGN(hva, PMD_SIZE) <= sec_end)
> + return PMD_SHIFT;
> +
> + return PAGE_SHIFT;
> +}
> +
>  static bool fault_supports_stage2_huge_mapping(struct kvm_memory_slot 
> *memslot,
>  unsigned long hva,
>  unsigned long map_size)
> @@ -769,7 +799,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   return -EFAULT;
>   }
>  
> - /* Let's check if we will get back a huge page backed by hugetlbfs */
> + /*
> +  * Let's check if we will get back a huge page backed by hugetlbfs, or
> +  * get block mapping for device MMIO region.
> +  */
>   mmap_read_lock(current->mm);
>   vma = find_vma_intersection(current->mm, hva, hva + 1);
>   if (unlikely(!vma)) {
> @@ -780,11 +813,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>   if (is_vm_hugetlb_page(vma))
>   vma_shift = huge_page_shift(hstate_vma(vma));
> + else if (vma->vm_flags & VM_PFNMAP)
> + vma_shift = device_rough_page_shift(memslot, vma, hva);
>   else
>   vma_shift = PAGE_SHIFT;
>  
> - if (logging_active ||
> - (vma->vm_flags & VM_PFNMAP)) {
> + if (logging_active) {
>   force_pte = true;
>   vma_shift = PAGE_SHIFT;

But why should we downgrade to page-size mappings if logging? This is
a device, and you aren't moving the device around, are you? Or is your
device actually memory with a device mapping that you are trying to
migrate?

>   }
> @@ -855,7 +889,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>   if (kvm_is_device_pfn(pfn)) {
>   device = true;
> - force_pte = true;
> + force_pte = (vma_pagesize == PAGE_SIZE);
>   } else if (logging_active && !write_fault) {
>   /*
>* Only actually map the page as writable if this was a write
> -- 
> 2.19.1
> 
> 

Thanks,

M.

-- 
Without deviation from 

[irqchip: irq/irqchip-next] sh: intc: Drop the use of irq_create_identity_mapping()

2021-04-07 Thread irqchip-bot for Marc Zyngier
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: eef56c3a0492e4c1bc2a081da8f402a26d882489
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/eef56c3a0492e4c1bc2a081da8f402a26d882489
Author:Marc Zyngier 
AuthorDate:Fri, 02 Apr 2021 15:58:21 +01:00
Committer: Marc Zyngier 
CommitterDate: Wed, 07 Apr 2021 12:12:52 +01:00

sh: intc: Drop the use of irq_create_identity_mapping()

Instead of playing games with using irq_create_identity_mapping()
and irq_domain_associate(), drop the use of the former and only
use the latter, together with the allocation of the irq_desc
as needed.

It doesn't make the code less awful, but at least the intent
is clearer.

Tested-by: Geert Uytterhoeven 
Signed-off-by: Marc Zyngier 
---
 drivers/sh/intc/core.c | 49 +++--
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/sh/intc/core.c b/drivers/sh/intc/core.c
index a14684f..ca4f4ca 100644
--- a/drivers/sh/intc/core.c
+++ b/drivers/sh/intc/core.c
@@ -179,6 +179,21 @@ static unsigned int __init save_reg(struct intc_desc_int 
*d,
return 0;
 }
 
+static bool __init intc_map(struct irq_domain *domain, int irq)
+{
+   if (!irq_to_desc(irq) && irq_alloc_desc_at(irq, NUMA_NO_NODE) != irq) {
+   pr_err("uname to allocate IRQ %d\n", irq);
+   return false;
+   }
+
+   if (irq_domain_associate(domain, irq, irq)) {
+   pr_err("domain association failure\n");
+   return false;
+   }
+
+   return true;
+}
+
 int __init register_intc_controller(struct intc_desc *desc)
 {
unsigned int i, k, smp;
@@ -311,24 +326,12 @@ int __init register_intc_controller(struct intc_desc 
*desc)
for (i = 0; i < hw->nr_vectors; i++) {
struct intc_vect *vect = hw->vectors + i;
unsigned int irq = evt2irq(vect->vect);
-   int res;
 
if (!vect->enum_id)
continue;
 
-   res = irq_create_identity_mapping(d->domain, irq);
-   if (unlikely(res)) {
-   if (res == -EEXIST) {
-   res = irq_domain_associate(d->domain, irq, irq);
-   if (unlikely(res)) {
-   pr_err("domain association failure\n");
-   continue;
-   }
-   } else {
-   pr_err("can't identity map IRQ %d\n", irq);
-   continue;
-   }
-   }
+   if (!intc_map(d->domain, irq))
+   continue;
 
intc_irq_xlate_set(irq, vect->enum_id, d);
intc_register_irq(desc, d, vect->enum_id, irq);
@@ -345,22 +348,8 @@ int __init register_intc_controller(struct intc_desc *desc)
 * IRQ support, each vector still needs to have
 * its own backing irq_desc.
 */
-   res = irq_create_identity_mapping(d->domain, irq2);
-   if (unlikely(res)) {
-   if (res == -EEXIST) {
-   res = irq_domain_associate(d->domain,
-  irq2, irq2);
-   if (unlikely(res)) {
-   pr_err("domain association "
-  "failure\n");
-   continue;
-   }
-   } else {
-   pr_err("can't identity map IRQ %d\n",
-  irq);
-   continue;
-   }
-   }
+   if (!intc_map(d->domain, irq2))
+   continue;
 
vect2->enum_id = 0;
 


[irqchip: irq/irqchip-next] irqdomain: Get rid of irq_create_identity_mapping()

2021-04-07 Thread irqchip-bot for Marc Zyngier
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: 4a35d6a03744ded782c9301f5f5d78ad68ce680f
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/4a35d6a03744ded782c9301f5f5d78ad68ce680f
Author:Marc Zyngier 
AuthorDate:Wed, 07 Apr 2021 13:17:10 +01:00
Committer: Marc Zyngier 
CommitterDate: Wed, 07 Apr 2021 13:25:52 +01:00

irqdomain: Get rid of irq_create_identity_mapping()

The sole user of irq_create_identity_mapping() having been converted,
get rid of the unused helper.

Signed-off-by: Marc Zyngier 
---
 include/linux/irqdomain.h | 6 --
 kernel/irq/irqdomain.c| 3 ---
 2 files changed, 9 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 33cacc8..d2c61de 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -419,12 +419,6 @@ extern int irq_create_strict_mappings(struct irq_domain 
*domain,
  unsigned int irq_base,
  irq_hw_number_t hwirq_base, int count);
 
-static inline int irq_create_identity_mapping(struct irq_domain *host,
- irq_hw_number_t hwirq)
-{
-   return irq_create_strict_mappings(host, hwirq, hwirq, 1);
-}
-
 extern const struct irq_domain_ops irq_domain_simple_ops;
 
 /* stock xlate functions */
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index d10ab1d..35c5a99 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -715,9 +715,6 @@ EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
  * locations. For use by controllers that already have static mappings
  * to insert in to the domain.
  *
- * Non-linear users can use irq_create_identity_mapping() for IRQ-at-a-time
- * domain insertion.
- *
  * 0 is returned upon success, while any failure to establish a static
  * mapping is treated as an error.
  */


[irqchip: irq/irqchip-next] mips: netlogic: Use irq_domain_simple_ops for XLP PIC

2021-04-07 Thread irqchip-bot for Marc Zyngier
The following commit has been merged into the irq/irqchip-next branch of 
irqchip:

Commit-ID: bd781ae53fac31acea9dec594d62a1424952dd4c
Gitweb:
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/bd781ae53fac31acea9dec594d62a1424952dd4c
Author:Marc Zyngier 
AuthorDate:Fri, 02 Apr 2021 18:09:49 +01:00
Committer: Marc Zyngier 
CommitterDate: Wed, 07 Apr 2021 13:25:52 +01:00

mips: netlogic: Use irq_domain_simple_ops for XLP PIC

Use the generic irq_domain_simple_ops structure instead of
a home-grown one.

Acked-by: Thomas Bogendoerfer 
Signed-off-by: Marc Zyngier 
---
 arch/mips/netlogic/common/irq.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/mips/netlogic/common/irq.c b/arch/mips/netlogic/common/irq.c
index cf33dd8..c25a2ce 100644
--- a/arch/mips/netlogic/common/irq.c
+++ b/arch/mips/netlogic/common/irq.c
@@ -276,10 +276,6 @@ asmlinkage void plat_irq_dispatch(void)
 }
 
 #ifdef CONFIG_CPU_XLP
-static const struct irq_domain_ops xlp_pic_irq_domain_ops = {
-   .xlate = irq_domain_xlate_onetwocell,
-};
-
 static int __init xlp_of_pic_init(struct device_node *node,
struct device_node *parent)
 {
@@ -324,7 +320,7 @@ static int __init xlp_of_pic_init(struct device_node *node,
 
xlp_pic_domain = irq_domain_add_legacy(node, n_picirqs,
nlm_irq_to_xirq(socid, PIC_IRQ_BASE), PIC_IRQ_BASE,
-   _pic_irq_domain_ops, NULL);
+   _domain_simple_ops, NULL);
if (xlp_pic_domain == NULL) {
pr_err("PIC %pOFn: Creating legacy domain failed!\n", node);
return -EINVAL;


Re: [PATCH] dt-bindings: qcom,pdc: Add compatible for sc7280

2021-04-07 Thread Marc Zyngier
On Mon, 15 Mar 2021 11:29:06 +0530, Rajendra Nayak wrote:
> Add the compatible string for sc7280 SoC from Qualcomm

Applied to irq/irqchip-next, thanks!

[1/1] dt-bindings: qcom,pdc: Add compatible for sc7280
  commit: 5deaa1d7c49151988b0bf919eeea6ad5535a29a2

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v4] irqchip/irq-mst: Support polarity configuration

2021-04-07 Thread Marc Zyngier
On Mon, 15 Mar 2021 21:18:48 +0800, Mark-PK Tsai wrote:
> Support irq polarity configuration and save and restore the config
> when system suspend and resume.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/irq-mst: Support polarity configuration
  commit: ea4aeaa5c88906eb3ca3d7d3d17a45605d2dd0de

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: (subset) [PATCH 0/5] stm32 usart wakeup rework

2021-04-07 Thread Marc Zyngier
On Fri, 19 Mar 2021 19:42:48 +0100, Erwan Le Ray wrote:
> This series reworks stm32 usart wakeup management.
> 
> Alexandre Torgue (1):
>   serial: stm32: update wakeup IRQ management
> 
> Erwan Le Ray (4):
>   serial: stm32: rework wakeup management
>   serial: stm32: clean wakeup handling in serial_suspend
>   irqchip/stm32: add usart instances exti direct event support
>   ARM: dts: stm32: Add wakeup management on stm32mp15x UART nodes
> 
> [...]

Applied to irq/irqchip-next, thanks!

[3/5] irqchip/stm32: add usart instances exti direct event support
  commit: e12c455055e9abc7403ce532616c0124a9d85ee7

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] irqchip/gic-v3: fix OF_BAD_ADDR error handling

2021-04-07 Thread Marc Zyngier
On Tue, 23 Mar 2021 14:18:35 +0100, Arnd Bergmann wrote:
> When building with extra warnings enabled, clang points out a
> mistake in the error handling:
> 
> drivers/irqchip/irq-gic-v3-mbi.c:306:21: error: result of comparison of 
> constant 18446744073709551615 with expression of type 'phys_addr_t' (aka 
> 'unsigned int') is always false 
> [-Werror,-Wtautological-constant-out-of-range-compare]
> if (mbi_phys_base == OF_BAD_ADDR) {
> 
> Truncate the constant to the same type as the variable it gets compared
> to, to shut make the check work and void the warning.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/gic-v3: fix OF_BAD_ADDR error handling
  commit: 8e13d96670a4c050d4883e6743a9e9858e5cfe10

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] irqchip/sifive-plic: Mark two global variables __ro_after_init

2021-04-07 Thread Marc Zyngier
On Tue, 30 Mar 2021 02:09:11 +0800, Jisheng Zhang wrote:
> All of these two are never modified after init, so they can be
>  __ro_after_init.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/sifive-plic: Mark two global variables __ro_after_init
  commit: e03b7c1bcbfad6f27b4682f638b98627c4e416ba

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] irqchip/hisi: Use the correct HiSilicon copyright

2021-04-07 Thread Marc Zyngier
On Tue, 30 Mar 2021 14:46:20 +0800, Hao Fang wrote:
> s/Hisilicon/HiSilicon/
> It should use capital S, according to
> https://www.hisilicon.com/en/terms-of-use.

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/hisi: Use the correct HiSilicon copyright
  commit: 64ec2ad3b84d43926e618bb515f2382c266535ee

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v2 00/10] Initial support for Nuvoton WPCM450 BMC SoC

2021-04-07 Thread Marc Zyngier
On Tue, 6 Apr 2021 14:09:11 +0200, Jonathan Neuschäfer wrote:
> This series adds basic support for the Nuvoton WPCM450 BMC SoC. It's an older
> SoC but still commonly found on eBay, mostly in Supermicro X9 server boards.
> 
> Third-party documentation is available at: 
> https://github.com/neuschaefer/wpcm450/wiki
> 
> Patches 1-4 add devicetree bindings for the WPCM450 SoC and its various parts.
> Patches 5-7 add arch and driver support. Patches 8 and 9 add a devicetree for
> the SoC and a board based on it. Patch 10 finally updates the MAINTAINERS 
> file.
> 
> [...]

Applied to irq/irqchip-next, thanks!

[03/10] dt-bindings: interrupt-controller: Add nuvoton, wpcm450-aic
commit: 7c18715546203a09f859dac2fe3ea8aceec5f235
[06/10] irqchip: Add driver for WPCM450 interrupt controller
commit: fead4dd496631707549f414b4059afb86ea8fb80

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v4] irqchip/irq-mst: Support polarity configuration

2021-04-07 Thread Marc Zyngier
On Mon, 15 Mar 2021 13:18:48 +,
Mark-PK Tsai  wrote:
> 
> Support irq polarity configuration and save and restore the config
> when system suspend and resume.
> 
> Signed-off-by: Mark-PK Tsai 
> ---
>  drivers/irqchip/irq-mst-intc.c | 94 --
>  1 file changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-mst-intc.c b/drivers/irqchip/irq-mst-intc.c
> index 143657b0cf28..a2ab3f837b96 100644
> --- a/drivers/irqchip/irq-mst-intc.c
> +++ b/drivers/irqchip/irq-mst-intc.c
> @@ -13,15 +13,27 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
> -#define INTC_MASK0x0
> -#define INTC_EOI 0x20
> +#define MST_INTC_MAX_IRQS64
> +
> +#define INTC_MASK0x0
> +#define INTC_REV_POLARITY0x10
> +#define INTC_EOI 0x20
> +
> +#ifdef CONFIG_PM_SLEEP
> +static LIST_HEAD(mst_intc_list);
> +#endif
>  
>  struct mst_intc_chip_data {
>   raw_spinlock_t  lock;
>   unsigned intirq_start, nr_irqs;
>   void __iomem*base;
>   boolno_eoi;
> +#ifdef CONFIG_PM_SLEEP
> + struct list_head entry;
> + u16 saved_polarity_conf[DIV_ROUND_UP(MST_INTC_MAX_IRQS, 16)];
> +#endif
>  };
>  
>  static void mst_set_irq(struct irq_data *d, u32 offset)
> @@ -78,6 +90,20 @@ static void mst_intc_eoi_irq(struct irq_data *d)
>   irq_chip_eoi_parent(d);
>  }
>  
> +static int mst_irq_chip_set_type(struct irq_data *data, unsigned int type)
> +{
> + if (type != IRQ_TYPE_EDGE_RISING && type != IRQ_TYPE_EDGE_FALLING &&
> + type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_LEVEL_LOW)
> + return -EINVAL;

All of this amounts to checking for IRQ_TYPE_{NONE,EDGE_BOTH}. Maybe
that's a better option.

> +
> + if (type == IRQ_TYPE_LEVEL_LOW || type == IRQ_TYPE_EDGE_FALLING)
> + mst_set_irq(data, INTC_REV_POLARITY);

What happens if the interrupt was set to LEVEL_LOW, then switched to
LEVEL_HIGH? You end up with a stuck reversed polarity.

> +
> + type = IRQ_TYPE_LEVEL_HIGH;
> +
> + return irq_chip_set_type_parent(data, type);

I'm going to fix this as below. Shout if you disagree.

Thanks,

M.

diff --git a/drivers/irqchip/irq-mst-intc.c b/drivers/irqchip/irq-mst-intc.c
index a2ab3f837b96..f6133ae28155 100644
--- a/drivers/irqchip/irq-mst-intc.c
+++ b/drivers/irqchip/irq-mst-intc.c
@@ -92,16 +92,20 @@ static void mst_intc_eoi_irq(struct irq_data *d)
 
 static int mst_irq_chip_set_type(struct irq_data *data, unsigned int type)
 {
-   if (type != IRQ_TYPE_EDGE_RISING && type != IRQ_TYPE_EDGE_FALLING &&
-   type != IRQ_TYPE_LEVEL_HIGH && type != IRQ_TYPE_LEVEL_LOW)
-   return -EINVAL;
-
-   if (type == IRQ_TYPE_LEVEL_LOW || type == IRQ_TYPE_EDGE_FALLING)
+   switch (type) {
+   case IRQ_TYPE_LEVEL_LOW:
+   case IRQ_TYPE_EDGE_FALLING:
mst_set_irq(data, INTC_REV_POLARITY);
+   break;
+   case IRQ_TYPE_LEVEL_HIGH:
+   case IRQ_TYPE_EDGE_RISING:
+   mst_clear_irq(data, INTC_REV_POLARITY);
+   break;
+   default:
+   return -EINVAL;
+   }
 
-   type = IRQ_TYPE_LEVEL_HIGH;
-
-   return irq_chip_set_type_parent(data, type);
+   return irq_chip_set_type_parent(data, IRQ_TYPE_LEVEL_HIGH);
 }
 
 static struct irq_chip mst_intc_chip = {

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2 06/10] irqchip: Add driver for WPCM450 interrupt controller

2021-04-07 Thread Marc Zyngier
On Tue, 06 Apr 2021 13:09:17 +0100,
Jonathan Neuschäfer  wrote:
> 
> The WPCM450 AIC ("Advanced Interrupt Controller") is the interrupt
> controller found in the Nuvoton WPCM450 SoC and other Winbond/Nuvoton
> SoCs.
> 
> The list of registers if based on the AMI vendor kernel and the
> Nuvoton W90N745 datasheet.
> 
> Although the hardware supports other interrupt modes, the driver only
> supports high-level interrupts at the moment, because other modes could
> not be tested so far.
> 
> Signed-off-by: Jonathan Neuschäfer 
> ---
> 
> v2:
> - Rename IRQS macro to AIC_NUM_IRQS
> - Fix IRQ range check
> - Use linux/printk.h header instead of linux/console.h
> - Add AIC_SCR_PRIORITY_MASK constant
> - Add missing register descriptions
> - Remove superfluous printk about IRQ flow type mismatch
> - Use BIT() macro
> - Rename _ack function to _eoi for accuracy, and use handle_fasteoi_irq
> ---
>  arch/arm/mach-npcm/Kconfig|   1 +
>  drivers/irqchip/Kconfig   |   6 ++
>  drivers/irqchip/Makefile  |   1 +
>  drivers/irqchip/irq-wpcm450-aic.c | 161 ++
>  4 files changed, 169 insertions(+)
>  create mode 100644 drivers/irqchip/irq-wpcm450-aic.c
> 
> diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig
> index 658c8efb4ca14..a71cf1d189ae5 100644
> --- a/arch/arm/mach-npcm/Kconfig
> +++ b/arch/arm/mach-npcm/Kconfig
> @@ -10,6 +10,7 @@ config ARCH_WPCM450
>   bool "Support for WPCM450 BMC (Hermon)"
>   depends on ARCH_MULTI_V5
>   select CPU_ARM926T
> + select WPCM450_AIC
>   select NPCM7XX_TIMER
>   help
> General support for WPCM450 BMC (Hermon).

I can't take this patch with this particular hunk, as I don't have
this file in my tree. I can either drop this line, or delay the
merging of this patch to a later point in time.

The driver otherwise looks ready.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v19 3/7] ptp: Reorganize ptp_kvm.c to make it arch-independent

2021-04-07 Thread Marc Zyngier
On Tue, 30 Mar 2021 15:54:26 +0100,
Marc Zyngier  wrote:
> 
> From: Jianyong Wu 
> 
> Currently, the ptp_kvm module contains a lot of x86-specific code.
> Let's move this code into a new arch-specific file in the same directory,
> and rename the arch-independent file to ptp_kvm_common.c.
> 
> Reviewed-by: Andre Przywara 
> Signed-off-by: Jianyong Wu 
> Signed-off-by: Marc Zyngier 
> Link: https://lore.kernel.org/r/20201209060932.212364-4-jianyong...@arm.com

Richard, Paolo,

Can I get an Ack on this and patch #7? We're getting pretty close to
the next merge window, and this series has been going on for a couple
of years now...

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v4 15/18] irqchip/apple-aic: Add support for the Apple Interrupt Controller

2021-04-06 Thread Marc Zyngier
Hi Hector,

On Fri, 02 Apr 2021 10:05:39 +0100,
Hector Martin  wrote:
> 
> This is the root interrupt controller used on Apple ARM SoCs such as the
> M1. This irqchip driver performs multiple functions:
> 
> * Handles both IRQs and FIQs
> 
> * Drives the AIC peripheral itself (which handles IRQs)
> 
> * Dispatches FIQs to downstream hard-wired clients (currently the ARM
>   timer).
> 
> * Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
>   into a single hardware IPI
> 
> Signed-off-by: Hector Martin 
> ---
>  MAINTAINERS |   2 +
>  drivers/irqchip/Kconfig |   8 +
>  drivers/irqchip/Makefile|   1 +
>  drivers/irqchip/irq-apple-aic.c | 837 
>  include/linux/cpuhotplug.h  |   1 +
>  5 files changed, 849 insertions(+)
>  create mode 100644 drivers/irqchip/irq-apple-aic.c

[...]

> +static int aic_irq_domain_translate(struct irq_domain *id,
> + struct irq_fwspec *fwspec,
> + unsigned long *hwirq,
> + unsigned int *type)
> +{
> + struct aic_irq_chip *ic = id->host_data;
> +
> + if (fwspec->param_count != 3 || !is_of_node(fwspec->fwnode))
> + return -EINVAL;
> +
> + switch (fwspec->param[0]) {
> + case AIC_IRQ:
> + if (fwspec->param[1] >= ic->nr_hw)
> + return -EINVAL;
> + *hwirq = fwspec->param[1];
> + break;
> + case AIC_FIQ:
> + if (fwspec->param[1] >= AIC_NR_FIQ)
> + return -EINVAL;
> + *hwirq = ic->nr_hw + fwspec->param[1];
> +
> + /*
> +  * In EL1 the non-redirected registers are the guest's,
> +  * not EL2's, so remap the hwirqs to match.
> +  */
> + if (!is_kernel_in_hyp_mode()) {
> + switch (fwspec->param[1]) {
> + case AIC_TMR_GUEST_PHYS:
> + *hwirq = ic->nr_hw + AIC_TMR_HV_PHYS;
> + break;
> + case AIC_TMR_GUEST_VIRT:
> + *hwirq = ic->nr_hw + AIC_TMR_HV_VIRT;
> + break;
> + case AIC_TMR_HV_PHYS:
> + case AIC_TMR_HV_VIRT:
> + return -ENOENT;
> + default:
> + break;
> + }
> + }

Urgh, this is nasty. You are internally remapping the hwirq from one
timer to another in order to avoid accessing the enable register
which happens to be an EL2 only register?

The way we normally deal with this kind of things is by providing a
different set of irq_chip callbacks. In this particular case, this
would leave mask/unmask as empty stubs. Or you could move the FIQ
handling to use handle_simple_irq(), because there isn't any callback
that is actually applicable.

It isn't a big deal for now, but that's something we should consider
addressing in the future. With that in mind:

Reviewed-by: Marc Zyngier 

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v6 9/9] KVM: selftests: aarch64/vgic-v3 init sequence tests

2021-04-06 Thread Marc Zyngier
On Tue, 06 Apr 2021 16:09:16 +0100,
Andrew Jones  wrote:
> 
> 
> Hi Eric,
> 
> It looks like Marc already picked this patch up, but, FWIW, here's
> a few more comments you may consider.

FWIW, I'm happy to drop this patch from the queue (the series is on
its own branch for this reason), or take reworks on top.

But given that we're getting close to release time and this series has
some userspace visibile impacts, I want to let it simmer in -next for
a few days before sending the pull request.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v6 0/9] KVM/ARM: Some vgic fixes and init sequence KVM selftests

2021-04-06 Thread Marc Zyngier
On Mon, 5 Apr 2021 18:39:32 +0200, Eric Auger wrote:
> While writting vgic v3 init sequence KVM selftests I noticed some
> relatively minor issues. This was also the opportunity to try to
> fix the issue laterly reported by Zenghui, related to the RDIST_TYPER
> last bit emulation. The final patch is a first batch of VGIC init
> sequence selftests. Of course they can be augmented with a lot more
> register access tests, but let's try to move forward incrementally ...
> 
> [...]

Applied to kvm-arm64/vgic-5.13, thanks!

[1/9] KVM: arm64: vgic-v3: Fix some error codes when setting RDIST base
  commit: d9b201e99c616001b4a51627820377b293479384
[2/9] KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
  commit: 53b16dd6ba5cf64ed147ac3523ec34651d553cb0
[3/9] KVM: arm64: vgic-v3: Fix error handling in vgic_v3_set_redist_base()
  commit: 8542a8f95a67ff6b76d6868ec0af58c464bdf041
[4/9] KVM: arm/arm64: vgic: Reset base address on kvm_vgic_dist_destroy()
  commit: 3a5211612764fa3948e5db5254734168e9e763de
[5/9] docs: kvm: devices/arm-vgic-v3: enhance KVM_DEV_ARM_VGIC_CTRL_INIT doc
  commit: 298c41b8fa1e02c5a35e2263d138583220ab6094
[6/9] KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]
  commit: da3853097679022e14a2d125983f11a67fd2f96a
[7/9] kvm: arm64: vgic-v3: Introduce vgic_v3_free_redist_region()
  commit: e5a35635464bc5304674b84ea42615a3fd0bd949
[8/9] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace
  commit: 28e9d4bce3be9b8fec6c854f87923db99c8fb874
[9/9] KVM: selftests: aarch64/vgic-v3 init sequence tests
  commit: dc0e058eef42f61effe9fd4f0fa4b0c793cc1f14

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH -next] KVM: arm64: Fix error return code in init_hyp_mode()

2021-04-06 Thread Marc Zyngier
On Tue, 6 Apr 2021 12:17:59 +, Wang Wensheng wrote:
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.

Applied to kvm-arm64/misc-5.13, thanks!

[1/1] KVM: arm64: Fix error return code in init_hyp_mode()
  commit: 52b9e265d22bccc5843e167da76ab119874e2883

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH 4/9] sh: intc: Drop the use of irq_create_identity_mapping()

2021-04-06 Thread Marc Zyngier
On Tue, 06 Apr 2021 11:32:13 +0100,
Geert Uytterhoeven  wrote:
> 
> Hi Marc,
> 
> On Tue, Apr 6, 2021 at 11:44 AM Marc Zyngier  wrote:
> > Instead of playing games with using irq_create_identity_mapping()
> > and irq_domain_associate(), drop the use of the former and only
> > use the latter, together with the allocation of the irq_desc
> > as needed.
> >
> > It doesn't make the code less awful, but at least the intent
> > is clearer.
> >
> > Signed-off-by: Marc Zyngier 
> 
> Thanks for your patch!
> 
> > --- a/drivers/sh/intc/core.c
> > +++ b/drivers/sh/intc/core.c
> > @@ -179,6 +179,23 @@ static unsigned int __init save_reg(struct 
> > intc_desc_int *d,
> > return 0;
> >  }
> >
> > +static bool __init intc_map(struct irq_domain *domain, int irq)
> > +{
> > +   int res;
> 
> warning: unused variable ‘res’ [-Wunused-variable]
> 
> > +
> > +   if (!irq_to_desc(irq) && irq_alloc_desc_at(irq, NUMA_NO_NODE) != 
> > irq) {
> > +   pr_err("uname to allocate IRQ %d\n", irq);
> > +   return false;
> > +   }
> > +
> > +   if (irq_domain_associate(domain, irq, irq)) {
> > +   pr_err("domain association failure\n");
> > +   return false;
> > +   }
> > +
> > +   return true;
> > +}
> > +
> >  int __init register_intc_controller(struct intc_desc *desc)
> >  {
> > unsigned int i, k, smp;
> > @@ -316,19 +333,8 @@ int __init register_intc_controller(struct intc_desc 
> > *desc)
> 
> warning: unused variable ‘res’ [-Wunused-variable]

Ah, thanks for spotting these.

> 
> > if (!vect->enum_id)
> > continue;
> >
> > -   res = irq_create_identity_mapping(d->domain, irq);
> 
> 
> > -   if (unlikely(res)) {
> > -   if (res == -EEXIST) {
> > -   res = irq_domain_associate(d->domain, irq, 
> > irq);
> > -   if (unlikely(res)) {
> > -   pr_err("domain association 
> > failure\n");
> > -   continue;
> > -   }
> > -   } else {
> > -   pr_err("can't identity map IRQ %d\n", irq);
> > -   continue;
> > -   }
> > -   }
> > +   if (!intc_map(d->domain, irq))
> > +   continue;
> >
> > intc_irq_xlate_set(irq, vect->enum_id, d);
> > intc_register_irq(desc, d, vect->enum_id, irq);
> 
> Otherwise this seems to work fine on real hardware (landisk) and qemu
> (rts7751r2d).  I did verify that the new function intc_map() is called.
> 
> Tested-by: Geert Uytterhoeven 

Awesome, thanks Geert.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH 1/9] irqdomain: Reimplement irq_linear_revmap() with irq_find_mapping()

2021-04-06 Thread Marc Zyngier
Christophe,

On Tue, 06 Apr 2021 12:21:33 +0100,
Christophe Leroy  wrote:
> 
> 
> 
> Le 06/04/2021 à 11:35, Marc Zyngier a écrit :
> > irq_linear_revmap() is supposed to be a fast path for domain
> > lookups, but it only exposes low-level details of the irqdomain
> > implementation, details which are better kept private.
> 
> Can you elaborate with more details ?

Things like directly picking into the revmap are positively awful, and
doesn't work if the domain has been constructed using the radix
tree. Which on its own is totally broken, because things like
irq_domain_create_hierarchy() will pick an implementation or the
other.

> 
> > 
> > The *overhead* between the two is only a function call and
> > a couple of tests, so it is likely that noone can show any
> > meaningful difference compared to the cost of taking an
> > interrupt.
> 
> Do you have any measurement ?

I did measure things on arm64, and couldn't come up with any
difference other than noise.

> Can you make the "likely" a certitude ?

Of course not. You can always come up with an artificial CPU
implementation that has a very small exception entry overhead, and a
ridiculously slow memory subsystem. Do I care about these? No.

If you can come up with realistic platforms that show a regression
with this patch, I'm all ears.

> 
> > 
> > Reimplement irq_linear_revmap() with irq_find_mapping()
> > in order to preserve source code compatibility, and
> > rename the internal field for a measure.
> 
> This is in complete contradiction with commit 
> https://github.com/torvalds/linux/commit/d3dcb436
> 
> At that time, irq_linear_revmap() was less complex than what
> irq_find_mapping() is today, and nevertheless it was considered worth
> restoring in as a fast path. What has changed since then ?

Over 8 years? Plenty. The use of irqdomains has been generalised, we
have domain hierarchies, and if anything, this commit introduces the
buggy behaviour I was mentioning above. I also don't see any mention
of actual performance in that commit.

And if we're worried about a fast path, being able to directly cache
the irq_data in the revmap, hence skipping the irq_desc lookup that
inevitable follows, is a much more interesting prospect than the "get
useless data quick" that irq_linear_revmap() implements.

This latter optimisation is what I am after.

> Can you also explain the reason for the renaming of "linear_revmap"
> into "revmap" ? What is that "measure" ?

To catch the potential direct use of the reverse map field.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


[PATCH 9/9] irqdomain: Kill irq_domain_add_legacy_isa

2021-04-06 Thread Marc Zyngier
This helper doesn't have a user anymore, let's remove it.

Signed-off-by: Marc Zyngier 
---
 Documentation/core-api/irq/irq-domain.rst |  1 -
 include/linux/irqdomain.h | 11 ---
 2 files changed, 12 deletions(-)

diff --git a/Documentation/core-api/irq/irq-domain.rst 
b/Documentation/core-api/irq/irq-domain.rst
index a77c24c27f7b..84e561db468f 100644
--- a/Documentation/core-api/irq/irq-domain.rst
+++ b/Documentation/core-api/irq/irq-domain.rst
@@ -146,7 +146,6 @@ Legacy
 
irq_domain_add_simple()
irq_domain_add_legacy()
-   irq_domain_add_legacy_isa()
irq_domain_create_legacy()
 
 The Legacy mapping is a special case for drivers that already have a
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 3997ed9e4d7d..2a7ecf08d56e 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -45,9 +45,6 @@ struct cpumask;
 struct seq_file;
 struct irq_affinity_desc;
 
-/* Number of irqs reserved for a legacy isa controller */
-#define NUM_ISA_INTERRUPTS 16
-
 #define IRQ_DOMAIN_IRQ_SPEC_PARAMS 16
 
 /**
@@ -346,14 +343,6 @@ static inline struct irq_domain 
*irq_domain_add_nomap(struct device_node *of_nod
 {
return __irq_domain_add(of_node_to_fwnode(of_node), 0, max_irq, 
max_irq, ops, host_data);
 }
-static inline struct irq_domain *irq_domain_add_legacy_isa(
-   struct device_node *of_node,
-   const struct irq_domain_ops *ops,
-   void *host_data)
-{
-   return irq_domain_add_legacy(of_node, NUM_ISA_INTERRUPTS, 0, 0, ops,
-host_data);
-}
 static inline struct irq_domain *irq_domain_add_tree(struct device_node 
*of_node,
 const struct irq_domain_ops *ops,
 void *host_data)
-- 
2.29.2



[PATCH 8/9] powerpc: Convert irq_domain_add_legacy_isa use to irq_domain_add_legacy

2021-04-06 Thread Marc Zyngier
irq_domain_add_legacy_isa is a pain. It only exists for the benefit of
two PPC-specific drivers, and creates an ugly dependency between asm/irq.h
and linux/irqdomain.h

Instead, let's convert these two drivers to irq_domain_add_legacy(),
stop using NUM_ISA_INTERRUPTS by directly setting NR_IRQS_LEGACY.

The dependency cannot be broken yet as there is a lot of PPC-related
code that depends on it, but that's the first step towards it.

A followup patch will remove irq_domain_add_legacy_isa.

Signed-off-by: Marc Zyngier 
---
 arch/powerpc/include/asm/irq.h | 4 ++--
 arch/powerpc/platforms/ps3/interrupt.c | 4 ++--
 arch/powerpc/sysdev/i8259.c| 3 ++-
 arch/powerpc/sysdev/mpic.c | 2 +-
 arch/powerpc/sysdev/tsi108_pci.c   | 3 ++-
 arch/powerpc/sysdev/xics/xics-common.c | 2 +-
 6 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index f3f264e441a7..aeb209144c68 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -23,8 +23,8 @@ extern atomic_t ppc_n_lost_interrupts;
 /* Total number of virq in the platform */
 #define NR_IRQSCONFIG_NR_IRQS
 
-/* Same thing, used by the generic IRQ code */
-#define NR_IRQS_LEGACY NUM_ISA_INTERRUPTS
+/* Number of irqs reserved for a legacy isa controller */
+#define NR_IRQS_LEGACY 16
 
 extern irq_hw_number_t virq_to_hw(unsigned int virq);
 
diff --git a/arch/powerpc/platforms/ps3/interrupt.c 
b/arch/powerpc/platforms/ps3/interrupt.c
index 78f2339ed5cb..93e367a00452 100644
--- a/arch/powerpc/platforms/ps3/interrupt.c
+++ b/arch/powerpc/platforms/ps3/interrupt.c
@@ -45,7 +45,7 @@
  * implementation equates HV plug value to Linux virq value, constrains each
  * interrupt to have a system wide unique plug number, and limits the range
  * of the plug values to map into the first dword of the bitmaps.  This
- * gives a usable range of plug values of  {NUM_ISA_INTERRUPTS..63}.  Note
+ * gives a usable range of plug values of  {NR_IRQS_LEGACY..63}.  Note
  * that there is no constraint on how many in this set an individual thread
  * can acquire.
  *
@@ -721,7 +721,7 @@ static unsigned int ps3_get_irq(void)
}
 
 #if defined(DEBUG)
-   if (unlikely(plug < NUM_ISA_INTERRUPTS || plug > PS3_PLUG_MAX)) {
+   if (unlikely(plug < NR_IRQS_LEGACY || plug > PS3_PLUG_MAX)) {
dump_bmp(_cpu(ps3_private, 0));
dump_bmp(_cpu(ps3_private, 1));
BUG();
diff --git a/arch/powerpc/sysdev/i8259.c b/arch/powerpc/sysdev/i8259.c
index c1d76c344351..dc1a151c63d7 100644
--- a/arch/powerpc/sysdev/i8259.c
+++ b/arch/powerpc/sysdev/i8259.c
@@ -260,7 +260,8 @@ void i8259_init(struct device_node *node, unsigned long 
intack_addr)
raw_spin_unlock_irqrestore(_lock, flags);
 
/* create a legacy host */
-   i8259_host = irq_domain_add_legacy_isa(node, _host_ops, NULL);
+   i8259_host = irq_domain_add_legacy(node, NR_IRQS_LEGACY, 0, 0,
+  _host_ops, NULL);
if (i8259_host == NULL) {
printk(KERN_ERR "i8259: failed to allocate irq host !\n");
return;
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index b0426f28946a..995fb2ada507 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -602,7 +602,7 @@ static void __init mpic_scan_ht_pics(struct mpic *mpic)
 /* Find an mpic associated with a given linux interrupt */
 static struct mpic *mpic_find(unsigned int irq)
 {
-   if (irq < NUM_ISA_INTERRUPTS)
+   if (irq < NR_IRQS_LEGACY)
return NULL;
 
return irq_get_chip_data(irq);
diff --git a/arch/powerpc/sysdev/tsi108_pci.c b/arch/powerpc/sysdev/tsi108_pci.c
index 49f9541954f8..042bb38fa5c2 100644
--- a/arch/powerpc/sysdev/tsi108_pci.c
+++ b/arch/powerpc/sysdev/tsi108_pci.c
@@ -404,7 +404,8 @@ void __init tsi108_pci_int_init(struct device_node *node)
 {
DBG("Tsi108_pci_int_init: initializing PCI interrupts\n");
 
-   pci_irq_host = irq_domain_add_legacy_isa(node, _irq_domain_ops, 
NULL);
+   pci_irq_host = irq_domain_add_legacy(node, NR_IRQS_LEGACY, 0, 0,
+_irq_domain_ops, NULL);
if (pci_irq_host == NULL) {
printk(KERN_ERR "pci_irq_host: failed to allocate irq 
domain!\n");
return;
diff --git a/arch/powerpc/sysdev/xics/xics-common.c 
b/arch/powerpc/sysdev/xics/xics-common.c
index 7e4305c01bac..fdf8dbb6 100644
--- a/arch/powerpc/sysdev/xics/xics-common.c
+++ b/arch/powerpc/sysdev/xics/xics-common.c
@@ -201,7 +201,7 @@ void xics_migrate_irqs_away(void)
struct ics *ics;
 
/* We can't set affinity on ISA interrupts */
-   if (virq < NUM_ISA_INTERRUPTS)
+   if (virq < NR_IRQS_LEGACY)
continue;
 

[PATCH 5/9] irqdomain: Kill irq_create_strict_mappings()/irq_create_identity_mapping()

2021-04-06 Thread Marc Zyngier
No user of these APIs are left, remove them.

Signed-off-by: Marc Zyngier 
---
 include/linux/irqdomain.h |  9 -
 kernel/irq/irqdomain.c| 35 ---
 2 files changed, 44 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index b9600f24878a..3997ed9e4d7d 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -411,15 +411,6 @@ static inline unsigned int irq_linear_revmap(struct 
irq_domain *domain,
 }
 
 extern unsigned int irq_create_direct_mapping(struct irq_domain *host);
-extern int irq_create_strict_mappings(struct irq_domain *domain,
- unsigned int irq_base,
- irq_hw_number_t hwirq_base, int count);
-
-static inline int irq_create_identity_mapping(struct irq_domain *host,
- irq_hw_number_t hwirq)
-{
-   return irq_create_strict_mappings(host, hwirq, hwirq, 1);
-}
 
 extern const struct irq_domain_ops irq_domain_simple_ops;
 
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index dfa716305ea9..0ba761e6b0a8 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -703,41 +703,6 @@ unsigned int irq_create_mapping_affinity(struct irq_domain 
*domain,
 }
 EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
 
-/**
- * irq_create_strict_mappings() - Map a range of hw irqs to fixed linux irqs
- * @domain: domain owning the interrupt range
- * @irq_base: beginning of linux IRQ range
- * @hwirq_base: beginning of hardware IRQ range
- * @count: Number of interrupts to map
- *
- * This routine is used for allocating and mapping a range of hardware
- * irqs to linux irqs where the linux irq numbers are at pre-defined
- * locations. For use by controllers that already have static mappings
- * to insert in to the domain.
- *
- * Non-linear users can use irq_create_identity_mapping() for IRQ-at-a-time
- * domain insertion.
- *
- * 0 is returned upon success, while any failure to establish a static
- * mapping is treated as an error.
- */
-int irq_create_strict_mappings(struct irq_domain *domain, unsigned int 
irq_base,
-  irq_hw_number_t hwirq_base, int count)
-{
-   struct device_node *of_node;
-   int ret;
-
-   of_node = irq_domain_get_of_node(domain);
-   ret = irq_alloc_descs(irq_base, irq_base, count,
- of_node_to_nid(of_node));
-   if (unlikely(ret < 0))
-   return ret;
-
-   irq_domain_associate_many(domain, irq_base, hwirq_base, count);
-   return 0;
-}
-EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
-
 static int irq_domain_translate(struct irq_domain *d,
struct irq_fwspec *fwspec,
irq_hw_number_t *hwirq, unsigned int *type)
-- 
2.29.2



[PATCH 4/9] sh: intc: Drop the use of irq_create_identity_mapping()

2021-04-06 Thread Marc Zyngier
Instead of playing games with using irq_create_identity_mapping()
and irq_domain_associate(), drop the use of the former and only
use the latter, together with the allocation of the irq_desc
as needed.

It doesn't make the code less awful, but at least the intent
is clearer.

Signed-off-by: Marc Zyngier 
---
 drivers/sh/intc/core.c | 50 ++
 1 file changed, 21 insertions(+), 29 deletions(-)

diff --git a/drivers/sh/intc/core.c b/drivers/sh/intc/core.c
index a14684ffe4c1..6c57ee1ce6c4 100644
--- a/drivers/sh/intc/core.c
+++ b/drivers/sh/intc/core.c
@@ -179,6 +179,23 @@ static unsigned int __init save_reg(struct intc_desc_int 
*d,
return 0;
 }
 
+static bool __init intc_map(struct irq_domain *domain, int irq)
+{
+   int res;
+
+   if (!irq_to_desc(irq) && irq_alloc_desc_at(irq, NUMA_NO_NODE) != irq) {
+   pr_err("uname to allocate IRQ %d\n", irq);
+   return false;
+   }
+
+   if (irq_domain_associate(domain, irq, irq)) {
+   pr_err("domain association failure\n");
+   return false;
+   }
+
+   return true;
+}
+
 int __init register_intc_controller(struct intc_desc *desc)
 {
unsigned int i, k, smp;
@@ -316,19 +333,8 @@ int __init register_intc_controller(struct intc_desc *desc)
if (!vect->enum_id)
continue;
 
-   res = irq_create_identity_mapping(d->domain, irq);
-   if (unlikely(res)) {
-   if (res == -EEXIST) {
-   res = irq_domain_associate(d->domain, irq, irq);
-   if (unlikely(res)) {
-   pr_err("domain association failure\n");
-   continue;
-   }
-   } else {
-   pr_err("can't identity map IRQ %d\n", irq);
-   continue;
-   }
-   }
+   if (!intc_map(d->domain, irq))
+   continue;
 
intc_irq_xlate_set(irq, vect->enum_id, d);
intc_register_irq(desc, d, vect->enum_id, irq);
@@ -345,22 +351,8 @@ int __init register_intc_controller(struct intc_desc *desc)
 * IRQ support, each vector still needs to have
 * its own backing irq_desc.
 */
-   res = irq_create_identity_mapping(d->domain, irq2);
-   if (unlikely(res)) {
-   if (res == -EEXIST) {
-   res = irq_domain_associate(d->domain,
-  irq2, irq2);
-   if (unlikely(res)) {
-   pr_err("domain association "
-  "failure\n");
-   continue;
-   }
-   } else {
-   pr_err("can't identity map IRQ %d\n",
-  irq);
-   continue;
-   }
-   }
+   if (!intc_map(d->domain, irq2))
+   continue;
 
vect2->enum_id = 0;
 
-- 
2.29.2



[PATCH 7/9] irqdomain: Drop references to recusive irqdomain setup

2021-04-06 Thread Marc Zyngier
It was never completely implemented, and was removed a long time
ago. Adjust the documentation to reflect this.

Signed-off-by: Marc Zyngier 
---
 kernel/irq/irqdomain.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 0ba761e6b0a8..89474aa88220 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -1659,12 +1659,10 @@ void irq_domain_free_irqs(unsigned int virq, unsigned 
int nr_irqs)
 
 /**
  * irq_domain_alloc_irqs_parent - Allocate interrupts from parent domain
+ * @domain:Domain below which interrupts must be allocated
  * @irq_base:  Base IRQ number
  * @nr_irqs:   Number of IRQs to allocate
  * @arg:   Allocation data (arch/domain specific)
- *
- * Check whether the domain has been setup recursive. If not allocate
- * through the parent domain.
  */
 int irq_domain_alloc_irqs_parent(struct irq_domain *domain,
 unsigned int irq_base, unsigned int nr_irqs,
@@ -1680,11 +1678,9 @@ EXPORT_SYMBOL_GPL(irq_domain_alloc_irqs_parent);
 
 /**
  * irq_domain_free_irqs_parent - Free interrupts from parent domain
+ * @domain:Domain below which interrupts must be freed
  * @irq_base:  Base IRQ number
  * @nr_irqs:   Number of IRQs to free
- *
- * Check whether the domain has been setup recursive. If not free
- * through the parent domain.
  */
 void irq_domain_free_irqs_parent(struct irq_domain *domain,
 unsigned int irq_base, unsigned int nr_irqs)
-- 
2.29.2



[PATCH 6/9] mips: netlogic: Use irq_domain_simple_ops for XLP PIC

2021-04-06 Thread Marc Zyngier
Use the generic irq_domain_simple_ops structure instead of
a home-grown one.

Signed-off-by: Marc Zyngier 
---
 arch/mips/netlogic/common/irq.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/mips/netlogic/common/irq.c b/arch/mips/netlogic/common/irq.c
index cf33dd8a487e..c25a2ce5e29f 100644
--- a/arch/mips/netlogic/common/irq.c
+++ b/arch/mips/netlogic/common/irq.c
@@ -276,10 +276,6 @@ asmlinkage void plat_irq_dispatch(void)
 }
 
 #ifdef CONFIG_CPU_XLP
-static const struct irq_domain_ops xlp_pic_irq_domain_ops = {
-   .xlate = irq_domain_xlate_onetwocell,
-};
-
 static int __init xlp_of_pic_init(struct device_node *node,
struct device_node *parent)
 {
@@ -324,7 +320,7 @@ static int __init xlp_of_pic_init(struct device_node *node,
 
xlp_pic_domain = irq_domain_add_legacy(node, n_picirqs,
nlm_irq_to_xirq(socid, PIC_IRQ_BASE), PIC_IRQ_BASE,
-   _pic_irq_domain_ops, NULL);
+   _domain_simple_ops, NULL);
if (xlp_pic_domain == NULL) {
pr_err("PIC %pOFn: Creating legacy domain failed!\n", node);
return -EINVAL;
-- 
2.29.2



[PATCH 3/9] irqchip/jcore-aic: Kill use of irq_create_strict_mappings()

2021-04-06 Thread Marc Zyngier
irq_create_strict_mappings() is a poor way to allow the use of
a linear IRQ domain as a legacy one. Let's be upfront about it.

Signed-off-by: Marc Zyngier 
---
 drivers/irqchip/irq-jcore-aic.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-jcore-aic.c b/drivers/irqchip/irq-jcore-aic.c
index 033bccb41455..5f47d8ee4ae3 100644
--- a/drivers/irqchip/irq-jcore-aic.c
+++ b/drivers/irqchip/irq-jcore-aic.c
@@ -100,11 +100,11 @@ static int __init aic_irq_of_init(struct device_node 
*node,
jcore_aic.irq_unmask = noop;
jcore_aic.name = "AIC";
 
-   domain = irq_domain_add_linear(node, dom_sz, _aic_irqdomain_ops,
+   domain = irq_domain_add_legacy(node, dom_sz - min_irq, min_irq, min_irq,
+  _aic_irqdomain_ops,
   _aic);
if (!domain)
return -ENOMEM;
-   irq_create_strict_mappings(domain, min_irq, min_irq, dom_sz - min_irq);
 
return 0;
 }
-- 
2.29.2



[PATCH 2/9] ARM: PXA: Kill use of irq_create_strict_mappings()

2021-04-06 Thread Marc Zyngier
irq_create_strict_mappings() is a poor way to allow the use of
a linear IRQ domain as a legacy one. Let's be upfront about
it and use a legacy domain when appropriate.

Signed-off-by: Marc Zyngier 
---
 arch/arm/mach-pxa/pxa_cplds_irqs.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/arm/mach-pxa/pxa_cplds_irqs.c 
b/arch/arm/mach-pxa/pxa_cplds_irqs.c
index 45c19ca96f7a..ec0d9b094744 100644
--- a/arch/arm/mach-pxa/pxa_cplds_irqs.c
+++ b/arch/arm/mach-pxa/pxa_cplds_irqs.c
@@ -147,22 +147,20 @@ static int cplds_probe(struct platform_device *pdev)
}
 
irq_set_irq_wake(fpga->irq, 1);
-   fpga->irqdomain = irq_domain_add_linear(pdev->dev.of_node,
-  CPLDS_NB_IRQ,
-  _irq_domain_ops, fpga);
+   if (base_irq)
+   fpga->irqdomain = irq_domain_add_legacy(pdev->dev.of_node,
+   CPLDS_NB_IRQ,
+   base_irq, 0,
+   _irq_domain_ops,
+   fpga);
+   else
+   fpga->irqdomain = irq_domain_add_linear(pdev->dev.of_node,
+   CPLDS_NB_IRQ,
+   _irq_domain_ops,
+   fpga);
if (!fpga->irqdomain)
return -ENODEV;
 
-   if (base_irq) {
-   ret = irq_create_strict_mappings(fpga->irqdomain, base_irq, 0,
-CPLDS_NB_IRQ);
-   if (ret) {
-   dev_err(>dev, "couldn't create the irq mapping 
%d..%d\n",
-   base_irq, base_irq + CPLDS_NB_IRQ);
-   return ret;
-   }
-   }
-
return 0;
 }
 
-- 
2.29.2



[PATCH 0/9] Cleaning up some of the irqdomain features

2021-04-06 Thread Marc Zyngier
The irqdomain subsystem has grown quite a lot over the years, and some
of its features are either oddly used or just pretty useless. Some
other helpers expose internals that are likely to change soon.

Here are the bits that I'm trying to get rid of:

- irq_linear_revmap exposes the internals of the domains, and only
  works for linear domains. The supposed speed improvement really
  isn't an argument, as it gets in the way of more significant
  optimisations. Reimplemented in terms of irq_find_mapping, which
  always works, and will eventually go at some point.

- irq_create_strict_mappings is just a way to constraint the
  allocation of irqdescs into a given range, which is better served by
  creating a legacy irqdomain, and shows that the platform really
  needs to catch up with the 21st century.

- irq_create_identity mapping is just a variation on the above, with
  irq==hwirq, although the way it is used is a gross hack in the SH
  code that needs to go.

- irq_domain_add_legacy_isa is, as the names shows, a variation on
  irq_domain_add_legacy with a reservation of 16 interrupts. This is
  only used in the PPC code.

The patches address all of the above, touching some of the ARM, PPC,
and SH code that is affected. Another couple of patches address a MIPS
platform that could use the generic code, and clean some of the
comments in the irqdomain code.

Unless anyone shouts, I'd like to take this into 5.13, as it is the
basis of some further (and deeper) changes in the way irqdomains work.

M.

Marc Zyngier (9):
  irqdomain: Reimplement irq_linear_revmap() with irq_find_mapping()
  ARM: PXA: Kill use of irq_create_strict_mappings()
  irqchip/jcore-aic: Kill use of irq_create_strict_mappings()
  sh: intc: Drop the use of irq_create_identity_mapping()
  irqdomain: Kill
irq_create_strict_mappings()/irq_create_identity_mapping()
  mips: netlogic: Use irq_domain_simple_ops for XLP PIC
  irqdomain: Drop references to recusive irqdomain setup
  powerpc: Convert irq_domain_add_legacy_isa use to
irq_domain_add_legacy
  irqdomain: Kill irq_domain_add_legacy_isa

 Documentation/core-api/irq/irq-domain.rst |  1 -
 arch/arm/mach-pxa/pxa_cplds_irqs.c| 24 +--
 arch/mips/netlogic/common/irq.c   |  6 +--
 arch/powerpc/include/asm/irq.h|  4 +-
 arch/powerpc/platforms/ps3/interrupt.c|  4 +-
 arch/powerpc/sysdev/i8259.c   |  3 +-
 arch/powerpc/sysdev/mpic.c|  2 +-
 arch/powerpc/sysdev/tsi108_pci.c  |  3 +-
 arch/powerpc/sysdev/xics/xics-common.c|  2 +-
 drivers/irqchip/irq-jcore-aic.c   |  4 +-
 drivers/sh/intc/core.c| 50 ++-
 include/linux/irqdomain.h | 42 ---
 kernel/irq/irqdomain.c| 49 +++---
 13 files changed, 59 insertions(+), 135 deletions(-)

-- 
2.29.2



[PATCH 1/9] irqdomain: Reimplement irq_linear_revmap() with irq_find_mapping()

2021-04-06 Thread Marc Zyngier
irq_linear_revmap() is supposed to be a fast path for domain
lookups, but it only exposes low-level details of the irqdomain
implementation, details which are better kept private.

The *overhead* between the two is only a function call and
a couple of tests, so it is likely that noone can show any
meaningful difference compared to the cost of taking an
interrupt.

Reimplement irq_linear_revmap() with irq_find_mapping()
in order to preserve source code compatibility, and
rename the internal field for a measure.

Signed-off-by: Marc Zyngier 
---
 include/linux/irqdomain.h | 22 +-
 kernel/irq/irqdomain.c|  6 +++---
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 33cacc8af26d..b9600f24878a 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -154,9 +154,9 @@ struct irq_domain_chip_generic;
  * Revmap data, used internally by irq_domain
  * @revmap_direct_max_irq: The largest hwirq that can be set for controllers 
that
  * support direct mapping
- * @revmap_size: Size of the linear map table @linear_revmap[]
+ * @revmap_size: Size of the linear map table @revmap[]
  * @revmap_tree: Radix map tree for hwirqs that don't fit in the linear map
- * @linear_revmap: Linear table of hwirq->virq reverse mappings
+ * @revmap: Linear table of hwirq->virq reverse mappings
  */
 struct irq_domain {
struct list_head link;
@@ -180,7 +180,7 @@ struct irq_domain {
unsigned int revmap_size;
struct radix_tree_root revmap_tree;
struct mutex revmap_tree_mutex;
-   unsigned int linear_revmap[];
+   unsigned int revmap[];
 };
 
 /* Irq domain flags */
@@ -396,24 +396,20 @@ static inline unsigned int irq_create_mapping(struct 
irq_domain *host,
return irq_create_mapping_affinity(host, hwirq, NULL);
 }
 
-
 /**
- * irq_linear_revmap() - Find a linux irq from a hw irq number.
+ * irq_find_mapping() - Find a linux irq from a hw irq number.
  * @domain: domain owning this hardware interrupt
  * @hwirq: hardware irq number in that domain space
- *
- * This is a fast path alternative to irq_find_mapping() that can be
- * called directly by irq controller code to save a handful of
- * instructions. It is always safe to call, but won't find irqs mapped
- * using the radix tree.
  */
+extern unsigned int irq_find_mapping(struct irq_domain *host,
+irq_hw_number_t hwirq);
+
 static inline unsigned int irq_linear_revmap(struct irq_domain *domain,
 irq_hw_number_t hwirq)
 {
-   return hwirq < domain->revmap_size ? domain->linear_revmap[hwirq] : 0;
+   return irq_find_mapping(domain, hwirq);
 }
-extern unsigned int irq_find_mapping(struct irq_domain *host,
-irq_hw_number_t hwirq);
+
 extern unsigned int irq_create_direct_mapping(struct irq_domain *host);
 extern int irq_create_strict_mappings(struct irq_domain *domain,
  unsigned int irq_base,
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index d10ab1d689d5..dfa716305ea9 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -486,7 +486,7 @@ static void irq_domain_clear_mapping(struct irq_domain 
*domain,
 irq_hw_number_t hwirq)
 {
if (hwirq < domain->revmap_size) {
-   domain->linear_revmap[hwirq] = 0;
+   domain->revmap[hwirq] = 0;
} else {
mutex_lock(>revmap_tree_mutex);
radix_tree_delete(>revmap_tree, hwirq);
@@ -499,7 +499,7 @@ static void irq_domain_set_mapping(struct irq_domain 
*domain,
   struct irq_data *irq_data)
 {
if (hwirq < domain->revmap_size) {
-   domain->linear_revmap[hwirq] = irq_data->irq;
+   domain->revmap[hwirq] = irq_data->irq;
} else {
mutex_lock(>revmap_tree_mutex);
radix_tree_insert(>revmap_tree, hwirq, irq_data);
@@ -920,7 +920,7 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 
/* Check if the hwirq is in the linear revmap. */
if (hwirq < domain->revmap_size)
-   return domain->linear_revmap[hwirq];
+   return domain->revmap[hwirq];
 
rcu_read_lock();
data = radix_tree_lookup(>revmap_tree, hwirq);
-- 
2.29.2



Re: [GIT PULL] coresight: Add support for ETE and TRBE

2021-04-06 Thread Marc Zyngier
Hi Mathieu,

On Mon, 05 Apr 2021 20:17:57 +0100,
Mathieu Poirier  wrote:
> 
> The following changes since commit a354a64d91eec3e0f8ef0eed575b480fd75b999c:
> 
>   KVM: arm64: Disable guest access to trace filter controls (2021-03-24 
> 17:26:38 +)
> 
> are available in the Git repository at:
> 
>   g...@gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux.git 
> next-ETE-TRBE
> 
> for you to fetch changes up to 7885b4e43231048654c5a80c0a18844ce3185e64:
> 
>   dts: bindings: Document device tree bindings for Arm TRBE (2021-04-05 
> 11:38:04 -0600)
> 
> 
> Hi Marc,
> 
> As previously agreed, here are the changes to support CoreSight
> ETE and TRBE components submitted here[1].
> 
> I draw your attention to these:
> 
> [PATCH v6 05/20] kvm: arm64: Handle access to TRFCR_EL1
> [PATCH v6 06/20] kvm: arm64: Move SPE availability check to VCPU load
> [PATCH v6 07/20] arm64: kvm: Enable access to TRBE support for host
> 
> They are KVM specific and will need an SoB tag.

There seem to be a disconnect here, because it works the other way
around.

If I pull this, I obviously cannot add anything to the patches that
are merged (changing stuff would result in changing the commit IDs,
which is exactly the opposite of what we are trying to achieve).

This isn't a problem, as the act of pulling the branch means that I am
happy with that, and the git merge makes it traceable.

However, some of the patches (the KVM ones) do not carry your own SoB,
which is a problem (if you are picking stuff off the list, you need to
add your own SoB). So for the couple of KVM patches, please add my

Acked-by: Marc Zyngier 

together with your SoB, resend the PR and I'll gladly merge it.

And if you can make sure the subject lines are formatted as:

"KVM: arm64: Super Duper feature enablement"

that'd be absolutely awesome (but that's just me being annoying...).

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v5 0/8] KVM/ARM: Some vgic fixes and init sequence KVM selftests

2021-04-05 Thread Marc Zyngier
Hi Eric,

On Sun, 04 Apr 2021 18:22:35 +0100,
Eric Auger  wrote:
> 
> While writting vgic v3 init sequence KVM selftests I noticed some
> relatively minor issues. This was also the opportunity to try to
> fix the issue laterly reported by Zenghui, related to the RDIST_TYPER
> last bit emulation. The final patch is a first batch of VGIC init
> sequence selftests. Of course they can be augmented with a lot more
> register access tests, but let's try to move forward incrementally ...
> 
> Best Regards
> 
> Eric
> 
> This series can be found at:
> https://github.com/eauger/linux/tree/vgic_kvmselftests_v5
> 
> History:
> v4 -> v5:
> - rewrite the last bit detection according to Marc's
>   interpretation of the spec and modify the kvm selftests
>   accordingly

Have you dropped v4's patch #1? It did seem to fix an actual issue,
didn't it?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v5 7/8] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-04-05 Thread Marc Zyngier
On Sun, 04 Apr 2021 18:22:42 +0100,
Eric Auger  wrote:
> 
> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> a bug identified when attempting to access the GICR_TYPER
> register before the redistributor region setting, but dropped
> the support of the LAST bit.
> 
> Emulating the GICR_TYPER.Last bit still makes sense for
> architecture compliance though. This patch restores its support
> (if the redistributor region was set) while keeping the code safe.
> 
> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> computes whether a redistributor is the highest one of a series
> of redistributor contributor pages.
> 
> With this new implementation we do not need to have a uaccess
> read accessor anymore.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v4 -> v5:
> - redist region list now is sorted by @base
> - change the implementation according to Marc's understanding of
>   the spec
> ---
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 58 +-
>  include/kvm/arm_vgic.h |  1 +
>  2 files changed, 34 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index e1ed0c5a8eaa..03a253785700 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -251,45 +251,52 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu 
> *vcpu,
>   vgic_enable_lpis(vcpu);
>  }
>  
> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_dist *vgic = >kvm->arch.vgic;
> + struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
> + struct vgic_redist_region *iter, *rdreg = vgic_cpu->rdreg;
> +
> + if (!rdreg)
> + return false;
> +
> + if (vgic_cpu->rdreg_index < rdreg->free_index - 1) {
> + return false;
> + } else if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) 
> {
> + struct list_head *rd_regions = >rd_regions;
> + gpa_t end = rdreg->base + rdreg->count * 
> KVM_VGIC_V3_REDIST_SIZE;
> +
> + /*
> +  * the rdist is the last one of the redist region,
> +  * check whether there is no other contiguous rdist region
> +  */
> + list_for_each_entry(iter, rd_regions, list) {
> + if (iter->base == end && iter->free_index > 0)
> + return false;
> + }

In the above notes, you state that the region list is now sorted by
base address, but I really can't see what sorts that list. And the
lines above indicate that you are still iterating over the whole RD
regions.

It's not a big deal (the code is now simple enough), but that's just
to confirm that I understand what is going on here.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH 3/4] PCI: j721e: Add PCIe support for j7200

2021-04-02 Thread Marc Zyngier
On Thu, 25 Mar 2021 09:09:35 +,
Kishon Vijay Abraham I  wrote:
> 
> J7200 has the same PCIe IP as in J721E with minor changes in the
> wrapper. Add PCIe support for j7200 accounting for the wrapper
> changes in pci-j721e.c
> Changes from J721E:
>  *) Allows byte access of bridge configuration space registers
>  *) Changes in legacy interrupt register map
> 
> Signed-off-by: Kishon Vijay Abraham I 
> ---
>  drivers/pci/controller/cadence/pci-j721e.c | 111 ++---
>  1 file changed, 99 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/pci/controller/cadence/pci-j721e.c 
> b/drivers/pci/controller/cadence/pci-j721e.c
> index 17db86a51ca8..f175f116abf6 100644
> --- a/drivers/pci/controller/cadence/pci-j721e.c
> +++ b/drivers/pci/controller/cadence/pci-j721e.c
> @@ -27,6 +27,7 @@
>  #define STATUS_REG_SYS_2 0x508
>  #define STATUS_CLR_REG_SYS_2 0x708
>  #define LINK_DOWNBIT(1)
> +#define J7200_LINK_DOWN  BIT(10)
>  
>  #define EOI_REG  0x10
>  
> @@ -35,6 +36,10 @@
>  #define STATUS_CLR_REG_SYS_0 0x700
>  #define INTx_EN(num) (1 << (num))
>  
> +#define ENABLE_REG_SYS_1 0x104
> +#define STATUS_REG_SYS_1 0x504
> +#define SYS1_INTx_EN(num)(1 << (22 + (num)))
> +
>  #define J721E_PCIE_USER_CMD_STATUS   0x4
>  #define LINK_TRAINING_ENABLE BIT(0)
>  
> @@ -48,6 +53,14 @@ enum link_status {
>   LINK_UP_DL_COMPLETED,
>  };
>  
> +#define USER_EOI_REG 0xC8
> +enum eoi_reg {
> + EOI_DOWNSTREAM_INTERRUPT,
> + EOI_FLR_INTERRUPT,
> + EOI_LEGACY_INTERRUPT,
> + EOI_POWER_STATE_INTERRUPT,
> +};
> +
>  #define J721E_MODE_RCBIT(7)
>  #define LANE_COUNT_MASK  BIT(8)
>  #define LANE_COUNT(n)((n) << 8)
> @@ -65,6 +78,8 @@ struct j721e_pcie {
>   void __iomem*user_cfg_base;
>   void __iomem*intd_cfg_base;
>   struct irq_domain   *legacy_irq_domain;
> + boolis_intc_v1;
> + u32 link_irq_reg_field;
>  };
>  
>  enum j721e_pcie_mode {
> @@ -75,6 +90,9 @@ enum j721e_pcie_mode {
>  struct j721e_pcie_data {
>   enum j721e_pcie_modemode;
>   bool quirk_retrain_flag;
> + boolis_intc_v1;
> + boolbyte_access_allowed;
> + const struct cdns_pcie_ops *ops;
>  };
>  
>  static inline u32 j721e_pcie_user_readl(struct j721e_pcie *pcie, u32 offset)
> @@ -106,12 +124,12 @@ static irqreturn_t j721e_pcie_link_irq_handler(int irq, 
> void *priv)
>   u32 reg;
>  
>   reg = j721e_pcie_intd_readl(pcie, STATUS_REG_SYS_2);
> - if (!(reg & LINK_DOWN))
> + if (!(reg & pcie->link_irq_reg_field))
>   return IRQ_NONE;
>  
>   dev_err(dev, "LINK DOWN!\n");
>  
> - j721e_pcie_intd_writel(pcie, STATUS_CLR_REG_SYS_2, LINK_DOWN);
> + j721e_pcie_intd_writel(pcie, STATUS_CLR_REG_SYS_2, 
> pcie->link_irq_reg_field);
>   return IRQ_HANDLED;
>  }
>  
> @@ -119,12 +137,40 @@ static void j721e_pcie_config_link_irq(struct 
> j721e_pcie *pcie)
>  {
>   u32 reg;
>  
> + pcie->link_irq_reg_field = J7200_LINK_DOWN;
> + if (pcie->is_intc_v1)
> + pcie->link_irq_reg_field = LINK_DOWN;
> +
>   reg = j721e_pcie_intd_readl(pcie, ENABLE_REG_SYS_2);
> - reg |= LINK_DOWN;
> + reg |= pcie->link_irq_reg_field;
>   j721e_pcie_intd_writel(pcie, ENABLE_REG_SYS_2, reg);
>  }
>  
>  static void j721e_pcie_legacy_irq_handler(struct irq_desc *desc)
> +{
> + struct j721e_pcie *pcie = irq_desc_get_handler_data(desc);
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + int virq;
> + u32 reg;
> + int i;
> +
> + chained_irq_enter(chip, desc);
> +
> + for (i = 0; i < PCI_NUM_INTX; i++) {
> + reg = j721e_pcie_intd_readl(pcie, STATUS_REG_SYS_1);
> + if (!(reg & SYS1_INTx_EN(i)))
> + continue;
> +
> + virq = irq_find_mapping(pcie->legacy_irq_domain, i);
> + generic_handle_irq(virq);
> + j721e_pcie_user_writel(pcie, USER_EOI_REG,
> +EOI_LEGACY_INTERRUPT);

Exact same comment as in the previous patch: this EOI (which I assume
is used to regenerate the GIC edge at after handling the INTx level
interrupt) must be placed in a irq_eoi() callback.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH 5/6] PCI: keystone: Add PCI legacy interrupt support for AM654

2021-04-02 Thread Marc Zyngier
On Thu, 25 Mar 2021 09:00:25 +,
Kishon Vijay Abraham I  wrote:
> 
> Add PCI legacy interrupt support for AM654. AM654 has a single HW
> interrupt line for all the four legacy interrupts INTA/INTB/INTC/INTD.
> The HW interrupt line connected to GIC is a pulse interrupt whereas
> the legacy interrupts by definition is level interrupt. In order to
> provide level interrupt functionality to edge interrupt line, PCIe
> in AM654 has provided IRQ_EOI register. When the SW writes to IRQ_EOI
> register after handling the interrupt, the IP checks the state of
> legacy interrupt and re-triggers pulse interrupt invoking the handler
> again.
> 
> Signed-off-by: Kishon Vijay Abraham I 
> ---
>  drivers/pci/controller/dwc/pci-keystone.c | 87 +--
>  1 file changed, 82 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c 
> b/drivers/pci/controller/dwc/pci-keystone.c
> index dfa9a7fcf9b7..84a25207d0d3 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -118,6 +118,7 @@ struct keystone_pcie {
>   /* PCI Device ID */
>   u32 device_id;
>   struct  device_node *legacy_intc_np;
> + struct irq_domain   *legacy_irq_domain;
>  
>   int msi_host_irq;
>   int num_lanes;
> @@ -289,6 +290,29 @@ static irqreturn_t ks_pcie_handle_error_irq(struct 
> keystone_pcie *ks_pcie)
>   return IRQ_HANDLED;
>  }
>  
> +static void ks_pcie_am654_legacy_irq_handler(struct irq_desc *desc)
> +{
> + struct keystone_pcie *ks_pcie = irq_desc_get_handler_data(desc);
> + struct irq_chip *chip = irq_desc_get_chip(desc);
> + int virq, i;
> + u32 reg;
> +
> + chained_irq_enter(chip, desc);
> +
> + for (i = 0; i < PCI_NUM_INTX; i++) {
> + reg = ks_pcie_app_readl(ks_pcie, IRQ_STATUS(i));
> + if (!(reg & INTx_EN))
> + continue;
> +
> + virq = irq_linear_revmap(ks_pcie->legacy_irq_domain, i);
> + generic_handle_irq(virq);

I'm on my way to kill irq_linear_revmap(), so I'd rather you didn't
add more instances. Consider using irq_find_mapping() instead.

> + ks_pcie_app_writel(ks_pcie, IRQ_STATUS(i), INTx_EN);
> + ks_pcie_app_writel(ks_pcie, IRQ_EOI, i);

What are these writes for? The first one feels like an Ack, and the
second one has EOI written over it.

If that's what they are, llease move these to a proper irq_chip
structure and use the appropriate flow handler, instead of
dummy_irq_chip and handle_simple_irq.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v4 7/8] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-04-02 Thread Marc Zyngier
Hi Eric,

On Thu, 01 Apr 2021 20:16:53 +0100,
Auger Eric  wrote:
> 
> Hi Marc,
> 
> On 4/1/21 7:30 PM, Marc Zyngier wrote:
> > On Thu, 01 Apr 2021 18:03:25 +0100,
> > Auger Eric  wrote:
> >>
> >> Hi Marc,
> >>
> >> On 4/1/21 3:42 PM, Marc Zyngier wrote:
> >>> Hi Eric,
> >>>
> >>> On Thu, 01 Apr 2021 09:52:37 +0100,
> >>> Eric Auger  wrote:
> >>>>
> >>>> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> >>>> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> >>>> a bug identified when attempting to access the GICR_TYPER
> >>>> register before the redistributor region setting, but dropped
> >>>> the support of the LAST bit.
> >>>>
> >>>> Emulating the GICR_TYPER.Last bit still makes sense for
> >>>> architecture compliance though. This patch restores its support
> >>>> (if the redistributor region was set) while keeping the code safe.
> >>>>
> >>>> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> >>>> computes whether a redistributor is the highest one of a series
> >>>> of redistributor contributor pages.
> >>>>
> >>>> The spec says "Indicates whether this Redistributor is the
> >>>> highest-numbered Redistributor in a series of contiguous
> >>>> Redistributor pages."
> >>>>
> >>>> The code is a bit convulated since there is no guarantee
> >>>
> >>> nit: convoluted
> >>>
> >>>> redistributors are added in a given reditributor region in
> >>>> ascending order. In that case the current implementation was
> >>>> wrong. Also redistributor regions can be contiguous
> >>>> and registered in non increasing base address order.
> >>>>
> >>>> So the index of redistributors are stored in an array within
> >>>> the redistributor region structure.
> >>>>
> >>>> With this new implementation we do not need to have a uaccess
> >>>> read accessor anymore.
> >>>>
> >>>> Signed-off-by: Eric Auger 
> >>>
> >>> This patch also hurt my head, a lot more than the first one.  See
> >>> below.
> >>>
> >>>> ---
> >>>>  arch/arm64/kvm/vgic/vgic-init.c|  7 +--
> >>>>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 --
> >>>>  arch/arm64/kvm/vgic/vgic.h |  1 +
> >>>>  include/kvm/arm_vgic.h |  3 +
> >>>>  4 files changed, 73 insertions(+), 35 deletions(-)
> >>>>
> >>>> diff --git a/arch/arm64/kvm/vgic/vgic-init.c 
> >>>> b/arch/arm64/kvm/vgic/vgic-init.c
> >>>> index cf6faa0aeddb2..61150c34c268c 100644
> >>>> --- a/arch/arm64/kvm/vgic/vgic-init.c
> >>>> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> >>>> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> >>>>  int i;
> >>>>  
> >>>>  vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> >>>> +vgic_cpu->index = vcpu->vcpu_id;
> >>>
> >>> Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
> >>> do we need another field? We can always get to the vcpu using a
> >>> container_of().
> >>>
> >>>>  
> >>>>  INIT_LIST_HEAD(_cpu->ap_list_head);
> >>>>  raw_spin_lock_init(_cpu->ap_list_lock);
> >>>> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
> >>>>  dist->vgic_dist_base = VGIC_ADDR_UNDEF;
> >>>>  
> >>>>  if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> >>>> -list_for_each_entry_safe(rdreg, next, 
> >>>> >rd_regions, list) {
> >>>> -list_del(>list);
> >>>> -kfree(rdreg);
> >>>> -}
> >>>> +list_for_each_entry_safe(rdreg, next, 
> >>>> >rd_regions, list)
> >>>> +vgic_v3_free_redist_region(rdreg);
> >>>
> >>> Consider moving the introduction of vgic_v3_free_redist_region() into
&g

Re: [PATCH v4 7/8] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-04-01 Thread Marc Zyngier
Hi Eric,

On Thu, 01 Apr 2021 09:52:37 +0100,
Eric Auger  wrote:
> 
> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> a bug identified when attempting to access the GICR_TYPER
> register before the redistributor region setting, but dropped
> the support of the LAST bit.
> 
> Emulating the GICR_TYPER.Last bit still makes sense for
> architecture compliance though. This patch restores its support
> (if the redistributor region was set) while keeping the code safe.
> 
> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> computes whether a redistributor is the highest one of a series
> of redistributor contributor pages.
> 
> The spec says "Indicates whether this Redistributor is the
> highest-numbered Redistributor in a series of contiguous
> Redistributor pages."
> 
> The code is a bit convulated since there is no guarantee

nit: convoluted

> redistributors are added in a given reditributor region in
> ascending order. In that case the current implementation was
> wrong. Also redistributor regions can be contiguous
> and registered in non increasing base address order.
> 
> So the index of redistributors are stored in an array within
> the redistributor region structure.
> 
> With this new implementation we do not need to have a uaccess
> read accessor anymore.
> 
> Signed-off-by: Eric Auger 

This patch also hurt my head, a lot more than the first one.  See
below.

> ---
>  arch/arm64/kvm/vgic/vgic-init.c|  7 +--
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 --
>  arch/arm64/kvm/vgic/vgic.h |  1 +
>  include/kvm/arm_vgic.h |  3 +
>  4 files changed, 73 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> index cf6faa0aeddb2..61150c34c268c 100644
> --- a/arch/arm64/kvm/vgic/vgic-init.c
> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
>   int i;
>  
>   vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> + vgic_cpu->index = vcpu->vcpu_id;

Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
do we need another field? We can always get to the vcpu using a
container_of().

>  
>   INIT_LIST_HEAD(_cpu->ap_list_head);
>   raw_spin_lock_init(_cpu->ap_list_lock);
> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>   dist->vgic_dist_base = VGIC_ADDR_UNDEF;
>  
>   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> - list_for_each_entry_safe(rdreg, next, >rd_regions, list) {
> - list_del(>list);
> - kfree(rdreg);
> - }
> + list_for_each_entry_safe(rdreg, next, >rd_regions, list)
> + vgic_v3_free_redist_region(rdreg);

Consider moving the introduction of vgic_v3_free_redist_region() into
a separate patch. On its own, that's a good readability improvement.

>   INIT_LIST_HEAD(>rd_regions);
>   } else {
>   dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 987e366c80008..f6a7eed1d6adb 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu 
> *vcpu,
>   vgic_enable_lpis(vcpu);
>  }
>  
> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> +{
> + struct vgic_dist *vgic = >kvm->arch.vgic;
> + struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
> +
> + if (!rdreg)
> + return false;
> +
> + if (rdreg->count && vgic_cpu->rdreg_index == (rdreg->count - 1)) {
> + /* check whether there is no other contiguous rdist region */
> + struct list_head *rd_regions = >rd_regions;
> + struct vgic_redist_region *iter;
> +
> + list_for_each_entry(iter, rd_regions, list) {
> + if (iter->base == rdreg->base + rdreg->count * 
> KVM_VGIC_V3_REDIST_SIZE &&
> + iter->free_index > 0) {
> + /* check the first rdist index of this region, if any */
> + if (vgic_cpu->index < iter->rdist_indices[0])
> + return false;

rdist_indices[] contains the vcpu_id of the vcpu associated with a
given RD in the region. At this stage, you have established that there
is another region that is contiguous with the one associated with our
vcpu. You also know that this adjacent region has a vcpu mapped in
(free_index isn't 0). Isn't that enough to declare that our vcpu isn't
last?  I definitely don't understand what the index comparison does
here.

It also seem to me that some of the complexity could be eliminated if
the regions were kept 

Re: [PATCH v3 00/14] PCI/MSI: Getting rid of msi_controller, and other cleanups

2021-04-01 Thread Marc Zyngier
On Thu, 01 Apr 2021 12:27:42 +0100,
Lorenzo Pieralisi  wrote:
> 
> On Tue, 30 Mar 2021 16:11:31 +0100, Marc Zyngier wrote:
> > This is a respin of the series described at [1].
> > 
> > * From v2 [2]:
> >   - Fixed the Xilinx driver, thanks to Bharat for testing it
> >   - Dropped the no_msi attribute, and solely rely on msi_domain, which
> > has the same effect for the only platform that was using it.
> >   - Fixed compilation on architectures that do not select the generic
> > MSI support
> > 
> > [...]
> 
> I have applied it to pci/msi and should be moved into -next shortly
> for further testing/visibility, thanks a lot for putting it together.

Thanks Lorenzo.

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v4 7/8] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-04-01 Thread Marc Zyngier
On Thu, 01 Apr 2021 18:03:25 +0100,
Auger Eric  wrote:
> 
> Hi Marc,
> 
> On 4/1/21 3:42 PM, Marc Zyngier wrote:
> > Hi Eric,
> > 
> > On Thu, 01 Apr 2021 09:52:37 +0100,
> > Eric Auger  wrote:
> >>
> >> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> >> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> >> a bug identified when attempting to access the GICR_TYPER
> >> register before the redistributor region setting, but dropped
> >> the support of the LAST bit.
> >>
> >> Emulating the GICR_TYPER.Last bit still makes sense for
> >> architecture compliance though. This patch restores its support
> >> (if the redistributor region was set) while keeping the code safe.
> >>
> >> We introduce a new helper, vgic_mmio_vcpu_rdist_is_last() which
> >> computes whether a redistributor is the highest one of a series
> >> of redistributor contributor pages.
> >>
> >> The spec says "Indicates whether this Redistributor is the
> >> highest-numbered Redistributor in a series of contiguous
> >> Redistributor pages."
> >>
> >> The code is a bit convulated since there is no guarantee
> > 
> > nit: convoluted
> > 
> >> redistributors are added in a given reditributor region in
> >> ascending order. In that case the current implementation was
> >> wrong. Also redistributor regions can be contiguous
> >> and registered in non increasing base address order.
> >>
> >> So the index of redistributors are stored in an array within
> >> the redistributor region structure.
> >>
> >> With this new implementation we do not need to have a uaccess
> >> read accessor anymore.
> >>
> >> Signed-off-by: Eric Auger 
> > 
> > This patch also hurt my head, a lot more than the first one.  See
> > below.
> > 
> >> ---
> >>  arch/arm64/kvm/vgic/vgic-init.c|  7 +--
> >>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 97 --
> >>  arch/arm64/kvm/vgic/vgic.h |  1 +
> >>  include/kvm/arm_vgic.h |  3 +
> >>  4 files changed, 73 insertions(+), 35 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/vgic/vgic-init.c 
> >> b/arch/arm64/kvm/vgic/vgic-init.c
> >> index cf6faa0aeddb2..61150c34c268c 100644
> >> --- a/arch/arm64/kvm/vgic/vgic-init.c
> >> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> >> @@ -190,6 +190,7 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
> >>int i;
> >>  
> >>vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
> >> +  vgic_cpu->index = vcpu->vcpu_id;
> > 
> > Is it so that vgic_cpu->index is always equal to vcpu_id? If so, why
> > do we need another field? We can always get to the vcpu using a
> > container_of().
> > 
> >>  
> >>INIT_LIST_HEAD(_cpu->ap_list_head);
> >>raw_spin_lock_init(_cpu->ap_list_lock);
> >> @@ -338,10 +339,8 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
> >>dist->vgic_dist_base = VGIC_ADDR_UNDEF;
> >>  
> >>if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
> >> -  list_for_each_entry_safe(rdreg, next, >rd_regions, list) {
> >> -  list_del(>list);
> >> -  kfree(rdreg);
> >> -  }
> >> +  list_for_each_entry_safe(rdreg, next, >rd_regions, list)
> >> +  vgic_v3_free_redist_region(rdreg);
> > 
> > Consider moving the introduction of vgic_v3_free_redist_region() into
> > a separate patch. On its own, that's a good readability improvement.
> > 
> >>INIT_LIST_HEAD(>rd_regions);
> >>} else {
> >>dist->vgic_cpu_base = VGIC_ADDR_UNDEF;
> >> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> >> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> index 987e366c80008..f6a7eed1d6adb 100644
> >> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> >> @@ -251,45 +251,57 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu 
> >> *vcpu,
> >>vgic_enable_lpis(vcpu);
> >>  }
> >>  
> >> +static bool vgic_mmio_vcpu_rdist_is_last(struct kvm_vcpu *vcpu)
> >> +{
> >> +  struct vgic_dist *vgic = >kvm->arch.vgic;
> >> +  struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
> >> +  struct vgic_redist_region *rdreg = vgic

Re: [PATCH v4 1/8] KVM: arm64: vgic-v3: Fix some error codes when setting RDIST base

2021-04-01 Thread Marc Zyngier
Hi Eric,

On Thu, 01 Apr 2021 09:52:31 +0100,
Eric Auger  wrote:
> 
> KVM_DEV_ARM_VGIC_GRP_ADDR group doc says we should return
> -EEXIST in case the base address of the redist is already set.
> We currently return -EINVAL.
> 
> However we need to return -EINVAL in case a legacy REDIST address
> is attempted to be set while REDIST_REGIONS were set. This case
> is discriminated by looking at the count field.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> v1 -> v2:
> - simplify the check sequence
> ---
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 15a6c98ee92f0..013b737b658f8 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -791,10 +791,6 @@ static int vgic_v3_insert_redist_region(struct kvm *kvm, 
> uint32_t index,
>   size_t size = count * KVM_VGIC_V3_REDIST_SIZE;
>   int ret;
>  
> - /* single rdist region already set ?*/
> - if (!count && !list_empty(rd_regions))
> - return -EINVAL;
> -
>   /* cross the end of memory ? */
>   if (base + size < base)
>   return -EINVAL;
> @@ -805,11 +801,14 @@ static int vgic_v3_insert_redist_region(struct kvm 
> *kvm, uint32_t index,
>   } else {
>   rdreg = list_last_entry(rd_regions,
>   struct vgic_redist_region, list);
> - if (index != rdreg->index + 1)
> - return -EINVAL;
>  
> - /* Cannot add an explicitly sized regions after legacy region */
> - if (!rdreg->count)
> + if ((!count) != (!rdreg->count))
> + return -EINVAL; /* Mix REDIST and REDIST_REGION */

Urgh... The triple negation killed me. Can we come up with a more
intuitive expression? Something like:

/* Don't mix single region and discrete redist regions */
if (!count && rdreg->count)
return -EINVAL;

Does it capture what you want to express?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v3 03/14] PCI: rcar: Convert to MSI domains

2021-04-01 Thread Marc Zyngier
On Thu, 01 Apr 2021 11:19:57 +0100,
Lorenzo Pieralisi  wrote:
> 
> On Tue, Mar 30, 2021 at 04:11:34PM +0100, Marc Zyngier wrote:
> 
> [...]
> 
> > +static void rcar_compose_msi_msg(struct irq_data *data, struct msi_msg 
> > *msg)
> > +{
> > +   struct rcar_msi *msi = irq_data_get_irq_chip_data(data);
> > +   unsigned long pa = virt_to_phys(msi);
> >  
> > -   hwirq = rcar_msi_alloc_region(msi, nvec);
> > -   if (hwirq < 0)
> > -   return -ENOSPC;
> > +   /* Use the msi structure as the PA for the MSI doorbell */
> > +   msg->address_lo = lower_32_bits(pa);
> > +   msg->address_hi = upper_32_bits(pa);
> 
> I don't think this change is aligned with the previous patch (is it ?),
> the PA address we are using here is different from the one programmed
> into the controller registers - either that or I am missing something,
> please let me know.

Err. You are right. This looks like a bad case of broken conflict
resolution on my part.

The following snippet should fix it. Let me know if you want me to
resend the whole thing or whether you are OK with applying this by
hand.

Thanks,

M.

diff --git a/drivers/pci/controller/pcie-rcar-host.c 
b/drivers/pci/controller/pcie-rcar-host.c
index f7331ad0d6dc..765cf2b45e24 100644
--- a/drivers/pci/controller/pcie-rcar-host.c
+++ b/drivers/pci/controller/pcie-rcar-host.c
@@ -573,11 +573,10 @@ static int rcar_msi_set_affinity(struct irq_data *d, 
const struct cpumask *mask,
 static void rcar_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 {
struct rcar_msi *msi = irq_data_get_irq_chip_data(data);
-   unsigned long pa = virt_to_phys(msi);
+   struct rcar_pcie *pcie = _to_host(msi)->pcie;
 
-   /* Use the msi structure as the PA for the MSI doorbell */
-   msg->address_lo = lower_32_bits(pa);
-   msg->address_hi = upper_32_bits(pa);
+   msg->address_lo = rcar_pci_read_reg(pcie, PCIEMSIALR) & ~MSIFE;
+   msg->address_hi = rcar_pci_read_reg(pcie, PCIEMSIAUR);
msg->data = data->hwirq;
 }
 

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] clocksource/arm_arch_timer: add __ro_after_init and __init

2021-04-01 Thread Marc Zyngier
On Tue, 30 Mar 2021 07:04:44 +0100,
Jisheng Zhang  wrote:
> 
> Some functions are not needed after booting, so mark them as __init
> to move them to the .init section.
> 
> Some global variables are never modified after init, so can be
> __ro_after_init.
> 
> Signed-off-by: Jisheng Zhang 

Acked-by: Marc Zyngier 

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v5 07/19] arm64: kvm: Enable access to TRBE support for host

2021-03-31 Thread Marc Zyngier
On Wed, 31 Mar 2021 16:28:46 +0100,
Alexandru Elisei  wrote:
> 
> Hello,
> 
> On 3/30/21 12:12 PM, Suzuki K Poulose wrote:
> > Hi Marc
> >
> > On 30/03/2021 11:12, Marc Zyngier wrote:
> >> Hi Suzuki,
> >>
> >> [+ Alex]
> >>
> >> On Tue, 23 Mar 2021 12:06:35 +,
> >> Suzuki K Poulose  wrote:
> >>> [..]
> >>
> >>>   #define MDCR_EL2_TTRF    (1 << 19)
> >>>   #define MDCR_EL2_TPMS    (1 << 14)
> >>>   #define MDCR_EL2_E2PB_MASK    (UL(0x3))
> >>> diff --git a/arch/arm64/include/asm/kvm_host.h
> >>> b/arch/arm64/include/asm/kvm_host.h
> >>> index 3d10e6527f7d..80d0a1a82a4c 100644
> >>> --- a/arch/arm64/include/asm/kvm_host.h
> >>> +++ b/arch/arm64/include/asm/kvm_host.h
> >>> @@ -315,6 +315,8 @@ struct kvm_vcpu_arch {
> >>>   struct kvm_guest_debug_arch regs;
> >>>   /* Statistical profiling extension */
> >>>   u64 pmscr_el1;
> >>> +    /* Self-hosted trace */
> >>> +    u64 trfcr_el1;
> >>>   } host_debug_state;
> >>>     /* VGIC state */
> >>> diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> >>> index 5eccbd62fec8..05d25e645b46 100644
> >>> --- a/arch/arm64/kernel/hyp-stub.S
> >>> +++ b/arch/arm64/kernel/hyp-stub.S
> >>> @@ -115,9 +115,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
> >>>   mrs_s    x0, SYS_VBAR_EL12
> >>>   msr    vbar_el1, x0
> >>>   -    // Use EL2 translations for SPE and disable access from EL1
> >>> +    // Use EL2 translations for SPE & TRBE and disable access from EL1
> >>>   mrs    x0, mdcr_el2
> >>>   bic    x0, x0, #(MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT)
> >>> +    bic    x0, x0, #(MDCR_EL2_E2TB_MASK << MDCR_EL2_E2TB_SHIFT)
> >>>   msr    mdcr_el2, x0
> >>>     // Transfer the MM state from EL1 to EL2
> >>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> >>> index dbc890511631..7b16f42d39f4 100644
> >>> --- a/arch/arm64/kvm/debug.c
> >>> +++ b/arch/arm64/kvm/debug.c
> >>> @@ -89,7 +89,7 @@ void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
> >>>    *  - Debug ROM Address (MDCR_EL2_TDRA)
> >>>    *  - OS related registers (MDCR_EL2_TDOSA)
> >>>    *  - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
> >>> - *  - Self-hosted Trace Filter controls (MDCR_EL2_TTRF)
> >>> + *  - Self-hosted Trace (MDCR_EL2_TTRF/MDCR_EL2_E2TB)
> >>
> >> For the record, this is likely to conflict with [1], although that
> >> patch still has some issues.
> >
> > Thanks for the heads up. I think that patch will also conflict with my fixes
> > that is queued in kvmarm/fixes.
> 
> How should I proceed with the patch [1]? For the next iteration, should I 
> rebase
> it on top of kvmarm/fixes or on top of kvmarm/fixes + this series?

On top of kvmarm/fixes, please. I'll merge that branch into 5.13
anyway, as it is taking some time to land upstream.

I'll take care of the conflicts, and shout if I need help!

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] KVM: arm64: Support PREL/PLT relocs in EL2 code

2021-03-31 Thread Marc Zyngier
On Wed, 31 Mar 2021 13:30:48 +, David Brazdil wrote:
> gen-hyprel tool parses object files of the EL2 portion of KVM
> and generates runtime relocation data. While only filtering for
> R_AARCH64_ABS64 relocations in the input object files, it has an
> allow-list of relocation types that are used for relative
> addressing. Other, unexpected, relocation types are rejected and
> cause the build to fail.
> 
> [...]

Applied to kvm-arm64/misc-5.13, thanks!

[1/1] KVM: arm64: Support PREL/PLT relocs in EL2 code
  commit: 77e06b300161d41d65950be9c77a785c142b381d

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] arm: uprobes: Don't hook on thumb instructions

2021-03-31 Thread Marc Zyngier
Hi Fredrik,

On  Mon, 18 May 2020, Fredrik Strupe wrote:
> Since uprobes is not supported for thumb, check that the thumb bit is
> not set when matching the uprobes instruction hooks.
>
> The Arm UDF instructions used for uprobes triggering
> (UPROBE_SWBP_ARM_INSN and UPROBE_SS_ARM_INSN) coincidentally share the
> same encoding as a pair of unallocated 32-bit thumb instructions (not
> UDF) when the condition code is 0b (0xf). This in effect makes it
> possible to trigger the uprobes functionality from thumb, and at that
> using two unallocated instructions which are not permanently
> undefined.
>
> Signed-off-by: Fredrik Strupe  Fixes: c7edc9e326d5 ("ARM: add uprobes support")

It looks like we dropped the ball on this patch. Could you please add
it to Russell's patch system, together with a Cc: stable?

Otherwise, just say the word and I'll do it for you.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH RFC for-next 1/1] arm64: sve: Fix some compile errors about sve_cond_update_zcr_vq

2021-03-31 Thread Marc Zyngier
On Wed, 31 Mar 2021 17:44:39 +0800, Xiaofei Tan wrote:
> There are some compile errors of ARM64 KVM when ARM64_SVE is not
> selected, in the locations of sve_cond_update_zcr_vq used.
> One error log is as following:
> 
>   CC  arch/arm64/kvm/sys_regs.o
> In file included from arch/arm64/kvm/hyp/vhe/switch.c:8:0:
> ./arch/arm64/kvm/hyp/include/hyp/switch.h: In function 
> ‘__hyp_sve_restore_guest’:
> ./arch/arm64/kvm/hyp/include/hyp/switch.h:220:2: error: implicit declaration 
> of
> function ‘sve_cond_update_zcr_vq’; did you mean
> ‘kvm_pmu_update_run’? [-Werror=implicit-function-declaration]
>   sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
>   ^~
>   kvm_pmu_update_run
> 
> [...]

Applied to kvm-arm64/nvhe-sve, thanks!

[1/1] arm64: sve: Fix some compile errors about sve_cond_update_zcr_vq
  commit: a9f8696d4be5228de9d1d4f0e9f027b64d77dab6

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH 00/18] KVM: Consolidate and optimize MMU notifiers

2021-03-31 Thread Marc Zyngier

On 2021-03-31 08:57, Paolo Bonzini wrote:


Queued and 1-9 and 18, thanks.  There's a small issue in patch 10 that
prevented me from committing 10-15, but they mostly look good.


Can you please push the resulting merge somewhere?

I'm concerned that it will conflict in interesting way with other stuff
that is on its way on the arm64 side, not to mentiobn that this hasn't
been tested at all on anything but x86 (and given the series was posted
on Friday, that's a bit of a short notice).

Thanks,

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH -next] KVM: arm64: Make symbol '_kvm_host_prot_finalize' static

2021-03-31 Thread Marc Zyngier
On Wed, 31 Mar 2021 15:36:19 +0800, Xu Jia wrote:
> The sparse tool complains as follows:
> 
> arch/arm64/kvm/arm.c:1900:6: warning:
>  symbol '_kvm_host_prot_finalize' was not declared. Should it be static?
> 
> This symbol is not used outside of arm.c, so this
> commit marks it static.

Applied to kvm-arm64/host-stage2, thanks!

[1/1] KVM: arm64: Make symbol '_kvm_host_prot_finalize' static
  commit: b1306fef1f48c0af1d659c18c53cf275fdcc94be

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v5 07/19] arm64: kvm: Enable access to TRBE support for host

2021-03-30 Thread Marc Zyngier
On Tue, 30 Mar 2021 16:23:14 +0100,
Mathieu Poirier  wrote:
> 
> On Tue, Mar 30, 2021 at 11:38:18AM +0100, Suzuki K Poulose wrote:
> > On 26/03/2021 16:55, Mathieu Poirier wrote:
> > > On Tue, Mar 23, 2021 at 12:06:35PM +, Suzuki K Poulose wrote:
> > > > For a nvhe host, the EL2 must allow the EL1&0 translation
> > > > regime for TraceBuffer (MDCR_EL2.E2TB == 0b11). This must
> > > > be saved/restored over a trip to the guest. Also, before
> > > > entering the guest, we must flush any trace data if the
> > > > TRBE was enabled. And we must prohibit the generation
> > > > of trace while we are in EL1 by clearing the TRFCR_EL1.
> > > > 
> > > > For vhe, the EL2 must prevent the EL1 access to the Trace
> > > > Buffer.
> > > > 
> > > > Cc: Will Deacon 
> > > > Cc: Catalin Marinas 
> > > > Cc: Marc Zyngier 
> > > > Cc: Mark Rutland 
> > > > Cc: Anshuman Khandual 
> > > > Acked-by: Mathieu Poirier 
> > > > Signed-off-by: Suzuki K Poulose 
> > > > ---
> > > >   arch/arm64/include/asm/el2_setup.h | 13 +
> > > >   arch/arm64/include/asm/kvm_arm.h   |  2 ++
> > > >   arch/arm64/include/asm/kvm_host.h  |  2 ++
> > > >   arch/arm64/kernel/hyp-stub.S   |  3 ++-
> > > >   arch/arm64/kvm/debug.c |  6 ++---
> > > >   arch/arm64/kvm/hyp/nvhe/debug-sr.c | 42 ++
> > > >   arch/arm64/kvm/hyp/nvhe/switch.c   |  1 +
> > > >   7 files changed, 65 insertions(+), 4 deletions(-)
> > > > 
> > > 
> > > Marc - do you want me to pick up this one?
> > 
> > I think the kvmarm tree is the best route for this patch, given the amount
> > of changes the tree is going through, in the areas this patch
> > touches. Or else there would be conflicts with merging. And this patch
> > depends on the patches from this series that were queued.
> > 
> > Here is the depency tree :
> > 
> > a) kvm-arm fixes for debug (Patch 1, 2) & SPE save-restore fix (queued in
> > v5.12-rc3)
> > 
> > b) TRBE defintions and Trace synchronization barrier (Patches 5 & 6)
> > 
> > c) kvm-arm TRBE host support (Patch 7)
> > 
> > d) TRBE driver support (and the ETE changes)
> > 
> > 
> > (c) code merge depends on -> (a) + (b)
> > (d) build (no conflicts) depends on -> (b)
> > 
> > 
> > Now (d) has an indirect dependency on (c) for operational correctness at
> > runtime.
> > So, if :
> > 
> > kvmarm tree picks up : b + c
> > coresight tree picksup : b + d
> > 
> > and if we could ensure the merge order of the trees are in
> > kvmarm
> > greg-kh (device-misc tree) (coresight goes via this tree)
> >
> 
> Greg's char-misc tree is based on the rc releases rather than next.  As such 
> it
> is a while before other branches like kvmarm get merged, causing all sort of
> compilation breakage.
>  
> > we should be fine.
> > 
> > Additionally, we could rip out the Kconfig changes from the TRBE patch
> > and add it only at the rc1, once we verify both the trees are in to make
> > sure the runtime operation dependency is not triggered.
> >
> 
> We could also do that but Greg might frown at the tactic, and
> rightly so.

We do that all the times. Otherwise, it is hardly possible to build an
infrastructure that spans across multiple subsystems *and* involves
userspace. I really wouldn't worry about that.

M.

-- 
Without deviation from the norm, progress is not possible.


[PATCH v3 11/14] PCI/MSI: Make pci_host_common_probe() declare its reliance on MSI domains

2021-03-30 Thread Marc Zyngier
The generic PCI host driver relies on MSI domains for MSIs to
be provided to its end-points. Make this dependency explicit.

This cures the warnings occuring on arm/arm64 VMs when booted
with PCI virtio devices and no MSI controller (no GICv3 ITS,
for example).

It is likely that other drivers will need to express the same
dependency.

Acked-by: Bjorn Helgaas 
Signed-off-by: Marc Zyngier 
---
 drivers/pci/controller/pci-host-common.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/controller/pci-host-common.c 
b/drivers/pci/controller/pci-host-common.c
index 6ab694f8d283..d3924a44db02 100644
--- a/drivers/pci/controller/pci-host-common.c
+++ b/drivers/pci/controller/pci-host-common.c
@@ -79,6 +79,7 @@ int pci_host_common_probe(struct platform_device *pdev)
 
bridge->sysdata = cfg;
bridge->ops = (struct pci_ops *)>pci_ops;
+   bridge->msi_domain = true;
 
return pci_host_probe(bridge);
 }
-- 
2.29.2



[PATCH v3 13/14] PCI/MSI: Document the various ways of ending up with NO_MSI

2021-03-30 Thread Marc Zyngier
We have now three ways of ending up with NO_MSI being set.
Document them.

Acked-by: Bjorn Helgaas 
Signed-off-by: Marc Zyngier 
---
 drivers/pci/msi.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d9c73c173c14..217dc9f0231f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -871,8 +871,15 @@ static int pci_msi_supported(struct pci_dev *dev, int nvec)
 * Any bridge which does NOT route MSI transactions from its
 * secondary bus to its primary bus must set NO_MSI flag on
 * the secondary pci_bus.
-* We expect only arch-specific PCI host bus controller driver
-* or quirks for specific PCI bridges to be setting NO_MSI.
+*
+* The NO_MSI flag can either be set directly by:
+* - arch-specific PCI host bus controller drivers (deprecated)
+* - quirks for specific PCI bridges
+*
+* or indirectly by platform-specific PCI host bridge drivers by
+* advertising the 'msi_domain' property, which results in
+* the NO_MSI flag when no MSI domain is found for this bridge
+* at probe time.
 */
for (bus = dev->bus; bus; bus = bus->parent)
if (bus->bus_flags & PCI_BUS_FLAGS_NO_MSI)
-- 
2.29.2



[PATCH v3 14/14] PCI: Refactor HT advertising of NO_MSI flag

2021-03-30 Thread Marc Zyngier
The few quirks that deal with NO_MSI tend to be copy-paste heavy.
Refactor them so that the hierarchy of conditions is slightly
cleaner.

Acked-by: Bjorn Helgaas 
Signed-off-by: Marc Zyngier 
---
 drivers/pci/quirks.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 653660e3ba9e..972bb0f9f994 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2585,10 +2585,8 @@ static int msi_ht_cap_enabled(struct pci_dev *dev)
 /* Check the HyperTransport MSI mapping to know whether MSI is enabled or not 
*/
 static void quirk_msi_ht_cap(struct pci_dev *dev)
 {
-   if (dev->subordinate && !msi_ht_cap_enabled(dev)) {
-   pci_warn(dev, "MSI quirk detected; subordinate MSI disabled\n");
-   dev->subordinate->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
-   }
+   if (!msi_ht_cap_enabled(dev))
+   quirk_disable_msi(dev);
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS, 
PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE,
quirk_msi_ht_cap);
@@ -2601,9 +2599,6 @@ static void quirk_nvidia_ck804_msi_ht_cap(struct pci_dev 
*dev)
 {
struct pci_dev *pdev;
 
-   if (!dev->subordinate)
-   return;
-
/*
 * Check HT MSI cap on this chipset and the root one.  A single one
 * having MSI is enough to be sure that MSI is supported.
@@ -2611,10 +2606,8 @@ static void quirk_nvidia_ck804_msi_ht_cap(struct pci_dev 
*dev)
pdev = pci_get_slot(dev->bus, 0);
if (!pdev)
return;
-   if (!msi_ht_cap_enabled(dev) && !msi_ht_cap_enabled(pdev)) {
-   pci_warn(dev, "MSI quirk detected; subordinate MSI disabled\n");
-   dev->subordinate->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
-   }
+   if (!msi_ht_cap_enabled(pdev))
+   quirk_msi_ht_cap(dev);
pci_dev_put(pdev);
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_CK804_PCIE,
-- 
2.29.2



  1   2   3   4   5   6   7   8   9   10   >