[PATCH v2] selftests/powerpc: Fix L1D flushing tests for Power10

2021-02-22 Thread Russell Currey
The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1
perf event to count L1D misses.  The value of this event has changed
over time:

- Power7 uses 0x400f0
- Power8 and Power9 use both 0x400f0 and 0x3e054
- Power10 uses only 0x3e054

Rather than relying on raw values, configure perf to count L1D read
misses in the most explicit way available.

This fixes the selftests to work on systems without 0x400f0 as
PM_LD_MISS_L1, and should change no behaviour for systems that the tests
already worked on.

The only potential downside is that referring to a specific perf event
requires PMU support implemented in the kernel for that platform.

Signed-off-by: Russell Currey 
---
v2: Move away from raw events as suggested by mpe

 tools/testing/selftests/powerpc/security/entry_flush.c | 2 +-
 tools/testing/selftests/powerpc/security/flush_utils.h | 4 
 tools/testing/selftests/powerpc/security/rfi_flush.c   | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/security/entry_flush.c 
b/tools/testing/selftests/powerpc/security/entry_flush.c
index 78cf914fa321..68ce377b205e 100644
--- a/tools/testing/selftests/powerpc/security/entry_flush.c
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -53,7 +53,7 @@ int entry_flush_test(void)
 
entry_flush = entry_flush_orig;
 
-   fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+   fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, 
PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
 
p = (char *)memalign(zero_size, CACHELINE_SIZE);
diff --git a/tools/testing/selftests/powerpc/security/flush_utils.h 
b/tools/testing/selftests/powerpc/security/flush_utils.h
index 07a5eb301466..7a3d60292916 100644
--- a/tools/testing/selftests/powerpc/security/flush_utils.h
+++ b/tools/testing/selftests/powerpc/security/flush_utils.h
@@ -9,6 +9,10 @@
 
 #define CACHELINE_SIZE 128
 
+#define PERF_L1D_READ_MISS_CONFIG  ((PERF_COUNT_HW_CACHE_L1D) |
\
+   (PERF_COUNT_HW_CACHE_OP_READ << 8) |
\
+   (PERF_COUNT_HW_CACHE_RESULT_MISS << 16))
+
 void syscall_loop(char *p, unsigned long iterations,
  unsigned long zero_size);
 
diff --git a/tools/testing/selftests/powerpc/security/rfi_flush.c 
b/tools/testing/selftests/powerpc/security/rfi_flush.c
index 7565fd786640..f73484a6470f 100644
--- a/tools/testing/selftests/powerpc/security/rfi_flush.c
+++ b/tools/testing/selftests/powerpc/security/rfi_flush.c
@@ -54,7 +54,7 @@ int rfi_flush_test(void)
 
rfi_flush = rfi_flush_orig;
 
-   fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+   fd = perf_event_open_counter(PERF_TYPE_HW_CACHE, 
PERF_L1D_READ_MISS_CONFIG, -1);
FAIL_IF(fd < 0);
 
p = (char *)memalign(zero_size, CACHELINE_SIZE);
-- 
2.30.1



[PATCH] powerpc/perf: Fix handling of privilege level checks in perf interrupt context

2021-02-22 Thread Athira Rajeev
Running "perf mem record" in powerpc platforms with selinux enabled
resulted in soft lockup's. Below call-trace was seen in the logs:

CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
NIP:  c0dff3d4 LR: c0dff3d0 CTR: 
REGS: c07fffab7d60 TRAP: 0100   Not tainted  (5.11.0-rc7+)
<<>>
NIP [c0dff3d4] _raw_spin_lock_irqsave+0x94/0x120
LR [c0dff3d0] _raw_spin_lock_irqsave+0x90/0x120
Call Trace:
[cfd471a0] [cfd47260] 0xcfd47260 (unreliable)
[cfd471e0] [c0b5fbbc] skb_queue_tail+0x3c/0x90
[cfd47220] [c0296edc] audit_log_end+0x6c/0x180
[cfd47260] [c06a3f20] common_lsm_audit+0xb0/0xe0
[cfd472a0] [c066c664] slow_avc_audit+0xa4/0x110
[cfd47320] [c066cff4] avc_has_perm+0x1c4/0x260
[cfd47430] [c066e064] selinux_perf_event_open+0x74/0xd0
[cfd47450] [c0669888] security_perf_event_open+0x68/0xc0
[cfd47490] [c013d788] record_and_restart+0x6e8/0x7f0
[cfd476c0] [c013dabc] perf_event_interrupt+0x22c/0x560
[cfd477d0] [c002d0fc] performance_monitor_exception+0x4c/0x60
[cfd477f0] [c000b378] 
performance_monitor_common_virt+0x1c8/0x1d0
interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120
NIP:  c0dff378 LR: c0b5fbbc CTR: c07d47f0
REGS: cfd47860 TRAP: 0f00   Not tainted  (5.11.0-rc7+)
<<>>
NIP [c0dff378] _raw_spin_lock_irqsave+0x38/0x120
LR [c0b5fbbc] skb_queue_tail+0x3c/0x90
interrupt: f00
[cfd47b00] [0038] 0x38 (unreliable)
[cfd47b40] [caae6200] 0xcaae6200
[cfd47b80] [c0296edc] audit_log_end+0x6c/0x180
[cfd47bc0] [c029f494] audit_log_exit+0x344/0xf80
[cfd47d10] [c02a2b00] __audit_syscall_exit+0x2c0/0x320
[cfd47d60] [c0032878] do_syscall_trace_leave+0x148/0x200
[cfd47da0] [c003d5b4] syscall_exit_prepare+0x324/0x390
[cfd47e10] [c000d76c] system_call_common+0xfc/0x27c

The above trace shows that while the CPU was handling a performance
monitor exception, there was a call to "security_perf_event_open"
function. In powerpc core-book3s, this function is called from
'perf_allow_kernel' check during recording of data address in the sample
via perf_get_data_addr().

Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks")
introduced security enhancements to perf. As part of this commit, the new
security hook for perf_event_open was added in all places where perf
paranoid check was previously used. In powerpc core-book3s code, originally
had paranoid checks in 'perf_get_data_addr' and 'power_pmu_bhrb_read'. So
'perf_paranoid_kernel' checks were replaced with 'perf_allow_kernel' in
these pmu helper functions as well.

The intention of paranoid checks in core-book3s is to verify privilege
access before capturing some of the sample data. Along with paranoid
checks, 'perf_allow_kernel' also does a 'security_perf_event_open'. Since
these functions are accessed while recording sample, we end up in calling
selinux_perf_event_open in PMI context. Some of the security functions
use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under
a spin lock and if we end up in calling selinux hook functions in PMI
handler, this could cause a dead lock.

Since the purpose of this security hook is to control access to
perf_event_open, it is not right to call this in interrupt context.
But in case of powerpc PMU, we need the privilege checks for specific
samples from branch history ring buffer and sampling register values.
Reference commits:
Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via
perf_get_data_addr()")
Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to
userspace via BHRB buffer")

As a fix, patch caches 'perf_allow_kernel' value in event_init in
'pmu_private' field of perf_event. The cached value is used in the
PMI code path.

Suggested-by: Michael Ellerman 
Signed-off-by: Athira Rajeev 
---
 arch/powerpc/perf/core-book3s.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 4b4319d8..9e9f67f 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -189,6 +189,11 @@ static inline unsigned long perf_ip_adjust(struct pt_regs 
*regs)
return 0;
 }
 
+static bool event_allow_kernel(struct perf_event *event)
+{
+   return (bool)event->pmu_private;
+}
+
 /*
  * The user wants a data address recorded.
  * If we're not doing instruction sampling, give them the SDAR
@@ -222,7 +227,7 @@ static inline void perf_get_data_addr(struct perf_event 
*event, struct pt_regs *
if (!(mmcra & MMCRA_SAMPLE_ENABLE) || sdar_valid)
*addrp = mfspr(SPRN_SDAR);
 
-   if (is_kernel_addr(mfspr(SPRN_SDAR)) && 

Re: [PATCH kernel] powerpc/iommu: Annotate nested lock for lockdep

2021-02-22 Thread Alexey Kardashevskiy




On 18/02/2021 23:59, Frederic Barrat wrote:



On 16/02/2021 04:20, Alexey Kardashevskiy wrote:

The IOMMU table is divided into pools for concurrent mappings and each
pool has a separate spinlock. When taking the ownership of an IOMMU group
to pass through a device to a VM, we lock these spinlocks which triggers
a false negative warning in lockdep (below).

This fixes it by annotating the large pool's spinlock as a nest lock.

===
WARNING: possible recursive locking detected
5.11.0-le_syzkaller_a+fstn1 #100 Not tainted

qemu-system-ppc/4129 is trying to acquire lock:
c000119bddb0 (&(p->lock)/1){}-{2:2}, at: 
iommu_take_ownership+0xac/0x1e0


but task is already holding lock:
c000119bdd30 (&(p->lock)/1){}-{2:2}, at: 
iommu_take_ownership+0xac/0x1e0


other info that might help us debug this:
  Possible unsafe locking scenario:

    CPU0
    
   lock(&(p->lock)/1);
   lock(&(p->lock)/1);
===

Signed-off-by: Alexey Kardashevskiy 
---
  arch/powerpc/kernel/iommu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 557a09dd5b2f..2ee642a6731a 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1089,7 +1089,7 @@ int iommu_take_ownership(struct iommu_table *tbl)
  spin_lock_irqsave(>large_pool.lock, flags);
  for (i = 0; i < tbl->nr_pools; i++)
-    spin_lock(>pools[i].lock);
+    spin_lock_nest_lock(>pools[i].lock, >large_pool.lock);



We have the same pattern and therefore should have the same problem in 
iommu_release_ownership().


But as I understand, we're hacking our way around lockdep here, since 
conceptually, those locks are independent. I was wondering why it seems 
to fix it by worrying only about the large pool lock.


This is the other way around - we telling the lockdep not to worry about 
small pool locks if the nest lock (==large pool lock) is locked. The 
warning is printed when a nested lock is detected and the lockdep checks 
if there is a nest for this nested lock at check_deadlock().



That loop can take 
many locks (up to 4 with current config). However, if the dma window is 
less than 1GB, we would only have one, so it would make sense for 
lockdep to stop complaining.


Why would it stop if the large pool is always there?

Is it what happened? In which case, this 
patch doesn't really fix it. Or I'm missing something :-)


I tried with 1 or 2 small pools, no difference at all. I might also be 
missing something here too :)





   Fred




  iommu_table_release_pages(tbl);



--
Alexey


Re: [PATCH v4 2/3] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE

2021-02-22 Thread David Gibson
On Mon, Feb 22, 2021 at 12:16:08PM +0530, Bharata B Rao wrote:
> On Wed, Feb 17, 2021 at 11:38:07AM +1100, David Gibson wrote:
> > On Mon, Feb 15, 2021 at 12:05:41PM +0530, Bharata B Rao wrote:
> > > Implement H_RPT_INVALIDATE hcall and add KVM capability
> > > KVM_CAP_PPC_RPT_INVALIDATE to indicate the support for the same.
> > > 
> > > This hcall does two types of TLB invalidations:
> > > 
> > > 1. Process-scoped invalidations for guests with LPCR[GTSE]=0.
> > >This is currently not used in KVM as GTSE is not usually
> > >disabled in KVM.
> > > 2. Partition-scoped invalidations that an L1 hypervisor does on
> > >behalf of an L2 guest. This replaces the uses of the existing
> > >hcall H_TLB_INVALIDATE.
> > > 
> > > In order to handle process scoped invalidations of L2, we
> > > intercept the nested exit handling code in L0 only to handle
> > > H_TLB_INVALIDATE hcall.
> > > 
> > > Signed-off-by: Bharata B Rao 
> > > ---
> > >  Documentation/virt/kvm/api.rst | 17 +
> > >  arch/powerpc/include/asm/kvm_book3s.h  |  3 +
> > >  arch/powerpc/include/asm/mmu_context.h | 11 +++
> > >  arch/powerpc/kvm/book3s_hv.c   | 91 
> > >  arch/powerpc/kvm/book3s_hv_nested.c| 96 ++
> > >  arch/powerpc/kvm/powerpc.c |  3 +
> > >  arch/powerpc/mm/book3s64/radix_tlb.c   | 25 +++
> > >  include/uapi/linux/kvm.h   |  1 +
> > >  8 files changed, 247 insertions(+)
> > > 
> > > diff --git a/Documentation/virt/kvm/api.rst 
> > > b/Documentation/virt/kvm/api.rst
> > > index 99ceb978c8b0..416c36aa35d4 100644
> > > --- a/Documentation/virt/kvm/api.rst
> > > +++ b/Documentation/virt/kvm/api.rst
> > > @@ -6038,6 +6038,23 @@ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit 
> > > notifications which user space
> > >  can then handle to implement model specific MSR handling and/or user 
> > > notifications
> > >  to inform a user that an MSR was not handled.
> > >  
> > > +7.22 KVM_CAP_PPC_RPT_INVALIDATE
> > > +--
> > > +
> > > +:Capability: KVM_CAP_PPC_RPT_INVALIDATE
> > > +:Architectures: ppc
> > > +:Type: vm
> > > +
> > > +This capability indicates that the kernel is capable of handling
> > > +H_RPT_INVALIDATE hcall.
> > > +
> > > +In order to enable the use of H_RPT_INVALIDATE in the guest,
> > > +user space might have to advertise it for the guest. For example,
> > > +IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
> > > +present in the "ibm,hypertas-functions" device-tree property.
> > > +
> > > +This capability is always enabled.
> > 
> > I guess that means it's always enabled when it's available - I'm
> > pretty sure it won't be enabled on POWER8 or on PR KVM.
> 
> Correct, will reword this and restrict this to POWER9, radix etc
> 
> > 
> > > +
> > >  8. Other capabilities.
> > >  ==
> > >  
> > > diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
> > > b/arch/powerpc/include/asm/kvm_book3s.h
> > > index d32ec9ae73bd..0f1c5fa6e8ce 100644
> > > --- a/arch/powerpc/include/asm/kvm_book3s.h
> > > +++ b/arch/powerpc/include/asm/kvm_book3s.h
> > > @@ -298,6 +298,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, 
> > > u64 dw1);
> > >  void kvmhv_release_all_nested(struct kvm *kvm);
> > >  long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
> > >  long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
> > > +long kvmhv_h_rpti_nested(struct kvm_vcpu *vcpu, unsigned long lpid,
> > > +  unsigned long type, unsigned long pg_sizes,
> > > +  unsigned long start, unsigned long end);
> > >  int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
> > > u64 time_limit, unsigned long lpcr);
> > >  void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state 
> > > *hr);
> > > diff --git a/arch/powerpc/include/asm/mmu_context.h 
> > > b/arch/powerpc/include/asm/mmu_context.h
> > > index d5821834dba9..fbf3b5b45fe9 100644
> > > --- a/arch/powerpc/include/asm/mmu_context.h
> > > +++ b/arch/powerpc/include/asm/mmu_context.h
> > > @@ -124,8 +124,19 @@ static inline bool need_extra_context(struct 
> > > mm_struct *mm, unsigned long ea)
> > >  
> > >  #if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && 
> > > defined(CONFIG_PPC_RADIX_MMU)
> > >  extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
> > > +void do_h_rpt_invalidate(unsigned long pid, unsigned long lpid,
> > > +  unsigned long type, unsigned long page_size,
> > > +  unsigned long psize, unsigned long start,
> > > +  unsigned long end);
> > >  #else
> > >  static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { 
> > > }
> > > +static inline void do_h_rpt_invalidate(unsigned long pid,
> > > +unsigned long lpid,
> > > +unsigned long type,
> > > +unsigned long page_size,
> > > + 

Re: [PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays

2021-02-22 Thread Stefano Stabellini
On Fri, 19 Feb 2021, Konrad Rzeszutek Wilk wrote:
> On Sun, Feb 07, 2021 at 04:56:01PM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 04, 2021 at 09:40:23AM +0100, Christoph Hellwig wrote:
> > > So one thing that has been on my mind for a while:  I'd really like
> > > to kill the separate dma ops in Xen swiotlb.  If we compare xen-swiotlb
> > > to swiotlb the main difference seems to be:
> > > 
> > >  - additional reasons to bounce I/O vs the plain DMA capable
> > >  - the possibility to do a hypercall on arm/arm64
> > >  - an extra translation layer before doing the phys_to_dma and vice
> > >versa
> > >  - an special memory allocator
> > > 
> > > I wonder if inbetween a few jump labels or other no overhead enablement
> > > options and possibly better use of the dma_range_map we could kill
> > > off most of swiotlb-xen instead of maintaining all this code duplication?
> > 
> > So I looked at this a bit more.
> > 
> > For x86 with XENFEAT_auto_translated_physmap (how common is that?)
> 
> Juergen, Boris please correct me if I am wrong, but that 
> XENFEAT_auto_translated_physmap
> only works for PVH guests?

ARM is always XENFEAT_auto_translated_physmap


> > pfn_to_gfn is a nop, so plain phys_to_dma/dma_to_phys do work as-is.
> > 
> > xen_arch_need_swiotlb always returns true for x86, and
> > range_straddles_page_boundary should never be true for the
> > XENFEAT_auto_translated_physmap case.
> 
> Correct. The kernel should have no clue of what the real MFNs are
> for PFNs.

On ARM, Linux knows the MFNs because for local pages MFN == PFN and for
foreign pages it keeps track in arch/arm/xen/p2m.c. More on this below.

xen_arch_need_swiotlb only returns true on ARM in rare situations where
bouncing on swiotlb buffers is required. Today it only happens on old
versions of Xen that don't support the cache flushing hypercall but
there could be more cases in the future.


> > 
> > So as far as I can tell the mapping fast path for the
> > XENFEAT_auto_translated_physmap can be trivially reused from swiotlb.
> > 
> > That leaves us with the next more complicated case, x86 or fully cache
> > coherent arm{,64} without XENFEAT_auto_translated_physmap.  In that case
> > we need to patch in a phys_to_dma/dma_to_phys that performs the MFN
> > lookup, which could be done using alternatives or jump labels.
> > I think if that is done right we should also be able to let that cover
> > the foreign pages in is_xen_swiotlb_buffer/is_swiotlb_buffer, but
> > in that worst case that would need another alternative / jump label.
> > 
> > For non-coherent arm{,64} we'd also need to use alternatives or jump
> > labels to for the cache maintainance ops, but that isn't a hard problem
> > either.

With the caveat that ARM is always XENFEAT_auto_translated_physmap, what
you wrote looks correct. I am writing down a brief explanation on how
swiotlb-xen is used on ARM.


pfn: address as seen by the guest, pseudo-physical address in ARM terminology
mfn (or bfn): real address, physical address in ARM terminology


On ARM dom0 is auto_translated (so Xen sets up the stage2 translation
in the MMU) and the translation is 1:1. So pfn == mfn for Dom0.

However, when another domain shares a page with Dom0, that page is not
1:1. Swiotlb-xen is used to retrieve the mfn for the foreign page at
xen_swiotlb_map_page. It does that with xen_phys_to_bus -> pfn_to_bfn.
It is implemented with a rbtree in arch/arm/xen/p2m.c.

In addition, swiotlb-xen is also used to cache-flush the page via
hypercall at xen_swiotlb_unmap_page. That is done because dev_addr is
really the mfn at unmap_page and we don't know the pfn for it. We can do
pfn-to-mfn but we cannot do mfn-to-pfn (there are good reasons for it
unfortunately). The only way to cache-flush by mfn is by issuing a
hypercall. The hypercall is implemented in arch/arm/xen/mm.c.

The pfn != bfn and pfn_valid() checks are used to detect if the page is
local (of dom0) or foreign; they work thanks to the fact that Dom0 is
1:1 mapped.


Getting back to what you wrote, yes if we had a way to do MFN lookups in
phys_to_dma, and a way to call the hypercall at unmap_page if the page
is foreign (e.g. if it fails a pfn_valid check) then I think we would be
good from an ARM perspective. The only exception is when
xen_arch_need_swiotlb returns true, in which case we need to actually
bounce on swiotlb buffers.


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Michael Ellerman
Rob Herring  writes:
> On Mon, Feb 22, 2021 at 6:05 AM Michael Ellerman  wrote:
>>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA256
>>
>> Hi Linus,
>>
>> Please pull powerpc updates for 5.12.
>>
>> There will be a conflict with the devicetree tree. It's OK to just take their
>> side of the conflict, we'll fix up the minor behaviour change that causes in 
>> a
>> follow-up patch.
>
> The issues turned out to be worse than just this, so I've dropped the
> conflicting change for 5.12.

OK, no worries.

cheers


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Michael Ellerman
"Oliver O'Halloran"  writes:

> On Tue, Feb 23, 2021 at 9:44 AM Linus Torvalds
>  wrote:
>>
>> On Mon, Feb 22, 2021 at 4:06 AM Michael Ellerman  wrote:
>> >
>> > Please pull powerpc updates for 5.12.
>>
>> Pulled. However:
>>
>> >  mode change 100755 => 100644 
>> > tools/testing/selftests/powerpc/eeh/eeh-functions.sh
>> >  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
>> >  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-unaware.sh
>>
>> Somebody is being confused.
>>
>> Why create two new shell scripts with the proper executable bit, and
>> then remove the executable bit from an existing one?
>>
>> That just seems very inconsistent.
>
> eeh-function.sh just provides some helper functions for the other
> scripts and doesn't do anything when executed directly. I thought
> making it non-executable made sense.

Yeah I think it does make sense. It just looks a bit odd in the diffstat
like this. Maybe if we called it lib.sh it would be more obvious?

cheers


Re: linux-next: manual merge of the spi tree with the powerpc tree

2021-02-22 Thread Stephen Rothwell
Hi Stephen,

On Fri, 12 Feb 2021 15:31:42 +1100 Stephen Rothwell  
wrote:
>
> Hi all,
> 
> Today's linux-next merge of the spi tree got a conflict in:
> 
>   drivers/spi/spi-mpc52xx.c
> 
> between commit:
> 
>   e10656114d32 ("spi: mpc52xx: Avoid using get_tbl()")
> 
> from the powerpc tree and commit:
> 
>   258ea99fe25a ("spi: spi-mpc52xx: Use new structure for SPI transfer delays")
> 
> from the spi tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> diff --cc drivers/spi/spi-mpc52xx.c
> index e6a30f232370,36f941500676..
> --- a/drivers/spi/spi-mpc52xx.c
> +++ b/drivers/spi/spi-mpc52xx.c
> @@@ -247,8 -247,10 +247,10 @@@ static int mpc52xx_spi_fsmstate_transfe
>   /* Is the transfer complete? */
>   ms->len--;
>   if (ms->len == 0) {
>  -ms->timestamp = get_tbl();
>  +ms->timestamp = mftb();
> - ms->timestamp += ms->transfer->delay_usecs * tb_ticks_per_usec;
> + if (ms->transfer->delay.unit == SPI_DELAY_UNIT_USECS)
> + ms->timestamp += ms->transfer->delay.value *
> +  tb_ticks_per_usec;
>   ms->state = mpc52xx_spi_fsmstate_wait;
>   return FSM_CONTINUE;
>   }

This is now a conflict between the powerpc tree and Linus' tree.

-- 
Cheers,
Stephen Rothwell


pgp3YqnBZzZgW.pgp
Description: OpenPGP digital signature


[powerpc:fixes-test] BUILD SUCCESS a5c2f7d40511976f30de38b4374b8da2b39a073c

2021-02-22 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
fixes-test
branch HEAD: a5c2f7d40511976f30de38b4374b8da2b39a073c  powerpc/4xx: Fix build 
errors from mfdcr()

elapsed time: 724m

configs tested: 101
configs skipped: 88

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
mips   mtx1_defconfig
mipsmalta_qemu_32r6_defconfig
powerpcfsp2_defconfig
h8300alldefconfig
powerpcicon_defconfig
sh   se7343_defconfig
ia64zx1_defconfig
arm shannon_defconfig
armmvebu_v5_defconfig
powerpc kmeter1_defconfig
sh ecovec24_defconfig
sh  polaris_defconfig
powerpc  g5_defconfig
mips  malta_defconfig
arcnsim_700_defconfig
powerpc akebono_defconfig
powerpc ppa8548_defconfig
powerpc mpc5200_defconfig
arm davinci_all_defconfig
mips tb0219_defconfig
armkeystone_defconfig
sh  sh7785lcr_32bit_defconfig
powerpc  makalu_defconfig
armrealview_defconfig
powerpc taishan_defconfig
arm  pxa168_defconfig
arm  simpad_defconfig
mips   ci20_defconfig
powerpc mpc8560_ads_defconfig
powerpc   lite5200b_defconfig
powerpc  ppc6xx_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a001-20210222
x86_64   randconfig-a002-20210222
x86_64   randconfig-a003-20210222
x86_64   randconfig-a005-20210222
x86_64   randconfig-a006-20210222
x86_64   randconfig-a004-20210222
i386 randconfig-a013-20210222
i386 randconfig-a012-20210222
i386 randconfig-a011-20210222
i386 randconfig-a014-20210222
i386 randconfig-a016-20210222
i386 randconfig-a015-20210222
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a015-20210222
x86_64   randconfig-a011-20210222
x86_64   randconfig-a012-20210222
x86_64   randconfig-a016-20210222
x86_64   randconfig-a014-20210222

[powerpc:merge] BUILD SUCCESS b267c8c58643460da9159ee69f46b3945cfd9de6

2021-02-22 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: b267c8c58643460da9159ee69f46b3945cfd9de6  Automatic merge of 
'master' into merge (2021-02-22 21:30)

elapsed time: 723m

configs tested: 123
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
mips   mtx1_defconfig
mipsmalta_qemu_32r6_defconfig
powerpcfsp2_defconfig
h8300alldefconfig
powerpcicon_defconfig
sh   se7343_defconfig
ia64zx1_defconfig
arm shannon_defconfig
armmvebu_v5_defconfig
powerpc kmeter1_defconfig
sh ecovec24_defconfig
powerpc redwood_defconfig
sh apsh4a3a_defconfig
powerpc  ppc44x_defconfig
sh  polaris_defconfig
powerpc  g5_defconfig
mips  malta_defconfig
arcnsim_700_defconfig
sh   se7705_defconfig
powerpc  chrp32_defconfig
ia64generic_defconfig
powerpc  bamboo_defconfig
arm  simpad_defconfig
arc nsimosci_hs_defconfig
arm   netwinder_defconfig
arm   spitz_defconfig
m68k   sun3_defconfig
powerpc mpc85xx_cds_defconfig
powerpc akebono_defconfig
mipsar7_defconfig
sparc   sparc32_defconfig
ia64  gensparse_defconfig
powerpc ppa8548_defconfig
powerpc mpc5200_defconfig
arm davinci_all_defconfig
mips tb0219_defconfig
armkeystone_defconfig
sh  sh7785lcr_32bit_defconfig
powerpc  makalu_defconfig
powerpc  walnut_defconfig
s390 allyesconfig
mipsjmr3927_defconfig
mips  maltasmvp_defconfig
m68k  multi_defconfig
armrealview_defconfig
powerpc taishan_defconfig
arm  pxa168_defconfig
mips   ci20_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a001-20210222
x86_64   randconfig-a002-20210222
x86_64   randconfig-a003-20210222
x86_64   randconfig-a005-20210222
x86_64   randconfig-a006-20210222
x86_64   randconfig-a004-20210222
i386 randconfig-a005-20210222
i386 randconfig-a006-20210222
i386 randconfig-a004-20210222
i386 randconfig-a003-20210222
i386 randconfig-a001-20210222
i386 randconfig-a002-20210222
i386 randconfig-a013-20210222
i386 randconfig-a012-20210222
i386

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Oliver O'Halloran
On Tue, Feb 23, 2021 at 9:44 AM Linus Torvalds
 wrote:
>
> On Mon, Feb 22, 2021 at 4:06 AM Michael Ellerman  wrote:
> >
> > Please pull powerpc updates for 5.12.
>
> Pulled. However:
>
> >  mode change 100755 => 100644 
> > tools/testing/selftests/powerpc/eeh/eeh-functions.sh
> >  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
> >  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-unaware.sh
>
> Somebody is being confused.
>
> Why create two new shell scripts with the proper executable bit, and
> then remove the executable bit from an existing one?
>
> That just seems very inconsistent.

eeh-function.sh just provides some helper functions for the other
scripts and doesn't do anything when executed directly. I thought
making it non-executable made sense.

>
>  Linus


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Linus Torvalds
On Mon, Feb 22, 2021 at 4:06 AM Michael Ellerman  wrote:
>
> Please pull powerpc updates for 5.12.

Pulled. However:

>  mode change 100755 => 100644 
> tools/testing/selftests/powerpc/eeh/eeh-functions.sh
>  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-aware.sh
>  create mode 100755 tools/testing/selftests/powerpc/eeh/eeh-vf-unaware.sh

Somebody is being confused.

Why create two new shell scripts with the proper executable bit, and
then remove the executable bit from an existing one?

That just seems very inconsistent.

 Linus


Re: [PATCH 06/13] KVM: PPC: Book3S 64: Move GUEST_MODE_SKIP test into KVM

2021-02-22 Thread Fabiano Rosas
Nicholas Piggin  writes:

> Move the GUEST_MODE_SKIP logic into KVM code. This is quite a KVM
> internal detail that has no real need to be in common handlers.
>
> Also add a comment explaining why this this thing exists.

this this

>
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Fabiano Rosas 

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 60 --
>  arch/powerpc/kvm/book3s_64_entry.S   | 64 
>  2 files changed, 56 insertions(+), 68 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index a1640d6ea65d..96f22c582213 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -133,7 +133,6 @@ name:
>  #define IBRANCH_TO_COMMON.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch 
> to common */
>  #define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() /* Common runs in 
> realmode */
>  #define IMASK.L_IMASK_\name\()   /* IRQ soft-mask bit */
> -#define IKVM_SKIP.L_IKVM_SKIP_\name\()   /* Generate KVM skip handler */
>  #define IKVM_REAL.L_IKVM_REAL_\name\()   /* Real entry tests KVM */
>  #define __IKVM_REAL(name).L_IKVM_REAL_ ## name
>  #define IKVM_VIRT.L_IKVM_VIRT_\name\()   /* Virt entry tests KVM */
> @@ -191,9 +190,6 @@ do_define_int n
>   .ifndef IMASK
>   IMASK=0
>   .endif
> - .ifndef IKVM_SKIP
> - IKVM_SKIP=0
> - .endif
>   .ifndef IKVM_REAL
>   IKVM_REAL=0
>   .endif
> @@ -254,15 +250,10 @@ do_define_int n
>   .balign IFETCH_ALIGN_BYTES
>  \name\()_kvm:
>
> - .if IKVM_SKIP
> - cmpwi   r10,KVM_GUEST_MODE_SKIP
> - beq 89f
> - .else
>  BEGIN_FTR_SECTION
>   ld  r10,IAREA+EX_CFAR(r13)
>   std r10,HSTATE_CFAR(r13)
>  END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
> - .endif
>
>   ld  r10,IAREA+EX_CTR(r13)
>   mtctr   r10
> @@ -289,27 +280,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>   ori r12,r12,(IVEC)
>   .endif
>   b   kvmppc_interrupt
> -
> - .if IKVM_SKIP
> -89:  mtocrf  0x80,r9
> - ld  r10,IAREA+EX_CTR(r13)
> - mtctr   r10
> - ld  r9,IAREA+EX_R9(r13)
> - ld  r10,IAREA+EX_R10(r13)
> - ld  r11,IAREA+EX_R11(r13)
> - ld  r12,IAREA+EX_R12(r13)
> - .if IHSRR_IF_HVMODE
> - BEGIN_FTR_SECTION
> - b   kvmppc_skip_Hinterrupt
> - FTR_SECTION_ELSE
> - b   kvmppc_skip_interrupt
> - ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
> - .elseif IHSRR
> - b   kvmppc_skip_Hinterrupt
> - .else
> - b   kvmppc_skip_interrupt
> - .endif
> - .endif
>  .endm
>
>  #else
> @@ -1128,7 +1098,6 @@ INT_DEFINE_BEGIN(machine_check)
>   ISET_RI=0
>   IDAR=1
>   IDSISR=1
> - IKVM_SKIP=1
>   IKVM_REAL=1
>  INT_DEFINE_END(machine_check)
>
> @@ -1419,7 +1388,6 @@ INT_DEFINE_BEGIN(data_access)
>   IVEC=0x300
>   IDAR=1
>   IDSISR=1
> - IKVM_SKIP=1
>   IKVM_REAL=1
>  INT_DEFINE_END(data_access)
>
> @@ -1465,7 +1433,6 @@ INT_DEFINE_BEGIN(data_access_slb)
>   IVEC=0x380
>   IRECONCILE=0
>   IDAR=1
> - IKVM_SKIP=1
>   IKVM_REAL=1
>  INT_DEFINE_END(data_access_slb)
>
> @@ -2111,7 +2078,6 @@ INT_DEFINE_BEGIN(h_data_storage)
>   IHSRR=1
>   IDAR=1
>   IDSISR=1
> - IKVM_SKIP=1
>   IKVM_REAL=1
>   IKVM_VIRT=1
>  INT_DEFINE_END(h_data_storage)
> @@ -3088,32 +3054,6 @@ EXPORT_SYMBOL(do_uaccess_flush)
>  MASKED_INTERRUPT
>  MASKED_INTERRUPT hsrr=1
>
> -#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
> -kvmppc_skip_interrupt:
> - /*
> -  * Here all GPRs are unchanged from when the interrupt happened
> -  * except for r13, which is saved in SPRG_SCRATCH0.
> -  */
> - mfspr   r13, SPRN_SRR0
> - addir13, r13, 4
> - mtspr   SPRN_SRR0, r13
> - GET_SCRATCH0(r13)
> - RFI_TO_KERNEL
> - b   .
> -
> -kvmppc_skip_Hinterrupt:
> - /*
> -  * Here all GPRs are unchanged from when the interrupt happened
> -  * except for r13, which is saved in SPRG_SCRATCH0.
> -  */
> - mfspr   r13, SPRN_HSRR0
> - addir13, r13, 4
> - mtspr   SPRN_HSRR0, r13
> - GET_SCRATCH0(r13)
> - HRFI_TO_KERNEL
> - b   .
> -#endif
> -
>   /*
>* Relocation-on interrupts: A subset of the interrupts can be delivered
>* with IR=1/DR=1, if AIL==2 and MSR.HV won't be changed by delivering
> diff --git a/arch/powerpc/kvm/book3s_64_entry.S 
> b/arch/powerpc/kvm/book3s_64_entry.S
> index 147ebf1c3c1f..820d103e5f50 100644
> --- a/arch/powerpc/kvm/book3s_64_entry.S
> +++ b/arch/powerpc/kvm/book3s_64_entry.S
> @@ -1,9 +1,10 @@
> +#include 
>  #include 
> -#include 
> +#include 
>  #include 
> -#include 
> -#include 
>  #include 
> +#include 
> +#include 
>
>  /*
>   * This is branched to from interrupt handlers in exception-64s.S which set
> @@ -19,17 +20,64 @@ kvmppc_interrupt:
> 

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread pr-tracker-bot
The pull request you sent on Mon, 22 Feb 2021 23:05:37 +1100:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-5.12-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b12b47249688915e987a9a2a393b522f86f6b7ab

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [PATCH 03/13] KVM: PPC: Book3S HV: Ensure MSR[ME] is always set in guest MSR

2021-02-22 Thread Fabiano Rosas
Nicholas Piggin  writes:

> Rather than add the ME bit to the MSR when the guest is entered, make
> it clear that the hypervisor does not allow the guest to clear the bit.
>
> The ME addition is kept in the code for now, but a future patch will
> warn if it's not present.
>
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Fabiano Rosas 

> ---
>  arch/powerpc/kvm/book3s_hv_builtin.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
> b/arch/powerpc/kvm/book3s_hv_builtin.c
> index dad118760a4e..ae8f291c5c48 100644
> --- a/arch/powerpc/kvm/book3s_hv_builtin.c
> +++ b/arch/powerpc/kvm/book3s_hv_builtin.c
> @@ -661,6 +661,13 @@ static void kvmppc_end_cede(struct kvm_vcpu *vcpu)
>
>  void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
>  {
> + /*
> +  * Guest must always run with machine check interrupt
> +  * enabled.
> +  */
> + if (!(msr & MSR_ME))
> + msr |= MSR_ME;
> +
>   /*
>* Check for illegal transactional state bit combination
>* and if we find it, force the TS field to a safe state.


Re: [PATCH 02/13] powerpc/64s: remove KVM SKIP test from instruction breakpoint handler

2021-02-22 Thread Fabiano Rosas
Nicholas Piggin  writes:

> The code being executed in KVM_GUEST_MODE_SKIP is hypervisor code with
> MSR[IR]=0, so the faults of concern are the d-side ones caused by access
> to guest context by the hypervisor.
>
> Instruction breakpoint interrupts are not a concern here. It's unlikely
> any good would come of causing breaks in this code, but skipping the
> instruction that caused it won't help matters (e.g., skip the mtmsr that
> sets MSR[DR]=0 or clears KVM_GUEST_MODE_SKIP).
>
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Fabiano Rosas 

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index 5d0ad3b38e90..5bc689a546ae 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2597,7 +2597,6 @@ EXC_VIRT_NONE(0x5200, 0x100)
>  INT_DEFINE_BEGIN(instruction_breakpoint)
>   IVEC=0x1300
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
> - IKVM_SKIP=1
>   IKVM_REAL=1
>  #endif
>  INT_DEFINE_END(instruction_breakpoint)


Re: [PATCH 01/13] powerpc/64s: Remove KVM handler support from CBE_RAS interrupts

2021-02-22 Thread Fabiano Rosas
Nicholas Piggin  writes:

> Cell does not support KVM.
>
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Fabiano Rosas 

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 6 --
>  1 file changed, 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index 39cbea495154..5d0ad3b38e90 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2574,8 +2574,6 @@ EXC_VIRT_NONE(0x5100, 0x100)
>  INT_DEFINE_BEGIN(cbe_system_error)
>   IVEC=0x1200
>   IHSRR=1
> - IKVM_SKIP=1
> - IKVM_REAL=1
>  INT_DEFINE_END(cbe_system_error)
>
>  EXC_REAL_BEGIN(cbe_system_error, 0x1200, 0x100)
> @@ -2745,8 +2743,6 @@ EXC_COMMON_BEGIN(denorm_exception_common)
>  INT_DEFINE_BEGIN(cbe_maintenance)
>   IVEC=0x1600
>   IHSRR=1
> - IKVM_SKIP=1
> - IKVM_REAL=1
>  INT_DEFINE_END(cbe_maintenance)
>
>  EXC_REAL_BEGIN(cbe_maintenance, 0x1600, 0x100)
> @@ -2798,8 +2794,6 @@ EXC_COMMON_BEGIN(altivec_assist_common)
>  INT_DEFINE_BEGIN(cbe_thermal)
>   IVEC=0x1800
>   IHSRR=1
> - IKVM_SKIP=1
> - IKVM_REAL=1
>  INT_DEFINE_END(cbe_thermal)
>
>  EXC_REAL_BEGIN(cbe_thermal, 0x1800, 0x100)


Re: [PATCH kernel 2/2] powerpc/iommu: Do not immediately panic when failed IOMMU table allocation

2021-02-22 Thread Leonardo Bras
On Mon, 2021-02-22 at 16:24 +1100, Alexey Kardashevskiy wrote:
> 
> On 18/02/2021 06:32, Leonardo Bras wrote:
> > On Tue, 2021-02-16 at 14:33 +1100, Alexey Kardashevskiy wrote:
> > > Most platforms allocate IOMMU table structures (specifically it_map)
> > > at the boot time and when this fails - it is a valid reason for panic().
> > > 
> > > However the powernv platform allocates it_map after a device is returned
> > > to the host OS after being passed through and this happens long after
> > > the host OS booted. It is quite possible to trigger the it_map allocation
> > > panic() and kill the host even though it is not necessary - the host OS
> > > can still use the DMA bypass mode (requires a tiny fraction of it_map's
> > > memory) and even if that fails, the host OS is runnnable as it was without
> > > the device for which allocating it_map causes the panic.
> > > 
> > > Instead of immediately crashing in a powernv/ioda2 system, this prints
> > > an error and continues. All other platforms still call panic().
> > > 
> > > Signed-off-by: Alexey Kardashevskiy 
> > 
> > Hello Alexey,
> > 
> > This looks like a good change, that passes panic() decision to platform
> > code. Everything looks pretty straightforward, but I have a question
> > regarding this:
> > 
> > > @@ -1930,16 +1931,16 @@ static long 
> > > pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
> > >   res_start = pe->phb->ioda.m32_pci_base >> 
> > > tbl->it_page_shift;
> > >   res_end = min(window_size, SZ_4G) >> tbl->it_page_shift;
> > >   }
> > > - iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end);
> > > - rc = pnv_pci_ioda2_set_window(>table_group, 0, tbl);
> > > 
> > > + if (iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end))
> > > + rc = pnv_pci_ioda2_set_window(>table_group, 0, tbl);
> > > + else
> > > + rc = -ENOMEM;
> > >   if (rc) {
> > > - pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n",
> > > - rc);
> > > + pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n", 
> > > rc);
> > >   iommu_tce_table_put(tbl);
> > > - return rc;
> > > + tbl = NULL; /* This clears iommu_table_base below */
> > >   }
> > > -
> > >   if (!pnv_iommu_bypass_disabled)
> > >   pnv_pci_ioda2_set_bypass(pe, true);
> > >   
> > > 
> > > 
> > > 
> > > 
> > 
> > If I could understand correctly, previously if iommu_init_table() did
> > not panic(), and pnv_pci_ioda2_set_window() returned something other
> > than 0, it would return rc in the if (rc) clause, but now it does not
> > happen anymore, going through if (!pnv_iommu_bypass_disabled) onwards.
> > 
> > Is that desired?
> 
> 
> Yes. A PE (==device, pretty much) has 2 DMA windows:
> - the default one which requires some RAM to operate
> - a bypass mode which tells the hardware that PCI addresses are 
> statically mapped to RAM 1:1.
> 
> This bypass mode does not require extra memory to work and is used in 
> the most cases on the bare metal as long as the device supports 64bit 
> DMA which is everything except GPUs. Since it is cheap to enable and 
> this what we prefer anyway, no urge to fail.
> 
> 
> > As far as I could see, returning rc there seems a good procedure after
> > iommu_init_table returning -ENOMEM.
> 
> This change is intentional and yes it could be done by a separate patch 
> but I figured there is no that much value in splitting.

Ok then, thanks for clarifying.
FWIW:

Reviewed-by: Leonardo Bras 




Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Rob Herring
On Mon, Feb 22, 2021 at 6:05 AM Michael Ellerman  wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Hi Linus,
>
> Please pull powerpc updates for 5.12.
>
> There will be a conflict with the devicetree tree. It's OK to just take their
> side of the conflict, we'll fix up the minor behaviour change that causes in a
> follow-up patch.

The issues turned out to be worse than just this, so I've dropped the
conflicting change for 5.12.

Rob


[GIT PULL] Please pull powerpc/linux.git powerpc-5.12-1 tag

2021-02-22 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull powerpc updates for 5.12.

There will be a conflict with the devicetree tree. It's OK to just take their
side of the conflict, we'll fix up the minor behaviour change that causes in a
follow-up patch.

There's also a trivial conflict with the spi tree.

cheers


The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62:

  Linux 5.11-rc2 (2021-01-03 15:55:30 -0800)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.12-1

for you to fetch changes up to 82d2c16b350f72aa21ac2a6860c542aa4b43a51e:

  powerpc/perf: Adds support for programming of Thresholding in P10 (2021-02-11 
23:35:36 +1100)

- --
powerpc updates for 5.12

A large series adding wrappers for our interrupt handlers, so that irq/nmi/user
tracking can be isolated in the wrappers rather than spread in each handler.

Conversion of the 32-bit syscall handling into C.

A series from Nick to streamline our TLB flushing when using the Radix MMU.

Switch to using queued spinlocks by default for 64-bit server CPUs.

A rework of our PCI probing so that it happens later in boot, when more generic
infrastructure is available.

Two small fixes to allow 32-bit little-endian processes to run on 64-bit
kernels.

Other smaller features, fixes & cleanups.

Thanks to:
  Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh Kumar K.V, Athira
  Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang Fan, Christophe Leroy,
  Christopher M. Riedl, Fabiano Rosas, Florian Fainelli, Frederic Barrat, Ganesh
  Goudar, Hari Bathini, Jiapeng Chong, Joseph J Allen, Kajol Jain, Markus
  Elfring, Michal Suchanek, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver
  O'Halloran, Pingfan Liu, Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan
  Das, Stephen Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, Zheng
  Yongjun.

- --
Alexey Kardashevskiy (3):
  powerpc/iommu/debug: Add debugfs entries for IOMMU tables
  powerpc/uaccess: Avoid might_fault() when user access is enabled
  powerpc/kuap: Restore AMR after replaying soft interrupts

Ananth N Mavinakayanahalli (2):
  powerpc/sstep: Check instruction validity against ISA version before 
emulation
  powerpc/sstep: Fix incorrect return from analyze_instr()

Aneesh Kumar K.V (3):
  powerpc/mm: Enable compound page check for both THP and HugeTLB
  powerpc/mm: Add PG_dcache_clean to indicate dcache clean state
  powerpc/mm: Remove dcache flush from memory remove.

Athira Rajeev (3):
  powerpc/perf: Include PMCs as part of per-cpu cpuhw_events struct
  powerpc/perf: Expose Performance Monitor Counter SPR's as part of 
extended regs
  powerpc/perf: Record counter overflow always if SAMPLE_IP is unset

Bhaskar Chowdhury (1):
  powerpc/44x: Fix a spelling mismach to mismatch in head_44x.S

Chengyang Fan (1):
  powerpc: remove unneeded semicolons

Christophe Leroy (38):
  powerpc/kvm: Force selection of CONFIG_PPC_FPU
  powerpc/47x: Disable 256k page size
  powerpc/44x: Remove STDBINUTILS kconfig option
  powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is selected
  powerpc/xmon: Enable breakpoints on 8xx
  powerpc/xmon: Select CONSOLE_POLL for the 8xx
  powerpc/32s: move DABR match out of handle_page_fault
  powerpc/8xx: Fix software emulation interrupt
  powerpc/uaccess: Perform barrier_nospec() in KUAP allowance helpers
  powerpc/32s: Change mfsrin() into a static inline function
  powerpc/32s: mfsrin()/mtsrin() become mfsr()/mtsr()
  powerpc/32s: Allow constant folding in mtsr()/mfsr()
  powerpc/32: Preserve cr1 in exception prolog stack check to fix build 
error
  powerpc/32s: Add missing call to kuep_lock on syscall entry
  powerpc/32: Always enable data translation on syscall entry
  powerpc/32: On syscall entry, enable instruction translation at the same 
time as data
  powerpc/32: Reorder instructions to avoid using CTR in syscall entry
  powerpc/irq: Add helper to set regs->softe
  powerpc/irq: Rework helpers that manipulate MSR[EE/RI]
  powerpc/irq: Add stub irq_soft_mask_return() for PPC32
  powerpc/syscall: Rename syscall_64.c into interrupt.c
  powerpc/syscall: Make interrupt.c buildable on PPC32
  powerpc/syscall: Use is_compat_task()
  powerpc/syscall: Save r3 in regs->orig_r3
  powerpc/syscall: Change condition to check MSR_RI
  powerpc/32: Always save non volatile GPRs at syscall entry
  powerpc/syscall: implement system call entry/exit logic in C for PPC32
  powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
  powerpc/32: Remove the counter in global_dbcr0
  powerpc/syscall: Do not check unsupported scv vector on PPC32
 

Re: [PATCH kernel] powerpc/iommu: Annotate nested lock for lockdep

2021-02-22 Thread Alexey Kardashevskiy




On 20/02/2021 14:49, Alexey Kardashevskiy wrote:



On 18/02/2021 23:59, Frederic Barrat wrote:



On 16/02/2021 04:20, Alexey Kardashevskiy wrote:

The IOMMU table is divided into pools for concurrent mappings and each
pool has a separate spinlock. When taking the ownership of an IOMMU 
group

to pass through a device to a VM, we lock these spinlocks which triggers
a false negative warning in lockdep (below).

This fixes it by annotating the large pool's spinlock as a nest lock.

===
WARNING: possible recursive locking detected
5.11.0-le_syzkaller_a+fstn1 #100 Not tainted

qemu-system-ppc/4129 is trying to acquire lock:
c000119bddb0 (&(p->lock)/1){}-{2:2}, at: 
iommu_take_ownership+0xac/0x1e0


but task is already holding lock:
c000119bdd30 (&(p->lock)/1){}-{2:2}, at: 
iommu_take_ownership+0xac/0x1e0


other info that might help us debug this:
  Possible unsafe locking scenario:

    CPU0
    
   lock(&(p->lock)/1);
   lock(&(p->lock)/1);
===

Signed-off-by: Alexey Kardashevskiy 
---
  arch/powerpc/kernel/iommu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 557a09dd5b2f..2ee642a6731a 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1089,7 +1089,7 @@ int iommu_take_ownership(struct iommu_table *tbl)
  spin_lock_irqsave(>large_pool.lock, flags);
  for (i = 0; i < tbl->nr_pools; i++)
-    spin_lock(>pools[i].lock);
+    spin_lock_nest_lock(>pools[i].lock, 
>large_pool.lock);



We have the same pattern and therefore should have the same problem in 
iommu_release_ownership().


But as I understand, we're hacking our way around lockdep here, since 
conceptually, those locks are independent. I was wondering why it 
seems to fix it by worrying only about the large pool lock. That loop 
can take many locks (up to 4 with current config). However, if the dma 
window is less than 1GB, we would only have one, so it would make 
sense for lockdep to stop complaining. Is it what happened? In which 
case, this patch doesn't really fix it. Or I'm missing something :-)



My rough undestanding is that when spin_lock_nest_lock is called first 
time, it does some magic with lockdep classes somewhere in 
__lock_acquire()/register_lock_class() and right after that the nested 
lock is not the same as before and it is annotated so  we cannot lock 
nested locks without locking the nest lock first and no (re)annotation 
is needed. I'll try to poke this code once again and see, it is just was 
easier with p9/nested which is gone for now because of little snow in 
one of the southern states :)



Turns out I have good imagination and in fact it does print this huge 
warning in the release hook as well so v2 is coming. Thanks,








   Fred




  iommu_table_release_pages(tbl);





--
Alexey


[PATCH] ibmveth: Switch to using the new API kobj_to_dev()

2021-02-22 Thread Yang Li
fixed the following coccicheck:
./drivers/net/ethernet/ibm/ibmveth.c:1805:51-52: WARNING opportunity for
kobj_to_dev()

Reported-by: Abaci Robot 
Signed-off-by: Yang Li 
---
 drivers/net/ethernet/ibm/ibmveth.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c 
b/drivers/net/ethernet/ibm/ibmveth.c
index c3ec9ce..6e9572c 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1801,8 +1801,7 @@ static ssize_t veth_pool_store(struct kobject *kobj, 
struct attribute *attr,
struct ibmveth_buff_pool *pool = container_of(kobj,
  struct ibmveth_buff_pool,
  kobj);
-   struct net_device *netdev = dev_get_drvdata(
-   container_of(kobj->parent, struct device, kobj));
+   struct net_device *netdev = dev_get_drvdata(kobj_to_dev(kobj->parent));
struct ibmveth_adapter *adapter = netdev_priv(netdev);
long value = simple_strtol(buf, NULL, 10);
long rc;
-- 
1.8.3.1