Re: [PATCH v3 28/34] s390/mm: Define KMSAN metadata for vmalloc and modules

2024-01-04 Thread Heiko Carstens
On Thu, Jan 04, 2024 at 11:03:42AM +0100, Alexander Gordeev wrote:
> On Tue, Jan 02, 2024 at 04:05:31PM +0100, Heiko Carstens wrote:
> Hi Heiko,
> ...
> > > @@ -253,9 +253,17 @@ static unsigned long setup_kernel_memory_layout(void)
> > >   MODULES_END = round_down(__abs_lowcore, _SEGMENT_SIZE);
> > >   MODULES_VADDR = MODULES_END - MODULES_LEN;
> > >   VMALLOC_END = MODULES_VADDR;
> > > +#ifdef CONFIG_KMSAN
> > > + VMALLOC_END -= MODULES_LEN * 2;
> > > +#endif
> > >  
> > >   /* allow vmalloc area to occupy up to about 1/2 of the rest virtual 
> > > space left */
> > >   vmalloc_size = min(vmalloc_size, round_down(VMALLOC_END / 2, 
> > > _REGION3_SIZE));
> > > +#ifdef CONFIG_KMSAN
> > > + /* take 2/3 of vmalloc area for KMSAN shadow and origins */
> > > + vmalloc_size = round_down(vmalloc_size / 3, _REGION3_SIZE);
> > > + VMALLOC_END -= vmalloc_size * 2;
> > > +#endif
> > 
> > Please use
> > 
> > if (IS_ENABLED(CONFIG_KMSAN))
> > 
> > above, since this way we get more compile time checks.
> 
> This way we will get a mixture of CONFIG_KASAN and CONFIG_KMSAN
> #ifdef vs IS_ENABLED() checks within one function. I guess, we
> would rather address it with a separate cleanup?

I don't think so, since you can't convert the CONFIG_KASAN ifdef to
IS_ENABLED() here: it won't compile.

But IS_ENABLED(CONFIG_KMSAN) should work. I highly prefer IS_ENABLED() over
ifdef since it allows for better compile time checks, and you won't be
surprised by code that doesn't compile if you just change a config option.
We've seen that way too often.



Re: [PATCH v3 34/34] kmsan: Enable on s390

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:54AM +0100, Ilya Leoshkevich wrote:
> Now that everything else is in place, enable KMSAN in Kconfig.
> 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/Kconfig | 1 +
>  1 file changed, 1 insertion(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 33/34] s390: Implement the architecture-specific kmsan functions

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:53AM +0100, Ilya Leoshkevich wrote:
> arch_kmsan_get_meta_or_null() finds the lowcore shadow by querying the
> prefix and calling kmsan_get_metadata() again.
> 
> kmsan_virt_addr_valid() delegates to virt_addr_valid().
> 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/include/asm/kmsan.h | 43 +++
>  1 file changed, 43 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 29/34] s390/string: Add KMSAN support

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:49AM +0100, Ilya Leoshkevich wrote:
> Add KMSAN support for the s390 implementations of the string functions.
> Do this similar to how it's already done for KASAN, except that the
> optimized memset{16,32,64}() functions need to be disabled: it's
> important for KMSAN to know that they initialized something.
> 
> The way boot code is built with regard to string functions is
> problematic, since most files think it's configured with sanitizers,
> but boot/string.c doesn't. This creates various problems with the
> memset64() definitions, depending on whether the code is built with
> sanitizers or fortify. This should probably be streamlined, but in the
> meantime resolve the issues by introducing the IN_BOOT_STRING_C macro,
> similar to the existing IN_ARCH_STRING_C macro.
> 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/boot/string.c| 16 
>  arch/s390/include/asm/string.h | 20 +++-
>  2 files changed, 31 insertions(+), 5 deletions(-)

Acked-by: Heiko Carstens 



Re: [PATCH v3 32/34] s390/unwind: Disable KMSAN checks

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:52AM +0100, Ilya Leoshkevich wrote:
> The unwind code can read uninitialized frames. Furthermore, even in
> the good case, KMSAN does not emit shadow for backchains. Therefore
> disable it for the unwinding functions.
> 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/kernel/unwind_bc.c | 4 
>  1 file changed, 4 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 30/34] s390/traps: Unpoison the kernel_stack_overflow()'s pt_regs

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:50AM +0100, Ilya Leoshkevich wrote:
> This is normally done by the generic entry code, but the
> kernel_stack_overflow() flow bypasses it.
> 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/kernel/traps.c | 6 ++
>  1 file changed, 6 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 28/34] s390/mm: Define KMSAN metadata for vmalloc and modules

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:48AM +0100, Ilya Leoshkevich wrote:
> The pages for the KMSAN metadata associated with most kernel mappings
> are taken from memblock by the common code. However, vmalloc and module
> metadata needs to be defined by the architectures.
> 
> Be a little bit more careful than x86: allocate exactly MODULES_LEN
> for the module shadow and origins, and then take 2/3 of vmalloc for
> the vmalloc shadow and origins. This ensures that users passing small
> vmalloc= values on the command line do not cause module metadata
> collisions.
> 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/boot/startup.c|  8 
>  arch/s390/include/asm/pgtable.h | 10 ++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
> index 8104e0e3d188..e37e7ffda430 100644
> --- a/arch/s390/boot/startup.c
> +++ b/arch/s390/boot/startup.c
> @@ -253,9 +253,17 @@ static unsigned long setup_kernel_memory_layout(void)
>   MODULES_END = round_down(__abs_lowcore, _SEGMENT_SIZE);
>   MODULES_VADDR = MODULES_END - MODULES_LEN;
>   VMALLOC_END = MODULES_VADDR;
> +#ifdef CONFIG_KMSAN
> + VMALLOC_END -= MODULES_LEN * 2;
> +#endif
>  
>   /* allow vmalloc area to occupy up to about 1/2 of the rest virtual 
> space left */
>   vmalloc_size = min(vmalloc_size, round_down(VMALLOC_END / 2, 
> _REGION3_SIZE));
> +#ifdef CONFIG_KMSAN
> + /* take 2/3 of vmalloc area for KMSAN shadow and origins */
> + vmalloc_size = round_down(vmalloc_size / 3, _REGION3_SIZE);
> + VMALLOC_END -= vmalloc_size * 2;
> +#endif

Please use

if (IS_ENABLED(CONFIG_KMSAN))

above, since this way we get more compile time checks.

> +#ifdef CONFIG_KMSAN
> +#define KMSAN_VMALLOC_SIZE (VMALLOC_END - VMALLOC_START)
> +#define KMSAN_VMALLOC_SHADOW_START VMALLOC_END
> +#define KMSAN_VMALLOC_ORIGIN_START (KMSAN_VMALLOC_SHADOW_START + \
> + KMSAN_VMALLOC_SIZE)
> +#define KMSAN_MODULES_SHADOW_START (KMSAN_VMALLOC_ORIGIN_START + \
> +     KMSAN_VMALLOC_SIZE)

Long single lines for these, please :)

With that, and Alexander Gordeev's comments addressed:
Acked-by: Heiko Carstens 



Re: [PATCH v3 27/34] s390/irqflags: Do not instrument arch_local_irq_*() with KMSAN

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:47AM +0100, Ilya Leoshkevich wrote:
> KMSAN generates the following false positives on s390x:
> 
> [6.063666] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
> [ ...]
> [6.577050] Call Trace:
> [6.619637]  [<0690d2de>] check_flags+0x1fe/0x210
> [6.665411] ([<0690d2da>] check_flags+0x1fa/0x210)
> [6.707478]  [<006cec1a>] lock_acquire+0x2ca/0xce0
> [6.749959]  [<069820ea>] _raw_spin_lock_irqsave+0xea/0x190
> [6.794912]  [<041fc988>] __stack_depot_save+0x218/0x5b0
> [6.838420]  [<0197affe>] __msan_poison_alloca+0xfe/0x1a0
> [6.882985]  [<07c5827c>] start_kernel+0x70c/0xd50
> [6.927454]  [<00100036>] startup_continue+0x36/0x40
> 
> Between trace_hardirqs_on() and `stosm __mask, 3` lockdep thinks that
> interrupts are on, but on the CPU they are still off. KMSAN
> instrumentation takes spinlocks, giving lockdep a chance to see and
> complain about this discrepancy.
> 
> KMSAN instrumentation is inserted in order to poison the __mask
> variable. Disable instrumentation in the respective functions. They are
> very small and it's easy to see that no important metadata updates are
> lost because of this.
> 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/include/asm/irqflags.h | 18 +++---
>  drivers/s390/char/sclp.c |  2 +-
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/include/asm/irqflags.h 
> b/arch/s390/include/asm/irqflags.h
> index 02427b205c11..7353a88b2ae2 100644
> --- a/arch/s390/include/asm/irqflags.h
> +++ b/arch/s390/include/asm/irqflags.h
> @@ -37,12 +37,19 @@ static __always_inline void __arch_local_irq_ssm(unsigned 
> long flags)
>   asm volatile("ssm   %0" : : "Q" (flags) : "memory");
>  }
>  
> -static __always_inline unsigned long arch_local_save_flags(void)
> +#ifdef CONFIG_KMSAN
> +#define ARCH_LOCAL_IRQ_ATTRIBUTES \
> + noinline notrace __no_sanitize_memory __maybe_unused
> +#else
> +#define ARCH_LOCAL_IRQ_ATTRIBUTES __always_inline
> +#endif
> +
> +static ARCH_LOCAL_IRQ_ATTRIBUTES unsigned long arch_local_save_flags(void)
>  {

Please change this to lower case and long single lines, so it matches the
more common patterns:

#ifdef CONFIG_KMSAN
#define __arch_local_irq_attributes noinline notrace __no_sanitize_memory 
__maybe_unused
#else
#define __arch_local_irq_attributes __always_inline
#endif

static __arch_local_irq_attributes unsigned long arch_local_save_flags(void)

...



Re: [PATCH v3 25/34] s390/diag: Unpoison diag224() output buffer

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:45AM +0100, Ilya Leoshkevich wrote:
> Diagnose 224 stores 4k bytes, which cannot be deduced from the inline
> assembly constraints. This leads to KMSAN false positives.
> 
> Unpoison the output buffer manually with kmsan_unpoison_memory().
> 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/kernel/diag.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c
> index 92fdc35f028c..fb83a21014d0 100644
> --- a/arch/s390/kernel/diag.c
> +++ b/arch/s390/kernel/diag.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -255,6 +256,7 @@ int diag224(void *ptr)
>   "1:\n"
>   EX_TABLE(0b,1b)
>   : "+d" (rc) :"d" (0), "d" (addr) : "memory");
> + kmsan_unpoison_memory(ptr, PAGE_SIZE);

Wouldn't it be better to adjust the inline assembly instead?
Something like this:

diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c
index 92fdc35f028c..b1b0acda50c6 100644
--- a/arch/s390/kernel/diag.c
+++ b/arch/s390/kernel/diag.c
@@ -247,14 +247,18 @@ int diag224(void *ptr)
 {
unsigned long addr = __pa(ptr);
int rc = -EOPNOTSUPP;
+   struct _d {
+   char _d[4096];
+   };
 
diag_stat_inc(DIAG_STAT_X224);
-   asm volatile(
-   "   diag%1,%2,0x224\n"
-   "0: lhi %0,0x0\n"
+   asm volatile("\n"
+   "   diag%[type],%[addr],0x224\n"
+   "0: lhi %[rc],0\n"
"1:\n"
EX_TABLE(0b,1b)
-   : "+d" (rc) :"d" (0), "d" (addr) : "memory");
+   : [rc] "+d" (rc), "=m" (*(struct _d *)ptr)
+   : [type] "d" (0), [addr] "d" (addr));
return rc;
 }
 EXPORT_SYMBOL(diag224);



Re: [PATCH v3 26/34] s390/ftrace: Unpoison ftrace_regs in kprobe_ftrace_handler()

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:46AM +0100, Ilya Leoshkevich wrote:
> s390 uses assembly code to initialize ftrace_regs and call
> kprobe_ftrace_handler(). Therefore, from the KMSAN's point of view,
> ftrace_regs is poisoned on kprobe_ftrace_handler() entry. This causes
> KMSAN warnings when running the ftrace testsuite.
> 
> Fix by trusting the assembly code and always unpoisoning ftrace_regs in
> kprobe_ftrace_handler().
> 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/kernel/ftrace.c | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 24/34] s390/cpumf: Unpoison STCCTM output buffer

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:44AM +0100, Ilya Leoshkevich wrote:
> stcctm() uses the "Q" constraint for dest, therefore KMSAN does not
> understand that it fills multiple doublewords pointed to by dest, not
> just one. This results in false positives.
> 
> Unpoison the whole dest manually with kmsan_unpoison_memory().
> 
> Reported-by: Alexander Gordeev 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/include/asm/cpu_mf.h | 6 ++
>  1 file changed, 6 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v3 23/34] s390/cpacf: Unpoison the results of cpacf_trng()

2024-01-02 Thread Heiko Carstens
On Thu, Dec 14, 2023 at 12:24:43AM +0100, Ilya Leoshkevich wrote:
> Prevent KMSAN from complaining about buffers filled by cpacf_trng()
> being uninitialized.
> 
> Tested-by: Alexander Gordeev 
> Reviewed-by: Alexander Potapenko 
> Signed-off-by: Ilya Leoshkevich 
> ---
>  arch/s390/include/asm/cpacf.h | 3 +++
>  1 file changed, 3 insertions(+)

Acked-by: Heiko Carstens 



Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode

2023-11-23 Thread Heiko Carstens
On Thu, Nov 23, 2023 at 10:23:49AM -0500, Steven Rostedt wrote:
> On Thu, 23 Nov 2023 12:25:48 +0100
> Heiko Carstens  wrote:
> 
> > So, if it helps (this still happens with Linus' master branch):
> > 
> > create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
> > parameter), which points to a data structure where "is_freed" is 1. Then it
> > looks like create_dir() returned "-EEXIST". And looking at the code this
> > combination then must lead to d_invalidate() incorrectly being called with
> > "-EEXIST" as dentry pointer.
> 
> I haven't looked too much at the error codes, let me do that on Monday
> (it's currently Turkey weekend here in the US).
> 
> But could you test this branch:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git  
> trace/core
> 
> I have a bunch of fixes in that branch that may fix your issue. I just
> finished testing it and plan on pushing it to Linus before the next rc
> release.

This is not that easy to reproduce, however you branch contains commit
71cade82f2b5 ("eventfs: Do not invalidate dentry in create_file/dir_dentry()")
which removes the d_invalidate() call.
The crash I reported cannot happen anymore with that commit. I'll consider
this fixed, and report again if this (or something else) still causes
problems.

Thanks!



Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode

2023-11-23 Thread Heiko Carstens
On Fri, Nov 17, 2023 at 03:38:29PM +0100, Heiko Carstens wrote:
> On Fri, Nov 17, 2023 at 03:23:35PM +0100, Heiko Carstens wrote:
> > I think this patch causes from time to time crashes when running ftrace
> > selftests. In particular I guess there is a bug wrt error handling in this
> > function (see below for call trace):
> > 
> > > +static struct dentry *
> > > +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> > > +struct dentry *parent, const char *name, umode_t mode, void 
> > > *data,
> > > +const struct file_operations *fops, bool lookup)
> > > +{
> ...
> > Note that the compare and swap instruction within d_invalidate() generates
> > a specification exception because it operates on an invalid address
> > (0xffef), which happens to be -EEXIST. So my assumption is that
> > create_dir_dentry() has incorrect error handling and passes -EEXIST instead
> > of a valid dentry pointer to d_invalidate().
> > 
> > But I leave it up to you to figure this out :)
> 
> Ok, wrong function quoted of course. But the rest of my statement
> should be correct.

So, if it helps (this still happens with Linus' master branch):

create_dir_dentry() is called with a "struct eventfs_inode *ei" (second
parameter), which points to a data structure where "is_freed" is 1. Then it
looks like create_dir() returned "-EEXIST". And looking at the code this
combination then must lead to d_invalidate() incorrectly being called with
"-EEXIST" as dentry pointer.

Now, I have no idea how the code should work, but it is quite obvious that
something is broken :)

Here the dump of the struct eventfs_inode that was passed to
create_file_dentry() when the crash happened:

crash> struct eventfs_inode eada7680
struct eventfs_inode {
  list = {
next = 0x10f802da0,
prev = 0x122
  },
  entries = 0x12c031328 ,
  name = 0x12b90bbac <__tpstrtab_xfs_alloc_vextent_exact_bno> 
"xfs_alloc_vextent_exact_bno",
  children = {
next = 0xeada76a0,
prev = 0xeada76a0
  },
  dentry = 0x0,
  d_parent = 0x107c75d40,
  d_children = 0xeada5700,
  entry_attrs = 0x0,
  attr = {
mode = 0,
uid = {
  val = 0
},
gid = {
  val = 0
}
  },
  data = 0xeada6660,
  {
llist = {
  next = 0xeada7668
},
rcu = {
  next = 0xeada7668,
  func = 0x12ad2a5b8 
}
  },
  is_freed = 1,
  nr_entries = 6
}



Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode

2023-11-17 Thread Heiko Carstens
On Fri, Nov 17, 2023 at 03:23:35PM +0100, Heiko Carstens wrote:
> I think this patch causes from time to time crashes when running ftrace
> selftests. In particular I guess there is a bug wrt error handling in this
> function (see below for call trace):
> 
> > +static struct dentry *
> > +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> > +  struct dentry *parent, const char *name, umode_t mode, void 
> > *data,
> > +  const struct file_operations *fops, bool lookup)
> > +{
...
> Note that the compare and swap instruction within d_invalidate() generates
> a specification exception because it operates on an invalid address
> (0xffef), which happens to be -EEXIST. So my assumption is that
> create_dir_dentry() has incorrect error handling and passes -EEXIST instead
> of a valid dentry pointer to d_invalidate().
> 
> But I leave it up to you to figure this out :)

Ok, wrong function quoted of course. But the rest of my statement
should be correct.



Re: [PATCH v5] eventfs: Remove eventfs_file and just use eventfs_inode

2023-11-17 Thread Heiko Carstens
Hi Steven,

On Wed, Oct 04, 2023 at 04:50:07PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> Instead of having a descriptor for every file represented in the eventfs
> directory, only have the directory itself represented. Change the API to
> send in a list of entries that represent all the files in the directory
> (but not other directories). The entry list contains a name and a callback
> function that will be used to create the files when they are accessed.
...
> Cc: Masami Hiramatsu 
> Cc: Mark Rutland 
> Cc: Andrew Morton 
> Cc: Ajay Kaher 
> Signed-off-by: Steven Rostedt (Google) 
> ---
> Changes since v4: 
> https://lore.kernel.org/linux-trace-kernel/20231003184059.49244...@gandalf.local.home/
> 
>  - Get the ei->dentry within the eventfs_mutex to keep consistency during the 
> lookup.
> 
>  fs/tracefs/event_inode.c | 847 ++-
>  fs/tracefs/inode.c   |   2 +-
>  fs/tracefs/internal.h|  37 +-
>  include/linux/trace_events.h |   2 +-
>  include/linux/tracefs.h  |  29 +-
>  kernel/trace/trace.c |   7 +-
>  kernel/trace/trace.h |   4 +-
>  kernel/trace/trace_events.c  | 313 +
>  8 files changed, 705 insertions(+), 536 deletions(-)

I think this patch causes from time to time crashes when running ftrace
selftests. In particular I guess there is a bug wrt error handling in this
function (see below for call trace):

> +static struct dentry *
> +create_file_dentry(struct eventfs_inode *ei, struct dentry **e_dentry,
> +struct dentry *parent, const char *name, umode_t mode, void 
> *data,
> +const struct file_operations *fops, bool lookup)
> +{
> + struct dentry *dentry;
> + bool invalidate = false;
> +
> + mutex_lock(_mutex);
> + /* If the e_dentry already has a dentry, use it */
> + if (*e_dentry) {
> + /* lookup does not need to up the ref count */
> + if (!lookup)
> + dget(*e_dentry);
> + mutex_unlock(_mutex);
> + return *e_dentry;
> + }
> + mutex_unlock(_mutex);
> +
> + /* The lookup already has the parent->d_inode locked */
> + if (!lookup)
> + inode_lock(parent->d_inode);
> +
> + dentry = create_file(name, mode, parent, data, fops);
> +
> + if (!lookup)
> + inode_unlock(parent->d_inode);
> +
> + mutex_lock(_mutex);
> +
> + if (IS_ERR_OR_NULL(dentry)) {
> + /*
> +  * When the mutex was released, something else could have
> +  * created the dentry for this e_dentry. In which case
> +  * use that one.
> +  *
> +  * Note, with the mutex held, the e_dentry cannot have content
> +  * and the ei->is_freed be true at the same time.
> +  */
> + WARN_ON_ONCE(ei->is_freed);
> + dentry = *e_dentry;
> + /* The lookup does not need to up the dentry refcount */
> + if (dentry && !lookup)
> + dget(dentry);
> + mutex_unlock(_mutex);
> + return dentry;
> + }
> +
> + if (!*e_dentry && !ei->is_freed) {
> + *e_dentry = dentry;
> + dentry->d_fsdata = ei;
> + } else {
> + /*
> +  * Should never happen unless we get here due to being freed.
> +  * Otherwise it means two dentries exist with the same name.
> +  */
> + WARN_ON_ONCE(!ei->is_freed);
> + invalidate = true;
> + }
> + mutex_unlock(_mutex);
> +
> + if (invalidate)
> + d_invalidate(dentry);
> +
> + if (lookup || invalidate)
> + dput(dentry);
> +
> + return invalidate ? NULL : dentry;
> +}

We sometimes see crashes like this:

specification exception: 0006 ilc:2 [#1] SMP 
CPU: 6 PID: 38815 Comm: ls Not tainted 
6.7.0-20231116.rc1.git1.a7e756a5bb26.300.vr.fc38.s390x #1
Hardware name: IBM 3906 M04 704 (z/VM 7.1.0)
Krnl PSW : 0704c0018000 01682304bb00 (d_invalidate+0x30/0x110)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS:  00e2 0047 00e20007
    ff7c197bf000 00e2f13b0b20 00e25bfae180
   00e2f2536000 ffef  ffef
   03ff95cacf98 00e2f29323f0 00e827c1fa18 00e827c1f9d0
Krnl Code: 01682304baf4: a718lhi %r1,0
   01682304baf8: 583003acl   %r3,940
  #01682304bafc: ba13b058cs  %r1,%r3,88(%r11)
  >01682304bb00: ec16006b007ecij 
%r1,0,6,01682304bbd6
   01682304bb06: e310b012ltg %r1,16(%r11)
   01682304bb0c: a784004ebrc 8,01682304bba8
   01682304bb10: b904002blgr %r2,%r11
  

Re: consolidate the flock uapi definitions

2021-04-15 Thread Heiko Carstens
On Mon, Apr 12, 2021 at 10:55:40AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> currently we deal with the slight differents in the various architecture
> variants of the flock and flock64 stuctures in a very cruft way.  This
> series switches to just use small arch hooks and define the rest in
> asm-generic and linux/compat.h instead.
> 
> Diffstat:
>  arch/arm64/include/asm/compat.h|   20 
>  arch/mips/include/asm/compat.h |   23 ++-
>  arch/mips/include/uapi/asm/fcntl.h |   28 +++-
>  arch/parisc/include/asm/compat.h   |   16 
>  arch/powerpc/include/asm/compat.h  |   20 
>  arch/s390/include/asm/compat.h |   20 
>  arch/sparc/include/asm/compat.h|   22 +-
>  arch/x86/include/asm/compat.h  |   24 +++-
>  include/linux/compat.h |   31 +++
>  include/uapi/asm-generic/fcntl.h   |   21 +++--
>  tools/include/uapi/asm-generic/fcntl.h |   21 +++--
>  11 files changed, 54 insertions(+), 192 deletions(-)

for the s390 bits:
Acked-by: Heiko Carstens 


Re: [PATCH 3/5] s390: Get rid of oprofile leftovers

2021-04-15 Thread Heiko Carstens
On Thu, Apr 15, 2021 at 11:47:26AM +0100, Marc Zyngier wrote:
> On Thu, 15 Apr 2021 11:38:52 +0100,
> Heiko Carstens  wrote:
> > 
> > On Wed, Apr 14, 2021 at 02:44:07PM +0100, Marc Zyngier wrote:
> > > perf_pmu_name() and perf_num_counters() are unused. Drop them.
> > > 
> > > Signed-off-by: Marc Zyngier 
> > > ---
> > >  arch/s390/kernel/perf_event.c | 21 -
> > >  1 file changed, 21 deletions(-)
> > 
> > Acked-by: Heiko Carstens 
> > 
> > ...or do you want me to pick this up and route via the s390 tree(?).
> 
> Either way work for me, but I just want to make sure the last patch
> doesn't get applied before the previous ones.

Ok, I applied this one to the s390 tree. Thanks!


Re: [PATCH 3/5] s390: Get rid of oprofile leftovers

2021-04-15 Thread Heiko Carstens
On Wed, Apr 14, 2021 at 02:44:07PM +0100, Marc Zyngier wrote:
> perf_pmu_name() and perf_num_counters() are unused. Drop them.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/s390/kernel/perf_event.c | 21 -
>  1 file changed, 21 deletions(-)

Acked-by: Heiko Carstens 

...or do you want me to pick this up and route via the s390 tree(?).


[GIT PULL] s390 updates for 5.12-rc8 / 5.12

2021-04-14 Thread Heiko Carstens
Hi Linux,

please pull two small s390 patches. This is also supposed to be the
last s390 pull request for 5.12. There are no known bugs left.

Thanks,
Heiko

The following changes since commit ad31a8c05196a3dc5283b193e9c74a72022d3c65:

  s390/setup: use memblock_free_late() to free old stack (2021-04-07 14:37:28 
+0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.12-7

for you to fetch changes up to a994eddb947ea9ebb7b14d9a1267001699f0a136:

  s390/entry: save the caller of psw_idle (2021-04-12 12:44:31 +0200)


s390 updates

- setup stack backchain properly in external and i/o interrupt handler
  to fix stack unwinding. This broke when converting to generic entry.

- save caller address of psw_idle to get a sane stacktrace.


Vasily Gorbik (2):
  s390/entry: avoid setting up backchain in ext|io handlers
  s390/entry: save the caller of psw_idle

 arch/s390/kernel/entry.S | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index c10b9f31eef7..12de7a9c85b3 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -401,15 +401,13 @@ ENTRY(\name)
brasl   %r14,.Lcleanup_sie_int
 #endif
 0: CHECK_STACK __LC_SAVE_AREA_ASYNC
-   lgr %r11,%r15
aghi%r15,-(STACK_FRAME_OVERHEAD + __PT_SIZE)
-   stg %r11,__SF_BACKCHAIN(%r15)
j   2f
 1: BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
lctlg   %c1,%c1,__LC_KERNEL_ASCE
lg  %r15,__LC_KERNEL_STACK
-   xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
-2: la  %r11,STACK_FRAME_OVERHEAD(%r15)
+2: xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
+   la  %r11,STACK_FRAME_OVERHEAD(%r15)
stmg%r0,%r7,__PT_R0(%r11)
# clear user controlled registers to prevent speculative use
xgr %r0,%r0
@@ -445,6 +443,7 @@ INT_HANDLER io_int_handler,__LC_IO_OLD_PSW,do_io_irq
  * Load idle PSW.
  */
 ENTRY(psw_idle)
+   stg %r14,(__SF_GPRS+8*8)(%r15)
stg %r3,__SF_EMPTY(%r15)
larl%r1,psw_idle_exit
stg %r1,__SF_EMPTY+8(%r15)


[GIT PULL] s390 updates for 5.12-rc7

2021-04-08 Thread Heiko Carstens
Hi Linus,

please pull a couple of s390 fixes for 5.12-rc7.

Thank you,
Heiko

The following changes since commit 84d572e634e28827d105746c922d8ada425e2d8b:

  MAINTAINERS: add backups for s390 vfio drivers (2021-03-28 20:20:33 +0200)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.12-6

for you to fetch changes up to ad31a8c05196a3dc5283b193e9c74a72022d3c65:

  s390/setup: use memblock_free_late() to free old stack (2021-04-07 14:37:28 
+0200)


s390 updates for 5.12-rc7

- fix incorrect dereference of the ext_params2 external interrupt parameter,
  which leads to an instant kernel crash if a pfault interrupt occurs.

- add forgotten stack unwinder support, and fix memory leak for the new
  machine check handler stack.

- fix inline assembly register clobbering due to KASAN code instrumentation.


Alexander Gordeev (1):
  s390/cpcmd: fix inline assembly register clobbering

Heiko Carstens (2):
  s390/irq: fix reading of ext_params2 field from lowcore
  s390/setup: use memblock_free_late() to free old stack

Vasily Gorbik (1):
  s390/unwind: add machine check handler stack

 arch/s390/include/asm/stacktrace.h |  1 +
 arch/s390/kernel/cpcmd.c   |  6 --
 arch/s390/kernel/dumpstack.c   | 12 +++-
 arch/s390/kernel/irq.c |  2 +-
 arch/s390/kernel/setup.c   |  2 +-
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/stacktrace.h 
b/arch/s390/include/asm/stacktrace.h
index ee056f4a4fa3..2b543163d90a 100644
--- a/arch/s390/include/asm/stacktrace.h
+++ b/arch/s390/include/asm/stacktrace.h
@@ -12,6 +12,7 @@ enum stack_type {
STACK_TYPE_IRQ,
STACK_TYPE_NODAT,
STACK_TYPE_RESTART,
+   STACK_TYPE_MCCK,
 };
 
 struct stack_info {
diff --git a/arch/s390/kernel/cpcmd.c b/arch/s390/kernel/cpcmd.c
index af013b4244d3..2da027359798 100644
--- a/arch/s390/kernel/cpcmd.c
+++ b/arch/s390/kernel/cpcmd.c
@@ -37,10 +37,12 @@ static int diag8_noresponse(int cmdlen)
 
 static int diag8_response(int cmdlen, char *response, int *rlen)
 {
+   unsigned long _cmdlen = cmdlen | 0x4000L;
+   unsigned long _rlen = *rlen;
register unsigned long reg2 asm ("2") = (addr_t) cpcmd_buf;
register unsigned long reg3 asm ("3") = (addr_t) response;
-   register unsigned long reg4 asm ("4") = cmdlen | 0x4000L;
-   register unsigned long reg5 asm ("5") = *rlen;
+   register unsigned long reg4 asm ("4") = _cmdlen;
+   register unsigned long reg5 asm ("5") = _rlen;
 
asm volatile(
"   diag%2,%0,0x8\n"
diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c
index 0dc4b258b98d..db1bc00229ca 100644
--- a/arch/s390/kernel/dumpstack.c
+++ b/arch/s390/kernel/dumpstack.c
@@ -79,6 +79,15 @@ static bool in_nodat_stack(unsigned long sp, struct 
stack_info *info)
return in_stack(sp, info, STACK_TYPE_NODAT, top - THREAD_SIZE, top);
 }
 
+static bool in_mcck_stack(unsigned long sp, struct stack_info *info)
+{
+   unsigned long frame_size, top;
+
+   frame_size = STACK_FRAME_OVERHEAD + sizeof(struct pt_regs);
+   top = S390_lowcore.mcck_stack + frame_size;
+   return in_stack(sp, info, STACK_TYPE_MCCK, top - THREAD_SIZE, top);
+}
+
 static bool in_restart_stack(unsigned long sp, struct stack_info *info)
 {
unsigned long frame_size, top;
@@ -108,7 +117,8 @@ int get_stack_info(unsigned long sp, struct task_struct 
*task,
/* Check per-cpu stacks */
if (!in_irq_stack(sp, info) &&
!in_nodat_stack(sp, info) &&
-   !in_restart_stack(sp, info))
+   !in_restart_stack(sp, info) &&
+   !in_mcck_stack(sp, info))
goto unknown;
 
 recursion_check:
diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c
index 601c21791338..714269e10eec 100644
--- a/arch/s390/kernel/irq.c
+++ b/arch/s390/kernel/irq.c
@@ -174,7 +174,7 @@ void noinstr do_ext_irq(struct pt_regs *regs)
 
memcpy(>int_code, _lowcore.ext_cpu_addr, 4);
regs->int_parm = S390_lowcore.ext_params;
-   regs->int_parm_long = *(unsigned long *)S390_lowcore.ext_params2;
+   regs->int_parm_long = S390_lowcore.ext_params2;
 
from_idle = !user_mode(regs) && regs->psw.addr == (unsigned 
long)psw_idle_exit;
if (from_idle)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 60da976eee6f..72134f9f6ff5 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -354,7 +354,7 @@ static int __init stack_realloc(void)
if (!new)
panic("Couldn't allocate machine check stack");
WRITE_ONCE(S390_low

Re: [PATCH 17/20] kbuild: s390: use common install script

2021-04-07 Thread Heiko Carstens
On Wed, Apr 07, 2021 at 07:34:16AM +0200, Greg Kroah-Hartman wrote:
> The common scripts/install.sh script will now work for s390, no changes
> needed.  So call that instead and delete the s390-only install script.
> 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Cc: linux-s...@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  arch/s390/boot/Makefile   |  2 +-
>  arch/s390/boot/install.sh | 30 --
>  2 files changed, 1 insertion(+), 31 deletions(-)
>  delete mode 100644 arch/s390/boot/install.sh

Acked-by: Heiko Carstens 


[GIT PULL] s390 updates for 5.12-rc6

2021-03-30 Thread Heiko Carstens
Hi Linus,

please pull a couple of small updates for s390.

Thanks,
Heiko

The following changes since commit 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b:

  Linux 5.12-rc4 (2021-03-21 14:56:43 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.12-5

for you to fetch changes up to 84d572e634e28827d105746c922d8ada425e2d8b:

  MAINTAINERS: add backups for s390 vfio drivers (2021-03-28 20:20:33 +0200)


s390 updates for 5.12-rc6

- fix incorrect initialization and update of vdso data pages, which
  results in incorrect tod clock steering, and that
  clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns incorrect values.

- update MAINTAINERS for s390 vfio drivers


Heiko Carstens (3):
  s390/vdso: copy tod_steering_delta value to vdso_data page
  s390/vdso: fix tod_steering_delta type
  s390/vdso: fix initializing and updating of vdso_data

Matthew Rosato (1):
  MAINTAINERS: add backups for s390 vfio drivers

 MAINTAINERS   |  4 +++-
 arch/s390/include/asm/vdso/data.h |  2 +-
 arch/s390/kernel/time.c   | 10 --
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9e876927c60d..68a562374b85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15634,8 +15634,8 @@ F:  Documentation/s390/pci.rst
 
 S390 VFIO AP DRIVER
 M: Tony Krowiak 
-M: Pierre Morel 
 M: Halil Pasic 
+M: Jason Herne 
 L: linux-s...@vger.kernel.org
 S: Supported
 W: http://www.ibm.com/developerworks/linux/linux390/
@@ -15647,6 +15647,7 @@ F:  drivers/s390/crypto/vfio_ap_private.h
 S390 VFIO-CCW DRIVER
 M: Cornelia Huck 
 M: Eric Farman 
+M: Matthew Rosato 
 R: Halil Pasic 
 L: linux-s...@vger.kernel.org
 L: k...@vger.kernel.org
@@ -15657,6 +15658,7 @@ F:  include/uapi/linux/vfio_ccw.h
 
 S390 VFIO-PCI DRIVER
 M: Matthew Rosato 
+M: Eric Farman 
 L: linux-s...@vger.kernel.org
 L: k...@vger.kernel.org
 S: Supported
diff --git a/arch/s390/include/asm/vdso/data.h 
b/arch/s390/include/asm/vdso/data.h
index 7b3cdb4a5f48..73ee89142666 100644
--- a/arch/s390/include/asm/vdso/data.h
+++ b/arch/s390/include/asm/vdso/data.h
@@ -6,7 +6,7 @@
 #include 
 
 struct arch_vdso_data {
-   __u64 tod_steering_delta;
+   __s64 tod_steering_delta;
__u64 tod_steering_end;
 };
 
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 165da961f901..326cb8f75f58 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -80,10 +80,12 @@ void __init time_early_init(void)
 {
struct ptff_qto qto;
struct ptff_qui qui;
+   int cs;
 
/* Initialize TOD steering parameters */
tod_steering_end = tod_clock_base.tod;
-   vdso_data->arch_data.tod_steering_end = tod_steering_end;
+   for (cs = 0; cs < CS_BASES; cs++)
+   vdso_data[cs].arch_data.tod_steering_end = tod_steering_end;
 
if (!test_facility(28))
return;
@@ -366,6 +368,7 @@ static void clock_sync_global(unsigned long delta)
 {
unsigned long now, adj;
struct ptff_qto qto;
+   int cs;
 
/* Fixup the monotonic sched clock. */
tod_clock_base.eitod += delta;
@@ -381,7 +384,10 @@ static void clock_sync_global(unsigned long delta)
panic("TOD clock sync offset %li is too large to drift\n",
  tod_steering_delta);
tod_steering_end = now + (abs(tod_steering_delta) << 15);
-   vdso_data->arch_data.tod_steering_end = tod_steering_end;
+   for (cs = 0; cs < CS_BASES; cs++) {
+   vdso_data[cs].arch_data.tod_steering_end = tod_steering_end;
+   vdso_data[cs].arch_data.tod_steering_delta = tod_steering_delta;
+   }
 
/* Update LPAR offset. */
if (ptff_query(PTFF_QTO) && ptff(, sizeof(qto), PTFF_QTO) == 0)


Re: [PATCH 0/3] s390 vdso fixes

2021-03-25 Thread Heiko Carstens
On Thu, Mar 25, 2021 at 04:56:18PM +0800, Li Wang wrote:
> Hi Heiko,
> 
> On Wed, Mar 24, 2021 at 5:58 AM Heiko Carstens  wrote:
> 
> > Li Wang reported that clock_gettime(CLOCK_MONOTONIC_RAW, ...) does not
> > work correctly on s390 via vdso. Debugging this also revealed an
> > unrelated bug (first patch).
> >
> > The second patch fixes the problem: the tod clock steering parameters
> > required by __arch_get_hw_counter() are only present within the first
> > element of the _vdso_data array and not at all within the _timens_data
> > array.
> >
> > Instead of working around this simply provide an s390 specific vdso
> > data page which contains the tod clock steering parameters.
> >
> > This allows also to remove ARCH_HAS_VDSO_DATA again.
> >
> > Heiko Carstens (3):
> >   s390/vdso: fix tod clock steering
> >   s390/vdso: fix arch_data access for __arch_get_hw_counter()
> >   lib/vdso: remove struct arch_vdso_data from vdso data struct
> >
> 
> Thanks for the quick fix! I confirmed these patches work for me.
> (tested with latest mainline kernel v5.12-rc4)
> 
> Tested-by: Li Wang 

Thanks a lot for confirming! However I decided to go with the simple variant:
https://lore.kernel.org/linux-s390/YFnxr1ZlMIOIqjfq@osiris/T/#m26f94fd8ac048421a4a8870e7259a09f97840a3e

May I add your Tested-by there as well?


Re: [PATCH 1/3] s390/vdso: fix tod clock steering

2021-03-24 Thread Heiko Carstens
On Tue, Mar 23, 2021 at 10:58:17PM +0100, Heiko Carstens wrote:
> The s390 specific vdso function __arch_get_hw_counter() is supposed to
> consider tod clock steering.
> 
> If a tod clock steering event happens and the tod clock is set to a
> new value __arch_get_hw_counter() will not return the real tod clock
> value but slowly drift it from the old delta until the returned value
> finally matches the real tod clock value again.
> 
> When converting the assembler code to C it was forgotten to tell user
> space in which direction the clock has to be adjusted.
> 
> Worst case is now that instead of drifting the clock slowly it will
> jump into the opposite direction by a factor of two.
> 
> Fix this by simply providing the missing value to user space.
> 
> Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
> Cc:  # 5.10
> Signed-off-by: Heiko Carstens 
> ---
>  arch/s390/kernel/time.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
> index 165da961f901..e37285a5101b 100644
> --- a/arch/s390/kernel/time.c
> +++ b/arch/s390/kernel/time.c
> @@ -382,6 +382,7 @@ static void clock_sync_global(unsigned long delta)
> tod_steering_delta);
>   tod_steering_end = now + (abs(tod_steering_delta) << 15);
>   vdso_data->arch_data.tod_steering_end = tod_steering_end;
> + vdso_data->arch_data.tod_steering_delta = tod_steering_delta;

..and yet another bug: __arch_get_hw_counter() tests if
tod_steering_delta is negative.
It makes sense to give tod_steering_delta the correct type:

diff --git a/arch/s390/include/asm/vdso/data.h 
b/arch/s390/include/asm/vdso/data.h
index 7b3cdb4a5f48..73ee89142666 100644
--- a/arch/s390/include/asm/vdso/data.h
+++ b/arch/s390/include/asm/vdso/data.h
@@ -6,7 +6,7 @@
 #include 
 
 struct arch_vdso_data {
-   __u64 tod_steering_delta;
+   __s64 tod_steering_delta;
__u64 tod_steering_end;
 };
 


Re: [PATCH 2/3] s390/vdso: fix arch_data access for __arch_get_hw_counter()

2021-03-23 Thread Heiko Carstens
On Tue, Mar 23, 2021 at 10:58:18PM +0100, Heiko Carstens wrote:
> Li Wang reported that clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns
> incorrect values when time is provided via vdso instead of system call:
> 
> vdso_ts_nsec = 4484351380985507, vdso_ts.tv_sec = 4484351, vdso_ts.tv_nsec = 
> 380985507
> sys_ts_nsec  = 1446923235377, sys_ts.tv_sec  = 1446, sys_ts.tv_nsec  = 
> 923235377
> 
> Within the s390 specific vdso function __arch_get_hw_counter() tries
> to read tod clock steering values from the arch_data member of the
> passed in vdso_data structure.
> However only the arch_data member of the first clock source base
> (CS_HRES_COARSE) is initialized. For CS_RAW arch_data is not at all
> initialized, which explains the incorrect returned values.
> 
> It is a bit odd to provide the required tod clock steering parameters
> only within the first element of the _vdso_data array. However for
> time namespaces even no member of the _timens_data array contains the
> required data, which would make fixing __arch_get_hw_counter() quite
> complicated.
> 
> Therefore simply add an s390 specific vdso data page which contains
> the tod clock steering parameters. Everything else seems to be
> unnecessary complex.
> 
> Reported-by: Li Wang 
> Fixes: 1ba2d6c0fd4e ("s390/vdso: simplify __arch_get_hw_counter()")
> Fixes: eeab78b05d20 ("s390/vdso: implement generic vdso time namespace 
> support")
> Link: https://lore.kernel.org/linux-s390/YFnxr1ZlMIOIqjfq@osiris
> Signed-off-by: Heiko Carstens 
> ---
>  arch/s390/Kconfig |  1 -
>  arch/s390/include/asm/vdso.h  |  4 +++-
>  arch/s390/include/asm/vdso/data.h | 13 
>  arch/s390/include/asm/vdso/datapage.h | 17 +++
>  arch/s390/include/asm/vdso/gettimeofday.h | 11 --
>  arch/s390/kernel/time.c   |  6 +++---
>  arch/s390/kernel/vdso.c   | 25 ---
>  arch/s390/kernel/vdso64/vdso64.lds.S  |  3 ++-
>  8 files changed, 56 insertions(+), 24 deletions(-)
>  delete mode 100644 arch/s390/include/asm/vdso/data.h
>  create mode 100644 arch/s390/include/asm/vdso/datapage.h

FWIW, alternatively to this and the third patch we could also do the
much shorter and simpler variant below. What I personally don't like
is that data is duplicated.
But on the other hand it is much shorter, and the more I think of it
this seems to be the way to go.
Opinions?

diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index e37285a5101b..fa095ecf0349 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -80,10 +80,12 @@ void __init time_early_init(void)
 {
struct ptff_qto qto;
struct ptff_qui qui;
+   int i;
 
/* Initialize TOD steering parameters */
tod_steering_end = tod_clock_base.tod;
-   vdso_data->arch_data.tod_steering_end = tod_steering_end;
+   for (i = 0; i < CS_BASES; i++)
+   vdso_data[i].arch_data.tod_steering_end = tod_steering_end;
 
if (!test_facility(28))
return;
@@ -366,6 +368,7 @@ static void clock_sync_global(unsigned long delta)
 {
unsigned long now, adj;
struct ptff_qto qto;
+   int i;
 
/* Fixup the monotonic sched clock. */
tod_clock_base.eitod += delta;
@@ -381,8 +384,10 @@ static void clock_sync_global(unsigned long delta)
panic("TOD clock sync offset %li is too large to drift\n",
  tod_steering_delta);
tod_steering_end = now + (abs(tod_steering_delta) << 15);
-   vdso_data->arch_data.tod_steering_end = tod_steering_end;
-   vdso_data->arch_data.tod_steering_delta = tod_steering_delta;
+   for (i = 0; i < CS_BASES; i++) {
+   vdso_data[i].arch_data.tod_steering_end = tod_steering_end;
+   vdso_data[i].arch_data.tod_steering_delta = tod_steering_delta;
+   }
 
/* Update LPAR offset. */
if (ptff_query(PTFF_QTO) && ptff(, sizeof(qto), PTFF_QTO) == 0)


[PATCH 2/3] s390/vdso: fix arch_data access for __arch_get_hw_counter()

2021-03-23 Thread Heiko Carstens
Li Wang reported that clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns
incorrect values when time is provided via vdso instead of system call:

vdso_ts_nsec = 4484351380985507, vdso_ts.tv_sec = 4484351, vdso_ts.tv_nsec = 
380985507
sys_ts_nsec  = 1446923235377, sys_ts.tv_sec  = 1446, sys_ts.tv_nsec  = 923235377

Within the s390 specific vdso function __arch_get_hw_counter() tries
to read tod clock steering values from the arch_data member of the
passed in vdso_data structure.
However only the arch_data member of the first clock source base
(CS_HRES_COARSE) is initialized. For CS_RAW arch_data is not at all
initialized, which explains the incorrect returned values.

It is a bit odd to provide the required tod clock steering parameters
only within the first element of the _vdso_data array. However for
time namespaces even no member of the _timens_data array contains the
required data, which would make fixing __arch_get_hw_counter() quite
complicated.

Therefore simply add an s390 specific vdso data page which contains
the tod clock steering parameters. Everything else seems to be
unnecessary complex.

Reported-by: Li Wang 
Fixes: 1ba2d6c0fd4e ("s390/vdso: simplify __arch_get_hw_counter()")
Fixes: eeab78b05d20 ("s390/vdso: implement generic vdso time namespace support")
Link: https://lore.kernel.org/linux-s390/YFnxr1ZlMIOIqjfq@osiris
Signed-off-by: Heiko Carstens 
---
 arch/s390/Kconfig |  1 -
 arch/s390/include/asm/vdso.h  |  4 +++-
 arch/s390/include/asm/vdso/data.h | 13 
 arch/s390/include/asm/vdso/datapage.h | 17 +++
 arch/s390/include/asm/vdso/gettimeofday.h | 11 --
 arch/s390/kernel/time.c   |  6 +++---
 arch/s390/kernel/vdso.c   | 25 ---
 arch/s390/kernel/vdso64/vdso64.lds.S  |  3 ++-
 8 files changed, 56 insertions(+), 24 deletions(-)
 delete mode 100644 arch/s390/include/asm/vdso/data.h
 create mode 100644 arch/s390/include/asm/vdso/datapage.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c1ff874e6c2e..532ce0fcc659 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -77,7 +77,6 @@ config S390
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_UBSAN_SANITIZE_ALL
-   select ARCH_HAS_VDSO_DATA
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_INLINE_READ_LOCK
select ARCH_INLINE_READ_LOCK_BH
diff --git a/arch/s390/include/asm/vdso.h b/arch/s390/include/asm/vdso.h
index b45e32c2..0d047f519df6 100644
--- a/arch/s390/include/asm/vdso.h
+++ b/arch/s390/include/asm/vdso.h
@@ -3,17 +3,19 @@
 #define __S390_VDSO_H__
 
 #include 
+#include 
 
 /* Default link address for the vDSO */
 #define VDSO64_LBASE   0
 
-#define __VVAR_PAGES   2
+#define __VVAR_PAGES   3
 
 #define VDSO_VERSION_STRINGLINUX_2.6.29
 
 #ifndef __ASSEMBLY__
 
 extern struct vdso_data *vdso_data;
+extern struct s390_vdso_data *s390_vdso_data;
 
 int vdso_getcpu_init(void);
 
diff --git a/arch/s390/include/asm/vdso/data.h 
b/arch/s390/include/asm/vdso/data.h
deleted file mode 100644
index 7b3cdb4a5f48..
--- a/arch/s390/include/asm/vdso/data.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __S390_ASM_VDSO_DATA_H
-#define __S390_ASM_VDSO_DATA_H
-
-#include 
-#include 
-
-struct arch_vdso_data {
-   __u64 tod_steering_delta;
-   __u64 tod_steering_end;
-};
-
-#endif /* __S390_ASM_VDSO_DATA_H */
diff --git a/arch/s390/include/asm/vdso/datapage.h 
b/arch/s390/include/asm/vdso/datapage.h
new file mode 100644
index ..bfae78d814af
--- /dev/null
+++ b/arch/s390/include/asm/vdso/datapage.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __S390_ASM_VDSO_DATAPAGE_H
+#define __S390_ASM_VDSO_DATAPAGE_H
+
+#include 
+
+#ifndef __ASSEMBLY__
+
+struct s390_vdso_data {
+   __u64 tod_steering_delta;
+   __u64 tod_steering_end;
+};
+
+extern struct s390_vdso_data _s390_data __attribute__((visibility("hidden")));
+
+#endif /* __ASSEMBLY__ */
+#endif /* __S390_ASM_VDSO_DATAPAGE_H */
diff --git a/arch/s390/include/asm/vdso/gettimeofday.h 
b/arch/s390/include/asm/vdso/gettimeofday.h
index ed89ef742530..bbd6da6b1651 100644
--- a/arch/s390/include/asm/vdso/gettimeofday.h
+++ b/arch/s390/include/asm/vdso/gettimeofday.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #define vdso_calc_delta __arch_vdso_calc_delta
@@ -22,14 +23,20 @@ static __always_inline const struct vdso_data 
*__arch_get_vdso_data(void)
return _vdso_data;
 }
 
+static __always_inline const struct s390_vdso_data *__get_s390_vdso_data(void)
+{
+   return &_s390_data;
+}
+
 static inline u64 __arch_get_hw_counter(s32 clock_mode, const struct vdso_data 
*vd)
 {
+   const struct s390_vdso_data *svd = __get_s390_vdso_data();
u64 adj, now;
 
now = get_tod_clock();
-   adj = vd->arch_dat

[PATCH 1/3] s390/vdso: fix tod clock steering

2021-03-23 Thread Heiko Carstens
The s390 specific vdso function __arch_get_hw_counter() is supposed to
consider tod clock steering.

If a tod clock steering event happens and the tod clock is set to a
new value __arch_get_hw_counter() will not return the real tod clock
value but slowly drift it from the old delta until the returned value
finally matches the real tod clock value again.

When converting the assembler code to C it was forgotten to tell user
space in which direction the clock has to be adjusted.

Worst case is now that instead of drifting the clock slowly it will
jump into the opposite direction by a factor of two.

Fix this by simply providing the missing value to user space.

Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
Cc:  # 5.10
Signed-off-by: Heiko Carstens 
---
 arch/s390/kernel/time.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 165da961f901..e37285a5101b 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -382,6 +382,7 @@ static void clock_sync_global(unsigned long delta)
  tod_steering_delta);
tod_steering_end = now + (abs(tod_steering_delta) << 15);
vdso_data->arch_data.tod_steering_end = tod_steering_end;
+   vdso_data->arch_data.tod_steering_delta = tod_steering_delta;
 
/* Update LPAR offset. */
if (ptff_query(PTFF_QTO) && ptff(, sizeof(qto), PTFF_QTO) == 0)
-- 
2.25.1



[PATCH 0/3] s390 vdso fixes

2021-03-23 Thread Heiko Carstens
Li Wang reported that clock_gettime(CLOCK_MONOTONIC_RAW, ...) does not
work correctly on s390 via vdso. Debugging this also revealed an
unrelated bug (first patch).

The second patch fixes the problem: the tod clock steering parameters
required by __arch_get_hw_counter() are only present within the first
element of the _vdso_data array and not at all within the _timens_data
array.

Instead of working around this simply provide an s390 specific vdso
data page which contains the tod clock steering parameters.

This allows also to remove ARCH_HAS_VDSO_DATA again.

Heiko Carstens (3):
  s390/vdso: fix tod clock steering
  s390/vdso: fix arch_data access for __arch_get_hw_counter()
  lib/vdso: remove struct arch_vdso_data from vdso data struct

 arch/Kconfig  |  3 ---
 arch/s390/Kconfig |  1 -
 arch/s390/include/asm/vdso.h  |  4 +++-
 arch/s390/include/asm/vdso/data.h | 13 
 arch/s390/include/asm/vdso/datapage.h | 17 +++
 arch/s390/include/asm/vdso/gettimeofday.h | 11 --
 arch/s390/kernel/time.c   |  5 +++--
 arch/s390/kernel/vdso.c   | 25 ---
 arch/s390/kernel/vdso64/vdso64.lds.S  |  3 ++-
 include/vdso/datapage.h   | 10 -
 10 files changed, 56 insertions(+), 36 deletions(-)
 delete mode 100644 arch/s390/include/asm/vdso/data.h
 create mode 100644 arch/s390/include/asm/vdso/datapage.h

-- 
2.25.1



[PATCH 3/3] lib/vdso: remove struct arch_vdso_data from vdso data struct

2021-03-23 Thread Heiko Carstens
Since commit d60d7de3e16d ("lib/vdso: Allow to add architecture-specific
vdso data") it is possible to provide arch specific VDSO data.

This was only added for s390, which doesn't make use this anymore.
Therefore remove it again.

Signed-off-by: Heiko Carstens 
---
 arch/Kconfig|  3 ---
 include/vdso/datapage.h | 10 --
 2 files changed, 13 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index ecfd3520b676..35c7114f7ea3 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1147,9 +1147,6 @@ config HAVE_SPARSE_SYSCALL_NR
  entries at 4000, 5000 and 6000 locations. This option turns on syscall
  related optimizations for a given architecture.
 
-config ARCH_HAS_VDSO_DATA
-   bool
-
 config HAVE_STATIC_CALL
bool
 
diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
index 73eb622e7663..ee810cae4e1e 100644
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -19,12 +19,6 @@
 #include 
 #include 
 
-#ifdef CONFIG_ARCH_HAS_VDSO_DATA
-#include 
-#else
-struct arch_vdso_data {};
-#endif
-
 #define VDSO_BASES (CLOCK_TAI + 1)
 #define VDSO_HRES  (BIT(CLOCK_REALTIME)| \
 BIT(CLOCK_MONOTONIC)   | \
@@ -70,8 +64,6 @@ struct vdso_timestamp {
  * @tz_dsttime:type of DST correction
  * @hrtimer_res:   hrtimer resolution
  * @__unused:  unused
- * @arch_data: architecture specific data (optional, defaults
- * to an empty struct)
  *
  * vdso_data will be accessed by 64 bit and compat code at the same time
  * so we should be careful before modifying this structure.
@@ -105,8 +97,6 @@ struct vdso_data {
s32 tz_dsttime;
u32 hrtimer_res;
u32 __unused;
-
-   struct arch_vdso_data   arch_data;
 };
 
 /*
-- 
2.25.1



Re: [s390x vDSO Bug?] clock_gettime(CLOCK_MONOTONIC_RAW, ...) gets abnormal ts value

2021-03-23 Thread Heiko Carstens
On Tue, Mar 23, 2021 at 08:11:41AM +0100, Heiko Carstens wrote:
> On Tue, Mar 23, 2021 at 02:21:52PM +0800, Li Wang wrote:
> > Hi linux-s390 experts,
> > 
> > We observed that LTP/clock_gettime04 always FAIL on s390x with
> > kernel-v5.12-rc3.
> > To simply show the problem, I rewrite the LTP reproducer as a simple C
> > below.
> > Maybe it's a new bug introduced from the kernel-5.12 series branch?
> > 
> > PASS:
> > 
> > # uname -r
> > 5.11.0-*.s390x
> > 
> > # grep TIME_NS /boot/config-5.11.0-*.s390x
> > no TIME_NS enabled
> > 
> > ## ./test-timer
> > vdso_ts_nsec = 898169901815, vdso_ts.tv_sec = 898, vdso_ts.tv_nsec =
> > 169901815
> > sys_ts_nsec  = 898169904269, sys_ts.tv_sec  = 898, sys_ts.tv_nsec  =
> > 169904269
> > ===> PASS
> > 
> > FAIL:
> > --
> > # uname -r
> > 5.12.0-0.rc3.*.s390x
> > 
> > # grep TIME_NS /boot/config-5.12.0-0.rc3.s390x
> > CONFIG_TIME_NS=y
> > CONFIG_GENERIC_VDSO_TIME_NS=y
> > 
> > # ./test-timer
> > vdso_ts_nsec = 4484351380985507, vdso_ts.tv_sec = 4484351, vdso_ts.tv_nsec
> > = 380985507
> > sys_ts_nsec  = 1446923235377, sys_ts.tv_sec  = 1446, sys_ts.tv_nsec  =
> > 923235377
> > ===> FAIL
> 
> Thanks for reporting!
> 
> I'll look later today into this. I would nearly bet that I broke it
> with commit f8d8977a3d97 ("s390/time: convert tod_clock_base to
> union")

So, I broke it with commit 1ba2d6c0fd4e ("s390/vdso: simplify
__arch_get_hw_counter()"). Reverting that patch will fix it for non
time namespace processes only.

The problem is that the vdso data page contains an array of struct
vdso_data's for each clock source. However only the first member of
that array contains a/the valid struct arch_vdso_data, which is
required for __arch_get_hw_counter(). Which alone is a bit odd...

However for a process which is within a time namespace there is no
(easy) way to access that page (the time namespace specific vdso data
page does not contain valid arch_vdso_data). I guess the real fix is
to simply map yet another page into the vvar mapping and put the
arch_data there. What a mess... :/


Re: [s390x vDSO Bug?] clock_gettime(CLOCK_MONOTONIC_RAW, ...) gets abnormal ts value

2021-03-23 Thread Heiko Carstens
On Tue, Mar 23, 2021 at 02:21:52PM +0800, Li Wang wrote:
> Hi linux-s390 experts,
> 
> We observed that LTP/clock_gettime04 always FAIL on s390x with
> kernel-v5.12-rc3.
> To simply show the problem, I rewrite the LTP reproducer as a simple C
> below.
> Maybe it's a new bug introduced from the kernel-5.12 series branch?
> 
> PASS:
> 
> # uname -r
> 5.11.0-*.s390x
> 
> # grep TIME_NS /boot/config-5.11.0-*.s390x
> no TIME_NS enabled
> 
> ## ./test-timer
> vdso_ts_nsec = 898169901815, vdso_ts.tv_sec = 898, vdso_ts.tv_nsec =
> 169901815
> sys_ts_nsec  = 898169904269, sys_ts.tv_sec  = 898, sys_ts.tv_nsec  =
> 169904269
> ===> PASS
> 
> FAIL:
> --
> # uname -r
> 5.12.0-0.rc3.*.s390x
> 
> # grep TIME_NS /boot/config-5.12.0-0.rc3.s390x
> CONFIG_TIME_NS=y
> CONFIG_GENERIC_VDSO_TIME_NS=y
> 
> # ./test-timer
> vdso_ts_nsec = 4484351380985507, vdso_ts.tv_sec = 4484351, vdso_ts.tv_nsec
> = 380985507
> sys_ts_nsec  = 1446923235377, sys_ts.tv_sec  = 1446, sys_ts.tv_nsec  =
> 923235377
> ===> FAIL

Thanks for reporting!

I'll look later today into this. I would nearly bet that I broke it
with commit f8d8977a3d97 ("s390/time: convert tod_clock_base to
union")


Re: [PATCH] s390/crc32-vx: Couple of typo fixes

2021-03-22 Thread Heiko Carstens
On Mon, Mar 22, 2021 at 06:35:33PM +0530, Bhaskar Chowdhury wrote:
> 
> s/defintions/definitions/
> s/intermedate/intermediate/
> 
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  arch/s390/crypto/crc32be-vx.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied, thanks.


Re: [PATCH v1 1/2] s390/kvm: split kvm_s390_real_to_abs

2021-03-22 Thread Heiko Carstens
On Mon, Mar 22, 2021 at 10:53:46AM +0100, David Hildenbrand wrote:
> > > diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h
> > > index daba10f76936..7c72a5e3449f 100644
> > > --- a/arch/s390/kvm/gaccess.h
> > > +++ b/arch/s390/kvm/gaccess.h
> > > @@ -18,17 +18,14 @@
> > >/**
> > > * kvm_s390_real_to_abs - convert guest real address to guest absolute 
> > > address
> > > - * @vcpu - guest virtual cpu
> > > + * @prefix - guest prefix
> > > * @gra - guest real address
> > > *
> > > * Returns the guest absolute address that corresponds to the passed 
> > > guest real
> > > - * address @gra of a virtual guest cpu by applying its prefix.
> > > + * address @gra of by applying the given prefix.
> > > */
> > > -static inline unsigned long kvm_s390_real_to_abs(struct kvm_vcpu *vcpu,
> > > -  unsigned long gra)
> > > +static inline unsigned long _kvm_s390_real_to_abs(u32 prefix, unsigned 
> > > long gra)
> > 
> > 
> > Just a matter of taste, but maybe this could be named differently?
> > kvm_s390_real2abs_prefix() ? kvm_s390_prefix_real_to_abs()?
> > 
> 
> +1, I also dislike these "_.*" style functions here.

Yes, let's bikeshed then :)

Could you then please try to rename page_to* and everything that looks
similar to page2* please? I'm wondering what the response will be..


Re: [PATCH] s390/kernel: Fix a typo

2021-03-22 Thread Heiko Carstens
On Mon, Mar 22, 2021 at 11:55:00AM +0530, Bhaskar Chowdhury wrote:
> 
> s/struture/structure/
> 
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  arch/s390/kernel/os_info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kernel/os_info.c b/arch/s390/kernel/os_info.c
> index 0a5e4bafb6ad..5a7420b23aa8 100644
> --- a/arch/s390/kernel/os_info.c
> +++ b/arch/s390/kernel/os_info.c
> @@ -52,7 +52,7 @@ void os_info_entry_add(int nr, void *ptr, u64 size)
>  }
> 
>  /*
> - * Initialize OS info struture and set lowcore pointer
> + * Initialize OS info structure and set lowcore pointer

Applied, thanks.


[GIT PULL] s390 updates for 5.12-rc4

2021-03-19 Thread Heiko Carstens
Hi Linus,

please pull three s390 specific bug fixes for 5.12-rc4.

Thanks,
Heiko

The following changes since commit 1e28eed17697bcf343c6743f0028cc3b5dd88bf0:

  Linux 5.12-rc3 (2021-03-14 14:41:02 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.12-4

for you to fetch changes up to 0b13525c20febcfecccf6fc1db5969727401317d:

  s390/pci: fix leak of PCI device structure (2021-03-15 19:10:56 +0100)


s390 updates for 5.12-rc4

- disable preemption when accessing local per-cpu variables in the new
  counter set driver

- fix by a factor of four increased steal time due to missing
  cputime_to_nsecs() conversion

- fix PCI device structure leak


Gerald Schaefer (1):
  s390/vtime: fix increased steal time accounting

Niklas Schnelle (1):
  s390/pci: fix leak of PCI device structure

Thomas Richter (1):
  s390/cpumf: disable preemption when accessing per-cpu variable

 arch/s390/include/asm/pci.h  |  2 +-
 arch/s390/kernel/perf_cpum_cf_diag.c |  3 ++-
 arch/s390/kernel/vtime.c |  2 +-
 arch/s390/pci/pci.c  | 28 
 arch/s390/pci/pci_event.c| 18 ++
 drivers/pci/hotplug/s390_pci_hpc.c   |  3 ++-
 6 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 053fe8b8dec7..a75d94a9bcb2 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -202,7 +202,7 @@ extern unsigned int s390_pci_no_rid;
 - 
*/
 /* Base stuff */
 int zpci_create_device(u32 fid, u32 fh, enum zpci_state state);
-void zpci_remove_device(struct zpci_dev *zdev);
+void zpci_remove_device(struct zpci_dev *zdev, bool set_error);
 int zpci_enable_device(struct zpci_dev *);
 int zpci_disable_device(struct zpci_dev *);
 int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64);
diff --git a/arch/s390/kernel/perf_cpum_cf_diag.c 
b/arch/s390/kernel/perf_cpum_cf_diag.c
index bc302b86ce28..2e3e7edbe3a0 100644
--- a/arch/s390/kernel/perf_cpum_cf_diag.c
+++ b/arch/s390/kernel/perf_cpum_cf_diag.c
@@ -968,7 +968,7 @@ static int cf_diag_all_start(void)
  */
 static size_t cf_diag_needspace(unsigned int sets)
 {
-   struct cpu_cf_events *cpuhw = this_cpu_ptr(_cf_events);
+   struct cpu_cf_events *cpuhw = get_cpu_ptr(_cf_events);
size_t bytes = 0;
int i;
 
@@ -984,6 +984,7 @@ static size_t cf_diag_needspace(unsigned int sets)
 sizeof(((struct s390_ctrset_cpudata *)0)->no_sets));
debug_sprintf_event(cf_diag_dbg, 5, "%s bytes %ld\n", __func__,
bytes);
+   put_cpu_ptr(_cf_events);
return bytes;
 }
 
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 73c7afcc0527..f216a1b2f825 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -214,7 +214,7 @@ void vtime_flush(struct task_struct *tsk)
avg_steal = S390_lowcore.avg_steal_timer / 2;
if ((s64) steal > 0) {
S390_lowcore.steal_timer = 0;
-   account_steal_time(steal);
+   account_steal_time(cputime_to_nsecs(steal));
avg_steal += steal;
}
S390_lowcore.avg_steal_timer = avg_steal;
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 600881d894dd..91064077526d 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -682,16 +682,36 @@ int zpci_disable_device(struct zpci_dev *zdev)
 }
 EXPORT_SYMBOL_GPL(zpci_disable_device);
 
-void zpci_remove_device(struct zpci_dev *zdev)
+/* zpci_remove_device - Removes the given zdev from the PCI core
+ * @zdev: the zdev to be removed from the PCI core
+ * @set_error: if true the device's error state is set to permanent failure
+ *
+ * Sets a zPCI device to a configured but offline state; the zPCI
+ * device is still accessible through its hotplug slot and the zPCI
+ * API but is removed from the common code PCI bus, making it
+ * no longer available to drivers.
+ */
+void zpci_remove_device(struct zpci_dev *zdev, bool set_error)
 {
struct zpci_bus *zbus = zdev->zbus;
struct pci_dev *pdev;
 
+   if (!zdev->zbus->bus)
+   return;
+
pdev = pci_get_slot(zbus->bus, zdev->devfn);
if (pdev) {
-   if (pdev->is_virtfn)
-   return zpci_iov_remove_virtfn(pdev, zdev->vfn);
+   if (set_error)
+   pdev->error_state = pci_channel_io_perm_failure;
+   if (pdev->is_virtfn) {
+   zpci_iov_remove_virtfn(pdev, zdev->vfn);
+   /* balance pci_get_slot */
+   pci_dev_put(pdev);
+   return;
+   }
  

Re: linux-next: Tree for Mar 19

2021-03-19 Thread Heiko Carstens
On Fri, Mar 19, 2021 at 05:59:50PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Warning: Some of the branches in linux-next may still based on v5.12-rc1,
> so please be careful if you are trying to bisect a bug.
> 
> News: if your -next included tree is based on Linus' tree tag
> v5.12-rc1{,-dontuse} (or somewhere between v5.11 and that tag), please
> consider rebasing it onto v5.12-rc2. Also, please check any branches
> merged into your branch.
> 
> Changes since 20210318:
> 
> The net-next tree gained a conflict against the net tree.
> 
> The amdgpu tree gained a build failure so I used the version from
> next-20210318.
> 
> The security tree gained a conflict against the ext3 tree.
> 
> The rcu tree lost its build failure.
> 
> The akpm-current tree still had its build failure for which I applied
> a hack.
> 
> The akpm tree gained a conflict against the security tre.
> 
> Non-merge commits (relative to Linus' tree): 5051
>  4781 files changed, 329814 insertions(+), 90904 deletions(-)
...
> Merging rust/rust-next (8ef6f74a3571 Rust support)

This breaks now on s390 with commit 8ef6f74a3571 ("Rust support").
make modules_install / depmod now fails with:

depmod: WARNING: 
/.../lib/modules/5.12.0-rc3-1-g8ef6f74a3571/kernel/drivers/s390/scsi/zfcp.ko
 needs unknown symbol

for every module (yes, the line is complete).


Re: s390: kernel/entry.o: in function `sys_call_table_emu': (.rodata+0x1bc0): undefined reference to `__s390_'

2021-03-18 Thread Heiko Carstens
On Thu, Mar 18, 2021 at 10:05:01AM -0700, Nick Desaulniers wrote:
> (Replying to 
> https://lore.kernel.org/linux-s390/ca+g9fytbw0hav5ooayck2rz_m2sj73krxpj0idzt+o8qtc1...@mail.gmail.com/)
> 
> Yeah, our CI is failing today, too with the same error on linux-next:
> https://github.com/ClangBuiltLinux/continuous-integration2/runs/2138006304?check_suite_focus=true

It is fixed in linux-next of today.


Re: [PATCH] memcg: set page->private before calling swap_readpage

2021-03-18 Thread Heiko Carstens
On Wed, Mar 17, 2021 at 06:59:59PM -0700, Shakeel Butt wrote:
> The function swap_readpage() (and other functions it call) extracts swap
> entry from page->private. However for SWP_SYNCHRONOUS_IO, the kernel
> skips the swapcache and thus we need to manually set the page->private
> with the swap entry before calling swap_readpage().
> 
> Signed-off-by: Shakeel Butt 
> Reported-by: Heiko Carstens 
> ---
> 
> Andrew, please squash this into "memcg: charge before adding to
> swapcache on swapin" patch.
> 
>  mm/memory.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index aefd158ae1ea..b6f3410b5902 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3324,7 +3324,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>   workingset_refault(page, shadow);
>  
>   lru_cache_add(page);
> +
> + /* To provide entry to swap_readpage() */
> + set_page_private(page, entry.val);
>   swap_readpage(page, true);
> +     set_page_private(page, 0);

Yes, this seems to fix it. Thanks a lot!

Tested-by: Heiko Carstens 


Re: linux-next: Tree for Mar 17

2021-03-17 Thread Heiko Carstens
On Wed, Mar 17, 2021 at 07:42:41PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> News: there will be no linux-next release on Friday this week.
> 
> Warning: Some of the branches in linux-next are still based on v5.12-rc1,
> so please be careful if you are trying to bisect a bug.
> 
> News: if your -next included tree is based on Linus' tree tag
> v5.12-rc1{,-dontuse} (or somewhere between v5.11 and that tag), please
> consider rebasing it onto v5.12-rc2. Also, please check any branches
> merged into your branch.
> 
> Changes since 20210316:
> 
> New tree: cifsd
> 
> The cifsd tree gained a build failure for which I applied a patch.
> 
> The drm-intel tree gained a conflict against the drm tree.
> 
> The tip tree gained a build failure so I used the version from
> next-20210316.
> 
> The rcu tree gained a build failure so I used the version from
> next-20210316.
> 
> Non-merge commits (relative to Linus' tree): 4404
>  4125 files changed, 288074 insertions(+), 79674 deletions(-)
> 
> 
> 
> I have created today's linux-next tree at
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> (patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you

This does unfortunately not compile on s390 due to commit 72dd1ce7ebd3
("quota: wire up quotactl_path").

The patch below fixes it:

diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
b/arch/s390/kernel/syscalls/syscall.tbl
index 4aeaa89fa774..a421905c36e8 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -445,4 +445,4 @@
 440  commonprocess_madvise sys_process_madvise 
sys_process_madvise
 441  commonepoll_pwait2sys_epoll_pwait2
compat_sys_epoll_pwait2
 442  commonmount_setattr   sys_mount_setattr   
sys_mount_setattr
-443  commonquotactl_path   sys_quotactl_path
+443  commonquotactl_path   sys_quotactl_path   
sys_quotactl_path


[GIT PULL] s390 updates for 5.12-rc3

2021-03-10 Thread Heiko Carstens
Hi Linus,

please pull some s390 updates for 5.12-rc3. All of this was actually
already beginning of last week in linux-next, however I rebased this
from rc1 to rc2.

Thanks,
Heiko

The following changes since commit a38fd8748464831584a19438cbb3082b5a2dab15:

  Linux 5.12-rc2 (2021-03-05 17:33:41 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.12-3

for you to fetch changes up to 78c7cccaab9d5f9ead44579d79dd7d13a05aec7e:

  s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig 
(2021-03-08 10:46:30 +0100)


s390 updates for 5.12-rc3

- fix various user space visible copy_to_user() instances which return the
  number of bytes left to copy instead of -EFAULT

- make TMPFS_INODE64 available again for s390 and alpha, now that both
  architectures have been switched to 64-bit ino_t
  see commit 96c0a6a72d18 ("s390,alpha: switch to 64-bit ino_t")

- make sure to release a shared hypervisor resource within the zcore device
  driver also on restart and power down; also remove unneeded surrounding
  debugfs_create return value checks

- for the new hardware counter set device driver rename the uapi header file to
  be a bit more generic; also remove 60 second read limit which is not really
  necessary and without the limit the interface can be easier tested

- some small cleanups, the largest being to convert all long long in our time
  and idle code to longs

- update defconfigs


Alexander Egorenkov (3):
  s390/zcore: no need to check return value of debugfs_create functions
  s390/zcore: release dump save area on restart or power down
  s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig

Eric Farman (1):
  s390/cio: return -EFAULT if copy_to_user() fails

Heiko Carstens (4):
  s390/time,idle: get rid of unsigned long long
  s390/topology: remove always false if check
  s390,alpha: make TMPFS_INODE64 available again
  s390: update defconfigs

Jiapeng Chong (1):
  s390/cpumf: remove unneeded semicolon

Joe Perches (1):
  s390/tty3270: avoid comma separated statements

Thomas Richter (2):
  s390/cpumf: remove 60 seconds read limit
  s390/cpumf: rename header file to hwctrset.h

Wang Qing (2):
  s390/cio: return -EFAULT if copy_to_user() fails
  s390/crypto: return -EFAULT if copy_to_user() fails

 arch/s390/configs/debug_defconfig  | 16 ++--
 arch/s390/configs/defconfig| 11 +-
 arch/s390/configs/zfcpdump_defconfig   |  3 --
 arch/s390/include/asm/idle.h   | 12 +++---
 arch/s390/include/asm/timex.h  | 36 +-
 .../uapi/asm/{perf_cpum_cf_diag.h => hwctrset.h}   |  0
 arch/s390/kernel/idle.c| 12 +++---
 arch/s390/kernel/perf_cpum_cf.c|  2 +-
 arch/s390/kernel/perf_cpum_cf_diag.c   | 20 ++
 arch/s390/kernel/time.c| 28 +++---
 arch/s390/kernel/topology.c|  2 -
 arch/s390/kvm/interrupt.c  |  2 +-
 drivers/s390/char/tty3270.c|  6 ++-
 drivers/s390/char/zcore.c  | 44 +-
 drivers/s390/cio/device_fsm.c  |  2 +-
 drivers/s390/cio/vfio_ccw_ops.c|  6 +--
 drivers/s390/crypto/vfio_ap_ops.c  |  2 +-
 fs/Kconfig |  2 +-
 18 files changed, 91 insertions(+), 115 deletions(-)
 rename arch/s390/include/uapi/asm/{perf_cpum_cf_diag.h => hwctrset.h} (100%)

diff --git a/arch/s390/configs/debug_defconfig 
b/arch/s390/configs/debug_defconfig
index 02056b024091..dc0b69058ac4 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -275,9 +275,9 @@ CONFIG_IP_VS_DH=m
 CONFIG_IP_VS_SH=m
 CONFIG_IP_VS_SED=m
 CONFIG_IP_VS_NQ=m
+CONFIG_IP_VS_TWOS=m
 CONFIG_IP_VS_FTP=m
 CONFIG_IP_VS_PE_SIP=m
-CONFIG_NF_TABLES_IPV4=y
 CONFIG_NFT_FIB_IPV4=m
 CONFIG_NF_TABLES_ARP=y
 CONFIG_IP_NF_IPTABLES=m
@@ -298,7 +298,6 @@ CONFIG_IP_NF_SECURITY=m
 CONFIG_IP_NF_ARPTABLES=m
 CONFIG_IP_NF_ARPFILTER=m
 CONFIG_IP_NF_ARP_MANGLE=m
-CONFIG_NF_TABLES_IPV6=y
 CONFIG_NFT_FIB_IPV6=m
 CONFIG_IP6_NF_IPTABLES=m
 CONFIG_IP6_NF_MATCH_AH=m
@@ -481,7 +480,6 @@ CONFIG_NLMON=m
 # CONFIG_NET_VENDOR_AQUANTIA is not set
 # CONFIG_NET_VENDOR_ARC is not set
 # CONFIG_NET_VENDOR_ATHEROS is not set
-# CONFIG_NET_VENDOR_AURORA is not set
 # CONFIG_NET_VENDOR_BROADCOM is not set
 # CONFIG_NET_VENDOR_BROCADE is not set
 # CONFIG_NET_VENDOR_CADENCE is not set
@@ -581,7 +579,6 @@ CONFIG_VIRTIO_BALLOON=m
 CONFIG_VIRTIO_INPUT=y
 CONFIG_VHOST_NET=m
 CONFIG_VHOST_VSOCK=m
-# CONFIG_SURFACE_PLATFORMS is not set
 CONFIG_S390_CCW_IOMMU=y
 CONFIG

Re: [PATCH 0/6] mm: some config cleanups

2021-03-09 Thread Heiko Carstens
On Tue, Mar 09, 2021 at 02:03:04PM +0530, Anshuman Khandual wrote:
> This series contains config cleanup patches which reduces code duplication
> across platforms and also improves maintainability. There is no functional
> change intended with this series. This has been boot tested on arm64 but
> only build tested on some other platforms.
> 
> This applies on 5.12-rc2
> 
> Cc: x...@kernel.org
> Cc: linux-i...@vger.kernel.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux-snps-...@lists.infradead.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-...@lists.ozlabs.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux...@vger.kernel.org
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
> 
> Anshuman Khandual (6):
>   mm: Generalize ARCH_HAS_CACHE_LINE_SIZE
>   mm: Generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)
>   mm: Generalize ARCH_ENABLE_MEMORY_[HOTPLUG|HOTREMOVE]
>   mm: Drop redundant ARCH_ENABLE_[HUGEPAGE|THP]_MIGRATION
>   mm: Drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK
>   mm: Drop redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE
> 
>  arch/arc/Kconfig   |  9 ++--
>  arch/arm/Kconfig   | 10 ++---
>  arch/arm64/Kconfig | 30 ++
>  arch/ia64/Kconfig  |  8 ++-
>  arch/mips/Kconfig  |  6 +-
>  arch/parisc/Kconfig|  5 +
>  arch/powerpc/Kconfig   | 11 ++
>  arch/powerpc/platforms/Kconfig.cputype | 16 +-
>  arch/riscv/Kconfig |  5 +
>  arch/s390/Kconfig  | 12 +++
>  arch/sh/Kconfig|  7 +++---
>  arch/sh/mm/Kconfig |  8 ---
>  arch/x86/Kconfig   | 29 ++---
>  fs/Kconfig |  5 -
>  mm/Kconfig     |  9 
>  15 files changed, 48 insertions(+), 122 deletions(-)

for the s390 bits:
Acked-by: Heiko Carstens 


Re: [PATCH] s390: cio: Return -EFAULT if copy_to_user() fails

2021-03-01 Thread Heiko Carstens
On Mon, Mar 01, 2021 at 01:07:26PM -0500, Eric Farman wrote:
> 
> 
> On 3/1/21 8:13 AM, Heiko Carstens wrote:
> > On Mon, Mar 01, 2021 at 08:01:33PM +0800, Wang Qing wrote:
> > > The copy_to_user() function returns the number of bytes remaining to be
> > > copied, but we want to return -EFAULT if the copy doesn't complete.
> > > 
> > > Signed-off-by: Wang Qing 
> > > ---
> > >   drivers/s390/cio/vfio_ccw_ops.c | 4 ++--
> > >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > Applied, thanks!
> 
> There's a third copy_to_user() call in this same routine, that deserves the
> same treatment. I'll get that fixup applied.

Thanks a lot - I actually realized that there was a third one, but
blindly assumed that the other patch addressed that (for which the
original broken commit e06670c5fe3b ("s390: vfio-ap: implement
VFIO_DEVICE_GET_INFO ioctl") got an amazing number of eight tags ;))

I'll keep your patch as a seperate one, since it fixes a different
upstream patch.


Re: [PATCH] s390: cio: Return -EFAULT if copy_to_user() fails

2021-03-01 Thread Heiko Carstens
On Mon, Mar 01, 2021 at 08:01:33PM +0800, Wang Qing wrote:
> The copy_to_user() function returns the number of bytes remaining to be
> copied, but we want to return -EFAULT if the copy doesn't complete.
> 
> Signed-off-by: Wang Qing 
> ---
>  drivers/s390/cio/vfio_ccw_ops.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied, thanks!


Re: [PATCH] s390: crypto: Return -EFAULT if copy_to_user() fails

2021-03-01 Thread Heiko Carstens
On Mon, Mar 01, 2021 at 08:08:21PM +0800, Wang Qing wrote:
> The copy_to_user() function returns the number of bytes remaining to be
> copied, but we want to return -EFAULT if the copy doesn't complete.
> 
> Signed-off-by: Wang Qing 
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied, thanks!


Re: [PATCH] gcc-plugins: Disable GCC_PLUGIN_CYC_COMPLEXITY for s390

2021-02-23 Thread Heiko Carstens
On Tue, Feb 23, 2021 at 09:41:40AM -0800, Guenter Roeck wrote:
> > I tried to explain why we don't want to set COMPILE_TEST for s390
> > anymore. It overrides architecture dependencies in Kconfig, and lots
> > of drivers do not set dependencies for HAS_IOMEM, HAS_DMA, and friends
> > correctly.
> > This generates constantly fallout which is irrelevant for s390 and
> > also for other architectures. It generates just work with close to
> > zero benefit. For drivers which matter for s390 we still see those
> > errors.
> > 
> > > On the other side, if that flag would be set explicitly by
> > > all{yes,mod}config, it would really beg for being misused. We
> > > might then as well add a new flag that is explicitly associated
> > > with all{yes,mod}config, but not with randconfig.
> > 
> > I think that makes most sense, probably also have a flag that is set
> > for randconfig.
> 
> Not sure what value such an option would have, and how it would be used.
> I would argue that randconfig should not set COMPILE_TEST to start with,
> since its purpose should be to test random valid configurations and not
> to compile test arbitrary (and in that case random) code. But that is
> a different question, and just my personal opinion.
> 
> Overall, the question is what kind of additional option you would find
> useful for s390. You make it clear that you don't want COMPILE_TEST.
> At the same time, you still want all{mod,yes}config, but presumably
> excluding options currently restricted by !COMPILE_TEST (such as
> DEBUG_INFO, BPF_PRELOAD, UBSAN_TRAP, GCC_PLUGIN_CYC_COMPLEXITY,
> and a few others). SUPPRESS_NOISY_TESTS would not cover that, but
> neither would RANDCONFIG (or whatever it would be called).

Well, if we would have e.g. RANDCONFIG, then we could probably revert
334ef6ed06fa ("init/Kconfig: make COMPILE_TEST depend on !S390") and
instead let COMPILE_TEST depend on !RANDCONFIG.
I think this _could_ solve all common problems we currently see.

And it would also do what you suggested.


Re: [PATCH] gcc-plugins: Disable GCC_PLUGIN_CYC_COMPLEXITY for s390

2021-02-23 Thread Heiko Carstens
On Mon, Feb 22, 2021 at 08:03:31AM -0800, Guenter Roeck wrote:
> > Maybe, we can add something like CONFIG_SUPPRESS_NOISY_TESTS,
> > which is set to y by all{yes,mod}config.
> > 
> > This is self-documenting, so we do not need the '# too noisy' comment.
> > 
> > 
> > 
> > config SUPPRESS_NOISY_TESTS
> >bool "suppress noisy test"
> > 
> > 
> > config GCC_PLUGIN_CYC_COMPLEXITY
> > bool "Compute the cyclomatic complexity of a function" if EXPERT
> > depends on !SUPPRESS_NOISY_TESTS
> > 
> 
> Good idea. Downside would be that it won't solve the real problem
> for s390 (which is lack of allmodconfig/allyesconfig compile test
> coverage because COMPILE_TEST isn't set anymore), but that is a
> different problem anyway, and my original patch doesn't solve
> that either.

I tried to explain why we don't want to set COMPILE_TEST for s390
anymore. It overrides architecture dependencies in Kconfig, and lots
of drivers do not set dependencies for HAS_IOMEM, HAS_DMA, and friends
correctly.
This generates constantly fallout which is irrelevant for s390 and
also for other architectures. It generates just work with close to
zero benefit. For drivers which matter for s390 we still see those
errors.

> On the other side, if that flag would be set explicitly by
> all{yes,mod}config, it would really beg for being misused. We
> might then as well add a new flag that is explicitly associated
> with all{yes,mod}config, but not with randconfig.

I think that makes most sense, probably also have a flag that is set
for randconfig.


Re: [PATCH] gcc-plugins: Disable GCC_PLUGIN_CYC_COMPLEXITY for s390

2021-02-22 Thread Heiko Carstens
On Sun, Feb 21, 2021 at 02:56:50PM -0800, Guenter Roeck wrote:
> Commit 334ef6ed06fa ("init/Kconfig: make COMPILE_TEST depend on !S390") 
> disabled
> COMPILE_TEST for s390. At the same time, "make allmodconfig/allyesconfig" for
> s390 is still supported. However, it generates thousands of compiler
> messages such as the following, making it highly impractical to run.
> 
> Cyclomatic Complexity 1 scripts/mod/devicetable-offsets.c:main
> Cyclomatic Complexity 1 
> scripts/mod/devicetable-offsets.c:_GLOBAL__sub_I_00100_0_main
> 
> Since GCC_PLUGIN_CYC_COMPLEXITY is primarily used for testing, disable it
> when building s390 images.
> 
> Cc: Arnd Bergmann 
> Cc: Heiko Carstens 
> Fixes: 334ef6ed06fa ("init/Kconfig: make COMPILE_TEST depend on !S390")
> Signed-off-by: Guenter Roeck 
> ---
>  scripts/gcc-plugins/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
> index ab9eb4cbe33a..5e9bb500f443 100644
> --- a/scripts/gcc-plugins/Kconfig
> +++ b/scripts/gcc-plugins/Kconfig
> @@ -21,7 +21,7 @@ if GCC_PLUGINS
>  
>  config GCC_PLUGIN_CYC_COMPLEXITY
>   bool "Compute the cyclomatic complexity of a function" if EXPERT
> - depends on !COMPILE_TEST# too noisy
> + depends on !COMPILE_TEST && !S390   # too noisy

I don't see a reason to disable this in general for s390. COMPILE_TEST
was only disabled for s390 because a lot of irrelevant configs didn't
compile and it would cause a lot of unnecessary work to fix that.

However the !COMPILE_TEST dependency here looks more like it was
misused in lack of a possibility to detect if the config was generated
with allyesconfig/allmodconfig. Maybe that could be added somehow to
Kconfig?


Re: [PATCH] KVM: s390: use ARRAY_SIZE instead of division operation

2021-02-21 Thread Heiko Carstens
On Sat, Feb 20, 2021 at 04:22:37PM +0800, Yang Li wrote:
> This eliminates the following coccicheck warning:
> ./arch/s390/tools/gen_facilities.c:154:37-38: WARNING: Use ARRAY_SIZE
> ./arch/s390/tools/gen_opcode_table.c:141:39-40: WARNING: Use ARRAY_SIZE
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Yang Li 
> ---
>  arch/s390/tools/gen_facilities.c   | 2 +-
>  arch/s390/tools/gen_opcode_table.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/tools/gen_facilities.c 
> b/arch/s390/tools/gen_facilities.c
> index 61ce5b5..5366817 100644
> --- a/arch/s390/tools/gen_facilities.c
> +++ b/arch/s390/tools/gen_facilities.c
> @@ -151,7 +151,7 @@ static void print_facility_lists(void)
>  {
>   unsigned int i;
>  
> - for (i = 0; i < sizeof(facility_defs) / sizeof(facility_defs[0]); i++)
> + for (i = 0; i < ARRAY_SIZE(facility_defs); i++)
>   print_facility_list(_defs[i]);
>  }
>  
> diff --git a/arch/s390/tools/gen_opcode_table.c 
> b/arch/s390/tools/gen_opcode_table.c
> index a1bc02b..468b70c 100644
> --- a/arch/s390/tools/gen_opcode_table.c
> +++ b/arch/s390/tools/gen_opcode_table.c
> @@ -138,7 +138,7 @@ static struct insn_type *insn_format_to_type(char *format)
>   strcpy(tmp, format);
>   base_format = tmp;
>   base_format = strsep(_format, "_");
> - for (i = 0; i < sizeof(insn_type_table) / sizeof(insn_type_table[0]); 
> i++) {
> + for (i = 0; i < ARRAY_SIZE(insn_type_table); i++) {

There is a reason why this doesn't use ARRAY_SIZE()...
Please stop sending trivial patches without even looking at the code.


Re: linux-next: Tree for Feb 11

2021-02-12 Thread Heiko Carstens
Hi Vlad,

> > Build fails on s390 using defconfig with:
> >
> > In file included from drivers/net/ethernet/mellanox/mlx5/core/en_tc.h:40,
> >  from drivers/net/ethernet/mellanox/mlx5/core/en_main.c:45:
> > drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:24:29: error: field 
> > 'match_level' has incomplete type
> >24 |  enum mlx5_flow_match_level match_level;
> >   | ^~~
> > drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:27:26: warning: 'struct 
> > mlx5e_encap_entry' declared inside parameter list will not be visible 
> > outside of this definition or declaration
> >27 |  int (*calc_hlen)(struct mlx5e_encap_entry *e);
> >   |  ^
> >
> > caused by this:
> > commit 0d9f96471493d5483d116c137693f03604332a04 (HEAD, refs/bisect/bad)
> > Author: Vlad Buslov 
> > Date:   Sun Jan 24 22:07:04 2021 +0200
> >
> > net/mlx5e: Extract tc tunnel encap/decap code to dedicated file
> > 
> > Following patches in series extend the extracted code with routing
> > infrastructure. To improve code modularity created a dedicated
> > tc_tun_encap.c source file and move encap/decap related code to the new
> > file. Export code that is used by both regular TC code and encap/decap 
> > code
> > into tc_priv.h (new header intended to be used only by TC module). 
> > Rename
> > some exported functions by adding "mlx5e_" prefix to their names.
> > 
> > Signed-off-by: Vlad Buslov 
> > Signed-off-by: Dmytro Linkin 
> > Reviewed-by: Roi Dayan 
> > Signed-off-by: Saeed Mahameed 
> >
> > Note: switching on NET_SWITCHDEV fixes the build error.
> 
> Hi Heiko,
> 
> This problem is supposed to be fixed by 36280f0797df ("net/mlx5e: Fix
> tc_tun.h to verify MLX5_ESWITCH config"). I'm trying to reproduce with
> config supplied by test robot in another thread (config: s390-defconfig)
> and current net-next builds fine for me. I've also verified that config
> option you mentioned is not set in that config:
> 
> $ grep NET_SWITCHDEV .config
> # CONFIG_NET_SWITCHDEV is not set
> 
> Can you help me reproduce?

The commit you mention is not part of linux-next 20210211 (I'm not
talking of net-next). So, probably will be fixed with today's
release. I just checked: net-next builds with s390 defconfig.

Thanks!


Re: linux-next: Tree for Feb 11

2021-02-11 Thread Heiko Carstens
On Thu, Feb 11, 2021 at 10:26:04PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Changes since 20210210:
> 
> The powerpc tree still had its build failure in the allyesconfig for
> which I applied a supplied patch.
> 
> The v4l-dvb tree lost its build failure.
> 
> The drm-misc tree lost its build failure.
> 
> The modules tree lost its build failure.
> 
> The device-mapper tree gained a build failure so I used the version
> from next-20210210.
> 
> The tip tree lost its boot failure.
> 
> The rcu tree gained conflicts against the block tree.
> 
> The driver-core tree lost its build failure.
> 
> The akpm-current tree gained conflicts against the fscache tree.
> 
> Non-merge commits (relative to Linus' tree): 9533
>  9470 files changed, 385794 insertions(+), 266880 deletions(-)
> 
> 

Build fails on s390 using defconfig with:

In file included from drivers/net/ethernet/mellanox/mlx5/core/en_tc.h:40,
 from drivers/net/ethernet/mellanox/mlx5/core/en_main.c:45:
drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:24:29: error: field 
'match_level' has incomplete type
   24 |  enum mlx5_flow_match_level match_level;
  | ^~~
drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:27:26: warning: 'struct 
mlx5e_encap_entry' declared inside parameter list will not be visible outside 
of this definition or declaration
   27 |  int (*calc_hlen)(struct mlx5e_encap_entry *e);
  |  ^

caused by this:
commit 0d9f96471493d5483d116c137693f03604332a04 (HEAD, refs/bisect/bad)
Author: Vlad Buslov 
Date:   Sun Jan 24 22:07:04 2021 +0200

net/mlx5e: Extract tc tunnel encap/decap code to dedicated file

Following patches in series extend the extracted code with routing
infrastructure. To improve code modularity created a dedicated
tc_tun_encap.c source file and move encap/decap related code to the new
file. Export code that is used by both regular TC code and encap/decap code
into tc_priv.h (new header intended to be used only by TC module). Rename
some exported functions by adding "mlx5e_" prefix to their names.

Signed-off-by: Vlad Buslov 
Signed-off-by: Dmytro Linkin 
Reviewed-by: Roi Dayan 
Signed-off-by: Saeed Mahameed 

Note: switching on NET_SWITCHDEV fixes the build error.


Re: [PATCH] tmpfs: Disallow CONFIG_TMPFS_INODE64 on s390

2021-02-07 Thread Heiko Carstens
On Fri, Feb 05, 2021 at 04:05:51PM -0800, Andrew Morton wrote:
> On Fri,  5 Feb 2021 17:06:20 -0600 Seth Forshee  
> wrote:
> 
> > This feature requires ino_t be 64-bits, which is true for every
> > 64-bit architecture but s390, so prevent this option from being
> > selected there.
> > 
> 
> The previous patch nicely described the end-user impact of the bug. 
> This is especially important when requesting a -stable backport.
> 
> Here's what I ended up with:
> 
> 
> From: Seth Forshee 
> Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
> 
> Currently there is an assumption in tmpfs that 64-bit architectures also
> have a 64-bit ino_t.  This is not true on s390 which has a 32-bit ino_t. 
> With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
> display "inode64" in the mount options, but passing the "inode64" mount
> option will fail.  This leads to the following behavior:
> 
>  # mkdir mnt
>  # mount -t tmpfs nodev mnt
>  # mount -o remount,rw mnt
>  mount: /home/ubuntu/mnt: mount point not mounted or bad option.
> 
> As mount sees "inode64" in the mount options and thus passes it in the
> options for the remount.
> 
> 
> So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
> 
> Link: 
> https://lkml.kernel.org/r/20210205230620.518245-1-seth.fors...@canonical.com
> Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
> Signed-off-by: Seth Forshee 
> Cc: Chris Down 
> Cc: Hugh Dickins 
> Cc: Amir Goldstein 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Cc:   [5.9+]
> Signed-off-by: Andrew Morton 
> ---
> 
>  fs/Kconfig |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-s390
> +++ a/fs/Kconfig
> @@ -203,7 +203,7 @@ config TMPFS_XATTR
>  
>  config TMPFS_INODE64
>   bool "Use 64-bit ino_t by default in tmpfs"
> - depends on TMPFS && 64BIT
> + depends on TMPFS && 64BIT && !S390

Heh, it's sort of funny that we have a similar patch, which
unfortunately was/is not yet on our external features branch,
which does exactly the same.

In any case:

Acked-by: Heiko Carstens 


Re: [PATCH 2/2] s390: mm: Fix secure storage access exception handling

2021-01-20 Thread Heiko Carstens
On Tue, Jan 19, 2021 at 11:25:01AM +0100, Christian Borntraeger wrote:
> > +   if (user_mode(regs)) {
> > +   send_sig(SIGSEGV, current, 0);
> > +   return;
> > +   } else
> > +   panic("Unexpected PGM 0x3d with TEID bit 61=0");
> 
> use BUG instead of panic? That would kill this process, but it allows
> people to maybe save unaffected data.

It would kill the process, and most likely lead to deadlock'ed
system. But with all the "good" debug information being lost, which
wouldn't be the case with panic().
I really don't think this is a good idea.


Re: [PATCH 2/2] s390: mm: Fix secure storage access exception handling

2021-01-20 Thread Heiko Carstens
On Wed, Jan 20, 2021 at 03:39:14PM +0100, Christian Borntraeger wrote:
> On 20.01.21 14:42, Heiko Carstens wrote:
> > On Tue, Jan 19, 2021 at 11:25:01AM +0100, Christian Borntraeger wrote:
> >>> + if (user_mode(regs)) {
> >>> + send_sig(SIGSEGV, current, 0);
> >>> + return;
> >>> + } else
> >>> + panic("Unexpected PGM 0x3d with TEID bit 61=0");
> >>
> >> use BUG instead of panic? That would kill this process, but it allows
> >> people to maybe save unaffected data.
> > 
> > It would kill the process, and most likely lead to deadlock'ed
> > system. But with all the "good" debug information being lost, which
> > wouldn't be the case with panic().
> > I really don't think this is a good idea.
> > 
> 
> My understanding is that Linus hates panic for anything that might be able
> to continue to run. With BUG the admin can decide via panic_on_oops if
> debugging data or runtime data is more important. But mm is more on your
> side, so if you insist on panic we can keep it.

I prefer to have good debug data - and when we are reaching this
panic, then we _most_ likely have data corruption anywhere (wrong
pointer?). So it seems to be best to me to shutdown the machine
immediately in order to avoid any further corruption instead of hoping
that the system stays somehow alive.

Furthermore a panic is easily detectable by a watchdog, while a BUG
may put the system into a limbo state where the real workload doesn't
work anymore, but the keepalive process does. I don't think this is
desirable.


Re: [PATCH 12/18] arch: s390: Remove CONFIG_OPROFILE support

2021-01-15 Thread Heiko Carstens
On Thu, Jan 14, 2021 at 05:05:25PM +0530, Viresh Kumar wrote:
> The "oprofile" user-space tools don't use the kernel OPROFILE support
> any more, and haven't in a long time. User-space has been converted to
> the perf interfaces.
> 
> Remove the old oprofile's architecture specific support.
> 
> Suggested-by: Christoph Hellwig 
> Suggested-by: Linus Torvalds 
> Signed-off-by: Viresh Kumar 
> ---
>  arch/s390/Kconfig |  1 -
>  arch/s390/Makefile|  3 ---
>  arch/s390/configs/debug_defconfig |  1 -
>  arch/s390/configs/defconfig   |  1 -
>  arch/s390/oprofile/Makefile   | 10 -
>  arch/s390/oprofile/init.c | 37 ---
>  6 files changed, 53 deletions(-)
>  delete mode 100644 arch/s390/oprofile/Makefile
>  delete mode 100644 arch/s390/oprofile/init.c

Acked-by: Heiko Carstens 


[PATCH] epoll: fix compat syscall wire up of epoll_pwait2

2020-12-20 Thread Heiko Carstens
Commit b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2") wired up
the 64 bit syscall instead of the compat variant in a couple of places.

Cc: Willem de Bruijn 
Cc: Al Viro 
Cc: Arnd Bergmann 
Cc: Matthew Wilcox (Oracle) 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Thomas Bogendoerfer 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Cc: "David S. Miller" 
Fixes: b0a0c2615f6f ("epoll: wire up syscall epoll_pwait2")
Signed-off-by: Heiko Carstens 
---
 arch/arm64/include/asm/unistd32.h | 2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl | 2 +-
 arch/s390/kernel/syscalls/syscall.tbl | 2 +-
 arch/sparc/kernel/syscalls/syscall.tbl| 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index f4bca2b90218..cccfbbefbf95 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -890,7 +890,7 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2)
 #define __NR_process_madvise 440
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
-__SYSCALL(__NR_epoll_pwait2, sys_epoll_pwait2)
+__SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index ad9c3dd0ab1f..0f03ad223f33 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -379,4 +379,4 @@
 438n32 pidfd_getfd sys_pidfd_getfd
 439n32 faccessat2  sys_faccessat2
 440n32 process_madvise sys_process_madvise
-441n32 epoll_pwait2sys_epoll_pwait2
+441n32 epoll_pwait2compat_sys_epoll_pwait2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
b/arch/s390/kernel/syscalls/syscall.tbl
index 14f6525886a8..d443423495e5 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -443,4 +443,4 @@
 438  commonpidfd_getfd sys_pidfd_getfd 
sys_pidfd_getfd
 439  commonfaccessat2  sys_faccessat2  
sys_faccessat2
 440  commonprocess_madvise sys_process_madvise 
sys_process_madvise
-441  commonepoll_pwait2sys_epoll_pwait2
sys_epoll_pwait2
+441  commonepoll_pwait2sys_epoll_pwait2
compat_sys_epoll_pwait2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl 
b/arch/sparc/kernel/syscalls/syscall.tbl
index c7da4c3271e6..40d8c7cd8298 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -486,4 +486,4 @@
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
 440common  process_madvise sys_process_madvise
-441common  epoll_pwait2sys_epoll_pwait2
+441common  epoll_pwait2sys_epoll_pwait2
compat_sys_epoll_pwait2
-- 
2.17.1



[GIT PULL] more s390 updates for 5.11 merge window

2020-12-18 Thread Heiko Carstens
Hi Linus,

please pull some more small updates for s390. This is mainly to
decouple udelay() and arch_cpu_idle() and simplify both of them,
like I brought it up after the lockdep breakage in 5.10-rc6.

Thanks,
Heiko

The following changes since commit 586592478b1fa8bb8cd6875a9191468e9b1a8b13:

  Merge tag 's390-5.11-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux (2020-12-14 16:22:26 
-0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.11-2

for you to fetch changes up to dfdc6e73cdcf011a04568231132916c6d06b861f:

  s390/zcrypt: convert comma to semicolon (2020-12-16 14:55:50 +0100)


- Always initialize kernel stack backchain when entering the kernel, so
  that unwinding works properly.

- Fix stack  unwinder test case to avoid rare interrupt stack corruption.

- Simplify udelay() and just let it busy loop instead of implementing a
  complex logic.

- arch_cpu_idle() cleanup.

- Some other minor improvements.


Heiko Carstens (10):
  s390: always clear kernel stack backchain before calling functions
  s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced
  s390/test_unwind: fix CALL_ON_STACK tests
  s390/test_unwind: use timer instead of udelay
  s390/delay: simplify udelay
  s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
  s390/delay: remove udelay_simple()
  s390/idle: merge enabled_wait() and arch_cpu_idle()
  s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle()
  s390/idle: allow arch_cpu_idle() to be kprobed

Zheng Yongjun (1):
  s390/zcrypt: convert comma to semicolon

 arch/s390/Kconfig  |   1 +
 arch/s390/include/asm/delay.h  |  12 ++---
 arch/s390/include/asm/processor.h  |   7 ---
 arch/s390/kernel/entry.S   |  16 +++---
 arch/s390/kernel/idle.c|  18 ++-
 arch/s390/kernel/ipl.c |   2 +-
 arch/s390/kernel/setup.c   |   1 -
 arch/s390/lib/delay.c  | 105 -
 arch/s390/lib/test_unwind.c|  31 ++-
 drivers/s390/cio/device.c  |   2 +-
 drivers/s390/crypto/zcrypt_cex2a.c |   2 +-
 drivers/s390/crypto/zcrypt_cex4.c  |   2 +-
 12 files changed, 44 insertions(+), 155 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index a60cc523d810..9e8895cb9ee7 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -153,6 +153,7 @@ config S390
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_GCC_PLUGINS
select HAVE_GENERIC_VDSO
+   select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_KERNEL_BZIP2
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZ4
diff --git a/arch/s390/include/asm/delay.h b/arch/s390/include/asm/delay.h
index 4a08379cd1eb..21a8fe18fe66 100644
--- a/arch/s390/include/asm/delay.h
+++ b/arch/s390/include/asm/delay.h
@@ -13,14 +13,12 @@
 #ifndef _S390_DELAY_H
 #define _S390_DELAY_H
 
-void udelay_enable(void);
-void __ndelay(unsigned long long nsecs);
-void __udelay(unsigned long long usecs);
-void udelay_simple(unsigned long long usecs);
+void __ndelay(unsigned long nsecs);
+void __udelay(unsigned long usecs);
 void __delay(unsigned long loops);
 
-#define ndelay(n) __ndelay((unsigned long long) (n))
-#define udelay(n) __udelay((unsigned long long) (n))
-#define mdelay(n) __udelay((unsigned long long) (n) * 1000)
+#define ndelay(n) __ndelay((unsigned long)(n))
+#define udelay(n) __udelay((unsigned long)(n))
+#define mdelay(n) __udelay((unsigned long)(n) * 1000)
 
 #endif /* defined(_S390_DELAY_H) */
diff --git a/arch/s390/include/asm/processor.h 
b/arch/s390/include/asm/processor.h
index 6b7269f51f83..2058a435add4 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -16,14 +16,12 @@
 
 #define CIF_NOHZ_DELAY 2   /* delay HZ disable for a tick */
 #define CIF_FPU3   /* restore FPU registers */
-#define CIF_IGNORE_IRQ 4   /* ignore interrupt (for udelay) */
 #define CIF_ENABLED_WAIT   5   /* in enabled wait state */
 #define CIF_MCCK_GUEST 6   /* machine check happening in guest */
 #define CIF_DEDICATED_CPU  7   /* this CPU is dedicated */
 
 #define _CIF_NOHZ_DELAYBIT(CIF_NOHZ_DELAY)
 #define _CIF_FPU   BIT(CIF_FPU)
-#define _CIF_IGNORE_IRQBIT(CIF_IGNORE_IRQ)
 #define _CIF_ENABLED_WAIT  BIT(CIF_ENABLED_WAIT)
 #define _CIF_MCCK_GUESTBIT(CIF_MCCK_GUEST)
 #define _CIF_DEDICATED_CPU BIT(CIF_DEDICATED_CPU)
@@ -292,11 +290,6 @@ static inline unsigned long __rewind_psw(psw_t psw, 
unsigned long ilc)
return (psw.addr - ilc) & mask;
 }
 
-/*
- * Function to stop a processor until the next interrupt occurs
- */
-void enabled_wait(void);
-
 /*
  * Function to 

Re: __local_bh_enable_ip() vs lockdep

2020-12-18 Thread Heiko Carstens
On Wed, Dec 16, 2020 at 06:52:59PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 15, 2020 at 02:47:24PM -0500, Steven Rostedt wrote:
> > On Tue, 15 Dec 2020 20:01:52 +0100
> > Heiko Carstens  wrote:
> > 
> > > Hello,
> > > 
> > > the ftrace stack tracer kernel selftest is able to trigger the warning
> > > below from time to time. This looks like there is an ordering problem
> > > in __local_bh_enable_ip():
> > > first there is a call to lockdep_softirqs_on() and afterwards
> > > preempt_count_sub() is ftraced before it was able to modify
> > > preempt_count:
> > 
> > Don't run ftrace stack tracer when debugging lockdep. ;-)
> > 
> >   /me runs!
> 
> Ha!, seriously though; that seems like something we've encountered
> before, but my google-fu is failing me.
> 
> Do you remember what, if anything, was the problem with this?
> 
> ---
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d5bfd5e661fc..9d71046ea247 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -186,7 +186,7 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int 
> cnt)
>* Keep preemption disabled until we are done with
>* softirq processing:
>*/
> - preempt_count_sub(cnt - 1);
> + __preempt_count_sub(cnt - 1);
>  
>   if (unlikely(!in_interrupt() && local_softirq_pending())) {
>   /*

FWIW,

Tested-by: Heiko Carstens 

Peter, will you make proper patch out of this?


Re: __local_bh_enable_ip() vs lockdep

2020-12-16 Thread Heiko Carstens
On Wed, Dec 16, 2020 at 06:52:59PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 15, 2020 at 02:47:24PM -0500, Steven Rostedt wrote:
> > On Tue, 15 Dec 2020 20:01:52 +0100
> > Heiko Carstens  wrote:
> > 
> > > Hello,
> > > 
> > > the ftrace stack tracer kernel selftest is able to trigger the warning
> > > below from time to time. This looks like there is an ordering problem
> > > in __local_bh_enable_ip():
> > > first there is a call to lockdep_softirqs_on() and afterwards
> > > preempt_count_sub() is ftraced before it was able to modify
> > > preempt_count:
> > 
> > Don't run ftrace stack tracer when debugging lockdep. ;-)
> > 
> >   /me runs!
> 
> Ha!, seriously though; that seems like something we've encountered
> before, but my google-fu is failing me.
> 
> Do you remember what, if anything, was the problem with this?

Actually this looks like:
1a63dcd8765b ("softirq: Reorder trace_softirqs_on to prevent lockdep splat")

I can give it a test, but it looks quite obvious that your patch will
make the problem go away.

> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d5bfd5e661fc..9d71046ea247 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -186,7 +186,7 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int 
> cnt)
>* Keep preemption disabled until we are done with
>* softirq processing:
>*/
> - preempt_count_sub(cnt - 1);
> + __preempt_count_sub(cnt - 1);
>  
>   if (unlikely(!in_interrupt() && local_softirq_pending())) {
>   /*
> 


__local_bh_enable_ip() vs lockdep

2020-12-15 Thread Heiko Carstens
Hello,

the ftrace stack tracer kernel selftest is able to trigger the warning
below from time to time. This looks like there is an ordering problem
in __local_bh_enable_ip():
first there is a call to lockdep_softirqs_on() and afterwards
preempt_count_sub() is ftraced before it was able to modify
preempt_count:

[ 1016.245418] [ cut here ]
[ 1016.245428] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled)
[ 1016.245441] WARNING: CPU: 8 PID: 8300 at kernel/locking/lockdep.c:5298 
check_flags.part.0+0x196/0x208
[ 1016.245580] CPU: 8 PID: 8300 Comm: sshd Not tainted 
5.11.0-20201215.rc0.git0.d33ce49dca6c.300.fc33.s390x+debug #1
...
[ 1016.245691] Call Trace:
[ 1016.245698]  [<4c1537fa>] check_flags.part.0+0x19a/0x208
[ 1016.245705] ([<4c1537f6>] check_flags.part.0+0x196/0x208)
[ 1016.245711]  [<4cced786>] lock_is_held_type+0x8e/0x1b8
[ 1016.245716]  [<4c172924>] rcu_read_lock_sched_held+0x64/0xb8
[ 1016.245724]  [<4c1b151c>] module_assert_mutex_or_preempt+0x34/0x68
[ 1016.245730]  [<4c1b2e04>] __module_address.part.0+0x2c/0x118
[ 1016.245735]  [<4c1b9dca>] __module_text_address+0x3a/0x90
[ 1016.245741]  [<4c1b9ed4>] is_module_text_address+0x34/0x78
[ 1016.245748]  [<4c0f9a1a>] kernel_text_address+0x5a/0x130
[ 1016.245752]  [<4c0f9b16>] __kernel_text_address+0x26/0x70
[ 1016.245757]  [<4c094038>] unwind_get_return_address+0x40/0x68
[ 1016.245763]  [<4c099dac>] arch_stack_walk+0xac/0xd0
[ 1016.245768]  [<4c18be10>] stack_trace_save+0x50/0x68
[ 1016.245774]  [<4c22d80c>] check_stack+0xc4/0x348
[ 1016.245780]  [<4c22db46>] stack_trace_call+0xb6/0xd0
[ 1016.245785]  [<4cd00082>] ftrace_caller+0x7a/0x7e
[ 1016.245791]  [<4c1081d6>] preempt_count_sub+0x6/0x138 <---
[ 1016.245795]  [<4c0d3d46>] __local_bh_enable_ip+0x13e/0x190 <---
[ 1016.245811]  [<03ff8023c34c>] nft_update_chain_stats+0xdc/0x168 
[nf_tables]
[ 1016.245820]  [<03ff8023c916>] nft_do_chain+0x53e/0x550 [nf_tables]
[ 1016.245827]  [<03ff80251974>] nft_do_chain_ipv4+0x6c/0x78 [nf_tables]
[ 1016.245833]  [<4cb0ab00>] nf_hook_slow+0x58/0xf8
[ 1016.245839]  [<4cb1dc24>] nf_hook.constprop.0+0xfc/0x1d0
[ 1016.245844]  [<4cb207b2>] __ip_local_out+0x92/0xe8
[ 1016.245848]  [<4cb20d00>] __ip_queue_xmit+0x1d8/0x640
[ 1016.245854]  [<4cb4578c>] __tcp_transmit_skb+0x3dc/0x770
[ 1016.245858]  [<4cb46e86>] tcp_write_xmit+0x38e/0x758
[ 1016.245863]  [<4cb47298>] __tcp_push_pending_frames+0x48/0x118
[ 1016.245868]  [<4cb2f604>] tcp_sendmsg_locked+0x95c/0xb78
[ 1016.245872]  [<4cb2f864>] tcp_sendmsg+0x44/0x68
[ 1016.245878]  [<4ca30c3c>] sock_sendmsg+0x64/0x78
[ 1016.245882]  [<4ca30cc2>] sock_write_iter+0x72/0x98
[ 1016.245887]  [<4c3dcfda>] new_sync_write+0x10a/0x198
[ 1016.245891]  [<4c3dd6a6>] vfs_write.part.0+0x196/0x290
[ 1016.245896]  [<4c3e0220>] ksys_write+0xb8/0xf8
[ 1016.245900]  [<4ccfd326>] system_call+0xe2/0x29c
[ 1016.245904] INFO: lockdep is turned off.


[GIT PULL] s390 updates for 5.11 merge window

2020-12-14 Thread Heiko Carstens
Hi Linus,

please pull s390 updates for the 5.11 merge window.

Note that the diffstat summary when merging this will look slightly different
than the one generated by 'git request-pull' below:

107 files changed, 1268 insertions(+), 1993 deletions(-)

This is because I had to merge our fixes branch, which contains commits which
are already in 5.10, twice into our features branch to resolve dependencies.

There is also "mm: simplify follow_pte{,pmd}" sitting in Andrew's patch
collection which will break s390 compilation due to a conflict with a commit
in this pull request.

However Andrew already has a fixup patch for that, so I guess this problem will
be solved "automatically". That is: Andrew handles it :)

This is the fixup patch:
https://www.ozlabs.org/~akpm/mmotm/broken-out/mm-simplify-follow_ptepmd-fix.patch

Thanks,
Heiko

The following changes since commit f8394f232b1eab649ce2df5c5f15b0e528c92091:

  Linux 5.10-rc3 (2020-11-08 16:10:16 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.11-1

for you to fetch changes up to 343dbdb7cb8997a2cb0fd804d6563b8a6de8d49b:

  s390/mm: add support to allocate gigantic hugepages using CMA (2020-12-10 
21:11:01 +0100)


- Add support for the hugetlb_cma command line option to allocate gigantic
  hugepages using CMA:

- Add arch_get_random_long() support.

- Add ap bus userspace notifications.

- Increase default size of vmalloc area to 512GB and otherwise let it increase
  dynamically by the size of physical memory. This should fix all occurrences
  where the vmalloc area was not large enough.

- Completely get rid of set_fs() (aka select SET_FS) and rework address space
  handling while doing that; making address space handling much more simple.

- Reimplement getcpu vdso syscall in C.

- Add support for extended SCLP responses (> 4k). This allows e.g. to handle
  also potential large system configurations.

- Simplify KASAN by removing 3-level page table support and only supporting
  4-levels from now on.

- Improve debug-ability of the kernel decompressor code, which now prints also
  stack traces and symbols in case of problems to the console.

- Remove more power management leftovers.

- Other various fixes and improvements all over the place.


Alexander Gordeev (2):
  s390/vmem: remove redundant check
  s390/vmem: make variable and function names consistent

Christian Borntraeger (1):
  s390/trng: set quality to 1024

Daniel Vetter (1):
  s390/pci: remove races against pte updates

Gerald Schaefer (1):
  s390/mm: add support to allocate gigantic hugepages using CMA

Harald Freudenberger (3):
  s390/ap: ap bus userspace notifications for some bus conditions
  s390/zcrypt/pkey: introduce zcrypt_wait_api_operational() function
  s390/crypto: add arch_get_random_long() support

Heiko Carstens (15):
  s390: fix system call exit path
  s390/mm: extend default vmalloc area size to 512GB
  s390/mm: let vmalloc area size depend on physical memory size
  s390: update defconfigs
  s390/mm: remove unused clear_user_asce()
  Merge branch 'fixes' into features
  s390: add separate program check exit path
  init/Kconfig: make COMPILE_TEST depend on !S390
  Merge branch 'fixes' into features
  s390/mm: remove set_fs / rework address space handling
  s390/mm: use invalid asce instead of kernel asce
  s390/mm: add debug user asce support
  s390/vdso: reimplement getcpu vdso syscall
  s390/vdso: add missing prototypes for vdso functions
  s390/mm: use invalid asce for user space when switching to init_mm

Julian Wiedmann (3):
  s390/prng: let misc_register() add the prng sysfs attributes
  s390/stp: let subsys_system_register() sysfs attributes
  s390/ap: let bus_register() add the AP bus sysfs attributes

Mauro Carvalho Chehab (1):
  s390/cio: fix kernel-doc markups in cio driver.

Niklas Schnelle (2):
  s390/pci: inform when missing required facilities
  s390/Kconfig: default PCI_NR_FUNCTIONS to 512

Philipp Rudo (2):
  s390/kexec_file: fix diag308 subcode when loading crash kernel
  s390/boot: add build-id to decompressor

Qinglang Miao (1):
  s390/cio: fix use-after-free in ccw_device_destroy_console

Sumanth Korikkar (3):
  s390/sclp: use memblock for early read cpu info
  s390/sclp: avoid copy of sclp_info_sccb
  s390/sclp: provide extended sccb support

Sven Schnelle (4):
  s390: fix fpu restore in entry.S
  s390/idle: add missing mt_cycles calculation
  s390/idle: fix accounting with machine checks
  s390/smp: perform initial CPU reset also for SMT siblings

Thomas Richter (1):
  s390/cpum_sf.c: fix file permission for cpum_sfb_size

Vasily Gorbik (18):
  s390/head: set io/ext ha

Re: [patch 12/30] s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()

2020-12-10 Thread Heiko Carstens
On Thu, Dec 10, 2020 at 08:25:48PM +0100, Thomas Gleixner wrote:
> The irq descriptor is already there, no need to look it up again.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: Christian Borntraeger 
> Cc: Heiko Carstens 
> Cc: linux-s...@vger.kernel.org
> ---
>  arch/s390/kernel/irq.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Heiko Carstens 


Re: [PATCH 3/3] s390/mm: Define arch_get_mappable_range()

2020-12-09 Thread Heiko Carstens
On Thu, Dec 10, 2020 at 09:48:11AM +0530, Anshuman Khandual wrote:
> >> Alternatively leaving __segment_load() and vmem_add_memory() unchanged
> >> will create three range checks i.e two memhp_range_allowed() and the
> >> existing VMEM_MAX_PHYS check in vmem_add_mapping() on all the hotplug
> >> paths, which is not optimal.
> > 
> > Ah, sorry. I didn't follow this discussion too closely. I just thought
> > my point of view would be clear: let's not have two different ways to
> > check for the same thing which must be kept in sync.
> > Therefore I was wondering why this next version is still doing
> > that. Please find a way to solve this.
> 
> The following change is after the current series and should work with
> and without memory hotplug enabled. There will be just a single place
> i.e vmem_get_max_addr() to update in case the maximum address changes
> from VMEM_MAX_PHYS to something else later.

Still not. That's way too much code churn for what you want to achieve.
If the s390 specific patch would look like below you can add

Acked-by: Heiko Carstens 

But please make sure that the arch_get_mappable_range() prototype in
linux/memory_hotplug.h is always visible and does not depend on
CONFIG_MEMORY_HOTPLUG. I'd like to avoid seeing sparse warnings
because of this.

Thanks.

diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index 77767850d0d0..e0e78234ae57 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -291,6 +291,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
return -EINVAL;
 
+   VM_BUG_ON(!memhp_range_allowed(start, size, 1));
rc = vmem_add_mapping(start, size);
if (rc)
return rc;
diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index b239f2ba93b0..ccd55e2f97f9 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -4,6 +4,7 @@
  *Author(s): Heiko Carstens 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -532,11 +533,23 @@ void vmem_remove_mapping(unsigned long start, unsigned 
long size)
mutex_unlock(_mutex);
 }
 
+struct range arch_get_mappable_range(void)
+{
+   struct range range;
+
+   range.start = 0;
+   range.end = VMEM_MAX_PHYS;
+   return range;
+}
+
 int vmem_add_mapping(unsigned long start, unsigned long size)
 {
+   struct range range;
int ret;
 
-   if (start + size > VMEM_MAX_PHYS ||
+   range = arch_get_mappable_range();
+   if (start < range.start ||
+   start + size > range.end ||
start + size < start)
return -ERANGE;
 


Re: [PATCH 3/3] s390/mm: Define arch_get_mappable_range()

2020-12-09 Thread Heiko Carstens
On Wed, Dec 09, 2020 at 08:07:04AM +0530, Anshuman Khandual wrote:
> >> +  if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) {
> >> +  rc = -ERANGE;
> >> +  goto out_resource;
> >> +  }
> >> +
...
> >> +struct range arch_get_mappable_range(void)
> >> +{
> >> +  struct range memhp_range;
> >> +
> >> +  memhp_range.start = 0;
> >> +  memhp_range.end =  VMEM_MAX_PHYS;
> >> +  return memhp_range;
> >> +}
> >> +
> >>  int arch_add_memory(int nid, u64 start, u64 size,
> >>struct mhp_params *params)
> >>  {
> >> @@ -291,6 +300,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >>if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
> >>return -EINVAL;
> >>  
> >> +  VM_BUG_ON(!memhp_range_allowed(start, size, 1));
> >>rc = vmem_add_mapping(start, size);
> >>if (rc)
> > Is there a reason why you added the memhp_range_allowed() check call
> > to arch_add_memory() instead of vmem_add_mapping()? If you would do
> 
> As I had mentioned previously, memhp_range_allowed() is available with
> CONFIG_MEMORY_HOTPLUG but vmem_add_mapping() is always available. Hence
> there will be a build failure in vmem_add_mapping() for the range check
> memhp_range_allowed() without memory hotplug enabled.
> 
> > that, then the extra code in __segment_load() wouldn't be
> > required.
> > Even though the error message from memhp_range_allowed() might be
> > highly confusing.
>
> Alternatively leaving __segment_load() and vmem_add_memory() unchanged
> will create three range checks i.e two memhp_range_allowed() and the
> existing VMEM_MAX_PHYS check in vmem_add_mapping() on all the hotplug
> paths, which is not optimal.

Ah, sorry. I didn't follow this discussion too closely. I just thought
my point of view would be clear: let's not have two different ways to
check for the same thing which must be kept in sync.
Therefore I was wondering why this next version is still doing
that. Please find a way to solve this.


Re: [PATCH 3/3] s390/mm: Define arch_get_mappable_range()

2020-12-08 Thread Heiko Carstens
On Tue, Dec 08, 2020 at 09:46:18AM +0530, Anshuman Khandual wrote:
> This overrides arch_get_mappabble_range() on s390 platform which will be
> used with recently added generic framework. It drops a redundant similar
> check in vmem_add_mapping() while compensating __segment_load() with a new
> address range check to preserve the existing functionality. It also adds a
> VM_BUG_ON() check that would ensure that memhp_range_allowed() has already
> been called on the hotplug path.
> 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: David Hildenbrand 
> Cc: linux-s...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  arch/s390/mm/extmem.c |  5 +
>  arch/s390/mm/init.c   | 10 ++
>  arch/s390/mm/vmem.c   |  4 
>  3 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
> index 5060956b8e7d..cc055a78f7b6 100644
> --- a/arch/s390/mm/extmem.c
> +++ b/arch/s390/mm/extmem.c
> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned 
> long *addr, unsigned long
>   goto out_free_resource;
>   }
>  
> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) {
> + rc = -ERANGE;
> + goto out_resource;
> + }
> +
>   rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
>   if (rc)
>   goto out_resource;
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index 77767850d0d0..64937baabf93 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -278,6 +278,15 @@ device_initcall(s390_cma_mem_init);
>  
>  #endif /* CONFIG_CMA */
>  
> +struct range arch_get_mappable_range(void)
> +{
> + struct range memhp_range;
> +
> + memhp_range.start = 0;
> + memhp_range.end =  VMEM_MAX_PHYS;
> + return memhp_range;
> +}
> +
>  int arch_add_memory(int nid, u64 start, u64 size,
>   struct mhp_params *params)
>  {
> @@ -291,6 +300,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>   if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
>   return -EINVAL;
>  
> + VM_BUG_ON(!memhp_range_allowed(start, size, 1));
>   rc = vmem_add_mapping(start, size);
>   if (rc)
>   return rc;
> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> index b239f2ba93b0..749eab43aa93 100644
> --- a/arch/s390/mm/vmem.c
> +++ b/arch/s390/mm/vmem.c
> @@ -536,10 +536,6 @@ int vmem_add_mapping(unsigned long start, unsigned long 
> size)
>  {
>   int ret;
>  
> - if (start + size > VMEM_MAX_PHYS ||
> - start + size < start)
> - return -ERANGE;
> -

Is there a reason why you added the memhp_range_allowed() check call
to arch_add_memory() instead of vmem_add_mapping()? If you would do
that, then the extra code in __segment_load() wouldn't be
required.
Even though the error message from memhp_range_allowed() might be
highly confusing.


[GIT PULL] s390 updates for 5.10-rc7

2020-12-03 Thread Heiko Carstens
Hi Linus,

please pull two late fixes for s390. One commit is fixing lockdep irq state
tracing which broke with -rc6. The other one fixes logical vs physical CPU
address mixup in our PCI code.

Thanks,
Heiko

The following changes since commit b65054597872ce3aefbc6a666385eabdf9e288da:

  Linux 5.10-rc6 (2020-11-29 15:50:50 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.10-6

for you to fetch changes up to b1cae1f84a0f609a34ebcaa087fbecef32f69882:

  s390: fix irq state tracing (2020-12-02 18:17:50 +0100)


- fix lockdep irq state tracing

- fix logical vs physical CPU address confusion in PCI code


Alexander Gordeev (1):
  s390/pci: fix CPU address in MSI for directed IRQ

Heiko Carstens (1):
  s390: fix irq state tracing

 arch/s390/kernel/entry.S | 15 ---
 arch/s390/lib/delay.c|  5 ++---
 arch/s390/pci/pci_irq.c  | 14 +++---
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 26bb0603c5a1..92beb1444644 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -763,12 +763,7 @@ ENTRY(io_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
 .Lio_loop:
lgr %r2,%r11# pass pointer to pt_regs
@@ -791,12 +786,7 @@ ENTRY(io_int_handler)
TSTMSK  __LC_CPU_FLAGS,_CIF_WORK
jnz .Lio_work
 .Lio_restore:
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tm  __PT_PSW(%r11),3
-   jno 0f
TRACE_IRQS_ON
-0:
-#endif
mvc __LC_RETURN_PSW(16),__PT_PSW(%r11)
tm  __PT_PSW+1(%r11),0x01   # returning to user ?
jno .Lio_exit_kernel
@@ -976,12 +966,7 @@ ENTRY(ext_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
lgr %r2,%r11# pass pointer to pt_regs
lghi%r3,EXT_INTERRUPT
diff --git a/arch/s390/lib/delay.c b/arch/s390/lib/delay.c
index daca7bad66de..8c0c68e7770e 100644
--- a/arch/s390/lib/delay.c
+++ b/arch/s390/lib/delay.c
@@ -33,7 +33,7 @@ EXPORT_SYMBOL(__delay);
 
 static void __udelay_disabled(unsigned long long usecs)
 {
-   unsigned long cr0, cr0_new, psw_mask, flags;
+   unsigned long cr0, cr0_new, psw_mask;
struct s390_idle_data idle;
u64 end;
 
@@ -45,9 +45,8 @@ static void __udelay_disabled(unsigned long long usecs)
psw_mask = __extract_psw() | PSW_MASK_EXT | PSW_MASK_WAIT;
set_clock_comparator(end);
set_cpu_flag(CIF_IGNORE_IRQ);
-   local_irq_save(flags);
psw_idle(, psw_mask);
-   local_irq_restore(flags);
+   trace_hardirqs_off();
clear_cpu_flag(CIF_IGNORE_IRQ);
set_clock_comparator(S390_lowcore.clock_comparator);
__ctl_load(cr0, 0, 0);
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 743f257cf2cb..75217fb63d7b 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -103,9 +103,10 @@ static int zpci_set_irq_affinity(struct irq_data *data, 
const struct cpumask *de
 {
struct msi_desc *entry = irq_get_msi_desc(data->irq);
struct msi_msg msg = entry->msg;
+   int cpu_addr = smp_cpu_get_cpu_address(cpumask_first(dest));
 
msg.address_lo &= 0xffff;
-   msg.address_lo |= (cpumask_first(dest) << 8);
+   msg.address_lo |= (cpu_addr << 8);
pci_write_msi_msg(data->irq, );
 
return IRQ_SET_MASK_OK;
@@ -238,6 +239,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int 
type)
unsigned long bit;
struct msi_desc *msi;
struct msi_msg msg;
+   int cpu_addr;
int rc, irq;
 
zdev->aisb = -1UL;
@@ -287,9 +289,15 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, 
int type)
 handle_percpu_irq);
msg.data = hwirq - bit;
if (irq_delivery == DIRECTED) {
+   if (msi->affinity)
+   cpu = cpumask_first(>affinity->mask);
+   else
+   cpu = 0;
+   cpu_addr = smp_cpu_get_cpu_address(cpu);
+
msg.address_lo = zdev->msi_addr & 0xffff;
-  

Re: [PATCH AUTOSEL 5.9 27/39] sched/idle: Fix arch_cpu_idle() vs tracing

2020-12-03 Thread Heiko Carstens
On Thu, Dec 03, 2020 at 08:28:21AM -0500, Sasha Levin wrote:
> From: Peter Zijlstra 
> 
> [ Upstream commit 58c644ba512cfbc2e39b758dd979edd1d6d00e27 ]
> 
> We call arch_cpu_idle() with RCU disabled, but then use
> local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.
> 
> Switch all arch_cpu_idle() implementations to use
> raw_local_irq_{en,dis}able() and carefully manage the
> lockdep,rcu,tracing state like we do in entry.
> 
> (XXX: we really should change arch_cpu_idle() to not return with
> interrupts enabled)
> 
> Reported-by: Sven Schnelle 
> Signed-off-by: Peter Zijlstra (Intel) 
> Reviewed-by: Mark Rutland 
> Tested-by: Mark Rutland 
> Link: https://lkml.kernel.org/r/20201120114925.594122...@infradead.org
> Signed-off-by: Sasha Levin 

This patch broke s390 irq state tracing. A patch to fix this is
scheduled to be merged upstream today (hopefully).
Therefore I think this patch should not yet go into 5.9 stable.


Re: [RFC V2 3/3] s390/mm: Define arch_get_mappable_range()

2020-12-03 Thread Heiko Carstens
On Thu, Dec 03, 2020 at 06:03:00AM +0530, Anshuman Khandual wrote:
> >> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
> >> index 5060956b8e7d..cc055a78f7b6 100644
> >> --- a/arch/s390/mm/extmem.c
> >> +++ b/arch/s390/mm/extmem.c
> >> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, 
> >> unsigned long *addr, unsigned long
> >>goto out_free_resource;
> >>}
> >>  
> >> +  if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) {
> >> +  rc = -ERANGE;
> >> +  goto out_resource;
> >> +  }
> >> +
> >>rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
> >>if (rc)
> >>goto out_resource;
> >> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> >> index b239f2ba93b0..06dddcc0ce06 100644
> >> --- a/arch/s390/mm/vmem.c
> >> +++ b/arch/s390/mm/vmem.c
> >> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, 
> >> unsigned long size)
> >>mutex_unlock(_mutex);
> >>  }
> >>  
> >> +struct range arch_get_mappable_range(void)
> >> +{
> >> +  struct range memhp_range;
> >> +
> >> +  memhp_range.start = 0;
> >> +  memhp_range.end =  VMEM_MAX_PHYS;
> >> +  return memhp_range;
> >> +}
> >> +
> >>  int vmem_add_mapping(unsigned long start, unsigned long size)
> >>  {
> >>int ret;
> >>  
> >> -  if (start + size > VMEM_MAX_PHYS ||
> >> -  start + size < start)
> >> -  return -ERANGE;
> >> -
> > 
> > I really fail to see how this could be considered an improvement for
> > s390. Especially I do not like that the (central) range check is now
> > moved to the caller (__segment_load). Which would mean potential
> > additional future callers would have to duplicate that code as well.
> 
> The physical range check is being moved to the generic hotplug code
> via arch_get_mappable_range() instead, making the existing check in
> vmem_add_mapping() redundant. Dropping the check there necessitates
> adding back a similar check in __segment_load(). Otherwise there
> will be a loss of functionality in terms of range check.
> 
> May be we could just keep this existing check in vmem_add_mapping()
> as well in order avoid this movement but then it would be redundant
> check in every hotplug path.
> 
> So I guess the choice is to either have redundant range checks in
> all hotplug paths or future internal callers of vmem_add_mapping()
> take care of the range check.

The problem I have with this current approach from an architecture
perspective: we end up having two completely different methods which
are doing the same and must be kept in sync. This might be obvious
looking at this patch, but I'm sure this will go out-of-sync (aka
broken) sooner or later.

Therefore I would really like to see a single method to do the range
checking. Maybe you could add a callback into architecture code, so
that such an architecture specific function could also be used
elsewhere. Dunno.


Re: [RFC V2 0/3] mm/hotplug: Pre-validate the address range with platform

2020-12-02 Thread Heiko Carstens
On Mon, Nov 30, 2020 at 08:59:49AM +0530, Anshuman Khandual wrote:
> This series adds a mechanism allowing platforms to weigh in and prevalidate
> incoming address range before proceeding further with the memory hotplug.
> This helps prevent potential platform errors for the given address range,
> down the hotplug call chain, which inevitably fails the hotplug itself.
> 
> This mechanism was suggested by David Hildenbrand during another discussion
> with respect to a memory hotplug fix on arm64 platform.
> 
> https://lore.kernel.org/linux-arm-kernel/1600332402-30123-1-git-send-email-anshuman.khand...@arm.com/
> 
> This mechanism focuses on the addressibility aspect and not [sub] section
> alignment aspect. Hence check_hotplug_memory_range() and check_pfn_span()
> have been left unchanged. Wondering if all these can still be unified in
> an expanded memhp_range_allowed() check, that can be called from multiple
> memory hot add and remove paths.
> 
> This series applies on v5.10-rc6 and has been slightly tested on arm64.
> But looking for some early feedback here.
> 
> Changes in RFC V2:
> 
> Incorporated all review feedbacks from David.
> 
> - Added additional range check in __segment_load() on s390 which was lost
> - Changed is_private init in pagemap_range()
> - Moved the framework into mm/memory_hotplug.c
> - Made arch_get_addressable_range() a __weak function
> - Renamed arch_get_addressable_range() as arch_get_mappable_range()
> - Callback arch_get_mappable_range() only handles range requiring linear 
> mapping
> - Merged multiple memhp_range_allowed() checks in register_memory_resource()
> - Replaced WARN() with pr_warn() in memhp_range_allowed()
> - Replaced error return code ERANGE with E2BIG
> 
> Changes in RFC V1:
> 
> https://lore.kernel.org/linux-mm/1606098529-7907-1-git-send-email-anshuman.khand...@arm.com/
> 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Ard Biesheuvel 
> Cc: Mark Rutland 
> Cc: David Hildenbrand 
> Cc: Andrew Morton 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org

Btw. please use git send-email's --cc-cover option to make sure that
all patches of this series will be sent to all listed cc's.
I really dislike to receive only the cover-letter and maybe on patch
and then have to figure out where to find the rest.

Thanks :)


Re: [RFC V2 3/3] s390/mm: Define arch_get_mappable_range()

2020-12-02 Thread Heiko Carstens
On Mon, Nov 30, 2020 at 08:59:52AM +0530, Anshuman Khandual wrote:
> This overrides arch_get_mappabble_range() on s390 platform and drops now
> redundant similar check in vmem_add_mapping(). This compensates by adding
> a new check __segment_load() to preserve the existing functionality.
> 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: David Hildenbrand 
> Cc: linux-s...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  arch/s390/mm/extmem.c |  5 +
>  arch/s390/mm/vmem.c   | 13 +
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
> index 5060956b8e7d..cc055a78f7b6 100644
> --- a/arch/s390/mm/extmem.c
> +++ b/arch/s390/mm/extmem.c
> @@ -337,6 +337,11 @@ __segment_load (char *name, int do_nonshared, unsigned 
> long *addr, unsigned long
>   goto out_free_resource;
>   }
>  
> + if (seg->end + 1 > VMEM_MAX_PHYS || seg->end + 1 < seg->start_addr) {
> + rc = -ERANGE;
> + goto out_resource;
> + }
> +
>   rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
>   if (rc)
>   goto out_resource;
> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> index b239f2ba93b0..06dddcc0ce06 100644
> --- a/arch/s390/mm/vmem.c
> +++ b/arch/s390/mm/vmem.c
> @@ -532,14 +532,19 @@ void vmem_remove_mapping(unsigned long start, unsigned 
> long size)
>   mutex_unlock(_mutex);
>  }
>  
> +struct range arch_get_mappable_range(void)
> +{
> + struct range memhp_range;
> +
> + memhp_range.start = 0;
> + memhp_range.end =  VMEM_MAX_PHYS;
> + return memhp_range;
> +}
> +
>  int vmem_add_mapping(unsigned long start, unsigned long size)
>  {
>   int ret;
>  
> - if (start + size > VMEM_MAX_PHYS ||
> - start + size < start)
> - return -ERANGE;
> -

I really fail to see how this could be considered an improvement for
s390. Especially I do not like that the (central) range check is now
moved to the caller (__segment_load). Which would mean potential
additional future callers would have to duplicate that code as well.


Re: [GIT pull] locking/urgent for v5.10-rc6

2020-12-02 Thread Heiko Carstens
On Wed, Dec 02, 2020 at 11:16:05AM +, Mark Rutland wrote:
> On Wed, Dec 02, 2020 at 11:56:49AM +0100, Heiko Carstens wrote:
> > From 7bd86fb3eb039a4163281472ca79b9158e726526 Mon Sep 17 00:00:00 2001
> > From: Heiko Carstens 
> > Date: Wed, 2 Dec 2020 11:46:01 +0100
> > Subject: [PATCH] s390: fix irq state tracing
> > 
> > With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs
> > tracing") common code calls arch_cpu_idle() with a lockdep state that
> > tells irqs are on.
> > 
> > This doesn't work very well for s390: psw_idle() will enable interrupts
> > to wait for an interrupt. As soon as an interrupt occurs the interrupt
> > handler will verify if the old context was psw_idle(). If that is the
> > case the interrupt enablement bits in the old program status word will
> > be cleared.
> > 
> > A subsequent test in both the external as well as the io interrupt
> > handler checks if in the old context interrupts were enabled. Due to
> > the above patching of the old program status word it is assumed the
> > old context had interrupts disabled, and therefore a call to
> > TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in
> > turn makes lockdep incorrectly "think" that interrupts are enabled
> > within the interrupt handler.
> > 
> > Fix this by unconditionally calling TRACE_IRQS_OFF when entering
> > interrupt handlers. Also call unconditionally TRACE_IRQS_ON when
> > leaving interrupts handlers.
> > 
> > This leaves the special psw_idle() case, which now returns with
> > interrupts disabled, but has an "irqs on" lockdep state. So callers of
> > psw_idle() must adjust the state on their own, if required. This is
> > currently only __udelay_disabled().
> > 
> > Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing")
> > Signed-off-by: Heiko Carstens 
> 
> FWIW, this makes sense to me from what I had to chase on the arm64 side,
> and this seems happy atop v5.10-rc6 with all the lockdep and RCU debug
> options enabled when booting to userspace under QEMU.
> 
> Thanks,
> Mark.

Thanks a lot for having a look and testing this!


Re: [GIT pull] locking/urgent for v5.10-rc6

2020-12-02 Thread Heiko Carstens
On Wed, Dec 02, 2020 at 10:21:16AM +0100, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 08:18:56PM +0100, Heiko Carstens wrote:
> OK, so with a little help from s390/PoO and Sven, the code removed skips
> the TRACE_IRQS_OFF when IRQs were enabled in the old PSW (the previous
> context).
> 
> That sounds entirely the right thing. Irrespective of what the previous
> IRQ state was, the current state is off.
> 
> > diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
> > index 2b85096964f8..5bd8c1044d09 100644
> > --- a/arch/s390/kernel/idle.c
> > +++ b/arch/s390/kernel/idle.c
> > @@ -123,7 +123,6 @@ void arch_cpu_idle_enter(void)
> >  void arch_cpu_idle(void)
> >  {
> > enabled_wait();
> > -   raw_local_irq_enable();
> >  }
> 
> Currently arch_cpu_idle() is defined as to return with IRQs enabled,
> however, the very first thing we do when we return is
> raw_local_irq_disable(), so this change is harmless.
> 
> It is also the direction I've been arguing for elsewhere in this thread.
> So I'm certainly not complaining.

So I left that raw_local_irq_enable() in to be consistent with other
architectures. enabled_wait() now returns with irqs disabled, but with
a lockdep state that tells irqs are on...  See patch below.
Works and hopefully makes sense ;)

In addition (but not for rc7) I want to get rid of our complex udelay
implementation. I think we don't need that anymore.. so there would be
only the idle code left where we have to play tricks.

>From 7bd86fb3eb039a4163281472ca79b9158e726526 Mon Sep 17 00:00:00 2001
From: Heiko Carstens 
Date: Wed, 2 Dec 2020 11:46:01 +0100
Subject: [PATCH] s390: fix irq state tracing

With commit 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs
tracing") common code calls arch_cpu_idle() with a lockdep state that
tells irqs are on.

This doesn't work very well for s390: psw_idle() will enable interrupts
to wait for an interrupt. As soon as an interrupt occurs the interrupt
handler will verify if the old context was psw_idle(). If that is the
case the interrupt enablement bits in the old program status word will
be cleared.

A subsequent test in both the external as well as the io interrupt
handler checks if in the old context interrupts were enabled. Due to
the above patching of the old program status word it is assumed the
old context had interrupts disabled, and therefore a call to
TRACE_IRQS_OFF (aka trace_hardirqs_off_caller) is skipped. Which in
turn makes lockdep incorrectly "think" that interrupts are enabled
within the interrupt handler.

Fix this by unconditionally calling TRACE_IRQS_OFF when entering
interrupt handlers. Also call unconditionally TRACE_IRQS_ON when
leaving interrupts handlers.

This leaves the special psw_idle() case, which now returns with
interrupts disabled, but has an "irqs on" lockdep state. So callers of
psw_idle() must adjust the state on their own, if required. This is
currently only __udelay_disabled().

Fixes: 58c644ba512c ("sched/idle: Fix arch_cpu_idle() vs tracing")
Signed-off-by: Heiko Carstens 
---
 arch/s390/kernel/entry.S | 15 ---
 arch/s390/lib/delay.c|  5 ++---
 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 26bb0603c5a1..92beb1444644 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -763,12 +763,7 @@ ENTRY(io_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
 .Lio_loop:
lgr %r2,%r11# pass pointer to pt_regs
@@ -791,12 +786,7 @@ ENTRY(io_int_handler)
TSTMSK  __LC_CPU_FLAGS,_CIF_WORK
jnz .Lio_work
 .Lio_restore:
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tm  __PT_PSW(%r11),3
-   jno 0f
TRACE_IRQS_ON
-0:
-#endif
mvc __LC_RETURN_PSW(16),__PT_PSW(%r11)
tm  __PT_PSW+1(%r11),0x01   # returning to user ?
jno .Lio_exit_kernel
@@ -976,12 +966,7 @@ ENTRY(ext_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
lgr %r2,%r11# pass pointer to pt_regs
lghi%r3,EXT_INTERRUPT
diff --git a/arch/s390/lib/delay.c b/arch/s390/lib/delay.c
index daca7bad66de..8c0c68e7770e 100644
--- a/arch/s390/lib/delay.c
+++ b/arch/s390/lib/delay.c
@@ -33,7 +33,7 @@ EXPORT_SYMBOL(__delay);
 
 static void __udelay_dis

Re: [GIT pull] locking/urgent for v5.10-rc6

2020-12-02 Thread Heiko Carstens
On Wed, Dec 02, 2020 at 10:38:05AM +0100, Peter Zijlstra wrote:
> On Wed, Dec 02, 2020 at 08:54:27AM +0100, Heiko Carstens wrote:
> > > > But but but...
> > > > 
> > > >   do_idle() # IRQs on
> > > > local_irq_disable();# IRQs off
> > > > defaul_idle_call()  # IRQs off
> > >   lockdep_hardirqs_on();  # IRQs off, but lockdep things they're on
> > > >   arch_cpu_idle()   # IRQs off
> > > > enabled_wait()  # IRQs off
> > > >   raw_local_save()  # still off
> > > >   psw_idle()# very much off
> > > > ext_int_handler # get an interrupt ?!?!
> > > rcu_irq_enter()   # lockdep thinks IRQs are on <- FAIL
> > > 
> > > I can't much read s390 assembler, but ext_int_handler() has a
> > > TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> > > with the actual state, but there's some condition before it, what's that
> > > test and is that right?
> > 
> > After digging a bit into our asm code: no, it is not right, and only
> > for psw_idle() it is wrong.
> > 
> > What happens with the current code:
> > 
> > - default_idle_call() calls lockdep_hardirqs_on() before calling into
> >   arch_cpu_idle()
> 
> Correct.
> 
> > - our arch_cpu_idle() calls psw_idle() which enables irqs. the irq
> >   handler will call/use the SWITCH_ASYNC macro which clears the
> >   interrupt enabled bits in the old program status word (_only_ for
> >   psw_idle)
> 
> This is the condition at 0: that compares r13 to psw_idle_exit?

Yes, exactly.

> > So I guess my patch which I sent yesterday evening should fix all that
> > mess
> 
> Yes, afaict it does the right thing. Exceptions should call
> TRACE_IRQS_OFF unconditionally, since the hardware will disable
> interrupts upon taking an exception, irrespective of what the previous
> context had.
> 
> On exception return the previous IRQ state is inspected and lockdep is
> changed to match (except for NMIs, which either are ignored by lockdep
> or need a little bit of extra care).

Yes, we do all that, except that it seems odd to test the previous
state for interrupts (not exceptions). Since for interrupts the
previous context obviously must have been enabled for interrupts.

Except if you play tricks with the old PSW, like we do :/


Re: [GIT pull] locking/urgent for v5.10-rc6

2020-12-01 Thread Heiko Carstens
> > But but but...
> > 
> >   do_idle() # IRQs on
> > local_irq_disable();# IRQs off
> > defaul_idle_call()  # IRQs off
>   lockdep_hardirqs_on();  # IRQs off, but lockdep things they're on
> >   arch_cpu_idle()   # IRQs off
> > enabled_wait()  # IRQs off
> >   raw_local_save()  # still off
> >   psw_idle()# very much off
> > ext_int_handler # get an interrupt ?!?!
> rcu_irq_enter()   # lockdep thinks IRQs are on <- FAIL
> 
> I can't much read s390 assembler, but ext_int_handler() has a
> TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> with the actual state, but there's some condition before it, what's that
> test and is that right?

After digging a bit into our asm code: no, it is not right, and only
for psw_idle() it is wrong.

What happens with the current code:

- default_idle_call() calls lockdep_hardirqs_on() before calling into
  arch_cpu_idle()

- our arch_cpu_idle() calls psw_idle() which enables irqs. the irq
  handler will call/use the SWITCH_ASYNC macro which clears the
  interrupt enabled bits in the old program status word (_only_ for
  psw_idle)

- this again causes the interrupt handler to _not_ call TRACE_IRQS_OFF
  and therefore lockdep thinks interrupts are enabled within the
  interrupt handler

So I guess my patch which I sent yesterday evening should fix all that
mess - plus an explicit trace_hardirqs_off() call in our udelay
implementation is required now.

I'll send a proper patch later.


Re: [GIT pull] locking/urgent for v5.10-rc6

2020-12-01 Thread Heiko Carstens
On Tue, Dec 01, 2020 at 08:14:41PM +0100, Peter Zijlstra wrote:
> On Tue, Dec 01, 2020 at 06:57:37PM +, Mark Rutland wrote:
> > On Tue, Dec 01, 2020 at 07:15:06PM +0100, Peter Zijlstra wrote:
> > > On Tue, Dec 01, 2020 at 03:55:19PM +0100, Peter Zijlstra wrote:
> > > > On Tue, Dec 01, 2020 at 06:46:44AM -0800, Paul E. McKenney wrote:
> > > > 
> > > > > > So after having talked to Sven a bit, the thing that is happening, 
> > > > > > is
> > > > > > that this is the one place where we take interrupts with RCU being
> > > > > > disabled. Normally RCU is watching and all is well, except during 
> > > > > > idle.
> > > > > 
> > > > > Isn't interrupt entry supposed to invoke rcu_irq_enter() at some 
> > > > > point?
> > > > > Or did this fall victim to recent optimizations?
> > > > 
> > > > It does, but the problem is that s390 is still using
> > > 
> > > I might've been too quick there, I can't actually seem to find where
> > > s390 does rcu_irq_enter()/exit().
> > > 
> > > Also, I'm thinking the below might just about solve the current problem.
> > > The next problem would then be it calling TRACE_IRQS_ON after it did
> > > rcu_irq_exit()... :/
> > 
> > I gave this patch a go under QEMU TCG atop v5.10-rc6 s390 defconfig with
> > PROVE_LOCKING and DEBUG_ATOMIC_SLEEP. It significantly reduces the
> > number of lockdep splats, but IIUC we need to handle the io_int_handler
> > path in addition to the ext_int_handler path, and there's a remaining
> > lockdep splat (below).
> 
> I'm amazed it didn't actually make things worse, given how I failed to
> spot do_IRQ() was arch code etc..
> 
> > If this ends up looking like we'll need more point-fixes, I wonder if we
> > should conditionalise the new behaviour of the core idle code under a
> > new CONFIG symbol for now, and opt-in x86 and arm64, then transition the
> > rest once they've had a chance to test. They'll still be broken in the
> > mean time, but no more so than they previously were.
> 
> We can do that I suppose... :/

Well, the following small patch works for me (plus an additional call to
trace_hardirqs_on() in our udelay implementation - but that's probably
independent).
Is there a reason why this should be considered broken?

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 26bb0603c5a1..92beb1444644 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -763,12 +763,7 @@ ENTRY(io_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
 .Lio_loop:
lgr %r2,%r11# pass pointer to pt_regs
@@ -791,12 +786,7 @@ ENTRY(io_int_handler)
TSTMSK  __LC_CPU_FLAGS,_CIF_WORK
jnz .Lio_work
 .Lio_restore:
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tm  __PT_PSW(%r11),3
-   jno 0f
TRACE_IRQS_ON
-0:
-#endif
mvc __LC_RETURN_PSW(16),__PT_PSW(%r11)
tm  __PT_PSW+1(%r11),0x01   # returning to user ?
jno .Lio_exit_kernel
@@ -976,12 +966,7 @@ ENTRY(ext_int_handler)
xc  __PT_FLAGS(8,%r11),__PT_FLAGS(%r11)
TSTMSK  __LC_CPU_FLAGS,_CIF_IGNORE_IRQ
jo  .Lio_restore
-#if IS_ENABLED(CONFIG_TRACE_IRQFLAGS)
-   tmhh%r8,0x300
-   jz  1f
TRACE_IRQS_OFF
-1:
-#endif
xc  __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)
lgr %r2,%r11# pass pointer to pt_regs
lghi%r3,EXT_INTERRUPT
diff --git a/arch/s390/kernel/idle.c b/arch/s390/kernel/idle.c
index 2b85096964f8..5bd8c1044d09 100644
--- a/arch/s390/kernel/idle.c
+++ b/arch/s390/kernel/idle.c
@@ -123,7 +123,6 @@ void arch_cpu_idle_enter(void)
 void arch_cpu_idle(void)
 {
enabled_wait();
-   raw_local_irq_enable();
 }
 
 void arch_cpu_idle_exit(void)


Re: [PATCH linux-next] include/getcpu.h: Fixed kernel test robot warning

2020-11-30 Thread Heiko Carstens
On Sat, Nov 28, 2020 at 09:11:57PM +0530, Souptick Joarder wrote:
> Kernel test robot generates below warning ->
> 
> >> arch/s390/kernel/vdso64/getcpu.c:8:5: warning: no previous prototype
> >> for function '__s390_vdso_getcpu' [-Wmissing-prototypes]
>int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct
> getcpu_cache *unused)
>^
>arch/s390/kernel/vdso64/getcpu.c:8:1: note: declare 'static' if the
> function is not intended to be used outside of this translation unit
>int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct
> getcpu_cache *unused)
>^
>static
>1 warning generated.
> 
> vim +/__s390_vdso_getcpu +8 arch/s390/kernel/vdso64/getcpu.c
> 
>  7
>> 8  int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct
>> getcpu_cache *unused)
> 
> It is fixed by adding a prototype.
> 
> Reported-by: kernel test robot 
> Signed-off-by: Souptick Joarder 
> ---
>  include/linux/getcpu.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/getcpu.h b/include/linux/getcpu.h
> index c304dcd..43c9208 100644
> --- a/include/linux/getcpu.h
> +++ b/include/linux/getcpu.h
> @@ -16,4 +16,5 @@ struct getcpu_cache {
>   unsigned long blob[128 / sizeof(long)];
>  };
>  
> +int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache 
> *unused);

Sorry, no. We won't add s390 specific prototypes to common code header
files. Anyway, I solved this differently and the "fix" should be in
linux-next soon.


[GIT PULL] s390 updates for 5.10-rc6

2020-11-24 Thread Heiko Carstens
Hi Linus,

please pull one important s390 fix for 5.10-rc6.

Thanks,
Heiko

The following changes since commit 78d732e1f326f74f240d416af9484928303d9951:

  s390/cpum_sf.c: fix file permission for cpum_sfb_size (2020-11-12 12:10:36 
+0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.10-5

for you to fetch changes up to 1179f170b6f0af7bb0b3b7628136eaac450ddf31:

  s390: fix fpu restore in entry.S (2020-11-23 11:52:13 +0100)


- disable interrupts when restoring fpu and vector registers,
  otherwise KVM guests might see corrupted register contents


Sven Schnelle (1):
  s390: fix fpu restore in entry.S

 arch/s390/kernel/asm-offsets.c | 10 +-
 arch/s390/kernel/entry.S   |  2 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index 2012c1cf0853..483051e10db3 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -53,11 +53,11 @@ int main(void)
/* stack_frame offsets */
OFFSET(__SF_BACKCHAIN, stack_frame, back_chain);
OFFSET(__SF_GPRS, stack_frame, gprs);
-   OFFSET(__SF_EMPTY, stack_frame, empty1);
-   OFFSET(__SF_SIE_CONTROL, stack_frame, empty1[0]);
-   OFFSET(__SF_SIE_SAVEAREA, stack_frame, empty1[1]);
-   OFFSET(__SF_SIE_REASON, stack_frame, empty1[2]);
-   OFFSET(__SF_SIE_FLAGS, stack_frame, empty1[3]);
+   OFFSET(__SF_EMPTY, stack_frame, empty1[0]);
+   OFFSET(__SF_SIE_CONTROL, stack_frame, empty1[1]);
+   OFFSET(__SF_SIE_SAVEAREA, stack_frame, empty1[2]);
+   OFFSET(__SF_SIE_REASON, stack_frame, empty1[3]);
+   OFFSET(__SF_SIE_FLAGS, stack_frame, empty1[4]);
BLANK();
OFFSET(__VDSO_GETCPU_VAL, vdso_per_cpu_data, getcpu_val);
BLANK();
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 5346545b9860..26bb0603c5a1 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -1068,6 +1068,7 @@ EXPORT_SYMBOL(save_fpu_regs)
  * %r4
  */
 load_fpu_regs:
+   stnsm   __SF_EMPTY(%r15),0xfc
lg  %r4,__LC_CURRENT
aghi%r4,__TASK_thread
TSTMSK  __LC_CPU_FLAGS,_CIF_FPU
@@ -1099,6 +1100,7 @@ load_fpu_regs:
 .Lload_fpu_regs_done:
ni  __LC_CPU_FLAGS+7,255-_CIF_FPU
 .Lload_fpu_regs_exit:
+   ssm __SF_EMPTY(%r15)
BR_EX   %r14
 .Lload_fpu_regs_end:
 ENDPROC(load_fpu_regs)


Re: irq-loongson-pch-pic.c:undefined reference to `of_iomap'

2020-11-17 Thread Heiko Carstens
On Tue, Nov 17, 2020 at 07:34:55PM +0100, Krzysztof Kozlowski wrote:
> > Looking a bit further, I now find that we ended up disabling 
> > CONFIG_COMPILE_TEST
> > entirely for arch/um, which is clearly an option that would also work for 
> > s390.
> 
> Yes, that was the easier solution than to spread "depends on HAS_IOMEM"
> all over Kconfigs.
> 
> +Cc Greg KH,
> 
> I got similar report around phy drivers:
> https://lore.kernel.org/lkml/202011140335.tcevqhmn-...@intel.com/
> 
> When reproducing this, I saw multiple unmet dependencies on s390 for
> MFD_SYSCON and MFD_STM32_TIMERS.
> 
> I suppose there is no point to fix them all because this will be
> basically UML case, so HAS_IOMEM all over the tree.

FWIW, I just replied a couple of minutes, but you might have missed
that:
---
I'll add a patch to the s390 tree which disables CONFIG_COMPILE_TEST
for s390. I wouldn't like to start again chasing/adding missing
'select' or 'depends on' statements in various config files.
---


Re: irq-loongson-pch-pic.c:undefined reference to `of_iomap'

2020-11-17 Thread Heiko Carstens
On Mon, Nov 16, 2020 at 10:21:26AM +0100, Arnd Bergmann wrote:
> > Don't we need the dependencies on HAS_IOMEM for the CONFIG_UML=y
> > case, too?
> 
> I would have expected that as well, but I don't see the problem when building
> an arch/um kernel, all I get is
> 
> ERROR: modpost: "devm_platform_ioremap_resource"
> [drivers/iio/adc/adi-axi-adc.ko] undefined!
> ERROR: modpost: "devm_platform_ioremap_resource"
> [drivers/ptp/ptp_ines.ko] undefined!
> ERROR: modpost: "devm_ioremap_resource"
> [drivers/net/ethernet/xilinx/xilinx_emac.ko] undefined!
> ERROR: modpost: "devm_platform_ioremap_resource_byname"
> [drivers/net/ethernet/xilinx/ll_temac.ko] undefined!
> ERROR: modpost: "devm_ioremap"
> [drivers/net/ethernet/xilinx/ll_temac.ko] undefined!
> ERROR: modpost: "devm_of_iomap"
> [drivers/net/ethernet/xilinx/ll_temac.ko] undefined!
> ERROR: modpost: "__open64_2" [fs/hostfs/hostfs.ko] undefined!
> 
> If I disable those five drivers, I can build and link a uml kernel without
> warnings. I could not find the difference compared to s390 here.
> 
> Looking a bit further, I now find that we ended up disabling 
> CONFIG_COMPILE_TEST
> entirely for arch/um, which is clearly an option that would also work for 
> s390.

I'll add a patch to the s390 tree which disables CONFIG_COMPILE_TEST
for s390. I wouldn't like to start again chasing/adding missing
'select' or 'depends on' statements in various config files.


[GIT PULL] s390 updates for 5.10-rc5

2020-11-17 Thread Heiko Carstens
Hi Linus,

please pull some small s390 updates for 5.10-rc5.

Thanks,
Heiko

The following changes since commit f8394f232b1eab649ce2df5c5f15b0e528c92091:

  Linux 5.10-rc3 (2020-11-08 16:10:16 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.10-4

for you to fetch changes up to 78d732e1f326f74f240d416af9484928303d9951:

  s390/cpum_sf.c: fix file permission for cpum_sfb_size (2020-11-12 12:10:36 
+0100)


- fix system call exit path; avoid return to user space with
  any TIF/CIF/PIF set

- fix file permission for cpum_sfb_size parameter

- another small defconfig update


Heiko Carstens (2):
  s390: fix system call exit path
  s390: update defconfigs

Thomas Richter (1):
  s390/cpum_sf.c: fix file permission for cpum_sfb_size

 arch/s390/configs/debug_defconfig | 1 +
 arch/s390/kernel/entry.S  | 2 ++
 arch/s390/kernel/perf_cpum_sf.c   | 2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/s390/configs/debug_defconfig 
b/arch/s390/configs/debug_defconfig
index a4d3c578fbd8..fe6f529ac82c 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -1,3 +1,4 @@
+CONFIG_UAPI_HEADER_TEST=y
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_WATCH_QUEUE=y
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 86235919c2d1..5346545b9860 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -422,6 +422,7 @@ ENTRY(system_call)
 #endif
LOCKDEP_SYS_EXIT
 .Lsysc_tif:
+   DISABLE_INTS
TSTMSK  __PT_FLAGS(%r11),_PIF_WORK
jnz .Lsysc_work
TSTMSK  __TI_flags(%r12),_TIF_WORK
@@ -444,6 +445,7 @@ ENTRY(system_call)
 # One of the work bits is on. Find out which one.
 #
 .Lsysc_work:
+   ENABLE_INTS
TSTMSK  __TI_flags(%r12),_TIF_NEED_RESCHED
jo  .Lsysc_reschedule
TSTMSK  __PT_FLAGS(%r11),_PIF_SYSCALL_RESTART
diff --git a/arch/s390/kernel/perf_cpum_sf.c b/arch/s390/kernel/perf_cpum_sf.c
index 4f9e4626df55..f100c9209743 100644
--- a/arch/s390/kernel/perf_cpum_sf.c
+++ b/arch/s390/kernel/perf_cpum_sf.c
@@ -2228,4 +2228,4 @@ static int __init init_cpum_sampling_pmu(void)
 }
 
 arch_initcall(init_cpum_sampling_pmu);
-core_param(cpum_sfb_size, CPUM_SF_MAX_SDB, sfb_size, 0640);
+core_param(cpum_sfb_size, CPUM_SF_MAX_SDB, sfb_size, 0644);


Re: [PATCH seccomp 5/8] s390: Enable seccomp architecture tracking

2020-11-09 Thread Heiko Carstens
On Tue, Nov 03, 2020 at 07:43:01AM -0600, YiFei Zhu wrote:
> From: YiFei Zhu 
> 
> To enable seccomp constant action bitmaps, we need to have a static
> mapping to the audit architecture and system call table size. Add these
> for s390.
> 
> Signed-off-by: YiFei Zhu 
> ---
>  arch/s390/include/asm/seccomp.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/s390/include/asm/seccomp.h b/arch/s390/include/asm/seccomp.h
> index 795bbe0d7ca6..71d46f0ba97b 100644
> --- a/arch/s390/include/asm/seccomp.h
> +++ b/arch/s390/include/asm/seccomp.h
> @@ -16,4 +16,13 @@
>  
>  #include 
>  
> +#define SECCOMP_ARCH_NATIVE  AUDIT_ARCH_S390X
> +#define SECCOMP_ARCH_NATIVE_NR   NR_syscalls
> +#define SECCOMP_ARCH_NATIVE_NAME "s390x"
> +#ifdef CONFIG_COMPAT
> +# define SECCOMP_ARCH_COMPAT AUDIT_ARCH_S390
> +# define SECCOMP_ARCH_COMPAT_NR  NR_syscalls
> +# define SECCOMP_ARCH_COMPAT_NAME"s390"
> +#endif
> +

Acked-by: Heiko Carstens 


[GIT PULL] s390 updates for 5.10-rc3

2020-11-05 Thread Heiko Carstens
Hi Linus,

please pull some small s390 updates for 5.10-rc3.

Thanks,
Heiko

The following changes since commit 3cea11cd5e3b00d91caf0b4730194039b45c5891:

  Linux 5.10-rc2 (2020-11-01 14:43:51 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.10-3

for you to fetch changes up to 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923:

  s390/pci: fix hot-plug of PCI function missing bus (2020-11-03 15:12:16 +0100)


- fix reference counting for ap devices

- fix paes selftest

- fix pmd_deref()/pud_deref() so they can also handle large pages

- remove unused vdso file and defines

- update defconfigs

- call rcu_cpu_starting() early in smp init code to avoid lockdep warnings

- fix hotplug of PCI function missing bus


Gerald Schaefer (1):
  s390/mm: make pmd/pud_deref() large page aware

Harald Freudenberger (2):
  s390/ap: fix ap devices reference counting
  s390/pkey: fix paes selftest failure with paes and pkey static build

Heiko Carstens (3):
  s390/vdso: remove empty unused file
  s390/vdso: remove unused constants
  s390: update defconfigs

Niklas Schnelle (1):
  s390/pci: fix hot-plug of PCI function missing bus

Qian Cai (1):
  s390/smp: move rcu_cpu_starting() earlier

 arch/s390/configs/debug_defconfig| 10 ---
 arch/s390/configs/defconfig  |  9 ---
 arch/s390/configs/zfcpdump_defconfig |  2 +-
 arch/s390/include/asm/pgtable.h  | 52 +---
 arch/s390/include/asm/vdso/vdso.h|  0
 arch/s390/kernel/asm-offsets.c   |  8 --
 arch/s390/kernel/smp.c   |  3 ++-
 arch/s390/pci/pci_event.c|  4 +++
 drivers/s390/crypto/ap_bus.c | 14 --
 drivers/s390/crypto/pkey_api.c   | 30 +++--
 drivers/s390/crypto/zcrypt_card.c| 13 +
 drivers/s390/crypto/zcrypt_queue.c   |  6 +
 12 files changed, 85 insertions(+), 66 deletions(-)
 delete mode 100644 arch/s390/include/asm/vdso/vdso.h

diff --git a/arch/s390/configs/debug_defconfig 
b/arch/s390/configs/debug_defconfig
index 0784bf3caf43..a4d3c578fbd8 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -93,9 +93,10 @@ CONFIG_CLEANCACHE=y
 CONFIG_FRONTSWAP=y
 CONFIG_CMA_DEBUG=y
 CONFIG_CMA_DEBUGFS=y
+CONFIG_CMA_AREAS=7
 CONFIG_MEM_SOFT_DIRTY=y
 CONFIG_ZSWAP=y
-CONFIG_ZSMALLOC=m
+CONFIG_ZSMALLOC=y
 CONFIG_ZSMALLOC_STAT=y
 CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
 CONFIG_IDLE_PAGE_TRACKING=y
@@ -378,7 +379,6 @@ CONFIG_NETLINK_DIAG=m
 CONFIG_CGROUP_NET_PRIO=y
 CONFIG_BPF_JIT=y
 CONFIG_NET_PKTGEN=m
-# CONFIG_NET_DROP_MONITOR is not set
 CONFIG_PCI=y
 # CONFIG_PCIEASPM is not set
 CONFIG_PCI_DEBUG=y
@@ -386,7 +386,7 @@ CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_S390=y
 CONFIG_DEVTMPFS=y
 CONFIG_CONNECTOR=y
-CONFIG_ZRAM=m
+CONFIG_ZRAM=y
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_CRYPTOLOOP=m
 CONFIG_BLK_DEV_DRBD=m
@@ -689,6 +689,7 @@ CONFIG_CRYPTO_TEST=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECRDSA=m
+CONFIG_CRYPTO_SM2=m
 CONFIG_CRYPTO_CURVE25519=m
 CONFIG_CRYPTO_GCM=y
 CONFIG_CRYPTO_CHACHA20POLY1305=m
@@ -709,7 +710,6 @@ CONFIG_CRYPTO_RMD160=m
 CONFIG_CRYPTO_RMD256=m
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA3=m
-CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -753,6 +753,7 @@ CONFIG_CRYPTO_DES_S390=m
 CONFIG_CRYPTO_AES_S390=m
 CONFIG_CRYPTO_GHASH_S390=m
 CONFIG_CRYPTO_CRC32_S390=y
+CONFIG_CRYPTO_DEV_VIRTIO=m
 CONFIG_CORDIC=m
 CONFIG_CRC32_SELFTEST=y
 CONFIG_CRC4=m
@@ -829,6 +830,7 @@ CONFIG_NETDEV_NOTIFIER_ERROR_INJECT=m
 CONFIG_FAULT_INJECTION=y
 CONFIG_FAILSLAB=y
 CONFIG_FAIL_PAGE_ALLOC=y
+CONFIG_FAULT_INJECTION_USERCOPY=y
 CONFIG_FAIL_MAKE_REQUEST=y
 CONFIG_FAIL_IO_TIMEOUT=y
 CONFIG_FAIL_FUTEX=y
diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
index 905bc8c4cfaf..17d5df2c1eff 100644
--- a/arch/s390/configs/defconfig
+++ b/arch/s390/configs/defconfig
@@ -87,9 +87,10 @@ CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_CLEANCACHE=y
 CONFIG_FRONTSWAP=y
+CONFIG_CMA_AREAS=7
 CONFIG_MEM_SOFT_DIRTY=y
 CONFIG_ZSWAP=y
-CONFIG_ZSMALLOC=m
+CONFIG_ZSMALLOC=y
 CONFIG_ZSMALLOC_STAT=y
 CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
 CONFIG_IDLE_PAGE_TRACKING=y
@@ -371,7 +372,6 @@ CONFIG_NETLINK_DIAG=m
 CONFIG_CGROUP_NET_PRIO=y
 CONFIG_BPF_JIT=y
 CONFIG_NET_PKTGEN=m
-# CONFIG_NET_DROP_MONITOR is not set
 CONFIG_PCI=y
 # CONFIG_PCIEASPM is not set
 CONFIG_HOTPLUG_PCI=y
@@ -379,7 +379,7 @@ CONFIG_HOTPLUG_PCI_S390=y
 CONFIG_UEVENT_HELPER=y
 CONFIG_DEVTMPFS=y
 CONFIG_CONNECTOR=y
-CONFIG_ZRAM=m
+CONFIG_ZRAM=y
 CONFIG_BLK_DEV_LOOP=m
 CONFIG_BLK_DEV_CRYPTOLOOP=m
 CONFIG_BLK_DEV_DRBD=m
@@ -680,6 +680,7 @@ CONFIG_CRYPTO_TEST=m
 CONFIG_CRYPTO_DH=m
 CONFIG_CRYPTO_ECDH=m
 CONFIG_CRYPTO_ECRDSA=m
+CONFIG_CRYPTO_SM2=m
 CONFIG_CRYPTO_CURVE25519=m

Re: [PATCH] s390: add support for TIF_NOTIFY_SIGNAL

2020-11-02 Thread Heiko Carstens
On Mon, Nov 02, 2020 at 11:59:41AM -0500, Qian Cai wrote:
> On Sun, 2020-11-01 at 17:31 +0000, Heiko Carstens wrote:
> > On Thu, Oct 29, 2020 at 10:21:11AM -0600, Jens Axboe wrote:
> > > Wire up TIF_NOTIFY_SIGNAL handling for s390.
> > > 
> > > Cc: linux-s...@vger.kernel.org
> > > Signed-off-by: Jens Axboe 
> 
> Even though I did confirm that today's linux-next contains this additional 
> patch
> from Heiko below, a z10 guest is still unable to boot. Reverting the whole
> series (reverting only "s390: add support for TIF_NOTIFY_SIGNAL" introduced
> compiling errors) fixed the problem, i.e., git revert --no-edit
> af0dd809f3d3..7b074c15374c [1]
> 
> .config: 
> https://cailca.coding.net/public/linux/mm/git/files/master/s390.config

I'll take a look at it, but probably not today anymore.


Re: [PATCH] s390/smp: Move rcu_cpu_starting() earlier

2020-11-01 Thread Heiko Carstens
On Sat, Oct 31, 2020 at 07:38:52PM -0400, Qian Cai wrote:
> > > This is avoided by moving the call to rcu_cpu_starting up near the
> > > beginning of the smp_init_secondary() function. Note that the
> > > raw_smp_processor_id() is required in order to avoid calling into
> > > lockdep before RCU has declared the CPU to be watched for readers.
> > > 
> > > Link: 
> > > https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
> > > Signed-off-by: Qian Cai 
> > > ---
> > >  arch/s390/kernel/smp.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > Could you provide the config you used? I'm wondering why I can't
> > reproduce this even though I have lot's of debug options enabled.
> https://cailca.coding.net/public/linux/mm/git/files/master/s390.config
> 
> Essentially, I believe it requires CONFIG_PROVE_RCU_LIST=y. Also, it occurs to
> me that this only starts to happen after the commit mentioned in the above 
> link.

Yes, with that enabled I can reprocuce it. Thanks! It depends on
CONFIG_RCU_EXPERT. I can't image why I didn't had that enabled.. :)


Re: [PATCH] s390/smp: Move rcu_cpu_starting() earlier

2020-10-31 Thread Heiko Carstens
On Wed, Oct 28, 2020 at 02:27:42PM -0400, Qian Cai wrote:
> The call to rcu_cpu_starting() in smp_init_secondary() is not early
> enough in the CPU-hotplug onlining process, which results in lockdep
> splats as follows:
> 
>  WARNING: suspicious RCU usage
>  -
>  kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
> 
>  other info that might help us debug this:
> 
>  RCU used illegally from offline CPU!
>  rcu_scheduler_active = 1, debug_locks = 1
>  no locks held by swapper/1/0.
> 
>  Call Trace:
>  show_stack+0x158/0x1f0
>  dump_stack+0x1f2/0x238
>  __lock_acquire+0x2640/0x4dd0
>  lock_acquire+0x3a8/0xd08
>  _raw_spin_lock_irqsave+0xc0/0xf0
>  clockevents_register_device+0xa8/0x528
>  init_cpu_timer+0x33e/0x468
>  smp_init_secondary+0x11a/0x328
>  smp_start_secondary+0x82/0x88
> 
> This is avoided by moving the call to rcu_cpu_starting up near the
> beginning of the smp_init_secondary() function. Note that the
> raw_smp_processor_id() is required in order to avoid calling into
> lockdep before RCU has declared the CPU to be watched for readers.
> 
> Link: 
> https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
> Signed-off-by: Qian Cai 
> ---
>  arch/s390/kernel/smp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Could you provide the config you used? I'm wondering why I can't
reproduce this even though I have lot's of debug options enabled.

I will apply it anyway after rc2 has been released, just curious.


[GIT PULL] s390 compile fix for 5.10-rc2

2020-10-26 Thread Heiko Carstens
Hi Linus,

please pull a simple fix, so that s390 compiles again after Joe Perches' commit
33def8498fdd ("treewide: Convert macro and uses of __section(foo) to 
__section("foo")")
which went in just before 5.10-rc1.

Thanks,
Heiko

The following changes since commit 3650b228f83adda7e5ee532e2b90429c03f7b9ec:

  Linux 5.10-rc1 (2020-10-25 15:14:11 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.10-2

for you to fetch changes up to 8e90b4b1305a80b1d7712370a163eff269ac1ba2:

  s390: correct __bootdata / __bootdata_preserved macros (2020-10-26 14:18:01 
+0100)


- Fix s390 compile breakage caused by commit 33def8498fdd
  ("treewide: Convert macro and uses of __section(foo) to __section("foo")")


Vasily Gorbik (1):
  s390: correct __bootdata / __bootdata_preserved macros

 arch/s390/include/asm/sections.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/sections.h b/arch/s390/include/asm/sections.h
index a996d3990a02..0c2151451ba5 100644
--- a/arch/s390/include/asm/sections.h
+++ b/arch/s390/include/asm/sections.h
@@ -26,14 +26,14 @@ static inline int arch_is_kernel_initmem_freed(unsigned 
long addr)
  * final .boot.data section, which should be identical in the decompressor and
  * the decompressed kernel (that is checked during the build).
  */
-#define __bootdata(var) __section(".boot.data.var") var
+#define __bootdata(var) __section(".boot.data." #var) var
 
 /*
  * .boot.preserved.data is similar to .boot.data, but it is not part of the
  * .init section and thus will be preserved for later use in the decompressed
  * kernel.
  */
-#define __bootdata_preserved(var) __section(".boot.preserved.data.var") var
+#define __bootdata_preserved(var) __section(".boot.preserved.data." #var) var
 
 extern unsigned long __sdma, __edma;
 extern unsigned long __stext_dma, __etext_dma;


Re: BUG: Bad page state in process dirtyc0w_child

2020-09-16 Thread Heiko Carstens
On Sat, Sep 12, 2020 at 09:54:12PM -0400, Qian Cai wrote:
> Occasionally, running this LTP test will trigger an error below on
> s390:
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/security/dirtyc0w/dirtyc0w.c
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/security/dirtyc0w/dirtyc0w_child.c
> 
> this .config:
> https://gitlab.com/cailca/linux-mm/-/blob/master/s390.config
> 
> [ 6970.253173] LTP: starting dirtyc0w
> [ 6971.599102] BUG: Bad page state in process dirtyc0w_child  pfn:8865d
> [ 6971.599867] page:1a8328d7 refcount:0 mapcount:0 
> mapping: index:0x0 pfn:0x8865d
> [ 6971.599876] flags: 0x4008000e(referenced|uptodate|dirty|swapbacked)
> [ 6971.599886] raw: 4008000e 0100 0122 
> 
> [ 6971.599893] raw:    
> 
> [ 6971.599900] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [ 6971.599906] Modules linked in: loop kvm ip_tables x_tables dm_mirror 
> dm_region_hash dm_log dm_mod [last unloaded: dummy_del_mod]
> [ 6971.599952] CPU: 1 PID: 65238 Comm: dirtyc0w_child Tainted: G   O  
> 5.9.0-rc4-next-20200909 #1
> [ 6971.599959] Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)
> [ 6971.599964] Call Trace:
> [ 6971.599979]  [<73aec038>] show_stack+0x158/0x1f0 
> [ 6971.599986]  [<73af724a>] dump_stack+0x1f2/0x238 
> [ 6971.54]  [<72ed086a>] bad_page+0x1ba/0x1c0 
> [ 6971.60]  [<72ed20c4>] free_pcp_prepare+0x4fc/0x658 
> [ 6971.66]  [<72ed96a6>] free_unref_page+0xae/0x158 
> [ 6971.600013]  [<72e8286a>] unmap_page_range+0xb62/0x1df8 
> [ 6971.600019]  [<72e83bbc>] unmap_single_vma+0xbc/0x1c8 
> [ 6971.600025]  [<72e8418e>] zap_page_range+0x176/0x230 
> [ 6971.600033]  [<72eece8e>] do_madvise+0xfde/0x1270 
> [ 6971.600039]  [<72eed50a>] __s390x_sys_madvise+0x72/0x98 
> [ 6971.600047]  [<73b1cce4>] system_call+0xdc/0x278 
> [ 6971.600053] 2 locks held by dirtyc0w_child/65238:
> [ 6971.600058]  #0: 00013442fa18 (>mmap_lock){}-{3:3}, at: 
> do_madvise+0x17a/0x1270
> [ 6971.600432]  #1: 0001343f9060 (ptlock_ptr(page)#2){+.+.}-{2:2}, at: 
> unmap_page_range+0x640/0x1df8
> [ 6971.600487] Disabling lock debugging due to kernel taint
> 
> Once it happens, running it again will trigger in on another PFN.
> 
> [39717.085115] BUG: Bad page state in process dirtyc0w_child  pfn:af065 
> 
> Any thoughts?

Alexander, Gerald, could you take a look?


Re: [PATCH -next] s390/diag: convert to use DEFINE_SEQ_ATTRIBUTE macro

2020-09-16 Thread Heiko Carstens
On Wed, Sep 16, 2020 at 10:50:29AM +0800, Liu Shixin wrote:
> Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.
> 
> Signed-off-by: Liu Shixin 
> ---
>  arch/s390/kernel/diag.c | 13 +
>  1 file changed, 1 insertion(+), 12 deletions(-)

Applied, thanks.


Re: [PATCH -next] s390/ap: remove unnecessary spin_lock_init()

2020-09-16 Thread Heiko Carstens
On Wed, Sep 16, 2020 at 02:21:30PM +0800, Qinglang Miao wrote:
> The spinlock ap_poll_timer_lock is initialized statically. It is
> unnecessary to initialize by spin_lock_init().
> 
> Signed-off-by: Qinglang Miao 
> ---
>  drivers/s390/crypto/ap_bus.c | 1 -
>  1 file changed, 1 deletion(-)

Applied, thanks.


Re: [PATCH] s390/zcrypt: remove set_fs() invocation in zcrypt device driver

2020-09-14 Thread Heiko Carstens
On Mon, Sep 14, 2020 at 09:36:07AM +0200, Harald Freudenberger wrote:
> Otherwise how to we provide this fix then ? My recommendation would
> be to go the 'usual' way: Commit this s390 internal and then let
> this go out with the next kernel merge window when next time Linus
> is pulling patches from the s390 subsystem for the 5.10 kernel
> development cycle.

I will create a "set_fs" topic branch on kernel.org based on
vfs.git/base.set_fs and add your patch there and also the rest of
s390 set_fs related patches on top of that as soon as things are
ready.


Re: [PATCH] s390/idle: Fix suspicious RCU usage

2020-09-08 Thread Heiko Carstens
On Tue, Sep 08, 2020 at 03:30:31PM +0200, pet...@infradead.org wrote:
> 
> After commit eb1f00237aca ("lockdep,trace: Expose tracepoints") the
> lock tracepoints are visible to lockdep and RCU-lockdep is finding a
> bunch more RCU violations that were previously hidden.
> 
> Switch the idle->seqcount over to using raw_write_*() to avoid the
> lockdep annotation and thus the lock tracepoints.
> 
> Reported-by: Guenter Roeck 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  arch/s390/kernel/idle.c |5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)

Applied, thank you!


[GIT PULL] s390 updates for 5.9-rc1

2020-08-13 Thread Heiko Carstens
Hi Linus,

please pull some small s390 updates for 5.9-rc1.

Thanks,
Heiko

The following changes since commit 00e4db51259a5f936fec1424b884f029479d3981:

  Merge tag 'perf-tools-2020-08-10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux (2020-08-10 19:21:38 
-0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.9-2

for you to fetch changes up to b450eeb0c973ed4125ea91e35613f029337fd28b:

  s390/numa: move code to arch/s390/kernel (2020-08-11 18:16:55 +0200)


- Allow s390 debug feature to handle finally more than 256 CPU numbers, instead
  of truncating the most significant bits.

- Improve THP splitting required by qemu processes by making use of
  walk_page_vma() instead of calling follow_page() for every single page
  within each vma.

- Add missing ZCRYPT dependency to VFIO_AP to fix potential compile problems.

- Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.

- Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
  translates a node distance of 0 to "no NUMA support available".

- Couple of other minor fixes and improvements.


Alexander Gordeev (2):
  s390/numa: set node distance to LOCAL_DISTANCE
  s390/numa: move code to arch/s390/kernel

Gerald Schaefer (1):
  s390/gmap: improve THP splitting

Heiko Carstens (1):
  s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again

Krzysztof Kozlowski (1):
  s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP

Mikhail Zaslonko (1):
  s390/debug: debug feature version 3

Tianjia Zhang (1):
  s390/pkey: remove redundant variable initialization

Vasily Gorbik (1):
  s390/atomic: circumvent gcc 10 build regression

Wang Hai (1):
  s390/test_unwind: fix possible memleak in test_unwind()

 arch/s390/Kbuild  |  1 -
 arch/s390/Kconfig |  2 +-
 arch/s390/include/asm/atomic.h| 12 ++--
 arch/s390/include/asm/debug.h | 17 ++---
 arch/s390/include/asm/topology.h  |  6 --
 arch/s390/kernel/Makefile |  1 +
 arch/s390/kernel/debug.c  | 32 ++--
 arch/s390/{numa => kernel}/numa.c |  0
 arch/s390/lib/test_unwind.c   |  1 +
 arch/s390/mm/gmap.c   | 27 ---
 arch/s390/numa/Makefile   |  2 --
 drivers/s390/crypto/pkey_api.c|  4 ++--
 12 files changed, 59 insertions(+), 46 deletions(-)
 rename arch/s390/{numa => kernel}/numa.c (100%)
 delete mode 100644 arch/s390/numa/Makefile

diff --git a/arch/s390/Kbuild b/arch/s390/Kbuild
index e63940bb57cd..8b98c501142d 100644
--- a/arch/s390/Kbuild
+++ b/arch/s390/Kbuild
@@ -7,5 +7,4 @@ obj-$(CONFIG_S390_HYPFS_FS) += hypfs/
 obj-$(CONFIG_APPLDATA_BASE)+= appldata/
 obj-y  += net/
 obj-$(CONFIG_PCI)  += pci/
-obj-$(CONFIG_NUMA) += numa/
 obj-$(CONFIG_ARCH_HAS_KEXEC_PURGATORY) += purgatory/
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 8c0b52940165..3d86e12e8e3c 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -126,7 +126,6 @@ config S390
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_VMALLOC
-   select CLOCKSOURCE_VALIDATE_LAST_CYCLE
select CPU_NO_EFFICIENT_FFS if !HAVE_MARCH_Z9_109_FEATURES
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_SOFT_DIRTY
@@ -766,6 +765,7 @@ config VFIO_AP
def_tristate n
prompt "VFIO support for AP devices"
depends on S390_AP_IOMMU && VFIO_MDEV_DEVICE && KVM
+   depends on ZCRYPT
help
This driver grants access to Adjunct Processor (AP) devices
via the VFIO mediated device interface.
diff --git a/arch/s390/include/asm/atomic.h b/arch/s390/include/asm/atomic.h
index cae473a7b6f7..11c5952e1afa 100644
--- a/arch/s390/include/asm/atomic.h
+++ b/arch/s390/include/asm/atomic.h
@@ -45,7 +45,11 @@ static inline int atomic_fetch_add(int i, atomic_t *v)
 static inline void atomic_add(int i, atomic_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
-   if (__builtin_constant_p(i) && (i > -129) && (i < 128)) {
+   /*
+* Order of conditions is important to circumvent gcc 10 bug:
+* https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549318.html
+*/
+   if ((i > -129) && (i < 128) && __builtin_constant_p(i)) {
__atomic_add_const(i, >counter);
return;
}
@@ -112,7 +116,11 @@ static inline s64 atomic64_fetch_add(s64 i, atomic64_t *v)
 static inline void atomic64_add(s64 i, atomic64_t *v)
 {
 #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
-   if (__builtin_constant_p(i) && (i > -

Re: [PATCH] s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP

2020-08-06 Thread Heiko Carstens
On Wed, Aug 05, 2020 at 05:50:53PM +0200, Krzysztof Kozlowski wrote:
> The VFIO_AP uses ap_driver_register() (and deregister) functions
> implemented in ap_bus.c (compiled into ap.o).  However the ap.o will be
> built only if CONFIG_ZCRYPT is selected.
> 
> This was not visible before commit e93a1695d7fb ("iommu: Enable compile
> testing for some of drivers") because the CONFIG_VFIO_AP depends on
> CONFIG_S390_AP_IOMMU which depends on the missing CONFIG_ZCRYPT.  After
> adding COMPILE_TEST, it is possible to select a configuration with
> VFIO_AP and S390_AP_IOMMU but without the ZCRYPT.
> 
> Add proper dependency to the VFIO_AP to fix build errors:
> 
> ERROR: modpost: "ap_driver_register" [drivers/s390/crypto/vfio_ap.ko] 
> undefined!
> ERROR: modpost: "ap_driver_unregister" [drivers/s390/crypto/vfio_ap.ko] 
> undefined!
> 
> Reported-by: kernel test robot 
> Fixes: e93a1695d7fb ("iommu: Enable compile testing for some of drivers")
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  arch/s390/Kconfig | 1 +
>  1 file changed, 1 insertion(+)

Applied, thanks.


Re: linux-next: build failure after merge of the net-next tree

2020-08-05 Thread Heiko Carstens
On Wed, Aug 05, 2020 at 03:06:27PM +0200, Stefano Brivio wrote:
> On Wed, 5 Aug 2020 22:31:21 +1000
> Stephen Rothwell  wrote:
> 
> > Hi all,
> > 
> > After merging the net-next tree, today's linux-next build (s390 defconfig)
> > failed like this:
> > 
> > net/ipv4/ip_tunnel_core.c:335:2: error: implicit declaration of function 
> > 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
> > 
> > Caused by commit
> > 
> >   4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP 
> > packets")
> 
> Ouch, sorry for that.
> 
> I'm getting a few of them by the way:
> 
> ---
> net/core/skbuff.o: In function `skb_checksum_setup_ipv6':
> /home/sbrivio/net-next/net/core/skbuff.c:4980: undefined reference to 
> `csum_ipv6_magic'
> net/core/netpoll.o: In function `netpoll_send_udp':
> /home/sbrivio/net-next/net/core/netpoll.c:419: undefined reference to 
> `csum_ipv6_magic'
> net/netfilter/utils.o: In function `nf_ip6_checksum':
> /home/sbrivio/net-next/net/netfilter/utils.c:74: undefined reference to 
> `csum_ipv6_magic'
> /home/sbrivio/net-next/net/netfilter/utils.c:84: undefined reference to 
> `csum_ipv6_magic'
> net/netfilter/utils.o: In function `nf_ip6_checksum_partial':
> /home/sbrivio/net-next/net/netfilter/utils.c:112: undefined reference to 
> `csum_ipv6_magic'
> net/ipv4/ip_tunnel_core.o:/home/sbrivio/net-next/net/ipv4/ip_tunnel_core.c:335:
>  more undefined references to `csum_ipv6_magic' follow
> ---
> 
> ...checking how it should be fixed now.
> 
> Heiko, by the way, do we want to provide a s390 version similar to the
> existing csum_partial() implementation in
> arch/s390/include/asm/checksum.h right away? Otherwise, I'll just take
> care of the ifdeffery.

You probably only need to include include/net/ip6_checksum.h which
contains the default implementation.

And yes, I put it on my todo list that we need to provide an s390
variant as well.


Re: [PATCH 2/2] s390: convert to GENERIC_VDSO

2020-08-03 Thread Heiko Carstens
On Mon, Aug 03, 2020 at 09:27:36PM +0200, Thomas Gleixner wrote:
> Heiko Carstens  writes:
> 
> > On Mon, Aug 03, 2020 at 06:05:24PM +0200, Thomas Gleixner wrote:
> >> +/**
> >> + * vdso_update_begin - Start of a VDSO update section
> >> + *
> >> + * Allows architecture code to safely update the architecture specific 
> >> VDSO
> >> + * data.
> >> + */
> >> +void vdso_update_begin(void)
> >> +{
> >> +  struct vdso_data *vdata = __arch_get_k_vdso_data();
> >> +
> >> +  raw_spin_lock(_lock);
> >> +  vdso_write_begin(vdata);
> >> +}
> >
> > I would assume that this only works if vdso_update_begin() is called
> > with irqs disabled, otherwise it could deadlock, no?
> 
> Yes.
> 
> > Maybe something like:
> >
> > void vdso_update_begin(unsigned long *flags)
> > {
> > struct vdso_data *vdata = __arch_get_k_vdso_data();
> >
> > raw_spin_lock_irqsave(_lock, *flags);
> > vdso_write_begin(vdata);
> 
> Shudder. Why not returning flags?

That was what I had initially but then looked at lock_timer_base(),
and tried to be consistent. Ok, bad example, since lock_timer_base()
cannot return flags.

> Thought about that briefly, but then hated the flags thing and delegated
> it to the caller. Lockdep will yell if that lock is taken with
> interrupts enabled :)
> 
> But aside of the pointer vs. value thing, I'm fine with doing it in the
> functions.

FWIW, my preference would also to use values instead of pointers.


Re: [PATCH 2/2] s390: convert to GENERIC_VDSO

2020-08-03 Thread Heiko Carstens
On Mon, Aug 03, 2020 at 06:05:24PM +0200, Thomas Gleixner wrote:
> +/**
> + * vdso_update_begin - Start of a VDSO update section
> + *
> + * Allows architecture code to safely update the architecture specific VDSO
> + * data.
> + */
> +void vdso_update_begin(void)
> +{
> + struct vdso_data *vdata = __arch_get_k_vdso_data();
> +
> + raw_spin_lock(_lock);
> + vdso_write_begin(vdata);
> +}

I would assume that this only works if vdso_update_begin() is called
with irqs disabled, otherwise it could deadlock, no?

Maybe something like:

void vdso_update_begin(unsigned long *flags)
{
struct vdso_data *vdata = __arch_get_k_vdso_data();

raw_spin_lock_irqsave(_lock, *flags);
vdso_write_begin(vdata);
}

void vdso_update_end(unsigned long *flags)
{
struct vdso_data *vdata = __arch_get_k_vdso_data();

vdso_write_end(vdata);
__arch_sync_vdso_data(vdata);
raw_spin_unlock_irqrestore(_lock, *flags);
}

? Just wondering.


[GIT PULL] s390 updates for 5.9 merge window

2020-08-03 Thread Heiko Carstens
Hi Linus,

please pull s390 updates for the 5.9 merge window.

Thanks,
Heiko

The following changes since commit 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68:

  Linux 5.8-rc3 (2020-06-28 15:00:24 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.9-1

for you to fetch changes up to 9a996c67a65d937b23408e56935ef23404c9418e:

  s390/vmemmap: coding style updates (2020-07-27 10:34:19 +0200)


- Add support for function error injection.

- Add support for custom exception handlers, as required by BPF_PROBE_MEM.

- Add support for BPF_PROBE_MEM.

- Add trace events for idle enter / exit for the s390 specific idle
  implementation.

- Remove unused zcore memmmap device.

- Remove unused "raw view" from s390 debug feature.

- AP bus + zcrypt device driver code refactoring.

- Provide cex4 cca sysfs attributes for cex3 for zcrypt device driver.

- Expose only minimal interface to walk physmem for mm/memblock. This
  is a common code change and it has been agreed on with Mike Rapoport
  and Andrew Morton that this can go upstream via the s390 tree.

- Rework of the s390 vmem/vmmemap code to allow for future memory hot
  remove.

- Get rid of FORCE_MAX_ZONEORDER to finally allow for order-10
  allocations again, instead of only order-8 allocations.

- Various small improvements and fixes.


Alexander Egorenkov (1):
  s390/zcore: remove memmap device

Christian Borntraeger (1):
  s390: fix comment regarding interrupts in svc

David Hildenbrand (13):
  s390/vmem: get rid of memory segment list
  s390/extmem: remove stale -ENOSPC comment and handling
  mm/memblock: expose only miminal interface to add/walk physmem
  s390/mm: don't set ARCH_KEEP_MEMBLOCK
  s390/vmem: rename vmem_add_mem() to vmem_add_range()
  s390/vmem: consolidate vmem_add_range() and vmem_remove_range()
  s390/vmemmap: extend modify_pagetable() to handle vmemmap
  s390/vmemmap: cleanup when vmemmap_populate() fails
  s390/vmemmap: take the vmem_mutex when populating/freeing
  s390/vmem: cleanup empty page tables
  s390/vmemmap: fallback to PTEs if mapping large PMD fails
  s390/vmemmap: remember unused sub-pmd ranges
  s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive sections

Gustavo A. R. Silva (1):
  s390/appldata: use struct_size() helper

Harald Freudenberger (7):
  s390/pkey: fix smatch warning inconsistent indenting
  s390/zcrypt: fix smatch warnings
  s390/zcrypt: code beautification and struct field renames
  s390/zcrypt: split ioctl function into smaller code units
  s390/ap: rename and clarify ap state machine related stuff
  s390/zcrypt: provide cex4 cca sysfs attributes for cex3
  s390/ap: rework crypto config info and default domain code

Heiko Carstens (11):
  s390/debug: remove raw view
  s390/debug: remove struct __debug_entry from uapi
  s390/smp: move smp_cpus_done() to header file
  s390/smp: add missing linebreak
  s390/mm: fix typo in comment
  s390/mm: avoid trimming to MAX_ORDER
  s390/mm: allow order 10 allocations
  s390/time: use CLOCKSOURCE_MASK
  s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE
  s390/time: improve comparison for tod steering
  s390/vmemmap: coding style updates

Ilya Leoshkevich (4):
  s390/kernel: unify EX_TABLE* implementations
  s390/kernel: expand exception table logic to allow new handling options
  s390/bpf: implement BPF_PROBE_MEM
  s390: enable HAVE_FUNCTION_ERROR_INJECTION

Julian Wiedmann (3):
  s390/qdio: fix statistics for 128 SBALs
  s390/qdio: allow to scan all 128 Input SBALs
  s390/qdio: remove internal polling in non-thinint path

Niklas Schnelle (1):
  s390/pci: clarify comment in s390_mmio_read/write

Oscar Carter (1):
  s390/tty3270: remove function callback casts

Sven Schnelle (5):
  s390: convert to msecs_to_jiffies()
  s390/pci: remove unused functions
  s390/time: remove unused function
  s390/stp: allow group and users to read stp sysfs files
  s390: add trace events for idle enter/exit

 Documentation/s390/s390dbf.rst  |  17 +-
 arch/s390/Kconfig   |   7 +-
 arch/s390/appldata/appldata_os.c|   6 +-
 arch/s390/include/asm/asm-const.h   |  12 +
 arch/s390/include/asm/debug.h   |  18 +-
 arch/s390/include/asm/extable.h |  52 ++-
 arch/s390/include/asm/linkage.h |  35 +-
 arch/s390/include/asm/pci_dma.h |  11 -
 arch/s390/include/asm/pgtable.h |   2 +-
 arch/s390/include/asm/ptrace.h  |   5 +
 arch/s390/include/asm/smp.h |   4 +
 arch/s390/include/asm/syscall_wrapper.h |   6 +-
 arch/s390/include/asm/timex.h   |   5 -
 arch/s390/include/uapi/asm/debug.h  |  3

Re: [PATCH] s390/test_unwind: fix possible memleak in test_unwind()

2020-07-31 Thread Heiko Carstens
On Thu, Jul 30, 2020 at 09:35:15AM +0200, Ilya Leoshkevich wrote:
> On Thu, 2020-07-30 at 14:36 +0800, Wang Hai wrote:
> > test_unwind() misses to call kfree(bt) in an error path.
> > Add the missed function call to fix it.
> > 
> > Fixes: 0610154650f1 ("s390/test_unwind: print verbose unwinding
> > results")
> > Reported-by: Hulk Robot 
> > Signed-off-by: Wang Hai 
> > ---
> >  arch/s390/lib/test_unwind.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/s390/lib/test_unwind.c
> > b/arch/s390/lib/test_unwind.c
> > index 32b7a30b2485..b0b12b46bc57 100644
> > --- a/arch/s390/lib/test_unwind.c
> > +++ b/arch/s390/lib/test_unwind.c
> > @@ -63,6 +63,7 @@ static noinline int test_unwind(struct task_struct
> > *task, struct pt_regs *regs,
> > break;
> > if (state.reliable && !addr) {
> > pr_err("unwind state reliable but addr is
> > 0\n");
> > +   kfree(bt);
> > return -EINVAL;
> > }
> > sprint_symbol(sym, addr);
> 
> Looks good to me, thanks!
> 
> Acked-by: Ilya Leoshkevich 

Applied, thanks!


Re: [PATCH v2 0/9] s390: implement and optimize vmemmap_free()

2020-07-24 Thread Heiko Carstens
On Wed, Jul 22, 2020 at 11:45:49AM +0200, David Hildenbrand wrote:
> This series is based on the latest s390/features branch [1]. It
> consolidates vmem_add_range(), vmem_remove_range(), and vmemmap_populate()
> into a single, recursive page table walker. It then implements
> vmemmap_free() and optimizes it by
> - Freeing empty page tables (also done for vmem_remove_range()).
> - Handling cases where the vmemmap of a section does not fill huge pages
>   completely (e.g., sizeof(struct page) == 56).
> 
> vmemmap_free() is currently never used, unless adiing standby memory fails
> (unlikely). This is relevant for virtio-mem, which adds/removes memory
> in memory block/section granularity (always removes memory in the same
> granularity it added it).
> 
> I gave this a proper test with my virtio-mem prototype (which I will share
> in the near future), both with 56 byte memmap per page and 64 byte memmap
> per page, with and without huge page support. In both cases, removing
> memory (routed through arch_remove_memory()) will result in
> - all populated vmemmap pages to get removed/freed
> - all applicable page tables for the vmemmap getting removed/freed
> - all applicable page tables for the idendity mapping getting removed/freed
> Unfortunately, I don't have access to bigger and z/VM (esp. dcss)
> environments.
> 
> This is the basis for real memory hotunplug support for s390x and should
> complete my journey to s390x vmem/vmemmap code for now
> 
> What needs double-checking is tlb flushing. AFAIKS, as there are no valid
> accesses, doing a single range flush at the end is sufficient, both when
> removing vmemmap pages and the idendity mapping.
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=features
> 
> v1 -> v2:
> - Convert to a single page table walker named "modify_pagetable()", with
>   two helper functions "add_pagetable()" and "remove_pagetable().
> 
> David Hildenbrand (9):
>   s390/vmem: rename vmem_add_mem() to vmem_add_range()
>   s390/vmem: consolidate vmem_add_range() and vmem_remove_range()
>   s390/vmemmap: extend modify_pagetable() to handle vmemmap
>   s390/vmemmap: cleanup when vmemmap_populate() fails
>   s390/vmemmap: take the vmem_mutex when populating/freeing
>   s390/vmem: cleanup empty page tables
>   s390/vmemmap: fallback to PTEs if mapping large PMD fails
>   s390/vmemmap: remember unused sub-pmd ranges
>   s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive
> sections
> 
>  arch/s390/mm/vmem.c | 637 ++--
>  1 file changed, 442 insertions(+), 195 deletions(-)

Series applied, thank you!


[GIT PULL] s390 updates for 5.8-rc7

2020-07-23 Thread Heiko Carstens
Hello Linus,

please pull s390 updates for 5.8-rc7.

Thanks,
Heiko

The following changes since commit ba47d845d715a010f7b51f6f89bae32845e6acb7:

  Linux 5.8-rc6 (2020-07-19 15:41:18 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.8-6

for you to fetch changes up to 0cfa112b33aba4473b00151c75b87818a835702a:

  MAINTAINERS: add Matthew for s390 IOMMU (2020-07-22 17:01:23 +0200)


- Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH
  to reflect reality and avoid further confusion. This is a user space
  visible change therefore the commit has also a stable tag for 5.7,
  where this counter was introduced.

- Add Matthew Rosato as s390 IOMMU maintainer.


Gerald Schaefer (1):
  MAINTAINERS: add Matthew for s390 IOMMU

Thomas Richter (1):
  s390/cpum_cf,perf: change DFLT_CCERROR counter name

 MAINTAINERS  | 1 +
 arch/s390/kernel/perf_cpum_cf_events.c   | 4 ++--
 tools/perf/pmu-events/arch/s390/cf_z15/extended.json | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d53db30d1365..df5fc5625ec8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14862,6 +14862,7 @@ F:  drivers/s390/block/dasd*
 F: include/linux/dasd_mod.h
 
 S390 IOMMU (PCI)
+M: Matthew Rosato 
 M: Gerald Schaefer 
 L: linux-s...@vger.kernel.org
 S: Supported
diff --git a/arch/s390/kernel/perf_cpum_cf_events.c 
b/arch/s390/kernel/perf_cpum_cf_events.c
index 1e3df52b2b65..37265f551a11 100644
--- a/arch/s390/kernel/perf_cpum_cf_events.c
+++ b/arch/s390/kernel/perf_cpum_cf_events.c
@@ -292,7 +292,7 @@ CPUMF_EVENT_ATTR(cf_z15, TX_C_TABORT_SPECIAL, 0x00f5);
 CPUMF_EVENT_ATTR(cf_z15, DFLT_ACCESS, 0x00f7);
 CPUMF_EVENT_ATTR(cf_z15, DFLT_CYCLES, 0x00fc);
 CPUMF_EVENT_ATTR(cf_z15, DFLT_CC, 0x00108);
-CPUMF_EVENT_ATTR(cf_z15, DFLT_CCERROR, 0x00109);
+CPUMF_EVENT_ATTR(cf_z15, DFLT_CCFINISH, 0x00109);
 CPUMF_EVENT_ATTR(cf_z15, MT_DIAG_CYCLES_ONE_THR_ACTIVE, 0x01c0);
 CPUMF_EVENT_ATTR(cf_z15, MT_DIAG_CYCLES_TWO_THR_ACTIVE, 0x01c1);
 
@@ -629,7 +629,7 @@ static struct attribute *cpumcf_z15_pmu_event_attr[] 
__initdata = {
CPUMF_EVENT_PTR(cf_z15, DFLT_ACCESS),
CPUMF_EVENT_PTR(cf_z15, DFLT_CYCLES),
CPUMF_EVENT_PTR(cf_z15, DFLT_CC),
-   CPUMF_EVENT_PTR(cf_z15, DFLT_CCERROR),
+   CPUMF_EVENT_PTR(cf_z15, DFLT_CCFINISH),
CPUMF_EVENT_PTR(cf_z15, MT_DIAG_CYCLES_ONE_THR_ACTIVE),
CPUMF_EVENT_PTR(cf_z15, MT_DIAG_CYCLES_TWO_THR_ACTIVE),
NULL,
diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/extended.json 
b/tools/perf/pmu-events/arch/s390/cf_z15/extended.json
index 2df2e231e9ee..24c4ba2a9ae5 100644
--- a/tools/perf/pmu-events/arch/s390/cf_z15/extended.json
+++ b/tools/perf/pmu-events/arch/s390/cf_z15/extended.json
@@ -380,7 +380,7 @@
{
"Unit": "CPU-M-CF",
"EventCode": "265",
-   "EventName": "DFLT_CCERROR",
+   "EventName": "DFLT_CCFINISH",
"BriefDescription": "Increments by one for every DEFLATE 
CONVERSION CALL instruction executed that ended in Condition Codes 0, 1 or 2",
"PublicDescription": "Increments by one for every DEFLATE 
CONVERSION CALL instruction executed that ended in Condition Codes 0, 1 or 2"
},


  1   2   3   4   5   6   7   8   9   10   >