from:"Heiko Carstens"

Re: [PATCH v3 11/11] sysctl: treewide: constify the ctl_table argument of handlers

2024-04-29 Thread Heiko Carstens

On Tue, Apr 23, 2024 at 09:54:46AM +0200, Thomas Weißschuh wrote:
> Adapt the proc_hander function signature to make it clear that handlers
> are not supposed to modify their ctl_table argument.
> 
> This is a prerequisite to moving the static ctl_table structs into
> rodata.
> By migrating all handlers at once a lengthy transition can be avoided.
> 
> The patch was mostly generated by coccinelle with the following script:
> 
> @@
> identifier func, ctl, write, buffer, lenp, ppos;
> @@
> 
> int func(
> - struct ctl_table *ctl,
> + const struct ctl_table *ctl,
>   int write, void *buffer, size_t *lenp, loff_t *ppos)
> { ... }
> 
> In addition to the scripted changes some other changes are done:
> 
> * the typedef proc_handler is adapted
> 
> * the prototypes of non-static handler are adapted
> 
> * kernel/seccomp.c:{read,write}_actions_logged() and
>   kernel/watchdog.c:proc_watchdog_common() are adapted as they need to
>   adapted together with the handlers for type-consistency reasons
> 
> Signed-off-by: Thomas Weißschuh 

...

>  arch/s390/appldata/appldata_base.c| 10 ++---
>  arch/s390/kernel/debug.c  |  2 +-
>  arch/s390/kernel/topology.c   |  2 +-
>  arch/s390/mm/cmm.c|  6 +--

Acked-by: Heiko Carstens  # s390

Re: [PATCH 7/9] s390: Convert from tasklet to BH workqueue

2024-04-08 Thread Heiko Carstens

On Wed, Mar 27, 2024 at 04:03:12PM +, Allen Pais wrote:
> The only generic interface to execute asynchronously in the BH context is
> tasklet; however, it's marked deprecated and has some design flaws. To
> replace tasklets, BH workqueue support was recently added. A BH workqueue
> behaves similarly to regular workqueues except that the queued work items
> are executed in the BH context.
> 
> This patch converts drivers/infiniband/* from tasklet to BH workqueue.
> 
> Based on the work done by Tejun Heo 
> Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10

I guess this dependency is a hard requirement due to commit 134874e2eee9
("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts
on BH work items")?

> ---
>  drivers/s390/block/dasd.c  | 42 
>  drivers/s390/block/dasd_int.h  | 10 +++---
>  drivers/s390/char/con3270.c| 27 
>  drivers/s390/crypto/ap_bus.c   | 24 +++---
>  drivers/s390/crypto/ap_bus.h   |  2 +-
>  drivers/s390/crypto/zcrypt_msgtype50.c |  2 +-
>  drivers/s390/crypto/zcrypt_msgtype6.c  |  4 +--
>  drivers/s390/net/ctcm_fsms.c   |  4 +--
>  drivers/s390/net/ctcm_main.c   | 15 -
>  drivers/s390/net/ctcm_main.h   |  5 +--
>  drivers/s390/net/ctcm_mpc.c| 12 +++
>  drivers/s390/net/ctcm_mpc.h|  7 ++--
>  drivers/s390/net/lcs.c | 26 +++
>  drivers/s390/net/lcs.h |  2 +-
>  drivers/s390/net/qeth_core_main.c  |  2 +-
>  drivers/s390/scsi/zfcp_qdio.c  | 45 +-
>  drivers/s390/scsi/zfcp_qdio.h  |  9 +++---
>  17 files changed, 117 insertions(+), 121 deletions(-)

I'm asking since this patch comes with multiple compile errors. Probably due
to lack of cross compiler tool chain on your side.

If the above wouldn't be a hard dependency I'd say we could take those parts
of your patch which are fine into the s390 tree for 6.10, fix the rest, and
schedule that as well for 6.10 via the s390 tree.

Re: [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess

2024-04-07 Thread Heiko Carstens

On Wed, Apr 03, 2024 at 04:38:04PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it
> as a vma lock event with VMA_LOCK_SUCCESS.
> 
> Signed-off-by: Kefeng Wang 
> ---
>  arch/s390/mm/fault.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index c421dd44ffbe..162ca2576fd4 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
>   goto lock_mmap;
>   if (!(vma->vm_flags & access)) {
>   vma_end_read(vma);
> - goto lock_mmap;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +     return handle_fault_error_nolock(regs, SEGV_ACCERR);

Reviewed-by: Heiko Carstens

Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess

2024-04-07 Thread Heiko Carstens

On Sun, Apr 07, 2024 at 03:49:53PM +0800, Kefeng Wang wrote:
> On 2024/4/4 4:45, Andrew Morton wrote:
> > On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang  
> > wrote:
> > 
> > > After VMA lock-based page fault handling enabled, if bad access met
> > > under per-vma lock, it will fallback to mmap_lock-based handling,
> > > so it leads to unnessary mmap lock and vma find again. A test from
> > > lmbench shows 34% improve after this changes on arm64,
> > > 
> > >lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> > > 
> > > Only build test on other archs except arm64.
> > 
> > Thanks.  So we now want a bunch of architectures to runtime test this.  Do
> > we have a selftest in place which will adequately do this?
> 
> I don't find such selftest, and badaccess would lead to coredump, the
> performance should not affect most scene, so no selftest is acceptable.
> lmbench is easy to use to measure the performance.

The rationale for this series (performance improvement) is a bit odd,
since I would expect that the changed code is usually never executed.

Re: [PATCH 3/4] arch: define CONFIG_PAGE_SIZE_*KB on all architectures

2024-02-27 Thread Heiko Carstens

On Mon, Feb 26, 2024 at 05:14:13PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Most architectures only support a single hardcoded page size. In order
> to ensure that each one of these sets the corresponding Kconfig symbols,
> change over the PAGE_SHIFT definition to the common one and allow
> only the hardware page size to be selected.
> 
> Signed-off-by: Arnd Bergmann 
> ---
...
>  arch/s390/Kconfig  | 1 +
>  arch/s390/include/asm/page.h   | 2 +-
...
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index fe565f3a3a91..b61c74c10050 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -199,6 +199,7 @@ config S390
>   select HAVE_MOD_ARCH_SPECIFIC
>   select HAVE_NMI
>   select HAVE_NOP_MCOUNT
> + select HAVE_PAGE_SIZE_4KB
>   select HAVE_PCI
>   select HAVE_PERF_EVENTS
>   select HAVE_PERF_REGS
> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index 73b9c3bf377f..ded9548d11d9 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -11,7 +11,7 @@
>  #include 
>  #include 
>  
> -#define _PAGE_SHIFT  12
> +#define _PAGE_SHIFT  CONFIG_PAGE_SHIFT

Acked-by: Heiko Carstens

Re: [PATCH 5/5] sched/vtime: do not include header

2024-01-29 Thread Heiko Carstens

On Sun, Jan 28, 2024 at 08:58:54PM +0100, Alexander Gordeev wrote:
> There is no architecture-specific code or data left
> that generic  needs to know about.
> Thus, avoid the inclusion of  header.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>  include/asm-generic/vtime.h | 1 -
>  include/linux/vtime.h   | 4 
>  2 files changed, 5 deletions(-)
>  delete mode 100644 include/asm-generic/vtime.h

I guess you need to get rid of this as well:

arch/powerpc/include/asm/Kbuild:generic-y += vtime.h

Re: [PATCH 4/5] s390/irq,nmi: do not include header

2024-01-29 Thread Heiko Carstens

On Sun, Jan 28, 2024 at 08:58:53PM +0100, Alexander Gordeev wrote:
> update_timer_sys() and update_timer_mcck() are inlines used for
> CPU time accounting from the interrupt and machine-check handlers.
> These routines are specific to s390 architecture, but declared
> via  header, which in turn inludes .
> Avoid the extra loop and include  header directly.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>  arch/s390/kernel/irq.c | 1 +
>  arch/s390/kernel/nmi.c | 1 +
>  2 files changed, 2 insertions(+)
...
> +++ b/arch/s390/kernel/irq.c
> +#include 
...
> +++ b/arch/s390/kernel/nmi.c
> +#include 

It is confusing when the patch subject is "do not include.." and all
what this patch is doing is to add two includes. I see what this is
doing: getting rid of the implicit include of asm/vtime.h most likely
via linux/hardirq.h, but that's not very obvious.

Anyway:
Acked-by: Heiko Carstens

Re: [PATCH 3/5] s390/vtime: remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover

2024-01-29 Thread Heiko Carstens

On Sun, Jan 28, 2024 at 08:58:52PM +0100, Alexander Gordeev wrote:
> __ARCH_HAS_VTIME_TASK_SWITCH macro is not used anymore.
> 
> Signed-off-by: Alexander Gordeev 
> ---
>  arch/s390/include/asm/vtime.h | 2 --
>  1 file changed, 2 deletions(-)

Acked-by: Heiko Carstens

Re: [PATCH 4/4] ptdump: add check_wx_pages debugfs attribute

2024-01-09 Thread Heiko Carstens

On Tue, Jan 09, 2024 at 01:14:38PM +0100, Christophe Leroy wrote:
> Add a writable attribute in debugfs to trigger a
> W^X pages check at any time.
> 
> To trigger the test, just echo any numeric value into
> /sys/kernel/debug/check_wx_pages
> 
> The result is provided into dmesg.
> 
> Signed-off-by: Christophe Leroy 
> ---
>  mm/ptdump.c | 19 +++
>  1 file changed, 19 insertions(+)
...
> +static int check_wx_debugfs_set(void *data, u64 val)
> +{
> + ptdump_check_wx();
> +
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(check_wx_fops, NULL, check_wx_debugfs_set, "%llu\n");
> +
> +static int ptdump_debugfs_init(void)
> +{
> + debugfs_create_file("check_wx_pages", 0200, NULL, NULL, _wx_fops);
> +
> + return 0;
> +}

Wouldn't it be better to have (only?) a readable attribute which triggers
this, and provides the result via this attribute?
That would allow for automated tests without having to parse dmesg.

Re: [linux-next:master] BUILD REGRESSION 2dac75696c6da3c848daa118a729827541c89d33

2023-10-19 Thread Heiko Carstens

On Thu, Oct 19, 2023 at 04:07:35AM +0800, kernel test robot wrote:
> arch/s390/include/asm/ctlreg.h:129:9: warning: array subscript 0 is outside 
> array bounds of 'struct ctlreg[0]' [-Warray-bounds=]
> arch/s390/include/asm/ctlreg.h:80:9: warning: array subscript 0 is outside 
> array bounds of 'struct ctlreg[0]' [-Warray-bounds=]
...
> |-- s390-defconfig
> |   `-- 
> arch-s390-include-asm-ctlreg.h:warning:array-subscript-is-outside-array-bounds-of-struct-ctlreg
...
> s390defconfig   gcc  

I'm wondering how this warning can appear in the builds. array-bounds
warnings are explicitly disabled, see init/Kconfig: CC_NO_ARRAY_BOUNDS. And
as expected, if I compile the kernel with gcc, defconfig, and with or
without W=1 the option -Wno-array-bounds is passed to the compiler.

And also as expected I do not see the above warnings.

So something is quite odd here.

Re: [PATCH mm-unstable v9 14/31] s390: Convert various pgalloc functions to use ptdescs

2023-10-12 Thread Heiko Carstens

On Mon, Aug 07, 2023 at 04:04:56PM -0700, Vishal Moola (Oracle) wrote:
> As part of the conversions to replace pgtable constructor/destructors with
> ptdesc equivalents, convert various page table functions to use ptdescs.
> 
> Some of the functions use the *get*page*() helper functions. Convert
> these to use pagetable_alloc() and ptdesc_address() instead to help
> standardize page tables further.
> 
> Acked-by: Mike Rapoport (IBM) 
> Signed-off-by: Vishal Moola (Oracle) 
> ---
>  arch/s390/include/asm/pgalloc.h |   4 +-
>  arch/s390/include/asm/tlb.h |   4 +-
>  arch/s390/mm/pgalloc.c  | 128 
>  3 files changed, 69 insertions(+), 67 deletions(-)
...
> diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
> index d7374add7820..07fc660a24aa 100644
> --- a/arch/s390/mm/pgalloc.c
> +++ b/arch/s390/mm/pgalloc.c
...
> @@ -488,16 +486,20 @@ static void base_pgt_free(unsigned long *table)
>  static unsigned long *base_crst_alloc(unsigned long val)
>  {
>   unsigned long *table;
> + struct ptdesc *ptdesc;
>  
> - table = (unsigned long *)__get_free_pages(GFP_KERNEL, CRST_ALLOC_ORDER);
> - if (table)
> - crst_table_init(table, val);
> + ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, CRST_ALLOC_ORDER);

I guess I must miss something, but what is the reason to mask out
__GFP_HIGHMEM here? It is not part of GFP_KERNEL, nor does s390 support
HIGHMEM.

Re: [PATCH 1/8] S390: Remove sentinel elem from ctl_table arrays

2023-09-07 Thread Heiko Carstens

On Wed, Sep 06, 2023 at 12:03:22PM +0200, Joel Granados via B4 Relay wrote:
> From: Joel Granados 
> 
> This commit comes at the tail end of a greater effort to remove the
> empty elements at the end of the ctl_table arrays (sentinels) which
> will reduce the overall build time size of the kernel and run time
> memory bloat by ~64 bytes per sentinel (further information Link :
> https://lore.kernel.org/all/zo5yx5jfoggi%2f...@bombadil.infradead.org/)
> 
> Remove the sentinel element from appldata_table, s390dbf_table,
> topology_ctl_table, cmm_table and page_table_sysctl. Reduced the
> memory allocation in appldata_register_ops by 1 effectively removing the
> sentinel from ops->ctl_table.
> 
> Signed-off-by: Joel Granados 
> ---
>  arch/s390/appldata/appldata_base.c | 6 ++
>  arch/s390/kernel/debug.c   | 3 +--
>  arch/s390/kernel/topology.c| 3 +--
>  arch/s390/mm/cmm.c | 3 +--
>  arch/s390/mm/pgalloc.c | 3 +--
>  5 files changed, 6 insertions(+), 12 deletions(-)

Acked-by: Heiko Carstens

Re: [PATCH rfc v2 04/10] s390: mm: use try_vma_locked_page_fault()

2023-08-24 Thread Heiko Carstens

On Thu, Aug 24, 2023 at 10:16:33AM +0200, Alexander Gordeev wrote:
> On Mon, Aug 21, 2023 at 08:30:50PM +0800, Kefeng Wang wrote:
> > Use new try_vma_locked_page_fault() helper to simplify code.
> > No functional change intended.
> > 
> > Signed-off-by: Kefeng Wang 
> > ---
> >  arch/s390/mm/fault.c | 66 ++--
> >  1 file changed, 27 insertions(+), 39 deletions(-)
...
> > -   fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, 
> > regs);
> > -   if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > -   vma_end_read(vma);
> > -   if (!(fault & VM_FAULT_RETRY)) {
> > -   count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> > -   if (likely(!(fault & VM_FAULT_ERROR)))
> > -   fault = 0;
> 
> This fault fixup is removed in the new version.
...

> > +   vmf.vm_flags = VM_WRITE;
> > +   if (vmf.vm_flags == VM_WRITE)
> > +   vmf.flags |= FAULT_FLAG_WRITE;
> > +
> > +   fault = try_vma_locked_page_fault();
> > +   if (fault == VM_FAULT_NONE)
> > +   goto lock_mm;
> 
> Because VM_FAULT_NONE is set to 0 it gets confused with
> the success code of 0 returned by a fault handler. In the
> former case we want to continue, while in the latter -
> successfully return. I think it applies to all archs.
...
> FWIW, this series ends up with kernel BUG at arch/s390/mm/fault.c:341!

Without having looked in detail into this patch: all of this is likely
because s390's fault handling is quite odd. Not only because fault is set
to 0, but also because of the private VM_FAULT values like
VM_FAULT_BADCONTEXT. I'm just cleaning up all of this, but it won't make it
for the next merge window.

Therefore I'd like to ask to drop the s390 conversion of this series, and
if this series is supposed to be merged the s390 conversion needs to be
done later. Let's not waste more time on the current implementation,
please.

Re: [RFC][PATCH] sched: Rename DIE domain

2023-07-14 Thread Heiko Carstens

On Wed, Jul 12, 2023 at 04:10:56PM +0200, Peter Zijlstra wrote:
> Hi
> 
> Thomas just tripped over the x86 topology setup creating a 'DIE' domain
> for the package mask :-)
> 
> Since these names are SCHED_DEBUG only, rename them.
> I don't think anybody *should* be relying on this, but who knows.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> ---
>  arch/powerpc/kernel/smp.c   | 2 +-
>  arch/s390/kernel/topology.c | 2 +-
>  arch/x86/kernel/smpboot.c   | 2 +-
>  kernel/sched/topology.c | 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)

For s390:
Acked-by: Heiko Carstens

Re: [PATCH] cachestat: wire up cachestat for other architectures

2023-05-11 Thread Heiko Carstens

On Wed, May 10, 2023 at 12:58:06PM -0700, Nhat Pham wrote:
> cachestat is previously only wired in for x86 (and architectures using
> the generic unistd.h table):
> 
> https://lore.kernel.org/lkml/20230503013608.2431726-1-npha...@gmail.com/
> 
> This patch wires cachestat in for all the other architectures.
> 
> Signed-off-by: Nhat Pham 
> ---
...
>  arch/s390/kernel/syscalls/syscall.tbl   | 1 +

Acked-by: Heiko Carstens  (s390)

Re: [PATCH v3 00/24] Remove COMMAND_LINE_SIZE from uapi

2023-02-14 Thread Heiko Carstens

On Tue, Feb 14, 2023 at 09:58:17AM +0100, Geert Uytterhoeven wrote:
> Hi Heiko,
> 
> On Tue, Feb 14, 2023 at 9:39 AM Heiko Carstens  wrote:
> > On Tue, Feb 14, 2023 at 08:49:01AM +0100, Alexandre Ghiti wrote:
> > > This all came up in the context of increasing COMMAND_LINE_SIZE in the
> > > RISC-V port.  In theory that's a UABI break, as COMMAND_LINE_SIZE is the
> > > maximum length of /proc/cmdline and userspace could staticly rely on
> > > that to be correct.
> > >
> > > Usually I wouldn't mess around with changing this sort of thing, but
> > > PowerPC increased it with a5980d064fe2 ("powerpc: Bump COMMAND_LINE_SIZE
> > > to 2048").  There are also a handful of examples of COMMAND_LINE_SIZE
> > > increasing, but they're from before the UAPI split so I'm not quite sure
> > > what that means: e5a6a1c90948 ("powerpc: derive COMMAND_LINE_SIZE from
> > > asm-generic"), 684d2fd48e71 ("[S390] kernel: Append scpdata to kernel
> > > boot command line"), 22242681cff5 ("MIPS: Extend COMMAND_LINE_SIZE"),
> > > and 2b74b85693c7 ("sh: Derive COMMAND_LINE_SIZE from
> > > asm-generic/setup.h.").
> > >
> > > It seems to me like COMMAND_LINE_SIZE really just shouldn't have been
> > > part of the uapi to begin with, and userspace should be able to handle
> > > /proc/cmdline of whatever length it turns out to be.  I don't see any
> > > references to COMMAND_LINE_SIZE anywhere but Linux via a quick Google
> > > search, but that's not really enough to consider it unused on my end.
> > >
> > > The feedback on the v1 seemed to indicate that COMMAND_LINE_SIZE really
> > > shouldn't be part of uapi, so this now touches all the ports.  I've
> > > tried to split this all out and leave it bisectable, but I haven't
> > > tested it all that aggressively.
> >
> > Just to confirm this assumption a bit more: that's actually the same
> > conclusion that we ended up with when commit 3da0243f906a ("s390: make
> > command line configurable") went upstream.
> 
> Commit 622021cd6c560ce7 ("s390: make command line configurable"),
> I assume?

Yes, sorry for that. I got distracted while writing and used the wrong
branch to look this up.

Re: [PATCH v3 00/24] Remove COMMAND_LINE_SIZE from uapi

2023-02-14 Thread Heiko Carstens

On Tue, Feb 14, 2023 at 08:49:01AM +0100, Alexandre Ghiti wrote:
> This all came up in the context of increasing COMMAND_LINE_SIZE in the
> RISC-V port.  In theory that's a UABI break, as COMMAND_LINE_SIZE is the
> maximum length of /proc/cmdline and userspace could staticly rely on
> that to be correct.
> 
> Usually I wouldn't mess around with changing this sort of thing, but
> PowerPC increased it with a5980d064fe2 ("powerpc: Bump COMMAND_LINE_SIZE
> to 2048").  There are also a handful of examples of COMMAND_LINE_SIZE
> increasing, but they're from before the UAPI split so I'm not quite sure
> what that means: e5a6a1c90948 ("powerpc: derive COMMAND_LINE_SIZE from
> asm-generic"), 684d2fd48e71 ("[S390] kernel: Append scpdata to kernel
> boot command line"), 22242681cff5 ("MIPS: Extend COMMAND_LINE_SIZE"),
> and 2b74b85693c7 ("sh: Derive COMMAND_LINE_SIZE from
> asm-generic/setup.h.").
> 
> It seems to me like COMMAND_LINE_SIZE really just shouldn't have been
> part of the uapi to begin with, and userspace should be able to handle
> /proc/cmdline of whatever length it turns out to be.  I don't see any
> references to COMMAND_LINE_SIZE anywhere but Linux via a quick Google
> search, but that's not really enough to consider it unused on my end.
> 
> The feedback on the v1 seemed to indicate that COMMAND_LINE_SIZE really
> shouldn't be part of uapi, so this now touches all the ports.  I've
> tried to split this all out and leave it bisectable, but I haven't
> tested it all that aggressively.

Just to confirm this assumption a bit more: that's actually the same
conclusion that we ended up with when commit 3da0243f906a ("s390: make
command line configurable") went upstream.

Re: [PATCH v3 24/24] s390: Remove empty

2023-02-14 Thread Heiko Carstens

On Tue, Feb 14, 2023 at 08:49:25AM +0100, Alexandre Ghiti wrote:
> From: Palmer Dabbelt 
> 
> Signed-off-by: Palmer Dabbelt 
> ---
>  arch/s390/include/asm/setup.h  | 1 -
>  arch/s390/include/uapi/asm/setup.h | 1 -
>  2 files changed, 2 deletions(-)
>  delete mode 100644 arch/s390/include/uapi/asm/setup.h

Acked-by: Heiko Carstens

Re: [PATCH 00/14] Remove clang's -Qunused-arguments from KBUILD_CPPFLAGS

2023-01-05 Thread Heiko Carstens

On Wed, Jan 04, 2023 at 12:54:18PM -0700, Nathan Chancellor wrote:
> Hi all,
...
> This series has seen my personal test framework, which tests several different
> configurations and architectures, with LLVM tip of tree (16.0.0). I have done
> defconfig, allmodconfig, and allnoconfig builds for arm, arm64, i386, mips,
> powerpc, riscv, s390, and x86_64 with GCC 12.2.0 as well but I am hoping the
> rest of the test infrastructure will catch any lurking problems.
> 
> I would like this series to stay together so that there is no opportunity for
> breakage so please consider giving acks so that this can be carried via the
> kbuild tree.
...
>   s390/vdso: Drop unused '-s' flag from KBUILD_AFLAGS_64
>   s390/vdso: Drop '-shared' from KBUILD_CFLAGS_64
>   s390/purgatory: Remove unused '-MD' and unnecessary '-c' flags
...
>  arch/s390/kernel/vdso64/Makefile|  4 +--
>  arch/s390/purgatory/Makefile    |  2 +-

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH] mm: remove zap_page_range and create zap_vma_pages

2023-01-04 Thread Heiko Carstens

On Tue, Jan 03, 2023 at 04:27:32PM -0800, Mike Kravetz wrote:
> zap_page_range was originally designed to unmap pages within an address
> range that could span multiple vmas.  While working on [1], it was
> discovered that all callers of zap_page_range pass a range entirely within
> a single vma.  In addition, the mmu notification call within zap_page
> range does not correctly handle ranges that span multiple vmas.  When
> crossing a vma boundary, a new mmu_notifier_range_init/end call pair
> with the new vma should be made.
> 
> Instead of fixing zap_page_range, do the following:
> - Create a new routine zap_vma_pages() that will remove all pages within
>   the passed vma.  Most users of zap_page_range pass the entire vma and
>   can use this new routine.
> - For callers of zap_page_range not passing the entire vma, instead call
>   zap_page_range_single().
> - Remove zap_page_range.
> 
> [1] 
> https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.krav...@oracle.com/
> Suggested-by: Peter Xu 
> Signed-off-by: Mike Kravetz 
> ---
> RFC->v1 Created zap_vma_pages to zap entire vma (Christoph Hellwig)
> Did not add Acked-by's as routine was changed.
> 
>  arch/arm64/kernel/vdso.c|  6 ++---
>  arch/powerpc/kernel/vdso.c  |  4 +---
>  arch/powerpc/platforms/book3s/vas-api.c |  2 +-
>  arch/powerpc/platforms/pseries/vas.c|  3 +--
>  arch/riscv/kernel/vdso.c|  6 ++---
>  arch/s390/kernel/vdso.c |  4 +---
>  arch/s390/mm/gmap.c |  2 +-
>  arch/x86/entry/vdso/vma.c   |  4 +---
>  drivers/android/binder_alloc.c  |  2 +-
>  include/linux/mm.h  |  7 --
>  mm/memory.c | 30 -
>  mm/page-writeback.c |  2 +-
>  net/ipv4/tcp.c  |  7 +++---
>  13 files changed, 21 insertions(+), 58 deletions(-)

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v3] arch: rename all internal names xchg to arch_xchg

2023-01-03 Thread Heiko Carstens

On Fri, Dec 30, 2022 at 03:15:52PM +0100, Andrzej Hajda wrote:
> __xchg will be used for non-atomic xchg macro.
> 
> Signed-off-by: Andrzej Hajda 
> Reviewed-by: Arnd Bergmann 
> ---
> v2: squashed all arch patches into one
> v3: fixed alpha/xchg_local, thx to l...@intel.com
> ---
...
>  arch/s390/include/asm/cmpxchg.h  | 4 ++--
> diff --git a/arch/s390/include/asm/cmpxchg.h b/arch/s390/include/asm/cmpxchg.h
> index 84c3f0d576c5b1..efc16f4aac8643 100644
> --- a/arch/s390/include/asm/cmpxchg.h
> +++ b/arch/s390/include/asm/cmpxchg.h
> @@ -14,7 +14,7 @@
>  
>  void __xchg_called_with_bad_pointer(void);
>  
> -static __always_inline unsigned long __xchg(unsigned long x,
> +static __always_inline unsigned long __arch_xchg(unsigned long x,
>   unsigned long address, int size)

Please adjust the alignment of the second line.

> @@ -77,7 +77,7 @@ static __always_inline unsigned long __xchg(unsigned long x,
>   __typeof__(*(ptr)) __ret;   \
>   \
>   __ret = (__typeof__(*(ptr)))\
> - __xchg((unsigned long)(x), (unsigned long)(ptr),\
> + __arch_xchg((unsigned long)(x), (unsigned long)(ptr),   \
>  sizeof(*(ptr))); \

Same here.

The same is true for a couple of other architectures - not sure if
they care however.

Re: [PATCH] mm: remove kern_addr_valid() completely

2022-10-18 Thread Heiko Carstens

On Tue, Oct 18, 2022 at 03:40:14PM +0800, Kefeng Wang wrote:
> Most architectures(except arm64/x86/sparc) simply return 1 for
> kern_addr_valid(), which is only used in read_kcore(), and it
> calls copy_from_kernel_nofault() which could check whether the
> address is a valid kernel address, so no need kern_addr_valid(),
> let's remove unneeded kern_addr_valid() completely.
> 
> Signed-off-by: Kefeng Wang 
> ---
...
>  arch/s390/include/asm/pgtable.h   |  2 -

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v6 5/7] treewide: use get_random_u32() when possible

2022-10-11 Thread Heiko Carstens

On Mon, Oct 10, 2022 at 05:06:11PM -0600, Jason A. Donenfeld wrote:
> The prandom_u32() function has been a deprecated inline wrapper around
> get_random_u32() for several releases now, and compiles down to the
> exact same code. Replace the deprecated wrapper with a direct call to
> the real function. The same also applies to get_random_int(), which is
> just a wrapper around get_random_u32(). This was done as a basic find
> and replace.
> 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Kees Cook 
> Reviewed-by: Yury Norov 
> Acked-by: Toke Høiland-Jørgensen  # for sch_cake
> Acked-by: Chuck Lever  # for nfsd
> Reviewed-by: Jan Kara  # for ext4
> Acked-by: Mika Westerberg  # for thunderbolt
> Acked-by: Darrick J. Wong  # for xfs
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/s390/mm/mmap.c        |  2 +-

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v6 4/7] treewide: use get_random_{u8,u16}() when possible, part 2

2022-10-11 Thread Heiko Carstens

On Mon, Oct 10, 2022 at 05:06:10PM -0600, Jason A. Donenfeld wrote:
> Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
> simply use the get_random_{u8,u16}() functions, which are faster than
> wasting the additional bytes from a 32-bit value. This was done by hand,
> identifying all of the places where one of the random integer functions
> was used in a non-32-bit context.
> 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Kees Cook 
> Reviewed-by: Yury Norov 
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/s390/kernel/process.c     | 2 +-

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v6 1/7] treewide: use prandom_u32_max() when possible, part 1

2022-10-11 Thread Heiko Carstens

On Mon, Oct 10, 2022 at 05:06:07PM -0600, Jason A. Donenfeld wrote:
> Rather than incurring a division or requesting too many random bytes for
> the given range, use the prandom_u32_max() function, which only takes
> the minimum required bytes from the RNG and avoids divisions. This was
...
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Kees Cook 
> Reviewed-by: Yury Norov 
> Reviewed-by: KP Singh 
> Reviewed-by: Jan Kara  # for ext4 and sbitmap
> Reviewed-by: Christoph Böhmwalder  # for 
> drbd
> Acked-by: Ulf Hansson  # for mmc
> Acked-by: Darrick J. Wong  # for xfs
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/s390/kernel/process.c|  2 +-
>  arch/s390/kernel/vdso.c   |  2 +-

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v3] random: handle archrandom with multiple longs

2022-07-22 Thread Heiko Carstens

On Tue, Jul 19, 2022 at 03:02:07PM +0200, Jason A. Donenfeld wrote:
> The archrandom interface was originally designed for x86, which supplies
> RDRAND/RDSEED for receiving random words into registers, resulting in
> one function to generate an int and another to generate a long. However,
> other architectures don't follow this.
> 
> On arm64, the SMCCC TRNG interface can return between 1 and 3 longs. On
> s390, the CPACF TRNG interface can return arbitrary amounts, with 32
> longs having the same cost as one. On UML, the os_getrandom() interface
> can return arbitrary amounts.
> 
> So change the api signature to take a "max_longs" parameter designating
> the maximum number of longs requested, and then return the number of
> longs generated.
> 
> Since callers need to check this return value and loop anyway, each arch
> implementation does not bother implementing its own loop to try again to
> fill the maximum number of longs. Additionally, all existing callers
> pass in a constant max_longs parameter. Taken together, these two things
> mean that the codegen doesn't really change much for one-word-at-a-time
> platforms, while performance is greatly improved on platforms such as
> s390.
> 
> Cc: Will Deacon 
> Cc: Alexander Gordeev 
> Cc: Thomas Gleixner 
> Cc: H. Peter Anvin 
> Cc: Catalin Marinas 
> Cc: Borislav Petkov 
> Cc: Heiko Carstens 
> Cc: Johannes Berg 
> Cc: Mark Rutland 
> Cc: Harald Freudenberger 
> Acked-by: Michael Ellerman 
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/arm64/include/asm/archrandom.h   | 102 --
>  arch/arm64/kernel/kaslr.c |   2 +-
>  arch/powerpc/include/asm/archrandom.h |  30 ++--
>  arch/powerpc/kvm/book3s_hv.c  |   2 +-
>  arch/s390/include/asm/archrandom.h|  29 ++--
>  arch/um/include/asm/archrandom.h  |  21 ++
>  arch/x86/include/asm/archrandom.h |  41 +--
>  arch/x86/kernel/espfix_64.c   |   2 +-
>  drivers/char/random.c |  45 
>  include/asm-generic/archrandom.h  |  18 +----
>  include/linux/random.h|  12 +--
>  11 files changed, 116 insertions(+), 188 deletions(-)

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v2] random: remove CONFIG_ARCH_RANDOM

2022-07-06 Thread Heiko Carstens

On Wed, Jul 06, 2022 at 02:32:25AM +0200, Jason A. Donenfeld wrote:
> When RDRAND was introduced, there was much discussion on whether it
> should be trusted and how the kernel should handle that. Initially, two
> mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
> "nordrand", a boot-time switch.
> 
> Later the thinking evolved. With a properly designed RNG, using RDRAND
> values alone won't harm anything, even if the outputs are malicious.
> Rather, the issue is whether those values are being *trusted* to be good
> or not. And so a new set of options were introduced as the real
> ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu".
> With these options, RDRAND is used, but it's not always credited. So in
> the worst case, it does nothing, and in the best case, maybe it helps.
> 
> Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the
> center and became something certain platforms force-select.
> 
> The old options don't really help with much, and it's a bit odd to have
> special handling for these instructions when the kernel can deal fine
> with the existence or untrusted existence or broken existence or
> non-existence of that CPU capability.
> 
> So this commit simplifies things down to the two options that are
> actually used, and removes the confusing old ones that aren't used or
> useful. It leaves "nordrand" for now, as the removal of that will take a
> different route.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Michael Ellerman 
> Cc: Heiko Carstens 
> Cc: Alexander Gordeev 
> Cc: Thomas Gleixner 
> Cc: H. Peter Anvin 
> Cc: Greg Kroah-Hartman 
> Cc: Arnd Bergmann 
> Signed-off-by: Jason A. Donenfeld 
...
>  arch/s390/Kconfig | 15 ---
>  arch/s390/configs/zfcpdump_defconfig  |  1 -
>  arch/s390/crypto/Makefile |  2 +-
>  arch/s390/include/asm/archrandom.h|  3 ---

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v5] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-31 Thread Heiko Carstens

hal Simek , Thomas Bogendoerfer , 
linux-par...@vger.kernel.org, Max Filippov , 
linux-ker...@vger.kernel.org, Dinh Nguyen , 
linux-ri...@lists.infradead.org, Palmer Dabbelt , Sven 
Schnelle , Guo Ren , 
linux-hexa...@vger.kernel.org, Ivan Kokshaysky , 
Johannes Berg , linuxppc-dev@lists.ozlabs.org, 
"David S . Miller" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Mon, May 30, 2022 at 02:34:50PM -0400, Peter Xu wrote:
> I observed that for each of the shared file-backed page faults, we're very
> likely to retry one more time for the 1st write fault upon no page.  It's
> because we'll need to release the mmap lock for dirty rate limit purpose
> with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).
> 
> Then after that throttling we return VM_FAULT_RETRY.
> 
> We did that probably because VM_FAULT_RETRY is the only way we can return
> to the fault handler at that time telling it we've released the mmap lock.
> 
> However that's not ideal because it's very likely the fault does not need
> to be retried at all since the pgtable was well installed before the
> throttling, so the next continuous fault (including taking mmap read lock,
> walk the pgtable, etc.) could be in most cases unnecessary.
> 
> It's not only slowing down page faults for shared file-backed, but also add
> more mmap lock contention which is in most cases not needed at all.
> 
> To observe this, one could try to write to some shmem page and look at
> "pgfault" value in /proc/vmstat, then we should expect 2 counts for each
> shmem write simply because we retried, and vm event "pgfault" will capture
> that.
> 
> To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
> show that we've completed the whole fault and released the lock.  It's also
> a hint that we should very possibly not need another fault immediately on
> this page because we've just completed it.
> 
> This patch provides a ~12% perf boost on my aarch64 test VM with a simple
> program sequentially dirtying 400MB shmem file being mmap()ed and these are
> the time it needs:
> 
>   Before: 650.980 ms (+-1.94%)
>   After:  569.396 ms (+-1.38%)
> 
> I believe it could help more than that.
> 
> We need some special care on GUP and the s390 pgfault handler (for gmap
> code before returning from pgfault), the rest changes in the page fault
> handlers should be relatively straightforward.
> 
> Another thing to mention is that mm_account_fault() does take this new
> fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.
> 
> I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
> not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
> them as-is.
> 
> Acked-by: Geert Uytterhoeven 
> Acked-by: Peter Zijlstra (Intel) 
> Acked-by: Johannes Weiner 
> Acked-by: Vineet Gupta 
> Acked-by: Guo Ren 
> Acked-by: Max Filippov 
> Acked-by: Christian Borntraeger 
> Acked-by: Michael Ellerman  (powerpc)
> Acked-by: Catalin Marinas 
> Reviewed-by: Alistair Popple 
> Reviewed-by: Ingo Molnar 
> Signed-off-by: Peter Xu 
> ---
...
>  arch/s390/mm/fault.c  | 12 
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index e173b6187ad5..973dcd05c293 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -433,6 +433,17 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   goto out_up;
>   goto out;
>   }
> +
> + /* The fault is fully completed (including releasing mmap lock) */
> + if (fault & VM_FAULT_COMPLETED) {
> + if (gmap) {
> + mmap_read_lock(mm);
> + goto out_gmap;
> + }
> + fault = 0;
> + goto out;
> + }
> +
>   if (unlikely(fault & VM_FAULT_ERROR))
>   goto out_up;
>  
> @@ -452,6 +463,7 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   mmap_read_lock(mm);
>   goto retry;
>   }
> +out_gmap:
>   if (IS_ENABLED(CONFIG_PGSTE) && gmap) {
>   address =  __gmap_link(gmap, current->thread.gmap_addr,
>  address);

FWIW:
Acked-by: Heiko Carstens

Re: [PATCH v4] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-30 Thread Heiko Carstens

r...@linux.ibm.com>, linux-par...@vger.kernel.org, Max Filippov 
, linux-ker...@vger.kernel.org, Johannes Berg 
, Dinh Nguyen , 
linux-ri...@lists.infradead.org, Palmer Dabbelt , Sven 
Schnelle , linux-al...@vger.kernel.org, Ivan Kokshaysky 
, Andrew Morton , Thomas 
Bogendoerfer , linuxppc-dev@lists.ozlabs.org, "David 
S . Miller" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Mon, May 30, 2022 at 12:00:52PM -0400, Peter Xu wrote:
> On Mon, May 30, 2022 at 11:52:54AM -0400, Peter Xu wrote:
> > On Mon, May 30, 2022 at 11:35:10AM +0200, Christian Borntraeger wrote:
> > > > diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> > > > index 4608cc962ecf..e1d40ca341b7 100644
> > > > --- a/arch/s390/mm/fault.c
> > > > +++ b/arch/s390/mm/fault.c
> > > > @@ -436,12 +436,11 @@ static inline vm_fault_t do_exception(struct 
> > > > pt_regs *regs, int access)
> > > > /* The fault is fully completed (including releasing mmap lock) 
> > > > */
> > > > if (fault & VM_FAULT_COMPLETED) {
> > > > -   /*
> > > > -* Gmap will need the mmap lock again, so retake it.  
> > > > TODO:
> > > > -* only conditionally take the lock when CONFIG_PGSTE 
> > > > set.
> > > > -*/
> > > > -   mmap_read_lock(mm);
> > > > -   goto out_gmap;
> > > > +   if (gmap) {
> > > > +   mmap_read_lock(mm);
> > > > +   goto out_gmap;
> > > > +   }
fault = 0;  <
> > > > +   goto out;
> 
> Hmm, right after I replied I found "goto out" could be problematic, since
> all s390 callers of do_exception() will assume it an error condition (side
> note: "goto out_gmap" contains one step to clear "fault" to 0).  I'll
> replace this with "return 0" instead if it looks good to both of you.
> 
> I'll wait for a confirmation before reposting.  Thanks,

Right, that was stupid. Thanks for double checking!

However could you please add "fault = 0" just in front of the goto out
like above? I'd like to avoid having returns and gotos mixed.

Re: [PATCH v4] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-30 Thread Heiko Carstens

on...@monstr.eu>, Thomas Bogendoerfer , 
linux-par...@vger.kernel.org, Max Filippov , 
linux-ker...@vger.kernel.org, Johannes Berg , Dinh 
Nguyen , linux-ri...@lists.infradead.org, Palmer Dabbelt 
, Sven Schnelle , 
linux-al...@vger.kernel.org, Ivan Kokshaysky , 
Andrew Morton , linuxppc-dev@lists.ozlabs.org, 
"David S . Miller" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 

On Fri, May 27, 2022 at 03:39:36PM -0400, Peter Xu wrote:
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index e173b6187ad5..4608cc962ecf 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -433,6 +433,17 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   goto out_up;
>   goto out;
>   }
> +
> + /* The fault is fully completed (including releasing mmap lock) */
> + if (fault & VM_FAULT_COMPLETED) {
> + /*
> +  * Gmap will need the mmap lock again, so retake it.  TODO:
> +  * only conditionally take the lock when CONFIG_PGSTE set.
> +  */
> + mmap_read_lock(mm);
> + goto out_gmap;
> + }
> +
>   if (unlikely(fault & VM_FAULT_ERROR))
>   goto out_up;
>  

Guess the patch below on top of your patch is what we want.
Just for clarification: if gmap is not NULL then the process is a kvm
process. So, depending on the workload, this optimization makes sense.

diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 4608cc962ecf..e1d40ca341b7 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -436,12 +436,11 @@ static inline vm_fault_t do_exception(struct pt_regs 
*regs, int access)

/* The fault is fully completed (including releasing mmap lock) */
if (fault & VM_FAULT_COMPLETED) {
-   /*
-* Gmap will need the mmap lock again, so retake it.  TODO:
-* only conditionally take the lock when CONFIG_PGSTE set.
-*/
-   mmap_read_lock(mm);
-   goto out_gmap;
+   if (gmap) {
+   mmap_read_lock(mm);
+   goto out_gmap;
+   }
+   goto out;
}

if (unlikely(fault & VM_FAULT_ERROR))

Re: [PATCH v3] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-27 Thread Heiko Carstens

e.net>, Chris Zankel , Michal Simek , 
Thomas Bogendoerfer , linux-par...@vger.kernel.org, 
Max Filippov , linux-ker...@vger.kernel.org, Dinh Nguyen 
, Palmer Dabbelt , Sven Schnelle 
, Guo Ren , Borislav Petkov 
, Johannes Berg , 
linuxppc-dev@lists.ozlabs.org, "David S . Miller" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Tue, May 24, 2022 at 07:45:31PM -0400, Peter Xu wrote:
> I observed that for each of the shared file-backed page faults, we're very
> likely to retry one more time for the 1st write fault upon no page.  It's
> because we'll need to release the mmap lock for dirty rate limit purpose
> with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).
> 
> Then after that throttling we return VM_FAULT_RETRY.
> 
> We did that probably because VM_FAULT_RETRY is the only way we can return
> to the fault handler at that time telling it we've released the mmap lock.
> 
> However that's not ideal because it's very likely the fault does not need
> to be retried at all since the pgtable was well installed before the
> throttling, so the next continuous fault (including taking mmap read lock,
> walk the pgtable, etc.) could be in most cases unnecessary.
> 
> It's not only slowing down page faults for shared file-backed, but also add
> more mmap lock contention which is in most cases not needed at all.
> 
> To observe this, one could try to write to some shmem page and look at
> "pgfault" value in /proc/vmstat, then we should expect 2 counts for each
> shmem write simply because we retried, and vm event "pgfault" will capture
> that.
> 
> To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
> show that we've completed the whole fault and released the lock.  It's also
> a hint that we should very possibly not need another fault immediately on
> this page because we've just completed it.
> 
> This patch provides a ~12% perf boost on my aarch64 test VM with a simple
> program sequentially dirtying 400MB shmem file being mmap()ed and these are
> the time it needs:
> 
>   Before: 650.980 ms (+-1.94%)
>   After:  569.396 ms (+-1.38%)
> 
> I believe it could help more than that.
> 
> We need some special care on GUP and the s390 pgfault handler (for gmap
> code before returning from pgfault), the rest changes in the page fault
> handlers should be relatively straightforward.
> 
> Another thing to mention is that mm_account_fault() does take this new
> fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.
> 
> I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
> not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
> them as-is.
> 
> Signed-off-by: Peter Xu 
...
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index e173b6187ad5..9503a7cfaf03 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -339,6 +339,7 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   unsigned long address;
>   unsigned int flags;
>   vm_fault_t fault;
> + bool need_unlock = true;
>   bool is_write;
>  
>   tsk = current;
> @@ -433,6 +434,13 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   goto out_up;
>   goto out;
>   }
> +
> + /* The fault is fully completed (including releasing mmap lock) */
> + if (fault & VM_FAULT_COMPLETED) {
> + need_unlock = false;
> + goto out_gmap;
> + }
> +
>   if (unlikely(fault & VM_FAULT_ERROR))
>   goto out_up;
>  
> @@ -452,6 +460,7 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   mmap_read_lock(mm);
>   goto retry;
>   }
> +out_gmap:
>   if (IS_ENABLED(CONFIG_PGSTE) && gmap) {
>   address =  __gmap_link(gmap, current->thread.gmap_addr,
>  address);
> @@ -466,7 +475,8 @@ static inline vm_fault_t do_exception(struct pt_regs 
> *regs, int access)
>   }
>   fault = 0;
>  out_up:
> - mmap_read_unlock(mm);
> + if (need_unlock)
> + mmap_read_unlock(mm);
>  out:

This seems to be incorrect. __gmap_link() requires the mmap_lock to be
held. Christian, Janosch, or David, could you please check?

Re: [PATCH 1/2] locking/lockref: Use try_cmpxchg64 in CMPXCHG_LOOP macro

2022-05-27 Thread Heiko Carstens

On Thu, May 26, 2022 at 01:42:35PM +0100, Mark Rutland wrote:
> On Thu, May 26, 2022 at 10:14:59PM +1000, Michael Ellerman wrote:
> > Linus Torvalds  writes:
> > > On Wed, May 25, 2022 at 7:40 AM Uros Bizjak  wrote:
> > >>
> > >> Use try_cmpxchg64 instead of cmpxchg64 in CMPXCHG_LOOP macro.
> > >> x86 CMPXCHG instruction returns success in ZF flag, so this
> > >> change saves a compare after cmpxchg (and related move instruction
> > >> in front of cmpxchg). The main loop of lockref_get improves from:
> > >
> > > Ack on this one regardless of the 32-bit x86 question.
> > >
> > > HOWEVER.
> > >
> > > I'd like other architectures to pipe up too, because I think right now
> > > x86 is the only one that implements that "arch_try_cmpxchg()" family
> > > of operations natively, and I think the generic fallback for when it
> > > is missing might be kind of nasty.
> > >
> > > Maybe it ends up generating ok code, but it's also possible that it
> > > just didn't matter when it was only used in one place in the
> > > scheduler.
> > 
> > This patch seems to generate slightly *better* code on powerpc.
> > 
> > I see one register-to-register move that gets shifted slightly later, so
> > that it's skipped on the path that returns directly via the SUCCESS
> > case.
> 
> FWIW, I see the same on arm64; a register-to-register move gets moved out of
> the success path. That changes the register allocation, and resulting in one
> fewer move, but otherwise the code generation is the same.

Just for the records: s390 code generation changes the same like on
powerpc; so looks good.

Re: [PATCH 22/30] panic: Introduce the panic post-reboot notifier list

2022-05-11 Thread Heiko Carstens

On Mon, May 09, 2022 at 11:16:10AM -0300, Guilherme G. Piccoli wrote:
> On 27/04/2022 19:49, Guilherme G. Piccoli wrote:
> > Currently we have 3 notifier lists in the panic path, which will
> > be wired in a way to allow the notifier callbacks to run in
> > different moments at panic time, in a subsequent patch.
> > 
> > But there is also an odd set of architecture calls hardcoded in
> > the end of panic path, after the restart machinery. They're
> > responsible for late time tunings / events, like enabling a stop
> > button (Sparc) or effectively stopping the machine (s390).
> > 
> > This patch introduces yet another notifier list to offer the
> > architectures a way to add callbacks in such late moment on
> > panic path without the need of ifdefs / hardcoded approaches.
> > 
> > Cc: Alexander Gordeev 
> > Cc: Christian Borntraeger 
> > Cc: "David S. Miller" 
> > Cc: Heiko Carstens 
> > Cc: Sven Schnelle 
> > Cc: Vasily Gorbik 
> > Signed-off-by: Guilherme G. Piccoli 
> 
> Hey S390/SPARC folks, sorry for the ping!
> 
> Any reviews on this V1 would be greatly appreciated, I'm working on V2
> and seeking feedback in the non-reviewed patches.

Sorry, missed that this is quite s390 specific. So, yes, this looks
good to me and nice to see that one of the remaining CONFIG_S390 in
common code will be removed!

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH 13/30] s390/consoles: Improve panic notifiers reliability

2022-04-29 Thread Heiko Carstens

On Wed, Apr 27, 2022 at 07:49:07PM -0300, Guilherme G. Piccoli wrote:
> Currently many console drivers for s390 rely on panic/reboot notifiers
> to invoke callbacks on these events. The panic() function disables local
> IRQs, secondary CPUs and preemption, so callbacks invoked on panic are
> effectively running in atomic context.
> 
> Happens that most of these console callbacks from s390 doesn't take the
> proper care with regards to atomic context, like taking spinlocks that
> might be taken in other function/CPU and hence will cause a lockup
> situation.
> 
> The goal for this patch is to improve the notifiers reliability, acting
> on 4 console drivers, as detailed below:
> 
> (1) con3215: changed a regular spinlock to the trylock alternative.
> 
> (2) con3270: also changed a regular spinlock to its trylock counterpart,
> but here we also have another problem: raw3270_activate_view() takes a
> different spinlock. So, we worked a helper to validate if this other lock
> is safe to acquire, and if so, raw3270_activate_view() should be safe.
> 
> Notice though that there is a functional change here: it's now possible
> to continue the notifier code [reaching con3270_wait_write() and
> con3270_rebuild_update()] without executing raw3270_activate_view().
> 
> (3) sclp: a global lock is used heavily in the functions called from
> the notifier, so we added a check here - if the lock is taken already,
> we just bail-out, preventing the lockup.
> 
> (4) sclp_vt220: same as (3), a lock validation was added to prevent the
> potential lockup problem.
> 
> Besides (1)-(4), we also removed useless void functions, adding the
> code called from the notifier inside its own body, and changed the
> priority of such notifiers to execute late, since they are "heavyweight"
> for the panic environment, so we aim to reduce risks here.
> Changed return values to NOTIFY_DONE as well, the standard one.
> 
> Cc: Alexander Gordeev 
> Cc: Christian Borntraeger 
> Cc: Heiko Carstens 
> Cc: Sven Schnelle 
> Cc: Vasily Gorbik 
> Signed-off-by: Guilherme G. Piccoli 
> ---
> 
> As a design choice, the option used here to verify a given spinlock is taken
> was the function "spin_is_locked()" - but we noticed that it is not often 
> used.
> An alternative would to take the lock with a spin_trylock() and if it 
> succeeds,
> just release the spinlock and continue the code. But that seemed weird...
> 
> Also, we'd like to ask a good validation of case (2) potential functionality
> change from the s390 console experts - far from expert here, and in our naive
> code observation, that seems fine, but that analysis might be missing some
> corner case.
> 
> Thanks in advance!
> 
>  drivers/s390/char/con3215.c| 36 +++--
>  drivers/s390/char/con3270.c| 34 +++
>  drivers/s390/char/raw3270.c| 18 +++
>  drivers/s390/char/raw3270.h|  1 +
>  drivers/s390/char/sclp_con.c   | 28 +--
>  drivers/s390/char/sclp_vt220.c | 42 +++---
>  6 files changed, 96 insertions(+), 63 deletions(-)

Code looks good, and everything still seems to work. I applied this
internally for the time being, and if it passes testing, I'll schedule
it for the next merge window.

Thanks!

Re: [PATCH v2 09/13] powerpc/ftrace: Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS

2022-02-16 Thread Heiko Carstens

On Tue, Feb 15, 2022 at 09:55:52PM +0530, Naveen N. Rao wrote:
> > > > > > > I think this is wrong. We need to differentiate
> > > > > > > between ftrace_caller() and ftrace_regs_caller()
> > > > > > > here, and only return pt_regs if coming in through
> > > > > > > ftrace_regs_caller() (i.e., FL_SAVE_REGS is set).
> > > > > > 
> > > > > > Not sure I follow you.
> > > > > > 
> > > > > > This is based on 5740a7c71ab6 ("s390/ftrace: add
> > > > > > HAVE_DYNAMIC_FTRACE_WITH_ARGS support")
> > > > > > 
> > > > > > It's all the point of HAVE_DYNAMIC_FTRACE_WITH_ARGS,
> > > > > > have the regs also with ftrace_caller().
> > > > > > 
> > > > > > Sure you only have the params, but that's the same on
> > > > > > s390, so what did I miss ?
> 
> Steven has explained the rationale for this in his other response:
> https://lore.kernel.org/all/20220215093849.556d5...@gandalf.local.home/

Thanks for this pointer, this clarifies a couple of things!

> > > > It looks like s390 is special since it apparently saves all
> > > > registers even for ftrace_caller:
> > > > https://lore.kernel.org/all/YbipdU5X4HNDWIni@osiris/
> > > 
> > > It is not what I understand from their code, see 
> > > https://elixir.bootlin.com/linux/v5.17-rc3/source/arch/s390/kernel/mcount.S#L37
> > > 
> > > 
> > > They have a common macro called with argument 'allregs' which is set
> > > to 0 for ftrace_caller() and 1 for ftrace_regs_caller().
> > > When allregs == 1, the macro seems to save more.
> > > 
> > > But ok, I can do like x86, but I need a trick to know whether
> > > FL_SAVE_REGS is set or not, like they do with fregs->regs.cs
> > > Any idea what the condition can be for powerpc ?
> 
> We'll need to explicitly zero-out something in pt_regs in ftrace_caller().
> We can probably use regs->msr since we don't expect it to be zero when saved
> from ftrace_regs_caller().
> > 
> > Finally, it looks like this change is done  via commit 894979689d3a
> > ("s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller
> > implementations") four hours the same day after the implementation of
> > arch_ftrace_get_regs()
> > 
> > They may have forgotten to change arch_ftrace_get_regs() which was added
> > in commit 5740a7c71ab6 ("s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_ARGS
> > support") with the assumption that ftrace_caller and ftrace_regs_caller
> > where identical.
> 
> Indeed, good find!

Thank you for bringing this up!

So, the in both variants s390 provides nearly identical data. The only
difference is that for FL_SAVE_REGS the program status word mask is
missing; therefore it is not possible to figure out the condition code
or if interrupts were enabled/disabled.

Vasily, Sven, I think we have two options here:

- don't provide sane psw mask contents at all and say (again) that
  ptregs contents are identical

- provide (finally) a full psw mask contents using epsw, and indicate
  validity with a flags bit in pt_regs

I would vote for the second option, even though epsw is slow. But this
is about the third or fourth time this came up in different
contexts. So I'd guess we should go for the slow but complete
solution. Opinions?

Re: [PATCH V3 5/8] sched: s390: Remove unused TASK_SIZE_OF

2021-12-28 Thread Heiko Carstens

On Tue, Dec 28, 2021 at 02:47:26PM +0800, guo...@kernel.org wrote:
> From: Guo Ren 
> 
> This macro isn't used in Linux sched, now. Delete in
> include/linux/sched.h and arch's include/asm.
> 
> Signed-off-by: Guo Ren 
> Signed-off-by: Guo Ren 
> Reviewed-by: Arnd Bergmann 
> ---
>  arch/s390/include/asm/processor.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Applied, thanks!

Re: [PATCH V2 5/8] sched: s390: Remove unused TASK_SIZE_OF

2021-12-25 Thread Heiko Carstens

On Sat, Dec 25, 2021 at 12:54:27PM +0800, guo...@kernel.org wrote:
> From: Guo Ren 
> 
> This macro isn't used in Linux sched, now. Delete in
> include/linux/sched.h and arch's include/asm.
> 
> Signed-off-by: Guo Ren 
> Reviewed-by: Arnd Bergmann 
> ---
>  arch/s390/include/asm/processor.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

I could pick this up for s390, however sender (From: field) of this patch
series does not match From: and Signed-off-by: fields above.

In general I don't pick up such patches, since this doesn't match the
"Developer's Certificate of Origin" requirements.
-> Documentation/process/submitting-patches.rst

Re: [PATCH v1 0/5] Implement livepatch on PPC32

2021-12-14 Thread Heiko Carstens

On Mon, Dec 13, 2021 at 05:50:52PM +, Christophe Leroy wrote:
> Le 13/12/2021 à 18:33, Steven Rostedt a écrit :
> > On Mon, 13 Dec 2021 17:30:48 +
> > Christophe Leroy  wrote:
> > 
> >> Thanks, I will try that.
> >>
> >> I can't find ftrace_graph_func() in s390. Does it mean that s390 doesn't
> >> have a working function tracer anymore ?
> >>
> >> I see your commit 0c0593b45c9b4 ("x86/ftrace: Make function graph use
> >> ftrace directly") is dated 8 Oct 2021 while 5740a7c71ab6 ("s390/ftrace:
> >> add HAVE_DYNAMIC_FTRACE_WITH_ARGS support") is 4 Oct 2021.
> > 
> > Hmm, maybe not. I can't test it.
> > 
> > This needs to be fixed if that's the case.
> > 
> > Thanks for bringing it up!

It still works, we run the full ftrace/kprobes selftests from the
kernel every day on multiple machines with several kernels (besides
other Linus' tree, but also linux-next). That said, I wanted to change
s390's code follow what x86 is currently doing anyway.

One thing to note: commit 5740a7c71ab6 ("s390/ftrace: add
HAVE_DYNAMIC_FTRACE_WITH_ARGS support") looks only that simple because
ftrace_caller _and_ ftrace_regs_caller used to save all register
contents into the pt_regs structure, which never was a requirement,
but implicitly fulfills the HAVE_DYNAMIC_FTRACE_WITH_ARGS
requirements.
Not sure if powerpc passes enough register contents via pt_regs for
HAVE_DYNAMIC_FTRACE_WITH_ARGS though. Might be something to check?

Re: [PATCH v2 0/6] KEXEC_SIG with appended signature

2021-11-30 Thread Heiko Carstens

On Thu, Nov 25, 2021 at 07:02:38PM +0100, Michal Suchanek wrote:
> Hello,
> 
> This is resend of the KEXEC_SIG patchset.
> 
> The first patch is new because it'a a cleanup that does not require any
> change to the module verification code.
> 
> The second patch is the only one that is intended to change any
> functionality.
> 
> The rest only deduplicates code but I did not receive any review on that
> part so I don't know if it's desirable as implemented.
> 
> The first two patches can be applied separately without the rest.
> 
> Thanks
> 
> Michal
> 
> Michal Suchanek (6):
>   s390/kexec_file: Don't opencode appended signature check.
>   powerpc/kexec_file: Add KEXEC_SIG support.
>   kexec_file: Don't opencode appended signature verification.
>   module: strip the signature marker in the verification function.
>   module: Use key_being_used_for for log messages in
> verify_appended_signature
>   module: Move duplicate mod_check_sig users code to mod_parse_sig
> 
>  arch/powerpc/Kconfig | 11 +
>  arch/powerpc/kexec/elf_64.c  | 14 ++
>  arch/s390/kernel/machine_kexec_file.c| 42 ++
>  crypto/asymmetric_keys/asymmetric_type.c |  1 +
>  include/linux/module_signature.h |  1 +
>  include/linux/verification.h |  4 ++
>  kernel/module-internal.h |  2 -
>  kernel/module.c  | 12 +++--
>  kernel/module_signature.c| 56 +++-
>  kernel/module_signing.c  | 33 +++---
>  security/integrity/ima/ima_modsig.c  | 22 ++----
>  11 files changed, 113 insertions(+), 85 deletions(-)

For all patches which touch s390:
Acked-by: Heiko Carstens

Re: [PATCH v2 4/4] s390: Use generic version of arch_is_kernel_initmem_freed()

2021-09-28 Thread Heiko Carstens

On Tue, Sep 28, 2021 at 09:15:37AM +0200, Christophe Leroy wrote:
> Generic version of arch_is_kernel_initmem_freed() now does the same
> as s390 version.
> 
> Remove the s390 version.
> 
> Cc: Gerald Schaefer 
> Signed-off-by: Christophe Leroy 
> ---
> v2: No change
> ---
>  arch/s390/include/asm/sections.h | 12 
>  arch/s390/mm/init.c  |  3 ---
>  2 files changed, 15 deletions(-)

Looks good. Thanks for cleaning this up!

Acked-by: Heiko Carstens

Re: [PATCH] ftrace: Cleanup ftrace_dyn_arch_init()

2021-09-03 Thread Heiko Carstens

On Fri, Sep 03, 2021 at 03:18:17PM +0800, Weizhao Ouyang wrote:
> Most ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
> ftrace_dyn_arch_init() to cleanup them.
> 
> Signed-off-by: Weizhao Ouyang 
> ---
>  arch/arm/kernel/ftrace.c  | 5 -
>  arch/arm64/kernel/ftrace.c| 5 -
>  arch/csky/kernel/ftrace.c | 5 -
>  arch/ia64/kernel/ftrace.c | 6 --
>  arch/microblaze/kernel/ftrace.c   | 5 -
>  arch/mips/include/asm/ftrace.h| 2 ++
>  arch/nds32/kernel/ftrace.c| 5 -
>  arch/parisc/kernel/ftrace.c   | 5 -
>  arch/powerpc/include/asm/ftrace.h | 4 
>  arch/riscv/kernel/ftrace.c| 5 -
>  arch/s390/kernel/ftrace.c | 5 -
>  arch/sh/kernel/ftrace.c   | 5 -
>  arch/sparc/kernel/ftrace.c| 5 -
>  arch/x86/kernel/ftrace.c  | 5 -
>  include/linux/ftrace.h| 1 -
>  kernel/trace/ftrace.c | 5 +
>  16 files changed, 11 insertions(+), 62 deletions(-)

For s390:
Acked-by: Heiko Carstens

Re: [PATCH 2/3] trace: refactor TRACE_IRQFLAGS_SUPPORT in Kconfig

2021-07-31 Thread Heiko Carstens

On Sat, Jul 31, 2021 at 02:22:32PM +0900, Masahiro Yamada wrote:
> Make architectures select TRACE_IRQFLAGS_SUPPORT instead of
> having many defines.
> 
> Signed-off-by: Masahiro Yamada 
> ---
...
>  arch/s390/Kconfig | 1 +
>  arch/s390/Kconfig.debug   | 3 ---

For s390:
Acked-by: Heiko Carstens

Re: [PATCH v5 0/6] compat: remove compat_alloc_user_space

2021-07-30 Thread Heiko Carstens

On Tue, Jul 27, 2021 at 04:48:53PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Going through compat_alloc_user_space() to convert indirect system call
> arguments tends to add complexity compared to handling the native and
> compat logic in the same code.
> 
> Out of the other remaining callers, the linux-media series went into
> v5.14, and the network ioctl handling is now fixed in net-next, so
> these are the last remaining users, and I now include the final
> patch to remove the definitions as well.
> 
> Since these patches are now all that remains, it would be nice to
> merge it all through Andrew's Linux-mm tree, which is already based
> on top of linux-next.
...
> 
> Arnd Bergmann (6):
>   kexec: move locking into do_kexec_load
>   kexec: avoid compat_alloc_user_space
>   mm: simplify compat_sys_move_pages
>   mm: simplify compat numa syscalls
>   compat: remove some compat entry points
>   arch: remove compat_alloc_user_space

Our CI reports this with linux-next and running strace selftest in
compat mode:

Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 038003e7c000 TEID: 038003e7c803
Fault in home space mode while using kernel ASCE.
AS:0001fb388007 R3:8021c007 S:82142000 P:0400 
Oops: 0011 ilc:3 [#1] SMP 
CPU: 0 PID: 1017495 Comm: get_mempolicy Tainted: G   OE 
5.14.0-20210730.rc3.git0.4ccc9e2db7ac.300.fc34.s390x+next #1
Hardware name: IBM 2827 H66 708 (LPAR)
Krnl PSW : 0704e0018000 0001f9f11000 (compat_put_bitmap+0x48/0xd0)
   R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0081  7d9df1c0 038003e7c008
   0004 7d9df1c4 038003e7be40 0001
   8000  0390 01c8
   00020d6ea000 02aa00401a48 0001fa0a85fa 038003e7bd50
Krnl Code: 0001f9f10ff4: a7bb0001aghi%r11,1
   0001f9f10ff8: 41303008la  %r3,8(%r3)
  #0001f9f10ffc: 41502004la  %r5,4(%r2)
  >0001f9f11000: e3103ff8ff04lg  %r1,-8(%r3)
   0001f9f11006: 5010f0a4st  %r1,164(%r15)
   0001f9f1100a: a50e0081llilh   %r0,129
   0001f9f1100e: c8402000f0a4mvcos   0(%r2),164(%r15),%r4
   0001f9f11014: 1799xr  %r9,%r9
Call Trace:
 [<0001f9f11000>] compat_put_bitmap+0x48/0xd0 
 [<0001fa0a85fa>] kernel_get_mempolicy+0x102/0x178 
 [<0001fa0a86b0>] __s390_sys_get_mempolicy+0x40/0x50 
 [<0001fa92be30>] __do_syscall+0x1c0/0x1e8 
 [<0001fa939148>] system_call+0x78/0xa0 
Last Breaking-Event-Address:
 [<038003e7bc00>] 0x38003e7bc00
Kernel panic - not syncing: Fatal exception: panic_on_oops

Note: I did not try to bisect this, since it looks to me like this
patch series causes the problem. Also, please don't get confused with
the kernel version name. The date encoded is the build date, not the
linux-next version.
linux-next commit 4ccc9e2db7ac ("Add linux-next specific files for
20210729") was used to build the kernel (s390 defconfig).

Re: [PATCH v5 4/6] mm: simplify compat numa syscalls

2021-07-27 Thread Heiko Carstens

On Tue, Jul 27, 2021 at 08:49:40PM +0200, Arnd Bergmann wrote:
> On Tue, Jul 27, 2021 at 8:38 PM Heiko Carstens  wrote:
> >
> > -268  commonmbind   sys_mbind   
> > compat_sys_mbind
> > -269  commonget_mempolicy   sys_get_mempolicy   
> > compat_sys_get_mempolicy
> > -270  commonset_mempolicy   sys_set_mempolicy   
> > compat_sys_set_mempolicy
> > +268  commonmbind   sys_mbind   
> > sys_mbind
> > +269  commonget_mempolicy   sys_get_mempolicy   
> > sys_get_mempolicy
> > +270  commonset_mempolicy   sys_set_mempolicy   
> > sys_set_mempolicy
> >
> > would remove compat_ptr() conversion from nmask above if I'm not mistaken.
> 
> Maybe I'm misremembering how compat syscalls work on s390. Doesn't
> SYSCALL_DEFINEx(sys_mbind) still create two entry points __s390x_sys_mbind()
> and __s390_sys_mbind() with different argument conversion (__SC_CAST vs
> __SC_COMPAT_CAST)? I thought that was the whole point of the macros.

You are remembering correctly, probably because you implemented it ;)
I totally forgot - sorry for the noise!

Re: [PATCH v5 4/6] mm: simplify compat numa syscalls

2021-07-27 Thread Heiko Carstens

On Tue, Jul 27, 2021 at 07:40:05PM +0200, Arnd Bergmann wrote:
> On Tue, Jul 27, 2021 at 7:27 PM Heiko Carstens  wrote:
> > > +static int get_bitmap(unsigned long *mask, const unsigned long __user 
> > > *nmask,
> > > +   unsigned long maxnode)
> > > +{
> > > + unsigned long nlongs = BITS_TO_LONGS(maxnode);
> > > + int ret;
> > > +
> > > + if (in_compat_syscall())
> > > + ret = compat_get_bitmap(mask,
> > > + (const compat_ulong_t __user 
> > > *)nmask,
> > > + maxnode);
> >
> > compat_ptr() conversion for e.g. nmask is missing with the next patch
> > which removes the compat system calls.
> > Is that intended or am I missing something?
> 
> I don't think it's needed here, since the pointer comes from the system
> call argument, which has the compat_ptr() conversion applied in
> arch/s390/include/asm/syscall_wrapper.h, not from a compat_uptr_t
> that gets passed indirectly. The compat_get_bitmap() conversion
> is only needed for byte order adjustment, not for converting pointers.
> 
> It's also possible that I'm the one who's missing something.

What I was trying to say: this patch on its own is ok. However with
the next patch you remove the compat system calls and map the regular
system calls instead.

That is:

-COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
-  compat_ulong_t, mode, compat_ulong_t __user *, nmask,
-  compat_ulong_t, maxnode, compat_ulong_t, flags)
-{
-   return kernel_mbind(start, len, mode, (unsigned long __user *)nmask,
-   maxnode, flags);
-}

and this:

-268  commonmbind   sys_mbind   
compat_sys_mbind
-269  commonget_mempolicy   sys_get_mempolicy   
compat_sys_get_mempolicy
-270  commonset_mempolicy   sys_set_mempolicy   
compat_sys_set_mempolicy
+268  commonmbind   sys_mbind   
sys_mbind
+269  commonget_mempolicy   sys_get_mempolicy   
sys_get_mempolicy
+270  commonset_mempolicy   sys_set_mempolicy   
sys_set_mempolicy

would remove compat_ptr() conversion from nmask above if I'm not mistaken.

Re: [PATCH v5 4/6] mm: simplify compat numa syscalls

2021-07-27 Thread Heiko Carstens

On Tue, Jul 27, 2021 at 04:48:57PM +0200, Arnd Bergmann wrote:
> ---
>  include/linux/compat.h |  17 ++--
>  mm/mempolicy.c | 175 +
>  2 files changed, 63 insertions(+), 129 deletions(-)
...
> +static int get_bitmap(unsigned long *mask, const unsigned long __user *nmask,
> +   unsigned long maxnode)
> +{
> + unsigned long nlongs = BITS_TO_LONGS(maxnode);
> + int ret;
> +
> + if (in_compat_syscall())
> + ret = compat_get_bitmap(mask,
> + (const compat_ulong_t __user *)nmask,
> + maxnode);

compat_ptr() conversion for e.g. nmask is missing with the next patch
which removes the compat system calls.
Is that intended or am I missing something?

Re: [PATCH v1 04/12] mm/memory_hotplug: remove nid parameter from arch_remove_memory()

2021-06-08 Thread Heiko Carstens

On Mon, Jun 07, 2021 at 09:54:22PM +0200, David Hildenbrand wrote:
> The parameter is unused, let's remove it.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/arm64/mm/mmu.c| 3 +--
>  arch/ia64/mm/init.c| 3 +--
>  arch/powerpc/mm/mem.c  | 3 +--
>  arch/s390/mm/init.c| 3 +--
>  arch/sh/mm/init.c  | 3 +--
>  arch/x86/mm/init_32.c  | 3 +--
>  arch/x86/mm/init_64.c  | 3 +--
>  include/linux/memory_hotplug.h | 3 +--
>  mm/memory_hotplug.c| 4 ++--
>  mm/memremap.c  | 5 +
>  10 files changed, 11 insertions(+), 22 deletions(-)

For s390:
Acked-by: Heiko Carstens

Re: consolidate the flock uapi definitions

2021-04-15 Thread Heiko Carstens

On Mon, Apr 12, 2021 at 10:55:40AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> currently we deal with the slight differents in the various architecture
> variants of the flock and flock64 stuctures in a very cruft way.  This
> series switches to just use small arch hooks and define the rest in
> asm-generic and linux/compat.h instead.
> 
> Diffstat:
>  arch/arm64/include/asm/compat.h|   20 
>  arch/mips/include/asm/compat.h |   23 ++-
>  arch/mips/include/uapi/asm/fcntl.h |   28 +++-
>  arch/parisc/include/asm/compat.h   |   16 
>  arch/powerpc/include/asm/compat.h  |   20 
>  arch/s390/include/asm/compat.h |   20 
>  arch/sparc/include/asm/compat.h|   22 +-
>  arch/x86/include/asm/compat.h  |   24 +++-
>  include/linux/compat.h |   31 +++
>  include/uapi/asm-generic/fcntl.h   |   21 +++--
>  tools/include/uapi/asm-generic/fcntl.h |   21 +++--
>  11 files changed, 54 insertions(+), 192 deletions(-)

for the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH 0/6] mm: some config cleanups

2021-03-09 Thread Heiko Carstens

On Tue, Mar 09, 2021 at 02:03:04PM +0530, Anshuman Khandual wrote:
> This series contains config cleanup patches which reduces code duplication
> across platforms and also improves maintainability. There is no functional
> change intended with this series. This has been boot tested on arm64 but
> only build tested on some other platforms.
> 
> This applies on 5.12-rc2
> 
> Cc: x...@kernel.org
> Cc: linux-i...@vger.kernel.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux-snps-...@lists.infradead.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux...@vger.kernel.org
> Cc: linux-fsde...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-ker...@vger.kernel.org
> 
> Anshuman Khandual (6):
>   mm: Generalize ARCH_HAS_CACHE_LINE_SIZE
>   mm: Generalize SYS_SUPPORTS_HUGETLBFS (rename as ARCH_SUPPORTS_HUGETLBFS)
>   mm: Generalize ARCH_ENABLE_MEMORY_[HOTPLUG|HOTREMOVE]
>   mm: Drop redundant ARCH_ENABLE_[HUGEPAGE|THP]_MIGRATION
>   mm: Drop redundant ARCH_ENABLE_SPLIT_PMD_PTLOCK
>   mm: Drop redundant HAVE_ARCH_TRANSPARENT_HUGEPAGE
> 
>  arch/arc/Kconfig   |  9 ++--
>  arch/arm/Kconfig   | 10 ++---
>  arch/arm64/Kconfig | 30 ++
>  arch/ia64/Kconfig  |  8 ++-
>  arch/mips/Kconfig  |  6 +-
>  arch/parisc/Kconfig|  5 +
>  arch/powerpc/Kconfig   | 11 ++
>  arch/powerpc/platforms/Kconfig.cputype | 16 +-
>  arch/riscv/Kconfig |  5 +
>  arch/s390/Kconfig  | 12 +++
>  arch/sh/Kconfig|  7 +++---
>  arch/sh/mm/Kconfig |  8 ---
>  arch/x86/Kconfig   | 29 ++---
>  fs/Kconfig |  5 -
>  mm/Kconfig     |  9 
>  15 files changed, 48 insertions(+), 122 deletions(-)

for the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH AUTOSEL 5.9 27/39] sched/idle: Fix arch_cpu_idle() vs tracing

2020-12-03 Thread Heiko Carstens

On Thu, Dec 03, 2020 at 08:28:21AM -0500, Sasha Levin wrote:
> From: Peter Zijlstra 
> 
> [ Upstream commit 58c644ba512cfbc2e39b758dd979edd1d6d00e27 ]
> 
> We call arch_cpu_idle() with RCU disabled, but then use
> local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.
> 
> Switch all arch_cpu_idle() implementations to use
> raw_local_irq_{en,dis}able() and carefully manage the
> lockdep,rcu,tracing state like we do in entry.
> 
> (XXX: we really should change arch_cpu_idle() to not return with
> interrupts enabled)
> 
> Reported-by: Sven Schnelle 
> Signed-off-by: Peter Zijlstra (Intel) 
> Reviewed-by: Mark Rutland 
> Tested-by: Mark Rutland 
> Link: https://lkml.kernel.org/r/20201120114925.594122...@infradead.org
> Signed-off-by: Sasha Levin 

This patch broke s390 irq state tracing. A patch to fix this is
scheduled to be merged upstream today (hopefully).
Therefore I think this patch should not yet go into 5.9 stable.

Re: [PATCH seccomp 5/8] s390: Enable seccomp architecture tracking

2020-11-09 Thread Heiko Carstens

On Tue, Nov 03, 2020 at 07:43:01AM -0600, YiFei Zhu wrote:
> From: YiFei Zhu 
> 
> To enable seccomp constant action bitmaps, we need to have a static
> mapping to the audit architecture and system call table size. Add these
> for s390.
> 
> Signed-off-by: YiFei Zhu 
> ---
>  arch/s390/include/asm/seccomp.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/s390/include/asm/seccomp.h b/arch/s390/include/asm/seccomp.h
> index 795bbe0d7ca6..71d46f0ba97b 100644
> --- a/arch/s390/include/asm/seccomp.h
> +++ b/arch/s390/include/asm/seccomp.h
> @@ -16,4 +16,13 @@
>  
>  #include 
>  
> +#define SECCOMP_ARCH_NATIVE  AUDIT_ARCH_S390X
> +#define SECCOMP_ARCH_NATIVE_NR   NR_syscalls
> +#define SECCOMP_ARCH_NATIVE_NAME "s390x"
> +#ifdef CONFIG_COMPAT
> +# define SECCOMP_ARCH_COMPAT AUDIT_ARCH_S390
> +# define SECCOMP_ARCH_COMPAT_NR  NR_syscalls
> +# define SECCOMP_ARCH_COMPAT_NAME"s390"
> +#endif
> +

Acked-by: Heiko Carstens

Re: [PATCH] ima: add a new CONFIG for loading arch-specific policies

2020-03-02 Thread Heiko Carstens

On Mon, Mar 02, 2020 at 09:56:58AM -0500, Mimi Zohar wrote:
> On Mon, 2020-03-02 at 15:52 +0100, Ard Biesheuvel wrote:
> > On Mon, 2 Mar 2020 at 15:48, Mimi Zohar  wrote:
> > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > > > index beea77046f9b..cafa66313fe2 100644
> > > > --- a/arch/x86/Kconfig
> > > > +++ b/arch/x86/Kconfig
> > > > @@ -230,6 +230,7 @@ config X86
> > > >   select VIRT_TO_BUS
> > > >   select X86_FEATURE_NAMESif PROC_FS
> > > >   select PROC_PID_ARCH_STATUS if PROC_FS
> > > > + select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI
> > >
> > > Not everyone is interested in enabling IMA or requiring IMA runtime
> > > policies.  With this patch, enabling IMA_ARCH_POLICY is therefore
> > > still left up to the person building the kernel.  As a result, I'm
> > > seeing the following warning, which is kind of cool.
> > >
> > > WARNING: unmet direct dependencies detected for
> > > IMA_SECURE_AND_OR_TRUSTED_BOOT
> > >   Depends on [n]: INTEGRITY [=y] && IMA [=y] && IMA_ARCH_POLICY [=n]
> > >   Selected by [y]:
> > >   - X86 [=y] && EFI [=y]
> > >
> > > Ard, Michael, Martin, just making sure this type of warning is
> > > acceptable before upstreaming this patch.  I would appreciate your
> > > tags.
> > >
> > 
> > Ehm, no, warnings like these are not really acceptable. It means there
> > is an inconsistency in the way the Kconfig dependencies are defined.
> > 
> > Does this help:
> > 
> >   select IMA_SECURE_AND_OR_TRUSTED_BOOT   if EFI && IMA_ARCH_POLICY
> > 
> > ?
> 
> Yes, that's fine for x86.  Michael, Martin, do you want something
> similar or would you prefer actually selecting IMA_ARCH_POLICY?

For s390 something like

select IMA_SECURE_AND_OR_TRUSTED_BOOT if IMA_ARCH_POLICY

should be fine.

Thanks,
Heiko

Re: [PATCH v3 02/22] compat: provide compat_ptr() on all architectures

2020-01-07 Thread Heiko Carstens

On Thu, Jan 02, 2020 at 03:55:20PM +0100, Arnd Bergmann wrote:
> In order to avoid needless #ifdef CONFIG_COMPAT checks,
> move the compat_ptr() definition to linux/compat.h
> where it can be seen by any file regardless of the
> architecture.
> 
> Only s390 needs a special definition, this can use the
> self-#define trick we have elsewhere.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm64/include/asm/compat.h   | 17 -
>  arch/mips/include/asm/compat.h| 18 --
>  arch/parisc/include/asm/compat.h  | 17 -
>  arch/powerpc/include/asm/compat.h | 17 -
>  arch/powerpc/oprofile/backtrace.c |  2 +-
>  arch/s390/include/asm/compat.h|  6 +-
>  arch/sparc/include/asm/compat.h   | 17 -
>  arch/x86/include/asm/compat.h | 17 -
>  include/linux/compat.h| 18 ++
>  9 files changed, 20 insertions(+), 109 deletions(-)

For s390:

Acked-by: Heiko Carstens

Re: [PATCH v2 00/29] vmlinux.lds.h: Refactor EXCEPTION_TABLE and NOTES

2019-10-16 Thread Heiko Carstens

On Thu, Oct 10, 2019 at 05:05:40PM -0700, Kees Cook wrote:
> Arch maintainers: please send Acks (if you haven't already) for your
> respective linker script changes; the intention is for this series to
> land via -tip.
> 
> v1: https://lore.kernel.org/lkml/20190926175602.33098-1-keesc...@chromium.org
> v2: clean up commit messages, rename RO_EXCEPTION_TABLE (bp)
> 
> 
> This series works to move the linker sections for NOTES and
> EXCEPTION_TABLE into the RO_DATA area, where they belong on most
> (all?) architectures. The problem being addressed was the discovery
> by Rick Edgecombe that the exception table was accidentally marked
> executable while he was developing his execute-only-memory series. When
> permissions were flipped from readable-and-executable to only-executable,
> the exception table became unreadable, causing things to explode rather
> badly. :)

Feel free to add
Acked-by: Heiko Carstens 
to every patch in this series which touches s390.

Re: [PATCH v2 06/29] s390: Move RO_DATA into "text" PT_LOAD Program Header

2019-10-16 Thread Heiko Carstens

On Thu, Oct 10, 2019 at 05:05:46PM -0700, Kees Cook wrote:
> In preparation for moving NOTES into RO_DATA, move RO_DATA back into the
> "text" PT_LOAD Program Header, as done with other architectures. The
> "data" PT_LOAD now starts with the writable data section.
> 
> Signed-off-by: Kees Cook 
> ---
>  arch/s390/kernel/vmlinux.lds.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
> index 7e0eb4020917..13294fef473e 100644
> --- a/arch/s390/kernel/vmlinux.lds.S
> +++ b/arch/s390/kernel/vmlinux.lds.S
> @@ -52,7 +52,7 @@ SECTIONS
> 
>   NOTES :text :note
> 
> - .dummy : { *(.dummy) } :data
> + .dummy : { *(.dummy) } :text
> 
>   RO_DATA_SECTION(PAGE_SIZE)
> 
> @@ -64,7 +64,7 @@ SECTIONS
>   .data..ro_after_init : {
>*(.data..ro_after_init)
>   JUMP_TABLE_DATA
> - }
> + } :data
>   EXCEPTION_TABLE(16)
>   . = ALIGN(PAGE_SIZE);
>   __end_ro_after_init = .;

Acked-by: Heiko Carstens

Re: [PATCH 2/2] arch: add pidfd and io_uring syscalls everywhere

2019-03-30 Thread Heiko Carstens

On Mon, Mar 25, 2019 at 03:47:37PM +0100, Arnd Bergmann wrote:
> Add the io_uring and pidfd_send_signal system calls to all architectures.
> 
> These system calls are designed to handle both native and compat tasks,
> so all entries are the same across architectures, only arm-compat and
> the generic tale still use an old format.
> 
> Signed-off-by: Arnd Bergmann 

> diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
> b/arch/s390/kernel/syscalls/syscall.tbl
> index 02579f95f391..3eb56e639b96 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -426,3 +426,7 @@
>  421  32  rt_sigtimedwait_time64  -   
> compat_sys_rt_sigtimedwait_time64
>  422  32  futex_time64-   
> sys_futex
>  423  32  sched_rr_get_interval_time64-   
> sys_sched_rr_get_interval
> +424  common  pidfd_send_signal   sys_pidfd_send_signal
> +425  common  io_uring_setup  sys_io_uring_setup
> +426  common  io_uring_enter  sys_io_uring_enter
> +427  common  io_uring_register   sys_io_uring_register

I was just about to write that io_uring_enter is missing compat
handling, but your first patch actually fixes that. Would have been
good to be cc'ed on both patches :)

For s390:
Acked-by: Heiko Carstens

Re: CONFIG_ARCH_SUPPORTS_INT128: Why not mips, s390, powerpc, and alpha?

2019-03-30 Thread Heiko Carstens

On Fri, Mar 29, 2019 at 01:07:07PM +, George Spelvin wrote:
> (Cross-posted in case there are generic issues; please trim if
> discussion wanders into single-architecture details.)
> 
> I was working on some scaling code that can benefit from 64x64->128-bit
> multiplies.  GCC supports an __int128 type on processors with hardware
> support (including z/Arch and MIPS64), but the support was broken on
> early compilers, so it's gated behind CONFIG_ARCH_SUPPORTS_INT128.
> 
> Currently, of the ten 64-bit architectures Linux supports, that's
> only enabled on x86, ARM, and RISC-V.
> 
> SPARC and HP-PA don't have support.
> 
> But that leaves Alpha, Mips, PowerPC, and S/390x.
> 
> Current mips64, powerpc64, and s390x gcc seems to generate sensible code
> for mul_u64_u64_shr() in  if I cross-compile them.
> 
> I don't have easy access to an Alpha cross-compiler to test, but
> as it has UMULH, I suspect it would work, too.
> 
> Is there a reason it hasn't been enabled on these platforms?

It hasn't been enabled on s390 simply because at least I wasn't aware
of this config option. Feel free to send a patch, otherwise I will
enable this. Whatever you prefer.

Thanks for pointing this out!

Re: [PATCH] compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

2019-03-21 Thread Heiko Carstens

On Wed, Mar 20, 2019 at 03:20:27PM +0900, Masahiro Yamada wrote:
> Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
> CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.
> 
> The idea is obviously arch-agnostic although we need some code fixups.
> This commit moves the config entry from arch/x86/Kconfig.debug to
> lib/Kconfig.debug so that all architectures (except MIPS for now) can
> benefit from it.
> 
> At this moment, I added "depends on !MIPS" because fixing 0day bot reports
> for MIPS was complex to me.
> 
> I tested this patch on my arm/arm64 boards.
> 
> This can make a huge difference in kernel image size especially when
> CONFIG_OPTIMIZE_FOR_SIZE is enabled.
> 
> For example, I got 3.5% smaller arm64 kernel image for v5.1-rc1.
> 
>   dec   file
>   18983424  arch/arm64/boot/Image.before
>   18321920  arch/arm64/boot/Image.after

Well, this will change, since now people (have to) start adding
__always_inline annotations on all architectures, most likely until
all have about the same amount of annotations like x86. This will
reduce the benefit.

Not sure if it's really a win that we get the inline vs
__always_inline discussion now on all architectures.

Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures

2019-01-21 Thread Heiko Carstens

On Fri, Jan 18, 2019 at 05:18:35PM +0100, Arnd Bergmann wrote:
> This adds 21 new system calls on each ABI that has 32-bit time_t
> today. All of these have the exact same semantics as their existing
> counterparts, and the new ones all have macro names that end in 'time64'
> for clarification.
> 
> This gets us to the point of being able to safely use a C library
> that has 64-bit time_t in user space. There are still a couple of
> loose ends to tie up in various areas of the code, but this is the
> big one, and should be entirely uncontroversial at this point.
> 
> In particular, there are four system calls (getitimer, setitimer,
> waitid, and getrusage) that don't have a 64-bit counterpart yet,
> but these can all be safely implemented in the C library by wrapping
> around the existing system calls because the 32-bit time_t they
> pass only counts elapsed time, not time since the epoch. They
> will be dealt with later.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/s390/kernel/syscalls/syscall.tbl   | 20 +

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH v2 28/29] y2038: rename old time and utime syscalls

2019-01-21 Thread Heiko Carstens

On Fri, Jan 18, 2019 at 05:18:34PM +0100, Arnd Bergmann wrote:
> The time, stime, utime, utimes, and futimesat system calls are only
> used on older architectures, and we do not provide y2038 safe variants
> of them, as they are replaced by clock_gettime64, clock_settime64,
> and utimensat_time64.
> 
> However, for consistency it seems better to have the 32-bit architectures
> that still use them call the "time32" entry points (leaving the
> traditional handlers for the 64-bit architectures), like we do for system
> calls that now require two versions.
> 
> Note: We used to always define __ARCH_WANT_SYS_TIME and
> __ARCH_WANT_SYS_UTIME and only set __ARCH_WANT_COMPAT_SYS_TIME and
> __ARCH_WANT_SYS_UTIME32 for compat mode on 64-bit kernels. Now this is
> reversed: only 64-bit architectures set __ARCH_WANT_SYS_TIME/UTIME, while
> we need __ARCH_WANT_SYS_TIME32/UTIME32 for 32-bit architectures and compat
> mode. The resulting asm/unistd.h changes look a bit counterintuitive.
> 
> This is only a cleanup patch and it should not change any behavior.
> 
> Signed-off-by: Arnd Bergmann 
...
>  arch/s390/include/asm/unistd.h  |  2 +-

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH v2 17/29] syscalls: remove obsolete __IGNORE_ macros

2019-01-21 Thread Heiko Carstens

On Fri, Jan 18, 2019 at 05:18:23PM +0100, Arnd Bergmann wrote:
> These are all for ignoring the lack of obsolete system calls,
> which have been marked the same way in scripts/checksyscall.sh,
> so these can be removed.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/mips/include/asm/unistd.h   | 16 
>  arch/parisc/include/asm/unistd.h |  3 ---
>  arch/s390/include/asm/unistd.h   |  2 --
>  arch/xtensa/include/asm/unistd.h | 12 
>  4 files changed, 33 deletions(-)

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH v2 14/29] arch: add pkey and rseq syscall numbers everywhere

2019-01-21 Thread Heiko Carstens

On Fri, Jan 18, 2019 at 05:18:20PM +0100, Arnd Bergmann wrote:
> Most architectures define system call numbers for the rseq and pkey system
> calls, even when they don't support the features, and perhaps never will.
> 
> Only a few architectures are missing these, so just define them anyway
> for consistency. If we decide to add them later to one of these, the
> system call numbers won't get out of sync then.
> 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/include/asm/unistd.h | 4 
>  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
>  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
>  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
>  arch/parisc/include/asm/unistd.h| 3 ---
>  arch/parisc/kernel/syscalls/syscall.tbl | 4 
>  arch/s390/include/asm/unistd.h  | 3 ---
>  arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
>  arch/sh/kernel/syscalls/syscall.tbl | 4 
>  arch/sparc/include/asm/unistd.h | 5 -
>  arch/sparc/kernel/syscalls/syscall.tbl  | 4 
>  arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
>  12 files changed, 28 insertions(+), 15 deletions(-)

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH v2 13/29] arch: add split IPC system calls where needed

2019-01-21 Thread Heiko Carstens

On Fri, Jan 18, 2019 at 05:18:19PM +0100, Arnd Bergmann wrote:
> The IPC system call handling is highly inconsistent across architectures,
> some use sys_ipc, some use separate calls, and some use both.  We also
> have some architectures that require passing IPC_64 in the flags, and
> others that set it implicitly.
> 
> For the additon of a y2083 safe semtimedop() system call, I chose to only
> support the separate entry points, but that requires first supporting
> the regular ones with their own syscall numbers.
> 
> The IPC_64 is now implied by the new semctl/shmctl/msgctl system
> calls even on the architectures that require passing it with the ipc()
> multiplexer.
> 
> I'm not adding the new semtimedop() or semop() on 32-bit architectures,
> those will get implemented using the new semtimedop_time64() version
> that gets added along with the other time64 calls.
> Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().
> 
> Signed-off-by: Arnd Bergmann 
> ---
> One aspect here that might be a bit controversial is the use of
> the same system call numbers across all architectures, synchronizing
> all of them with the x86-32 numbers. With the new syscall.tbl
> files, I hope we can just keep doing that in the future, and no
> longer require the architecture maintainers to assign a number.
> 
> This is mainly useful for implementers of the C libraries: if
> we can add future system calls everywhere at the same time, using
> a particular version of the kernel headers also guarantees that
> the system call number macro is visible.
> ---
>  arch/m68k/kernel/syscalls/syscall.tbl | 11 +++
>  arch/mips/kernel/syscalls/syscall_o32.tbl | 11 +++
>  arch/powerpc/kernel/syscalls/syscall.tbl  | 13 +
>  arch/s390/kernel/syscalls/syscall.tbl | 12 
>  arch/sh/kernel/syscalls/syscall.tbl   | 11 +++
>  arch/sparc/kernel/syscalls/syscall.tbl| 12 
>  arch/x86/entry/syscalls/syscall_32.tbl| 11 +++
>  7 files changed, 81 insertions(+)

For the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH 19/21] treewide: add checks for the return value of memblock_alloc*()

2019-01-18 Thread Heiko Carstens

On Wed, Jan 16, 2019 at 03:44:19PM +0200, Mike Rapoport wrote:
> Add check for the return value of memblock_alloc*() functions and call
> panic() in case of error.
> The panic message repeats the one used by panicing memblock allocators with
> adjustment of parameters to include only relevant ones.
> 
> The replacement was mostly automated with semantic patches like the one
> below with manual massaging of format strings.
> 
> @@
> expression ptr, size, align;
> @@
> ptr = memblock_alloc(size, align);
> + if (!ptr)
> + panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
> size, align);
> 
> Signed-off-by: Mike Rapoport 
...
> diff --git a/arch/s390/numa/toptree.c b/arch/s390/numa/toptree.c
> index 71a608c..0118c77 100644
> --- a/arch/s390/numa/toptree.c
> +++ b/arch/s390/numa/toptree.c
> @@ -31,10 +31,14 @@ struct toptree __ref *toptree_alloc(int level, int id)
>  {
>   struct toptree *res;
> 
> - if (slab_is_available())
> + if (slab_is_available()) {
>   res = kzalloc(sizeof(*res), GFP_KERNEL);
> - else
> + } else {
>   res = memblock_alloc(sizeof(*res), 8);
> + if (!res)
> + panic("%s: Failed to allocate %zu bytes align=0x%x\n",
> +   __func__, sizeof(*res), 8);
> + }
>   if (!res)
>   return res;

Please remove this hunk, since the code _should_ be able to handle
allocation failures anyway (see end of quoted code).

Otherwise for the s390 bits:
Acked-by: Heiko Carstens

Re: [PATCH 15/15] arch: add pkey and rseq syscall numbers everywhere

2019-01-14 Thread Heiko Carstens

On Fri, Jan 11, 2019 at 06:30:43PM +0100, Arnd Bergmann wrote:
> On Thu, Jan 10, 2019 at 9:36 PM Heiko Carstens
>  wrote:
> > On Thu, Jan 10, 2019 at 05:24:35PM +0100, Arnd Bergmann wrote:
> 
> > Since you only need/want the system call numbers, could you please
> > change these lines to:
> >
> > > +384  common  pkey_alloc  -   -
> > > +385  common  pkey_free   -   -
> > > +386  common  pkey_mprotect   -   -
> >
> > Otherwise it _looks_ like we would need compat wrappers here as well,
> > even though all of them would just jump to sys_ni_syscall() in this
> > case. Making this explicit seems to better.
> 
> Ok, fair enough. I considered doing this originally and then
> decided against it for consistency with the asm-generic file,
> but I don't care much either way.
> 
> Is this something you may want to add later? I'm not sure exactly
> how pkey compares to s390 storage keys, or if this is something
> completely unrelated.

I don't think pkeys will ever work on s390, since they require a key
per mapping, while the s390 storage keys are per physical page.

Re: [PATCH 07/11] y2038: syscalls: rename y2038 compat syscalls

2019-01-10 Thread Heiko Carstens

On Thu, Jan 10, 2019 at 06:22:12PM +0100, Arnd Bergmann wrote:
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
> b/arch/s390/kernel/syscalls/syscall.tbl
> index f84ea364a302..b3199a744731 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -20,7 +20,7 @@
>  10   common  unlink  sys_unlink  
> compat_sys_unlink
>  11   common  execve  sys_execve  
> compat_sys_execve
>  12   common  chdir   sys_chdir   
> compat_sys_chdir
> -13   32  time-   
> compat_sys_time
> +13   32  time-   
> sys_time32
>  14   common  mknod   sys_mknod   
> compat_sys_mknod
>  15   common  chmod   sys_chmod   
> compat_sys_chmod
>  16   32  lchown  -   
> compat_sys_s390_lchown16
> @@ -30,11 +30,11 @@
>  22   common  umount  sys_oldumount   
> compat_sys_oldumount
>  23   32  setuid  -   
> compat_sys_s390_setuid16
>  24   32  getuid  -   
> compat_sys_s390_getuid16
> -25   32  stime   -   
> compat_sys_stime
> +25   32  stime   -   
> sys_stime32
>  26   common  ptrace  sys_ptrace  
> compat_sys_ptrace
>  27   common  alarm   sys_alarm   
> sys_alarm
>  29   common  pause   sys_pause   
> sys_pause
> -30   common  utime   sys_utime   
> compat_sys_utime
> +30   common  utime   sys_utime   
> sys_utime32
...(and more)...

All of them need compat wrappers to clear the uppermost 33 bits of
user space pointers. I assume there is no new *32 system call which
takes u64/s64 arguments; so the pointers should be the only problem.

Re: [PATCH 15/15] arch: add pkey and rseq syscall numbers everywhere

2019-01-10 Thread Heiko Carstens

On Thu, Jan 10, 2019 at 05:24:35PM +0100, Arnd Bergmann wrote:
> Most architectures define system call numbers for the rseq and pkey system
> calls, even when they don't support the features, and perhaps never will.
> 
> Only a few architectures are missing these, so just define them anyway
> for consistency. If we decide to add them later to one of these, the
> system call numbers won't get out of sync then.
> 
> Signed-off-by: Arnd Bergmann 
> diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h
> index a1fbf15d53aa..ed08f114ee91 100644
> --- a/arch/s390/include/asm/unistd.h
> +++ b/arch/s390/include/asm/unistd.h
> @@ -11,9 +11,6 @@
>  #include 
> 
>  #define __IGNORE_time
> -#define __IGNORE_pkey_mprotect
> -#define __IGNORE_pkey_alloc
> -#define __IGNORE_pkey_free
> 
>  #define __ARCH_WANT_NEW_STAT
>  #define __ARCH_WANT_OLD_READDIR
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
> b/arch/s390/kernel/syscalls/syscall.tbl
> index 428cf512a757..f84ea364a302 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -391,6 +391,9 @@
>  381  common  kexec_file_load sys_kexec_file_load 
> compat_sys_kexec_file_load
>  382  common  io_pgetevents   sys_io_pgetevents   
> compat_sys_io_pgetevents
>  383  common  rseqsys_rseq
> compat_sys_rseq
> +384  common  pkey_alloc  sys_pkey_alloc  
> sys_pkey_alloc
> +385  common  pkey_free   sys_pkey_free   
> sys_pkey_free
> +386  common  pkey_mprotect   sys_pkey_mprotect   
> sys_pkey_mprotect

Since you only need/want the system call numbers, could you please
change these lines to:

> +384  common  pkey_alloc  -   -
> +385  common  pkey_free   -   -
> +386  common  pkey_mprotect   -   -

Otherwise it _looks_ like we would need compat wrappers here as well,
even though all of them would just jump to sys_ni_syscall() in this
case. Making this explicit seems to better.

Re: [PATCH 14/15] arch: add split IPC system calls where needed

2019-01-10 Thread Heiko Carstens

On Thu, Jan 10, 2019 at 05:24:34PM +0100, Arnd Bergmann wrote:
> The IPC system call handling is highly inconsistent across architectures,
> some use sys_ipc, some use separate calls, and some use both.  We also
> have some architectures that require passing IPC_64 in the flags, and
> others that set it implicitly.
> 
> For the additon of a y2083 safe semtimedop() system call, I chose to only
> support the separate entry points, but that requires first supporting
> the regular ones with their own syscall numbers.
> 
> The IPC_64 is now implied by the new semctl/shmctl/msgctl system
> calls even on the architectures that require passing it with the ipc()
> multiplexer.
> 
> I'm not adding the new semtimedop() or semop() on 32-bit architectures,
> those will get implemented using the new semtimedop_time64() version
> that gets added along with the other time64 calls.
> Three 64-bit architectures (powerpc, s390 and sparc) get semtimedop().
> 
> Signed-off-by: Arnd Bergmann 
> ---
> One aspect here that might be a bit controversial is the use of
> the same system call numbers across all architectures, synchronizing
> all of them with the x86-32 numbers. With the new syscall.tbl
> files, I hope we can just keep doing that in the future, and no
> longer require the architecture maintainers to assign a number.
> 
> This is mainly useful for implementers of the C libraries: if
> we can add future system calls everywhere at the same time, using
> a particular version of the kernel headers also guarantees that
> the system call number macro is visible.

> diff --git a/arch/s390/kernel/syscalls/syscall.tbl 
> b/arch/s390/kernel/syscalls/syscall.tbl
> index 022fc099b628..428cf512a757 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -391,3 +391,15 @@
>  381  common  kexec_file_load sys_kexec_file_load 
> compat_sys_kexec_file_load
>  382  common  io_pgetevents   sys_io_pgetevents   
> compat_sys_io_pgetevents
>  383  common  rseqsys_rseq
> compat_sys_rseq
> +# room for arch specific syscalls
> +392  64  semtimedop  sys_semtimedop  -
> +393  common  semget  sys_semget  
> sys_semget
...
> +395  common  shmget  sys_shmget  
> sys_shmget
...
> +398  common  shmdt   sys_shmdt   
> sys_shmdt
> +399  common  msgget  sys_msgget  
> sys_msgget

These four need compat system call wrappers, unfortunately... (well,
actually only shmget and shmdt require them, but let's add them for
all four). See arch/s390/kernel/compat_wrapper.c

I'm afraid this compat special handling will be even more annoying in
the future, since s390 will be the only architecture which requires
this special handling.

_Maybe_ it would make sense to automatically generate a weak compat
system call wrapper for s390 with the SYSCALL_DEFINE macros, but that
probably won't work in all cases.

Re: [PATCH RFC 1/2] drivers/base: export lock_device_hotplug/unlock_device_hotplug

2018-08-17 Thread Heiko Carstens

On Fri, Aug 17, 2018 at 01:04:58PM +0200, David Hildenbrand wrote:
> >> If there are no objections, I'll go into that direction. But I'll wait
> >> for more comments regarding the general concept first.
> > 
> > It is the middle of the merge window, and maintainers are really busy
> > right now.  I doubt you will get many review comments just yet...
> > 
> 
> This has been broken since 2015, so I guess it can wait a bit :)

I hope you figured out what needs to be locked why. Your patch description
seems to be "only" about locking order ;)

I tried to figure out and document that partially with 55adc1d05dca ("mm:
add private lock to serialize memory hotplug operations"), and that wasn't
easy to figure out. I was especially concerned about sprinkling
lock/unlock_device_hotplug() calls, which has the potential to make it the
next BKL thing.

Re: [PATCH v3 04/17] y2038: s390: Remove unneeded ipc uapi header files

2018-04-20 Thread Heiko Carstens

On Thu, Apr 19, 2018 at 04:37:24PM +0200, Arnd Bergmann wrote:
> The s390 msgbuf/sembuf/shmbuf header files are all identical to the
> version from asm-generic.
> 
> This patch removes the files and replaces them with 'generic-y'
> statements, to avoid having to modify each copy when we extend sysvipc
> to deal with 64-bit time_t in 32-bit user space.
> 
> Note that unlike alpha and ia64, the ipcbuf.h header file is slightly
> different here, so I'm leaving the private copy.
> 
> To deal with 32-bit compat tasks, we also have to adapt the definitions
> of compat_{shm,sem,msg}id_ds to match the changes to the respective
> asm-generic files.
> 
> Signed-off-by: Arnd Bergmann <a...@arndb.de>
> ---
>  arch/s390/include/asm/compat.h  | 32 
>  arch/s390/include/uapi/asm/Kbuild   |  3 +++
>  arch/s390/include/uapi/asm/msgbuf.h | 38 
>  arch/s390/include/uapi/asm/sembuf.h | 30 ---
>  arch/s390/include/uapi/asm/shmbuf.h | 49 
> -
>  5 files changed, 19 insertions(+), 133 deletions(-)
>  delete mode 100644 arch/s390/include/uapi/asm/msgbuf.h
>  delete mode 100644 arch/s390/include/uapi/asm/sembuf.h
>  delete mode 100644 arch/s390/include/uapi/asm/shmbuf.h

FWIW,

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCH v1] mm: relax deferred struct page requirements

2017-11-16 Thread Heiko Carstens

On Thu, Nov 16, 2017 at 08:46:01PM -0500, Pavel Tatashin wrote:
> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT,
> as all the page initialization code is in common code.
> 
> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code
> does not really use hotplug memory functionality. So, we can remove this
> requirement as well.
> 
> This patch allows to use deferred struct page initialization on all
> platforms with memblock allocator.
> 
> Tested on x86, arm64, and sparc. Also, verified that code compiles on
> PPC with CONFIG_MEMORY_HOTPLUG disabled.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatas...@oracle.com>
> ---
>  arch/powerpc/Kconfig | 1 -
>  arch/s390/Kconfig| 1 -
>  arch/x86/Kconfig | 1 -
>  mm/Kconfig   | 7 +--
>  4 files changed, 1 insertion(+), 9 deletions(-)

For s390 the s390 bit:

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [v3 9/9] s390: teach platforms not to zero struct pages memory

2017-05-15 Thread Heiko Carstens

Hello Pasha,

> Thank you for looking at this patch. I am worried to make the proposed
> change, because, as I understand in this case we allocate memory not for
> "struct page"s but for table that hold them. So, we will change the behavior
> from the current one, where this table is allocated zeroed, but now it won't
> be zeroed.

The page table, if needed, is allocated and populated a couple of lines
above. See the vmem_pte_alloc() call. So my request to include the hunk
below is still valid ;)

> >If you add the hunk below then this is
> >
> >Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>
> >
> >diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> >index ffe9ba1aec8b..bf88a8b9c24d 100644
> >--- a/arch/s390/mm/vmem.c
> >+++ b/arch/s390/mm/vmem.c
> >@@ -272,7 +272,7 @@ int __meminit vmemmap_populate(unsigned long start, 
> >unsigned long end, int node)
> > if (pte_none(*pt_dir)) {
> > void *new_page;
> >-new_page = vmemmap_alloc_block(PAGE_SIZE, node, true);
> >+new_page = vmemmap_alloc_block(PAGE_SIZE, node, 
> >VMEMMAP_ZERO);
> > if (!new_page)
> > goto out;
> > pte_val(*pt_dir) = __pa(new_page) | pgt_prot;
> >
>

Re: [v3 9/9] s390: teach platforms not to zero struct pages memory

2017-05-08 Thread Heiko Carstens

On Fri, May 05, 2017 at 01:03:16PM -0400, Pavel Tatashin wrote:
> If we are using deferred struct page initialization feature, most of
> "struct page"es are getting initialized after other CPUs are started, and
> hence we are benefiting from doing this job in parallel. However, we are
> still zeroing all the memory that is allocated for "struct pages" using the
> boot CPU.  This patch solves this problem, by deferring zeroing "struct
> pages" to only when they are initialized on s390 platforms.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatas...@oracle.com>
> Reviewed-by: Shannon Nelson <shannon.nel...@oracle.com>
> ---
>  arch/s390/mm/vmem.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> index 9c75214..ffe9ba1 100644
> --- a/arch/s390/mm/vmem.c
> +++ b/arch/s390/mm/vmem.c
> @@ -252,7 +252,7 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node)
>   void *new_page;
>  
>   new_page = vmemmap_alloc_block(PMD_SIZE, node,
> -true);
> +VMEMMAP_ZERO);
>   if (!new_page)
>   goto out;
>   pmd_val(*pm_dir) = __pa(new_page) | sgt_prot;

If you add the hunk below then this is

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index ffe9ba1aec8b..bf88a8b9c24d 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -272,7 +272,7 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node)
if (pte_none(*pt_dir)) {
void *new_page;
 
-   new_page = vmemmap_alloc_block(PAGE_SIZE, node, true);
+   new_page = vmemmap_alloc_block(PAGE_SIZE, node, 
VMEMMAP_ZERO);
if (!new_page)
goto out;
pte_val(*pt_dir) = __pa(new_page) | pgt_prot;

Re: [v2 5/5] mm: teach platforms not to zero struct pages memory

2017-03-27 Thread Heiko Carstens

On Fri, Mar 24, 2017 at 03:19:52PM -0400, Pavel Tatashin wrote:
> If we are using deferred struct page initialization feature, most of
> "struct page"es are getting initialized after other CPUs are started, and
> hence we are benefiting from doing this job in parallel. However, we are
> still zeroing all the memory that is allocated for "struct pages" using the
> boot CPU.  This patch solves this problem, by deferring zeroing "struct
> pages" to only when they are initialized.
> 
> Signed-off-by: Pavel Tatashin 
> Reviewed-by: Shannon Nelson 
> ---
>  arch/powerpc/mm/init_64.c |2 +-
>  arch/s390/mm/vmem.c   |2 +-
>  arch/sparc/mm/init_64.c   |2 +-
>  arch/x86/mm/init_64.c |2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index eb4c270..24faf2d 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -181,7 +181,7 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node)
>   if (vmemmap_populated(start, page_size))
>   continue;
> 
> - p = vmemmap_alloc_block(page_size, node, true);
> + p = vmemmap_alloc_block(page_size, node, VMEMMAP_ZERO);
>   if (!p)
>   return -ENOMEM;
> 
> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> index 9c75214..ffe9ba1 100644
> --- a/arch/s390/mm/vmem.c
> +++ b/arch/s390/mm/vmem.c
> @@ -252,7 +252,7 @@ int __meminit vmemmap_populate(unsigned long start, 
> unsigned long end, int node)
>   void *new_page;
> 
>   new_page = vmemmap_alloc_block(PMD_SIZE, node,
> -true);
> +VMEMMAP_ZERO);
>   if (!new_page)
>   goto out;
>   pmd_val(*pm_dir) = __pa(new_page) | sgt_prot;

s390 has two call sites that need to be converted, like you did in one of
your previous patches. The same seems to be true for powerpc, unless there
is a reason to not convert them?

Re: [v1 0/5] parallelized "struct page" zeroing

2017-03-24 Thread Heiko Carstens

On Fri, Mar 24, 2017 at 09:51:09AM +0100, Christian Borntraeger wrote:
> On 03/24/2017 12:01 AM, Pavel Tatashin wrote:
> > When deferred struct page initialization feature is enabled, we get a
> > performance gain of initializing vmemmap in parallel after other CPUs are
> > started. However, we still zero the memory for vmemmap using one boot CPU.
> > This patch-set fixes the memset-zeroing limitation by deferring it as well.
> > 
> > Here is example performance gain on SPARC with 32T:
> > base
> > https://hastebin.com/ozanelatat.go
> > 
> > fix
> > https://hastebin.com/utonawukof.go
> > 
> > As you can see without the fix it takes: 97.89s to boot
> > With the fix it takes: 46.91 to boot.
> > 
> > On x86 time saving is going to be even greater (proportionally to memory 
> > size)
> > because there are twice as many "struct page"es for the same amount of 
> > memory,
> > as base pages are twice smaller.
> 
> Fixing the linux-s390 mailing list email.
> This might be useful for s390 as well.

Unfortunately only for the fake numa case, since as far as I understand it,
parallelization happens only on a node granularity. And since we are
usually only having one node...

But anyway, it won't hurt to set ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT on
s390 also. I'll do some testing and then we'll see.

Pavel, could you please change your patch 5 so it also converts the s390
call sites of vmemmap_alloc_block() so they use VMEMMAP_ZERO instead of
'true' as argument?

Re: [PATCH 1/3] futex: remove duplicated code

2017-03-03 Thread Heiko Carstens

On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote:
> There is code duplicated over all architecture's headers for
> futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
> and comparison of the result.
> 
> Remove this duplication and leave up to the arches only the needed
> assembly which is now in arch_futex_atomic_op_inuser.
> 
> Note that s390 removed access_ok check in d12a29703 ("s390/uaccess:
> remove pointless access_ok() checks") as access_ok there returns true.
> We introduce it back to the helper for the sake of simplicity (it gets
> optimized away anyway).
> 
> Signed-off-by: Jiri Slaby <jsl...@suse.cz>
> ---
>  arch/s390/include/asm/futex.h   | 23 -
>  include/asm-generic/futex.h | 50 
> +++--
>  kernel/futex.c  | 36 ++

Looks good to me and still boots on s390. Therefore for the s390 bits:
Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Thanks!

Re: [RFC 1/4] mm: remove unused TASK_SIZE_OF()

2017-01-01 Thread Heiko Carstens

On Fri, Dec 30, 2016 at 06:56:31PM +0300, Dmitry Safonov wrote:
> All users of TASK_SIZE_OF(tsk) have migrated to mm->task_size or
> TASK_SIZE_MAX since:
> commit d696ca016d57 ("x86/fsgsbase/64: Use TASK_SIZE_MAX for
> FSBASE/GSBASE upper limits"),
> commit a06db751c321 ("pagemap: check permissions and capabilities at
> open time"),
> 
> Signed-off-by: Dmitry Safonov <dsafo...@virtuozzo.com>
> ---
...

> diff --git a/arch/s390/include/asm/processor.h 
> b/arch/s390/include/asm/processor.h
> index 6bca916a5ba0..c53e8e2a51ac 100644
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -89,10 +89,9 @@ extern void execve_tail(void);
>   * User space process size: 2GB for 31 bit, 4TB or 8PT for 64 bit.
>   */
> 
> -#define TASK_SIZE_OF(tsk)((tsk)->mm->context.asce_limit)
>  #define TASK_UNMAPPED_BASE   (test_thread_flag(TIF_31BIT) ? \
>   (1UL << 30) : (1UL << 41))
> -#define TASK_SIZETASK_SIZE_OF(current)
> +#define TASK_SIZE(current->mm->context.asce_limit)
>  #define TASK_MAX_SIZE(1UL << 53)
> 
>  #define STACK_TOP(1UL << (test_thread_flag(TIF_31BIT) ? 31:42))

FWIW, for the s390 part:

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

Re: [PATCH 1/3] kernel/sched: introduce vcpu preempted check interface

2016-06-28 Thread Heiko Carstens

On Mon, Jun 27, 2016 at 04:00:43PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 27, 2016 at 01:41:28PM -0400, Pan Xinhui wrote:
> > +++ b/include/linux/sched.h
> > @@ -3293,6 +3293,15 @@ static inline void set_task_cpu(struct task_struct 
> > *p, unsigned int cpu)
> >  
> >  #endif /* CONFIG_SMP */
> >  
> > +#ifdef arch_vcpu_is_preempted
> > +static inline bool vcpu_is_preempted(int cpu)
> > +{
> > +   return arch_vcpu_is_preempted(cpu);
> > +}
> > +#else
> > +#define vcpu_is_preempted(cpu) false
> > +#endif
> 
> #ifndef vcpu_is_preempted
> #define vcpu_is_preempted(cpu)(false)
> #endif
> 
> Is so much simpler...
> 
> Also, please Cc the virt list so that other interested parties can
> comment, and maybe also the s390 folks.

The s390 implementation would be to simply use cpu_is_preempted() from
arch/s390/lib/spinlock.c.
It's nice that there will be a common code function for this!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/3] param: convert some "on"/"off" users to strtobool

2016-01-28 Thread Heiko Carstens

On Thu, Jan 28, 2016 at 06:17:07AM -0800, Kees Cook wrote:
> This changes several users of manual "on"/"off" parsing to use strtobool.
> 
> Signed-off-by: Kees Cook <keesc...@chromium.org>
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-s...@vger.kernel.org
> ---
>  arch/powerpc/kernel/rtasd.c  | 10 +++---
>  arch/powerpc/platforms/pseries/hotplug-cpu.c | 11 +++
>  arch/s390/kernel/time.c  |  8 ++--
>  arch/s390/kernel/topology.c  |  8 +++-
>  arch/x86/kernel/aperture_64.c| 13 +++--
>  include/linux/tick.h |  2 +-
>  kernel/time/hrtimer.c| 11 +++
>  kernel/time/tick-sched.c | 11 +++
>  8 files changed, 21 insertions(+), 53 deletions(-)

For the s390 bits:

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 3/3] s390: query dynamic DEBUG_PAGEALLOC setting

2016-01-27 Thread Heiko Carstens

On Wed, Jan 27, 2016 at 09:50:18AM +0100, Christian Borntraeger wrote:
> We can use debug_pagealloc_enabled() to check if we can map
> the identity mapping with 1MB/2GB pages as well as to print
> the current setting in dump_stack.
> 
> Signed-off-by: Christian Borntraeger <borntrae...@de.ibm.com>
> ---
>  arch/s390/kernel/dumpstack.c |  8 
>  arch/s390/mm/vmem.c  | 10 --
>  2 files changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/s390/kernel/dumpstack.c b/arch/s390/kernel/dumpstack.c
> index dc8e204..3f352e9 100644
> --- a/arch/s390/kernel/dumpstack.c
> +++ b/arch/s390/kernel/dumpstack.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -184,10 +185,9 @@ void die(struct pt_regs *regs, const char *str)
>  #endif
>  #ifdef CONFIG_SMP
>   printk("SMP ");
> -#endif
> -#ifdef CONFIG_DEBUG_PAGEALLOC
> - printk("DEBUG_PAGEALLOC");
> -#endif
> +#endif   
> +if (debug_pagealloc_enabled())
> + printk("DEBUG_PAGEALLOC");
>   printk("\n");

Indentation is broken ("if").
Besides that

Reviewed-by: Heiko Carstens <heiko.carst...@de.ibm.com>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/4] s390: enable text relative kallsyms for 64-bit targets

2016-01-20 Thread Heiko Carstens

On Wed, Jan 20, 2016 at 10:05:37AM +0100, Ard Biesheuvel wrote:
> This enables the newly introduced text-relative kallsyms support when
> building 64-bit targets. This cuts the size of the kallsyms address
> table in half, reducing the memory footprint of the kernel .rodata
> section by about 250 KB for a defconfig build.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> ---
> 
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index dbeeb3a049f2..588160fd1db0 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -149,6 +149,7 @@ config S390
>   select HAVE_REGS_AND_STACK_ACCESS_API
>   select HAVE_SYSCALL_TRACEPOINTS
>   select HAVE_VIRT_CPU_ACCOUNTING
> + select KALLSYMS_TEXT_RELATIVE if 64BIT

Please remove the "if 64BIT" since s390 is always 64BIT in the meantime.
Tested on s390 and everything seems still to work ;)

Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 3/4] s390: enable text relative kallsyms for 64-bit targets

2016-01-20 Thread Heiko Carstens

On Wed, Jan 20, 2016 at 11:04:24AM +0100, Ard Biesheuvel wrote:
> On 20 January 2016 at 10:43, Heiko Carstens <heiko.carst...@de.ibm.com> wrote:
> > On Wed, Jan 20, 2016 at 10:05:37AM +0100, Ard Biesheuvel wrote:
> >> This enables the newly introduced text-relative kallsyms support when
> >> building 64-bit targets. This cuts the size of the kallsyms address
> >> table in half, reducing the memory footprint of the kernel .rodata
> >> section by about 250 KB for a defconfig build.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> >> ---
> >>
> >> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> >> index dbeeb3a049f2..588160fd1db0 100644
> >> --- a/arch/s390/Kconfig
> >> +++ b/arch/s390/Kconfig
> >> @@ -149,6 +149,7 @@ config S390
> >>   select HAVE_REGS_AND_STACK_ACCESS_API
> >>   select HAVE_SYSCALL_TRACEPOINTS
> >>   select HAVE_VIRT_CPU_ACCOUNTING
> >> + select KALLSYMS_TEXT_RELATIVE if 64BIT
> >
> > Please remove the "if 64BIT" since s390 is always 64BIT in the meantime.
> > Tested on s390 and everything seems still to work ;)
> >
> > Acked-by: Heiko Carstens <heiko.carst...@de.ibm.com>
> >
> 
> Thanks! Did you take a look at /proc/kallsyms, by any chance? It
> should look identical with and without these patches

Close to identical, since the generated code and offsets change a bit with
your new config option enabled and disabled. But only those parts that are
linked behind kernel/kallsyms.c.

However I did run a couple of ftrace, kprobes tests and enforced call
backtraces. Everything still works.

So it looks all good.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 2/5] mm: mlock: Add new mlock, munlock, and munlockall system calls

2015-07-08 Thread Heiko Carstens

On Tue, Jul 07, 2015 at 01:03:40PM -0400, Eric B Munson wrote:
 With the refactored mlock code, introduce new system calls for mlock,
 munlock, and munlockall.  The new calls will allow the user to specify
 what lock states are being added or cleared.  mlock2 and munlock2 are
 trivial at the moment, but a follow on patch will add a new mlock state
 making them useful.
 
 munlock2 addresses a limitation of the current implementation.  If a
 user calls mlockall(MCL_CURRENT | MCL_FUTURE) and then later decides
 that MCL_FUTURE should be removed, they would have to call munlockall()
 followed by mlockall(MCL_CURRENT) which could potentially be very
 expensive.  The new munlockall2 system call allows a user to simply
 clear the MCL_FUTURE flag.
 
 Signed-off-by: Eric B Munson emun...@akamai.com

...

 diff --git a/arch/s390/kernel/syscalls.S b/arch/s390/kernel/syscalls.S
 index 1acad02..f6d81d6 100644
 --- a/arch/s390/kernel/syscalls.S
 +++ b/arch/s390/kernel/syscalls.S
 @@ -363,3 +363,6 @@ SYSCALL(sys_bpf,compat_sys_bpf)
  SYSCALL(sys_s390_pci_mmio_write,compat_sys_s390_pci_mmio_write)
  SYSCALL(sys_s390_pci_mmio_read,compat_sys_s390_pci_mmio_read)
  SYSCALL(sys_execveat,compat_sys_execveat)
 +SYSCALL(sys_mlock2,compat_sys_mlock2)/* 355 */
 +SYSCALL(sys_munlock2,compat_sys_munlock2)
 +SYSCALL(sys_munlockall2,compat_sys_munlockall2)

FWIW, you would also need to add matching lines to the two files

arch/s390/include/uapi/asm/unistd.h
arch/s390/kernel/compat_wrapper.c

so that the system call would be wired up on s390.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/2] Reenable might_sleep() checks for might_fault() when atomic

2014-11-27 Thread Heiko Carstens

On Thu, Nov 27, 2014 at 09:03:01AM +0100, David Hildenbrand wrote:
  Code like
  spin_lock(lock);
  if (copy_to_user(...))
  rc = ...
  spin_unlock(lock);
  really *should* generate warnings like it did before.
  
  And *only* code like
  spin_lock(lock);
 
 Is only code like this valid or also with the spin_lock() dropped?
 (e.g. the access in patch1 if I remember correctly)
 
 So should page_fault_disable() increment the pagefault counter and the preempt
 counter or only the first one?

Given that a sequence like

page_fault_disable();
if (copy_to_user(...))
rc = ...
page_fault_enable();

is correct code right now I think page_fault_disable() should increase both.
No need for surprising semantic changes.

 So we would have pagefault code rely on:
 
 in_disabled_pagefault() ( pagefault_disabled() ... whatever ) instead of
 in_atomic().

No, let's be more defensive: the page fault handler should do nothing if
in_atomic() just like now. But it could have a quick check and emit a one
time warning if page faults aren't disabled in addition.
That might help debugging but keeps the system more likely alive.

might_fault() however should call might_sleep() if page faults aren't
disabled, but that's what you proposed anyway I think.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/2] Reenable might_sleep() checks for might_fault() when atomic

2014-11-26 Thread Heiko Carstens

On Wed, Nov 26, 2014 at 07:04:47PM +0200, Michael S. Tsirkin wrote:
 On Wed, Nov 26, 2014 at 05:51:08PM +0100, Christian Borntraeger wrote:
   But this one was  giving users in field false positives.
  
  So lets try to fix those, ok? If we cant, then tough luck.
 
 Sure.
 I think the simplest way might be to make spinlock disable
 premption when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
 
 As a result, userspace access will fail and caller will
 get a nice error.

Yes, _userspace_ now sees unpredictable behaviour, instead of that the
kernel emits a big loud warning to the console.

Please consider this simple example:

int bar(char __user *ptr)
{
...
if (copy_to_user(ptr, ...)
return -EFAULT;
...
}

SYSCALL_DEFINE1(foo, char __user *, ptr)
{
int rc;

...
rc = bar(ptr);
if (rc)
goto out;
...
out:
return rc;  
}

The above simple system call just works fine, with and without your change,
however if somebody (incorrectly) changes sys_foo() to the code below:

spin_lock(lock);
rc = bar(ptr);
if (rc)
goto out;
out:
spin_unlock(lock);
return rc;  

Broken code like above used to generate warnings. With your change we won't
see any warnings anymore. Instead we get random and bad behaviour:

For !CONFIG_PREEMPT if the page at ptr is not mapped, the kernel will see
a fault, potentially schedule and potentially deadlock on lock.
Without _any_ warning anymore.

For CONFIG_PREEMPT if the page at ptr is mapped, everthing works. However if
the page is not mapped, userspace now all of the sudden will see an invalid(!)
-EFAULT return code, instead of that the kernel resolved the page fault.
Yes, the kernel can't resolve the fault since we hold a spinlock. But the
above bogus code did give warnings to give you an idea that something probably
is not correct.

Who on earth is supposed to debug crap like this???

What we really want is:

Code like
spin_lock(lock);
if (copy_to_user(...))
rc = ...
spin_unlock(lock);
really *should* generate warnings like it did before.

And *only* code like
spin_lock(lock);
page_fault_disable();
if (copy_to_user(...))
rc = ...
page_fault_enable();
spin_unlock(lock);
should not generate warnings, since the author hopefully knew what he did.

We could achieve that by e.g. adding a couple of pagefault disabled bits
within current_thread_info()-preempt_count, which would allow
pagefault_disable() and pagefault_enable() to modify a different part of
preempt_count than it does now, so there is a way to tell if pagefaults have
been explicitly disabled or are just a side effect of preemption being
disabled.
This would allow might_fault() to restore its old sane behaviour for the
!page_fault_disabled() case.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND][PATCH 1/2] lib/scatterlist: Make ARCH_HAS_SG_CHAIN an actual Kconfig

2014-03-23 Thread Heiko Carstens

On Sat, Mar 22, 2014 at 11:13:51AM -0700, Laura Abbott wrote:
 Rather than have architectures #define ARCH_HAS_SG_CHAIN in an architecture
 specific scatterlist.h, make it a proper Kconfig option and use that
 instead. At same time, remove the header files are are now mostly
 useless and just include asm-generic/scatterlist.h.
 

[...]

 diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
 index 65a0775..d6c2059 100644
 --- a/arch/s390/Kconfig
 +++ b/arch/s390/Kconfig
 @@ -142,6 +142,7 @@ config S390
   select SYSCTL_EXCEPTION_TRACE
   select VIRT_CPU_ACCOUNTING
   select VIRT_TO_BUS
 + select ARCH_HAS_SG_CHAIN
 

Acked-by: Heiko Carstens heiko.carst...@de.ibm.com

FWIW, it would have been nice to keep the list of selected configs sorted.
However no need to resend.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 net-next] fix unsafe set_memory_rw from softirq

2013-10-04 Thread Heiko Carstens

On Thu, Oct 03, 2013 at 07:24:06PM -0700, Alexei Starovoitov wrote:
 diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
 index 7092392..a5df511 100644
 --- a/arch/s390/net/bpf_jit_comp.c
 +++ b/arch/s390/net/bpf_jit_comp.c
 @@ -881,7 +881,9 @@ void bpf_jit_free(struct sk_filter *fp)
   struct bpf_binary_header *header = (void *)addr;
 
   if (fp-bpf_func == sk_run_filter)
 - return;
 + goto free_filter;
   set_memory_rw(addr, header-pages);
   module_free(NULL, header);
 +free_filter:
 + kfree(fp);
  }

For the s390 part:

Acked-by: Heiko Carstens heiko.carst...@de.ibm.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Fixed typo on word accounting in kprobes.c in mutliple architectures

2013-09-20 Thread Heiko Carstens

On Thu, Sep 19, 2013 at 02:33:58AM +0530, Anoop Thomas Mathew wrote:
 Signed-off-by: Anoop Thomas Mathew a...@profoundis.com
 ---
  arch/arc/kernel/kprobes.c |2 +-
  arch/ia64/kernel/kprobes.c|2 +-
  arch/powerpc/kernel/kprobes.c |2 +-
  arch/s390/kernel/kprobes.c|2 +-
  arch/sparc/kernel/kprobes.c   |2 +-
  5 files changed, 5 insertions(+), 5 deletions(-)

Please send trivial typo fixes to Jiri Kosina triv...@kernel.org.
See TRIVIAL PATCHES in MAINTAINERS. Thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Build regressions/improvements in v3.7-rc2

2012-10-23 Thread Heiko Carstens

On Mon, Oct 22, 2012 at 09:50:26PM +0200, Geert Uytterhoeven wrote:
 On Mon, Oct 22, 2012 at 9:47 PM, Geert Uytterhoeven
 ge...@linux-m68k.org wrote:
  JFYI, when comparing v3.7-rc2 to v3.7-rc1[3], the summaries are:
- build errors: +4/-44
 
   + arch/s390/include/asm/kvm_para.h: error: redefinition of
 'kvm_arch_para_features':  = 147:28, 147:99
   + arch/s390/include/asm/kvm_para.h: error: redefinition of
 'kvm_check_and_clear_guest_paused':  = 152:91, 152:20
 
 s390-allmodconfig/s390-allyesconfig/s390-defconfig

Thanks Geert. We have already a build fix for this from David Howells
which is waiting to be merged upstream.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v5 12/19] memory-hotplug: introduce new function arch_remove_memory()

2012-07-30 Thread Heiko Carstens

On Fri, Jul 27, 2012 at 06:32:15PM +0800, Wen Congyang wrote:
 We don't call __add_pages() directly in the function add_memory()
 because some other architecture related things need to be done
 before or after calling __add_pages(). So we should introduce
 a new function arch_remove_memory() to revert the things
 done in arch_add_memory().
 
 Note: the function for s390 is not implemented(I don't know how to
 implement it for s390).

There is no hardware or firmware interface which could trigger a
hot memory remove on s390. So there is nothing that needs to be
implemented.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 1/2] perf: Move arch specific code into separate arch directory

2010-04-14 Thread Heiko Carstens

On Wed, Apr 14, 2010 at 07:46:12AM -0700, Masami Hiramatsu wrote:
 Ian Munsie wrote:
  From: Ian Munsie imun...@au.ibm.com
  
  The perf userspace tool included some architecture specific code to map
  registers from the DWARF register number into the names used by the regs
  and stack access API.
  
  This patch moves the architecture specific code out into a separate
  arch/x86 directory along with the infrastructure required to use it.
  
  Signed-off-by: Ian Munsie imun...@au.ibm.com
  ---
  Changes since v1: From Masami Hiramatsu's suggestion, I added a check in the
  Makefile for if the arch specific Makefile defines PERF_HAVE_DWARF_REGS,
  printing a message during build if it has not. This simplifies the code
  removing the odd macro from the previous version and the need for an arch
  specific arch_dwarf-regs.h. I have not entirely disabled DWARF support for
  architectures that don't implement the register mappings, so that they can
  still add a probe based on a line number (they will be missing the ability 
  to
  capture the value of a variable from a register).
 
 Hmm, sorry, I don't think it is a good way to go... IMHO, porting dwarf-regs.c
 is so easy (you can just refer systemtap/runtime/loc2c-runtime.h), easier
 than porting kprobe-tracer on another arch. And perf is a part of kernel tree.
 It means that someone who are porting kprobe-tracer, he should port
 dwarf-regs.c too. In that case, PERF_HAVE_DWARF_REGS flag will be used only
 between those two patches in same patchset. So, I suggested you to drop dwarf
 support if dwarf-regs mapping doesn't exist.
 
 AFAIK, at this point, only s390 users are affected. I'd like to ask
 them to just port a register mapping on perf and test it too.

Hm, I'm a bit lost here. Probably due to lack of context. What would be missing
on s390 and what am I supposed to implement and how can I test it?
Any pointers to git commits?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PowerPC] 2.6.33-git11 : Badness at kernel/kprobes.c:264

2010-03-08 Thread Heiko Carstens

On Sat, Mar 06, 2010 at 01:40:46PM +0530, Sachin Sant wrote:
 With latest 33 git(2.6.33-git11 : 64096c1741...) on a POWER6 box
 
 type=2000 audit(1267853400.180:1): initialized
 Kprobe smoke test started
 [ cut here ]
 Badness at kernel/kprobes.c:264
 NIP: c06251e0 LR: c0625190 CTR: c007914c
 REGS: c000fecc3680 TRAP: 0700   Not tainted  (2.6.33-git11-autotest)
 MSR: 80029032 EE,ME,CE,IR,DR  CR: 2448  XER: 200b
 TASK = c000feca[1] 'swapper' THREAD: c000fecc CPU: 2
 GPR00: 0001 c000fecc3900 c0b297b0 c000fc68
 GPR04: 0004  24022024 c0a2b9d0
 GPR08: 4000 c000fc680004 0001 0004
 GPR12: 2224 c0bc2b00 00051bc3 00051aa1
 GPR16: 00051bbb 00d0 c08011f8 c07f1ba1
 GPR20: 015e87a8 c08e87a8 c000fecc3cc8 c000fecc3cd0
 GPR24: c000fecc3cd8 c000fecc3cc0 c000fecc3be0 
 GPR28:  c0a2b8b8 c0a94888 d0bd0004
 NIP [c06251e0] .free_insn_slot+0x84/0x12c
 LR [c0625190] .free_insn_slot+0x34/0x12c
 Call Trace:
 [c000fecc3900] [c0625190] .free_insn_slot+0x34/0x12c (unreliable)
 [c000fecc3990] [c0622050] .arch_remove_kprobe+0x28/0x48
 [c000fecc3a10] [c0623f58] .__unregister_kprobe_bottom+0x28/0x8c
 [c000fecc3aa0] [c062419c] .unregister_kprobes+0xc0/0xf0
 [c000fecc3b40] [c06241ec] .unregister_kprobe+0x20/0x30
 [c000fecc3bb0] [c00e081c] .init_test_probes+0xc4/0x66c
 [c000fecc3c50] [c08c288c] .init_kprobes+0x1f0/0x230
 [c000fecc3e30] [c00097a4] .do_one_initcall+0x88/0x1bc
 [c000fecc3ee0] [c08a0490] .kernel_init+0x220/0x2dc
 [c000fecc3f90] [c002c4d0] .kernel_thread+0x54/0x70
 Instruction dump:
 7c00f850 7c804b92 2fa4 419c007c 7d0a5b92 7fa44000 409c0070 7d232214
 88090020 6802 7cd0 78000fe0 0b00 2fbc 419e0044 8123001c
 
 2.6.33-git10(64ba99267...) was OK.
 
 This WARN_ON was introduced by commit 4610ee1d36...
 
 kprobes: Introduce generic insn_slot framework

FWIW, same on s390...
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] perf_counter/powerpc: Fix compilation after perf_counter_overflow change

2009-09-21 Thread Heiko Carstens

On Mon, Sep 21, 2009 at 09:30:43AM +0200, Ingo Molnar wrote:
 
 * Metzger, Markus T markus.t.metz...@intel.com wrote:
 
  -Original Message-
  From: Paul Mackerras [mailto:pau...@samba.org]
  Sent: Monday, September 21, 2009 8:45 AM
  
  
  Markus, please take care in future to mention it in the changelog if
  your patches touch definitions used by other architectures.  If you
  could go so far as to use grep a bit more and fix up other
  architectures' callsites for the things you're changing, that would be
  very much appreciated.  Thanks.
  
  I'm sorry I missed that.
  
  There's one more place in arch/sparc/.
  The below patch should fix it, but I have no means to test it.
 
 You also missed a third thing:
 
 +static inline int
 +perf_output_begin(struct perf_output_handle *handle, struct perf_counter *c,
 + unsigned int size, int nmi, int sample)   { }
 
 an 'int' function returning void ...
 
 Plus all the !PERF_COUNTERS branch of empty inlines is pointless - these 
 facilities are used by perfcounters code only. I fixed that too.

Hi Ingo,

did you fix all of these warnings for !PERF_COUNTERS?

include/linux/perf_counter.h: In function 'perf_output_begin':
include/linux/perf_counter.h:854: warning: no return statement in function 
returning non-void
include/linux/perf_counter.h: At top level:
include/linux/perf_counter.h:863: warning: 'struct perf_sample_data' declared 
inside parameter list
include/linux/perf_counter.h:863: warning: its scope is only this definition or 
declaration, which is probably not what you want
include/linux/perf_counter.h:868: warning: 'struct perf_sample_data' declared 
inside parameter list
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/3] cpu: pseries: Cpu offline states framework

2009-09-16 Thread Heiko Carstens

On Tue, Sep 15, 2009 at 08:28:34PM +0530, Balbir Singh wrote:
 * Peter Zijlstra a.p.zijls...@chello.nl [2009-09-15 14:11:41]:
 
  On Tue, 2009-09-15 at 17:36 +0530, Gautham R Shenoy wrote:
   This patchset contains the offline state driver implemented for
   pSeries. For pSeries, we define three available_hotplug_states. They are:
   
   online: The processor is online.
   
   offline: This is the the default behaviour when the cpu is 
   offlined
   even in the absense of this driver. The CPU would call make an
   rtas_stop_self() call and hand over the CPU back to the resource 
   pool,
   thereby effectively deallocating that vCPU from the LPAR.
   NOTE: This would result in a configuration change to the LPAR
   which is visible to the outside world.
   
   inactive: This cedes the vCPU to the hypervisor with a cede 
   latency
   specifier value 2.
   NOTE: This option does not result in a configuration change
   and the vCPU would be still entitled to the LPAR to which it 
   earlier
   belong to.
   
   Any feedback on the patchset will be immensely valuable.
  
  I still think its a layering violation... its the hypervisor manager
  that should be bothered in what state an off-lined cpu is in. 
 
 
 From a design standpoint where we stand today is
 
 1. A cede indicates that the CPU is no longer needed and can be
 reassigned (remember we do dedicated CPU partitions in power)
 2. What this patch is trying to do is to say We don't need the
 CPU, but please don't reassign, put it to sleep

FWIW, this sounds exactly like the same we have already on s390.
But back then I didn't consider adding a common code infrastructure
would make sense :)

Besides the online attribute we have an additional configure
attribute to which can only be written if the cpu is offline.
Writing a 0 to it would mean that you currently won't need the cpu
anymore and the hypervisor is free to reassign the cpu to a different
LPAR.
Writing a 1 to it means you want to use it. If there are enough
resources you will get it. If not.. bad luck.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Fix ext4 bitops

2008-02-04 Thread Heiko Carstens

| fs/ext4/mballoc.c: In function 'ext4_mb_generate_buddy':
| fs/ext4/mballoc.c:954: error: implicit declaration of function 
'generic_find_next_le_bit'

The s390 specific bitops uses parts of the generic implementation.
Include the correct header.
   
   That doesn't work:
   
   fs/built-in.o: In function `ext4_mb_release_inode_pa':
   mballoc.c:(.text+0x95a8a): undefined reference to 
   `generic_find_next_le_bit'
   fs/built-in.o: In function `ext4_mb_init_cache':
   mballoc.c:(.text+0x967ea): undefined reference to 
   `generic_find_next_le_bit'
   
   This still needs generic_find_next_le_bit which comes
   from lib/find_next_bit.c. That one doesn't get built on s390 since we
   don't set GENERIC_FIND_NEXT_BIT.
   Currently we have the lengthly patch below queued.
  
  Similar issue on m68k. As Bastian also saw it on powerpc, I'm getting the
  impression the ext4 people don't (compile) test on big endian machines?
  
  Gr{oetje,eeting}s,
  
 
 I have sent this patches to linux-arch expecting a review from
 different arch people. It is true that the patches are tested only on
 powerpc, x86-64, x86. That's the primary reason of me sending the
 patches to linux-arch.

Is there anything special I need to do so the ext4 code actually uses
ext2_find_next_bit() ? Haven't looked at the ext4 code, but I'd like to
test if the s390 implementation is ok.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] powerpc: add new required termio functions

2007-09-12 Thread Heiko Carstens

On Wed, Sep 12, 2007 at 12:04:39PM +1000, Michael Neuling wrote:
 The tty: termios locking functions break with new termios type patch
 (f629307c857c030d5a3dd777fee37c8bb395e171) breaks the powerpc compile.
 [...]
 I'm guessing other architectures are broken too?

FWIW, the above quoted patch breaks s390 as well.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH] powerpc: add new required termio functions

2007-09-12 Thread Heiko Carstens

On Wed, Sep 12, 2007 at 12:34:09PM +0100, Christoph Hellwig wrote:
 On Wed, Sep 12, 2007 at 04:01:09AM -0700, Andrew Morton wrote:
  On Wed, 12 Sep 2007 12:20:32 +0200 Heiko Carstens [EMAIL PROTECTED] wrote:
  
   On Wed, Sep 12, 2007 at 12:04:39PM +1000, Michael Neuling wrote:
The tty: termios locking functions break with new termios type patch
(f629307c857c030d5a3dd777fee37c8bb395e171) breaks the powerpc compile.
[...]
I'm guessing other architectures are broken too?
   
   FWIW, the above quoted patch breaks s390 as well.
  
  Does this fix it?

Yes, it does.

 I might be missing something, but the the right fix is probably to
 apply the arch patches from Alan to powerpc and s390.  We don't want to
 be left over without all the nice termios features on these platforms,
 do we?

But not in rc6 timeframe, I would guess?
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

97 matches

Mail list logo