[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

--- Comment #10 from Christophe Leroy (christophe.le...@c-s.fr) ---
Thanks for testing.

Proposed patch at https://patchwork.ozlabs.org/patch/1097492/

Note that the backport to stable should be straight forward as the directory
and file name changes are in following commit:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=17312f258cf6eb584f276ad592972ade7e16e318

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Linus Torvalds
[ Sorry about html and mobile crud, I'm not at the computer right now ]

How about we just undo the whole misguided probe_kernel_address() thing?

The excuse for is was that it would avoid crashes.

It turns out that was wrong, and that it just made things much worse.
Honestly, we haven't really had the crashes that the logic was supposed to
protect against in the first place, so by now it's clear that the whole
thing was a stupid and horrible mistake in the first place.

So let's admit that and just go back to the old sane model.

Seriously, we've never needed that odd probing. It causes issues. It's
wrong and pointless.

   Linus

On Thu, May 9, 2019, 21:32 Sergey Senozhatsky <
sergey.senozhatsky.w...@gmail.com> wrote:

> On (05/09/19 14:19), Petr Mladek wrote:
> > 1. Report on Power:
> >
> > Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> > CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> >
> > The problem is the combination of some new code called via printk(),
> > check_pointer() which calls probe_kernel_read(). That then calls
> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> > (before we've patched features). With the JUMP_LABEL debug enabled that
> > causes us to call printk() & dump_stack() and we end up recursing and
> > overflowing the stack.
>
> Hmm... hmm... PPC does an .opd-based symbol dereference, which
> eventually probe_kernel_read()-s. So early printk(%pS) will do
>
> printk(%pS)
>  dereference_function_descriptor()
>   probe_kernel_address()
>dump_stack()
> printk(%pS)
>  dereference_function_descriptor()
>   probe_kernel_address()
>dump_stack()
> printk(%pS)
>  ...
>
> I'd say... that it's not vsprintf that we want to fix, it's
> the idea that probe_kernel_address() can dump_stack() on any
> platform. On some archs probe_kernel_address()->dump_stack()
> is going nowhere:
>  dump_stack() does probe_kernel_address(), which calls dump_stack(),
>  which calls printk(%pS)->probe_kernel_address() again and again,
>  and again.
>
> -ss
>


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Sergey Senozhatsky
On (05/09/19 21:47), Linus Torvalds wrote:
>[ Sorry about html and mobile crud, I'm not at the computer right now ]
>How about we just undo the whole misguided probe_kernel_address() thing?

But the problem will remain - %pS/%pF on PPC (and some other arch-s)
do dereference_function_descriptor(), which calls probe_kernel_address().
So if probe_kernel_address() starts to dump_stack(), then we are heading
towards stack overflow. Unless I'm totally missing something.

-ss


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Sergey Senozhatsky
On (05/09/19 14:19), Petr Mladek wrote:
> 1. Report on Power:
> 
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> 
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.

Hmm... hmm... PPC does an .opd-based symbol dereference, which
eventually probe_kernel_read()-s. So early printk(%pS) will do

printk(%pS)
 dereference_function_descriptor()
  probe_kernel_address()
   dump_stack()
printk(%pS)
 dereference_function_descriptor()
  probe_kernel_address()
   dump_stack()
printk(%pS)
 ...

I'd say... that it's not vsprintf that we want to fix, it's
the idea that probe_kernel_address() can dump_stack() on any
platform. On some archs probe_kernel_address()->dump_stack()
is going nowhere:
 dump_stack() does probe_kernel_address(), which calls dump_stack(),
 which calls printk(%pS)->probe_kernel_address() again and again,
 and again.

-ss


Re: [PATCH v10 11/12] ima: Define ima-modsig template

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> Define new "d-modsig" template field which holds the digest that is
> expected to match the one contained in the modsig, and also new "modsig"
> template field which holds the appended file signature.
> 
> Add a new "ima-modsig" defined template descriptor with the new fields as
> well as the ones from the "ima-sig" descriptor.
> 
> Change ima_store_measurement() to accept a struct modsig * argument so that
> it can be passed along to the templates via struct ima_event_data.
> 
> Suggested-by: Mimi Zohar 
> Signed-off-by: Thiago Jung Bauermann 

Thanks, Roberto.  Just some thoughts inline below.

Reviewed-by: Mimi Zohar 

> ---



> +/*
> + * Validating the appended signature included in the measurement list 
> requires
> + * the file hash calculated without the appended signature (i.e., the 
> 'd-modsig'
> + * field). Therefore, notify the user if they have the 'modsig' field but not
> + * the 'd-modsig' field in the template.
> + */
> +static void check_current_template_modsig(void)
> +{
> +#define MSG "template with 'modsig' field also needs 'd-modsig' field\n"
> + struct ima_template_desc *template;
> + bool has_modsig, has_dmodsig;
> + static bool checked;
> + int i;
> +
> + /* We only need to notify the user once. */
> + if (checked)
> + return;
> +
> + has_modsig = has_dmodsig = false;
> + template = ima_template_desc_current();
> + for (i = 0; i < template->num_fields; i++) {
> + if (!strcmp(template->fields[i]->field_id, "modsig"))
> + has_modsig = true;
> + else if (!strcmp(template->fields[i]->field_id, "d-modsig"))
> + has_dmodsig = true;
> + }
> +
> + if (has_modsig && !has_dmodsig)
> + pr_notice(MSG);
> +
> + checked = true;
> +#undef MSG
> +}
> +

There was some recent discussion about supporting per IMA policy rule
template formats.  This feature will allow just the kexec kernel image
to require ima-modsig.  When per policy rule template formats support
is upstreamed, this function will need to be updated.


> 
> @@ -389,3 +425,25 @@ int ima_eventsig_init(struct ima_event_data *event_data,
>   return ima_write_template_field_data(xattr_value, event_data->xattr_len,
>DATA_FMT_HEX, field_data);
>  }
> +
> +int ima_eventmodsig_init(struct ima_event_data *event_data,
> +  struct ima_field_data *field_data)
> +{
> + const void *data;
> + u32 data_len;
> + int rc;
> +
> + if (!event_data->modsig)
> + return 0;
> +
> + /*
> +  * The xattr_value for IMA_MODSIG is a runtime structure containing
> +  * pointers. Get its raw data instead.
> +  */

"xattr_value"?  The comment needs some clarification.

Mimi

> + rc = ima_modsig_serialize(event_data->modsig, , _len);
> + if (rc)
> + return rc;
> +
> + return ima_write_template_field_data(data, data_len,
> +  DATA_FMT_HEX, field_data);
> +}



Re: [PATCH v10 09/12] ima: Implement support for module-style appended signatures

2019-05-09 Thread Mimi Zohar
Hi Thiago,

> diff --git a/security/integrity/ima/ima_policy.c 
> b/security/integrity/ima/ima_policy.c
> index fca7a3f23321..a7a20a8c15c1 100644
> --- a/security/integrity/ima/ima_policy.c
> +++ b/security/integrity/ima/ima_policy.c
> @@ -1144,6 +1144,12 @@ void ima_delete_rules(void)
>   }
>  }
>  
> +#define __ima_hook_stringify(str)(#str),
> +
> +const char *const func_tokens[] = {
> + __ima_hooks(__ima_hook_stringify)
> +};
> +
>  #ifdef   CONFIG_IMA_READ_POLICY
>  enum {
>   mask_exec = 0, mask_write, mask_read, mask_append
> @@ -1156,12 +1162,6 @@ static const char *const mask_tokens[] = {
>   "MAY_APPEND"
>  };
>  
> -#define __ima_hook_stringify(str)(#str),
> -
> -static const char *const func_tokens[] = {
> - __ima_hooks(__ima_hook_stringify)
> -};
> -
>  void *ima_policy_start(struct seq_file *m, loff_t *pos)
>  {
>   loff_t l = *pos;

Is moving this something left over from previous versions or there is
a need for this change?

Other than this, the patch looks good.

Mimi



Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread David Hildenbrand
> Looks good to me.
> 
>>
>> (I would actually even prefer "memory_block_devices", because memory
>> blocks have different meanins)
>>
> 
> Agree with you, this comes to my mind sometime ago :-)

We have memblocks, memory_blocks  ... I guess memory_block_device is
unique :)

> 
>>>
>>> [...]
>>>
 /*
 @@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct 
 resource *res)
if (ret < 0)
goto error;

 +  /* create memory block devices after memory was added */
 +  ret = hotplug_memory_register(start, size);
 +  if (ret) {
 +  arch_remove_memory(nid, start, size, NULL);
>>>
>>> Functionally, it works I think.
>>>
>>> But arch_remove_memory() would remove pages from zone. At this point, we 
>>> just
>>> allocate section/mmap for pages, the zones are empty and pages are not
>>> connected to zone.
>>>
>>> Function  zone = page_zone(page); always gets zone #0, since pages->flags 
>>> is 0
>>> at  this point. This is not exact.
>>>
>>> Would we add some comment to mention this? Or we need to clean up
>>> arch_remove_memory() to take out __remove_zone()?
>>
>> That is precisely what is on my list next (see cover letter).This is
>> already broken when memory that was never onlined is removed again.
>> So I am planning to fix that independently.
>>
> 
> Sounds great :-)

Especially, I suspect a lot of bugs in the area of

1. Remove memory that has never been onlined
2. Remove memory that has been onlined/offlined a couple of times
3. Remove memory that has been onlined to different zones.

Will see when refactoring if my intuition is right :)

> 
> Hope you would cc me in the following series.


Sure! Thanks!


-- 

Thanks,

David / dhildenb


Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread Wei Yang
On Thu, May 09, 2019 at 04:58:56PM +0200, David Hildenbrand wrote:
>On 09.05.19 16:31, Wei Yang wrote:
>> On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>>> Only memory to be added to the buddy and to be onlined/offlined by
>>> user space using memory block devices needs (and should have!) memory
>>> block devices.
>>>
>>> Factor out creation of memory block devices Create all devices after
>>> arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>>> because it is now effectively stale.
>>>
>>> Only after memory block devices have been added, memory can be onlined
>>> by user space. This implies, that memory is not visible to user space at
>>> all before arch_add_memory() succeeded.
>>>
>>> Cc: Greg Kroah-Hartman 
>>> Cc: "Rafael J. Wysocki" 
>>> Cc: David Hildenbrand 
>>> Cc: "mike.tra...@hpe.com" 
>>> Cc: Andrew Morton 
>>> Cc: Ingo Molnar 
>>> Cc: Andrew Banman 
>>> Cc: Oscar Salvador 
>>> Cc: Michal Hocko 
>>> Cc: Pavel Tatashin 
>>> Cc: Qian Cai 
>>> Cc: Wei Yang 
>>> Cc: Arun KS 
>>> Cc: Mathieu Malaterre 
>>> Signed-off-by: David Hildenbrand 
>>> ---
>>> drivers/base/memory.c  | 70 ++
>>> include/linux/memory.h |  2 +-
>>> mm/memory_hotplug.c| 15 -
>>> 3 files changed, 53 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>> index 6e0cb4fda179..862c202a18ca 100644
>>> --- a/drivers/base/memory.c
>>> +++ b/drivers/base/memory.c
>>> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>>> return 0;
>>> }
>>>
>>> +static void unregister_memory(struct memory_block *memory)
>>> +{
>>> +   BUG_ON(memory->dev.bus != _subsys);
>>> +
>>> +   /* drop the ref. we got via find_memory_block() */
>>> +   put_device(>dev);
>>> +   device_unregister(>dev);
>>> +}
>>> +
>>> /*
>>> - * need an interface for the VM to add new memory regions,
>>> - * but without onlining it.
>>> + * Create memory block devices for the given memory area. Start and size
>>> + * have to be aligned to memory block granularity. Memory block devices
>>> + * will be initialized as offline.
>>>  */
>>> -int hotplug_memory_register(int nid, struct mem_section *section)
>>> +int hotplug_memory_register(unsigned long start, unsigned long size)
>> 
>> One trivial suggestion about the function name.
>> 
>> For memory_block device, sometimes we use the full name
>> 
>> find_memory_block
>> init_memory_block
>> add_memory_block
>> 
>> But sometimes we use *nick* name
>> 
>> hotplug_memory_register
>> register_memory
>> unregister_memory
>> 
>> This is a little bit confusion.
>> 
>> Can we use one name convention here?
>
>We can just go for
>
>crate_memory_blocks() and free_memory_blocks(). Or do
>you have better suggestions?

s/crate/create/

Looks good to me.

>
>(I would actually even prefer "memory_block_devices", because memory
>blocks have different meanins)
>

Agree with you, this comes to my mind sometime ago :-)

>> 
>> [...]
>> 
>>> /*
>>> @@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct 
>>> resource *res)
>>> if (ret < 0)
>>> goto error;
>>>
>>> +   /* create memory block devices after memory was added */
>>> +   ret = hotplug_memory_register(start, size);
>>> +   if (ret) {
>>> +   arch_remove_memory(nid, start, size, NULL);
>> 
>> Functionally, it works I think.
>> 
>> But arch_remove_memory() would remove pages from zone. At this point, we just
>> allocate section/mmap for pages, the zones are empty and pages are not
>> connected to zone.
>> 
>> Function  zone = page_zone(page); always gets zone #0, since pages->flags is >> 0
>> at  this point. This is not exact.
>> 
>> Would we add some comment to mention this? Or we need to clean up
>> arch_remove_memory() to take out __remove_zone()?
>
>That is precisely what is on my list next (see cover letter).This is
>already broken when memory that was never onlined is removed again.
>So I am planning to fix that independently.
>

Sounds great :-)

Hope you would cc me in the following series.

>
>-- 
>
>Thanks,
>
>David / dhildenb

-- 
Wei Yang
Help you, Help me


Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-09 Thread Borislav Petkov
On Thu, May 09, 2019 at 04:55:34PM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2019 at 12:52:05AM +1000, Michael Ellerman wrote:
> > Thanks. It would be nice if you could send it as a fix for 5.2, it's the
> > last thing blocking one of my allmodconfig builds. But if you don't
> > think it qualifies as a fix that's fine too, it can wait.
> 
> Sure, no problem. Will do a pull request later.

Hmm, so looking at this more, I was able to produce this config with my
ancient cross-compiler:

CONFIG_EDAC_SUPPORT=y
CONFIG_EDAC=m
CONFIG_EDAC_LEGACY_SYSFS=y
CONFIG_EDAC_MPC85XX=y

Now, mpc85xx_edac is built-in and edac_core.ko is a module
(CONFIG_EDAC=m) and that should not work - i.e., builtin code calling
module functions. But my cross-compiler is happily building this without
complaint. Or maybe I'm missing something.

In any case, I *think* the proper fix should be to do:

config EDAC_MPC85XX
bool "Freescale MPC83xx / MPC85xx"
depends on FSL_SOC && EDAC=y

so that you can't even produce the above invalid .config snippet.

Hmmm?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


RE: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread David Laight
From: Michal Suchánek
> Sent: 09 May 2019 14:38
...
> > The problem is the combination of some new code called via printk(),
> > check_pointer() which calls probe_kernel_read(). That then calls
> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> > (before we've patched features).
> 
> There is early_mmu_has_feature for this case. mmu_has_feature does not
> work before patching so parts of kernel that can run before patching
> must use the early_ variant which actually runs code reading the
> feature bitmap to determine the answer.

Does the early_ variant get patched so the it is reasonably
efficient after the 'patching' is done?
Or should there be a third version which gets patched across?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [next-20190507][powerpc] WARN kernel/cgroup/cgroup.c:6008 with LTP ptrace01 test case

2019-05-09 Thread Roman Gushchin
On Thu, May 09, 2019 at 11:35:36AM +0530, Sachin Sant wrote:
> 
> 
> > On 09-May-2019, at 4:53 AM, Roman Gushchin  wrote:
> > 
> > On Wed, May 08, 2019 at 03:06:30PM +0530, Sachin Sant wrote:
> >> While running LTP tests(specifically ptrace01) following WARNING is 
> >> observed
> >> on POWER8 LPAR running next-20190507 built using 4K page size.
> >> 
> >> [ 3969.979492] msgrcv04 (433) used greatest stack depth: 9328 bytes left
> >> [ 3981.452911] madvise06 (515): drop_caches: 3
> >> [ 4004.575752] WARNING: CPU: 5 PID: 721 at kernel/cgroup/cgroup.c:6008 
> >> cgroup_exit+0x2ac/0x2c0
> >> [ 4004.575781] Modules linked in: overlay rpadlpar_io rpaphp 
> >> iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack 
> >> nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc kvm 
> >> iptable_filter pseries_rng rng_core vmx_crypto ip_tables x_tables autofs4 
> >> [last unloaded: dummy_del_mod]
> >> [ 4004.575837] CPU: 5 PID: 721 Comm: ptrace01 Tainted: G   O  
> >> 5.1.0-next-20190507-autotest #1
> >> [ 4004.575846] NIP:  c1b3026c LR: c1b30054 CTR: 
> >> c1c9f020
> >> [ 4004.575855] REGS: c00171fff840 TRAP: 0700   Tainted: G   O  
> >>  (5.1.0-next-20190507-autotest)
> >> [ 4004.575863] MSR:  80010282b033 
> >>   CR: 44004824  XER: 2000
> >> [ 4004.575885] CFAR: c1b30078 IRQMASK: 1 
> >> [ 4004.575885] GPR00: c1b30054 c00171fffad0 c3938700 
> >> c0027b02fa18 
> >> [ 4004.575885] GPR04: c0027b02fa00  c3ae8700 
> >> 001c180a 
> >> [ 4004.575885] GPR08: 0001 0001 c3ae8700 
> >> 0001 
> >> [ 4004.575885] GPR12: 4400 c0001ec7ea80 c3a4d670 
> >> 0009 
> >> [ 4004.575885] GPR16:  00040100 418004fc 
> >> 0843 
> >> [ 4004.575885] GPR20: 0009 0001 c001715e9200 
> >> c0016d8f4d00 
> >> [ 4004.575885] GPR24: c00171fffd90 0100 c00168692478 
> >> c00171fffb98 
> >> [ 4004.575885] GPR28: c00168692400 c0016d8f4d00 c36420d0 
> >> c0027b02fa00 
> >> [ 4004.575958] NIP [c1b3026c] cgroup_exit+0x2ac/0x2c0
> >> [ 4004.575966] LR [c1b30054] cgroup_exit+0x94/0x2c0
> >> [ 4004.575972] Call Trace:
> >> [ 4004.575979] [c00171fffad0] [c1b30054] 
> >> cgroup_exit+0x94/0x2c0 (unreliable)
> >> [ 4004.575990] [c00171fffb30] [c19cea98] do_exit+0x878/0x1ae0
> >> [ 4004.575999] [c00171fffc00] [c19cfe4c] 
> >> do_group_exit+0xac/0x1d0
> >> [ 4004.576009] [c00171fffc40] [c19ed00c] 
> >> get_signal+0x2bc/0x11c0
> >> [ 4004.576019] [c00171fffd30] [c1867b14] 
> >> do_notify_resume+0x384/0x900
> >> [ 4004.576029] [c00171fffe20] [c183e844] 
> >> ret_from_except_lite+0x70/0x74
> >> [ 4004.576037] Instruction dump:
> >> [ 4004.576043] 314a0001 7d40492d 40c2fff4 3d42001b e92a7288 39290001 
> >> f92a7288 4bfffe5c 
> >> [ 4004.576056] 3d42001b e92a7258 39290001 f92a7258 <0fe0> 4bfffe0c 
> >> 4be91e45 6000 
> >> [ 4004.576071] ---[ end trace 82a1a7c19005ebd6 ]—
> >> 
> >> The WARN_ONCE was added by following commit 
> >> 96b9c592def5 ("cgroup: get rid of cgroup_freezer_frozen_exit()”). 
> >> 
> >> Reverting the patch helps avoid the warning.
> > 
> > Hi Sachin!
> > 
> > Thank you for the report!
> > 
> > Can you, please, check that the patch at https://lkml.org/lkml/2019/5/8/938 
> > 
> > solves the problem?
> > 
> This patch fixes the problem for me.

Thank you for the confirmation!

The patch has just been pulled into the cgroup tree,
and will appear in next soon.


Thanks!


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Guenter Roeck
On Fri, May 10, 2019 at 12:31:05AM +1000, Michael Ellerman wrote:
> Greg Kroah-Hartman  writes:
> > On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:
> >> I see multiple instances of:
> >> 
> >> arch/powerpc/kernel/exceptions-64s.S:839: Error:
> >>attempt to move .org backwards
> >> 
> >> in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
> >> 
> >> This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
> >> forwarding barrier at kernel entry/exit"), which is part of a large patch
> >> series and can not easily be reverted.
> >> 
> >> Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?
> >
> > Michael, I thought this patch series was supposed to fix ppc issues, not
> > add to them :)
> 
> Well it fixes some, but creates others :}
> 
> > Any ideas on what to do here?
> 
> Turning off CONFIG_CBE_RAS (obscure IBM Cell Blade RAS features) is
> sufficient to get it building. Is that an option for your build tests
> Guenter?
> 
I could turn it off unconditionally, meaning it would affect all other
branches. I would rather stop building ppc:allmodconfig for v4.4.y.

Guenter


Re: [PATCH linux-next v10 0/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-05-09 Thread Oleg Nesterov
On 04/16, Dmitry V. Levin wrote:
>
> [Andrew, could you take this patchset into your tree, please?]

Just in case...

I have already acked 6/7.

Other patches look good to me too, just I don't think I can actually review
these non-x86 changes.

Oleg.



Re: [PATCH v10 08/12] ima: Factor xattr_verify() out of ima_appraise_measurement()

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> Verify xattr signature in a separate function so that the logic in
> ima_appraise_measurement() remains clear when it gains the ability to also
> verify an appended module signature.
> 
> The code in the switch statement is unchanged except for having to
> dereference the status and cause variables (since they're now pointers),
> and fixing the style of a block comment to appease checkpatch.
> 
> Suggested-by: Mimi Zohar 
> Signed-off-by: Thiago Jung Bauermann 

Reviewed-by:  Mimi Zohar 



Re: [PATCH v10 02/12] PKCS#7: Refactor verify_pkcs7_signature()

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> IMA will need to verify a PKCS#7 signature which has already been parsed.
> For this reason, factor out the code which does that from
> verify_pkcs7_signature() into a new function which takes a struct
> pkcs7_message instead of a data buffer.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Reviewed-by: Mimi Zohar 
> Cc: David Howells 
> Cc: David Woodhouse 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 

Reviewed-by: Mimi Zohar 



Re: [PATCH v10 01/12] MODSIGN: Export module signature definitions

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> IMA will use the module_signature format for append signatures, so export
> the relevant definitions and factor out the code which verifies that the
> appended signature trailer is valid.
> 
> Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it
> and be able to use mod_check_sig() without having to depend on either
> CONFIG_MODULE_SIG or CONFIG_MODULES.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Jessica Yu 

Just a couple minor questions/comments below.

Reviewed-by: Mimi Zohar 

> ---

< snip >


> diff --git a/init/Kconfig b/init/Kconfig
> index 4592bf7997c0..a71019553ee1 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1906,7 +1906,7 @@ config MODULE_SRCVERSION_ALL
>  config MODULE_SIG
>   bool "Module signature verification"
>   depends on MODULES
> - select SYSTEM_DATA_VERIFICATION
> + select MODULE_SIG_FORMAT
>   help
> Check modules for valid signatures upon load: the signature
> is simply appended to the module. For more information see
> @@ -2036,6 +2036,10 @@ config TRIM_UNUSED_KSYMS
>  
>  endif # MODULES
>  
> +config MODULE_SIG_FORMAT
> + def_bool n
> + select SYSTEM_DATA_VERIFICATION

Normally Kconfigs, in the same file, are defined before they are used.
 I'm not sure if that is required or just a convention.


>  config MODULES_TREE_LOOKUP
>   def_bool y
>   depends on PERF_EVENTS || TRACING
> diff --git a/kernel/Makefile b/kernel/Makefile
> index 6c57e78817da..d2f2488f80ab 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -57,6 +57,7 @@ endif
>  obj-$(CONFIG_UID16) += uid16.o
>  obj-$(CONFIG_MODULES) += module.o
>  obj-$(CONFIG_MODULE_SIG) += module_signing.o
> +obj-$(CONFIG_MODULE_SIG_FORMAT) += module_signature.o
>  obj-$(CONFIG_KALLSYMS) += kallsyms.o
>  obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
>  obj-$(CONFIG_CRASH_CORE) += crash_core.o
> diff --git a/kernel/module.c b/kernel/module.c
> index 985caa467aef..326ddeb364dd 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/kernel/module_signature.c b/kernel/module_signature.c
> new file mode 100644
> index ..6d5e59f27f55
> --- /dev/null
> +++ b/kernel/module_signature.c
> @@ -0,0 +1,45 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Module signature checker
> + *
> + * Copyright (C) 2012 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowe...@redhat.com)
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/**
> + * mod_check_sig - check that the given signature is sane
> + *
> + * @ms:  Signature to check.
> + * @file_len:Size of the file to which @ms is appended.

"name" is missing.

Mimi

> + */



Re: [PATCH v10 06/12] ima: Use designated initializers for struct ima_event_data

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> Designated initializers allow specifying only the members of the struct
> that need initialization. Non-mentioned members are initialized to zero.
> 
> This makes the code a bit clearer (particularly in ima_add_boot_aggregate()
> and also allows adding a new member to the struct without having to update
> all struct initializations.
> 
> Signed-off-by: Thiago Jung Bauermann 

Reviewed-by:  Mimi Zohar 

> ---
>  security/integrity/ima/ima_api.c  | 11 +++
>  security/integrity/ima/ima_init.c |  4 ++--
>  2 files changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/security/integrity/ima/ima_api.c 
> b/security/integrity/ima/ima_api.c
> index c7505fb122d4..0639d0631f2c 100644
> --- a/security/integrity/ima/ima_api.c
> +++ b/security/integrity/ima/ima_api.c
> @@ -133,8 +133,9 @@ void ima_add_violation(struct file *file, const unsigned 
> char *filename,
>  {
>   struct ima_template_entry *entry;
>   struct inode *inode = file_inode(file);
> - struct ima_event_data event_data = {iint, file, filename, NULL, 0,
> - cause};
> + struct ima_event_data event_data = { .iint = iint, .file = file,
> +  .filename = filename,
> +  .violation = cause };
>   int violation = 1;
>   int result;
>  
> @@ -284,8 +285,10 @@ void ima_store_measurement(struct integrity_iint_cache 
> *iint,
>   int result = -ENOMEM;
>   struct inode *inode = file_inode(file);
>   struct ima_template_entry *entry;
> - struct ima_event_data event_data = {iint, file, filename, xattr_value,
> - xattr_len, NULL};
> + struct ima_event_data event_data = { .iint = iint, .file = file,
> +  .filename = filename,
> +  .xattr_value = xattr_value,
> +  .xattr_len = xattr_len };
>   int violation = 0;
>  
>   if (iint->measured_pcrs & (0x1 << pcr))
> diff --git a/security/integrity/ima/ima_init.c 
> b/security/integrity/ima/ima_init.c
> index 6c9295449751..ef6c3a26296e 100644
> --- a/security/integrity/ima/ima_init.c
> +++ b/security/integrity/ima/ima_init.c
> @@ -49,8 +49,8 @@ static int __init ima_add_boot_aggregate(void)
>   const char *audit_cause = "ENOMEM";
>   struct ima_template_entry *entry;
>   struct integrity_iint_cache tmp_iint, *iint = _iint;
> - struct ima_event_data event_data = {iint, NULL, boot_aggregate_name,
> - NULL, 0, NULL};
> + struct ima_event_data event_data = { .iint = iint,
> +  .filename = boot_aggregate_name };
>   int result = -ENOMEM;
>   int violation = 0;
>   struct {



Re: [PATCH v10 03/12] PKCS#7: Introduce pkcs7_get_digest()

2019-05-09 Thread Mimi Zohar
On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> IMA will need to access the digest of the PKCS7 message (as calculated by
> the kernel) before the signature is verified, so introduce
> pkcs7_get_digest() for that purpose.
> 
> Also, modify pkcs7_digest() to detect when the digest was already
> calculated so that it doesn't have to do redundant work. Verifying that
> sinfo->sig->digest isn't NULL is sufficient because both places which
> allocate sinfo->sig (pkcs7_parse_message() and pkcs7_note_signed_info())
> use kzalloc() so sig->digest is always initialized to zero.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: David Howells 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 

Reviewed-by: Mimi Zohar 



Re: [PATCH] powerpc/64s: Use early_mmu_has_feature() in set_kuap()

2019-05-09 Thread Michael Ellerman
On Wed, 2019-05-08 at 12:30:47 UTC, Michael Ellerman wrote:
> When implementing the KUAP support on Radix we fixed one case where
> mmu_has_feature() was being called too early in boot via
> __put_user_size().
> 
> However since then some new code in linux-next has created a new path
> via which we can end up calling mmu_has_feature() too early.
> 
> On P9 this leads to crashes early in boot if we have both PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code
> calls printk() which calls probe_kernel_read(), that does a
> __copy_from_user_inatomic() which calls into set_kuap() and that uses
> mmu_has_feature().
> 
> At that point in boot we haven't patched MMU features yet so the debug
> code in mmu_has_feature() complains, and calls printk(). At that point
> we recurse, eg:
> 
>   ...
>   dump_stack+0xdc
>   probe_kernel_read+0x1a4
>   check_pointer+0x58
>   ...
>   printk+0x40
>   dump_stack_print_info+0xbc
>   dump_stack+0x8
>   probe_kernel_read+0x1a4
>   probe_kernel_read+0x19c
>   check_pointer+0x58
>   ...
>   printk+0x40
>   cpufeatures_process_feature+0xc8
>   scan_cpufeatures_subnodes+0x380
>   of_scan_flat_dt_subnodes+0xb4
>   dt_cpu_ftrs_scan_callback+0x158
>   of_scan_flat_dt+0xf0
>   dt_cpu_ftrs_scan+0x3c
>   early_init_devtree+0x360
>   early_setup+0x9c
> 
> And so on for infinity, symptom is a dead system.
> 
> Even more fun is what happens when using the hash MMU (ie. p8 or p9
> with Radix disabled), and when we don't have
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled
> we don't check if static keys have been initialised, we just rely on
> the jump label. But the jump label defaults to true so we just whack
> the AMR even though Radix is not enabled.
> 
> Clearing the AMR is fine, but after we've done the user copy we write
> (0b11 << 62) into AMR. When using hash that makes all pages with key
> zero no longer readable or writable. All kernel pages implicitly have
> key zero, and so all of a sudden the kernel can't read or write any of
> its memory. Again dead system.
> 
> In the medium term we have several options for fixing this.
> probe_kernel_read() doesn't need to touch AMR at all, it's not doing a
> user access after all, but it uses __copy_from_user_inatomic() just
> because it's easy, we could fix that.
> 
> It would also be safe to default to not writing to the AMR during
> early boot, until we've detected features. But it's not clear that
> flipping all the MMU features to static_key_false won't introduce
> other bugs.
> 
> But for now just switch to early_mmu_has_feature() in set_kuap(), that
> avoids all the problems with jump labels. It adds the overhead of a
> global lookup and test, but that's probably trivial compared to the
> writes to the AMR anyway.
> 
> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
> Signed-off-by: Michael Ellerman 
> Reviewed-by: Russell Currey 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/8150a153c013aa2dd1ffae43370b89ac

cheers


Re: [PATCH] powerpc/book3s/64: check for NULL pointer in pgd_alloc()

2019-05-09 Thread Michael Ellerman
On Mon, 2019-05-06 at 00:20:43 UTC, Rick Lindsley wrote:
> When the memset code was added to pgd_alloc(), it failed to consider that 
> kmem_cache_alloc() can return NULL.  It's uncommon, but not impossible under 
> heavy memory contention.
> 
> Signed-off-by: Rick Lindsley 
> Fixes: cf266dbcd2a7 ("Zero PGD pages on allocation")

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f39356261c265a0689d7ee568132d516

cheers


Re: [PATCH v8 05/20] KVM: PPC: Book3S HV: Remove pmd_is_leaf()

2019-05-09 Thread Steven Price
On 29/04/2019 03:05, Paul Mackerras wrote:
> On Wed, Apr 03, 2019 at 03:16:12PM +0100, Steven Price wrote:
>> Since pmd_large() is now always available, pmd_is_leaf() is redundant.
>> Replace all uses with calls to pmd_large().
> 
> NAK.  I don't want to do this, because pmd_is_leaf() is purely about
> the guest page tables (the "partition-scoped" radix tree which
> specifies the guest physical to host physical translation), not about
> anything to do with the Linux process page tables.  The guest page
> tables have the same format as the Linux process page tables, but they
> are managed separately.

Fair enough, I'll drop this patch in the next posting.

> If it makes things clearer, I could rename it to "guest_pmd_is_leaf()"
> or something similar.

I'll leave that decision up to you - it might prevent similar confusion
in the future.

Steve


Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread David Hildenbrand
On 09.05.19 16:31, Wei Yang wrote:
> On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>> Only memory to be added to the buddy and to be onlined/offlined by
>> user space using memory block devices needs (and should have!) memory
>> block devices.
>>
>> Factor out creation of memory block devices Create all devices after
>> arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>> because it is now effectively stale.
>>
>> Only after memory block devices have been added, memory can be onlined
>> by user space. This implies, that memory is not visible to user space at
>> all before arch_add_memory() succeeded.
>>
>> Cc: Greg Kroah-Hartman 
>> Cc: "Rafael J. Wysocki" 
>> Cc: David Hildenbrand 
>> Cc: "mike.tra...@hpe.com" 
>> Cc: Andrew Morton 
>> Cc: Ingo Molnar 
>> Cc: Andrew Banman 
>> Cc: Oscar Salvador 
>> Cc: Michal Hocko 
>> Cc: Pavel Tatashin 
>> Cc: Qian Cai 
>> Cc: Wei Yang 
>> Cc: Arun KS 
>> Cc: Mathieu Malaterre 
>> Signed-off-by: David Hildenbrand 
>> ---
>> drivers/base/memory.c  | 70 ++
>> include/linux/memory.h |  2 +-
>> mm/memory_hotplug.c| 15 -
>> 3 files changed, 53 insertions(+), 34 deletions(-)
>>
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 6e0cb4fda179..862c202a18ca 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>>  return 0;
>> }
>>
>> +static void unregister_memory(struct memory_block *memory)
>> +{
>> +BUG_ON(memory->dev.bus != _subsys);
>> +
>> +/* drop the ref. we got via find_memory_block() */
>> +put_device(>dev);
>> +device_unregister(>dev);
>> +}
>> +
>> /*
>> - * need an interface for the VM to add new memory regions,
>> - * but without onlining it.
>> + * Create memory block devices for the given memory area. Start and size
>> + * have to be aligned to memory block granularity. Memory block devices
>> + * will be initialized as offline.
>>  */
>> -int hotplug_memory_register(int nid, struct mem_section *section)
>> +int hotplug_memory_register(unsigned long start, unsigned long size)
> 
> One trivial suggestion about the function name.
> 
> For memory_block device, sometimes we use the full name
> 
> find_memory_block
> init_memory_block
> add_memory_block
> 
> But sometimes we use *nick* name
> 
> hotplug_memory_register
> register_memory
> unregister_memory
> 
> This is a little bit confusion.
> 
> Can we use one name convention here?

We can just go for

crate_memory_blocks() and free_memory_blocks(). Or do
you have better suggestions?

(I would actually even prefer "memory_block_devices", because memory
blocks have different meanins)

> 
> [...]
> 
>> /*
>> @@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct 
>> resource *res)
>>  if (ret < 0)
>>  goto error;
>>
>> +/* create memory block devices after memory was added */
>> +ret = hotplug_memory_register(start, size);
>> +if (ret) {
>> +arch_remove_memory(nid, start, size, NULL);
> 
> Functionally, it works I think.
> 
> But arch_remove_memory() would remove pages from zone. At this point, we just
> allocate section/mmap for pages, the zones are empty and pages are not
> connected to zone.
> 
> Function  zone = page_zone(page); always gets zone #0, since pages->flags is 0
> at  this point. This is not exact.
> 
> Would we add some comment to mention this? Or we need to clean up
> arch_remove_memory() to take out __remove_zone()?

That is precisely what is on my list next (see cover letter).This is
already broken when memory that was never onlined is removed again.
So I am planning to fix that independently.


-- 

Thanks,

David / dhildenb


Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-09 Thread Borislav Petkov
On Fri, May 10, 2019 at 12:52:05AM +1000, Michael Ellerman wrote:
> Thanks. It would be nice if you could send it as a fix for 5.2, it's the
> last thing blocking one of my allmodconfig builds. But if you don't
> think it qualifies as a fix that's fine too, it can wait.

Sure, no problem. Will do a pull request later.

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-09 Thread Michael Ellerman
Borislav Petkov  writes:
> On Mon, May 06, 2019 at 08:50:45AM +0200, Johannes Thumshirn wrote:
>> Acked-by: Johannes Thumshirn 
>
> Queued, thanks.

Thanks. It would be nice if you could send it as a fix for 5.2, it's the
last thing blocking one of my allmodconfig builds. But if you don't
think it qualifies as a fix that's fine too, it can wait.

cheers


Re: [PATCH] crypto: caam/jr - Remove extra memory barrier during job ring dequeue

2019-05-09 Thread Michael Ellerman
Herbert Xu  writes:
> On Thu, May 02, 2019 at 11:08:55AM +, Horia Geanta wrote:
>> >> +
>> >>  static inline void wr_reg32(void __iomem *reg, u32 data)
>> >>  {
>> >>   if (caam_little_end)
>> > 
>> > This crashes on my p5020ds. Did you test on powerpc?
>> > 
>> > # first bad commit: [bbfcac5ff5f26aafa51935a62eb86b6eacfe8a49] crypto: 
>> > caam/jr - Remove extra memory barrier during job ring dequeue
>> 
>> Thanks for the report Michael.
>> 
>> Any hint what would be the proper approach here - to have relaxed I/O 
>> accessors
>> that would work both for ARM and PPC, and avoid ifdeffery etc.?
>
> Since no fix has been found I'm reverting the commit in question.

Thanks.

Sorry I haven't had time to look into it with everything else going on
during the merge window.

cheers


Re: [PATCH] powerpc: Fix compile issue with force DAWR

2019-05-09 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 08/05/2019 à 03:30, Michael Neuling a écrit :
>> If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail
>> at linking with:
>>arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined 
>> reference to `dawr_force_enable'
>> 
>> This was caused by this recent patch:
>> commit c1fe190c06723322f2dfac31d3b982c581e434ef
>> Author: Michael Neuling 
>> powerpc: Add force enable of DAWR on P9 option
>
> I think you should use the standard commit format, checkpatch will tell you.
>
>> 
>> This builds dawr_force_enable in always via a new file.
>
> Do we really need a new file just for that ?

I told him to make a new file, rather than drop it in some other file
that's already full of unrealted junk :)

I did mean for him to put all the dawr_force_enable code in the file
though, not just the variable.

> As far as I understand, it is always compiled in, so can't we use 
> another file like traps.o or setup-common.o or somewhere else ?
>
> Or just put an ifdef in arch/powerpc/kvm/book3s_hv_rmhandlers.S ?
> Because your fix will put dawr_force_enable on every build even the ones 
> who don't need it at all.

We don't want to use an ifdef in the KVM code, because that would break
the case where you want to enable the DAWR for use by guests but the
host kernel doesn't have PERF support.

It shouldn't be in obj-y, it should at least be 64-bit only.

But it should be pretty trivial to create a config symbol for it, with
something like:

config DAWR_FORCE_ENABLE
def_bool y
depends on PERF || KVM

cheers


Re: Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

2019-05-09 Thread Michael Ellerman
Petr Mladek  writes:
> On Wed 2019-05-08 00:54:51, Michael Ellerman wrote:
>> Hi folks,
>> 
>> Just an FYI in case anyone else is seeing crashes very early in boot in
>> linux-next with the above config options.
>>
>> The problem is the combination of some new code called via printk(),
>> check_pointer() which calls probe_kernel_read(). That then calls 
>> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
>> (before we've patched features). With the JUMP_LABEL debug enabled that
>> causes us to call printk() & dump_stack() and we end up recursing and
>> overflowing the stack.
>
> Sigh, the check_pointer() stuff is in Linus's tree now, see
> the commit 3e5903eb9cff707301712 ("vsprintf: Prevent crash when
> dereferencing invalid pointers").

No worries.

>> Because it happens so early you don't get any output, just an apparently
>> dead system.
>> 
>> The stack trace (which you don't see) is something like:
>> 
>>   ...
>>   dump_stack+0xdc
>>   probe_kernel_read+0x1a4
>>   check_pointer+0x58
>>   string+0x3c
>>   vsnprintf+0x1bc
>>   vscnprintf+0x20
>>   printk_safe_log_store+0x7c
>>   printk+0x40
>>   dump_stack_print_info+0xbc
>>   dump_stack+0x8
>>   probe_kernel_read+0x1a4
>>   probe_kernel_read+0x19c
>>   check_pointer+0x58
>>   string+0x3c
>>   vsnprintf+0x1bc
>>   vscnprintf+0x20
>>   vprintk_store+0x6c
>>   vprintk_emit+0xec
>>   vprintk_func+0xd4
>>   printk+0x40
>>   cpufeatures_process_feature+0xc8
>>   scan_cpufeatures_subnodes+0x380
>>   of_scan_flat_dt_subnodes+0xb4
>>   dt_cpu_ftrs_scan_callback+0x158
>>   of_scan_flat_dt+0xf0
>>   dt_cpu_ftrs_scan+0x3c
>>   early_init_devtree+0x360
>>   early_setup+0x9c
>> 
>> 
>> The simple fix is to use early_mmu_has_feature() in allow_user_access(),
>> but we'd rather not do that because it penalises all
>> copy_to/from_users() for the life of the system with the cost of the
>> runtime check vs the jump label. The irony is probe_kernel_read()
>> shouldn't be allowing user access at all, because we're reading the
>> kernel not userspace.
>
> I have tried to find a lightweight way for a safe reading of unknown
> kernel pointer. But I have not succeeded so far. I see only variants
> with user access. The user access is handled in arch-specific code
> and I do not see any variant without it.
>
> I am not sure on which level it should get fixed.

I sent a fix in powerpc code (sorry might have forgot to Cc you):

  https://patchwork.ozlabs.org/patch/1097015/

I've merged that into the powerpc tree. I think it's too subtle for us
to have an ordering requirement that deep in the user copy code, it was
just a matter of time before it caused a problem, you were just unlucky
it was your patch that did :)

We'll eventually switch it back to using a jump label but make it safe
to call early in boot before we've detected features.

> Could you please send it to lkml to get a wider audience?

I see you also sent a fix, that looks like a safe default to me.

But as I said I'm happy with the powerpc fix, so there's no requirement
from us that your fix get merged.

cheers


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Michal Suchánek
On Thu, 9 May 2019 07:06:32 -0700
Guenter Roeck  wrote:

> On 5/9/19 6:26 AM, Michal Suchánek wrote:
> > On Thu, 9 May 2019 06:07:31 -0700
> > Guenter Roeck  wrote:
> >   
> >> On 5/9/19 2:49 AM, Michal Suchánek wrote:  
> >>> On Thu, 9 May 2019 08:53:24 +0200
> >>> Greg Kroah-Hartman  wrote:
> >>>  
>  On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:  
> > I see multiple instances of:
> >
> > arch/powerpc/kernel/exceptions-64s.S:839: Error:
> > attempt to move .org backwards
> >
> > in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
> >
> > This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a 
> > store
> > forwarding barrier at kernel entry/exit"), which is part of a large 
> > patch
> > series and can not easily be reverted.
> >
> > Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?  
> 
>  Michael, I thought this patch series was supposed to fix ppc issues, not
>  add to them :)
> 
>  Any ideas on what to do here?  
> >>>
> >>> What exact code do you build?
> >>> 
> >> $ make ARCH=powerpc CROSS_COMPILE=powerpc64-linux- allmodconfig
> >> $ powerpc64-linux-gcc --version
> >> powerpc64-linux-gcc (GCC) 8.3.0
> >>  
> > 
> > Gcc should not see this file. I am asking because I do not see an .org
> > directive at line 839 of 4.4.179. I probably need some different repo
> > or extra patches to see the same code as you do.
> >   
> v4.4.179-143-gc4db218e9451 from
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> branch linux-4.4.y

Still don't see it. That branch is at 4.4.179 and c4db218e9451 does not
exist after fetching from the repo.

Anyway, here is a patch (untested):

Subject: [PATCH] Move out-of-line exception handlers after relon exception
 handlers.

The relon exception handlers need to be at specific location and code
inflation in the common handler code can cause

Error: attempt to move .org backwards

Signed-off-by: Michal Suchanek 
---
 arch/powerpc/kernel/exceptions-64s.S | 88 ++--
 1 file changed, 44 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 938a30fef031..1d477d21ff09 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -772,50 +772,6 @@ kvmppc_skip_Hinterrupt:
b   .
 #endif
 
-/*
- * Code from here down to __end_handlers is invoked from the
- * exception prologs above.  Because the prologs assemble the
- * addresses of these handlers using the LOAD_HANDLER macro,
- * which uses an ori instruction, these handlers must be in
- * the first 64k of the kernel image.
- */
-
-/*** Common interrupt handlers ***/
-
-   STD_EXCEPTION_COMMON(0x100, system_reset, system_reset_exception)
-
-   STD_EXCEPTION_COMMON_ASYNC(0x500, hardware_interrupt, do_IRQ)
-   STD_EXCEPTION_COMMON_ASYNC(0x900, decrementer, timer_interrupt)
-   STD_EXCEPTION_COMMON(0x980, hdecrementer, hdec_interrupt)
-#ifdef CONFIG_PPC_DOORBELL
-   STD_EXCEPTION_COMMON_ASYNC(0xa00, doorbell_super, doorbell_exception)
-#else
-   STD_EXCEPTION_COMMON_ASYNC(0xa00, doorbell_super, unknown_exception)
-#endif
-   STD_EXCEPTION_COMMON(0xb00, trap_0b, unknown_exception)
-   STD_EXCEPTION_COMMON(0xd00, single_step, single_step_exception)
-   STD_EXCEPTION_COMMON(0xe00, trap_0e, unknown_exception)
-   STD_EXCEPTION_COMMON(0xe40, emulation_assist, 
emulation_assist_interrupt)
-   STD_EXCEPTION_COMMON_ASYNC(0xe60, hmi_exception, handle_hmi_exception)
-#ifdef CONFIG_PPC_DOORBELL
-   STD_EXCEPTION_COMMON_ASYNC(0xe80, h_doorbell, doorbell_exception)
-#else
-   STD_EXCEPTION_COMMON_ASYNC(0xe80, h_doorbell, unknown_exception)
-#endif
-   STD_EXCEPTION_COMMON_ASYNC(0xf00, performance_monitor, 
performance_monitor_exception)
-   STD_EXCEPTION_COMMON(0x1300, instruction_breakpoint, 
instruction_breakpoint_exception)
-   STD_EXCEPTION_COMMON(0x1502, denorm, unknown_exception)
-#ifdef CONFIG_ALTIVEC
-   STD_EXCEPTION_COMMON(0x1700, altivec_assist, altivec_assist_exception)
-#else
-   STD_EXCEPTION_COMMON(0x1700, altivec_assist, unknown_exception)
-#endif
-#ifdef CONFIG_CBE_RAS
-   STD_EXCEPTION_COMMON(0x1200, cbe_system_error, 
cbe_system_error_exception)
-   STD_EXCEPTION_COMMON(0x1600, cbe_maintenance, cbe_maintenance_exception)
-   STD_EXCEPTION_COMMON(0x1800, cbe_thermal, cbe_thermal_exception)
-#endif /* CONFIG_CBE_RAS */
-
/*
 * Relocation-on interrupts: A subset of the interrupts can be delivered
 * with IR=1/DR=1, if AIL==2 and MSR.HV won't be changed by delivering
@@ -969,6 +925,50 @@ system_call_entry:
 ppc64_runlatch_on_trampoline:
b   __ppc64_runlatch_on
 
+/*
+ * Code from here down to __end_handlers is invoked from the
+ * exception prologs above.  Because the prologs assemble the
+ * addresses of these handlers using 

Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread Wei Yang
On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>Only memory to be added to the buddy and to be onlined/offlined by
>user space using memory block devices needs (and should have!) memory
>block devices.
>
>Factor out creation of memory block devices Create all devices after
>arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>because it is now effectively stale.
>
>Only after memory block devices have been added, memory can be onlined
>by user space. This implies, that memory is not visible to user space at
>all before arch_add_memory() succeeded.
>
>Cc: Greg Kroah-Hartman 
>Cc: "Rafael J. Wysocki" 
>Cc: David Hildenbrand 
>Cc: "mike.tra...@hpe.com" 
>Cc: Andrew Morton 
>Cc: Ingo Molnar 
>Cc: Andrew Banman 
>Cc: Oscar Salvador 
>Cc: Michal Hocko 
>Cc: Pavel Tatashin 
>Cc: Qian Cai 
>Cc: Wei Yang 
>Cc: Arun KS 
>Cc: Mathieu Malaterre 
>Signed-off-by: David Hildenbrand 
>---
> drivers/base/memory.c  | 70 ++
> include/linux/memory.h |  2 +-
> mm/memory_hotplug.c| 15 -
> 3 files changed, 53 insertions(+), 34 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 6e0cb4fda179..862c202a18ca 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>   return 0;
> }
> 
>+static void unregister_memory(struct memory_block *memory)
>+{
>+  BUG_ON(memory->dev.bus != _subsys);
>+
>+  /* drop the ref. we got via find_memory_block() */
>+  put_device(>dev);
>+  device_unregister(>dev);
>+}
>+
> /*
>- * need an interface for the VM to add new memory regions,
>- * but without onlining it.
>+ * Create memory block devices for the given memory area. Start and size
>+ * have to be aligned to memory block granularity. Memory block devices
>+ * will be initialized as offline.
>  */
>-int hotplug_memory_register(int nid, struct mem_section *section)
>+int hotplug_memory_register(unsigned long start, unsigned long size)

One trivial suggestion about the function name.

For memory_block device, sometimes we use the full name

find_memory_block
init_memory_block
add_memory_block

But sometimes we use *nick* name

hotplug_memory_register
register_memory
unregister_memory

This is a little bit confusion.

Can we use one name convention here? 

[...]

> /*
>@@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct resource 
>*res)
>   if (ret < 0)
>   goto error;
> 
>+  /* create memory block devices after memory was added */
>+  ret = hotplug_memory_register(start, size);
>+  if (ret) {
>+  arch_remove_memory(nid, start, size, NULL);

Functionally, it works I think.

But arch_remove_memory() would remove pages from zone. At this point, we just
allocate section/mmap for pages, the zones are empty and pages are not
connected to zone.

Function  zone = page_zone(page); always gets zone #0, since pages->flags is 0
at  this point. This is not exact.

Would we add some comment to mention this? Or we need to clean up
arch_remove_memory() to take out __remove_zone()?


>+  goto error;
>+  }
>+
>   if (new_node) {
>   /* If sysfs file of new node can't be created, cpu on the node
>* can't be hot-added. There is no rollback way now.
>-- 
>2.20.1

-- 
Wei Yang
Help you, Help me


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Michael Ellerman
Greg Kroah-Hartman  writes:
> On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:
>> I see multiple instances of:
>> 
>> arch/powerpc/kernel/exceptions-64s.S:839: Error:
>>  attempt to move .org backwards
>> 
>> in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
>> 
>> This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
>> forwarding barrier at kernel entry/exit"), which is part of a large patch
>> series and can not easily be reverted.
>> 
>> Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?
>
> Michael, I thought this patch series was supposed to fix ppc issues, not
> add to them :)

Well it fixes some, but creates others :}

> Any ideas on what to do here?

Turning off CONFIG_CBE_RAS (obscure IBM Cell Blade RAS features) is
sufficient to get it building. Is that an option for your build tests
Guenter?

We can try to rearrange some of the exception vectors as Michal
suggested, but that's not without risk either.

cheers


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Guenter Roeck

On 5/9/19 6:26 AM, Michal Suchánek wrote:

On Thu, 9 May 2019 06:07:31 -0700
Guenter Roeck  wrote:


On 5/9/19 2:49 AM, Michal Suchánek wrote:

On Thu, 9 May 2019 08:53:24 +0200
Greg Kroah-Hartman  wrote:
   

On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:

I see multiple instances of:

arch/powerpc/kernel/exceptions-64s.S:839: Error:
attempt to move .org backwards

in v4.4.y.queue (v4.4.179-143-gc4db218e9451).

This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
forwarding barrier at kernel entry/exit"), which is part of a large patch
series and can not easily be reverted.

Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?


Michael, I thought this patch series was supposed to fix ppc issues, not
add to them :)

Any ideas on what to do here?


What exact code do you build?
  

$ make ARCH=powerpc CROSS_COMPILE=powerpc64-linux- allmodconfig
$ powerpc64-linux-gcc --version
powerpc64-linux-gcc (GCC) 8.3.0



Gcc should not see this file. I am asking because I do not see an .org
directive at line 839 of 4.4.179. I probably need some different repo
or extra patches to see the same code as you do.


v4.4.179-143-gc4db218e9451 from
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
branch linux-4.4.y

That also includes 'powerpc/64s: Improve RFI L1-D cache flush fallback',
but reverting it does not make a difference. Also, the .org is
hidden in STD_RELON_EXCEPTION_PSERIES().

Guenter


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Petr Mladek
On Thu 2019-05-09 09:13:57, Steven Rostedt wrote:
> On Thu,  9 May 2019 14:19:23 +0200
> Petr Mladek  wrote:
> 
> > The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> > invalid pointers") broke boot on several architectures. The common
> > pattern is that probe_kernel_read() is not working during early
> > boot because userspace access framework is not ready.
> > 
> > The check is only the best effort. Let's not rush with it during
> > the early boot.
> > 
> > Details:
> > 
> > 1. Report on Power:
> > 
> > Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> > CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> > 
> > The problem is the combination of some new code called via printk(),
> > check_pointer() which calls probe_kernel_read(). That then calls
> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> > (before we've patched features). With the JUMP_LABEL debug enabled that
> > causes us to call printk() & dump_stack() and we end up recursing and
> > overflowing the stack.
> > 
> > Because it happens so early you don't get any output, just an apparently
> > dead system.
> > 
> > The stack trace (which you don't see) is something like:
> > 
> >   ...
> >   dump_stack+0xdc
> >   probe_kernel_read+0x1a4
> >   check_pointer+0x58
> >   string+0x3c
> >   vsnprintf+0x1bc
> >   vscnprintf+0x20
> >   printk_safe_log_store+0x7c
> >   printk+0x40
> >   dump_stack_print_info+0xbc
> >   dump_stack+0x8
> >   probe_kernel_read+0x1a4
> >   probe_kernel_read+0x19c
> >   check_pointer+0x58
> >   string+0x3c
> >   vsnprintf+0x1bc
> >   vscnprintf+0x20
> >   vprintk_store+0x6c
> >   vprintk_emit+0xec
> >   vprintk_func+0xd4
> >   printk+0x40
> >   cpufeatures_process_feature+0xc8
> >   scan_cpufeatures_subnodes+0x380
> >   of_scan_flat_dt_subnodes+0xb4
> >   dt_cpu_ftrs_scan_callback+0x158
> >   of_scan_flat_dt+0xf0
> >   dt_cpu_ftrs_scan+0x3c
> >   early_init_devtree+0x360
> >   early_setup+0x9c
> > 
> > 2. Report on s390:
> > 
> > vsnprintf invocations, are broken on s390. For example, the early boot
> > output now looks like this where the first (efault) should be
> > the linux_banner:
> > 
> > [0.099985] (efault)
> > [0.099985] setup: Linux is running as a z/VM guest operating system in 
> > 64-bit mode
> > [0.100066] setup: The maximum memory size is 8192MB
> > [0.100070] cma: Reserved 4 MiB at (efault)
> > [0.100100] numa: NUMA mode: (efault)
> > 
> > The reason for this, is that the code assumes that
> > probe_kernel_address() works very early. This however is not true on
> > at least s390. Uaccess on KERNEL_DS works only after page tables have
> > been setup on s390, which happens with setup_arch()->paging_init().
> > 
> > Any probe_kernel_address() invocation before that will return -EFAULT.
> 
> Hmm, this sounds to me that probe_kernel_address() is broken for these
> architectures. Perhaps the system_state check should be in
> probe_kernel_address() for those architectures?

Yeah. Well, these problems are hard to debug. It left a dead power
system with a blank screen. I am not sure if the added check is
worth the pain.

I hope that the check would help to debug problems. But it is yet
another complexity in printk() path. I think that it is fine
to keep it enabled only on the booted system for a while
and get some more feedback.

Best Regards,
Petr


Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread David Hildenbrand
On 09.05.19 15:55, Wei Yang wrote:
> On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>> Only memory to be added to the buddy and to be onlined/offlined by
>> user space using memory block devices needs (and should have!) memory
>> block devices.
>>
>> Factor out creation of memory block devices Create all devices after
>> arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>> because it is now effectively stale.
>>
>> Only after memory block devices have been added, memory can be onlined
>> by user space. This implies, that memory is not visible to user space at
>> all before arch_add_memory() succeeded.
>>
>> Cc: Greg Kroah-Hartman 
>> Cc: "Rafael J. Wysocki" 
>> Cc: David Hildenbrand 
>> Cc: "mike.tra...@hpe.com" 
>> Cc: Andrew Morton 
>> Cc: Ingo Molnar 
>> Cc: Andrew Banman 
>> Cc: Oscar Salvador 
>> Cc: Michal Hocko 
>> Cc: Pavel Tatashin 
>> Cc: Qian Cai 
>> Cc: Wei Yang 
>> Cc: Arun KS 
>> Cc: Mathieu Malaterre 
>> Signed-off-by: David Hildenbrand 
>> ---
>> drivers/base/memory.c  | 70 ++
>> include/linux/memory.h |  2 +-
>> mm/memory_hotplug.c| 15 -
>> 3 files changed, 53 insertions(+), 34 deletions(-)
>>
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 6e0cb4fda179..862c202a18ca 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>>  return 0;
>> }
>>
>> +static void unregister_memory(struct memory_block *memory)
>> +{
>> +BUG_ON(memory->dev.bus != _subsys);
>> +
>> +/* drop the ref. we got via find_memory_block() */
>> +put_device(>dev);
>> +device_unregister(>dev);
>> +}
>> +
>> /*
>> - * need an interface for the VM to add new memory regions,
>> - * but without onlining it.
>> + * Create memory block devices for the given memory area. Start and size
>> + * have to be aligned to memory block granularity. Memory block devices
>> + * will be initialized as offline.
>>  */
>> -int hotplug_memory_register(int nid, struct mem_section *section)
>> +int hotplug_memory_register(unsigned long start, unsigned long size)
>> {
>> -int ret = 0;
>> +unsigned long block_nr_pages = memory_block_size_bytes() >> PAGE_SHIFT;
>> +unsigned long start_pfn = PFN_DOWN(start);
>> +unsigned long end_pfn = start_pfn + (size >> PAGE_SHIFT);
>> +unsigned long pfn;
>>  struct memory_block *mem;
>> +int ret = 0;
>>
>> -mutex_lock(_sysfs_mutex);
>> +BUG_ON(!IS_ALIGNED(start, memory_block_size_bytes()));
>> +BUG_ON(!IS_ALIGNED(size, memory_block_size_bytes()));
>>
>> -mem = find_memory_block(section);
>> -if (mem) {
>> -mem->section_count++;
>> -put_device(>dev);
>> -} else {
>> -ret = init_memory_block(, section, MEM_OFFLINE);
>> +mutex_lock(_sysfs_mutex);
>> +for (pfn = start_pfn; pfn != end_pfn; pfn += block_nr_pages) {
>> +mem = find_memory_block(__pfn_to_section(pfn));
>> +if (mem) {
>> +WARN_ON_ONCE(false);
> 
> One question here, the purpose of WARN_ON_ONCE(false) is? Would we trigger
> this?

Would happen if something goes terribly wrong. We might want to remove
this once we are sure this will not happen.

I replaced it in the meantime by a

if (WARN_ON_ONCE(mem)) {
put_device(>dev);
ret = -EEXIST;
break;
}

> 
>> +put_device(>dev);
>> +continue;
>> +}
>> +ret = init_memory_block(, __pfn_to_section(pfn),
>> +MEM_OFFLINE);
>>  if (ret)
>> -goto out;
>> -mem->section_count++;
>> +break;
>> +mem->section_count = memory_block_size_bytes() /
>> + MIN_MEMORY_BLOCK_SIZE;
> 
> Maybe we can leverage sections_per_block variable.

Most certainly if it does what I think it does :) thanks!


-- 

Thanks,

David / dhildenb


Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread Wei Yang
On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>Only memory to be added to the buddy and to be onlined/offlined by
>user space using memory block devices needs (and should have!) memory
>block devices.
>
>Factor out creation of memory block devices Create all devices after
>arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>because it is now effectively stale.
>
>Only after memory block devices have been added, memory can be onlined
>by user space. This implies, that memory is not visible to user space at
>all before arch_add_memory() succeeded.
>
>Cc: Greg Kroah-Hartman 
>Cc: "Rafael J. Wysocki" 
>Cc: David Hildenbrand 
>Cc: "mike.tra...@hpe.com" 
>Cc: Andrew Morton 
>Cc: Ingo Molnar 
>Cc: Andrew Banman 
>Cc: Oscar Salvador 
>Cc: Michal Hocko 
>Cc: Pavel Tatashin 
>Cc: Qian Cai 
>Cc: Wei Yang 
>Cc: Arun KS 
>Cc: Mathieu Malaterre 
>Signed-off-by: David Hildenbrand 
>---
> drivers/base/memory.c  | 70 ++
> include/linux/memory.h |  2 +-
> mm/memory_hotplug.c| 15 -
> 3 files changed, 53 insertions(+), 34 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 6e0cb4fda179..862c202a18ca 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>   return 0;
> }
> 
>+static void unregister_memory(struct memory_block *memory)
>+{
>+  BUG_ON(memory->dev.bus != _subsys);
>+
>+  /* drop the ref. we got via find_memory_block() */
>+  put_device(>dev);
>+  device_unregister(>dev);
>+}
>+
> /*
>- * need an interface for the VM to add new memory regions,
>- * but without onlining it.
>+ * Create memory block devices for the given memory area. Start and size
>+ * have to be aligned to memory block granularity. Memory block devices
>+ * will be initialized as offline.
>  */
>-int hotplug_memory_register(int nid, struct mem_section *section)
>+int hotplug_memory_register(unsigned long start, unsigned long size)
> {
>-  int ret = 0;
>+  unsigned long block_nr_pages = memory_block_size_bytes() >> PAGE_SHIFT;
>+  unsigned long start_pfn = PFN_DOWN(start);
>+  unsigned long end_pfn = start_pfn + (size >> PAGE_SHIFT);
>+  unsigned long pfn;
>   struct memory_block *mem;
>+  int ret = 0;
> 
>-  mutex_lock(_sysfs_mutex);
>+  BUG_ON(!IS_ALIGNED(start, memory_block_size_bytes()));
>+  BUG_ON(!IS_ALIGNED(size, memory_block_size_bytes()));
> 
>-  mem = find_memory_block(section);
>-  if (mem) {
>-  mem->section_count++;
>-  put_device(>dev);
>-  } else {
>-  ret = init_memory_block(, section, MEM_OFFLINE);
>+  mutex_lock(_sysfs_mutex);
>+  for (pfn = start_pfn; pfn != end_pfn; pfn += block_nr_pages) {
>+  mem = find_memory_block(__pfn_to_section(pfn));
>+  if (mem) {
>+  WARN_ON_ONCE(false);

One question here, the purpose of WARN_ON_ONCE(false) is? Would we trigger
this?

>+  put_device(>dev);
>+  continue;
>+  }
>+  ret = init_memory_block(, __pfn_to_section(pfn),
>+  MEM_OFFLINE);
>   if (ret)
>-  goto out;
>-  mem->section_count++;
>+  break;
>+  mem->section_count = memory_block_size_bytes() /
>+   MIN_MEMORY_BLOCK_SIZE;

Maybe we can leverage sections_per_block variable.

mem->section_count = sections_per_block;

>+  }
>+  if (ret) {
>+  end_pfn = pfn;
>+  for (pfn = start_pfn; pfn != end_pfn; pfn += block_nr_pages) {
>+  mem = find_memory_block(__pfn_to_section(pfn));
>+  if (!mem)
>+  continue;
>+  mem->section_count = 0;
>+  unregister_memory(mem);
>+  }
>   }

-- 
Wei Yang
Help you, Help me


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Michal Suchánek
On Thu,  9 May 2019 14:19:23 +0200
Petr Mladek  wrote:

> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
> 
> The check is only the best effort. Let's not rush with it during
> the early boot.
> 
> Details:
> 
> 1. Report on Power:
> 
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> 
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features).

There is early_mmu_has_feature for this case. mmu_has_feature does not
work before patching so parts of kernel that can run before patching
must use the early_ variant which actually runs code reading the
feature bitmap to determine the answer.

Thanks

Michal


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Michal Suchánek
On Thu, 9 May 2019 06:07:31 -0700
Guenter Roeck  wrote:

> On 5/9/19 2:49 AM, Michal Suchánek wrote:
> > On Thu, 9 May 2019 08:53:24 +0200
> > Greg Kroah-Hartman  wrote:
> >   
> >> On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:  
> >>> I see multiple instances of:
> >>>
> >>> arch/powerpc/kernel/exceptions-64s.S:839: Error:
> >>>   attempt to move .org backwards
> >>>
> >>> in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
> >>>
> >>> This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
> >>> forwarding barrier at kernel entry/exit"), which is part of a large patch
> >>> series and can not easily be reverted.
> >>>
> >>> Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?  
> >>
> >> Michael, I thought this patch series was supposed to fix ppc issues, not
> >> add to them :)
> >>
> >> Any ideas on what to do here?  
> > 
> > What exact code do you build?
> >  
> $ make ARCH=powerpc CROSS_COMPILE=powerpc64-linux- allmodconfig
> $ powerpc64-linux-gcc --version
> powerpc64-linux-gcc (GCC) 8.3.0
>

Gcc should not see this file. I am asking because I do not see an .org
directive at line 839 of 4.4.179. I probably need some different repo
or extra patches to see the same code as you do.

Thanks

Michal


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Steven Rostedt
On Thu,  9 May 2019 14:19:23 +0200
Petr Mladek  wrote:

> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
> 
> The check is only the best effort. Let's not rush with it during
> the early boot.
> 
> Details:
> 
> 1. Report on Power:
> 
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> 
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.
> 
> Because it happens so early you don't get any output, just an apparently
> dead system.
> 
> The stack trace (which you don't see) is something like:
> 
>   ...
>   dump_stack+0xdc
>   probe_kernel_read+0x1a4
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   printk_safe_log_store+0x7c
>   printk+0x40
>   dump_stack_print_info+0xbc
>   dump_stack+0x8
>   probe_kernel_read+0x1a4
>   probe_kernel_read+0x19c
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   vprintk_store+0x6c
>   vprintk_emit+0xec
>   vprintk_func+0xd4
>   printk+0x40
>   cpufeatures_process_feature+0xc8
>   scan_cpufeatures_subnodes+0x380
>   of_scan_flat_dt_subnodes+0xb4
>   dt_cpu_ftrs_scan_callback+0x158
>   of_scan_flat_dt+0xf0
>   dt_cpu_ftrs_scan+0x3c
>   early_init_devtree+0x360
>   early_setup+0x9c
> 
> 2. Report on s390:
> 
> vsnprintf invocations, are broken on s390. For example, the early boot
> output now looks like this where the first (efault) should be
> the linux_banner:
> 
> [0.099985] (efault)
> [0.099985] setup: Linux is running as a z/VM guest operating system in 
> 64-bit mode
> [0.100066] setup: The maximum memory size is 8192MB
> [0.100070] cma: Reserved 4 MiB at (efault)
> [0.100100] numa: NUMA mode: (efault)
> 
> The reason for this, is that the code assumes that
> probe_kernel_address() works very early. This however is not true on
> at least s390. Uaccess on KERNEL_DS works only after page tables have
> been setup on s390, which happens with setup_arch()->paging_init().
> 
> Any probe_kernel_address() invocation before that will return -EFAULT.

Hmm, this sounds to me that probe_kernel_address() is broken for these
architectures. Perhaps the system_state check should be in
probe_kernel_address() for those architectures?


> 
> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid 
> pointers")
> Signed-off-by: Petr Mladek 
> ---
>  lib/vsprintf.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 7b0a6140bfad..8b43a883be6b 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
>   if (!ptr)
>   return "(null)";
>  
> - if (probe_kernel_address(ptr, byte))
> - return "(efault)";

Either that, or we add a macro to those architectures and add:

#ifdef ARCH_NO_EARLY_PROBE_KERNEL_ADDRESS

> + /* User space address handling is not ready during early boot. */
> + if (system_state <= SYSTEM_BOOTING) {
> + if ((unsigned long)ptr < PAGE_SIZE)
> + return "(efault)";
> + } else {

#endif

Why add an unnecessary branch for archs this is not a problem for?

-- Steve

> + if (probe_kernel_address(ptr, byte))
> + return "(efault)";
>  
>   return NULL;
>  }



Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Guenter Roeck

On 5/9/19 2:49 AM, Michal Suchánek wrote:

On Thu, 9 May 2019 08:53:24 +0200
Greg Kroah-Hartman  wrote:


On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:

I see multiple instances of:

arch/powerpc/kernel/exceptions-64s.S:839: Error:
attempt to move .org backwards

in v4.4.y.queue (v4.4.179-143-gc4db218e9451).

This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
forwarding barrier at kernel entry/exit"), which is part of a large patch
series and can not easily be reverted.

Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?


Michael, I thought this patch series was supposed to fix ppc issues, not
add to them :)

Any ideas on what to do here?


What exact code do you build?


$ make ARCH=powerpc CROSS_COMPILE=powerpc64-linux- allmodconfig
$ powerpc64-linux-gcc --version
powerpc64-linux-gcc (GCC) 8.3.0

Guenter


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Andy Shevchenko
On Thu, May 09, 2019 at 02:19:23PM +0200, Petr Mladek wrote:
> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
> 
> The check is only the best effort. Let's not rush with it during
> the early boot.
> 
> Details:
> 
> 1. Report on Power:
> 
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> 
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.
> 
> Because it happens so early you don't get any output, just an apparently
> dead system.
> 
> The stack trace (which you don't see) is something like:
> 
>   ...
>   dump_stack+0xdc
>   probe_kernel_read+0x1a4
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   printk_safe_log_store+0x7c
>   printk+0x40
>   dump_stack_print_info+0xbc
>   dump_stack+0x8
>   probe_kernel_read+0x1a4
>   probe_kernel_read+0x19c
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   vprintk_store+0x6c
>   vprintk_emit+0xec
>   vprintk_func+0xd4
>   printk+0x40
>   cpufeatures_process_feature+0xc8
>   scan_cpufeatures_subnodes+0x380
>   of_scan_flat_dt_subnodes+0xb4
>   dt_cpu_ftrs_scan_callback+0x158
>   of_scan_flat_dt+0xf0
>   dt_cpu_ftrs_scan+0x3c
>   early_init_devtree+0x360
>   early_setup+0x9c
> 
> 2. Report on s390:
> 
> vsnprintf invocations, are broken on s390. For example, the early boot
> output now looks like this where the first (efault) should be
> the linux_banner:
> 
> [0.099985] (efault)
> [0.099985] setup: Linux is running as a z/VM guest operating system in 
> 64-bit mode
> [0.100066] setup: The maximum memory size is 8192MB
> [0.100070] cma: Reserved 4 MiB at (efault)
> [0.100100] numa: NUMA mode: (efault)
> 
> The reason for this, is that the code assumes that
> probe_kernel_address() works very early. This however is not true on
> at least s390. Uaccess on KERNEL_DS works only after page tables have
> been setup on s390, which happens with setup_arch()->paging_init().
> 
> Any probe_kernel_address() invocation before that will return -EFAULT.
> 

It's seems as a good enough fix.
Reviewed-by: Andy Shevchenko 

Though in all cases would be nice to distinguish error pointers as well.
Something like

if (IS_ERR(ptr))
return err_pointer_str(ptr);

in check_pointer_msg().

> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid 
> pointers")
> Signed-off-by: Petr Mladek 
> ---
>  lib/vsprintf.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 7b0a6140bfad..8b43a883be6b 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
>   if (!ptr)
>   return "(null)";
>  
> - if (probe_kernel_address(ptr, byte))
> - return "(efault)";
> + /* User space address handling is not ready during early boot. */
> + if (system_state <= SYSTEM_BOOTING) {
> + if ((unsigned long)ptr < PAGE_SIZE)
> + return "(efault)";
> + } else {
> + if (probe_kernel_address(ptr, byte))
> + return "(efault)";
>  
>   return NULL;
>  }
> -- 
> 2.16.4
> 

-- 
With Best Regards,
Andy Shevchenko




[PATCH] powerpc/32s: fix flush_hash_pages() on SMP

2019-05-09 Thread Christophe Leroy
flush_hash_pages() runs with data translation off, so current
task_struct has to be accesssed using physical address.

Reported-by: Erhard F. 
Fixes: f7354ccac844 ("powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPU")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/book3s32/hash_low.S | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
b/arch/powerpc/mm/book3s32/hash_low.S
index e27792d0b744..8366c2abeafc 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -539,7 +539,8 @@ _GLOBAL(flush_hash_pages)
 #ifdef CONFIG_SMP
lis r9, (mmu_hash_lock - PAGE_OFFSET)@ha
addir9, r9, (mmu_hash_lock - PAGE_OFFSET)@l
-   lwz r8,TASK_CPU(r2)
+   tophys  (r8, r2)
+   lwz r8, TASK_CPU(r8)
orisr8,r8,9
 10:lwarx   r0,0,r9
cmpi0,r0,0
-- 
2.13.3



Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread Wei Yang
On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>Only memory to be added to the buddy and to be onlined/offlined by
>user space using memory block devices needs (and should have!) memory
>block devices.
>
>Factor out creation of memory block devices Create all devices after
>arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>because it is now effectively stale.
>
>Only after memory block devices have been added, memory can be onlined
>by user space. This implies, that memory is not visible to user space at
>all before arch_add_memory() succeeded.
>
>Cc: Greg Kroah-Hartman 
>Cc: "Rafael J. Wysocki" 
>Cc: David Hildenbrand 
>Cc: "mike.tra...@hpe.com" 
>Cc: Andrew Morton 
>Cc: Ingo Molnar 
>Cc: Andrew Banman 
>Cc: Oscar Salvador 
>Cc: Michal Hocko 
>Cc: Pavel Tatashin 
>Cc: Qian Cai 
>Cc: Wei Yang 
>Cc: Arun KS 
>Cc: Mathieu Malaterre 
>Signed-off-by: David Hildenbrand 
>---
> drivers/base/memory.c  | 70 ++
> include/linux/memory.h |  2 +-
> mm/memory_hotplug.c| 15 -
> 3 files changed, 53 insertions(+), 34 deletions(-)
>
>diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>index 6e0cb4fda179..862c202a18ca 100644
>--- a/drivers/base/memory.c
>+++ b/drivers/base/memory.c
>@@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>   return 0;
> }
> 
>+static void unregister_memory(struct memory_block *memory)
>+{
>+  BUG_ON(memory->dev.bus != _subsys);
>+
>+  /* drop the ref. we got via find_memory_block() */
>+  put_device(>dev);
>+  device_unregister(>dev);
>+}
>+
> /*
>- * need an interface for the VM to add new memory regions,
>- * but without onlining it.
>+ * Create memory block devices for the given memory area. Start and size
>+ * have to be aligned to memory block granularity. Memory block devices
>+ * will be initialized as offline.
>  */
>-int hotplug_memory_register(int nid, struct mem_section *section)
>+int hotplug_memory_register(unsigned long start, unsigned long size)
> {
>-  int ret = 0;
>+  unsigned long block_nr_pages = memory_block_size_bytes() >> PAGE_SHIFT;
>+  unsigned long start_pfn = PFN_DOWN(start);
>+  unsigned long end_pfn = start_pfn + (size >> PAGE_SHIFT);
>+  unsigned long pfn;
>   struct memory_block *mem;
>+  int ret = 0;
> 
>-  mutex_lock(_sysfs_mutex);
>+  BUG_ON(!IS_ALIGNED(start, memory_block_size_bytes()));
>+  BUG_ON(!IS_ALIGNED(size, memory_block_size_bytes()));

After this change, the call flow looks like this:

add_memory_resource
check_hotplug_memory_range
hotplug_memory_register

Since in check_hotplug_memory_range() has checked the boundary, do we need to
check here again?

-- 
Wei Yang
Help you, Help me


Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()

2019-05-09 Thread David Hildenbrand
On 09.05.19 14:43, Wei Yang wrote:
> On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote:
>> Only memory to be added to the buddy and to be onlined/offlined by
>> user space using memory block devices needs (and should have!) memory
>> block devices.
>>
>> Factor out creation of memory block devices Create all devices after
>> arch_add_memory() succeeded. We can later drop the want_memblock parameter,
>> because it is now effectively stale.
>>
>> Only after memory block devices have been added, memory can be onlined
>> by user space. This implies, that memory is not visible to user space at
>> all before arch_add_memory() succeeded.
>>
>> Cc: Greg Kroah-Hartman 
>> Cc: "Rafael J. Wysocki" 
>> Cc: David Hildenbrand 
>> Cc: "mike.tra...@hpe.com" 
>> Cc: Andrew Morton 
>> Cc: Ingo Molnar 
>> Cc: Andrew Banman 
>> Cc: Oscar Salvador 
>> Cc: Michal Hocko 
>> Cc: Pavel Tatashin 
>> Cc: Qian Cai 
>> Cc: Wei Yang 
>> Cc: Arun KS 
>> Cc: Mathieu Malaterre 
>> Signed-off-by: David Hildenbrand 
>> ---
>> drivers/base/memory.c  | 70 ++
>> include/linux/memory.h |  2 +-
>> mm/memory_hotplug.c| 15 -
>> 3 files changed, 53 insertions(+), 34 deletions(-)
>>
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 6e0cb4fda179..862c202a18ca 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
>>  return 0;
>> }
>>
>> +static void unregister_memory(struct memory_block *memory)
>> +{
>> +BUG_ON(memory->dev.bus != _subsys);
>> +
>> +/* drop the ref. we got via find_memory_block() */
>> +put_device(>dev);
>> +device_unregister(>dev);
>> +}
>> +
>> /*
>> - * need an interface for the VM to add new memory regions,
>> - * but without onlining it.
>> + * Create memory block devices for the given memory area. Start and size
>> + * have to be aligned to memory block granularity. Memory block devices
>> + * will be initialized as offline.
>>  */
>> -int hotplug_memory_register(int nid, struct mem_section *section)
>> +int hotplug_memory_register(unsigned long start, unsigned long size)
>> {
>> -int ret = 0;
>> +unsigned long block_nr_pages = memory_block_size_bytes() >> PAGE_SHIFT;
>> +unsigned long start_pfn = PFN_DOWN(start);
>> +unsigned long end_pfn = start_pfn + (size >> PAGE_SHIFT);
>> +unsigned long pfn;
>>  struct memory_block *mem;
>> +int ret = 0;
>>
>> -mutex_lock(_sysfs_mutex);
>> +BUG_ON(!IS_ALIGNED(start, memory_block_size_bytes()));
>> +BUG_ON(!IS_ALIGNED(size, memory_block_size_bytes()));
> 
> After this change, the call flow looks like this:
> 
> add_memory_resource
> check_hotplug_memory_range
> hotplug_memory_register
> 
> Since in check_hotplug_memory_range() has checked the boundary, do we need to
> check here again?
> 

I prefer to check for such requirements explicitly in applicable places,
especially if they are placed in different files. Makes code easier to
get. WARN_ON_ONCE will indicate that this has to be assured by the caller.

Thanks!

-- 

Thanks,

David / dhildenb


Re: [PATCH v2 1/8] mm/memory_hotplug: Simplify and fix check_hotplug_memory_range()

2019-05-09 Thread Wei Yang
On Tue, May 07, 2019 at 08:37:57PM +0200, David Hildenbrand wrote:
>By converting start and size to page granularity, we actually ignore
>unaligned parts within a page instead of properly bailing out with an
>error.
>
>Cc: Andrew Morton 
>Cc: Oscar Salvador 
>Cc: Michal Hocko 
>Cc: David Hildenbrand 
>Cc: Pavel Tatashin 
>Cc: Qian Cai 
>Cc: Wei Yang 
>Cc: Arun KS 
>Cc: Mathieu Malaterre 
>Signed-off-by: David Hildenbrand 

Reviewed-by: Wei Yang 


-- 
Wei Yang
Help you, Help me


[PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-09 Thread Petr Mladek
The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
invalid pointers") broke boot on several architectures. The common
pattern is that probe_kernel_read() is not working during early
boot because userspace access framework is not ready.

The check is only the best effort. Let's not rush with it during
the early boot.

Details:

1. Report on Power:

Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.

Because it happens so early you don't get any output, just an apparently
dead system.

The stack trace (which you don't see) is something like:

  ...
  dump_stack+0xdc
  probe_kernel_read+0x1a4
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  printk_safe_log_store+0x7c
  printk+0x40
  dump_stack_print_info+0xbc
  dump_stack+0x8
  probe_kernel_read+0x1a4
  probe_kernel_read+0x19c
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  vprintk_store+0x6c
  vprintk_emit+0xec
  vprintk_func+0xd4
  printk+0x40
  cpufeatures_process_feature+0xc8
  scan_cpufeatures_subnodes+0x380
  of_scan_flat_dt_subnodes+0xb4
  dt_cpu_ftrs_scan_callback+0x158
  of_scan_flat_dt+0xf0
  dt_cpu_ftrs_scan+0x3c
  early_init_devtree+0x360
  early_setup+0x9c

2. Report on s390:

vsnprintf invocations, are broken on s390. For example, the early boot
output now looks like this where the first (efault) should be
the linux_banner:

[0.099985] (efault)
[0.099985] setup: Linux is running as a z/VM guest operating system in 
64-bit mode
[0.100066] setup: The maximum memory size is 8192MB
[0.100070] cma: Reserved 4 MiB at (efault)
[0.100100] numa: NUMA mode: (efault)

The reason for this, is that the code assumes that
probe_kernel_address() works very early. This however is not true on
at least s390. Uaccess on KERNEL_DS works only after page tables have
been setup on s390, which happens with setup_arch()->paging_init().

Any probe_kernel_address() invocation before that will return -EFAULT.

Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid 
pointers")
Signed-off-by: Petr Mladek 
---
 lib/vsprintf.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7b0a6140bfad..8b43a883be6b 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
if (!ptr)
return "(null)";
 
-   if (probe_kernel_address(ptr, byte))
-   return "(efault)";
+   /* User space address handling is not ready during early boot. */
+   if (system_state <= SYSTEM_BOOTING) {
+   if ((unsigned long)ptr < PAGE_SIZE)
+   return "(efault)";
+   } else {
+   if (probe_kernel_address(ptr, byte))
+   return "(efault)";
 
return NULL;
 }
-- 
2.16.4



[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

--- Comment #8 from Erhard F. (erhar...@mailbox.org) ---
I tried to apply the patch on top of 5.1.0 but it did not work out 'cause there
is no arch/powerpc/mm/book3s32/hash_low.S. The correct file on my system seemed
arch/powerpc/mm/hash_low_32.S, but changing the path in the patch did not work
either.

What actually did work was manually applying your proposed change in
arch/powerpc/mm/book3s32/hash_low.S! Now the G4 boots up fine with 5.1.0 as it
did with 5.0.x.

Many thanks for the fix!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

--- Comment #9 from Erhard F. (erhar...@mailbox.org) ---
@Christophe: Oops, accidentally I trashed your last comment. Sorry! This patch
didn't apply either. There is no book3s32 directory in my linux-stable tree,
the hash_*.S files are directly in mm/

But does not matter, your code change works!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH] powerpc/ftrace: Enable C Version of recordmcount

2019-05-09 Thread Christophe Leroy




Le 08/05/2019 à 02:45, Michael Ellerman a écrit :

Christophe Leroy  writes:

Selects HAVE_C_RECORDMCOUNT to use the C version of the recordmcount
intead of the old Perl Version of recordmcount.

This should improve build time. It also seems like the old Perl Version
misses some calls to _mcount that the C version finds.


That would make this a bug fix possibly for stable even.

Any more details on what the difference is, is it just some random
subset of functions or some pattern?


I have not been able to identify any pattern. Will add a few details in 
the 'issue' on github.


Christophe



cheers


diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2711aac24621..d87de4f9da61 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -180,6 +180,7 @@ config PPC
select HAVE_ARCH_NVRAM_OPS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
+   select HAVE_C_RECORDMCOUNT
select HAVE_CBPF_JITif !PPC64
select HAVE_STACKPROTECTOR  if PPC64 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
select HAVE_STACKPROTECTOR  if PPC32 && 
$(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
--
2.13.3


[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

--- Comment #7 from Christophe Leroy (christophe.le...@c-s.fr) ---
On 05/09/2019 10:05 AM, bugzilla-dae...@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=203125
> 
> Christophe Leroy (christophe.le...@c-s.fr) changed:
> 
> What|Removed |Added
> 
>   CC||christophe.le...@c-s.fr
> 
> --- Comment #5 from Christophe Leroy (christophe.le...@c-s.fr) ---
> Apparently, MSR DR is not sent and DAR has value 0xc0090654, meaning that a
> virt address tries to get accessed while in real mode.
> 

Could you try the patch below:

diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
b/arch/powerpc/mm/book3s32/hash_low.S
index e27792d0b744..8366c2abeafc 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -539,7 +539,8 @@ _GLOBAL(flush_hash_pages)
  #ifdef CONFIG_SMP
lis r9, (mmu_hash_lock - PAGE_OFFSET)@ha
addir9, r9, (mmu_hash_lock - PAGE_OFFSET)@l
-   lwz r8,TASK_CPU(r2)
+   tophys  (r8, r2)
+   lwz r8, TASK_CPU(r8)
orisr8,r8,9
  10:   lwarx   r0,0,r9
cmpi0,r0,0


Christophe

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

--- Comment #6 from Christophe Leroy (christophe.le...@c-s.fr) ---
Try followin change:

diff --git a/arch/powerpc/mm/book3s32/hash_low.S
b/arch/powerpc/mm/book3s32/hash_low.S
index e27792d0b744..8366c2abeafc 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -539,7 +539,8 @@ _GLOBAL(flush_hash_pages)
 #ifdef CONFIG_SMP
 lisr9, (mmu_hash_lock - PAGE_OFFSET)@ha
 addir9, r9, (mmu_hash_lock - PAGE_OFFSET)@l
-lwzr8,TASK_CPU(r2)
+tophys(r8, r2)
+lwzr8, TASK_CPU(r8)
 orisr8,r8,9
 10:lwarxr0,0,r9
 cmpi0,r0,0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread Christophe Leroy




On 05/09/2019 10:05 AM, bugzilla-dae...@bugzilla.kernel.org wrote:

https://bugzilla.kernel.org/show_bug.cgi?id=203125

Christophe Leroy (christophe.le...@c-s.fr) changed:

What|Removed |Added

  CC||christophe.le...@c-s.fr

--- Comment #5 from Christophe Leroy (christophe.le...@c-s.fr) ---
Apparently, MSR DR is not sent and DAR has value 0xc0090654, meaning that a
virt address tries to get accessed while in real mode.



Could you try the patch below:

diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
b/arch/powerpc/mm/book3s32/hash_low.S

index e27792d0b744..8366c2abeafc 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -539,7 +539,8 @@ _GLOBAL(flush_hash_pages)
 #ifdef CONFIG_SMP
lis r9, (mmu_hash_lock - PAGE_OFFSET)@ha
addir9, r9, (mmu_hash_lock - PAGE_OFFSET)@l
-   lwz r8,TASK_CPU(r2)
+   tophys  (r8, r2)
+   lwz r8, TASK_CPU(r8)
orisr8,r8,9
 10:lwarx   r0,0,r9
cmpi0,r0,0


Christophe


[Bug 203125] Kernel 5.1-rc1 fails to boot on a PowerMac G4 3,6: Caused by (from SRR1=141020): Transfer error ack signal

2019-05-09 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203125

Christophe Leroy (christophe.le...@c-s.fr) changed:

   What|Removed |Added

 CC||christophe.le...@c-s.fr

--- Comment #5 from Christophe Leroy (christophe.le...@c-s.fr) ---
Apparently, MSR DR is not sent and DAR has value 0xc0090654, meaning that a
virt address tries to get accessed while in real mode.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Michal Suchánek
On Thu, 9 May 2019 08:53:24 +0200
Greg Kroah-Hartman  wrote:

> On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:
> > I see multiple instances of:
> > 
> > arch/powerpc/kernel/exceptions-64s.S:839: Error:
> > attempt to move .org backwards
> > 
> > in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
> > 
> > This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
> > forwarding barrier at kernel entry/exit"), which is part of a large patch
> > series and can not easily be reverted.
> > 
> > Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?  
> 
> Michael, I thought this patch series was supposed to fix ppc issues, not
> add to them :)
> 
> Any ideas on what to do here?

What exact code do you build?

In my source there the SLB relon handler starts just above this line and
a lot of out-ouf-line are handlers before. Moving some out-of-line
handlers below the parts with fixed location should fix the build error.

Thanks

Michal


Re: Crashes in linux-next on powerpc with CONFIG_PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

2019-05-09 Thread Petr Mladek
On Wed 2019-05-08 00:54:51, Michael Ellerman wrote:
> Hi folks,
> 
> Just an FYI in case anyone else is seeing crashes very early in boot in
> linux-next with the above config options.
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls 
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.

Sigh, the check_pointer() stuff is in Linus's tree now, see
the commit 3e5903eb9cff707301712 ("vsprintf: Prevent crash when
dereferencing invalid pointers").

> Because it happens so early you don't get any output, just an apparently
> dead system.
> 
> The stack trace (which you don't see) is something like:
> 
>   ...
>   dump_stack+0xdc
>   probe_kernel_read+0x1a4
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   printk_safe_log_store+0x7c
>   printk+0x40
>   dump_stack_print_info+0xbc
>   dump_stack+0x8
>   probe_kernel_read+0x1a4
>   probe_kernel_read+0x19c
>   check_pointer+0x58
>   string+0x3c
>   vsnprintf+0x1bc
>   vscnprintf+0x20
>   vprintk_store+0x6c
>   vprintk_emit+0xec
>   vprintk_func+0xd4
>   printk+0x40
>   cpufeatures_process_feature+0xc8
>   scan_cpufeatures_subnodes+0x380
>   of_scan_flat_dt_subnodes+0xb4
>   dt_cpu_ftrs_scan_callback+0x158
>   of_scan_flat_dt+0xf0
>   dt_cpu_ftrs_scan+0x3c
>   early_init_devtree+0x360
>   early_setup+0x9c
> 
> 
> The simple fix is to use early_mmu_has_feature() in allow_user_access(),
> but we'd rather not do that because it penalises all
> copy_to/from_users() for the life of the system with the cost of the
> runtime check vs the jump label. The irony is probe_kernel_read()
> shouldn't be allowing user access at all, because we're reading the
> kernel not userspace.

I have tried to find a lightweight way for a safe reading of unknown
kernel pointer. But I have not succeeded so far. I see only variants
with user access. The user access is handled in arch-specific code
and I do not see any variant without it.

I am not sure on which level it should get fixed.

Could you please send it to lkml to get a wider audience?

Best Regards,
Petr

> For now if you're hitting it just turn off 
> CONFIG_PPC_KUAP and/or CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
> 
> cheers


Re: [PATCH] powerpc: Fix compile issue with force DAWR

2019-05-09 Thread Christophe Leroy




Le 08/05/2019 à 03:30, Michael Neuling a écrit :

If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail
at linking with:
   arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined reference 
to `dawr_force_enable'

This was caused by this recent patch:
commit c1fe190c06723322f2dfac31d3b982c581e434ef
Author: Michael Neuling 
powerpc: Add force enable of DAWR on P9 option


I think you should use the standard commit format, checkpatch will tell you.



This builds dawr_force_enable in always via a new file.


Do we really need a new file just for that ?
As far as I understand, it is always compiled in, so can't we use 
another file like traps.o or setup-common.o or somewhere else ?


Or just put an ifdef in arch/powerpc/kvm/book3s_hv_rmhandlers.S ?
Because your fix will put dawr_force_enable on every build even the ones 
who don't need it at all.


Christophe



Signed-off-by: Michael Neuling  > ---
  arch/powerpc/kernel/Makefile|  2 +-
  arch/powerpc/kernel/dawr.c  | 11 +++
  arch/powerpc/kernel/hw_breakpoint.c |  3 ---
  3 files changed, 12 insertions(+), 4 deletions(-)
  create mode 100644 arch/powerpc/kernel/dawr.c

diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 0ea6c4aa3a..48a20ef5be 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -49,7 +49,7 @@ obj-y := cputable.o ptrace.o 
syscalls.o \
   signal.o sysfs.o cacheinfo.o time.o \
   prom.o traps.o setup-common.o \
   udbg.o misc.o io.o misc_$(BITS).o \
-  of_platform.o prom_parse.o
+  of_platform.o prom_parse.o dawr.o
  obj-$(CONFIG_PPC64)   += setup_64.o sys_ppc32.o \
   signal_64.o ptrace32.o \
   paca.o nvram_64.o firmware.o
diff --git a/arch/powerpc/kernel/dawr.c b/arch/powerpc/kernel/dawr.c
new file mode 100644
index 00..ca343efd23
--- /dev/null
+++ b/arch/powerpc/kernel/dawr.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// DAWR global variables
+//
+// Copyright 2019, Michael Neuling, IBM Corporation.
+
+#include 
+#include 
+
+bool dawr_force_enable;
+EXPORT_SYMBOL_GPL(dawr_force_enable);
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index da307dd93e..78a17454f4 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -381,9 +381,6 @@ void hw_breakpoint_pmu_read(struct perf_event *bp)
/* TODO */
  }
  
-bool dawr_force_enable;

-EXPORT_SYMBOL_GPL(dawr_force_enable);
-
  static ssize_t dawr_write_file_bool(struct file *file,
const char __user *user_buf,
size_t count, loff_t *ppos)



Re: [PATCH] [v2] x86/mpx: fix recursive munmap() corruption

2019-05-09 Thread Ingo Molnar


* Dave Hansen  wrote:

> Reported-by: Richard Biener 
> Reported-by: H.J. Lu 
> Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
> Cc: Yang Shi 
> Cc: Michal Hocko 
> Cc: Vlastimil Babka 
> Cc: Andy Lutomirski 
> Cc: x...@kernel.org
> Cc: Andrew Morton 
> Cc: linux-ker...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: sta...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux...@lists.infradead.org
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: linux-a...@vger.kernel.org
> Cc: Guan Xuetao 
> Cc: Jeff Dike 
> Cc: Richard Weinberger 
> Cc: Anton Ivanov 

I've also added your:

  Signed-off-by: Dave Hansen 

Because I suppose you intended to sign off on it?

Thanks,

Ingo


Re: [PATCH v2 5/8] mm/memory_hotplug: Drop MHP_MEMBLOCK_API

2019-05-09 Thread David Hildenbrand
On 09.05.19 01:08, osalvador wrote:
> On Wed, 2019-05-08 at 09:39 +0200, David Hildenbrand wrote:
>> However I haven't really thought it through yet, smalles like that
>> could
>> as well just be handled by the caller of
>> arch_add_memory()/arch_remove_memory() eventually, passing it via
>> something like the altmap parameter.
>>
>> Let's leave it in place for now, we can talk about that once we have
>> new
>> patches from Oscar.
> Hi David,
> 
> I plan to send a new patchset once this is and Dan's are merged,
> otherwise I will have a mayhem with the conflicts.
> 
> I also plan/want to review this patchset, but time is tight this week.
> 

Sure, take your time. I'll resend this patch set most probably tomorrow
or early next week. Cheers!

-- 

Thanks,

David / dhildenb


Re: Build failure in v4.4.y.queue (ppc:allmodconfig)

2019-05-09 Thread Greg Kroah-Hartman
On Wed, May 08, 2019 at 01:26:42PM -0700, Guenter Roeck wrote:
> I see multiple instances of:
> 
> arch/powerpc/kernel/exceptions-64s.S:839: Error:
>   attempt to move .org backwards
> 
> in v4.4.y.queue (v4.4.179-143-gc4db218e9451).
> 
> This is due to commit 9b2d4e06d7f1 ("powerpc/64s: Add support for a store
> forwarding barrier at kernel entry/exit"), which is part of a large patch
> series and can not easily be reverted.
> 
> Guess I'll stop doing ppc:allmodconfig builds in v4.4.y ?

Michael, I thought this patch series was supposed to fix ppc issues, not
add to them :)

Any ideas on what to do here?

thanks,

greg k-h


Re: [PATCH v5 00/16] KVM: PPC: Book3S HV: add XIVE native exploitation mode

2019-05-09 Thread Cédric Le Goater


Satheesh,

>> Xive(both ic-mode=dual and ic-mode=xive) guest fails to boot with 
>> guest memory > 64G, till 64G it boots fine.
>>
>> Note: xics(ic-mode=xics) guest with the same configuration boots fine
> 
> Indeed. The guest hangs because IPIs are not correctly received. The guest 
> sees the EQ page as being filled with zeroes and discards the interrupt 
> whereas the host, KVM and QEMU, sees the correct entries.
> 
> I haven't spotted anything bizarre from guest side. Do we have a 64GB 
> frontier somewhere in KVM ? 

The issue was an erroneous assignment of the EQ page address in QEMU.

I pushed the fix in my QEMU branch : 

  https://github.com/legoater/qemu/commits/xive-next

Thanks,

C.


Re: [next-20190507][powerpc] WARN kernel/cgroup/cgroup.c:6008 with LTP ptrace01 test case

2019-05-09 Thread Sachin Sant


> On 09-May-2019, at 4:53 AM, Roman Gushchin  wrote:
> 
> On Wed, May 08, 2019 at 03:06:30PM +0530, Sachin Sant wrote:
>> While running LTP tests(specifically ptrace01) following WARNING is observed
>> on POWER8 LPAR running next-20190507 built using 4K page size.
>> 
>> [ 3969.979492] msgrcv04 (433) used greatest stack depth: 9328 bytes left
>> [ 3981.452911] madvise06 (515): drop_caches: 3
>> [ 4004.575752] WARNING: CPU: 5 PID: 721 at kernel/cgroup/cgroup.c:6008 
>> cgroup_exit+0x2ac/0x2c0
>> [ 4004.575781] Modules linked in: overlay rpadlpar_io rpaphp iptable_mangle 
>> xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv4 
>> ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc kvm iptable_filter 
>> pseries_rng rng_core vmx_crypto ip_tables x_tables autofs4 [last unloaded: 
>> dummy_del_mod]
>> [ 4004.575837] CPU: 5 PID: 721 Comm: ptrace01 Tainted: G   O  
>> 5.1.0-next-20190507-autotest #1
>> [ 4004.575846] NIP:  c1b3026c LR: c1b30054 CTR: 
>> c1c9f020
>> [ 4004.575855] REGS: c00171fff840 TRAP: 0700   Tainted: G   O
>>(5.1.0-next-20190507-autotest)
>> [ 4004.575863] MSR:  80010282b033 
>>   CR: 44004824  XER: 2000
>> [ 4004.575885] CFAR: c1b30078 IRQMASK: 1 
>> [ 4004.575885] GPR00: c1b30054 c00171fffad0 c3938700 
>> c0027b02fa18 
>> [ 4004.575885] GPR04: c0027b02fa00  c3ae8700 
>> 001c180a 
>> [ 4004.575885] GPR08: 0001 0001 c3ae8700 
>> 0001 
>> [ 4004.575885] GPR12: 4400 c0001ec7ea80 c3a4d670 
>> 0009 
>> [ 4004.575885] GPR16:  00040100 418004fc 
>> 0843 
>> [ 4004.575885] GPR20: 0009 0001 c001715e9200 
>> c0016d8f4d00 
>> [ 4004.575885] GPR24: c00171fffd90 0100 c00168692478 
>> c00171fffb98 
>> [ 4004.575885] GPR28: c00168692400 c0016d8f4d00 c36420d0 
>> c0027b02fa00 
>> [ 4004.575958] NIP [c1b3026c] cgroup_exit+0x2ac/0x2c0
>> [ 4004.575966] LR [c1b30054] cgroup_exit+0x94/0x2c0
>> [ 4004.575972] Call Trace:
>> [ 4004.575979] [c00171fffad0] [c1b30054] cgroup_exit+0x94/0x2c0 
>> (unreliable)
>> [ 4004.575990] [c00171fffb30] [c19cea98] do_exit+0x878/0x1ae0
>> [ 4004.575999] [c00171fffc00] [c19cfe4c] do_group_exit+0xac/0x1d0
>> [ 4004.576009] [c00171fffc40] [c19ed00c] get_signal+0x2bc/0x11c0
>> [ 4004.576019] [c00171fffd30] [c1867b14] 
>> do_notify_resume+0x384/0x900
>> [ 4004.576029] [c00171fffe20] [c183e844] 
>> ret_from_except_lite+0x70/0x74
>> [ 4004.576037] Instruction dump:
>> [ 4004.576043] 314a0001 7d40492d 40c2fff4 3d42001b e92a7288 39290001 
>> f92a7288 4bfffe5c 
>> [ 4004.576056] 3d42001b e92a7258 39290001 f92a7258 <0fe0> 4bfffe0c 
>> 4be91e45 6000 
>> [ 4004.576071] ---[ end trace 82a1a7c19005ebd6 ]—
>> 
>> The WARN_ONCE was added by following commit 
>> 96b9c592def5 ("cgroup: get rid of cgroup_freezer_frozen_exit()”). 
>> 
>> Reverting the patch helps avoid the warning.
> 
> Hi Sachin!
> 
> Thank you for the report!
> 
> Can you, please, check that the patch at https://lkml.org/lkml/2019/5/8/938 
> 
> solves the problem?
> 
This patch fixes the problem for me.

Thanks
-Sachin