Re: [RFC] Efficiency of the phandle_cache on ppc64/SLOF

2019-11-29 Thread Frank Rowand
On 11/29/19 9:10 AM, Sebastian Andrzej Siewior wrote:
> I've been looking at phandle_cache and noticed the following: The raw
> phandle value as generated by dtc starts at zero and is incremented by
> one for each phandle entry. The qemu pSeries model is using Slof (which
> is probably the same thing as used on real hardware) and this looks like
> a poiner value for the phandle.
> With
>   qemu-system-ppc64le -m 16G -machine pseries -smp 8 
> 
> I got the following output:
> | entries: 64
> | phandle 7e732468 slot 28 hash c
> | phandle 7e732ad0 slot 10 hash 27
> | phandle 7e732ee8 slot 28 hash 3a
> | phandle 7e734160 slot 20 hash 36
> | phandle 7e734318 slot 18 hash 3a
> | phandle 7e734428 slot 28 hash 33
> | phandle 7e734538 slot 38 hash 2c
> | phandle 7e734850 slot 10 hash e
> | phandle 7e735220 slot 20 hash 2d
> | phandle 7e735bf0 slot 30 hash d
> | phandle 7e7365c0 slot 0 hash 2d
> | phandle 7e736f90 slot 10 hash d
> | phandle 7e737960 slot 20 hash 2d
> | phandle 7e738330 slot 30 hash d
> | phandle 7e738d00 slot 0 hash 2d
> | phandle 7e739730 slot 30 hash 38
> | phandle 7e73bd08 slot 8 hash 17
> | phandle 7e73c2e0 slot 20 hash 32
> | phandle 7e73c7f8 slot 38 hash 37
> | phandle 7e782420 slot 20 hash 13
> | phandle 7e782ed8 slot 18 hash 1b
> | phandle 7e73ce28 slot 28 hash 39
> | phandle 7e73d390 slot 10 hash 22
> | phandle 7e73d9a8 slot 28 hash 1a
> | phandle 7e73dc28 slot 28 hash 37
> | phandle 7e73de00 slot 0 hash a
> | phandle 7e73e028 slot 28 hash 0
> | phandle 7e7621a8 slot 28 hash 36
> | phandle 7e73e458 slot 18 hash 1e
> | phandle 7e73e608 slot 8 hash 1e
> | phandle 7e740078 slot 38 hash 28
> | phandle 7e740180 slot 0 hash 1d
> | phandle 7e740240 slot 0 hash 33
> | phandle 7e740348 slot 8 hash 29
> | phandle 7e740410 slot 10 hash 2
> | phandle 7e740eb0 slot 30 hash 3e
> | phandle 7e745390 slot 10 hash 33
> | phandle 7e747b08 slot 8 hash c
> | phandle 7e748528 slot 28 hash f
> | phandle 7e74a6e0 slot 20 hash 18
> | phandle 7e74aab0 slot 30 hash b
> | phandle 7e74f788 slot 8 hash d
> | Used entries: 8, hashed: 29
> 
> So the hash array has 64 entries out which only 8 are populated. Using
> hash_32() populates 29 entries.
> Could someone with real hardware verify this?
> I'm not sure how important this performance wise, it looks just like a
> waste using only 1/8 of the array.

The hash used is based on the assumptions you noted, and as stated in the
code, that phandle property values are in a contiguous range of 1..n
(not starting from zero), which is what dtc generates.

We knew that for systems that do not match the assumptions that the hash
will not be optimal.  Unless there is a serious performance problem for
such systems, I do not want to make the phandle hash code more complicated
to optimize for these cases.  And the pseries have been performing ok
without phandle related performance issues that I remember hearing since
before the cache was added, which could have only helped the performance.
Yes, if your observations are correct, some memory is being wasted, but
a 64 entry cache is not very large on a pseries.

There is already some push back from Rob that the existing code is more
complex than needed (eg variable cache size).

-Frank

> 
> The patch used for testing:
> 
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 1d667eb730e19..2640d4bc81a9a 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -197,6 +197,7 @@ void of_populate_phandle_cache(void)
>   u32 cache_entries;
>   struct device_node *np;
>   u32 phandles = 0;
> + struct device_node **cache2;
>  
>   raw_spin_lock_irqsave(_lock, flags);
>  
> @@ -214,14 +215,32 @@ void of_populate_phandle_cache(void)
>  
>   phandle_cache = kcalloc(cache_entries, sizeof(*phandle_cache),
>   GFP_ATOMIC);
> + cache2 = kcalloc(cache_entries, sizeof(*phandle_cache), GFP_ATOMIC);
>   if (!phandle_cache)
>   goto out;
>  
> + pr_err("%s(%d) entries: %d\n", __func__, __LINE__, cache_entries);
>   for_each_of_allnodes(np)
>   if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
> + int slot;
>   of_node_get(np);
>   phandle_cache[np->phandle & phandle_cache_mask] = np;
> + slot = hash_32(np->phandle, __ffs(cache_entries));
> + cache2[slot] = np;
> + pr_err("%s(%d) phandle %x slot %x hash %x\n", __func__, 
> __LINE__,
> +np->phandle, np->phandle & phandle_cache_mask, 
> slot);
>   }
> + {
> + int i, filled = 0, filled_hash = 0;
> +
> + for (i = 0; i < cache_entries; i++) {
> + if (phandle_cache[i])
> + filled++;
> + if (cache2[i])
> + filled_hash++;
> + }
> + pr_err("%s(%d) Used entries: %d, hashed: %d\n", __func__, 
> 

Re: [PATCH] powerpc/kasan: KASAN is not supported on RELOCATABLE && FSL_BOOKE

2019-11-29 Thread shaolexi
>Le 29/11/2019 à 08:46, Christophe Leroy a écrit :
>>
>>
>> Le 29/11/2019 à 08:04, Lexi Shao a écrit :
>>> CONFIG_RELOCATABLE and CONFIG_KASAN cannot be enabled at the same
>>> time on ppce500 fsl_booke. All functions called before
>>> kasan_early_init() should be disabled with kasan check. When
>>> CONFIG_RELOCATABLE is enabled on ppce500 fsl_booke, relocate_init()
>>> is called before kasan_early_init() which triggers kasan check and results 
>>> in boot failure.
>>> Call trace and functions which triggers kasan check(*):
>>>- _start
>>> - set_ivor
>>>  - relocate_init(*)
>>>   - early_get_first_memblock_info(*)
>>>- of_scan_flat_dt(*)
>>> ...
>>>  - kasan_early_init
>>>
>>> Potential solutions could be 1. implement relocate_init and all its
>>> children function in a seperate file or 2. introduce a global
>>> vairable in KASAN, only enable KASAN check when init is done.
>>
>> Solution 1 seems uneasy. of_scan_flat_dt() and children are general
>> functions that can't be set aside.
>> Solution 2 would destroy performance, and would anyway not work with
>> inline instrumentation.
>>
>> Have you tried moving the call to kasan_early_init() before the call
>> of
>> relocate_init() ?
>
>I just tried it with QEMU, it works. I'll send a patch out soon.
>

Yes I tried but couldn't get it to work on a P1010. There might be conflict
somewhere else with my kernel config. Will keep on debugging.
Thanks for the prompt reply and trying it out on qemu.

Lexi

>
>
>> On other PPC32, kasan_early_init() is the first thing done after
>> activating the MMU. But AFAIU, MMU is always active on BOOKE though.
>>
>> Christophe
>>
>>>
>>> Disable KASAN when RELOCATABLE is selected on fsl_booke for now until
>>> it is supported.
>>>
>>> Signed-off-by: Lexi Shao 
>>> ---
>>>   arch/powerpc/Kconfig | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index
>>> 3e56c9c2f16e..14f3da63c088 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -171,7 +171,7 @@ config PPC
>>>   select HAVE_ARCH_AUDITSYSCALL
>>>   select HAVE_ARCH_HUGE_VMAPif PPC_BOOK3S_64 &&
>>> PPC_RADIX_MMU
>>>   select HAVE_ARCH_JUMP_LABEL
>>> -select HAVE_ARCH_KASANif PPC32
>>> +select HAVE_ARCH_KASANif PPC32 && !(RELOCATABLE &&
>>> FSL_BOOKE)
>>>   select HAVE_ARCH_KGDB
>>>   select HAVE_ARCH_MMAP_RND_BITS
>>>   select HAVE_ARCH_MMAP_RND_COMPAT_BITSif COMPAT
>>>


Re: [PATCH] ASoC: fsl_sai: add IRQF_SHARED

2019-11-29 Thread Fabio Estevam
Hi Michael,

On Thu, Nov 28, 2019 at 7:38 PM Michael Walle  wrote:
>
> The LS1028A SoC uses the same interrupt line for adjacent SAIs. Use
> IRQF_SHARED to be able to use these SAIs simultaneously.

On i.MX8M SAI5 and SAI6 share the same interrupt number too:

Reviewed-by: Fabio Estevam 

Thanks


Re: [PATCH v2 17/19] powerpc: book3s64: convert to pin_user_pages() and put_user_page()

2019-11-29 Thread John Hubbard

On 11/29/19 3:23 AM, Jan Kara wrote:

On Mon 25-11-19 15:10:33, John Hubbard wrote:

1. Convert from get_user_pages() to pin_user_pages().

2. As required by pin_user_pages(), release these pages via
put_user_page(). In this case, do so via put_user_pages_dirty_lock().

That has the side effect of calling set_page_dirty_lock(), instead
of set_page_dirty(). This is probably more accurate.


Maybe more accurate but it doesn't work for mm_iommu_unpin(). As I'm
checking mm_iommu_unpin() gets called from RCU callback which is executed
interrupt context and you cannot lock pages from such context. So you need
to queue work from the RCU callback and then do the real work from the
workqueue...

Honza


ah yes, fixed locally. (In order to avoid  distracting people during the merge
window, I won't post any more versions of the series until the merge window is
over, unless a maintainer tells me that any of these patches are desired for
5.5.)

With that, we are back to a one-line diff for this part:

@@ -215,7 +214,7 @@ static void mm_iommu_unpin(struct 
mm_iommu_table_group_mem_t *mem)
if (mem->hpas[i] & MM_IOMMU_TABLE_GROUP_PAGE_DIRTY)
SetPageDirty(page);
 
-   put_page(page);

+   put_user_page(page);
mem->hpas[i] = 0;
}
 }

btw, I'm also working on your feedback for patch 17 (mm/gup: track FOLL_PIN 
pages [1]),
from a few days earlier, it's not being ignored, I'm just trying to avoid 
distracting
people during the merge window.

[1] https://lore.kernel.org/r/20191121093941.ga18...@quack2.suse.cz

thanks,
--
John Hubbard
NVIDIA


Re: Build failure on latest powerpc/merge (311ae9e159d8 io_uring: fix dead-hung for non-iter fixed rw)

2019-11-29 Thread Jens Axboe

On 11/29/19 10:07 AM, Pavel Begunkov wrote:

On 29/11/2019 20:16, Jens Axboe wrote:

On 11/29/19 8:14 AM, Christophe Leroy wrote:


Reverting commit 311ae9e159d8 ("io_uring: fix dead-hung for non-iter
fixed rw") clears the failure.

Most likely an #include is missing.


Huh weird how the build bots didn't catch that. Does the below work?


Yes it works, thanks.


Thanks for reporting and testing, I've queued it up with your reported
and tested-by.


My bad, thanks for the report and fixing.


No worries, usually the build bots are great at finding these before
patches go upstream. They have been unreliable lately, unfortunately.

--
Jens Axboe



Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma
Hi Will,

On Fri, Nov 29, 2019 at 3:54 PM Will Deacon  wrote:
>
> On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote:
> > Changes since v4:
> > 
> > - v4 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> > - Addressed comments from Dave and added patches for documenting
> >   new variables appended to vmcoreinfo documentation.
> > - Added testing report shared by Akashi for PATCH 2/5.
>
> Please can you fix your mail setup? The last two times you've sent this
> series it seems to get split into two threads, which is really hard to
> track in my inbox:
>
> First thread:
>
> https://lore.kernel.org/lkml/1574972621-25750-1-git-send-email-bhsha...@redhat.com/
>
> Second thread:
>
> https://lore.kernel.org/lkml/1574972716-25858-1-git-send-email-bhsha...@redhat.com/

There seems to be some issue with my server's msmtp settings. I have
tried resending the v5 (see
).

I hope the threading is ok this time.

Thanks for your patience.

Regards,
Bhupesh



[RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

2019-11-29 Thread Bhupesh Sharma
Add documentation for TCR_EL1.T1SZ variable being added to
vmcoreinfo.

It indicates the size offset of the memory region addressed by TTBR1_EL1
and hence can be used for determining the vabits_actual value.

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 447b64314f56..f9349f9d3345 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -398,6 +398,12 @@ KERNELOFFSET
 The kernel randomization offset. Used to compute the page offset. If
 KASLR is disabled, this value is zero.
 
+TCR_EL1.T1SZ
+
+
+Indicates the size offset of the memory region addressed by TTBR1_EL1
+and hence can be used for determining the vabits_actual value.
+
 arm
 ===
 
-- 
2.7.4



[RESEND PATCH v5 4/5] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'

2019-11-29 Thread Bhupesh Sharma
Add documentation for 'MAX_PHYSMEM_BITS' variable being added to
vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 007a6b86e0ee..447b64314f56 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also 
somewhat
 similar to the mem_map variable, both of them are used to translate an
 address.
 
+MAX_PHYSMEM_BITS
+
+
+Defines the maximum supported physical address space memory.
+
 page
 
 
-- 
2.7.4



[RESEND PATCH v5 3/5] Documentation/arm64: Fix a simple typo in memory.rst

2019-11-29 Thread Bhupesh Sharma
Fix a simple typo in arm64/memory.rst

Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/arm64/memory.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
index 02e02175e6f5..cf03b3290800 100644
--- a/Documentation/arm64/memory.rst
+++ b/Documentation/arm64/memory.rst
@@ -129,7 +129,7 @@ this logic.
 
 As a single binary will need to support both 48-bit and 52-bit VA
 spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
-also must be sized large enought to accommodate a fixed PAGE_OFFSET.
+also must be sized large enough to accommodate a fixed PAGE_OFFSET.
 
 Most code in the kernel should not need to consider the VA_BITS, for
 code that does need to know the VA size the variables are
-- 
2.7.4



[RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2019-11-29 Thread Bhupesh Sharma
vabits_actual variable on arm64 indicates the actual VA space size,
and allows a single binary to support both 48-bit and 52-bit VA
spaces.

If the ARMv8.2-LVA optional feature is present, and we are running
with a 64KB page size; then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size offset of the memory region
addressed by TTBR1_EL1 (and hence can be used for determining the
vabits_actual value) it makes more sense to export the same in
vmcoreinfo rather than vabits_actual variable, as the name of the
variable can change in future kernel versions, but the architectural
constructs like TCR_EL1.T1SZ can be used better to indicate intended
specific fields to user-space.

User-space utilities like makedumpfile and crash-utility, need to
read/write this value from/to vmcoreinfo for determining if a virtual
address lies in the linear map range.

The user-space computation for determining whether an address lies in
the linear map range is the same as we have in kernel-space:

  #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1)))

I have sent out user-space patches for makedumpfile and crash-utility
to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see
[0] and [1]).

Akashi reported that he was able to use this patchset and the user-space
changes to get user-space working fine with the 52-bit kernel VA
changes (see [2]).

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html
[1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html
[2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 arch/arm64/kernel/crash_core.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index d9fbd433cc17..d2e7aff5821e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -215,6 +215,7 @@
 #define TCR_TxSZ(x)(TCR_T0SZ(x) | TCR_T1SZ(x))
 #define TCR_TxSZ_WIDTH 6
 #define TCR_T0SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T0SZ_OFFSET)
+#define TCR_T1SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T1SZ_OFFSET)
 
 #define TCR_EPD0_SHIFT 7
 #define TCR_EPD0_MASK  (UL(1) << TCR_EPD0_SHIFT)
diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..f78310ba65ea 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -7,6 +7,13 @@
 #include 
 #include 
 
+static inline u64 get_tcr_el1_t1sz(void);
+
+static inline u64 get_tcr_el1_t1sz(void)
+{
+   return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET;
+}
+
 void arch_crash_save_vmcoreinfo(void)
 {
VMCOREINFO_NUMBER(VA_BITS);
@@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void)
kimage_voffset);
vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
PHYS_OFFSET);
+   vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n",
+   get_tcr_el1_t1sz());
vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
 }
-- 
2.7.4



[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-29 Thread Bhupesh Sharma
Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4



[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma
- Resending the v5 version as Will Deacon reported that the patchset was
  split into two seperate threads while sending out. It was an issue
  with my 'msmtp' settings which seems to be now fixed. Please ignore
  all previous v5 versions.

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (5):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
  Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++
 Documentation/arm64/memory.rst |  2 +-
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c |  9 +
 kernel/crash_core.c|  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4



[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-29 Thread Bhupesh Sharma
Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4



[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma
- Resending the v5 version as Will Deacon reported that the patchset was
  split into two seperate threads while sending out. It was an issue
  with my 'msmtp' settings which is now fixed.

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (5):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
  Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++
 Documentation/arm64/memory.rst |  2 +-
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c |  9 +
 kernel/crash_core.c|  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4



Re: [PATCH v4 2/2] powerpc/irq: inline call_do_irq() and call_do_softirq()

2019-11-29 Thread Segher Boessenkool
Hi!

On Wed, Nov 27, 2019 at 04:15:15PM +0100, Christophe Leroy wrote:
> Le 27/11/2019 à 15:59, Segher Boessenkool a écrit :
> >On Wed, Nov 27, 2019 at 02:50:30PM +0100, Christophe Leroy wrote:
> >>So what do we do ? We just drop the "r2" clobber ?
> >
> >You have to make sure your asm code works for all ABIs.  This is quite
> >involved if you do a call to an external function.  The compiler does
> >*not* see this call, so you will have to make sure that all that the
> >compiler and linker do will work, or prevent some of those things (say,
> >inlining of the function containing the call).
> 
> But the whole purpose of the patch is to inline the call to __do_irq() 
> in order to avoid the trampoline function.

Yes, so you call __do_irq.  You have to make sure that what you tell the
compiler -- and what you *don't tell the compiler -- works with what the
ABIs require, and what the called function expects and provides.

> >That does not fix everything.  The called function requires a specific
> >value in r2 on entry.
> 
> Euh ... but there is nothing like that when using existing 
> call_do_irq().

> How does GCC know that call_do_irq() has same TOC as __do_irq() ?

The existing call_do_irq isn't C code.  It doesn't do anything with r2,
as far as I can see; __do_irq just gets whatever the caller of call_do_irq
has.

So I guess all the callers of call_do_irq have the correct r2 value always
already?  In that case everything Just Works.

> >So all this needs verification.  Hopefully you can get away with just
> >not clobbering r2 (and not adding a nop after the bl), sure.  But this
> >needs to be checked.
> >
> >Changing control flow inside inline assembler always is problematic.
> >Another problem in this case (on all ABIs) is that the compiler does
> >not see you call __do_irq.  Again, you can probably get away with that
> >too, but :-)
> 
> Anyway it sees I reference it, as it is in input arguments. Isn't it 
> enough ?

It is enough for some things, sure.  But not all.


Segher


Re: Build failure on latest powerpc/merge (311ae9e159d8 io_uring: fix dead-hung for non-iter fixed rw)

2019-11-29 Thread Jens Axboe

On 11/29/19 8:14 AM, Christophe Leroy wrote:



Le 29/11/2019 à 17:04, Jens Axboe a écrit :

On 11/29/19 6:53 AM, Christophe Leroy wrote:

 CC  fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’
[-Werror=implicit-function-declaration]
   iovec.iov_base = kmap(iter->bvec->bv_page)
^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer
without a cast [-Wint-conversion]
   iovec.iov_base = kmap(iter->bvec->bv_page)
  ^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’
[-Werror=implicit-function-declaration]
   kunmap(iter->bvec->bv_page);
   ^


Reverting commit 311ae9e159d8 ("io_uring: fix dead-hung for non-iter
fixed rw") clears the failure.

Most likely an #include is missing.


Huh weird how the build bots didn't catch that. Does the below work?


Yes it works, thanks.


Thanks for reporting and testing, I've queued it up with your reported
and tested-by.

--
Jens Axboe



Re: Build failure on latest powerpc/merge (311ae9e159d8 io_uring: fix dead-hung for non-iter fixed rw)

2019-11-29 Thread Christophe Leroy




Le 29/11/2019 à 17:04, Jens Axboe a écrit :

On 11/29/19 6:53 AM, Christophe Leroy wrote:

    CC  fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’
[-Werror=implicit-function-declaration]
  iovec.iov_base = kmap(iter->bvec->bv_page)
   ^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer
without a cast [-Wint-conversion]
  iovec.iov_base = kmap(iter->bvec->bv_page)
 ^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’
[-Werror=implicit-function-declaration]
  kunmap(iter->bvec->bv_page);
  ^


Reverting commit 311ae9e159d8 ("io_uring: fix dead-hung for non-iter
fixed rw") clears the failure.

Most likely an #include is missing.


Huh weird how the build bots didn't catch that. Does the below work?


Yes it works, thanks.

Christophe


Re: Build failure on latest powerpc/merge (311ae9e159d8 io_uring: fix dead-hung for non-iter fixed rw)

2019-11-29 Thread Jens Axboe

On 11/29/19 6:53 AM, Christophe Leroy wrote:

CC  fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’
[-Werror=implicit-function-declaration]
  iovec.iov_base = kmap(iter->bvec->bv_page)
   ^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer
without a cast [-Wint-conversion]
  iovec.iov_base = kmap(iter->bvec->bv_page)
 ^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’
[-Werror=implicit-function-declaration]
  kunmap(iter->bvec->bv_page);
  ^


Reverting commit 311ae9e159d8 ("io_uring: fix dead-hung for non-iter
fixed rw") clears the failure.

Most likely an #include is missing.


Huh weird how the build bots didn't catch that. Does the below work?


diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2c2e8c25da01..745eb005fefe 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -69,6 +69,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS

 #include 

--
Jens Axboe



Re: XFS check crash (WAS Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory)

2019-11-29 Thread Daniel Axtens
> 
> Nope, it's vm_map_ram() not being handled
 
 
 Another suspicious one. Related to kasan/vmalloc?
>>> 
>>> Very likely the same as with ion:
>>> 
>>> # git grep vm_map_ram|grep xfs
>>> fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
>>> structures (e.g.
>>> fs/xfs/xfs_buf.c:   bp->b_addr = 
>>> vm_map_ram(bp->b_pages, bp->b_page_count,
>> 
>> Aaargh, that's an embarassing miss.
>> 
>> It's a bit intricate because kasan_vmalloc_populate function is
>> currently set up to take a vm_struct not a vmap_area, but I'll see if I
>> can get something simple out this evening - I'm away for the first part
>> of next week.

For crashes in XFS, binder etc that implicate vm_map_ram, see:
https://lore.kernel.org/linux-mm/20191129154519.30964-1-...@axtens.net/

The easiest way I found to repro the bug is
sudo modprobe i915 mock_selftest=-1

For lock warns, one that goes through the percpu alloc path, the patch
is already queued in mmots.

For Dmitry's latest one where there's an allocation in the
purge_vmap_area_lazy path that triggers a locking warning, you'll have
to wait until next week, sorry.

Regards,
Daniel


[RFC] Efficiency of the phandle_cache on ppc64/SLOF

2019-11-29 Thread Sebastian Andrzej Siewior
I've been looking at phandle_cache and noticed the following: The raw
phandle value as generated by dtc starts at zero and is incremented by
one for each phandle entry. The qemu pSeries model is using Slof (which
is probably the same thing as used on real hardware) and this looks like
a poiner value for the phandle.
With
qemu-system-ppc64le -m 16G -machine pseries -smp 8 

I got the following output:
| entries: 64
| phandle 7e732468 slot 28 hash c
| phandle 7e732ad0 slot 10 hash 27
| phandle 7e732ee8 slot 28 hash 3a
| phandle 7e734160 slot 20 hash 36
| phandle 7e734318 slot 18 hash 3a
| phandle 7e734428 slot 28 hash 33
| phandle 7e734538 slot 38 hash 2c
| phandle 7e734850 slot 10 hash e
| phandle 7e735220 slot 20 hash 2d
| phandle 7e735bf0 slot 30 hash d
| phandle 7e7365c0 slot 0 hash 2d
| phandle 7e736f90 slot 10 hash d
| phandle 7e737960 slot 20 hash 2d
| phandle 7e738330 slot 30 hash d
| phandle 7e738d00 slot 0 hash 2d
| phandle 7e739730 slot 30 hash 38
| phandle 7e73bd08 slot 8 hash 17
| phandle 7e73c2e0 slot 20 hash 32
| phandle 7e73c7f8 slot 38 hash 37
| phandle 7e782420 slot 20 hash 13
| phandle 7e782ed8 slot 18 hash 1b
| phandle 7e73ce28 slot 28 hash 39
| phandle 7e73d390 slot 10 hash 22
| phandle 7e73d9a8 slot 28 hash 1a
| phandle 7e73dc28 slot 28 hash 37
| phandle 7e73de00 slot 0 hash a
| phandle 7e73e028 slot 28 hash 0
| phandle 7e7621a8 slot 28 hash 36
| phandle 7e73e458 slot 18 hash 1e
| phandle 7e73e608 slot 8 hash 1e
| phandle 7e740078 slot 38 hash 28
| phandle 7e740180 slot 0 hash 1d
| phandle 7e740240 slot 0 hash 33
| phandle 7e740348 slot 8 hash 29
| phandle 7e740410 slot 10 hash 2
| phandle 7e740eb0 slot 30 hash 3e
| phandle 7e745390 slot 10 hash 33
| phandle 7e747b08 slot 8 hash c
| phandle 7e748528 slot 28 hash f
| phandle 7e74a6e0 slot 20 hash 18
| phandle 7e74aab0 slot 30 hash b
| phandle 7e74f788 slot 8 hash d
| Used entries: 8, hashed: 29

So the hash array has 64 entries out which only 8 are populated. Using
hash_32() populates 29 entries.
Could someone with real hardware verify this?
I'm not sure how important this performance wise, it looks just like a
waste using only 1/8 of the array.

The patch used for testing:

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 1d667eb730e19..2640d4bc81a9a 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -197,6 +197,7 @@ void of_populate_phandle_cache(void)
u32 cache_entries;
struct device_node *np;
u32 phandles = 0;
+   struct device_node **cache2;
 
raw_spin_lock_irqsave(_lock, flags);
 
@@ -214,14 +215,32 @@ void of_populate_phandle_cache(void)
 
phandle_cache = kcalloc(cache_entries, sizeof(*phandle_cache),
GFP_ATOMIC);
+   cache2 = kcalloc(cache_entries, sizeof(*phandle_cache), GFP_ATOMIC);
if (!phandle_cache)
goto out;
 
+   pr_err("%s(%d) entries: %d\n", __func__, __LINE__, cache_entries);
for_each_of_allnodes(np)
if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL) {
+   int slot;
of_node_get(np);
phandle_cache[np->phandle & phandle_cache_mask] = np;
+   slot = hash_32(np->phandle, __ffs(cache_entries));
+   cache2[slot] = np;
+   pr_err("%s(%d) phandle %x slot %x hash %x\n", __func__, 
__LINE__,
+  np->phandle, np->phandle & phandle_cache_mask, 
slot);
}
+   {
+   int i, filled = 0, filled_hash = 0;
+
+   for (i = 0; i < cache_entries; i++) {
+   if (phandle_cache[i])
+   filled++;
+   if (cache2[i])
+   filled_hash++;
+   }
+   pr_err("%s(%d) Used entries: %d, hashed: %d\n", __func__, 
__LINE__, filled, filled_hash);
+   }
 
 out:
raw_spin_unlock_irqrestore(_lock, flags);

Sebastian


XFS check crash (WAS Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory)

2019-11-29 Thread Qian Cai
eb c7
[  160.068386][  T844] RSP: 0018:c9000a4b7cb0 EFLAGS: 00010a06
[  160.068389][  T844] RAX: 192001fc RBX: c9000f80 RCX: 
c06d10ae
[  160.068391][  T844] RDX: 0003 RSI: dc00 RDI: 
c9000f800060
[  160.068393][  T844] RBP: c9000a4b7cb0 R08: ed130bee89e5 R09: 
0001
[  160.068395][  T844] R10: ed130bee89e4 R11: 88985f744f23 R12: 

[  160.068397][  T844] R13: 889724be0040 R14: 88836c8e5000 R15: 
000c8000
[  160.068399][  T844] FS:  () GS:88985f70() 
knlGS:
[  160.068401][  T844] CS:  0010 DS:  ES:  CR0: 80050033
[  160.068404][  T844] CR2: f52001fc CR3: 001f615b8004 CR4: 
003606e0
[  160.068405][  T844] DR0:  DR1:  DR2: 

[  160.068407][  T844] DR3:  DR6: fffe0ff0 DR7: 
0400
[  160.068410][  T844] Kernel panic - not syncing: Fatal exception
[  160.095178][  T844] Kernel Offset: 0x21c0 from 0x8100 
(relocation range: 0x8000-0xbfff)
[  160.541027][  T844] ---[ end Kernel panic - not syncing: Fatal exception ]---

> 
> Regards,
> Daniel
> 
>> 
>>> 
>>> BUG: unable to handle page fault for address: f52005b8
>>> #PF: supervisor read access in kernel mode
>>> #PF: error_code(0x) - not-present page
>>> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
>>> Oops:  [#1] PREEMPT SMP KASAN
>>> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>>> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>>> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
>>> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
>>> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
>>> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
>>> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
>>> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
>>> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
>>> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
>>> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
>>> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
>>> FS:  7fb49bda9700() GS:88802d40() knlGS:
>>> CS:  0010 DS:  ES:  CR0: 80050033
>>> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
>>> DR0:  DR1:  DR2: 
>>> DR3:  DR6: fffe0ff0 DR7: 0400
>>> PKRU: 5554
>>> Call Trace:
>>> xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
>>> __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
>>> xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
>>> xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
>>> xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
>>> xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
>>> get_tree_bdev+0x444/0x620 fs/super.c:1340
>>> xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
>>> vfs_get_tree+0x8e/0x300 fs/super.c:1545
>>> do_new_mount fs/namespace.c:2822 [inline]
>>> do_mount+0x152d/0x1b50 fs/namespace.c:3142
>>> ksys_mount+0x114/0x130 fs/namespace.c:3351
>>> __do_sys_mount fs/namespace.c:3365 [inline]
>>> __se_sys_mount fs/namespace.c:3362 [inline]
>>> __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
>>> do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
>>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x46736a
>>> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
>>> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
>>> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>>> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
>>> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
>>> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
>>> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
>>> R10:  R11: 0202 R12: 7fb49bda8b30
>>> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
>>> Modules linked in:
>>> Dumping ftrace buffer:
>>>   (ftrace buffer empty)
>>> CR2: f52005b8
>>> ---[ end t

Build failure on latest powerpc/merge (311ae9e159d8 io_uring: fix dead-hung for non-iter fixed rw)

2019-11-29 Thread Christophe Leroy

  CC  fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’ 
[-Werror=implicit-function-declaration]

    iovec.iov_base = kmap(iter->bvec->bv_page)
 ^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer 
without a cast [-Wint-conversion]

    iovec.iov_base = kmap(iter->bvec->bv_page)
   ^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’ 
[-Werror=implicit-function-declaration]

    kunmap(iter->bvec->bv_page);
    ^


Reverting commit 311ae9e159d8 ("io_uring: fix dead-hung for non-iter 
fixed rw") clears the failure.


Most likely an #include is missing.


Christophe



[PATCH] powerpc/kasan: fix boot failure with RELOCATABLE && FSL_BOOKE

2019-11-29 Thread Christophe Leroy
When enabling CONFIG_RELOCATABLE and CONFIG_KASAN on FSL_BOOKE,
the kernel doesn't boot.

relocate_init() requires KASAN early shadow area to be set up because
it needs access to the device tree through generic functions.

Call kasan_early_init() before calling relocate_init()

Reported-by: Lexi Shao 
Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_fsl_booke.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/head_fsl_booke.S 
b/arch/powerpc/kernel/head_fsl_booke.S
index 838d9d4650c7..6f7a3a7162c5 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -240,6 +240,9 @@ set_ivor:
 
bl  early_init
 
+#ifdef CONFIG_KASAN
+   bl  kasan_early_init
+#endif
 #ifdef CONFIG_RELOCATABLE
mr  r3,r30
mr  r4,r31
@@ -266,9 +269,6 @@ set_ivor:
 /*
  * Decide what sort of machine this is and initialize the MMU.
  */
-#ifdef CONFIG_KASAN
-   bl  kasan_early_init
-#endif
mr  r3,r30
mr  r4,r31
bl  machine_init
-- 
2.13.3



Re: [PATCH] powerpc/kasan: KASAN is not supported on RELOCATABLE && FSL_BOOKE

2019-11-29 Thread Christophe Leroy




Le 29/11/2019 à 08:46, Christophe Leroy a écrit :



Le 29/11/2019 à 08:04, Lexi Shao a écrit :

CONFIG_RELOCATABLE and CONFIG_KASAN cannot be enabled at the same time
on ppce500 fsl_booke. All functions called before kasan_early_init()
should be disabled with kasan check. When CONFIG_RELOCATABLE is enabled
on ppce500 fsl_booke, relocate_init() is called before kasan_early_init()
which triggers kasan check and results in boot failure.
Call trace and functions which triggers kasan check(*):
   - _start
    - set_ivor
 - relocate_init(*)
  - early_get_first_memblock_info(*)
   - of_scan_flat_dt(*)
...
 - kasan_early_init

Potential solutions could be 1. implement relocate_init and all its 
children
function in a seperate file or 2. introduce a global vairable in 
KASAN, only

enable KASAN check when init is done.


Solution 1 seems uneasy. of_scan_flat_dt() and children are general 
functions that can't be set aside.
Solution 2 would destroy performance, and would anyway not work with 
inline instrumentation.


Have you tried moving the call to kasan_early_init() before the call of 
relocate_init() ?


I just tried it with QEMU, it works. I'll send a patch out soon.

Christophe


On other PPC32, kasan_early_init() is the first thing done after 
activating the MMU. But AFAIU, MMU is always active on BOOKE though.


Christophe



Disable KASAN when RELOCATABLE is selected on fsl_booke for now until
it is supported.

Signed-off-by: Lexi Shao 
---
  arch/powerpc/Kconfig | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3e56c9c2f16e..14f3da63c088 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -171,7 +171,7 @@ config PPC
  select HAVE_ARCH_AUDITSYSCALL
  select HAVE_ARCH_HUGE_VMAP    if PPC_BOOK3S_64 && PPC_RADIX_MMU
  select HAVE_ARCH_JUMP_LABEL
-    select HAVE_ARCH_KASAN    if PPC32
+    select HAVE_ARCH_KASAN    if PPC32 && !(RELOCATABLE && 
FSL_BOOKE)

  select HAVE_ARCH_KGDB
  select HAVE_ARCH_MMAP_RND_BITS
  select HAVE_ARCH_MMAP_RND_COMPAT_BITS    if COMPAT



Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Dmitry Vyukov
On Fri, Nov 29, 2019 at 1:29 PM Daniel Axtens  wrote:
> >>> Nope, it's vm_map_ram() not being handled
> >> Another suspicious one. Related to kasan/vmalloc?
> > Very likely the same as with ion:
> >
> > # git grep vm_map_ram|grep xfs
> > fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
> > structures (e.g.
> > fs/xfs/xfs_buf.c:   bp->b_addr = 
> > vm_map_ram(bp->b_pages, bp->b_page_count,
>
> Aaargh, that's an embarassing miss.
>
> It's a bit intricate because kasan_vmalloc_populate function is
> currently set up to take a vm_struct not a vmap_area, but I'll see if I
> can get something simple out this evening - I'm away for the first part
> of next week.
>
> Do you have to do anything interesting to get it to explode with xfs? Is
> it as simple as mounting a drive and doing some I/O? Or do you need to
> do something more involved?

As simple as running syzkaller :)
with this config
https://github.com/google/syzkaller/blob/master/dashboard/config/upstream-kasan.config

> Regards,
> Daniel
>
> >
> >>
> >> BUG: unable to handle page fault for address: f52005b8
> >> #PF: supervisor read access in kernel mode
> >> #PF: error_code(0x) - not-present page
> >> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
> >> Oops:  [#1] PREEMPT SMP KASAN
> >> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
> >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> >> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> >> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> >> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> >> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> >> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> >> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> >> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> >> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> >> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> >> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> >> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
> >> FS:  7fb49bda9700() GS:88802d40() 
> >> knlGS:
> >> CS:  0010 DS:  ES:  CR0: 80050033
> >> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
> >> DR0:  DR1:  DR2: 
> >> DR3:  DR6: fffe0ff0 DR7: 0400
> >> PKRU: 5554
> >> Call Trace:
> >>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
> >>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
> >>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
> >>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
> >>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
> >>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
> >>  get_tree_bdev+0x444/0x620 fs/super.c:1340
> >>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
> >>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
> >>  do_new_mount fs/namespace.c:2822 [inline]
> >>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
> >>  ksys_mount+0x114/0x130 fs/namespace.c:3351
> >>  __do_sys_mount fs/namespace.c:3365 [inline]
> >>  __se_sys_mount fs/namespace.c:3362 [inline]
> >>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
> >>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
> >>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >> RIP: 0033:0x46736a
> >> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> >> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
> >> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> >> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
> >> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
> >> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
> >> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
> >> R10:  R11: 0202 R12: 7fb49bda8b30
> >> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
> >> Modules linked in:
> >> Dumping ftrace buffer:
> >>(ftrace buffer empty)
> >> CR2: f52005b8
> >> ---[ end trace eddd8949d4c898df ]---
> >

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Daniel Axtens


>>> Nope, it's vm_map_ram() not being handled
>> 
>> 
>> Another suspicious one. Related to kasan/vmalloc?
>
> Very likely the same as with ion:
>
> # git grep vm_map_ram|grep xfs
> fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
> structures (e.g.
> fs/xfs/xfs_buf.c:   bp->b_addr = vm_map_ram(bp->b_pages, 
> bp->b_page_count,

Aaargh, that's an embarassing miss.

It's a bit intricate because kasan_vmalloc_populate function is
currently set up to take a vm_struct not a vmap_area, but I'll see if I
can get something simple out this evening - I'm away for the first part
of next week.

Do you have to do anything interesting to get it to explode with xfs? Is
it as simple as mounting a drive and doing some I/O? Or do you need to
do something more involved?

Regards,
Daniel

>  
>> 
>> BUG: unable to handle page fault for address: f52005b8
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x) - not-present page
>> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
>> Oops: 0000 [#1] PREEMPT SMP KASAN
>> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
>> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
>> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
>> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
>> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
>> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
>> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
>> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
>> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
>> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
>> FS:  7fb49bda9700() GS:88802d40() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> PKRU: 5554
>> Call Trace:
>>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
>>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
>>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
>>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
>>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
>>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
>>  get_tree_bdev+0x444/0x620 fs/super.c:1340
>>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
>>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
>>  do_new_mount fs/namespace.c:2822 [inline]
>>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
>>  ksys_mount+0x114/0x130 fs/namespace.c:3351
>>  __do_sys_mount fs/namespace.c:3365 [inline]
>>  __se_sys_mount fs/namespace.c:3362 [inline]
>>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
>>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x46736a
>> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
>> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
>> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
>> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
>> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
>> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
>> R10:  R11: 0202 R12: 7fb49bda8b30
>> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
>> Modules linked in:
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> CR2: f52005b8
>> ---[ end trace eddd8949d4c898df ]---
>> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
>> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
>> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
>> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
>> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
>> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
>> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
>> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
>>

Re: linux-next: Fixes tag needs some work in the powerpc tree

2019-11-29 Thread Stephen Rothwell
Hi all,

hmm, that subject is completely wrong, sorry.

On Fri, 29 Nov 2019 23:12:00 +1100 Stephen Rothwell  
wrote:
> 
> Commit
> 
>   6f090192f822 ("x86/efi: remove unused variables")
> 
> is missing a Signed-off-by from its committer.

-- 
Cheers,
Stephen Rothwell


pgpUTv7q4WJWs.pgp
Description: OpenPGP digital signature


linux-next: Fixes tag needs some work in the powerpc tree

2019-11-29 Thread Stephen Rothwell
Hi all,

Commit

  6f090192f822 ("x86/efi: remove unused variables")

is missing a Signed-off-by from its committer.

-- 
Cheers,
Stephen Rothwell


pgpwOkerSneth.pgp
Description: OpenPGP digital signature


Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Daniel Axtens
Hi Dmitry,

>> I am testing this support on next-20191129 and seeing the following warnings:
>>
>> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
>> 4 locks held by kworker/1:1/44:
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> __write_once_size include/linux/compiler.h:247 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
>> include/asm-generic/atomic-instrumented.h:868 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
>> kernel/workqueue.c:615 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
>> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
>> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
>> include/linux/spinlock.h:338 [inline]
>>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
>> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>> Preemption disabled at:
>> [] spin_lock include/linux/spinlock.h:338 [inline]
>> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.0-next-20191129+ #5
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
>> Workqueue: events pcpu_balance_workfn
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x199/0x216 lib/dump_stack.c:118
>>  ___might_sleep.cold.97+0x1f5/0x238 kernel/sched/core.c:6800
>>  __might_sleep+0x95/0x190 kernel/sched/core.c:6753
>>  prepare_alloc_pages mm/page_alloc.c:4681 [inline]
>>  __alloc_pages_nodemask+0x3cd/0x890 mm/page_alloc.c:4730
>>  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2211
>>  alloc_pages include/linux/gfp.h:532 [inline]
>>  __get_free_pages+0xc/0x40 mm/page_alloc.c:4786
>>  kasan_populate_vmalloc_pte mm/kasan/common.c:762 [inline]
>>  kasan_populate_vmalloc_pte+0x2f/0x1b0 mm/kasan/common.c:753
>>  apply_to_pte_range mm/memory.c:2041 [inline]
>>  apply_to_pmd_range mm/memory.c:2068 [inline]
>>  apply_to_pud_range mm/memory.c:2088 [inline]
>>  apply_to_p4d_range mm/memory.c:2108 [inline]
>>  apply_to_page_range+0x5ca/0xa00 mm/memory.c:2133
>>  kasan_populate_vmalloc+0x69/0xa0 mm/kasan/common.c:791
>>  pcpu_get_vm_areas+0x1596/0x3df0 mm/vmalloc.c:3439
>>  pcpu_create_chunk+0x240/0x7f0 mm/percpu-vm.c:340
>>  pcpu_balance_workfn+0x1033/0x13e0 mm/percpu.c:1934
>>  process_one_work+0x9b5/0x1750 kernel/workqueue.c:2264
>>  worker_thread+0x8b/0xd20 kernel/workqueue.c:2410
>>  kthread+0x365/0x450 kernel/kthread.c:255
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>
>>
>> Not sure if it's the same or not. Is it addressed by something in flight?

It looks like this one is the same.

There is a patch to fix it:
https://lore.kernel.org/linux-mm/20191120052719.7201-1-...@axtens.net/

Andrew said he had picked it up on the 22nd:
https://marc.info/?l=linux-mm-commits=157438241512561=2
It's landed in mmots but not mmotm, so hopefully that will happen and
then it will land in -next very soon!

I will look into your other bug report shortly.

Regards,
Daniel

>>
>> My config:
>> https://gist.githubusercontent.com/dvyukov/36c7be311fdec9cd51c649f7c3cb2ddb/raw/39c6f864fdd0ffc53f0822b14c354a73c1695fa1/gistfile1.txt
>
>
> I've tried this fix for pcpu_get_vm_areas:
> https://groups.google.com/d/msg/kasan-dev/t_F2X1MWKwk/h152Z3q2AgAJ
> and it helps. But this will break syzbot on linux-next soon.


Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Andrey Ryabinin



On 11/29/19 2:47 PM, Dmitry Vyukov wrote:
> On Fri, Nov 29, 2019 at 12:38 PM Andrey Ryabinin
>  wrote:
>>>>>
>>>>>
>>>>> Not sure if it's the same or not. Is it addressed by something in flight?
>>>>>
>>>>> My config:
>>>>> https://gist.githubusercontent.com/dvyukov/36c7be311fdec9cd51c649f7c3cb2ddb/raw/39c6f864fdd0ffc53f0822b14c354a73c1695fa1/gistfile1.txt
>>>>
>>>>
>>>> I've tried this fix for pcpu_get_vm_areas:
>>>> https://groups.google.com/d/msg/kasan-dev/t_F2X1MWKwk/h152Z3q2AgAJ
>>>> and it helps. But this will break syzbot on linux-next soon.
>>>
>>>
>>> Can this be related as well?
>>> Crashes on accesses to shadow on the ion memory...
>>
>> Nope, it's vm_map_ram() not being handled
> 
> 
> Another suspicious one. Related to kasan/vmalloc?

Very likely the same as with ion:

# git grep vm_map_ram|grep xfs
fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
structures (e.g.
fs/xfs/xfs_buf.c:   bp->b_addr = vm_map_ram(bp->b_pages, 
bp->b_page_count,
 
> 
> BUG: unable to handle page fault for address: f52005b8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
> Oops:  [#1] PREEMPT SMP KASAN
> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
> FS:  7fb49bda9700() GS:88802d40() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> PKRU: 5554
> Call Trace:
>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
>  get_tree_bdev+0x444/0x620 fs/super.c:1340
>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
>  do_new_mount fs/namespace.c:2822 [inline]
>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
>  ksys_mount+0x114/0x130 fs/namespace.c:3351
>  __do_sys_mount fs/namespace.c:3365 [inline]
>  __se_sys_mount fs/namespace.c:3362 [inline]
>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x46736a
> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
> R10:  R11: 0202 R12: 7fb49bda8b30
> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
> Modules linked in:
> Dumping ftrace buffer:
>(ftrace buffer empty)
> CR2: f52005b8
> ---[ end trace eddd8949d4c898df ]---
> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Andrey Ryabinin



On 11/29/19 2:02 PM, Dmitry Vyukov wrote:
> On Fri, Nov 29, 2019 at 11:58 AM Dmitry Vyukov  wrote:
>>
>> On Fri, Nov 29, 2019 at 11:43 AM Dmitry Vyukov  wrote:
>>>
>>> On Tue, Nov 19, 2019 at 10:54 AM Andrey Ryabinin
>>>  wrote:
>>>> On 11/18/19 6:29 AM, Daniel Axtens wrote:
>>>>> Qian Cai  writes:
>>>>>
>>>>>> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
>>>>>>> /*
>>>>>>>  * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>>>>>>>  * flag. It means that vm_struct is not fully initialized.
>>>>>>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const 
>>>>>>> unsigned long *offsets,
>>>>>>>
>>>>>>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>>>>>>>  pcpu_get_vm_areas);
>>>>>>> +
>>>>>>> +   /* assume success here */
>>>>>>> +   kasan_populate_vmalloc(sizes[area], vms[area]);
>>>>>>> }
>>>>>>> spin_unlock(_area_lock);
>>>>>>
>>>>>> Here it is all wrong. GFP_KERNEL with in_atomic().
>>>>>
>>>>> I think this fix will work, I will do a v12 with it included.
>>>>
>>>> You can send just the fix. Andrew will fold it into the original patch 
>>>> before sending it to Linus.
>>>>
>>>>
>>>>
>>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>>>> index a4b950a02d0b..bf030516258c 100644
>>>>> --- a/mm/vmalloc.c
>>>>> +++ b/mm/vmalloc.c
>>>>> @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const 
>>>>> unsigned long *offsets,
>>>>>
>>>>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>>>>>  pcpu_get_vm_areas);
>>>>> +   }
>>>>> +   spin_unlock(_area_lock);
>>>>>
>>>>> +   /* populate the shadow space outside of the lock */
>>>>> +   for (area = 0; area < nr_vms; area++) {
>>>>> /* assume success here */
>>>>> kasan_populate_vmalloc(sizes[area], vms[area]);
>>>>> }
>>>>> -   spin_unlock(_area_lock);
>>>>>
>>>>> kfree(vas);
>>>>> return vms;
>>>
>>> Hi,
>>>
>>> I am testing this support on next-20191129 and seeing the following 
>>> warnings:
>>>
>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
>>> 4 locks held by kworker/1:1/44:
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> __write_once_size include/linux/compiler.h:247 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
>>> include/asm-generic/atomic-instrumented.h:868 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: ffff888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
>>> kernel/workqueue.c:615 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>>>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
>>> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>>>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
>>> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>>>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
>>> include/linux/spinlock.h:338 [inline]
>>>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
>>> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>>> Preemption disabled at:
>>> [] spin_lock include/linux/spinlock.h:338 [inline]
>>> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>>> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.

Re: [PATCH v2 17/19] powerpc: book3s64: convert to pin_user_pages() and put_user_page()

2019-11-29 Thread Jan Kara
On Mon 25-11-19 15:10:33, John Hubbard wrote:
> 1. Convert from get_user_pages() to pin_user_pages().
> 
> 2. As required by pin_user_pages(), release these pages via
> put_user_page(). In this case, do so via put_user_pages_dirty_lock().
> 
> That has the side effect of calling set_page_dirty_lock(), instead
> of set_page_dirty(). This is probably more accurate.

Maybe more accurate but it doesn't work for mm_iommu_unpin(). As I'm
checking mm_iommu_unpin() gets called from RCU callback which is executed
interrupt context and you cannot lock pages from such context. So you need
to queue work from the RCU callback and then do the real work from the
workqueue...

Honza

> 
> As Christoph Hellwig put it, "set_page_dirty() is only safe if we are
> dealing with a file backed page where we have reference on the inode it
> hangs off." [1]
> 
> [1] https://lore.kernel.org/r/20190723153640.gb...@lst.de
> 
> Cc: Jan Kara 
> Signed-off-by: John Hubbard 
> ---
>  arch/powerpc/mm/book3s64/iommu_api.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c 
> b/arch/powerpc/mm/book3s64/iommu_api.c
> index 56cc84520577..fc1670a6fc3c 100644
> --- a/arch/powerpc/mm/book3s64/iommu_api.c
> +++ b/arch/powerpc/mm/book3s64/iommu_api.c
> @@ -103,7 +103,7 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, 
> unsigned long ua,
>   for (entry = 0; entry < entries; entry += chunk) {
>   unsigned long n = min(entries - entry, chunk);
>  
> - ret = get_user_pages(ua + (entry << PAGE_SHIFT), n,
> + ret = pin_user_pages(ua + (entry << PAGE_SHIFT), n,
>   FOLL_WRITE | FOLL_LONGTERM,
>   mem->hpages + entry, NULL);
>   if (ret == n) {
> @@ -167,9 +167,8 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, 
> unsigned long ua,
>   return 0;
>  
>  free_exit:
> - /* free the reference taken */
> - for (i = 0; i < pinned; i++)
> - put_page(mem->hpages[i]);
> + /* free the references taken */
> + put_user_pages(mem->hpages, pinned);
>  
>   vfree(mem->hpas);
>   kfree(mem);
> @@ -212,10 +211,9 @@ static void mm_iommu_unpin(struct 
> mm_iommu_table_group_mem_t *mem)
>   if (!page)
>   continue;
>  
> - if (mem->hpas[i] & MM_IOMMU_TABLE_GROUP_PAGE_DIRTY)
> - SetPageDirty(page);
> + put_user_pages_dirty_lock(, 1,
> + mem->hpas[i] & MM_IOMMU_TABLE_GROUP_PAGE_DIRTY);
>  
> - put_page(page);
>   mem->hpas[i] = 0;
>   }
>  }
> -- 
> 2.24.0
> 
-- 
Jan Kara 
SUSE Labs, CR


Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Dmitry Vyukov
On Fri, Nov 29, 2019 at 11:43 AM Dmitry Vyukov  wrote:
>
> On Tue, Nov 19, 2019 at 10:54 AM Andrey Ryabinin
>  wrote:
> > On 11/18/19 6:29 AM, Daniel Axtens wrote:
> > > Qian Cai  writes:
> > >
> > >> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
> > >>> /*
> > >>>  * In this function, newly allocated vm_struct has VM_UNINITIALIZED
> > >>>  * flag. It means that vm_struct is not fully initialized.
> > >>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const 
> > >>> unsigned long *offsets,
> > >>>
> > >>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
> > >>>  pcpu_get_vm_areas);
> > >>> +
> > >>> +   /* assume success here */
> > >>> +   kasan_populate_vmalloc(sizes[area], vms[area]);
> > >>> }
> > >>> spin_unlock(_area_lock);
> > >>
> > >> Here it is all wrong. GFP_KERNEL with in_atomic().
> > >
> > > I think this fix will work, I will do a v12 with it included.
> >
> > You can send just the fix. Andrew will fold it into the original patch 
> > before sending it to Linus.
> >
> >
> >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index a4b950a02d0b..bf030516258c 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const 
> > > unsigned long *offsets,
> > >
> > > setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
> > >  pcpu_get_vm_areas);
> > > +   }
> > > +   spin_unlock(_area_lock);
> > >
> > > +   /* populate the shadow space outside of the lock */
> > > +   for (area = 0; area < nr_vms; area++) {
> > > /* assume success here */
> > > kasan_populate_vmalloc(sizes[area], vms[area]);
> > > }
> > > -   spin_unlock(_area_lock);
> > >
> > > kfree(vas);
> > > return vms;
>
> Hi,
>
> I am testing this support on next-20191129 and seeing the following warnings:
>
> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
> 4 locks held by kworker/1:1/44:
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> __write_once_size include/linux/compiler.h:247 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
> include/asm-generic/atomic-instrumented.h:868 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
> kernel/workqueue.c:615 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
> include/linux/spinlock.h:338 [inline]
>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
> Preemption disabled at:
> [] spin_lock include/linux/spinlock.h:338 [inline]
> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.0-next-20191129+ #5
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
> Workqueue: events pcpu_balance_workfn
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x199/0x216 lib/dump_stack.c:118
>  ___might_sleep.cold.97+0x1f5/0x238 kernel/sched/core.c:6800
>  __might_sleep+0x95/0x190 kernel/sched/core.c:6753
>  prepare_alloc_pages mm/page_alloc.c:4681 [inline]
>  __alloc_pages_nodemask+0x3cd/0x890 mm/page_alloc.c:4730
>  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2211
>  alloc_pages include/linux/gfp.h:532 [inline]
>  __get_free_pages+0xc/0x40 mm/page_alloc.c:4786
>  kasan_populate_vmalloc_pte mm/kasan/common.c:762 [inline]
>  kasan

Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Will Deacon
On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote:
> Changes since v4:
> 
> - v4 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> - Addressed comments from Dave and added patches for documenting
>   new variables appended to vmcoreinfo documentation.
> - Added testing report shared by Akashi for PATCH 2/5.

Please can you fix your mail setup? The last two times you've sent this
series it seems to get split into two threads, which is really hard to
track in my inbox:

First thread:

https://lore.kernel.org/lkml/1574972621-25750-1-git-send-email-bhsha...@redhat.com/

Second thread:

https://lore.kernel.org/lkml/1574972716-25858-1-git-send-email-bhsha...@redhat.com/

Thanks,

Will


[PATCH] powerpc/kasan: KASAN is not supported on RELOCATABLE && FSL_BOOKE

2019-11-29 Thread Lexi Shao
CONFIG_RELOCATABLE and CONFIG_KASAN cannot be enabled at the same time
on ppce500 fsl_booke. All functions called before kasan_early_init()
should be disabled with kasan check. When CONFIG_RELOCATABLE is enabled
on ppce500 fsl_booke, relocate_init() is called before kasan_early_init()
which triggers kasan check and results in boot failure.
Call trace and functions which triggers kasan check(*):
  - _start
   - set_ivor
- relocate_init(*)
 - early_get_first_memblock_info(*)
  - of_scan_flat_dt(*)
...
- kasan_early_init

Potential solutions could be 1. implement relocate_init and all its children
function in a seperate file or 2. introduce a global vairable in KASAN, only
enable KASAN check when init is done.

Disable KASAN when RELOCATABLE is selected on fsl_booke for now until
it is supported.

Signed-off-by: Lexi Shao 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3e56c9c2f16e..14f3da63c088 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -171,7 +171,7 @@ config PPC
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if PPC_BOOK3S_64 && 
PPC_RADIX_MMU
select HAVE_ARCH_JUMP_LABEL
-   select HAVE_ARCH_KASAN  if PPC32
+   select HAVE_ARCH_KASAN  if PPC32 && !(RELOCATABLE && 
FSL_BOOKE)
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS   if COMPAT
-- 
2.12.3