Re: [PATCH] x86/PAT: priority the PAT warn to error to highlight the developer

2019-10-01 Thread Peter Zijlstra
On Tue, Oct 01, 2019 at 05:00:25AM +, Zhang, Jun wrote:
> Please see my comments.
> 

Please use a normal MUA that can quote text you're replying to. This is
unreadable garbage.


RE: [PATCH] x86/PAT: priority the PAT warn to error to highlight the developer

2019-09-30 Thread Zhang, Jun
Please see my comments.

Thanks,
Jun

-Original Message-
From: Borislav Petkov  
Sent: Monday, September 30, 2019 8:03 PM
To: Zhang, Jun 
Cc: dave.han...@linux.intel.com; l...@kernel.org; pet...@infradead.org; 
t...@linutronix.de; mi...@redhat.com; h...@zytor.com; He, Bo ; 
x...@kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] x86/PAT: priority the PAT warn to error to highlight the 
developer

On Sun, Sep 29, 2019 at 03:20:31PM +0800, jun.zh...@intel.com wrote:
> From: zhang jun 
> 
> Documentation/x86/pat.txt says:
> set_memory_uc() or set_memory_wc() must use together with 
> set_memory_wb()

I had to open that file to see what it actually says - btw, the filename is 
pat.rst now - and you're very heavily paraphrasing what is there. So try again 
explaining what the requirement is.
[ZJ] next parts come from pat.txt in kernel version 4.19
Drivers wanting to export some pages to userspace do it by using mmap
interface and a combination of
1) pgprot_noncached()
2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()

With PAT support, a new API pgprot_writecombine is being added. So, drivers can
continue to use the above sequence, with either pgprot_noncached() or
pgprot_writecombine() in step 1, followed by step 2.

In addition, step 2 internally tracks the region as UC or WC in memtype
list in order to ensure no conflicting mapping.

Note that this set of APIs only works with IO (non RAM) regions. If driver
wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
as step 0 above and also track the usage of those pages and use set_memory_wb()
before the page is freed to free pool.

> if break the PAT attribute, there are tons of warning like:
> [   45.846872] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req

That's some android NDK thing, it seems: "The Android NDK is a toolset that 
lets you implement parts of your app in native code,... " lemme guess, they 
have a kernel module?
[ZJ] no, "NDK MediaCodec_" is an android codec2.0 process. It want to use WC 
memory.

> write-combining for [mem 0x1e7a8-0x1e7a87fff], got write-back and 
> in the extremely case, we see kernel panic unexpected like:
> list_del corruption. prev->next should be 88806dbe69c0, but was 
> 888036f048c0

This is not really helpful. You need to explain what exactly you're doing - not 
shortening the error messages.
[ZJ] android codec2.0 want to use WC memory. Which use ion to allocate memory. 
So, we enable drivers/staging/android/ion, which work well except X86, x86 need 
to set_memory_wc().
So there are tons of warning, then list_del corruption. I use this 
patch(https://www.lkml.org/lkml/2019/9/29/25), list crash disappear.
Next is error message.
<4>[49967.389732] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req 
write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<4>[49967.389747] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x1091f4000-0x1091f7fff], got write-back
<4>[49967.390622] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x10909-0x109090fff], got write-back
<4>[49967.390687] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x1091f8000-0x1091fbfff], got write-back
<4>[49967.390855] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req 
write-combining for [mem 0x1091f4000-0x1091f4fff], got write-back
<4>[49967.391405] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x109098000-0x109098fff], got write-back
<4>[49967.391454] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req 
write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<4>[49967.391474] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x1091f4000-0x1091f7fff], got write-back
<4>[49967.392641] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x14eb68000-0x14eb68fff], got write-back
<4>[49967.392708] x86/PAT: .vorbis.decoder:10606 map pfn RAM range req 
write-combining for [mem 0x1091f8000-0x1091fbfff], got write-back
<4>[49967.393001] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req 
write-combining for [mem 0x1091f4000-0x1091f4fff], got write-back
<4>[49967.394066] x86/PAT: NDK MediaCodec_:10602 map pfn RAM range req 
write-combining for [mem 0x1091f8000-0x1091f8fff], got write-back
<6>[50045.677129] binder: 3390:3390 transaction failed 29189/-22, size 88-0 
line 3131
<3>[50046.153621] list_del corruption. prev->next should be 89598004c960, 
but was 895ad46e4590
<4>[50046.163464] invalid opcode:  [#1] PREEMPT SMP NOPTI
<4>[50046.169297] CPU: 1 PID: 18792 Comm: Binder:3390_1B Tainted: G U O 
 4.19.68-PKT-190905T163945Z-00031-g9de920e66b4e #1
<4>[50046.182213] RIP: 0010:__list_del_entr

Re: [PATCH] x86/PAT: priority the PAT warn to error to highlight the developer

2019-09-30 Thread Borislav Petkov
On Sun, Sep 29, 2019 at 03:20:31PM +0800, jun.zh...@intel.com wrote:
> From: zhang jun 
> 
> Documentation/x86/pat.txt says:
> set_memory_uc() or set_memory_wc() must use together with set_memory_wb()

I had to open that file to see what it actually says - btw, the filename
is pat.rst now - and you're very heavily paraphrasing what is there. So
try again explaining what the requirement is.

> if break the PAT attribute, there are tons of warning like:
> [   45.846872] x86/PAT: NDK MediaCodec_:3753 map pfn RAM range req

That's some android NDK thing, it seems: "The Android NDK is a toolset
that lets you implement parts of your app in native code,... " lemme
guess, they have a kernel module?

> write-combining for [mem 0x1e7a8-0x1e7a87fff], got write-back
> and in the extremely case, we see kernel panic unexpected like:
> list_del corruption. prev->next should be 88806dbe69c0,
> but was 888036f048c0

This is not really helpful. You need to explain what exactly you're
doing - not shortening the error messages.

> so it's better to priority the warn to error to highlight to
> remind the developer.

Whut?

>From reading what is trying hard to be a sentence, I can only guess what
you're trying to say here. And it doesn't make it clear why is pr_warn()
not enough and it has to be pr_err().

> Signed-off-by: zhang jun 
> Signed-off-by: he, bo 

And this SOB chain is wrong.

> ---
>  arch/x86/mm/pat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index d9fbd4f69920..43a4dfdcedc8 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -897,7 +897,7 @@ static int reserve_pfn_range(u64 paddr, unsigned long 
> size, pgprot_t *vma_prot,
>  
>   pcm = lookup_memtype(paddr);
>   if (want_pcm != pcm) {
> - pr_warn("x86/PAT: %s:%d map pfn RAM range req %s for 
> [mem %#010Lx-%#010Lx], got %s\n",
> + pr_err("x86/PAT: %s:%d map pfn RAM range req %s for 
> [mem %#010Lx-%#010Lx], got %s!!!\n",

Three "!!!" would make this more urgent, huh?

How about you make the error message more informative and user-friendly,
instead?

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette