[PATCH v2 5/5] powerpc/vdso: Declare vdso_patches[] as __initdata

2020-08-27 Thread Christophe Leroy
vdso_patches[] table is used only at init time.

Mark it __initdata.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/vdso.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 4ad042995ccc..dfa08a7b4e7c 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -76,7 +76,7 @@ struct vdso_patch_def
  * Currently, we only change sync_dicache to do nothing on processors
  * with a coherent icache
  */
-static struct vdso_patch_def vdso_patches[] = {
+static struct vdso_patch_def vdso_patches[] __initdata = {
{
CPU_FTR_COHERENT_ICACHE, CPU_FTR_COHERENT_ICACHE,
"__kernel_sync_dicache", "__kernel_sync_dicache_p5"
-- 
2.25.0



[PATCH v2 2/5] powerpc/vdso: Don't rely on vdso_pages being 0 for failure

2020-08-27 Thread Christophe Leroy
If vdso initialisation failed, vdso_ready is not set.
Otherwise, vdso_pages is only 0 when it is a 32 bits task
and CONFIG_VDSO32 is not selected.

As arch_setup_additional_pages() now bails out directly in
that case, we don't need to set vdso_pages to 0.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/vdso.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 3ef3fc546ac8..8f245e988a8a 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -176,11 +176,6 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, 
int uses_interp)
 
current->mm->context.vdso_base = 0;
 
-   /* vDSO has a problem and was disabled, just don't "enable" it for the
-* process
-*/
-   if (vdso_pages == 0)
-   return 0;
/* Add a page to the vdso size for the data page */
vdso_pages ++;
 
@@ -710,14 +705,16 @@ static int __init vdso_init(void)
 * Initialize the vDSO images in memory, that is do necessary
 * fixups of vDSO symbols, locate trampolines, etc...
 */
-   if (vdso_setup())
-   goto setup_failed;
+   if (vdso_setup()) {
+   pr_err("vDSO setup failure, not enabled !\n");
+   return 0;
+   }
 
if (IS_ENABLED(CONFIG_VDSO32)) {
/* Make sure pages are in the correct state */
pagelist = kcalloc(vdso32_pages + 1, sizeof(struct page *), 
GFP_KERNEL);
if (!pagelist)
-   goto alloc_failed;
+   return 0;
 
pagelist[0] = virt_to_page(vdso_data);
 
@@ -730,7 +727,7 @@ static int __init vdso_init(void)
if (IS_ENABLED(CONFIG_PPC64)) {
pagelist = kcalloc(vdso64_pages + 1, sizeof(struct page *), 
GFP_KERNEL);
if (!pagelist)
-   goto alloc_failed;
+   return 0;
 
pagelist[0] = virt_to_page(vdso_data);
 
@@ -743,14 +740,6 @@ static int __init vdso_init(void)
smp_wmb();
vdso_ready = 1;
 
-   return 0;
-
-setup_failed:
-   pr_err("vDSO setup failure, not enabled !\n");
-alloc_failed:
-   vdso32_pages = 0;
-   vdso64_pages = 0;
-
return 0;
 }
 arch_initcall(vdso_init);
-- 
2.25.0



[PATCH v2 4/5] powerpc/vdso: Declare constant vars as __ro_after_init

2020-08-27 Thread Christophe Leroy
To avoid any risk of modification of vital VDSO variables,
declare them __ro_after_init.

vdso32_kbase and vdso64_kbase could be made 'const', but it would
have high impact on all functions using them as the compiler doesn't
expect const property to be discarded.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/vdso.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index fb393266b9cb..4ad042995ccc 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -38,19 +38,19 @@
 #define VDSO_ALIGNMENT (1 << 16)
 
 extern char vdso32_start, vdso32_end;
-static unsigned int vdso32_pages;
-static void *vdso32_kbase = _start;
-unsigned long vdso32_sigtramp;
-unsigned long vdso32_rt_sigtramp;
+static unsigned int vdso32_pages __ro_after_init;
+static void *vdso32_kbase __ro_after_init = _start;
+unsigned long vdso32_sigtramp __ro_after_init;
+unsigned long vdso32_rt_sigtramp __ro_after_init;
 
 extern char vdso64_start, vdso64_end;
-static void *vdso64_kbase = _start;
-static unsigned int vdso64_pages;
+static void *vdso64_kbase __ro_after_init = _start;
+static unsigned int vdso64_pages __ro_after_init;
 #ifdef CONFIG_PPC64
-unsigned long vdso64_rt_sigtramp;
+unsigned long vdso64_rt_sigtramp __ro_after_init;
 #endif /* CONFIG_PPC64 */
 
-static int vdso_ready;
+static int vdso_ready __ro_after_init;
 
 /*
  * The vdso data page (aka. systemcfg for old ppc64 fans) is here.
-- 
2.25.0



[PATCH v2 1/5] powerpc/vdso: Remove DBG()

2020-08-27 Thread Christophe Leroy
DBG() is defined as void when DEBUG is not defined,
and DEBUG is explicitly undefined.

It means there is no other way than modifying source code
to get the messages printed.

It was most likely useful in the first days of VDSO, but
today the only 3 DBG() calls don't deserve a special
handling.

Just remove them. If one day someone need such messages back,
use a standard pr_debug() or equivalent.

Signed-off-by: Christophe Leroy 
---
This is a follow up series, applying on top of the series that
switches powerpc VDSO to _install_special_mapping(),
rebased on today's powerpc/next-test (dd419a93bd99)

v2 removes the modification to arch_setup_additional_pages() to
consider when is_32bit_task() returning true when CONFIG_VDSO32
not set, as this should never happen.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/vdso.c | 13 -
 1 file changed, 13 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index e2568d9ecdff..3ef3fc546ac8 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -31,14 +31,6 @@
 #include 
 #include 
 
-#undef DEBUG
-
-#ifdef DEBUG
-#define DBG(fmt...) printk(fmt)
-#else
-#define DBG(fmt...)
-#endif
-
 /* Max supported size for symbol names */
 #define MAX_SYMNAME64
 
@@ -567,9 +559,6 @@ static __init int vdso_fixup_alt_funcs(struct lib32_elfinfo 
*v32,
if (!match)
continue;
 
-   DBG("replacing %s with %s...\n", patch->gen_name,
-   patch->fix_name ? "NONE" : patch->fix_name);
-
/*
 * Patch the 32 bits and 64 bits symbols. Note that we do not
 * patch the "." symbol on 64 bits.
@@ -704,7 +693,6 @@ static int __init vdso_init(void)
 * Calculate the size of the 64 bits vDSO
 */
vdso64_pages = (_end - _start) >> PAGE_SHIFT;
-   DBG("vdso64_kbase: %p, 0x%x pages\n", vdso64_kbase, vdso64_pages);
 
vdso32_kbase = _start;
 
@@ -712,7 +700,6 @@ static int __init vdso_init(void)
 * Calculate the size of the 32 bits vDSO
 */
vdso32_pages = (_end - _start) >> PAGE_SHIFT;
-   DBG("vdso32_kbase: %p, 0x%x pages\n", vdso32_kbase, vdso32_pages);
 
/*
 * Setup the syscall map in the vDOS
-- 
2.25.0



[PATCH v2 3/5] powerpc/vdso: Initialise vdso32_kbase at compile time

2020-08-27 Thread Christophe Leroy
Initialise vdso32_kbase at compile time like vdso64_kbase.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/vdso.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index 8f245e988a8a..fb393266b9cb 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -37,13 +37,12 @@
 /* The alignment of the vDSO */
 #define VDSO_ALIGNMENT (1 << 16)
 
+extern char vdso32_start, vdso32_end;
 static unsigned int vdso32_pages;
-static void *vdso32_kbase;
+static void *vdso32_kbase = _start;
 unsigned long vdso32_sigtramp;
 unsigned long vdso32_rt_sigtramp;
 
-extern char vdso32_start, vdso32_end;
-
 extern char vdso64_start, vdso64_end;
 static void *vdso64_kbase = _start;
 static unsigned int vdso64_pages;
@@ -689,8 +688,6 @@ static int __init vdso_init(void)
 */
vdso64_pages = (_end - _start) >> PAGE_SHIFT;
 
-   vdso32_kbase = _start;
-
/*
 * Calculate the size of the 32 bits vDSO
 */
-- 
2.25.0



Re: [PATCH v3 2/3] usb typec: mt6360: Rename driver/Kconfig/Makefile from mt6360 to mt636x

2020-08-27 Thread ChiYuan Huang
Guenter Roeck  於 2020年8月28日 週五 上午12:41寫道:
>
> On Thu, Aug 27, 2020 at 07:18:56PM +0800, cy_huang wrote:
> > From: ChiYuan Huang 
> >
> > 1. Rename file form tcpci_mt6360.c to tcpci_mt636x.c
> > 2. Rename internal function from mt6360 to mt636x, except the register
> > definition.
> > 3. Change Kconfig/Makefile from MT6360 to MT636X.
> >
> > Signed-off-by: ChiYuan Huang 
> > ---
> >  drivers/usb/typec/tcpm/Kconfig|   6 +-
> >  drivers/usb/typec/tcpm/Makefile   |   2 +-
> >  drivers/usb/typec/tcpm/tcpci_mt6360.c | 212 
> > --
> >  drivers/usb/typec/tcpm/tcpci_mt636x.c | 212 
> > ++
> >  4 files changed, 216 insertions(+), 216 deletions(-)
> >  delete mode 100644 drivers/usb/typec/tcpm/tcpci_mt6360.c
> >  create mode 100644 drivers/usb/typec/tcpm/tcpci_mt636x.c
>
> Maybe Heikki is ok with this change, but I am not, for the reasons
> mentioned before. So I won't approve this patch. Note that, either
> case, it should be merged with the first patch.

Yes, I agree with you opinion. use 636x,  the range is too large from
0 to 9, it may not all be compatible.
Even though it's also possible that the part number don't have the
same function.
So I'm going to remove the rename patch.
Do I need to add a patch named "revert"? Or just remove it. I'm not
sure which way is better.

It seems you all want the code change to be squashed into the first
code. And the second one is the DT binding. Right?


>
> Guenter


[PATCH] f2fs: prevent compressed file from being disabled after releasing cblocks

2020-08-27 Thread Daeho Jeong
From: Daeho Jeong 

After releasing cblocks, the compressed file can be accidentally
disabled in compression mode, since it has zero cblocks. As we are
using IMMUTABLE flag to present released cblocks state, we can add
IMMUTABLE state check when considering the compressed file disabling.

Signed-off-by: Daeho Jeong 
---
 fs/f2fs/f2fs.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 02811ce15059..14d30740ba03 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3936,6 +3936,8 @@ static inline u64 f2fs_disable_compressed_file(struct 
inode *inode)
if (!f2fs_compressed_file(inode))
return 0;
if (S_ISREG(inode->i_mode)) {
+   if (IS_IMMUTABLE(inode))
+   return 1;
if (get_dirty_pages(inode))
return 1;
if (fi->i_compr_blocks)
-- 
2.28.0.402.g5ffc5be6b7-goog



Re: [PATCH v1 4/9] powerpc/vdso: Remove unnecessary ifdefs in vdso_pagelist initialization

2020-08-27 Thread Christophe Leroy




Le 28/08/2020 à 07:40, Christophe Leroy a écrit :



Le 27/08/2020 à 15:19, Michael Ellerman a écrit :

Christophe Leroy  writes:

On 08/26/2020 02:58 PM, Michael Ellerman wrote:

Christophe Leroy  writes:

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index daef14a284a3..bbb69832fd46 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -718,16 +710,14 @@ static int __init vdso_init(void)

...

-
-#ifdef CONFIG_VDSO32
   vdso32_kbase = _start;
   /*
@@ -735,8 +725,6 @@ static int __init vdso_init(void)
    */
   vdso32_pages = (_end - _start) >> PAGE_SHIFT;
   DBG("vdso32_kbase: %p, 0x%x pages\n", vdso32_kbase, 
vdso32_pages);

-#endif


This didn't build for ppc64le:


/opt/cross/gcc-8.20_binutils-2.32/powerpc64-unknown-linux-gnu/bin/powerpc64-unknown-linux-gnu-ld: 
arch/powerpc/kernel/vdso.o:(.toc+0x0): undefined reference to 
`vdso32_end'

/opt/cross/gcc-8.20_binutils-2.32/powerpc64-unknown-linux-gnu/bin/powerpc64-unknown-linux-gnu-ld: 
arch/powerpc/kernel/vdso.o:(.toc+0x8): undefined reference to 
`vdso32_start'
    make[1]: *** [/scratch/michael/build/maint/Makefile:1166: 
vmlinux] Error 1

    make: *** [Makefile:185: __sub-make] Error 2

So I just put that ifdef back.



The problem is because is_32bit() can still return true even when
CONFIG_VDSO32 is not set.


Hmm, you're right. My config had CONFIG_COMPAT enabled.

But that seems like a bug, if someone enables COMPAT on ppc64le they are
almost certainly going to want VDSO32 as well.

So I think I'll do a lead up patch as below.


Ah yes, and with that then no need to consider the case where 
is_32bit_task() is true and CONFIG_VDSO32 is not selected.


I'll update my leading series accordingly.


I meant follow up series.

Christophe


Christophe



cheers

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype

index d4fd109f177e..cf2da1e401ef 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -501,13 +501,12 @@ endmenu
  config VDSO32
  def_bool y
-    depends on PPC32 || CPU_BIG_ENDIAN
+    depends on PPC32 || COMPAT
  help
    This symbol controls whether we build the 32-bit VDSO. We 
obviously
    want to do that if we're building a 32-bit kernel. If we're 
building
-  a 64-bit kernel then we only want a 32-bit VDSO if we're 
building for
-  big endian. That is because the only little endian 
configuration we

-  support is ppc64le which is 64-bit only.
+  a 64-bit kernel then we only want a 32-bit VDSO if we're also 
enabling

+  COMPAT.
  choice
  prompt "Endianness selection"



[PATCH 0/1] Remove redundant condition for MTK_TIMER

2020-08-27 Thread Freddy Hsin
Remove the redundant condition of MTK_TIMER because the driver
can work on MTK platform normally, so COMPILE_TEST is no longer
needed for development purpose

Freddy Hsin (1):
  timer: mt6873: remove COMPILE_TEST condition for MTK timer

 drivers/clocksource/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


[PATCH v1 1/1] timer: mt6873: remove COMPILE_TEST condition for MTK timer

2020-08-27 Thread Freddy Hsin
MTK timer driver can work on MTK platform normally, so remove
the redundant condition for MTK_TIMER

Signed-off-by: Freddy Hsin 
---
 drivers/clocksource/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 9141838..1ec5d94 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -472,7 +472,7 @@ config SYS_SUPPORTS_SH_CMT
bool
 
 config MTK_TIMER
-   bool "Mediatek timer driver" if COMPILE_TEST
+   bool "Mediatek timer driver"
depends on HAS_IOMEM
select TIMER_OF
select CLKSRC_MMIO
-- 
1.7.9.5


Re: [PATCH v1 4/9] powerpc/vdso: Remove unnecessary ifdefs in vdso_pagelist initialization

2020-08-27 Thread Christophe Leroy




Le 27/08/2020 à 15:19, Michael Ellerman a écrit :

Christophe Leroy  writes:

On 08/26/2020 02:58 PM, Michael Ellerman wrote:

Christophe Leroy  writes:

diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index daef14a284a3..bbb69832fd46 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -718,16 +710,14 @@ static int __init vdso_init(void)

...
   
-

-#ifdef CONFIG_VDSO32
vdso32_kbase = _start;
   
   	/*

@@ -735,8 +725,6 @@ static int __init vdso_init(void)
 */
vdso32_pages = (_end - _start) >> PAGE_SHIFT;
DBG("vdso32_kbase: %p, 0x%x pages\n", vdso32_kbase, vdso32_pages);
-#endif


This didn't build for ppc64le:


/opt/cross/gcc-8.20_binutils-2.32/powerpc64-unknown-linux-gnu/bin/powerpc64-unknown-linux-gnu-ld:
 arch/powerpc/kernel/vdso.o:(.toc+0x0): undefined reference to `vdso32_end'

/opt/cross/gcc-8.20_binutils-2.32/powerpc64-unknown-linux-gnu/bin/powerpc64-unknown-linux-gnu-ld:
 arch/powerpc/kernel/vdso.o:(.toc+0x8): undefined reference to `vdso32_start'
make[1]: *** [/scratch/michael/build/maint/Makefile:1166: vmlinux] Error 1
make: *** [Makefile:185: __sub-make] Error 2

So I just put that ifdef back.



The problem is because is_32bit() can still return true even when
CONFIG_VDSO32 is not set.


Hmm, you're right. My config had CONFIG_COMPAT enabled.

But that seems like a bug, if someone enables COMPAT on ppc64le they are
almost certainly going to want VDSO32 as well.

So I think I'll do a lead up patch as below.


Ah yes, and with that then no need to consider the case where 
is_32bit_task() is true and CONFIG_VDSO32 is not selected.


I'll update my leading series accordingly.

Christophe



cheers

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index d4fd109f177e..cf2da1e401ef 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -501,13 +501,12 @@ endmenu
  
  config VDSO32

def_bool y
-   depends on PPC32 || CPU_BIG_ENDIAN
+   depends on PPC32 || COMPAT
help
  This symbol controls whether we build the 32-bit VDSO. We obviously
  want to do that if we're building a 32-bit kernel. If we're building
- a 64-bit kernel then we only want a 32-bit VDSO if we're building for
- big endian. That is because the only little endian configuration we
- support is ppc64le which is 64-bit only.
+ a 64-bit kernel then we only want a 32-bit VDSO if we're also enabling
+ COMPAT.
  
  choice

prompt "Endianness selection"



Re: [PATCH 09/23] cachefiles: use ASSERT_FAIL()/ASSERT_WARN() to cleanup some code

2020-08-27 Thread kernel test robot
Hi Chunguang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mkp-scsi/for-next]
[also build test ERROR on scsi/for-next block/for-next linus/master 
asm-generic/master v5.9-rc2 next-20200827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Chunguang-Xu/clean-up-the-code-related-to-ASSERT/20200827-182148
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:11,
from include/linux/list.h:9,
from include/linux/module.h:12,
from fs/cachefiles/bind.c:8:
   fs/cachefiles/bind.c: In function 'cachefiles_daemon_bind':
>> fs/cachefiles/internal.h:319:31: error: 'x' undeclared (first use in this 
>> function)
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   fs/cachefiles/internal.h:319:19: note: in expansion of macro 'ASSERT_FAIL'
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^~~
   fs/cachefiles/bind.c:39:2: note: in expansion of macro 'ASSERT'
  39 |  ASSERT(cache->fstop_percent >= 0 &&
 |  ^~
   fs/cachefiles/internal.h:319:31: note: each undeclared identifier is 
reported only once for each function it appears in
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   fs/cachefiles/internal.h:319:19: note: in expansion of macro 'ASSERT_FAIL'
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^~~
   fs/cachefiles/bind.c:39:2: note: in expansion of macro 'ASSERT'
  39 |  ASSERT(cache->fstop_percent >= 0 &&
 |  ^~
--
   In file included from include/linux/kernel.h:11,
from include/linux/list.h:9,
from include/linux/module.h:12,
from fs/cachefiles/daemon.c:8:
   fs/cachefiles/daemon.c: In function 'cachefiles_daemon_release':
>> fs/cachefiles/internal.h:319:31: error: 'x' undeclared (first use in this 
>> function)
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   fs/cachefiles/internal.h:319:19: note: in expansion of macro 'ASSERT_FAIL'
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^~~
   fs/cachefiles/daemon.c:135:2: note: in expansion of macro 'ASSERT'
 135 |  ASSERT(cache);
 |  ^~
   fs/cachefiles/internal.h:319:31: note: each undeclared identifier is 
reported only once for each function it appears in
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   fs/cachefiles/internal.h:319:19: note: in expansion of macro 'ASSERT_FAIL'
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^~~
   fs/cachefiles/daemon.c:135:2: note: in expansion of macro 'ASSERT'
 135 |  ASSERT(cache);
 |  ^~
   fs/cachefiles/daemon.c: In function 'cachefiles_daemon_write':
>> fs/cachefiles/internal.h:319:31: error: 'x' undeclared (first use in this 
>> function)
 319 | #define ASSERT(X) ASSERT_FAIL(x)
 |   ^
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
  78 | # define unlikely(x) __builtin_expect(!!(x), 0)
 |  ^
   fs/cachefiles/internal.h:319:19: note: in expansion of macro 'ASSERT_FAIL'
 319 | #define ASSERT(X) ASSERT_FAIL(x)

linux-next: Tree for Aug 28

2020-08-27 Thread Stephen Rothwell
Hi all,

News:  There will be no linux-next releases next Monday or Tuesday.

Changes since 20200827:

The net-next tree gained a conflict against the net tree.

Non-merge commits (relative to Linus' tree): 3122
 3668 files changed, 104159 insertions(+), 39605 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig and htmldocs. And finally, a simple boot test
of the powerpc pseries_le_defconfig kernel in qemu (with and without
kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 328 trees (counting Linus' and 86 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (15bc20c6af4c Merge tag 'tty-5.9-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty)
Merging fixes/fixes (9123e3a74ec7 Linux 5.9-rc1)
Merging kbuild-current/fixes (d012a7190fc1 Linux 5.9-rc2)
Merging arc-current/for-curr (89d29997f103 irqchip/eznps: Fix build error for 
!ARC700 builds)
Merging arm-current/fixes (5c6360ee4a0e ARM: 8988/1: mmu: fix crash in EFI 
calls due to p4d typo in create_mapping_late())
Merging arm64-fixes/for-next/fixes (8d75785a8142 ARM64: vdso32: Install vdso32 
from vdso_install)
Merging arm-soc-fixes/arm/fixes (9c8b0a9c37b7 Merge tag 'imx-fixes-5.9' of 
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes)
Merging uniphier-fixes/fixes (48778464bb7d Linux 5.8-rc2)
Merging drivers-memory-fixes/fixes (7ff3a2a626f7 memory: jz4780_nemc: Fix an 
error pointer vs NULL check in probe())
Merging m68k-current/for-linus (382f429bb559 m68k: defconfig: Update defconfigs 
for v5.8-rc3)
Merging powerpc-fixes/fixes (16d83a540ca4 Revert "powerpc/powernv/idle: Replace 
CPU feature check with PVR check")
Merging s390-fixes/fixes (bffc2f7aa963 s390/vmem: fix vmem_add_range for 
4-level paging)
Merging sparc/master (0a95a6d1a4cd sparc: use for_each_child_of_node() macro)
Merging fscrypt-current/for-stable (2b4eae95c736 fscrypt: don't evict dirty 
inodes after removing key)
Merging net/master (b43c75abfd08 rxrpc: Fix memory leak in 
rxkad_verify_response())
Merging bpf/master (7787b6fc938e bpf, sysctl: Let bpf_stats_handler take a 
kernel pointer buffer)
Merging ipsec/master (45a36a18d019 xfrmi: drop ignore_df check before updating 
pmtu)
Merging netfilter/master (3622adb02623 ipv6: ndisc: adjust 
ndisc_ifinfo_sysctl_change prototype)
Merging ipvs/master (7c7ab580db49 net: Convert to use the fallthrough macro)
Merging wireless-drivers/master (4afc850e2e9e mwifiex: Increase AES key storage 
size to 256 bits)
Merging mac80211/master (2d9b55508556 cfg80211: Adjust 6 GHz frequency to 
channel conversion)
Merging rdma-fixes/for-rc (097a9d23b725 RDMA/bnxt_re: Remove the qp from list 
only if the qp destroy succeeds)
Merging sound-current/for-linus (858e0ad9301d ALSA: hda/hdmi: always check pin 
power status in i915 pin fixup)
Merging sound-asoc-fixes/for-linus (d563b6c834ae Merge series "ASoC: Fix return 
check for devm_regmap_init_sdw()" from Vinod Koul :)
Merging regmap-fixes/for-linus (d012a7190fc1 Linux 5.9-rc2)
Merging regulator-fixes/for-linus (3bec5b6aae83 Merge tag 'v5.9-rc2' into 
regulator-5.9)
Merging spi-fixes/for-linus (3812e0343b42 Merge remote-tracking branch 
'spi/for-5.9' into spi-linus)
Merging pci-current/for-linus (7c2308f79fc8 PCI/P2PDMA: Fix build without DMA 
ops)
Merging driver-core.current/driver-core-linus (d012a7190fc1 Linux 5.9-rc2)
Merging tty.current/tty-linus (ea1fc02e12b6 tty: serial: imx: add dep

Re: [PATCH v18 00/32] per memcg lru_lock

2020-08-27 Thread Alex Shi


在 2020/8/28 上午9:40, Daniel Jordan 写道:
> I went back to your v1 post to see what motivated you originally, and you had
> some results from aim9 but nothing about where this reared its head in the
> first place.  How did you discover the bottleneck?  I'm just curious about how
> lru_lock hurts in practice.

We have gotten very high 'sys' in some buiness/machines. And found much of time 
spent
on the lru_lock and/or zone lock. Seems per memcg lru_lock could help this, but 
still
no idea on zone lock.

Thanks
Alex


Re: Aw: Re: [PATCH v5 3/7] drm/mediatek: disable tmds on mt2701

2020-08-27 Thread Frank Wunderlich
Without this patch i have flickering/horizontal distortion (looks line every 
second line has different x position as one above approx. 5 px) on my 1280x1024 
tft.

Fbcon is unreadable with this problem.

Hard to describe by words only :(

Am 28. August 2020 01:46:07 MESZ schrieb Chun-Kuang Hu 
:
>Hi, Frank:
>
>Matthias Brugger  於 2020年8月27日 週四 下午10:28寫道:
>>
>>
>>
>> On 27/08/2020 15:41, Frank Wunderlich wrote:
>> > Hi Matthias,
>> >
>> > any opinions about the dts-changes?
>> >
>>
>> they look good to me.
>>
>> > maybe series except the tmds-Patch get merged...so i add it only to
>my own repo till we find a better way?
>> > currently mainline does not support hdmi at all for the board. the
>tmds-patch is only a fix for specific resolutions which have a
>"flickering" without this Patch.
>> >
>>
>> Well let's see what CK's opinion.
>>
>
>Because no one has comment on this patch, I could apply this patch but
>I need you to add more experiment information so if someone meets
>another bug, he could fix his bug and consider your problem.
>
>Regards,
>Chun-Kuang.
>
>> Regards,
>> Matthias
>
>___
>Linux-mediatek mailing list
>linux-media...@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/linux-mediatek

regards Frank


Re: [PATCH 6/6] gpio: zynq: Simplify with dev_err_probe()

2020-08-27 Thread Michal Simek



On 27. 08. 20 22:08, Krzysztof Kozlowski wrote:
> Common pattern of handling deferred probe can be simplified with
> dev_err_probe().  Less code and also it prints the error value.
> 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/gpio/gpio-zynq.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpio/gpio-zynq.c b/drivers/gpio/gpio-zynq.c
> index 53d1387592fd..0b5a17ab996f 100644
> --- a/drivers/gpio/gpio-zynq.c
> +++ b/drivers/gpio/gpio-zynq.c
> @@ -929,11 +929,9 @@ static int zynq_gpio_probe(struct platform_device *pdev)
>  
>   /* Retrieve GPIO clock */
>   gpio->clk = devm_clk_get(>dev, NULL);
> - if (IS_ERR(gpio->clk)) {
> - if (PTR_ERR(gpio->clk) != -EPROBE_DEFER)
> - dev_err(>dev, "input clock not found.\n");
> - return PTR_ERR(gpio->clk);
> - }
> + if (IS_ERR(gpio->clk))
> + return dev_err_probe(>dev, PTR_ERR(gpio->clk), "input 
> clock not found.\n");
> +
>   ret = clk_prepare_enable(gpio->clk);
>   if (ret) {
>   dev_err(>dev, "Unable to enable clock.\n");
> 

Reviewed-by: Michal Simek 

Thanks,
Michal


Re: [PATCH] mmc: sdhci-msm: When dev_pm_opp_of_add_table() returns 0 it's not an error

2020-08-27 Thread Viresh Kumar
On 27-08-20, 08:33, Douglas Anderson wrote:
> The commit d05a7238fe1c ("mmc: sdhci-msm: Unconditionally call
> dev_pm_opp_of_remove_table()") works fine in the case where there is
> no OPP table.  However, if there is an OPP table then
> dev_pm_opp_of_add_table() will return 0.  Since 0 != -ENODEV then the
> "if (ret != -ENODEV)" will evaluate to true and we'll fall into the
> error case.  Oops.
> 
> Let's fix this.
> 
> Fixes: d05a7238fe1c ("mmc: sdhci-msm: Unconditionally call 
> dev_pm_opp_of_remove_table()")
> Signed-off-by: Douglas Anderson 
> ---
> 
>  drivers/mmc/host/sdhci-msm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
> index b7e47107a31a..55101dba42bd 100644
> --- a/drivers/mmc/host/sdhci-msm.c
> +++ b/drivers/mmc/host/sdhci-msm.c
> @@ -2284,7 +2284,7 @@ static int sdhci_msm_probe(struct platform_device *pdev)
>  
>   /* OPP table is optional */
>   ret = dev_pm_opp_of_add_table(>dev);
> - if (ret != -ENODEV) {
> + if (ret && ret != -ENODEV) {
>   dev_err(>dev, "Invalid OPP table in Device tree\n");
>   goto opp_cleanup;
>   }

Wow!

How many bugs did I introduce with a simple patch :(

@Ulf, since this is material for 5.10 I was planning to resend the
original patch itself with all the things fixed. Will you be able to
rebase your tree? Or do you want to apply fixes separately ?

-- 
viresh


Re: Printing bitfields in the kernel (Re: [PATCH] drm: Parse Colorimetry data block from EDID)

2020-08-27 Thread Joe Perches
On Thu, 2020-08-27 at 10:34 +0300, Pekka Paalanen wrote:
> On Wed, 26 Aug 2020 22:23:28 +0800
> Algea Cao  wrote:
> 
> > CEA 861.3 spec adds colorimetry data block for HDMI.
> > Parsing the block to get the colorimetry data from
> > panel.

If flags are int, I could imagine another %p extension
where %*p is used like:

printk("flags: %*pn", flags, bitstrings)

where flags is:

BIT(0)
BIT(1)
...
BIT(last)

and

char *bitstrings[] = {
"bit 0 description",
"bit 1 description",
...
"last bit description"
};

Or define YA struct with 2 entries as the struct members
and use that.

struct foo {
unsigned long flags,
char ** descriptions,
};

struct foo bar = {.flags =  .descriptions = bitstrings};

printk("flags: %p\n, );




Re: [RFC 0/2] Add risc-v vhost-net support

2020-08-27 Thread Anup Patel
On Fri, Jul 24, 2020 at 2:25 PM Yifei Jiang  wrote:
>
> Hi,
>
> These two patches enable support for vhost-net on RISC-V architecture. They 
> are developed
> based on the Linux source in this repo: https://github.com/avpatel/linux,
> the branch is riscv_kvm_v13.
>
> The accompanying QEMU is from the repo: https://github.com/alistair23/qemu, 
> the branch is
> hyp-ext-v0.6.next. In order for the QEMU to work with KVM, the patch found 
> here is necessary:
> https://patchwork.kernel.org/cover/11435965/
>
> Several steps to use this:
>
> 1. create virbr0 on riscv64 emulation
> $ brctl addbr virbr0
> $ brctl stp virbr0 on
> $ ifconfig virbr0 up
> $ ifconfig virbr0  netmask 
>
> 2. boot riscv64 guestOS on riscv64 emulation
> $ ./qemu-system-riscv64 -M virt,accel=kvm -m 1024M -cpu host -nographic \
> -name guest=riscv-guest \
> -smp 2 \
> -kernel ./Image \
> -drive file=./guest.img,format=raw,id=hd0 \
> -device virtio-blk,drive=hd0 \
> -netdev 
> type=tap,vhost=on,script=./ifup.sh,downscript=./ifdown.sh,id=net0 \
> -append "root=/dev/vda rw console=ttyS0 earlycon=sbi"
>
> $ cat ifup.sh
> #!/bin/sh
> brctl addif virbr0 $1
> ifconfig $1 up
>
> $ cat ifdown.sh
> #!/bin/sh
> ifconfig $1 down
> brctl delif virbr0 $1
>
> This brenchmark is vhost-net compare with virtio:
>
> $ ./netperf -H  -l 100 -t TCP_STREAM
>
> vhost-net:
> Recv   SendSend
> Socket Socket  Message  Elapsed
> Size   SizeSize Time Throughput
> bytes  bytes   bytessecs.10^6bits/sec
>
> 131072  16384  16384100.07457.55
>
> virtio:
> Recv   SendSend
> Socket Socket  Message  Elapsed
> Size   SizeSize Time Throughput
> bytes  bytes   bytessecs.10^6bits/sec
>
> 131072  16384  16384100.07227.02
>
>
> The next step is to support irqfd on RISC-V architecture.
>
> Yifei Jiang (2):
>   RISC-V: KVM: enable ioeventfd capability and compile for risc-v
>   RISC-V: KVM: read\write kernel mmio device support
>
>  arch/riscv/kvm/Kconfig |  2 ++
>  arch/riscv/kvm/Makefile|  2 +-
>  arch/riscv/kvm/vcpu_exit.c | 38 --
>  arch/riscv/kvm/vm.c|  1 +
>  4 files changed, 36 insertions(+), 7 deletions(-)
>
> --
> 2.19.1
>
>

I will be squashing these patches into PATCH7 of v14 KVM RISC-V series.

I will also add your Signed-off-by to PATCH7 of v14 KVM RISC-V to
acknowledge your efforts.

Thanks,
Anup


possible deadlock in proc_pid_syscall (2)

2020-08-27 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:15bc20c6 Merge tag 'tty-5.9-rc3' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15349f9690
kernel config:  https://syzkaller.appspot.com/x/.config?x=978db74cb30aa994
dashboard link: https://syzkaller.appspot.com/bug?extid=db9cdf3dd1f64252c6ef
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+db9cdf3dd1f64252c...@syzkaller.appspotmail.com

==
WARNING: possible circular locking dependency detected
5.9.0-rc2-syzkaller #0 Not tainted
--
syz-executor.0/18445 is trying to acquire lock:
88809f2e0dc8 (>exec_update_mutex){+.+.}-{3:3}, at: lock_trace 
fs/proc/base.c:408 [inline]
88809f2e0dc8 (>exec_update_mutex){+.+.}-{3:3}, at: 
proc_pid_syscall+0xaa/0x2b0 fs/proc/base.c:646

but task is already holding lock:
88808e9a3c30 (>lock){+.+.}-{3:3}, at: seq_read+0x61/0x1070 
fs/seq_file.c:155

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (>lock){+.+.}-{3:3}:
   __mutex_lock_common kernel/locking/mutex.c:956 [inline]
   __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
   seq_read+0x61/0x1070 fs/seq_file.c:155
   pde_read fs/proc/inode.c:306 [inline]
   proc_reg_read+0x221/0x300 fs/proc/inode.c:318
   do_loop_readv_writev fs/read_write.c:734 [inline]
   do_loop_readv_writev fs/read_write.c:721 [inline]
   do_iter_read+0x48e/0x6e0 fs/read_write.c:955
   vfs_readv+0xe5/0x150 fs/read_write.c:1073
   kernel_readv fs/splice.c:355 [inline]
   default_file_splice_read.constprop.0+0x4e6/0x9e0 fs/splice.c:412
   do_splice_to+0x137/0x170 fs/splice.c:871
   splice_direct_to_actor+0x307/0x980 fs/splice.c:950
   do_splice_direct+0x1b3/0x280 fs/splice.c:1059
   do_sendfile+0x55f/0xd40 fs/read_write.c:1540
   __do_sys_sendfile64 fs/read_write.c:1601 [inline]
   __se_sys_sendfile64 fs/read_write.c:1587 [inline]
   __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1587
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #2 (sb_writers#4){.+.+}-{0:0}:
   percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
   __sb_start_write+0x234/0x470 fs/super.c:1672
   sb_start_write include/linux/fs.h:1643 [inline]
   mnt_want_write+0x3a/0xb0 fs/namespace.c:354
   ovl_setattr+0x5c/0x850 fs/overlayfs/inode.c:28
   notify_change+0xb60/0x10a0 fs/attr.c:336
   chown_common+0x4a9/0x550 fs/open.c:674
   do_fchownat+0x126/0x1e0 fs/open.c:704
   __do_sys_lchown fs/open.c:729 [inline]
   __se_sys_lchown fs/open.c:727 [inline]
   __x64_sys_lchown+0x7a/0xc0 fs/open.c:727
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #1 (_i_mutex_dir_key[depth]){}-{3:3}:
   down_read+0x96/0x420 kernel/locking/rwsem.c:1492
   inode_lock_shared include/linux/fs.h:789 [inline]
   lookup_slow fs/namei.c:1560 [inline]
   walk_component+0x409/0x6a0 fs/namei.c:1860
   lookup_last fs/namei.c:2309 [inline]
   path_lookupat+0x1ba/0x830 fs/namei.c:2333
   filename_lookup+0x19f/0x560 fs/namei.c:2366
   create_local_trace_uprobe+0x87/0x4e0 kernel/trace/trace_uprobe.c:1574
   perf_uprobe_init+0x132/0x210 kernel/trace/trace_event_perf.c:323
   perf_uprobe_event_init+0xff/0x1c0 kernel/events/core.c:9580
   perf_try_init_event+0x12a/0x560 kernel/events/core.c:10899
   perf_init_event kernel/events/core.c:10951 [inline]
   perf_event_alloc.part.0+0xdee/0x3770 kernel/events/core.c:11229
   perf_event_alloc kernel/events/core.c:11608 [inline]
   __do_sys_perf_event_open+0x72c/0x2cb0 kernel/events/core.c:11724
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (>exec_update_mutex){+.+.}-{3:3}:
   check_prev_add kernel/locking/lockdep.c:2496 [inline]
   check_prevs_add kernel/locking/lockdep.c:2601 [inline]
   validate_chain kernel/locking/lockdep.c:3218 [inline]
   __lock_acquire+0x2a6b/0x5640 kernel/locking/lockdep.c:4426
   lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:5005
   __mutex_lock_common kernel/locking/mutex.c:956 [inline]
   __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
   lock_trace fs/proc/base.c:408 [inline]
   proc_pid_syscall+0xaa/0x2b0 fs/proc/base.c:646
   proc_single_show+0x116/0x1e0 fs/proc/base.c:775
   seq_read+0x432/0x1070 fs/seq_file.c:208
   do_loop_readv_writev fs/read_write.c:734 [inline]
   do_loop_readv_writev fs/read_write.c:721 [inline]
   do_iter_read+0x48e/0x6e0 

Re: [PATCH RFC 2/2] target/kvm: Add interfaces needed for log dirty

2020-08-27 Thread Anup Patel
On Thu, Aug 27, 2020 at 1:54 PM Yifei Jiang  wrote:
>
> Add two interfaces of log dirty for kvm_main.c, and detele the interface
> kvm_vm_ioctl_get_dirty_log which is redundantly defined.
>
> CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT is added in defconfig.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Yipeng Yin 
> ---
>  arch/riscv/configs/defconfig |  1 +
>  arch/riscv/kvm/Kconfig   |  1 +
>  arch/riscv/kvm/mmu.c | 43 
>  arch/riscv/kvm/vm.c  |  6 -
>  4 files changed, 45 insertions(+), 6 deletions(-)
>
> diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
> index d36e1000bbd3..857d799672c2 100644
> --- a/arch/riscv/configs/defconfig
> +++ b/arch/riscv/configs/defconfig
> @@ -19,6 +19,7 @@ CONFIG_SOC_VIRT=y
>  CONFIG_SMP=y
>  CONFIG_VIRTUALIZATION=y
>  CONFIG_KVM=y
> +CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
>  CONFIG_HOTPLUG_CPU=y
>  CONFIG_MODULES=y
>  CONFIG_MODULE_UNLOAD=y
> diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
> index 2356dc52ebb3..91fcffc70e5d 100644
> --- a/arch/riscv/kvm/Kconfig
> +++ b/arch/riscv/kvm/Kconfig
> @@ -26,6 +26,7 @@ config KVM
> select KVM_MMIO
> select HAVE_KVM_VCPU_ASYNC_IOCTL
> select SRCU
> +   select KVM_GENERIC_DIRTYLOG_READ_PROTECT
> help
>   Support hosting virtualized guest machines.
>
> diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
> index 88bce80ee983..df2a470c25e4 100644
> --- a/arch/riscv/kvm/mmu.c
> +++ b/arch/riscv/kvm/mmu.c
> @@ -358,6 +358,43 @@ void stage2_wp_memory_region(struct kvm *kvm, int slot)
> kvm_flush_remote_tlbs(kvm);
>  }
>
> +/**
> + * kvm_mmu_write_protect_pt_masked() - write protect dirty pages
> + * @kvm:The KVM pointer
> + * @slot:   The memory slot associated with mask
> + * @gfn_offset: The gfn offset in memory slot
> + * @mask:   The mask of dirty pages at offset 'gfn_offset' in this memory
> + *  slot to be write protected
> + *
> + * Walks bits set in mask write protects the associated pte's. Caller must
> + * acquire kvm_mmu_lock.
> + */
> +static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> +struct kvm_memory_slot *slot,
> +gfn_t gfn_offset, unsigned long mask)
> +{
> +phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
> +phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
> +phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
> +
> +stage2_wp_range(kvm, start, end);
> +}
> +
> +/*
> + * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for 
> selected
> + * dirty pages.
> + *
> + * It calls kvm_mmu_write_protect_pt_masked to write protect selected pages 
> to
> + * enable dirty logging for them.
> + */
> +void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
> +struct kvm_memory_slot *slot,
> +gfn_t gfn_offset, unsigned long mask)
> +{
> +kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask);
> +}
> +
> +
>  int stage2_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t hpa,
>unsigned long size, bool writable)
>  {
> @@ -433,6 +470,12 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct 
> kvm_memory_slot *memslot)
>  {
>  }
>
> +void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm,
> +   struct kvm_memory_slot *memslot)
> +{
> +   kvm_flush_remote_tlbs(kvm);
> +}
> +
>  void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free)
>  {
>  }
> diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
> index 4f2498198cb5..f7405676903b 100644
> --- a/arch/riscv/kvm/vm.c
> +++ b/arch/riscv/kvm/vm.c
> @@ -12,12 +12,6 @@
>  #include 
>  #include 
>
> -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> -{
> -   /* TODO: To be added later. */
> -   return -ENOTSUPP;
> -}
> -
>  int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  {
> int r;
> --
> 2.19.1
>
>

I already have a similar change as part of v14 KVM RISC-V series.

Let us coordinate better. Please let us know in-advance for any
KVM RISC-V feature you plan to work on. Otherwise, this leads to
efforts wasted at your end or at our end.

Regards,
Anup


Re: WARNING: at drivers/opp/core.c:678 dev_pm_opp_set_rate+0x4cc/0x5d4 - on arm x15

2020-08-27 Thread Viresh Kumar
On 27-08-20, 21:18, Stephen Rothwell wrote:
> Hi Viresh,
> 
> On Thu, 27 Aug 2020 15:16:51 +0530 Viresh Kumar  
> wrote:
> >
> > On 27-08-20, 15:04, Naresh Kamboju wrote:
> > > While boot testing arm x15 devices the Kernel warning noticed with linux 
> > > next
> > > tag 20200825.
> > > 
> > > BAD:  next-20200825
> > > GOOD:  next-20200824
> > > 
> > > metadata:
> > >   git branch: master
> > >   git repo: 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > >   git commit: 3a00d3dfd4b68b208ecd5405e676d06c8ad6bb63
> > >   git describe: next-20200825
> > >   make_kernelversion: 5.9.0-rc2
> > >   kernel-config:
> > > https://builds.tuxbuild.com/LDTu4GFMmvkJspza5LJIjQ/kernel.config
> > > 
> > > We are working on git bisect and boot testing on x15 and get back to you. 
> > >  
> > 
> > Was this working earlier ? But considering that multiple things
> > related to OPP broke recently, it may be a OPP core bug as well. Not
> > sure though.
> > 
> > Can you give me delta between both the next branches for drivers/opp/
> > path ? I didn't get these tags after fetching linux-next.
> 
> Yeah, you need to explicitly fetch the tags as only the latest tag is
> part of the branches in the tree.

Ah, I see. Thanks.

-- 
viresh


Loan

2020-08-27 Thread mendes . maia
Hello,

You can review your loan agreements immediately and you never need to visit a 
branch.

We offer a loan with an interest rate of 2% per annul. Quick payout, apply now 
within 48 hours, the offer covers all types of loans and the offer is open to 
blacklisted people.

This can elevate your business to a higher level.

You can apply below by e-mail

Send us an email at info.global24ho...@aol.com
Whatsapp? +1 (508) 571-8073


Re: [PATCH RFC 1/2] riscv/kvm: Fix use VSIP_VALID_MASK mask HIP register

2020-08-27 Thread Anup Patel
On Thu, Aug 27, 2020 at 1:53 PM Yifei Jiang  wrote:
>
> The correct sip/sie 0x222 could mask wrong 0x000 by VSIP_VALID_MASK,
> This patch fix it.
>
> Signed-off-by: Yifei Jiang 
> Signed-off-by: Yipeng Yin 
> ---
>  arch/riscv/kvm/vcpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> index adb0815951aa..297e921f 100644
> --- a/arch/riscv/kvm/vcpu.c
> +++ b/arch/riscv/kvm/vcpu.c
> @@ -419,8 +419,8 @@ static int kvm_riscv_vcpu_set_reg_csr(struct kvm_vcpu 
> *vcpu,
>
> if (reg_num == KVM_REG_RISCV_CSR_REG(sip) ||
> reg_num == KVM_REG_RISCV_CSR_REG(sie)) {
> -   reg_val = reg_val << VSIP_TO_HVIP_SHIFT;
> reg_val = reg_val & VSIP_VALID_MASK;
> +   reg_val = reg_val << VSIP_TO_HVIP_SHIFT;

Thanks for this fix. I have squashed it into PATCH5 of KVM RISC-V v14
series.

Regards,
Anup


Re: [PATCH v2 3/3] riscv: Add cache information in AUX vector

2020-08-27 Thread kernel test robot
Hi Zong,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.9-rc2 next-20200827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Zong-Li/Get-cache-information-from-userland/20200827-162439
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
15bc20c6af4ceee97a1f90b43c0e386643c071b4
config: riscv-randconfig-s032-20200827 (attached as .config)
compiler: riscv64-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-191-g10164920-dirty
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

>> arch/riscv/kernel/cacheinfo.c:39:16: sparse: sparse: Using plain integer as 
>> NULL pointer

# 
https://github.com/0day-ci/linux/commit/a51c248ba0626069792c3f84c8879f685f4a1ff6
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Zong-Li/Get-cache-information-from-userland/20200827-162439
git checkout a51c248ba0626069792c3f84c8879f685f4a1ff6
vim +39 arch/riscv/kernel/cacheinfo.c

26  
27  static struct cacheinfo *get_cacheinfo(u32 level, enum cache_type type)
28  {
29  struct cpu_cacheinfo *this_cpu_ci = 
get_cpu_cacheinfo(smp_processor_id());
30  struct cacheinfo *this_leaf;
31  int index;
32  
33  for (index = 0; index < this_cpu_ci->num_leaves; index++) {
34  this_leaf = this_cpu_ci->info_list + index;
35  if (this_leaf->level == level && this_leaf->type == 
type)
36  return this_leaf;
37  }
38  
  > 39  return 0;
40  }
41  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [RFC][PATCH 3/7] kprobes: Remove kretprobe hash

2020-08-27 Thread Masami Hiramatsu
On Thu, 27 Aug 2020 18:12:40 +0200
Peter Zijlstra  wrote:

> @@ -1313,25 +1261,28 @@ void kprobe_busy_end(void)
>  void kprobe_flush_task(struct task_struct *tk)
>  {
>   struct kretprobe_instance *ri;
> - struct hlist_head *head, empty_rp;
> + struct hlist_head empty_rp;
> + struct llist_node *node;
>   struct hlist_node *tmp;

We don't need this tmp anymore.

> @@ -1935,71 +1932,45 @@ unsigned long __kretprobe_trampoline_han
>   unsigned long trampoline_address,
>   void *frame_pointer)
>  {
> + kprobe_opcode_t *correct_ret_addr = NULL;
>   struct kretprobe_instance *ri = NULL;
> - struct hlist_head *head, empty_rp;
> + unsigned long orig_ret_address = 0;
> + struct llist_node *first, *node;
> + struct hlist_head empty_rp;
>   struct hlist_node *tmp;

Here too.

I'm trying to port this patch on my v4 series. I'll add my RFC patch of
kretprobe_holder too.

Thank you,

-- 
Masami Hiramatsu 


[PATCH v4] power: supply: sbs-battery: don't assume i2c errors as battery disconnect

2020-08-27 Thread Ikjoon Jang
Current sbs-battery considers all smbus errors as disconnection events
when battery-detect pin isn't supplied, and restored to present state back
when any successful transaction is made.

This can lead to unwanted state changes between present and !present
when there's one i2c error and other following commands were successful.

This patch provides a unified way of checking presence by calling
sbs_get_battery_presence_and_health() when detect pin is not used.

Signed-off-by: Ikjoon Jang 
---
v4: rebase from merge conflict, amend commit messages
v3: check return value of get_presence_and_health()
v2: combine get_presence_and_health functions to reuse
---

 drivers/power/supply/sbs-battery.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/power/supply/sbs-battery.c 
b/drivers/power/supply/sbs-battery.c
index 6273211cd673..dacc4bc1c013 100644
--- a/drivers/power/supply/sbs-battery.c
+++ b/drivers/power/supply/sbs-battery.c
@@ -959,10 +959,17 @@ static int sbs_get_property(struct power_supply *psy,
return -EINVAL;
}
 
-   if (!chip->gpio_detect &&
-   chip->is_present != (ret >= 0)) {
-   sbs_update_presence(chip, (ret >= 0));
-   power_supply_changed(chip->power_supply);
+   if (!chip->gpio_detect && chip->is_present != (ret >= 0)) {
+   bool old_present = chip->is_present;
+   union power_supply_propval val;
+
+   ret = sbs_get_battery_presence_and_health(
+   client, POWER_SUPPLY_PROP_PRESENT, );
+
+   sbs_update_presence(chip, !ret && val.intval);
+
+   if (old_present != chip->is_present)
+   power_supply_changed(chip->power_supply);
}
 
 done:
@@ -1147,11 +1154,13 @@ static int sbs_probe(struct i2c_client *client)
 * to the battery.
 */
if (!(force_load || chip->gpio_detect)) {
-   rc = sbs_read_word_data(client, sbs_data[REG_STATUS].addr);
+   union power_supply_propval val;
 
-   if (rc < 0) {
-   dev_err(>dev, "%s: Failed to get device 
status\n",
-   __func__);
+   rc = sbs_get_battery_presence_and_health(
+   client, POWER_SUPPLY_PROP_PRESENT, );
+   if (rc < 0 || !val.intval) {
+   dev_err(>dev, "Failed to get present status\n");
+   rc = -ENODEV;
goto exit_psupply;
}
}
-- 
2.28.0.402.g5ffc5be6b7-goog



Re: [PATCH 17/19] z2ram: reindent

2020-08-27 Thread Christoph Hellwig
On Fri, Aug 28, 2020 at 10:57:46AM +1000, Finn Thain wrote:
> On Thu, 27 Aug 2020, Joe Perches wrote:
> 
> > 
> > checkpatch already does this.
> > 
> 
> Did you use checkpatch to generate this patch?

I used scripts/Lindent.


Re: [PATCH 09/10] sh: don't allow non-coherent DMA for NOMMU

2020-08-27 Thread Christoph Hellwig
On Thu, Aug 27, 2020 at 10:11:53PM -0400, Rich Felker wrote:
> > This change broke SD card support on J2 because MMC_SPI spuriously
> > depends on HAS_DMA. It looks like it can be fixed just by removing
> > that dependency from drivers/mmc/host/Kconfig.
> 
> It can't. mmp_spi_probe fails with ENOMEM, probably due to trying to
> do some DMA setup thing that's not going to be needed if the
> underlying SPI device doesn't support/use DMA.

Adding the linux-mmc and linux-spi lists, as that seems pretty odd.


RE: [PATCH 1/2] dt-bindings: pwm: renesas,pwm-rcar: Add r8a774e1 support

2020-08-27 Thread Yoshihiro Shimoda
Hi Lad-san,

> From: Lad Prabhakar, Sent: Tuesday, August 25, 2020 7:45 PM
> 
> From: Marian-Cristian Rotariu 
> 
> Document RZ/G2H (R8A774E1) SoC bindings.
> 
> No driver change is needed due to the fallback compatible value
> "renesas,pwm-rcar".
> 
> Signed-off-by: Marian-Cristian Rotariu 
> 
> Signed-off-by: Lad Prabhakar 
> ---

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda 

Best regards,
Yoshihiro Shimoda

>  Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.yaml | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.yaml
> b/Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.yaml
> index daadde9ff9c4..5407c11e92a4 100644
> --- a/Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.yaml
> +++ b/Documentation/devicetree/bindings/pwm/renesas,pwm-rcar.yaml
> @@ -20,6 +20,7 @@ properties:
>- renesas,pwm-r8a774a1  # RZ/G2M
>- renesas,pwm-r8a774b1  # RZ/G2N
>- renesas,pwm-r8a774c0  # RZ/G2E
> +  - renesas,pwm-r8a774e1  # RZ/G2H
>- renesas,pwm-r8a7778   # R-Car M1A
>- renesas,pwm-r8a7779   # R-Car H1
>- renesas,pwm-r8a7790   # R-Car H2
> --
> 2.17.1



Re: [Cocci] [PATCH] usb: atm: don't use snprintf() for sysfs attrs

2020-08-27 Thread Joe Perches
On Thu, 2020-08-27 at 15:45 -0700, Joe Perches wrote:
> On Thu, 2020-08-27 at 15:20 -0700, Kees Cook wrote:
> > On Fri, Aug 28, 2020 at 12:01:34AM +0300, Denis Efremov wrote:
> > > Just FYI, I've send an addition to the device_attr_show.cocci script[1] 
> > > to turn
> > > simple cases of snprintf (e.g. "%i") to sprintf. Looks like many 
> > > developers would
> > > like it more than changing snprintf to scnprintf. As for me, I don't like 
> > > the idea
> > > of automated altering of the original logic from bounded snprint to 
> > > unbouded one
> > > with sprintf.
> > 
> > Agreed. This just makes me cringe. If the API design declares that when
> > a show() callback starts, buf has been allocated with PAGE_SIZE bytes,
> > then that's how the logic should proceed, and it should be using
> > scnprintf...
> > 
> > show(...) {
> > size_t remaining = PAGE_SIZE;
> > 
> > ...
> > remaining -= scnprintf(buf, remaining, "fmt", var args ...);
> > remaining -= scnprintf(buf, remaining, "fmt", var args ...);
> > remaining -= scnprintf(buf, remaining, "fmt", var args ...);
> > 
> > return PAGE_SIZE - remaining;
> > }
> 
> It seems likely that coccinelle could do those transform
> with any of sprintf/snprintf/scnprint too.
> 
> Though my bikeshed would use a single function and have
> that function know the maximum output size

Perhaps something like the below with a sample conversion
that uses single and multiple sysfs_emit uses.

I believe coccinelle can _mostly_ automated this.

---
 fs/sysfs/file.c   | 30 ++
 include/linux/sysfs.h |  8 
 kernel/power/main.c   | 49 ++---
 3 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index eb6897ab78e7..c0ff3ba8e373 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -707,3 +707,33 @@ int sysfs_change_owner(struct kobject *kobj, kuid_t kuid, 
kgid_t kgid)
return 0;
 }
 EXPORT_SYMBOL_GPL(sysfs_change_owner);
+
+/**
+ * sysfs_emit - scnprintf equivalent, aware of PAGE_SIZE buffer.
+ * @buf:   start of PAGE_SIZE buffer.
+ * @pos:   current position in buffer
+ *  (pos - buf) must always be < PAGE_SIZE
+ * @fmt:   format
+ * @...:   arguments to format
+ *
+ *
+ * Returns number of characters written at pos.
+ */
+int sysfs_emit(char *buf, char *pos, const char *fmt, ...)
+{
+   int len;
+   va_list args;
+
+   WARN(pos < buf, "pos < buf\n");
+   WARN(pos - buf >= PAGE_SIZE, "pos >= PAGE_SIZE (%tu > %lu)\n",
+pos - buf, PAGE_SIZE);
+   if (pos < buf || pos - buf >= PAGE_SIZE)
+   return 0;
+
+   va_start(args, fmt);
+   len = vscnprintf(pos, PAGE_SIZE - (pos - buf), fmt, args);
+   va_end(args);
+
+   return len;
+}
+EXPORT_SYMBOL_GPL(sysfs_emit);
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 34e84122f635..5a21d3d30016 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -329,6 +329,8 @@ int sysfs_groups_change_owner(struct kobject *kobj,
 int sysfs_group_change_owner(struct kobject *kobj,
 const struct attribute_group *groups, kuid_t kuid,
 kgid_t kgid);
+__printf(3, 4)
+int sysfs_emit(char *buf, char *pos, const char *fmt, ...);
 
 #else /* CONFIG_SYSFS */
 
@@ -576,6 +578,12 @@ static inline int sysfs_group_change_owner(struct kobject 
*kobj,
return 0;
 }
 
+__printf(3, 4)
+static inline int sysfs_emit(char *buf, char *pos, const char *fmt, ...)
+{
+   return 0;
+}
+
 #endif /* CONFIG_SYSFS */
 
 static inline int __must_check sysfs_create_file(struct kobject *kobj,
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 40f86ec4ab30..f3fb9f255234 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -100,7 +100,7 @@ int pm_async_enabled = 1;
 static ssize_t pm_async_show(struct kobject *kobj, struct kobj_attribute *attr,
 char *buf)
 {
-   return sprintf(buf, "%d\n", pm_async_enabled);
+   return sysfs_emit(buf, buf, "%d\n", pm_async_enabled);
 }
 
 static ssize_t pm_async_store(struct kobject *kobj, struct kobj_attribute 
*attr,
@@ -124,7 +124,7 @@ power_attr(pm_async);
 static ssize_t mem_sleep_show(struct kobject *kobj, struct kobj_attribute 
*attr,
  char *buf)
 {
-   char *s = buf;
+   char *pos = buf;
suspend_state_t i;
 
for (i = PM_SUSPEND_MIN; i < PM_SUSPEND_MAX; i++)
@@ -132,16 +132,16 @@ static ssize_t mem_sleep_show(struct kobject *kobj, 
struct kobj_attribute *attr,
const char *label = mem_sleep_states[i];
 
if (mem_sleep_current == i)
-   s += sprintf(s, "[%s] ", label);
+   pos += sysfs_emit(buf, pos, "[%s] ", label);
else
-   s += sprintf(s, "%s ", label);
+ 

Re: [RFC][PATCH 6/7] freelist: Lock less freelist

2020-08-27 Thread Lai Jiangshan
On Fri, Aug 28, 2020 at 12:23 AM Peter Zijlstra  wrote:

> +static inline void __freelist_add(struct freelist_node *node, struct 
> freelist_head *list)
> +{
> +   /*
> +* Since the refcount is zero, and nobody can increase it once it's
> +* zero (except us, and we run only one copy of this method per node 
> at
> +* a time, i.e. the single thread case), then we know we can safely


> +
> +   /*
> +* OK, the head must have changed on us, but we still need to 
> decrement
> +* the refcount we increased.
> +*/
> +   refs = atomic_fetch_add(-1, >refs);
> +   if (refs == REFS_ON_FREELIST + 1)
> +   __freelist_add(prev, list);

I'm curious whether it is correct to just set the prev->refs to zero and return
@prev? So that it can remove an unneeded "add()()" pair (although in
an unlikely branch) and __freelist_add() can be folded into freelist_add()
for tidier code.

Thanks
Lai.


[PATCH] f2fs: make fibmap consistent with fiemap for compression chunk

2020-08-27 Thread Daeho Jeong
From: Daeho Jeong 

Currently fibmap returns zero address for compression chunk. But it
is not consistent with the output of fiemap, since fiemap returns
real pysical block address related to the compression chunk. Therefore
I suggest fibmap returns the same output with fiemap.

Signed-off-by: Daeho Jeong 
---
 fs/f2fs/data.c | 33 -
 1 file changed, 33 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c1b676be67b9..8c26c5d0c778 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3708,36 +3708,6 @@ static int f2fs_set_data_page_dirty(struct page *page)
return 0;
 }
 
-
-static sector_t f2fs_bmap_compress(struct inode *inode, sector_t block)
-{
-#ifdef CONFIG_F2FS_FS_COMPRESSION
-   struct dnode_of_data dn;
-   sector_t start_idx, blknr = 0;
-   int ret;
-
-   start_idx = round_down(block, F2FS_I(inode)->i_cluster_size);
-
-   set_new_dnode(, inode, NULL, NULL, 0);
-   ret = f2fs_get_dnode_of_data(, start_idx, LOOKUP_NODE);
-   if (ret)
-   return 0;
-
-   if (dn.data_blkaddr != COMPRESS_ADDR) {
-   dn.ofs_in_node += block - start_idx;
-   blknr = f2fs_data_blkaddr();
-   if (!__is_valid_data_blkaddr(blknr))
-   blknr = 0;
-   }
-
-   f2fs_put_dnode();
-   return blknr;
-#else
-   return 0;
-#endif
-}
-
-
 static sector_t f2fs_bmap(struct address_space *mapping, sector_t block)
 {
struct inode *inode = mapping->host;
@@ -3753,9 +3723,6 @@ static sector_t f2fs_bmap(struct address_space *mapping, 
sector_t block)
if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
filemap_write_and_wait(mapping);
 
-   if (f2fs_compressed_file(inode))
-   blknr = f2fs_bmap_compress(inode, block);
-
if (!get_data_block_bmap(inode, block, , 0))
blknr = tmp.b_blocknr;
 out:
-- 
2.28.0.402.g5ffc5be6b7-goog



Re: [Patch v2 0/4] tracing: trivial cleanup

2020-08-27 Thread Wei Yang
Steven,

Would you like to pick this up?

On Sun, Jul 12, 2020 at 09:10:32AM +0800, Wei Yang wrote:
>Some trivial cleanup for tracing.
>
>v2:
>  * drop patch 1
>  * merge patch 4 & 5
>  * introduce a new patch change the return value of tracing_init_dentry()
>
>Wei Yang (4):
>  tracing: simplify the logic by defining next to be "lasst + 1"
>  tracing: save one trace_event->type by using __TRACE_LAST_TYPE
>  tracing: toplevel d_entry already initialized
>  tracing: make tracing_init_dentry() returns an integer instead of a
>d_entry pointer
>
> kernel/trace/trace.c | 36 ++--
> kernel/trace/trace.h |  2 +-
> kernel/trace/trace_dynevent.c|  8 +++
> kernel/trace/trace_events.c  |  9 ++-
> kernel/trace/trace_events_synth.c|  9 +++
> kernel/trace/trace_functions_graph.c |  8 +++
> kernel/trace/trace_hwlat.c   |  8 +++
> kernel/trace/trace_kprobe.c  | 10 
> kernel/trace/trace_output.c  | 14 +--
> kernel/trace/trace_printk.c  |  8 +++
> kernel/trace/trace_stack.c   | 12 +-
> kernel/trace/trace_stat.c|  8 +++
> kernel/trace/trace_uprobe.c  |  9 ---
> 13 files changed, 66 insertions(+), 75 deletions(-)
>
>-- 
>2.20.1 (Apple Git-117)

-- 
Wei Yang
Help you, Help me


Re: [PATCH bpf-next v1 8/8] bpf/selftests: Test for bpf_per_cpu_ptr()

2020-08-27 Thread Hao Luo
Thanks for taking a look!

On Fri, Aug 21, 2020 at 8:30 PM Andrii Nakryiko
 wrote:
>
> On Wed, Aug 19, 2020 at 3:42 PM Hao Luo  wrote:
> >
> > Test bpf_per_cpu_ptr(). Test two paths in the kernel. If the base
> > pointer points to a struct, the returned reg is of type PTR_TO_BTF_ID.
> > Direct pointer dereference can be applied on the returned variable.
> > If the base pointer isn't a struct, the returned reg is of type
> > PTR_TO_MEM, which also supports direct pointer dereference.
> >
> > Signed-off-by: Hao Luo 
> > ---
>
> Acked-by: Andrii Nakryiko 
>
[...]
> >
> >  __u64 out__runqueues = -1;
> >  __u64 out__bpf_prog_active = -1;
> > +__u32 out__rq_cpu = -1;
> > +unsigned long out__process_counts = -1;
>
> try to not use long for variables, it is 32-bit integer in user-space
> but always 64-bit in BPF. This causes problems when using skeleton on
> 32-bit architecture.
>

Ack. I will use another variable of type 'int' instead.

> >
> > -extern const struct rq runqueues __ksym; /* struct type global var. */
> > +extern const struct rq runqueues __ksym; /* struct type percpu var. */
> >  extern const int bpf_prog_active __ksym; /* int type global var. */
> > +extern const unsigned long process_counts __ksym; /* int type percpu var. 
> > */
> >
> >  SEC("raw_tp/sys_enter")
> >  int handler(const void *ctx)
> >  {
> > +   struct rq *rq;
> > +   unsigned long *count;
> > +
> > out__runqueues = (__u64)
> > out__bpf_prog_active = (__u64)_prog_active;
> >
> > +   rq = (struct rq *)bpf_per_cpu_ptr(, 1);
> > +   if (rq)
> > +   out__rq_cpu = rq->cpu;
>
> this is awesome!
>
> Are there any per-cpu variables that are arrays? Would be nice to test
> those too.
>
>

There are currently per-cpu arrays, but not common. There is a
'pmc_prev_left' in arch/x86, I can add that in this test.

[...]


[PATCH net-next v2 3/3] hinic: add support to query function table

2020-08-27 Thread Luo bin
add debugfs node for querying function table, for example:
cat /sys/kernel/debug/hinic/:15:00.0/func_table/valid

Signed-off-by: Luo bin 
---
V0~V1:
- remove command interfaces to the read only files
- split addition of each object into a separate patch

V1~V2:
- remove vlan_id and vlan_mode from the func_table_fields

 .../net/ethernet/huawei/hinic/hinic_debugfs.c | 92 ++-
 .../net/ethernet/huawei/hinic/hinic_debugfs.h | 79 
 drivers/net/ethernet/huawei/hinic/hinic_dev.h |  3 +
 .../net/ethernet/huawei/hinic/hinic_hw_dev.h  |  2 +
 .../net/ethernet/huawei/hinic/hinic_main.c| 15 +++
 5 files changed, 190 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c 
b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
index d10d0a6d9f13..19eb839177ec 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
@@ -70,6 +70,63 @@ static u64 hinic_dbg_get_rq_info(struct hinic_dev *nic_dev, 
struct hinic_rq *rq,
return 0;
 }
 
+enum func_tbl_info {
+   VALID,
+   RX_MODE,
+   MTU,
+   RQ_DEPTH,
+   QUEUE_NUM,
+};
+
+static char *func_table_fields[] = {"valid", "rx_mode", "mtu", "rq_depth", 
"cfg_q_num"};
+
+static int hinic_dbg_get_func_table(struct hinic_dev *nic_dev, int idx)
+{
+   struct tag_sml_funcfg_tbl *funcfg_table_elem;
+   struct hinic_cmd_lt_rd *read_data;
+   u16 out_size = sizeof(*read_data);
+   int err;
+
+   read_data = kzalloc(sizeof(*read_data), GFP_KERNEL);
+   if (!read_data)
+   return ~0;
+
+   read_data->node = TBL_ID_FUNC_CFG_SM_NODE;
+   read_data->inst = TBL_ID_FUNC_CFG_SM_INST;
+   read_data->entry_size = HINIC_FUNCTION_CONFIGURE_TABLE_SIZE;
+   read_data->lt_index = HINIC_HWIF_FUNC_IDX(nic_dev->hwdev->hwif);
+   read_data->len = HINIC_FUNCTION_CONFIGURE_TABLE_SIZE;
+
+   err = hinic_port_msg_cmd(nic_dev->hwdev, HINIC_PORT_CMD_RD_LINE_TBL, 
read_data,
+sizeof(*read_data), read_data, _size);
+   if (err || out_size != sizeof(*read_data) || read_data->status) {
+   netif_err(nic_dev, drv, nic_dev->netdev,
+ "Failed to get func table, err: %d, status: 0x%x, out 
size: 0x%x\n",
+ err, read_data->status, out_size);
+   kfree(read_data);
+   return ~0;
+   }
+
+   funcfg_table_elem = (struct tag_sml_funcfg_tbl *)read_data->data;
+
+   switch (idx) {
+   case VALID:
+   return funcfg_table_elem->dw0.bs.valid;
+   case RX_MODE:
+   return funcfg_table_elem->dw0.bs.nic_rx_mode;
+   case MTU:
+   return funcfg_table_elem->dw1.bs.mtu;
+   case RQ_DEPTH:
+   return funcfg_table_elem->dw13.bs.cfg_rq_depth;
+   case QUEUE_NUM:
+   return funcfg_table_elem->dw13.bs.cfg_q_num;
+   }
+
+   kfree(read_data);
+
+   return ~0;
+}
+
 static ssize_t hinic_dbg_cmd_read(struct file *filp, char __user *buffer, 
size_t count,
  loff_t *ppos)
 {
@@ -91,6 +148,10 @@ static ssize_t hinic_dbg_cmd_read(struct file *filp, char 
__user *buffer, size_t
out = hinic_dbg_get_rq_info(dbg->dev, dbg->object, *desc);
break;
 
+   case HINIC_DBG_FUNC_TABLE:
+   out = hinic_dbg_get_func_table(dbg->dev, *desc);
+   break;
+
default:
netif_warn(dbg->dev, drv, dbg->dev->netdev, "Invalid hinic 
debug cmd: %d\n",
   dbg->type);
@@ -136,7 +197,9 @@ static int create_dbg_files(struct hinic_dev *dev, enum 
hinic_dbg_type type, voi
 
 static void rem_dbg_files(struct hinic_debug_priv *dbg)
 {
-   debugfs_remove_recursive(dbg->root);
+   if (dbg->type != HINIC_DBG_FUNC_TABLE)
+   debugfs_remove_recursive(dbg->root);
+
kfree(dbg);
 }
 
@@ -184,6 +247,21 @@ void hinic_rq_debug_rem(struct hinic_rq *rq)
rem_dbg_files(rq->dbg);
 }
 
+int hinic_func_table_debug_add(struct hinic_dev *dev)
+{
+   if (HINIC_IS_VF(dev->hwdev->hwif))
+   return 0;
+
+   return create_dbg_files(dev, HINIC_DBG_FUNC_TABLE, dev, 
dev->func_tbl_dbgfs, >dbg,
+   func_table_fields, 
ARRAY_SIZE(func_table_fields));
+}
+
+void hinic_func_table_debug_rem(struct hinic_dev *dev)
+{
+   if (!HINIC_IS_VF(dev->hwdev->hwif) && dev->dbg)
+   rem_dbg_files(dev->dbg);
+}
+
 void hinic_sq_dbgfs_init(struct hinic_dev *nic_dev)
 {
nic_dev->sq_dbgfs = debugfs_create_dir("SQs", nic_dev->dbgfs_root);
@@ -204,6 +282,18 @@ void hinic_rq_dbgfs_uninit(struct hinic_dev *nic_dev)
debugfs_remove_recursive(nic_dev->rq_dbgfs);
 }
 
+void hinic_func_tbl_dbgfs_init(struct hinic_dev *nic_dev)
+{
+   if (!HINIC_IS_VF(nic_dev->hwdev->hwif))
+   nic_dev->func_tbl_dbgfs = 

[PATCH net-next v2 2/3] hinic: add support to query rq info

2020-08-27 Thread Luo bin
add debugfs node for querying rq info, for example:
cat /sys/kernel/debug/hinic/:15:00.0/RQs/0x0/rq_hw_pi

Signed-off-by: Luo bin 
---
V0~V1:
- remove command interfaces to the read only files
- split addition of each object into a separate patch

 .../net/ethernet/huawei/hinic/hinic_debugfs.c | 66 +++
 .../net/ethernet/huawei/hinic/hinic_debugfs.h |  8 +++
 drivers/net/ethernet/huawei/hinic/hinic_dev.h |  2 +
 .../net/ethernet/huawei/hinic/hinic_hw_io.c   |  1 +
 .../net/ethernet/huawei/hinic/hinic_hw_qp.h   |  3 +
 .../net/ethernet/huawei/hinic/hinic_main.c| 23 ++-
 6 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c 
b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
index 2a1050cb400e..d10d0a6d9f13 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
@@ -40,6 +40,36 @@ static u64 hinic_dbg_get_sq_info(struct hinic_dev *nic_dev, 
struct hinic_sq *sq,
return 0;
 }
 
+enum rq_dbg_info {
+   GLB_RQ_ID,
+   RQ_HW_PI,
+   RQ_SW_CI,
+   RQ_SW_PI,
+   RQ_MSIX_ENTRY,
+};
+
+static char *rq_fields[] = {"glb_rq_id", "rq_hw_pi", "rq_sw_ci", "rq_sw_pi", 
"rq_msix_entry"};
+
+static u64 hinic_dbg_get_rq_info(struct hinic_dev *nic_dev, struct hinic_rq 
*rq, int idx)
+{
+   struct hinic_wq *wq = rq->wq;
+
+   switch (idx) {
+   case GLB_RQ_ID:
+   return nic_dev->hwdev->func_to_io.global_qpn + rq->qid;
+   case RQ_HW_PI:
+   return be16_to_cpu(*(__be16 *)(rq->pi_virt_addr)) & wq->mask;
+   case RQ_SW_CI:
+   return atomic_read(>cons_idx) & wq->mask;
+   case RQ_SW_PI:
+   return atomic_read(>prod_idx) & wq->mask;
+   case RQ_MSIX_ENTRY:
+   return rq->msix_entry;
+   }
+
+   return 0;
+}
+
 static ssize_t hinic_dbg_cmd_read(struct file *filp, char __user *buffer, 
size_t count,
  loff_t *ppos)
 {
@@ -57,6 +87,10 @@ static ssize_t hinic_dbg_cmd_read(struct file *filp, char 
__user *buffer, size_t
out = hinic_dbg_get_sq_info(dbg->dev, dbg->object, *desc);
break;
 
+   case HINIC_DBG_RQ_INFO:
+   out = hinic_dbg_get_rq_info(dbg->dev, dbg->object, *desc);
+   break;
+
default:
netif_warn(dbg->dev, drv, dbg->dev->netdev, "Invalid hinic 
debug cmd: %d\n",
   dbg->type);
@@ -128,6 +162,28 @@ void hinic_sq_debug_rem(struct hinic_sq *sq)
rem_dbg_files(sq->dbg);
 }
 
+int hinic_rq_debug_add(struct hinic_dev *dev, u16 rq_id)
+{
+   struct hinic_rq *rq;
+   struct dentry *root;
+   char sub_dir[16];
+
+   rq = dev->rxqs[rq_id].rq;
+
+   sprintf(sub_dir, "0x%x", rq_id);
+
+   root = debugfs_create_dir(sub_dir, dev->rq_dbgfs);
+
+   return create_dbg_files(dev, HINIC_DBG_RQ_INFO, rq, root, >dbg, 
rq_fields,
+   ARRAY_SIZE(rq_fields));
+}
+
+void hinic_rq_debug_rem(struct hinic_rq *rq)
+{
+   if (rq->dbg)
+   rem_dbg_files(rq->dbg);
+}
+
 void hinic_sq_dbgfs_init(struct hinic_dev *nic_dev)
 {
nic_dev->sq_dbgfs = debugfs_create_dir("SQs", nic_dev->dbgfs_root);
@@ -138,6 +194,16 @@ void hinic_sq_dbgfs_uninit(struct hinic_dev *nic_dev)
debugfs_remove_recursive(nic_dev->sq_dbgfs);
 }
 
+void hinic_rq_dbgfs_init(struct hinic_dev *nic_dev)
+{
+   nic_dev->rq_dbgfs = debugfs_create_dir("RQs", nic_dev->dbgfs_root);
+}
+
+void hinic_rq_dbgfs_uninit(struct hinic_dev *nic_dev)
+{
+   debugfs_remove_recursive(nic_dev->rq_dbgfs);
+}
+
 void hinic_dbg_init(struct hinic_dev *nic_dev)
 {
nic_dev->dbgfs_root = 
debugfs_create_dir(pci_name(nic_dev->hwdev->hwif->pdev),
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.h 
b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.h
index 45fb3b40f487..186ca4a26919 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.h
@@ -12,10 +12,18 @@ int hinic_sq_debug_add(struct hinic_dev *dev, u16 sq_id);
 
 void hinic_sq_debug_rem(struct hinic_sq *sq);
 
+int hinic_rq_debug_add(struct hinic_dev *dev, u16 rq_id);
+
+void hinic_rq_debug_rem(struct hinic_rq *rq);
+
 void hinic_sq_dbgfs_init(struct hinic_dev *nic_dev);
 
 void hinic_sq_dbgfs_uninit(struct hinic_dev *nic_dev);
 
+void hinic_rq_dbgfs_init(struct hinic_dev *nic_dev);
+
+void hinic_rq_dbgfs_uninit(struct hinic_dev *nic_dev);
+
 void hinic_dbg_init(struct hinic_dev *nic_dev);
 
 void hinic_dbg_uninit(struct hinic_dev *nic_dev);
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h 
b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
index 95d9548014ac..0876a699d205 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
@@ -60,6 +60,7 @@ struct hinic_intr_coal_info {
 
 enum hinic_dbg_type {

[PATCH net-next v2 1/3] hinic: add support to query sq info

2020-08-27 Thread Luo bin
add debugfs node for querying sq info, for example:
cat /sys/kernel/debug/hinic/:15:00.0/SQs/0x0/sq_pi

Signed-off-by: Luo bin 
---
V0~V1:
- remove command interfaces to the read only files
- split addition of each object into a separate patch

 drivers/net/ethernet/huawei/hinic/Makefile|   3 +-
 .../net/ethernet/huawei/hinic/hinic_debugfs.c | 162 ++
 .../net/ethernet/huawei/hinic/hinic_debugfs.h |  27 +++
 drivers/net/ethernet/huawei/hinic/hinic_dev.h |  15 ++
 .../net/ethernet/huawei/hinic/hinic_hw_dev.c  |   1 +
 .../net/ethernet/huawei/hinic/hinic_hw_io.c   |   1 +
 .../net/ethernet/huawei/hinic/hinic_hw_io.h   |   1 +
 .../net/ethernet/huawei/hinic/hinic_hw_qp.h   |   3 +
 .../net/ethernet/huawei/hinic/hinic_main.c|  45 -
 9 files changed, 254 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
 create mode 100644 drivers/net/ethernet/huawei/hinic/hinic_debugfs.h

diff --git a/drivers/net/ethernet/huawei/hinic/Makefile 
b/drivers/net/ethernet/huawei/hinic/Makefile
index 67b59d0ba769..2f89119c9b69 100644
--- a/drivers/net/ethernet/huawei/hinic/Makefile
+++ b/drivers/net/ethernet/huawei/hinic/Makefile
@@ -4,4 +4,5 @@ obj-$(CONFIG_HINIC) += hinic.o
 hinic-y := hinic_main.o hinic_tx.o hinic_rx.o hinic_port.o hinic_hw_dev.o \
   hinic_hw_io.o hinic_hw_qp.o hinic_hw_cmdq.o hinic_hw_wq.o \
   hinic_hw_mgmt.o hinic_hw_api_cmd.o hinic_hw_eqs.o hinic_hw_if.o \
-  hinic_common.o hinic_ethtool.o hinic_devlink.o hinic_hw_mbox.o 
hinic_sriov.o
+  hinic_common.o hinic_ethtool.o hinic_devlink.o hinic_hw_mbox.o \
+  hinic_sriov.o hinic_debugfs.o
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c 
b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
new file mode 100644
index ..2a1050cb400e
--- /dev/null
+++ b/drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
@@ -0,0 +1,162 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Huawei HiNIC PCI Express Linux driver
+ * Copyright(c) 2017 Huawei Technologies Co., Ltd
+ */
+
+#include 
+#include 
+
+#include "hinic_debugfs.h"
+
+static struct dentry *hinic_dbgfs_root;
+
+enum sq_dbg_info {
+   GLB_SQ_ID,
+   SQ_PI,
+   SQ_CI,
+   SQ_FI,
+   SQ_MSIX_ENTRY,
+};
+
+static char *sq_fields[] = {"glb_sq_id", "sq_pi", "sq_ci", "sq_fi", 
"sq_msix_entry"};
+
+static u64 hinic_dbg_get_sq_info(struct hinic_dev *nic_dev, struct hinic_sq 
*sq, int idx)
+{
+   struct hinic_wq *wq = sq->wq;
+
+   switch (idx) {
+   case GLB_SQ_ID:
+   return nic_dev->hwdev->func_to_io.global_qpn + sq->qid;
+   case SQ_PI:
+   return atomic_read(>prod_idx) & wq->mask;
+   case SQ_CI:
+   return atomic_read(>cons_idx) & wq->mask;
+   case SQ_FI:
+   return be16_to_cpu(*(__be16 *)(sq->hw_ci_addr)) & wq->mask;
+   case SQ_MSIX_ENTRY:
+   return sq->msix_entry;
+   }
+
+   return 0;
+}
+
+static ssize_t hinic_dbg_cmd_read(struct file *filp, char __user *buffer, 
size_t count,
+ loff_t *ppos)
+{
+   struct hinic_debug_priv *dbg;
+   char ret_buf[20];
+   int *desc;
+   u64 out;
+   int ret;
+
+   desc = filp->private_data;
+   dbg = container_of(desc, struct hinic_debug_priv, field_id[*desc]);
+
+   switch (dbg->type) {
+   case HINIC_DBG_SQ_INFO:
+   out = hinic_dbg_get_sq_info(dbg->dev, dbg->object, *desc);
+   break;
+
+   default:
+   netif_warn(dbg->dev, drv, dbg->dev->netdev, "Invalid hinic 
debug cmd: %d\n",
+  dbg->type);
+   return -EINVAL;
+   }
+
+   ret = snprintf(ret_buf, sizeof(ret_buf), "0x%llx\n", out);
+
+   return simple_read_from_buffer(buffer, count, ppos, ret_buf, ret);
+}
+
+static const struct file_operations hinic_dbg_cmd_fops = {
+   .owner = THIS_MODULE,
+   .open  = simple_open,
+   .read  = hinic_dbg_cmd_read,
+};
+
+static int create_dbg_files(struct hinic_dev *dev, enum hinic_dbg_type type, 
void *data,
+   struct dentry *root, struct hinic_debug_priv **dbg, 
char **field,
+   int nfile)
+{
+   struct hinic_debug_priv *tmp;
+   int i;
+
+   tmp = kzalloc(sizeof(*tmp), GFP_KERNEL);
+   if (!tmp)
+   return -ENOMEM;
+
+   tmp->dev = dev;
+   tmp->object = data;
+   tmp->type = type;
+   tmp->root = root;
+
+   for (i = 0; i < nfile; i++) {
+   tmp->field_id[i] = i;
+   debugfs_create_file(field[i], 0400, root, >field_id[i], 
_dbg_cmd_fops);
+   }
+
+   *dbg = tmp;
+
+   return 0;
+}
+
+static void rem_dbg_files(struct hinic_debug_priv *dbg)
+{
+   debugfs_remove_recursive(dbg->root);
+   kfree(dbg);
+}
+
+int hinic_sq_debug_add(struct hinic_dev *dev, u16 sq_id)
+{
+   struct hinic_sq *sq;
+   struct dentry *root;
+   

[PATCH net-next v2 0/3] hinic: add debugfs support

2020-08-27 Thread Luo bin
add debugfs node for querying sq/rq info and function table

Luo bin (3):
  hinic: add support to query sq info
  hinic: add support to query rq info
  hinic: add support to query function table

 drivers/net/ethernet/huawei/hinic/Makefile|   3 +-
 .../net/ethernet/huawei/hinic/hinic_debugfs.c | 318 ++
 .../net/ethernet/huawei/hinic/hinic_debugfs.h | 114 +++
 drivers/net/ethernet/huawei/hinic/hinic_dev.h |  20 ++
 .../net/ethernet/huawei/hinic/hinic_hw_dev.c  |   1 +
 .../net/ethernet/huawei/hinic/hinic_hw_dev.h  |   2 +
 .../net/ethernet/huawei/hinic/hinic_hw_io.c   |   2 +
 .../net/ethernet/huawei/hinic/hinic_hw_io.h   |   1 +
 .../net/ethernet/huawei/hinic/hinic_hw_qp.h   |   6 +
 .../net/ethernet/huawei/hinic/hinic_main.c|  83 -
 10 files changed, 544 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/ethernet/huawei/hinic/hinic_debugfs.c
 create mode 100644 drivers/net/ethernet/huawei/hinic/hinic_debugfs.h

-- 
2.17.1



[Patch v2 4/7] mm/hugetlb: count file_region to be added when regions_needed != NULL

2020-08-27 Thread Wei Yang
There are only two cases of function add_reservation_in_range()

* count file_region and return the number in regions_needed
* do the real list operation without counting

This means it is not necessary to have two parameters to classify these
two cases.

Just use regions_needed to separate them.

Signed-off-by: Wei Yang 
Reviewed-by: Baoquan He 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index cbe67428bf99..bbccbfeb8601 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -319,16 +319,17 @@ static void coalesce_file_region(struct resv_map *resv, 
struct file_region *rg)
}
 }
 
-/* Must be called with resv->lock held. Calling this with count_only == true
- * will count the number of pages to be added but will not modify the linked
- * list. If regions_needed != NULL and count_only == true, then regions_needed
- * will indicate the number of file_regions needed in the cache to carry out to
- * add the regions for this range.
+/*
+ * Must be called with resv->lock held.
+ *
+ * Calling this with regions_needed != NULL will count the number of pages
+ * to be added but will not modify the linked list. And regions_needed will
+ * indicate the number of file_regions needed in the cache to carry out to add
+ * the regions for this range.
  */
 static long add_reservation_in_range(struct resv_map *resv, long f, long t,
 struct hugetlb_cgroup *h_cg,
-struct hstate *h, long *regions_needed,
-bool count_only)
+struct hstate *h, long *regions_needed)
 {
long add = 0;
struct list_head *head = >regions;
@@ -364,14 +365,14 @@ static long add_reservation_in_range(struct resv_map 
*resv, long f, long t,
 */
if (rg->from > last_accounted_offset) {
add += rg->from - last_accounted_offset;
-   if (!count_only) {
+   if (!regions_needed) {
nrg = get_file_region_entry_from_cache(
resv, last_accounted_offset, rg->from);
record_hugetlb_cgroup_uncharge_info(h_cg, h,
resv, nrg);
list_add(>link, rg->link.prev);
coalesce_file_region(resv, nrg);
-   } else if (regions_needed)
+   } else
*regions_needed += 1;
}
 
@@ -383,13 +384,13 @@ static long add_reservation_in_range(struct resv_map 
*resv, long f, long t,
 */
if (last_accounted_offset < t) {
add += t - last_accounted_offset;
-   if (!count_only) {
+   if (!regions_needed) {
nrg = get_file_region_entry_from_cache(
resv, last_accounted_offset, t);
record_hugetlb_cgroup_uncharge_info(h_cg, h, resv, nrg);
list_add(>link, rg->link.prev);
coalesce_file_region(resv, nrg);
-   } else if (regions_needed)
+   } else
*regions_needed += 1;
}
 
@@ -482,8 +483,8 @@ static long region_add(struct resv_map *resv, long f, long 
t,
 retry:
 
/* Count how many regions are actually needed to execute this add. */
-   add_reservation_in_range(resv, f, t, NULL, NULL, _regions_needed,
-true);
+   add_reservation_in_range(resv, f, t, NULL, NULL,
+_regions_needed);
 
/*
 * Check for sufficient descriptors in the cache to accommodate
@@ -511,7 +512,7 @@ static long region_add(struct resv_map *resv, long f, long 
t,
goto retry;
}
 
-   add = add_reservation_in_range(resv, f, t, h_cg, h, NULL, false);
+   add = add_reservation_in_range(resv, f, t, h_cg, h, NULL);
 
resv->adds_in_progress -= in_regions_needed;
 
@@ -547,9 +548,9 @@ static long region_chg(struct resv_map *resv, long f, long 
t,
 
spin_lock(>lock);
 
-   /* Count how many hugepages in this range are NOT respresented. */
+   /* Count how many hugepages in this range are NOT represented. */
chg = add_reservation_in_range(resv, f, t, NULL, NULL,
-  out_regions_needed, true);
+  out_regions_needed);
 
if (*out_regions_needed == 0)
*out_regions_needed = 1;
-- 
2.20.1 (Apple Git-117)



[PATCH] ubifs: setflags: Don't show error message when vfs_ioc_setflags_prepare() fails

2020-08-27 Thread Zhihao Cheng
Following process will trigger ubifs_err:
  1. useradd -m freg(Under root)
  2. cd /home/freg && mkdir mp  (Under freg)
  3. mount -t ubifs /dev/ubi0_0 /home/freg/mp   (Under root)
  4. cd /home/freg && echo 123 > mp/a   (Under root)
  5. cd mp && chown freg a && chgrp freg a && chmod 777 a   (Under root)
  6. chattr +i a(Under freg)

UBIFS error (ubi0:0 pid 1723): ubifs_ioctl [ubifs]: can't modify inode
65 attributes
chattr: Operation not permitted while setting flags on a

This is not an UBIFS problem, it was caused by task priviliage checking
on file operations. Remove error message printing from kernel just like
other filesystems (eg. ext4), since we already have enough information
from userspace tools.

Signed-off-by: Zhihao Cheng 
---
 fs/ubifs/ioctl.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/ubifs/ioctl.c b/fs/ubifs/ioctl.c
index 3df9be2c684c..4363d85a3fd4 100644
--- a/fs/ubifs/ioctl.c
+++ b/fs/ubifs/ioctl.c
@@ -134,7 +134,6 @@ static int setflags(struct inode *inode, int flags)
return err;
 
 out_unlock:
-   ubifs_err(c, "can't modify inode %lu attributes", inode->i_ino);
mutex_unlock(>ui_mutex);
ubifs_release_budget(c, );
return err;
-- 
2.25.4



[Patch v2 5/7] mm/hugetlb: a page from buddy is not on any list

2020-08-27 Thread Wei Yang
The page allocated from buddy is not on any list, so just use list_add()
is enough.

Signed-off-by: Wei Yang 
Reviewed-by: Baoquan He 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bbccbfeb8601..5a71cb7acf6b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2428,7 +2428,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
h->resv_huge_pages--;
}
spin_lock(_lock);
-   list_move(>lru, >hugepage_activelist);
+   list_add(>lru, >hugepage_activelist);
/* Fall through */
}
hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
-- 
2.20.1 (Apple Git-117)



[Patch v2 0/7] mm/hugetlb: code refine and simplification

2020-08-27 Thread Wei Yang
Following are some cleanup for hugetlb.

Simple test with tools/testing/selftests/vm/map_hugetlb pass.

v2:
  * drop 5/6/10 since similar patches are merged or under review.
  * adjust 2 based on comment from Mike Kravetz

Wei Yang (7):
  mm/hugetlb: not necessary to coalesce regions recursively
  mm/hugetlb: remove VM_BUG_ON(!nrg) in
get_file_region_entry_from_cache()
  mm/hugetlb: use list_splice to merge two list at once
  mm/hugetlb: count file_region to be added when regions_needed != NULL
  mm/hugetlb: a page from buddy is not on any list
  mm/hugetlb: return non-isolated page in the loop instead of break and
check
  mm/hugetlb: narrow the hugetlb_lock protection area during preparing
huge page

 mm/hugetlb.c | 77 +++-
 1 file changed, 34 insertions(+), 43 deletions(-)

-- 
2.20.1 (Apple Git-117)



[Patch v2 7/7] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page

2020-08-27 Thread Wei Yang
set_hugetlb_cgroup_[rsvd] just manipulate page local data, which is not
necessary to be protected by hugetlb_lock.

Let's take this out.

Signed-off-by: Wei Yang 
Reviewed-by: Baoquan He 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6ad365dd1e96..ae840dc09197 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1492,9 +1492,9 @@ static void prep_new_huge_page(struct hstate *h, struct 
page *page, int nid)
 {
INIT_LIST_HEAD(>lru);
set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
-   spin_lock(_lock);
set_hugetlb_cgroup(page, NULL);
set_hugetlb_cgroup_rsvd(page, NULL);
+   spin_lock(_lock);
h->nr_huge_pages++;
h->nr_huge_pages_node[nid]++;
spin_unlock(_lock);
-- 
2.20.1 (Apple Git-117)



[Patch v2 3/7] mm/hugetlb: use list_splice to merge two list at once

2020-08-27 Thread Wei Yang
Instead of add allocated file_region one by one to region_cache, we
could use list_splice to merge two list at once.

Also we know the number of entries in the list, increase the number
directly.

Signed-off-by: Wei Yang 
Reviewed-by: Baoquan He 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f325839be617..cbe67428bf99 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -441,11 +441,8 @@ static int allocate_file_region_entries(struct resv_map 
*resv,
 
spin_lock(>lock);
 
-   list_for_each_entry_safe(rg, trg, _regions, link) {
-   list_del(>link);
-   list_add(>link, >region_cache);
-   resv->region_cache_count++;
-   }
+   list_splice(_regions, >region_cache);
+   resv->region_cache_count += to_allocate;
}
 
return 0;
-- 
2.20.1 (Apple Git-117)



[Patch v2 6/7] mm/hugetlb: return non-isolated page in the loop instead of break and check

2020-08-27 Thread Wei Yang
Function dequeue_huge_page_node_exact() iterates the free list and
return the first non-isolated one.

Instead of break and check the loop variant, we could return in the loop
directly. This could reduce some redundant check.

Signed-off-by: Wei Yang 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 5a71cb7acf6b..6ad365dd1e96 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1033,20 +1033,18 @@ static struct page *dequeue_huge_page_node_exact(struct 
hstate *h, int nid)
 {
struct page *page;
 
-   list_for_each_entry(page, >hugepage_freelists[nid], lru)
-   if (!PageHWPoison(page))
-   break;
-   /*
-* if 'non-isolated free hugepage' not found on the list,
-* the allocation fails.
-*/
-   if (>hugepage_freelists[nid] == >lru)
-   return NULL;
-   list_move(>lru, >hugepage_activelist);
-   set_page_refcounted(page);
-   h->free_huge_pages--;
-   h->free_huge_pages_node[nid]--;
-   return page;
+   list_for_each_entry(page, >hugepage_freelists[nid], lru) {
+   if (PageHWPoison(page))
+   continue;
+
+   list_move(>lru, >hugepage_activelist);
+   set_page_refcounted(page);
+   h->free_huge_pages--;
+   h->free_huge_pages_node[nid]--;
+   return page;
+   }
+
+   return NULL;
 }
 
 static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t 
gfp_mask, int nid,
-- 
2.20.1 (Apple Git-117)



[Patch v2 1/7] mm/hugetlb: not necessary to coalesce regions recursively

2020-08-27 Thread Wei Yang
Per my understanding, we keep the regions ordered and would always
coalesce regions properly. So the task to keep this property is just
to coalesce its neighbour.

Let's simplify this.

Signed-off-by: Wei Yang 
Reviewed-by: Baoquan He 
Reviewed-by: Mike Kravetz 
---
 mm/hugetlb.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 590111ea6975..62ec74f6d03f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -307,8 +307,7 @@ static void coalesce_file_region(struct resv_map *resv, 
struct file_region *rg)
list_del(>link);
kfree(rg);
 
-   coalesce_file_region(resv, prg);
-   return;
+   rg = prg;
}
 
nrg = list_next_entry(rg, link);
@@ -318,9 +317,6 @@ static void coalesce_file_region(struct resv_map *resv, 
struct file_region *rg)
 
list_del(>link);
kfree(rg);
-
-   coalesce_file_region(resv, nrg);
-   return;
}
 }
 
-- 
2.20.1 (Apple Git-117)



[Patch v2 2/7] mm/hugetlb: remove VM_BUG_ON(!nrg) in get_file_region_entry_from_cache()

2020-08-27 Thread Wei Yang
We are sure to get a valid file_region, otherwise the
VM_BUG_ON(resv->region_cache_count <= 0) at the very beginning would be
triggered.

Let's remove the redundant one.

Signed-off-by: Wei Yang 
---
 mm/hugetlb.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 62ec74f6d03f..f325839be617 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -238,7 +238,6 @@ get_file_region_entry_from_cache(struct resv_map *resv, 
long from, long to)
 
resv->region_cache_count--;
nrg = list_first_entry(>region_cache, struct file_region, link);
-   VM_BUG_ON(!nrg);
list_del(>link);
 
nrg->from = from;
-- 
2.20.1 (Apple Git-117)



[PATCH v2] media: uvcvideo: Convey full colorspace information to V4L2

2020-08-27 Thread Adam Goode
The Color Matching Descriptor has been present in USB cameras since
the original version of UVC, but it has never been fully exposed
in Linux.

This change informs V4L2 of all of the UVC colorspace parameters:
color primaries, transfer characteristics, and YCbCr encoding.
videodev2.h doesn't have values for all the possible UVC color settings,
so it is mapped as closely as possible.

Additionally, this patch overrides the default setting for quantization
for UVC MJPEG. By default, V4L2 assumes that MJPEG is full range encoded,
which is not correct for UVC.

JPEG itself does not specify YCbCr encoding information, this is
left to some other metadata. For typical JPEG images (those that conform to
JFIF, see https://www.w3.org/Graphics/JPEG/jfif3.pdf), the colorspace is
specified as YCbCr CCIR 601 with full range. The use of this variant
on the 601 standard in JFIF is the reason that V4L2 defaults to full range
for JPEG.

A JPEG image isn't a JFIF unless it contains an APP0 tag with 'JFIF', and
the UVC standard is clear that APP0 is optional for its MJPEG payload.
It does not mention JFIF at all. Moreover, it provides color metadata
in the Color Matching Descriptor, all using limited range as of UVC 1.5.

Note that web browsers such as Chrome and Firefox already ignore V4L2's
quantization for USB devices and use the correct limited range, but
other programs such as qv4l2 will incorrectly interpret the encoding of
MJPEG from USB cameras without this change.

Since there are many YUV and non-YUV formats supported by UVC cameras (but
not mentioned in the official specifications), and the quantization is
also not specified for these formats, I am not changing that behavior: all
formats besides MJPEG will stay at V4L2_QUANTIZATION_DEFAULT as before.

Signed-off-by: Adam Goode 
---

Changes in v2:
 - Apply quantization override only for MJPEG.
 - Provide more comments and background information about JPEG vs JFIF.
 - Explain the substitutions for xfer func and ycbcr encoding.

 drivers/media/usb/uvc/uvc_driver.c | 87 --
 drivers/media/usb/uvc/uvc_v4l2.c   |  6 +++
 drivers/media/usb/uvc/uvcvideo.h   |  6 ++-
 3 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_driver.c 
b/drivers/media/usb/uvc/uvc_driver.c
index 431d86e1c94b..4e530a4bf976 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -248,10 +248,10 @@ static struct uvc_format_desc *uvc_format_by_guid(const 
u8 guid[16])
return NULL;
 }
 
-static u32 uvc_colorspace(const u8 primaries)
+static enum v4l2_colorspace uvc_colorspace(const u8 primaries)
 {
-   static const u8 colorprimaries[] = {
-   0,
+   static const enum v4l2_colorspace colorprimaries[] = {
+   V4L2_COLORSPACE_DEFAULT,  /* Unspecified */
V4L2_COLORSPACE_SRGB,
V4L2_COLORSPACE_470_SYSTEM_M,
V4L2_COLORSPACE_470_SYSTEM_BG,
@@ -262,7 +262,61 @@ static u32 uvc_colorspace(const u8 primaries)
if (primaries < ARRAY_SIZE(colorprimaries))
return colorprimaries[primaries];
 
-   return 0;
+   return V4L2_COLORSPACE_DEFAULT;  /* Reserved */
+}
+
+static enum v4l2_xfer_func uvc_xfer_func(const u8 transfer_characteristics)
+{
+   /* V4L2 currently does not currently have definitions for all
+* possible values of UVC transfer characteristics. If
+* v4l2_xfer_func is extended with new values, the mapping
+* below should be updated.
+*
+* Substitutions are taken from the mapping given for
+* V4L2_XFER_FUNC_DEFAULT documented in videodev2.h.
+*/
+   static const enum v4l2_xfer_func xfer_funcs[] = {
+   V4L2_XFER_FUNC_DEFAULT,/* Unspecified */
+   V4L2_XFER_FUNC_709,
+   V4L2_XFER_FUNC_709,/* Substitution for BT.470-2 M */
+   V4L2_XFER_FUNC_709,/* Substitution for BT.470-2 B, G */
+   V4L2_XFER_FUNC_709,/* Substitution for SMPTE 170M */
+   V4L2_XFER_FUNC_SMPTE240M,
+   V4L2_XFER_FUNC_NONE,
+   V4L2_XFER_FUNC_SRGB,
+   };
+
+   if (transfer_characteristics < ARRAY_SIZE(xfer_funcs))
+   return xfer_funcs[transfer_characteristics];
+
+   return V4L2_XFER_FUNC_DEFAULT;  /* Reserved */
+}
+
+static enum v4l2_ycbcr_encoding uvc_ycbcr_enc(const u8 matrix_coefficients)
+{
+   /* V4L2 currently does not currently have definitions for all
+* possible values of UVC matrix coefficients. If
+* v4l2_ycbcr_encoding is extended with new values, the
+* mapping below should be updated.
+*
+* Substitutions are taken from the mapping given for
+* V4L2_YCBCR_ENC_DEFAULT documented in videodev2.h.
+*
+* FCC is assumed to be close enough to 601.
+*/
+   static const enum v4l2_ycbcr_encoding ycbcr_encs[] = {
+   

Re: [PATCH v3] mm: Fix kthread_use_mm() vs TLB invalidate

2020-08-27 Thread Nicholas Piggin
Excerpts from pet...@infradead.org's message of August 21, 2020 11:04 pm:
> On Fri, Aug 21, 2020 at 11:09:51AM +0530, Aneesh Kumar K.V wrote:
>> Peter Zijlstra  writes:
>> 
>> > For SMP systems using IPI based TLB invalidation, looking at
>> > current->active_mm is entirely reasonable. This then presents the
>> > following race condition:
>> >
>> >
>> >   CPU0 CPU1
>> >
>> >   flush_tlb_mm(mm) use_mm(mm)
>> > 
>> >  tsk->active_mm = mm;
>> >  
>> >if (tsk->active_mm == mm)
>> >  // flush TLBs
>> >  
>> >  switch_mm(old_mm,mm,tsk);
>> >
>> >
>> > Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
>> > because the IPI lands before we actually switched.
>> >
>> > Avoid this by disabling IRQs across changing ->active_mm and
>> > switch_mm().
>> >
>> > [ There are all sorts of reasons this might be harmless for various
>> > architecture specific reasons, but best not leave the door open at
>> > all. ]
>> 
>> 
>> Do we have similar race with exec_mmap()? I am looking at exec_mmap()
>> runnning parallel to do_exit_flush_lazy_tlb(). We can get
>> 
>>  if (current->active_mm == mm) {
>> 
>> true and if we don't disable irq around updating tsk->mm/active_mm we
>> can end up doing mmdrop on wrong mm?
> 
> exec_mmap() is called after de_thread(), there should not be any mm
> specific invalidations around I think.
> 
> Then again, CLONE_VM without CLONE_THREAD might still be possible, so
> yeah, we probably want IRQs disabled there too, just for consistency and
> general paranoia if nothing else.

The problem is probably not this TLB flushing race, but I think there
is a lazy tlb race.

  call_usermodehelper()
kernel_execve()
  old_mm = current->mm;
  active_mm = current->active_mm;
  *** preempt *** -->schedule()
   prev->active_mm = NULL;
   mmdrop(prev active mm)
 ... 
  <--schedule()
  current->mm = mm;
  current->active_mm = mm;
  if (!old_mm)
  mmdrop(active_mm); /* double free! */

There's possibly other problematic interleavings. powerpc also has an
issue with switching away a lazy tlb mm via IPI which is basically the
same problem so I just illustrate the more general issue.

I think we just make it a rule that these always get updated under
local_irq_disable, to be safe.

Trouble is we can't just do it, because some architectures can't do
activate_mm with irqs disabled. ARM and UM, at least. UM can't even
do preempt_disabled. We can probably change them to make them work,
I'm not sure what the best way to go is, my first attempt is to require
activate_mm to do the mm switching and the irq disable as well, but
I'll need some help from the archs

I'll send out rfcs in a minute.

Thanks,
Nick


Re: [PATCH net-next v1 3/3] hinic: add support to query function table

2020-08-27 Thread luobin (L)
On 2020/8/28 3:44, Jakub Kicinski wrote:
> On Thu, 27 Aug 2020 19:13:21 +0800 Luo bin wrote:
>> +switch (idx) {
>> +case VALID:
>> +return funcfg_table_elem->dw0.bs.valid;
>> +case RX_MODE:
>> +return funcfg_table_elem->dw0.bs.nic_rx_mode;
>> +case MTU:
>> +return funcfg_table_elem->dw1.bs.mtu;
>> +case VLAN_MODE:
>> +return funcfg_table_elem->dw1.bs.vlan_mode;
>> +case VLAN_ID:
>> +return funcfg_table_elem->dw1.bs.vlan_id;
>> +case RQ_DEPTH:
>> +return funcfg_table_elem->dw13.bs.cfg_rq_depth;
>> +case QUEUE_NUM:
>> +return funcfg_table_elem->dw13.bs.cfg_q_num;
> 
> The first two patches look fairly unobjectionable to me, but here the
> information does not seem that driver-specific. What's vlan_mode, and
> vlan_id in the context of PF? Why expose mtu, is it different than
> netdev mtu? What's valid? rq_depth?
> .
> 
The vlan_mode and vlan_id in function table are provided for VF in QinQ scenario
and they are useless for PF. Querying VF's function table is unsupported now, so
there is no need to expose vlan_id and vlan mode and I'll remove them in my next
patchset. The function table is saved in hw and we expose the mtu to ensure the
mtu saved in hw is same with netdev mtu. The valid filed indicates whether this
function is enabled or not and the hw can judge whether the RQ buffer in host is
sufficient by comparing the values of rq depth, pi and ci.




[PATCH v2] stackleak: Fix a race between stack erasing sysctl handlers

2020-08-27 Thread Muchun Song
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

CPU0: CPU1:
  proc_sys_write
stack_erasing_sysctlproc_sys_call_handler
  table->data =stack_erasing_sysctl
table->data = 
  proc_doulongvec_minmax
do_proc_doulongvec_minmax sysctl_head_finish
  __do_proc_doulongvec_minmax   unuse_table
i = table->data;
*i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it.

Fixes: 964c9dff0091 ("stackleak: Allow runtime disabling of kernel stack 
erasing")
Signed-off-by: Muchun Song 
---
changelogs in v2:
 1. Add more details about how the race happened to the commit message.

 kernel/stackleak.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index a8fc9ae1d03d..fd95b87478ff 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -25,10 +25,15 @@ int stack_erasing_sysctl(struct ctl_table *table, int write,
int ret = 0;
int state = !static_branch_unlikely(_erasing_bypass);
int prev_state = state;
+   struct ctl_table dup_table = *table;
 
-   table->data = 
-   table->maxlen = sizeof(int);
-   ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+   /*
+* In order to avoid races with __do_proc_doulongvec_minmax(), we
+* can duplicate the @table and alter the duplicate of it.
+*/
+   dup_table.data = 
+   dup_table.maxlen = sizeof(int);
+   ret = proc_dointvec_minmax(_table, write, buffer, lenp, ppos);
state = !!state;
if (ret || !write || state == prev_state)
return ret;
-- 
2.11.0



Re: [PATCH] coresight: cti: write regsiters directly in cti_enable_hw()

2020-08-27 Thread Tingwei Zhang
On Fri, Aug 28, 2020 at 02:12:53AM +0800, Mathieu Poirier wrote:
> Hi Tingwei,
> 
> On Tue, Aug 18, 2020 at 07:10:57PM +0800, Tingwei Zhang wrote:
> > Deadlock as below is triggered by one CPU holds drvdata->spinlock
> > and calls cti_enable_hw(). Smp_call_function_single() is called
> > in cti_enable_hw() and tries to let another CPU write CTI registers.
> > That CPU is trying to get drvdata->spinlock in cti_cpu_pm_notify()
> > and doesn't response to IPI from smp_call_function_single().
> > 
> > [  988.335937] CPU: 6 PID: 10258 Comm: sh Tainted: GWL
> > 5.8.0-rc6-mainline-16783-gc38daa79b26b-dirty #1
> > [  988.346364] Hardware name: Thundercomm Dragonboard 845c (DT)
> > [  988.352073] pstate: 2045 (nzCv daif +PAN -UAO BTYPE=--)
> > [  988.357689] pc : smp_call_function_single+0x158/0x1b8
> > [  988.362782] lr : smp_call_function_single+0x124/0x1b8
> > ...
> > [  988.451638] Call trace:
> > [  988.454119]  smp_call_function_single+0x158/0x1b8
> > [  988.458866]  cti_enable+0xb4/0xf8 [coresight_cti]
> > [  988.463618]  coresight_control_assoc_ectdev+0x6c/0x128 [coresight]
> > [  988.469855]  coresight_enable+0x1f0/0x364 [coresight]
> > [  988.474957]  enable_source_store+0x5c/0x9c [coresight]
> > [  988.480140]  dev_attr_store+0x14/0x28
> > [  988.483839]  sysfs_kf_write+0x38/0x4c
> > [  988.487532]  kernfs_fop_write+0x1c0/0x2b0
> > [  988.491585]  vfs_write+0xfc/0x300
> > [  988.494931]  ksys_write+0x78/0xe0
> > [  988.498283]  __arm64_sys_write+0x18/0x20
> > [  988.502240]  el0_svc_common+0x98/0x160
> > [  988.506024]  do_el0_svc+0x78/0x80
> > [  988.509377]  el0_sync_handler+0xd4/0x270
> > [  988.513337]  el0_sync+0x164/0x180
> > 
> 
> Was this the full log or you did cut some of it?
> 

I cut some CPU registers' value since it's too long and not relevant.
The Call trace is full.

> > This change write CTI registers directly in cti_enable_hw().
> > Config->hw_powered has been checked to be true with spinlock holded.
> > CTI is powered and can be programmed until spinlock is released.
> > 
> 
> From your explanation above it seems that cti_enable_hw() was called from,
> say
> CPUy, to enable the CTI associated to CPUx.  CTIx's drvdata->spinlock was
> taken
> and smp_call_function_single() called right after.  That woke up CPUx and
> cti_cpu_pm_notify() was executed on CPUx in interrupt context, trying to
> take
> CTIx's drvdata->spinlock.  That hung CPUx and the kernel got angry.  Is my
> assessment correct?
> 

Most of them is correct. The only difference is CPUx is power on when
cti_enable_hw() is called.  Otherwise it will goto cti_state_unchanged:
and won't call cti_enable_hw_smp_call(). cti_cpu_pm_notify() is called
when CPUx tries to suspend instead of resume.

> If so I don't think the fix suggested in this patch will work.  The same
> condition will happen whenever cti_enable_hw() is called on a CPU to
> enable a
> CTI that belongs to another CPU and that cti_cpu_pm_notify() is called on
> latter
> CPU at the same time.
> 

I'm not sure I understand this correctly.  Let me clarify it a little bit.
It's a deadlock since cti_enable_hw() holds the spinlock and calls
cti_enable_hw_smp_call() from CPUx to enable CTI associated to CPUy. It
waits for cti_enable_hw_smp_call() to return. IPI is sent to CPUy while
CPUy is in cti_cpu_pm_notify() and waits for spinlock. In this patch,
I remove cti_enable_hw_smp_call() and write CTI CPU directly on CPUx.
It won't wait for CPUy and release spinlock after program registers of
CTI. After cti_enable_hw() releases spinlock, cti_cpu_pm_notify() will
continue to run. Since spinlock is held and config->hw_powered is true,
we don't need to worry about CPUy power down when we program CTI on CPUx.

> I think a better solution is to grab the lock in cti_enable_hw() and check
> the
> value of ->ctidev.cpu.  If not a global CPU, i.e >= 0, then release the
> lock and
> call smp_call_function_single().  In cti_enable_hw_smp_call() take the
> lock
> again and move forward from there. 
> 

After cti_enable_hw() releases the lock, it's possible that CPU is offline
by user, cti_enable_hw_smp_call() will fail in this case.



> I have applied the other two patches in this set so no need to send them
> again.
>
Thanks,
Tingwei 
> Thanks,
> Mathieu
> 
> > Fixes: 6a0953ce7de9 ("coresight: cti: Add CPU idle pm notifer to CTI
> devices")
> > Signed-off-by: Tingwei Zhang 
> > ---
> >  drivers/hwtracing/coresight/coresight-cti.c | 17 +
> >  1 file changed, 1 insertion(+), 16 deletions(-)
> > 
> > diff --git a/drivers/hwtracing/coresight/coresight-cti.c
> b/drivers/hwtracing/coresight/coresight-cti.c
> > index 3ccc703dc940..869569eb8c7f 100644
> > --- a/drivers/hwtracing/coresight/coresight-cti.c
> > +++ b/drivers/hwtracing/coresight/coresight-cti.c
> > @@ -86,13 +86,6 @@ void cti_write_all_hw_regs(struct cti_drvdata
> *drvdata)
> > CS_LOCK(drvdata->base);
> >  }
> >  
> > -static void cti_enable_hw_smp_call(void *info)
> > -{
> > -   struct 

[PATCH] arm64: fix some spelling mistakes in the comments by codespell

2020-08-27 Thread Xiaoming Ni
arch/arm64/include/asm/cpu_ops.h:24: necesary ==> necessary
arch/arm64/include/asm/kvm_arm.h:69: maintainance ==> maintenance
arch/arm64/include/asm/cpufeature.h:361: capabilties ==> capabilities
arch/arm64/kernel/perf_regs.c:19: compatability ==> compatibility
arch/arm64/kernel/smp_spin_table.c:86: endianess ==> endianness
arch/arm64/kernel/smp_spin_table.c:88: endianess ==> endianness
arch/arm64/kvm/vgic/vgic-mmio-v3.c:1004: targetting ==> targeting
arch/arm64/kvm/vgic/vgic-mmio-v3.c:1005: targetting ==> targeting

Signed-off-by: Xiaoming Ni 
---
 arch/arm64/include/asm/cpu_ops.h| 2 +-
 arch/arm64/include/asm/cpufeature.h | 2 +-
 arch/arm64/include/asm/kvm_arm.h| 2 +-
 arch/arm64/kernel/perf_regs.c   | 2 +-
 arch/arm64/kernel/smp_spin_table.c  | 4 ++--
 arch/arm64/kvm/vgic/vgic-mmio-v3.c  | 4 ++--
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/cpu_ops.h b/arch/arm64/include/asm/cpu_ops.h
index d28e8f37d3b4..e95c4df83911 100644
--- a/arch/arm64/include/asm/cpu_ops.h
+++ b/arch/arm64/include/asm/cpu_ops.h
@@ -21,7 +21,7 @@
  * mechanism for doing so, tests whether it is possible to boot
  * the given CPU.
  * @cpu_boot:  Boots a cpu into the kernel.
- * @cpu_postboot: Optionally, perform any post-boot cleanup or necesary
+ * @cpu_postboot: Optionally, perform any post-boot cleanup or necessary
  * synchronisation. Called from the cpu being booted.
  * @cpu_can_disable: Determines whether a CPU can be disabled based on
  * mechanism-specific information.
diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 89b4f0142c28..3a42dc8e697c 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -358,7 +358,7 @@ static inline int cpucap_default_scope(const struct 
arm64_cpu_capabilities *cap)
 }
 
 /*
- * Generic helper for handling capabilties with multiple (match,enable) pairs
+ * Generic helper for handling capabilities with multiple (match,enable) pairs
  * of call backs, sharing the same capability bit.
  * Iterate over each entry to see if at least one matches.
  */
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 51c1d9918999..21f91aebc052 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -66,7 +66,7 @@
  * TWI:Trap WFI
  * TIDCP:  Trap L2CTLR/L2ECTLR
  * BSU_IS: Upgrade barriers to the inner shareable domain
- * FB: Force broadcast of all maintainance operations
+ * FB: Force broadcast of all maintenance operations
  * AMO:Override CPSR.A and enable signaling with VA
  * IMO:Override CPSR.I and enable signaling with VI
  * FMO:Override CPSR.F and enable signaling with VF
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index 666b225aeb3a..94e8718e7229 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -16,7 +16,7 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
 
/*
 * Our handling of compat tasks (PERF_SAMPLE_REGS_ABI_32) is weird, but
-* we're stuck with it for ABI compatability reasons.
+* we're stuck with it for ABI compatibility reasons.
 *
 * For a 32-bit consumer inspecting a 32-bit task, then it will look at
 * the first 16 registers (see arch/arm/include/uapi/asm/perf_regs.h).
diff --git a/arch/arm64/kernel/smp_spin_table.c 
b/arch/arm64/kernel/smp_spin_table.c
index c8a3fee00c11..5892e79fa429 100644
--- a/arch/arm64/kernel/smp_spin_table.c
+++ b/arch/arm64/kernel/smp_spin_table.c
@@ -83,9 +83,9 @@ static int smp_spin_table_cpu_prepare(unsigned int cpu)
 
/*
 * We write the release address as LE regardless of the native
-* endianess of the kernel. Therefore, any boot-loaders that
+* endianness of the kernel. Therefore, any boot-loaders that
 * read this address need to convert this address to the
-* boot-loader's endianess before jumping. This is mandated by
+* boot-loader's endianness before jumping. This is mandated by
 * the boot protocol.
 */
writeq_relaxed(__pa_symbol(secondary_holding_pen), release_addr);
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 5c786b915cd3..52d6f24f65dc 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -1001,8 +1001,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, 
bool allow_group1)
raw_spin_lock_irqsave(>irq_lock, flags);
 
/*
-* An access targetting Group0 SGIs can only generate
-* those, while an access targetting Group1 SGIs can
+* An access targeting Group0 SGIs can only generate
+* those, while an access targeting Group1 SGIs can

Re: [PATCH] net: dsa: mt7530: fix advertising unsupported

2020-08-27 Thread Florian Fainelli




On 8/27/2020 2:15 AM, Landen Chao wrote:

1000baseT_Half


Looks like this part of the commit subject spilled into the commit message.



Remove 1000baseT_Half to advertise correct hardware capability in
phylink_validate() callback function.

Fixes: 38f790a80560 ("net: dsa: mt7530: Add support for port 5")
Signed-off-by: Landen Chao 


Reviewed-by: Florian Fainelli 
--
Florian


possible deadlock in proc_pid_stack (2)

2020-08-27 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:494d311a Add linux-next specific files for 20200821
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1644d50e90
kernel config:  https://syzkaller.appspot.com/x/.config?x=a61d44f28687f508
dashboard link: https://syzkaller.appspot.com/bug?extid=a26ada9907073b2f6f97
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a26ada9907073b2f6...@syzkaller.appspotmail.com

==
WARNING: possible circular locking dependency detected
5.9.0-rc1-next-20200821-syzkaller #0 Not tainted
--
syz-executor.2/26207 is trying to acquire lock:
8880a064e688 (>exec_update_mutex){+.+.}-{3:3}, at: lock_trace 
fs/proc/base.c:408 [inline]
8880a064e688 (>exec_update_mutex){+.+.}-{3:3}, at: 
proc_pid_stack+0xf0/0x2a0 fs/proc/base.c:452

but task is already holding lock:
88809dc4d8f8 (>lock){+.+.}-{3:3}, at: seq_read+0x61/0x1070 
fs/seq_file.c:155

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (>lock){+.+.}-{3:3}:
   __mutex_lock_common kernel/locking/mutex.c:956 [inline]
   __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
   seq_read+0x61/0x1070 fs/seq_file.c:155
   do_loop_readv_writev fs/read_write.c:734 [inline]
   do_loop_readv_writev fs/read_write.c:721 [inline]
   do_iter_read+0x48e/0x6e0 fs/read_write.c:955
   vfs_readv+0xe5/0x150 fs/read_write.c:1073
   kernel_readv fs/splice.c:355 [inline]
   default_file_splice_read.constprop.0+0x4e6/0x9e0 fs/splice.c:412
   do_splice_to+0x137/0x170 fs/splice.c:871
   splice_direct_to_actor+0x307/0x980 fs/splice.c:950
   do_splice_direct+0x1b3/0x280 fs/splice.c:1059
   do_sendfile+0x55f/0xd40 fs/read_write.c:1540
   __do_sys_sendfile64 fs/read_write.c:1601 [inline]
   __se_sys_sendfile64 fs/read_write.c:1587 [inline]
   __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1587
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #2 (sb_writers#4){.+.+}-{0:0}:
   percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
   __sb_start_write+0x234/0x470 fs/super.c:1672
   sb_start_write include/linux/fs.h:1643 [inline]
   mnt_want_write+0x3a/0xb0 fs/namespace.c:354
   ovl_do_remove+0xe1/0xe00 fs/overlayfs/dir.c:889
   vfs_rmdir.part.0+0x113/0x430 fs/namei.c:3712
   vfs_rmdir fs/namei.c:3698 [inline]
   do_rmdir+0x3ae/0x440 fs/namei.c:3773
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #1 (_i_mutex_dir_key[depth]#2){}-{3:3}:
   down_read+0x96/0x420 kernel/locking/rwsem.c:1492
   inode_lock_shared include/linux/fs.h:789 [inline]
   lookup_slow fs/namei.c:1560 [inline]
   walk_component+0x409/0x6a0 fs/namei.c:1860
   lookup_last fs/namei.c:2309 [inline]
   path_lookupat+0x1ba/0x830 fs/namei.c:2333
   filename_lookup+0x19f/0x560 fs/namei.c:2366
   create_local_trace_uprobe+0x87/0x4e0 kernel/trace/trace_uprobe.c:1574
   perf_uprobe_init+0x132/0x210 kernel/trace/trace_event_perf.c:323
   perf_uprobe_event_init+0xff/0x1c0 kernel/events/core.c:9608
   perf_try_init_event+0x12a/0x560 kernel/events/core.c:10927
   perf_init_event kernel/events/core.c:10979 [inline]
   perf_event_alloc.part.0+0xe04/0x3790 kernel/events/core.c:11257
   perf_event_alloc kernel/events/core.c:11636 [inline]
   __do_sys_perf_event_open+0x72c/0x2cb0 kernel/events/core.c:11752
   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

-> #0 (>exec_update_mutex){+.+.}-{3:3}:
   check_prev_add kernel/locking/lockdep.c:2496 [inline]
   check_prevs_add kernel/locking/lockdep.c:2601 [inline]
   validate_chain kernel/locking/lockdep.c:3218 [inline]
   __lock_acquire+0x2a6b/0x5640 kernel/locking/lockdep.c:4426
   lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:5005
   __mutex_lock_common kernel/locking/mutex.c:956 [inline]
   __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
   lock_trace fs/proc/base.c:408 [inline]
   proc_pid_stack+0xf0/0x2a0 fs/proc/base.c:452
   proc_single_show+0x116/0x1e0 fs/proc/base.c:775
   seq_read+0x432/0x1070 fs/seq_file.c:208
   do_loop_readv_writev fs/read_write.c:734 [inline]
   do_loop_readv_writev fs/read_write.c:721 [inline]
   do_iter_read+0x48e/0x6e0 fs/read_write.c:955
   vfs_readv+0xe5/0x150 fs/read_write.c:1073
   do_preadv fs/read_write.c:1165 [inline]
   __do_sys_preadv fs/read_write.c:1215 [inline]
   __se_sys_preadv fs/read_write.c:1210 [inline]
   

Results from the 2020 Linux Foundation Technical Advisory Board election

2020-08-27 Thread Jonathan Corbet
This year's election for the Linux Foundation Technical Advisory Board had
955 authorized voters; 235 of them cast ballots.  The results were:

1: Laura Abbott
2: Kees Cook
3: Dan Williams
4: Christian Brauner
5: Chris Mason

6: Olof Johansson
7: Frank Rowand

The top five will serve two-year terms on the TAB.

Many thanks to all who voted this year, and to all of the candidates.

jon


[PATCH v2] mm/hugetlb: Fix a race between hugetlb sysctl handlers

2020-08-27 Thread Muchun Song
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.

CPU0: CPU1:
  proc_sys_write
hugetlb_sysctl_handler  proc_sys_call_handler
hugetlb_sysctl_handler_common hugetlb_sysctl_handler
  table->data =hugetlb_sysctl_handler_common
  table->data = 
proc_doulongvec_minmax
  do_proc_doulongvec_minmax   sysctl_head_finish
__do_proc_doulongvec_minmax unuse_table
  i = table->data;
  *i = val;  // corrupt CPU1's stack

Fix this by duplicating the `table`, and only update the duplicate of
it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.

The following oops was seen:

BUG: kernel NULL pointer dereference, address: 
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
Code: Bad RIP value.
...
Call Trace:
 ? set_max_huge_pages+0x3da/0x4f0
 ? alloc_pool_huge_page+0x150/0x150
 ? proc_doulongvec_minmax+0x46/0x60
 ? hugetlb_sysctl_handler_common+0x1c7/0x200
 ? nr_hugepages_store+0x20/0x20
 ? copy_fd_bitmaps+0x170/0x170
 ? hugetlb_sysctl_handler+0x1e/0x20
 ? proc_sys_call_handler+0x2f1/0x300
 ? unregister_sysctl_table+0xb0/0xb0
 ? __fd_install+0x78/0x100
 ? proc_sys_write+0x14/0x20
 ? __vfs_write+0x4d/0x90
 ? vfs_write+0xef/0x240
 ? ksys_write+0xc0/0x160
 ? __ia32_sys_read+0x50/0x50
 ? __close_fd+0x129/0x150
 ? __x64_sys_write+0x43/0x50
 ? do_syscall_64+0x6c/0x200
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes")
Signed-off-by: Muchun Song 
---
chagelogs in v2:
 1. Add more details about how the race happened to the commit message.
 2. Remove unnecessary assignment of table->maxlen.

 mm/hugetlb.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a301c2d672bf..4c2a2620eeed 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3454,6 +3454,22 @@ static unsigned int allowed_mems_nr(struct hstate *h)
 }
 
 #ifdef CONFIG_SYSCTL
+static int proc_hugetlb_doulongvec_minmax(struct ctl_table *table, int write,
+ void *buffer, size_t *length,
+ loff_t *ppos, unsigned long *out)
+{
+   struct ctl_table dup_table;
+
+   /*
+* In order to avoid races with __do_proc_doulongvec_minmax(), we
+* can duplicate the @table and alter the duplicate of it.
+*/
+   dup_table = *table;
+   dup_table.data = out;
+
+   return proc_doulongvec_minmax(_table, write, buffer, length, ppos);
+}
+
 static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 struct ctl_table *table, int write,
 void *buffer, size_t *length, loff_t *ppos)
@@ -3465,9 +3481,8 @@ static int hugetlb_sysctl_handler_common(bool 
obey_mempolicy,
if (!hugepages_supported())
return -EOPNOTSUPP;
 
-   table->data = 
-   table->maxlen = sizeof(unsigned long);
-   ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
+   ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
+);
if (ret)
goto out;
 
@@ -3510,9 +3525,8 @@ int hugetlb_overcommit_handler(struct ctl_table *table, 
int write,
if (write && hstate_is_gigantic(h))
return -EINVAL;
 
-   table->data = 
-   table->maxlen = sizeof(unsigned long);
-   ret = proc_doulongvec_minmax(table, write, buffer, length, ppos);
+   ret = proc_hugetlb_doulongvec_minmax(table, write, buffer, length, ppos,
+);
if (ret)
goto out;
 
-- 
2.11.0



Re: [PATCH v6 0/3] io_uring: add restrictions to support untrusted applications and guests

2020-08-27 Thread Jens Axboe
On 8/27/20 8:58 AM, Stefano Garzarella wrote:
> v6:
>  - moved restriction checks in a function [Jens]
>  - changed ret value handling in io_register_restrictions() [Jens]
> 
> v5: 
> https://lore.kernel.org/io-uring/20200827134044.82821-1-sgarz...@redhat.com/
> v4: 
> https://lore.kernel.org/io-uring/20200813153254.93731-1-sgarz...@redhat.com/
> v3: 
> https://lore.kernel.org/io-uring/20200728160101.48554-1-sgarz...@redhat.com/
> RFC v2: 
> https://lore.kernel.org/io-uring/20200716124833.93667-1-sgarz...@redhat.com
> RFC v1: 
> https://lore.kernel.org/io-uring/20200710141945.129329-1-sgarz...@redhat.com
> 
> Following the proposal that I send about restrictions [1], I wrote this series
> to add restrictions in io_uring.
> 
> I also wrote helpers in liburing and a test case 
> (test/register-restrictions.c)
> available in this repository:
> https://github.com/stefano-garzarella/liburing (branch: io_uring_restrictions)
> 
> Just to recap the proposal, the idea is to add some restrictions to the
> operations (sqe opcode and flags, register opcode) to safely allow untrusted
> applications or guests to use io_uring queues.
> 
> The first patch changes io_uring_register(2) opcodes into an enumeration to
> keep track of the last opcode available.
> 
> The second patch adds IOURING_REGISTER_RESTRICTIONS opcode and the code to
> handle restrictions.
> 
> The third patch adds IORING_SETUP_R_DISABLED flag to start the rings disabled,
> allowing the user to register restrictions, buffers, files, before to start
> processing SQEs.

Applied, thanks.

-- 
Jens Axboe



[PATCH v3 1/4] block: Move bio merge related functions into blk-merge.c

2020-08-27 Thread Baolin Wang
It's better to move bio merge related functions into blk-merge.c,
which contains all merge related functions.

Signed-off-by: Baolin Wang 
Reviewed-by: Christoph Hellwig 
---
 block/blk-core.c  | 156 -
 block/blk-merge.c | 157 ++
 2 files changed, 157 insertions(+), 156 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d9d6326..ed79109 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -642,162 +642,6 @@ void blk_put_request(struct request *req)
 }
 EXPORT_SYMBOL(blk_put_request);
 
-static void blk_account_io_merge_bio(struct request *req)
-{
-   if (!blk_do_io_stat(req))
-   return;
-
-   part_stat_lock();
-   part_stat_inc(req->part, merges[op_stat_group(req_op(req))]);
-   part_stat_unlock();
-}
-
-bool bio_attempt_back_merge(struct request *req, struct bio *bio,
-   unsigned int nr_segs)
-{
-   const int ff = bio->bi_opf & REQ_FAILFAST_MASK;
-
-   if (!ll_back_merge_fn(req, bio, nr_segs))
-   return false;
-
-   trace_block_bio_backmerge(req->q, req, bio);
-   rq_qos_merge(req->q, req, bio);
-
-   if ((req->cmd_flags & REQ_FAILFAST_MASK) != ff)
-   blk_rq_set_mixed_merge(req);
-
-   req->biotail->bi_next = bio;
-   req->biotail = bio;
-   req->__data_len += bio->bi_iter.bi_size;
-
-   bio_crypt_free_ctx(bio);
-
-   blk_account_io_merge_bio(req);
-   return true;
-}
-
-bool bio_attempt_front_merge(struct request *req, struct bio *bio,
-   unsigned int nr_segs)
-{
-   const int ff = bio->bi_opf & REQ_FAILFAST_MASK;
-
-   if (!ll_front_merge_fn(req, bio, nr_segs))
-   return false;
-
-   trace_block_bio_frontmerge(req->q, req, bio);
-   rq_qos_merge(req->q, req, bio);
-
-   if ((req->cmd_flags & REQ_FAILFAST_MASK) != ff)
-   blk_rq_set_mixed_merge(req);
-
-   bio->bi_next = req->bio;
-   req->bio = bio;
-
-   req->__sector = bio->bi_iter.bi_sector;
-   req->__data_len += bio->bi_iter.bi_size;
-
-   bio_crypt_do_front_merge(req, bio);
-
-   blk_account_io_merge_bio(req);
-   return true;
-}
-
-bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
-   struct bio *bio)
-{
-   unsigned short segments = blk_rq_nr_discard_segments(req);
-
-   if (segments >= queue_max_discard_segments(q))
-   goto no_merge;
-   if (blk_rq_sectors(req) + bio_sectors(bio) >
-   blk_rq_get_max_sectors(req, blk_rq_pos(req)))
-   goto no_merge;
-
-   rq_qos_merge(q, req, bio);
-
-   req->biotail->bi_next = bio;
-   req->biotail = bio;
-   req->__data_len += bio->bi_iter.bi_size;
-   req->nr_phys_segments = segments + 1;
-
-   blk_account_io_merge_bio(req);
-   return true;
-no_merge:
-   req_set_nomerge(q, req);
-   return false;
-}
-
-/**
- * blk_attempt_plug_merge - try to merge with %current's plugged list
- * @q: request_queue new bio is being queued at
- * @bio: new bio being queued
- * @nr_segs: number of segments in @bio
- * @same_queue_rq: pointer to  request that gets filled in when
- * another request associated with @q is found on the plug list
- * (optional, may be %NULL)
- *
- * Determine whether @bio being queued on @q can be merged with a request
- * on %current's plugged list.  Returns %true if merge was successful,
- * otherwise %false.
- *
- * Plugging coalesces IOs from the same issuer for the same purpose without
- * going through @q->queue_lock.  As such it's more of an issuing mechanism
- * than scheduling, and the request, while may have elvpriv data, is not
- * added on the elevator at this point.  In addition, we don't have
- * reliable access to the elevator outside queue lock.  Only check basic
- * merging parameters without querying the elevator.
- *
- * Caller must ensure !blk_queue_nomerges(q) beforehand.
- */
-bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
-   unsigned int nr_segs, struct request **same_queue_rq)
-{
-   struct blk_plug *plug;
-   struct request *rq;
-   struct list_head *plug_list;
-
-   plug = blk_mq_plug(q, bio);
-   if (!plug)
-   return false;
-
-   plug_list = >mq_list;
-
-   list_for_each_entry_reverse(rq, plug_list, queuelist) {
-   bool merged = false;
-
-   if (rq->q == q && same_queue_rq) {
-   /*
-* Only blk-mq multiple hardware queues case checks the
-* rq in the same queue, there should be only one such
-* rq in a queue
-**/
-   *same_queue_rq = rq;
-   }
-
-   if (rq->q != q || !blk_rq_merge_ok(rq, bio))
-   continue;
-
-   switch (blk_try_merge(rq, bio)) {

Re: [PATCH] sched/fair: Fix wrong cpu selecting from isolated domain

2020-08-27 Thread Xunlei Pang
On 2020/8/24 PM8:30, Xunlei Pang wrote:
> We've met problems that occasionally tasks with full cpumask
> (e.g. by putting it into a cpuset or setting to full affinity)
> were migrated to our isolated cpus in production environment.
> 
> After some analysis, we found that it is due to the current
> select_idle_smt() not considering the sched_domain mask.
> 
> Fix it by checking the valid domain mask in select_idle_smt().
> 
> Fixes: 10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings())
> Reported-by: Wetp Zhang 
> Signed-off-by: Xunlei Pang 
> ---
>  kernel/sched/fair.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1a68a05..fa942c4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6075,7 +6075,7 @@ static int select_idle_core(struct task_struct *p, 
> struct sched_domain *sd, int
>  /*
>   * Scan the local SMT mask for idle CPUs.
>   */
> -static int select_idle_smt(struct task_struct *p, int target)
> +static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, 
> int target)
>  {
>   int cpu;
>  
> @@ -6083,7 +6083,8 @@ static int select_idle_smt(struct task_struct *p, int 
> target)
>   return -1;
>  
>   for_each_cpu(cpu, cpu_smt_mask(target)) {
> - if (!cpumask_test_cpu(cpu, p->cpus_ptr))
> + if (!cpumask_test_cpu(cpu, p->cpus_ptr) ||
> + !cpumask_test_cpu(cpu, sched_domain_span(sd)))
>   continue;
>   if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
>   return cpu;
> @@ -6099,7 +6100,7 @@ static inline int select_idle_core(struct task_struct 
> *p, struct sched_domain *s
>   return -1;
>  }
>  
> -static inline int select_idle_smt(struct task_struct *p, int target)
> +static inline int select_idle_smt(struct task_struct *p, struct sched_domain 
> *sd, int target)
>  {
>   return -1;
>  }
> @@ -6274,7 +6275,7 @@ static int select_idle_sibling(struct task_struct *p, 
> int prev, int target)
>   if ((unsigned)i < nr_cpumask_bits)
>   return i;
>  
> - i = select_idle_smt(p, target);
> + i = select_idle_smt(p, sd, target);
>   if ((unsigned)i < nr_cpumask_bits)
>   return i;
>  
> 

Hi Peter, any other comments?


[PATCH v3 3/4] block: Add a new helper to attempt to merge a bio

2020-08-27 Thread Baolin Wang
There are lots of duplicated code when trying to merge a bio from
plug list and sw queue, we can introduce a new helper to attempt
to merge a bio, which can simplify the blk_bio_list_merge()
and blk_attempt_plug_merge().

Signed-off-by: Baolin Wang 
---
 block/blk-merge.c| 104 ++-
 block/blk-mq-sched.c |   6 +--
 block/blk.h  |  21 ---
 3 files changed, 71 insertions(+), 60 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index b09e9fc..80c9744 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -907,13 +907,14 @@ static void blk_account_io_merge_bio(struct request *req)
part_stat_unlock();
 }
 
-bool bio_attempt_back_merge(struct request *req, struct bio *bio,
-   unsigned int nr_segs)
+enum bio_merge_status bio_attempt_back_merge(struct request *req,
+struct bio *bio,
+unsigned int nr_segs)
 {
const int ff = bio->bi_opf & REQ_FAILFAST_MASK;
 
if (!ll_back_merge_fn(req, bio, nr_segs))
-   return false;
+   return BIO_MERGE_FAILED;
 
trace_block_bio_backmerge(req->q, req, bio);
rq_qos_merge(req->q, req, bio);
@@ -928,16 +929,17 @@ bool bio_attempt_back_merge(struct request *req, struct 
bio *bio,
bio_crypt_free_ctx(bio);
 
blk_account_io_merge_bio(req);
-   return true;
+   return BIO_MERGE_OK;
 }
 
-bool bio_attempt_front_merge(struct request *req, struct bio *bio,
-   unsigned int nr_segs)
+enum bio_merge_status bio_attempt_front_merge(struct request *req,
+ struct bio *bio,
+ unsigned int nr_segs)
 {
const int ff = bio->bi_opf & REQ_FAILFAST_MASK;
 
if (!ll_front_merge_fn(req, bio, nr_segs))
-   return false;
+   return BIO_MERGE_FAILED;
 
trace_block_bio_frontmerge(req->q, req, bio);
rq_qos_merge(req->q, req, bio);
@@ -954,11 +956,12 @@ bool bio_attempt_front_merge(struct request *req, struct 
bio *bio,
bio_crypt_do_front_merge(req, bio);
 
blk_account_io_merge_bio(req);
-   return true;
+   return BIO_MERGE_OK;
 }
 
-bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
-   struct bio *bio)
+enum bio_merge_status bio_attempt_discard_merge(struct request_queue *q,
+   struct request *req,
+   struct bio *bio)
 {
unsigned short segments = blk_rq_nr_discard_segments(req);
 
@@ -976,10 +979,39 @@ bool bio_attempt_discard_merge(struct request_queue *q, 
struct request *req,
req->nr_phys_segments = segments + 1;
 
blk_account_io_merge_bio(req);
-   return true;
+   return BIO_MERGE_OK;
 no_merge:
req_set_nomerge(q, req);
-   return false;
+   return BIO_MERGE_FAILED;
+}
+
+static enum bio_merge_status blk_attempt_bio_merge(struct request_queue *q,
+  struct request *rq,
+  struct bio *bio,
+  unsigned int nr_segs,
+  bool sched_allow_merge)
+{
+   if (!blk_rq_merge_ok(rq, bio))
+   return BIO_MERGE_NONE;
+
+   switch (blk_try_merge(rq, bio)) {
+   case ELEVATOR_BACK_MERGE:
+   if (!sched_allow_merge ||
+   (sched_allow_merge && blk_mq_sched_allow_merge(q, rq, bio)))
+   return bio_attempt_back_merge(rq, bio, nr_segs);
+   break;
+   case ELEVATOR_FRONT_MERGE:
+   if (!sched_allow_merge ||
+   (sched_allow_merge && blk_mq_sched_allow_merge(q, rq, bio)))
+   return bio_attempt_front_merge(rq, bio, nr_segs);
+   break;
+   case ELEVATOR_DISCARD_MERGE:
+   return bio_attempt_discard_merge(q, rq, bio);
+   default:
+   return BIO_MERGE_NONE;
+   }
+
+   return BIO_MERGE_FAILED;
 }
 
 /**
@@ -1018,8 +1050,6 @@ bool blk_attempt_plug_merge(struct request_queue *q, 
struct bio *bio,
plug_list = >mq_list;
 
list_for_each_entry_reverse(rq, plug_list, queuelist) {
-   bool merged = false;
-
if (rq->q == q && same_queue_rq) {
/*
 * Only blk-mq multiple hardware queues case checks the
@@ -1029,24 +1059,11 @@ bool blk_attempt_plug_merge(struct request_queue *q, 
struct bio *bio,
*same_queue_rq = rq;
}
 
-   if (rq->q != q || !blk_rq_merge_ok(rq, bio))
+   if (rq->q != q)
continue;
 
-   switch (blk_try_merge(rq, bio)) {
-   case 

[PATCH v3 4/4] block: Remove blk_mq_attempt_merge() function

2020-08-27 Thread Baolin Wang
The small blk_mq_attempt_merge() function is only called by
__blk_mq_sched_bio_merge(), just open code it.

Signed-off-by: Baolin Wang 
Reviewed-by: Christoph Hellwig 
---
 block/blk-mq-sched.c | 44 
 1 file changed, 16 insertions(+), 28 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 94db0c9..205d971 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -391,28 +391,6 @@ bool blk_mq_sched_try_merge(struct request_queue *q, 
struct bio *bio,
 }
 EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge);
 
-/*
- * Reverse check our software queue for entries that we could potentially
- * merge with. Currently includes a hand-wavy stop count of 8, to not spend
- * too much time checking for merges.
- */
-static bool blk_mq_attempt_merge(struct request_queue *q,
-struct blk_mq_hw_ctx *hctx,
-struct blk_mq_ctx *ctx, struct bio *bio,
-unsigned int nr_segs)
-{
-   enum hctx_type type = hctx->type;
-
-   lockdep_assert_held(>lock);
-
-   if (blk_bio_list_merge(q, >rq_lists[type], bio, nr_segs)) {
-   ctx->rq_merged++;
-   return true;
-   }
-
-   return false;
-}
-
 bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio,
unsigned int nr_segs)
 {
@@ -426,14 +404,24 @@ bool __blk_mq_sched_bio_merge(struct request_queue *q, 
struct bio *bio,
return e->type->ops.bio_merge(hctx, bio, nr_segs);
 
type = hctx->type;
-   if ((hctx->flags & BLK_MQ_F_SHOULD_MERGE) &&
-   !list_empty_careful(>rq_lists[type])) {
-   /* default per sw-queue merge */
-   spin_lock(>lock);
-   ret = blk_mq_attempt_merge(q, hctx, ctx, bio, nr_segs);
-   spin_unlock(>lock);
+   if (!(hctx->flags & BLK_MQ_F_SHOULD_MERGE) ||
+   list_empty_careful(>rq_lists[type]))
+   return false;
+
+   /* default per sw-queue merge */
+   spin_lock(>lock);
+   /*
+* Reverse check our software queue for entries that we could
+* potentially merge with. Currently includes a hand-wavy stop
+* count of 8, to not spend too much time checking for merges.
+*/
+   if (blk_bio_list_merge(q, >rq_lists[type], bio, nr_segs)) {
+   ctx->rq_merged++;
+   ret = true;
}
 
+   spin_unlock(>lock);
+
return ret;
 }
 
-- 
1.8.3.1



[PATCH v3 2/4] block: Move blk_mq_bio_list_merge() into blk-merge.c

2020-08-27 Thread Baolin Wang
Move the blk_mq_bio_list_merge() into blk-merge.c and
rename it as a generic name.

Signed-off-by: Baolin Wang 
---
 block/blk-merge.c  | 44 
 block/blk-mq-sched.c   | 46 +-
 block/blk.h|  2 ++
 block/kyber-iosched.c  |  2 +-
 include/linux/blk-mq.h |  2 --
 5 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 3aa2de5..b09e9fc 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1052,3 +1052,47 @@ bool blk_attempt_plug_merge(struct request_queue *q, 
struct bio *bio,
 
return false;
 }
+
+/*
+ * Iterate list of requests and see if we can merge this bio with any
+ * of them.
+ */
+bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
+   struct bio *bio, unsigned int nr_segs)
+{
+   struct request *rq;
+   int checked = 8;
+
+   list_for_each_entry_reverse(rq, list, queuelist) {
+   bool merged = false;
+
+   if (!checked--)
+   break;
+
+   if (!blk_rq_merge_ok(rq, bio))
+   continue;
+
+   switch (blk_try_merge(rq, bio)) {
+   case ELEVATOR_BACK_MERGE:
+   if (blk_mq_sched_allow_merge(q, rq, bio))
+   merged = bio_attempt_back_merge(rq, bio,
+   nr_segs);
+   break;
+   case ELEVATOR_FRONT_MERGE:
+   if (blk_mq_sched_allow_merge(q, rq, bio))
+   merged = bio_attempt_front_merge(rq, bio,
+   nr_segs);
+   break;
+   case ELEVATOR_DISCARD_MERGE:
+   merged = bio_attempt_discard_merge(q, rq, bio);
+   break;
+   default:
+   continue;
+   }
+
+   return merged;
+   }
+
+   return false;
+}
+EXPORT_SYMBOL_GPL(blk_bio_list_merge);
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index d2790e5..82acff9 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -392,50 +392,6 @@ bool blk_mq_sched_try_merge(struct request_queue *q, 
struct bio *bio,
 EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge);
 
 /*
- * Iterate list of requests and see if we can merge this bio with any
- * of them.
- */
-bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
-  struct bio *bio, unsigned int nr_segs)
-{
-   struct request *rq;
-   int checked = 8;
-
-   list_for_each_entry_reverse(rq, list, queuelist) {
-   bool merged = false;
-
-   if (!checked--)
-   break;
-
-   if (!blk_rq_merge_ok(rq, bio))
-   continue;
-
-   switch (blk_try_merge(rq, bio)) {
-   case ELEVATOR_BACK_MERGE:
-   if (blk_mq_sched_allow_merge(q, rq, bio))
-   merged = bio_attempt_back_merge(rq, bio,
-   nr_segs);
-   break;
-   case ELEVATOR_FRONT_MERGE:
-   if (blk_mq_sched_allow_merge(q, rq, bio))
-   merged = bio_attempt_front_merge(rq, bio,
-   nr_segs);
-   break;
-   case ELEVATOR_DISCARD_MERGE:
-   merged = bio_attempt_discard_merge(q, rq, bio);
-   break;
-   default:
-   continue;
-   }
-
-   return merged;
-   }
-
-   return false;
-}
-EXPORT_SYMBOL_GPL(blk_mq_bio_list_merge);
-
-/*
  * Reverse check our software queue for entries that we could potentially
  * merge with. Currently includes a hand-wavy stop count of 8, to not spend
  * too much time checking for merges.
@@ -449,7 +405,7 @@ static bool blk_mq_attempt_merge(struct request_queue *q,
 
lockdep_assert_held(>lock);
 
-   if (blk_mq_bio_list_merge(q, >rq_lists[type], bio, nr_segs)) {
+   if (blk_bio_list_merge(q, >rq_lists[type], bio, nr_segs)) {
ctx->rq_merged++;
return true;
}
diff --git a/block/blk.h b/block/blk.h
index 49e2928..d6152d2 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -177,6 +177,8 @@ bool bio_attempt_discard_merge(struct request_queue *q, 
struct request *req,
struct bio *bio);
 bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
unsigned int nr_segs, struct request **same_queue_rq);
+bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
+   struct bio *bio, unsigned int nr_segs);
 
 void blk_account_io_start(struct request *req);
 void 

[PATCH v3 0/4] Some clean-ups for bio merge

2020-08-27 Thread Baolin Wang
Hi,

There are some duplicated code when trying to merge bio from pluged list
and software queue, thus this patch set did some clean-ups when merging
a bio. Any comments are welcome. Thanks.

Changes from v2:
 - Split blk_mq_bio_list_merge() moving into a separate patch.
 - Add reviewed-by tag from Christoph.
 - Coding style improvement.

Changes from v1:
 - Drop patch 2 and patch 5 in v1 patch set.
 - Add reviewed-by tag from Christoph.
 - Move blk_mq_bio_list_merge() into blk-merge.c and rename it.
 - Some coding style improvements.


Baolin Wang (4):
  block: Move bio merge related functions into blk-merge.c
  block: Move blk_mq_bio_list_merge() into blk-merge.c
  block: Add a new helper to attempt to merge a bio
  block: Remove blk_mq_attempt_merge() function

 block/blk-core.c   | 156 -
 block/blk-merge.c  | 203 +
 block/blk-mq-sched.c   |  94 +--
 block/blk.h|  23 --
 block/kyber-iosched.c  |   2 +-
 include/linux/blk-mq.h |   2 -
 6 files changed, 240 insertions(+), 240 deletions(-)

-- 
1.8.3.1



Re: [PATCH 3/4] Input: twl4030_keypad - Fix handling of platform_get_irq() error

2020-08-27 Thread kernel test robot
Hi Krzysztof,

I love your patch! Perhaps something to improve:

[auto build test WARNING on input/next]
[also build test WARNING on sunxi/sunxi/for-next linus/master v5.9-rc2 
next-20200827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Krzysztof-Kozlowski/Input-ep93xx_keypad-Fix-handling-of-platform_get_irq-error/20200827-152706
base:   https://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git next
config: i386-randconfig-m021-20200828 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

smatch warnings:
drivers/input/keyboard/twl4030_keypad.c:379 twl4030_kp_probe() warn: unsigned 
'kp->irq' is never less than zero.

# 
https://github.com/0day-ci/linux/commit/d83af6799bafdf8f1f84ddfc48876f621735963b
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Krzysztof-Kozlowski/Input-ep93xx_keypad-Fix-handling-of-platform_get_irq-error/20200827-152706
git checkout d83af6799bafdf8f1f84ddfc48876f621735963b
vim +379 drivers/input/keyboard/twl4030_keypad.c

   318  
   319  /*
   320   * Registers keypad device with input subsystem
   321   * and configures TWL4030 keypad registers
   322   */
   323  static int twl4030_kp_probe(struct platform_device *pdev)
   324  {
   325  struct twl4030_keypad_data *pdata = 
dev_get_platdata(>dev);
   326  const struct matrix_keymap_data *keymap_data = NULL;
   327  struct twl4030_keypad *kp;
   328  struct input_dev *input;
   329  u8 reg;
   330  int error;
   331  
   332  kp = devm_kzalloc(>dev, sizeof(*kp), GFP_KERNEL);
   333  if (!kp)
   334  return -ENOMEM;
   335  
   336  input = devm_input_allocate_device(>dev);
   337  if (!input)
   338  return -ENOMEM;
   339  
   340  /* get the debug device */
   341  kp->dbg_dev = >dev;
   342  kp->input   = input;
   343  
   344  /* setup input device */
   345  input->name = "TWL4030 Keypad";
   346  input->phys = "twl4030_keypad/input0";
   347  
   348  input->id.bustype   = BUS_HOST;
   349  input->id.vendor= 0x0001;
   350  input->id.product   = 0x0001;
   351  input->id.version   = 0x0003;
   352  
   353  if (pdata) {
   354  if (!pdata->rows || !pdata->cols || 
!pdata->keymap_data) {
   355  dev_err(>dev, "Missing platform_data\n");
   356  return -EINVAL;
   357  }
   358  
   359  kp->n_rows = pdata->rows;
   360  kp->n_cols = pdata->cols;
   361  kp->autorepeat = pdata->rep;
   362  keymap_data = pdata->keymap_data;
   363  } else {
   364  error = matrix_keypad_parse_properties(>dev, 
>n_rows,
   365 >n_cols);
   366  if (error)
   367  return error;
   368  
   369  kp->autorepeat = true;
   370  }
   371  
   372  if (kp->n_rows > TWL4030_MAX_ROWS || kp->n_cols > 
TWL4030_MAX_COLS) {
   373  dev_err(>dev,
   374  "Invalid rows/cols amount specified in 
platform/devicetree data\n");
   375  return -EINVAL;
   376  }
   377  
   378  kp->irq = platform_get_irq(pdev, 0);
 > 379  if (kp->irq < 0)
   380  return kp->irq;
   381  
   382  error = matrix_keypad_build_keymap(keymap_data, NULL,
   383 TWL4030_MAX_ROWS,
   384 1 << TWL4030_ROW_SHIFT,
   385 kp->keymap, input);
   386  if (error) {
   387  dev_err(kp->dbg_dev, "Failed to build keymap\n");
   388  return error;
   389  }
   390  
   391  input_set_capability(input, EV_MSC, MSC_SCAN);
   392  /* Enable auto repeat feature of Linux input subsystem */
   393  if (kp->autorepeat)
   394  __set_bit(EV_REP, input->evbit);
   395  
   396  error = input_register_device(input);
   397  if (error) {
   398  dev_err(kp->dbg_dev,
   399  "Unable to register twl4030 keypad device\n");
   400   

[git pull] drm fixes for 5.9-rc3

2020-08-27 Thread Dave Airlie
Hey Linus,

As expected a bit of an rc3 uptick, amdgpu and msm are the main ones,
one msm patch was from the merge window, but had dependencies and we
dropped it until the other tree had landed. Otherwise it's a couple of
fixes for core, and etnaviv, and single i915, exynos, omap fixes.

I'm still tracking the Sandybridge gpu relocations issue, if we don't
see much movement I might just queue up the reverts. I'll talk to
Daniel next week once he's back from holidays.

Dave.

drm-fixes-2020-08-28:
drm fixes for 5.9-rc3

core:
- Take modeset bkl for legacy drivers.

dp_mst:
- Allow null crtc in dp_mst.

i915:
- Fix command parser desc matching with masks

amdgpu:
- Misc display fixes
- Backlight fixes
- MPO fix for DCN1
- Fixes for Sienna Cichlid
- Fixes for Navy Flounder
- Vega SW CTF fixes
- SMU fix for Raven
- Fix a possible overflow in INFO ioctl
- Gfx10 clockgating fix

msm:
- opp/bw scaling patch followup
- frequency restoring fux
- vblank in atomic commit fix
- dpu modesetting fixes
- fencing fix

etnaviv:
- scheduler interaction fix
- gpu init regression fix

exynos:
- Just drop __iommu annotation to fix sparse warning.

omap:
- locking state fix.
The following changes since commit d012a7190fc1fd72ed48911e77ca97ba4521bccd:

  Linux 5.9-rc2 (2020-08-23 14:08:43 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2020-08-28

for you to fetch changes up to 2a3f9da32de4616f0104209194e9bd3dfae092c9:

  Merge tag 'drm-intel-fixes-2020-08-27' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes (2020-08-28
11:02:53 +1000)


drm fixes for 5.9-rc3

core:
- Take modeset bkl for legacy drivers.

dp_mst:
- Allow null crtc in dp_mst.

i915:
- Fix command parser desc matching with masks

amdgpu:
- Misc display fixes
- Backlight fixes
- MPO fix for DCN1
- Fixes for Sienna Cichlid
- Fixes for Navy Flounder
- Vega SW CTF fixes
- SMU fix for Raven
- Fix a possible overflow in INFO ioctl
- Gfx10 clockgating fix

msm:
- opp/bw scaling patch followup
- frequency restoring fux
- vblank in atomic commit fix
- dpu modesetting fixes
- fencing fix

etnaviv:
- scheduler interaction fix
- gpu init regression fix

exynos:
- Just drop __iommu annotation to fix sparse warning.

omap:
- locking state fix.


Alex Deucher (1):
  drm/amdgpu: Fix buffer overflow in INFO ioctl

Alexander Monakov (1):
  drm/amd/display: use correct scale for actual_brightness

Bhawanpreet Lakha (1):
  drm/dp_mst: Don't return error code when crtc is null

Brandon Syu (1):
  drm/amd/display: Keep current gain when ABM disable immediately

Christian Gmeiner (1):
  drm/etnaviv: fix external abort seen on GC600 rev 0x19

Daniel Vetter (1):
  drm/modeset-lock: Take the modeset BKL for legacy drivers

Dave Airlie (6):
  Merge tag 'exynos-drm-fixes-v5.9-rc3' of
git://git.kernel.org/.../daeinki/drm-exynos into drm-fixes
  Merge branch 'etnaviv/fixes' of
https://git.pengutronix.de/git/lst/linux into drm-fixes
  Merge tag 'drm-msm-fixes-2020-08-24' of
https://gitlab.freedesktop.org/drm/msm into drm-fixes
  Merge tag 'amd-drm-fixes-5.9-2020-08-26' of
git://people.freedesktop.org/~agd5f/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2020-08-26' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge tag 'drm-intel-fixes-2020-08-27' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Dinghao Liu (1):
  drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init

Dmitry Baryshkov (1):
  drm/msm/a6xx: fix gmu start on newer firmware

Evan Quan (4):
  drm/amd/pm: correct Vega10 swctf limit setting
  drm/amd/pm: correct Vega12 swctf limit setting
  drm/amd/pm: correct Vega20 swctf limit setting
  drm/amd/pm: correct the thermal alert temperature limit settings

Furquan Shaikh (1):
  drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to
0 in amdgpu_dm_update_backlight_caps

Jaehyun Chung (1):
  drm/amd/display: Revert HDCP disable sequence change

Jiansong Chen (5):
  drm/amd/pm: enable run_btc callback for sienna_cichlid
  drm/amd/pm: set VCN pg per instances
  drm/amdgpu/gfx10: refine mgcg setting
  drm/amdgpu: use MODE1 reset for navy_flounder by default
  drm/amdgpu: disable runtime pm for navy_flounder

Jonathan Marek (1):
  drm/msm/a6xx: fix frequency not always being restored on GMU resume

Kalyan Thota (2):
  drm/msm/dpu: Fix reservation failures in modeset
  drm/msm/dpu: Fix scale params in plane validation

Krishna Manikandan (1):
  drm/msm: add shutdown support for display platform_driver

Lucas Stach (1):
  drm/etnaviv: always start/stop scheduler in timeout processing

Maarten Lankhorst (1):
  Merge tag 'v5.9-rc2' into drm-misc-fixes

Marek Szyprowski (1):
  drm/exynos: gem: Fix sparse warning

Mika Kuoppala 

Re: [External] Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers

2020-08-27 Thread Muchun Song
On Fri, Aug 28, 2020 at 5:51 AM Mike Kravetz  wrote:
>
> On 8/25/20 7:47 PM, Muchun Song wrote:
> >
> > CPU0: CPU1:
> >   proc_sys_write
> > hugetlb_sysctl_handler  proc_sys_call_handler
> > hugetlb_sysctl_handler_common hugetlb_sysctl_handler
> >   table->data =
> > hugetlb_sysctl_handler_common
> >   table->data = 
> > proc_doulongvec_minmax
> >   do_proc_doulongvec_minmax sysctl_head_finish
> > __do_proc_doulongvec_minmax
> >   i = table->data;
> >   *i = val; // corrupt CPU1 stack
>
> Thanks Muchun!
> Can you please add this to the commit message.

OK, I will do that. Thanks.

>
> Also, when looking closer at the patch I do not think setting table->maxlen
> is necessary in these routines.  maxlen is set when the hugetlb ctl_table
> entries are defined and initialized.  This is not something you introduced.
> The unnecessary assignments are in the existing code.  However, there is no
> need to carry them forward.

Yeah, I agree with you. I will remove the unnecessary assignment of
table->maxlen.

>
> --
> Mike Kravetz



-- 
Yours,
Muchun


Re: [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page shift

2020-08-27 Thread Alexey Kardashevskiy



On 28/08/2020 01:32, Leonardo Bras wrote:
> Hello Alexey, thank you for this feedback!
> 
> On Sat, 2020-08-22 at 19:33 +1000, Alexey Kardashevskiy wrote:
>>> +#define TCE_RPN_BITS   52  /* Bits 0-51 represent 
>>> RPN on TCE */
>>
>> Ditch this one and use MAX_PHYSMEM_BITS instead? I am pretty sure this
>> is the actual limit.
> 
> I understand this MAX_PHYSMEM_BITS(51) comes from the maximum physical memory 
> addressable in the machine. IIUC, it means we can access physical address up 
> to (1ul << MAX_PHYSMEM_BITS). 
> 
> This 52 comes from PAPR "Table 9. TCE Definition" which defines bits
> 0-51 as the RPN. By looking at code, I understand that it means we may input 
> any address < (1ul << 52) to TCE.
> 
> In practice, MAX_PHYSMEM_BITS should be enough as of today, because I suppose 
> we can't ever pass a physical page address over 
> (1ul << 51), and TCE accepts up to (1ul << 52).
> But if we ever increase MAX_PHYSMEM_BITS, it doesn't necessarily means that 
> TCE_RPN_BITS will also be increased, so I think they are independent values. 
> 
> Does it make sense? Please let me know if I am missing something.

The underlying hardware is PHB3/4 about which the IODA2 Version 2.4
6Apr2012.pdf spec says:

"The number of most significant RPN bits implemented in the TCE is
dependent on the max size of System Memory to be supported by the platform".

IODA3 is the same on this matter.

This is MAX_PHYSMEM_BITS and PHB itself does not have any other limits
on top of that. So the only real limit comes from MAX_PHYSMEM_BITS and
where TCE_RPN_BITS comes from exactly - I have no idea.


> 
>>
>>
>>> +#define TCE_RPN_MASK(ps)   ((1ul << (TCE_RPN_BITS - (ps))) - 1)
>>>  #define TCE_VALID  0x800   /* TCE valid */
>>>  #define TCE_ALLIO  0x400   /* TCE valid for all lpars */
>>>  #define TCE_PCI_WRITE  0x2 /* write from PCI 
>>> allowed */
>>> diff --git a/arch/powerpc/platforms/pseries/iommu.c 
>>> b/arch/powerpc/platforms/pseries/iommu.c
>>> index e4198700ed1a..8fe23b7dff3a 100644
>>> --- a/arch/powerpc/platforms/pseries/iommu.c
>>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>>> @@ -107,6 +107,9 @@ static int tce_build_pSeries(struct iommu_table *tbl, 
>>> long index,
>>> u64 proto_tce;
>>> __be64 *tcep;
>>> u64 rpn;
>>> +   const unsigned long tceshift = tbl->it_page_shift;
>>> +   const unsigned long pagesize = IOMMU_PAGE_SIZE(tbl);
>>> +   const u64 rpn_mask = TCE_RPN_MASK(tceshift);
>>
>> Using IOMMU_PAGE_SIZE macro for the page size and not using
>> IOMMU_PAGE_MASK for the mask - this incosistency makes my small brain
>> explode :) I understand the history but man... Oh well, ok.
>>
> 
> Yeah, it feels kind of weird after two IOMMU related consts. :)
> But sure IOMMU_PAGE_MASK() would not be useful here :)
> 
> And this kind of let me thinking:
>>> +   rpn = __pa(uaddr) >> tceshift;
>>> +   *tcep = cpu_to_be64(proto_tce | (rpn & rpn_mask) << tceshift);
> Why not:
>   rpn_mask =  TCE_RPN_MASK(tceshift) << tceshift;


A mask for a page number (but not the address!) hurts my brain, masks
are good against addresses but numbers should already have all bits
adjusted imho, may be it is just me :-/


>   
>   rpn = __pa(uaddr) & rpn_mask;
>   *tcep = cpu_to_be64(proto_tce | rpn)
> 
> I am usually afraid of changing stuff like this, but I think it's safe.
> 
>> Good, otherwise. Thanks,
> 
> Thank you for reviewing!
>  
> 
> 

-- 
Alexey


RE: drivers/net/wireless/realtek/rtw88/rtw8821c.c:71:8: warning: type qualifiers ignored on function return type

2020-08-27 Thread Tony Chuang
Hi Andy

> + linux-wireless
> 
> kernel test robot  writes:
> 
> > Hi Tzu-En,
> >
> > First bad commit (maybe != root cause):
> >
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> master
> > head:   15bc20c6af4ceee97a1f90b43c0e386643c071b4
> > commit: f745eb9ca5bf823bc5c0f82a434cefb41c57844e rtw88: 8821c: Add
> 8821CE to Kconfig and Makefile
> > date:   6 weeks ago
> > config: arm-randconfig-r012-20200827 (attached as .config)
> > compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> > wget
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O
> ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > git checkout f745eb9ca5bf823bc5c0f82a434cefb41c57844e
> > # save the attached .config to linux build tree
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0
> make.cross ARCH=arm
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> >
> > All warnings (new ones prefixed by >>):
> >
> >>> drivers/net/wireless/realtek/rtw88/rtw8821c.c:71:8: warning: type
> qualifiers ignored on function return type [-Wignored-qualifiers]
> >   71 | static const u8 rtw8821c_get_swing_index(struct rtw_dev
> *rtwdev)
> >  |^
> 
> Tony, please check this.
> 

Andy, please send a patch to fix it.

Thanks,
Yen-Hsuan


RE: drivers/net/wireless/realtek/rtw88/pci.c:1477:5: warning: no previous prototype for 'rtw_pci_probe'

2020-08-27 Thread Tony Chuang
> + linux-wireless
> 
> kernel test robot  writes:
> 
> > Hi Zong-Zhe,
> >
> > FYI, the error/warning still remains.
> >
> > tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> master
> > head:   23ee3e4e5bd27bdbc0f1785eef7209ce872794c7
> > commit: 72f256c2b948622cc45ff8bc0456dd6039d8fe36 rtw88: extract:
> > export symbols about pci interface
> > date:   10 weeks ago
> > config: arc-randconfig-r026-20200725 (attached as .config)
> > compiler: arc-elf-gcc (GCC) 9.3.0
> > reproduce (this is a W=1 build):
> > wget
> > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > git checkout 72f256c2b948622cc45ff8bc0456dd6039d8fe36
> > # save the attached .config to linux build tree
> > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0
> make.cross ARCH=arc
> >
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> >
> > All warnings (new ones prefixed by >>):
> >
> >>> drivers/net/wireless/realtek/rtw88/pci.c:1477:5: warning: no
> >>> previous prototype for 'rtw_pci_probe' [-Wmissing-prototypes]
> > 1477 | int rtw_pci_probe(struct pci_dev *pdev,
> >  | ^
> >>> drivers/net/wireless/realtek/rtw88/pci.c:1557:6: warning: no
> >>> previous prototype for 'rtw_pci_remove' [-Wmissing-prototypes]
> > 1557 | void rtw_pci_remove(struct pci_dev *pdev)
> >  |  ^~
> >>> drivers/net/wireless/realtek/rtw88/pci.c:1579:6: warning: no
> >>> previous prototype for 'rtw_pci_shutdown' [-Wmissing-prototypes]
> > 1579 | void rtw_pci_shutdown(struct pci_dev *pdev)
> >  |  ^~~~
> 
> Tony, these are older warnings but please also check these.
> 

I think this warning can be ignored, as the commit was going to export
pci symbols for the follow-up patches to use, such as:

f56f08636dda rtw88: extract: make 8723d an individual kernel module
416e87fcc780 rtw88: extract: make 8822b an individual kernel module
ba0fbe236fb8 rtw88: extract: make 8822c an individual kernel module

And these patches were submitted and applied together.

Thanks,
Yen-Hsuan


[PATCH v9 2/2] phy: Add USB3 PHY support for Intel LGM SoC

2020-08-27 Thread Ramuthevar,Vadivel MuruganX
From: Ramuthevar Vadivel Murugan 

Add support for USB PHY on Intel LGM SoC.

Signed-off-by: Ramuthevar Vadivel Murugan 

Reviewed-by: Philipp Zabel 
---
 drivers/phy/Kconfig   |  10 ++
 drivers/phy/Makefile  |   1 +
 drivers/phy/phy-lgm-usb.c | 284 ++
 3 files changed, 295 insertions(+)
 create mode 100644 drivers/phy/phy-lgm-usb.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index de9362c25c07..83797b2e6406 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -49,6 +49,16 @@ config PHY_XGENE
help
  This option enables support for APM X-Gene SoC multi-purpose PHY.
 
+config USB_LGM_PHY
+   tristate "INTEL Lightning Mountain USB PHY Driver"
+   select USB_PHY
+   select REGULATOR
+   select REGULATOR_FIXED_VOLTAGE
+   help
+ Enable this to support Intel DWC3 PHY USB phy. This driver provides
+ interface to interact with USB GEN-II and USB 3.x PHY that is part
+ of the Intel network SOC.
+
 source "drivers/phy/allwinner/Kconfig"
 source "drivers/phy/amlogic/Kconfig"
 source "drivers/phy/broadcom/Kconfig"
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index c27408e4daae..6eb2916773c5 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_GENERIC_PHY_MIPI_DPHY) += phy-core-mipi-dphy.o
 obj-$(CONFIG_PHY_LPC18XX_USB_OTG)  += phy-lpc18xx-usb-otg.o
 obj-$(CONFIG_PHY_XGENE)+= phy-xgene.o
 obj-$(CONFIG_PHY_PISTACHIO_USB)+= phy-pistachio-usb.o
+obj-$(CONFIG_USB_LGM_PHY)  += phy-lgm-usb.o
 obj-y  += allwinner/   \
   amlogic/ \
   broadcom/\
diff --git a/drivers/phy/phy-lgm-usb.c b/drivers/phy/phy-lgm-usb.c
new file mode 100644
index ..309c8f0e0724
--- /dev/null
+++ b/drivers/phy/phy-lgm-usb.c
@@ -0,0 +1,284 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel LGM USB PHY driver
+ *
+ * Copyright (C) 2020 Intel Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define CTRL1_OFFSET   0x14
+#define SRAM_EXT_LD_DONE   BIT(25)
+#define SRAM_INIT_DONE BIT(26)
+
+#define TCPC_OFFSET0x1014
+#define TCPC_MUX_CTL   GENMASK(1, 0)
+#define MUX_NC 0
+#define MUX_USB1
+#define MUX_DP 2
+#define MUX_USBDP  3
+#define TCPC_FLIPPED   BIT(2)
+#define TCPC_LOW_POWER_EN  BIT(3)
+#define TCPC_VALID BIT(4)
+#define TCPC_CONN  \
+   (TCPC_VALID | FIELD_PREP(TCPC_MUX_CTL, MUX_USB))
+#define TCPC_DISCONN   \
+   (TCPC_VALID | FIELD_PREP(TCPC_MUX_CTL, MUX_NC) | TCPC_LOW_POWER_EN)
+
+static const char *const PHY_RESETS[] = { "phy31", "phy", };
+static const char *const CTL_RESETS[] = { "apb", "ctrl", };
+
+struct tca_apb {
+   struct reset_control *resets[ARRAY_SIZE(PHY_RESETS)];
+   struct regulator *vbus;
+   struct work_struct wk;
+   struct usb_phy phy;
+
+   bool regulator_enabled;
+   bool phy_initialized;
+   bool connected;
+};
+
+static int get_flipped(struct tca_apb *ta, bool *flipped)
+{
+   union extcon_property_value property;
+   int ret;
+
+   ret = extcon_get_property(ta->phy.edev, EXTCON_USB_HOST,
+ EXTCON_PROP_USB_TYPEC_POLARITY, );
+   if (ret) {
+   dev_err(ta->phy.dev, "no polarity property from extcon\n");
+   return ret;
+   }
+
+   *flipped = property.intval;
+
+   return 0;
+}
+
+static int phy_init(struct usb_phy *phy)
+{
+   struct tca_apb *ta = container_of(phy, struct tca_apb, phy);
+   void __iomem *ctrl1 = phy->io_priv + CTRL1_OFFSET;
+   int val, ret, i;
+
+   if (ta->phy_initialized)
+   return 0;
+
+   for (i = 0; i < ARRAY_SIZE(PHY_RESETS); i++)
+   reset_control_deassert(ta->resets[i]);
+
+   ret = readl_poll_timeout(ctrl1, val, val & SRAM_INIT_DONE, 10, 10 * 
1000);
+   if (ret) {
+   dev_err(ta->phy.dev, "SRAM init failed, 0x%x\n", val);
+   return ret;
+   }
+
+   writel(readl(ctrl1) | SRAM_EXT_LD_DONE, ctrl1);
+
+   ta->phy_initialized = true;
+   if (!ta->phy.edev) {
+   writel(TCPC_CONN, ta->phy.io_priv + TCPC_OFFSET);
+   return phy->set_vbus(phy, true);
+   }
+
+   schedule_work(>wk);
+
+   return ret;
+}
+
+static void phy_shutdown(struct usb_phy *phy)
+{
+   struct tca_apb *ta = container_of(phy, struct tca_apb, phy);
+   int i;
+
+   if (!ta->phy_initialized)
+   return;
+
+   ta->phy_initialized = false;
+   flush_work(>wk);
+   ta->phy.set_vbus(>phy, false);
+
+   ta->connected = false;
+   writel(TCPC_DISCONN, 

Re: [PATCH v4 0/3] Mediatek pinctrl patch on mt8192

2020-08-27 Thread CK Hu
Hi, Linus:

On Thu, 2020-08-27 at 10:52 +0200, Linus Walleij wrote:
> On Mon, Aug 17, 2020 at 2:18 AM Zhiyong Tao  wrote:
> 
> > This series includes 3 patches:
> > 1.add pinctrl file on mt8192.
> > 2.add pinctrl binding document on mt8192.
> > 3.add pinctrl driver on MT8192.
> 
> Patches applied for v5.10!

I does not see these patches in your tree [1], have you applied them? I
would like to pick these patches from your tree.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git/

Regards,
CK

> Thanks!
> Linus Walleij
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek



[PATCH v9 1/2] dt-bindings: phy: Add USB PHY support for Intel LGM SoC

2020-08-27 Thread Ramuthevar,Vadivel MuruganX
From: Ramuthevar Vadivel Murugan 

Add the dt-schema to support USB PHY on Intel LGM SoC

Signed-off-by: Ramuthevar Vadivel Murugan 

Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/phy/intel,lgm-usb-phy.yaml | 58 ++
 1 file changed, 58 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/intel,lgm-usb-phy.yaml

diff --git a/Documentation/devicetree/bindings/phy/intel,lgm-usb-phy.yaml 
b/Documentation/devicetree/bindings/phy/intel,lgm-usb-phy.yaml
new file mode 100644
index ..ce62c0b94daf
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/intel,lgm-usb-phy.yaml
@@ -0,0 +1,58 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/intel,lgm-usb-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Intel LGM USB PHY Device Tree Bindings
+
+maintainers:
+  - Vadivel Murugan Ramuthevar 
+
+properties:
+  compatible:
+const: intel,lgm-usb-phy
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  resets:
+items:
+  - description: USB PHY and Host controller reset
+  - description: APB BUS reset
+  - description: General Hardware reset
+
+  reset-names:
+items:
+  - const: phy
+  - const: apb
+  - const: phy31
+
+  "#phy-cells":
+const: 0
+
+required:
+  - compatible
+  - clocks
+  - reg
+  - resets
+  - reset-names
+  - "#phy-cells"
+
+additionalProperties: false
+
+examples:
+  - |
+usb-phy@e7e0 {
+compatible = "intel,lgm-usb-phy";
+reg = <0xe7e0 0x1>;
+clocks = < 153>;
+resets = < 0x70 0x24>,
+ < 0x70 0x26>,
+ < 0x70 0x28>;
+reset-names = "phy", "apb", "phy31";
+#phy-cells = <0>;
+};
-- 
2.11.0



[PATCH v9 0/2] phy: Add USB PHY support on Intel LGM SoC

2020-08-27 Thread Ramuthevar,Vadivel MuruganX
The USB PHY provides the optimized for low power dissipation while active, 
idle, or on standby.
Requires minimal external components, a single resistor, for best operation.
Supports 10/5-Gbps high-speed data transmission rates through 3-m USB 3.x cable
---
v9:
  - Vinod review comments update
  - remove depends on USB_SUPPORT
  - replace ret variable by 0 in return statement
  - replace dev_info by dev_dbg
  - handle ret and extcon_get_state separately
v8-resend:
  - Correct the typo error in my previous patch
v8:
  - Rebase to V5.9-rc1
v7:
  - No Change
v6:
  - No Change
v5:
  - As per Felipe and Greg's suggestion usb phy driver reviewed patches
changed the folder from drivers/usb/phy to drivers/phy
  - Reviewed-By tag added in commit message
v4:
  - Andy's review comments addressed
  - drop the excess error debug prints
  - error check optimized
  - merge the split line to one line
v3:
  - Andy's review comments update
  - hardcode return value changed to actual return value from the callee
  - add error check is fixed according to the above
  - correct the assignment in redundant
  - combine the split line into one line
v2:
  - Address Phillip's review comments
  - replace devm_reset_control_get() by devm_reset_control_get_exclusive()
  - re-design the assert and deassert fucntion calls as per review comments
  - address kbuild bot warnings
  - add the comments
v1:
  - initial version
---
dt-bindings: usb: Add USB PHY support for Intel LGM SoC
v9:
  - No Change
v8-resend:
  - No change
v8:
  - No Change
v7:
  - Fixed the bot issue: usb-phy@e7e0: '#phy-cells' is a required property
v6:
  - Fixed the bot issue.
  - replace node-name by usb-phy@ in example
v5:
  - Reviewed-By tag added
v4:
  - No Change
v3:
  - No Change
v2:
  - No Change
v1:
  - initial version
 
Ramuthevar Vadivel Murugan (2):
  dt-bindings: phy: Add USB PHY support for Intel LGM SoC
  phy: Add USB3 PHY support for Intel LGM SoC

 .../devicetree/bindings/phy/intel,lgm-usb-phy.yaml |  58 +
 drivers/phy/Kconfig|  10 +
 drivers/phy/Makefile   |   1 +
 drivers/phy/phy-lgm-usb.c  | 281 +
 4 files changed, 350 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/intel,lgm-usb-phy.yaml
 create mode 100644 drivers/phy/phy-lgm-usb.c

-- 
2.11.0



[PATCH v3] Input: elants_i2c - Report resolution of ABS_MT_TOUCH_MAJOR by FW information.

2020-08-27 Thread Johnny Chuang
This patch adds a new behavior to report touch major resolution
based on information provided by firmware.

In initial process, driver acquires touch information from touch ic.
It contains one byte about the resolution value of ABS_MT_TOUCH_MAJOR.
Touch driver will report touch major resolution by this information.

Signed-off-by: Johnny Chuang 
---
Changes in v2:
  - register a real resolution value from firmware,
instead of hardcoding resolution value as 1 by flag.
Changes in v3:
  - modify git log message from flag to real value.
  - modify driver comment from flag to real value.
---
 drivers/input/touchscreen/elants_i2c.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/input/touchscreen/elants_i2c.c 
b/drivers/input/touchscreen/elants_i2c.c
index b0bd5bb..661a3ee 100644
--- a/drivers/input/touchscreen/elants_i2c.c
+++ b/drivers/input/touchscreen/elants_i2c.c
@@ -151,6 +151,7 @@ struct elants_data {
 
bool wake_irq_enabled;
bool keep_power_in_suspend;
+   u8 report_major_resolution;
 
/* Must be last to be used for DMA operations */
u8 buf[MAX_PACKET_SIZE] cacheline_aligned;
@@ -459,6 +460,9 @@ static int elants_i2c_query_ts_info(struct elants_data *ts)
rows = resp[2] + resp[6] + resp[10];
cols = resp[3] + resp[7] + resp[11];
 
+   /* get report resolution value of ABS_MT_TOUCH_MAJOR */
+   ts->report_major_resolution = resp[16];
+
/* Process mm_to_pixel information */
error = elants_i2c_execute_command(client,
   get_osr_cmd, sizeof(get_osr_cmd),
@@ -1325,6 +1329,8 @@ static int elants_i2c_probe(struct i2c_client *client,
 0, MT_TOOL_PALM, 0, 0);
input_abs_set_res(ts->input, ABS_MT_POSITION_X, ts->x_res);
input_abs_set_res(ts->input, ABS_MT_POSITION_Y, ts->y_res);
+   if (ts->report_major_resolution > 0)
+   input_abs_set_res(ts->input, ABS_MT_TOUCH_MAJOR, 
ts->report_major_resolution);
 
touchscreen_parse_properties(ts->input, true, >prop);
 
-- 
2.7.4



[PATCH] of: of_match_node: Make stub an inline function to avoid W=1 warnings

2020-08-27 Thread Andrew Lunn
When building without CONFIG_OF and W=1, errors are given about unused
arrays of match data, because of_match_node is stubbed as a macro. The
compile does not see it takes parameters when not astub, so it
generates warnings about unused variables. Replace the stub with an
inline function to avoid these false warnings.

Signed-off-by: Andrew Lunn 
---
 include/linux/of.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/of.h b/include/linux/of.h
index 5cf7ae0465d1..e9838387e7d9 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -991,7 +991,12 @@ static inline int of_map_id(struct device_node *np, u32 id,
 }
 
 #define of_match_ptr(_ptr) NULL
-#define of_match_node(_matches, _node) NULL
+
+static inline const struct of_device_id *of_match_node(
+   const struct of_device_id *matches, const struct device_node *node)
+{
+   return NULL;
+}
 #endif /* CONFIG_OF */
 
 /* Default string compare functions, Allow arch asm/prom.h to override */
-- 
2.28.0



Re: [PATCH v8 2/8] powerpc/vdso: Remove __kernel_datapage_offset and simplify __get_datapage()

2020-08-27 Thread Michael Ellerman
Dmitry Safonov <0x7f454...@gmail.com> writes:
> Hello,
>
> On Wed, 26 Aug 2020 at 15:39, Michael Ellerman  wrote:
>> Christophe Leroy  writes:
> [..]
>> > arch_remap() gets replaced by vdso_remap()
>> >
>> > For arch_unmap(), I'm wondering how/what other architectures do, because
>> > powerpc seems to be the only one to erase the vdso context pointer when
>> > unmapping the vdso.
>>
>> Yeah. The original unmap/remap stuff was added for CRIU, which I thought
>> people tested on other architectures (more than powerpc even).
>>
>> Possibly no one really cares about vdso unmap though, vs just moving the
>> vdso.
>>
>> We added a test for vdso unmap recently because it happened to trigger a
>> KAUP failure, and someone actually hit it & reported it.
>
> You right, CRIU cares much more about moving vDSO.
> It's done for each restoree and as on most setups vDSO is premapped and
> used by the application - it's actively tested.
> Speaking about vDSO unmap - that's concerning only for heterogeneous C/R,
> i.e when an application is migrated from a system that uses vDSO to the one
> which doesn't - it's much rare scenario.
> (for arm it's !CONFIG_VDSO, for x86 it's `vdso=0` boot parameter)

Ah OK that explains it.

The case we hit of VDSO unmapping was some strange "library OS" thing
which had explicitly unmapped the VDSO, so also very rare.

> Looking at the code, it seems quite easy to provide/maintain .close() for
> vm_special_mapping. A bit harder to add a test from CRIU side
> (as glibc won't know on restore that it can't use vdso anymore),
> but totally not impossible.
>
>> Running that test on arm64 segfaults:
>>
>>   # ./sigreturn_vdso
>>   VDSO is at 0x8191f000-0x8191 (4096 bytes)
>>   Signal delivered OK with VDSO mapped
>>   VDSO moved to 0x8191a000-0x8191afff (4096 bytes)
>>   Signal delivered OK with VDSO moved
>>   Unmapped VDSO
>>   Remapped the stack executable
>>   [   48.556191] potentially unexpected fatal signal 11.
>>   [   48.556752] CPU: 0 PID: 140 Comm: sigreturn_vdso Not tainted 
>> 5.9.0-rc2-00057-g2ac69819ba9e #190
>>   [   48.556990] Hardware name: linux,dummy-virt (DT)
>>   [   48.557336] pstate: 60001000 (nZCv daif -PAN -UAO BTYPE=--)
>>   [   48.557475] pc : 8191a7bc
>>   [   48.557603] lr : 8191a7bc
>>   [   48.557697] sp : c13c9e90
>>   [   48.557873] x29: c13cb0e0 x28: 
>>   [   48.558201] x27:  x26: 
>>   [   48.558337] x25:  x24: 
>>   [   48.558754] x23:  x22: 
>>   [   48.558893] x21: 004009b0 x20: 
>>   [   48.559046] x19: 00400ff0 x18: 
>>   [   48.559180] x17: 817da300 x16: 00412010
>>   [   48.559312] x15:  x14: 001c
>>   [   48.559443] x13: 656c626174756365 x12: 7865206b63617473
>>   [   48.559625] x11: 0003 x10: 0101010101010101
>>   [   48.559828] x9 : 818afda8 x8 : 0081
>>   [   48.559973] x7 : 6174732065687420 x6 : 64657070616d6552
>>   [   48.560115] x5 : 0e0388bd x4 : 0040135d
>>   [   48.560270] x3 :  x2 : 0001
>>   [   48.560412] x1 : 0003 x0 : 004120b8
>>   Segmentation fault
>>   #
>>
>> So I think we need to keep the unmap hook. Maybe it should be handled by
>> the special_mapping stuff generically.
>
> I'll cook a patch for vm_special_mapping if you don't mind :-)

That would be great, thanks!

cheers


RE: [PATCH 1/1] iommu/vt-d: Use device numa domain if RHSA is missing

2020-08-27 Thread Tian, Kevin
> From: Lu Baolu 
> Sent: Thursday, August 27, 2020 1:57 PM
> 
> If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
> table, we could default to the device NUMA domain as fall back. This also
> benefits the vIOMMU use case where only a single vIOMMU is exposed,
> hence
> no RHSA will be present but device numa domain can be correct.

this benefits vIOMMU but not necessarily only applied to single-vIOMMU
case. The logic still holds in multiple vIOMMU cases as long as RHSA is
not provided.

> 
> Cc: Jacob Pan 
> Cc: Kevin Tian 
> Cc: Ashok Raj 
> Signed-off-by: Lu Baolu 
> ---
>  drivers/iommu/intel/iommu.c | 31 +--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index e0516d64d7a3..bce158468abf 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -700,12 +700,41 @@ static int
> domain_update_iommu_superpage(struct dmar_domain *domain,
>   return fls(mask);
>  }
> 
> +static int domain_update_device_node(struct dmar_domain *domain)
> +{
> + struct device_domain_info *info;
> + int nid = NUMA_NO_NODE;
> +
> + assert_spin_locked(_domain_lock);
> +
> + if (list_empty(>devices))
> + return NUMA_NO_NODE;
> +
> + list_for_each_entry(info, >devices, link) {
> + if (!info->dev)
> + continue;
> +
> + nid = dev_to_node(info->dev);
> + if (nid != NUMA_NO_NODE)
> + break;
> + }

There could be multiple device numa nodes as devices within the
same domain may sit behind different IOMMUs. Of course there
is no perfect answer in such situation, and this patch is still an
obvious improvement on current always-on-node0 policy. But 
some comment about such implication is welcomed.

> +
> + return nid;
> +}
> +
>  /* Some capabilities may be different across iommus */
>  static void domain_update_iommu_cap(struct dmar_domain *domain)
>  {
>   domain_update_iommu_coherency(domain);
>   domain->iommu_snooping =
> domain_update_iommu_snooping(NULL);
>   domain->iommu_superpage =
> domain_update_iommu_superpage(domain, NULL);
> +
> + /*
> +  * If RHSA is missing, we should default to the device numa domain
> +  * as fall back.
> +  */
> + if (domain->nid == NUMA_NO_NODE)
> + domain->nid = domain_update_device_node(domain);
>  }
> 
>  struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8
> bus,
> @@ -5086,8 +5115,6 @@ static struct iommu_domain
> *intel_iommu_domain_alloc(unsigned type)
>   if (type == IOMMU_DOMAIN_DMA)
>   intel_init_iova_domain(dmar_domain);
> 
> - domain_update_iommu_cap(dmar_domain);
> -

Is it intended or by mistake? If the former, looks it is a separate fix...

>   domain = _domain->domain;
>   domain->geometry.aperture_start = 0;
>   domain->geometry.aperture_end   =
> --
> 2.17.1



Re: [PATCH 09/10] sh: don't allow non-coherent DMA for NOMMU

2020-08-27 Thread Rich Felker
On Thu, Aug 27, 2020 at 10:00:48PM -0400, Rich Felker wrote:
> On Tue, Jul 14, 2020 at 02:18:55PM +0200, Christoph Hellwig wrote:
> > The code handling non-coherent DMA depends on being able to remap code
> > as non-cached.  But that can't be done without an MMU, so using this
> > option on NOMMU builds is broken.
> > 
> > Signed-off-by: Christoph Hellwig 
> > ---
> >  arch/sh/Kconfig | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> > index f8027eee08edae..337eb496c45a0a 100644
> > --- a/arch/sh/Kconfig
> > +++ b/arch/sh/Kconfig
> > @@ -61,6 +61,7 @@ config SUPERH
> > select MAY_HAVE_SPARSE_IRQ
> > select MODULES_USE_ELF_RELA
> > select NEED_SG_DMA_LENGTH
> > +   select NO_DMA if !MMU && !DMA_COHERENT
> > select NO_GENERIC_PCI_IOPORT_MAP if PCI
> > select OLD_SIGACTION
> > select OLD_SIGSUSPEND
> > @@ -135,7 +136,7 @@ config DMA_COHERENT
> > bool
> 
> This change broke SD card support on J2 because MMC_SPI spuriously
> depends on HAS_DMA. It looks like it can be fixed just by removing
> that dependency from drivers/mmc/host/Kconfig.

It can't. mmp_spi_probe fails with ENOMEM, probably due to trying to
do some DMA setup thing that's not going to be needed if the
underlying SPI device doesn't support/use DMA.

Rich


Re: [PATCH 09/10] sh: don't allow non-coherent DMA for NOMMU

2020-08-27 Thread Rich Felker
On Tue, Jul 14, 2020 at 02:18:55PM +0200, Christoph Hellwig wrote:
> The code handling non-coherent DMA depends on being able to remap code
> as non-cached.  But that can't be done without an MMU, so using this
> option on NOMMU builds is broken.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  arch/sh/Kconfig | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> index f8027eee08edae..337eb496c45a0a 100644
> --- a/arch/sh/Kconfig
> +++ b/arch/sh/Kconfig
> @@ -61,6 +61,7 @@ config SUPERH
>   select MAY_HAVE_SPARSE_IRQ
>   select MODULES_USE_ELF_RELA
>   select NEED_SG_DMA_LENGTH
> + select NO_DMA if !MMU && !DMA_COHERENT
>   select NO_GENERIC_PCI_IOPORT_MAP if PCI
>   select OLD_SIGACTION
>   select OLD_SIGSUSPEND
> @@ -135,7 +136,7 @@ config DMA_COHERENT
>   bool

This change broke SD card support on J2 because MMC_SPI spuriously
depends on HAS_DMA. It looks like it can be fixed just by removing
that dependency from drivers/mmc/host/Kconfig.

Rich


Re: [PATCH v11 5/5] kdump: update Documentation about crashkernel

2020-08-27 Thread chenzhou
Hi Catalin,


On 2020/8/19 20:03, Dave Young wrote:
> On 08/18/20 at 03:07pm, chenzhou wrote:
>>
>> On 2020/8/10 14:03, Dave Young wrote:
>>> Hi,
>>>
> Previously I remember we talked about to use similar logic as X86, but I
> remember you mentioned on some arm64 platform there could be no low
> memory at all.  Is this not a problem now for the fallback?  Just be
> curious, thanks for the update, for the common part looks good.
 Hi Dave,

 Did you mean this discuss: https://lkml.org/lkml/2019/12/27/122?
>>> I meant about this reply instead :)
>>> https://lkml.org/lkml/2020/1/16/616
>> Hi Dave,
>>
>> Sorry for not repley in time, I was on holiday last week.
> Hi, no problem, thanks for following up.
>
>> The platform James mentioned may exist for which have no devices and need no 
>> low memory.
>> For our arm64 server platform, there are some devices and need low memory.
>>
>> I got it. For the platform with no low memory, reserving crashkernel will  
>> always fail.
>> How about like this:
> I think the question should leave to Catalin or James, I have no
> suggestion about this:)
Any suggestions about this?

Thanks,
Chen Zhou
>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index a8e34d97a894..4df18c7ea438 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -147,7 +147,7 @@ static void __init reserve_crashkernel(void)
>> }
>> memblock_reserve(crash_base, crash_size);
>>  
>> -   if (crash_base >= CRASH_ADDR_LOW_MAX) {
>> +   if (memstart_addr < CRASH_ADDR_LOW_MAX && crash_base >= 
>> CRASH_ADDR_LOW_MAX) {
>> const char *rename = "Crash kernel (low)";
>>  
>> if (reserve_crashkernel_low()) {
>>
>>
>> Thanks,
>> Chen Zhou
>>
>>> Thanks
>>> Dave
>>>
>>>
>>> .
>>>
>>
>
> .
>




Re: [PATCH v1 06/10] powerpc/pseries/iommu: Add ddw_list_add() helper

2020-08-27 Thread Alexey Kardashevskiy



On 28/08/2020 08:11, Leonardo Bras wrote:
> On Mon, 2020-08-24 at 13:46 +1000, Alexey Kardashevskiy wrote:
>>>  static int find_existing_ddw_windows(void)
>>>  {
>>> int len;
>>> @@ -887,18 +905,11 @@ static int find_existing_ddw_windows(void)
>>> if (!direct64)
>>> continue;
>>>  
>>> -   window = kzalloc(sizeof(*window), GFP_KERNEL);
>>> -   if (!window || len < sizeof(struct dynamic_dma_window_prop)) {
>>> +   window = ddw_list_add(pdn, direct64);
>>> +   if (!window || len < sizeof(*direct64)) {
>>
>> Since you are touching this code, it looks like the "len <
>> sizeof(*direct64)" part should go above to "if (!direct64)".
> 
> Sure, makes sense.
> It will be fixed for v2.
> 
>>
>>
>>
>>> kfree(window);
>>> remove_ddw(pdn, true);
>>> -   continue;
>>> }
>>> -
>>> -   window->device = pdn;
>>> -   window->prop = direct64;
>>> -   spin_lock(_window_list_lock);
>>> -   list_add(>list, _window_list);
>>> -   spin_unlock(_window_list_lock);
>>> }
>>>  
>>> return 0;
>>> @@ -1261,7 +1272,8 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
>>> device_node *pdn)
>>> dev_dbg(>dev, "created tce table LIOBN 0x%x for %pOF\n",
>>>   create.liobn, dn);
>>>  
>>> -   window = kzalloc(sizeof(*window), GFP_KERNEL);
>>> +   /* Add new window to existing DDW list */
>>
>> The comment seems to duplicate what the ddw_list_add name already suggests.
> 
> Ok, I will remove it then.
> 
>>> +   window = ddw_list_add(pdn, ddwprop);
>>> if (!window)
>>> goto out_clear_window;
>>>  
>>> @@ -1280,16 +1292,14 @@ static u64 enable_ddw(struct pci_dev *dev, struct 
>>> device_node *pdn)
>>> goto out_free_window;
>>> }
>>>  
>>> -   window->device = pdn;
>>> -   window->prop = ddwprop;
>>> -   spin_lock(_window_list_lock);
>>> -   list_add(>list, _window_list);
>>> -   spin_unlock(_window_list_lock);
>>
>> I'd leave these 3 lines here and in find_existing_ddw_windows() (which
>> would make  ddw_list_add -> ddw_prop_alloc). In general you want to have
>> less stuff to do on the failure path. kmalloc may fail and needs kfree
>> but you can safely delay list_add (which cannot fail) and avoid having
>> the lock help twice in the same function (one of them is hidden inside
>> ddw_list_add).
>> Not sure if this change is really needed after all. Thanks,
> 
> I understand this leads to better performance in case anything fails.
> Also, I think list_add happening in the end is less error-prone (in
> case the list is checked between list_add and a fail).

Performance was not in my mind at all.

I noticed you remove from a list with a lock help and it was not there
before and there is a bunch on labels on the exit path and started
looking for list_add() and if you do not double remove from the list.


> But what if we put it at the end?
> What is the chance of a kzalloc of 4 pointers (struct direct_window)
> failing after walk_system_ram_range?

This is not about chances really, it is about readability. If let's say
kmalloc failed, you just to the error exit label and simply call kfree()
on that pointer, kfree will do nothing if it is NULL already, simple.
list_del() does not have this simplicity.


> Is it not worthy doing that for making enable_ddw() easier to
> understand?

This is my goal here :)


> 
> Best regards,
> Leonardo
> 

-- 
Alexey


[PATCH v3 3/6] IMA: update process_buffer_measurement to measure buffer hash

2020-08-27 Thread Tushar Sugandhi
process_buffer_measurement() currently only measures the input buffer.
When the buffer being measured is too large, it may result in bloated
IMA logs.

Introduce a boolean parameter measure_buf_hash to support measuring
hash of a buffer, which would be much smaller, instead of the buffer
itself.
Signed-off-by: Tushar Sugandhi 
---
 security/integrity/ima/ima.h |  3 +-
 security/integrity/ima/ima_appraise.c|  2 +-
 security/integrity/ima/ima_asymmetric_keys.c |  2 +-
 security/integrity/ima/ima_main.c| 29 ++--
 security/integrity/ima/ima_queue_keys.c  |  3 +-
 5 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 83ed57147e68..ba332de8ed0b 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -267,7 +267,8 @@ void ima_store_measurement(struct integrity_iint_cache 
*iint, struct file *file,
   struct ima_template_desc *template_desc);
 int process_buffer_measurement(struct inode *inode, const void *buf, int size,
   const char *eventname, enum ima_hooks func,
-  int pcr, const char *func_data);
+  int pcr, const char *func_data,
+  bool measure_buf_hash);
 void ima_audit_measurement(struct integrity_iint_cache *iint,
   const unsigned char *filename);
 int ima_alloc_init_template(struct ima_event_data *event_data,
diff --git a/security/integrity/ima/ima_appraise.c 
b/security/integrity/ima/ima_appraise.c
index 372d16382960..20adffe5bf58 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -336,7 +336,7 @@ int ima_check_blacklist(struct integrity_iint_cache *iint,
if ((rc == -EPERM) && (iint->flags & IMA_MEASURE))
process_buffer_measurement(NULL, digest, digestsize,
   "blacklisted-hash", NONE,
-  pcr, NULL);
+  pcr, NULL, false);
}
 
return rc;
diff --git a/security/integrity/ima/ima_asymmetric_keys.c 
b/security/integrity/ima/ima_asymmetric_keys.c
index 1c68c500c26f..a74095793936 100644
--- a/security/integrity/ima/ima_asymmetric_keys.c
+++ b/security/integrity/ima/ima_asymmetric_keys.c
@@ -60,5 +60,5 @@ void ima_post_key_create_or_update(struct key *keyring, 
struct key *key,
 */
process_buffer_measurement(NULL, payload, payload_len,
   keyring->description, KEY_CHECK, 0,
-  keyring->description);
+  keyring->description, false);
 }
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 0979a62a9257..52cbbc1f7ea2 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -733,17 +733,21 @@ int ima_load_data(enum kernel_load_data_id id)
  * @func: IMA hook
  * @pcr: pcr to extend the measurement
  * @func_data: private data specific to @func, can be NULL.
+ * @measure_buf_hash: if set to true - will measure hash of the buf,
+ *instead of buf
  *
  * Based on policy, the buffer is measured into the ima log.
  */
 int process_buffer_measurement(struct inode *inode, const void *buf, int size,
   const char *eventname, enum ima_hooks func,
-  int pcr, const char *func_data)
+  int pcr, const char *func_data,
+  bool measure_buf_hash)
 {
int ret = 0;
const char *audit_cause = "ENOMEM";
struct ima_template_entry *entry = NULL;
struct integrity_iint_cache iint = {};
+   struct integrity_iint_cache digest_iint = {};
struct ima_event_data event_data = {.iint = ,
.filename = eventname,
.buf = buf,
@@ -752,7 +756,7 @@ int process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
struct {
struct ima_digest_data hdr;
char digest[IMA_MAX_DIGEST_SIZE];
-   } hash = {};
+   } hash = {}, digest_hash = {};
int violation = 0;
int action = 0;
u32 secid;
@@ -801,6 +805,24 @@ int process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
goto out;
}
 
+   if (measure_buf_hash) {
+   digest_iint.ima_hash = _hash.hdr;
+   digest_iint.ima_hash->algo = ima_hash_algo;
+   digest_iint.ima_hash->length = hash_digest_size[ima_hash_algo];
+
+   ret = ima_calc_buffer_hash(hash.hdr.digest,
+  iint.ima_hash->length,
+   

[PATCH v3 2/6] IMA: change process_buffer_measurement return type from void to int

2020-08-27 Thread Tushar Sugandhi
process_buffer_measurement() does not return the result of the operation.
Therefore, the consumers of this function cannot act on it, if needed.

Update return type of process_buffer_measurement() from void to int.

Signed-off-by: Tushar Sugandhi 
---
 security/integrity/ima/ima.h  |  6 +++---
 security/integrity/ima/ima_main.c | 14 +++---
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 8875085db689..83ed57147e68 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -265,9 +265,9 @@ void ima_store_measurement(struct integrity_iint_cache 
*iint, struct file *file,
   struct evm_ima_xattr_data *xattr_value,
   int xattr_len, const struct modsig *modsig, int pcr,
   struct ima_template_desc *template_desc);
-void process_buffer_measurement(struct inode *inode, const void *buf, int size,
-   const char *eventname, enum ima_hooks func,
-   int pcr, const char *func_data);
+int process_buffer_measurement(struct inode *inode, const void *buf, int size,
+  const char *eventname, enum ima_hooks func,
+  int pcr, const char *func_data);
 void ima_audit_measurement(struct integrity_iint_cache *iint,
   const unsigned char *filename);
 int ima_alloc_init_template(struct ima_event_data *event_data,
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index c870fd6d2f83..0979a62a9257 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -736,9 +736,9 @@ int ima_load_data(enum kernel_load_data_id id)
  *
  * Based on policy, the buffer is measured into the ima log.
  */
-void process_buffer_measurement(struct inode *inode, const void *buf, int size,
-   const char *eventname, enum ima_hooks func,
-   int pcr, const char *func_data)
+int process_buffer_measurement(struct inode *inode, const void *buf, int size,
+  const char *eventname, enum ima_hooks func,
+  int pcr, const char *func_data)
 {
int ret = 0;
const char *audit_cause = "ENOMEM";
@@ -758,7 +758,7 @@ void process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
u32 secid;
 
if (!ima_policy_flag)
-   return;
+   return 0;
 
/*
 * Both LSM hooks and auxilary based buffer measurements are
@@ -772,7 +772,7 @@ void process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
action = ima_get_action(inode, current_cred(), secid, 0, func,
, , func_data);
if (!(action & IMA_MEASURE))
-   return;
+   return 0;
}
 
if (!pcr)
@@ -787,7 +787,7 @@ void process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
pr_err("template %s init failed, result: %d\n",
   (strlen(template->name) ?
template->name : template->fmt), ret);
-   return;
+   return ret;
}
}
 
@@ -819,7 +819,7 @@ void process_buffer_measurement(struct inode *inode, const 
void *buf, int size,
func_measure_str(func),
audit_cause, ret, 0, ret);
 
-   return;
+   return ret;
 }
 
 /**
-- 
2.17.1



Re: [PATCH v3 01/16] kprobes: Add generic kretprobe trampoline handler

2020-08-27 Thread Masami Hiramatsu
On Fri, 28 Aug 2020 01:38:44 +0900
Masami Hiramatsu  wrote:

> +unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
> + unsigned long trampoline_address,
> + void *frame_pointer)
> +{
> + struct kretprobe_instance *ri = NULL;
> + struct hlist_head *head, empty_rp;
> + struct hlist_node *tmp;
> + unsigned long flags, orig_ret_address = 0;
> + kprobe_opcode_t *correct_ret_addr = NULL;
> + bool skipped = false;
> +
> + INIT_HLIST_HEAD(_rp);
> + kretprobe_hash_lock(current, , );
> +
> + /*
> +  * It is possible to have multiple instances associated with a given
> +  * task either because multiple functions in the call path have
> +  * return probes installed on them, and/or more than one
> +  * return probe was registered for a target function.
> +  *
> +  * We can handle this because:
> +  * - instances are always pushed into the head of the list
> +  * - when multiple return probes are registered for the same
> +  *   function, the (chronologically) first instance's ret_addr
> +  *   will be the real return address, and all the rest will
> +  *   point to kretprobe_trampoline.
> +  */
> + hlist_for_each_entry(ri, head, hlist) {
> + if (ri->task != current)
> + /* another task is sharing our hash bucket */
> + continue;
> + /*
> +  * Return probes must be pushed on this hash list correct
> +  * order (same as return order) so that it can be popped
> +  * correctly. However, if we find it is pushed it incorrect
> +  * order, this means we find a function which should not be
> +  * probed, because the wrong order entry is pushed on the
> +  * path of processing other kretprobe itself.
> +  */
> + if (ri->fp != frame_pointer) {
> + if (!skipped)
> + pr_warn("kretprobe is stacked incorrectly. 
> Trying to fixup.\n");
> + skipped = true;
> + continue;
> + }
> +
> + orig_ret_address = (unsigned long)ri->ret_addr;
> + if (skipped)
> + pr_warn("%ps must be blacklisted because of incorrect 
> kretprobe order\n",
> + ri->rp->kp.addr);
> +
> + if (orig_ret_address != trampoline_address)
> + /*
> +  * This is the real return address. Any other
> +  * instances associated with this task are for
> +  * other calls deeper on the call stack
> +  */
> + break;
> + }
> +
> + kretprobe_assert(ri, orig_ret_address, trampoline_address);
> +
> + correct_ret_addr = ri->ret_addr;

Oops, here is an insane code... why we have orig_ret_address *and* 
correct_ret_addr?
I'll clean this up.

Thanks,

-- 
Masami Hiramatsu 


[PATCH v3 4/6] IMA: add policy to measure critical data from kernel components

2020-08-27 Thread Tushar Sugandhi
There would be several candidate kernel components suitable for IMA
measurement. Not all of them would have support for IMA measurement.
Also, system administrators may not want to measure data for all of
them, even when they support IMA measurement. An IMA policy specific
to various kernel components is needed to measure their respective
critical data.

Add a new IMA policy "critical_kernel_data_sources" to support measuring
various critical kernel components. This policy would enable the
system administrators to limit the measurement to the components,
if the components support IMA measurement.

Signed-off-by: Tushar Sugandhi 
---
 Documentation/ABI/testing/ima_policy |  3 +++
 security/integrity/ima/ima_policy.c  | 29 +++-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/ima_policy 
b/Documentation/ABI/testing/ima_policy
index cd572912c593..7ccdc1964e29 100644
--- a/Documentation/ABI/testing/ima_policy
+++ b/Documentation/ABI/testing/ima_policy
@@ -48,6 +48,9 @@ Description:
template:= name of a defined IMA template type
(eg, ima-ng). Only valid when action is "measure".
pcr:= decimal value
+   critical_kernel_data_sources:= list of kernel
+   components (eg, selinux|apparmor|dm-crypt) that
+   contain data critical to the security of the kernel.
 
default policy:
# PROC_SUPER_MAGIC
diff --git a/security/integrity/ima/ima_policy.c 
b/security/integrity/ima/ima_policy.c
index 8866e84d0062..c8a044705347 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -33,6 +33,7 @@
 #define IMA_PCR0x0100
 #define IMA_FSNAME 0x0200
 #define IMA_KEYRINGS   0x0400
+#define IMA_DATA_SOURCES   0x0800
 
 #define UNKNOWN0
 #define MEASURE0x0001  /* same as IMA_MEASURE */
@@ -84,6 +85,7 @@ struct ima_rule_entry {
} lsm[MAX_LSM_RULES];
char *fsname;
struct ima_rule_opt_list *keyrings; /* Measure keys added to these 
keyrings */
+   struct ima_rule_opt_list *data_sources; /* Measure data from these 
sources */
struct ima_template_desc *template;
 };
 
@@ -911,7 +913,7 @@ enum {
Opt_uid_lt, Opt_euid_lt, Opt_fowner_lt,
Opt_appraise_type, Opt_appraise_flag,
Opt_permit_directio, Opt_pcr, Opt_template, Opt_keyrings,
-   Opt_err
+   Opt_data_sources, Opt_err
 };
 
 static const match_table_t policy_tokens = {
@@ -948,6 +950,7 @@ static const match_table_t policy_tokens = {
{Opt_pcr, "pcr=%s"},
{Opt_template, "template=%s"},
{Opt_keyrings, "keyrings=%s"},
+   {Opt_data_sources, "critical_kernel_data_sources=%s"},
{Opt_err, NULL}
 };
 
@@ -1312,6 +1315,24 @@ static int ima_parse_rule(char *rule, struct 
ima_rule_entry *entry)
 
entry->flags |= IMA_KEYRINGS;
break;
+   case Opt_data_sources:
+   ima_log_string(ab, "critical_kernel_data_sources",
+  args[0].from);
+
+   if (entry->data_sources) {
+   result = -EINVAL;
+   break;
+   }
+
+   entry->data_sources = ima_alloc_rule_opt_list(args);
+   if (IS_ERR(entry->data_sources)) {
+   result = PTR_ERR(entry->data_sources);
+   entry->data_sources = NULL;
+   break;
+   }
+
+   entry->flags |= IMA_DATA_SOURCES;
+   break;
case Opt_fsuuid:
ima_log_string(ab, "fsuuid", args[0].from);
 
@@ -1692,6 +1713,12 @@ int ima_policy_show(struct seq_file *m, void *v)
seq_puts(m, " ");
}
 
+   if (entry->flags & IMA_DATA_SOURCES) {
+   seq_puts(m, "critical_kernel_data_sources=");
+   ima_show_rule_opt_list(m, entry->data_sources);
+   seq_puts(m, " ");
+   }
+
if (entry->flags & IMA_PCR) {
snprintf(tbuf, sizeof(tbuf), "%d", entry->pcr);
seq_printf(m, pt(Opt_pcr), tbuf);
-- 
2.17.1



[PATCH v3 6/6] IMA: validate supported kernel data sources before measurement

2020-08-27 Thread Tushar Sugandhi
Currently, IMA does not restrict random data sources from measuring their
data using ima_measure_critical_data(). Any kernel data source can call
the function, and it's data will get measured as long as the input
event_data_source is part of the IMA policy -
CRITICAL_DATA+critical_kernel_data_sources. This may result in IMA log
getting bloated by random data sources. Supporting random data sources
at run-time may also impact the reliability of the system. 

To ensure that only data from supported sources are measured, the kernel
component needs to be added to a compile-time list of supported sources
(an "allowed list of components") in ima.h. IMA then validates the input
parameter - event_data_source passed to ima_measure_critical_data()
against this allowed list at run-time.

Provide an infrastructure for kernel data sources to be added to
the supported data sources list at compile-time. Update
ima_measure_critical_data() to validate, at run-time, that the data
source is supported before measuring the data.
Signed-off-by: Tushar Sugandhi 
---
 security/integrity/ima/ima.h  | 29 +
 security/integrity/ima/ima_main.c |  3 +++
 2 files changed, 32 insertions(+)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 00b84052c8f1..ecb0a1e7378f 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -228,6 +228,35 @@ extern const char *const func_tokens[];
 
 struct modsig;
 
+#define __ima_supported_kernel_data_sources(source)\
+   source(MIN_SOURCE, min_source)  \
+   source(MAX_SOURCE, max_source)
+
+#define __ima_enum_stringify(ENUM, str) (#str),
+
+enum ima_supported_kernel_data_sources {
+   __ima_supported_kernel_data_sources(__ima_hook_enumify)
+};
+
+static const char * const ima_supported_kernel_data_sources_str[] = {
+   __ima_supported_kernel_data_sources(__ima_enum_stringify)
+};
+
+static inline bool ima_kernel_data_source_is_supported(const char *source)
+{
+   int i;
+
+   if (!source)
+   return false;
+
+   for (i = MIN_SOURCE + 1; i < MAX_SOURCE; i++) {
+   if (!strcmp(ima_supported_kernel_data_sources_str[i], source))
+   return true;
+   }
+
+   return false;
+}
+
 #ifdef CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS
 /*
  * To track keys that need to be measured.
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index a889bf40cb7e..41be4d1d839e 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -888,6 +888,9 @@ int ima_measure_critical_data(const char *event_name,
if (!event_name || !event_data_source || !buf || !buf_len)
return -EINVAL;
 
+   if (!ima_kernel_data_source_is_supported(event_data_source))
+   return -EPERM;
+
return process_buffer_measurement(NULL, buf, buf_len, event_name,
  CRITICAL_DATA, 0, event_data_source,
  measure_buf_hash);
-- 
2.17.1



[PATCH v3 1/6] IMA: generalize keyring specific measurement constructs

2020-08-27 Thread Tushar Sugandhi
IMA functions such as ima_match_keyring(), process_buffer_measurement(),
ima_match_policy() etc. handle data specific to keyrings. Currently,
these constructs are not generic to handle any func specific data.
This makes it harder to extend without code duplication.

Refactor the keyring specific measurement constructs to be generic and
reusable in other measurement scenarios.

Signed-off-by: Tushar Sugandhi 
---
 security/integrity/ima/ima.h|  6 ++---
 security/integrity/ima/ima_api.c|  6 ++---
 security/integrity/ima/ima_main.c   |  6 ++---
 security/integrity/ima/ima_policy.c | 42 -
 4 files changed, 33 insertions(+), 27 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 38043074ce5e..8875085db689 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -255,7 +255,7 @@ static inline void ima_process_queued_keys(void) {}
 int ima_get_action(struct inode *inode, const struct cred *cred, u32 secid,
   int mask, enum ima_hooks func, int *pcr,
   struct ima_template_desc **template_desc,
-  const char *keyring);
+  const char *func_data);
 int ima_must_measure(struct inode *inode, int mask, enum ima_hooks func);
 int ima_collect_measurement(struct integrity_iint_cache *iint,
struct file *file, void *buf, loff_t size,
@@ -267,7 +267,7 @@ void ima_store_measurement(struct integrity_iint_cache 
*iint, struct file *file,
   struct ima_template_desc *template_desc);
 void process_buffer_measurement(struct inode *inode, const void *buf, int size,
const char *eventname, enum ima_hooks func,
-   int pcr, const char *keyring);
+   int pcr, const char *func_data);
 void ima_audit_measurement(struct integrity_iint_cache *iint,
   const unsigned char *filename);
 int ima_alloc_init_template(struct ima_event_data *event_data,
@@ -283,7 +283,7 @@ const char *ima_d_path(const struct path *path, char 
**pathbuf, char *filename);
 int ima_match_policy(struct inode *inode, const struct cred *cred, u32 secid,
 enum ima_hooks func, int mask, int flags, int *pcr,
 struct ima_template_desc **template_desc,
-const char *keyring);
+const char *func_data);
 void ima_init_policy(void);
 void ima_update_policy(void);
 void ima_update_policy_flag(void);
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 4f39fb93f278..af218babd198 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -170,7 +170,7 @@ void ima_add_violation(struct file *file, const unsigned 
char *filename,
  * @func: caller identifier
  * @pcr: pointer filled in if matched measure policy sets pcr=
  * @template_desc: pointer filled in if matched measure policy sets template=
- * @keyring: keyring name used to determine the action
+ * @func_data: private data specific to @func, can be NULL.
  *
  * The policy is defined in terms of keypairs:
  * subj=, obj=, type=, func=, mask=, fsmagic=
@@ -186,14 +186,14 @@ void ima_add_violation(struct file *file, const unsigned 
char *filename,
 int ima_get_action(struct inode *inode, const struct cred *cred, u32 secid,
   int mask, enum ima_hooks func, int *pcr,
   struct ima_template_desc **template_desc,
-  const char *keyring)
+  const char *func_data)
 {
int flags = IMA_MEASURE | IMA_AUDIT | IMA_APPRAISE | IMA_HASH;
 
flags &= ima_policy_flag;
 
return ima_match_policy(inode, cred, secid, func, mask, flags, pcr,
-   template_desc, keyring);
+   template_desc, func_data);
 }
 
 /*
diff --git a/security/integrity/ima/ima_main.c 
b/security/integrity/ima/ima_main.c
index 8a91711ca79b..c870fd6d2f83 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -732,13 +732,13 @@ int ima_load_data(enum kernel_load_data_id id)
  * @eventname: event name to be used for the buffer entry.
  * @func: IMA hook
  * @pcr: pcr to extend the measurement
- * @keyring: keyring name to determine the action to be performed
+ * @func_data: private data specific to @func, can be NULL.
  *
  * Based on policy, the buffer is measured into the ima log.
  */
 void process_buffer_measurement(struct inode *inode, const void *buf, int size,
const char *eventname, enum ima_hooks func,
-   int pcr, const char *keyring)
+   int pcr, const char *func_data)
 {
int ret = 0;
const char *audit_cause = "ENOMEM";
@@ -770,7 +770,7 @@ void process_buffer_measurement(struct inode *inode, const 
void *buf, int 

[PATCH v3 5/6] IMA: add hook to measure critical data from kernel components

2020-08-27 Thread Tushar Sugandhi
Currently, IMA does not provide a generic function for kernel components
to measure their data. A generic function provided by IMA would
enable various parts of the kernel with easier and faster on-boarding to
use IMA infrastructure, would avoid code duplication, and consistent
usage of IMA policy "critical_kernel_data_sources" across the kernel.

Add a new IMA func CRITICAL_DATA and a corresponding IMA hook
ima_measure_critical_data() to support measuring various critical kernel
components. Limit the measurement to the components that are specified
in the IMA policy - CRITICAL_DATA+critical_kernel_data_sources.

Signed-off-by: Tushar Sugandhi 
---
 Documentation/ABI/testing/ima_policy |  8 ++-
 include/linux/ima.h  | 11 +
 security/integrity/ima/ima.h |  1 +
 security/integrity/ima/ima_api.c |  2 +-
 security/integrity/ima/ima_main.c| 24 
 security/integrity/ima/ima_policy.c  | 34 
 6 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/Documentation/ABI/testing/ima_policy 
b/Documentation/ABI/testing/ima_policy
index 7ccdc1964e29..36d9cee9704d 100644
--- a/Documentation/ABI/testing/ima_policy
+++ b/Documentation/ABI/testing/ima_policy
@@ -29,7 +29,7 @@ Description:
base:   func:= 
[BPRM_CHECK][MMAP_CHECK][CREDS_CHECK][FILE_CHECK][MODULE_CHECK]
[FIRMWARE_CHECK]
[KEXEC_KERNEL_CHECK] [KEXEC_INITRAMFS_CHECK]
-   [KEXEC_CMDLINE] [KEY_CHECK]
+   [KEXEC_CMDLINE] [KEY_CHECK] [CRITICAL_DATA]
mask:= [[^]MAY_READ] [[^]MAY_WRITE] [[^]MAY_APPEND]
   [[^]MAY_EXEC]
fsmagic:= hex value
@@ -51,6 +51,8 @@ Description:
critical_kernel_data_sources:= list of kernel
components (eg, selinux|apparmor|dm-crypt) that
contain data critical to the security of the kernel.
+   Only valid when action is "measure" and func is
+   CRITICAL_DATA.
 
default policy:
# PROC_SUPER_MAGIC
@@ -128,3 +130,7 @@ Description:
keys added to .builtin_trusted_keys or .ima keyring:
 
measure func=KEY_CHECK 
keyrings=.builtin_trusted_keys|.ima
+
+   Example of measure rule using CRITICAL_DATA to measure critical 
data
+
+   measure func=CRITICAL_DATA 
critical_kernel_data_sources=selinux|apparmor|dm-crypt
diff --git a/include/linux/ima.h b/include/linux/ima.h
index d15100de6cdd..136fc02580db 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -26,6 +26,10 @@ extern int ima_post_read_file(struct file *file, void *buf, 
loff_t size,
 extern void ima_post_path_mknod(struct dentry *dentry);
 extern int ima_file_hash(struct file *file, char *buf, size_t buf_size);
 extern void ima_kexec_cmdline(int kernel_fd, const void *buf, int size);
+extern int ima_measure_critical_data(const char *event_name,
+const char *event_data_source,
+const void *buf, int buf_len,
+bool measure_buf_hash);
 
 #ifdef CONFIG_IMA_KEXEC
 extern void ima_add_kexec_buffer(struct kimage *image);
@@ -104,6 +108,13 @@ static inline int ima_file_hash(struct file *file, char 
*buf, size_t buf_size)
 }
 
 static inline void ima_kexec_cmdline(int kernel_fd, const void *buf, int size) 
{}
+static inline int ima_measure_critical_data(const char *event_name,
+   const char *event_data_source,
+   const void *buf, int buf_len,
+   bool measure_buf_hash)
+{
+   return -EOPNOTSUPP;
+}
 #endif /* CONFIG_IMA */
 
 #ifndef CONFIG_IMA_KEXEC
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index ba332de8ed0b..00b84052c8f1 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -200,6 +200,7 @@ static inline unsigned int ima_hash_key(u8 *digest)
hook(POLICY_CHECK, policy)  \
hook(KEXEC_CMDLINE, kexec_cmdline)  \
hook(KEY_CHECK, key)\
+   hook(CRITICAL_DATA, critical_data)  \
hook(MAX_CHECK, none)
 
 #define __ima_hook_enumify(ENUM, str)  ENUM,
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index af218babd198..9917e1730cb6 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -176,7 +176,7 @@ void ima_add_violation(struct file *file, const unsigned 
char *filename,
  * subj=, obj=, type=, func=, mask=, fsmagic=
  * subj,obj, and type: are LSM specific.
  * func: FILE_CHECK | BPRM_CHECK | CREDS_CHECK | 

[PATCH v3 0/6] IMA: Infrastructure for measurement of critical kernel data

2020-08-27 Thread Tushar Sugandhi
There are several kernel components that contain critical data which if
accidentally or maliciously altered, can compromise the security of the
kernel. Example of such components would include LSMs like SELinux, or
AppArmor; or device-mapper targets like dm-crypt, dm-verity etc.

Many of these components do not use the capabilities provided by kernel
integrity subsystem (IMA), and thus they don't use the benefits of
extended TPM PCR quotes and ultimately the benefits of remote attestation.

This series bridges this gap, so that potential kernel components that
contain data critical to the security of the kernel could take advantage
of IMA's measuring and quoting abilities - thus ultimately enabling
remote attestation for their specific data.

System administrators may want to pick and choose which kernel
components they would want to enable for measurements, quoting, and
remote attestation. To enable that, a new IMA policy is introduced.

And lastly, the functionality is exposed through a function
ima_measure_critical_data(). The functionality is generic enough to
measure the data of any kernel component at run-time. To ensure that only
data from supported sources are measured, the kernel component needs to
be added to a compile-time list of supported sources (an "allowed list
of components"). IMA validates the source passed to
ima_measure_critical_data() against this allowed list at run-time. 

This series is based on the following repo/branch:

 repo: https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git
 branch: next-integrity
 commit d012a7190fc1 ("Linux 5.9-rc2")

This series also has a dependency on the following patch series:
 https://patchwork.kernel.org/patch/11709527/

Change Log v3:
Incorporated feedback from Mimi on v2.
 - Renamed the policy "data_sources" to
   "critical_kernel_data_sources".
 - Added "critical_kernel_data_sources" description in
   Documentation/ima-policy.
 - Split CRITICAL_DATA + critical_kernel_data_sources into two separate
   patches.
 - Merged hook ima_measure_critical_data() + CRITICAL_DATA into a single
   patch.
 - Added functionality to validate data sources before measurement.

Change Log v2:
 - Reverted the unnecessary indentations in existing #define.
 - Updated the description to replace the word 'enlightened' with
   'supported'.
 - Reverted the unnecessary rename of attribute size to buf_len.
 - Introduced a boolean parameter measure_buf_hash as per community
   feedback to support measuring hash of the buffer, instead of the
   buffer itself.


Tushar Sugandhi (6):
  IMA: generalize keyring specific measurement constructs
  IMA: change process_buffer_measurement return type from void to int
  IMA: update process_buffer_measurement to measure buffer hash
  IMA: add policy to measure critical data from kernel components
  IMA: add hook to measure critical data from kernel components
  IMA: validate supported kernel data sources before measurement

 Documentation/ABI/testing/ima_policy |  11 +-
 include/linux/ima.h  |  11 ++
 security/integrity/ima/ima.h |  41 +++-
 security/integrity/ima/ima_api.c |   8 +-
 security/integrity/ima/ima_appraise.c|   2 +-
 security/integrity/ima/ima_asymmetric_keys.c |   2 +-
 security/integrity/ima/ima_main.c|  72 +++--
 security/integrity/ima/ima_policy.c  | 101 +++
 security/integrity/ima/ima_queue_keys.c  |   3 +-
 9 files changed, 205 insertions(+), 46 deletions(-)

-- 
2.17.1



RE: [PATCH v2] Input: elants_i2c - Report resolution of ABS_MT_TOUCH_MAJOR by FW information.

2020-08-27 Thread Johnny.Chuang
> On Wed, 26 Aug 2020 at 18:44, Johnny Chuang
>  wrote:
> >
> > This patch adds a new behavior to report touch major resolution based
> > on information provided by firmware.
> >
> > In initial process, driver acquires touch information from touch ic.
> > This information contains of one flag about reporting resolution of
> > ABS_MT_TOUCH_MAJOR is needed, or not.
> > Touch driver will report touch major resolution after geting this flag.
> 
> I think this paragraph needs updating now that the firmware's reporting the
> actual resolution instead of a flag.

Thanks Harry, I will update patch v3 for this.

> 
> >
> > Signed-off-by: Johnny Chuang 
> > ---
> > Changes in v2:
> >   - register real resolution instead of true/false.
> > ---
> >  drivers/input/touchscreen/elants_i2c.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/input/touchscreen/elants_i2c.c
> > b/drivers/input/touchscreen/elants_i2c.c
> > index b0bd5bb..dc7f4a5 100644
> > --- a/drivers/input/touchscreen/elants_i2c.c
> > +++ b/drivers/input/touchscreen/elants_i2c.c
> > @@ -151,6 +151,7 @@ struct elants_data {
> >
> > bool wake_irq_enabled;
> > bool keep_power_in_suspend;
> > +   u8 report_major_resolution;
> >
> > /* Must be last to be used for DMA operations */
> > u8 buf[MAX_PACKET_SIZE] cacheline_aligned; @@ -459,6
> > +460,9 @@ static int elants_i2c_query_ts_info(struct elants_data *ts)
> > rows = resp[2] + resp[6] + resp[10];
> > cols = resp[3] + resp[7] + resp[11];
> >
> > +   /* Decide if report resolution of ABS_MT_TOUCH_MAJOR */
> > +   ts->report_major_resolution = resp[16];
> > +
> > /* Process mm_to_pixel information */
> > error = elants_i2c_execute_command(client,
> >get_osr_cmd,
> > sizeof(get_osr_cmd), @@ -1325,6 +1329,8 @@ static int
> elants_i2c_probe(struct i2c_client *client,
> >  0, MT_TOOL_PALM, 0, 0);
> > input_abs_set_res(ts->input, ABS_MT_POSITION_X, ts->x_res);
> > input_abs_set_res(ts->input, ABS_MT_POSITION_Y, ts->y_res);
> > +   if (ts->report_major_resolution > 0)
> > +   input_abs_set_res(ts->input, ABS_MT_TOUCH_MAJOR,
> > + ts->report_major_resolution);
> >
> > touchscreen_parse_properties(ts->input, true, >prop);
> >
> > --
> > 2.7.4
> >
> 
> Harry Cutts
> Chrome OS Touch/Input team

--
Johnny



[PATCH v2] padata: add another maintainer and another list

2020-08-27 Thread Daniel Jordan
At Steffen's request, I'll help maintain padata for the foreseeable
future.

While at it, let's have patches go to lkml too since the code is now
used outside of crypto.

Signed-off-by: Daniel Jordan 
Cc: Herbert Xu 
Cc: Steffen Klassert 
Cc: linux-cry...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3b186ade3597..06a1b8a6d953 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13024,7 +13024,9 @@ F:  lib/packing.c
 
 PADATA PARALLEL EXECUTION MECHANISM
 M: Steffen Klassert 
+M: Daniel Jordan 
 L: linux-cry...@vger.kernel.org
+L: linux-kernel@vger.kernel.org
 S: Maintained
 F: Documentation/core-api/padata.rst
 F: include/linux/padata.h
-- 
2.28.0



Re: [PATCH v1 04/10] powerpc/kernel/iommu: Add new iommu_table_in_use() helper

2020-08-27 Thread Alexey Kardashevskiy



On 28/08/2020 04:34, Leonardo Bras wrote:
> On Sat, 2020-08-22 at 20:34 +1000, Alexey Kardashevskiy wrote:
>>> +
>>> +   /*ignore reserved bit0*/
>>
>> s/ignore reserved bit0/ ignore reserved bit0 /  (add spaces)
> 
> Fixed
> 
>>> +   if (tbl->it_offset == 0)
>>> +   p1_start = 1;
>>> +
>>> +   /* Check if reserved memory is valid*/
>>
>> A missing space here.
> 
> Fixed
> 
>>
>>> +   if (tbl->it_reserved_start >= tbl->it_offset &&
>>> +   tbl->it_reserved_start <= (tbl->it_offset + tbl->it_size) &&
>>> +   tbl->it_reserved_end   >= tbl->it_offset &&
>>> +   tbl->it_reserved_end   <= (tbl->it_offset + tbl->it_size)) {
>>
>> Uff. What if tbl->it_reserved_end is bigger than tbl->it_offset +
>> tbl->it_size?
>>
>> The reserved area is to preserve MMIO32 so it is for it_offset==0 only
>> and the boundaries are checked in the only callsite, and it is unlikely
>> to change soon or ever.
>>
>> Rather that bothering with fixing that, may be just add (did not test):
>>
>> if (WARN_ON((
>> (tbl->it_reserved_start || tbl->it_reserved_end) && (it_offset != 0))
>> (tbl->it_reserved_start > it_offset && tbl->it_reserved_end < it_offset
>> + it_size) && (it_offset == 0)) )
>>  return true;
>>
>> Or simply always look for it_offset..it_reserved_start and
>> it_reserved_end..it_offset+it_size and if there is no reserved area,
>> initialize it_reserved_start=it_reserved_end=it_offset so the first
>> it_offset..it_reserved_start becomes a no-op.
> 
> The problem here is that the values of it_reserved_{start,end} are not
> necessarily valid. I mean, on iommu_table_reserve_pages() the values
> are stored however they are given (bit reserving is done only if they
> are valid). 
> 
> Having a it_reserved_{start,end} value outside the valid ranges would
> cause find_next_bit() to run over memory outside the bitmap.
> Even if the those values are < tbl->it_offset, the resulting
> subtraction on unsigned would cause it to become a big value and run
> over memory outside the bitmap.
> 
> But I think you are right. That is not the place to check if the
> reserved values are valid. It should just trust them here.
> I intent to change iommu_table_reserve_pages() to only store the
> parameters in it_reserved_{start,end} if they are in the range, and or
> it_offset in both of them if they are not.
> 
> What do you think?

This should work, yes.


> 
> Thanks for the feedback!
> Leonardo Bras
> 
> 
> 

-- 
Alexey


Re: [PATCH] padata: add a reviewer

2020-08-27 Thread Daniel Jordan
On Thu, Aug 27, 2020 at 08:44:09AM +0200, Steffen Klassert wrote:
> Please also consider to add yourself as one of the maintainers.

Ok, sure!  I'll take you up on that.


Re: [PATCH] kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed

2020-08-27 Thread Lai Jiangshan
Ping @Sean Christopherson

On Mon, Aug 24, 2020 at 5:18 PM Lai Jiangshan  wrote:
>
> From: Lai Jiangshan 
>
> 8c8560b83390("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific 
> flushes)
> changed it without giving any reason in the changelog.
>
> In theory, the syncing is needed, and need to be fixed by reverting
> this part of change.
>
> Signed-off-by: Lai Jiangshan 
> ---
>  arch/x86/kvm/mmu/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4e03841f053d..9a93de921f2b 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2468,7 +2468,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct 
> kvm_vcpu *vcpu,
> }
>
> if (sp->unsync_children)
> -   kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> +   kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
>
> __clear_sp_write_flooding_count(sp);
>
> --
> 2.19.1.6.gb485710b
>


Re: [PATCH] kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed

2020-08-27 Thread Lai Jiangshan
Ping @Sean Christopherson

On Mon, Aug 24, 2020 at 5:18 PM Lai Jiangshan  wrote:
>
> From: Lai Jiangshan 
>
> 8c8560b83390("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific 
> flushes)
> changed it without giving any reason in the changelog.
>
> In theory, the syncing is needed, and need to be fixed by reverting
> this part of change.
>
> Signed-off-by: Lai Jiangshan 
> ---
>  arch/x86/kvm/mmu/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4e03841f053d..9a93de921f2b 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2468,7 +2468,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct 
> kvm_vcpu *vcpu,
> }
>
> if (sp->unsync_children)
> -   kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
> +   kvm_make_request(KVM_REQ_MMU_SYNC, vcpu);
>
> __clear_sp_write_flooding_count(sp);
>
> --
> 2.19.1.6.gb485710b
>


RE: [PATCH 07/11] soundwire: intel: Only call sdw stream APIs for the first cpu_dai

2020-08-27 Thread Liao, Bard
> -Original Message-
> From: Vinod Koul 
> Sent: Wednesday, August 26, 2020 5:47 PM
> To: Bard Liao 
> Cc: alsa-de...@alsa-project.org; linux-kernel@vger.kernel.org; ti...@suse.de;
> broo...@kernel.org; gre...@linuxfoundation.org; j...@cadence.com;
> srinivas.kandaga...@linaro.org; rander.w...@linux.intel.com;
> ranjani.sridha...@linux.intel.com; hui.w...@canonical.com; pierre-
> louis.boss...@linux.intel.com; Kale, Sanyog R ; Lin,
> Mengdong ; Liao, Bard 
> Subject: Re: [PATCH 07/11] soundwire: intel: Only call sdw stream APIs for
> the first cpu_dai
> 
> On 18-08-20, 10:41, Bard Liao wrote:
> > We should call these APIs once per stream. So we can only call it when
> > the dai ops is invoked for the first cpu dai.
> >
> > Signed-off-by: Bard Liao 
> > Reviewed-by: Pierre-Louis Bossart
> > 
> > Reviewed-by: Ranjani Sridharan 
> > ---
> >  drivers/soundwire/intel.c | 45
> > +--
> >  1 file changed, 39 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
> > index 89a8ad1f80e8..7c63581270fd 100644
> > --- a/drivers/soundwire/intel.c
> > +++ b/drivers/soundwire/intel.c
> > @@ -941,11 +941,13 @@ static int intel_hw_params(struct
> > snd_pcm_substream *substream,  static int intel_prepare(struct
> snd_pcm_substream *substream,
> >  struct snd_soc_dai *dai)
> >  {
> > +   struct snd_soc_pcm_runtime *rtd = substream->private_data;
> > +   struct snd_soc_dai *first_cpu_dai = asoc_rtd_to_cpu(rtd, 0);
> > struct sdw_cdns *cdns = snd_soc_dai_get_drvdata(dai);
> > struct sdw_intel *sdw = cdns_to_intel(cdns);
> > struct sdw_cdns_dma_data *dma;
> > int ch, dir;
> > -   int ret;
> > +   int ret = 0;
> >
> > dma = snd_soc_dai_get_dma_data(dai, substream);
> > if (!dma) {
> > @@ -985,7 +987,13 @@ static int intel_prepare(struct
> snd_pcm_substream *substream,
> > goto err;
> > }
> >
> > -   ret = sdw_prepare_stream(dma->stream);
> > +   /*
> > +* All cpu dais belong to a stream. To ensure sdw_prepare_stream
> > +* is called once per stream, we should call it only when
> > +* dai = first_cpu_dai.
> > +*/
> > +   if (first_cpu_dai == dai)
> > +   ret = sdw_prepare_stream(dma->stream);
> 
> Hmmm why not use the one place which is unique in the card to call this,
> hint machine dais are only called once.

Yes, we can call it in machine driver. But, shouldn't it belong to platform
level? The point is that if we move the stuff to machine driver, it will
force people to implement these stuff on their own Intel machine driver.

> 
> ~Vinod


Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-27 Thread Kehuan Feng
Hi Hillf,

Unfortunately, above mem barriers don't help. The issue shows up
within 1 minute ...

Hillf Danton  于2020年8月27日周四 下午8:58写道:

>
>
> On Thu, 27 Aug 2020 14:56:31 +0800 Kehuan Feng wrote:
> >
> > > Lets see if TCQ_F_NOLOC is making fq_codel different in your testing.
> >
> > I assume you meant disabling NOLOCK for pfifo_fast.
> >
> > Here is the modification,
> >
> > --- ./net/sched/sch_generic.c.orig  2020-08-24 22:02:04.589830751 +0800
> > +++ ./net/sched/sch_generic.c   2020-08-27 10:17:10.148977195 +0800
> > @@ -792,7 +792,7 @@
> > .dump   =3D   pfifo_fast_dump,
> > .change_tx_queue_len =3D  pfifo_fast_change_tx_queue_len,
> > .owner  =3D   THIS_MODULE,
> > -   .static_flags   =3D   TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
> > +   .static_flags   =3D   TCQ_F_CPUSTATS,
> >
> > The issue never happen again with it for over 3 hours stressing. And I
> > restarted the test for two times. No any surprising. Quite stable...
>
> Jaw off. That is great news and I'm failing again to explain the test
> result wrt the difference TCQ_F_NOLOCK can make in running qdisc.
>
> Nothing comes into mind other than two mem barriers though only one is
> needed...
>
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3040,6 +3040,7 @@ static void __netif_reschedule(struct Qd
>
>  void __netif_schedule(struct Qdisc *q)
>  {
> +   smp_mb__before_atomic();
> if (!test_and_set_bit(__QDISC_STATE_SCHED, >state))
> __netif_reschedule(q);
>  }
> @@ -4899,6 +4900,7 @@ static __latent_entropy void net_tx_acti
>  */
> smp_mb__before_atomic();
> clear_bit(__QDISC_STATE_SCHED, >state);
> +   smp_mb__after_atomic();
> qdisc_run(q);
> if (root_lock)
> spin_unlock(root_lock);
>


Re: [PATCH v11 25/25] x86/cet/shstk: Add arch_prctl functions for shadow stack

2020-08-27 Thread H.J. Lu
On Thu, Aug 27, 2020 at 6:35 PM Andy Lutomirski  wrote:
>
> On Thu, Aug 27, 2020 at 12:38 PM H.J. Lu  wrote:
> >
> > On Thu, Aug 27, 2020 at 11:56 AM Andy Lutomirski  
> > wrote:
> > >
> > >
> > >
> > > > On Aug 27, 2020, at 11:13 AM, Yu, Yu-cheng  
> > > > wrote:
> > > >
> > > > On 8/27/2020 6:36 AM, Florian Weimer wrote:
> > > >> * H. J. Lu:
> > >  On Thu, Aug 27, 2020 at 6:19 AM Florian Weimer  
> > >  wrote:
> > > >
> > > > * Dave Martin:
> > > >
> > > >> You're right that this has implications: for i386, libc probably 
> > > >> pulls
> > > >> more arguments off the stack than are really there in some 
> > > >> situations.
> > > >> This isn't a new problem though.  There are already generic prctls 
> > > >> with
> > > >> fewer than 4 args that are used on x86.
> > > >
> > > > As originally posted, glibc prctl would have to know that it has to 
> > > > pull
> > > > an u64 argument off the argument list for ARCH_X86_CET_DISABLE.  But
> > > > then the u64 argument is a problem for arch_prctl as well.
> > > >
> > > >>>
> > > >>> Argument of ARCH_X86_CET_DISABLE is int and passed in register.
> > > >> The commit message and the C source say otherwise, I think (not sure
> > > >> about the C source, not a kernel hacker).
> > > >
> > > > H.J. Lu suggested that we fix x86 arch_prctl() to take four arguments, 
> > > > and then keep MMAP_SHSTK as an arch_prctl().  Because now the map flags 
> > > > and size are all in registers, this also solves problems being pointed 
> > > > out earlier.  Without a wrapper, the shadow stack mmap call (from user 
> > > > space) will be:
> > > >
> > > > syscall(_NR_arch_prctl, ARCH_X86_CET_MMAP_SHSTK, size, MAP_32BIT).
> > >
> > > I admit I don’t see a show stopping technical reason we can’t add 
> > > arguments to an existing syscall, but I’m pretty sure it’s unprecedented, 
> > > and it doesn’t seem like a good idea.
> >
> > prctl prototype is:
> >
> > extern int prctl (int __option, ...)
> >
> > and implemented in kernel as:
> >
> >   int prctl(int option, unsigned long arg2, unsigned long arg3,
> >  unsigned long arg4, unsigned long arg5);
> >
> > Not all prctl operations take all 5 arguments.   It also applies
> > to arch_prctl.  It is quite normal for different operations of
> > arch_prctl to take different numbers of arguments.
>
> If by "quite normal" you mean "does not happen", then I agree.
>
> In any event, I will not have anything to do with a patch that changes
> an existing syscall signature unless Linus personally acks it.  So if
> you want to email him and linux-abi, be my guest.

Can you think of ANY issues of passing more arguments to arch_prctl?
syscall () provided by glibc always passes 6 arguments to the kernel.
Arguments are already in the registers.  What kind of problems do
you see?

-- 
H.J.


  1   2   3   4   5   6   7   8   9   10   >