date:20190920

Re: [PATCH -net] zd1211rw: zd_usb: Use "%zu" to format size_t

2019-09-20 Thread Kalle Valo

Geert Uytterhoeven  wrote:

> On 32-bit:
> 
> drivers/net/wireless/zydas/zd1211rw/zd_usb.c: In function 
> ‘check_read_regs’:
> drivers/net/wireless/zydas/zd1211rw/zd_def.h:18:25: warning: format ‘%ld’ 
> expects argument of type ‘long int’, but argument 6 has type ‘size_t’ {aka 
> ‘unsigned int’} [-Wformat=]
>   dev_printk(level, dev, "%s() " fmt, __func__, ##args)
>^~~
> drivers/net/wireless/zydas/zd1211rw/zd_def.h:22:4: note: in expansion of 
> macro ‘dev_printk_f’
>   dev_printk_f(KERN_DEBUG, dev, fmt, ## args)
>   ^~~~
> drivers/net/wireless/zydas/zd1211rw/zd_usb.c:1635:3: note: in expansion 
> of macro ‘dev_dbg_f’
>dev_dbg_f(zd_usb_dev(usb),
>^
> drivers/net/wireless/zydas/zd1211rw/zd_usb.c:1636:51: note: format string 
> is defined here
>"error: actual length %d less than expected %ld\n",
>~~^
>%d
> 
> Fixes: 84b0b66352470e64 ("zd1211rw: zd_usb: Use struct_size() helper")
> Signed-off-by: Geert Uytterhoeven 

Patch applied to wireless-drivers.git, thanks.

6355592e6b55 zd1211rw: zd_usb: Use "%zu" to format size_t

-- 
https://patchwork.kernel.org/patch/11151959/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: [PATCH 5.3 00/21] 5.3.1-stable review

2019-09-20 Thread Greg Kroah-Hartman

On Fri, Sep 20, 2019 at 08:11:35PM +0530, Naresh Kamboju wrote:
> On Fri, 20 Sep 2019 at 03:36, Greg Kroah-Hartman
>  wrote:
> >
> > This is the start of the stable review cycle for the 5.3.1 release.
> > There are 21 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat 21 Sep 2019 09:44:25 PM UTC.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.1-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-5.3.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
> 
> Results from Linaro’s test farm.
> No regressions on arm64, arm, x86_64, and i386.

Nice to see 5.3.0 pass everything :)

Thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 5.3 00/21] 5.3.1-stable review

2019-09-20 Thread Greg Kroah-Hartman

On Fri, Sep 20, 2019 at 03:17:48PM -0600, shuah wrote:
> On 9/19/19 4:03 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.3.1 release.
> > There are 21 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat 21 Sep 2019 09:44:25 PM UTC.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.1-rc1.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-5.3.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH 5.2 000/124] 5.2.17-stable review

2019-09-20 Thread Greg Kroah-Hartman

On Fri, Sep 20, 2019 at 11:37:38AM -0700, Guenter Roeck wrote:
> On Fri, Sep 20, 2019 at 12:01:28AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.2.17 release.
> > There are 124 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat 21 Sep 2019 09:44:25 PM UTC.
> > Anything received after that time might be too late.
> > 
> 
> Ok, here we are:
> 
> Build results:
>   total: 159 pass: 159 fail: 0
> Qemu test results:
>   total: 390 pass: 390 fail: 0

Wonderful, thanks for testing all of these and letting me know.

greg k-h

Re: [PATCH] riscv: Fix memblock reservation for device tree blob

2019-09-20 Thread Anup Patel

On Sat, Sep 21, 2019 at 6:30 AM Albert Ou  wrote:
>
> This fixes an error with how the FDT blob is reserved in memblock.
> An incorrect physical address calculation exposed the FDT header to
> unintended corruption, which typically manifested with of_fdt_raw_init()
> faulting during late boot after fdt_totalsize() returned a wrong value.
> Systems with smaller physical memory sizes more frequently trigger this
> issue, as the kernel is more likely to allocate from the DMA32 zone
> where bbl places the DTB after the kernel image.
>
> Commit 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
> changed the mapping of the DTB to reside in the fixmap area.
> Consequently, early_init_fdt_reserve_self() cannot be used anymore in
> setup_bootmem() since it relies on __pa() to derive a physical address,
> which does not work with dtb_early_va that is no longer a valid kernel
> logical address.
>
> The reserved[0x1] region shows the effect of the pointer underflow
> resulting from the __pa(initial_boot_params) offset subtraction:
>
> [0.00] MEMBLOCK configuration:
> [0.00]  memory size = 0x1fe0 reserved size = 
> 0x00a2e514
> [0.00]  memory.cnt  = 0x1
> [0.00]  memory[0x0] [0x8020-0x9fff], 
> 0x1fe0 bytes flags: 0x0
> [0.00]  reserved.cnt  = 0x2
> [0.00]  reserved[0x0]   [0x8020-0x80c2dfeb], 
> 0x00a2dfec bytes flags: 0x0
> [0.00]  reserved[0x1]   [0xfff08010-0xfff080100527], 
> 0x0528 bytes flags: 0x0
>
> With the fix applied:
>
> [0.00] MEMBLOCK configuration:
> [0.00]  memory size = 0x1fe0 reserved size = 
> 0x00a2e514
> [0.00]  memory.cnt  = 0x1
> [0.00]  memory[0x0] [0x8020-0x9fff], 
> 0x1fe0 bytes flags: 0x0
> [0.00]  reserved.cnt  = 0x2
> [0.00]  reserved[0x0]   [0x8020-0x80c2dfeb], 
> 0x00a2dfec bytes flags: 0x0
> [0.00]  reserved[0x1]   [0x80e0-0x80e00527], 
> 0x0528 bytes flags: 0x0

Thanks for catching this issue.

Most of us did not notice this issue most likely because:
1. We generally have good enough RAM on QEMU and SiFive Unleashed
2. Most of people use OpenSBI FW_JUMP on QEMU and U-Boot  on
SiFive Unleashed to boot in Linux which places FDT quite far away
from Linux kernel end

Linux ARM64 kernel also uses FIXMAP to access FDT and over there
as well early_init_fdt_reserve_self() is not used.

>
> Fixes: 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
> Signed-off-by: Albert Ou 
> ---
>  arch/riscv/mm/init.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index f0ba713..52d007c 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -82,6 +83,8 @@ static void __init setup_initrd(void)
>  }
>  #endif /* CONFIG_BLK_DEV_INITRD */
>
> +static phys_addr_t __dtb_pa __initdata;

May be dtb_early_pa will be more consistent name
instead of __dtb_pa because it matches dtb_early_va
used below.

> +
>  void __init setup_bootmem(void)
>  {
> struct memblock_region *reg;
> @@ -117,7 +120,12 @@ void __init setup_bootmem(void)
> setup_initrd();
>  #endif /* CONFIG_BLK_DEV_INITRD */
>
> -   early_init_fdt_reserve_self();
> +   /*
> +* Avoid using early_init_fdt_reserve_self() since __pa() does
> +* not work for DTB pointers that are fixmap addresses
> +*/
> +   memblock_reserve(__dtb_pa, fdt_totalsize(dtb_early_va));
> +
> early_init_fdt_scan_reserved_mem();
> memblock_allow_resize();
> memblock_dump_all();
> @@ -333,6 +341,7 @@ static uintptr_t __init best_map_size(phys_addr_t base, 
> phys_addr_t size)
> "not use absolute addressing."
>  #endif
>
> +

Please remove this newline addition.

>  asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>  {
> uintptr_t va, end_va;
> @@ -393,6 +402,8 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
>
> /* Save pointer to DTB for early FDT parsing */
> dtb_early_va = (void *)fix_to_virt(FIX_FDT) + (dtb_pa & ~PAGE_MASK);
> +   /* Save physical address for memblock reservation */
> +   __dtb_pa = dtb_pa;
>  }
>
>  static void __init setup_vm_final(void)
> --
> 2.7.4
>
>
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

This deserves to be stable kernel fix as well.
You should add:
Cc: sta...@vger.kernel.org
in your commit description.

Apart from minor nits above.

Reviewed-by: Anup Patel 

I tried this patch for both RV64 and RV32 on QEMU with
Yocto rootfs.

Tested-by:

[PATCH] perf docs: Allow man page date to be specified

2019-09-20 Thread Ian Rogers

With this change if a perf_date parameter is provided to asciidoc
then it will override the default date written to the man page metadata.
Without this change, or if the perf_date isn't specified, then the
current date is written to the metadata. Having this parameter allows
the metadata to be constant if builds happen on different dates. The
name of the parameter is intended to be consistent with the existing
perf_version parameter.

Signed-off-by: Ian Rogers 
---
 tools/perf/Documentation/asciidoc.conf | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/Documentation/asciidoc.conf 
b/tools/perf/Documentation/asciidoc.conf
index 356b23a40339..2b62ba1e72b7 100644
--- a/tools/perf/Documentation/asciidoc.conf
+++ b/tools/perf/Documentation/asciidoc.conf
@@ -71,6 +71,9 @@ ifdef::backend-docbook[]
 [header]
 template::[header-declarations]
 
+ifdef::perf_date[]
+{perf_date}
+endif::perf_date[]
 
 {mantitle}
 {manvolnum}
-- 
2.23.0.351.gc4317032e6-goog

Re: pci: endpoint test BUG

2019-09-20 Thread Randy Dunlap

On 9/20/19 7:04 PM, Hillf Danton wrote:
>> 
> 
>>> It will be resent if no one saw the message.
> 
>> 
> 
>> I didn't see it and I can't find it on lore.kernel.org/linux-pci/.
> 
>> 
> 
> Respin, git send-email works/jj/pci-epf-uaf.txt
> 
> ...
> 
> From: Hillf Danton 
> 
> To: Bjorn Helgaas 
> 
> Cc: linux-pci ,
> 
>     LKML ,
> 
>     Randy Dunlap ,
> 
>     Al Viro ,
> 
>     Dan Carpenter ,
> 
>     Lorenzo Pieralisi ,
> 
>     Kishon Vijay Abraham I ,
> 
>     Andrey Konovalov ,
> 
>     Hillf Danton 
> 
> Subject: [PATCH] PCI: endpoint: Fix uaf on unregistering driver
> 
> Date: Sat, 21 Sep 2019 09:58:28 +0800
> 
> Message-Id: <20190921015828.15644-1-hdan...@sina.com>
> 
> MIME-Version: 1.0
> 
> Content-Transfer-Encoding: 8bit
> 
>  
> 
> Result: 250
> 
>  
> 
> And let me know you see it.

No, not seeing the patch in my Inbox nor on lore.kernel.org.

It's a mystery to me.

-- 
~Randy

Re: [RFC] microoptimizing hlist_add_{before,behind}

2019-09-20 Thread Al Viro

On Sat, Sep 21, 2019 at 12:12:33AM +0100, Al Viro wrote:
>   Neither hlist_add_before() nor hlist_add_behind() should ever
> be called with both arguments pointing to the same hlist_node.
> However, gcc doesn't know that, so it ends up with pointless reloads.
> AFAICS, the following generates better code, is obviously equivalent
> in case when arguments are different and actually even in case when
> they are same, the end result is identical (if the hlist hadn't been
> corrupted even earlier than that).
> 
>   Objections?
> 
> Signed-off-by: Al Viro 

*gyah*

git diff >/tmp/y1



scp-out /tmp/y1



My apologies ;-/  Correct diff follows:

diff --git a/include/linux/list.h b/include/linux/list.h
index 85c92555e31f..5c84383675bc 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -793,21 +793,21 @@ static inline void hlist_add_head(struct hlist_node *n, 
struct hlist_head *h)
 static inline void hlist_add_before(struct hlist_node *n,
struct hlist_node *next)
 {
-   n->pprev = next->pprev;
+   struct hlist_node **p = n->pprev = next->pprev;
n->next = next;
next->pprev = >next;
-   WRITE_ONCE(*(n->pprev), n);
+   WRITE_ONCE(*p, n);
 }
 
 static inline void hlist_add_behind(struct hlist_node *n,
struct hlist_node *prev)
 {
-   n->next = prev->next;
+   struct hlist_node *p = n->next = prev->next;
prev->next = n;
n->pprev = >next;
 
-   if (n->next)
-   n->next->pprev  = >next;
+   if (p)
+   p->pprev  = >next;
 }
 
 /* after that we'll appear to be on some hlist and hlist_del will work */

Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()

2019-09-20 Thread Willy Tarreau

On Fri, Sep 20, 2019 at 04:30:20PM -0700, Andy Lutomirski wrote:
> So I think that just improving the
> getrandom()-is-blocking-on-x86-and-arm behavior, adding GRND_INSECURE
> and GRND_SECURE_BLOCKING, and adding the warning if 0 is passed is
> good enough.

I think so as well. Anyway, keep in mind that *with a sane API*,
userland can improve very quickly (faster than kernel deployments in
field). But userland developers need reliable and testable support for
features. If it's enough to do #ifndef GRND_xxx/#define GRND_xxx and
call getrandom() with these flags to detect support, it's basically 5
reliable lines of code to add to userland to make a warning disappear
and/or to allow a system that previously failed to boot to now boot. So
this gives strong incentive to userland to adopt the new API, provided
there's a way for the developer to understand what's happening (which
the warning does).

If we do it right, all we'll hear are userland developers complaining
that those stupid kernel developers have changed their API again and
really don't know what they want. That will be a good sign that the
warning flows back to them and that adoption is taking.

And if the change is small enough, maybe it could make sense to backport
it to stable versions to fix boot issues. With a testable feature it
does make sense.

Willy

Re: [PATCH] dt-bindings: net: dwmac: fix 'mac-mode' type

2019-09-20 Thread Florian Fainelli




On 9/20/2019 6:11 PM, Jakub Kicinski wrote:
> On Tue, 17 Sep 2019 13:30:52 +0300, Alexandru Ardelean wrote:
>> The 'mac-mode' property is similar to 'phy-mode' and 'phy-connection-type',
>> which are enums of mode strings.
>>
>> The 'dwmac' driver supports almost all modes declared in the 'phy-mode'
>> enum (except for 1 or 2). But in general, there may be a case where
>> 'mac-mode' becomes more generic and is moved as part of phylib or netdev.
>>
>> In any case, the 'mac-mode' field should be made an enum, and it also makes
>> sense to just reference the 'phy-connection-type' from
>> 'ethernet-controller.yaml'. That will also make it more future-proof for new
>> modes.
>>
>> Signed-off-by: Alexandru Ardelean 
> 
> Applied, thank you!
> 
> FWIW I had to add the Fixes tag by hand, either ozlabs patchwork or my
> git-pw doesn't have the automagic handling there, yet.

AFAICT the ozlabs patchwork instance does not do it, but if you have
patchwork administrative rights (the jango administration panel I mean)
then it is simple to add the regular expression to the list of tags that
patchwork already recognized. Had tried getting that included by
default, but it also counted all of those tags and therefore was not
particularly fine grained:

https://lists.ozlabs.org/pipermail/patchwork/2017-January/003910.html
-- 
Florian

Re: [PATCH] perf: add support for logging debug messages to file

2019-09-20 Thread Changbin Du

On Fri, Sep 20, 2019 at 10:53:56PM +0200, Jiri Olsa wrote:
> On Sun, Sep 15, 2019 at 06:27:40PM +0800, Changbin Du wrote:
> > When in TUI mode, it is impossible to show all the debug messages to
> > console. This make it hard to debug perf issues using debug messages.
> > This patch adds support for logging debug messages to file to resolve
> > this problem.
> > 
> > The usage is:
> > perf -debug verbose=2 --debug file=1 COMMAND
> > 
> > And the path of log file is '~/perf.log'.
> > 
> > Signed-off-by: Changbin Du 
> > ---
> >  tools/perf/Documentation/perf.txt |  4 +++-
> >  tools/perf/util/debug.c   | 20 
> >  2 files changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/Documentation/perf.txt 
> > b/tools/perf/Documentation/perf.txt
> > index 401f0ed67439..45db7b22d1a5 100644
> > --- a/tools/perf/Documentation/perf.txt
> > +++ b/tools/perf/Documentation/perf.txt
> > @@ -16,7 +16,8 @@ OPTIONS
> > Setup debug variable (see list below) in value
> > range (0, 10). Use like:
> >   --debug verbose   # sets verbose = 1
> > - --debug verbose=2 # sets verbose = 2
> > + --debug verbose=2 --debug file=1
> > +   # sets verbose = 2 and save log to file
> 
> it's variable already, why not allow to pass the path directly like:
> 
>   --debug file=~/perf.log
> 
> would be great if we won't need to use --debug twice and allow:
> 
>   --debug verbose=2,file=perf.log
>
This could be done, but first we need to change the option parsing code
first. will do it later.

> jirka

-- 
Cheers,
Changbin Du

Re: [PATCH v3 0/9] hacking: make 'kernel hacking' menu better structurized

2019-09-20 Thread Changbin Du

Gentle ping for status of this series. thx!

On Mon, Sep 09, 2019 at 10:44:44PM +0800, Changbin Du wrote:
> This series is a trivial improvment for the layout of 'kernel hacking'
> configuration menu. Now we have many items in it which makes takes
> a little time to look up them since they are not well structurized yet.
> 
> Early discussion is here:
> https://lkml.org/lkml/2019/9/1/39
> 
> This is a preview:
> 
>   │ 
> ┌─┐ │ 
>  
>   │ │printk and dmesg options  --->   
> │ │  
>   │ │Compile-time checks and compiler options  --->   
> │ │  
>   │ │Generic Kernel Debugging Instruments  --->   
> │ │  
>   │ │-*- Kernel debugging 
> │ │  
>   │ │[*]   Miscellaneous debug code   
> │ │  
>   │ │Memory Debugging  --->   
> │ │  
>   │ │[ ] Debug shared IRQ handlers
> │ │  
>   │ │Debug Oops, Lockups and Hangs  --->  
> │ │  
>   │ │Scheduler Debugging  --->
> │ │  
>   │ │[*] Enable extra timekeeping sanity checking 
> │ │  
>   │ │Lock Debugging (spinlocks, mutexes, etc...)  --->
> │ │  
>   │ │-*- Stack backtrace support  
> │ │  
>   │ │[ ] Warn for all uses of unseeded randomness 
> │ │  
>   │ │[ ] kobject debugging
> │ │  
>   │ │Debug kernel data structures  --->   
> │ │  
>   │ │[ ] Debug credential management  
> │ │  
>   │ │RCU Debugging  --->  
> │ │  
>   │ │[ ] Force round-robin CPU selection for unbound work items   
> │ │  
>   │ │[ ] Force extended block device numbers and spread them  
> │ │  
>   │ │[ ] Enable CPU hotplug state control 
> │ │  
>   │ │[*] Latency measuring infrastructure 
> │ │  
>   │ │[*] Tracers  --->
> │ │  
>   │ │[ ] Remote debugging over FireWire early on boot 
> │ │  
>   │ │[*] Sample kernel code  ---> 
> │ │  
>   │ │[*] Filter access to /dev/mem
> │ │  
>   │ │[ ]   Filter I/O access to /dev/mem  
> │ │  
>   │ │[ ] Additional debug code for syzbot 
> │ │  
>   │ │x86 Debugging  --->  
> │ │  
>   │ │Kernel Testing and Coverage  --->
> │ │  
>   │ │ 
> │ │  
>   │ │ 
> │ │  
>   │ 
> └─┘ │ 
>  
>   
> ├─┤
>   
>   │  < Exit >< Help >< Save >< Load > 
>   │  
>   
> └─┘
>  
> 
> v3:
>   o change subject prefix.
> v2:
>   o rebase to linux-next.
>   o move DEBUG_FS to 'Generic Kernel Debugging Instruments'
>   o move DEBUG_NOTIFIERS to 'Debug kernel data structures'
> 
> Changbin Du (9):
>   hacking: Group sysrq/kgdb/ubsan into 'Generic Kernel Debugging
> Instruments'
>   hacking: Create submenu for arch special debugging options
>   hacking: Group kernel data structures debugging together
>   hacking: Move kernel testing and coverage options to same submenu
>   hacking: Move Oops into 'Lockups and Hangs'
>   hacking: Move SCHED_STACK_END_CHECK after DEBUG_STACK_USAGE
>   hacking: Create a submenu for scheduler debugging options
>   hacking: Move DEBUG_BUGVERBOSE to 'printk and dmesg options'
>   hacking: Move DEBUG_FS to 'Generic Kernel Debugging Instruments'
> 
>  lib/Kconfig.debug | 659 --
>  1 file changed, 340 insertions(+), 319 deletions(-)
> 
> -- 
> 2.20.1
> 

-- 
Cheers,
Changbin Du

Re: [PATCH] dt-bindings: net: remove un-implemented property

2019-09-20 Thread Jakub Kicinski

On Wed, 18 Sep 2019 14:14:47 +0300, Alexandru Ardelean wrote:
> The `adi,disable-energy-detect` property was implemented in an initial
> version of the `adin` driver series, but after a review it was discarded in
> favor of implementing the ETHTOOL_PHY_EDPD phy-tunable option.
> 
> With the ETHTOOL_PHY_EDPD control, it's possible to disable/enable
> Energy-Detect-Power-Down for the `adin` PHY, so this device-tree is not
> needed.
> 
> Fixes: 767078132ff9 ("dt-bindings: net: add bindings for ADIN PHY driver")
> Signed-off-by: Alexandru Ardelean 

Applied, thank you!

[PATCH] quota: code cleanup for hash bits calculation

2019-09-20 Thread Chengguang Xu

Code cleanup for hash bits calculation by
calling rounddown_pow_of_two() and ilog2()

Signed-off-by: Chengguang Xu 
---
 fs/quota/dquot.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 6e826b454082..679dd3b5db70 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -2983,13 +2983,9 @@ static int __init dquot_init(void)
 
/* Find power-of-two hlist_heads which can fit into allocation */
nr_hash = (1UL << order) * PAGE_SIZE / sizeof(struct hlist_head);
-   dq_hash_bits = 0;
-   do {
-   dq_hash_bits++;
-   } while (nr_hash >> dq_hash_bits);
-   dq_hash_bits--;
+   nr_hash = rounddown_pow_of_two(nr_hash);
+   dq_hash_bits = ilog2(nr_hash);
 
-   nr_hash = 1UL << dq_hash_bits;
dq_hash_mask = nr_hash - 1;
for (i = 0; i < nr_hash; i++)
INIT_HLIST_HEAD(dquot_hash + i);
-- 
2.21.0

[PATCH] selinux: Remove load size limit

2019-09-20 Thread zhanglin

Load size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.

Limiting load size to 64MB is both pointless and affects real world use
cases.

Signed-off-by: zhanglin 
---
 security/selinux/selinuxfs.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index f3a5a138a096..4249400e9712 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -549,10 +549,6 @@ static ssize_t sel_write_load(struct file *file, const 
char __user *buf,
if (*ppos != 0)
goto out;
 
-   length = -EFBIG;
-   if (count > 64 * 1024 * 1024)
-   goto out;
-
length = -ENOMEM;
data = vmalloc(count);
if (!data)
-- 
2.17.1

Re: PROBLEM: nfs? crash in Linux 5.3 (possible regression)

2019-09-20 Thread Nick Bowler

On 9/20/19, Trond Myklebust  wrote:
> On Fri, 2019-09-20 at 14:23 -0400, Nick Bowler wrote:
>> Not sure how reproducible this is.  Since I've never seen a crash
>> like this before it may be a regression compared to, say, Linux 4.19
>> but I am not certain because this particular machine is brand new so
>> I don't have experience with older kernels on it...

So it actually seems pretty reliably reproducible, 4 attempts to compile
Linux on Linux 5.3 and all four crash the same way, although there's
definitely some randomness here...

On the other hand, I cannot reproduce if I install Linux 5.2 so it does
seem like a regression in 5.3.  I will see how well bisecting goes...

>> [  796.050025] BUG: kernel NULL pointer dereference, address:
>> 0014
>> [  796.051280] #PF: supervisor read access in kernel mode
>> [  796.053063] #PF: error_code(0x) - not-present page
>> [  796.054636] PGD 0 P4D 0
>> [  796.055688] Oops:  [#1] PREEMPT SMP
>> [  796.056768] CPU: 2 PID: 190 Comm: kworker/2:2 Tainted: GW
>>   5.3.0 #6
>> [  796.057953] Hardware name: To Be Filled By O.E.M. To Be Filled By
>> O.E.M./B450 Gaming-ITX/ac, BIOS P3.30 05/17/2019
>> [  796.059329] Workqueue: events key_garbage_collector
>> [  796.060623] RIP: 0010:keyring_gc_check_iterator+0x27/0x30
>
> That would be the keyring garbage collector, not NFS.
>
> Cced keyri...@vger.kernel.org
>
>
>> [  796.061845] Code: 44 00 00 48 83 e7 fc b8 01 00 00 00 f6 87 80 00
>> 00 00 21 75 19 48 8b 57 58 48 39 16 7c 05 48 85 d2 7f 0b 48 8b 87 a0
>> 00 00 00 <0f> b6 40 14 c3 0f 1f 40 00 48 83 e7 fc e9 27 eb ff ff 0f
>> 1f
>> 80 00
>> [  796.064638] RSP: 0018:b40fc0757df8 EFLAGS: 00010282
>> [  796.066058] RAX:  RBX: a14338caed80 RCX:
>> b40fc0757e40
>> [  796.067531] RDX: a1433ae85558 RSI: b40fc0757e40 RDI:
>> a1433ae85500
>> [  796.069014] RBP: b40fc0757e40 R08:  R09:
>> 000f
>> [  796.070513] R10: 8080808080808080 R11: 0001 R12:
>> a4cd6180
>> [  796.072025] R13: a14338caee10 R14: a14338caedf0 R15:
>> a1433ffeff00
>> [  796.073567] FS:  () GS:a1434048()
>> knlGS:
>> [  796.075171] CS:  0010 DS:  ES:  CR0: 80050033
>> [  796.076785] CR2: 0014 CR3: 000747ce6000 CR4:
>> 003406e0
>> [  796.078445] Call Trace:
>> [  796.080091]  assoc_array_subtree_iterate+0x55/0x100
>> [  796.081770]  keyring_gc+0x3f/0x80
>> [  796.083447]  key_garbage_collector+0x330/0x3d0
>> [  796.085155]  process_one_work+0x1cb/0x320
>> [  796.086869]  worker_thread+0x28/0x3c0
>> [  796.088603]  ? process_one_work+0x320/0x320
>> [  796.090335]  kthread+0x106/0x120
>> [  796.092053]  ? kthread_create_on_node+0x40/0x40
>> [  796.093810]  ret_from_fork+0x1f/0x30
>> [  796.095569] Modules linked in: sha1_ssse3 sha1_generic cbc cts
>> rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace ext4 crc16 mbcache
>> jbd2 iwlmvm mac80211 libarc4 amdgpu iwlwifi snd_hda_codec_realtek
>> snd_hda_codec_generic kvm_amd gpu_sched kvm snd_hda_codec_hdmi
>> drm_kms_helper irqbypass k10temp syscopyarea sysfillrect sysimgblt
>> fb_sys_fops video ttm cfg80211 snd_hda_intel snd_hda_codec drm
>> snd_hwdep rfkill snd_hda_core backlight snd_pcm evdev snd_timer snd
>> soundcore efivarfs dm_crypt hid_generic igb hwmon i2c_algo_bit sr_mod
>> cdrom sunrpc dm_mod
>> [  796.104033] CR2: 0014
>> [  796.106304] ---[ end trace 695aee10f9202347 ]---
>> [  796.108585] RIP: 0010:keyring_gc_check_iterator+0x27/0x30
>> [  796.110894] Code: 44 00 00 48 83 e7 fc b8 01 00 00 00 f6 87 80 00
>> 00 00 21 75 19 48 8b 57 58 48 39 16 7c 05 48 85 d2 7f 0b 48 8b 87 a0
>> 00 00 00 <0f> b6 40 14 c3 0f 1f 40 00 48 83 e7 fc e9 27 eb ff ff 0f
>> 1f
>> 80 00
>> [  796.115773] RSP: 0018:b40fc0757df8 EFLAGS: 00010282
>> [  796.118209] RAX:  RBX: a14338caed80 RCX:
>> b40fc0757e40
>> [  796.120683] RDX: a1433ae85558 RSI: b40fc0757e40 RDI:
>> a1433ae85500
>> [  796.123176] RBP: b40fc0757e40 R08:  R09:
>> 000f
>> [  796.125668] R10: 8080808080808080 R11: 0001 R12:
>> a4cd6180
>> [  796.128104] R13: a14338caee10 R14: a14338caedf0 R15:
>> a1433ffeff00
>> [  796.130493] FS:  () GS:a1434048()
>> knlGS:
>> [  796.132923] CS:  0010 DS:  ES:  CR0: 80050033
>> [  796.135266] CR2: 0014 CR3: 000747ce6000 CR4:
>> 003406e0
>

Re: [PATCH] drivers/net: release skb on failure

2019-09-20 Thread Jakub Kicinski

On Tue, 17 Sep 2019 23:45:21 -0500, Navid Emamdoost wrote:
> In ql_run_loopback_test, ql_lb_send does not release skb when fails. So
> it must be released before returning.
> 
> Signed-off-by: Navid Emamdoost 
> ---
>  drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c | 4 +++-

Thanks for the patch, this driver has been moved, please see

commit 955315b0dc8c8641311430f40fbe53990ba40e33
Author: Benjamin Poirier 
Date:   Tue Jul 23 15:14:13 2019 +0900

qlge: Move drivers/net/ethernet/qlogic/qlge/ to
drivers/staging/qlge/ 
The hardware has been declared EOL by the vendor more than 5 years
ago. What's more relevant to the Linux kernel is that the quality
of this driver is not on par with many other mainline drivers.

Cc: Manish Chopra 
Message-id: <20190617074858.32467-1-bpoir...@suse.com>
Signed-off-by: Benjamin Poirier 
Signed-off-by: David S. Miller 

Could you rebase, and send the new version to GregKH as he is the
stable maintainer?

>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c 
> b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> index a6886cc5654c..d539b71b2a5c 100644
> --- a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> +++ b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
> @@ -544,8 +544,10 @@ static int ql_run_loopback_test(struct ql_adapter *qdev)
>   skb_put(skb, size);
>   ql_create_lb_frame(skb, size);
>   rc = ql_lb_send(skb, qdev->ndev);
> - if (rc != NETDEV_TX_OK)
> + if (rc != NETDEV_TX_OK) {
> + dev_kfree_skb_any(skb);
>   return -EPIPE;
> + }
>   atomic_inc(>lb_count);
>   }
>   /* Give queue time to settle before testing results. */

Re: [PATCH 3/3] mm:fix gup_pud_range

2019-09-20 Thread Qiujun Huang

On Sat, Sep 21, 2019 at 9:19 AM John Hubbard  wrote:
>
> On 9/20/19 5:33 PM, Qiujun Huang wrote:
> >> On 9/20/19 8:51 AM, Qiujun Huang wrote:
> ...
> >> It would be nice if this spelled out a little more clearly what's
> >> wrong. I think you and Aneesh are saying that the entry is really
> >> a swap entry, created by the MCE response to a bad page?
> > do_machine_check->
> > do_memory_failure->
> > memory_failure->
> > hwpoison_user_mappings
> > will updated PUD level PTE entry as a swap entry.
> >
> > static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> > unsigned long address, void *arg)
> > {
> > ...
> > if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
> > pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
>
> OK, that helps. Let's add something approximately like this to the
> commit description:
>
> do_machine_check()
>   do_memory_failure()
> memory_failure()
>   hw_poison_user_mappings()
> try_to_unmap()
>   pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
>
> ...and now we have a swap entry that indicates that the page entry
> refers to a bad (and poisoned) page of memory, but gup_fast() at this
> level of the page table was ignoring swap entries, and incorrectly
> assuming that "!pxd_none() == valid and present".
>
> And this was not just a poisoned page problem, but a generaly swap entry
> problem. So, any swap entry type (device memory migration, numa migration,
> or just regular swapping) could lead to the same problem.
>
> Fix this by checking for pxd_present(), instead of pxd_none().
>
>
> > ...
> >>
> >>>
> >>> Signed-off-by: Qiujun Huang 
> >>> ---
> >>>  mm/gup.c | 2 ++
> >>>  1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/mm/gup.c b/mm/gup.c
> >>> index 98f13ab..6157ed9 100644
> >>> --- a/mm/gup.c
> >>> +++ b/mm/gup.c
> >>> @@ -2230,6 +2230,8 @@ static int gup_pud_range(p4d_t p4d, unsigned long 
> >>> addr, unsigned long end,
> >>>   next = pud_addr_end(addr, end);
> >>>   if (pud_none(pud))
> >>>   return 0;
> >>> + if (unlikely(!pud_present(pud)))
> >>> + return 0;
> >>
> >> If the MCE hwpoison behavior puts in swap entries, then it seems like all
> >> page table walkers would need to check for p*d_present(), and maybe at all
> >> levels too, right?
> > I think so
> >>
>
> Should those changes be part of this fix, do you think?

Yes, please.Thanks
>
> thanks,
> --
> John Hubbard
> NVIDIA

Re: [PATCH 3.16 000/132] 3.16.74-rc1 review

2019-09-20 Thread Guenter Roeck


On 9/20/19 2:16 PM, Ben Hutchings wrote:

On Fri, 2019-09-20 at 13:04 -0700, Guenter Roeck wrote:

On Fri, Sep 20, 2019 at 03:23:34PM +0100, Ben Hutchings wrote:

This is the start of the stable review cycle for the 3.16.74 release.
There are 132 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Mon Sep 23 20:00:00 UTC 2019.
Anything received after that time might be too late.



Build results:
total: 136 pass: 135 fail: 1
Failed builds:
arm:allmodconfig
Qemu test results:
total: 229 pass: 229 fail: 0

Build errors in arm:allmodconfig are along the line of

In file included from include/linux/printk.h:5,
  from include/linux/kernel.h:13,
  from include/linux/clk.h:16,
  from drivers/gpu/drm/tilcdc/tilcdc_drv.h:21,
  from drivers/gpu/drm/tilcdc/tilcdc_drv.c:20:
include/linux/init.h:343:7: error: 'cleanup_module'
specifies less restrictive attribute than its target 'tilcdc_drm_fini': 
'cold'

In addition to a few errors like that, there are literally thousands
of similar warnings.


It looks like this is triggered by you switching arm builds from gcc 8
to 9, rather than by any code change.



After reverting to gcc 8.3.0 for arm, I get:

Build results:
total: 136 pass: 136 fail: 0
Qemu test results:
total: 229 pass: 229 fail: 0

Sorry for the noise.

Guenter

Re: [PATCH] jffs2:freely allocate memory when parameters are invalid

2019-09-20 Thread Hou Tao

Hi Richard,

On 2019/9/20 22:38, Richard Weinberger wrote:
> On Fri, Sep 20, 2019 at 4:14 PM Xiaoming Ni  wrote:
>> I still think this is easier to understand:
>>  Free the memory allocated by the current function in the failed branch
> 
> Please note that jffs2 is in "odd fixes only" maintenance mode.
> Therefore patches like this cannot be processed.
> 
> On my never ending review queue are some other jffs2 patches which
> seem to address
> real problems. These go first.
> 
> I see that many patches come form Huawei, maybe one of you can help
> maintaining jffs2?
> Reviews, tests, etc.. are very welcome!
> 
In Huawei we use jffs2 broadly in our products to support filesystem on raw
NOR flash and NAND flash, so fixing the bugs in jffs2 means a lot to us.

Although I have not read all of jffs2 code thoroughly, I had find and "fixed"
some bugs in jffs2 and I am willing to do any help in the jffs2 community. Maybe
we can start by testing and reviewing the pending patches in patch work ?

Regards,
Tao

Re: [PATCH] dt-bindings: net: dwmac: fix 'mac-mode' type

2019-09-20 Thread Jakub Kicinski

On Tue, 17 Sep 2019 13:30:52 +0300, Alexandru Ardelean wrote:
> The 'mac-mode' property is similar to 'phy-mode' and 'phy-connection-type',
> which are enums of mode strings.
> 
> The 'dwmac' driver supports almost all modes declared in the 'phy-mode'
> enum (except for 1 or 2). But in general, there may be a case where
> 'mac-mode' becomes more generic and is moved as part of phylib or netdev.
> 
> In any case, the 'mac-mode' field should be made an enum, and it also makes
> sense to just reference the 'phy-connection-type' from
> 'ethernet-controller.yaml'. That will also make it more future-proof for new
> modes.
> 
> Signed-off-by: Alexandru Ardelean 

Applied, thank you!

FWIW I had to add the Fixes tag by hand, either ozlabs patchwork or my
git-pw doesn't have the automagic handling there, yet.

Re: [RESEND PATCH v2] mm/oom_killer: Add task UID to info message on an oom kill

2019-09-20 Thread Rafael Aquini

On Fri, Sep 20, 2019 at 05:13:40PM -0700, Andrew Morton wrote:
> On Thu, 13 Jun 2019 10:23:18 +0200 Michal Hocko  wrote:
> 
> > On Wed 12-06-19 13:57:53, Joel Savitz wrote:
> > > In the event of an oom kill, useful information about the killed
> > > process is printed to dmesg. Users, especially system administrators,
> > > will find it useful to immediately see the UID of the process.
> > 
> > Could you be more specific please? We already print uid when dumping
> > eligible tasks so it is not overly hard to find that information in the
> > oom report. Well, except when dumping of eligible tasks is disabled. Is
> > this what you are after?
> > 
> > Please always be specific about usecases in the changelog. A terse
> > statement that something is useful doesn't tell much very often.
> > 
> 
> 
> I'll add this to the chagnelog:
> 
> : We already print uid when dumping eligible tasks so it is not overly hard
> : to find that information in the oom report.  However this information is
> : unavailable then dumping of eligible tasks is disabled.
 

Thanks Andrew! just a minor nit there: 's/then/when/'


Acked-by: Rafael Aquini 
>

[PATCH] riscv: Fix memblock reservation for device tree blob

2019-09-20 Thread Albert Ou

This fixes an error with how the FDT blob is reserved in memblock.
An incorrect physical address calculation exposed the FDT header to
unintended corruption, which typically manifested with of_fdt_raw_init()
faulting during late boot after fdt_totalsize() returned a wrong value.
Systems with smaller physical memory sizes more frequently trigger this
issue, as the kernel is more likely to allocate from the DMA32 zone
where bbl places the DTB after the kernel image.

Commit 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
changed the mapping of the DTB to reside in the fixmap area.
Consequently, early_init_fdt_reserve_self() cannot be used anymore in
setup_bootmem() since it relies on __pa() to derive a physical address,
which does not work with dtb_early_va that is no longer a valid kernel
logical address.

The reserved[0x1] region shows the effect of the pointer underflow
resulting from the __pa(initial_boot_params) offset subtraction:

[0.00] MEMBLOCK configuration:
[0.00]  memory size = 0x1fe0 reserved size = 
0x00a2e514
[0.00]  memory.cnt  = 0x1
[0.00]  memory[0x0] [0x8020-0x9fff], 
0x1fe0 bytes flags: 0x0
[0.00]  reserved.cnt  = 0x2
[0.00]  reserved[0x0]   [0x8020-0x80c2dfeb], 
0x00a2dfec bytes flags: 0x0
[0.00]  reserved[0x1]   [0xfff08010-0xfff080100527], 
0x0528 bytes flags: 0x0

With the fix applied:

[0.00] MEMBLOCK configuration:
[0.00]  memory size = 0x1fe0 reserved size = 
0x00a2e514
[0.00]  memory.cnt  = 0x1
[0.00]  memory[0x0] [0x8020-0x9fff], 
0x1fe0 bytes flags: 0x0
[0.00]  reserved.cnt  = 0x2
[0.00]  reserved[0x0]   [0x8020-0x80c2dfeb], 
0x00a2dfec bytes flags: 0x0
[0.00]  reserved[0x1]   [0x80e0-0x80e00527], 
0x0528 bytes flags: 0x0

Fixes: 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
Signed-off-by: Albert Ou 
---
 arch/riscv/mm/init.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index f0ba713..52d007c 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -82,6 +83,8 @@ static void __init setup_initrd(void)
 }
 #endif /* CONFIG_BLK_DEV_INITRD */
 
+static phys_addr_t __dtb_pa __initdata;
+
 void __init setup_bootmem(void)
 {
struct memblock_region *reg;
@@ -117,7 +120,12 @@ void __init setup_bootmem(void)
setup_initrd();
 #endif /* CONFIG_BLK_DEV_INITRD */
 
-   early_init_fdt_reserve_self();
+   /*
+* Avoid using early_init_fdt_reserve_self() since __pa() does
+* not work for DTB pointers that are fixmap addresses
+*/
+   memblock_reserve(__dtb_pa, fdt_totalsize(dtb_early_va));
+
early_init_fdt_scan_reserved_mem();
memblock_allow_resize();
memblock_dump_all();
@@ -333,6 +341,7 @@ static uintptr_t __init best_map_size(phys_addr_t base, 
phys_addr_t size)
"not use absolute addressing."
 #endif
 
+
 asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 {
uintptr_t va, end_va;
@@ -393,6 +402,8 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
 
/* Save pointer to DTB for early FDT parsing */
dtb_early_va = (void *)fix_to_virt(FIX_FDT) + (dtb_pa & ~PAGE_MASK);
+   /* Save physical address for memblock reservation */
+   __dtb_pa = dtb_pa;
 }
 
 static void __init setup_vm_final(void)
-- 
2.7.4

Re: pci: endpoint test BUG

2019-09-20 Thread Randy Dunlap

On 9/20/19 5:38 PM, Hillf Danton wrote:
>>Kishon, Hillf, can you turn it into a patch and send it asap please ?
> 
>  
> 
> What was sent a couple of days before,
> 
>  
> 
> To: Bjorn Helgaas 
> 
> Cc: linux-pci , LKML 
> 
> Subject: [PATCH] PCI: endpoint: Fix uaf on unregistering driver
> 
> ...
> 
>  
> 
> Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each 
> pci_epf_device_id table entry")
> 
> Reported-and-tested-by: Randy Dunlap 
> 
> Cc: Al Viro 
> 
> Cc: Dan Carpenter 
> 
> Cc: Lorenzo Pieralisi 
> 
> Cc: Kishon Vijay Abraham I 
> 
> Cc: Andrey Konovalov 
> 
> Signed-off-by: Hillf Danton 
> 
> ---
> 
>  
> 
> and it is certain that  is on the Cc list.
> 
>  
> 
> It will be resent if no one saw the message.

I didn't see it and I can't find it on lore.kernel.org/linux-pci/.

-- 
~Randy

Re: [patch 3/6] posix-cpu-timers: Restrict timer_create() permissions

2019-09-20 Thread Frederic Weisbecker

On Thu, Sep 05, 2019 at 02:03:42PM +0200, Thomas Gleixner wrote:
> Right now there is no restriction at all to attach a Posix CPU timer to any
> process in the system. Per thread CPU timers are limited to be created by
> threads in the same thread group.
> 
> Timers can be used to observe activity of tasks and also impose overhead on
> the process to which they are attached because that process needs to do the
> fine grained CPU time accounting.
> 
> Limit the ability to attach timers to a process by checking whether the
> task which is creating the timer has permissions to attach ptrace on the
> target process.
> 
> Signed-off-by: Thomas Gleixner 

Makes sense. I hope no serious user currently rely on that lack of
restriction. Let's just apply and wait for complains if any.

Reviewed-by: Frederic Weisbecker

Re: [PATCH 3/3] mm:fix gup_pud_range

2019-09-20 Thread Qiujun Huang

>On 9/20/19 8:51 AM, Qiujun Huang wrote:
>> __get_user_pages_fast try to walk the page table but the
>> hugepage pte is replace by hwpoison swap entry by mca path.
>
>I expect you mean MCE (machine check exception), rather than mca?
Yeah
>
>> ...
>> [15798.177437] mce: Uncorrected hardware memory error in
>>   user-access at 224f1761c0
>> [15798.180171] MCE 0x224f176: Killing pal_main:6784 due to
>>   hardware memory corruption
>> [15798.180176] MCE 0x224f176: Killing qemu-system-x86:167336
>>   due to hardware memory corruption
>> ...
>> [15798.180206] BUG: unable to handle kernel
>> [15798.180226] paging request at 89123000
>> [15798.180236] IP: [] gup_pud_range+
>>   0x13e/0x1e0
>> ...
>>
>> We need to skip the hwpoison entry in gup_pud_range.
>
>It would be nice if this spelled out a little more clearly what's
>wrong. I think you and Aneesh are saying that the entry is really
>a swap entry, created by the MCE response to a bad page?
do_machine_check->
do_memory_failure->
memory_failure->
hwpoison_user_mappings
will updated PUD level PTE entry as a swap entry.

static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
unsigned long address, void *arg)
{
...
if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
if (PageHuge(page)) {
int nr = 1 << compound_order(page);
hugetlb_count_sub(nr, mm);
set_huge_swap_pte_at(mm, address,
pvmw.pte, pteval,
vma_mmu_pagesize(vma));
} else {
dec_mm_counter(mm, mm_counter(page));
set_pte_at(mm, address, pvmw.pte, pteval);
}
...

and, gup_pud_range will reference the pud entry.

gup_pud_range->gup_pmd_range:
static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
int write, struct page **pages, int *nr)
{
unsigned long next;
pmd_t *pmdp;

pmdp = pmd_offset(, addr);
do {
pmd_t pmd = *pmdp;  <--the pmdp is hwpoison swap entry. 89123000
and results in corruption

...
>
>>
>> Signed-off-by: Qiujun Huang 
>> ---
>>  mm/gup.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/gup.c b/mm/gup.c
>> index 98f13ab..6157ed9 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -2230,6 +2230,8 @@ static int gup_pud_range(p4d_t p4d, unsigned long 
>> addr, unsigned long end,
>>   next = pud_addr_end(addr, end);
>>   if (pud_none(pud))
>>   return 0;
>> + if (unlikely(!pud_present(pud)))
>> + return 0;
>
>If the MCE hwpoison behavior puts in swap entries, then it seems like all
>page table walkers would need to check for p*d_present(), and maybe at all
>levels too, right?
I think so
>
>thanks,



On Sat, Sep 21, 2019 at 3:37 AM John Hubbard  wrote:
>
> On 9/20/19 8:51 AM, Qiujun Huang wrote:
> > __get_user_pages_fast try to walk the page table but the
> > hugepage pte is replace by hwpoison swap entry by mca path.
>
> I expect you mean MCE (machine check exception), rather than mca?
>
> > ...
> > [15798.177437] mce: Uncorrected hardware memory error in
> >   user-access at 224f1761c0
> > [15798.180171] MCE 0x224f176: Killing pal_main:6784 due to
> >   hardware memory corruption
> > [15798.180176] MCE 0x224f176: Killing qemu-system-x86:167336
> >   due to hardware memory corruption
> > ...
> > [15798.180206] BUG: unable to handle kernel
> > [15798.180226] paging request at 89123000
> > [15798.180236] IP: [] gup_pud_range+
> >   0x13e/0x1e0
> > ...
> >
> > We need to skip the hwpoison entry in gup_pud_range.
>
> It would be nice if this spelled out a little more clearly what's
> wrong. I think you and Aneesh are saying that the entry is really
> a swap entry, created by the MCE response to a bad page?
>
> >
> > Signed-off-by: Qiujun Huang 
> > ---
> >  mm/gup.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 98f13ab..6157ed9 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -2230,6 +2230,8 @@ static int gup_pud_range(p4d_t p4d, unsigned long 
> > addr, unsigned long end,
> >   next = pud_addr_end(addr, end);
> >   if (pud_none(pud))
> >   return 0;
> > + if (unlikely(!pud_present(pud)))
> > + return 0;
>
> If the MCE hwpoison behavior puts in swap entries, then it seems like all
> page table walkers would need to check for p*d_present(), and maybe at all
> levels too, right?
>
> thanks,
> --
> John Hubbard
> NVIDIA
>
>
> >   if (unlikely(pud_huge(pud))) {
> >   if (!gup_huge_pud(pud, pudp, addr, next, flags,
> > pages, nr))
> >

Re: [RFC v2] zswap: Add CONFIG_ZSWAP_IO_SWITCH to handle swap IO issue

2019-09-20 Thread Randy Dunlap

On 9/19/19 11:35 PM, Hui Zhu wrote:
> This is the second version of this patch.  The previous version is in
> https://lkml.org/lkml/2019/9/11/935
> I updated the commit introduction and Kconfig  because it is not clear.
> 
Hi,
Just a few minor fixes (below):

> 
> Signed-off-by: Hui Zhu 
> ---
>  include/linux/swap.h |  3 +++
>  mm/Kconfig   | 18 +
>  mm/page_io.c | 16 +++
>  mm/zswap.c   | 55 
> 
>  4 files changed, 92 insertions(+)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 56cec63..5408d65 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -546,6 +546,24 @@ config ZSWAP
> they have not be fully explored on the large set of potential
> configurations and workloads that exist.
>  
> +config ZSWAP_IO_SWITCH
> + bool "Compressed cache for swap pages according to the IO status"
> + depends on ZSWAP
> + def_bool n

Just drop the "def_bool n".  It's already a "bool" and 'n' is the default value 
for it.

> + help
> +   This function help the system that normal swap speed is higher

helps the system in which normal swap speed is higher

> +   than zswap speed to handle the swap IO issue.
> +   For example, a VM that is disk device is not set cache config or

possibly:
  For example, a VM where the disk device is not set for cache config or

> +   set cache=writeback.
> +
> +   This function make zswap just work when the disk of the swap file

  This function makes

> +   is under high IO load.
> +   It add two parameters read_in_flight_limit and write_in_flight_limit 
> to

  It adds two parameters (read_in_flight_limit and 
write_in_flight_limit) to

> +   zswap.  When zswap is enabled, pages will be stored to zswap only
> +   when the IO in flight number of swap device is bigger than

   of the swap device

> +   zswap_read_in_flight_limit or zswap_write_in_flight_limit.
> +   If unsure, say "n".
> +
>  config ZPOOL
>   tristate "Common API for compressed memory storage"
>   help

> diff --git a/mm/zswap.c b/mm/zswap.c
> index 0e22744..1255645 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -62,6 +62,13 @@ static u64 zswap_reject_compress_poor;
>  static u64 zswap_reject_alloc_fail;
>  /* Store failed because the entry metadata could not be allocated (rare) */
>  static u64 zswap_reject_kmemcache_fail;
> +#ifdef CONFIG_ZSWAP_IO_SWITCH
> +/* Store failed because zswap_read_in_flight_limit or
> + * zswap_write_in_flight_limit is bigger than IO in flight number of
> + * swap device
> + */

Please use the documented multi-line comment format.  E.g.:

/*
 * Store failed because zswap_read_in_flight_limit or
 * zswap_write_in_flight_limit is bigger than IO in flight number of
 * swap device.
 */

> +static u64 zswap_reject_io;
> +#endif
>  /* Duplicate store was encountered (rare) */
>  static u64 zswap_duplicate_entry;
>  
> @@ -114,6 +121,22 @@ static bool zswap_same_filled_pages_enabled = true;
>  module_param_named(same_filled_pages_enabled, 
> zswap_same_filled_pages_enabled,
>  bool, 0644);
>  
> +#ifdef CONFIG_ZSWAP_IO_SWITCH
> +/* zswap will not try to store the page if zswap_read_in_flight_limit is
> + * bigger than IO read in flight number of swap device
> + */

Use documented multi-line comment format.

> +static unsigned int zswap_read_in_flight_limit;
> +module_param_named(read_in_flight_limit, zswap_read_in_flight_limit,
> +uint, 0644);
> +
> +/* zswap will not try to store the page if zswap_write_in_flight_limit is
> + * bigger than IO write in flight number of swap device
> + */

ditto.

thanks.
-- 
~Randy

Re: [PATCH v9 07/11] dt-bindings: pwm: pwm-mediatek: add a property "num-pwms"

2019-09-20 Thread Thierry Reding

On Fri, Sep 20, 2019 at 06:49:07AM +0800, Sam Shih wrote:
> From: Ryder Lee 
> 
> This adds a property "num-pwms" in example so that we could
> specify the number of PWM channels via device tree.
> 
> Signed-off-by: Ryder Lee 
> Signed-off-by: Sam Shih 
> Reviewed-by: Matthias Brugger 
> Acked-by: Uwe Kleine-König 
> ---
> Changes since v6:
> Follow reviewers's comments:
> - The subject should indicate this is for Mediatek
> 
> Changes since v5:
> - Add an Acked-by tag
> - This file is original v4 patch 5/10
> (https://patchwork.kernel.org/patch/11102577/)
> 
> ---
>  Documentation/devicetree/bindings/pwm/pwm-mediatek.txt | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

You failed to address Rob's questions repeatedly and I agree with him
that you can just as easily derive the number of PWMs from the specific
compatible string. I won't be applying this and none of the patches that
depend on it.

Thierry


signature.asc
Description: PGP signature

Re: [RESEND PATCH v2] mm/oom_killer: Add task UID to info message on an oom kill

2019-09-20 Thread Andrew Morton

On Thu, 13 Jun 2019 10:23:18 +0200 Michal Hocko  wrote:

> On Wed 12-06-19 13:57:53, Joel Savitz wrote:
> > In the event of an oom kill, useful information about the killed
> > process is printed to dmesg. Users, especially system administrators,
> > will find it useful to immediately see the UID of the process.
> 
> Could you be more specific please? We already print uid when dumping
> eligible tasks so it is not overly hard to find that information in the
> oom report. Well, except when dumping of eligible tasks is disabled. Is
> this what you are after?
> 
> Please always be specific about usecases in the changelog. A terse
> statement that something is useful doesn't tell much very often.
> 



I'll add this to the chagnelog:

: We already print uid when dumping eligible tasks so it is not overly hard
: to find that information in the oom report.  However this information is
: unavailable then dumping of eligible tasks is disabled.

[RFC]Sample module for Kernel access to Ftrace instances.

2019-09-20 Thread Divya Indi

[PATCH] tracing: Sample module to demonstrate kernel access to Ftrace

Hi,

This patch is for a sample module to demonstrate the use of APIs that 
were introduced/exported in order to access Ftrace instances from within the 
kernel.

Please Note: This module is dependent on -
- commit: f45d122 tracing: Kernel access to Ftrace instances
- Patches pending review: 
https://lore.kernel.org/lkml/1565805327-579-1-git-send-email-divya.i...@oracle.com/

The sample module creates/lookup a trace array called sample-instance on module 
load time. 
We then start a kernel thread(simple-thread) to -
1) Enable tracing for event "sample_event" to buffer associated with the trace 
array - "sample-instance".
2) Start a timer that will disable tracing to this buffer after 5 sec. (Tracing 
disabled after 5 sec ie at count=4)
3) Write to the buffer using trace_array_printk()
4) Stop the kernel thread and destroy the buffer during module unload.

A sample output for the same -

# tracer: nop
#
# entries-in-buffer/entries-written: 16/16   #P:4
#
#  _-=> irqs-off
# / _=> need-resched
#| / _---=> hardirq/softirq
#|| / _--=> preempt-depth
#||| / delay
#   TASK-PID   CPU#  TIMESTAMP  FUNCTION
#  | |   |      | |
 sample-instance-26797 [003]  955180.489833: simple_thread: 
trace_array_printk: count=0
 sample-instance-26797 [003]  955180.489836: sample_event: count value=0 at 
jiffies=5249940864
 sample-instance-26797 [003]  955181.513722: simple_thread: 
trace_array_printk: count=1
 sample-instance-26797 [003]  955181.513724: sample_event: count value=1 at 
jiffies=5249941888
 sample-instance-26797 [003]  955182.537629: simple_thread: 
trace_array_printk: count=2
 sample-instance-26797 [003]  955182.537631: sample_event: count value=2 at 
jiffies=5249942912
 sample-instance-26797 [003]  955183.561516: simple_thread: 
trace_array_printk: count=3
 sample-instance-26797 [003]  955183.561518: sample_event: count value=3 at 
jiffies=5249943936
 sample-instance-26797 [003]  955184.585423: simple_thread: 
trace_array_printk: count=4
 sample-instance-26797 [003]  955184.585427: sample_event: count value=4 at 
jiffies=5249944960
 sample-instance-26797 [003]  955185.609344: simple_thread: 
trace_array_printk: count=5
 sample-instance-26797 [003]  955186.633241: simple_thread: 
trace_array_printk: count=6
 sample-instance-26797 [003]  955187.657157: simple_thread: 
trace_array_printk: count=7
 sample-instance-26797 [003]  955188.681039: simple_thread: 
trace_array_printk: count=8
 sample-instance-26797 [003]  955189.704937: simple_thread: 
trace_array_printk: count=9
 sample-instance-26797 [003]  955190.728840: simple_thread: 
trace_array_printk: count=10

Let me know if you have any questions.

Thanks,
Divya

[PATCH] tracing: Sample module to demonstrate kernel access to Ftrace instances.

2019-09-20 Thread Divya Indi

This is a sample module to demostrate the use of the newly introduced and
exported APIs to access Ftrace instances from within the kernel.

Newly introduced APIs used here -

1. Create a new trace array if it does not exist.
struct trace_array *trace_array_create(const char *name)

2. Destroy/Remove a trace array.
int trace_array_destroy(struct trace_array *tr)

3. Lookup a trace array, given its name.
struct trace_array *trace_array_lookup(const char *name)

4. Enable/Disable trace events:
int trace_array_set_clr_event(struct trace_array *tr, const char *system,
const char *event, int set);

Exported APIs -
1. trace_printk equivalent for instances.
int trace_array_printk(struct trace_array *tr,
   unsigned long ip, const char *fmt, ...);

2. Helper function.
void trace_printk_init_buffers(void);

3. To decrement the reference counter.
void trace_array_put(struct trace_array *tr)

Signed-off-by: Divya Indi 
Reviewed-by: Manjunath Patil 
Reviewed-by: Joe Jin 
---
 samples/Kconfig  |   7 ++
 samples/Makefile |   1 +
 samples/ftrace_instance/Makefile |   6 ++
 samples/ftrace_instance/sample-trace-array.c | 134 +++
 samples/ftrace_instance/sample-trace-array.h |  84 +
 5 files changed, 232 insertions(+)
 create mode 100644 samples/ftrace_instance/Makefile
 create mode 100644 samples/ftrace_instance/sample-trace-array.c
 create mode 100644 samples/ftrace_instance/sample-trace-array.h

diff --git a/samples/Kconfig b/samples/Kconfig
index d63cc8a..1c7864b 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -20,6 +20,13 @@ config SAMPLE_TRACE_PRINTK
 This builds a module that calls trace_printk() and can be used to
 test various trace_printk() calls from a module.
 
+config SAMPLE_TRACE_ARRAY
+tristate "Build sample module for kernel access to Ftrace instancess"
+   depends on EVENT_TRACING && m
+   help
+This builds a module that demonstrates the use of various APIs to
+access Ftrace instances from within the kernel.
+
 config SAMPLE_KOBJECT
tristate "Build kobject examples"
help
diff --git a/samples/Makefile b/samples/Makefile
index debf892..02c444e 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_SAMPLE_RPMSG_CLIENT) += rpmsg/
 subdir-$(CONFIG_SAMPLE_SECCOMP)+= seccomp
 obj-$(CONFIG_SAMPLE_TRACE_EVENTS)  += trace_events/
 obj-$(CONFIG_SAMPLE_TRACE_PRINTK)  += trace_printk/
+obj-$(CONFIG_SAMPLE_TRACE_ARRAY)   += ftrace_instance/
 obj-$(CONFIG_VIDEO_PCI_SKELETON)   += v4l/
 obj-y  += vfio-mdev/
 subdir-$(CONFIG_SAMPLE_VFS)+= vfs
diff --git a/samples/ftrace_instance/Makefile b/samples/ftrace_instance/Makefile
new file mode 100644
index 000..3603b13
--- /dev/null
+++ b/samples/ftrace_instance/Makefile
@@ -0,0 +1,6 @@
+# Builds a module that calls various routines to access Ftrace instances.
+# To use(as root):  insmod sample-trace-array.ko
+
+CFLAGS_sample-trace-array.o := -I$(src)
+
+obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += sample-trace-array.o
diff --git a/samples/ftrace_instance/sample-trace-array.c 
b/samples/ftrace_instance/sample-trace-array.c
new file mode 100644
index 000..0595bc7
--- /dev/null
+++ b/samples/ftrace_instance/sample-trace-array.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Any file that uses trace points, must include the header.
+ * But only one file, must include the header by defining
+ * CREATE_TRACE_POINTS first.  This will make the C code that
+ * creates the handles for the trace points.
+ */
+#define CREATE_TRACE_POINTS
+#include "sample-trace-array.h"
+
+struct trace_array *tr;
+static void mytimer_handler(struct timer_list *unused);
+static struct task_struct *simple_tsk;
+
+/*
+ * mytimer: Timer setup to disable tracing for event "sample_event". This
+ * timer is only for the purposes of the sample module to demonstrate access of
+ * Ftrace instances from within kernel.
+ */
+static DEFINE_TIMER(mytimer, mytimer_handler);
+
+static void mytimer_handler(struct timer_list *unused)
+{
+   /*
+* Disable tracing for event "sample_event".
+*/
+   trace_array_set_clr_event(tr, "sample-subsystem", "sample_event", 0);
+}
+
+static void simple_thread_func(int count)
+{
+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_timeout(HZ);
+
+   /*
+* Printing count value using trace_array_printk() - trace_printk()
+* equivalent for the instance buffers.
+*/
+   trace_array_printk(tr, _THIS_IP_, "trace_array_printk: count=%d\n",
+   count);
+   /*
+* Tracepoint for event "sample_event". This will print the
+* current value of count and current jiffies.
+*/
+

[GIT PULL] libnvdimm for 5.4

2019-09-20 Thread Dan Williams

Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
tags/libnvdimm-for-5.4

...to receive some reworks to better support nvdimms on powerpc and an
nvdimm security interface update.

There was some last minute build breakage detected in -next so I've
left a patch that finalizes the powerpc compatibility work and some
other fixes out of this pull request.  The build fix requires a new
symbol export that needs an ack from ppc folks, so I'm going to save
that for a post -rc1 update. "libnvdimm/dax: Pick the right alignment
default when creating dax devices" not typically something I would
send during the -rc cycle, but I see no strong reason for it to wait
until v5.5.

The pending fixes for others watching are:

Aneesh Kumar K.V (4):
  libnvdimm/dax: Pick the right alignment default when creating dax devices
  mm/nvdimm: Fix endian conversion issues
  libnvdimm/altmap: Track namespace boundaries in altmap
  libnvdimm/region: Initialize bad block for volatile namespaces

Nathan Chancellor (1):
  libnvdimm/nfit_test: Fix acpi_handle redefinition

---

The following changes since commit d45331b00ddb179e291766617259261c112db872:

  Linux 5.3-rc4 (2019-08-11 13:26:41 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
tags/libnvdimm-for-5.4

for you to fetch changes up to 5b26db95fee3f1ce0d096b2de0ac6f3716171093:

  libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check
(2019-09-05 16:11:14 -0700)


libnvdimm for 5.4

- Rework the nvdimm core to accommodate architectures with different page
  sizes and ones that can change supported huge page sizes at boot
  time rather than a compile time constant.

- Introduce a distinct 'frozen' attribute for the nvdimm security state
  since it is independent of the locked state.

- Miscellaneous fixups.


Aneesh Kumar K.V (6):
  libnvdimm/of_pmem: Provide a unique name for bus provider
  libnvdimm/pmem: Advance namespace seed for specific probe errors
  libnvdimm/pfn_dev: Add a build check to make sure we notice when
struct page size change
  libnvdimm/pfn_dev: Add page size and struct page size to pfn superblock
  libnvdimm/label: Remove the dpa align check
  libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check

Dan Williams (5):
  tools/testing/nvdimm: Fix fallthrough warning
  libnvdimm/security: Introduce a 'frozen' attribute
  libnvdimm/security: Tighten scope of nvdimm->busy vs security operations
  libnvdimm/security: Consolidate 'security' operations
  libnvdimm/region: Rewrite _probe_success() to _advance_seeds()

Gustavo A. R. Silva (1):
  libnvdimm, region: Use struct_size() in kzalloc()

 drivers/acpi/nfit/intel.c|  59 ++--
 drivers/nvdimm/bus.c |  10 +-
 drivers/nvdimm/dimm_devs.c   | 134 ++
 drivers/nvdimm/label.c   |   5 -
 drivers/nvdimm/namespace_devs.c  |  40 ++--
 drivers/nvdimm/nd-core.h |  54 ---
 drivers/nvdimm/nd.h  |   4 +
 drivers/nvdimm/of_pmem.c |   2 +-
 drivers/nvdimm/pfn.h |   5 +-
 drivers/nvdimm/pfn_devs.c|  35 ++-
 drivers/nvdimm/pmem.c|  29 +-
 drivers/nvdimm/region_devs.c |  83 
 drivers/nvdimm/security.c| 199 ++-
 include/linux/libnvdimm.h|   9 +-
 tools/testing/nvdimm/dimm_devs.c |  19 +---
 tools/testing/nvdimm/test/nfit.c |   3 +-
 16 files changed, 346 insertions(+), 344 deletions(-)

Re: [RFC patch 03/15] x86/entry: Use generic syscall entry function

2019-09-20 Thread Andy Lutomirski

On Thu, Sep 19, 2019 at 8:09 AM Thomas Gleixner  wrote:
>
> Replace the syscall entry work handling with the generic version, Provide
> the necessary helper inlines to handle the real architecture specific
> parts, e.g. audit and seccomp invocations.

> -   if (work & (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_EMU)) {
> -   ret = tracehook_report_syscall_entry(regs);
> -   if (ret || (work & _TIF_SYSCALL_EMU))
> -   return -1L;
> -   }

Unless I missed something, you lost the _TIF_SYSCALL_EMU abomination.

Re: [RFC patch 02/15] x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY

2019-09-20 Thread Andy Lutomirski

On Thu, Sep 19, 2019 at 8:09 AM Thomas Gleixner  wrote:
>
> Evaluating _TIF_NOHZ to decide whether to use the slow syscall entry path
> is not only pointless, it's actually counterproductive:
>
>  1) Context tracking code is invoked unconditionally before that flag is
> evaluated.
>
>  2) If the flag is set the slow path is invoked for nothing due to #1

Can we also get rid of TIF_NOHZ on x86?

Re: [RFC patch 01/15] entry: Provide generic syscall entry functionality

2019-09-20 Thread Andy Lutomirski

On Thu, Sep 19, 2019 at 8:09 AM Thomas Gleixner  wrote:
>
> On syscall entry certain work needs to be done conditionally like tracing,
> seccomp etc. This code is duplicated in all architectures.
>
> Provide a generic version.
>
> Signed-off-by: Thomas Gleixner 
> ---
>  arch/Kconfig |3 +
>  include/linux/entry-common.h |  122 
> +++
>  kernel/Makefile  |1
>  kernel/entry/Makefile|3 +
>  kernel/entry/common.c|   33 +++
>  5 files changed, 162 insertions(+)
>
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -27,6 +27,9 @@ config HAVE_IMA_KEXEC
>  config HOTPLUG_SMT
> bool
>
> +config GENERIC_ENTRY
> +   bool
> +
>  config OPROFILE
> tristate "OProfile system profiling"
> depends on PROFILING
> --- /dev/null
> +++ b/include/linux/entry-common.h
> @@ -0,0 +1,122 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __LINUX_ENTRYCOMMON_H
> +#define __LINUX_ENTRYCOMMON_H
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +/*
> + * Define dummy _TIF work flags if not defined by the architecture or for
> + * disabled functionality.
> + */
> +#ifndef _TIF_SYSCALL_TRACE
> +# define _TIF_SYSCALL_TRACE(0)
> +#endif
> +
> +#ifndef _TIF_SYSCALL_EMU
> +# define _TIF_SYSCALL_EMU  (0)
> +#endif
> +
> +#ifndef _TIF_SYSCALL_TRACEPOINT
> +# define _TIF_SYSCALL_TRACEPOINT   (0)
> +#endif
> +
> +#ifndef _TIF_SECCOMP
> +# define _TIF_SECCOMP  (0)
> +#endif
> +
> +#ifndef _TIF_AUDIT
> +# define _TIF_AUDIT(0)
> +#endif

I'm wondering if these should be __TIF (double-underscore) or
MAYBE_TIF_ or something to avoid errors where people do flags |=
TIF_WHATEVER and get surprised.

> +/**
> + * syscall_enter_from_usermode - Check and handle work before invoking
> + *  a syscall
> + * @regs:  Pointer to currents pt_regs
> + * @syscall:   The syscall number
> + *
> + * Invoked from architecture specific syscall entry code with interrupts
> + * enabled.
> + *
> + * Returns: The original or a modified syscall number
> + */

Maybe document that it can return -1 to skip the syscall and that, if
this happens, it may use syscall_set_error() or
syscall_set_return_value() first.  If neither of those is called and
-1 is returned, then the syscall will fail with ENOSYS.

Re: [PATCH v2 0/4] debug_pagealloc improvements through page_owner

2019-09-20 Thread Andrew Morton

On Thu, 22 Aug 2019 16:03:44 -0700 Andrew Morton  
wrote:

> On Tue, 20 Aug 2019 15:18:24 +0200 Vlastimil Babka  wrote:
> 
> > v2: also fix THP split handling (added Patch 1) per Kirill
> > 
> > The debug_pagealloc functionality serves a similar purpose on the page
> > allocator level that slub_debug does on the kmalloc level, which is to 
> > detect
> > bad users. One notable feature that slub_debug has is storing stack traces 
> > of
> > who last allocated and freed the object. On page level we track allocations 
> > via
> > page_owner, but that info is discarded when freeing, and we don't track 
> > freeing
> > at all. This series improves those aspects. With both debug_pagealloc and
> > page_owner enabled, we can then get bug reports such as the example in 
> > Patch 4.
> > 
> > SLUB debug tracking additionaly stores cpu, pid and timestamp. This could be
> > added later, if deemed useful enough to justify the additional page_ext
> > structure size.
> 
> Thanks.  I split [1/1] out of the series as a bugfix and turned this
> into a three-patch series.
> 

None of which anyone has yet reviewed :(

Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()

2019-09-20 Thread Andy Lutomirski

On Fri, Sep 20, 2019 at 3:44 PM Linus Torvalds
 wrote:
>
> On Fri, Sep 20, 2019 at 1:51 PM Andy Lutomirski  wrote:
> >
> > To be clear, when I say "blocking", I mean "blocks until we're ready,
> > but we make sure we're ready in a moderately timely manner".
>
> .. an I want a pony.
>
> The problem is that you start from an assumption that we simply can't
> seem to do.

Eh, fair enough, I wasn't thinking about platforms without fast clocks.

I'm very nervous about allowing getrandom(..., 0) to fail with
-EAGAIN, though.  On a very, very brief search, I didn't find any
programs that would incorrectly assume it worked, but I can easily
imagine programs crashing, and that might be bad, too.  At the end of
the day, most user programmers who call getrandom() really did notice
that we flubbed the ABI, and either they were too lazy to fall back to
/dev/urandom, or they didn't want to for some reason, or they
genuinely want the blocking behavior.  And people who work with little
embedded systems without good clocks that basically can't generate
random numbers already know this, and they have little scripts to help
out.

So I think that just improving the
getrandom()-is-blocking-on-x86-and-arm behavior, adding GRND_INSECURE
and GRND_SECURE_BLOCKING, and adding the warning if 0 is passed is
good enough.  I suppose we could also have separate
GRND_SECURE_BLOCKING and GRND_SECURE_BLOCK_FOREVER.  We could also say
that, if you want to block forever, you should poll() on /dev/random
(with my patches applied, where this actually does what users would
want).

--Andy

Re: [PATCH V3 4/4] ASoC: fsl_asrc: Fix error with S24_3LE format bitstream in i.MX8

2019-09-20 Thread Nicolin Chen

Hello Shengjiu,

One issue for error-out and some nit-pickings inline. Thanks.

On Thu, Sep 19, 2019 at 08:11:42PM +0800, Shengjiu Wang wrote:
> There is error "aplay: pcm_write:2023: write error: Input/output error"
> on i.MX8QM/i.MX8QXP platform for S24_3LE format.
> 
> In i.MX8QM/i.MX8QXP, the DMA is EDMA, which don't support 24bit
> sample, but we didn't add any constraint, that cause issues.
> 
> So we need to query the caps of dma, then update the hw parameters
> according to the caps.
> 
> Signed-off-by: Shengjiu Wang 
> ---
>  sound/soc/fsl/fsl_asrc.c |  4 +--
>  sound/soc/fsl/fsl_asrc.h |  3 +++
>  sound/soc/fsl/fsl_asrc_dma.c | 52 +++-
>  3 files changed, 50 insertions(+), 9 deletions(-)
> 
> @@ -276,6 +274,11 @@ static int fsl_asrc_dma_startup(struct snd_pcm_substream 
> *substream)
>   struct device *dev = component->dev;
>   struct fsl_asrc *asrc_priv = dev_get_drvdata(dev);
>   struct fsl_asrc_pair *pair;
> + bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
> + u8 dir = tx ? OUT : IN;
> + struct dma_chan *tmp_chan;
> + struct snd_dmaengine_dai_dma_data *dma_data;

Nit: would it be possible to reorganize these a bit? Usually
we put struct things together unless there is a dependency,
similar to fsl_asrc_dma_hw_params().

> @@ -285,9 +288,44 @@ static int fsl_asrc_dma_startup(struct snd_pcm_substream 
> *substream)
>  
>   runtime->private_data = pair;
>  
> + /* Request a temp pair, which is release in the end */

Nit: "which will be released later" or "and will release it
later"? And could we use a work like "dummy"? Or at least I
would love to see the comments explaining the parameter "1"
in the function call below.

> + ret = fsl_asrc_request_pair(1, pair);
> + if (ret < 0) {
> + dev_err(dev, "failed to request asrc pair\n");
> + return ret;
> + }
> +
> + tmp_chan = fsl_asrc_get_dma_channel(pair, dir);
> + if (!tmp_chan) {
> + dev_err(dev, "can't get dma channel\n");

Could we align with other error messages using "failed to"?

> + ret = snd_soc_set_runtime_hwparams(substream, _imx_hardware);
> + if (ret)
> + return ret;
> +
[...]
> + dma_release_channel(tmp_chan);
> + fsl_asrc_release_pair(pair);

I think we need an "out:" here for those error-out routines
to goto. Otherwise, it'd be a pair leak?

> +

Could we drop this? There is a blank line below already :)

>  
>   return 0;
>  }
> -- 
> 2.21.0
>

Re: [PATCH -next] mm/kmemleak: record the current memory pool size

2019-09-20 Thread Andrew Morton

On Thu, 15 Aug 2019 11:02:16 +0100 Catalin Marinas  
wrote:

> On Wed, Aug 14, 2019 at 03:07:11PM -0400, Qian Cai wrote:
> > The only way to obtain the current memory pool size for a running kernel
> > is to check back the kernel config file which is inconvenient. Record it
> > in the kernel messages.
> > 
> > Signed-off-by: Qian Cai 
> > ---
> >  mm/kmemleak.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index b8bbe9ac5472..1f74f8bcb4eb 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -1967,7 +1967,8 @@ static int __init kmemleak_late_init(void)
> > mutex_unlock(_mutex);
> > }
> >  
> > -   pr_info("Kernel memory leak detector initialized\n");
> > +   pr_info("Kernel memory leak detector initialized (mem pool size: %d)\n",
> > +   mem_pool_free_count);
> 
> I wouldn't actually call it the "memory pool size" as I see the size as
> a constant set at config time. What about "memory pool available"?
> 
> (even this one is not entirely accurate since we have a
> mem_pool_free_list but I expect such list not to have too many elements
> at the late_initcall time)
> 
> If you change the printed string:
> 
> Acked-by: Catalin Marinas 

--- a/mm/kmemleak.c~mm-kmemleak-record-the-current-memory-pool-size-fix
+++ a/mm/kmemleak.c
@@ -1967,7 +1967,7 @@ static int __init kmemleak_late_init(voi
mutex_unlock(_mutex);
}
 
-   pr_info("Kernel memory leak detector initialized (mem pool size: %d)\n",
+   pr_info("Kernel memory leak detector initialized (mem pool available: 
%d)\n",
mem_pool_free_count);
 
return 0;
_

[PATCH v16 05/19] kunit: test: add the concept of expectations

2019-09-20 Thread Brendan Higgins

Add support for expectations, which allow properties to be specified and
then verified in tests.

Signed-off-by: Brendan Higgins 
Reviewed-by: Greg Kroah-Hartman 
Reviewed-by: Logan Gunthorpe 
Reviewed-by: Stephen Boyd 
---
 include/kunit/test.h | 836 +++
 lib/kunit/test.c |  62 
 2 files changed, 898 insertions(+)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index 6781c756f11b..30a62de16bc9 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -9,6 +9,8 @@
 #ifndef _KUNIT_TEST_H
 #define _KUNIT_TEST_H
 
+#include 
+#include 
 #include 
 #include 
 
@@ -372,4 +374,838 @@ void __printf(3, 4) kunit_printk(const char *level,
 #define kunit_err(test, fmt, ...) \
kunit_printk(KERN_ERR, test, fmt, ##__VA_ARGS__)
 
+/**
+ * KUNIT_SUCCEED() - A no-op expectation. Only exists for code clarity.
+ * @test: The test context object.
+ *
+ * The opposite of KUNIT_FAIL(), it is an expectation that cannot fail. In 
other
+ * words, it does nothing and only exists for code clarity. See
+ * KUNIT_EXPECT_TRUE() for more information.
+ */
+#define KUNIT_SUCCEED(test) do {} while (0)
+
+void kunit_do_assertion(struct kunit *test,
+   struct kunit_assert *assert,
+   bool pass,
+   const char *fmt, ...);
+
+#define KUNIT_ASSERTION(test, pass, assert_class, INITIALIZER, fmt, ...) do {  
\
+   struct assert_class __assertion = INITIALIZER; \
+   kunit_do_assertion(test,   \
+  &__assertion.assert,\
+  pass,   \
+  fmt,\
+  ##__VA_ARGS__); \
+} while (0)
+
+
+#define KUNIT_FAIL_ASSERTION(test, assert_type, fmt, ...) \
+   KUNIT_ASSERTION(test,  \
+   false, \
+   kunit_fail_assert, \
+   KUNIT_INIT_FAIL_ASSERT_STRUCT(test, assert_type),  \
+   fmt,   \
+   ##__VA_ARGS__)
+
+/**
+ * KUNIT_FAIL() - Always causes a test to fail when evaluated.
+ * @test: The test context object.
+ * @fmt: an informational message to be printed when the assertion is made.
+ * @...: string format arguments.
+ *
+ * The opposite of KUNIT_SUCCEED(), it is an expectation that always fails. In
+ * other words, it always results in a failed expectation, and consequently
+ * always causes the test case to fail when evaluated. See KUNIT_EXPECT_TRUE()
+ * for more information.
+ */
+#define KUNIT_FAIL(test, fmt, ...)\
+   KUNIT_FAIL_ASSERTION(test, \
+KUNIT_EXPECTATION,\
+fmt,  \
+##__VA_ARGS__)
+
+#define KUNIT_UNARY_ASSERTION(test,   \
+ assert_type, \
+ condition,   \
+ expected_true,   \
+ fmt, \
+ ...) \
+   KUNIT_ASSERTION(test,  \
+   !!(condition) == !!expected_true,  \
+   kunit_unary_assert,\
+   KUNIT_INIT_UNARY_ASSERT_STRUCT(test,   \
+  assert_type,\
+  #condition, \
+  expected_true), \
+   fmt,   \
+   ##__VA_ARGS__)
+
+#define KUNIT_TRUE_MSG_ASSERTION(test, assert_type, condition, fmt, ...)   
\
+   KUNIT_UNARY_ASSERTION(test,\
+ assert_type, \
+ condition,   \
+ true,\
+ fmt, \
+

[PATCH v16 01/19] kunit: test: add KUnit test runner core

2019-09-20 Thread Brendan Higgins

Add core facilities for defining unit tests; this provides a common way
to define test cases, functions that execute code which is under test
and determine whether the code under test behaves as expected; this also
provides a way to group together related test cases in test suites (here
we call them test_modules).

Just define test cases and how to execute them for now; setting
expectations on code will be defined later.

Signed-off-by: Brendan Higgins 
Reviewed-by: Greg Kroah-Hartman 
Reviewed-by: Logan Gunthorpe 
Reviewed-by: Luis Chamberlain 
Reviewed-by: Stephen Boyd 
---
 include/kunit/test.h | 188 ++
 lib/kunit/Kconfig|  17 
 lib/kunit/Makefile   |   1 +
 lib/kunit/test.c | 191 +++
 4 files changed, 397 insertions(+)
 create mode 100644 include/kunit/test.h
 create mode 100644 lib/kunit/Kconfig
 create mode 100644 lib/kunit/Makefile
 create mode 100644 lib/kunit/test.c

diff --git a/include/kunit/test.h b/include/kunit/test.h
new file mode 100644
index ..e30d1bf2fb68
--- /dev/null
+++ b/include/kunit/test.h
@@ -0,0 +1,188 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Base unit test (KUnit) API.
+ *
+ * Copyright (C) 2019, Google LLC.
+ * Author: Brendan Higgins 
+ */
+
+#ifndef _KUNIT_TEST_H
+#define _KUNIT_TEST_H
+
+#include 
+
+struct kunit;
+
+/**
+ * struct kunit_case - represents an individual test case.
+ *
+ * @run_case: the function representing the actual test case.
+ * @name: the name of the test case.
+ *
+ * A test case is a function with the signature,
+ * ``void (*)(struct kunit *)`` that makes expectations (see
+ * KUNIT_EXPECT_TRUE()) about code under test. Each test case is associated
+ * with a  kunit_suite and will be run after the suite's init
+ * function and followed by the suite's exit function.
+ *
+ * A test case should be static and should only be created with the
+ * KUNIT_CASE() macro; additionally, every array of test cases should be
+ * terminated with an empty test case.
+ *
+ * Example:
+ *
+ * .. code-block:: c
+ *
+ * void add_test_basic(struct kunit *test)
+ * {
+ * KUNIT_EXPECT_EQ(test, 1, add(1, 0));
+ * KUNIT_EXPECT_EQ(test, 2, add(1, 1));
+ * KUNIT_EXPECT_EQ(test, 0, add(-1, 1));
+ * KUNIT_EXPECT_EQ(test, INT_MAX, add(0, INT_MAX));
+ * KUNIT_EXPECT_EQ(test, -1, add(INT_MAX, INT_MIN));
+ * }
+ *
+ * static struct kunit_case example_test_cases[] = {
+ * KUNIT_CASE(add_test_basic),
+ * {}
+ * };
+ *
+ */
+struct kunit_case {
+   void (*run_case)(struct kunit *test);
+   const char *name;
+
+   /* private: internal use only. */
+   bool success;
+};
+
+/**
+ * KUNIT_CASE - A helper for creating a  kunit_case
+ *
+ * @test_name: a reference to a test case function.
+ *
+ * Takes a symbol for a function representing a test case and creates a
+ *  kunit_case object from it. See the documentation for
+ *  kunit_case for an example on how to use it.
+ */
+#define KUNIT_CASE(test_name) { .run_case = test_name, .name = #test_name }
+
+/**
+ * struct kunit_suite - describes a related collection of  kunit_case
+ *
+ * @name:  the name of the test. Purely informational.
+ * @init:  called before every test case.
+ * @exit:  called after every test case.
+ * @test_cases:a null terminated array of test cases.
+ *
+ * A kunit_suite is a collection of related  kunit_case s, such that
+ * @init is called before every test case and @exit is called after every
+ * test case, similar to the notion of a *test fixture* or a *test class*
+ * in other unit testing frameworks like JUnit or Googletest.
+ *
+ * Every  kunit_case must be associated with a kunit_suite for KUnit
+ * to run it.
+ */
+struct kunit_suite {
+   const char name[256];
+   int (*init)(struct kunit *test);
+   void (*exit)(struct kunit *test);
+   struct kunit_case *test_cases;
+};
+
+/**
+ * struct kunit - represents a running instance of a test.
+ *
+ * @priv: for user to store arbitrary data. Commonly used to pass data
+ *   created in the init function (see  kunit_suite).
+ *
+ * Used to store information about the current context under which the test
+ * is running. Most of this data is private and should only be accessed
+ * indirectly via public functions; the one exception is @priv which can be
+ * used by the test writer to store arbitrary data.
+ */
+struct kunit {
+   void *priv;
+
+   /* private: internal use only. */
+   const char *name; /* Read only after initialization! */
+   /*
+* success starts as true, and may only be set to false during a
+* test case; thus, it is safe to update this across multiple
+* threads using WRITE_ONCE; however, as a consequence, it may only
+* be read after the test case finishes once all threads associated
+* with the test case have terminated.
+

[PATCH v16 02/19] kunit: test: add test resource management API

2019-09-20 Thread Brendan Higgins

Create a common API for test managed resources like memory and test
objects. A lot of times a test will want to set up infrastructure to be
used in test cases; this could be anything from just wanting to allocate
some memory to setting up a driver stack; this defines facilities for
creating "test resources" which are managed by the test infrastructure
and are automatically cleaned up at the conclusion of the test.

Signed-off-by: Brendan Higgins 
Reviewed-by: Greg Kroah-Hartman 
Reviewed-by: Logan Gunthorpe 
Reviewed-by: Stephen Boyd 
---
 include/kunit/test.h | 187 +++
 lib/kunit/test.c | 163 +
 2 files changed, 350 insertions(+)

diff --git a/include/kunit/test.h b/include/kunit/test.h
index e30d1bf2fb68..6781c756f11b 100644
--- a/include/kunit/test.h
+++ b/include/kunit/test.h
@@ -9,8 +9,72 @@
 #ifndef _KUNIT_TEST_H
 #define _KUNIT_TEST_H
 
+#include 
 #include 
 
+struct kunit_resource;
+
+typedef int (*kunit_resource_init_t)(struct kunit_resource *, void *);
+typedef void (*kunit_resource_free_t)(struct kunit_resource *);
+
+/**
+ * struct kunit_resource - represents a *test managed resource*
+ * @allocation: for the user to store arbitrary data.
+ * @free: a user supplied function to free the resource. Populated by
+ * kunit_alloc_resource().
+ *
+ * Represents a *test managed resource*, a resource which will automatically be
+ * cleaned up at the end of a test case.
+ *
+ * Example:
+ *
+ * .. code-block:: c
+ *
+ * struct kunit_kmalloc_params {
+ * size_t size;
+ * gfp_t gfp;
+ * };
+ *
+ * static int kunit_kmalloc_init(struct kunit_resource *res, void *context)
+ * {
+ * struct kunit_kmalloc_params *params = context;
+ * res->allocation = kmalloc(params->size, params->gfp);
+ *
+ * if (!res->allocation)
+ * return -ENOMEM;
+ *
+ * return 0;
+ * }
+ *
+ * static void kunit_kmalloc_free(struct kunit_resource *res)
+ * {
+ * kfree(res->allocation);
+ * }
+ *
+ * void *kunit_kmalloc(struct kunit *test, size_t size, gfp_t gfp)
+ * {
+ * struct kunit_kmalloc_params params;
+ * struct kunit_resource *res;
+ *
+ * params.size = size;
+ * params.gfp = gfp;
+ *
+ * res = kunit_alloc_resource(test, kunit_kmalloc_init,
+ * kunit_kmalloc_free, );
+ * if (res)
+ * return res->allocation;
+ *
+ * return NULL;
+ * }
+ */
+struct kunit_resource {
+   void *allocation;
+   kunit_resource_free_t free;
+
+   /* private: internal use only. */
+   struct list_head node;
+};
+
 struct kunit;
 
 /**
@@ -114,6 +178,13 @@ struct kunit {
 * with the test case have terminated.
 */
bool success; /* Read only after test_case finishes! */
+   spinlock_t lock; /* Guards all mutable test state. */
+   /*
+* Because resources is a list that may be updated multiple times (with
+* new resources) from any thread associated with a test case, we must
+* protect it with some type of lock.
+*/
+   struct list_head resources; /* Protected by lock. */
 };
 
 void kunit_init_test(struct kunit *test, const char *name);
@@ -147,6 +218,122 @@ int kunit_run_tests(struct kunit_suite *suite);
}  \
late_initcall(kunit_suite_init##suite)
 
+/*
+ * Like kunit_alloc_resource() below, but returns the struct kunit_resource
+ * object that contains the allocation. This is mostly for testing purposes.
+ */
+struct kunit_resource *kunit_alloc_and_get_resource(struct kunit *test,
+   kunit_resource_init_t init,
+   kunit_resource_free_t free,
+   gfp_t internal_gfp,
+   void *context);
+
+/**
+ * kunit_alloc_resource() - Allocates a *test managed resource*.
+ * @test: The test context object.
+ * @init: a user supplied function to initialize the resource.
+ * @free: a user supplied function to free the resource.
+ * @internal_gfp: gfp to use for internal allocations, if unsure, use 
GFP_KERNEL
+ * @context: for the user to pass in arbitrary data to the init function.
+ *
+ * Allocates a *test managed resource*, a resource which will automatically be
+ * cleaned up at the end of a test case. See  kunit_resource for an
+ * example.
+ *
+ * NOTE: KUnit needs to allocate memory for each kunit_resource object. You 
must
+ * specify an @internal_gfp that is compatible with the use context of your
+ * resource.
+ */
+static inline void *kunit_alloc_resource(struct kunit *test,
+kunit_resource_init_t init,
+

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Stephen Boyd

Quoting Amit Kucheria (2019-09-20 15:14:58)
> On Fri, Sep 20, 2019 at 3:09 PM Stephen Boyd  wrote:
> >
> > Ok so the plan is to change DT and then change it back? That sounds
> > quite bad so please fix the thermal core to not care about this before
> > applying these changes so that we don't churn DT.
> 
> Hi Stephen,
> 
> Our emails crossed paths. I think we could just make the property
> optional so that we can remove the property completely for drivers
> that support interrupts. Comments?

OK. This means that the delay properties become irrelevant once an
interrupt is there? I guess that's OK. My concern is that we need to
choose one or the other when it would be simpler to have both and
fallback to the delays so that DT migration strategies are purely
additive. It's not like the delays aren't calculated to be those numbers
anymore. They're just not going to be used.

> 
> That is a bigger change to the bindings and I don't want to hold the
> tsens interrupt support hostage to agreement on this.

Alright. I admit I haven't looked into the details but is it hard for
some reason to make it use interrupts before delays?

[RFC] microoptimizing hlist_add_{before,behind}

2019-09-20 Thread Al Viro

Neither hlist_add_before() nor hlist_add_behind() should ever
be called with both arguments pointing to the same hlist_node.
However, gcc doesn't know that, so it ends up with pointless reloads.
AFAICS, the following generates better code, is obviously equivalent
in case when arguments are different and actually even in case when
they are same, the end result is identical (if the hlist hadn't been
corrupted even earlier than that).

Objections?

Signed-off-by: Al Viro 
---
diff --git a/include/linux/list.h b/include/linux/list.h
index 85c92555e31f..aee8232e6827 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -793,21 +793,21 @@ static inline void hlist_add_head(struct hlist_node *n, 
struct hlist_head *h)
 static inline void hlist_add_before(struct hlist_node *n,
struct hlist_node *next)
 {
-   n->pprev = next->pprev;
+   struct hlist_node *p = n->pprev = next->pprev;
n->next = next;
next->pprev = >next;
-   WRITE_ONCE(*(n->pprev), n);
+   WRITE_ONCE(*p, n);
 }
 
 static inline void hlist_add_behind(struct hlist_node *n,
struct hlist_node *prev)
 {
-   n->next = prev->next;
+   struct hlist_node *p = n->next = prev->next;
prev->next = n;
n->pprev = >next;
 
-   if (n->next)
-   n->next->pprev  = >next;
+   if (p)
+   p->pprev  = >next;
 }
 
 /* after that we'll appear to be on some hlist and hlist_del will work */

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-20 Thread Linus Torvalds

On Fri, Sep 20, 2019 at 4:05 PM Linus Torvalds
 wrote:
>
>
> Now, I hear you say "those are so small these days that it doesn't
> matter". And maybe you're right. But particularly for slow media,
> triggering good streaming write behavior has been a problem in the
> past.

Which reminds me: the writebehind trigger should likely be tied to the
estimate of the bdi write speed.

We _do_ have that avg_write_bandwidth thing in the bdi_writeback
structure, it sounds like a potentially good idea to try to use that
to estimate when to do writebehind.

No?

Linus

Re: [PATCH v2] mm: implement write-behind policy for sequential file writes

2019-09-20 Thread Linus Torvalds

On Fri, Sep 20, 2019 at 12:35 AM Konstantin Khlebnikov
 wrote:
>
> This patch implements write-behind policy which tracks sequential writes
> and starts background writeback when file have enough dirty pages.

Apart from a spelling error ("contigious"), my only reaction is that
I've wanted this for the multi-file writes, not just for single big
files.

Yes, single big files may be a simpler and perhaps the "10% effort for
90% of the gain", and thus the right thing to do, but I do wonder if
you've looked at simply extending it to cover multiple files when
people copy a whole directory (or unpack a tar-file, or similar).

Now, I hear you say "those are so small these days that it doesn't
matter". And maybe you're right. But partiocularly for slow media,
triggering good streaming write behavior has been a problem in the
past.

So I'm wondering whether the "writebehind" state should perhaps be
considered be a process state, rather than "struct file" state, and
also start triggering for writing smaller files.

Maybe this was already discussed and people decided that the big-file
case was so much easier that it wasn't worth worrying about
writebehind for multiple files.

Linus

Re: [PATCH] perf record: fix priv level with branch sampling for paranoid=2

2019-09-20 Thread Stephane Eranian

On Fri, Sep 20, 2019 at 12:12 PM Jiri Olsa  wrote:
>
> On Tue, Sep 03, 2019 at 11:26:03PM -0700, Stephane Eranian wrote:
> > Now that the default perf_events paranoid level is set to 2, a regular user
> > cannot monitor kernel level activity anymore. As such, with the following
> > cmdline:
> >
> > $ perf record -e cycles date
> >
> > The perf tool first tries cycles:uk but then falls back to cycles:u
> > as can be seen in the perf report --header-only output:
> >
> >   cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
> >   event : name = cycles:u, , id = { 436186, ... }
> >
> > This is okay as long as there is way to learn the priv level was changed
> > internally by the tool.
> >
> > But consider a similar example:
> >
> > $ perf record -b -e cycles date
> > Error:
> > You may not have permission to collect stats.
> >
> > Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> > which controls use of the performance events system by
> > unprivileged users (without CAP_SYS_ADMIN).
> > ...
> >
> > Why is that treated differently given that the branch sampling inherits the
> > priv level of the first event in this case, i.e., cycles:u? It turns out
> > that the branch sampling code is more picky and also checks exclude_hv.
> >
> > In the fallback path, perf record is setting exclude_kernel = 1, but it
> > does not change exclude_hv. This does not seem to match the restriction
> > imposed by paranoid = 2.
> >
> > This patch fixes the problem by forcing exclude_hv = 1 in the fallback
> > for paranoid=2. With this in place:
> >
> > $ perf record -b -e cycles date
> >   cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
> >   event : name = cycles:u, , id = { 436847, ... }
> >
> > And the command succeeds as expected.
> >
> > Signed-off-by: Stephane Eranian 
> > ---
> >  tools/perf/util/evsel.c | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 85825384f9e8..3cbe06fdf7f7 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int 
> > err,
> >   if (evsel->name)
> >   free(evsel->name);
> >   evsel->name = new_name;
> > - scnprintf(msg, msgsize,
> > -"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel 
> > samples", paranoid);
> > + scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, 
> > trying "
> > +   "to fall back to excluding kernel and hypervisor "
> > +   " samples", paranoid);
>
> extra space in here^
>
> Warning:
> kernel.perf_event_paranoid=2, trying to fall back to excluding kernel 
> and hypervisor  samples
>
> other than that it looks good to me
>
Fixed in v2.

> Acked-by: Jiri Olsa 
>
> thanks,
> jirka

[PATCH v2] perf record: fix priv level with branch sampling for paranoid=2

2019-09-20 Thread Stephane Eranian

Now that the default perf_events paranoid level is set to 2, a regular user
cannot monitor kernel level activity anymore. As such, with the following
cmdline:

$ perf record -e cycles date

The perf tool first tries cycles:uk but then falls back to cycles:u
as can be seen in the perf report --header-only output:

  cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
  event : name = cycles:u, , id = { 436186, ... }

This is okay as long as there is way to learn the priv level was changed
internally by the tool.

But consider a similar example:

$ perf record -b -e cycles date
Error:
You may not have permission to collect stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
...

Why is that treated differently given that the branch sampling inherits the
priv level of the first event in this case, i.e., cycles:u? It turns out
that the branch sampling code is more picky and also checks exclude_hv.

In the fallback path, perf record is setting exclude_kernel = 1, but it
does not change exclude_hv. This does not seem to match the restriction
imposed by paranoid = 2.

This patch fixes the problem by forcing exclude_hv = 1 in the fallback
for paranoid=2. With this in place:

$ perf record -b -e cycles date
  cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
  event : name = cycles:u, , id = { 436847, ... }

And the command succeeds as expected.

V2 fix a white space.

Signed-off-by: Stephane Eranian 
---
 tools/perf/util/evsel.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 85825384f9e8..3cbe06fdf7f7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int err,
if (evsel->name)
free(evsel->name);
evsel->name = new_name;
-   scnprintf(msg, msgsize,
-"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel 
samples", paranoid);
+   scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
+ "to fall back to excluding kernel and hypervisor "
+ " samples", paranoid);
evsel->core.attr.exclude_kernel = 1;
+   evsel->core.attr.exclude_hv = 1;

return true;
}
-- 
2.23.0.187.g17f5b7556c-goog

Re: [PATCH v2 4/4] task: RCUify the assignment of rq->curr

2019-09-20 Thread Frederic Weisbecker

On Sat, Sep 14, 2019 at 07:35:02AM -0500, Eric W. Biederman wrote:
> 
> The current task on the runqueue is currently read with rcu_dereference().
> 
> To obtain ordinary rcu semantics for an rcu_dereference of rq->curr it needs
> to be paird with rcu_assign_pointer of rq->curr.  Which provides the
> memory barrier necessary to order assignments to the task_struct
> and the assignment to rq->curr.
> 
> Unfortunately the assignment of rq->curr in __schedule is a hot path,
> and it has already been show that additional barriers in that code
> will reduce the performance of the scheduler.  So I will attempt to
> describe below why you can effectively have ordinary rcu semantics
> without any additional barriers.
> 
> The assignment of rq->curr in init_idle is a slow path called once
> per cpu and that can use rcu_assign_pointer() without any concerns.
> 
> As I write this there are effectively two users of rcu_dereference on
> rq->curr.  There is the membarrier code in kernel/sched/membarrier.c
> that only looks at "->mm" after the rcu_dereference.  Then there is
> task_numa_compare() in kernel/sched/fair.c.  My best reading of the
> code shows that task_numa_compare only access: "->flags",
> "->cpus_ptr", "->numa_group", "->numa_faults[]",
> "->total_numa_faults", and "->se.cfs_rq".
> 
> The code in __schedule() essentially does:
>   rq_lock(...);
>   smp_mb__after_spinlock();
> 
>   next = pick_next_task(...);
>   rq->curr = next;
> 
>   context_switch(prev, next);
> 
> At the start of the function the rq_lock/smp_mb__after_spinlock
> pair provides a full memory barrier.  Further there is a full memory barrier
> in context_switch().
> 
> This means that any task that has already run and modified itself (the
> common case) has already seen two memory barriers before __schedule()
> runs and begins executing.  A task that modifies itself then sees a
> third full memory barrier pair with the rq_lock();
> 
> For a brand new task that is enqueued with wake_up_new_task() there
> are the memory barriers present from the taking and release the
> pi_lock and the rq_lock as the processes is enqueued as well as the
> full memory barrier at the start of __schedule() assuming __schedule()
> happens on the same cpu.
> 
> This means that by the time we reach the assignment of rq->curr
> except for values on the task struct modified in pick_next_task
> the code has the same guarantees as if it used rcu_assign_pointer.
> 
> Reading through all of the implementations of pick_next_task it
> appears pick_next_task is limited to modifying the task_struct fields
> "->se", "->rt", "->dl".  These fields are the sched_entity structures
> of the varies schedulers.
> 
> Further "->se.cfs_rq" is only changed in cgroup attach/move operations
> initialized by userspace.
> 
> Unless I have missed something this means that in practice that the
> users of "rcu_dereerence(rq->curr)" get normal rcu semantics of
> rcu_dereference() for the fields the care about, despite the
> assignment of rq->curr in __schedule() ot using rcu_assign_pointer.
> 
> Link: 
> https://lore.kernel.org/r/20190903200603.gw2...@hirez.programming.kicks-ass.net
> Signed-off-by: "Eric W. Biederman" 
> ---
>  kernel/sched/core.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 69015b7c28da..668262806942 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3857,7 +3857,11 @@ static void __sched notrace __schedule(bool preempt)
>  
>   if (likely(prev != next)) {
>   rq->nr_switches++;
> - rq->curr = next;
> + /*
> +  * RCU users of rcu_dereference(rq->curr) may not see
> +  * changes to task_struct made by pick_next_task().
> +  */
> + RCU_INIT_POINTER(rq->curr, next);

It would be nice to have more explanations in the comments as to why we
don't use rcu_assign_pointer() here (the very fast-path issue) and why
it is expected to be fine (the rq_lock() + post spinlock barrier) under
which condition. Some short summary of the changelog. Because that line
implies way too many subtleties.

Thanks.

Verify ACK packet in handshake in kernel module (access TCP state table)

2019-09-20 Thread Swarm

First time emailing to this mailing list so please let me know if I made 
a mistake in how I sent it. I'm trying to receive a notification from 
the kernel once it verifies an ACK packet in a handshake. Problem is, 
there is no API or kernel resource I've seen that supports this feature 
for both syncookies and normal handshakes. Where exactly in the kernel 
does the ACK get verified? If there isn't a way to be notified of it, 
where should I start adding that feature into the kernel?

Re: [GIT PULL] VFIO updates for v5.4-rc1

2019-09-20 Thread pr-tracker-bot

The pull request you sent on Fri, 20 Sep 2019 15:12:26 -0600:

> git://github.com/awilliam/linux-vfio.git tags/vfio-v5.4-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1ddd00276fd5fbd14dd5e366d8777dcd5f2d1b65

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] clk changes for the merge window

2019-09-20 Thread pr-tracker-bot

The pull request you sent on Fri, 20 Sep 2019 14:40:42 -0700:

> https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git 
> tags/clk-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a703d279c57e1bfe2b6536c3a17c1c498b416d24

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH 3.16 000/132] 3.16.74-rc1 review

2019-09-20 Thread Guenter Roeck

On Fri, Sep 20, 2019 at 10:16:49PM +0100, Ben Hutchings wrote:
> On Fri, 2019-09-20 at 13:04 -0700, Guenter Roeck wrote:
> > On Fri, Sep 20, 2019 at 03:23:34PM +0100, Ben Hutchings wrote:
> > > This is the start of the stable review cycle for the 3.16.74 release.
> > > There are 132 patches in this series, which will be posted as responses
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Mon Sep 23 20:00:00 UTC 2019.
> > > Anything received after that time might be too late.
> > > 
> > 
> > Build results:
> > total: 136 pass: 135 fail: 1
> > Failed builds:
> > arm:allmodconfig
> > Qemu test results:
> > total: 229 pass: 229 fail: 0
> > 
> > Build errors in arm:allmodconfig are along the line of
> > 
> > In file included from include/linux/printk.h:5,
> >  from include/linux/kernel.h:13,
> >  from include/linux/clk.h:16,
> >  from drivers/gpu/drm/tilcdc/tilcdc_drv.h:21,
> >  from drivers/gpu/drm/tilcdc/tilcdc_drv.c:20:
> > include/linux/init.h:343:7: error: 'cleanup_module'
> > specifies less restrictive attribute than its target 'tilcdc_drm_fini': 
> > 'cold'
> > 
> > In addition to a few errors like that, there are literally thousands
> > of similar warnings.
> 
> It looks like this is triggered by you switching arm builds from gcc 8
> to 9, rather than by any code change.
> 
Ah, good point.

> Does it actually make sense to try to support building Linux 3.16 with
> gcc 9?  If so, I suppose I'll need to add:
> 

It helps streamline my builds and reduces the number of compilers
I have to keep around. No problem, though; I can switch back to an older
compiler for arm on 3.16.

Guenter

[PATCH] tracing: prevent memory leak

2019-09-20 Thread Navid Emamdoost

In predicate_parse, there is an error path that is not going to
out_free instead it returns directly which leads to memory leak.

Signed-off-by: Navid Emamdoost 
---
 kernel/trace/trace_events_filter.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_filter.c 
b/kernel/trace/trace_events_filter.c
index c773b8fb270c..c9a74f82b14a 100644
--- a/kernel/trace/trace_events_filter.c
+++ b/kernel/trace/trace_events_filter.c
@@ -452,8 +452,10 @@ predicate_parse(const char *str, int nr_parens, int 
nr_preds,
 
switch (*next) {
case '(':   /* #2 */
-   if (top - op_stack > nr_parens)
-   return ERR_PTR(-EINVAL);
+   if (top - op_stack > nr_parens) {
+   ret = -EINVAL;
+   goto out_free;
+   }
*(++top) = invert;
continue;
case '!':   /* #3 */
-- 
2.17.1

Re: [PATCH 6/7] pwm: jz4740: Make PWM start with the active part

2019-09-20 Thread Thierry Reding

On Mon, Aug 12, 2019 at 11:58:53PM +0200, Uwe Kleine-König wrote:
> On Mon, Aug 12, 2019 at 10:50:01PM +0200, Paul Cercueil wrote:
> > 
> > 
> > Le lun. 12 août 2019 à 7:55, Uwe =?iso-8859-1?q?Kleine-K=F6nig?=
> >  a écrit :
> > > On Fri, Aug 09, 2019 at 07:33:24PM +0200, Paul Cercueil wrote:
> > > > 
> > > > 
> > > >  Le ven. 9 août 2019 à 19:10, Uwe =?iso-8859-1?q?Kleine-K=F6nig?=
> > > >   a écrit :
> > > >  > On Fri, Aug 09, 2019 at 02:30:30PM +0200, Paul Cercueil wrote:
> > > >  > >  The PWM will always start with the inactive part. To counter
> > > > that,
> > > >  > >  when PWM is enabled we switch the configured polarity, and use
> > > >  > >  'period - duty + 1' as the real duty.
> > > >  >
> > > >  > Where does the + 1 come from? This looks wrong. (So if duty=0 is
> > > >  > requested you use duty = period + 1?)
> > > > 
> > > >  You'd never request duty == 0, would you?
> > > > 
> > > >  Your duty must always be in the inclusive range [1, period]
> > > >  (hardware values, not ns). A duty of 0 is a hardware fault
> > > >  (on the jz4740 it is).
> > > 
> > > From the PWM framework's POV duty cycle = 0 is perfectly valid. Similar
> > > to duty == period. Not supporting dutz cycle 0 is another limitation of
> > > your PWM that should be documented.
> > > 
> > > For actual use cases of duty cycle = 0 see drivers/hwmon/pwm-fan.c or
> > > drivers/leds/leds-pwm.c.
> > 
> > Perfectly valid for the PWM framework, maybe; but what is the expected
> > output then? A constant inactive state?
> 
> Yes, a constant inactive state is expected. This is consistent and in a
> similar way when using duty == period an constant active output is
> expected.
> 
> > Then I guess I can just disable the PWM output in the driver when
> > configured with duty == 0.
> 
> Some time ago I argued with Thierry that we could drop the concept of
> enabled/disabled for a PWM because a disabled PWM is supposed to behave
> identically to duty=0. This is however only nearly true because with
> duty=0 the time the PWM is inactive still is a multiple of the period.
> 
> I tend to agree that disabling the PWM when duty=0 is requested is
> better than to fail the request (or configure for duty=1 $whateverunit).
> I'm looking forward to what Thierry's opinion is here.

Agreed. If in order to meet the expectations of duty == 0 you have to
disable the PWM, then that's what you should do.

Thierry


signature.asc
Description: PGP signature

Re: [PATCH V3 3/4] ASoC: pcm_dmaengine: Extract snd_dmaengine_pcm_refine_runtime_hwparams

2019-09-20 Thread Nicolin Chen

On Thu, Sep 19, 2019 at 08:11:41PM +0800, Shengjiu Wang wrote:
> When set the runtime hardware parameters, we may need to query
> the capability of DMA to complete the parameters.
> 
> This patch is to Extract this operation from
> dmaengine_pcm_set_runtime_hwparams function to a separate function
> snd_dmaengine_pcm_refine_runtime_hwparams, that other components
> which need this feature can call this function.
> 
> Signed-off-by: Shengjiu Wang 

> @@ -145,58 +140,15 @@ static int dmaengine_pcm_set_runtime_hwparams(struct 
> snd_pcm_substream *substrea

> + ret = snd_dmaengine_pcm_refine_runtime_hwparams(substream,
> + dma_data,
> + ,
> + chan);
> + if (ret)
> + return ret;
>  
>   return snd_soc_set_runtime_hwparams(substream, );
> +
> }

Just a nit, why add a line here? :)

The rest looks good to me, not sure whether the name "refine"
would be the best one though, would like to wait for opinions
from others.

Thanks

Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2()

2019-09-20 Thread Linus Torvalds

On Fri, Sep 20, 2019 at 1:51 PM Andy Lutomirski  wrote:
>
> To be clear, when I say "blocking", I mean "blocks until we're ready,
> but we make sure we're ready in a moderately timely manner".

.. an I want a pony.

The problem is that you start from an assumption that we simply can't
seem to do.

> In other words, I want GRND_SECURE_BLOCKING and /dev/random reads to
> genuinely always work and to genuinely never take much longer than 5s.
> I don't want a special case where they fail.

Honestly, if that's the case and we _had_ such a methoc of
initializing the rng, then I suspect we could just ignore the flags
entirely, with the possible exception of GRND_NONBLOCK. And even that
is "possible exception", because once your worst-case is a one-time
delay of 5s at boot time thing, you might as well consider it
nonblocking in general.

Yes, there are some in-kernel users that really can't afford to do
even that 5s delay (not just may they be atomic, but more likely it's
just that we don't want to delay _everything_ by 5s), but they don't
use the getrandom() system call anyway.

> The exposed user APIs are, subject to bikeshedding that can happen
> later over the actual values, etc:

So the thing is, you start from the impossible assumption, and _if_
you hold that assumption then we might as well just keep the existing
"zero means blocking", because nobody mind.

I'd love to say "yes, we can guarantee good enough entropy for
everybody in 5s and we don't even need to warn about it, because
everybody will be comfortable with the state of our entropy at that
point".

It sounds like a _lovely_ model.

But honestly, it simply sounds unlikely.

Now, there are different kinds of unlikely.

In particular, if you actually have a CPU cycle counter that actually
runs at least on the same order of magnitude as the CPU frequency -
then I believe in the jitter entropy more than in many other cases.

Sadly, many platforms don't have that kind of cycle counter.

I've also not seen a hugely believable "yes, the jitter entropy is
real" paper. Alexander points to the existing jitterentropy crypto
code, and claims it can fill all our entropy needs in two seconds, but
there are big caveats:

 (a) that code uses get_random_entropy(), which on a PC is that nice
fast TSC that we want. On other platforms (or on really old PC's - we
technically support CPU's still that don't have rdtsc)? It might be
zero. Every time.

 (b) How was it tested? There are lots of randomness tests, but most
of them can be fooled with a simple counter through a cryptographic
hash - which you basically need to do anyway on whatever entropy
source you have in order to "whiten" it. It's simply _really_ hard to
decide on entropy.

So it's really easy to make the randomness of some input look really
good, without any real idea how good it truly is. And maybe it really
is very very good on one particular machine, and then on another one
(with either a simpler in-order core or a lower-frequency timestamp
counter) it might be horrendously bad, and you'll never know,

So I'd love to believe in your simple model. Really. I just don't see
how to get there reliably.

Matthew Garrettpointed to one analysis on jitterentropy, and that one
wasn't all that optimistic.

I do think jitterentropy would likely be good enough in practice - at
least on PC's with a TSC - for the fairly small window at boot and
getrandom(0). As I mentioned, I don't think it will make anybody
_happy_, but it might be one of those things where it's a compromise
that at least works for people, with the key generation people who are
really unhappy with it having a new option for their case.

And maybe Alexander can convince people that when you run the
jitterentropy code a hundred billion times, the end result (not the
random stream from it, but the jitter bits themselves - but I'm not
even sure how to boil it down) - really is random.

 Linus

RE: [RFC] buildtar: add case for riscv architecture

2019-09-20 Thread Palmer Dabbelt


On Tue, 17 Sep 2019 02:35:10 PDT (-0700), m...@aurabindo.in wrote:

‐‐‐ Original Message ‐‐‐
On Sunday, September 15, 2019 12:57 AM, Palmer Dabbelt  
wrote:


On Sat, 14 Sep 2019 06:05:59 PDT (-0700), Anup Patel wrote:

> > -Original Message-
> > From: linux-kernel-ow...@vger.kernel.org  > ow...@vger.kernel.org> On Behalf Of Palmer Dabbelt
> > Sent: Saturday, September 14, 2019 6:30 PM
> > To: m...@aurabindo.in
> > Cc: Troy Benjegerdes troy.benjeger...@sifive.com; Paul Walmsley
> > paul.walms...@sifive.com; a...@eecs.berkeley.edu; linux-
> > ri...@lists.infradead.org; linux-kernel@vger.kernel.org; linux-
> > kbu...@vger.kernel.org
> > Subject: Re: [RFC] buildtar: add case for riscv architecture
> > On Wed, 11 Sep 2019 05:54:07 PDT (-0700), m...@aurabindo.in wrote:
> >
> > > > None of the available RiscV platforms that I’m aware of use compressed
> > > > images, unless there are some new bootloaders I haven’t seen yet.
> > >
> > > >
> > >
> > > I noticed that default build image is Image.gz, which is why I thought 
its a
> > > good idea to copy it into the tarball. Does such a copy not make sense at 
this
> > > point ?
> >
> > Image.gz can't be booted directly: it's just Image that's been compressed
> > with the standard gzip command. A bootloader would have to decompress
> > that image before loading it into memory, which requires extra bootloader
> > support.
> > Contrast that with the zImage style images (which are vmlinuz on x86), which
> > are self-extracting and therefor require no bootloader support. The
> > examples for u-boot all use the "booti" command, which expects
> > uncompressed images.
> > Poking around I couldn't figure out a way to have u-boot decompress the
> > images, but that applies to arm64 as well so I'm not sure if I'm missing
> > something.
> > If I was doing this, I'd copy over arch/riscv/boot/Image and call it
> > "/boot/image-${KERNELRELEASE}", as calling it vmlinuz is a bit confusing to
> > me because I'd expect vmlinuz to be a self-extracting compressed
> > executable and not a raw gzip file.
>
> On the contrary, it is indeed possible to boot Image.gz directly using
> U-Boot booti command so this patch would be useful.
> Atish had got it working on U-Boot but he has deferred booti Image.gz
> support due to few more dependent changes. May be he can share
> more info.

Oh, great. I guess it makes sense to just put both in the tarball, then, as
users will still need to use the Image format for now.



Uncompressed vmlinux is already copied by default. This patch just adds the
Image.gz into the archive as vmlinuz. But as you said, since the name vmlinuz is
reserved for self extracting archives, should I keep the original name Image.gz 
?


vmlinux is not the same as Image: vmlinux is an ELF file that can't be loaded 
directly by most bootloaders, Image is a mostly-flat binary with a small header 
that we're expecting can be booted by most bootloaders.

[PATCH v5 0/1] intel_cht_int33fe: Split code to USB Micro-B and Type-C variants

2019-09-20 Thread Yauhen Kharuzhy

Patch to support INT33FE ACPI pseudo-device on hardware with USB Micro-B
connector.

v5:
- Spelling corrections in Kconfig, commit description and comments;
- Micro-B code: Remove warning at fuel gauge registration failure and
  use PTR_ERR_OR_ZERO() for simplicity.

v4:
- Micro-B variant: Don't print error to the kernel log if i2c_acpi_new_device()
  has returned -EPROBE_DEFER.

v3:
- Rename TypeB variant to Micro-B (we have only one such device for now and it
  has Micro-B connector)
- Rebase on current linus/master
- Remove empty lines and replace "TypeC" by "Type-C"

v2:
Instead of defining two separated modules with two separated config
options, compile {common,typeb,typec} sources into one .ko module.
Call needed variant-specific probe function based after of hardware type
detection in common code.

Yauhen Kharuzhy (1):
  platform/x86/intel_cht_int33fe: Split code to USB Micro-B and Type-C
variants

 drivers/platform/x86/Kconfig  |  10 +-
 drivers/platform/x86/Makefile |   4 +
 .../platform/x86/intel_cht_int33fe_common.c   | 147 ++
 .../platform/x86/intel_cht_int33fe_common.h   |  41 +
 .../platform/x86/intel_cht_int33fe_microb.c   |  57 +++
 ...ht_int33fe.c => intel_cht_int33fe_typec.c} |  78 +-
 6 files changed, 265 insertions(+), 72 deletions(-)
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.c
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.h
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_microb.c
 rename drivers/platform/x86/{intel_cht_int33fe.c => intel_cht_int33fe_typec.c} 
(82%)

-- 
2.23.0.rc1

[PATCH v5 1/1] platform/x86/intel_cht_int33fe: Split code to USB Micro-B and Type-C variants

2019-09-20 Thread Yauhen Kharuzhy

Existing intel_cht_int33fe ACPI pseudo-device driver assumes that
hardware has Type-C connector and register related devices described as
I2C connections in the _CRS resource.

There is at least one hardware (Lenovo Yoga Book YB1-91L/F) with Micro-B
USB connector exists. It has INT33FE device in the DSDT table but
there are only two I2C connection described: PMIC and BQ27452 battery
fuel gauge.

Splitting existing INT33FE driver allow to maintain code for USB Micro-B
(or AB) connector variant separately and make it simpler.

Split driver to intel_cht_int33fe_common.c and
intel_cht_int33fe_{microb,typec}.c. Compile all this sources to one .ko
module to make user experience easier.

Signed-off-by: Yauhen Kharuzhy 
---
 drivers/platform/x86/Kconfig  |  10 +-
 drivers/platform/x86/Makefile |   4 +
 .../platform/x86/intel_cht_int33fe_common.c   | 147 ++
 .../platform/x86/intel_cht_int33fe_common.h   |  41 +
 .../platform/x86/intel_cht_int33fe_microb.c   |  57 +++
 ...ht_int33fe.c => intel_cht_int33fe_typec.c} |  78 +-
 6 files changed, 265 insertions(+), 72 deletions(-)
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.c
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_common.h
 create mode 100644 drivers/platform/x86/intel_cht_int33fe_microb.c
 rename drivers/platform/x86/{intel_cht_int33fe.c => intel_cht_int33fe_typec.c} 
(82%)

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 1b67bb578f9f..e9e5aa791caf 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -930,14 +930,20 @@ config INTEL_CHT_INT33FE
  This driver add support for the INT33FE ACPI device found on
  some Intel Cherry Trail devices.
 
+ There are two kinds of INT33FE ACPI device possible: for hardware
+ with USB Type-C and Micro-B connectors. This driver supports both.
+
  The INT33FE ACPI device has a CRS table with I2cSerialBusV2
- resources for 3 devices: Maxim MAX17047 Fuel Gauge Controller,
+ resources for Fuel Gauge Controller and (in the Type-C variant)
  FUSB302 USB Type-C Controller and PI3USB30532 USB switch.
  This driver instantiates i2c-clients for these, so that standard
  i2c drivers for these chips can bind to the them.
 
  If you enable this driver it is advised to also select
- CONFIG_TYPEC_FUSB302=m and CONFIG_BATTERY_MAX17042=m.
+ CONFIG_BATTERY_BQ27XXX=m or CONFIG_BATTERY_BQ27XXX_I2C=m for Micro-B
+ device and CONFIG_TYPEC_FUSB302=m and CONFIG_BATTERY_MAX17042=m
+ for Type-C device.
+
 
 config INTEL_INT0002_VGPIO
tristate "Intel ACPI INT0002 Virtual GPIO driver"
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 415104033060..216d3b6fd6a7 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -61,6 +61,10 @@ obj-$(CONFIG_TOSHIBA_BT_RFKILL)  += toshiba_bluetooth.o
 obj-$(CONFIG_TOSHIBA_HAPS) += toshiba_haps.o
 obj-$(CONFIG_TOSHIBA_WMI)  += toshiba-wmi.o
 obj-$(CONFIG_INTEL_CHT_INT33FE)+= intel_cht_int33fe.o
+intel_cht_int33fe-objs := intel_cht_int33fe_common.o \
+  intel_cht_int33fe_typec.o \
+  intel_cht_int33fe_microb.o
+
 obj-$(CONFIG_INTEL_INT0002_VGPIO) += intel_int0002_vgpio.o
 obj-$(CONFIG_INTEL_HID_EVENT)  += intel-hid.o
 obj-$(CONFIG_INTEL_VBTN)   += intel-vbtn.o
diff --git a/drivers/platform/x86/intel_cht_int33fe_common.c 
b/drivers/platform/x86/intel_cht_int33fe_common.c
new file mode 100644
index ..42dd11623f56
--- /dev/null
+++ b/drivers/platform/x86/intel_cht_int33fe_common.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Common code for Intel Cherry Trail ACPI INT33FE pseudo device drivers
+ * (USB Micro-B and Type-C connector variants).
+ *
+ * Copyright (c) 2019 Yauhen Kharuzhy 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "intel_cht_int33fe_common.h"
+
+#define EXPECTED_PTYPE 4
+
+static int cht_int33fe_i2c_res_filter(struct acpi_resource *ares, void *data)
+{
+   struct acpi_resource_i2c_serialbus *sb;
+   int *count = data;
+
+   if (i2c_acpi_get_i2c_resource(ares, ))
+   (*count)++;
+
+   return 1;
+}
+
+static int cht_int33fe_count_i2c_clients(struct device *dev)
+{
+   struct acpi_device *adev;
+   LIST_HEAD(resource_list);
+   int count = 0;
+
+   adev = ACPI_COMPANION(dev);
+   if (!adev)
+   return -EINVAL;
+
+   acpi_dev_get_resources(adev, _list,
+  cht_int33fe_i2c_res_filter, );
+
+   acpi_dev_free_resource_list(_list);
+
+   return count;
+}
+
+static int cht_int33fe_check_hw_type(struct device *dev)
+{
+   unsigned long long ptyp;
+   acpi_status status;
+   int ret;
+
+   status =

Re: [PATCH RFC 02/14] drivers: irqchip: pdc: Do not toggle IRQ_ENABLE during mask/unmask

2019-09-20 Thread Lina Iyer


On Fri, Sep 20 2019 at 16:22 -0600, Stephen Boyd wrote:

Quoting Lina Iyer (2019-09-11 09:15:57)

On Thu, Sep 05 2019 at 18:39 -0600, Stephen Boyd wrote:
>Quoting Lina Iyer (2019-08-29 11:11:51)
>> When an interrupt is to be serviced, the convention is to mask the
>> interrupt at the chip and unmask after servicing the interrupt. Enabling
>> and disabling the interrupt at the PDC irqchip causes an interrupt storm
>> due to the way dual edge interrupts are handled in hardware.
>>
>> Skip configuring the PDC when the IRQ is masked and unmasked, instead
>> use the irq_enable/irq_disable callbacks to toggle the IRQ_ENABLE
>> register at the PDC. The PDC's IRQ_ENABLE register is only used during
>> the monitoring mode when the system is asleep and is not needed for
>> active mode detection.
>
>I think this is saying that we want to always let the line be sent
>through the PDC to the parent irqchip, in this case GIC, so that we
>don't get an interrupt storm for dual edge interrupts? Why does dual
>edge interrupts cause a problem?
>
I am not sure about the hardware details, but the PDC designers did not
expect enable and disable to be called whenever the interrupt is
handled. This specially becomes a problem for dual edge interrupts which
seems to generate a interrupt storm when enabled/disabled while handling
the interrupt.



Ok. I just wanted to confirm that masking "doesn't matter" to the PDC
because it assumes the irqchip closer to the CPU will be able to mask it
anyway. Is that right?


That is correct.

Re: [PATCH RFC 02/14] drivers: irqchip: pdc: Do not toggle IRQ_ENABLE during mask/unmask

2019-09-20 Thread Stephen Boyd

Quoting Lina Iyer (2019-09-11 09:15:57)
> On Thu, Sep 05 2019 at 18:39 -0600, Stephen Boyd wrote:
> >Quoting Lina Iyer (2019-08-29 11:11:51)
> >> When an interrupt is to be serviced, the convention is to mask the
> >> interrupt at the chip and unmask after servicing the interrupt. Enabling
> >> and disabling the interrupt at the PDC irqchip causes an interrupt storm
> >> due to the way dual edge interrupts are handled in hardware.
> >>
> >> Skip configuring the PDC when the IRQ is masked and unmasked, instead
> >> use the irq_enable/irq_disable callbacks to toggle the IRQ_ENABLE
> >> register at the PDC. The PDC's IRQ_ENABLE register is only used during
> >> the monitoring mode when the system is asleep and is not needed for
> >> active mode detection.
> >
> >I think this is saying that we want to always let the line be sent
> >through the PDC to the parent irqchip, in this case GIC, so that we
> >don't get an interrupt storm for dual edge interrupts? Why does dual
> >edge interrupts cause a problem?
> >
> I am not sure about the hardware details, but the PDC designers did not
> expect enable and disable to be called whenever the interrupt is
> handled. This specially becomes a problem for dual edge interrupts which
> seems to generate a interrupt storm when enabled/disabled while handling
> the interrupt.
> 

Ok. I just wanted to confirm that masking "doesn't matter" to the PDC
because it assumes the irqchip closer to the CPU will be able to mask it
anyway. Is that right?

Re: [PATCH RFC 05/14] dt-bindings/interrupt-controller: pdc: add SPI config register

2019-09-20 Thread Stephen Boyd

Quoting Lina Iyer (2019-09-17 14:50:20)
> On Fri, Sep 13 2019 at 13:53 -0600, Lina Iyer wrote:
> >On Thu, Sep 05 2019 at 18:03 -0600, Stephen Boyd wrote:
> >>Quoting Lina Iyer (2019-09-03 10:07:22)
> >>>On Mon, Sep 02 2019 at 07:58 -0600, Marc Zyngier wrote:
> On 02/09/2019 14:38, Rob Herring wrote:
> > On Thu, Aug 29, 2019 at 12:11:54PM -0600, Lina Iyer wrote:
> >>>These are not GIC registers but located on the PDC interface to the GIC.
> >>>They may or may not be secure access controlled, depending on the SoC.
> >>>
> >>
> >>It looks like it falls under this "mailbox" device which is really the
> >>catch all bucket for bits with no home besides they're related to the
> >>apps CPUs/subsystem.
> >>
> >Thanks for pointing to this.
> >>  apss_shared: mailbox@1799 {
> >>  compatible = "qcom,sdm845-apss-shared";
> >>  reg = <0 0x1799 0 0x1000>;
> >But this doesn't seem correct. The registers in this page are all not
> >mailbox door bell registers. We should restrict the space allocated to
> >the mbox to 0xC or something, definitely, not the whole page. They all
> >cannot be treated as a mailbox registers.

Well the binding is already done and this is the compatible string for
this node and register region. Sounds like this node is a mailbox plus
some more stuff in the same page.

> >>  #mbox-cells = <1>;
> >>  };
> >>
> >>Can you point to this node with a phandle and then parse the reg
> >>property out of it to use in the scm readl/writel APIs? Maybe it can be
> >>a two cell property with <_shared 0xf0> to indicate the offset to
> >>the registers to read/write? In non-secure mode presumably we need to
> >>also write these registers? Good news is that there's a regmap for this
> >>driver already, so maybe that can be acquired from the pdc driver.
> >>
> >The register space collection seems to be mix of different types of
> >application processor registers that should probably not be grouped up
> >under one subsystem. A single regmap doesn't seem correct either.

Why isn't a single regmap correct? The PDC driver should be able to use
it to read/write into this register space. The lock on the regmap will
need to be changed to a raw lock though for RT. Otherwise it looks OK to
me.

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

On Fri, Sep 20, 2019 at 3:09 PM Stephen Boyd  wrote:
>
> Quoting Amit Kucheria (2019-09-20 15:07:25)
> > On Fri, Sep 20, 2019 at 3:02 PM Stephen Boyd  wrote:
> > >
> > > Quoting Amit Kucheria (2019-09-20 14:52:24)
> > > > Register upper-lower interrupts for each of the two tsens controllers.
> > > >
> > > > Signed-off-by: Amit Kucheria 
> > > > ---
> > > >  arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
> > > >  1 file changed, 32 insertions(+), 28 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> > > > b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > > index 96c0a481f454..bb763b362c16 100644
> > > > --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > > +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > > @@ -175,8 +175,8 @@
> > > >
> > > > thermal-zones {
> > > > cpu0-thermal {
> > > > -   polling-delay-passive = <250>;
> > > > -   polling-delay = <1000>;
> > > > +   polling-delay-passive = <0>;
> > > > +   polling-delay = <0>;
> > >
> > > I thought the plan was to make this unnecessary to change?
> >
> > IMO that change should be part of a different series to the thermal
> > core. I've not actually started working on it yet (traveling for the
> > next 10 days or so) but plan to do it.
> >
>
> Ok so the plan is to change DT and then change it back? That sounds
> quite bad so please fix the thermal core to not care about this before
> applying these changes so that we don't churn DT.

Hi Stephen,

Our emails crossed paths. I think we could just make the property
optional so that we can remove the property completely for drivers
that support interrupts. Comments?

That is a bigger change to the bindings and I don't want to hold the
tsens interrupt support hostage to agreement on this.

Regards,
Amit

Re: [PATCH bpf] libbpf: fix version identification on busybox

2019-09-20 Thread Ivan Khoronzhuk


On Fri, Sep 20, 2019 at 02:51:14PM -0700, Andrii Nakryiko wrote:

On Fri, Sep 20, 2019 at 12:19 PM Ivan Khoronzhuk
 wrote:


On Fri, Sep 20, 2019 at 09:34:51PM +0300, Ivan Khoronzhuk wrote:
>On Fri, Sep 20, 2019 at 09:41:54AM -0700, Andrii Nakryiko wrote:
>>On Fri, Sep 20, 2019 at 1:22 AM Ivan Khoronzhuk
>> wrote:
>>>
>>>On Thu, Sep 19, 2019 at 01:02:40PM -0700, Andrii Nakryiko wrote:
On Thu, Sep 19, 2019 at 11:22 AM Ivan Khoronzhuk
 wrote:
>
> It's very often for embedded to have stripped version of sort in
> busybox, when no -V option present. It breaks build natively on target
> board causing recursive loop.
>
> BusyBox v1.24.1 (2019-04-06 04:09:16 UTC) multi-call binary. \
> Usage: sort [-nrugMcszbdfimSTokt] [-o FILE] [-k \
> start[.offset][opts][,end[.offset][opts]] [-t CHAR] [FILE]...
>
> Lets modify command a little to avoid -V option.
>
> Fixes: dadb81d0afe732 ("libbpf: make libbpf.map source of truth for libbpf 
version")
>
> Signed-off-by: Ivan Khoronzhuk 
> ---
>
> Based on bpf/master
>
>  tools/lib/bpf/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
> index c6f94cffe06e..a12490ad6215 100644
> --- a/tools/lib/bpf/Makefile
> +++ b/tools/lib/bpf/Makefile
> @@ -3,7 +3,7 @@
>
>  LIBBPF_VERSION := $(shell \
> grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
> -   sort -rV | head -n1 | cut -d'_' -f2)
> +   cut -d'_' -f2 | sort -r | head -n1)

You can't just sort alphabetically, because:

1.2
1.11

should be in that order. See discussion on mailing thread for original 
commit.
>>>
>>>if X1.X2.X3, where X = {0,1,9}
>>>Then it can be:
>>>
>>>-LIBBPF_VERSION := $(shell \
>>>-   grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
>>>-   sort -rV | head -n1 | cut -d'_' -f2)
>>>+_LBPFLIST := $(patsubst %;,%,$(patsubst LIBBPF_%,%,$(filter LIBBPF_%, \
>>>+   $(shell cat libbpf.map
>>>+_LBPFLIST2 := $(foreach v,$(_LBPFLIST), \
>>>+   $(subst $() $(),,$(foreach n,$(subst .,$() $(),$(v)), \
>>>+   $(shell printf "%05d" $(n)
>>>+_LBPF_VER := $(word $(words $(sort $(_LBPFLIST2))), $(sort $(_LBPFLIST2)))
>>>+LIBBPF_VERSION := $(patsubst %_$(_LBPF_VER),%,$(filter %_$(_LBPF_VER), \
>>>+$(join $(addsuffix _, $(_LBPFLIST)),$(_LBPFLIST2
>>>
>>>It's bigger but avoids invocations of grep/sort/cut/head, only cat/printf
>>>, thus -V option also.
>>>
>>
>>No way, this is way too ugly (and still unreliable, if we ever have
>>X.Y.Z.W or something). I'd rather go with my original approach of
>Yes, forgot to add
>X1,X2,X3,...XN, where X = {0,1,9} and N = const for all versions.
>But frankly, 1.0.0 looks too far.

It actually works for any numbs of X1.X2...X100
but not when you have couple kindof:
X1.X2.X3
and
X1.X2.X3.X4

But, no absolutely any problem to extend this solution to handle all cases,
by just adding leading 0 to every "transformed version", say limit it to 10
possible 'dots' (%5*10d) and it will work as clocks. Advantage - mostly make
functions.

Here can be couple more solutions with sed, not sure it can look less maniac.

>
>>fetching the last version in libbpf.map file. See
>>https://www.spinics.net/lists/netdev/msg592703.html.

Yes it's nice but, no sort, no X1.X2.X3XN

Main is to solve it for a long time.


Thinking a bit more about this, I'm even more convinced that we should
just go with my original approach: find last section in libbpf.map and
extract LIBBPF version from that. That will handle whatever crazy
version format we might decide to use (e.g., 1.2.3-experimental).
We'll just need to make sure that latest version is the last in
libbpf.map, which will just happen naturally. So instead of this
Makefile complexity, please can you port back my original approach?
Thanks!


I don't insist, placed it for history and to show it can be sorted
alphabetically, I can live with cross-compilation that I hope goes soon,
on host no need to worry about this at all. So I better leave this change
up to you.

--
Regards,
Ivan Khoronzhuk

Re: [0/2] net: dsa: vsc73xx: Adjustments for vsc73xx_platform_probe()

2019-09-20 Thread Jakub Kicinski

On Fri, 20 Sep 2019 09:36:57 -0700, Florian Fainelli wrote:
> On 9/20/19 8:30 AM, Markus Elfring wrote:
> >> netdev is closed at the moment for patch.  
> > 
> > I wonder about this information.  
> 
> This is covered here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/netdev-FAQ.rst#n40
> 
> and you can skip the reading and check this URL:
> 
> http://vger.kernel.org/~davem/net-next.html

Indeed, looks like we have a mix of clean ups of varying clarity here.
I will just drop all, including the devm_platform_ioremap_resource()
conversion patches from patchwork for now. 

Markus, please repost them all once net-next opens. 
Sorry for inconvenience.

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

On Fri, Sep 20, 2019 at 3:07 PM Amit Kucheria  wrote:
>
> On Fri, Sep 20, 2019 at 3:02 PM Stephen Boyd  wrote:
> >
> > Quoting Amit Kucheria (2019-09-20 14:52:24)
> > > Register upper-lower interrupts for each of the two tsens controllers.
> > >
> > > Signed-off-by: Amit Kucheria 
> > > ---
> > >  arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
> > >  1 file changed, 32 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> > > b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > index 96c0a481f454..bb763b362c16 100644
> > > --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > @@ -175,8 +175,8 @@
> > >
> > > thermal-zones {
> > > cpu0-thermal {
> > > -   polling-delay-passive = <250>;
> > > -   polling-delay = <1000>;
> > > +   polling-delay-passive = <0>;
> > > +   polling-delay = <0>;
> >
> > I thought the plan was to make this unnecessary to change?
>
> IMO that change should be part of a different series to the thermal
> core. I've not actually started working on it yet (traveling for the
> next 10 days or so) but plan to do it.

In fact, I was thinking of making the entire property optional, so I
started down the path of converting the thermal bindings to YAML but
haven't finished the process yet.

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Stephen Boyd

Quoting Amit Kucheria (2019-09-20 15:07:25)
> On Fri, Sep 20, 2019 at 3:02 PM Stephen Boyd  wrote:
> >
> > Quoting Amit Kucheria (2019-09-20 14:52:24)
> > > Register upper-lower interrupts for each of the two tsens controllers.
> > >
> > > Signed-off-by: Amit Kucheria 
> > > ---
> > >  arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
> > >  1 file changed, 32 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> > > b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > index 96c0a481f454..bb763b362c16 100644
> > > --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > > @@ -175,8 +175,8 @@
> > >
> > > thermal-zones {
> > > cpu0-thermal {
> > > -   polling-delay-passive = <250>;
> > > -   polling-delay = <1000>;
> > > +   polling-delay-passive = <0>;
> > > +   polling-delay = <0>;
> >
> > I thought the plan was to make this unnecessary to change?
> 
> IMO that change should be part of a different series to the thermal
> core. I've not actually started working on it yet (traveling for the
> next 10 days or so) but plan to do it.
> 

Ok so the plan is to change DT and then change it back? That sounds
quite bad so please fix the thermal core to not care about this before
applying these changes so that we don't churn DT.

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

On Fri, Sep 20, 2019 at 3:02 PM Stephen Boyd  wrote:
>
> Quoting Amit Kucheria (2019-09-20 14:52:24)
> > Register upper-lower interrupts for each of the two tsens controllers.
> >
> > Signed-off-by: Amit Kucheria 
> > ---
> >  arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
> >  1 file changed, 32 insertions(+), 28 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> > b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > index 96c0a481f454..bb763b362c16 100644
> > --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> > @@ -175,8 +175,8 @@
> >
> > thermal-zones {
> > cpu0-thermal {
> > -   polling-delay-passive = <250>;
> > -   polling-delay = <1000>;
> > +   polling-delay-passive = <0>;
> > +   polling-delay = <0>;
>
> I thought the plan was to make this unnecessary to change?

IMO that change should be part of a different series to the thermal
core. I've not actually started working on it yet (traveling for the
next 10 days or so) but plan to do it.

Regards,
Amit

Re: [PATCH v4 15/15] drivers: thermal: tsens: Add interrupt support

2019-09-20 Thread Stephen Boyd

Quoting Amit Kucheria (2019-09-20 14:52:30)
> Depending on the IP version, TSENS supports upper, lower and critical
> threshold interrupts. We only add support for upper and lower threshold
> interrupts for now.
> 
> TSENSv2 has an irq [status|clear|mask] bit tuple for each sensor while
> earlier versions only have a single bit per sensor to denote status and
> clear. These differences are handled transparently by the interrupt
> handler. At each interrupt, we reprogram the new upper and lower threshold
> in the .set_trip callback.
> 
> Signed-off-by: Amit Kucheria 
> ---

Reviewed-by: Stephen Boyd

Re: [PATCH v4 14/15] drivers: thermal: tsens: Create function to return sign-extended temperature

2019-09-20 Thread Stephen Boyd

Quoting Amit Kucheria (2019-09-20 14:52:29)
> Hide the details of how to convert values read from TSENS HW to mCelsius
> behind a function. All versions of the IP can be supported as a result.
> 
> Signed-off-by: Amit Kucheria 
> ---

Reviewed-by: Stephen Boyd 

Just one nit below.

>  drivers/thermal/qcom/tsens-common.c | 50 +
>  1 file changed, 36 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/thermal/qcom/tsens-common.c 
> b/drivers/thermal/qcom/tsens-common.c
> index ea2c46cc6a66..6b6b3841c2d0 100644
> --- a/drivers/thermal/qcom/tsens-common.c
> +++ b/drivers/thermal/qcom/tsens-common.c
> @@ -310,6 +331,7 @@ int __init init_common(struct tsens_priv *priv)
> goto err_put_device;
> }
> }
> +
> for (i = 0, j = VALID_0; i < priv->feat->max_sensors; i++, j++) {
> priv->rf[j] = devm_regmap_field_alloc(dev, priv->tm_map,
>   priv->fields[j]);

Drop this hunk?

Re: [PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Stephen Boyd

Quoting Amit Kucheria (2019-09-20 14:52:24)
> Register upper-lower interrupts for each of the two tsens controllers.
> 
> Signed-off-by: Amit Kucheria 
> ---
>  arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
>  1 file changed, 32 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> index 96c0a481f454..bb763b362c16 100644
> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
> @@ -175,8 +175,8 @@
>  
> thermal-zones {
> cpu0-thermal {
> -   polling-delay-passive = <250>;
> -   polling-delay = <1000>;
> +   polling-delay-passive = <0>;
> +   polling-delay = <0>;

I thought the plan was to make this unnecessary to change?

[RFC][PATCH RT 5/7] revert-block

2019-09-20 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Revert swork version of: block: blk-mq: move blk_queue_usage_counter_release() 
into process context

In order to switch to upstream, we need to revert the swork code.

Signed-off-by: Steven Rostedt (VMware) 
---
 block/blk-core.c   | 14 +-
 include/linux/blkdev.h |  2 --
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f7e16b4466f0..3f08d6fd0787 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -969,21 +969,12 @@ void blk_queue_exit(struct request_queue *q)
percpu_ref_put(>q_usage_counter);
 }
 
-static void blk_queue_usage_counter_release_swork(struct swork_event *sev)
-{
-   struct request_queue *q =
-   container_of(sev, struct request_queue, mq_pcpu_wake);
-
-   wake_up_all(>mq_freeze_wq);
-}
-
 static void blk_queue_usage_counter_release(struct percpu_ref *ref)
 {
struct request_queue *q =
container_of(ref, struct request_queue, q_usage_counter);
 
-   if (wq_has_sleeper(>mq_freeze_wq))
-   swork_queue(>mq_pcpu_wake);
+   wake_up_all(>mq_freeze_wq);
 }
 
 static void blk_rq_timed_out_timer(struct timer_list *t)
@@ -1080,7 +1071,6 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
gfp_mask, int node_id,
queue_flag_set_unlocked(QUEUE_FLAG_BYPASS, q);
 
init_waitqueue_head(>mq_freeze_wq);
-   INIT_SWORK(>mq_pcpu_wake, blk_queue_usage_counter_release_swork);
 
/*
 * Init percpu_ref in atomic mode so that it's faster to shutdown.
@@ -3970,8 +3960,6 @@ int __init blk_dev_init(void)
if (!kblockd_workqueue)
panic("Failed to create kblockd\n");
 
-   BUG_ON(swork_get());
-
request_cachep = kmem_cache_create("blkdev_requests",
sizeof(struct request), 0, SLAB_PANIC, NULL);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7b7c0bc6a514..f1960add94df 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 
 struct module;
 struct scsi_ioctl_command;
@@ -656,7 +655,6 @@ struct request_queue {
 #endif
struct rcu_head rcu_head;
wait_queue_head_t   mq_freeze_wq;
-   struct swork_event  mq_pcpu_wake;
struct percpu_ref   q_usage_counter;
struct list_headall_q_node;
 
-- 
2.20.1

[RFC][PATCH RT 3/7] revert-thermal

2019-09-20 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

Revert: thermal: Defer thermal wakups to threads

Signed-off-by: Steven Rostedt (VMware) 
---
 drivers/thermal/x86_pkg_temp_thermal.c | 52 ++
 1 file changed, 3 insertions(+), 49 deletions(-)

diff --git a/drivers/thermal/x86_pkg_temp_thermal.c 
b/drivers/thermal/x86_pkg_temp_thermal.c
index a5991cbb408f..1ef937d799e4 100644
--- a/drivers/thermal/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/x86_pkg_temp_thermal.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -330,7 +329,7 @@ static void pkg_thermal_schedule_work(int cpu, struct 
delayed_work *work)
schedule_delayed_work_on(cpu, work, ms);
 }
 
-static void pkg_thermal_notify_work(struct swork_event *event)
+static int pkg_thermal_notify(u64 msr_val)
 {
int cpu = smp_processor_id();
struct pkg_device *pkgdev;
@@ -349,47 +348,9 @@ static void pkg_thermal_notify_work(struct swork_event 
*event)
}
 
spin_unlock_irqrestore(_temp_lock, flags);
-}
-
-#ifdef CONFIG_PREEMPT_RT_FULL
-static struct swork_event notify_work;
-
-static int pkg_thermal_notify_work_init(void)
-{
-   int err;
-
-   err = swork_get();
-   if (err)
-   return err;
-
-   INIT_SWORK(_work, pkg_thermal_notify_work);
return 0;
 }
 
-static void pkg_thermal_notify_work_cleanup(void)
-{
-   swork_put();
-}
-
-static int pkg_thermal_notify(u64 msr_val)
-{
-   swork_queue(_work);
-   return 0;
-}
-
-#else  /* !CONFIG_PREEMPT_RT_FULL */
-
-static int pkg_thermal_notify_work_init(void) { return 0; }
-
-static void pkg_thermal_notify_work_cleanup(void) {  }
-
-static int pkg_thermal_notify(u64 msr_val)
-{
-   pkg_thermal_notify_work(NULL);
-   return 0;
-}
-#endif /* CONFIG_PREEMPT_RT_FULL */
-
 static int pkg_temp_thermal_device_add(unsigned int cpu)
 {
int pkgid = topology_logical_package_id(cpu);
@@ -554,16 +515,11 @@ static int __init pkg_temp_thermal_init(void)
if (!x86_match_cpu(pkg_temp_thermal_ids))
return -ENODEV;
 
-   if (!pkg_thermal_notify_work_init())
-   return -ENODEV;
-
max_packages = topology_max_packages();
packages = kcalloc(max_packages, sizeof(struct pkg_device *),
   GFP_KERNEL);
-   if (!packages) {
-   ret = -ENOMEM;
-   goto err;
-   }
+   if (!packages)
+   return -ENOMEM;
 
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "thermal/x86_pkg:online",
pkg_thermal_cpu_online, 
pkg_thermal_cpu_offline);
@@ -581,7 +537,6 @@ static int __init pkg_temp_thermal_init(void)
return 0;
 
 err:
-   pkg_thermal_notify_work_cleanup();
kfree(packages);
return ret;
 }
@@ -595,7 +550,6 @@ static void __exit pkg_temp_thermal_exit(void)
cpuhp_remove_state(pkg_thermal_hp_state);
debugfs_remove_recursive(debugfs);
kfree(packages);
-   pkg_thermal_notify_work_cleanup();
 }
 module_exit(pkg_temp_thermal_exit)
 
-- 
2.20.1

[RFC][PATCH RT 0/7] Revert of simple work, and backport workqueue rework

2019-09-20 Thread Steven Rostedt

To be able to backport some of the fixes done in upstream RT, I need
to revert the simple work code and incorporate the work queue rework
that was done in rt-devel. I originally thought this crashed under
stress test (as I reported in the stable meeting), but found that
the system booted a different kernel that had known issues. 4.19-rt with
these patches have passed all my tests. But before posting an rc1, 
I was hoping to get some people to look it over and perhaps even test
it some to make sure there's not any issues here.

The reason I'm doing this is that this goes beyond the stable scope
for 4.19-rt, but I believe it is still necessary.

-- Steve



Daniel Wagner (1):
  thermal: Defer thermal wakups to threads

Sebastian Andrzej Siewior (3):
  fs/aio: simple simple work
  block: blk-mq: move blk_queue_usage_counter_release() into process context
  workqueue: rework

Steven Rostedt (VMware) (3):
  revert-aio
  revert-thermal
  revert-block


 block/blk-core.c   |  10 +-
 drivers/block/loop.c   |   2 +-
 drivers/spi/spi-rockchip.c |   1 -
 drivers/thermal/x86_pkg_temp_thermal.c |  52 +-
 fs/aio.c   |  12 +-
 include/linux/blk-cgroup.h |   2 +-
 include/linux/blkdev.h |   4 +-
 include/linux/interrupt.h  |   5 -
 include/linux/kthread-cgroup.h |  17 --
 include/linux/kthread.h|  15 +-
 include/linux/swait.h  |  14 ++
 include/linux/workqueue.h  |   4 -
 init/main.c|   1 -
 kernel/irq/manage.c|  36 +---
 kernel/kthread.c   |  14 --
 kernel/sched/core.c|   1 +
 kernel/time/hrtimer.c  |  24 ---
 kernel/workqueue.c | 300 ++---
 18 files changed, 168 insertions(+), 346 deletions(-)
 delete mode 100644 include/linux/kthread-cgroup.h

[RFC][PATCH RT 7/7] workqueue: rework

2019-09-20 Thread Steven Rostedt

From: Sebastian Andrzej Siewior 

[ Upstream commit d15a862f24df983458533aebd6fa207ecdd1095a ]

This is an all-in change of the workqueue rework.
The worker_pool.lock is made to raw_spinlock_t. With this change we can
schedule workitems from preempt-disable sections and sections with disabled
interrupts. This change allows to remove all kthread_.* workarounds we used to
have.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt (VMware) 
---
 block/blk-core.c   |   6 +-
 drivers/block/loop.c   |   2 +-
 drivers/spi/spi-rockchip.c |   1 -
 drivers/thermal/x86_pkg_temp_thermal.c |  28 +--
 fs/aio.c   |  10 +-
 include/linux/blk-cgroup.h |   2 +-
 include/linux/blkdev.h |   2 +-
 include/linux/interrupt.h  |   5 -
 include/linux/kthread-cgroup.h |  17 --
 include/linux/kthread.h|  15 +-
 include/linux/swait.h  |  14 ++
 include/linux/workqueue.h  |   4 -
 init/main.c|   1 -
 kernel/irq/manage.c|  36 +--
 kernel/kthread.c   |  14 --
 kernel/sched/core.c|   1 +
 kernel/time/hrtimer.c  |  24 --
 kernel/workqueue.c | 300 +++--
 18 files changed, 164 insertions(+), 318 deletions(-)
 delete mode 100644 include/linux/kthread-cgroup.h

diff --git a/block/blk-core.c b/block/blk-core.c
index 8d4c5d69c5c4..f0cb86ef8215 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -969,7 +969,7 @@ void blk_queue_exit(struct request_queue *q)
percpu_ref_put(>q_usage_counter);
 }
 
-static void blk_queue_usage_counter_release_wrk(struct kthread_work *work)
+static void blk_queue_usage_counter_release_wrk(struct work_struct *work)
 {
struct request_queue *q =
container_of(work, struct request_queue, mq_pcpu_wake);
@@ -983,7 +983,7 @@ static void blk_queue_usage_counter_release(struct 
percpu_ref *ref)
container_of(ref, struct request_queue, q_usage_counter);
 
if (wq_has_sleeper(>mq_freeze_wq))
-   kthread_schedule_work(>mq_pcpu_wake);
+   schedule_work(>mq_pcpu_wake);
 }
 
 static void blk_rq_timed_out_timer(struct timer_list *t)
@@ -1080,7 +1080,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
gfp_mask, int node_id,
queue_flag_set_unlocked(QUEUE_FLAG_BYPASS, q);
 
init_waitqueue_head(>mq_freeze_wq);
-   kthread_init_work(>mq_pcpu_wake, 
blk_queue_usage_counter_release_wrk);
+   INIT_WORK(>mq_pcpu_wake, blk_queue_usage_counter_release_wrk);
 
/*
 * Init percpu_ref in atomic mode so that it's faster to shutdown.
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 298c96b6befa..cef8e00c9d9d 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -70,7 +70,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/spi/spi-rockchip.c b/drivers/spi/spi-rockchip.c
index b56619418cea..fdcf3076681b 100644
--- a/drivers/spi/spi-rockchip.c
+++ b/drivers/spi/spi-rockchip.c
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #define DRIVER_NAME "rockchip-spi"
 
diff --git a/drivers/thermal/x86_pkg_temp_thermal.c 
b/drivers/thermal/x86_pkg_temp_thermal.c
index 82f21fd4afb0..1ef937d799e4 100644
--- a/drivers/thermal/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/x86_pkg_temp_thermal.c
@@ -29,7 +29,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -330,7 +329,7 @@ static void pkg_thermal_schedule_work(int cpu, struct 
delayed_work *work)
schedule_delayed_work_on(cpu, work, ms);
 }
 
-static void pkg_thermal_notify_work(struct kthread_work *work)
+static int pkg_thermal_notify(u64 msr_val)
 {
int cpu = smp_processor_id();
struct pkg_device *pkgdev;
@@ -349,32 +348,8 @@ static void pkg_thermal_notify_work(struct kthread_work 
*work)
}
 
spin_unlock_irqrestore(_temp_lock, flags);
-}
-
-#ifdef CONFIG_PREEMPT_RT_FULL
-static DEFINE_KTHREAD_WORK(notify_work, pkg_thermal_notify_work);
-
-static int pkg_thermal_notify(u64 msr_val)
-{
-   kthread_schedule_work(_work);
-   return 0;
-}
-
-static void pkg_thermal_notify_flush(void)
-{
-   kthread_flush_work(_work);
-}
-
-#else  /* !CONFIG_PREEMPT_RT_FULL */
-
-static void pkg_thermal_notify_flush(void) { }
-
-static int pkg_thermal_notify(u64 msr_val)
-{
-   pkg_thermal_notify_work(NULL);
return 0;
 }
-#endif /* CONFIG_PREEMPT_RT_FULL */
 
 static int pkg_temp_thermal_device_add(unsigned int cpu)
 {
@@ -573,7 +548,6 @@ static void __exit pkg_temp_thermal_exit(void)
platform_thermal_package_rate_control = NULL;
 
cpuhp_remove_state(pkg_thermal_hp_state);
-   pkg_thermal_notify_flush();
debugfs_remove_recursive(debugfs);
kfree(packages);
 }
diff

[RFC][PATCH RT 1/7] revert-aio

2019-09-20 Thread Steven Rostedt

From: "Steven Rostedt (VMware)" 

revert: fs/aio: simple simple work

Signed-off-by: Steven Rostedt (VMware) 
---
 fs/aio.c | 15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 16dcf8521c2c..911e23087dfb 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
@@ -122,7 +121,6 @@ struct kioctx {
longnr_pages;
 
struct rcu_work free_rwork; /* see free_ioctx() */
-   struct swork_event  free_swork; /* see free_ioctx() */
 
/*
 * signals when all in-flight requests are done
@@ -267,7 +265,6 @@ static int __init aio_setup(void)
.mount  = aio_mount,
.kill_sb= kill_anon_super,
};
-   BUG_ON(swork_get());
aio_mnt = kern_mount(_fs);
if (IS_ERR(aio_mnt))
panic("Failed to create aio fs mount.");
@@ -609,9 +606,9 @@ static void free_ioctx_reqs(struct percpu_ref *ref)
  * and ctx->users has dropped to 0, so we know no more kiocbs can be submitted 
-
  * now it's safe to cancel any that need to be.
  */
-static void free_ioctx_users_work(struct swork_event *sev)
+static void free_ioctx_users(struct percpu_ref *ref)
 {
-   struct kioctx *ctx = container_of(sev, struct kioctx, free_swork);
+   struct kioctx *ctx = container_of(ref, struct kioctx, users);
struct aio_kiocb *req;
 
spin_lock_irq(>ctx_lock);
@@ -629,14 +626,6 @@ static void free_ioctx_users_work(struct swork_event *sev)
percpu_ref_put(>reqs);
 }
 
-static void free_ioctx_users(struct percpu_ref *ref)
-{
-   struct kioctx *ctx = container_of(ref, struct kioctx, users);
-
-   INIT_SWORK(>free_swork, free_ioctx_users_work);
-   swork_queue(>free_swork);
-}
-
 static int ioctx_add_table(struct kioctx *ctx, struct mm_struct *mm)
 {
unsigned i, new_nr;
-- 
2.20.1

[RFC][PATCH RT 6/7] block: blk-mq: move blk_queue_usage_counter_release() into process context

2019-09-20 Thread Steven Rostedt

From: Sebastian Andrzej Siewior 

[ Upstream commit 61c928ecf4fe200bda9b49a0813b5ba0f43995b5 ]

| BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:914
| in_atomic(): 1, irqs_disabled(): 0, pid: 255, name: kworker/u257:6
| 5 locks held by kworker/u257:6/255:
|  #0:  ("events_unbound"){.+.+.+}, at: [] 
process_one_work+0x171/0x5e0
|  #1:  ((>work)){+.+.+.}, at: [] 
process_one_work+0x171/0x5e0
|  #2:  (>scan_mutex){+.+.+.}, at: [] 
__scsi_add_device+0xa3/0x130 [scsi_mod]
|  #3:  (>tag_list_lock){+.+...}, at: [] 
blk_mq_init_queue+0x96a/0xa50
|  #4:  (rcu_read_lock_sched){..}, at: [] 
percpu_ref_kill_and_confirm+0x1d/0x120
| Preemption disabled at:[] 
blk_mq_freeze_queue_start+0x56/0x70
|
| CPU: 2 PID: 255 Comm: kworker/u257:6 Not tainted 3.18.7-rt0+ #1
| Workqueue: events_unbound async_run_entry_fn
|  0003 8800bc29f998 815b3a12 
|   8800bc29f9b8 8109aa16 8800bc29fa28
|  8800bc5d1bc8 8800bc29f9e8 815b8dd4 8800
| Call Trace:
|  [] dump_stack+0x4f/0x7c
|  [] __might_sleep+0x116/0x190
|  [] rt_spin_lock+0x24/0x60
|  [] __wake_up+0x29/0x60
|  [] blk_mq_usage_counter_release+0x1e/0x20
|  [] percpu_ref_kill_and_confirm+0x106/0x120
|  [] blk_mq_freeze_queue_start+0x56/0x70
|  [] blk_mq_update_tag_set_depth+0x40/0xd0
|  [] blk_mq_init_queue+0x98c/0xa50
|  [] scsi_mq_alloc_queue+0x20/0x60 [scsi_mod]
|  [] scsi_alloc_sdev+0x2f5/0x370 [scsi_mod]
|  [] scsi_probe_and_add_lun+0x9e4/0xdd0 [scsi_mod]
|  [] __scsi_add_device+0x126/0x130 [scsi_mod]
|  [] ata_scsi_scan_host+0xaf/0x200 [libata]
|  [] async_port_probe+0x46/0x60 [libata]
|  [] async_run_entry_fn+0x3b/0xf0
|  [] process_one_work+0x201/0x5e0

percpu_ref_kill_and_confirm() invokes blk_mq_usage_counter_release() in
a rcu-sched region. swait based wake queue can't be used due to
wake_up_all() usage and disabled interrupts in !RT configs (as reported
by Corey Minyard).
The wq_has_sleeper() check has been suggested by Peter Zijlstra.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt (VMware) 
---
 block/blk-core.c   | 12 +++-
 include/linux/blkdev.h |  2 ++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 3f08d6fd0787..8d4c5d69c5c4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -969,12 +969,21 @@ void blk_queue_exit(struct request_queue *q)
percpu_ref_put(>q_usage_counter);
 }
 
+static void blk_queue_usage_counter_release_wrk(struct kthread_work *work)
+{
+   struct request_queue *q =
+   container_of(work, struct request_queue, mq_pcpu_wake);
+
+   wake_up_all(>mq_freeze_wq);
+}
+
 static void blk_queue_usage_counter_release(struct percpu_ref *ref)
 {
struct request_queue *q =
container_of(ref, struct request_queue, q_usage_counter);
 
-   wake_up_all(>mq_freeze_wq);
+   if (wq_has_sleeper(>mq_freeze_wq))
+   kthread_schedule_work(>mq_pcpu_wake);
 }
 
 static void blk_rq_timed_out_timer(struct timer_list *t)
@@ -1071,6 +1080,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t 
gfp_mask, int node_id,
queue_flag_set_unlocked(QUEUE_FLAG_BYPASS, q);
 
init_waitqueue_head(>mq_freeze_wq);
+   kthread_init_work(>mq_pcpu_wake, 
blk_queue_usage_counter_release_wrk);
 
/*
 * Init percpu_ref in atomic mode so that it's faster to shutdown.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f1960add94df..15a489abfb62 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -655,6 +656,7 @@ struct request_queue {
 #endif
struct rcu_head rcu_head;
wait_queue_head_t   mq_freeze_wq;
+   struct kthread_work mq_pcpu_wake;
struct percpu_ref   q_usage_counter;
struct list_headall_q_node;
 
-- 
2.20.1

[RFC][PATCH RT 2/7] fs/aio: simple simple work

2019-09-20 Thread Steven Rostedt

From: Sebastian Andrzej Siewior 

[ Upstream commit 1a142116f6435ef070ecebb66d2d599507c10601 ]

|BUG: sleeping function called from invalid context at 
kernel/locking/rtmutex.c:768
|in_atomic(): 1, irqs_disabled(): 0, pid: 26, name: rcuos/2
|2 locks held by rcuos/2/26:
| #0:  (rcu_callback){.+.+..}, at: [] 
rcu_nocb_kthread+0x1e2/0x380
| #1:  (rcu_read_lock_sched){.+.+..}, at: [] 
percpu_ref_kill_rcu+0xa6/0x1c0
|Preemption disabled at:[] rcu_nocb_kthread+0x263/0x380
|Call Trace:
| [] dump_stack+0x4e/0x9c
| [] __might_sleep+0xfb/0x170
| [] rt_spin_lock+0x24/0x70
| [] free_ioctx_users+0x30/0x130
| [] percpu_ref_kill_rcu+0x1b4/0x1c0
| [] rcu_nocb_kthread+0x263/0x380
| [] kthread+0xd6/0xf0
| [] ret_from_fork+0x7c/0xb0

replace this preempt_disable() friendly swork.

Reported-By: Mike Galbraith 
Suggested-by: Benjamin LaHaise 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Steven Rostedt (VMware) 
---
 fs/aio.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 911e23087dfb..0c613d805bf1 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -121,6 +121,7 @@ struct kioctx {
longnr_pages;
 
struct rcu_work free_rwork; /* see free_ioctx() */
+   struct kthread_work free_kwork; /* see free_ioctx() */
 
/*
 * signals when all in-flight requests are done
@@ -606,9 +607,9 @@ static void free_ioctx_reqs(struct percpu_ref *ref)
  * and ctx->users has dropped to 0, so we know no more kiocbs can be submitted 
-
  * now it's safe to cancel any that need to be.
  */
-static void free_ioctx_users(struct percpu_ref *ref)
+static void free_ioctx_users_work(struct kthread_work *work)
 {
-   struct kioctx *ctx = container_of(ref, struct kioctx, users);
+   struct kioctx *ctx = container_of(work, struct kioctx, free_kwork);
struct aio_kiocb *req;
 
spin_lock_irq(>ctx_lock);
@@ -626,6 +627,14 @@ static void free_ioctx_users(struct percpu_ref *ref)
percpu_ref_put(>reqs);
 }
 
+static void free_ioctx_users(struct percpu_ref *ref)
+{
+   struct kioctx *ctx = container_of(ref, struct kioctx, users);
+
+   kthread_init_work(>free_kwork, free_ioctx_users_work);
+   kthread_schedule_work(>free_kwork);
+}
+
 static int ioctx_add_table(struct kioctx *ctx, struct mm_struct *mm)
 {
unsigned i, new_nr;
-- 
2.20.1

[RFC][PATCH RT 4/7] thermal: Defer thermal wakups to threads

2019-09-20 Thread Steven Rostedt

From: Daniel Wagner 

[ Upstream commit ad2408dc248fe58536eef5b2b5734d8f9d3a280b ]

On RT the spin lock in pkg_temp_thermal_platfrom_thermal_notify will
call schedule while we run in irq context.

[] dump_stack+0x4e/0x8f
[] __schedule_bug+0xa6/0xb4
[] __schedule+0x5b4/0x700
[] schedule+0x2a/0x90
[] rt_spin_lock_slowlock+0xe5/0x2d0
[] rt_spin_lock+0x25/0x30
[] pkg_temp_thermal_platform_thermal_notify+0x45/0x134 
[x86_pkg_temp_thermal]
[] ? therm_throt_process+0x1b/0x160
[] intel_thermal_interrupt+0x211/0x250
[] smp_thermal_interrupt+0x21/0x40
[] thermal_interrupt+0x6d/0x80

Let's defer the work to a kthread.

Signed-off-by: Daniel Wagner 
Signed-off-by: Steven Rostedt (VMware) 
[bigeasy: reoder init/denit position. TODO: flush swork on exit]
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/thermal/x86_pkg_temp_thermal.c | 28 +-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/x86_pkg_temp_thermal.c 
b/drivers/thermal/x86_pkg_temp_thermal.c
index 1ef937d799e4..82f21fd4afb0 100644
--- a/drivers/thermal/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/x86_pkg_temp_thermal.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -329,7 +330,7 @@ static void pkg_thermal_schedule_work(int cpu, struct 
delayed_work *work)
schedule_delayed_work_on(cpu, work, ms);
 }
 
-static int pkg_thermal_notify(u64 msr_val)
+static void pkg_thermal_notify_work(struct kthread_work *work)
 {
int cpu = smp_processor_id();
struct pkg_device *pkgdev;
@@ -348,8 +349,32 @@ static int pkg_thermal_notify(u64 msr_val)
}
 
spin_unlock_irqrestore(_temp_lock, flags);
+}
+
+#ifdef CONFIG_PREEMPT_RT_FULL
+static DEFINE_KTHREAD_WORK(notify_work, pkg_thermal_notify_work);
+
+static int pkg_thermal_notify(u64 msr_val)
+{
+   kthread_schedule_work(_work);
+   return 0;
+}
+
+static void pkg_thermal_notify_flush(void)
+{
+   kthread_flush_work(_work);
+}
+
+#else  /* !CONFIG_PREEMPT_RT_FULL */
+
+static void pkg_thermal_notify_flush(void) { }
+
+static int pkg_thermal_notify(u64 msr_val)
+{
+   pkg_thermal_notify_work(NULL);
return 0;
 }
+#endif /* CONFIG_PREEMPT_RT_FULL */
 
 static int pkg_temp_thermal_device_add(unsigned int cpu)
 {
@@ -548,6 +573,7 @@ static void __exit pkg_temp_thermal_exit(void)
platform_thermal_package_rate_control = NULL;
 
cpuhp_remove_state(pkg_thermal_hp_state);
+   pkg_thermal_notify_flush();
debugfs_remove_recursive(debugfs);
kfree(packages);
 }
-- 
2.20.1

[PATCH v4 06/15] arm64: dts: msm8916: thermal: Fixup HW ids for cpu sensors

2019-09-20 Thread Amit Kucheria

msm8916 uses sensors 0, 1, 2, 4 and 5. Sensor 3 is NOT used. Fixup the
device tree so that the correct sensor ID is used and as a result we can
actually check the temperature for the cpu2_3 sensor.

Signed-off-by: Amit Kucheria 
Reviewed-by: Daniel Lezcano 
Reviewed-by: Stephen Boyd 
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 5ea9fb8f2f87..8686e101905c 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -179,7 +179,7 @@
polling-delay-passive = <250>;
polling-delay = <1000>;
 
-   thermal-sensors = < 4>;
+   thermal-sensors = < 5>;
 
trips {
cpu0_1_alert0: trip-point@0 {
@@ -209,7 +209,7 @@
polling-delay-passive = <250>;
polling-delay = <1000>;
 
-   thermal-sensors = < 3>;
+   thermal-sensors = < 4>;
 
trips {
cpu2_3_alert0: trip-point@0 {
-- 
2.17.1

[PATCH v4 07/15] dt-bindings: thermal: tsens: Convert over to a yaml schema

2019-09-20 Thread Amit Kucheria

Older IP only supports the 'uplow' interrupt, but newer IP supports
'uplow' and 'critical' interrupts. Document interrupt support in the
tsens driver by converting over to a YAML schema.

Suggested-by: Stephen Boyd 
Signed-off-by: Amit Kucheria 
---
 .../bindings/thermal/qcom-tsens.txt   |  55 --
 .../bindings/thermal/qcom-tsens.yaml  | 168 ++
 MAINTAINERS   |   1 +
 3 files changed, 169 insertions(+), 55 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/thermal/qcom-tsens.txt
 create mode 100644 Documentation/devicetree/bindings/thermal/qcom-tsens.yaml

diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.txt 
b/Documentation/devicetree/bindings/thermal/qcom-tsens.txt
deleted file mode 100644
index 673cc1831ee9..
--- a/Documentation/devicetree/bindings/thermal/qcom-tsens.txt
+++ /dev/null
@@ -1,55 +0,0 @@
-* QCOM SoC Temperature Sensor (TSENS)
-
-Required properties:
-- compatible:
-  Must be one of the following:
-- "qcom,msm8916-tsens" (MSM8916)
-- "qcom,msm8974-tsens" (MSM8974)
-- "qcom,msm8996-tsens" (MSM8996)
-- "qcom,qcs404-tsens", "qcom,tsens-v1" (QCS404)
-- "qcom,msm8998-tsens", "qcom,tsens-v2" (MSM8998)
-- "qcom,sdm845-tsens", "qcom,tsens-v2" (SDM845)
-  The generic "qcom,tsens-v2" property must be used as a fallback for any SoC
-  with version 2 of the TSENS IP. MSM8996 is the only exception because the
-  generic property did not exist when support was added.
-  Similarly, the generic "qcom,tsens-v1" property must be used as a fallback 
for
-  any SoC with version 1 of the TSENS IP.
-
-- reg: Address range of the thermal registers.
-  New platforms containing v2.x.y of the TSENS IP must specify the SROT and TM
-  register spaces separately, with order being TM before SROT.
-  See Example 2, below.
-
-- #thermal-sensor-cells : Should be 1. See ./thermal.txt for a description.
-- #qcom,sensors: Number of sensors in tsens block
-- Refer to Documentation/devicetree/bindings/nvmem/nvmem.txt to know how to 
specify
-nvmem cells
-
-Example 1 (legacy support before a fallback tsens-v2 property was introduced):
-tsens: thermal-sensor@90 {
-   compatible = "qcom,msm8916-tsens";
-   reg = <0x4a8000 0x2000>;
-   nvmem-cells = <_caldata>, <_calsel>;
-   nvmem-cell-names = "caldata", "calsel";
-   #thermal-sensor-cells = <1>;
-   };
-
-Example 2 (for any platform containing v2 of the TSENS IP):
-tsens0: thermal-sensor@c263000 {
-   compatible = "qcom,sdm845-tsens", "qcom,tsens-v2";
-   reg = <0xc263000 0x1ff>, /* TM */
-   <0xc222000 0x1ff>; /* SROT */
-   #qcom,sensors = <13>;
-   #thermal-sensor-cells = <1>;
-   };
-
-Example 3 (for any platform containing v1 of the TSENS IP):
-tsens: thermal-sensor@4a9000 {
-   compatible = "qcom,qcs404-tsens", "qcom,tsens-v1";
-   reg = <0x004a9000 0x1000>, /* TM */
- <0x004a8000 0x1000>; /* SROT */
-   nvmem-cells = <_caldata>;
-   nvmem-cell-names = "calib";
-   #qcom,sensors = <10>;
-   #thermal-sensor-cells = <1>;
-   };
diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml 
b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
new file mode 100644
index ..23afc7bf5a44
--- /dev/null
+++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
@@ -0,0 +1,168 @@
+# SPDX-License-Identifier: (GPL-2.0 OR MIT)
+# Copyright 2019 Linaro Ltd.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/thermal/qcom-tsens.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: QCOM SoC Temperature Sensor (TSENS)
+
+maintainers:
+  - Amit Kucheria 
+
+description: |
+  QCOM SoCs have TSENS IP to allow temperature measurement. There are currently
+  three distinct major versions of the IP that is supported by a single driver.
+  The IP versions are named v0.1, v1 and v2 in the driver, where v0.1 captures
+  everything before v1 when there was no versioning information.
+
+properties:
+  compatible:
+oneOf:
+  - description: v0.1 of TSENS
+items:
+  - enum:
+  - qcom,msm8916-tsens
+  - qcom,msm8974-tsens
+  - const: qcom,tsens-v0_1
+
+  - description: v1 of TSENS
+items:
+  - enum:
+  - qcom,qcs404-tsens
+  - const: qcom,tsens-v1
+
+  - description: v2 of TSENS
+items:
+  - enum:
+  - qcom,msm8996-tsens
+  - qcom,msm8998-tsens
+  - qcom,sdm845-tsens
+  - const: qcom,tsens-v2
+
+  reg:
+maxItems: 2
+items:
+  - description: TM registers
+  - description: SROT registers
+
+  nvmem-cells:
+minItems: 1
+maxItems: 2
+description:
+  Reference to an nvmem node for the calibration data
+
+

[PATCH v4 04/15] drivers: thermal: tsens: Add debugfs support

2019-09-20 Thread Amit Kucheria

Dump some basic version info and sensor details into debugfs. Example
from qcs404 below:

--(/sys/kernel/debug) $ ls tsens/
4a9000.thermal-sensor  version
--(/sys/kernel/debug) $ cat tsens/version
1.4.0
--(/sys/kernel/debug) $ cat tsens/4a9000.thermal-sensor/sensors
max: 11
num: 10

  idslope   offset

   0 3200   404000
   1 3200   404000
   2 3200   404000
   3 3200   404000
   4 3200   404000
   5 3200   404000
   6 3200   404000
   7 3200   404000
   8 3200   404000
   9 3200   404000

Signed-off-by: Amit Kucheria 
Reviewed-by: Stephen Boyd 
---
 drivers/thermal/qcom/tsens-common.c | 83 +
 drivers/thermal/qcom/tsens.c|  2 +
 drivers/thermal/qcom/tsens.h|  6 +++
 3 files changed, 91 insertions(+)

diff --git a/drivers/thermal/qcom/tsens-common.c 
b/drivers/thermal/qcom/tsens-common.c
index 7437bfe196e5..ea2c46cc6a66 100644
--- a/drivers/thermal/qcom/tsens-common.c
+++ b/drivers/thermal/qcom/tsens-common.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2015, The Linux Foundation. All rights reserved.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -139,6 +140,77 @@ int get_temp_common(struct tsens_sensor *s, int *temp)
return 0;
 }
 
+#ifdef CONFIG_DEBUG_FS
+static int dbg_sensors_show(struct seq_file *s, void *data)
+{
+   struct platform_device *pdev = s->private;
+   struct tsens_priv *priv = platform_get_drvdata(pdev);
+   int i;
+
+   seq_printf(s, "max: %2d\nnum: %2d\n\n",
+  priv->feat->max_sensors, priv->num_sensors);
+
+   seq_puts(s, "  idslope   offset\n--\n");
+   for (i = 0;  i < priv->num_sensors; i++) {
+   seq_printf(s, "%8d %8d %8d\n", priv->sensor[i].hw_id,
+  priv->sensor[i].slope, priv->sensor[i].offset);
+   }
+
+   return 0;
+}
+
+static int dbg_version_show(struct seq_file *s, void *data)
+{
+   struct platform_device *pdev = s->private;
+   struct tsens_priv *priv = platform_get_drvdata(pdev);
+   u32 maj_ver, min_ver, step_ver;
+   int ret;
+
+   if (tsens_ver(priv) > VER_0_1) {
+   ret = regmap_field_read(priv->rf[VER_MAJOR], _ver);
+   if (ret)
+   return ret;
+   ret = regmap_field_read(priv->rf[VER_MINOR], _ver);
+   if (ret)
+   return ret;
+   ret = regmap_field_read(priv->rf[VER_STEP], _ver);
+   if (ret)
+   return ret;
+   seq_printf(s, "%d.%d.%d\n", maj_ver, min_ver, step_ver);
+   } else {
+   seq_puts(s, "0.1.0\n");
+   }
+
+   return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(dbg_version);
+DEFINE_SHOW_ATTRIBUTE(dbg_sensors);
+
+static void tsens_debug_init(struct platform_device *pdev)
+{
+   struct tsens_priv *priv = platform_get_drvdata(pdev);
+   struct dentry *root, *file;
+
+   root = debugfs_lookup("tsens", NULL);
+   if (!root)
+   priv->debug_root = debugfs_create_dir("tsens", NULL);
+   else
+   priv->debug_root = root;
+
+   file = debugfs_lookup("version", priv->debug_root);
+   if (!file)
+   debugfs_create_file("version", 0444, priv->debug_root,
+   pdev, _version_fops);
+
+   /* A directory for each instance of the TSENS IP */
+   priv->debug = debugfs_create_dir(dev_name(>dev), 
priv->debug_root);
+   debugfs_create_file("sensors", 0444, priv->debug, pdev, 
_sensors_fops);
+}
+#else
+static inline void tsens_debug_init(struct platform_device *pdev) {}
+#endif
+
 static const struct regmap_config tsens_config = {
.name   = "tm",
.reg_bits   = 32,
@@ -199,6 +271,15 @@ int __init init_common(struct tsens_priv *priv)
goto err_put_device;
}
 
+   if (tsens_ver(priv) > VER_0_1) {
+   for (i = VER_MAJOR; i <= VER_STEP; i++) {
+   priv->rf[i] = devm_regmap_field_alloc(dev, 
priv->srot_map,
+ priv->fields[i]);
+   if (IS_ERR(priv->rf[i]))
+   return PTR_ERR(priv->rf[i]);
+   }
+   }
+
priv->rf[TSENS_EN] = devm_regmap_field_alloc(dev, priv->srot_map,
 priv->fields[TSENS_EN]);
if (IS_ERR(priv->rf[TSENS_EN])) {
@@ -238,6 +319,8 @@ int __init init_common(struct tsens_priv *priv)
}
}
 
+   tsens_debug_init(op);
+
return 0;
 
 err_put_device:
diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index 06c6bbd69a1a..772aa76b50e1 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2015, The Linux Foundation. All rights reserved.
  */

[PATCH v4 03/15] drivers: thermal: tsens: Add func identifier to debug statements

2019-09-20 Thread Amit Kucheria

Printing the function name when enabling debugging makes logs easier to
read.

Signed-off-by: Amit Kucheria 
Reviewed-by: Stephen Boyd 
Reviewed-by: Daniel Lezcano 
---
 drivers/thermal/qcom/tsens-common.c | 8 
 drivers/thermal/qcom/tsens.c| 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/qcom/tsens-common.c 
b/drivers/thermal/qcom/tsens-common.c
index c037bdf92c66..7437bfe196e5 100644
--- a/drivers/thermal/qcom/tsens-common.c
+++ b/drivers/thermal/qcom/tsens-common.c
@@ -42,8 +42,8 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 *p1,
 
for (i = 0; i < priv->num_sensors; i++) {
dev_dbg(priv->dev,
-   "sensor%d - data_point1:%#x data_point2:%#x\n",
-   i, p1[i], p2[i]);
+   "%s: sensor%d - data_point1:%#x data_point2:%#x\n",
+   __func__, i, p1[i], p2[i]);
 
priv->sensor[i].slope = SLOPE_DEFAULT;
if (mode == TWO_PT_CALIB) {
@@ -60,7 +60,7 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 *p1,
priv->sensor[i].offset = (p1[i] * SLOPE_FACTOR) -
(CAL_DEGC_PT1 *
priv->sensor[i].slope);
-   dev_dbg(priv->dev, "offset:%d\n", priv->sensor[i].offset);
+   dev_dbg(priv->dev, "%s: offset:%d\n", __func__, 
priv->sensor[i].offset);
}
 }
 
@@ -209,7 +209,7 @@ int __init init_common(struct tsens_priv *priv)
if (ret)
goto err_put_device;
if (!enabled) {
-   dev_err(dev, "tsens device is not enabled\n");
+   dev_err(dev, "%s: device not enabled\n", __func__);
ret = -ENODEV;
goto err_put_device;
}
diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index 542a7f8c3d96..06c6bbd69a1a 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -127,7 +127,7 @@ static int tsens_probe(struct platform_device *pdev)
of_property_read_u32(np, "#qcom,sensors", _sensors);
 
if (num_sensors <= 0) {
-   dev_err(dev, "invalid number of sensors\n");
+   dev_err(dev, "%s: invalid number of sensors\n", __func__);
return -EINVAL;
}
 
@@ -156,7 +156,7 @@ static int tsens_probe(struct platform_device *pdev)
 
ret = priv->ops->init(priv);
if (ret < 0) {
-   dev_err(dev, "tsens init failed\n");
+   dev_err(dev, "%s: init failed\n", __func__);
return ret;
}
 
@@ -164,7 +164,7 @@ static int tsens_probe(struct platform_device *pdev)
ret = priv->ops->calibrate(priv);
if (ret < 0) {
if (ret != -EPROBE_DEFER)
-   dev_err(dev, "tsens calibration failed\n");
+   dev_err(dev, "%s: calibration failed\n", 
__func__);
return ret;
}
}
-- 
2.17.1

[PATCH v4 01/15] drivers: thermal: tsens: Get rid of id field in tsens_sensor

2019-09-20 Thread Amit Kucheria

There are two fields - id and hw_id - to track what sensor an action was
to performed on. This was because the sensors connected to a TSENS IP
might not be contiguous i.e. 1, 2, 4, 5 with 3 being skipped.

This causes confusion in the code which uses hw_id sometimes and id
other times (tsens_get_temp, tsens_get_trend).

Switch to only using the hw_id field to track the physical ID of the
sensor. When we iterate through all the sensors connected to an IP
block, we use an index i to loop through the list of sensors, and then
return the actual hw_id that is registered on that index.

Signed-off-by: Amit Kucheria 
Reviewed-by: Stephen Boyd 
Reviewed-by: Daniel Lezcano 
---
 drivers/thermal/qcom/tsens-8960.c   |  4 ++--
 drivers/thermal/qcom/tsens-common.c | 16 +---
 drivers/thermal/qcom/tsens.c| 11 +--
 drivers/thermal/qcom/tsens.h| 10 --
 4 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/drivers/thermal/qcom/tsens-8960.c 
b/drivers/thermal/qcom/tsens-8960.c
index 8d9b721dadb6..3e1436fda1eb 100644
--- a/drivers/thermal/qcom/tsens-8960.c
+++ b/drivers/thermal/qcom/tsens-8960.c
@@ -243,11 +243,11 @@ static inline int code_to_mdegC(u32 adc_code, const 
struct tsens_sensor *s)
return adc_code * slope + offset;
 }
 
-static int get_temp_8960(struct tsens_priv *priv, int id, int *temp)
+static int get_temp_8960(struct tsens_sensor *s, int *temp)
 {
int ret;
u32 code, trdy;
-   const struct tsens_sensor *s = >sensor[id];
+   struct tsens_priv *priv = s->priv;
unsigned long timeout;
 
timeout = jiffies + usecs_to_jiffies(TIMEOUT_US);
diff --git a/drivers/thermal/qcom/tsens-common.c 
b/drivers/thermal/qcom/tsens-common.c
index 528df8801254..c037bdf92c66 100644
--- a/drivers/thermal/qcom/tsens-common.c
+++ b/drivers/thermal/qcom/tsens-common.c
@@ -83,11 +83,12 @@ static inline int code_to_degc(u32 adc_code, const struct 
tsens_sensor *s)
return degc;
 }
 
-int get_temp_tsens_valid(struct tsens_priv *priv, int i, int *temp)
+int get_temp_tsens_valid(struct tsens_sensor *s, int *temp)
 {
-   struct tsens_sensor *s = >sensor[i];
-   u32 temp_idx = LAST_TEMP_0 + s->hw_id;
-   u32 valid_idx = VALID_0 + s->hw_id;
+   struct tsens_priv *priv = s->priv;
+   int hw_id = s->hw_id;
+   u32 temp_idx = LAST_TEMP_0 + hw_id;
+   u32 valid_idx = VALID_0 + hw_id;
u32 last_temp = 0, valid, mask;
int ret;
 
@@ -123,12 +124,13 @@ int get_temp_tsens_valid(struct tsens_priv *priv, int i, 
int *temp)
return 0;
 }
 
-int get_temp_common(struct tsens_priv *priv, int i, int *temp)
+int get_temp_common(struct tsens_sensor *s, int *temp)
 {
-   struct tsens_sensor *s = >sensor[i];
+   struct tsens_priv *priv = s->priv;
+   int hw_id = s->hw_id;
int last_temp = 0, ret;
 
-   ret = regmap_field_read(priv->rf[LAST_TEMP_0 + s->hw_id], _temp);
+   ret = regmap_field_read(priv->rf[LAST_TEMP_0 + hw_id], _temp);
if (ret)
return ret;
 
diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index 0627d8615c30..6ed687a6e53c 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -14,19 +14,19 @@
 
 static int tsens_get_temp(void *data, int *temp)
 {
-   const struct tsens_sensor *s = data;
+   struct tsens_sensor *s = data;
struct tsens_priv *priv = s->priv;
 
-   return priv->ops->get_temp(priv, s->id, temp);
+   return priv->ops->get_temp(s, temp);
 }
 
 static int tsens_get_trend(void *data, int trip, enum thermal_trend *trend)
 {
-   const struct tsens_sensor *s = data;
+   struct tsens_sensor *s = data;
struct tsens_priv *priv = s->priv;
 
if (priv->ops->get_trend)
-   return priv->ops->get_trend(priv, s->id, trend);
+   return priv->ops->get_trend(s, trend);
 
return -ENOTSUPP;
 }
@@ -86,8 +86,7 @@ static int tsens_register(struct tsens_priv *priv)
 
for (i = 0;  i < priv->num_sensors; i++) {
priv->sensor[i].priv = priv;
-   priv->sensor[i].id = i;
-   tzd = devm_thermal_zone_of_sensor_register(priv->dev, i,
+   tzd = devm_thermal_zone_of_sensor_register(priv->dev, 
priv->sensor[i].hw_id,
   >sensor[i],
   _of_ops);
if (IS_ERR(tzd))
diff --git a/drivers/thermal/qcom/tsens.h b/drivers/thermal/qcom/tsens.h
index 2fd94997245b..d022e726d074 100644
--- a/drivers/thermal/qcom/tsens.h
+++ b/drivers/thermal/qcom/tsens.h
@@ -31,7 +31,6 @@ enum tsens_ver {
  * @priv: tsens device instance that this sensor is connected to
  * @tzd: pointer to the thermal zone that this sensor is in
  * @offset: offset of temperature adjustment curve
- * @id: Sensor ID
  * @hw_id: HW ID can be used in case of platform-specific IDs
  * @slope: slope of

[PATCH v4 02/15] drivers: thermal: tsens: Simplify code flow in tsens_probe

2019-09-20 Thread Amit Kucheria

Move platform_set_drvdata up to avoid an extra 'if (ret)' check after
the call to tsens_register.

Signed-off-by: Amit Kucheria 
Reviewed-by: Stephen Boyd 
Reviewed-by: Daniel Lezcano 
---
 drivers/thermal/qcom/tsens.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index 6ed687a6e53c..542a7f8c3d96 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -149,6 +149,8 @@ static int tsens_probe(struct platform_device *pdev)
priv->feat = data->feat;
priv->fields = data->fields;
 
+   platform_set_drvdata(pdev, priv);
+
if (!priv->ops || !priv->ops->init || !priv->ops->get_temp)
return -EINVAL;
 
@@ -167,11 +169,7 @@ static int tsens_probe(struct platform_device *pdev)
}
}
 
-   ret = tsens_register(priv);
-
-   platform_set_drvdata(pdev, priv);
-
-   return ret;
+   return tsens_register(priv);
 }
 
 static int tsens_remove(struct platform_device *pdev)
-- 
2.17.1

[PATCH v4 00/15] thermal: qcom: tsens: Add interrupt support

2019-09-20 Thread Amit Kucheria

Changes since v3:
- Fix up the YAML definitions based on Rob's review

Changes since v2:
- Addressed Stephen's review comment
- Moved the dt-bindings to yaml (This throws up some new warnings in various 
QCOM
devicetrees. I'll send out a separate series to fix them up)
- Collected reviews and acks
- Added the dt-bindings to MAINTAINERS

Changes since v1:
- Collected reviews and acks
- Addressed Stephen's review comments (hopefully I got them all).
- Completely removed critical interrupt infrastructure from this series.
  Will post that separately.
- Fixed a bug in sign-extension of temperature.
- Fixed DT bindings to use the name of the interrupt e.g. "uplow" and use
  platform_get_irq_byname().

Add interrupt support to TSENS. The first 6 patches are general fixes and
cleanups to the driver before interrupt support is introduced.

This series has been developed against qcs404 and sdm845 and then tested on
msm8916 and msm8974 (Thanks Brian). Testing on msm8998 would be appreciated 
since I don't
have hardware handy.

Amit Kucheria (15):
  drivers: thermal: tsens: Get rid of id field in tsens_sensor
  drivers: thermal: tsens: Simplify code flow in tsens_probe
  drivers: thermal: tsens: Add __func__ identifier to debug statements
  drivers: thermal: tsens: Add debugfs support
  arm: dts: msm8974: thermal: Add thermal zones for each sensor
  arm64: dts: msm8916: thermal: Fixup HW ids for cpu sensors
  dt-bindings: thermal: tsens: Convert over to a yaml schema
  arm64: dts: sdm845: thermal: Add interrupt support
  arm64: dts: msm8996: thermal: Add interrupt support
  arm64: dts: msm8998: thermal: Add interrupt support
  arm64: dts: qcs404: thermal: Add interrupt support
  arm: dts: msm8974: thermal: Add interrupt support
  arm64: dts: msm8916: thermal: Add interrupt support
  drivers: thermal: tsens: Create function to return sign-extended
temperature
  drivers: thermal: tsens: Add interrupt support

 .../bindings/thermal/qcom-tsens.txt   |  55 --
 .../bindings/thermal/qcom-tsens.yaml  | 168 ++
 MAINTAINERS   |   1 +
 arch/arm/boot/dts/qcom-msm8974.dtsi   | 108 +++-
 arch/arm64/boot/dts/qcom/msm8916.dtsi |  26 +-
 arch/arm64/boot/dts/qcom/msm8996.dtsi |  60 +-
 arch/arm64/boot/dts/qcom/msm8998.dtsi |  82 +--
 arch/arm64/boot/dts/qcom/qcs404.dtsi  |  42 +-
 arch/arm64/boot/dts/qcom/sdm845.dtsi  |  88 +--
 drivers/thermal/qcom/tsens-8960.c |   4 +-
 drivers/thermal/qcom/tsens-common.c   | 529 --
 drivers/thermal/qcom/tsens-v0_1.c |  11 +
 drivers/thermal/qcom/tsens-v1.c   |  29 +
 drivers/thermal/qcom/tsens-v2.c   |  13 +
 drivers/thermal/qcom/tsens.c  |  58 +-
 drivers/thermal/qcom/tsens.h  | 286 --
 16 files changed, 1248 insertions(+), 312 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/thermal/qcom-tsens.txt
 create mode 100644 Documentation/devicetree/bindings/thermal/qcom-tsens.yaml

-- 
2.17.1

[PATCH v4 14/15] drivers: thermal: tsens: Create function to return sign-extended temperature

2019-09-20 Thread Amit Kucheria

Hide the details of how to convert values read from TSENS HW to mCelsius
behind a function. All versions of the IP can be supported as a result.

Signed-off-by: Amit Kucheria 
---
 drivers/thermal/qcom/tsens-common.c | 50 +
 1 file changed, 36 insertions(+), 14 deletions(-)

diff --git a/drivers/thermal/qcom/tsens-common.c 
b/drivers/thermal/qcom/tsens-common.c
index ea2c46cc6a66..6b6b3841c2d0 100644
--- a/drivers/thermal/qcom/tsens-common.c
+++ b/drivers/thermal/qcom/tsens-common.c
@@ -84,13 +84,46 @@ static inline int code_to_degc(u32 adc_code, const struct 
tsens_sensor *s)
return degc;
 }
 
+/**
+ * tsens_hw_to_mC - Return sign-extended temperature in mCelsius.
+ * @s: Pointer to sensor struct
+ * @field: Index into regmap_field array pointing to temperature data
+ *
+ * This function handles temperature returned in ADC code or deciCelsius
+ * depending on IP version.
+ *
+ * Return: Temperature in milliCelsius on success, a negative errno will
+ * be returned in error cases
+ */
+static int tsens_hw_to_mC(struct tsens_sensor *s, int field)
+{
+   struct tsens_priv *priv = s->priv;
+   u32 resolution;
+   u32 temp = 0;
+   int ret;
+
+   resolution = priv->fields[LAST_TEMP_0].msb -
+   priv->fields[LAST_TEMP_0].lsb;
+
+   ret = regmap_field_read(priv->rf[field], );
+   if (ret)
+   return ret;
+
+   /* Convert temperature from ADC code to milliCelsius */
+   if (priv->feat->adc)
+   return code_to_degc(temp, s) * 1000;
+
+   /* deciCelsius -> milliCelsius along with sign extension */
+   return sign_extend32(temp, resolution) * 100;
+}
+
 int get_temp_tsens_valid(struct tsens_sensor *s, int *temp)
 {
struct tsens_priv *priv = s->priv;
int hw_id = s->hw_id;
u32 temp_idx = LAST_TEMP_0 + hw_id;
u32 valid_idx = VALID_0 + hw_id;
-   u32 last_temp = 0, valid, mask;
+   u32 valid;
int ret;
 
ret = regmap_field_read(priv->rf[valid_idx], );
@@ -108,19 +141,7 @@ int get_temp_tsens_valid(struct tsens_sensor *s, int *temp)
}
 
/* Valid bit is set, OK to read the temperature */
-   ret = regmap_field_read(priv->rf[temp_idx], _temp);
-   if (ret)
-   return ret;
-
-   if (priv->feat->adc) {
-   /* Convert temperature from ADC code to milliCelsius */
-   *temp = code_to_degc(last_temp, s) * 1000;
-   } else {
-   mask = GENMASK(priv->fields[LAST_TEMP_0].msb,
-  priv->fields[LAST_TEMP_0].lsb);
-   /* Convert temperature from deciCelsius to milliCelsius */
-   *temp = sign_extend32(last_temp, fls(mask) - 1) * 100;
-   }
+   *temp = tsens_hw_to_mC(s, temp_idx);
 
return 0;
 }
@@ -310,6 +331,7 @@ int __init init_common(struct tsens_priv *priv)
goto err_put_device;
}
}
+
for (i = 0, j = VALID_0; i < priv->feat->max_sensors; i++, j++) {
priv->rf[j] = devm_regmap_field_alloc(dev, priv->tm_map,
  priv->fields[j]);
-- 
2.17.1

[PATCH v4 13/15] arm64: dts: msm8916: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupt for the tsens controller.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 8686e101905c..c0d0492d90ec 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -176,8 +176,8 @@
 
thermal-zones {
cpu0_1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -206,8 +206,8 @@
};
 
cpu2_3-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 4>;
 
@@ -236,8 +236,8 @@
};
 
gpu-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -256,8 +256,8 @@
};
 
camera-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -271,8 +271,8 @@
};
 
modem-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 0>;
 
@@ -816,6 +816,8 @@
nvmem-cells = <_caldata>, <_calsel>;
nvmem-cell-names = "calib", "calib_sel";
#qcom,sensors = <5>;
+   interrupts = ;
+   interrupt-names = "uplow";
#thermal-sensor-cells = <1>;
};
 
-- 
2.17.1

[PATCH v4 11/15] arm64: dts: qcs404: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupt for the tsens controller.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/qcs404.dtsi | 42 +++-
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi 
b/arch/arm64/boot/dts/qcom/qcs404.dtsi
index 3d0789775009..065a60d50a07 100644
--- a/arch/arm64/boot/dts/qcom/qcs404.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi
@@ -280,6 +280,8 @@
nvmem-cells = <_caldata>;
nvmem-cell-names = "calib";
#qcom,sensors = <10>;
+   interrupts = ;
+   interrupt-names = "uplow";
#thermal-sensor-cells = <1>;
};
 
@@ -1071,8 +1073,8 @@
 
thermal-zones {
aoss-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 0>;
 
@@ -1086,8 +1088,8 @@
};
 
q6-hvx-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -1101,8 +1103,8 @@
};
 
lpass-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -1116,8 +1118,8 @@
};
 
wlan-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -1131,8 +1133,8 @@
};
 
cluster-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 4>;
 
@@ -1165,8 +1167,8 @@
};
 
cpu0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -1199,8 +1201,8 @@
};
 
cpu1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 6>;
 
@@ -1233,8 +1235,8 @@
};
 
cpu2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -1267,8 +1269,8 @@
};
 
cpu3-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 8>;
 
@@ -1301,8 +1303,8 @@
};
 
gpu-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 9>;
 
-- 
2.17.1

[PATCH v4 08/15] arm64: dts: sdm845: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupts for each of the two tsens controllers.

Signed-off-by: Amit Kucheria 
Reviewed-by: Stephen Boyd 
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 88 +++-
 1 file changed, 46 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 4babff5f19b5..fdd74c39b744 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -2386,6 +2386,8 @@
reg = <0 0x0c263000 0 0x1ff>, /* TM */
  <0 0x0c222000 0 0x1ff>; /* SROT */
#qcom,sensors = <13>;
+   interrupts = ;
+   interrupt-names = "uplow";
#thermal-sensor-cells = <1>;
};
 
@@ -2394,6 +2396,8 @@
reg = <0 0x0c265000 0 0x1ff>, /* TM */
  <0 0x0c223000 0 0x1ff>; /* SROT */
#qcom,sensors = <8>;
+   interrupts = ;
+   interrupt-names = "uplow";
#thermal-sensor-cells = <1>;
};
 
@@ -2712,8 +2716,8 @@
 
thermal-zones {
cpu0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -2756,8 +2760,8 @@
};
 
cpu1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -2800,8 +2804,8 @@
};
 
cpu2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -2844,8 +2848,8 @@
};
 
cpu3-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 4>;
 
@@ -2888,8 +2892,8 @@
};
 
cpu4-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -2932,8 +2936,8 @@
};
 
cpu5-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 8>;
 
@@ -2976,8 +2980,8 @@
};
 
cpu6-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 9>;
 
@@ -3020,8 +3024,8 @@
};
 
cpu7-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 10>;
 
@@ -3064,8 +3068,8 @@
};
 
aoss0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 0>;
 
@@ -3079,8 +3083,8 @@
};
 
cluster0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -3099,8 +3103,8 @@
};
 
cluster1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 6>;
 
@@ -3119,8 +3123,8 @@
};
 
gpu-thermal-top {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+

[PATCH v4 05/15] arm: dts: msm8974: thermal: Add thermal zones for each sensor

2019-09-20 Thread Amit Kucheria

msm8974 has 11 sensors connected to a single TSENS IP. Define a thermal
zone for each of those sensors to expose the temperature of each zone.

Signed-off-by: Amit Kucheria 
Tested-by: Brian Masney 
Reviewed-by: Stephen Boyd 
---
 arch/arm/boot/dts/qcom-msm8974.dtsi | 90 +
 1 file changed, 90 insertions(+)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi 
b/arch/arm/boot/dts/qcom-msm8974.dtsi
index 369e58f64145..d32f639505f1 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -217,6 +217,96 @@
};
};
};
+
+   q6-dsp-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 1>;
+
+   trips {
+   q6_dsp_alert0: trip-point0 {
+   temperature = <9>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
+
+   modemtx-thermal {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 2>;
+
+   trips {
+   modemtx_alert0: trip-point0 {
+   temperature = <9>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
+
+   video-thermal {
+   polling-delay-passive = <0>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 3>;
+
+   trips {
+   video_alert0: trip-point0 {
+   temperature = <95000>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
+
+   wlan-thermal {
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
+
+   thermal-sensors = < 4>;
+
+   trips {
+   wlan_alert0: trip-point0 {
+   temperature = <105000>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
+
+   gpu-thermal-top {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 9>;
+
+   trips {
+   gpu1_alert0: trip-point0 {
+   temperature = <9>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
+
+   gpu-thermal-bottom {
+   polling-delay-passive = <250>;
+   polling-delay = <1000>;
+
+   thermal-sensors = < 10>;
+
+   trips {
+   gpu2_alert0: trip-point0 {
+   temperature = <9>;
+   hysteresis = <2000>;
+   type = "hot";
+   };
+   };
+   };
};
 
cpu-pmu {
-- 
2.17.1

[PATCH v4 09/15] arm64: dts: msm8996: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupts for each of the two tsens controllers.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/msm8996.dtsi | 60 ++-
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 96c0a481f454..bb763b362c16 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -175,8 +175,8 @@
 
thermal-zones {
cpu0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -196,8 +196,8 @@
};
 
cpu1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -217,8 +217,8 @@
};
 
cpu2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 8>;
 
@@ -238,8 +238,8 @@
};
 
cpu3-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 10>;
 
@@ -259,8 +259,8 @@
};
 
gpu-thermal-top {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 6>;
 
@@ -274,8 +274,8 @@
};
 
gpu-thermal-bottom {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -289,8 +289,8 @@
};
 
m4m-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -304,8 +304,8 @@
};
 
l3-or-venus-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -319,8 +319,8 @@
};
 
cluster0-l2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -334,8 +334,8 @@
};
 
cluster1-l2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 12>;
 
@@ -349,8 +349,8 @@
};
 
camera-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -364,8 +364,8 @@
};
 
q6-dsp-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -379,8 +379,8 @@
};
 
mem-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -394,8 +394,8 @@
};
 
modemtx-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;

[PATCH v4 10/15] arm64: dts: msm8998: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupts for each of the two tsens controllers.

Signed-off-by: Amit Kucheria 
---
 arch/arm64/boot/dts/qcom/msm8998.dtsi | 82 ++-
 1 file changed, 42 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/msm8998.dtsi 
b/arch/arm64/boot/dts/qcom/msm8998.dtsi
index c13ed7aeb1e0..1e2f77b38f2c 100644
--- a/arch/arm64/boot/dts/qcom/msm8998.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8998.dtsi
@@ -440,8 +440,8 @@
 
thermal-zones {
cpu0-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -461,8 +461,8 @@
};
 
cpu1-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -482,8 +482,8 @@
};
 
cpu2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -503,8 +503,8 @@
};
 
cpu3-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 4>;
 
@@ -524,8 +524,8 @@
};
 
cpu4-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -545,8 +545,8 @@
};
 
cpu5-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 8>;
 
@@ -566,8 +566,8 @@
};
 
cpu6-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 9>;
 
@@ -587,8 +587,8 @@
};
 
cpu7-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 10>;
 
@@ -608,8 +608,8 @@
};
 
gpu-thermal-bottom {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 12>;
 
@@ -623,8 +623,8 @@
};
 
gpu-thermal-top {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 13>;
 
@@ -638,8 +638,8 @@
};
 
clust0-mhm-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -653,8 +653,8 @@
};
 
clust1-mhm-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 6>;
 
@@ -668,8 +668,8 @@
};
 
cluster1-l2-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 11>;
 
@@ -683,8 +683,8 @@
};
 
modem-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;

[PATCH v4 15/15] drivers: thermal: tsens: Add interrupt support

2019-09-20 Thread Amit Kucheria

Depending on the IP version, TSENS supports upper, lower and critical
threshold interrupts. We only add support for upper and lower threshold
interrupts for now.

TSENSv2 has an irq [status|clear|mask] bit tuple for each sensor while
earlier versions only have a single bit per sensor to denote status and
clear. These differences are handled transparently by the interrupt
handler. At each interrupt, we reprogram the new upper and lower threshold
in the .set_trip callback.

Signed-off-by: Amit Kucheria 
---
 drivers/thermal/qcom/tsens-common.c | 376 ++--
 drivers/thermal/qcom/tsens-v0_1.c   |  11 +
 drivers/thermal/qcom/tsens-v1.c |  29 +++
 drivers/thermal/qcom/tsens-v2.c |  13 +
 drivers/thermal/qcom/tsens.c|  31 ++-
 drivers/thermal/qcom/tsens.h| 270 
 6 files changed, 668 insertions(+), 62 deletions(-)

diff --git a/drivers/thermal/qcom/tsens-common.c 
b/drivers/thermal/qcom/tsens-common.c
index 6b6b3841c2d0..03bf1b8133ea 100644
--- a/drivers/thermal/qcom/tsens-common.c
+++ b/drivers/thermal/qcom/tsens-common.c
@@ -13,6 +13,31 @@
 #include 
 #include "tsens.h"
 
+/**
+ * struct tsens_irq_data - IRQ status and temperature violations
+ * @up_viol:upper threshold violated
+ * @up_thresh:  upper threshold temperature value
+ * @up_irq_mask:mask register for upper threshold irqs
+ * @up_irq_clear:   clear register for uppper threshold irqs
+ * @low_viol:   lower threshold violated
+ * @low_thresh: lower threshold temperature value
+ * @low_irq_mask:   mask register for lower threshold irqs
+ * @low_irq_clear:  clear register for lower threshold irqs
+ *
+ * Structure containing data about temperature threshold settings and
+ * irq status if they were violated.
+ */
+struct tsens_irq_data {
+   u32 up_viol;
+   int up_thresh;
+   u32 up_irq_mask;
+   u32 up_irq_clear;
+   u32 low_viol;
+   int low_thresh;
+   u32 low_irq_mask;
+   u32 low_irq_clear;
+};
+
 char *qfprom_read(struct device *dev, const char *cname)
 {
struct nvmem_cell *cell;
@@ -65,6 +90,14 @@ void compute_intercept_slope(struct tsens_priv *priv, u32 
*p1,
}
 }
 
+static inline u32 degc_to_code(int degc, const struct tsens_sensor *s)
+{
+   u64 code = (degc * s->slope + s->offset) / SLOPE_FACTOR;
+
+   pr_debug("%s: raw_code: 0x%llx, degc:%d\n", __func__, code, degc);
+   return clamp_val(code, THRESHOLD_MIN_ADC_CODE, THRESHOLD_MAX_ADC_CODE);
+}
+
 static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s)
 {
int degc, num, den;
@@ -117,6 +150,313 @@ static int tsens_hw_to_mC(struct tsens_sensor *s, int 
field)
return sign_extend32(temp, resolution) * 100;
 }
 
+/**
+ * tsens_mC_to_hw - Convert temperature to hardware register value
+ * @s: Pointer to sensor struct
+ * @temp: temperature in milliCelsius to be programmed to hardware
+ *
+ * This function outputs the value to be written to hardware in ADC code
+ * or deciCelsius depending on IP version.
+ *
+ * Return: ADC code or temperature in deciCelsius.
+ */
+static int tsens_mC_to_hw(struct tsens_sensor *s, int temp)
+{
+   struct tsens_priv *priv = s->priv;
+
+   /* milliC to adc code */
+   if (priv->feat->adc)
+   return degc_to_code(temp / 1000, s);
+
+   /* milliC to deciC */
+   return temp / 100;
+}
+
+static inline enum tsens_ver tsens_version(struct tsens_priv *priv)
+{
+   return priv->feat->ver_major;
+}
+
+static void tsens_set_interrupt_v1(struct tsens_priv *priv, u32 hw_id,
+  enum tsens_irq_type irq_type, bool enable)
+{
+   u32 index;
+
+   switch (irq_type) {
+   case UPPER:
+   index = UP_INT_CLEAR_0 + hw_id;
+   break;
+   case LOWER:
+   index = LOW_INT_CLEAR_0 + hw_id;
+   break;
+   }
+   regmap_field_write(priv->rf[index], enable ? 0 : 1);
+}
+
+static void tsens_set_interrupt_v2(struct tsens_priv *priv, u32 hw_id,
+  enum tsens_irq_type irq_type, bool enable)
+{
+   u32 index_mask, index_clear;
+
+   /*
+* To enable the interrupt flag for a sensor:
+*- clear the mask bit
+* To disable the interrupt flag for a sensor:
+*- Mask further interrupts for this sensor
+*- Write 1 followed by 0 to clear the interrupt
+*/
+   switch (irq_type) {
+   case UPPER:
+   index_mask  = UP_INT_MASK_0 + hw_id;
+   index_clear = UP_INT_CLEAR_0 + hw_id;
+   break;
+   case LOWER:
+   index_mask  = LOW_INT_MASK_0 + hw_id;
+   index_clear = LOW_INT_CLEAR_0 + hw_id;
+   break;
+   }
+
+   if (enable) {
+   regmap_field_write(priv->rf[index_mask], 0);
+   } else {
+   regmap_field_write(priv->rf[index_mask],  1);
+

[PATCH v4 12/15] arm: dts: msm8974: thermal: Add interrupt support

2019-09-20 Thread Amit Kucheria

Register upper-lower interrupt for the tsens controller.

Signed-off-by: Amit Kucheria 
Tested-by: Brian Masney 
---
 arch/arm/boot/dts/qcom-msm8974.dtsi | 36 +++--
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/arch/arm/boot/dts/qcom-msm8974.dtsi 
b/arch/arm/boot/dts/qcom-msm8974.dtsi
index d32f639505f1..290f7c3827d4 100644
--- a/arch/arm/boot/dts/qcom-msm8974.dtsi
+++ b/arch/arm/boot/dts/qcom-msm8974.dtsi
@@ -139,8 +139,8 @@
 
thermal-zones {
cpu-thermal0 {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 5>;
 
@@ -159,8 +159,8 @@
};
 
cpu-thermal1 {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 6>;
 
@@ -179,8 +179,8 @@
};
 
cpu-thermal2 {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 7>;
 
@@ -199,8 +199,8 @@
};
 
cpu-thermal3 {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 8>;
 
@@ -219,8 +219,8 @@
};
 
q6-dsp-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 1>;
 
@@ -234,8 +234,8 @@
};
 
modemtx-thermal {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 2>;
 
@@ -250,7 +250,7 @@
 
video-thermal {
polling-delay-passive = <0>;
-   polling-delay = <1000>;
+   polling-delay = <0>;
 
thermal-sensors = < 3>;
 
@@ -279,8 +279,8 @@
};
 
gpu-thermal-top {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 9>;
 
@@ -294,8 +294,8 @@
};
 
gpu-thermal-bottom {
-   polling-delay-passive = <250>;
-   polling-delay = <1000>;
+   polling-delay-passive = <0>;
+   polling-delay = <0>;
 
thermal-sensors = < 10>;
 
@@ -531,6 +531,8 @@
nvmem-cells = <_calib>, <_backup>;
nvmem-cell-names = "calib", "calib_backup";
#qcom,sensors = <11>;
+   interrupts = ;
+   interrupt-names = "uplow";
#thermal-sensor-cells = <1>;
};
 
-- 
2.17.1

Re: [PATCH bpf] libbpf: fix version identification on busybox

2019-09-20 Thread Andrii Nakryiko

On Fri, Sep 20, 2019 at 12:19 PM Ivan Khoronzhuk
 wrote:
>
> On Fri, Sep 20, 2019 at 09:34:51PM +0300, Ivan Khoronzhuk wrote:
> >On Fri, Sep 20, 2019 at 09:41:54AM -0700, Andrii Nakryiko wrote:
> >>On Fri, Sep 20, 2019 at 1:22 AM Ivan Khoronzhuk
> >> wrote:
> >>>
> >>>On Thu, Sep 19, 2019 at 01:02:40PM -0700, Andrii Nakryiko wrote:
> On Thu, Sep 19, 2019 at 11:22 AM Ivan Khoronzhuk
>  wrote:
> >
> > It's very often for embedded to have stripped version of sort in
> > busybox, when no -V option present. It breaks build natively on target
> > board causing recursive loop.
> >
> > BusyBox v1.24.1 (2019-04-06 04:09:16 UTC) multi-call binary. \
> > Usage: sort [-nrugMcszbdfimSTokt] [-o FILE] [-k \
> > start[.offset][opts][,end[.offset][opts]] [-t CHAR] [FILE]...
> >
> > Lets modify command a little to avoid -V option.
> >
> > Fixes: dadb81d0afe732 ("libbpf: make libbpf.map source of truth for 
> > libbpf version")
> >
> > Signed-off-by: Ivan Khoronzhuk 
> > ---
> >
> > Based on bpf/master
> >
> >  tools/lib/bpf/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
> > index c6f94cffe06e..a12490ad6215 100644
> > --- a/tools/lib/bpf/Makefile
> > +++ b/tools/lib/bpf/Makefile
> > @@ -3,7 +3,7 @@
> >
> >  LIBBPF_VERSION := $(shell \
> > grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
> > -   sort -rV | head -n1 | cut -d'_' -f2)
> > +   cut -d'_' -f2 | sort -r | head -n1)
> 
> You can't just sort alphabetically, because:
> 
> 1.2
> 1.11
> 
> should be in that order. See discussion on mailing thread for original 
> commit.
> >>>
> >>>if X1.X2.X3, where X = {0,1,9}
> >>>Then it can be:
> >>>
> >>>-LIBBPF_VERSION := $(shell \
> >>>-   grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
> >>>-   sort -rV | head -n1 | cut -d'_' -f2)
> >>>+_LBPFLIST := $(patsubst %;,%,$(patsubst LIBBPF_%,%,$(filter LIBBPF_%, \
> >>>+   $(shell cat libbpf.map
> >>>+_LBPFLIST2 := $(foreach v,$(_LBPFLIST), \
> >>>+   $(subst $() $(),,$(foreach n,$(subst .,$() $(),$(v)), \
> >>>+   $(shell printf "%05d" $(n)
> >>>+_LBPF_VER := $(word $(words $(sort $(_LBPFLIST2))), $(sort $(_LBPFLIST2)))
> >>>+LIBBPF_VERSION := $(patsubst %_$(_LBPF_VER),%,$(filter %_$(_LBPF_VER), \
> >>>+$(join $(addsuffix _, $(_LBPFLIST)),$(_LBPFLIST2
> >>>
> >>>It's bigger but avoids invocations of grep/sort/cut/head, only cat/printf
> >>>, thus -V option also.
> >>>
> >>
> >>No way, this is way too ugly (and still unreliable, if we ever have
> >>X.Y.Z.W or something). I'd rather go with my original approach of
> >Yes, forgot to add
> >X1,X2,X3,...XN, where X = {0,1,9} and N = const for all versions.
> >But frankly, 1.0.0 looks too far.
>
> It actually works for any numbs of X1.X2...X100
> but not when you have couple kindof:
> X1.X2.X3
> and
> X1.X2.X3.X4
>
> But, no absolutely any problem to extend this solution to handle all cases,
> by just adding leading 0 to every "transformed version", say limit it to 10
> possible 'dots' (%5*10d) and it will work as clocks. Advantage - mostly make
> functions.
>
> Here can be couple more solutions with sed, not sure it can look less maniac.
>
> >
> >>fetching the last version in libbpf.map file. See
> >>https://www.spinics.net/lists/netdev/msg592703.html.
>
> Yes it's nice but, no sort, no X1.X2.X3XN
>
> Main is to solve it for a long time.

Thinking a bit more about this, I'm even more convinced that we should
just go with my original approach: find last section in libbpf.map and
extract LIBBPF version from that. That will handle whatever crazy
version format we might decide to use (e.g., 1.2.3-experimental).
We'll just need to make sure that latest version is the last in
libbpf.map, which will just happen naturally. So instead of this
Makefile complexity, please can you port back my original approach?
Thanks!

>
> >>
> 
> >  LIBBPF_MAJOR_VERSION := $(firstword $(subst ., ,$(LIBBPF_VERSION)))
> >
> >  MAKEFLAGS += --no-print-directory
> > --
> > 2.17.1
> >
> >>>
> >>>--
> >>>Regards,
> >>>Ivan Khoronzhuk
> >
> >--
> >Regards,
> >Ivan Khoronzhuk
>
> --
> Regards,
> Ivan Khoronzhuk

Re: [PATCH 1/2] mmc: sdhci: Let drivers define their DMA mask

2019-09-20 Thread Nicolin Chen

On Fri, Sep 20, 2019 at 04:53:16PM +0200, Thierry Reding wrote:
> From: Adrian Hunter 
> 
> Add host operation ->set_dma_mask() so that drivers can define their own
> DMA masks.
> 
> Signed-off-by: Adrian Hunter 
> Signed-off-by: Thierry Reding 

Tested-by: Nicolin Chen 

Ran a boot test with both patches on a Tegra186 board.

Thanks!

> ---
>  drivers/mmc/host/sdhci.c | 12 
>  drivers/mmc/host/sdhci.h |  1 +
>  2 files changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index a5dc5aae973e..bc04c3180477 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -3756,18 +3756,14 @@ int sdhci_setup_host(struct sdhci_host *host)
>   host->flags &= ~SDHCI_USE_ADMA;
>   }
>  
> - /*
> -  * It is assumed that a 64-bit capable device has set a 64-bit DMA mask
> -  * and *must* do 64-bit DMA.  A driver has the opportunity to change
> -  * that during the first call to ->enable_dma().  Similarly
> -  * SDHCI_QUIRK2_BROKEN_64_BIT_DMA must be left to the drivers to
> -  * implement.
> -  */
>   if (sdhci_can_64bit_dma(host))
>   host->flags |= SDHCI_USE_64_BIT_DMA;
>  
>   if (host->flags & (SDHCI_USE_SDMA | SDHCI_USE_ADMA)) {
> - ret = sdhci_set_dma_mask(host);
> + if (host->ops->set_dma_mask)
> + ret = host->ops->set_dma_mask(host);
> + else
> + ret = sdhci_set_dma_mask(host);
>  
>   if (!ret && host->ops->enable_dma)
>   ret = host->ops->enable_dma(host);
> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
> index 902f855efe8f..8285498c0d8a 100644
> --- a/drivers/mmc/host/sdhci.h
> +++ b/drivers/mmc/host/sdhci.h
> @@ -622,6 +622,7 @@ struct sdhci_ops {
>  
>   u32 (*irq)(struct sdhci_host *host, u32 intmask);
>  
> + int (*set_dma_mask)(struct sdhci_host *host);
>   int (*enable_dma)(struct sdhci_host *host);
>   unsigned int(*get_max_clock)(struct sdhci_host *host);
>   unsigned int(*get_min_clock)(struct sdhci_host *host);
> -- 
> 2.23.0
>

RE: [PATCH] perf map: fix overlapped map handling

2019-09-20 Thread Steve MacLean

>>  after->start = map->end;
>> +after->pgoff = pos->map_ip(pos, map->end);
>
> So is this equivalent to what __split_vma() does in the kernel, i.e.:
>
>if (new_below)
>new->vm_end = addr;
>else {
>new->vm_start = addr;
>new->vm_pgoff += ((addr - vma->vm_start) >> PAGE_SHIFT);
>}
>
> where new->vm_pgoff starts equal to the vm_pgoff of the mmap being split?

It is roughly equivalent.  The pgoff in struct map is stored in bytes not in 
pages, so it doesn't include the shift.

An earlier version of this patch used:
after->start = map->end;
+   after->pgoff += map->end - pos->start;

Instead of the newer Functionally equivalent:
after->start = map->end;
+   after->pgoff = pos->map_ip(pos, map->end);

I preferred the latter form as it made more sense with the assertion that the 
mapping of map->end should match in pos and after.

Steve

Re: [PATCH v3 07/15] dt-bindings: thermal: tsens: Convert over to a yaml schema

2019-09-20 Thread Amit Kucheria

On Tue, Sep 17, 2019 at 12:06 PM Rob Herring  wrote:
>
> On Wed, Sep 11, 2019 at 12:46:24PM +0530, Amit Kucheria wrote:
> > Document interrupt support in the tsens driver by converting over to a
> > YAML schema.
> >
> > Suggested-by: Stephen Boyd 
> > Signed-off-by: Amit Kucheria 
> > ---
> >  .../bindings/thermal/qcom-tsens.txt   |  55 --
> >  .../bindings/thermal/qcom-tsens.yaml  | 174 ++
> >  MAINTAINERS   |   1 +
> >  3 files changed, 175 insertions(+), 55 deletions(-)
> >  delete mode 100644 Documentation/devicetree/bindings/thermal/qcom-tsens.txt
> >  create mode 100644 
> > Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
>
>
> > diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml 
> > b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
> > new file mode 100644
> > index ..6784766fe58f
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
> > @@ -0,0 +1,174 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR MIT)
> > +# Copyright 2019 Linaro Ltd.
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/thermal/qcom-tsens.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: QCOM SoC Temperature Sensor (TSENS)
> > +
> > +maintainers:
> > +  - Amit Kucheria 
> > +
> > +description: |
> > +  QCOM SoCs have TSENS IP to allow temperature measurement. There are 
> > currently
> > +  three distinct major versions of the IP that is supported by a single 
> > driver.
> > +  The IP versions are named v0.1, v1 and v2 in the driver, where v0.1 
> > captures
> > +  everything before v1 when there was no versioning information.
> > +
> > +properties:
> > +  compatible:
> > +oneOf:
> > +  - description: v0.1 of TSENS
> > +items:
> > +  - enum:
> > +  - qcom,msm8916-tsens
> > +  - qcom,msm8974-tsens
> > +  - const: qcom,tsens-v0_1
> > +
> > +  - description: v1 of TSENS
> > +items:
> > +  - enum:
> > +  - qcom,qcs404-tsens
> > +  - const: qcom,tsens-v1
> > +
> > +  - description: v2 of TSENS
> > +items:
> > +  - enum:
> > +  - qcom,msm8996-tsens
> > +  - qcom,msm8998-tsens
> > +  - qcom,sdm845-tsens
> > +  - const: qcom,tsens-v2
> > +
> > +  reg:
> > +maxItems: 2
> > +items:
> > +  - description: TM registers
> > +  - description: SROT registers
> > +
> > +  nvmem-cells:
> > +minItems: 1
> > +maxItems: 2
> > +description:
> > +  Reference to an nvmem node for the calibration data
> > +
> > +  nvmem-cells-names:
>
> This is going to require 2 items, so you need an explicit minItems and
> maxItems.

Will fix.

> > +items:
> > +  - enum:
> > +- caldata
> > +- calsel
> > +
> > +  "#qcom,sensors":
> > +allOf:
> > +  - $ref: /schemas/types.yaml#/definitions/uint32
> > +  - minimum: 1
> > +  - maximum: 16
> > +description:
> > +  Number of sensors enabled on this platform
> > +
> > +  "#thermal-sensor-cells":
> > +const: 1
> > +description:
> > +  Number of cells required to uniquely identify the thermal sensors. 
> > Since
> > +  we have multiple sensors this is set to 1
> > +
> > +allOf:
> > +  - if:
> > +  properties:
> > +compatible:
> > +  contains:
> > +enum:
> > +  - qcom,msm8916-tsens
> > +  - qcom,msm8974-tsens
> > +  - qcom,qcs404-tsens
> > +  - qcom,tsens-v0_1
> > +  - qcom,tsens-v1
> > +then:
> > +  properties:
> > +interrupts:
>
> > +  minItems: 1
> > +  maxItems: 1
>
> These can be implicit.

Will remove all of these.

> > +  items:
> > +- description: Combined interrupt if upper or lower threshold 
> > crossed
> > +interrupt-names:
> > +  minItems: 1
> > +  maxItems: 1
>
> ditto.
>
> > +  items:
> > +- const: uplow
> > +
> > +else:
> > +  properties:
> > +interrupts:
> > +  minItems: 2
> > +  maxItems: 2
>
> ditto.
>
> > +  items:
> > +- description: Combined interrupt if upper or lower threshold 
> > crossed
> > +- description: Interrupt if critical threshold crossed
> > +interrupt-names:
> > +  minItems: 2
> > +  maxItems: 2
>
> ditto.
>
> > +  items:
> > +- const: uplow
> > +- const: critical
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - "#qcom,sensors"
> > +  - interrupts
> > +  - interrupt-names
> > +  - "#thermal-sensor-cells"
> > +
> > +examples:
> > +  - |
> > +#include 
> > +// Example 1 (legacy: for pre v1 IP):
> > +tsens1: thermal-sensor@90 {
> > +   compatible = "qcom,msm8916-tsens", "qcom,tsens-v0_1";
> > +   reg =

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 924 matches

Mail list logo