Re: [PATCH v2] x86/PCI: fix a memory leak bug

2019-04-16 Thread Ingo Molnar


* Wenwen Wang  wrote:

> On Tue, Apr 16, 2019 at 3:33 PM Thomas Gleixner  wrote:
> >
> > On Tue, 16 Apr 2019, Wenwen Wang wrote:
> >
> > > In pcibios_irq_init(), the PCI IRQ routing table 'pirq_table' is firstly
> > > found through pirq_find_routing_table(). If the table is not found and
> > > 'CONFIG_PCI_BIOS' is defined, the table is then allocated in
> > > pcibios_get_irq_routing_table() using kmalloc(). In the following
> > > execution, if the I/O APIC is used, this table is actually not used.
> > > However, in that case, the allocated table is not freed, which can lead to
> > > a memory leak bug.
> >
> > s/which can lead to/which is/
> >
> > There is no 'can'. It simply is a memory leak.
> >
> > > To fix this issue, this patch frees the allocated table if it is not used.
> >
> > To fix this issue, free the allocated table if it is not used.
> >
> > 'this patch' is completely redundant information and discouraged in
> > Documentation/process/
> >
> 
> Thanks for your suggestions, Thomas. I will revise the commit's message.
> 
> Wenwen
> 
> > Other than that:
> >
> > Acked-by: Thomas Gleixner 

You didn't add Thomas's Acked-by to your commit ...

Thanks,

Ingo


Re: [PATCH-tip 0/2] locking/rwsem: Rwsem rearchitecture part 2 follow-up patches

2019-04-16 Thread Ingo Molnar


* Waiman Long  wrote:

> On 04/16/2019 01:37 PM, Peter Zijlstra wrote:
> > On Tue, Apr 16, 2019 at 01:03:10PM -0400, Waiman Long wrote:
> >> On 04/16/2019 10:17 AM, Peter Zijlstra wrote:
> >>> On Tue, Apr 16, 2019 at 09:18:50AM -0400, Waiman Long wrote:
>  On 04/16/2019 09:10 AM, Peter Zijlstra wrote:
> > On Mon, Apr 15, 2019 at 04:58:27PM -0400, Waiman Long wrote:
> >> This series contain 2 follow-up patches to alleviate the performance
> >> regression found in the page_fault1 test of the will-it-scale 
> >> benchmark.
> >> This does not recover all the lost performance, but reclaim a sizeable
> >> portion of it.
> >>
> >> The regression was found on an Intel system. I have run the test on
> >> an AMD system. The regression wasn't seen there.  There are only minor
> >> variations in performance. Perhaps the page fault path is quite 
> >> different
> >> between Intel and AMD systems.
> > Can you please just fold this back into the appropriate patches? Trying
> > to review all the back and forth is painful.
>  I will send out an update part 2 patch with patch 1 of this series
>  merged into the writer spinning on reader patch. Patch 2 of this series
>  will be a standalone one.
> >>> Hmm, in that case I can fold it back too. So hold off on sending it.
> >>>
> >>> I thought #2 was a fixup for an earlier patch as well.
> >> #2 is a performance fix.
> > Of this patch?
> >
> > 206038 N T Apr 13 Waiman Long (7.5K) ├─>[PATCH v4 11/16] locking/rwsem: 
> > Enable readers spinning on writer
> >
> > Fixes should have a Fixes: tag. And if the patch it fixes isn't a commit
> > yet, the patch should be refreshed to not need a fix.
> 
> The original patch isn't wrong. This patch just introduce another idea
> to make it better. That is why I would still like to separate it as a
> distinct patch.

Yeah, I think it's better to have it in two separate patches. Basically 
patch #1 has a downside for certain workloads, which the heuristics in 
patch #2 improve. That's the only connection between the two patches.

If we find some other worst-case workload then the split of the commits 
would allow more finegrained examination of the effects of these 
performance tunings.

Thanks,

Ingo


Re: [PATCH v2] proc/sysctl: add shared variables for range check

2019-04-16 Thread Matteo Croce
On April 17, 2019 12:17:41 PM GMT+09:00, Andrew Morton 
 wrote:
> On Wed, 17 Apr 2019 02:59:43 +0200 Matteo Croce 
> wrote:
> 
> > In the sysctl code the proc_dointvec_minmax() function is often used
> to
> > validate the user supplied value between an allowed range. This
> function
> > uses the extra1 and extra2 members from struct ctl_table as minimum
> and
> > maximum allowed value.
> > 
> > On sysctl handler declaration, in every source file there are some
> readonly
> > variables containing just an integer which address is assigned to
> the
> > extra1 and extra2 members, so the sysctl range is enforced.
> > 
> > The special values 0, 1 and INT_MAX are very often used as range
> boundary,
> > leading duplication of variables like zero=0, one=1, int_max=INT_MAX
> in
> > different source files:
> > 
> > $ git grep -E '\.extra[12].*&(zero|one|int_max)\b' |wc -l
> > 245
> > 
> > This patch adds three const variables for the most commonly used
> values,
> > and use them instead of creating a local one for every object file.
> > 
> 
> Nice.  A few thoughts:
> 
> > --- a/arch/s390/appldata/appldata_base.c
> > +++ b/arch/s390/appldata/appldata_base.c
> > @@ -220,15 +220,13 @@ appldata_timer_handler(struct ctl_table *ctl,
> int write,
> >void __user *buffer, size_t *lenp, loff_t *ppos)
> >  {
> > int timer_active = appldata_timer_active;
> > -   int zero = 0;
> > -   int one = 1;
> > int rc;
> > struct ctl_table ctl_entry = {
> > .procname   = ctl->procname,
> > .data   = _active,
> > .maxlen = sizeof(int),
> > -   .extra1 = ,
> > -   .extra2 = ,
> > +   .extra1 = (void *)_zero,
> > +   .extra2 = (void *)_one,
> 
> The casts are ugly, and by casting away constness they introduce the
> risk that some errant could could change the value of 0, 1 and
> INT_MAX!
> Maybe - perhaps trying to do that would cause a segv but still,
> they're ugly.
> 
> A proper fix would require changing extra1 and extra2 to const void *.
> 
> Perhaps that would be unfeasibly extensive?
> 

Hi Andrew,

I agree that the casts are ugly, but the "casts discards const qualifier" is 
way more ugly, so I have no choice.

I though about declaring extra1,2 as const, I quickly checked for code which 
write into these pointers and I found none, but I only looked for one, two and 
int_max values.

We could do a deeper search to see if other values are safe to turn to const.

> > ...
> >
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -21,6 +21,11 @@ static const struct inode_operations
> proc_sys_inode_operations;
> >  static const struct file_operations proc_sys_dir_file_operations;
> >  static const struct inode_operations proc_sys_dir_operations;
> >  
> > +/* shared constants to be used in various sysctls */
> > +const int sysctl_zero = 0;
> > +const int sysctl_one = 1;
> > +const int sysctl_int_max = INT_MAX;
> 
> Don't these require EXPORT_SYMBOL()?

Yes, for kernel modules, as the kbuild bot just pointed out.

Regards,
-- 
Matteo Croce
per aspera ad upstream


Re: [PATCH V2 0/3] Introduce Thermal Pressure

2019-04-16 Thread Ingo Molnar


* Ingo Molnar  wrote:

> * Thara Gopinath  wrote:
> 
> > The test results below shows 3-5% improvement in performance when
> > using the third solution compared to the default system today where
> > scheduler is unware of cpu capacity limitations due to thermal events.
> 
> The numbers look very promising!
> 
> I've rearranged the results to make the performance properties of the 
> various approaches and parameters easier to see:
> 
>  (seconds, lower is better)
> 
>Hackbench   Aobench   Dhrystone
>  =   ===   =
> Vanilla kernel (No Thermal Pressure) 10.21141.581.14
> Instantaneous thermal pressure   10.16141.631.15
> Thermal Pressure Averaging:
>   - PELT fmwk 9.88134.481.19
>   - non-PELT Algo. Decay : 500 ms 9.94133.621.09
>   - non-PELT Algo. Decay : 250 ms 7.52137.221.012
>   - non-PELT Algo. Decay : 125 ms 9.87137.551.12

So what I forgot to say is that IMO your results show robust improvements 
over the vanilla kernel of around 5%, with a relatively straightforward 
thermal pressure metric. So I suspect we could do something like this, if 
there was a bit more measurements done to get the best decay period 
established - the 125-250-500 msecs results seem a bit coarse and not 
entirely unambiguous.

In terms of stddev: the perf stat --pre hook could be used to add a dummy 
benchmark run, to heat up the test system, to get more reliable, less 
noisy numbers?

BTW., that big improvement in hackbench results to 7.52 at 250 msecs, is 
that real, or a fluke perhaps?

Thanks,

Ingo


Re: [PATCH v4 2/2] x86/boot/KASLR: skip the specified crashkernel region

2019-04-16 Thread Pingfan Liu
On Wed, Apr 17, 2019 at 3:01 AM Borislav Petkov  wrote:
>
> On Mon, Apr 08, 2019 at 01:58:35PM +0800, Pingfan Liu wrote:
> > crashkernel=x@y or or =range1:size1[,range2:size2,...]@offset option may
> > fail to reserve the required memory region if KASLR puts kernel into the
> > region. To avoid this uncertainty, asking KASLR to skip the required
> > region.
> >
> > Signed-off-by: Pingfan Liu 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: Borislav Petkov 
> > Cc: "H. Peter Anvin" 
> > Cc: Baoquan He 
> > Cc: Will Deacon 
> > Cc: Nicolas Pitre 
> > Cc: Vivek Goyal 
> > Cc: Chao Fan 
> > Cc: "Kirill A. Shutemov" 
> > Cc: Ard Biesheuvel 
> > CC: Hari Bathini 
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  arch/x86/boot/compressed/kaslr.c | 40 
> > 
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/arch/x86/boot/compressed/kaslr.c 
> > b/arch/x86/boot/compressed/kaslr.c
> > index 2e53c05..765a593 100644
> > --- a/arch/x86/boot/compressed/kaslr.c
> > +++ b/arch/x86/boot/compressed/kaslr.c
> > @@ -107,6 +107,7 @@ enum mem_avoid_index {
> >   MEM_AVOID_BOOTPARAMS,
> >   MEM_AVOID_MEMMAP_BEGIN,
> >   MEM_AVOID_MEMMAP_END = MEM_AVOID_MEMMAP_BEGIN + MAX_MEMMAP_REGIONS - 
> > 1,
> > + MEM_AVOID_CRASHKERNEL,
> >   MEM_AVOID_MAX,
> >  };
> >
> > @@ -131,6 +132,11 @@ char *skip_spaces(const char *str)
> >  }
> >  #include "../../../../lib/ctype.c"
> >  #include "../../../../lib/cmdline.c"
> > +#ifdef CONFIG_CRASH_CORE
> > +#define printk
> > +#define _BOOT_KASLR
> > +#include "../../../../lib/parse_crashkernel.c"
> > +#endif
> >
> >  static int
> >  parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
> > @@ -292,6 +298,39 @@ static void handle_mem_options(void)
> >   return;
> >  }
> >
> > +static u64 mem_ram_size(void)
> > +{
> > + struct boot_e820_entry *entry;
> > + u64 total_sz = 0;
> > + int i;
> > +
> > + for (i = 0; i < boot_params->e820_entries; i++) {
> > + entry = _params->e820_table[i];
> > + /* Skip non-RAM entries. */
> > + if (entry->type != E820_TYPE_RAM)
> > + continue;
> > + total_sz += entry->size;
> > + }
> > + return total_sz;
> > +}
> > +
> > +/*
> > + * For crashkernel=size@offset or =range1:size1[,range2:size2,...]@offset
> > + * options, recording mem_avoid for them.
> > + */
> > +static void handle_crashkernel_options(void)
> > +{
> > + unsigned long long crash_size, crash_base = 0;
> > + char *cmdline = (char *)get_cmd_line_ptr();
> > + u64 total_sz = mem_ram_size();
> > +
> > + parse_crashkernel(cmdline, total_sz, _size, _base);
>
> That function has a return value which you could test. And then you
> don't need to set crash_base to 0 above.
>
Take __parse_crashkernel()->parse_crashkernel_simple() for example. If
no offset given, then it still return 0, but crash_base is dangling.

Regards,
Pingfan


Re: [PATCH v2] proc/sysctl: add shared variables for range check

2019-04-16 Thread kbuild test robot
Hi Matteo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.1-rc5]
[cannot apply to next-20190416]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Matteo-Croce/proc-sysctl-add-shared-variables-for-range-check/20190417-105810
config: i386-randconfig-m1-04170939 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   ERROR: "sysctl_zero" [net/sctp/sctp.ko] undefined!
   ERROR: "sysctl_one" [net/sctp/sctp.ko] undefined!
   ERROR: "sysctl_int_max" [net/sctp/sctp.ko] undefined!
>> ERROR: "sysctl_zero" [net/mpls/mpls_router.ko] undefined!
>> ERROR: "sysctl_one" [net/mpls/mpls_router.ko] undefined!
   ERROR: "sysctl_zero" [net/ipv6/ipv6.ko] undefined!
   ERROR: "sysctl_one" [net/ipv6/ipv6.ko] undefined!
   ERROR: "sysctl_zero" [drivers/gpu/drm/i915/i915.ko] undefined!
   ERROR: "sysctl_one" [drivers/gpu/drm/i915/i915.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v4 1/2] kernel/crash_core: separate the parsing routines to lib/parse_crashkernel.c

2019-04-16 Thread Pingfan Liu
On Wed, Apr 17, 2019 at 3:01 AM Borislav Petkov  wrote:
>
> On Mon, Apr 08, 2019 at 01:58:34PM +0800, Pingfan Liu wrote:
> > Beside kernel, at early boot stage, the KASLR code also needs to parse the
> > crashkernel=x@y or crashkernel=ramsize-range:size[,...][@offset] option,
> > and avoid to put randomized kernel in the region.
> >
> > Extracting the parsing related routines to lib/parse_crashkernel.c, so it
> > will be handy included by other
> > files.
>
> Use this commit message for your next submission:
>
> crash: Carve out crashkernel= cmdline parsing
>
> Make the "crashkernel=" parsing functionality available to the early
> KASLR code. Will be used by a later patch to parse crashkernel regions
> which KASLR should aviod.
>
OK.

> > Signed-off-by: Pingfan Liu 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: Borislav Petkov 
> > Cc: "H. Peter Anvin" 
> > Cc: Baoquan He 
> > Cc: Will Deacon 
> > Cc: Nicolas Pitre 
> > Cc: Chao Fan 
> > Cc: "Kirill A. Shutemov" 
> > Cc: Ard Biesheuvel 
> > Cc: Vivek Goyal 
> > CC: Hari Bathini 
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  kernel/crash_core.c | 273 -
> >  lib/Makefile|   2 +
> >  lib/parse_crashkernel.c | 289 
> > 
> >  3 files changed, 291 insertions(+), 273 deletions(-)
>
> And this is not how you carve out code.
>
> First, you do a patch which does only code move. Nothing more.
>
> In a follow on patch, you make the changes to the moved code so that it
> is immediately visible what you're changing.
>
Will fix it. Thanks for your review.

Regards,
Pingfan


Re: [PATCH V2 0/3] Introduce Thermal Pressure

2019-04-16 Thread Ingo Molnar


* Thara Gopinath  wrote:

> The test results below shows 3-5% improvement in performance when
> using the third solution compared to the default system today where
> scheduler is unware of cpu capacity limitations due to thermal events.

The numbers look very promising!

I've rearranged the results to make the performance properties of the 
various approaches and parameters easier to see:

 (seconds, lower is better)

 Hackbench   Aobench   Dhrystone
 =   ===   =
Vanilla kernel (No Thermal Pressure) 10.21141.581.14
Instantaneous thermal pressure   10.16141.631.15
Thermal Pressure Averaging:
  - PELT fmwk 9.88134.481.19
  - non-PELT Algo. Decay : 500 ms 9.94133.621.09
  - non-PELT Algo. Decay : 250 ms 7.52137.221.012
  - non-PELT Algo. Decay : 125 ms 9.87137.551.12


Firstly, a couple of questions about the numbers:

   1)

  Is the 1.012 result for "non-PELT 250 msecs Dhrystone" really 1.012?
  You reported it as:

 non-PELT Algo. Decay : 250 ms   1.012   7.02%

  But the formatting is significant 3 digits versus only two for all 
  the other results.

   2)

  You reported the hackbench numbers with "10 runs" - did the other 
  benchmarks use 10 runs as well? Maybe you used fewer runs for the 
  longest benchmark, Aobench?

Secondly, it appears the non-PELT decaying average is the best approach, 
but the results are a bit coarse around the ~250 msecs peak. Maybe it 
would be good to measure it in 50 msecs steps between 50 msecs and 1000 
msecs - but only if it can be scripted sanely:

A possible approach would be to add a debug sysctl for the tuning period, 
and script all these benchmark runs and the printing of the results. You 
could add another (debug) sysctl to turn the 'instant' logic on, and to 
restore vanilla kernel behavior as well - this makes it all much easier 
to script and measure with a single kernel image, without having to 
reboot the kernel. The sysctl overhead will not be measurable for 
workloads like this.

Then you can use "perf stat --null --table" to measure runtime and stddev 
easily and with a single tool, for example:

  dagon:~> perf stat --null --sync --repeat 10 --table ./hackbench 20 
>benchmark.out

  Performance counter stats for './hackbench 20' (10 runs):

   # Table of individual measurements:
   0.15246 (-0.03960) ##
   0.20832 (+0.01627) ##
   0.17895 (-0.01310) ##
   0.19791 (+0.00585) #
   0.19209 (+0.4) #
   0.19406 (+0.00201) #
   0.22484 (+0.03278) ###
   0.18695 (-0.00511) #
   0.19032 (-0.00174) #
   0.19464 (+0.00259) #

   # Final result:
   0.19205 +- 0.00592 seconds time elapsed  ( +-  3.08% )

Note how all the individual measurements can be captured this way, 
without seeing the benchmark output itself. So difference benchmarks can 
be measured this way, assuming they don't have too long setup time.

Thanks,

Ingo


[PATCH v3 11/11] riscv: Make mmap allocation top-down by default

2019-04-16 Thread Alexandre Ghiti
In order to avoid wasting user address space by using bottom-up mmap
allocation scheme, prefer top-down scheme when possible.

Before:
root@qemuriscv64:~# cat /proc/self/maps
0001-00016000 r-xp  fe:00 6389   /bin/cat.coreutils
00016000-00017000 r--p 5000 fe:00 6389   /bin/cat.coreutils
00017000-00018000 rw-p 6000 fe:00 6389   /bin/cat.coreutils
00018000-00039000 rw-p  00:00 0  [heap]
156000-16d000 r-xp  fe:00 7193   /lib/ld-2.28.so
16d000-16e000 r--p 00016000 fe:00 7193   /lib/ld-2.28.so
16e000-16f000 rw-p 00017000 fe:00 7193   /lib/ld-2.28.so
16f000-17 rw-p  00:00 0
17-172000 r-xp  00:00 0  [vdso]
174000-176000 rw-p  00:00 0
176000-1555674000 r-xp  fe:00 7187   /lib/libc-2.28.so
1555674000-1555678000 r--p 000fd000 fe:00 7187   /lib/libc-2.28.so
1555678000-155567a000 rw-p 00101000 fe:00 7187   /lib/libc-2.28.so
155567a000-15556a rw-p  00:00 0
3fffb9-3fffbb1000 rw-p  00:00 0  [stack]

After:
root@qemuriscv64:~# cat /proc/self/maps
0001-00016000 r-xp  fe:00 6389   /bin/cat.coreutils
00016000-00017000 r--p 5000 fe:00 6389   /bin/cat.coreutils
00017000-00018000 rw-p 6000 fe:00 6389   /bin/cat.coreutils
00018000-00039000 rw-p  00:00 0  [heap]
3ff7eb6000-3ff7ed8000 rw-p  00:00 0
3ff7ed8000-3ff7fd6000 r-xp  fe:00 7187   /lib/libc-2.28.so
3ff7fd6000-3ff7fda000 r--p 000fd000 fe:00 7187   /lib/libc-2.28.so
3ff7fda000-3ff7fdc000 rw-p 00101000 fe:00 7187   /lib/libc-2.28.so
3ff7fdc000-3ff7fe2000 rw-p  00:00 0
3ff7fe4000-3ff7fe6000 r-xp  00:00 0  [vdso]
3ff7fe6000-3ff7ffd000 r-xp  fe:00 7193   /lib/ld-2.28.so
3ff7ffd000-3ff7ffe000 r--p 00016000 fe:00 7193   /lib/ld-2.28.so
3ff7ffe000-3ff7fff000 rw-p 00017000 fe:00 7193   /lib/ld-2.28.so
3ff7fff000-3ff800 rw-p  00:00 0
3fff888000-3fff8a9000 rw-p  00:00 0  [stack]

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Christoph Hellwig 
---
 arch/riscv/Kconfig | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index eb56c82d8aa1..f5897e0dbc1c 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -49,6 +49,17 @@ config RISCV
select GENERIC_IRQ_MULTI_HANDLER
select ARCH_HAS_PTE_SPECIAL
select HAVE_EBPF_JIT if 64BIT
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
+   select HAVE_ARCH_MMAP_RND_BITS
+
+config ARCH_MMAP_RND_BITS_MIN
+   default 18
+
+# max bits determined by the following formula:
+#  VA_BITS - PAGE_SHIFT - 3
+config ARCH_MMAP_RND_BITS_MAX
+   default 33 if 64BIT # SV48 based
+   default 18
 
 config MMU
def_bool y
-- 
2.20.1



[PATCH v3 10/11] mips: Use generic mmap top-down layout

2019-04-16 Thread Alexandre Ghiti
mips uses a top-down layout by default that fits the generic functions.
At the same time, this commit allows to fix problem uncovered
and not fixed for mips here:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

Signed-off-by: Alexandre Ghiti 
---
 arch/mips/Kconfig |  1 +
 arch/mips/include/asm/processor.h |  5 ---
 arch/mips/mm/mmap.c   | 67 ---
 3 files changed, 1 insertion(+), 72 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 4a5f5b0ee9a9..ec2f07561e4d 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -14,6 +14,7 @@ config MIPS
select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_IPC_PARSE_VERSION
select BUILDTIME_EXTABLE_SORT
select CLONE_BACKWARDS
diff --git a/arch/mips/include/asm/processor.h 
b/arch/mips/include/asm/processor.h
index aca909bd7841..fba18d4a9190 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -29,11 +29,6 @@
 
 extern unsigned int vced_count, vcei_count;
 
-/*
- * MIPS does have an arch_pick_mmap_layout()
- */
-#define HAVE_ARCH_PICK_MMAP_LAYOUT 1
-
 #ifdef CONFIG_32BIT
 #ifdef CONFIG_KVM_GUEST
 /* User space process size is limited to 1GB in KVM Guest Mode */
diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index ffbe69f3a7d9..61e65a69bb09 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -20,43 +20,6 @@
 unsigned long shm_align_mask = PAGE_SIZE - 1;  /* Sane caches */
 EXPORT_SYMBOL(shm_align_mask);
 
-/* gap between mmap and stack */
-#define MIN_GAP(128*1024*1024UL)
-#define MAX_GAP((STACK_TOP)/6*5)
-#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))
-
-static int mmap_is_legacy(struct rlimit *rlim_stack)
-{
-   if (current->personality & ADDR_COMPAT_LAYOUT)
-   return 1;
-
-   if (rlim_stack->rlim_cur == RLIM_INFINITY)
-   return 1;
-
-   return sysctl_legacy_va_layout;
-}
-
-static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
-{
-   unsigned long gap = rlim_stack->rlim_cur;
-   unsigned long pad = stack_guard_gap;
-
-   /* Account for stack randomization if necessary */
-   if (current->flags & PF_RANDOMIZE)
-   pad += (STACK_RND_MASK << PAGE_SHIFT);
-
-   /* Values close to RLIM_INFINITY can overflow. */
-   if (gap + pad > gap)
-   gap += pad;
-
-   if (gap < MIN_GAP)
-   gap = MIN_GAP;
-   else if (gap > MAX_GAP)
-   gap = MAX_GAP;
-
-   return PAGE_ALIGN(STACK_TOP - gap - rnd);
-}
-
 #define COLOUR_ALIGN(addr, pgoff)  \
addr) + shm_align_mask) & ~shm_align_mask) +\
 (((pgoff) << PAGE_SHIFT) & shm_align_mask))
@@ -154,36 +117,6 @@ unsigned long arch_get_unmapped_area_topdown(struct file 
*filp,
addr0, len, pgoff, flags, DOWN);
 }
 
-unsigned long arch_mmap_rnd(void)
-{
-   unsigned long rnd;
-
-#ifdef CONFIG_COMPAT
-   if (TASK_IS_32BIT_ADDR)
-   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
-   else
-#endif /* CONFIG_COMPAT */
-   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
-
-   return rnd << PAGE_SHIFT;
-}
-
-void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
-{
-   unsigned long random_factor = 0UL;
-
-   if (current->flags & PF_RANDOMIZE)
-   random_factor = arch_mmap_rnd();
-
-   if (mmap_is_legacy(rlim_stack)) {
-   mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-   mm->get_unmapped_area = arch_get_unmapped_area;
-   } else {
-   mm->mmap_base = mmap_base(random_factor, rlim_stack);
-   mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-   }
-}
-
 static inline unsigned long brk_rnd(void)
 {
unsigned long rnd = get_random_long();
-- 
2.20.1



Aw: Re: [PATCH v1 0/4] make hdmi work on bananapi-r2

2019-04-16 Thread Frank Wunderlich
Hi CK Hu,

you mean the problematic patch is fix possible_crtcs (4/4) and the others are 
ok?

can you push the first 3 while working on the last one?

regards Frank


[PATCH v3 09/11] mips: Use STACK_TOP when computing mmap base address

2019-04-16 Thread Alexandre Ghiti
mmap base address must be computed wrt stack top address, using TASK_SIZE
is wrong since STACK_TOP and TASK_SIZE are not equivalent.

Signed-off-by: Alexandre Ghiti 
---
 arch/mips/mm/mmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 3ff82c6f7e24..ffbe69f3a7d9 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -22,7 +22,7 @@ EXPORT_SYMBOL(shm_align_mask);
 
 /* gap between mmap and stack */
 #define MIN_GAP(128*1024*1024UL)
-#define MAX_GAP((TASK_SIZE)/6*5)
+#define MAX_GAP((STACK_TOP)/6*5)
 #define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))
 
 static int mmap_is_legacy(struct rlimit *rlim_stack)
@@ -54,7 +54,7 @@ static unsigned long mmap_base(unsigned long rnd, struct 
rlimit *rlim_stack)
else if (gap > MAX_GAP)
gap = MAX_GAP;
 
-   return PAGE_ALIGN(TASK_SIZE - gap - rnd);
+   return PAGE_ALIGN(STACK_TOP - gap - rnd);
 }
 
 #define COLOUR_ALIGN(addr, pgoff)  \
-- 
2.20.1



[PATCH v3 08/11] mips: Properly account for stack randomization and stack guard gap

2019-04-16 Thread Alexandre Ghiti
This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for randomization.
This fixes the problem uncovered and not fixed for mips here:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

Signed-off-by: Alexandre Ghiti 
---
 arch/mips/mm/mmap.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c
index 2f616ebeb7e0..3ff82c6f7e24 100644
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -21,8 +21,9 @@ unsigned long shm_align_mask = PAGE_SIZE - 1; /* Sane caches 
*/
 EXPORT_SYMBOL(shm_align_mask);
 
 /* gap between mmap and stack */
-#define MIN_GAP (128*1024*1024UL)
-#define MAX_GAP ((TASK_SIZE)/6*5)
+#define MIN_GAP(128*1024*1024UL)
+#define MAX_GAP((TASK_SIZE)/6*5)
+#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))
 
 static int mmap_is_legacy(struct rlimit *rlim_stack)
 {
@@ -38,6 +39,15 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
 static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
 {
unsigned long gap = rlim_stack->rlim_cur;
+   unsigned long pad = stack_guard_gap;
+
+   /* Account for stack randomization if necessary */
+   if (current->flags & PF_RANDOMIZE)
+   pad += (STACK_RND_MASK << PAGE_SHIFT);
+
+   /* Values close to RLIM_INFINITY can overflow. */
+   if (gap + pad > gap)
+   gap += pad;
 
if (gap < MIN_GAP)
gap = MIN_GAP;
-- 
2.20.1



[PATCH v3 07/11] arm: Use generic mmap top-down layout

2019-04-16 Thread Alexandre Ghiti
arm uses a top-down mmap layout by default that exactly fits the generic
functions, so get rid of arch specific code and use the generic version
by selecting ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT.

Signed-off-by: Alexandre Ghiti 
---
 arch/arm/Kconfig |  1 +
 arch/arm/include/asm/processor.h |  2 --
 arch/arm/mm/mmap.c   | 62 
 3 files changed, 1 insertion(+), 64 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 850b4805e2d1..f8f603da181f 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -28,6 +28,7 @@ config ARM
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_IPC_PARSE_VERSION
select BUILDTIME_EXTABLE_SORT if MMU
select CLONE_BACKWARDS
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 57fe73ea0f72..944ef1fb1237 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -143,8 +143,6 @@ static inline void prefetchw(const void *ptr)
 #endif
 #endif
 
-#define HAVE_ARCH_PICK_MMAP_LAYOUT
-
 #endif
 
 #endif /* __ASM_ARM_PROCESSOR_H */
diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index 0b94b674aa91..b8d912ac9e61 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -17,43 +17,6 @@
addr)+SHMLBA-1)&~(SHMLBA-1)) +  \
 (((pgoff)<> (PAGE_SHIFT - 12))
-
-static int mmap_is_legacy(struct rlimit *rlim_stack)
-{
-   if (current->personality & ADDR_COMPAT_LAYOUT)
-   return 1;
-
-   if (rlim_stack->rlim_cur == RLIM_INFINITY)
-   return 1;
-
-   return sysctl_legacy_va_layout;
-}
-
-static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
-{
-   unsigned long gap = rlim_stack->rlim_cur;
-   unsigned long pad = stack_guard_gap;
-
-   /* Account for stack randomization if necessary */
-   if (current->flags & PF_RANDOMIZE)
-   pad += (STACK_RND_MASK << PAGE_SHIFT);
-
-   /* Values close to RLIM_INFINITY can overflow. */
-   if (gap + pad > gap)
-   gap += pad;
-
-   if (gap < MIN_GAP)
-   gap = MIN_GAP;
-   else if (gap > MAX_GAP)
-   gap = MAX_GAP;
-
-   return PAGE_ALIGN(STACK_TOP - gap - rnd);
-}
-
 /*
  * We need to ensure that shared mappings are correctly aligned to
  * avoid aliasing issues with VIPT caches.  We need to ensure that
@@ -181,31 +144,6 @@ arch_get_unmapped_area_topdown(struct file *filp, const 
unsigned long addr0,
return addr;
 }
 
-unsigned long arch_mmap_rnd(void)
-{
-   unsigned long rnd;
-
-   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
-
-   return rnd << PAGE_SHIFT;
-}
-
-void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
-{
-   unsigned long random_factor = 0UL;
-
-   if (current->flags & PF_RANDOMIZE)
-   random_factor = arch_mmap_rnd();
-
-   if (mmap_is_legacy(rlim_stack)) {
-   mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-   mm->get_unmapped_area = arch_get_unmapped_area;
-   } else {
-   mm->mmap_base = mmap_base(random_factor, rlim_stack);
-   mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-   }
-}
-
 /*
  * You really shouldn't be using read() or write() on /dev/mem.  This
  * might go away in the future.
-- 
2.20.1



[PATCH v3 06/11] arm: Use STACK_TOP when computing mmap base address

2019-04-16 Thread Alexandre Ghiti
mmap base address must be computed wrt stack top address, using TASK_SIZE
is wrong since STACK_TOP and TASK_SIZE are not equivalent.

Signed-off-by: Alexandre Ghiti 
---
 arch/arm/mm/mmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index bff3d00bda5b..0b94b674aa91 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -19,7 +19,7 @@
 
 /* gap between mmap and stack */
 #define MIN_GAP(128*1024*1024UL)
-#define MAX_GAP((TASK_SIZE)/6*5)
+#define MAX_GAP((STACK_TOP)/6*5)
 #define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))
 
 static int mmap_is_legacy(struct rlimit *rlim_stack)
@@ -51,7 +51,7 @@ static unsigned long mmap_base(unsigned long rnd, struct 
rlimit *rlim_stack)
else if (gap > MAX_GAP)
gap = MAX_GAP;
 
-   return PAGE_ALIGN(TASK_SIZE - gap - rnd);
+   return PAGE_ALIGN(STACK_TOP - gap - rnd);
 }
 
 /*
-- 
2.20.1



Detecting x86 LAPIC timer frequency from CPUID data

2019-04-16 Thread Daniel Drake
The CPUID.0x16 leaf provides "Bus (Reference) Frequency (in MHz)".

In the thread "No 8254 PIT & no HPET on new Intel N3350 platforms
causes kernel panic during early boot" we are exploring ways to have
the kernel avoid using the PIT/HPET IRQ0 timer in more cases, and
Thomas Gleixner suggested that we could use this CPUID data to set
lapic_timer_frequency, avoiding the need for calibrate_APIC_clock()
to measure the APIC clock against the IRQ0 timer.

I'm thinking of the the following code change, however I get
unexpected results on Intel i7-8565U (Whiskey Lake). When
booting without this change, and with apic=notscdeadline (so that
APIC clock gets calibrated and used), the bus speed is detected as
23MHz:

 ... lapic delta = 149994
 ... PM-Timer delta = 357939
 ... PM-Timer result ok
 . delta 149994
 . mult: 6442193
 . calibration result: 23999
 . CPU clock speed is 1991.0916 MHz.
 . host bus clock speed is 23.0999 MHz.

However the CPUID.0x16 ECX reports a 100MHz bus speed on this device,
so this code change would produce a significantly different calibration.

Am I doing anything obviously wrong?

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 3fae23834069..6c51ce842f86 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -679,6 +679,16 @@ static unsigned long cpu_khz_from_cpuid(void)
 
cpuid(0x16, _base_mhz, _max_mhz, _bus_mhz, );
 
+#ifdef CONFIG_X86_LOCAL_APIC
+   /*
+* If bus frequency is provided in CPUID data, set
+* global lapic_timer_frequency to bus_clock_cycles/jiffy.
+* This avoids having to calibrate the APIC timer later.
+*/
+   if (ecx_bus_mhz)
+   lapic_timer_frequency = (ecx_bus_mhz * 100) / HZ;
+#endif
+
return eax_base_mhz * 1000;
 }
 
-- 
2.19.1



[PATCH v3 05/11] arm: Properly account for stack randomization and stack guard gap

2019-04-16 Thread Alexandre Ghiti
This commit takes care of stack randomization and stack guard gap when
computing mmap base address and checks if the task asked for randomization.
This fixes the problem uncovered and not fixed for arm here:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

Signed-off-by: Alexandre Ghiti 
---
 arch/arm/mm/mmap.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c
index f866870db749..bff3d00bda5b 100644
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -18,8 +18,9 @@
 (((pgoff)<> (PAGE_SHIFT - 12))
 
 static int mmap_is_legacy(struct rlimit *rlim_stack)
 {
@@ -35,6 +36,15 @@ static int mmap_is_legacy(struct rlimit *rlim_stack)
 static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
 {
unsigned long gap = rlim_stack->rlim_cur;
+   unsigned long pad = stack_guard_gap;
+
+   /* Account for stack randomization if necessary */
+   if (current->flags & PF_RANDOMIZE)
+   pad += (STACK_RND_MASK << PAGE_SHIFT);
+
+   /* Values close to RLIM_INFINITY can overflow. */
+   if (gap + pad > gap)
+   gap += pad;
 
if (gap < MIN_GAP)
gap = MIN_GAP;
-- 
2.20.1



[PATCH v3 04/11] arm64, mm: Move generic mmap layout functions to mm

2019-04-16 Thread Alexandre Ghiti
arm64 handles top-down mmap layout in a way that can be easily reused
by other architectures, so make it available in mm.
It then introduces a new config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
that can be set by other architectures to benefit from those functions.
Note that this new config depends on MMU being enabled, if selected
without MMU support, a warning will be thrown.

Suggested-by: Christoph Hellwig 
Signed-off-by: Alexandre Ghiti 
---
 arch/Kconfig   |  8 
 arch/arm64/Kconfig |  1 +
 arch/arm64/include/asm/processor.h |  2 -
 arch/arm64/mm/mmap.c   | 76 --
 kernel/sysctl.c|  6 ++-
 mm/util.c  | 74 -
 6 files changed, 86 insertions(+), 81 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 3368786a..7c8965c64590 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -684,6 +684,14 @@ config HAVE_ARCH_COMPAT_MMAP_BASES
  and vice-versa 32-bit applications to call 64-bit mmap().
  Required for applications doing different bitness syscalls.
 
+# This allows to use a set of generic functions to determine mmap base
+# address by giving priority to top-down scheme only if the process
+# is not in legacy mode (compat task, unlimited stack size or
+# sysctl_legacy_va_layout).
+config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
+   bool
+   depends on MMU
+
 config HAVE_COPY_THREAD_TLS
bool
help
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7e34b9eba5de..670719a26b45 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -66,6 +66,7 @@ config ARM64
select ARCH_SUPPORTS_INT128 if GCC_VERSION >= 5 || CC_IS_CLANG
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
+   select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select ARCH_WANT_FRAME_POINTERS
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index 5d9ce62bdebd..4de2a2fd605a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -274,8 +274,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 "nop") : : "p" (ptr));
 }
 
-#define HAVE_ARCH_PICK_MMAP_LAYOUT
-
 #endif
 
 extern unsigned long __ro_after_init signal_minsigstksz; /* sigframe size */
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index ac89686c4af8..c74224421216 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -31,82 +31,6 @@
 
 #include 
 
-/*
- * Leave enough space between the mmap area and the stack to honour ulimit in
- * the face of randomisation.
- */
-#define MIN_GAP (SZ_128M)
-#define MAX_GAP(STACK_TOP/6*5)
-
-static int mmap_is_legacy(struct rlimit *rlim_stack)
-{
-   if (current->personality & ADDR_COMPAT_LAYOUT)
-   return 1;
-
-   if (rlim_stack->rlim_cur == RLIM_INFINITY)
-   return 1;
-
-   return sysctl_legacy_va_layout;
-}
-
-unsigned long arch_mmap_rnd(void)
-{
-   unsigned long rnd;
-
-#ifdef CONFIG_COMPAT
-   if (is_compat_task())
-   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
-   else
-#endif
-   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
-   return rnd << PAGE_SHIFT;
-}
-
-static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
-{
-   unsigned long gap = rlim_stack->rlim_cur;
-   unsigned long pad = stack_guard_gap;
-
-   /* Account for stack randomization if necessary */
-   if (current->flags & PF_RANDOMIZE)
-   pad += (STACK_RND_MASK << PAGE_SHIFT);
-
-   /* Values close to RLIM_INFINITY can overflow. */
-   if (gap + pad > gap)
-   gap += pad;
-
-   if (gap < MIN_GAP)
-   gap = MIN_GAP;
-   else if (gap > MAX_GAP)
-   gap = MAX_GAP;
-
-   return PAGE_ALIGN(STACK_TOP - gap - rnd);
-}
-
-/*
- * This function, called very early during the creation of a new process VM
- * image, sets up which VM layout function to use:
- */
-void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
-{
-   unsigned long random_factor = 0UL;
-
-   if (current->flags & PF_RANDOMIZE)
-   random_factor = arch_mmap_rnd();
-
-   /*
-* Fall back to the standard layout if the personality bit is set, or
-* if the expected stack growth is unlimited:
-*/
-   if (mmap_is_legacy(rlim_stack)) {
-   mm->mmap_base = TASK_UNMAPPED_BASE + random_factor;
-   mm->get_unmapped_area = arch_get_unmapped_area;
-   } else {
-   mm->mmap_base = mmap_base(random_factor, rlim_stack);
-   mm->get_unmapped_area = arch_get_unmapped_area_topdown;
-   }
-}
-
 /*
  * You really shouldn't be using read() or write() on /dev/mem.  

Re: [PATCH v2] proc/sysctl: add shared variables for range check

2019-04-16 Thread kbuild test robot
Hi Matteo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.1-rc5]
[cannot apply to next-20190416]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Matteo-Croce/proc-sysctl-add-shared-variables-for-range-check/20190417-105810
config: x86_64-randconfig-m0-04170939 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   ERROR: "sysctl_zero" [net/sctp/sctp.ko] undefined!
   ERROR: "sysctl_one" [net/sctp/sctp.ko] undefined!
   ERROR: "sysctl_int_max" [net/sctp/sctp.ko] undefined!
>> ERROR: "sysctl_one" [net/rxrpc/rxrpc.ko] undefined!
>> ERROR: "sysctl_zero" [net/ipv6/ipv6.ko] undefined!
>> ERROR: "sysctl_one" [net/ipv6/ipv6.ko] undefined!
   make[2]: *** [__modpost] Error 1
   make[2]: Target '_modpost' not remade because of errors.
   make[1]: *** [modules] Error 2
   make[1]: Target '_all' not remade because of errors.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH v3 03/11] arm64: Consider stack randomization for mmap base only when necessary

2019-04-16 Thread Alexandre Ghiti
Do not offset mmap base address because of stack randomization if
current task does not want randomization.

Signed-off-by: Alexandre Ghiti 
---
 arch/arm64/mm/mmap.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index ed4f9915f2b8..ac89686c4af8 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -65,7 +65,11 @@ unsigned long arch_mmap_rnd(void)
 static unsigned long mmap_base(unsigned long rnd, struct rlimit *rlim_stack)
 {
unsigned long gap = rlim_stack->rlim_cur;
-   unsigned long pad = (STACK_RND_MASK << PAGE_SHIFT) + stack_guard_gap;
+   unsigned long pad = stack_guard_gap;
+
+   /* Account for stack randomization if necessary */
+   if (current->flags & PF_RANDOMIZE)
+   pad += (STACK_RND_MASK << PAGE_SHIFT);
 
/* Values close to RLIM_INFINITY can overflow. */
if (gap + pad > gap)
-- 
2.20.1



[PATCH v3 02/11] arm64: Make use of is_compat_task instead of hardcoding this test

2019-04-16 Thread Alexandre Ghiti
Each architecture has its own way to determine if a task is a compat task,
by using is_compat_task in arch_mmap_rnd, it allows more genericity and
then it prepares its moving to mm/.

Signed-off-by: Alexandre Ghiti 
---
 arch/arm64/mm/mmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index 842c8a5fcd53..ed4f9915f2b8 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -54,7 +54,7 @@ unsigned long arch_mmap_rnd(void)
unsigned long rnd;
 
 #ifdef CONFIG_COMPAT
-   if (test_thread_flag(TIF_32BIT))
+   if (is_compat_task())
rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
else
 #endif
-- 
2.20.1



[PATCH v3 01/11] mm, fs: Move randomize_stack_top from fs to mm

2019-04-16 Thread Alexandre Ghiti
This preparatory commit moves this function so that further introduction
of generic topdown mmap layout is contained only in mm/util.c.

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Christoph Hellwig 
---
 fs/binfmt_elf.c| 20 
 include/linux/mm.h |  2 ++
 mm/util.c  | 22 ++
 3 files changed, 24 insertions(+), 20 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 7d09d125f148..045f3b29d264 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -662,26 +662,6 @@ static unsigned long load_elf_interp(struct elfhdr 
*interp_elf_ex,
  * libraries.  There is no binary dependent code anywhere else.
  */
 
-#ifndef STACK_RND_MASK
-#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12))/* 8MB of VA */
-#endif
-
-static unsigned long randomize_stack_top(unsigned long stack_top)
-{
-   unsigned long random_variable = 0;
-
-   if (current->flags & PF_RANDOMIZE) {
-   random_variable = get_random_long();
-   random_variable &= STACK_RND_MASK;
-   random_variable <<= PAGE_SHIFT;
-   }
-#ifdef CONFIG_STACK_GROWSUP
-   return PAGE_ALIGN(stack_top) + random_variable;
-#else
-   return PAGE_ALIGN(stack_top) - random_variable;
-#endif
-}
-
 static int load_elf_binary(struct linux_binprm *bprm)
 {
struct file *interpreter = NULL; /* to shut gcc up */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 76769749b5a5..087824a5059f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2312,6 +2312,8 @@ extern int install_special_mapping(struct mm_struct *mm,
   unsigned long addr, unsigned long len,
   unsigned long flags, struct page **pages);
 
+unsigned long randomize_stack_top(unsigned long stack_top);
+
 extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned 
long, unsigned long, unsigned long);
 
 extern unsigned long mmap_region(struct file *file, unsigned long addr,
diff --git a/mm/util.c b/mm/util.c
index d559bde497a9..a54afb9b4faa 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -14,6 +14,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -291,6 +293,26 @@ int vma_is_stack_for_current(struct vm_area_struct *vma)
return (vma->vm_start <= KSTK_ESP(t) && vma->vm_end >= KSTK_ESP(t));
 }
 
+#ifndef STACK_RND_MASK
+#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12)) /* 8MB of VA */
+#endif
+
+unsigned long randomize_stack_top(unsigned long stack_top)
+{
+   unsigned long random_variable = 0;
+
+   if (current->flags & PF_RANDOMIZE) {
+   random_variable = get_random_long();
+   random_variable &= STACK_RND_MASK;
+   random_variable <<= PAGE_SHIFT;
+   }
+#ifdef CONFIG_STACK_GROWSUP
+   return PAGE_ALIGN(stack_top) + random_variable;
+#else
+   return PAGE_ALIGN(stack_top) - random_variable;
+#endif
+}
+
 #if defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT)
 void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
 {
-- 
2.20.1



[PATCH v3 00/11] Provide generic top-down mmap layout functions

2019-04-16 Thread Alexandre Ghiti
This series introduces generic functions to make top-down mmap layout
easily accessible to architectures, in particular riscv which was
the initial goal of this series.
The generic implementation was taken from arm64 and used successively
by arm, mips and finally riscv.

Note that in addition the series fixes 2 issues:
- stack randomization was taken into account even if not necessary.
- [1] fixed an issue with mmap base which did not take into account
  randomization but did not report it to arm and mips, so by moving
  arm64 into a generic library, this problem is now fixed for both
  architectures.

This work is an effort to factorize architecture functions to avoid
code duplication and oversights as in [1].

[1]: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1429066.html

Changes in v3:
  - Split into small patches to ease review as suggested by Christoph
Hellwig and Kees Cook
  - Move help text of new config as a comment, as suggested by Christoph
  - Make new config depend on MMU, as suggested by Christoph

Changes in v2 as suggested by Christoph Hellwig:
  - Preparatory patch that moves randomize_stack_top
  - Fix duplicate config in riscv
  - Align #if defined on next line => this gives rise to a checkpatch
warning. I found this pattern all around the tree, in the same proportion
as the previous pattern which was less pretty:
git grep -C 1 -n -P "^#if defined.+\|\|.*$" 

Alexandre Ghiti (11):
  mm, fs: Move randomize_stack_top from fs to mm
  arm64: Make use of is_compat_task instead of hardcoding this test
  arm64: Consider stack randomization for mmap base only when necessary
  arm64, mm: Move generic mmap layout functions to mm
  arm: Properly account for stack randomization and stack guard gap
  arm: Use STACK_TOP when computing mmap base address
  arm: Use generic mmap top-down layout
  mips: Properly account for stack randomization and stack guard gap
  mips: Use STACK_TOP when computing mmap base address
  mips: Use generic mmap top-down layout
  riscv: Make mmap allocation top-down by default

 arch/Kconfig   |  8 +++
 arch/arm/Kconfig   |  1 +
 arch/arm/include/asm/processor.h   |  2 -
 arch/arm/mm/mmap.c | 52 
 arch/arm64/Kconfig |  1 +
 arch/arm64/include/asm/processor.h |  2 -
 arch/arm64/mm/mmap.c   | 72 --
 arch/mips/Kconfig  |  1 +
 arch/mips/include/asm/processor.h  |  5 --
 arch/mips/mm/mmap.c| 57 --
 arch/riscv/Kconfig | 11 
 fs/binfmt_elf.c| 20 ---
 include/linux/mm.h |  2 +
 kernel/sysctl.c|  6 +-
 mm/util.c  | 96 +-
 15 files changed, 123 insertions(+), 213 deletions(-)

-- 
2.20.1



mmotm 2019-04-16-22-01 uploaded

2019-04-16 Thread akpm
The mm-of-the-moment snapshot 2019-04-16-22-01 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/



The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is available at

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/

and use of this tree is similar to
http://git.cmpxchg.org/cgit.cgi/linux-mmotm.git/, described above.


This mmotm tree contains the following patches against 5.1-rc5:
(patches marked "*" will be included in linux-next)

  origin.patch
* checkpatch-dont-interpret-stack-dumps-as-commit-ids.patch
* mm-add-sys-kernel-slab-cache-cache_dma32.patch
* 
coredump-fix-race-condition-between-mmget_not_zero-get_task_mm-and-core-dumping.patch
* userfaultfd-use-rcu-to-free-the-task-struct-when-fork-fails.patch
* slab-store-tagged-freelist-for-off-slab-slabmgmt.patch
* mm-swapoff-shmem_find_swap_entries-filter-out-other-types.patch
* mm-swapoff-remove-too-limiting-swap_unuse_max_tries.patch
* mm-swapoff-take-notice-of-completion-sooner.patch
* mm-swapoff-shmem_unuse-stop-eviction-without-igrab.patch
* mm-swapoff-shmem_unuse-stop-eviction-without-igrab-fix.patch
* 
mm-memory_hotplug-do-not-unlock-when-fails-to-take-the-device_hotplug_lock.patch
* 
mm-vmstat-fix-proc-vmstat-format-for-config_debug_tlbflush=y-config_smp=n.patch
* proc-fix-map_files-test-on-f29.patch
* proc-fixup-proc-pid-vm-test.patch
* mm-hotplug-treat-cma-pages-as-unmovable.patch
* mm-hotplug-treat-cma-pages-as-unmovable-v4.patch
* mm-fix-inactive-list-balancing-between-numa-nodes-and-cgroups.patch
* kcov-improve-config_arch_has_kcov-help-text.patch
* watchdog-hard-lockup-message-should-end-with-a-newline.patch
* init-initialize-jump-labels-before-command-line-option-parsing.patch
* kmemleak-fix-unused-function-warning.patch
* mm-memory_hotplug-drop-memory-device-reference-after-find_memory_block.patch
* zram-pass-down-the-bvec-we-need-to-read-into-in-the-work-struct.patch
* lib-kconfigdebug-fix-build-error-without-config_block.patch
* prctl-fix-false-positive-in-validate_prctl_map.patch
* scripts-spellingtxt-add-more-typos-to-spellingtxt-and-sort.patch
* arch-sh-boards-mach-dreamcast-irqc-remove-duplicate-header.patch
* debugobjects-move-printk-out-of-db-lock-critical-sections.patch
* ocfs2-use-common-file-type-conversion.patch
* ocfs2-fix-ocfs2-read-inode-data-panic-in-ocfs2_iget.patch
* ocfs2-clear-zero-in-unaligned-direct-io.patch
* ocfs2-clear-zero-in-unaligned-direct-io-checkpatch-fixes.patch
* ocfs2-wait-for-recovering-done-after-direct-unlock-request.patch
* ocfs2-checkpoint-appending-truncate-log-transaction-before-flushing.patch
* ramfs-support-o_tmpfile.patch
  mm.patch
* list-add-function-list_rotate_to_front.patch
* slob-respect-list_head-abstraction-layer.patch
* slob-use-slab_list-instead-of-lru.patch
* slub-add-comments-to-endif-pre-processor-macros.patch
* slub-use-slab_list-instead-of-lru.patch
* slab-use-slab_list-instead-of-lru.patch
* mm-remove-stale-comment-from-page-struct.patch
* slub-remove-useless-kmem_cache_debug-before-remove_full.patch
* mm-slab-remove-unneed-check-in-cpuup_canceled.patch
* slub-update-the-comment-about-slab-frozen.patch
* slab-fix-an-infinite-loop-in-leaks_show.patch
* slab-fix-an-infinite-loop-in-leaks_show-fix.patch
* mm-vmscan-drop-zone-id-from-kswapd-tracepoints.patch
* mm-cma_debugc-fix-the-break-condition-in-cma_maxchunk_get.patch
* userfaultfd-sysctl-add-vmunprivileged_userfaultfd.patch
* userfaultfd-sysctl-add-vmunprivileged_userfaultfd-fix.patch
* page-cache-store-only-head-pages-in-i_pages.patch
* page-cache-store-only-head-pages-in-i_pages-fix.patch
* page-cache-store-only-head-pages-in-i_pages-fix-fix.patch
* mm-page_alloc-disallow-__gfp_comp-in-alloc_pages_exact.patch
* 

Re: [PATCH] x86/tlb: Revert: Align TLB invalidation info

2019-04-16 Thread Nadav Amit
> On Apr 16, 2019, at 11:28 AM, Peter Zijlstra  wrote:
> 
> On Tue, Apr 16, 2019 at 10:45:05AM -0700, Linus Torvalds wrote:
>> So very much Ack on that patch, but maybe we could do a bit more cleanup 
>> here?
> 
> Yeah, Nadav was going to try and clean that up. But I figured we should
> get this revert in and backported while it's hot :-)

I will get to it hopefully next week. I need to do some benchmarking to see
the impact of getting it off the stack, although usually the IPI itself
dominates the TLB shootdown performance overhead.

Re: [PATCH v1 1/1] thermal: rcar_gen3_thermal: request IRQ after device initialization

2019-04-16 Thread Jiada Wang

Hi Eugeniu

thanks for your test & comments and adding more people for review

I will add necessary backtrace information to description and
rephrase commit summary in V2 patch

Thanks,
Jiada

On 2019/04/17 2:48, Eugeniu Rosca wrote:

Hi Jiada,

Adding below people, since they've made recent contributions to the
driver and might be interested in your patch:

git log master --since="1 year" -- drivers/thermal/rcar_gen3_thermal.c \
 | grep -o "\-by:.*" | sed 's/\-by: //' | sort | uniq -c | sort -rn
   7 Eduardo Valentin 
   6 Simon Horman 
   5 Niklas Söderlund 
   2 Geert Uytterhoeven 
   1 Sergei Shtylyov 
   1 Marek Vasut 
   1 Kuninori Morimoto 
   1 Hien Dang 
   1 Fabrizio Castro 
   1 Dien Pham 
   1 Daniel Lezcano 
   1 Biju Das 

I confirm that loading and unloading the rcar3 thermal driver in a
loop produces soft lockup using v5.1-rc5-10-g618d919cae2f on
H3-ES2.0-Salvator-X.

Full log and .config can be found here:
https://gist.github.com/erosca/1f76b6dd897cdc39581fca475155e363

I post an excerpt from the above [1] (why not including it in the
description?). Also, why not rephrasing the commit summary line in such
a way that everybody understands this patch fixes a severe issue, e.g.
"thermal: rcar_gen3_thermal: Fix soft lockup on probe" ?

BTW, with this patch applied I left the thermal driver being
loaded/unloaded on the target for over one hour w/o seeing the issue
reproduced. So, while there might be slight variations in how the final
solution looks like, I think the patch already deserves a:

Tested-by: Eugeniu Rosca 

[1] Soft lockup reproduced with v5.1-rc5-10-g618d919cae2f

root@rcar-gen3:~# while true; do rmmod rcar_gen3_thermal; modprobe 
rcar_gen3_thermal; done
[   43.439043] rcar_gen3_thermal e6198000.thermal: TSC0: Loaded 0 trip points
[   43.451670] rcar_gen3_thermal e6198000.thermal: TSC1: Loaded 0 trip points
[   43.463974] rcar_gen3_thermal e6198000.thermal: TSC2: Loaded 0 trip points

[..]

[  553.966104] rcar_gen3_thermal e6198000.thermal: TSC0: Loaded 0 trip points
[  553.978759] rcar_gen3_thermal e6198000.thermal: TSC1: Loaded 0 trip points
[  553.991058] rcar_gen3_thermal e6198000.thermal: TSC2: Loaded 0 trip points
[  562.235306] renesas_sdhi_internal_dmac ee10.sd: timeout waiting for 
hardware interrupt (CMD25)
[  567.353336] renesas_sdhi_internal_dmac ee10.sd: timeout waiting for 
hardware interrupt (CMD13)
[  572.473318] renesas_sdhi_internal_dmac ee10.sd: timeout waiting for 
hardware interrupt (CMD13)
[  577.593328] renesas_sdhi_internal_dmac ee10.sd: timeout waiting for 
hardware interrupt (CMD12)
[  579.189148] rcu: INFO: rcu_preempt self-detected stall on CPU
[  579.195329] rcu: 0-: (1 GPs behind) idle=b76/1/0x4004 
softirq=263851/263851 fqs=6251 last_accelerate: e095/4240, Nonlazy posted: ...
[  579.209711] rcu:  (t=25008 jiffies g=346801 q=468)
[  579.214801] Task dump for CPU 0:
[  579.218178] modprobeR  running task0  6337   1420 0x002a
[  579.225514] Call trace:
[  579.228103]  dump_backtrace+0x0/0x1dc
[  579.231934]  show_stack+0x24/0x30
[  579.235410]  sched_show_task+0x31c/0x36c
[  579.239507]  dump_cpu_task+0xb0/0xc0
[  579.243248]  rcu_dump_cpu_stacks+0x220/0x238
[  579.247702]  rcu_sched_clock_irq+0x8a4/0x141c
[  579.252249]  update_process_times+0x34/0x64
[  579.256617]  tick_sched_handle+0x80/0x98
[  579.260714]  tick_sched_timer+0x64/0xbc
[  579.264722]  __hrtimer_run_queues+0x5c0/0xb84
[  579.269266]  hrtimer_interrupt+0x1ec/0x454
[  579.273547]  arch_timer_handler_phys+0x40/0x58
[  579.278185]  handle_percpu_devid_irq+0x174/0x6e8
[  579.282999]  generic_handle_irq+0x3c/0x54
[  579.287185]  __handle_domain_irq+0x114/0x118
[  579.291639]  gic_handle_irq+0x70/0xac
[  579.295465]  el1_irq+0xbc/0x180
[  579.298756]  __asan_load8+0x8c/0x9c
[  579.302403]  rcu_is_watching+0x80/0x8c
[  579.306322]  rebalance_domains+0x12c/0x584
[  579.310599]  run_rebalance_domains+0x1f4/0x298
[  579.315231]  __do_softirq+0x4c0/0xab8
[  579.319061]  irq_exit+0x148/0x1d8
[  579.322530]  __handle_domain_irq+0xc0/0x118
[  579.326894]  gic_handle_irq+0x70/0xac
[  579.330720]  el1_irq+0xbc/0x180
[  579.334012]  lock_is_held_type+0xec/0x144
[  579.338201]  rcu_read_lock_sched_held+0x90/0x98
[  579.342927]  kmem_cache_alloc+0x328/0x3e0
[  579.347114]  create_object+0x5c/0x39c
[  579.350944]  kmemleak_alloc+0x54/0x88
[  579.354774]  __kmalloc_track_caller+0x1c8/0x434
[  579.359499]  devres_alloc_node+0x40/0x8c
[  579.363597]  __devm_request_region+0x48/0xc8
[  579.368055]  devm_ioremap_resource+0xcc/0x148
[  579.372626]  rcar_gen3_thermal_probe+0x288/0x618 [rcar_gen3_thermal]
[  579.379231]  platform_drv_probe+0x70/0xe4
[  579.383420]  really_probe+0x2d8/0x3d8
[  579.387249]  driver_probe_device+0x154/0x164
[  579.391705]  device_driver_attach+0x98/0xa0
[  579.396070]  __driver_attach+0xf0/0xf4
[  579.399987]  bus_for_each_dev+0x114/0x13c
[  579.404173]  

Re: [PATCH] ARM: dts: exynos: add CCI-400 PMU nodes support to Exynos542x SoCs

2019-04-16 Thread Anand Moon
Hi Krzysztof,

On Tue, 16 Apr 2019 at 15:49, Krzysztof Kozlowski  wrote:
>
> On Mon, 15 Apr 2019 at 14:24, Anand Moon  wrote:
> > Cache Coherent Interface (CCI) among Cortex-A15 and Cortex-A7, G2D, G3D and 
> > SSS
> >
> > Level 0 > CPU blocks such as Cortex-A15 (CA15), Cortex-A7 (CA7) are
> > joined as the member of Level 0 CCI bus
> >
> > Level 1 > Display engine block (DISP) and 2D graphic engines (G2D) are
> > directly connected to Level 1.
> >   DISP, MDMA, SSS.
> >
> > Level 2 > While all the other IP is connected to Level 1 bus via Level 2 bus
> >G3D, MSCL, MFC, ISP, JPEG/Rotator/DMA/PERI, NAND/SD/EMMC.
> >
> > So my question is the mapped with the cci ip block correct.
> > Level 0 (cci_control0)
> > Level 1 (cci_control1)
> > Level 2 (cci_control1)
>
> Hi Anand,
>
> I do not understand the question - what is mapped with correctly or not?
>
> Best regards,
> Krzysztof

Following the 
https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/cci.txt
CCI node linked to CPU and DMA nodes for example.

On this line below diagram from Exynos 5422 UM show various IP block
linked to CCI level.
Below image just elaborate overall architecture of Exynos 5422 CCI.

[0] https://imgur.com/gallery/0xJSwGQ

So we should map the various IP block to corresponding CCI level.

Best Regards
-Anand


Re: [PATCH v2] panic: add an option to replay all the printk message in buffer

2019-04-16 Thread Andrew Morton
On Wed, 10 Apr 2019 23:37:18 +0800 Feng Tang  wrote:

> Currently on panic, kernel will lower the loglevel and print out
> new printk msg only with console_flush_on_panic().
> 
> Add an option for users to configure the "panic_print" to see
> all dmesg in buffer, some of which they may have never seen due
> to the loglevel setting, which will help debugging too.
> 
> Thanks to Petr Mladek as somes codes come directly from the sample
> code in his review comments.

CONFIG_PRINTK=n:

kernel/printk/printk.c: In function console_unlock:
kernel/printk/printk.c:2419:11: warning: __builtin_memcpy writing 27 bytes into 
a region of size 0 overflows the destination [-Wstringop-overflow=]
len += sprintf(text + len,
   ^~~
"Replaying the entire log:\n");
~~

because LOG_LINE_MAX=0 and PREFIX_MAX=0.


Which is interesting.  The pre-existing

len = sprintf(text,
  "** %llu printk messages dropped **\n",
  log_first_seq - console_seq);

in console_unlock() has the same issue, but the compiler doesn't seem
to want to warn.

(Also, using sprintf() is a bit lame for the new message - could use
strlcpy()).

I'll drop the patch for now - we don't want that warning to come out. 
console_unlock() needs some fixing for the CONFIG_PRINTK=n case.



Re: [PATCH] ARM: debug-ll: add default address for digicolor

2019-04-16 Thread Baruch Siach
Hi Arnd,

On Tue, Apr 16, 2019 at 03:17:30PM +0200, Arnd Bergmann wrote:
> The digicolor platform has three UARTs, but the Kconfig.debug
> file explicitly lists port zero as the one to be used for the
> console, while not providing any default values.
> 
> This can get an automated randconfig build stuck in a loop
> waiting for the user to input the number. As we already know
> the physical address, this patch provides that number as
> default, along with a reasonable default value for the virtual
> address.
> 
> Cc: Baruch Siach 
> Signed-off-by: Arnd Bergmann 
> ---
>  arch/arm/Kconfig.debug | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
> index a8a1e14f20ab..e5f0b36d797f 100644
> --- a/arch/arm/Kconfig.debug
> +++ b/arch/arm/Kconfig.debug
> @@ -1676,6 +1676,7 @@ config DEBUG_UART_PHYS
>   default 0xe6e68000 if DEBUG_RCAR_GEN2_SCIF1
>   default 0xe6ee if DEBUG_RCAR_GEN2_SCIF4
>   default 0xe8008000 if DEBUG_R7S72100_SCIF2
> + default 0xf740 if DEBUG_DIGICOLOR_UA0

The addruart macro in arch/arm/include/debug/digicolor.S adds the 0x740 offset. 
So CONFIG_DEBUG_UART_PHYS should be 0xf000.

>   default 0xfbe0 if ARCH_EBSA110
>   default 0xf1012000 if DEBUG_MVEBU_UART0_ALTERNATE
>   default 0xf1012100 if DEBUG_MVEBU_UART1_ALTERNATE
> @@ -1785,6 +1786,7 @@ config DEBUG_UART_VIRT
>   default 0xfd012000 if DEBUG_MVEBU_UART0_ALTERNATE && ARCH_MV78XX0
>   default 0xfd883000 if DEBUG_ALPINE_UART0
>   default 0xfde12000 if DEBUG_MVEBU_UART0_ALTERNATE && ARCH_DOVE
> + default 0xfe000740 if DEBUG_DIGICOLOR_UA0

The value I used when testing used to be 0xf010.

>   default 0xfe012000 if DEBUG_MVEBU_UART0_ALTERNATE && ARCH_ORION5X
>   default 0xfe017000 if DEBUG_MMP_UART2
>   default 0xfe018000 if DEBUG_MMP_UART3

Thanks,
baruch

-- 
 http://baruch.siach.name/blog/  ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -


Re: [PATCH v3] arm64: sysreg: make mrs_s and msr_s macros work with Clang and LTO

2019-04-16 Thread Kees Cook
On Tue, Apr 16, 2019 at 12:08 PM Mark Rutland  wrote:
>
> On Mon, Apr 15, 2019 at 10:22:27AM -0700, Nick Desaulniers wrote:
> > On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland  wrote:
> > > It would be nice if we could simply rely on a more recent binutils these
> > > days, which supports the generic S sysreg
> > > definition. That would mean we could get rid of the whole msr_s/mrs_s
> > > hack by turning that into a CPP macro which built that name.
> > >
> > > It looks like binutils has been able to do that since September 2014...
> > >
> > > Are folk using toolchains older than that to compile kernels?
> >
> > Do you have a link to a commit?  If we can pinpoint the binutils
> > version, that might help.
>
> IIUC any version of binutils with commit:
>
>   df7b4545b2b49572 ("[PATCH/AArch64] Generic support for all system registers 
> using mrs and msr")
>
> ... should be sufficent.

This appears to be binutils 2.25:

$ git describe --match 'binutils-2_*' --contains df7b4545b2b49572
binutils-2_25~418

Documentation/process/changes.rst lists 2.20 as current minimum for binutils.

Ubuntu's old LTS (Trusty, Apr 2014) has 2.24, and that release is
about to exit support this month.

Debian's old-stable (Apr 2015) has 2.25. It's about to exit support
when Debian Buster releases.

RHEL6 (2010) has 2.20. RHEL7 (2014) has 2.25.

It seems not unreasonable to require 2.25 at least for arm64 builds?

-- 
Kees Cook


Re: next/master boot bisection: next-20190215 on beaglebone-black

2019-04-16 Thread Kees Cook
On Tue, Apr 16, 2019 at 4:04 PM Guenter Roeck  wrote:
>
> On Tue, Apr 16, 2019 at 1:37 PM Dan Williams  wrote:
> > Ah, no, the problem is that jump_label_init() is called by
> > setup_arch() on x86, and smp_prepare_boot_cpu() on powerpc, but not
> > until after parse_args() on ARM.
> >
> Anywhere but arm64, x86, and ppc, really.
>
> $ git grep jump_label_init arch
> arch/arm64/kernel/smp.c:jump_label_init();
> arch/powerpc/lib/feature-fixups.c:  jump_label_init();
> arch/x86/kernel/setup.c:jump_label_init();

Oooh, nice. Yeah, so, this is already a bug for "hardened_usercopy=0"
which sets static branches too.

> > Given it appears to be safe to call jump_label_init() early how about
> > something like the following?
> >
> > diff --git a/init/main.c b/init/main.c
> > index 598e278b46f7..7d4025d665eb 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -582,6 +582,8 @@ asmlinkage __visible void __init start_kernel(void)
> > page_alloc_init();
> >
> > pr_notice("Kernel command line: %s\n", boot_command_line);
> > +   /* parameters may set static keys */
> > +   jump_label_init();
> > parse_early_param();
> > after_dashes = parse_args("Booting kernel",
> >   static_command_line, __start___param,
> > @@ -591,8 +593,6 @@ asmlinkage __visible void __init start_kernel(void)
> > parse_args("Setting init args", after_dashes, NULL, 0, -1, 
> > -1,
> >NULL, set_init_arg);
> >
> > -   jump_label_init();
> > -
>
> That should work, unless there was a reason to have it that late. It
> doesn't look like that was the case, but I may be missing something.

Yes please. :) Let's fix it like you've suggested.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)

2019-04-16 Thread Kees Cook
On Mon, Apr 15, 2019 at 7:58 AM Jitendra Sharma  wrote:
>
> Hi Kees Cook/Luis,
>
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
>
> -000|has_group_leader_pid(inline)
> -000|next_tgid(
> | [X20] ns = 0xFF87CABB1AC0,
> | [locdesc] iter = (
> | [locdesc] tgid = 424,
> | [locdesc] task = ?))
> | [X21] p = 0xFFD0F948
> | [X21] task = 0xFFD0F948
> -001|proc_pid_readdir(
> | [X20] file = 0xFFD1AC60FC40,
> | [X19] ctx = 0xFF8027363E40)
> | [X21] ns = 0xFF87CABB1AC0
> -002|proc_root_readdir(
> | [X20] file = 0xFFD1AC60FC40,
> | [X19] ctx = 0xFF8027363E40)
> -003|iterate_dir(
> | [X19] file = 0xFFD1AC60FC40,
> | [X22] ctx = 0xFF8027363E40)
> | [X23] inode = 0xFFD1F20246D0
> -004|SYSC_getdents64(inline)
> -004|sys_getdents64(
> | ?,
> | ?,
> | [X19] count = 4200)
> | [X19] count = 4200
> | [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
> | [X0] error = -1720
> -005|el0_svc_naked(asm)
> -->|exception
> -006|NUX:0x78C5AD7D38(asm)
> ---|end of frame
>
>
>  From this call stack,task: 0xFFD0F948, seems to be invalid.
> As(from ramdumps) it doesn't have any valid fields. And while trying to
> access the fields of this task struct in has_group_leader_pid, abort is
> happening.
>
>  From the dumps, its not clear why the task struct is coming to be some
> invalid (Possibly task has already exited).  This issue is observed
> during normal monkey testing for long hours.
>
> Could you please provide some pointers which could help in debugging
> this issue further.

Do you have any hints on how to reproduce this? I assume something is
missing proper locking or RCU handling, but I don't see anything
obvious in the surrounding code yet...

-- 
Kees Cook


Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory()

2019-04-16 Thread Andrew Morton
On Wed, 10 Apr 2019 10:07:24 +0200 David Hildenbrand  wrote:

> Care to fixup both u64 to resource_size_t? Or should I send a patch?
> Whatever you prefer.

Please send along a fixup.

This patch series has no evidence of having been reviewed :(.  Can you
suggest who could help us out here?



Re: [PATCH] stm class: Fix out of bound access from bitmap allocation

2019-04-16 Thread Sai Prakash Ranjan

On 4/16/2019 8:30 PM, Alexander Shishkin wrote:

Sai Prakash Ranjan  writes:


From: Mulu He 

Bitmap allocation works on array of unsigned longs and
for stm master allocation when the number of software
channels is 32, 4 bytes are allocated and there is a out of
bound access at the first 8 bytes access of bitmap region.


Does the below fix the problem for you?

 From fb22b9ab109b332e58d72df13563e270befbd0e3 Mon Sep 17 00:00:00 2001
From: Alexander Shishkin 
Date: Tue, 16 Apr 2019 17:47:02 +0300
Subject: [PATCH] stm class: Fix channel bitmap on 32-bit systems

Commit 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace
Module devices") naively calculates the channel bitmap size in 64-bit
chunks regardless of the size of underlying unsigned long, making the
bitmap half as big on a 32-bit system. This leads to an out of bounds
access with the upper half of the bitmap.

Fix this by using BITS_TO_LONGS. While at it, convert to using
struct_size() for the total size calculation of the master struct.

Signed-off-by: Alexander Shishkin 
Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module 
devices")
Reported-by: Mulu He 
Cc: sta...@vger.kernel.org # v4.4+
---
  drivers/hwtracing/stm/core.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/stm/core.c b/drivers/hwtracing/stm/core.c
index 5b5807cbcf7c..8c45e79e47db 100644
--- a/drivers/hwtracing/stm/core.c
+++ b/drivers/hwtracing/stm/core.c
@@ -166,11 +166,9 @@ stm_master(struct stm_device *stm, unsigned int idx)
  static int stp_master_alloc(struct stm_device *stm, unsigned int idx)
  {
struct stp_master *master;
-   size_t size;
  
-	size = ALIGN(stm->data->sw_nchannels, 8) / 8;

-   size += sizeof(struct stp_master);
-   master = kzalloc(size, GFP_ATOMIC);
+   master = kzalloc(struct_size(master, chan_map, 
BITS_TO_LONGS(stm->data->sw_nchannels)),
+GFP_ATOMIC);
if (!master)
return -ENOMEM;
  



++ David

Yes it does fix the issue. Actually initial fix internally was using
BITS_TO_LONGS, don't no why they deferred from it.

Anyways thanks for the patch.

- Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


RE: [PATCH V12 1/5] dt-bindings: fsl: scu: add thermal binding

2019-04-16 Thread Aisheng Dong
> From: Anson Huang
> Sent: Tuesday, April 16, 2019 11:22 AM
> 
> NXP i.MX8QXP is an ARMv8 SoC with a Cortex-M4 core inside as system
> controller, the system controller is in charge of system power, clock and
> thermal sensors etc. management, Linux kernel has to communicate with
> system controller via MU (message unit) IPC to get temperature from thermal
> sensors, this patch adds binding doc for i.MX system controller thermal 
> driver.
> 
> Signed-off-by: Anson Huang 
> ---
> No changes.
> ---
>  .../devicetree/bindings/arm/freescale/fsl,scu.txt| 16
> 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> index 5d7dbab..f4fb6d5 100644
> --- a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> +++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
> @@ -133,6 +133,17 @@ RTC bindings based on SCU Message Protocol
> Required properties:
>  - compatible: should be "fsl,imx8qxp-sc-rtc";
> 
> +Thermal bindings based on SCU Message Protocol
> +
> +
> +Required properties:
> +- compatible:Should be :
> +   "fsl,imx8qxp-sc-thermal"
> + followed by "fsl,imx-sc-thermal";
> +
> +- #thermal-sensor-cells: See
> Documentation/devicetree/bindings/thermal/thermal.txt
> + for a description.

Better to have an explicit value here.
e.g.
Must be 1. See xxx for a description.

Otherwise:
Reviewed-by: Dong Aisheng 

Regards
Dong Aisheng


Re: [PATCH v1 1/3] thermal: rockchip: add pinctrl control

2019-04-16 Thread elaine.zhang

hi,

在 2019/4/16 下午6:12, Daniel Lezcano 写道:

Hi Elaine,

On 11/04/2019 09:46, elaine.zhang wrote:

hi,

在 2019/4/4 上午11:03, Daniel Lezcano 写道:

On 01/04/2019 08:43, Elaine Zhang wrote:

Based on the TSADC Tshut mode to select pinctrl,
instead of setting pinctrl based on architecture
(Not depends on pinctrl setting by "init" or "default").
And it requires setting the tshut polarity before select pinctrl.

I'm not sure to fully read the description. Can you rephrase/elaborate
the changelog?

Example:
     tsadc: tsadc@ff25 {
     compatible = "rockchip,rk3328-tsadc";
     .
     pinctrl-names = "init", "default", "sleep";
     pinctrl-0 = <_gpio>;
     pinctrl-1 = <_out>;
     pinctrl-2 = <_gpio>;
     status = "disabled";
     .
     };
      {
     rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU 1:GPIO */
     status = "okay";
     };
     Application on the product, hope tsadc overtemperature reset cru to
restart.
     But when pinctrl is initialized, it will setting pinctrl by "init"
and "default".
     So the pinctrl will set iomux to "otp_out", which may be make system
crash.

why?
This "otp_out" IO may be connected to the RESET circuit on the hardware. 
If the IO is in the wrong state, it will trigger RESET (similar to the 
effect of pressing the RESET button), which will cause the system to 
restart all the time.



     tsadc gpio iomux:
     "otp_gpio" : just normal gpio, keep default level.
     "otp_out" : tsadc shut down io, when overtemperature,this io may be
trigger high
     level or low level(depend on the tsadc polarity).

     After correction:
     if rockchip,hw-tshut-mode = <0>; (tsadc overtemperature reset cru to
restart)
     select pinctrl to otp_gpio
     if rockchip,hw-tshut-mode = <1>;(tsadc overtemperature IO triggers
output at high or low levels)
     select pinctrl to otp_out.
     Actively select iomux based on rockchip,hw-tshut-mode,
     rather than relying on the pinctrl framework to select iomux.

Let me rephrase it and tell me if I understood correctly:

"When the temperature sensor mode is set to 0, it will automatically
reset the board via the Clock-Reset-Unit (CRU) if the over temperature
threshold is reached. However, when the pinctrl initializes, it does a
transition to "otp_out" which may lead the SoC to crash (why?)

Explicitly use the pinctrl to set/unset the right mode instead of
relying on the pinctrl init mode"

This explanation is correct.


So this patch is a fix and it must contain the Fixes: ... tag.

OK, I'll fix that in the v2.




Signed-off-by: Elaine Zhang 
---
   drivers/thermal/rockchip_thermal.c | 61
+++---
   1 file changed, 50 insertions(+), 11 deletions(-)

diff --git a/drivers/thermal/rockchip_thermal.c
b/drivers/thermal/rockchip_thermal.c
index 9c7643d62ed7..faa6c7792155 100644
--- a/drivers/thermal/rockchip_thermal.c
+++ b/drivers/thermal/rockchip_thermal.c
@@ -34,7 +34,7 @@
    */
   enum tshut_mode {
   TSHUT_MODE_CRU = 0,
-    TSHUT_MODE_GPIO,
+    TSHUT_MODE_OTP,

Why do you change the enum name? The impact on the patch is much higher,
no ?

I just want to make it a little bit more intuitive to understand the
definition of mode.

TSHUT_MODE_CRU: pinctrl select iomux to otp_gpio,the io is normal io.
TSHUT_MODE_OTP: pinctrl select iomux to otp_out, the io is tsadc shut io.

I understand, but at the end the changes for this patch are counter
intuitive, so it is preferable to keep it as is so we can review the
important part.

[ ... ]

OK, keep the enum name. I'll fix it in the V2.



   +static void thermal_pinctrl_select_otp(struct
rockchip_thermal_data *thermal)
+{
+    if (!IS_ERR(thermal->pinctrl) &&
!IS_ERR_OR_NULL(thermal->otp_state))
+    pinctrl_select_state(thermal->pinctrl,
+ thermal->otp_state);
+}
+
+static void thermal_pinctrl_select_gpio(struct rockchip_thermal_data
*thermal)
+{
+    if (!IS_ERR(thermal->pinctrl) &&
!IS_ERR_OR_NULL(thermal->gpio_state))
+    pinctrl_select_state(thermal->pinctrl,
+ thermal->gpio_state);
+}

You should not have to create a couple of specific functions just to
check the pinctrl pointers are set. The caller should do that.

Because there are several places where the call is made,create a couple
of specific functions reduce a lot of duplicated code.

No, that does not reduce a lot of duplicated code, it hides the fact
there is no control from the caller. See the comments below.

[ ... ]

OK, I'll fix it in the V2.



   static int rockchip_configure_from_dt(struct device *dev,
     struct device_node *np,
     struct rockchip_thermal_data *thermal)
@@ -1103,7 +1120,7 @@ static int rockchip_configure_from_dt(struct
device *dev,
   if (of_property_read_u32(np, "rockchip,hw-tshut-mode",
_mode)) {
    

Re: [PATCH] init: Initialize jump labels before command line option parsing

2019-04-16 Thread Andrew Morton
On Tue, 16 Apr 2019 17:39:03 -0700 Guenter Roeck  wrote:

> > > Has it been confirmed that this fixes
> > > mm-shuffle-initial-free-memory-to-improve-memory-side-cache-utilization.patch
> > > on beaglebone-black?
> >
> > This only fixes dynamically enabling the shuffling on 32-bit ARM.
> > Guenter happened to run without the mm-only 'force-enable-always'
> > patch and when he went to use the command line option to enable it he
> > hit the jump-label warning.
> >
> 
> For my part I have not seen the original failure; it seems that the
> kernelci logs are no longer present. As such, I neither know how it
> looks like nor how to (try to) reproduce it. I just thought it might
> be worthwhile to run the patch through my boot tests to see if
> anything pops up. From the feedback I got, though, it sounded like the
> failure is/was very omap2 specific, so I would not be able to
> reproduce it anyway.

hm.  Maybe we forge ahead and see if someone hits the issue who
can work with us on resolving it.  It sounds like the affected population
will be quite small.  But still, ugh :(


Re: [PATCH 2/2] kernel: use sysctl shared variables for range check

2019-04-16 Thread Kees Cook
On Tue, Apr 16, 2019 at 10:21 PM Kees Cook  wrote:
>
> On Mon, Apr 8, 2019 at 5:09 PM Matteo Croce  wrote:
> >
> > Use the shared variables for range check, instead of declaring a local one
> > in every source file.
> >
> > Signed-off-by: Matteo Croce 
> > ---
> >  kernel/pid_namespace.c |   3 +-
> >  kernel/sysctl.c| 193 -
> >  kernel/ucount.c|   6 +-
> >  3 files changed, 98 insertions(+), 104 deletions(-)
> >
> > diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> > index aa6e72fb7c08..ddbb51bc4968 100644
> > --- a/kernel/pid_namespace.c
> > +++ b/kernel/pid_namespace.c
> > @@ -290,14 +290,13 @@ static int pid_ns_ctl_handler(struct ctl_table 
> > *table, int write,
> >  }
> >
> >  extern int pid_max;
> > -static int zero = 0;
> >  static struct ctl_table pid_ns_ctl_table[] = {
> > {
> > .procname = "ns_last_pid",
> > .maxlen = sizeof(int),
> > .mode = 0666, /* permissions are checked in the handler */
> > .proc_handler = pid_ns_ctl_handler,
> > -   .extra1 = ,
> > +   .extra1 = (void *)_zero,
>
> BTW, I don't think these (void *) casts are actually needed. I thought
> extra1/2 were already void * so assignments don't need the casting.

Nevermind, I see akpm already mentioned this, and I see it's the
"const" removal now.

-- 
Kees Cook


RE: [PATCH] i2c: imx: correct the method of getting private data in notifier_call

2019-04-16 Thread Anson Huang
Hi, Aisheng

Best Regards!
Anson Huang

> -Original Message-
> From: Aisheng Dong
> Sent: Wednesday, April 17, 2019 11:13 AM
> To: Anson Huang ; shawn...@kernel.org;
> s.ha...@pengutronix.de; ker...@pengutronix.de; feste...@gmail.com;
> wsa+rene...@sang-engineering.com; u.kleine-koe...@pengutronix.de;
> e...@deif.com; li...@rempel-privat.de; mo...@codeaurora.org; Laurentiu
> Tudor ; p...@axentia.se; linux-
> i...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
> ker...@vger.kernel.org
> Cc: dl-linux-imx 
> Subject: RE: [PATCH] i2c: imx: correct the method of getting private data in
> notifier_call
> 
> > From: Anson Huang
> > Sent: Wednesday, April 17, 2019 10:00 AM
> >
> > The way of getting private imx_i2c_struct in
> > i2c_imx_clk_notifier_call() is incorrect, should use clk_change_nb
> > element to get correct address and avoid below kernel dump during
> > POST_RATE_CHANGE notify by clk
> > framework:
> >
> > Unable to handle kernel paging request at virtual address 03ef1488 pgd
> > =
> > (ptrval) [03ef1488] *pgd= Internal error: Oops: 5 [#1] PREEMPT
> > SMP ARM Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > Workqueue: events reduce_bus_freq_handler PC is at
> > i2c_imx_set_clk+0x10/0xb8 LR is at i2c_imx_clk_notifier_call+0x20/0x28
> > pc : [<806a893c>]lr : [<806a8a04>]psr: a0080013
> > sp : bf399dd8  ip : bf3432ac  fp : bf7c1dc0
> > r10: 0002  r9 :   r8 : 
> > r7 : 03ef1480  r6 : bf399e50  r5 :   r4 : 
> > r3 : bf025300  r2 : bf399e50  r1 : 00b71b00  r0 : bf399be8
> > Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > Control: 10c5387d  Table: 4e03004a  DAC: 0051 Process kworker/2:1
> > (pid: 38, stack limit = 0x(ptrval))
> > Stack: (0xbf399dd8 to 0xbf39a000)
> > 9dc0:
> > 806a89e4 
> > 9de0:  bf399e50 0002 806a8a04 806a89e4 80142900 
> > 
> > 9e00: bf34ef18 bf34ef04   bf399e50 80142d84 
> > bf399e6c
> > 9e20: bf34ef00 80f214c4 bf025300 0002 80f08d08 bf017480 
> > 80142df0
> > 9e40:  80166ed8 80c27638 8045de58 bf352340 03ef1480
> > 00b71b00 0f82e242
> > 9e60: bf025300 0002 03ef1480 80f60e5c 0001 8045edf0
> > 0002 8045eb08
> > 9e80: bf025300 0002 03ef1480 8045ee10 03ef1480 8045eb08
> > bf01be40 0002
> > 9ea0: 03ef1480 8045ee10 07de2900 8045eb08 bf01b780 0002
> > 07de2900 8045ee10
> > 9ec0: 80c27898 bf399ee4 bf020a80 0002 1f78a400 8045ee10 80f60e5c
> > 80460514
> > 9ee0: 80f60e5c bf01b600 bf01b480 80460460 0f82e242 bf383a80 bf383a00
> > 80f60e5c
> > 9f00:  bf7c1dc0 80f60e70 80460564 80f60df0 80f60d24 80f60df0
> > 8011e72c
> > 9f20:  80f60df0 80f60e6c bf7c4f00  8011e7ac bf274000
> > 8013bd84
> > 9f40: bf7c1dd8 80f03d00 bf274000 bf7c1dc0 bf274014 bf7c1dd8 80f03d00
> > bf398000
> > 9f60: 0008 8013bfb4  bf25d100 bf25d0c0 
> > bf274000 8013bf88
> > 9f80: bf25d11c bf0cfebc  8014140c bf25d0c0 801412ec 
> > 
> > 9fa0:    801010e8  
> >  
> > 9fc0:      
> >  
> > 9fe0:     0013 
> >   [<806a893c>] (i2c_imx_set_clk) from [<806a8a04>]
> > (i2c_imx_clk_notifier_call+0x20/0x28)
> > [<806a8a04>] (i2c_imx_clk_notifier_call) from [<80142900>]
> > (notifier_call_chain+0x44/0x84) [<80142900>] (notifier_call_chain)
> > from [<80142d84>] (__srcu_notifier_call_chain+0x44/0x98)
> > [<80142d84>] (__srcu_notifier_call_chain) from [<80142df0>]
> > (srcu_notifier_call_chain+0x18/0x20)
> > [<80142df0>] (srcu_notifier_call_chain) from [<8045de58>]
> > (__clk_notify+0x78/0xa4) [<8045de58>] (__clk_notify) from [<8045edf0>]
> > (__clk_recalc_rates+0x60/0xb4) [<8045edf0>] (__clk_recalc_rates) from
> > [<8045ee10>] (__clk_recalc_rates+0x80/0xb4)
> > Code: e92d40f8 e5903298 e59072a0 e1530001 (e5975008) ---[ end trace
> > fc7f5514b97b6cbb ]---
> >
> > Fixes: 90ad2cbe88c2("i2c: imx: use clk notifier for rate changes")
> > Signed-off-by: Anson Huang 
> 
> Please also provide how to reproduce it.
> And it seems not a new issue, should we CC stable?

This issue is NOT easy to reproduce unless creating a dedicated test case, 
upstream
kernel normally does NOT go into this path, I met this issue during internal 
bus-freq
test, and it should be a real issue. Code review also can find this issue I 
think.

Yes, if it is indeed a real issue, we should CC stable.

Anson.

> 
> Regards
> Dong Aisheng
> 
> > ---
> >  drivers/i2c/busses/i2c-imx.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-imx.c
> > b/drivers/i2c/busses/i2c-imx.c index
> > c0c3043..fd70b11 100644
> > --- a/drivers/i2c/busses/i2c-imx.c
> > +++ b/drivers/i2c/busses/i2c-imx.c
> > @@ -515,9 +515,9 @@ static int i2c_imx_clk_notifier_call(struct
> > notifier_block 

Re: [PATCH 2/2] kernel: use sysctl shared variables for range check

2019-04-16 Thread Kees Cook
On Mon, Apr 8, 2019 at 5:09 PM Matteo Croce  wrote:
>
> Use the shared variables for range check, instead of declaring a local one
> in every source file.
>
> Signed-off-by: Matteo Croce 
> ---
>  kernel/pid_namespace.c |   3 +-
>  kernel/sysctl.c| 193 -
>  kernel/ucount.c|   6 +-
>  3 files changed, 98 insertions(+), 104 deletions(-)
>
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index aa6e72fb7c08..ddbb51bc4968 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -290,14 +290,13 @@ static int pid_ns_ctl_handler(struct ctl_table *table, 
> int write,
>  }
>
>  extern int pid_max;
> -static int zero = 0;
>  static struct ctl_table pid_ns_ctl_table[] = {
> {
> .procname = "ns_last_pid",
> .maxlen = sizeof(int),
> .mode = 0666, /* permissions are checked in the handler */
> .proc_handler = pid_ns_ctl_handler,
> -   .extra1 = ,
> +   .extra1 = (void *)_zero,

BTW, I don't think these (void *) casts are actually needed. I thought
extra1/2 were already void * so assignments don't need the casting.

-- 
Kees Cook


Re: [PATCH v2] proc/sysctl: add shared variables for range check

2019-04-16 Thread Andrew Morton
On Wed, 17 Apr 2019 02:59:43 +0200 Matteo Croce  wrote:

> In the sysctl code the proc_dointvec_minmax() function is often used to
> validate the user supplied value between an allowed range. This function
> uses the extra1 and extra2 members from struct ctl_table as minimum and
> maximum allowed value.
> 
> On sysctl handler declaration, in every source file there are some readonly
> variables containing just an integer which address is assigned to the
> extra1 and extra2 members, so the sysctl range is enforced.
> 
> The special values 0, 1 and INT_MAX are very often used as range boundary,
> leading duplication of variables like zero=0, one=1, int_max=INT_MAX in
> different source files:
> 
> $ git grep -E '\.extra[12].*&(zero|one|int_max)\b' |wc -l
> 245
> 
> This patch adds three const variables for the most commonly used values,
> and use them instead of creating a local one for every object file.
> 

Nice.  A few thoughts:

> --- a/arch/s390/appldata/appldata_base.c
> +++ b/arch/s390/appldata/appldata_base.c
> @@ -220,15 +220,13 @@ appldata_timer_handler(struct ctl_table *ctl, int write,
>  void __user *buffer, size_t *lenp, loff_t *ppos)
>  {
>   int timer_active = appldata_timer_active;
> - int zero = 0;
> - int one = 1;
>   int rc;
>   struct ctl_table ctl_entry = {
>   .procname   = ctl->procname,
>   .data   = _active,
>   .maxlen = sizeof(int),
> - .extra1 = ,
> - .extra2 = ,
> + .extra1 = (void *)_zero,
> + .extra2 = (void *)_one,

The casts are ugly, and by casting away constness they introduce the
risk that some errant could could change the value of 0, 1 and INT_MAX!
Maybe - perhaps trying to do that would cause a segv but still,
they're ugly.

A proper fix would require changing extra1 and extra2 to const void *. 
Perhaps that would be unfeasibly extensive?

> ...
>
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -21,6 +21,11 @@ static const struct inode_operations 
> proc_sys_inode_operations;
>  static const struct file_operations proc_sys_dir_file_operations;
>  static const struct inode_operations proc_sys_dir_operations;
>  
> +/* shared constants to be used in various sysctls */
> +const int sysctl_zero = 0;
> +const int sysctl_one = 1;
> +const int sysctl_int_max = INT_MAX;

Don't these require EXPORT_SYMBOL()?




RE: [PATCH] i2c: imx: correct the method of getting private data in notifier_call

2019-04-16 Thread Aisheng Dong
[...]

> >
> > Fixes: 90ad2cbe88c2("i2c: imx: use clk notifier for rate changes")
> > Signed-off-by: Anson Huang 
> 
> Please also provide how to reproduce it.
> And it seems not a new issue, should we CC stable?

Besides above comments:

Reviewed-by: Dong Aisheng 

Regards
Dong Aisheng

> 
> Regards
> Dong Aisheng
> 
> > ---
> >  drivers/i2c/busses/i2c-imx.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/i2c/busses/i2c-imx.c
> > b/drivers/i2c/busses/i2c-imx.c index
> > c0c3043..fd70b11 100644
> > --- a/drivers/i2c/busses/i2c-imx.c
> > +++ b/drivers/i2c/busses/i2c-imx.c
> > @@ -515,9 +515,9 @@ static int i2c_imx_clk_notifier_call(struct
> > notifier_block *nb,
> >  unsigned long action, void *data)  {
> > struct clk_notifier_data *ndata = data;
> > -   struct imx_i2c_struct *i2c_imx = container_of(>clk,
> > +   struct imx_i2c_struct *i2c_imx = container_of(nb,
> >   struct imx_i2c_struct,
> > - clk);
> > + clk_change_nb);
> >
> > if (action & POST_RATE_CHANGE)
> > i2c_imx_set_clk(i2c_imx, ndata->new_rate);
> > --
> > 2.7.4



RE: [PATCH] i2c: imx: correct the method of getting private data in notifier_call

2019-04-16 Thread Aisheng Dong
> From: Anson Huang
> Sent: Wednesday, April 17, 2019 10:00 AM
> 
> The way of getting private imx_i2c_struct in i2c_imx_clk_notifier_call() is
> incorrect, should use clk_change_nb element to get correct address and avoid
> below kernel dump during POST_RATE_CHANGE notify by clk
> framework:
> 
> Unable to handle kernel paging request at virtual address 03ef1488 pgd =
> (ptrval) [03ef1488] *pgd= Internal error: Oops: 5 [#1] PREEMPT SMP
> ARM Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> Workqueue: events reduce_bus_freq_handler PC is at
> i2c_imx_set_clk+0x10/0xb8 LR is at i2c_imx_clk_notifier_call+0x20/0x28
> pc : [<806a893c>]lr : [<806a8a04>]psr: a0080013
> sp : bf399dd8  ip : bf3432ac  fp : bf7c1dc0
> r10: 0002  r9 :   r8 : 
> r7 : 03ef1480  r6 : bf399e50  r5 :   r4 : 
> r3 : bf025300  r2 : bf399e50  r1 : 00b71b00  r0 : bf399be8
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 4e03004a  DAC: 0051 Process kworker/2:1
> (pid: 38, stack limit = 0x(ptrval))
> Stack: (0xbf399dd8 to 0xbf39a000)
> 9dc0:
> 806a89e4 
> 9de0:  bf399e50 0002 806a8a04 806a89e4 80142900 
> 
> 9e00: bf34ef18 bf34ef04   bf399e50 80142d84 
> bf399e6c
> 9e20: bf34ef00 80f214c4 bf025300 0002 80f08d08 bf017480 
> 80142df0
> 9e40:  80166ed8 80c27638 8045de58 bf352340 03ef1480
> 00b71b00 0f82e242
> 9e60: bf025300 0002 03ef1480 80f60e5c 0001 8045edf0
> 0002 8045eb08
> 9e80: bf025300 0002 03ef1480 8045ee10 03ef1480 8045eb08
> bf01be40 0002
> 9ea0: 03ef1480 8045ee10 07de2900 8045eb08 bf01b780 0002
> 07de2900 8045ee10
> 9ec0: 80c27898 bf399ee4 bf020a80 0002 1f78a400 8045ee10 80f60e5c
> 80460514
> 9ee0: 80f60e5c bf01b600 bf01b480 80460460 0f82e242 bf383a80 bf383a00
> 80f60e5c
> 9f00:  bf7c1dc0 80f60e70 80460564 80f60df0 80f60d24 80f60df0
> 8011e72c
> 9f20:  80f60df0 80f60e6c bf7c4f00  8011e7ac bf274000
> 8013bd84
> 9f40: bf7c1dd8 80f03d00 bf274000 bf7c1dc0 bf274014 bf7c1dd8 80f03d00
> bf398000
> 9f60: 0008 8013bfb4  bf25d100 bf25d0c0 
> bf274000 8013bf88
> 9f80: bf25d11c bf0cfebc  8014140c bf25d0c0 801412ec 
> 
> 9fa0:    801010e8  
>  
> 9fc0:      
>  
> 9fe0:     0013 
>   [<806a893c>] (i2c_imx_set_clk) from [<806a8a04>]
> (i2c_imx_clk_notifier_call+0x20/0x28)
> [<806a8a04>] (i2c_imx_clk_notifier_call) from [<80142900>]
> (notifier_call_chain+0x44/0x84) [<80142900>] (notifier_call_chain) from
> [<80142d84>] (__srcu_notifier_call_chain+0x44/0x98)
> [<80142d84>] (__srcu_notifier_call_chain) from [<80142df0>]
> (srcu_notifier_call_chain+0x18/0x20)
> [<80142df0>] (srcu_notifier_call_chain) from [<8045de58>]
> (__clk_notify+0x78/0xa4) [<8045de58>] (__clk_notify) from [<8045edf0>]
> (__clk_recalc_rates+0x60/0xb4) [<8045edf0>] (__clk_recalc_rates) from
> [<8045ee10>] (__clk_recalc_rates+0x80/0xb4)
> Code: e92d40f8 e5903298 e59072a0 e1530001 (e5975008) ---[ end trace
> fc7f5514b97b6cbb ]---
> 
> Fixes: 90ad2cbe88c2("i2c: imx: use clk notifier for rate changes")
> Signed-off-by: Anson Huang 

Please also provide how to reproduce it.
And it seems not a new issue, should we CC stable?

Regards
Dong Aisheng

> ---
>  drivers/i2c/busses/i2c-imx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c index
> c0c3043..fd70b11 100644
> --- a/drivers/i2c/busses/i2c-imx.c
> +++ b/drivers/i2c/busses/i2c-imx.c
> @@ -515,9 +515,9 @@ static int i2c_imx_clk_notifier_call(struct
> notifier_block *nb,
>unsigned long action, void *data)  {
>   struct clk_notifier_data *ndata = data;
> - struct imx_i2c_struct *i2c_imx = container_of(>clk,
> + struct imx_i2c_struct *i2c_imx = container_of(nb,
> struct imx_i2c_struct,
> -   clk);
> +   clk_change_nb);
> 
>   if (action & POST_RATE_CHANGE)
>   i2c_imx_set_clk(i2c_imx, ndata->new_rate);
> --
> 2.7.4



Re: [PATCH v1 1/1] thermal: rcar_gen3_thermal: request IRQ after device initialization

2019-04-16 Thread Jiada Wang

Hi Daniel

On 2019/04/17 4:22, Daniel Lezcano wrote:

On 11/04/2019 12:03, Jiada Wang wrote:

Currently IRQ is remain enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
triggered but not able to clear IRQ status, thus cause system to hang.

this patch by moving request of IRQ after device initialization to
avoid this issue.


Why not disable the interrupt and clear the irq status in the .remove
callback, so the place is clean when probing again?


 struct rcar_gen3_thermal_priv *priv = dev_get_drvdata(dev);

 rcar_thermal_irq_set(priv, false);

should do the trick no ?


yes, this issue also can be addressed by disable the interrupt in .remove.

But there is race condition between .remove and irq thread function,
(which enables IRQ)
so driver need to ensure threaded irq has been disabled before call 
rcar_thermal_irq_set(priv, false) in .remove.

this adds additional complexity.

I am fine with both solutions,
what is your opinion?

Thanks,
Jiada


Signed-off-by: Jiada Wang 
---
  drivers/thermal/rcar_gen3_thermal.c | 48 -
  1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/thermal/rcar_gen3_thermal.c 
b/drivers/thermal/rcar_gen3_thermal.c
index 88fa41cf16e8..4d095d7f9763 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -375,28 +375,6 @@ static int rcar_gen3_thermal_probe(struct platform_device 
*pdev)
  
  	platform_set_drvdata(pdev, priv);
  
-	/*

-* Request 2 (of the 3 possible) IRQs, the driver only needs to
-* to trigger on the low and high trip points of the current
-* temp window at this point.
-*/
-   for (i = 0; i < 2; i++) {
-   irq = platform_get_irq(pdev, i);
-   if (irq < 0)
-   return irq;
-
-   irqname = devm_kasprintf(dev, GFP_KERNEL, "%s:ch%d",
-dev_name(dev), i);
-   if (!irqname)
-   return -ENOMEM;
-
-   ret = devm_request_threaded_irq(dev, irq, rcar_gen3_thermal_irq,
-   rcar_gen3_thermal_irq_thread,
-   IRQF_SHARED, irqname, priv);
-   if (ret)
-   return ret;
-   }
-
pm_runtime_enable(dev);
pm_runtime_get_sync(dev);
  
@@ -458,6 +436,32 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)

goto error_unregister;
}
  
+	/*

+* Request 2 (of the 3 possible) IRQs, the driver only needs to
+* to trigger on the low and high trip points of the current
+* temp window at this point.
+*/
+   for (i = 0; i < 2; i++) {
+   irq = platform_get_irq(pdev, i);
+   if (irq < 0) {
+   ret = irq;
+   goto error_unregister;
+   }
+
+   irqname = devm_kasprintf(dev, GFP_KERNEL, "%s:ch%d",
+dev_name(dev), i);
+   if (!irqname) {
+   ret = -ENOMEM;
+   goto error_unregister;
+   }
+
+   ret = devm_request_threaded_irq(dev, irq, rcar_gen3_thermal_irq,
+   rcar_gen3_thermal_irq_thread,
+   IRQF_SHARED, irqname, priv);
+   if (ret)
+   goto error_unregister;
+   }
+
rcar_thermal_irq_set(priv, true);
  
  	return 0;







[PATCH 0/3] fix leaked of_node references in drivers/firmware

2019-04-16 Thread Wen Yang
The call to of_get_cpu_node/of_find_compatible_node/of_parse_phandle...
returns a node pointer with refcount incremented thus it must be
explicitly decremented after the last usage.

We developed a coccinelle SmPL to detect drivers/firmware code and
found some issues.
This patch series fixes those issues.

Wen Yang (3):
  firmware: arm_sdei: fix leaked of_node references
  firmware: psci: fix leaked of_node references
  firmware: stratix10-svc: fix leaked of_node references

 drivers/firmware/arm_sdei.c  |  1 +
 drivers/firmware/psci.c  |  4 +++-
 drivers/firmware/stratix10-svc.c | 14 ++
 3 files changed, 14 insertions(+), 5 deletions(-)

-- 
2.9.5



[PATCH 1/3] firmware: arm_sdei: fix leaked of_node references

2019-04-16 Thread Wen Yang
In sdei_present_dt function, fw_np is obtained by calling
of_find_node_by_name(), np is obtained by calling
of_find_matching_node(), and the reference counts of those
two device_nodes, fw_np and np, are increased.
But when the function exits, only of_node_put is called on np,
and fw_np's reference count is leaked.

Detected by coccinelle with the following warnings:
./drivers/firmware/arm_sdei.c:1088:2-8: ERROR: missing of_node_put; acquired a 
node pointer with refcount incremented on line 1082, but without a 
corresponding object release within this function.
./drivers/firmware/arm_sdei.c:1091:1-7: ERROR: missing of_node_put; acquired a 
node pointer with refcount incremented on line 1082, but without a 
corresponding object release within this function.

Signed-off-by: Wen Yang 
Cc: James Morse 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/firmware/arm_sdei.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
index e6376f9..2faa329 100644
--- a/drivers/firmware/arm_sdei.c
+++ b/drivers/firmware/arm_sdei.c
@@ -1084,6 +1084,7 @@ static bool __init sdei_present_dt(void)
return false;
 
np = of_find_matching_node(fw_np, sdei_of_match);
+   of_node_put(fw_np);
if (!np)
return false;
of_node_put(np);
-- 
2.9.5



[PATCH 2/3] firmware: psci: fix leaked of_node references

2019-04-16 Thread Wen Yang
The call to of_find_matching_node_and_match returns a node pointer
with refcount incremented thus it must be explicitly decremented
after the last usage.

672 int __init psci_dt_init(void)
673 {
674 struct device_node *np;
...
678 np = of_find_matching_node_and_match(...);
679
680 if (!np || !of_device_is_available(np))
682 return -ENODEV;  ---> leaked here
...
686 return init_fn(np);  ---> released here
687 }

Detected by using coccinelle.

Signed-off-by: Wen Yang 
Cc: Mark Rutland 
Cc: Lorenzo Pieralisi 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/firmware/psci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index c80ec1d..e4143f8 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -677,8 +677,10 @@ int __init psci_dt_init(void)
 
np = of_find_matching_node_and_match(NULL, psci_of_match, _np);
 
-   if (!np || !of_device_is_available(np))
+   if (!np || !of_device_is_available(np)) {
+   of_node_put(np);
return -ENODEV;
+   }
 
init_fn = (psci_initcall_t)matched_np->data;
return init_fn(np);
-- 
2.9.5



[PATCH 3/3] firmware: stratix10-svc: fix leaked of_node references

2019-04-16 Thread Wen Yang
In stratix10_svc_init function, fw_np is obtained by calling
of_find_node_by_name(), np is obtained by calling
of_find_matching_node(), and the reference counts of those
two device_nodes, fw_np and np, are increased.
But when the function exits, only of_node_put is called on np,
and fw_np's reference count is leaked.

Detected by coccinelle with the following warnings:
./drivers/firmware/stratix10-svc.c:1020:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 1014, but without a 
corresponding object release within this function.
./drivers/firmware/stratix10-svc.c:1025:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 1014, but without a 
corresponding object release within this function.
./drivers/firmware/stratix10-svc.c:1027:1-7: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 1014, but without a 
corresponding object release within this function.

Signed-off-by: Wen Yang 
Cc: Greg Kroah-Hartman 
Cc: Alan Tull 
Cc: Richard Gong 
Cc: Nicolas Saenz Julienne 
Cc: linux-kernel@vger.kernel.org
---
 drivers/firmware/stratix10-svc.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
index 6e65148..482a6bd 100644
--- a/drivers/firmware/stratix10-svc.c
+++ b/drivers/firmware/stratix10-svc.c
@@ -1016,15 +1016,21 @@ static int __init stratix10_svc_init(void)
return -ENODEV;
 
np = of_find_matching_node(fw_np, stratix10_svc_drv_match);
-   if (!np)
-   return -ENODEV;
+   if (!np) {
+   ret = -ENODEV;
+   goto out_put_fw_np;
+   }
 
of_node_put(np);
ret = of_platform_populate(fw_np, stratix10_svc_drv_match, NULL, NULL);
if (ret)
-   return ret;
+   goto out_put_fw_np;
 
-   return platform_driver_register(_svc_driver);
+   ret = platform_driver_register(_svc_driver);
+
+out_put_fw_np:
+   of_node_put(fw_np);
+   return ret;
 }
 
 static void __exit stratix10_svc_exit(void)
-- 
2.9.5



[PATCH 1/2] power: supply: fix leaked of_node refs in ab8500_bm_of_probe

2019-04-16 Thread Wen Yang
The call to of_parse_phandle returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.

492 int ab8500_bm_of_probe(struct device *dev,
493struct device_node *np,
494struct abx500_bm_data *bm)
495 {
496 const struct batres_vs_temp *tmp_batres_tbl;
497 struct device_node *battery_node;
...
501 /* get phandle to 'battery-info' node */
502 battery_node = of_parse_phandle(np, "battery", 0);
...
509 if (!btech) {
510 dev_warn(dev, "missing property battery-name/type\n");
511 return -EINVAL;---> leaked here
512 }
...
540 of_node_put(battery_node);   ---> released here
541
542 return 0;
543 }

Detected by coccinelle with the following warnings:
./drivers/power/supply/ab8500_bmdata.c:511:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 502, but without a 
corresponding object release within this function.

Signed-off-by: Wen Yang 
Cc: Sebastian Reichel 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/power/supply/ab8500_bmdata.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/power/supply/ab8500_bmdata.c 
b/drivers/power/supply/ab8500_bmdata.c
index 7b2b699..f6a6697 100644
--- a/drivers/power/supply/ab8500_bmdata.c
+++ b/drivers/power/supply/ab8500_bmdata.c
@@ -508,6 +508,7 @@ int ab8500_bm_of_probe(struct device *dev,
btech = of_get_property(battery_node, "stericsson,battery-type", NULL);
if (!btech) {
dev_warn(dev, "missing property battery-name/type\n");
+   of_node_put(battery_node);
return -EINVAL;
}
 
-- 
2.9.5



[PATCH 2/2] power: supply: fix leaked of_node refs in power_supply_get_battery_info

2019-04-16 Thread Wen Yang
The call to of_parse_phandle returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.

Detected by coccinelle with the following warnings:
./drivers/power/supply/power_supply_core.c:601:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:604:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:632:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:635:2-8: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:653:3-9: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:664:3-9: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.
./drivers/power/supply/power_supply_core.c:673:1-7: ERROR: missing of_node_put; 
acquired a node pointer with refcount incremented on line 595, but without a 
corresponding object release within this function.

Signed-off-by: Wen Yang 
Cc: Sebastian Reichel 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/power/supply/power_supply_core.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/power/supply/power_supply_core.c 
b/drivers/power/supply/power_supply_core.c
index 65c619c..874495c 100644
--- a/drivers/power/supply/power_supply_core.c
+++ b/drivers/power/supply/power_supply_core.c
@@ -598,10 +598,12 @@ int power_supply_get_battery_info(struct power_supply 
*psy,
 
err = of_property_read_string(battery_np, "compatible", );
if (err)
-   return err;
+   goto out_put_node;
 
-   if (strcmp("simple-battery", value))
-   return -ENODEV;
+   if (strcmp("simple-battery", value)) {
+   err = -ENODEV;
+   goto out_put_node;
+   }
 
/* The property and field names below must correspond to elements
 * in enum power_supply_property. For reasoning, see
@@ -629,10 +631,12 @@ int power_supply_get_battery_info(struct power_supply 
*psy,
 
len = of_property_count_u32_elems(battery_np, "ocv-capacity-celsius");
if (len < 0 && len != -EINVAL) {
-   return len;
+   err = len;
+   goto out_put_node;
} else if (len > POWER_SUPPLY_OCV_TEMP_MAX) {
dev_err(>dev, "Too many temperature values\n");
-   return -EINVAL;
+   err = -EINVAL;
+   goto out_put_node;
} else if (len > 0) {
of_property_read_u32_array(battery_np, "ocv-capacity-celsius",
   info->ocv_temp, len);
@@ -650,7 +654,8 @@ int power_supply_get_battery_info(struct power_supply *psy,
dev_err(>dev, "failed to get %s\n", propname);
kfree(propname);
power_supply_put_battery_info(psy, info);
-   return -EINVAL;
+   err = -EINVAL;
+   goto out_put_node;
}
 
kfree(propname);
@@ -661,7 +666,8 @@ int power_supply_get_battery_info(struct power_supply *psy,
devm_kcalloc(>dev, tab_len, sizeof(*table), 
GFP_KERNEL);
if (!info->ocv_table[index]) {
power_supply_put_battery_info(psy, info);
-   return -ENOMEM;
+   err = -ENOMEM;
+   goto out_put_node;
}
 
for (i = 0; i < tab_len; i++) {
@@ -670,7 +676,9 @@ int power_supply_get_battery_info(struct power_supply *psy,
}
}
 
-   return 0;
+out_put_node:
+   of_node_put(battery_np);
+   return err;
 }
 EXPORT_SYMBOL_GPL(power_supply_get_battery_info);
 
-- 
2.9.5



[PATCH 0/2] fix leaked of_node references in drivers/power

2019-04-16 Thread Wen Yang
The call to of_get_cpu_node/of_find_compatible_node/of_parse_phandle...
returns a node pointer with refcount incremented thus it must be
explicitly decremented after the last usage.

We developed a coccinelle SmPL to detect drivers/power code and
found some issues.
This patch series fixes those issues.

Cc: Sebastian Reichel 
Cc: linux...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Wen Yang (2):
  power: supply: fix leaked of_node refs in ab8500_bm_of_probe
  power: supply: fix leaked of_node refs in
power_supply_get_battery_info

 drivers/power/supply/ab8500_bmdata.c |  1 +
 drivers/power/supply/power_supply_core.c | 24 
 2 files changed, 17 insertions(+), 8 deletions(-)

-- 
2.9.5



Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Cong Wang
On Tue, Apr 16, 2019 at 7:31 PM Cong Wang  wrote:
> Yes it is, I have a stacktrace in production which clearly shows
> del_elem.isra.1+0x34/0x40, unlike the one I triggered via fake
> PFN's. I can show you if you want, it is on 4.14, so very unlikely
> it is interesting to anyone here.

Correct myself:

The one triggered via fake PFN's also shows
del_elem.constprop.1+0x39/0x40.

Sorry for the mistake, I kinda read into another crash...


Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Cong Wang
On Tue, Apr 16, 2019 at 6:53 PM Luck, Tony  wrote:
>
> On Tue, Apr 16, 2019 at 04:47:55PM -0700, Cong Wang wrote:
> > 229 static void del_elem(struct ce_array *ca, int idx)
> > 230 {
> > 231 /* Save us a function call when deleting the last element. */
> > 232 if (ca->n - (idx + 1))
> > 233 memmove((void *)>array[idx],
> > 234 (void *)>array[idx + 1],
> > 235 (ca->n - (idx + 1)) * sizeof(u64));
> > 236
> > 237 ca->n--;
> > 238 }
> >
> > idx is ca->n and ca->n is MAX_ELEMS-1, then the above if statement
> > becomes true, therefore idx+1 is MAX_ELEMS which is just beyond
> > the valid range.
>
> Is that really the memmove() where we die?  It looks like
> it has a special case for dealing with the last element.

Yes it is, I have a stacktrace in production which clearly shows
del_elem.isra.1+0x34/0x40, unlike the one I triggered via fake
PFN's. I can show you if you want, it is on 4.14, so very unlikely
it is interesting to anyone here.



>
> But this:
>
> 296 ret = find_elem(ca, pfn, );
> 297 if (ret < 0) {
> 298 /*
> 299  * Shift range [to-end] to make room for one more element.
> 300  */
> 301 memmove((void *)>array[to + 1],
> 302 (void *)>array[to],
> 303 (ca->n - to) * sizeof(u64));
> 304
>
> looks like it also needs a special case for when "to ==  MAX_ELEMS-1"
> (we don't need to memmove).

In the specific I talked about, find_elem() returns non-negative, so it won't
even hit this branch. Remember how it passed the if check
(this_pfn == pfn)? ;)


Thanks.


linux-next: build warning after merge of the block tree

2019-04-16 Thread Stephen Rothwell
Hi all,

After merging the block tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

fs/f2fs/node.c: In function 'f2fs_remove_inode_page':
fs/f2fs/node.c:1193:47: warning: format '%zu' expects argument of type 
'size_t', but argument 5 has type 'blkcnt_t' {aka 'long long unsigned int'} 
[-Wformat=]
"Inconsistent i_blocks, ino:%lu, iblocks:%zu",
 ~~^
 %llu
inode->i_ino, inode->i_blocks);
  ~~~   

Introduced by commit

  90ae238d9dac ("f2fs: fix to avoid panic in f2fs_remove_inode_page()")

from the f2fs tree interacting with commit

  72deb455b5ec ("block: remove CONFIG_LBDAF")

-- 
Cheers,
Stephen Rothwell


pgpnPxYAPdrPt.pgp
Description: OpenPGP digital signature


[PATCH -next] regulator: stm32-pwr: Make some symbols static

2019-04-16 Thread Wei Yongjun
Fixes the following sparse warnings:

drivers/regulator/stm32-pwr.c:35:5: warning:
 symbol 'ready_mask_table' was not declared. Should it be static?
drivers/regulator/stm32-pwr.c:47:5: warning:
 symbol 'stm32_pwr_reg_is_ready' was not declared. Should it be static?
drivers/regulator/stm32-pwr.c:57:5: warning:
 symbol 'stm32_pwr_reg_is_enabled' was not declared. Should it be static?

Fixes: 6cdae8173f67 ("regulator: Add support for stm32 power regulators")
Signed-off-by: Wei Yongjun 
---
 drivers/regulator/stm32-pwr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/regulator/stm32-pwr.c b/drivers/regulator/stm32-pwr.c
index e434b26d4c8b..222d593d76a2 100644
--- a/drivers/regulator/stm32-pwr.c
+++ b/drivers/regulator/stm32-pwr.c
@@ -32,7 +32,7 @@ enum {
STM32PWR_REG_NUM_REGS
 };
 
-u32 ready_mask_table[STM32PWR_REG_NUM_REGS] = {
+static u32 ready_mask_table[STM32PWR_REG_NUM_REGS] = {
[PWR_REG11] = REG_1_1_RDY,
[PWR_REG18] = REG_1_8_RDY,
[PWR_USB33] = USB_3_3_RDY,
@@ -44,7 +44,7 @@ struct stm32_pwr_reg {
u32 ready_mask;
 };
 
-int stm32_pwr_reg_is_ready(struct regulator_dev *rdev)
+static int stm32_pwr_reg_is_ready(struct regulator_dev *rdev)
 {
struct stm32_pwr_reg *priv = rdev_get_drvdata(rdev);
u32 val;
@@ -54,7 +54,7 @@ int stm32_pwr_reg_is_ready(struct regulator_dev *rdev)
return (val & priv->ready_mask);
 }
 
-int stm32_pwr_reg_is_enabled(struct regulator_dev *rdev)
+static int stm32_pwr_reg_is_enabled(struct regulator_dev *rdev)
 {
struct stm32_pwr_reg *priv = rdev_get_drvdata(rdev);
u32 val;





[PATCH -next] regulator: stm32-pwr: Fix return value check in stm32_pwr_regulator_probe()

2019-04-16 Thread Wei Yongjun
In case of error, the function of_iomap() returns NULL pointer not
ERR_PTR(). The IS_ERR() test in the return value check should be
replaced with NULL test.

Fixes: 6cdae8173f67 ("regulator: Add support for stm32 power regulators")
Signed-off-by: Wei Yongjun 
---
 drivers/regulator/stm32-pwr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/regulator/stm32-pwr.c b/drivers/regulator/stm32-pwr.c
index e434b26d4c8b..d4b9d6accfad 100644
--- a/drivers/regulator/stm32-pwr.c
+++ b/drivers/regulator/stm32-pwr.c
@@ -140,9 +140,9 @@ static int stm32_pwr_regulator_probe(struct platform_device 
*pdev)
int i, ret = 0;
 
base = of_iomap(np, 0);
-   if (IS_ERR(base)) {
+   if (!base) {
dev_err(>dev, "Unable to map IO memory\n");
-   return PTR_ERR(base);
+   return -ENOMEM;
}
 
config.dev = >dev;





Re: [greybus-dev] [PATCH v2] Staging: greybus: Cleanup in greybus driver

2019-04-16 Thread Viresh Kumar
On 16-04-19, 17:13, Madhumitha Prabakaran wrote:
> Fix a blank line after structure declarations. Also, convert
> macros into inline functions in order to maintain Linux kernel
> coding style based on which the inline function is
> preferable over the macro.
> 
> Blank line fixes are suggested by checkpatch.pl

This was all intentional and we use such macros everywhere in kernel. No need
for such a patch, sorry.

-- 
viresh


Re: [PATCH v4 2/2] dt-bindings: cpufreq: Document allwinner,sun50i-h6-operating-points

2019-04-16 Thread Viresh Kumar
On 16-04-19, 11:52, Yangtao Li wrote:
> Allwinner Process Voltage Scaling Tables defines the voltage and
> frequency value based on the speedbin blown in the efuse combination.
> The sunxi-cpufreq-nvmem driver reads the efuse value from the SoC to
> provide the OPP framework with required information.
> This is used to determine the voltage and frequency value for each
> OPP of operating-points-v2 table when it is parsed by the OPP framework.
> 
> The "allwinner,sun50i-h6-operating-points" DT extends the
> "operating-points-v2"
> with following parameters:
> - nvmem-cells (NVMEM area containig the speedbin information)
> - opp-microvolt-: voltage in micro Volts.
>   At runtime, the platform can pick a  and matching
>   opp-microvolt- property.
>   HW: :
>   sun50iw-h6  speed0 speed1 speed2
> 
> Signed-off-by: Yangtao Li 
> ---
>  .../bindings/opp/sun50i-nvmem-cpufreq.txt | 167 ++
>  1 file changed, 167 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/opp/sun50i-nvmem-cpufreq.txt

I would need an Ack from Rob before applying this.

-- 
viresh


Re: [PATCH v4 0/2] cpufreq: Add sunxi nvmem based CPU scaling driver

2019-04-16 Thread Viresh Kumar
On 16-04-19, 11:52, Yangtao Li wrote:
> Add sunxi nvmem based CPU scaling driver, refers to qcom-cpufreq-kryo.
> 
> Yangtao Li (2):
>   cpufreq: Add sunxi nvmem based CPU scaling driver
>   dt-bindings: cpufreq: Document allwinner,sun50i-h6-operating-points
> 
>  .../bindings/opp/sun50i-nvmem-cpufreq.txt | 167 +
>  MAINTAINERS   |   7 +
>  drivers/cpufreq/Kconfig.arm   |  12 +
>  drivers/cpufreq/Makefile  |   1 +
>  drivers/cpufreq/cpufreq-dt-platdev.c  |   2 +
>  drivers/cpufreq/sun50i-cpufreq-nvmem.c| 226 ++
>  6 files changed, 415 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/opp/sun50i-nvmem-cpufreq.txt
>  create mode 100644 drivers/cpufreq/sun50i-cpufreq-nvmem.c
> 
> ---
> v4:
> -I removed this sunxi_cpufreq_soc_data structure for now.

Why this change ?

> -Convert to less generic name.
> -Update soc_bin xlate.
> v3:
> -update changelog and title
> -convert compatibles to allwinner,cpu-operating-points-v2
> -document the valid names for opp-microvolt-
> v2:
> -update changelog
> -convert to dev_pm_opp_set_prop_name instead of
>  dev_pm_opp_set_supported_hw
> -some change in OPP Node  
> ---
> 2.17.0

-- 
viresh


[PATCH] i2c: imx: correct the method of getting private data in notifier_call

2019-04-16 Thread Anson Huang
The way of getting private imx_i2c_struct in i2c_imx_clk_notifier_call()
is incorrect, should use clk_change_nb element to get correct address
and avoid below kernel dump during POST_RATE_CHANGE notify by clk
framework:

Unable to handle kernel paging request at virtual address 03ef1488
pgd = (ptrval)
[03ef1488] *pgd=
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Workqueue: events reduce_bus_freq_handler
PC is at i2c_imx_set_clk+0x10/0xb8
LR is at i2c_imx_clk_notifier_call+0x20/0x28
pc : [<806a893c>]lr : [<806a8a04>]psr: a0080013
sp : bf399dd8  ip : bf3432ac  fp : bf7c1dc0
r10: 0002  r9 :   r8 : 
r7 : 03ef1480  r6 : bf399e50  r5 :   r4 : 
r3 : bf025300  r2 : bf399e50  r1 : 00b71b00  r0 : bf399be8
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 4e03004a  DAC: 0051
Process kworker/2:1 (pid: 38, stack limit = 0x(ptrval))
Stack: (0xbf399dd8 to 0xbf39a000)
9dc0:   806a89e4 
9de0:  bf399e50 0002 806a8a04 806a89e4 80142900  
9e00: bf34ef18 bf34ef04   bf399e50 80142d84  bf399e6c
9e20: bf34ef00 80f214c4 bf025300 0002 80f08d08 bf017480  80142df0
9e40:  80166ed8 80c27638 8045de58 bf352340 03ef1480 00b71b00 0f82e242
9e60: bf025300 0002 03ef1480 80f60e5c 0001 8045edf0 0002 8045eb08
9e80: bf025300 0002 03ef1480 8045ee10 03ef1480 8045eb08 bf01be40 0002
9ea0: 03ef1480 8045ee10 07de2900 8045eb08 bf01b780 0002 07de2900 8045ee10
9ec0: 80c27898 bf399ee4 bf020a80 0002 1f78a400 8045ee10 80f60e5c 80460514
9ee0: 80f60e5c bf01b600 bf01b480 80460460 0f82e242 bf383a80 bf383a00 80f60e5c
9f00:  bf7c1dc0 80f60e70 80460564 80f60df0 80f60d24 80f60df0 8011e72c
9f20:  80f60df0 80f60e6c bf7c4f00  8011e7ac bf274000 8013bd84
9f40: bf7c1dd8 80f03d00 bf274000 bf7c1dc0 bf274014 bf7c1dd8 80f03d00 bf398000
9f60: 0008 8013bfb4  bf25d100 bf25d0c0  bf274000 8013bf88
9f80: bf25d11c bf0cfebc  8014140c bf25d0c0 801412ec  
9fa0:    801010e8    
9fc0:        
9fe0:     0013   
[<806a893c>] (i2c_imx_set_clk) from [<806a8a04>] 
(i2c_imx_clk_notifier_call+0x20/0x28)
[<806a8a04>] (i2c_imx_clk_notifier_call) from [<80142900>] 
(notifier_call_chain+0x44/0x84)
[<80142900>] (notifier_call_chain) from [<80142d84>] 
(__srcu_notifier_call_chain+0x44/0x98)
[<80142d84>] (__srcu_notifier_call_chain) from [<80142df0>] 
(srcu_notifier_call_chain+0x18/0x20)
[<80142df0>] (srcu_notifier_call_chain) from [<8045de58>] 
(__clk_notify+0x78/0xa4)
[<8045de58>] (__clk_notify) from [<8045edf0>] (__clk_recalc_rates+0x60/0xb4)
[<8045edf0>] (__clk_recalc_rates) from [<8045ee10>] 
(__clk_recalc_rates+0x80/0xb4)
Code: e92d40f8 e5903298 e59072a0 e1530001 (e5975008)
---[ end trace fc7f5514b97b6cbb ]---

Fixes: 90ad2cbe88c2("i2c: imx: use clk notifier for rate changes")
Signed-off-by: Anson Huang 
---
 drivers/i2c/busses/i2c-imx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-imx.c b/drivers/i2c/busses/i2c-imx.c
index c0c3043..fd70b11 100644
--- a/drivers/i2c/busses/i2c-imx.c
+++ b/drivers/i2c/busses/i2c-imx.c
@@ -515,9 +515,9 @@ static int i2c_imx_clk_notifier_call(struct notifier_block 
*nb,
 unsigned long action, void *data)
 {
struct clk_notifier_data *ndata = data;
-   struct imx_i2c_struct *i2c_imx = container_of(>clk,
+   struct imx_i2c_struct *i2c_imx = container_of(nb,
  struct imx_i2c_struct,
- clk);
+ clk_change_nb);
 
if (action & POST_RATE_CHANGE)
i2c_imx_set_clk(i2c_imx, ndata->new_rate);
-- 
2.7.4



Re: [RFC][PATCH] tracing/x86: Save CR2 before tracing irqsoff on error_entry

2019-04-16 Thread He Zhe
>From the last replies in the thread, it seems some work is going on. May I ask
when we can possibly roughly have a fix or workaround?

Thanks,
Zhe


On 3/21/19 10:15 AM, Steven Rostedt wrote:
> From: Steven Rostedt (VMware) 
>
> He Zhe reported a crash by enabling trace events and selecting
> "userstacktrace" which will read the stack of userspace for every trace
> event recorded. Zhe narrowed it down to:
>
>  c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their 
> usage")
>
> With the problem config, I was able to also reproduce the error. I
> narrowed it down to just having to do the following:
>
>  # cd /sys/kernel/tracing
>  # echo 1 > options/userstacktrace
>  # echo 1 > events/preemptirq/irq_disable/enable
>
> And sure enough, I triggered a crash. Well, it was systemd crashing
> with a bad memory access??
>
>  systemd-journal[537]: segfault at ed8cb8 ip 7f7fffc9fef5 sp 
> 7ffc4062cb10 error 7
>
> And it would crash similarly each time I tried it, but always at a
> different place. After spending the day on this, I finally figured it
> out. The bug is happening in entry_64.S right after error_entry.
> There's two TRACE_IRQS_OFF in that code path, which if I comment out,
> the bug goes away. Then it dawned on me that the crash always happens
> when systemd does a normal page fault. We had this bug before, and it
> was with the exception trace points.
>
> The issue is that a tracepoint can fault (reading vmalloc or whatever).
> And doing a userspace stack trace most definitely will fault. But if we
> are coming from a legitimate page fault, the address of that fault (in
> the CR2 register) will be lost if we fault before we get to the page
> fault handler. That's exactly what is happening.
>
> To solve this, a TRACE_IRQS_OFF_CR2 (and ON for consistency) was added
> that saves the CR2 register. A new trace_hardirqs_off_thunk_cr2 is
> created that stores the cr2 register, calls the
> trace_hardirqs_off_caller, then on return restores the cr2 register if
> it changed, before returning.
>
> On my tests this fixes the issue. I just want to know if this is a
> legitimate fix or if someone can come up with a better fix?
>
> Note: this also saves the exception context just like the
> do_page_fault() function does.
>
> Note2: This only gets enabled when lockdep or irq tracing is enabled,
> which is not recommended for production environments.
>
> Link: 
> http://lkml.kernel.org/r/897cf5cf-fc24-8a64-cb28-847f2d2e6...@windriver.com
>
> Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify 
> their usage")
> Signed-off-by: Steven Rostedt (VMware) 
> ---
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 7bc105f47d21..7edffec328e2 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -292,6 +292,32 @@ __visible void do_syscall_64(unsigned long nr, struct 
> pt_regs *regs)
>  
>   syscall_return_slowpath(regs);
>  }
> +
> +extern void trace_hardirqs_on_caller(unsigned long caller_addr);
> +__visible void trace_hardirqs_on_caller_cr2(unsigned long caller_addr)
> +{
> + unsigned long address = read_cr2(); /* Get the faulting address */
> + enum ctx_state prev_state;
> +
> + prev_state = exception_enter();
> + trace_hardirqs_on_caller(caller_addr);
> + if (address != read_cr2())
> + write_cr2(address);
> + exception_exit(prev_state);
> +}
> +
> +extern void trace_hardirqs_off_caller(unsigned long caller_addr);
> +__visible void trace_hardirqs_off_caller_cr2(unsigned long caller_addr)
> +{
> + unsigned long address = read_cr2(); /* Get the faulting address */
> + enum ctx_state prev_state;
> +
> + prev_state = exception_enter();
> + trace_hardirqs_off_caller(caller_addr);
> + if (address != read_cr2())
> + write_cr2(address);
> + exception_exit(prev_state);
> +}
>  #endif
>  
>  #if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 1f0efdb7b629..73ddf24fed3e 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1248,12 +1248,12 @@ ENTRY(error_entry)
>* we fix gsbase, and we should do it before enter_from_user_mode
>* (which can take locks).
>*/
> - TRACE_IRQS_OFF
> + TRACE_IRQS_OFF_CR2
>   CALL_enter_from_user_mode
>   ret
>  
>  .Lerror_entry_done:
> - TRACE_IRQS_OFF
> + TRACE_IRQS_OFF_CR2
>   ret
>  
>   /*
> diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
> index be36bf4e0957..1300b53b98cb 100644
> --- a/arch/x86/entry/thunk_64.S
> +++ b/arch/x86/entry/thunk_64.S
> @@ -41,6 +41,8 @@
>  #ifdef CONFIG_TRACE_IRQFLAGS
>   THUNK trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1
>   THUNK trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1
> + THUNK trace_hardirqs_on_thunk_cr2,trace_hardirqs_on_caller_cr2,1
> + THUNK 

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Luck, Tony
On Tue, Apr 16, 2019 at 04:47:55PM -0700, Cong Wang wrote:
> 229 static void del_elem(struct ce_array *ca, int idx)
> 230 {
> 231 /* Save us a function call when deleting the last element. */
> 232 if (ca->n - (idx + 1))
> 233 memmove((void *)>array[idx],
> 234 (void *)>array[idx + 1],
> 235 (ca->n - (idx + 1)) * sizeof(u64));
> 236
> 237 ca->n--;
> 238 }
> 
> idx is ca->n and ca->n is MAX_ELEMS-1, then the above if statement
> becomes true, therefore idx+1 is MAX_ELEMS which is just beyond
> the valid range.

Is that really the memmove() where we die?  It looks like
it has a special case for dealing with the last element.

But this:

296 ret = find_elem(ca, pfn, );
297 if (ret < 0) {
298 /*
299  * Shift range [to-end] to make room for one more element.
300  */
301 memmove((void *)>array[to + 1],
302 (void *)>array[to],
303 (ca->n - to) * sizeof(u64));
304

looks like it also needs a special case for when "to ==  MAX_ELEMS-1"
(we don't need to memmove).

-Tony


[GIT PULL] Hyper-V commits for 5.1

2019-04-16 Thread Sasha Levin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:

  Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git 
tags/hyperv-fixes-signed

for you to fetch changes up to a0033bd1eae4650b69be07c17cb87393da584563:

  Drivers: hv: vmbus: Remove the undesired put_cpu_ptr() in hv_synic_cleanup() 
(2019-04-13 09:36:35 -0400)

- 
Three fixes:

 1. Fix for a race condition in the hyper-v ringbuffer code by Kimberly
Brown.
 2. Fix to show monitor data only when monitor pages are actually
allocated, also by Kimberly Brown.
 3. Fix cpu reference counting in the vmbus code by Dexuan Cui.

- 
Dexuan Cui (1):
  Drivers: hv: vmbus: Remove the undesired put_cpu_ptr() in 
hv_synic_cleanup()

Kimberly Brown (4):
  Drivers: hv: vmbus: Expose monitor data only when monitor pages are used
  Drivers: hv: vmbus: Refactor chan->state if statement
  Drivers: hv: vmbus: Set ring_info field to 0 and remove memset
  Drivers: hv: vmbus: Fix race condition with new ring_buffer_info mutex

 Documentation/ABI/stable/sysfs-bus-vmbus |  12 ++-
 drivers/hv/channel_mgmt.c|   3 +
 drivers/hv/hv.c  |   1 -
 drivers/hv/hyperv_vmbus.h|   3 +
 drivers/hv/ring_buffer.c |  22 +++-
 drivers/hv/vmbus_drv.c   | 166 +--
 include/linux/hyperv.h   |   7 +-
 7 files changed, 175 insertions(+), 39 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAly2gjwACgkQ3qZv95d3
LNwtLg//UPVpBuilk7pzyumaEYNNeVcq781h/cghAOxiTDPlCI/qAkshdPj9ePAI
UWmBl/lUTXWcN26Xb2NLF9VulXfTFCiphQdgYMQEVslnOvbYLMLkIOOkrkovT6ie
OE5x1j+2S2uF+nwZvLq+FYuS5OiCBx8De1HaCUJQjh1dnrcdjCBTk+idE12tdqa6
pXO32CBKLUK0AEo9yHfeyg9RRsXV89u0wDEOhS160sqx5B6o1TevGOxkPDDQKrKG
mo80aY17MV3ljcODt4pqCNz1rITV6wwZ2oTz1sEv+5nMsqGZ9dcpXYQiQp0huaVl
/j4v/xY9gUDhGYvixoT8Mn87zc7y0y6o+VMcoM7YO7NJkbDA3XN6UaGzXdon061U
w25JBpWdHmot+5PxNafp708V7OBcWpbCoEFhSKvzKcYZlnGo4OA0om966v1WsW2K
ssXEPS4qdwtRM1dyK9c/Lt9HgaZWXzpIzj2bMxVjYPMeAmOd6W0QlfnuZuxYHmRd
aoZswMwFiC2lyOoHtbo50EzeKLwxVju+ToHFTf2jRXnjsLvDXkE6FmX9DSYnwvnE
Vs5qCvW8ynOPhTWAn6eKl0Ysb12+m36w+MEol7JTUlw4XPw7ahHsuxsN4TJQnshD
mZYG2Mrc38B/YOytT0asT5Z9U5ob9U7446I2Rn3XbdjVs9EEBAA=
=gXIY
-END PGP SIGNATURE-


[PATCH v3 2/2] platform/chrome: wilco_ec: Add USB PowerShare Policy control

2019-04-16 Thread Nick Crews
USB PowerShare is a policy which affects charging via the special
USB PowerShare port (marked with a small lightning bolt or battery icon)
when in low power states:
- In S0, the port will always provide power.
- In S0ix, if power_share is enabled, then power will be supplied to
  the port when on AC or if battery is > 50%. Else no power is supplied.
- In S5, if power_share is enabled, then power will be supplied to
  the port when on AC. Else no power is supplied.

v3 changes:
- Drop a silly blank line
- Use val > 1 instead of val != 0 && val != 1
v2 changes:
- Move documentation to Documentation/ABI/testing/sysfs-platform-wilco-ec
- Zero out reserved bytes in requests.

Signed-off-by: Nick Crews 
---
 .../ABI/testing/sysfs-platform-wilco-ec   | 16 
 drivers/platform/chrome/wilco_ec/sysfs.c  | 92 +++
 2 files changed, 108 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-platform-wilco-ec 
b/Documentation/ABI/testing/sysfs-platform-wilco-ec
index e074c203cd32..0c07d5e0b737 100644
--- a/Documentation/ABI/testing/sysfs-platform-wilco-ec
+++ b/Documentation/ABI/testing/sysfs-platform-wilco-ec
@@ -9,3 +9,19 @@ Description:
Input should be parseable by kstrtou8() to 0 or 1.
Output will be either "0\n" or "1\n".
 
+What:  /sys/bus/platform/devices/GOOG000C\:00/usb_power_share
+Date:  April 2019
+KernelVersion: 5.2
+Description:
+   Control the USB PowerShare Policy. USB PowerShare is a policy
+   which affects charging via the special USB PowerShare port
+   (marked with a small lightning bolt or battery icon) when in
+   low power states:
+   - In S0, the port will always provide power.
+   - In S0ix, if power_share is enabled, then power will be
+ supplied to the port when on AC or if battery is > 50%.
+ Else no power is supplied.
+   - In S5, if power_share is enabled, then power will be supplied
+ to the port when on AC. Else no power is supplied.
+
+   Input should be either "0" or "1".
diff --git a/drivers/platform/chrome/wilco_ec/sysfs.c 
b/drivers/platform/chrome/wilco_ec/sysfs.c
index 959b5da2eb16..14e1eee95d42 100644
--- a/drivers/platform/chrome/wilco_ec/sysfs.c
+++ b/drivers/platform/chrome/wilco_ec/sysfs.c
@@ -23,6 +23,26 @@ struct boot_on_ac_request {
u8 reserved7;
 } __packed;
 
+#define CMD_USB_POWER_SHARE 0x39
+
+enum usb_power_share_op {
+   POWER_SHARE_GET = 0,
+   POWER_SHARE_SET = 1,
+};
+
+struct usb_power_share_request {
+   u8 cmd; /* Always CMD_USB_POWER_SHARE */
+   u8 reserved;
+   u8 op;  /* One of enum usb_power_share_op */
+   u8 val; /* When setting, either 0 or 1 */
+} __packed;
+
+struct usb_power_share_response {
+   u8 reserved;
+   u8 status;  /* Set by EC to 0 on success, other value on failure */
+   u8 val; /* When getting, set by EC to either 0 or 1 */
+} __packed;
+
 static ssize_t boot_on_ac_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
@@ -57,8 +77,80 @@ static ssize_t boot_on_ac_store(struct device *dev,
 
 static DEVICE_ATTR_WO(boot_on_ac);
 
+static int send_usb_power_share(struct wilco_ec_device *ec,
+   struct usb_power_share_request *rq,
+   struct usb_power_share_response *rs)
+{
+   struct wilco_ec_message msg;
+   int ret;
+
+   memset(, 0, sizeof(msg));
+   msg.type = WILCO_EC_MSG_LEGACY;
+   msg.request_data = rq;
+   msg.request_size = sizeof(*rq);
+   msg.response_data = rs;
+   msg.response_size = sizeof(*rs);
+   ret = wilco_ec_mailbox(ec, );
+   if (ret < 0)
+   return ret;
+   if (rs->status)
+   return -EIO;
+
+   return 0;
+}
+
+static ssize_t usb_power_share_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct wilco_ec_device *ec = dev_get_drvdata(dev);
+   struct usb_power_share_request rq;
+   struct usb_power_share_response rs;
+   int ret;
+
+   memset(, 0, sizeof(rq));
+   rq.cmd = CMD_USB_POWER_SHARE;
+   rq.op = POWER_SHARE_GET;
+
+   ret = send_usb_power_share(ec, , );
+   if (ret < 0)
+   return ret;
+
+   return sprintf(buf, "%d\n", rs.val);
+}
+
+static ssize_t usb_power_share_store(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct wilco_ec_device *ec = dev_get_drvdata(dev);
+   struct usb_power_share_request rq;
+   struct usb_power_share_response rs;
+   int ret;
+   u8 val;
+
+   ret = kstrtou8(buf, 10, );
+   if (ret < 0)
+   

[PATCH v3 1/2] platform/chrome: wilco_ec: Add Boot on AC support

2019-04-16 Thread Nick Crews
Boot on AC is a policy which makes the device boot from S5 when AC
power is connected. This is useful for users who want to run their
device headless or with a dock.

v3 changes:
- Add docstring to wilco_ec_add_sysfs()
- Tweak a comment
- Use val > 1 instead of val != 0 && val != 1
v2 changes:
- Move documentation to Documentation/ABI/testing/sysfs-platform-wilco-ec

Signed-off-by: Nick Crews 
---
 .../ABI/testing/sysfs-platform-wilco-ec   | 11 +++
 drivers/platform/chrome/wilco_ec/Makefile |  2 +-
 drivers/platform/chrome/wilco_ec/core.c   |  9 +++
 drivers/platform/chrome/wilco_ec/sysfs.c  | 77 +++
 include/linux/platform_data/wilco-ec.h| 12 +++
 5 files changed, 110 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ABI/testing/sysfs-platform-wilco-ec
 create mode 100644 drivers/platform/chrome/wilco_ec/sysfs.c

diff --git a/Documentation/ABI/testing/sysfs-platform-wilco-ec 
b/Documentation/ABI/testing/sysfs-platform-wilco-ec
new file mode 100644
index ..e074c203cd32
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-wilco-ec
@@ -0,0 +1,11 @@
+What:  /sys/bus/platform/devices/GOOG000C\:00/boot_on_ac
+Date:  April 2019
+KernelVersion: 5.2
+Description:
+   Boot on AC is a policy which makes the device boot from S5
+   when AC power is connected. This is useful for users who
+   want to run their device headless or with a dock.
+
+   Input should be parseable by kstrtou8() to 0 or 1.
+   Output will be either "0\n" or "1\n".
+
diff --git a/drivers/platform/chrome/wilco_ec/Makefile 
b/drivers/platform/chrome/wilco_ec/Makefile
index 063e7fb4ea17..1281dd7737c4 100644
--- a/drivers/platform/chrome/wilco_ec/Makefile
+++ b/drivers/platform/chrome/wilco_ec/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 
-wilco_ec-objs  := core.o mailbox.o
+wilco_ec-objs  := core.o mailbox.o sysfs.o
 obj-$(CONFIG_WILCO_EC) += wilco_ec.o
 wilco_ec_debugfs-objs  := debugfs.o
 obj-$(CONFIG_WILCO_EC_DEBUGFS) += wilco_ec_debugfs.o
diff --git a/drivers/platform/chrome/wilco_ec/core.c 
b/drivers/platform/chrome/wilco_ec/core.c
index 05e1e2be1c91..abd15d04e57b 100644
--- a/drivers/platform/chrome/wilco_ec/core.c
+++ b/drivers/platform/chrome/wilco_ec/core.c
@@ -89,8 +89,16 @@ static int wilco_ec_probe(struct platform_device *pdev)
goto unregister_debugfs;
}
 
+   ret = wilco_ec_add_sysfs(ec);
+   if (ret < 0) {
+   dev_err(dev, "Failed to create sysfs entries: %d", ret);
+   goto unregister_rtc;
+   }
+
return 0;
 
+unregister_rtc:
+   platform_device_unregister(ec->rtc_pdev);
 unregister_debugfs:
if (ec->debugfs_pdev)
platform_device_unregister(ec->debugfs_pdev);
@@ -102,6 +110,7 @@ static int wilco_ec_remove(struct platform_device *pdev)
 {
struct wilco_ec_device *ec = platform_get_drvdata(pdev);
 
+   wilco_ec_remove_sysfs(ec);
platform_device_unregister(ec->rtc_pdev);
if (ec->debugfs_pdev)
platform_device_unregister(ec->debugfs_pdev);
diff --git a/drivers/platform/chrome/wilco_ec/sysfs.c 
b/drivers/platform/chrome/wilco_ec/sysfs.c
new file mode 100644
index ..959b5da2eb16
--- /dev/null
+++ b/drivers/platform/chrome/wilco_ec/sysfs.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2019 Google LLC
+ *
+ * Sysfs properties to view and modify EC-controlled features on Wilco devices.
+ * The entries will appear under /sys/bus/platform/devices/GOOG000C:00/
+ *
+ * See Documentation/ABI/testing/sysfs-platform-wilco-ec for more information.
+ */
+
+#include 
+#include 
+
+#define CMD_KB_CMOS0x7C
+#define SUB_CMD_KB_CMOS_AUTO_ON0x03
+
+struct boot_on_ac_request {
+   u8 cmd; /* Always CMD_KB_CMOS */
+   u8 reserved1;
+   u8 sub_cmd; /* Always SUB_CMD_KB_CMOS_AUTO_ON */
+   u8 reserved3to5[3];
+   u8 val; /* Either 0 or 1 */
+   u8 reserved7;
+} __packed;
+
+static ssize_t boot_on_ac_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct wilco_ec_device *ec = dev_get_drvdata(dev);
+   struct boot_on_ac_request rq;
+   struct wilco_ec_message msg;
+   int ret;
+   u8 val;
+
+   ret = kstrtou8(buf, 10, );
+   if (ret < 0)
+   return ret;
+   if (val > 1)
+   return -EINVAL;
+
+   memset(, 0, sizeof(rq));
+   rq.cmd = CMD_KB_CMOS;
+   rq.sub_cmd = SUB_CMD_KB_CMOS_AUTO_ON;
+   rq.val = val;
+
+   memset(, 0, sizeof(msg));
+   msg.type = WILCO_EC_MSG_LEGACY;
+   msg.request_data = 
+   msg.request_size = sizeof(rq);
+   ret = 

Re: [PATCH v2 02/19] PM / devfreq: tegra: Replace readl-writel with relaxed versions

2019-04-16 Thread Chanwoo Choi
Hi,

On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
> There is no need to insert memory barrier on each readl/writel
> invocation, hence use the relaxed versions.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/devfreq/tegra-devfreq.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index 02905978abe1..0c0909fba545 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -191,23 +191,23 @@ static struct tegra_actmon_emc_ratio 
> actmon_emc_ratios[] = {
>  
>  static u32 actmon_readl(struct tegra_devfreq *tegra, u32 offset)
>  {
> - return readl(tegra->regs + offset);
> + return readl_relaxed(tegra->regs + offset);
>  }
>  
>  static void actmon_writel(struct tegra_devfreq *tegra, u32 val, u32 offset)
>  {
> - writel(val, tegra->regs + offset);
> + writel_relaxed(val, tegra->regs + offset);
>  }
>  
>  static u32 device_readl(struct tegra_devfreq_device *dev, u32 offset)
>  {
> - return readl(dev->regs + offset);
> + return readl_relaxed(dev->regs + offset);
>  }
>  
>  static void device_writel(struct tegra_devfreq_device *dev, u32 val,
> u32 offset)
>  {
> - writel(val, dev->regs + offset);
> + writel_relaxed(val, dev->regs + offset);
>  }
>  
>  static unsigned long do_percent(unsigned long val, unsigned int pct)
> 

Reviewed-by: Chanwoo Choi 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v2 06/19] PM / devfreq: tegra: Fix missed error checking on devfreq initialization failure

2019-04-16 Thread Chanwoo Choi
Hi,

On 19. 4. 16. 오후 11:29, Dmitry Osipenko wrote:
> 16.04.2019 5:32, Chanwoo Choi пишет:
>> Hi,
>>
>> patch6/7/8/9 are for handling of exception handling in probe() function.
>> Actually, I'm not sure that there are special reason to split out
>> the patches. I think that you can squash patch6/7/8/9 to only one patch.
> 
> Indeed, I was rebasing and reordering patches multiple times and looks like 
> there is no reason not to squash these patches now.
> 
>> Also, even if patch6/7/8/9 handle the exception handling in probe(),
>> the tegra_devfreq_probe() doesn't support the restoring sequence
>> when fail happen. I think that if you want to fix the fail case of probe(),
>> please add the restoring sequence about following function.
>> - clk_disable_unprepare()
>> - clk_notifier_unregister()
>> - dev_pm_opp_remove()
> 
> When all of 6/7/8/9 patches are applied, the clk_notifier_register() becomes 
> the last invocation of the probe function and clk_enable() is kept at the 
> first place of the probe order. Hence the sequence you're suggesting is 
> already incorrect because error-unwinding order usually should be opposite to 
> the probe order. It looks to me that the current final result of these 
> patches is already correct.

You're right. When I replied it, I have not considered the order.
Sorry, it made you some confusion.

> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


[PATCH v2] proc/sysctl: add shared variables for range check

2019-04-16 Thread Matteo Croce
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This function
uses the extra1 and extra2 members from struct ctl_table as minimum and
maximum allowed value.

On sysctl handler declaration, in every source file there are some readonly
variables containing just an integer which address is assigned to the
extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range boundary,
leading duplication of variables like zero=0, one=1, int_max=INT_MAX in
different source files:

$ git grep -E '\.extra[12].*&(zero|one|int_max)\b' |wc -l
245

This patch adds three const variables for the most commonly used values,
and use them instead of creating a local one for every object file.

Signed-off-by: Matteo Croce 
---
 arch/s390/appldata/appldata_base.c|  15 +-
 arch/s390/kernel/topology.c   |   6 +-
 arch/x86/entry/vdso/vdso32-setup.c|   7 +-
 arch/x86/kernel/itmt.c|   6 +-
 drivers/base/firmware_loader/fallback_table.c |  11 +-
 drivers/gpu/drm/i915/i915_perf.c  |   8 +-
 drivers/hv/vmbus_drv.c|   6 +-
 drivers/s390/char/sclp_async.c|   7 +-
 drivers/tty/tty_ldisc.c   |   6 +-
 drivers/xen/balloon.c |   7 +-
 fs/eventpoll.c|   3 +-
 fs/notify/inotify/inotify_user.c  |   8 +-
 fs/proc/proc_sysctl.c |   5 +
 include/linux/sysctl.h|   4 +
 ipc/ipc_sysctl.c  |  35 ++--
 kernel/pid_namespace.c|   3 +-
 kernel/sysctl.c   | 193 +-
 kernel/ucount.c   |   6 +-
 net/core/neighbour.c  |  20 +-
 net/core/sysctl_net_core.c|  34 ++-
 net/dccp/sysctl.c |  16 +-
 net/ipv4/sysctl_net_ipv4.c|  58 +++---
 net/ipv6/addrconf.c   |   6 +-
 net/ipv6/route.c  |   7 +-
 net/ipv6/sysctl_net_ipv6.c|   8 +-
 net/mpls/af_mpls.c|  10 +-
 net/netfilter/ipvs/ip_vs_ctl.c|   3 +-
 net/rxrpc/sysctl.c|   9 +-
 net/sctp/sysctl.c |  35 ++--
 net/sunrpc/xprtrdma/transport.c   |   3 +-
 security/keys/sysctl.c|  26 ++-
 security/loadpin/loadpin.c|   6 +-
 security/yama/yama_lsm.c  |   3 +-
 33 files changed, 261 insertions(+), 319 deletions(-)

diff --git a/arch/s390/appldata/appldata_base.c 
b/arch/s390/appldata/appldata_base.c
index e4b58240ec53..82ae75b5ead6 100644
--- a/arch/s390/appldata/appldata_base.c
+++ b/arch/s390/appldata/appldata_base.c
@@ -220,15 +220,13 @@ appldata_timer_handler(struct ctl_table *ctl, int write,
   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
int timer_active = appldata_timer_active;
-   int zero = 0;
-   int one = 1;
int rc;
struct ctl_table ctl_entry = {
.procname   = ctl->procname,
.data   = _active,
.maxlen = sizeof(int),
-   .extra1 = ,
-   .extra2 = ,
+   .extra1 = (void *)_zero,
+   .extra2 = (void *)_one,
};
 
rc = proc_douintvec_minmax(_entry, write, buffer, lenp, ppos);
@@ -255,13 +253,12 @@ appldata_interval_handler(struct ctl_table *ctl, int 
write,
   void __user *buffer, size_t *lenp, loff_t *ppos)
 {
int interval = appldata_interval;
-   int one = 1;
int rc;
struct ctl_table ctl_entry = {
.procname   = ctl->procname,
.data   = ,
.maxlen = sizeof(int),
-   .extra1 = ,
+   .extra1 = (void *)_one,
};
 
rc = proc_dointvec_minmax(_entry, write, buffer, lenp, ppos);
@@ -289,13 +286,11 @@ appldata_generic_handler(struct ctl_table *ctl, int write,
struct list_head *lh;
int rc, found;
int active;
-   int zero = 0;
-   int one = 1;
struct ctl_table ctl_entry = {
.data   = ,
.maxlen = sizeof(int),
-   .extra1 = ,
-   .extra2 = ,
+   .extra1 = (void *)_zero,
+   .extra2 = (void *)_one,
};
 
found = 0;
diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
index 8964a3f60aad..347eb433c0a5 100644
--- a/arch/s390/kernel/topology.c
+++ b/arch/s390/kernel/topology.c
@@ -587,15 +587,13 

Re: [PATCH v2 05/19] PM / devfreq: tegra: Replace write memory barrier with the read barrier

2019-04-16 Thread Chanwoo Choi
On 19. 4. 16. 오후 10:57, Dmitry Osipenko wrote:
> 16.04.2019 11:00, Chanwoo Choi пишет:
>> Hi,
>>
>> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
>>> The write memory barrier isn't needed because the BUS buffer is flushed
>>> by read after write that happens after the removed wmb(), we will also
>>> use readl() instead of the relaxed version to ensure that read is indeed
>>> completed.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/devfreq/tegra-devfreq.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/tegra-devfreq.c 
>>> b/drivers/devfreq/tegra-devfreq.c
>>> index d62fb1b0d9bb..f0f0d78f6cbf 100644
>>> --- a/drivers/devfreq/tegra-devfreq.c
>>> +++ b/drivers/devfreq/tegra-devfreq.c
>>> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct 
>>> tegra_devfreq *tegra,
>>>  static void actmon_write_barrier(struct tegra_devfreq *tegra)
>>>  {
>>> /* ensure the update has reached the ACTMON */
>>> -   wmb();
>>> -   actmon_readl(tegra, ACTMON_GLB_STATUS);
>>> +   readl(tegra->regs + ACTMON_GLB_STATUS);
>>
>> I think that this meaning of actmon_write_barrier() keeps
>> the order of 'store' assembly command without the execution change
>> from compiler optimization by using the wmb().
> 
> The IO mapped memory is strongly-ordered on ARM, hence all readl/writel 
> accesses are guaranteed to be ordered by default. I think wmb() here is just 
> a cut-n-pasted relic from old downstream driver.

OK.

> 
>> But, this patch edits it as following:
>> The result of the following two cases are same?
>>
>> [original code]
>>  wmb()
>>  read_relaxed()
>>
>> [new code by this patch]
>>  readl_relaxed()
>>  rmb()
> 
> Yes, the result is the same. The wmb() is not just about IO accesses, but 
> about all kind of memory accesses and at least on Tegra30 it is quite 
> expensive operation because it translates into L2XO cache syncing 
> (arm_heavy_mb) + dsb(). It should be more efficient to flush out writes with 
> a read-back and then wait for that read operation to be completed.
> 
> 

OK. Thanks for explanation.

Reviewed-by: Chanwoo Choi 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


[PATCH v4 2/2] power_supply: platform/chrome: wilco_ec: Add charging config driver

2019-04-16 Thread Nick Crews
Add control of the charging algorithm used on Wilco devices.
See Documentation/ABI/testing/sysfs-class-power-wilco for the
userspace interface and other info.

v4 changes:
-Move implementation from
 drivers/platform/chrome/wilco_ec/charge_config.c to
 drivers/power/supply/wilco_charger.c
-Move drivers/platform/chrome/wilco_ec/properties.h to
 include/linux/platform_data/wilco-ec-properties.h
-Remove parentheses in switch statement in psp_val_to_charge_mode()
-Check for any negatvie return code from psp_val_to_charge_mode()
 instead of just -EINVAL so its less brittle
-Tweak comments in wilco-ec-properties.h
v3 changes:
-Add this changelog
-Fix commit message tags
v2 changes:
-Update Documentation to say KernelVersion 5.2
-Update Documentation to explain Trickle mode better.
-rename things from using *PCC* to *CHARGE*
-Split up conversions between POWER_SUPPLY_PROP_CHARGE_TYPE values
and Wilco EC codes
-Use devm_ flavor of power_supply_register(), which simplifies things
-Add extra error checking on property messages received from the EC
-Fix bug in memcpy() calls in properties.c
-Refactor fill_property_id()
-Add valid input checks to charge_type
-Properly convert charge_type when get()ting

Signed-off-by: Nick Crews 
---
 .../ABI/testing/sysfs-class-power-wilco   |  30 +++
 drivers/platform/chrome/wilco_ec/Kconfig  |   9 +
 drivers/platform/chrome/wilco_ec/Makefile |   2 +
 drivers/platform/chrome/wilco_ec/core.c   |  16 ++
 drivers/platform/chrome/wilco_ec/properties.c | 125 
 drivers/power/supply/wilco_charger.c  | 190 ++
 .../linux/platform_data/wilco-ec-properties.h |  68 +++
 include/linux/platform_data/wilco-ec.h|   2 +
 8 files changed, 442 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-class-power-wilco
 create mode 100644 drivers/platform/chrome/wilco_ec/properties.c
 create mode 100644 drivers/power/supply/wilco_charger.c
 create mode 100644 include/linux/platform_data/wilco-ec-properties.h

diff --git a/Documentation/ABI/testing/sysfs-class-power-wilco 
b/Documentation/ABI/testing/sysfs-class-power-wilco
new file mode 100644
index ..7f3b01310476
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-power-wilco
@@ -0,0 +1,30 @@
+What:  /sys/class/power_supply/wilco_charger/charge_type
+Date:  April 2019
+KernelVersion: 5.2
+Description:
+   What charging algorithm to use:
+
+   Standard: Fully charges battery at a standard rate.
+   Adaptive: Battery settings adaptively optimized based on
+   typical battery usage pattern.
+   Fast: Battery charges over a shorter period.
+   Trickle: Extends battery lifespan, intended for users who
+   primarily use their Chromebook while connected to AC.
+   Custom: A low and high threshold percentage is specified.
+   Charging begins when level drops below
+   charge_control_start_threshold, and ceases when
+   level is above charge_control_end_threshold.
+
+What:  
/sys/class/power_supply/wilco_charger/charge_control_start_threshold
+Date:  April 2019
+KernelVersion: 5.2
+Description:
+   Used when charge_type="Custom", as described above. Measured in
+   percentages. The valid range is [50, 95].
+
+What:  
/sys/class/power_supply/wilco_charger/charge_control_end_threshold
+Date:  April 2019
+KernelVersion: 5.2
+Description:
+   Used when charge_type="Custom", as described above. Measured in
+   percentages. The valid range is [55, 100].
diff --git a/drivers/platform/chrome/wilco_ec/Kconfig 
b/drivers/platform/chrome/wilco_ec/Kconfig
index e09e4cebe9b4..1c427830bd57 100644
--- a/drivers/platform/chrome/wilco_ec/Kconfig
+++ b/drivers/platform/chrome/wilco_ec/Kconfig
@@ -18,3 +18,12 @@ config WILCO_EC_DEBUGFS
  manipulation and allow for testing arbitrary commands.  This
  interface is intended for debug only and will not be present
  on production devices.
+
+config WILCO_EC_CHARGE_CNTL
+   tristate "Enable charging control"
+   depends on WILCO_EC
+   help
+ If you say Y here, you get support to control the charging
+ routines performed by the Wilco Embedded Controller.
+ Further information can be found in
+ Documentation/ABI/testing/sysfs-class-power-wilco)
diff --git a/drivers/platform/chrome/wilco_ec/Makefile 
b/drivers/platform/chrome/wilco_ec/Makefile
index 063e7fb4ea17..7e980f56f793 100644
--- a/drivers/platform/chrome/wilco_ec/Makefile
+++ b/drivers/platform/chrome/wilco_ec/Makefile
@@ -4,3 +4,5 @@ wilco_ec-objs   := core.o mailbox.o
 obj-$(CONFIG_WILCO_EC) += wilco_ec.o
 wilco_ec_debugfs-objs  := debugfs.o
 obj-$(CONFIG_WILCO_EC_DEBUGFS) += wilco_ec_debugfs.o

[PATCH v4 1/2] power_supply: Add more charge types and CHARGE_CONTROL_* properties

2019-04-16 Thread Nick Crews
Add "Standard", "Adaptive", and "Custom" modes to the charge_type
property, to expand the existing "Trickle" and "Fast" modes.
In addition, add POWER_SUPPLY_PROP_CHARGE_CONTROL_START_THRESHOLD
and POWER_SUPPLY_PROP_CHARGE_CONTROL_END_THRESHOLD properties, to expand
the existing CHARGE_CONTROL_* properties. I am adding them in order
to support a new Chrome OS device, but these properties should be
general enough that they can be used on other devices.

The meaning of "Standard" is obvious, but "Adaptive" and "Custom" are
more tricky: "Adaptive" means that the charge controller uses some
custom algorithm to change the charge type automatically, with no
configuration needed. "Custom" means that the charge controller uses the
POWER_SUPPLY_PROP_CHARGE_CONTROL_* properties as configuration for some
other algorithm. For example, in the use case that I am supporting,
this means the battery begins charging when the percentage
level drops below POWER_SUPPLY_PROP_CHARGE_CONTROL_START_THRESHOLD and
charging ceases when the percentage level goes above
POWER_SUPPLY_PROP_CHARGE_CONTROL_END_THRESHOLD.

v4 changes:
- Add documentation for the new properties, and add documentation for
  the the previously missing charge_control_limit and
  charge_control_limit_max properties.

Signed-off-by: Nick Crews 
---
 Documentation/ABI/testing/sysfs-class-power | 51 +++--
 drivers/power/supply/power_supply_sysfs.c   |  4 +-
 include/linux/power_supply.h| 10 +++-
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-power 
b/Documentation/ABI/testing/sysfs-class-power
index 5e23e22dce1b..b77e30b9014e 100644
--- a/Documentation/ABI/testing/sysfs-class-power
+++ b/Documentation/ABI/testing/sysfs-class-power
@@ -114,15 +114,60 @@ Description:
Access: Read
Valid values: Represented in microamps
 
+What:  /sys/class/power_supply//charge_control_limit
+Date:  Oct 2012
+Contact:   linux...@vger.kernel.org
+Description:
+   Maximum allowable charging current. Used for charge rate
+   throttling for thermal cooling or improving battery health.
+
+   Access: Read, Write
+   Valid values: Represented in microamps
+
+What:  /sys/class/power_supply//charge_control_limit_max
+Date:  Oct 2012
+Contact:   linux...@vger.kernel.org
+Description:
+   Maximum legal value for the charge_control_limit property.
+
+   Access: Read
+   Valid values: Represented in microamps
+
+What:  
/sys/class/power_supply//charge_control_start_threshold
+Date:  April 2019
+Contact:   linux...@vger.kernel.org
+Description:
+   Represents a battery percentage level, below which charging will
+   begin.
+
+   Access: Read, Write
+   Valid values: 0 - 100 (percent)
+
+What:  
/sys/class/power_supply//charge_control_end_threshold
+Date:  April 2019
+Contact:   linux...@vger.kernel.org
+Description:
+   Represents a battery percentage level, above which charging will
+   stop.
+
+   Access: Read, Write
+   Valid values: 0 - 100 (percent)
+
 What:  /sys/class/power_supply//charge_type
 Date:  July 2009
 Contact:   linux...@vger.kernel.org
 Description:
Represents the type of charging currently being applied to the
-   battery.
+   battery. "Trickle", "Fast", and "Standard" all mean different
+   charging speeds. "Adaptive" means that the charger uses some
+   algorithm to adjust the charge rate dynamically, without
+   any user configuration required. "Custom" means that the charger
+   uses the charge_control_* properties as configuration for some
+   different algorithm.
 
-   Access: Read
-   Valid values: "Unknown", "N/A", "Trickle", "Fast"
+   Access: Read, Write
+   Valid values: "Unknown", "N/A", "Trickle", "Fast", "Standard",
+ "Adaptive", "Custom"
 
 What:  /sys/class/power_supply//charge_term_current
 Date:  July 2014
diff --git a/drivers/power/supply/power_supply_sysfs.c 
b/drivers/power/supply/power_supply_sysfs.c
index dce24f596160..6104a3f03d46 100644
--- a/drivers/power/supply/power_supply_sysfs.c
+++ b/drivers/power/supply/power_supply_sysfs.c
@@ -56,7 +56,7 @@ static const char * const power_supply_status_text[] = {
 };
 
 static const char * const power_supply_charge_type_text[] = {
-   "Unknown", "N/A", "Trickle", "Fast"
+   "Unknown", "N/A", "Trickle", "Fast", "Standard", "Adaptive", "Custom"
 };
 
 static const char * const power_supply_health_text[] = {
@@ -274,6 +274,8 @@ static struct device_attribute power_supply_attrs[] = {

Re: [PATCH] init: Initialize jump labels before command line option parsing

2019-04-16 Thread Guenter Roeck
On Tue, Apr 16, 2019 at 5:04 PM Dan Williams  wrote:
>
> On Tue, Apr 16, 2019 at 4:44 PM Andrew Morton  
> wrote:
> >
> > On Tue, 16 Apr 2019 13:54:04 -0700 Dan Williams  
> > wrote:
> >
> > > When a module option, or core kernel argument, toggles a static-key it
> > > requires jump labels to be initialized early. While x86, PowerPC, and
> > > ARM64 arrange for jump_label_init() to be called before parse_args(),
> > > ARM does not.
> > >
> > >   Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 
> > > console=ttyAMA0,115200 page_alloc.shuffle=1
> > >   [ cut here ]
> > >   WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
> > >   page_alloc_shuffle+0x12c/0x1ac
> > >   static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
> > >   before call to jump_label_init()
> > >   Modules linked in:
> > >   CPU: 0 PID: 0 Comm: swapper Not tainted
> > >   5.1.0-rc4-next-20190410-3-g3367c36ce744 #1
> > >   Hardware name: ARM Integrator/CP (Device Tree)
> > >   [] (unwind_backtrace) from [] (show_stack+0x10/0x18)
> > >   [] (show_stack) from [] (dump_stack+0x18/0x24)
> > >   [] (dump_stack) from [] (__warn+0xe0/0x108)
> > >   [] (__warn) from [] (warn_slowpath_fmt+0x44/0x6c)
> > >   [] (warn_slowpath_fmt) from []
> > >   (page_alloc_shuffle+0x12c/0x1ac)
> > >   [] (page_alloc_shuffle) from [] 
> > > (shuffle_store+0x28/0x48)
> > >   [] (shuffle_store) from [] (parse_args+0x1f4/0x350)
> > >   [] (parse_args) from [] (start_kernel+0x1c0/0x488)
> > >
> > > Move the fallback call to jump_label_init() to occur before
> > > parse_args(). The redundant calls to jump_label_init() in other archs
> > > are left intact in case they have static key toggling use cases that are
> > > even earlier than option parsing.
> >
> > Has it been confirmed that this fixes
> > mm-shuffle-initial-free-memory-to-improve-memory-side-cache-utilization.patch
> > on beaglebone-black?
>
> This only fixes dynamically enabling the shuffling on 32-bit ARM.
> Guenter happened to run without the mm-only 'force-enable-always'
> patch and when he went to use the command line option to enable it he
> hit the jump-label warning.
>

For my part I have not seen the original failure; it seems that the
kernelci logs are no longer present. As such, I neither know how it
looks like nor how to (try to) reproduce it. I just thought it might
be worthwhile to run the patch through my boot tests to see if
anything pops up. From the feedback I got, though, it sounded like the
failure is/was very omap2 specific, so I would not be able to
reproduce it anyway.

Guenter

> The original beaglebone-black failure was something different and
> avoided this case because the jump-label was never used.
>
> I am otherwise unable to recreate the failure on either the original
> failing -next, nor on v5.1-rc5 plus the latest state of the patches. I
> need from someone who is actually still seeing the failure so they can
> compare with the configuration that is working for me. For reference
> that's the Yocto beaglebone-black defconfig:
>
> https://github.com/jumpnow/meta-bbb/blob/thud/recipes-kernel/linux/linux-stable-5.0/beaglebone/defconfig


Re: [PATCH] x86/mm/mem_encrypt: fix a crash with kmemleak_scan

2019-04-16 Thread Qian Cai
On 4/16/19 1:11 PM, Borislav Petkov wrote:
>> +/*
>> + * Inform kmemleak about the hole in the .bss section since the
>> + * corresponding pages will be unmapped with DEBUG_PAGEALLOC=y.
>> + */
>> +kmemleak_free_part((void *)vaddr, vaddr_end - vaddr);
>>  free_init_pages("unused decrypted", vaddr, vaddr_end);
> 
> I don't understand what the logic here is: we have a couple of other
> free_init_pages() calls but they don't have kmemleak_free_part() in
> front.
> 
> Now, if kmemleak needs to be told that memory is getting freed, why
> isn't kmemleak_free_part() called in free_init_pages() ?
> 
> This needs more explanation.

kmemleak_init() will register the data/bss sections (only register
.data..ro_after_init if not within .data) and then kmemleak_scan() will scan
those address and dereference them looking for pointer referencing. If
free_init_pages() free and unmap pages in those sections, kmemleak_scan() will
trigger a crash if referencing one of those addresses.

I checked other x86 free_init_pages() call sites and don't see anything obvious
where another place to free an address in those sections.

__smp_locks[]: .smp_locks
__initramfs_start[]: .init
__init_begin: .init
from text_end to rodata_start: contains .notes, __ex_table
from rodata_end to _sdata: .pci_fixup, __ksymtab, __ksymtab_gpl etc

So, I don't think it need to add kmemleak_free_part() in every free_init_pages()
calls.


Re: [PATCH] KVM: vmx: print more APICv fields in dump_vmcs

2019-04-16 Thread Krish Sadhukhan




On 04/15/2019 06:35 AM, Paolo Bonzini wrote:

The SVI, RVI, virtual-APIC page address and APIC-access page address fields
were left out of dump_vmcs.  Add them.

Signed-off-by: Paolo Bonzini 
---
  arch/x86/kvm/vmx/vmx.c | 13 +++--
  1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae8..f8054dc1de65 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5723,8 +5723,17 @@ static void dump_vmcs(void)
if (secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
pr_err("TSC Multiplier = 0x%016llx\n",
   vmcs_read64(TSC_MULTIPLIER));
-   if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW)
-   pr_err("TPR Threshold = 0x%02x\n", vmcs_read32(TPR_THRESHOLD));
+   if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW) {
+   if (secondary_exec_control & 
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
+   u16 status = vmcs_read16(GUEST_INTR_STATUS);
+   pr_err("SVI|RVI = %02x|%02x ", status >> 8, status & 
0xff);
+   }
+   pr_err(KERN_CONT "TPR Threshold = 0x%02x\n", 
vmcs_read32(TPR_THRESHOLD));
+   if (secondary_exec_control & 
(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+ 
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE))
+   pr_err("APIC-access addr = 0x%016llx ", 
vmcs_read64(APIC_ACCESS_ADDR));
+   pr_err(KERN_CONT "virt-APIC addr=0x%016llx\n", 
vmcs_read64(VIRTUAL_APIC_PAGE_ADDR));
+   }
if (pin_based_exec_ctrl & PIN_BASED_POSTED_INTR)
pr_err("PostedIntrVec = 0x%02x\n", vmcs_read16(POSTED_INTR_NV));
if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT))


Reviewed-by: Krish Sadhukhan 


Re: [PATCH v2 12/19] PM / devfreq: tegra: Avoid inconsistency of current frequency value

2019-04-16 Thread Chanwoo Choi
Hi,

On 19. 4. 17. 오전 12:40, Dmitry Osipenko wrote:
> 16.04.2019 10:15, Chanwoo Choi пишет:
>> Hi,
>>
>> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
>>> The frequency value potentially could change in-between. It doesn't
>>> cause any real problem at all right now, but that could change in the
>>> future. Hence let's avoid the inconsistency.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/devfreq/tegra-devfreq.c | 6 --
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/tegra-devfreq.c 
>>> b/drivers/devfreq/tegra-devfreq.c
>>> index a668e4fbc874..f1a6f951813a 100644
>>> --- a/drivers/devfreq/tegra-devfreq.c
>>> +++ b/drivers/devfreq/tegra-devfreq.c
>>> @@ -496,13 +496,15 @@ static int tegra_devfreq_get_dev_status(struct device 
>>> *dev,
>>>  {
>>> struct tegra_devfreq *tegra = dev_get_drvdata(dev);
>>> struct tegra_devfreq_device *actmon_dev;
>>> +   unsigned long cur_freq;
>>>  
>>> -   stat->current_frequency = tegra->cur_freq * KHZ;
>>> +   cur_freq = READ_ONCE(tegra->cur_freq);
>>>  
>>> /* To be used by the tegra governor */
>>> stat->private_data = tegra;
>>>  
>>> /* The below are to be used by the other governors */
>>> +   stat->current_frequency = cur_freq * KHZ;
>>>  
>>> actmon_dev = >devices[MCALL];
>>>  
>>> @@ -513,7 +515,7 @@ static int tegra_devfreq_get_dev_status(struct device 
>>> *dev,
>>> stat->busy_time *= 100 / BUS_SATURATION_RATIO;
>>>  
>>> /* Number of cycles in a sampling period */
>>> -   stat->total_time = ACTMON_SAMPLING_PERIOD * tegra->cur_freq;
>>> +   stat->total_time = ACTMON_SAMPLING_PERIOD * cur_freq;
>>>  
>>> stat->busy_time = min(stat->busy_time, stat->total_time);
>>>  
>>>
>>
>> The read/write access of tegra->cur_freq is in the single routine
>> of update_devfreq() as following. I think that there are no any
>> potential problem about the inconsistency of tegra->cur_freq.
> 
> No, that's wrong assumption. The tegra->cur_freq is changed by the clock 
> notifier that runs asynchronously with the devfreq driver when EMC clock rate 
> is changed by something else in the kernel. 
> 
>> IMHO, if there are no any problem now, I'm not sure that we need
>> to apply this patch.
>>
>> update_devfreq()
>> {
>>  devfreq->governor->get_target_freq()
>>  devfreq_update_stats(devfreq)
>>  tegra_devfreq_get_dev_status()
>>  stat->current_frequency = tegra->cur_freq * KHZ;
>>
>>  devfreq_set_target()
>>  tegra_devfreq_target()
>>  clk_set_min_rate(emc_rate, )
>>  tegra_actmon_rate_notify_cb()
>>  tegra->cur_freq = data->new_rate / KHZ;
>>  
>>  clk_set_rate(emc_rate, )
>>  tegra_actmon_rate_notify_cb()
>>  tegra->cur_freq = data->new_rate / KHZ;
>> }
>>
>>
> 
> The cur_freq value is changed by the clock notifier that runs asynchronously 
> with the rest of the devfreq driver. Hence potentially compiler may generate 
> two separate fetches of the cur_freq value, then the clock rate could be 
> changed by other CPU core simultaneously with tegra_devfreq_get_dev_status() 
> or kernel may re-schedule preemptively, changing the clock rate in-between of 
> the two fetches.
> 
> 

Thanks. I understand why have to consider the inconsistency of clock
which is used on the multiple points.

Looks good to me.
Reviewed-by: Chanwoo Choi 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH] zram: pass down the bvec we need to read into in the work struct

2019-04-16 Thread Sergey Senozhatsky
On (04/16/19 16:53), Andrew Morton wrote:
> > Adding more Cc and stable (i thought this was 5.1 addition). Note that
> > without this patch on arch/kernel where PAGE_SIZE != 4096 userspace
> > could read random memory through a zram block device (thought userspace
> > probably would have no control on the address being read).
> 
> Looks good to me.
> 
> Minchan & Sergey, can you please review?

Sorry.

Looks OK to me.

Reviewed-by: Sergey Senozhatsky 

-ss


Re: [PATCH v2 10/19] PM / devfreq: tegra: Drop primary interrupt handler

2019-04-16 Thread Chanwoo Choi
Hi,

On 19. 4. 17. 오전 12:23, Dmitry Osipenko wrote:
> 16.04.2019 8:56, Chanwoo Choi пишет:
>> Hi,
>>
>> It looks good to me to drop the primary interrupt handler
>> but I have some comments. Please check it.
>>
>> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
>>> There is no real need in the primary interrupt handler, hence move
>>> everything to the secondary (threaded) handler. In a result locking
>>> is consistent now and there are no potential races with the interrupt
>>> handler because it is protected with the devfreq's mutex.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/devfreq/tegra-devfreq.c | 61 -
>>>  1 file changed, 22 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/tegra-devfreq.c 
>>> b/drivers/devfreq/tegra-devfreq.c
>>> index 2a1464098200..69b557df5084 100644
>>> --- a/drivers/devfreq/tegra-devfreq.c
>>> +++ b/drivers/devfreq/tegra-devfreq.c
>>> @@ -144,7 +144,6 @@ static struct tegra_devfreq_device_config 
>>> actmon_device_configs[] = {
>>>  struct tegra_devfreq_device {
>>> const struct tegra_devfreq_device_config *config;
>>> void __iomem *regs;
>>> -   spinlock_t lock;
>>>  
>>> /* Average event count sampled in the last interrupt */
>>> u32 avg_count;
>>> @@ -249,11 +248,8 @@ static void actmon_write_barrier(struct tegra_devfreq 
>>> *tegra)
>>>  static void actmon_isr_device(struct tegra_devfreq *tegra,
>>>   struct tegra_devfreq_device *dev)
>>>  {
>>> -   unsigned long flags;
>>> u32 intr_status, dev_ctrl;
>>>  
>>> -   spin_lock_irqsave(>lock, flags);
>>> -
>>> dev->avg_count = device_readl(dev, ACTMON_DEV_AVG_COUNT);
>>> tegra_devfreq_update_avg_wmark(tegra, dev);
>>>  
>>> @@ -302,26 +298,6 @@ static void actmon_isr_device(struct tegra_devfreq 
>>> *tegra,
>>> device_writel(dev, ACTMON_INTR_STATUS_CLEAR, ACTMON_DEV_INTR_STATUS);
>>>  
>>> actmon_write_barrier(tegra);
>>> -
>>> -   spin_unlock_irqrestore(>lock, flags);
>>> -}
>>> -
>>> -static irqreturn_t actmon_isr(int irq, void *data)
>>> -{
>>> -   struct tegra_devfreq *tegra = data;
>>> -   bool handled = false;
>>> -   unsigned int i;
>>> -   u32 val;
>>> -
>>> -   val = actmon_readl(tegra, ACTMON_GLB_STATUS);
>>> -   for (i = 0; i < ARRAY_SIZE(tegra->devices); i++) {
>>> -   if (val & tegra->devices[i].config->irq_mask) {
>>> -   actmon_isr_device(tegra, tegra->devices + i);
>>> -   handled = true;
>>> -   }
>>> -   }
>>> -
>>> -   return handled ? IRQ_WAKE_THREAD : IRQ_NONE;
>>>  }
>>>  
>>>  static unsigned long actmon_cpu_to_emc_rate(struct tegra_devfreq *tegra,
>>> @@ -348,35 +324,46 @@ static void actmon_update_target(struct tegra_devfreq 
>>> *tegra,
>>> unsigned long cpu_freq = 0;
>>> unsigned long static_cpu_emc_freq = 0;
>>> unsigned int avg_sustain_coef;
>>> -   unsigned long flags;
>>> +   u32 avg_count;
>>>  
>>> if (dev->config->avg_dependency_threshold) {
>>> cpu_freq = cpufreq_get(0);
>>> static_cpu_emc_freq = actmon_cpu_to_emc_rate(tegra, cpu_freq);
>>> }
>>>  
>>> -   spin_lock_irqsave(>lock, flags);
>>> -
>>> -   dev->target_freq = dev->avg_count / ACTMON_SAMPLING_PERIOD;
>>> +   avg_count = dev->avg_count;
>>> +   dev->target_freq = avg_count / ACTMON_SAMPLING_PERIOD;
>>
>> Actually, this change is not related to this patch.
>> Please keep the original code.
>>
>>> avg_sustain_coef = 100 * 100 / dev->config->boost_up_threshold;
>>> dev->target_freq = do_percent(dev->target_freq, avg_sustain_coef);
>>> dev->target_freq += dev->boost_freq;
>>>  
>>> -   if (dev->avg_count >= dev->config->avg_dependency_threshold)
>>> +   if (avg_count >= dev->config->avg_dependency_threshold)
>>
>> ditto.
> 
> Good catch, it's a leftover from v1 that I forgot to revert. Thank you.
> 
> [snip]
> 
>>
>> When I review this patch, I have a question
>> about why tegra_actmon_rate_notify_cb is needed.
>>
>> tegra_actmon_rate_notify_cb() do something
>> when the clock rate of emc_clock is changed.
>> I think that 'emc_clock' is changed by this driver.
>> It means that the this driver can catch the change timing
>> of emc_clock rate without notifier. 
> 
> The devfreq driver isn't the only driver of the EMC clock rate. The devfreq 
> driver changes EMC freq dynamically based of on average memory usage 
> activity, but for some hardware units (like display controller for example) 
> there is a requirement for a minimum memory bandwidth (isochronous 
> transactions) and hence when display is waking up from suspend it immediately 
> requires an amount of memory bandwidth that could be higher than ACTMON 
> hardware unit suggests and besides the ACTMON's reaction is delayed by the 
> sampling period (12ms).

Thanks for explaining it. If EMC clock is used on multiple point,
I understand why the clock notifier is necessary.

> 
>> IMO, it is possible to call tegra_devfreq_update_wmark()
>> directly without 

Re: [PATCH v2 17/19] PM / devfreq: tegra: Support Tegra30

2019-04-16 Thread Chanwoo Choi
On 19. 4. 17. 오전 12:49, Dmitry Osipenko wrote:
> 16.04.2019 10:48, Chanwoo Choi пишет:
>> Hi,
>>
>> On 19. 4. 15. 오후 11:55, Dmitry Osipenko wrote:
>>> The devfreq driver can be used on Tegra30 without any code change and
>>> it works perfectly fine, the default Tegra124 parameters are good enough
>>> for Tegra30.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/devfreq/Kconfig | 4 ++--
>>>  drivers/devfreq/tegra-devfreq.c | 1 +
>>>  2 files changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index a78dffe603c1..78c33ddd4eea 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -92,8 +92,8 @@ config ARM_EXYNOS_BUS_DEVFREQ
>>>   This does not yet operate with optimal voltages.
>>>  
>>>  config ARM_TEGRA_DEVFREQ
>>> -   tristate "Tegra DEVFREQ Driver"
>>> -   depends on ARCH_TEGRA_124_SOC
>>> +   tristate "NVIDIA Tegra30+ DEVFREQ Driver"
>>
>> Looks good to me. But, I have a question.
>>
>> 'Tegra30+' expression is enough in order to indicate
>> the support for both tegra30 and tegra124?
>>
>> In my case, it is difficult to catch the meaning
>> of which tegra30+ support the kind of tegra124.
> 
> The ACTMON serves Tegra30, Tegra124 and Tegra210 SoC's. The later Tegra 
> generations also have the ACTMON hardware unit, but most of actual drivers 
> have been moved into firmware on the later gens, including the ACTMON driver. 
> I'll change the wording to "Tegra30/124/210" for clarity. Note that Tegra210 
> support isn't implemented at the moment by the driver, but that may change in 
> the future.

OK. Thanks.

> 
>>> +   depends on ARCH_TEGRA
>>> select PM_OPP
>>> help
>>>   This adds the DEVFREQ driver for the Tegra family of SoCs.
>>> diff --git a/drivers/devfreq/tegra-devfreq.c 
>>> b/drivers/devfreq/tegra-devfreq.c
>>> index f0711d5ad27d..0a2465a58cf5 100644
>>> --- a/drivers/devfreq/tegra-devfreq.c
>>> +++ b/drivers/devfreq/tegra-devfreq.c
>>> @@ -719,6 +719,7 @@ static int tegra_devfreq_remove(struct platform_device 
>>> *pdev)
>>>  }
>>>  
>>>  static const struct of_device_id tegra_devfreq_of_match[] = {
>>> +   { .compatible = "nvidia,tegra30-actmon" },
>>> { .compatible = "nvidia,tegra124-actmon" },
>>> { },
>>>  };
>>>
>>
>> Reviewed-by: Chanwoo Choi 
>>
> 
> Thanks.
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v2 18/19] PM / devfreq: tegra: Enable COMPILE_TEST for the driver

2019-04-16 Thread Chanwoo Choi
On 19. 4. 17. 오전 12:52, Dmitry Osipenko wrote:
> 16.04.2019 10:43, Chanwoo Choi пишет:
>> On 19. 4. 15. 오후 11:55, Dmitry Osipenko wrote:
>>> The driver's compilation doesn't have any specific dependencies, hence
>>> the COMPILE_TEST option can be supported in Kconfig.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/devfreq/Kconfig | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index 78c33ddd4eea..bd6652863e7d 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -93,7 +93,7 @@ config ARM_EXYNOS_BUS_DEVFREQ
>>>  
>>>  config ARM_TEGRA_DEVFREQ
>>> tristate "NVIDIA Tegra30+ DEVFREQ Driver"
>>> -   depends on ARCH_TEGRA
>>> +   depends on ARCH_TEGRA || COMPILE_TEST
>>> select PM_OPP
>>> help
>>>   This adds the DEVFREQ driver for the Tegra family of SoCs.
>>>
>>
>> I think that it doesn't need to make it a separate patch.
>> You can merge this patch to patch17 and then drop this patch.
>>
> 
> I'd prefer to have this change as a separate patch because this is a 
> logically distinct change from the patch 17.
> 
> 

OK.

Reviewed-by: Chanwoo Choi 

-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH v2 19/19] PM / devfreq: Introduce driver for NVIDIA Tegra20

2019-04-16 Thread Chanwoo Choi
Hi,

On 19. 4. 17. 오전 1:11, Dmitry Osipenko wrote:
> 16.04.2019 11:31, Chanwoo Choi пишет:
>> Hi,
>>
>> On 19. 4. 15. 오후 11:55, Dmitry Osipenko wrote:
>>> Add devfreq driver for NVIDIA Tegra20 SoC's. The driver periodically
>>> reads out Memory Controller counters and adjusts memory frequency based
>>> on the memory clients activity.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  MAINTAINERS   |   8 ++
>>>  drivers/devfreq/Kconfig   |  10 ++
>>>  drivers/devfreq/Makefile  |   1 +
>>>  drivers/devfreq/tegra20-devfreq.c | 177 ++
>>>  4 files changed, 196 insertions(+)
>>>  create mode 100644 drivers/devfreq/tegra20-devfreq.c
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 80b59db1b6e4..91f475ec4545 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -10056,6 +10056,14 @@ F: include/linux/memblock.h
>>>  F: mm/memblock.c
>>>  F: Documentation/core-api/boot-time-mm.rst
>>>  
>>> +MEMORY FREQUENCY SCALING DRIVER FOR NVIDIA TEGRA20
>>> +M: Dmitry Osipenko 
>>> +L: linux...@vger.kernel.org
>>> +L: linux-te...@vger.kernel.org
>>> +T: git git://git.kernel.org/pub/scm/linux/kernel/git/mzx/devfreq.git
>>> +S: Maintained
>>> +F: drivers/devfreq/tegra20-devfreq.c
>>> +
>>>  MEMORY MANAGEMENT
>>>  L: linux...@kvack.org
>>>  W: http://www.linux-mm.org
>>> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
>>> index bd6652863e7d..af4c86c4e0f6 100644
>>> --- a/drivers/devfreq/Kconfig
>>> +++ b/drivers/devfreq/Kconfig
>>> @@ -100,6 +100,16 @@ config ARM_TEGRA_DEVFREQ
>>>   It reads ACTMON counters of memory controllers and adjusts the
>>>   operating frequencies and voltages with OPP support.
>>>  
>>> +config ARM_TEGRA20_DEVFREQ
>>> +   tristate "NVIDIA Tegra20 DEVFREQ Driver"
>>> +   depends on (TEGRA_MC && TEGRA20_EMC) || COMPILE_TEST
>>> +   select DEVFREQ_GOV_SIMPLE_ONDEMAND
>>> +   select PM_OPP
>>> +   help
>>> + This adds the DEVFREQ driver for the Tegra20 family of SoCs.
>>> + It reads Memory Controller counters and adjusts the operating
>>> + frequencies and voltages with OPP support.
>>> +
>>>  config ARM_RK3399_DMC_DEVFREQ
>>> tristate "ARM RK3399 DMC DEVFREQ Driver"
>>> depends on ARCH_ROCKCHIP
>>> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
>>> index 32b8d4d3f12c..6fcc5596b8b7 100644
>>> --- a/drivers/devfreq/Makefile
>>> +++ b/drivers/devfreq/Makefile
>>> @@ -11,6 +11,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE) += governor_passive.o
>>>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)   += exynos-bus.o
>>>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)   += rk3399_dmc.o
>>>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)+= tegra-devfreq.o
>>> +obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)  += tegra20-devfreq.o
>>>  
>>>  # DEVFREQ Event Drivers
>>>  obj-$(CONFIG_PM_DEVFREQ_EVENT) += event/
>>> diff --git a/drivers/devfreq/tegra20-devfreq.c 
>>> b/drivers/devfreq/tegra20-devfreq.c
>>> new file mode 100644
>>> index ..18c9aad7a9d7
>>> --- /dev/null
>>> +++ b/drivers/devfreq/tegra20-devfreq.c
>>> @@ -0,0 +1,177 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * NVIDIA Tegra20 devfreq driver
>>> + *
>>> + * Author: Dmitry Osipenko 
>>> + */
>>
>> It doesn't any "Copyright (c) 2019 ..." sentence.
> 
> I'll add one in v3.
> 
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>> +#include 
>>
>> I can find the '' header file
>> on mainline branch. But mc.h is included in linux-next.git. 
>>
>> If you don't share the patch related to mc.h,
>> the kernel build will be failed when apply it the devfreq.git
>> on final step. Actually, it should make the immutable branch
>> between two related maintainers in order to remove the build fail.
> 
> The '' header file exists since v3.18 [0], seems you just 
> missed something.
> 
> [0] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/soc/tegra/mc.h?h=v3.18=8918465163171322c77a19d5258a95f56d89d2e4

Sorry. It is my missing point. When I tried to find it,
it is included in the mainline kernel.

> 
>>> +
>>> +#include "governor.h"
>>> +
>>> +#define MC_STAT_CONTROL0x90
>>> +#define MC_STAT_EMC_CLOCK_LIMIT0xa0
>>> +#define MC_STAT_EMC_CLOCKS 0xa4
>>> +#define MC_STAT_EMC_CONTROL0xa8
>>> +#define MC_STAT_EMC_COUNT  0xb8
>>> +
>>> +#define EMC_GATHER_CLEAR   (1 << 8)
>>> +#define EMC_GATHER_ENABLE  (3 << 8)
>>> +
>>> +struct tegra_devfreq {
>>> +   struct devfreq *devfreq;
>>> +   struct clk *clk;
>>> +   void __iomem *regs;
>>> +};
>>> +
>>> +static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>>> +   u32 flags)
>>> +{
>>> +   struct tegra_devfreq *tegra = dev_get_drvdata(dev);
>>> +   struct dev_pm_opp *opp;
>>> +   unsigned 

Re: [PATCH] arm64: defconfig: Update UFSHCD for Hi3660 soc

2019-04-16 Thread Leo Yan
On Tue, Apr 16, 2019 at 06:02:21PM +0100, Valentin Schneider wrote:
> Commit 7ee7ef24d02d ("scsi: arm64: defconfig: enable configs for Hisilicon 
> ufs")
> set 'CONFIG_SCSI_UFS_HISI=y', but the configs it depends
> on
> 
>   (CONFIG_SCSI_HFSHCD_PLATFORM && CONFIG_SCSI_UFSHCD)
> 
> were left to being built as modules.
> 
> Commit 1f4fa50dd48f ("arm64: defconfig: Regenerate for v4.20") "fixed"
> that by reverting to 'CONFIG_SCSI_UFS_HISI=m'.
> 
> Thing is, if the rootfs is stored in the on-board flash (which
> is the "canonical" way of doing things), we either need these drivers
> to be built-in, or we need to fiddle with an initramfs to access that
> flash and eventually load the modules installed over there.
> 
> The former is the easiest, do that.
> 
> Signed-off-by: Valentin Schneider 

Looks good to me:

Reviewed-by: Leo Yan 

> ---
>  arch/arm64/configs/defconfig | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 2d9c39033c1a..32fb03503b0b 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -222,10 +222,10 @@ CONFIG_BLK_DEV_SD=y
>  CONFIG_SCSI_SAS_ATA=y
>  CONFIG_SCSI_HISI_SAS=y
>  CONFIG_SCSI_HISI_SAS_PCI=y
> -CONFIG_SCSI_UFSHCD=m
> -CONFIG_SCSI_UFSHCD_PLATFORM=m
> +CONFIG_SCSI_UFSHCD=y
> +CONFIG_SCSI_UFSHCD_PLATFORM=y
>  CONFIG_SCSI_UFS_QCOM=m
> -CONFIG_SCSI_UFS_HISI=m
> +CONFIG_SCSI_UFS_HISI=y
>  CONFIG_ATA=y
>  CONFIG_SATA_AHCI=y
>  CONFIG_SATA_AHCI_PLATFORM=y
> --
> 2.20.1
> 


ANNOUNCE: pahole v1.13 (More BTF, __aligned__, __packed__)

2019-04-16 Thread Arnaldo Carvalho de Melo
Hi,

The v1.13 release of pahole and its friends is out, available at
the usual places:

Main git repo:

  git://git.kernel.org/pub/scm/devel/pahole/pahole.git

Mirror git repo:

  https://github.com/acmel/dwarves.git

tarball + gpg signature:

  https://fedorapeople.org/~acme/dwarves/dwarves-1.13.tar.xz
  https://fedorapeople.org/~acme/dwarves/dwarves-1.13.tar.sign

Here is a summary of changes:

- BTF

  - Use of the recently introduced BTF deduplication algorithm present in the
Linux kernel's libbpf library, which allows for all the types in a multi
compile unit binary such as vmlinux to be compactly stored, without 
duplicates.

E.g.: from roughly:

$ readelf -SW ../build/v5.1-rc4+/vmlinux | grep .debug_info.*PROGBITS
  [63] .debug_info PROGBITS  1d80be0 c3c18b9 00 0 0 1
$
195 MiB

to:

$ time pahole --btf_encode ../build/v5.1-rc4+/vmlinux 
real0m19.168s
user0m17.707s # On a Lenovo t480s (i7-8650U) SSD
sys 0m1.337s
$

$ readelf -SW ../build/v5.1-rc4+/vmlinux | grep .BTF.*PROGBITS
  [78] .BTFPROGBITS  27b49f61 1e23c3 00 0 0 1
$ 
~2 MiB

  - Introduce a 'btfdiff' utility that prints the output from DWARF and from
BTF, comparing the pretty printed outputs, running it on various linux
kernel images, such as an allyesconfig for ppc64.

Running it on the above 5.1-rc4+ vmlinux:

  $ btfdiff ../build/v5.1-rc4+/vmlinux
  $ 

No differences from the types generated from the DWARF ELF sections to the
ones generated from the BTF ELF section.

  - Add a BTF loader, i.e. 'pahole -F btf' allows pretty printing of structs
and unions in the same fashion as with DWARF info, and since BTF is way
more compact, using it is much faster than using DWARF.

  $ cat ../build/v5.1-rc4+/vmlinux > /dev/null
  $ perf stat -e cycles pahole -F btf ../build/v5.1-rc4+/vmlinux  > 
/dev/null
  
   Performance counter stats for 'pahole -F btf ../build/v5.1-rc4+/vmlinux':
  
 229,712,692  cycles:u  
  
 0.063379597 seconds time elapsed
 0.056265000 seconds user
 0.006911000 seconds sys
  
  $ perf stat -e cycles pahole -F dwarf ../build/v5.1-rc4+/vmlinux  > 
/dev/null
  
   Performance counter stats for 'pahole -F dwarf 
../build/v5.1-rc4+/vmlinux':
  
  49,579,679,466  cycles:u  
  
13.063487352 seconds time elapsed
12.612512000 seconds user
 0.426226000 seconds sys
  $

- Better union support:

  - Allow unions to be specified in pahole in the same fashion as structs

$ pahole -C thread_union ../build/v5.1-rc4+/net/ipv4/tcp.o
union thread_union {
struct task_struct task __attribute__((__aligned__(64))); /* 0 
11008 */
long unsigned int  stack[2048];   /* 0 
16384 */
};
$

- Infer __attribute__((__packed__)) when structs have no alignment holes
  and violate basic types (integer, longs, short integer) natural alignment
  requirements. Several heuristics are used to infer the __packed__
  attribute, see the changeset log for descriptions.

  $ pahole -F btf -C boot_e820_entry ../build/v5.1-rc4+/vmlinux
  struct boot_e820_entry {
  __u64 addr; /* 0 8 */
  __u64 size; /* 8 8 */
  __u32 type; /*16 4 */

/* size: 20, cachelines: 1, members: 3 */
/* last cacheline: 20 bytes */
  } __attribute__((__packed__));
  $ 

  $ pahole -F btf -C lzma_header ../build/v5.1-rc4+/vmlinux
  struct lzma_header {
  uint8_tpos;  /* 0 1 */
  uint32_t   dict_size;/* 1 4 */
  uint64_t   dst_size; /* 5 8 */

  /* size: 13, cachelines: 1, members: 3 */
  /* last cacheline: 13 bytes */
  } __attribute__((__packed__));

- Support DWARF5's DW_AT_alignment, which, together with the __packed__
  attribute inference algorithms produce output that, when compiled, should
  produce structures with layouts that match the original source code.

  See it in action with 'struct task_struct', which will also show some of the
  new information at the struct summary, at the end of the struct:

  $ pahole -C task_struct ../build/v5.1-rc4+/vmlinux | tail -19
/* --- cacheline 103 boundary (6592 bytes) --- */
struct vm_struct   * stack_vm_area;  /*  6592 8 */
refcount_t stack_refcount;   /*  6600 4 */
  
/* XXX 4 bytes hole, try to pack */
  
void * security; /*  6608 8 */
  
/* XXX 40 bytes hole, try to pack */
  
/* --- 

Re: [PATCH v5 1/4] dt-bindings: arm: coresight: Add new compatible for static replicator

2019-04-16 Thread Leo Yan
On Tue, Apr 16, 2019 at 02:18:40PM -0600, Mathieu Poirier wrote:
> Hi Leo,
> 
> On Fri, 12 Apr 2019 at 04:28, Leo Yan  wrote:
> >
> > CoreSight uses below bindings for replicator:
> >
> >   Dynamic replicator, aka. configurable replicator:
> > "arm,coresight-dynamic-replicator", "arm,primecell";
> >
> >   Static replicator, aka. non-configurable replicator:
> > "arm,coresight-replicator";
> >
> > The compatible string "arm,coresight-replicator" is not an explicit
> > naming to express the replicator is 'static'.  To unify the naming
> > convention, this patch introduces a new compatible string
> > "arm,coresight-static-replicator" for the static replicator; the
> > compatible string "arm,coresight-replicator" is kept for backward
> > compatibility, but tag it as obsolete and suggest to use the new
> > compatible string.
> >
> > As result CoreSight replicator have below bindings:
> >
> >   Dynamic replicator:
> > "arm,coresight-dynamic-replicator", "arm,primecell";
> >
> >   Static replicator:
> > "arm,coresight-static-replicator";
> > "arm,coresight-replicator"; (obsolete)
> >
> > Signed-off-by: Leo Yan 
> > ---
> >  Documentation/devicetree/bindings/arm/coresight.txt | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/arm/coresight.txt 
> > b/Documentation/devicetree/bindings/arm/coresight.txt
> > index f8aff65ab921..d02d160fa8ac 100644
> > --- a/Documentation/devicetree/bindings/arm/coresight.txt
> > +++ b/Documentation/devicetree/bindings/arm/coresight.txt
> > @@ -69,7 +69,10 @@ its hardware characteristcs.
> >
> > * compatible: Currently supported value is (note the absence of the
> >   AMBA markee):
> > -   - "arm,coresight-replicator"
> > +   - Coresight Non-configurable Replicator:
> > +   "arm,coresight-static-replicator";
> > +   "arm,coresight-replicator"; (OBSOLETE. For backward
> > +   compatibility and will be removed)
> >
> > * port or ports: see "Graph bindings for Coresight" below.
> >
> > @@ -169,7 +172,7 @@ Example:
> > /* non-configurable replicators don't show up on the
> >  * AMBA bus.  As such no need to add "arm,primecell".
> >  */
> > -   compatible = "arm,coresight-replicator";
> > +   compatible = "arm,coresight-static-replicator";
> >
> > out-ports {
> > #address-cells = <1>;
> > --
> > 2.17.1
> 
> Since this is a binding patch it needs to be sent on its own.

Thanks for reminding, Mathieu.

Since this is the second time you remind me to send DT binding related
patches separately, so I may misunderstand your meaning and want to get
clarification to avoid making the same mistake for many times.

Before I remembered in one patch set we need to organise patches with
sending document patch (or document changing patch) ahead and then
followed by the corresponding code change patch.  So this can give the
reviewers more clear context;  and this also can present the merging
dependency between document change patches and the code change patches.

This is the rule I followed in this patch set and I sent to CoreSight
and DT maintainers (and mailing lists) together.

Please let me know what you think about this?  And also welcome
Rob/Mark's suggestions.

Thanks,
Leo Yan


Re: [PATCH] init: Initialize jump labels before command line option parsing

2019-04-16 Thread Dan Williams
On Tue, Apr 16, 2019 at 4:44 PM Andrew Morton  wrote:
>
> On Tue, 16 Apr 2019 13:54:04 -0700 Dan Williams  
> wrote:
>
> > When a module option, or core kernel argument, toggles a static-key it
> > requires jump labels to be initialized early. While x86, PowerPC, and
> > ARM64 arrange for jump_label_init() to be called before parse_args(),
> > ARM does not.
> >
> >   Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 
> > console=ttyAMA0,115200 page_alloc.shuffle=1
> >   [ cut here ]
> >   WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
> >   page_alloc_shuffle+0x12c/0x1ac
> >   static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
> >   before call to jump_label_init()
> >   Modules linked in:
> >   CPU: 0 PID: 0 Comm: swapper Not tainted
> >   5.1.0-rc4-next-20190410-3-g3367c36ce744 #1
> >   Hardware name: ARM Integrator/CP (Device Tree)
> >   [] (unwind_backtrace) from [] (show_stack+0x10/0x18)
> >   [] (show_stack) from [] (dump_stack+0x18/0x24)
> >   [] (dump_stack) from [] (__warn+0xe0/0x108)
> >   [] (__warn) from [] (warn_slowpath_fmt+0x44/0x6c)
> >   [] (warn_slowpath_fmt) from []
> >   (page_alloc_shuffle+0x12c/0x1ac)
> >   [] (page_alloc_shuffle) from [] 
> > (shuffle_store+0x28/0x48)
> >   [] (shuffle_store) from [] (parse_args+0x1f4/0x350)
> >   [] (parse_args) from [] (start_kernel+0x1c0/0x488)
> >
> > Move the fallback call to jump_label_init() to occur before
> > parse_args(). The redundant calls to jump_label_init() in other archs
> > are left intact in case they have static key toggling use cases that are
> > even earlier than option parsing.
>
> Has it been confirmed that this fixes
> mm-shuffle-initial-free-memory-to-improve-memory-side-cache-utilization.patch
> on beaglebone-black?

This only fixes dynamically enabling the shuffling on 32-bit ARM.
Guenter happened to run without the mm-only 'force-enable-always'
patch and when he went to use the command line option to enable it he
hit the jump-label warning.

The original beaglebone-black failure was something different and
avoided this case because the jump-label was never used.

I am otherwise unable to recreate the failure on either the original
failing -next, nor on v5.1-rc5 plus the latest state of the patches. I
need from someone who is actually still seeing the failure so they can
compare with the configuration that is working for me. For reference
that's the Yocto beaglebone-black defconfig:

https://github.com/jumpnow/meta-bbb/blob/thud/recipes-kernel/linux/linux-stable-5.0/beaglebone/defconfig


with due respect

2019-04-16 Thread Mr Duna Wattara
Dear Friend,

I know that this mail will come to you as a surprise as we have never
met before, but need not to worry as I am contacting you independently
of my investigation and no one is informed of this communication.

I need your urgent assistance in transferring the sum of $11.3million
immediately to your private account.The money has been here in our
Bank lying dormant for years now without anybody coming for the claim of it.

I want to release the money to you as the relative to our deceased
customer (the account owner) who died a long with his supposed NEXT OF
KIN since 16th October 2005. The Banking laws here does not allow such
money to stay more than 14 years, because the money will be recalled
to the Bank treasury account as unclaimed fund.

By indicating your interest I will send you the full details on how
the business will be executed.

Please respond urgently and delete if you are not interested.

Best Regards,
Mr. Duna Wattara.


Re: [PATCH] zram: pass down the bvec we need to read into in the work struct

2019-04-16 Thread Andrew Morton
On Wed, 10 Apr 2019 15:43:50 -0400 Jerome Glisse  wrote:

> Adding more Cc and stable (i thought this was 5.1 addition). Note that
> without this patch on arch/kernel where PAGE_SIZE != 4096 userspace
> could read random memory through a zram block device (thought userspace
> probably would have no control on the address being read).

Looks good to me.

Minchan & Sergey, can you please review?


From: Jérôme Glisse 
Subject: zram: pass down the bvec we need to read into in the work struct

When scheduling work item to read page we need to pass down the proper
bvec struct which points to the page to read into.  Before this patch it
uses a randomly initialized bvec (only if PAGE_SIZE != 4096) which is
wrong.

Note that without this patch on arch/kernel where PAGE_SIZE != 4096
userspace could read random memory through a zram block device (thought
userspace probably would have no control on the address being read).

Link: http://lkml.kernel.org/r/20190408183219.26377-1-jgli...@redhat.com
Signed-off-by: Jérôme Glisse 
Reviewed-by: Andrew Morton 
Cc: Minchan Kim 
Cc: Nitin Gupta 
Cc: Sergey Senozhatsky 
Cc: 
Signed-off-by: Andrew Morton 
---

 drivers/block/zram/zram_drv.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- 
a/drivers/block/zram/zram_drv.c~zram-pass-down-the-bvec-we-need-to-read-into-in-the-work-struct
+++ a/drivers/block/zram/zram_drv.c
@@ -774,18 +774,18 @@ struct zram_work {
struct zram *zram;
unsigned long entry;
struct bio *bio;
+   struct bio_vec bvec;
 };
 
 #if PAGE_SIZE != 4096
 static void zram_sync_read(struct work_struct *work)
 {
-   struct bio_vec bvec;
struct zram_work *zw = container_of(work, struct zram_work, work);
struct zram *zram = zw->zram;
unsigned long entry = zw->entry;
struct bio *bio = zw->bio;
 
-   read_from_bdev_async(zram, , entry, bio);
+   read_from_bdev_async(zram, >bvec, entry, bio);
 }
 
 /*
@@ -798,6 +798,7 @@ static int read_from_bdev_sync(struct zr
 {
struct zram_work work;
 
+   work.bvec = *bvec;
work.zram = zram;
work.entry = entry;
work.bio = bio;
_



[PATCH] iio: adc: qcom-spmi-adc5: Fix of-based module autoloading

2019-04-16 Thread Bjorn Andersson
The of_device_id table needs to be registered as module alias in order
for automatic module loading to pick the kernel module based on the
DeviceTree compatible. So add MODULE_DEVICE_TABLE() to make this happen.

Fixes: e13d757279bb ("iio: adc: Add QCOM SPMI PMIC5 ADC driver")
Cc: sta...@vger.kernel.org
Signed-off-by: Bjorn Andersson 
---
 drivers/iio/adc/qcom-spmi-adc5.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/iio/adc/qcom-spmi-adc5.c b/drivers/iio/adc/qcom-spmi-adc5.c
index 6a866cc187f7..21fdcde77883 100644
--- a/drivers/iio/adc/qcom-spmi-adc5.c
+++ b/drivers/iio/adc/qcom-spmi-adc5.c
@@ -664,6 +664,7 @@ static const struct of_device_id adc5_match_table[] = {
},
{ }
 };
+MODULE_DEVICE_TABLE(of, adc5_match_table);
 
 static int adc5_get_dt_data(struct adc5_chip *adc, struct device_node *node)
 {
-- 
2.18.0



Re: [PATCH v3 3/5] powerpc: Use the correct style for SPDX License Identifier

2019-04-16 Thread Andrew Donnellan

On 17/4/19 1:28 am, Nishad Kamdar wrote:

This patch corrects the SPDX License Identifier style
in the powerpc Hardware Architecture related files.

Suggested-by: Joe Perches 
Signed-off-by: Nishad Kamdar 
---
TIL there's a different style for source vs headers... sigh. :( Thanks 
for fixing.


Acked-by: Andrew Donnellan 


  arch/powerpc/include/asm/pnv-ocxl.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pnv-ocxl.h 
b/arch/powerpc/include/asm/pnv-ocxl.h
index 208b5503f4ed..7de82647e761 100644
--- a/arch/powerpc/include/asm/pnv-ocxl.h
+++ b/arch/powerpc/include/asm/pnv-ocxl.h
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
  // Copyright 2017 IBM Corp.
  #ifndef _ASM_PNV_OCXL_H
  #define _ASM_PNV_OCXL_H



--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited



Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Cong Wang
On Tue, Apr 16, 2019 at 4:28 PM Luck, Tony  wrote:
>
> On Tue, Apr 16, 2019 at 04:18:57PM -0700, Cong Wang wrote:
> > > The problem case occurs when we've seen enough distinct
> > > errors that we have filled every entry, then we try to
> > > look up a pfn that is larger that any seen before.
> > >
> > > The loop:
> > >
> > > while (min < max) {
> > > ...
> > > }
> > >
> > > will terminate with "min" set to MAX_ELEMS. Then we
> > > execute:
> > >
> > > this_pfn = PFN(ca->array[min]);
> > >
> > > which references beyond the end of the space allocated
> > > for ca->array.
> >
> > Exactly.
>
> Hmmm. But can we ever really have this happen?  The call
> sequence to get here looks like:
>
>
> mutex_lock(_mutex);
>
> if (ca->n == MAX_ELEMS)
> WARN_ON(!del_lru_elem_unlocked(ca));
>
> ret = find_elem(ca, pfn, );
>
> I.e. if the array was all the way full, we delete one element
> before calling find_elem().  So when we get here:
>
> static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
> {
> u64 this_pfn;
> int min = 0, max = ca->n;
>
> The biggest value "max" can have is MAX_ELEMS-1

This is exactly the explanation for why the crash is inside
memmove() rather than inside find_elem(). del_elem() actually
accesses off-by-two once we pass its 'if' check in line 232:

229 static void del_elem(struct ce_array *ca, int idx)
230 {
231 /* Save us a function call when deleting the last element. */
232 if (ca->n - (idx + 1))
233 memmove((void *)>array[idx],
234 (void *)>array[idx + 1],
235 (ca->n - (idx + 1)) * sizeof(u64));
236
237 ca->n--;
238 }

idx is ca->n and ca->n is MAX_ELEMS-1, then the above if statement
becomes true, therefore idx+1 is MAX_ELEMS which is just beyond
the valid range.

Thanks.


Re: [PATCH 2/2] kernel: use sysctl shared variables for range check

2019-04-16 Thread Andrew Morton
On Wed, 10 Apr 2019 15:59:55 -0700 Kees Cook  wrote:

> On Wed, Apr 10, 2019 at 3:54 PM Matteo Croce  wrote:
> >
> > On Thu, Apr 11, 2019 at 12:34 AM Kees Cook  wrote:
> > >
> > > On Wed, Apr 10, 2019 at 3:30 PM Matteo Croce  wrote:
> > > >
> > > > FYI, this are the stats from my local repo, just to let you the size
> > > > of a series with all the changes in it:
> > > >
> > > > $ git --no-pager log --stat --oneline linus/master
> > > >  2 files changed, 4 insertions(+), 9 deletions(-)
> > > >  2 files changed, 7 insertions(+), 14 deletions(-)
> > > >  6 files changed, 15 insertions(+), 30 deletions(-)
> > > >  1 file changed, 16 insertions(+), 19 deletions(-)
> > > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > >  3 files changed, 15 insertions(+), 20 deletions(-)
> > > >  12 files changed, 93 insertions(+), 116 deletions(-)
> > > >  3 files changed, 98 insertions(+), 104 deletions(-)
> > > >  2 files changed, 9 insertions(+)
> > >
> > > Tiny! :) Seriously, though, I think this should be fine to take
> > > directly to Linus after patch 1 lands, or see if akpm wants to carry
> > > it directly.
> > >
> > > --
> > > Kees Cook
> >
> > Nice. Still as an 8 patches series, or squashed into a bigger one?
> 
> I suspect a single patch would be fine, but let's wait to hear from akpm.

Bring it!


Re: [PATCH] init: Initialize jump labels before command line option parsing

2019-04-16 Thread Andrew Morton
On Tue, 16 Apr 2019 13:54:04 -0700 Dan Williams  
wrote:

> When a module option, or core kernel argument, toggles a static-key it
> requires jump labels to be initialized early. While x86, PowerPC, and
> ARM64 arrange for jump_label_init() to be called before parse_args(),
> ARM does not.
> 
>   Kernel command line: rdinit=/sbin/init page_alloc.shuffle=1 panic=-1 
> console=ttyAMA0,115200 page_alloc.shuffle=1
>   [ cut here ]
>   WARNING: CPU: 0 PID: 0 at ./include/linux/jump_label.h:303
>   page_alloc_shuffle+0x12c/0x1ac
>   static_key_enable(): static key 'page_alloc_shuffle_key+0x0/0x4' used
>   before call to jump_label_init()
>   Modules linked in:
>   CPU: 0 PID: 0 Comm: swapper Not tainted
>   5.1.0-rc4-next-20190410-3-g3367c36ce744 #1
>   Hardware name: ARM Integrator/CP (Device Tree)
>   [] (unwind_backtrace) from [] (show_stack+0x10/0x18)
>   [] (show_stack) from [] (dump_stack+0x18/0x24)
>   [] (dump_stack) from [] (__warn+0xe0/0x108)
>   [] (__warn) from [] (warn_slowpath_fmt+0x44/0x6c)
>   [] (warn_slowpath_fmt) from []
>   (page_alloc_shuffle+0x12c/0x1ac)
>   [] (page_alloc_shuffle) from [] 
> (shuffle_store+0x28/0x48)
>   [] (shuffle_store) from [] (parse_args+0x1f4/0x350)
>   [] (parse_args) from [] (start_kernel+0x1c0/0x488)
> 
> Move the fallback call to jump_label_init() to occur before
> parse_args(). The redundant calls to jump_label_init() in other archs
> are left intact in case they have static key toggling use cases that are
> even earlier than option parsing.

Has it been confirmed that this fixes
mm-shuffle-initial-free-memory-to-improve-memory-side-cache-utilization.patch
on beaglebone-black?



Re: [PATCH] binfmt_elf: Move brk out of mmap when doing direct loader exec

2019-04-16 Thread Andrew Morton
On Tue, 16 Apr 2019 18:14:00 -0500 Kees Cook  wrote:

> On Tue, Apr 16, 2019 at 6:04 PM Andrew Morton  
> wrote:
> >
> > >
> > > Reported-by: Ali Saidi 
> > > Link: 
> > > https://lkml.kernel.org/r/CAGXu5jJ5sj3emOT2QPxQkNQk0qbU6zEfu9=omfhx_p0nckp...@mail.gmail.com
> > > Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
> > > Signed-off-by: Kees Cook 
> >
> > No cc:stable?
> 
> Probably it should be, yes. I think I'm just shy about that when
> poking ELF mappings. :)

Well, the -stable bots will backport anything that might look slightly
like a fix anyway.

I'll add cc:stable and shall hold it out until 5.2-rc1, so it should
get a bit of a spin before being backported.


Re: [PATCH] xpad.c: Send necessary control transfer to certain Xbox controllers

2019-04-16 Thread Mike Salvatore
All,

I took another look at this patch and realized it was missing a #include.
I've added the appropriate #include to a new version of the patch.

Thanks,
Mike Salvatore



From 6fbf80fa3d5bc39fa054350a663080e1380046f8 Mon Sep 17 00:00:00 2001
From: Mike Salvatore 
Date: Wed, 13 Mar 2019 22:11:37 -0400
Subject: [PATCH] Input: xpad - send control init message to certain Xbox
 controllers

The Xbox controller with idVendor == 0x045e and idProduct == 0x028e
requires that a specific control transfer be sent from the host to the
device before the device will send data to the host.

This patch introduces an xboxone_control_packet struct and a mechanism
for sending control packets to devices that require them at
initialization.

Signed-off-by: Mike Salvatore 
---
 drivers/input/joystick/xpad.c | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
index cfc8b94527b9..756df325bfa6 100644
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -81,6 +81,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -460,6 +461,25 @@ struct xboxone_init_packet {
.len= ARRAY_SIZE(_data),\
}
 
+struct xboxone_control_packet {
+   u16 idVendor;
+   u16 idProduct;
+   struct usb_ctrlrequest ctrlrequest;
+};
+
+#define XBOXONE_CONTROL_PKT(_vid, _pid, _reqtype, _req, _value, _index, _len)  
\
+   {   \
+   .idVendor   = (_vid),   \
+   .idProduct  = (_pid),   \
+   .ctrlrequest= { \
+   .bRequestType   = (_reqtype),   \
+   .bRequest   = (_req),   \
+   .wValue = (_value), \
+   .wIndex = (_index), \
+   .wLength= (_len),   \
+   },  \
+   }
+
 
 /*
  * This packet is required for all Xbox One pads with 2015
@@ -537,6 +557,13 @@ static const struct xboxone_init_packet 
xboxone_init_packets[] = {
XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumbleend_init),
 };
 
+static const struct xboxone_control_packet xboxone_control_packets[] = {
+   XBOXONE_CONTROL_PKT(0x045e, 0x028e,
+   USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_INTERFACE,
+   USB_REQ_CLEAR_FEATURE,
+   USB_DEVICE_REMOTE_WAKEUP, 0, 0),
+};
+
 struct xpad_output_packet {
u8 data[XPAD_PKT_LEN];
u8 len;
@@ -1119,6 +1146,31 @@ static int xpad_init_output(struct usb_interface *intf, 
struct usb_xpad *xpad,
return error;
 }
 
+static int xpad_init_control_msg(struct usb_xpad *xpad)
+{
+   struct usb_device *udev = xpad->udev;
+   size_t i;
+
+   for (i = 0; i < ARRAY_SIZE(xboxone_control_packets); i++) {
+   u16 idVendor = xboxone_control_packets[i].idVendor;
+   u16 idProduct = xboxone_control_packets[i].idProduct;
+
+   if (le16_to_cpu(udev->descriptor.idVendor) == idVendor
+   && le16_to_cpu(udev->descriptor.idProduct) == idProduct) {
+   const struct usb_ctrlrequest *ctrlrequest =
+   &(xboxone_control_packets[i].ctrlrequest);
+
+   return usb_control_msg(udev, usb_sndctrlpipe(udev, 0),
+   ctrlrequest->bRequest,
+   ctrlrequest->bRequestType, ctrlrequest->wValue,
+   ctrlrequest->wIndex, NULL, ctrlrequest->wLength,
+   2 * HZ);
+   }
+   }
+
+   return 0;
+}
+
 static void xpad_stop_output(struct usb_xpad *xpad)
 {
if (xpad->xtype != XTYPE_UNKNOWN) {
@@ -1839,6 +1891,11 @@ static int xpad_probe(struct usb_interface *intf, const 
struct usb_device_id *id
if (error)
goto err_deinit_output;
}
+
+   error = xpad_init_control_msg(xpad);
+   if (error)
+   goto err_deinit_output;
+
return 0;
 
 err_deinit_output:
-- 
2.17.1



On 3/23/19 12:46 PM, Mike Salvatore wrote:
> From 3051524e62d68b920019bcb50a713e736fcf4234 Mon Sep 17 00:00:00 2001
> From: Mike Salvatore 
> Date: Wed, 13 Mar 2019 22:11:37 -0400
> Subject: [PATCH] Input: xpad - send control init message to certain Xbox
>  controllers
> 
> The Xbox controller with idVendor == 0x045e and idProduct == 0x028e
> requires that a specific control transfer be sent from the host to the
> device before the device will send data to the host.
> 
> This patch introduces an xboxone_control_packet struct and a mechanism
> for sending control packets to devices that require them at
> initialization.
> 
> Signed-off-by: Mike Salvatore 
> ---
>  drivers/input/joystick/xpad.c | 56 

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Luck, Tony
On Tue, Apr 16, 2019 at 04:18:57PM -0700, Cong Wang wrote:
> > The problem case occurs when we've seen enough distinct
> > errors that we have filled every entry, then we try to
> > look up a pfn that is larger that any seen before.
> >
> > The loop:
> >
> > while (min < max) {
> > ...
> > }
> >
> > will terminate with "min" set to MAX_ELEMS. Then we
> > execute:
> >
> > this_pfn = PFN(ca->array[min]);
> >
> > which references beyond the end of the space allocated
> > for ca->array.
> 
> Exactly.

Hmmm. But can we ever really have this happen?  The call
sequence to get here looks like:


mutex_lock(_mutex);

if (ca->n == MAX_ELEMS)
WARN_ON(!del_lru_elem_unlocked(ca));

ret = find_elem(ca, pfn, );

I.e. if the array was all the way full, we delete one element
before calling find_elem().  So when we get here:

static int __find_elem(struct ce_array *ca, u64 pfn, unsigned int *to)
{
u64 this_pfn;
int min = 0, max = ca->n;

The biggest value "max" can have is MAX_ELEMS-1


-Tony


Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Cong Wang
On Tue, Apr 16, 2019 at 3:18 PM Luck, Tony  wrote:
>
> On Tue, Apr 16, 2019 at 11:07:26AM +0200, Borislav Petkov wrote:
> > On Mon, Apr 15, 2019 at 06:20:00PM -0700, Cong Wang wrote:
> > > ce_arr.array[] is always within the range [0, ce_arr.n-1].
> > > However, the binary search code in __find_elem() uses ce_arr.n
> > > as the maximum index, which could lead to an off-by-one
> > > out-of-bound access when the element after the last is exactly
> > > the one just got deleted, that is, 'min' returned to caller as
> > > 'ce_arr.n'.
> >
> > Sorry, I don't follow.
> >
> > There's a debugfs interface in /sys/kernel/debug/ras/cec/ with which you
> > can input random PFNs and test the thing.
> >
> > Show me pls how this can happen with an example.
>
> The array of previously seen pfn values is one page.
>
> The problem case occurs when we've seen enough distinct
> errors that we have filled every entry, then we try to
> look up a pfn that is larger that any seen before.
>
> The loop:
>
> while (min < max) {
> ...
> }
>
> will terminate with "min" set to MAX_ELEMS. Then we
> execute:
>
> this_pfn = PFN(ca->array[min]);
>
> which references beyond the end of the space allocated
> for ca->array.

Exactly.


>
> Probably won't crash, but we will read a garbage value
> from whatever memory is allocated next.


It will eventually crash, this can be proved. :)


>
> Chances are high that the test:
>
> if (this_pfn == pfn)
>
> won't find that the garbage value matches the pfn that
> we were looking for ... so we will likley be lucky and
> not do anything too dumb. But we shouldn't just cross
> our fingers and hope.
>
> Fix looks mostly OK, but we should probably move the
>
> if (to)
> *to = min;
>
> inside the new
>
> if (min < ca->n) {
> ...
> }
>
> clause.

No, we can't, in case of -ENOKEY, we still have to save the index
for caller to insert it at the end of the array, therefore *to must be
always saved.

Thanks.


Re: [PATCH 0/4] z3fold: support page migration

2019-04-16 Thread Andrew Morton
On Thu, 11 Apr 2019 17:32:12 +0200 Vitaly Wool  wrote:

> This patchset implements page migration support and slightly better
> buddy search. To implement page migration support, z3fold has to move
> away from the current scheme of handle encoding. i. e. stop encoding
> page address in handles. Instead, a small per-page structure is created
> which will contain actual addresses for z3fold objects, while pointers
> to fields of that structure will be used as handles.

Can you please help find a reviewer for this work?

For some reason I'm seeing a massive number of rejects when trying to
apply these.  It looks like your mail client performed some sort of
selective space-stuffing.  I suggest you email a patch to yourself,
check that the result applies properly.





Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node

2019-04-16 Thread Yang Shi



Why cannot we start simple and build from there? In other words I 
do not

think we really need anything like N_CPU_MEM at all.
In this patchset N_CPU_MEM is used to tell us what nodes are cpuless 
nodes.
They would be the preferred demotion target.  Of course, we could 
rely on
firmware to just demote to the next best node, but it may be a 
"preferred"
node, if so I don't see too much benefit achieved by demotion. Am I 
missing

anything?

Why cannot we simply demote in the proximity order? Why do you make
cpuless nodes so special? If other close nodes are vacant then just use
them.


And, I'm supposed we agree to *not* migrate from PMEM node (cpuless 
node) to any other node on reclaim path, right? If so we need know if 
the current node is DRAM node or PMEM node. If DRAM node, do demotion; 
if PMEM node, do swap. So, using N_CPU_MEM to tell us if the current 
node is DRAM node or not.


We could. But, this raises another question, would we prefer to just 
demote to the next fallback node (just try once), if it is contended, 
then just swap (i.e. DRAM0 -> PMEM0 -> Swap); or would we prefer to 
try all the nodes in the fallback order to find the first less 
contended one (i.e. DRAM0 -> PMEM0 -> DRAM1 -> PMEM1 -> Swap)?



|--| |--| |--|    |--|
|PMEM0|---|DRAM0| --- CPU0 --- CPU1 --- |DRAM1| --- |PMEM1|
|--| |--| |--|   |--|

The first one sounds simpler, and the current implementation does so 
and this needs find out the closest PMEM node by recognizing cpuless 
node.


If we prefer go with the second option, it is definitely unnecessary 
to specialize any node.






Re: [v2 RFC PATCH 0/9] Another Approach to Use PMEM as NUMA Node

2019-04-16 Thread Yang Shi




On 4/16/19 4:04 PM, Dave Hansen wrote:

On 4/16/19 2:59 PM, Yang Shi wrote:

On 4/16/19 2:22 PM, Dave Hansen wrote:

Keith Busch had a set of patches to let you specify the demotion order
via sysfs for fun.  The rules we came up with were:
1. Pages keep no history of where they have been
2. Each node can only demote to one other node

Does this mean any remote node? Or just DRAM to PMEM, but remote PMEM
might be ok?

In Keith's code, I don't think we differentiated.  We let any node
demote to any other node you want, as long as it follows the cycle rule.


I recall Keith's code let the userspace define the target node. Anyway, 
we may need add one rule: not migrate-on-reclaim from PMEM node. 
Demoting from PMEM to DRAM sounds pointless.





Re: [PATCH v2 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Cong Wang
On Tue, Apr 16, 2019 at 2:46 PM Borislav Petkov  wrote:
>
> On Tue, Apr 16, 2019 at 02:33:50PM -0700, Cong Wang wrote:
> > ce_arr.array[] is always within the range [0, ce_arr.n-1].
> > However, the binary search code in __find_elem() uses ce_arr.n
> > as the maximum index, which could lead to an off-by-one
> > out-of-bound access right after the while loop. In this case,
> > we should not even read it, just return -ENOKEY instead.
> >
> > Note, this could cause a kernel crash if ce_arr.n is exactly
> > MAX_ELEMS.
>
> "Could cause"?
>
> I'm still waiting for a demonstration. You can build a case through
> writing values in the debugfs nodes I pointed you at or even with a
> patch ontop preparing the exact conditions for it to crash. And then
> give me that "recipe" to trigger it here in a VM.

It is actually fairly easy:

1) Fill the whole page with PFN's:
for i in `seq 0 511`; do echo $i >> /sys/kernel/debug/ras/cec/pfn; done

2) Set thresh to 1 in order to trigger the deletion:
echo 1 > /sys/kernel/debug/ras/cec/count_threshold

3) Repeatedly add and remove the last element:
echo 512 >> /sys/kernel/debug/ras/cec/pfn
(until you get a crash.)

In case you still don't get it, here it is:

[   57.732593] BUG: unable to handle kernel paging request at 9c667bca
[   57.734994] #PF error: [PROT] [WRITE]
[   57.735891] PGD 75601067 P4D 75601067 PUD 75605067 PMD 7bca1063 PTE
80007bca0061
[   57.737702] Oops: 0003 [#1] SMP PTI
[   57.738533] CPU: 0 PID: 649 Comm: bash Not tainted 5.1.0-rc5+ #561
[   57.739965] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29
04/01/2014
[   57.742892] RIP: 0010:__memmove+0x57/0x1a0
[   57.743853] Code: 00 72 05 40 38 fe 74 3b 48 83 ea 20 48 83 ea 20
4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 48 8d 76 20 4c 89 1f 4c
89 57 08 <4c> 89 4f 10 4c 89 47 18 48 8d 7f 20 73 d4 48 83 c2 20 e9 a2
00 00
[   57.748150] RSP: 0018:be2ec0c8bdf8 EFLAGS: 00010206
[   57.749371] RAX: 9c667a5c1ff0 RBX: 0001 RCX: 0ff8
[   57.751018] RDX: 0007fe921fb8 RSI: 9c667bca0018 RDI: 9c667bc9fff0
[   57.752674] RBP: 0200 R08:  R09: 015c
[   57.754325] R10: 00040001 R11: 5a5a5a5a5a5a5a5a R12: 0004
[   57.755976] R13: 9c6671787778 R14: 9c6671787728 R15: 9c6671787750
[   57.757631] FS:  7f33ca294740() GS:9c667d80()
knlGS:
[   57.759689] CS:  0010 DS:  ES:  CR0: 80050033
[   57.761023] CR2: 9c667bca CR3: 7061e000 CR4: 000406f0
[   57.762681] Call Trace:
[   57.763275]  del_elem.constprop.1+0x39/0x40
[   57.764260]  cec_add_elem+0x1e4/0x211
[   57.765129]  simple_attr_write+0xa2/0xc3
[   57.766057]  debugfs_attr_write+0x45/0x5c
[   57.767005]  full_proxy_write+0x4b/0x65
[   57.767911]  ? full_proxy_poll+0x50/0x50
[   57.768844]  vfs_write+0xb8/0xf5
[   57.769613]  ksys_write+0x6b/0xb8
[   57.770407]  do_syscall_64+0x57/0x65
[   57.771249]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

I will leave it as a homework for explaining why the crash is inside
memmove(). ;)

Thanks.


Re: [PATCH] binfmt_elf: Move brk out of mmap when doing direct loader exec

2019-04-16 Thread Kees Cook
On Tue, Apr 16, 2019 at 6:04 PM Andrew Morton  wrote:
>
> On Mon, 15 Apr 2019 21:23:20 -0700 Kees Cook  wrote:
>
> > Commit eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"),
> > made changes in the rare case when the ELF loader was directly invoked
> > (e.g to set a non-inheritable LD_LIBRARY_PATH, testing new versions of
> > the loader), by moving into the mmap region to avoid both ET_EXEC and PIE
> > binaries. This had the effect of also moving the brk region into mmap,
> > which could lead to the stack and brk being arbitrarily close to each
> > other. An unlucky process wouldn't get its requested stack size and stack
> > allocations could end up scribbling on the heap.
> >
> > This is illustrated here. In the case of using the loader directly, brk
> > (so helpfully identified as "[heap]") is allocated with the _loader_
> > not the binary. For example, with ASLR entirely disabled, you can see
> > this more clearly:
> >
> > $ /bin/cat /proc/self/maps
> > 4000-c000 r-xp  ... /bin/cat
> > 5575b000-5575c000 r--p 7000 ... /bin/cat
> > 5575c000-5575d000 rw-p 8000 ... /bin/cat
> > 5575d000-5577e000 rw-p  ... [heap]
> > ...
> > 77ff7000-77ffa000 r--p  ... [vvar]
> > 77ffa000-77ffc000 r-xp  ... [vdso]
> > 77ffc000-77ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so
> > 77ffd000-77ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so
> > 77ffe000-77fff000 rw-p  ...
> > 7ffde000-7000 rw-p  ... [stack]
> >
> > $ /lib/x86_64-linux-gnu/ld-2.27.so /bin/cat /proc/self/maps
> > ...
> > 77bcc000-77bd4000 r-xp  ... /bin/cat
> > 77bd4000-77dd3000 ---p 8000 ... /bin/cat
> > 77dd3000-77dd4000 r--p 7000 ... /bin/cat
> > 77dd4000-77dd5000 rw-p 8000 ... /bin/cat
> > 77dd5000-77dfc000 r-xp  ... /lib/x86_64-linux-gnu/ld-2.27.so
> > 77fb2000-77fd6000 rw-p  ...
> > 77ff7000-77ffa000 r--p  ... [vvar]
> > 77ffa000-77ffc000 r-xp  ... [vdso]
> > 77ffc000-77ffd000 r--p 00027000 ... /lib/x86_64-linux-gnu/ld-2.27.so
> > 77ffd000-77ffe000 rw-p 00028000 ... /lib/x86_64-linux-gnu/ld-2.27.so
> > 77ffe000-7802 rw-p  ... [heap]
> > 7ffde000-7000 rw-p  ... [stack]
> >
> > The solution is to move brk out of mmap and into ELF_ET_DYN_BASE since
> > nothing is there in this direct loader case (and ET_EXEC still far
> > away at 0x40). Anything that ran before should still work (i.e. the
> > ultimately-launched binary already had the brk very far from its text,
> > so this should be no different from a COMPAT_BRK standpoint). The only
> > risk I see here is that if someone started to suddenly depend on the
> > entire memory space above the mmap region being available when launching
> > binaries via a direct loader execs which seems highly unlikely, I'd hope:
> > this would mean a binary would _not_ work when execed normally.
> >
> > Reported-by: Ali Saidi 
> > Link: 
> > https://lkml.kernel.org/r/CAGXu5jJ5sj3emOT2QPxQkNQk0qbU6zEfu9=omfhx_p0nckp...@mail.gmail.com
> > Fixes: eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE")
> > Signed-off-by: Kees Cook 
>
> No cc:stable?

Probably it should be, yes. I think I'm just shy about that when
poking ELF mappings. :)

-- 
Kees Cook


  1   2   3   4   5   6   7   8   9   10   >