RE: [PATCH 00/17] fs, btrfs refcount conversions

2017-03-06 Thread Reshetova, Elena
> At 03/06/2017 05:43 PM, Reshetova, Elena wrote:
> >
> >> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> >>> Now when new refcount_t type and API are finally merged
> >>> (see include/linux/refcount.h), the following
> >>> patches convert various refcounters in the btrfs filesystem from atomic_t
> >>> to refcount_t. By doing this we prevent intentional or accidental
> >>> underflows or overflows that can led to use-after-free vulnerabilities.
> >>>
> >>> The below patches are fully independent and can be cherry-picked 
> >>> separately.
> >>> Since we convert all kernel subsystems in the same fashion, resulting
> >>> in about 300 patches, we have to group them for sending at least in some
> >>> fashion to be manageable. Please excuse the long cc list.
> >>>
> >>> These patches have been tested with xfstests by running btrfs-related 
> >>> tests.
> >>> btrfs debug was enabled, warns on refcount errors, too. No output related 
> >>> to
> >>> refcount errors produced. However, the following errors were during the 
> >>> run:
> >>>  * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> >>>  process hangs. They all seem to be around qgroup, sometimes error visible
> >>>  such as qgroup scan failed -4 before it blocks, but not always.
> >>
> >> How reproducible of the hang?
> >
> > Always in  my environment, but I would not much go into investigating why it
> happens, if it works for you.
> > My test environment is far from ideal: I am testing in VM with rather old
> userspace and couple of additional changes in,
> > so there are many things that can potentially go wrong. Anyway the strace 
> > for
> 078 is in the attachment.
> 
> Thanks for the strace.
> 
> However no "-f" is passed to strace, so it doesn't contain much useful info.
> 
> >
> > If the patches pass all tests on your side, could you please take them in 
> > and
> propagate further?
> > I will continue with other kernel subsystems.
> 
> The patchset itself looks like a common cleanup, while I did encounter
> several cases (almost all scrub tests) causing kernel warning due to
> underflow.

Oh, could you please send me the warning outputs? I can hopefully analyze and 
fix them. 

Best Regards,
Elena.

> 
> So I'm afraid the patchset will not be merged until we fix all the
> underflows.
> 
> But thanks for the patchset, it helps us to expose a lot of problem.
> 
> Thanks,
> Qu
> 
> >
> > Best Regards,
> > Elena.
> >
> >
> >>
> >> I also see the -EINTR output, but that seems to be designed for
> >> btrfs/11[45].
> >>
> >> btrfs/078 is unrelated to qgroup, and all these three test pass in my
> >> test environment, which is v4.11-rc1 with your patches applied.
> >>
> >> I ran these 3 tests in a row with default and space_cache=v2 mount
> >> options, and 5 times for each mount option, no hang at all.
> >>
> >> It would help much if more info can be provided, from blocked process
> >> backtrace to test mount option to base commit.
> >>
> >> Thanks,
> >> Qu
> >>
> >>>  * test btrfs/104 dmesg has additional error output:
> >>>  BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> >>>  to free: 4096
> >>>  I tried looking at the code on what causes the failure, but could not 
> >>> figure
> >>>  it out. It doesn't seem to be related to any refcount changes at least 
> >>> IMO.
> >>>
> >>> The above test failures are hard for me to understand and interpreted, but
> >>> they don't seem to relate to refcount conversions.
> >>>
> >>> Elena Reshetova (17):
> >>>   fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
> >>>
> >>>  fs/btrfs/backref.c   |  2 +-
> >>>  fs/btrfs/compression.c   | 18 -
> >>>  fs/btrfs/ctree.h | 

RE: [PATCH 00/17] fs, btrfs refcount conversions

2017-03-06 Thread Reshetova, Elena
> At 03/06/2017 05:43 PM, Reshetova, Elena wrote:
> >
> >> At 03/03/2017 04:55 PM, Elena Reshetova wrote:
> >>> Now when new refcount_t type and API are finally merged
> >>> (see include/linux/refcount.h), the following
> >>> patches convert various refcounters in the btrfs filesystem from atomic_t
> >>> to refcount_t. By doing this we prevent intentional or accidental
> >>> underflows or overflows that can led to use-after-free vulnerabilities.
> >>>
> >>> The below patches are fully independent and can be cherry-picked 
> >>> separately.
> >>> Since we convert all kernel subsystems in the same fashion, resulting
> >>> in about 300 patches, we have to group them for sending at least in some
> >>> fashion to be manageable. Please excuse the long cc list.
> >>>
> >>> These patches have been tested with xfstests by running btrfs-related 
> >>> tests.
> >>> btrfs debug was enabled, warns on refcount errors, too. No output related 
> >>> to
> >>> refcount errors produced. However, the following errors were during the 
> >>> run:
> >>>  * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
> >>>  process hangs. They all seem to be around qgroup, sometimes error visible
> >>>  such as qgroup scan failed -4 before it blocks, but not always.
> >>
> >> How reproducible of the hang?
> >
> > Always in  my environment, but I would not much go into investigating why it
> happens, if it works for you.
> > My test environment is far from ideal: I am testing in VM with rather old
> userspace and couple of additional changes in,
> > so there are many things that can potentially go wrong. Anyway the strace 
> > for
> 078 is in the attachment.
> 
> Thanks for the strace.
> 
> However no "-f" is passed to strace, so it doesn't contain much useful info.
> 
> >
> > If the patches pass all tests on your side, could you please take them in 
> > and
> propagate further?
> > I will continue with other kernel subsystems.
> 
> The patchset itself looks like a common cleanup, while I did encounter
> several cases (almost all scrub tests) causing kernel warning due to
> underflow.

Oh, could you please send me the warning outputs? I can hopefully analyze and 
fix them. 

Best Regards,
Elena.

> 
> So I'm afraid the patchset will not be merged until we fix all the
> underflows.
> 
> But thanks for the patchset, it helps us to expose a lot of problem.
> 
> Thanks,
> Qu
> 
> >
> > Best Regards,
> > Elena.
> >
> >
> >>
> >> I also see the -EINTR output, but that seems to be designed for
> >> btrfs/11[45].
> >>
> >> btrfs/078 is unrelated to qgroup, and all these three test pass in my
> >> test environment, which is v4.11-rc1 with your patches applied.
> >>
> >> I ran these 3 tests in a row with default and space_cache=v2 mount
> >> options, and 5 times for each mount option, no hang at all.
> >>
> >> It would help much if more info can be provided, from blocked process
> >> backtrace to test mount option to base commit.
> >>
> >> Thanks,
> >> Qu
> >>
> >>>  * test btrfs/104 dmesg has additional error output:
> >>>  BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
> >>>  to free: 4096
> >>>  I tried looking at the code on what causes the failure, but could not 
> >>> figure
> >>>  it out. It doesn't seem to be related to any refcount changes at least 
> >>> IMO.
> >>>
> >>> The above test failures are hard for me to understand and interpreted, but
> >>> they don't seem to relate to refcount conversions.
> >>>
> >>> Elena Reshetova (17):
> >>>   fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_caching_control.count from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
> >>> refcount_t
> >>>   fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
> >>>   fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t
> >>>
> >>>  fs/btrfs/backref.c   |  2 +-
> >>>  fs/btrfs/compression.c   | 18 -
> >>>  fs/btrfs/ctree.h | 

Re: [PATCH] docs: Clarify details for reporting security bugs

2017-03-06 Thread Kees Cook
On Mon, Mar 6, 2017 at 11:27 PM, Jonathan Corbet  wrote:
> On Mon, 6 Mar 2017 11:13:51 -0800
> Kees Cook  wrote:
>
>> The kernel security team is regularly asked to provide CVE identifiers,
>> which we don't normally do. This updates the documentation to mention
>> this and adds some more details about coordination and patch handling
>> that come up regularly. Based on an earlier draft by Willy Tarreau.
>>
>> Signed-off-by: Kees Cook 
>> Acked-by: Willy Tarreau 
>
> Seems good, applied to the docs tree, thanks.

Thanks!

>
>> Related question: shouldn't security-bugs.rst and submitting-patches.rst live
>> in /process/ rather than /admin-guide/ ?
>
> The former should maybe be there, depending on just who we think it
> should be aimed at, I guess.  submitting-patches.rst is already in
> process/, though, so I'm not quite sure I understand that question?

Ah, maybe I was looking at the wrong thing. Regardless, it seems like
security-bugs.rst is the same "type" as "submitting-patches.rst", so
perhaps it should move?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] docs: Clarify details for reporting security bugs

2017-03-06 Thread Kees Cook
On Mon, Mar 6, 2017 at 11:27 PM, Jonathan Corbet  wrote:
> On Mon, 6 Mar 2017 11:13:51 -0800
> Kees Cook  wrote:
>
>> The kernel security team is regularly asked to provide CVE identifiers,
>> which we don't normally do. This updates the documentation to mention
>> this and adds some more details about coordination and patch handling
>> that come up regularly. Based on an earlier draft by Willy Tarreau.
>>
>> Signed-off-by: Kees Cook 
>> Acked-by: Willy Tarreau 
>
> Seems good, applied to the docs tree, thanks.

Thanks!

>
>> Related question: shouldn't security-bugs.rst and submitting-patches.rst live
>> in /process/ rather than /admin-guide/ ?
>
> The former should maybe be there, depending on just who we think it
> should be aimed at, I guess.  submitting-patches.rst is already in
> process/, though, so I'm not quite sure I understand that question?

Ah, maybe I was looking at the wrong thing. Regardless, it seems like
security-bugs.rst is the same "type" as "submitting-patches.rst", so
perhaps it should move?

-Kees

-- 
Kees Cook
Pixel Security


[tip:perf/core] tools arch x86: Include asm/cmpxchg.h

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  3337e682d9f3043bb0b925d976558ed5c41b0a09
Gitweb: http://git.kernel.org/tip/3337e682d9f3043bb0b925d976558ed5c41b0a09
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:54:53 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

tools arch x86: Include asm/cmpxchg.h

Will be included from atomic.h and used in refcount.h

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-pzrydfee75mhq64kazxmf...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/arch/x86/include/asm/cmpxchg.h | 89 
 tools/perf/MANIFEST  |  1 +
 tools/scripts/Makefile.include   |  9 
 3 files changed, 99 insertions(+)

diff --git a/tools/arch/x86/include/asm/cmpxchg.h 
b/tools/arch/x86/include/asm/cmpxchg.h
new file mode 100644
index 000..f525326
--- /dev/null
+++ b/tools/arch/x86/include/asm/cmpxchg.h
@@ -0,0 +1,89 @@
+#ifndef TOOLS_ASM_X86_CMPXCHG_H
+#define TOOLS_ASM_X86_CMPXCHG_H
+
+#include 
+
+/*
+ * Non-existant functions to indicate usage errors at link time
+ * (or compile-time if the compiler implements __compiletime_error().
+ */
+extern void __cmpxchg_wrong_size(void)
+   __compiletime_error("Bad argument size for cmpxchg");
+
+/*
+ * Constants for operation sizes. On 32-bit, the 64-bit size it set to
+ * -1 because sizeof will never return -1, thereby making those switch
+ * case statements guaranteeed dead code which the compiler will
+ * eliminate, and allowing the "missing symbol in the default case" to
+ * indicate a usage error.
+ */
+#define __X86_CASE_B   1
+#define __X86_CASE_W   2
+#define __X86_CASE_L   4
+#ifdef __x86_64__
+#define __X86_CASE_Q   8
+#else
+#define__X86_CASE_Q-1  /* sizeof will never return -1 
*/
+#endif
+
+/*
+ * Atomic compare and exchange.  Compare OLD with MEM, if identical,
+ * store NEW in MEM.  Return the initial value in MEM.  Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __raw_cmpxchg(ptr, old, new, size, lock)   \
+({ \
+   __typeof__(*(ptr)) __ret;   \
+   __typeof__(*(ptr)) __old = (old);   \
+   __typeof__(*(ptr)) __new = (new);   \
+   switch (size) { \
+   case __X86_CASE_B:  \
+   {   \
+   volatile u8 *__ptr = (volatile u8 *)(ptr);  \
+   asm volatile(lock "cmpxchgb %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "q" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_W:  \
+   {   \
+   volatile u16 *__ptr = (volatile u16 *)(ptr);\
+   asm volatile(lock "cmpxchgw %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "r" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_L:  \
+   {   \
+   volatile u32 *__ptr = (volatile u32 *)(ptr);\
+   asm volatile(lock "cmpxchgl %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "r" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_Q:  \
+   {   \
+   volatile u64 *__ptr = (volatile u64 *)(ptr);\
+   asm volatile(lock "cmpxchgq %2,%1" 

[tip:perf/core] tools arch x86: Include asm/cmpxchg.h

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  3337e682d9f3043bb0b925d976558ed5c41b0a09
Gitweb: http://git.kernel.org/tip/3337e682d9f3043bb0b925d976558ed5c41b0a09
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:54:53 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

tools arch x86: Include asm/cmpxchg.h

Will be included from atomic.h and used in refcount.h

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-pzrydfee75mhq64kazxmf...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/arch/x86/include/asm/cmpxchg.h | 89 
 tools/perf/MANIFEST  |  1 +
 tools/scripts/Makefile.include   |  9 
 3 files changed, 99 insertions(+)

diff --git a/tools/arch/x86/include/asm/cmpxchg.h 
b/tools/arch/x86/include/asm/cmpxchg.h
new file mode 100644
index 000..f525326
--- /dev/null
+++ b/tools/arch/x86/include/asm/cmpxchg.h
@@ -0,0 +1,89 @@
+#ifndef TOOLS_ASM_X86_CMPXCHG_H
+#define TOOLS_ASM_X86_CMPXCHG_H
+
+#include 
+
+/*
+ * Non-existant functions to indicate usage errors at link time
+ * (or compile-time if the compiler implements __compiletime_error().
+ */
+extern void __cmpxchg_wrong_size(void)
+   __compiletime_error("Bad argument size for cmpxchg");
+
+/*
+ * Constants for operation sizes. On 32-bit, the 64-bit size it set to
+ * -1 because sizeof will never return -1, thereby making those switch
+ * case statements guaranteeed dead code which the compiler will
+ * eliminate, and allowing the "missing symbol in the default case" to
+ * indicate a usage error.
+ */
+#define __X86_CASE_B   1
+#define __X86_CASE_W   2
+#define __X86_CASE_L   4
+#ifdef __x86_64__
+#define __X86_CASE_Q   8
+#else
+#define__X86_CASE_Q-1  /* sizeof will never return -1 
*/
+#endif
+
+/*
+ * Atomic compare and exchange.  Compare OLD with MEM, if identical,
+ * store NEW in MEM.  Return the initial value in MEM.  Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __raw_cmpxchg(ptr, old, new, size, lock)   \
+({ \
+   __typeof__(*(ptr)) __ret;   \
+   __typeof__(*(ptr)) __old = (old);   \
+   __typeof__(*(ptr)) __new = (new);   \
+   switch (size) { \
+   case __X86_CASE_B:  \
+   {   \
+   volatile u8 *__ptr = (volatile u8 *)(ptr);  \
+   asm volatile(lock "cmpxchgb %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "q" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_W:  \
+   {   \
+   volatile u16 *__ptr = (volatile u16 *)(ptr);\
+   asm volatile(lock "cmpxchgw %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "r" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_L:  \
+   {   \
+   volatile u32 *__ptr = (volatile u32 *)(ptr);\
+   asm volatile(lock "cmpxchgl %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "r" (__new), "0" (__old) \
+: "memory");   \
+   break;  \
+   }   \
+   case __X86_CASE_Q:  \
+   {   \
+   volatile u64 *__ptr = (volatile u64 *)(ptr);\
+   asm volatile(lock "cmpxchgq %2,%1"  \
+: "=a" (__ret), "+m" (*__ptr)  \
+: "r" (__new), "0" (__old) \
+  

[RFC PATCH 1/5] Document: Add document file for Sony CXD2880 DVB-T2/T tuner + demodulator

2017-03-06 Thread Yasunari.Takiguchi
From: Yasunari Takiguchi 

This is the driver for Sony CXD2880 DVB-T2/T tuner + demodulator.

Regarding this third Beta Release, the status is:
- Tested on Raspberry Pi 3.
- The DVB-API operates under dvbv5 tools.

Signed-off-by: Yasunari Takiguchi 
Signed-off-by: Masayuki Yamamoto 
Signed-off-by: Hideki Nozawa 
Signed-off-by: Kota Yonezawa 
Signed-off-by: Toshihiko Matsumoto 
Signed-off-by: Satoshi Watanabe 
---
 .../devicetree/bindings/media/spi/sony-cxd2880.txt |   16 
 1 file changed, 16 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt

diff --git a/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt 
b/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt
new file mode 100644
index 000..bdbb047
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt
@@ -0,0 +1,16 @@
+Sony CXD2880 DVB-T2/T tuner + demodulator driver SPI adapter
+
+Required properties:
+- compatible: Should be "sony,cxd2880".
+- reg: SPI chip select number for the device.
+- spi-max-frequency: Maximum bus speed, should be set to <5500> (55MHz).
+- status: Should be "okay"
+
+Example:
+
+cxd2880@0 {
+   compatible = "sony,cxd2880";
+   reg = <0>; /* CE0 */
+   spi-max-frequency = <5500>; /* 55MHz */
+   status = "okay";
+};
-- 
1.7.9.5



[RFC PATCH 1/5] Document: Add document file for Sony CXD2880 DVB-T2/T tuner + demodulator

2017-03-06 Thread Yasunari.Takiguchi
From: Yasunari Takiguchi 

This is the driver for Sony CXD2880 DVB-T2/T tuner + demodulator.

Regarding this third Beta Release, the status is:
- Tested on Raspberry Pi 3.
- The DVB-API operates under dvbv5 tools.

Signed-off-by: Yasunari Takiguchi 
Signed-off-by: Masayuki Yamamoto 
Signed-off-by: Hideki Nozawa 
Signed-off-by: Kota Yonezawa 
Signed-off-by: Toshihiko Matsumoto 
Signed-off-by: Satoshi Watanabe 
---
 .../devicetree/bindings/media/spi/sony-cxd2880.txt |   16 
 1 file changed, 16 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt

diff --git a/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt 
b/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt
new file mode 100644
index 000..bdbb047
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/spi/sony-cxd2880.txt
@@ -0,0 +1,16 @@
+Sony CXD2880 DVB-T2/T tuner + demodulator driver SPI adapter
+
+Required properties:
+- compatible: Should be "sony,cxd2880".
+- reg: SPI chip select number for the device.
+- spi-max-frequency: Maximum bus speed, should be set to <5500> (55MHz).
+- status: Should be "okay"
+
+Example:
+
+cxd2880@0 {
+   compatible = "sony,cxd2880";
+   reg = <0>; /* CE0 */
+   spi-max-frequency = <5500>; /* 55MHz */
+   status = "okay";
+};
-- 
1.7.9.5



[tip:perf/core] tools include: Introduce atomic_cmpxchg_{relaxed,release}()

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  2bcdeadbc094b4f6511aedea1e5b8052bf0cc89c
Gitweb: http://git.kernel.org/tip/2bcdeadbc094b4f6511aedea1e5b8052bf0cc89c
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:57:53 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:14 -0300

tools include: Introduce atomic_cmpxchg_{relaxed,release}()

Will be used by refcnt.h

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-jszriruqfqpez1bkivwfj...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/linux/atomic.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/include/linux/atomic.h b/tools/include/linux/atomic.h
index 4e3d3d1..9f21fc2 100644
--- a/tools/include/linux/atomic.h
+++ b/tools/include/linux/atomic.h
@@ -3,4 +3,10 @@
 
 #include 
 
+/* atomic_cmpxchg_relaxed */
+#ifndef atomic_cmpxchg_relaxed
+#define  atomic_cmpxchg_relaxedatomic_cmpxchg
+#define  atomic_cmpxchg_release atomic_cmpxchg
+#endif /* atomic_cmpxchg_relaxed */
+
 #endif /* __TOOLS_LINUX_ATOMIC_H */


[tip:perf/core] tools include: Introduce atomic_cmpxchg_{relaxed,release}()

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  2bcdeadbc094b4f6511aedea1e5b8052bf0cc89c
Gitweb: http://git.kernel.org/tip/2bcdeadbc094b4f6511aedea1e5b8052bf0cc89c
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:57:53 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:14 -0300

tools include: Introduce atomic_cmpxchg_{relaxed,release}()

Will be used by refcnt.h

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-jszriruqfqpez1bkivwfj...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/linux/atomic.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/include/linux/atomic.h b/tools/include/linux/atomic.h
index 4e3d3d1..9f21fc2 100644
--- a/tools/include/linux/atomic.h
+++ b/tools/include/linux/atomic.h
@@ -3,4 +3,10 @@
 
 #include 
 
+/* atomic_cmpxchg_relaxed */
+#ifndef atomic_cmpxchg_relaxed
+#define  atomic_cmpxchg_relaxedatomic_cmpxchg
+#define  atomic_cmpxchg_release atomic_cmpxchg
+#endif /* atomic_cmpxchg_relaxed */
+
 #endif /* __TOOLS_LINUX_ATOMIC_H */


RE: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw

2017-03-06 Thread 廖崇榮
Hi Dmitry

-Original Message-
From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com] 
Sent: Tuesday, March 07, 2017 3:55 AM
To: KT Liao
Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; Matjaz
Hegedic
Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special
touchpad fw

On Sun, Mar 05, 2017 at 03:13:02AM +0100, Matjaz Hegedic wrote:
> EeeBook X205TA is yet another ASUS device with a special touchpad 
> firmware that needs to be accounted for during initialization, or else 
> the touchpad will go into an invalid state upon suspend/resume.
> Adding the appropriate ic_type and product_id check fixes the problem.

KT, does this look reasonable? Are there more ASUS models that need such
handling?
[KT] : I just discuss it with FW team. 
We can't confirm it right now because it's an old product. And the solution
focus on power-on issue, not suspend/resume.
I will let you know once we figure it out.

Our FW has modified, the issue should not happen on new models.

Thanks  KT
> 
> Signed-off-by: Matjaz Hegedic 
> ---
>  drivers/input/mouse/elan_i2c_core.c | 22 --
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/input/mouse/elan_i2c_core.c 
> b/drivers/input/mouse/elan_i2c_core.c
> index 2c7d287..dde3ad7 100644
> --- a/drivers/input/mouse/elan_i2c_core.c
> +++ b/drivers/input/mouse/elan_i2c_core.c
> @@ -218,17 +218,19 @@ static int elan_query_product(struct 
> elan_tp_data *data)
>  
>  static int elan_check_ASUS_special_fw(struct elan_tp_data *data)  {
> - if (data->ic_type != 0x0E)
> - return false;
> -
> - switch (data->product_id) {
> - case 0x05 ... 0x07:
> - case 0x09:
> - case 0x13:
> - return true;
> - default:
> - return false;
> + if (data->ic_type == 0x0E) {
> + switch (data->product_id) {
> + case 0x05 ... 0x07:
> + case 0x09:
> + case 0x13:
> + return true;
> + }
>   }
> + /* ASUS EeeBook X205TA */
> + else if (data->ic_type == 0x8 && data->product_id == 0x26)
> + return true;
> +
> + return false;
>  }
>  
>  static int __elan_initialize(struct elan_tp_data *data)
> --
> 2.7.4
> 

Thanks.

-- 
Dmitry



RE: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special touchpad fw

2017-03-06 Thread 廖崇榮
Hi Dmitry

-Original Message-
From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com] 
Sent: Tuesday, March 07, 2017 3:55 AM
To: KT Liao
Cc: linux-in...@vger.kernel.org; linux-kernel@vger.kernel.org; Matjaz
Hegedic
Subject: Re: [PATCH] Input: elan_i2c - add ASUS EeeBook X205TA special
touchpad fw

On Sun, Mar 05, 2017 at 03:13:02AM +0100, Matjaz Hegedic wrote:
> EeeBook X205TA is yet another ASUS device with a special touchpad 
> firmware that needs to be accounted for during initialization, or else 
> the touchpad will go into an invalid state upon suspend/resume.
> Adding the appropriate ic_type and product_id check fixes the problem.

KT, does this look reasonable? Are there more ASUS models that need such
handling?
[KT] : I just discuss it with FW team. 
We can't confirm it right now because it's an old product. And the solution
focus on power-on issue, not suspend/resume.
I will let you know once we figure it out.

Our FW has modified, the issue should not happen on new models.

Thanks  KT
> 
> Signed-off-by: Matjaz Hegedic 
> ---
>  drivers/input/mouse/elan_i2c_core.c | 22 --
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/input/mouse/elan_i2c_core.c 
> b/drivers/input/mouse/elan_i2c_core.c
> index 2c7d287..dde3ad7 100644
> --- a/drivers/input/mouse/elan_i2c_core.c
> +++ b/drivers/input/mouse/elan_i2c_core.c
> @@ -218,17 +218,19 @@ static int elan_query_product(struct 
> elan_tp_data *data)
>  
>  static int elan_check_ASUS_special_fw(struct elan_tp_data *data)  {
> - if (data->ic_type != 0x0E)
> - return false;
> -
> - switch (data->product_id) {
> - case 0x05 ... 0x07:
> - case 0x09:
> - case 0x13:
> - return true;
> - default:
> - return false;
> + if (data->ic_type == 0x0E) {
> + switch (data->product_id) {
> + case 0x05 ... 0x07:
> + case 0x09:
> + case 0x13:
> + return true;
> + }
>   }
> + /* ASUS EeeBook X205TA */
> + else if (data->ic_type == 0x8 && data->product_id == 0x26)
> + return true;
> +
> + return false;
>  }
>  
>  static int __elan_initialize(struct elan_tp_data *data)
> --
> 2.7.4
> 

Thanks.

-- 
Dmitry



Re: [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 11:24:10AM -0500, Johannes Weiner wrote:
> On Mon, Mar 06, 2017 at 10:37:40AM +0900, Minchan Kim wrote:
> > On Fri, Mar 03, 2017 at 08:59:54AM +0100, Michal Hocko wrote:
> > > On Fri 03-03-17 10:26:09, Minchan Kim wrote:
> > > > On Tue, Feb 28, 2017 at 04:39:59PM -0500, Johannes Weiner wrote:
> > > > > @@ -3316,6 +3325,9 @@ static int balance_pgdat(pg_data_t *pgdat, int 
> > > > > order, int classzone_idx)
> > > > >   sc.priority--;
> > > > >   } while (sc.priority >= 1);
> > > > >  
> > > > > + if (!sc.nr_reclaimed)
> > > > > + pgdat->kswapd_failures++;
> > > > 
> > > > sc.nr_reclaimed is reset to zero in above big loop's beginning so most 
> > > > of time,
> > > > it pgdat->kswapd_failures is increased.
> 
> That wasn't intentional; I didn't see the sc.nr_reclaimed reset.
> 
> ---
> 
> From e126db716926ff353b35f3a6205bd5853e01877b Mon Sep 17 00:00:00 2001
> From: Johannes Weiner 
> Date: Mon, 6 Mar 2017 10:53:59 -0500
> Subject: [PATCH] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes fix
> 
> Check kswapd failure against the cumulative nr_reclaimed count, not
> against the count from the lowest priority iteration.
> 
> Suggested-by: Minchan Kim 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 




Re: [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 11:24:10AM -0500, Johannes Weiner wrote:
> On Mon, Mar 06, 2017 at 10:37:40AM +0900, Minchan Kim wrote:
> > On Fri, Mar 03, 2017 at 08:59:54AM +0100, Michal Hocko wrote:
> > > On Fri 03-03-17 10:26:09, Minchan Kim wrote:
> > > > On Tue, Feb 28, 2017 at 04:39:59PM -0500, Johannes Weiner wrote:
> > > > > @@ -3316,6 +3325,9 @@ static int balance_pgdat(pg_data_t *pgdat, int 
> > > > > order, int classzone_idx)
> > > > >   sc.priority--;
> > > > >   } while (sc.priority >= 1);
> > > > >  
> > > > > + if (!sc.nr_reclaimed)
> > > > > + pgdat->kswapd_failures++;
> > > > 
> > > > sc.nr_reclaimed is reset to zero in above big loop's beginning so most 
> > > > of time,
> > > > it pgdat->kswapd_failures is increased.
> 
> That wasn't intentional; I didn't see the sc.nr_reclaimed reset.
> 
> ---
> 
> From e126db716926ff353b35f3a6205bd5853e01877b Mon Sep 17 00:00:00 2001
> From: Johannes Weiner 
> Date: Mon, 6 Mar 2017 10:53:59 -0500
> Subject: [PATCH] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes fix
> 
> Check kswapd failure against the cumulative nr_reclaimed count, not
> against the count from the lowest priority iteration.
> 
> Suggested-by: Minchan Kim 
> Signed-off-by: Johannes Weiner 
Acked-by: Minchan Kim 




Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

2017-03-06 Thread Ingo Molnar

* Mike Travis  wrote:

> Add a new NMI call chain that is called last after all other NMI handlers
> have been checked and did not "handle" the NMI.  This mimics the current
> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> multiple NMI handlers registering on this call chain.
> 
> This call chain dramatically lowers the NMI call frequency when high
> frequency NMI tools are in use, notably the perf tools.  It is required
> for NMI handlers that cannot sustain a high NMI call rate without
> ramifications to the system operability.

So how about we just turn off that warning instead? I don't remember the last 
time 
it actually _helped_ us find any kernel or hardware bug - and it has caused 
tons 
of problems...

It's not like we warn about excess regular IRQs either - we either handle them 
or 
at most increase a counter somewhere. We could do the same for NMIs: introduce 
a 
counter somewhere that counts the number of seemingly unhandled NMIs.

But in any case, we should not spam the kernel log, neither with high, nor with 
low frequency.

Thanks,

Ingo


Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

2017-03-06 Thread Ingo Molnar

* Mike Travis  wrote:

> Add a new NMI call chain that is called last after all other NMI handlers
> have been checked and did not "handle" the NMI.  This mimics the current
> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> multiple NMI handlers registering on this call chain.
> 
> This call chain dramatically lowers the NMI call frequency when high
> frequency NMI tools are in use, notably the perf tools.  It is required
> for NMI handlers that cannot sustain a high NMI call rate without
> ramifications to the system operability.

So how about we just turn off that warning instead? I don't remember the last 
time 
it actually _helped_ us find any kernel or hardware bug - and it has caused 
tons 
of problems...

It's not like we warn about excess regular IRQs either - we either handle them 
or 
at most increase a counter somewhere. We could do the same for NMIs: introduce 
a 
counter somewhere that counts the number of seemingly unhandled NMIs.

But in any case, we should not spam the kernel log, neither with high, nor with 
low frequency.

Thanks,

Ingo


Re: [PATCH] irqchip/gic-v3: Fix GICD_CTLR_ARE_NS bit field

2017-03-06 Thread Marc Zyngier
On Tue, Mar 07 2017 at  4:07:05 am GMT, Alim Akhtar  
wrote:
> From: Alim Akhtar 
>
> As per GICv3 Architecture specification 8.9.4 field descriptions,
> GICD_CTLR_ARE_NS is bit[5]. This patch correct the same.
>
> Fixes: 021f6537 ("irqchip: gic-v3: Initial support for GICv3")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Alim Akhtar 
> ---
>  include/linux/irqchip/arm-gic-v3.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/irqchip/arm-gic-v3.h 
> b/include/linux/irqchip/arm-gic-v3.h
> index e808f8a..4aaf639 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -57,7 +57,7 @@
>  
>  #define GICD_CTLR_RWP(1U << 31)
>  #define GICD_CTLR_DS (1U << 6)
> -#define GICD_CTLR_ARE_NS (1U << 4)
> +#define GICD_CTLR_ARE_NS (1U << 5)
>  #define GICD_CTLR_ENABLE_G1A (1U << 1)
>  #define GICD_CTLR_ENABLE_G1  (1U << 0)

No, the issue is much more subtle.

- When the access is secure in a system that supports two security
  states, this is bit[5] indeed.

- When the access is non-secure in a system that supports two security
  states, this is bit[4] (so that software written for a single security
  mode can run on both side of the security fence).

- In a system that only supports a single security state, this is bit[4]
  too.

Given that Linux is only designed to run on the non-secure side (at
least when paired with GICv3), I stand by my original bit layout.

Thanks,

M.
-- 
Jazz is not dead, it just smell funny.


Re: [PATCH] irqchip/gic-v3: Fix GICD_CTLR_ARE_NS bit field

2017-03-06 Thread Marc Zyngier
On Tue, Mar 07 2017 at  4:07:05 am GMT, Alim Akhtar  
wrote:
> From: Alim Akhtar 
>
> As per GICv3 Architecture specification 8.9.4 field descriptions,
> GICD_CTLR_ARE_NS is bit[5]. This patch correct the same.
>
> Fixes: 021f6537 ("irqchip: gic-v3: Initial support for GICv3")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Alim Akhtar 
> ---
>  include/linux/irqchip/arm-gic-v3.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/irqchip/arm-gic-v3.h 
> b/include/linux/irqchip/arm-gic-v3.h
> index e808f8a..4aaf639 100644
> --- a/include/linux/irqchip/arm-gic-v3.h
> +++ b/include/linux/irqchip/arm-gic-v3.h
> @@ -57,7 +57,7 @@
>  
>  #define GICD_CTLR_RWP(1U << 31)
>  #define GICD_CTLR_DS (1U << 6)
> -#define GICD_CTLR_ARE_NS (1U << 4)
> +#define GICD_CTLR_ARE_NS (1U << 5)
>  #define GICD_CTLR_ENABLE_G1A (1U << 1)
>  #define GICD_CTLR_ENABLE_G1  (1U << 0)

No, the issue is much more subtle.

- When the access is secure in a system that supports two security
  states, this is bit[5] indeed.

- When the access is non-secure in a system that supports two security
  states, this is bit[4] (so that software written for a single security
  mode can run on both side of the security fence).

- In a system that only supports a single security state, this is bit[4]
  too.

Given that Linux is only designed to run on the non-secure side (at
least when paired with GICv3), I stand by my original bit layout.

Thanks,

M.
-- 
Jazz is not dead, it just smell funny.


Re: [PATCH] docs: Clarify details for reporting security bugs

2017-03-06 Thread Jonathan Corbet
On Mon, 6 Mar 2017 11:13:51 -0800
Kees Cook  wrote:

> The kernel security team is regularly asked to provide CVE identifiers,
> which we don't normally do. This updates the documentation to mention
> this and adds some more details about coordination and patch handling
> that come up regularly. Based on an earlier draft by Willy Tarreau.
> 
> Signed-off-by: Kees Cook 
> Acked-by: Willy Tarreau 

Seems good, applied to the docs tree, thanks.

> Related question: shouldn't security-bugs.rst and submitting-patches.rst live
> in /process/ rather than /admin-guide/ ?

The former should maybe be there, depending on just who we think it
should be aimed at, I guess.  submitting-patches.rst is already in
process/, though, so I'm not quite sure I understand that question?

Thanks,

jon


Re: [PATCH] docs: Clarify details for reporting security bugs

2017-03-06 Thread Jonathan Corbet
On Mon, 6 Mar 2017 11:13:51 -0800
Kees Cook  wrote:

> The kernel security team is regularly asked to provide CVE identifiers,
> which we don't normally do. This updates the documentation to mention
> this and adds some more details about coordination and patch handling
> that come up regularly. Based on an earlier draft by Willy Tarreau.
> 
> Signed-off-by: Kees Cook 
> Acked-by: Willy Tarreau 

Seems good, applied to the docs tree, thanks.

> Related question: shouldn't security-bugs.rst and submitting-patches.rst live
> in /process/ rather than /admin-guide/ ?

The former should maybe be there, depending on just who we think it
should be aimed at, I guess.  submitting-patches.rst is already in
process/, though, so I'm not quite sure I understand that question?

Thanks,

jon


[tip:perf/core] tools arch x86: Introduce atomic_cmpxchg()

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  8a73615df3b8973df2de1455c00e9169522d8257
Gitweb: http://git.kernel.org/tip/8a73615df3b8973df2de1455c00e9169522d8257
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:55:43 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

tools arch x86: Introduce atomic_cmpxchg()

Will be used by atomic_cmpxchg_relaxed(), in turn used by refcount.h.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-kdmovd3l4gw5b1w31ypr6...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/arch/x86/include/asm/atomic.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/arch/x86/include/asm/atomic.h 
b/tools/arch/x86/include/asm/atomic.h
index 059e33e..328eece 100644
--- a/tools/arch/x86/include/asm/atomic.h
+++ b/tools/arch/x86/include/asm/atomic.h
@@ -7,6 +7,8 @@
 
 #define LOCK_PREFIX "\n\tlock; "
 
+#include 
+
 /*
  * Atomic operations that C can't guarantee us.  Useful for
  * resource counting etc..
@@ -62,4 +64,9 @@ static inline int atomic_dec_and_test(atomic_t *v)
GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", "e");
 }
 
+static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+   return cmpxchg(>counter, old, new);
+}
+
 #endif /* _TOOLS_LINUX_ASM_X86_ATOMIC_H */


[tip:perf/core] tools arch x86: Introduce atomic_cmpxchg()

2017-03-06 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  8a73615df3b8973df2de1455c00e9169522d8257
Gitweb: http://git.kernel.org/tip/8a73615df3b8973df2de1455c00e9169522d8257
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 22 Feb 2017 16:55:43 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

tools arch x86: Introduce atomic_cmpxchg()

Will be used by atomic_cmpxchg_relaxed(), in turn used by refcount.h.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Elena Reshetova 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-kdmovd3l4gw5b1w31ypr6...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/arch/x86/include/asm/atomic.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/arch/x86/include/asm/atomic.h 
b/tools/arch/x86/include/asm/atomic.h
index 059e33e..328eece 100644
--- a/tools/arch/x86/include/asm/atomic.h
+++ b/tools/arch/x86/include/asm/atomic.h
@@ -7,6 +7,8 @@
 
 #define LOCK_PREFIX "\n\tlock; "
 
+#include 
+
 /*
  * Atomic operations that C can't guarantee us.  Useful for
  * resource counting etc..
@@ -62,4 +64,9 @@ static inline int atomic_dec_and_test(atomic_t *v)
GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", "e");
 }
 
+static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+   return cmpxchg(>counter, old, new);
+}
+
 #endif /* _TOOLS_LINUX_ASM_X86_ATOMIC_H */


Re: [PATCH] perf probe: Return errno when does not hit any event

2017-03-06 Thread Kefeng Wang
+ Arnaldo Carvalho de Melo 

On 2017/3/6 17:34, Kefeng Wang wrote:
> On old perf, when using perf probe -d to delete an inexistent event,
> it return errno, eg,
> 
> -bash-4.3# perf probe -d xxx  || echo $?
> Info: Event "*:xxx" does not exist.
>   Error: Failed to delete events.
> 255
> 
> But now perf_del_probe_events() will always set ret = 0, different
> from previous del_perf_probe_events(). After this, it return errno
> again, eg,
> 
> -bash-4.3# ./perf probe -d xxx  || echo $?
>   Error: Failed to delete events.
> 254
> 
> And it is more appropriate to return -ENOENT instead of -EPERM.
> 
> Signed-off-by: Kefeng Wang 
> ---
>  tools/perf/builtin-probe.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index 1fcebc3..c46b41c 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -444,7 +444,8 @@ static int perf_del_probe_events(struct strfilter *filter)
>   if (ret == -ENOENT && ret2 == -ENOENT)
>   pr_debug("\"%s\" does not hit any event.\n", str);
>   /* Note that this is silently ignored */
> - ret = 0;
> + else
> + ret = 0;
>  
>  error:
>   if (kfd >= 0)
> 



Re: [PATCH] perf probe: Return errno when does not hit any event

2017-03-06 Thread Kefeng Wang
+ Arnaldo Carvalho de Melo 

On 2017/3/6 17:34, Kefeng Wang wrote:
> On old perf, when using perf probe -d to delete an inexistent event,
> it return errno, eg,
> 
> -bash-4.3# perf probe -d xxx  || echo $?
> Info: Event "*:xxx" does not exist.
>   Error: Failed to delete events.
> 255
> 
> But now perf_del_probe_events() will always set ret = 0, different
> from previous del_perf_probe_events(). After this, it return errno
> again, eg,
> 
> -bash-4.3# ./perf probe -d xxx  || echo $?
>   Error: Failed to delete events.
> 254
> 
> And it is more appropriate to return -ENOENT instead of -EPERM.
> 
> Signed-off-by: Kefeng Wang 
> ---
>  tools/perf/builtin-probe.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
> index 1fcebc3..c46b41c 100644
> --- a/tools/perf/builtin-probe.c
> +++ b/tools/perf/builtin-probe.c
> @@ -444,7 +444,8 @@ static int perf_del_probe_events(struct strfilter *filter)
>   if (ret == -ENOENT && ret2 == -ENOENT)
>   pr_debug("\"%s\" does not hit any event.\n", str);
>   /* Note that this is silently ignored */
> - ret = 0;
> + else
> + ret = 0;
>  
>  error:
>   if (kfd >= 0)
> 



Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Hannes Reinecke
On 03/07/2017 06:22 AM, Minchan Kim wrote:
> Hello Johannes,
> 
> On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
>> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
>> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
>> pages attached to the bio's bvec this results in a kernel panic because of
>> array out of bounds accesses in zram_decompress_page().
> 
> First of all, thanks for the report and fix up!
> Unfortunately, I'm not familiar with that interface of block layer.
> 
> It seems this is a material for stable so I want to understand it clear.
> Could you say more specific things to educate me?
> 
> What scenario/When/How it is problem?  It will help for me to understand!
> 
The problem is that zram as it currently stands can only handle bios
where each bvec contains a single page (or, to be precise, a chunk of
data with a length of a page).

This is not an automatic guarantee from the block layer (who is free to
send us bios with arbitrary-sized bvecs), so we need to set the queue
limits to ensure that.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Hannes Reinecke
On 03/07/2017 06:22 AM, Minchan Kim wrote:
> Hello Johannes,
> 
> On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
>> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
>> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
>> pages attached to the bio's bvec this results in a kernel panic because of
>> array out of bounds accesses in zram_decompress_page().
> 
> First of all, thanks for the report and fix up!
> Unfortunately, I'm not familiar with that interface of block layer.
> 
> It seems this is a material for stable so I want to understand it clear.
> Could you say more specific things to educate me?
> 
> What scenario/When/How it is problem?  It will help for me to understand!
> 
The problem is that zram as it currently stands can only handle bios
where each bvec contains a single page (or, to be precise, a chunk of
data with a length of a page).

This is not an automatic guarantee from the block layer (who is free to
send us bios with arbitrary-sized bvecs), so we need to set the queue
limits to ensure that.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH] x86/fpu: fix boolreturn.cocci warnings

2017-03-06 Thread Ingo Molnar

* kbuild test robot  wrote:

> arch/x86/kernel/fpu/xstate.c:931:9-10: WARNING: return of 0/1 in function 
> 'xfeatures_mxcsr_quirk' with return type bool
> 
>  Return statements in functions returning bool should use
>  true/false instead of 1/0.

Note that this is a totally bogus warning. I personally find a 0/1 return more 
readable than a textual 'true/false', even if bools are used, and nowhere does 
the 
kernel mandate the use of 0/1.

So NAK ...

Thanks,

Ingo


Re: [PATCH] x86/fpu: fix boolreturn.cocci warnings

2017-03-06 Thread Ingo Molnar

* kbuild test robot  wrote:

> arch/x86/kernel/fpu/xstate.c:931:9-10: WARNING: return of 0/1 in function 
> 'xfeatures_mxcsr_quirk' with return type bool
> 
>  Return statements in functions returning bool should use
>  true/false instead of 1/0.

Note that this is a totally bogus warning. I personally find a 0/1 return more 
readable than a textual 'true/false', even if bools are used, and nowhere does 
the 
kernel mandate the use of 0/1.

So NAK ...

Thanks,

Ingo


Re: [lustre-devel] [PATCH] staging: lustre: fix sparse warning about different address spaces

2017-03-06 Thread Oleg Drokin

On Mar 1, 2017, at 6:57 PM, Mario Bambagini wrote:

> fixed the following sparse warning by adding proper cast:
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74: warning: 
> incorrect type in argument 2 (different address spaces)
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:expected 
> char const [noderef] *
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:got char 
> *[assigned] sval
> 
> Signed-off-by: Mario Bambagini 

The patch is fine, but just be advised this whole function is going away real 
soon now
per Al Viro request (and also because it no longer does what it should).

> ---
> drivers/staging/lustre/lustre/obdclass/obd_config.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> index 9ca84c7..8fce88f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> @@ -1052,7 +1052,8 @@ int class_process_proc_param(char *prefix, struct 
> lprocfs_vars *lvars,
> 
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
> - rc = var->fops->write(, sval,
> + rc = var->fops->write(,
> + (const char __user *)sval,
>   vallen, NULL);
>   set_fs(oldfs);
>   }
> -- 
> 2.1.4
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



Re: [lustre-devel] [PATCH] staging: lustre: fix sparse warning about different address spaces

2017-03-06 Thread Oleg Drokin

On Mar 1, 2017, at 6:57 PM, Mario Bambagini wrote:

> fixed the following sparse warning by adding proper cast:
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74: warning: 
> incorrect type in argument 2 (different address spaces)
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:expected 
> char const [noderef] *
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:got char 
> *[assigned] sval
> 
> Signed-off-by: Mario Bambagini 

The patch is fine, but just be advised this whole function is going away real 
soon now
per Al Viro request (and also because it no longer does what it should).

> ---
> drivers/staging/lustre/lustre/obdclass/obd_config.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> index 9ca84c7..8fce88f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> @@ -1052,7 +1052,8 @@ int class_process_proc_param(char *prefix, struct 
> lprocfs_vars *lvars,
> 
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
> - rc = var->fops->write(, sval,
> + rc = var->fops->write(,
> + (const char __user *)sval,
>   vallen, NULL);
>   set_fs(oldfs);
>   }
> -- 
> 2.1.4
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org



[tip:perf/core] perf vendor events: Add mapping for KnightsMill PMU events

2017-03-06 Thread tip-bot for Karol Wachowski
Commit-ID:  771ceddaadd0a2b31603034b36dca50943ff6836
Gitweb: http://git.kernel.org/tip/771ceddaadd0a2b31603034b36dca50943ff6836
Author: Karol Wachowski 
AuthorDate: Mon, 20 Feb 2017 12:50:40 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

perf vendor events: Add mapping for KnightsMill PMU events

Reuse events from KnightsLanding for KnightsMill

Signed-off-by: Karol Wachowski 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Dave Hansen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Cc: Piotr Luc 
Cc: Srinivas Pandruvada 
Link: 
http://lkml.kernel.org/r/1487591440-25172-1-git-send-email-karol.wachow...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv 
b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 12181bb..d1a12e5 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -17,6 +17,7 @@ GenuineIntel-6-3A,v18,ivybridge,core
 GenuineIntel-6-3E,v19,ivytown,core
 GenuineIntel-6-2D,v20,jaketown,core
 GenuineIntel-6-57,v9,knightslanding,core
+GenuineIntel-6-85,v9,knightslanding,core
 GenuineIntel-6-1E,v2,nehalemep,core
 GenuineIntel-6-1F,v2,nehalemep,core
 GenuineIntel-6-1A,v2,nehalemep,core


[tip:perf/core] perf vendor events: Add mapping for KnightsMill PMU events

2017-03-06 Thread tip-bot for Karol Wachowski
Commit-ID:  771ceddaadd0a2b31603034b36dca50943ff6836
Gitweb: http://git.kernel.org/tip/771ceddaadd0a2b31603034b36dca50943ff6836
Author: Karol Wachowski 
AuthorDate: Mon, 20 Feb 2017 12:50:40 +0100
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 3 Mar 2017 19:07:13 -0300

perf vendor events: Add mapping for KnightsMill PMU events

Reuse events from KnightsLanding for KnightsMill

Signed-off-by: Karol Wachowski 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Dave Hansen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Cc: Piotr Luc 
Cc: Srinivas Pandruvada 
Link: 
http://lkml.kernel.org/r/1487591440-25172-1-git-send-email-karol.wachow...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/pmu-events/arch/x86/mapfile.csv | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv 
b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 12181bb..d1a12e5 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -17,6 +17,7 @@ GenuineIntel-6-3A,v18,ivybridge,core
 GenuineIntel-6-3E,v19,ivytown,core
 GenuineIntel-6-2D,v20,jaketown,core
 GenuineIntel-6-57,v9,knightslanding,core
+GenuineIntel-6-85,v9,knightslanding,core
 GenuineIntel-6-1E,v2,nehalemep,core
 GenuineIntel-6-1F,v2,nehalemep,core
 GenuineIntel-6-1A,v2,nehalemep,core


[PATCH 1/2] usb: xhci-mtk: rebuild xhci_mtk_setup()

2017-03-06 Thread Chunfeng Yun
simplify xhci_mtk_setup() and add xhci_mtk_start() for
xhci_driver_overrides struct

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 67d5dc7..9636884 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -381,8 +381,10 @@ static int usb_wakeup_of_property_parse(struct 
xhci_hcd_mtk *mtk,
 }
 
 static int xhci_mtk_setup(struct usb_hcd *hcd);
+static int xhci_mtk_start(struct usb_hcd *hcd);
 static const struct xhci_driver_overrides xhci_mtk_overrides __initconst = {
.reset = xhci_mtk_setup,
+   .start = xhci_mtk_start,
 };
 
 static struct hc_driver __read_mostly xhci_mtk_hc_driver;
@@ -492,7 +494,6 @@ static void xhci_mtk_quirks(struct device *dev, struct 
xhci_hcd *xhci)
 /* called during probe() after chip reset completes */
 static int xhci_mtk_setup(struct usb_hcd *hcd)
 {
-   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct xhci_hcd_mtk *mtk = hcd_to_mtk(hcd);
int ret;
 
@@ -502,9 +503,14 @@ static int xhci_mtk_setup(struct usb_hcd *hcd)
return ret;
}
 
-   ret = xhci_gen_setup(hcd, xhci_mtk_quirks);
-   if (ret)
-   return ret;
+   return xhci_gen_setup(hcd, xhci_mtk_quirks);
+}
+
+static int xhci_mtk_start(struct usb_hcd *hcd)
+{
+   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   struct xhci_hcd_mtk *mtk = hcd_to_mtk(hcd);
+   int ret;
 
if (usb_hcd_is_primary_hcd(hcd)) {
mtk->num_u3_ports = xhci->num_usb3_ports;
@@ -514,7 +520,7 @@ static int xhci_mtk_setup(struct usb_hcd *hcd)
return ret;
}
 
-   return ret;
+   return xhci_run(hcd);
 }
 
 static int xhci_mtk_probe(struct platform_device *pdev)
-- 
1.7.9.5



[PATCH 1/2] usb: xhci-mtk: rebuild xhci_mtk_setup()

2017-03-06 Thread Chunfeng Yun
simplify xhci_mtk_setup() and add xhci_mtk_start() for
xhci_driver_overrides struct

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 67d5dc7..9636884 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -381,8 +381,10 @@ static int usb_wakeup_of_property_parse(struct 
xhci_hcd_mtk *mtk,
 }
 
 static int xhci_mtk_setup(struct usb_hcd *hcd);
+static int xhci_mtk_start(struct usb_hcd *hcd);
 static const struct xhci_driver_overrides xhci_mtk_overrides __initconst = {
.reset = xhci_mtk_setup,
+   .start = xhci_mtk_start,
 };
 
 static struct hc_driver __read_mostly xhci_mtk_hc_driver;
@@ -492,7 +494,6 @@ static void xhci_mtk_quirks(struct device *dev, struct 
xhci_hcd *xhci)
 /* called during probe() after chip reset completes */
 static int xhci_mtk_setup(struct usb_hcd *hcd)
 {
-   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
struct xhci_hcd_mtk *mtk = hcd_to_mtk(hcd);
int ret;
 
@@ -502,9 +503,14 @@ static int xhci_mtk_setup(struct usb_hcd *hcd)
return ret;
}
 
-   ret = xhci_gen_setup(hcd, xhci_mtk_quirks);
-   if (ret)
-   return ret;
+   return xhci_gen_setup(hcd, xhci_mtk_quirks);
+}
+
+static int xhci_mtk_start(struct usb_hcd *hcd)
+{
+   struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+   struct xhci_hcd_mtk *mtk = hcd_to_mtk(hcd);
+   int ret;
 
if (usb_hcd_is_primary_hcd(hcd)) {
mtk->num_u3_ports = xhci->num_usb3_ports;
@@ -514,7 +520,7 @@ static int xhci_mtk_setup(struct usb_hcd *hcd)
return ret;
}
 
-   return ret;
+   return xhci_run(hcd);
 }
 
 static int xhci_mtk_probe(struct platform_device *pdev)
-- 
1.7.9.5



[PATCH 2/2] usb: xhci-mtk: fix checkpatch warning and erorr

2017-03-06 Thread Chunfeng Yun
there are two warnings and a erorr when checked by checkpatch.pl
as following:

WARNING:BLOCK_COMMENT_STYLE: Block comments should align
the * on each line

ERROR:COMPLEX_MACRO: Macros with complex values should be
enclosed in parentheses

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 9636884..22c94fe 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -287,9 +287,9 @@ static void usb_wakeup_ip_sleep_dis(struct xhci_hcd_mtk 
*mtk)
 }
 
 /*
-* for line-state wakeup mode, phy's power should not power-down
-* and only support cable plug in/out
-*/
+ * for line-state wakeup mode, phy's power should not power-down
+ * and only support cable plug in/out
+ */
 static void usb_wakeup_line_state_en(struct xhci_hcd_mtk *mtk)
 {
u32 tmp;
@@ -350,10 +350,10 @@ static int usb_wakeup_of_property_parse(struct 
xhci_hcd_mtk *mtk,
struct device *dev = mtk->dev;
 
/*
-   * wakeup function is optional, so it is not an error if this property
-   * does not exist, and in such case, no need to get relative
-   * properties anymore.
-   */
+* wakeup function is optional, so it is not an error if this property
+* does not exist, and in such case, no need to get relative
+* properties anymore.
+*/
of_property_read_u32(dn, "mediatek,wakeup-src", >wakeup_src);
if (!mtk->wakeup_src)
return 0;
@@ -796,7 +796,7 @@ static int __maybe_unused xhci_mtk_resume(struct device 
*dev)
 static const struct dev_pm_ops xhci_mtk_pm_ops = {
SET_SYSTEM_SLEEP_PM_OPS(xhci_mtk_suspend, xhci_mtk_resume)
 };
-#define DEV_PM_OPS IS_ENABLED(CONFIG_PM) ? _mtk_pm_ops : NULL
+#define DEV_PM_OPS (IS_ENABLED(CONFIG_PM) ? _mtk_pm_ops : NULL)
 
 #ifdef CONFIG_OF
 static const struct of_device_id mtk_xhci_of_match[] = {
-- 
1.7.9.5



[PATCH 2/2] usb: xhci-mtk: fix checkpatch warning and erorr

2017-03-06 Thread Chunfeng Yun
there are two warnings and a erorr when checked by checkpatch.pl
as following:

WARNING:BLOCK_COMMENT_STYLE: Block comments should align
the * on each line

ERROR:COMPLEX_MACRO: Macros with complex values should be
enclosed in parentheses

Signed-off-by: Chunfeng Yun 
---
 drivers/usb/host/xhci-mtk.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c
index 9636884..22c94fe 100644
--- a/drivers/usb/host/xhci-mtk.c
+++ b/drivers/usb/host/xhci-mtk.c
@@ -287,9 +287,9 @@ static void usb_wakeup_ip_sleep_dis(struct xhci_hcd_mtk 
*mtk)
 }
 
 /*
-* for line-state wakeup mode, phy's power should not power-down
-* and only support cable plug in/out
-*/
+ * for line-state wakeup mode, phy's power should not power-down
+ * and only support cable plug in/out
+ */
 static void usb_wakeup_line_state_en(struct xhci_hcd_mtk *mtk)
 {
u32 tmp;
@@ -350,10 +350,10 @@ static int usb_wakeup_of_property_parse(struct 
xhci_hcd_mtk *mtk,
struct device *dev = mtk->dev;
 
/*
-   * wakeup function is optional, so it is not an error if this property
-   * does not exist, and in such case, no need to get relative
-   * properties anymore.
-   */
+* wakeup function is optional, so it is not an error if this property
+* does not exist, and in such case, no need to get relative
+* properties anymore.
+*/
of_property_read_u32(dn, "mediatek,wakeup-src", >wakeup_src);
if (!mtk->wakeup_src)
return 0;
@@ -796,7 +796,7 @@ static int __maybe_unused xhci_mtk_resume(struct device 
*dev)
 static const struct dev_pm_ops xhci_mtk_pm_ops = {
SET_SYSTEM_SLEEP_PM_OPS(xhci_mtk_suspend, xhci_mtk_resume)
 };
-#define DEV_PM_OPS IS_ENABLED(CONFIG_PM) ? _mtk_pm_ops : NULL
+#define DEV_PM_OPS (IS_ENABLED(CONFIG_PM) ? _mtk_pm_ops : NULL)
 
 #ifdef CONFIG_OF
 static const struct of_device_id mtk_xhci_of_match[] = {
-- 
1.7.9.5



Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Minchan Kim
Hi Hannes,

On Tue, Mar 7, 2017 at 4:00 PM, Hannes Reinecke  wrote:
> On 03/07/2017 06:22 AM, Minchan Kim wrote:
>> Hello Johannes,
>>
>> On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
>>> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
>>> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
>>> pages attached to the bio's bvec this results in a kernel panic because of
>>> array out of bounds accesses in zram_decompress_page().
>>
>> First of all, thanks for the report and fix up!
>> Unfortunately, I'm not familiar with that interface of block layer.
>>
>> It seems this is a material for stable so I want to understand it clear.
>> Could you say more specific things to educate me?
>>
>> What scenario/When/How it is problem?  It will help for me to understand!
>>

Thanks for the quick response!

> The problem is that zram as it currently stands can only handle bios
> where each bvec contains a single page (or, to be precise, a chunk of
> data with a length of a page).

Right.

>
> This is not an automatic guarantee from the block layer (who is free to
> send us bios with arbitrary-sized bvecs), so we need to set the queue
> limits to ensure that.

What does it mean "bios with arbitrary-sized bvecs"?
What kinds of scenario is it used/useful?

And how can we solve it by setting queue limit?

Sorry for the many questions due to limited knowledge.

Thanks.


Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Minchan Kim
Hi Hannes,

On Tue, Mar 7, 2017 at 4:00 PM, Hannes Reinecke  wrote:
> On 03/07/2017 06:22 AM, Minchan Kim wrote:
>> Hello Johannes,
>>
>> On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
>>> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
>>> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
>>> pages attached to the bio's bvec this results in a kernel panic because of
>>> array out of bounds accesses in zram_decompress_page().
>>
>> First of all, thanks for the report and fix up!
>> Unfortunately, I'm not familiar with that interface of block layer.
>>
>> It seems this is a material for stable so I want to understand it clear.
>> Could you say more specific things to educate me?
>>
>> What scenario/When/How it is problem?  It will help for me to understand!
>>

Thanks for the quick response!

> The problem is that zram as it currently stands can only handle bios
> where each bvec contains a single page (or, to be precise, a chunk of
> data with a length of a page).

Right.

>
> This is not an automatic guarantee from the block layer (who is free to
> send us bios with arbitrary-sized bvecs), so we need to set the queue
> limits to ensure that.

What does it mean "bios with arbitrary-sized bvecs"?
What kinds of scenario is it used/useful?

And how can we solve it by setting queue limit?

Sorry for the many questions due to limited knowledge.

Thanks.


Re: [PATCH v5 05/15] livepatch/powerpc: add TIF_PATCH_PENDING thread flag

2017-03-06 Thread Balbir Singh
On Mon, 2017-02-13 at 19:42 -0600, Josh Poimboeuf wrote:
> Add the TIF_PATCH_PENDING thread flag to enable the new livepatch
> per-task consistency model for powerpc.  The bit getting set indicates
> the thread has a pending patch which needs to be applied when the thread
> exits the kernel.
> 
> The bit is included in the _TIF_USER_WORK_MASK macro so that
> do_notify_resume() and klp_update_patch_state() get called when the bit
> is set.
> 
> Signed-off-by: Josh Poimboeuf 
> Reviewed-by: Petr Mladek 
> Reviewed-by: Miroslav Benes 
> Reviewed-by: Kamalesh Babulal 
> ---

Reviewed-by: Balbir Singh 



Re: [PATCH v5 05/15] livepatch/powerpc: add TIF_PATCH_PENDING thread flag

2017-03-06 Thread Balbir Singh
On Mon, 2017-02-13 at 19:42 -0600, Josh Poimboeuf wrote:
> Add the TIF_PATCH_PENDING thread flag to enable the new livepatch
> per-task consistency model for powerpc.  The bit getting set indicates
> the thread has a pending patch which needs to be applied when the thread
> exits the kernel.
> 
> The bit is included in the _TIF_USER_WORK_MASK macro so that
> do_notify_resume() and klp_update_patch_state() get called when the bit
> is set.
> 
> Signed-off-by: Josh Poimboeuf 
> Reviewed-by: Petr Mladek 
> Reviewed-by: Miroslav Benes 
> Reviewed-by: Kamalesh Babulal 
> ---

Reviewed-by: Balbir Singh 



Re: [GIT PULL 00/35] perf/core improvements and fixes

2017-03-06 Thread Ingo Molnar

* Arnaldo Carvalho de Melo <a...@kernel.org> wrote:

> From: Arnaldo Carvalho de Melo <a...@redhat.com>
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
> 
>   Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 
> 08:05:45 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170306
> 
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
> 
>   perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles 
> Baylis)
> 
>   E.g.:
> 
>   # perf report -s symbol_size,symbol
> 
>   Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
>   Overhead  Symbol size  Symbol
> 14.55%  326  [k] flush_tlb_mm_range
>  7.20% 1045  [k] filemap_map_pages
>  5.82%  124  [k] vma_interval_tree_insert
>  5.18% 2430  [k] unmap_page_range
>  2.57%  571  [k] vma_interval_tree_remove
>  1.94%  494  [k] page_add_file_rmap
>  1.82%  740  [k] page_remove_rmap
>  1.66% 1017  [k] release_pages
>  1.57% 1636  [k] update_blocked_averages
>  1.57%   76  [k] unlock_page
> 
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' 
> (Namhyung Kim)
> 
> Change in behaviour:
> 
> - Make system wide (-a) the default option if no target was specified and one
>   of following conditions is met:
> 
>   - No workload specified (current behaviour)
> 
>   - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
> 
> Fixes:
> 
> - Add missing initialization to the instruction decoder used in the
>   intel PT/BTS code, which was causing lots of failures in 'perf test',
>   looking for a value when there was none (Adrian Hunter)
> 
> Infrastructure:
> 
> - Add arch code needed to adopt the kernel's refcount_t to aid in
>   catching bugs when using atomic_t as a reference counter, basically
>   cmpxchg related functions (Arnaldo Carvalho de Melo)
> 
> - Convert the code using atomic_t as reference counts to refcount_t
>   (Elena Rashetova)
> 
> - Add feature test for sched_getcpu() to more easily check for its
>   presence in the many libc implementations and accross different
>   versions of such C libraries (Arnaldo Carvalho de Melo)
> 
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
>   requested events can't get counted because a PMU counter is taken by that
>   watchdog (Borislav Petkov).
> 
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
> 
> Documentation:
> 
> - Clarify the term 'convergence' in:
> 
>perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
> 
> Kernel code:
> 
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
> 
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
> 
> 
> Adrian Hunter (1):
>   perf intel-PT/BTS: Add missing initialization
> 
> Arnaldo Carvalho de Melo (12):
>   tools include: Adopt __compiletime_error
>   tools arch x86: Include asm/cmpxchg.h
>   tools arch x86: Introduce atomic_cmpxchg()
>   tools include: Introduce atomic_cmpxchg_{relaxed,release}()
>   tools include: Provide gcc based cmpxchg fallback for !x86
>   tools include: Add UINT_MAX def to kernel.h
>   tools include: Adopt kernel's refcount.h
>   perf evlist: Clarify a bit the use of perf_mmap->refcnt
>   tools build: Add test for sched_getcpu()
>   perf bench futex: Use __maybe_unused
>   perf bench futex: Fix build on musl + clang
>   tools build: Use the same CC for feature detection and actual build
> 
> Borislav Petkov (1):
>   perf stat: Issue a HW watchdog disable hint
> 
> Charles Baylis (1):
>   perf tools: Allow sorting by symbol size
> 
> Elena Reshetova (9):
>   perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
>   perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
>   perf comm: Convert comm_str.refcn

Re: [GIT PULL 00/35] perf/core improvements and fixes

2017-03-06 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> From: Arnaldo Carvalho de Melo 
> 
> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit 9d020d33fc1b2faa0eb35859df1381ca5dc94ffe:
> 
>   Merge branch 'linus' into perf/urgent, to resolve conflict (2017-03-02 
> 08:05:45 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-core-for-mingo-4.11-20170306
> 
> for you to fetch changes up to 001916b94a04809a94abb07daba6f9ace01906ba:
> 
>   perf bench numa: Add more comment for -c option (2017-03-06 12:39:30 -0300)
> 
> 
> perf/core improvements and fixes:
> 
> New features:
> 
> - Allow sorting by symbol_size in 'perf report' and 'perf top' (Charles 
> Baylis)
> 
>   E.g.:
> 
>   # perf report -s symbol_size,symbol
> 
>   Samples: 9K of event 'cycles:k', Event count (approx.): 2870461623
>   Overhead  Symbol size  Symbol
> 14.55%  326  [k] flush_tlb_mm_range
>  7.20% 1045  [k] filemap_map_pages
>  5.82%  124  [k] vma_interval_tree_insert
>  5.18% 2430  [k] unmap_page_range
>  2.57%  571  [k] vma_interval_tree_remove
>  1.94%  494  [k] page_add_file_rmap
>  1.82%  740  [k] page_remove_rmap
>  1.66% 1017  [k] release_pages
>  1.57% 1636  [k] update_blocked_averages
>  1.57%   76  [k] unlock_page
> 
> - Add support for -p/--pid, -a/--all-cpus and -C/--cpu in 'perf ftrace' 
> (Namhyung Kim)
> 
> Change in behaviour:
> 
> - Make system wide (-a) the default option if no target was specified and one
>   of following conditions is met:
> 
>   - No workload specified (current behaviour)
> 
>   - A workload is specified but all requested events are system wide ones,
> like uncore ones. (Jiri Olsa)
> 
> Fixes:
> 
> - Add missing initialization to the instruction decoder used in the
>   intel PT/BTS code, which was causing lots of failures in 'perf test',
>   looking for a value when there was none (Adrian Hunter)
> 
> Infrastructure:
> 
> - Add arch code needed to adopt the kernel's refcount_t to aid in
>   catching bugs when using atomic_t as a reference counter, basically
>   cmpxchg related functions (Arnaldo Carvalho de Melo)
> 
> - Convert the code using atomic_t as reference counts to refcount_t
>   (Elena Rashetova)
> 
> - Add feature test for sched_getcpu() to more easily check for its
>   presence in the many libc implementations and accross different
>   versions of such C libraries (Arnaldo Carvalho de Melo)
> 
> - Issue a HW watchdog disable hint in 'perf stat' for when some of the
>   requested events can't get counted because a PMU counter is taken by that
>   watchdog (Borislav Petkov).
> 
> - Add mapping for Intel's KnightsMill PMU events (Karol Wachowski)
> 
> Documentation:
> 
> - Clarify the term 'convergence' in:
> 
>perf bench numa numa-mem -h --show_convergence (Jiri Olsa)
> 
> Kernel code:
> 
> - Ensure probe location is at function entry in kretprobes (Naveen N. Rao)
> 
> - Allow return probes with offsets and absolute addresses (Naveen N. Rao)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (1):
>   perf intel-PT/BTS: Add missing initialization
> 
> Arnaldo Carvalho de Melo (12):
>   tools include: Adopt __compiletime_error
>   tools arch x86: Include asm/cmpxchg.h
>   tools arch x86: Introduce atomic_cmpxchg()
>   tools include: Introduce atomic_cmpxchg_{relaxed,release}()
>   tools include: Provide gcc based cmpxchg fallback for !x86
>   tools include: Add UINT_MAX def to kernel.h
>   tools include: Adopt kernel's refcount.h
>   perf evlist: Clarify a bit the use of perf_mmap->refcnt
>   tools build: Add test for sched_getcpu()
>   perf bench futex: Use __maybe_unused
>   perf bench futex: Fix build on musl + clang
>   tools build: Use the same CC for feature detection and actual build
> 
> Borislav Petkov (1):
>   perf stat: Issue a HW watchdog disable hint
> 
> Charles Baylis (1):
>   perf tools: Allow sorting by symbol size
> 
> Elena Reshetova (9):
>   perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t
>   perf cpumap: Convert cpu_map.refcnt from atomic_t to refcount_t
>   perf comm: Convert comm_str.refcnt from atomic_t to refcount_t
>   perf dso: Convert dso.refcnt from

[PATCH] staging: rtl8192e: fix coding style issue, improve error handling

2017-03-06 Thread Suniel Mahesh
Fix coding style issue and comments in rtl_core.c
Return -ENOMEM, if it is out of memory
Pointer comparison with NULL replaced by logical NOT

Signed-off-by: Suniel Mahesh 
---
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c | 253 +++
 1 file changed, 100 insertions(+), 153 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 4c0caa6..1099c94 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -1,4 +1,4 @@
-/**
+/
  * Copyright(c) 2008 - 2010 Realtek Corporation. All rights reserved.
  *
  * Based on the r8180 driver, which is:
@@ -17,7 +17,7 @@
  *
  * Contact Information:
  * wlanfae 
-**/
+ */
 #include 
 #include 
 #include 
@@ -37,7 +37,6 @@
 static int channels = 0x3fff;
 static char *ifname = "wlan%d";
 
-
 static const struct rtl819x_ops rtl819xp_ops = {
.nic_type   = NIC_8192E,
.get_eeprom_size= rtl92e_get_eeprom_size,
@@ -100,9 +99,7 @@ static void _rtl92e_hard_data_xmit(struct sk_buff *skb, 
struct net_device *dev,
 static int _rtl92e_down(struct net_device *dev, bool shutdownrf);
 static void _rtl92e_restart(void *data);
 
-/
-   -IO STUFF-
-*/
+/* IO STUFF */
 
 u8 rtl92e_readb(struct net_device *dev, int x)
 {
@@ -140,9 +137,7 @@ void rtl92e_writew(struct net_device *dev, int x, u16 y)
udelay(20);
 }
 
-/
-   -GENERAL FUNCTION-
-*/
+/* GENERAL FUNCTION */
 bool rtl92e_set_rf_state(struct net_device *dev,
 enum rt_rf_power_state StateToSet,
 RT_RF_CHANGE_SOURCE ChangeSource)
@@ -200,7 +195,6 @@ bool rtl92e_set_rf_state(struct net_device *dev,
priv->rtllib->RfOffReason = 0;
bActionAllowed = true;
 
-
if (rtState == eRfOff &&
ChangeSource >= RF_CHANGE_BY_HW)
bConnectBySSID = true;
@@ -223,7 +217,8 @@ bool rtl92e_set_rf_state(struct net_device *dev,
else
priv->blinked_ingpio = false;
rtllib_MgntDisconnect(priv->rtllib,
- 
WLAN_REASON_DISASSOC_STA_HAS_LEFT);
+ WLAN_REASON_DISASSOC_STA_
+   HAS_LEFT);
}
}
if ((ChangeSource == RF_CHANGE_BY_HW) && !priv->bHwRadioOff)
@@ -247,7 +242,6 @@ bool rtl92e_set_rf_state(struct net_device *dev,
 StateToSet, priv->rtllib->RfOffReason);
PHY_SetRFPowerState(dev, StateToSet);
if (StateToSet == eRfOn) {
-
if (bConnectBySSID && priv->blinked_ingpio) {
schedule_delayed_work(
 >associate_procedure_wq, 0);
@@ -346,16 +340,16 @@ static void _rtl92e_update_cap(struct net_device *dev, 
u16 cap)
}
}
 
-   if (net->mode & (IEEE_G|IEEE_N_24G)) {
+   if (net->mode & (IEEE_G | IEEE_N_24G)) {
u8  slot_time_val;
u8  CurSlotTime = priv->slot_time;
 
if ((cap & WLAN_CAPABILITY_SHORT_SLOT_TIME) &&
-  (!priv->rtllib->pHTInfo->bCurrentRT2RTLongSlotTime)) {
+   (!priv->rtllib->pHTInfo->bCurrentRT2RTLongSlotTime)) {
if (CurSlotTime != SHORT_SLOT_TIME) {
slot_time_val = SHORT_SLOT_TIME;
priv->rtllib->SetHwRegHandler(dev,
-HW_VAR_SLOT_TIME, _time_val);
+   HW_VAR_SLOT_TIME, _time_val);
}
} else {
if (CurSlotTime != NON_SHORT_SLOT_TIME) {
@@ -407,7 +401,6 @@ static void _rtl92e_qos_activate(void *data)
for (i = 0; i <  QOS_QUEUE_NUM; i++)
priv->rtllib->SetHwRegHandler(dev, HW_VAR_AC_PARAM, (u8 *)());
 
-
 success:
mutex_unlock(>mutex);
 

[PATCH] staging: rtl8192e: fix coding style issue, improve error handling

2017-03-06 Thread Suniel Mahesh
Fix coding style issue and comments in rtl_core.c
Return -ENOMEM, if it is out of memory
Pointer comparison with NULL replaced by logical NOT

Signed-off-by: Suniel Mahesh 
---
 drivers/staging/rtl8192e/rtl8192e/rtl_core.c | 253 +++
 1 file changed, 100 insertions(+), 153 deletions(-)

diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c 
b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
index 4c0caa6..1099c94 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_core.c
@@ -1,4 +1,4 @@
-/**
+/
  * Copyright(c) 2008 - 2010 Realtek Corporation. All rights reserved.
  *
  * Based on the r8180 driver, which is:
@@ -17,7 +17,7 @@
  *
  * Contact Information:
  * wlanfae 
-**/
+ */
 #include 
 #include 
 #include 
@@ -37,7 +37,6 @@
 static int channels = 0x3fff;
 static char *ifname = "wlan%d";
 
-
 static const struct rtl819x_ops rtl819xp_ops = {
.nic_type   = NIC_8192E,
.get_eeprom_size= rtl92e_get_eeprom_size,
@@ -100,9 +99,7 @@ static void _rtl92e_hard_data_xmit(struct sk_buff *skb, 
struct net_device *dev,
 static int _rtl92e_down(struct net_device *dev, bool shutdownrf);
 static void _rtl92e_restart(void *data);
 
-/
-   -IO STUFF-
-*/
+/* IO STUFF */
 
 u8 rtl92e_readb(struct net_device *dev, int x)
 {
@@ -140,9 +137,7 @@ void rtl92e_writew(struct net_device *dev, int x, u16 y)
udelay(20);
 }
 
-/
-   -GENERAL FUNCTION-
-*/
+/* GENERAL FUNCTION */
 bool rtl92e_set_rf_state(struct net_device *dev,
 enum rt_rf_power_state StateToSet,
 RT_RF_CHANGE_SOURCE ChangeSource)
@@ -200,7 +195,6 @@ bool rtl92e_set_rf_state(struct net_device *dev,
priv->rtllib->RfOffReason = 0;
bActionAllowed = true;
 
-
if (rtState == eRfOff &&
ChangeSource >= RF_CHANGE_BY_HW)
bConnectBySSID = true;
@@ -223,7 +217,8 @@ bool rtl92e_set_rf_state(struct net_device *dev,
else
priv->blinked_ingpio = false;
rtllib_MgntDisconnect(priv->rtllib,
- 
WLAN_REASON_DISASSOC_STA_HAS_LEFT);
+ WLAN_REASON_DISASSOC_STA_
+   HAS_LEFT);
}
}
if ((ChangeSource == RF_CHANGE_BY_HW) && !priv->bHwRadioOff)
@@ -247,7 +242,6 @@ bool rtl92e_set_rf_state(struct net_device *dev,
 StateToSet, priv->rtllib->RfOffReason);
PHY_SetRFPowerState(dev, StateToSet);
if (StateToSet == eRfOn) {
-
if (bConnectBySSID && priv->blinked_ingpio) {
schedule_delayed_work(
 >associate_procedure_wq, 0);
@@ -346,16 +340,16 @@ static void _rtl92e_update_cap(struct net_device *dev, 
u16 cap)
}
}
 
-   if (net->mode & (IEEE_G|IEEE_N_24G)) {
+   if (net->mode & (IEEE_G | IEEE_N_24G)) {
u8  slot_time_val;
u8  CurSlotTime = priv->slot_time;
 
if ((cap & WLAN_CAPABILITY_SHORT_SLOT_TIME) &&
-  (!priv->rtllib->pHTInfo->bCurrentRT2RTLongSlotTime)) {
+   (!priv->rtllib->pHTInfo->bCurrentRT2RTLongSlotTime)) {
if (CurSlotTime != SHORT_SLOT_TIME) {
slot_time_val = SHORT_SLOT_TIME;
priv->rtllib->SetHwRegHandler(dev,
-HW_VAR_SLOT_TIME, _time_val);
+   HW_VAR_SLOT_TIME, _time_val);
}
} else {
if (CurSlotTime != NON_SHORT_SLOT_TIME) {
@@ -407,7 +401,6 @@ static void _rtl92e_qos_activate(void *data)
for (i = 0; i <  QOS_QUEUE_NUM; i++)
priv->rtllib->SetHwRegHandler(dev, HW_VAR_AC_PARAM, (u8 *)());
 
-
 success:
mutex_unlock(>mutex);
 }
@@ -427,16 +420,16 @@ static int 

[PATCH 2/5] locking/locktorture: Fix rwsem reader_delay

2017-03-06 Thread Davidlohr Bueso
We should account for nreader threads, not writers in this
callback. Could even trigger a div by 0 if the user explicitly
disables writers.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index f24582d4dad3..0ef5510f8742 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -570,7 +570,7 @@ static void torture_rwsem_read_delay(struct 
torture_random_state *trsp)
 
/* We want a long delay occasionally to force massive contention.  */
if (!(torture_random(trsp) %
- (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
+ (cxt.nrealreaders_stress * 2000 * longdelay_ms)))
mdelay(longdelay_ms * 2);
else
mdelay(longdelay_ms / 2);
-- 
2.6.6



[PATCH 2/5] locking/locktorture: Fix rwsem reader_delay

2017-03-06 Thread Davidlohr Bueso
We should account for nreader threads, not writers in this
callback. Could even trigger a div by 0 if the user explicitly
disables writers.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index f24582d4dad3..0ef5510f8742 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -570,7 +570,7 @@ static void torture_rwsem_read_delay(struct 
torture_random_state *trsp)
 
/* We want a long delay occasionally to force massive contention.  */
if (!(torture_random(trsp) %
- (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
+ (cxt.nrealreaders_stress * 2000 * longdelay_ms)))
mdelay(longdelay_ms * 2);
else
mdelay(longdelay_ms / 2);
-- 
2.6.6



[PATCH 1/5] locking: Introduce range reader/writer lock

2017-03-06 Thread Davidlohr Bueso
This implements a sleepable range rwlock, based on interval tree, serializing
conflicting/intersecting/overlapping ranges within the tree. The largest range
is given by [0, ~0 - 1] (inclusive). Unlike traditional locks, range locking
involves dealing with the tree itself and the range to be locked, normally
stack allocated and always explicitly prepared/initialized by the user in a
[a0, a1] a0 <= a1 sorted manner, before actually taking the lock.

We allow exclusive locking of arbitrary ranges. We guarantee that each
range is locked only after all conflicting range locks requested previously
have been unlocked. Thus we achieve fairness and avoid livelocks.

When new range lock is requested, we add its interval to the tree and store
number of intervals intersecting it to 'blocking_ranges'. When a range is
unlocked, we again walk intervals that intersect with the unlocked one and
decrement their 'blocking_ranges'. We wake up owner of any range lock whose
'blocking_ranges' drops to 0.

For the shared case, the 'blocking_ranges' is only incremented if the
intersecting range is not marked as a reader. In order to mitigate some of
the tree walk overhead for non-intersecting ranges, the tree's min/max values
are maintained and consulted in O(1) in the fastpath.

How much does it cost:
--

The cost of lock and unlock of a range is O(log(R_all)+R_int) where R_all is
total number of ranges and R_int is the number of ranges intersecting the
new range range to be added.

Due to its sharable nature, full range locks can be compared with rw-sempahores,
which also serves from a mutex standpoint as writer-only situations are
pretty similar nowadays.

The first is the memory footprint, rwsems are larger than tree locks: 40 vs
24 bytes, but the later requires an additional 64 bytes of stack for the range
structure.

Secondly, because every range call is serialized by the tree->lock, any lock()
fastpath will at least have an interval_tree_insert() and spinlock lock+unlock
overhead compared to a single atomic insn in the case of rwsems. Similar 
scenario
obviously for the unlock() case.

The torture module was used to measure 1-1 differences in lock acquisition with
increasing core counts over a period of 10 minutes. Readers and writers are
interleaved, with a slight advantage to writers as its the first kthread that is
created. The following shows the avg ops/minute with various thread-setups on
boxes with small and large core-counts.

** 4-core AMD Opteron **
(write-only)
rwsem-2thr: 4198.5, stddev: 7.77
range-2thr: 4199.1, stddev: 0.73

rwsem-4thr: 6036.8, stddev: 50.91
range-4thr: 6004.9, stddev: 126.57

rwsem-8thr: 6245.6, stddev: 59.39
range-8thr: 6229.3, stddev: 10.60

(read-only)
rwsem-2thr: 5930.7, stddev: 21.92
range-2thr: 5917.3, stddev: 25.45

rwsem-4thr: 9881.6, stddev: 0.70
range-4thr: 9540.2, stddev: 98.28

rwsem-8thr: 11633.2, stddev: 7.72
range-8thr: 11314.7, stddev: 62.22

For the read/write-only cases, there is very little difference between the 
range lock
and rwsems, with up to a 3% hit, which could very well be considered in the 
noise range.

(read-write)
rwsem-write-1thr: 1744.8, stddev: 11.59
rwsem-read-1thr:  1043.1, stddev: 3.97
range-write-1thr: 1740.2, stddev: 5.99
range-read-1thr:  1022.5, stddev: 6.41

rwsem-write-2thr: 1662.5, stddev: 0.70
rwsem-read-2thr:  1278.0, stddev: 25.45
range-write-2thr: 1321.5, stddev: 51.61
range-read-2thr:  1243.5, stddev: 30.40

rwsem-write-4thr: 1761.0, stddev: 11.31
rwsem-read-4thr:  1426.0, stddev: 7.07
range-write-4thr: 1417.0, stddev: 29.69
range-read-4thr:  1398.0, stddev: 56.56

While a single reader and writer threads does not show must difference, 
increasing
core counts shows that in reader/writer workloads, writer threads can take a 
hit in
raw performance of up to ~20%, while the number of reader throughput is quite 
similar
among both locks.

** 240-core (ht) IvyBridge **
(write-only)
rwsem-120thr: 6844.5, stddev: 82.73
range-120thr: 6070.5, stddev: 85.55

rwsem-240thr: 6292.5, stddev: 146.3
range-240thr: 6099.0, stddev: 15.55

rwsem-480thr: 6164.8, stddev: 33.94
range-480thr: 6062.3, stddev: 19.79

(read-only)
rwsem-120thr: 136860.4, stddev: 2539.92
range-120thr: 138052.2, stddev: 327.39

rwsem-240thr: 235297.5, stddev: 2220.50
range-240thr: 232099.1, stddev: 3614.72

rwsem-480thr: 272683.0, stddev: 3924.32
range-480thr: 256539.2, stddev: 9541.69

Similar to the small box, larger machines show that range locks take only a 
minor
(up to ~6% for 480 threads) hit even in completely exclusive or shared 
scenarios.

(read-write)
rwsem-write-60thr: 4658.1, stddev: 1303.19
rwsem-read-60thr:  1108.7, stddev: 718.42
range-write-60thr: 3203.6, stddev: 139.30
range-read-60thr:  1852.8, stddev: 147.5

rwsem-write-120thr: 3971.3, stddev: 1413.0
rwsem-read-120thr:  1038.8, stddev: 353.51
range-write-120thr: 2282.1, stddev: 207.18
range-read-120thr:  1856.5, stddev: 198.69

rwsem-write-240thr: 4112.7, stddev: 2448.1
rwsem-read-240thr:  1277.4, 

[PATCH 1/5] locking: Introduce range reader/writer lock

2017-03-06 Thread Davidlohr Bueso
This implements a sleepable range rwlock, based on interval tree, serializing
conflicting/intersecting/overlapping ranges within the tree. The largest range
is given by [0, ~0 - 1] (inclusive). Unlike traditional locks, range locking
involves dealing with the tree itself and the range to be locked, normally
stack allocated and always explicitly prepared/initialized by the user in a
[a0, a1] a0 <= a1 sorted manner, before actually taking the lock.

We allow exclusive locking of arbitrary ranges. We guarantee that each
range is locked only after all conflicting range locks requested previously
have been unlocked. Thus we achieve fairness and avoid livelocks.

When new range lock is requested, we add its interval to the tree and store
number of intervals intersecting it to 'blocking_ranges'. When a range is
unlocked, we again walk intervals that intersect with the unlocked one and
decrement their 'blocking_ranges'. We wake up owner of any range lock whose
'blocking_ranges' drops to 0.

For the shared case, the 'blocking_ranges' is only incremented if the
intersecting range is not marked as a reader. In order to mitigate some of
the tree walk overhead for non-intersecting ranges, the tree's min/max values
are maintained and consulted in O(1) in the fastpath.

How much does it cost:
--

The cost of lock and unlock of a range is O(log(R_all)+R_int) where R_all is
total number of ranges and R_int is the number of ranges intersecting the
new range range to be added.

Due to its sharable nature, full range locks can be compared with rw-sempahores,
which also serves from a mutex standpoint as writer-only situations are
pretty similar nowadays.

The first is the memory footprint, rwsems are larger than tree locks: 40 vs
24 bytes, but the later requires an additional 64 bytes of stack for the range
structure.

Secondly, because every range call is serialized by the tree->lock, any lock()
fastpath will at least have an interval_tree_insert() and spinlock lock+unlock
overhead compared to a single atomic insn in the case of rwsems. Similar 
scenario
obviously for the unlock() case.

The torture module was used to measure 1-1 differences in lock acquisition with
increasing core counts over a period of 10 minutes. Readers and writers are
interleaved, with a slight advantage to writers as its the first kthread that is
created. The following shows the avg ops/minute with various thread-setups on
boxes with small and large core-counts.

** 4-core AMD Opteron **
(write-only)
rwsem-2thr: 4198.5, stddev: 7.77
range-2thr: 4199.1, stddev: 0.73

rwsem-4thr: 6036.8, stddev: 50.91
range-4thr: 6004.9, stddev: 126.57

rwsem-8thr: 6245.6, stddev: 59.39
range-8thr: 6229.3, stddev: 10.60

(read-only)
rwsem-2thr: 5930.7, stddev: 21.92
range-2thr: 5917.3, stddev: 25.45

rwsem-4thr: 9881.6, stddev: 0.70
range-4thr: 9540.2, stddev: 98.28

rwsem-8thr: 11633.2, stddev: 7.72
range-8thr: 11314.7, stddev: 62.22

For the read/write-only cases, there is very little difference between the 
range lock
and rwsems, with up to a 3% hit, which could very well be considered in the 
noise range.

(read-write)
rwsem-write-1thr: 1744.8, stddev: 11.59
rwsem-read-1thr:  1043.1, stddev: 3.97
range-write-1thr: 1740.2, stddev: 5.99
range-read-1thr:  1022.5, stddev: 6.41

rwsem-write-2thr: 1662.5, stddev: 0.70
rwsem-read-2thr:  1278.0, stddev: 25.45
range-write-2thr: 1321.5, stddev: 51.61
range-read-2thr:  1243.5, stddev: 30.40

rwsem-write-4thr: 1761.0, stddev: 11.31
rwsem-read-4thr:  1426.0, stddev: 7.07
range-write-4thr: 1417.0, stddev: 29.69
range-read-4thr:  1398.0, stddev: 56.56

While a single reader and writer threads does not show must difference, 
increasing
core counts shows that in reader/writer workloads, writer threads can take a 
hit in
raw performance of up to ~20%, while the number of reader throughput is quite 
similar
among both locks.

** 240-core (ht) IvyBridge **
(write-only)
rwsem-120thr: 6844.5, stddev: 82.73
range-120thr: 6070.5, stddev: 85.55

rwsem-240thr: 6292.5, stddev: 146.3
range-240thr: 6099.0, stddev: 15.55

rwsem-480thr: 6164.8, stddev: 33.94
range-480thr: 6062.3, stddev: 19.79

(read-only)
rwsem-120thr: 136860.4, stddev: 2539.92
range-120thr: 138052.2, stddev: 327.39

rwsem-240thr: 235297.5, stddev: 2220.50
range-240thr: 232099.1, stddev: 3614.72

rwsem-480thr: 272683.0, stddev: 3924.32
range-480thr: 256539.2, stddev: 9541.69

Similar to the small box, larger machines show that range locks take only a 
minor
(up to ~6% for 480 threads) hit even in completely exclusive or shared 
scenarios.

(read-write)
rwsem-write-60thr: 4658.1, stddev: 1303.19
rwsem-read-60thr:  1108.7, stddev: 718.42
range-write-60thr: 3203.6, stddev: 139.30
range-read-60thr:  1852.8, stddev: 147.5

rwsem-write-120thr: 3971.3, stddev: 1413.0
rwsem-read-120thr:  1038.8, stddev: 353.51
range-write-120thr: 2282.1, stddev: 207.18
range-read-120thr:  1856.5, stddev: 198.69

rwsem-write-240thr: 4112.7, stddev: 2448.1
rwsem-read-240thr:  1277.4, 

[PATCH 3/5] locking/locktorture: Fix num reader/writer corner cases

2017-03-06 Thread Davidlohr Bueso
Things can explode for locktorture if the user does combinations
of nwriters_stress=0 nreaders_stress=0. Fix this by not assuming
we always want to torture writer threads.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 76 +---
 1 file changed, 44 insertions(+), 32 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 0ef5510f8742..a68167803eee 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -715,8 +715,7 @@ static void __torture_print_stats(char *page,
 {
bool fail = 0;
int i, n_stress;
-   long max = 0;
-   long min = statp[0].n_lock_acquired;
+   long max = 0, min = statp ? statp[0].n_lock_acquired : 0;
long long sum = 0;
 
n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
@@ -823,7 +822,7 @@ static void lock_torture_cleanup(void)
 * such, only perform the underlying torture-specific cleanups,
 * and avoid anything related to locktorture.
 */
-   if (!cxt.lwsa)
+   if (!cxt.lwsa && !cxt.lrsa)
goto end;
 
if (writer_tasks) {
@@ -898,6 +897,13 @@ static int __init lock_torture_init(void)
firsterr = -EINVAL;
goto unwind;
}
+
+   if (nwriters_stress == 0 && nreaders_stress == 0) {
+   pr_alert("lock-torture: must run at least one locking 
thread\n");
+   firsterr = -EINVAL;
+   goto unwind;
+   }
+
if (cxt.cur_ops->init)
cxt.cur_ops->init();
 
@@ -921,17 +927,19 @@ static int __init lock_torture_init(void)
 #endif
 
/* Initialize the statistics so that each run gets its own numbers. */
+   if (nwriters_stress) {
+   lock_is_write_held = 0;
+   cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, 
GFP_KERNEL);
+   if (cxt.lwsa == NULL) {
+   VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
+   firsterr = -ENOMEM;
+   goto unwind;
+   }
 
-   lock_is_write_held = 0;
-   cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, 
GFP_KERNEL);
-   if (cxt.lwsa == NULL) {
-   VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
-   firsterr = -ENOMEM;
-   goto unwind;
-   }
-   for (i = 0; i < cxt.nrealwriters_stress; i++) {
-   cxt.lwsa[i].n_lock_fail = 0;
-   cxt.lwsa[i].n_lock_acquired = 0;
+   for (i = 0; i < cxt.nrealwriters_stress; i++) {
+   cxt.lwsa[i].n_lock_fail = 0;
+   cxt.lwsa[i].n_lock_acquired = 0;
+   }
}
 
if (cxt.cur_ops->readlock) {
@@ -948,19 +956,21 @@ static int __init lock_torture_init(void)
cxt.nrealreaders_stress = cxt.nrealwriters_stress;
}
 
-   lock_is_read_held = 0;
-   cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, 
GFP_KERNEL);
-   if (cxt.lrsa == NULL) {
-   VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
-   firsterr = -ENOMEM;
-   kfree(cxt.lwsa);
-   cxt.lwsa = NULL;
-   goto unwind;
-   }
-
-   for (i = 0; i < cxt.nrealreaders_stress; i++) {
-   cxt.lrsa[i].n_lock_fail = 0;
-   cxt.lrsa[i].n_lock_acquired = 0;
+   if (nreaders_stress) {
+   lock_is_read_held = 0;
+   cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * 
cxt.nrealreaders_stress, GFP_KERNEL);
+   if (cxt.lrsa == NULL) {
+   VERBOSE_TOROUT_STRING("cxt.lrsa: Out of 
memory");
+   firsterr = -ENOMEM;
+   kfree(cxt.lwsa);
+   cxt.lwsa = NULL;
+   goto unwind;
+   }
+
+   for (i = 0; i < cxt.nrealreaders_stress; i++) {
+   cxt.lrsa[i].n_lock_fail = 0;
+   cxt.lrsa[i].n_lock_acquired = 0;
+   }
}
}
 
@@ -990,12 +1000,14 @@ static int __init lock_torture_init(void)
goto unwind;
}
 
-   writer_tasks = kzalloc(cxt.nrealwriters_stress * 
sizeof(writer_tasks[0]),
-  GFP_KERNEL);
-   if (writer_tasks == NULL) {
-   VERBOSE_TOROUT_ERRSTRING("writer_tasks: Out of memory");
-   firsterr = -ENOMEM;
-   goto unwind;
+   if (nwriters_stress) {
+   writer_tasks = kzalloc(cxt.nrealwriters_stress * 
sizeof(writer_tasks[0]),
+  

[PATCH 3/5] locking/locktorture: Fix num reader/writer corner cases

2017-03-06 Thread Davidlohr Bueso
Things can explode for locktorture if the user does combinations
of nwriters_stress=0 nreaders_stress=0. Fix this by not assuming
we always want to torture writer threads.

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 76 +---
 1 file changed, 44 insertions(+), 32 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 0ef5510f8742..a68167803eee 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -715,8 +715,7 @@ static void __torture_print_stats(char *page,
 {
bool fail = 0;
int i, n_stress;
-   long max = 0;
-   long min = statp[0].n_lock_acquired;
+   long max = 0, min = statp ? statp[0].n_lock_acquired : 0;
long long sum = 0;
 
n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
@@ -823,7 +822,7 @@ static void lock_torture_cleanup(void)
 * such, only perform the underlying torture-specific cleanups,
 * and avoid anything related to locktorture.
 */
-   if (!cxt.lwsa)
+   if (!cxt.lwsa && !cxt.lrsa)
goto end;
 
if (writer_tasks) {
@@ -898,6 +897,13 @@ static int __init lock_torture_init(void)
firsterr = -EINVAL;
goto unwind;
}
+
+   if (nwriters_stress == 0 && nreaders_stress == 0) {
+   pr_alert("lock-torture: must run at least one locking 
thread\n");
+   firsterr = -EINVAL;
+   goto unwind;
+   }
+
if (cxt.cur_ops->init)
cxt.cur_ops->init();
 
@@ -921,17 +927,19 @@ static int __init lock_torture_init(void)
 #endif
 
/* Initialize the statistics so that each run gets its own numbers. */
+   if (nwriters_stress) {
+   lock_is_write_held = 0;
+   cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, 
GFP_KERNEL);
+   if (cxt.lwsa == NULL) {
+   VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
+   firsterr = -ENOMEM;
+   goto unwind;
+   }
 
-   lock_is_write_held = 0;
-   cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, 
GFP_KERNEL);
-   if (cxt.lwsa == NULL) {
-   VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
-   firsterr = -ENOMEM;
-   goto unwind;
-   }
-   for (i = 0; i < cxt.nrealwriters_stress; i++) {
-   cxt.lwsa[i].n_lock_fail = 0;
-   cxt.lwsa[i].n_lock_acquired = 0;
+   for (i = 0; i < cxt.nrealwriters_stress; i++) {
+   cxt.lwsa[i].n_lock_fail = 0;
+   cxt.lwsa[i].n_lock_acquired = 0;
+   }
}
 
if (cxt.cur_ops->readlock) {
@@ -948,19 +956,21 @@ static int __init lock_torture_init(void)
cxt.nrealreaders_stress = cxt.nrealwriters_stress;
}
 
-   lock_is_read_held = 0;
-   cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, 
GFP_KERNEL);
-   if (cxt.lrsa == NULL) {
-   VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
-   firsterr = -ENOMEM;
-   kfree(cxt.lwsa);
-   cxt.lwsa = NULL;
-   goto unwind;
-   }
-
-   for (i = 0; i < cxt.nrealreaders_stress; i++) {
-   cxt.lrsa[i].n_lock_fail = 0;
-   cxt.lrsa[i].n_lock_acquired = 0;
+   if (nreaders_stress) {
+   lock_is_read_held = 0;
+   cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * 
cxt.nrealreaders_stress, GFP_KERNEL);
+   if (cxt.lrsa == NULL) {
+   VERBOSE_TOROUT_STRING("cxt.lrsa: Out of 
memory");
+   firsterr = -ENOMEM;
+   kfree(cxt.lwsa);
+   cxt.lwsa = NULL;
+   goto unwind;
+   }
+
+   for (i = 0; i < cxt.nrealreaders_stress; i++) {
+   cxt.lrsa[i].n_lock_fail = 0;
+   cxt.lrsa[i].n_lock_acquired = 0;
+   }
}
}
 
@@ -990,12 +1000,14 @@ static int __init lock_torture_init(void)
goto unwind;
}
 
-   writer_tasks = kzalloc(cxt.nrealwriters_stress * 
sizeof(writer_tasks[0]),
-  GFP_KERNEL);
-   if (writer_tasks == NULL) {
-   VERBOSE_TOROUT_ERRSTRING("writer_tasks: Out of memory");
-   firsterr = -ENOMEM;
-   goto unwind;
+   if (nwriters_stress) {
+   writer_tasks = kzalloc(cxt.nrealwriters_stress * 
sizeof(writer_tasks[0]),
+  GFP_KERNEL);
+

RE: [v3] mmc: sdhci-cadence: add HS400 enhanced strobe support

2017-03-06 Thread Piotr Sroka
Hi Masahiro,

Thanks for all of your reviews.

Best Regards
Piotr Sroka

> -Original Message-
> From: Masahiro Yamada [mailto:yamada.masah...@socionext.com]
> Sent: 06 March, 2017 9:59 PM
> Subject: Re: [v3] mmc: sdhci-cadence: add HS400 enhanced strobe support
> 
> Hi Piotr,
> 
> 2017-03-06 17:28 GMT+09:00 Piotr Sroka :
> > Add support for HS400ES mode to Cadence SDHCI driver.
> >
> > Signed-off-by: Piotr Sroka 
> > ---
> > Changes in v2:
> > - Modify enhanced strobe function to handle disabling
> >   enhanced strobe inside the function.
> >   Do no relay on that mmc_set_ios() is called
> >   immediately after host->ops->hs400_enhanced_strobe(host, >ios).
> > Changes in v3:
> > - Few coding-style fixes were made.
> 
> 
> Thanks for the update, and it looks good to me.
> 
> Reviewed-by: Masahiro Yamada 
> 
> --
> Best Regards
> Masahiro Yamada


RE: [v3] mmc: sdhci-cadence: add HS400 enhanced strobe support

2017-03-06 Thread Piotr Sroka
Hi Masahiro,

Thanks for all of your reviews.

Best Regards
Piotr Sroka

> -Original Message-
> From: Masahiro Yamada [mailto:yamada.masah...@socionext.com]
> Sent: 06 March, 2017 9:59 PM
> Subject: Re: [v3] mmc: sdhci-cadence: add HS400 enhanced strobe support
> 
> Hi Piotr,
> 
> 2017-03-06 17:28 GMT+09:00 Piotr Sroka :
> > Add support for HS400ES mode to Cadence SDHCI driver.
> >
> > Signed-off-by: Piotr Sroka 
> > ---
> > Changes in v2:
> > - Modify enhanced strobe function to handle disabling
> >   enhanced strobe inside the function.
> >   Do no relay on that mmc_set_ios() is called
> >   immediately after host->ops->hs400_enhanced_strobe(host, >ios).
> > Changes in v3:
> > - Few coding-style fixes were made.
> 
> 
> Thanks for the update, and it looks good to me.
> 
> Reviewed-by: Masahiro Yamada 
> 
> --
> Best Regards
> Masahiro Yamada


Re: [PATCH 00/17] fs, btrfs refcount conversions

2017-03-06 Thread Qu Wenruo



At 03/06/2017 05:43 PM, Reshetova, Elena wrote:



At 03/03/2017 04:55 PM, Elena Reshetova wrote:

Now when new refcount_t type and API are finally merged
(see include/linux/refcount.h), the following
patches convert various refcounters in the btrfs filesystem from atomic_t
to refcount_t. By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The below patches are fully independent and can be cherry-picked separately.
Since we convert all kernel subsystems in the same fashion, resulting
in about 300 patches, we have to group them for sending at least in some
fashion to be manageable. Please excuse the long cc list.

These patches have been tested with xfstests by running btrfs-related tests.
btrfs debug was enabled, warns on refcount errors, too. No output related to
refcount errors produced. However, the following errors were during the run:
 * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
 process hangs. They all seem to be around qgroup, sometimes error visible
 such as qgroup scan failed -4 before it blocks, but not always.


How reproducible of the hang?


Always in  my environment, but I would not much go into investigating why it 
happens, if it works for you.
My test environment is far from ideal: I am testing in VM with rather old 
userspace and couple of additional changes in,
so there are many things that can potentially go wrong. Anyway the strace for 
078 is in the attachment.


Thanks for the strace.

However no "-f" is passed to strace, so it doesn't contain much useful info.



If the patches pass all tests on your side, could you please take them in and 
propagate further?
I will continue with other kernel subsystems.


The patchset itself looks like a common cleanup, while I did encounter 
several cases (almost all scrub tests) causing kernel warning due to 
underflow.


So I'm afraid the patchset will not be merged until we fix all the 
underflows.


But thanks for the patchset, it helps us to expose a lot of problem.

Thanks,
Qu



Best Regards,
Elena.




I also see the -EINTR output, but that seems to be designed for
btrfs/11[45].

btrfs/078 is unrelated to qgroup, and all these three test pass in my
test environment, which is v4.11-rc1 with your patches applied.

I ran these 3 tests in a row with default and space_cache=v2 mount
options, and 5 times for each mount option, no hang at all.

It would help much if more info can be provided, from blocked process
backtrace to test mount option to base commit.

Thanks,
Qu


 * test btrfs/104 dmesg has additional error output:
 BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
 to free: 4096
 I tried looking at the code on what causes the failure, but could not figure
 it out. It doesn't seem to be related to any refcount changes at least IMO.

The above test failures are hard for me to understand and interpreted, but
they don't seem to relate to refcount conversions.

Elena Reshetova (17):
  fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
refcount_t
  fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_caching_control.count from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
  fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
  fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
refcount_t
  fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t

 fs/btrfs/backref.c   |  2 +-
 fs/btrfs/compression.c   | 18 -
 fs/btrfs/ctree.h |  5 +++--
 fs/btrfs/delayed-inode.c | 46 ++--
 fs/btrfs/delayed-inode.h |  5 +++--
 fs/btrfs/delayed-ref.c   |  8 
 fs/btrfs/delayed-ref.h   |  8 +---
 fs/btrfs/disk-io.c   |  6 +++---
 fs/btrfs/disk-io.h   |  4 ++--
 fs/btrfs/extent-tree.c   | 20 +--
 fs/btrfs/extent_io.c | 18 -
 fs/btrfs/extent_io.h |  3 ++-
 fs/btrfs/extent_map.c| 10 +-
 fs/btrfs/extent_map.h|  3 ++-
 fs/btrfs/ordered-data.c  | 20 +--
 fs/btrfs/ordered-data.h  |  2 +-
 

Re: [PATCH 00/17] fs, btrfs refcount conversions

2017-03-06 Thread Qu Wenruo



At 03/06/2017 05:43 PM, Reshetova, Elena wrote:



At 03/03/2017 04:55 PM, Elena Reshetova wrote:

Now when new refcount_t type and API are finally merged
(see include/linux/refcount.h), the following
patches convert various refcounters in the btrfs filesystem from atomic_t
to refcount_t. By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The below patches are fully independent and can be cherry-picked separately.
Since we convert all kernel subsystems in the same fashion, resulting
in about 300 patches, we have to group them for sending at least in some
fashion to be manageable. Please excuse the long cc list.

These patches have been tested with xfstests by running btrfs-related tests.
btrfs debug was enabled, warns on refcount errors, too. No output related to
refcount errors produced. However, the following errors were during the run:
 * tests btrfs/078, btrfs/114, btrfs/115, no errors anywhere in dmesg, but
 process hangs. They all seem to be around qgroup, sometimes error visible
 such as qgroup scan failed -4 before it blocks, but not always.


How reproducible of the hang?


Always in  my environment, but I would not much go into investigating why it 
happens, if it works for you.
My test environment is far from ideal: I am testing in VM with rather old 
userspace and couple of additional changes in,
so there are many things that can potentially go wrong. Anyway the strace for 
078 is in the attachment.


Thanks for the strace.

However no "-f" is passed to strace, so it doesn't contain much useful info.



If the patches pass all tests on your side, could you please take them in and 
propagate further?
I will continue with other kernel subsystems.


The patchset itself looks like a common cleanup, while I did encounter 
several cases (almost all scrub tests) causing kernel warning due to 
underflow.


So I'm afraid the patchset will not be merged until we fix all the 
underflows.


But thanks for the patchset, it helps us to expose a lot of problem.

Thanks,
Qu



Best Regards,
Elena.




I also see the -EINTR output, but that seems to be designed for
btrfs/11[45].

btrfs/078 is unrelated to qgroup, and all these three test pass in my
test environment, which is v4.11-rc1 with your patches applied.

I ran these 3 tests in a row with default and space_cache=v2 mount
options, and 5 times for each mount option, no hang at all.

It would help much if more info can be provided, from blocked process
backtrace to test mount option to base commit.

Thanks,
Qu


 * test btrfs/104 dmesg has additional error output:
 BTRFS warning (device vdc): qgroup 258 reserved space underflow, have: 0,
 to free: 4096
 I tried looking at the code on what causes the failure, but could not figure
 it out. It doesn't seem to be related to any refcount changes at least IMO.

The above test failures are hard for me to understand and interpreted, but
they don't seem to relate to refcount conversions.

Elena Reshetova (17):
  fs, btrfs: convert btrfs_bio.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_transaction.use_count from atomic_t to
refcount_t
  fs, btrfs: convert extent_map.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_ordered_extent.refs from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_caching_control.count from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_delayed_ref_node.refs from atomic_t to
refcount_t
  fs, btrfs: convert btrfs_delayed_node.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_root.refs from atomic_t to refcount_t
  fs, btrfs: convert extent_state.refs from atomic_t to refcount_t
  fs, btrfs: convert compressed_bio.pending_bios from atomic_t to
refcount_t
  fs, btrfs: convert scrub_recover.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_page.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_block.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_parity.refs from atomic_t to refcount_t
  fs, btrfs: convert scrub_ctx.refs from atomic_t to refcount_t
  fs, btrfs: convert btrfs_raid_bio.refs from atomic_t to refcount_t

 fs/btrfs/backref.c   |  2 +-
 fs/btrfs/compression.c   | 18 -
 fs/btrfs/ctree.h |  5 +++--
 fs/btrfs/delayed-inode.c | 46 ++--
 fs/btrfs/delayed-inode.h |  5 +++--
 fs/btrfs/delayed-ref.c   |  8 
 fs/btrfs/delayed-ref.h   |  8 +---
 fs/btrfs/disk-io.c   |  6 +++---
 fs/btrfs/disk-io.h   |  4 ++--
 fs/btrfs/extent-tree.c   | 20 +--
 fs/btrfs/extent_io.c | 18 -
 fs/btrfs/extent_io.h |  3 ++-
 fs/btrfs/extent_map.c| 10 +-
 fs/btrfs/extent_map.h|  3 ++-
 fs/btrfs/ordered-data.c  | 20 +--
 fs/btrfs/ordered-data.h  |  2 +-
 

Re: [PATCH V3 1/3] sched/deadline: Replenishment timer should fire in the next period

2017-03-06 Thread Wanpeng Li
2017-02-28 18:07 GMT+08:00 Daniel Bristot de Oliveira :
> Currently, the replenishment timer is set to fire at the deadline
> of a task. Although that works for implicit deadline tasks because the
> deadline is equals to the begin of the next period, that is not correct
> for constrained deadline tasks (deadline < period).
>
> For instance:
>
> f.c:
>  --- %< ---
> int main (void)
> {
> for(;;);
> }
>  --- >% ---
>
>   # gcc -o f f.c
>
>   # trace-cmd record -e sched:sched_switch  \
>-e syscalls:sys_exit_sched_setattr   \
>chrt -d --sched-runtime  49000   \
>--sched-deadline 5   \
>--sched-period  10 0 ./f
>
>   # trace-cmd report | grep "{pid of ./f}"
>
> After setting parameters, the task is replenished and continue running
> until being throttled.
>  f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0
>
> The task is throttled after running 492318 ms, as expected.

This should be us.

Otherwise,

Reviewed-by: Wanpeng Li 

>  f-11295 [003] 13322.606094: sched_switch:   f:11295 [-1] R ==> \
>watchdog/3:32 [0]
>
> But then, the task is replenished 500719 ms after the first
> replenishment:
> -0 [003] 13322.614495: sched_switch:   swapper/3:0 [120] R \
>==> f:11295 [-1]
>
> Running for 490277 ms:
>  f-11295 [003] 13323.104772: sched_switch:   f:11295 [-1] R ==>  \
>swapper/3:0 [120]
>
> Hence, in the first period, the task runs 2 * runtime, and that is a bug.
>
> During the first replenishment, the next deadline is set one period away.
> So the runtime / period starts to be respected. However, as the second
> replenishment took place in the wrong instant, the next replenishment
> will also be held in a wrong instant of time. Rather than occurring in
> the nth period away from the first activation, it is taking place
> in the (nth period - relative deadline).
>
> Signed-off-by: Daniel Bristot de Oliveira 
> Reviewed-by: Luca Abeni 
> Reviewed-by: Steven Rostedt (VMware) 
> Reviewed-by: Juri Lelli 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Juri Lelli 
> Cc: Tommaso Cucinotta 
> Cc: Luca Abeni 
> Cc: Steven Rostedt 
> Cc: Mike Galbraith 
> Cc: Romulo Silva de Oliveira 
> Cc: Daniel Bristot de Oliveira 
> Cc: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/deadline.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 70ef2b1..3c94d85 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity 
> *dl_se,
> }
>  }
>
> +static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
> +{
> +   return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
> +}
> +
>  /*
>   * If the entity depleted all its runtime, and if we want it to sleep
>   * while waiting for some new execution time to become available, we
> - * set the bandwidth enforcement timer to the replenishment instant
> + * set the bandwidth replenishment timer to the replenishment instant
>   * and try to activate it.
>   *
>   * Notice that it is important for the caller to know if the timer
> @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
>  * that it is actually coming from rq->clock and not from
>  * hrtimer's time base reading.
>  */
> -   act = ns_to_ktime(dl_se->deadline);
> +   act = ns_to_ktime(dl_next_period(dl_se));
> now = hrtimer_cb_get_time(timer);
> delta = ktime_to_ns(now) - rq_clock(rq);
> act = ktime_add_ns(act, delta);
> --
> 2.9.3
>


Re: [PATCH V3 1/3] sched/deadline: Replenishment timer should fire in the next period

2017-03-06 Thread Wanpeng Li
2017-02-28 18:07 GMT+08:00 Daniel Bristot de Oliveira :
> Currently, the replenishment timer is set to fire at the deadline
> of a task. Although that works for implicit deadline tasks because the
> deadline is equals to the begin of the next period, that is not correct
> for constrained deadline tasks (deadline < period).
>
> For instance:
>
> f.c:
>  --- %< ---
> int main (void)
> {
> for(;;);
> }
>  --- >% ---
>
>   # gcc -o f f.c
>
>   # trace-cmd record -e sched:sched_switch  \
>-e syscalls:sys_exit_sched_setattr   \
>chrt -d --sched-runtime  49000   \
>--sched-deadline 5   \
>--sched-period  10 0 ./f
>
>   # trace-cmd report | grep "{pid of ./f}"
>
> After setting parameters, the task is replenished and continue running
> until being throttled.
>  f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0
>
> The task is throttled after running 492318 ms, as expected.

This should be us.

Otherwise,

Reviewed-by: Wanpeng Li 

>  f-11295 [003] 13322.606094: sched_switch:   f:11295 [-1] R ==> \
>watchdog/3:32 [0]
>
> But then, the task is replenished 500719 ms after the first
> replenishment:
> -0 [003] 13322.614495: sched_switch:   swapper/3:0 [120] R \
>==> f:11295 [-1]
>
> Running for 490277 ms:
>  f-11295 [003] 13323.104772: sched_switch:   f:11295 [-1] R ==>  \
>swapper/3:0 [120]
>
> Hence, in the first period, the task runs 2 * runtime, and that is a bug.
>
> During the first replenishment, the next deadline is set one period away.
> So the runtime / period starts to be respected. However, as the second
> replenishment took place in the wrong instant, the next replenishment
> will also be held in a wrong instant of time. Rather than occurring in
> the nth period away from the first activation, it is taking place
> in the (nth period - relative deadline).
>
> Signed-off-by: Daniel Bristot de Oliveira 
> Reviewed-by: Luca Abeni 
> Reviewed-by: Steven Rostedt (VMware) 
> Reviewed-by: Juri Lelli 
> Cc: Ingo Molnar 
> Cc: Peter Zijlstra 
> Cc: Juri Lelli 
> Cc: Tommaso Cucinotta 
> Cc: Luca Abeni 
> Cc: Steven Rostedt 
> Cc: Mike Galbraith 
> Cc: Romulo Silva de Oliveira 
> Cc: Daniel Bristot de Oliveira 
> Cc: linux-kernel@vger.kernel.org
> ---
>  kernel/sched/deadline.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 70ef2b1..3c94d85 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity 
> *dl_se,
> }
>  }
>
> +static inline u64 dl_next_period(struct sched_dl_entity *dl_se)
> +{
> +   return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period;
> +}
> +
>  /*
>   * If the entity depleted all its runtime, and if we want it to sleep
>   * while waiting for some new execution time to become available, we
> - * set the bandwidth enforcement timer to the replenishment instant
> + * set the bandwidth replenishment timer to the replenishment instant
>   * and try to activate it.
>   *
>   * Notice that it is important for the caller to know if the timer
> @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p)
>  * that it is actually coming from rq->clock and not from
>  * hrtimer's time base reading.
>  */
> -   act = ns_to_ktime(dl_se->deadline);
> +   act = ns_to_ktime(dl_next_period(dl_se));
> now = hrtimer_cb_get_time(timer);
> delta = ktime_to_ns(now) - rq_clock(rq);
> act = ktime_add_ns(act, delta);
> --
> 2.9.3
>


Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Minchan Kim
Hello Johannes,

On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
> pages attached to the bio's bvec this results in a kernel panic because of
> array out of bounds accesses in zram_decompress_page().

First of all, thanks for the report and fix up!
Unfortunately, I'm not familiar with that interface of block layer.

It seems this is a material for stable so I want to understand it clear.
Could you say more specific things to educate me?

What scenario/When/How it is problem?  It will help for me to understand!

Thanks.

> 
> Signed-off-by: Johannes Thumshirn 
> ---
>  drivers/block/zram/zram_drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index e27d89a..dceb5ed 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1189,6 +1189,8 @@ static int zram_add(void)
>   blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
>   blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
>   zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
> + zram->disk->queue->limits.max_sectors = SECTORS_PER_PAGE;
> + zram->disk->queue->limits.chunk_sectors = 0;
>   blk_queue_max_discard_sectors(zram->disk->queue, UINT_MAX);
>   /*
>* zram_bio_discard() will clear all logical blocks if logical block
> -- 
> 1.8.5.6
> 


Re: [PATCH] zram: set physical queue limits to avoid array out of bounds accesses

2017-03-06 Thread Minchan Kim
Hello Johannes,

On Mon, Mar 06, 2017 at 11:23:35AM +0100, Johannes Thumshirn wrote:
> zram can handle at most SECTORS_PER_PAGE sectors in a bio's bvec. When using
> the NVMe over Fabrics loopback target which potentially sends a huge bulk of
> pages attached to the bio's bvec this results in a kernel panic because of
> array out of bounds accesses in zram_decompress_page().

First of all, thanks for the report and fix up!
Unfortunately, I'm not familiar with that interface of block layer.

It seems this is a material for stable so I want to understand it clear.
Could you say more specific things to educate me?

What scenario/When/How it is problem?  It will help for me to understand!

Thanks.

> 
> Signed-off-by: Johannes Thumshirn 
> ---
>  drivers/block/zram/zram_drv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index e27d89a..dceb5ed 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1189,6 +1189,8 @@ static int zram_add(void)
>   blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
>   blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
>   zram->disk->queue->limits.discard_granularity = PAGE_SIZE;
> + zram->disk->queue->limits.max_sectors = SECTORS_PER_PAGE;
> + zram->disk->queue->limits.chunk_sectors = 0;
>   blk_queue_max_discard_sectors(zram->disk->queue, UINT_MAX);
>   /*
>* zram_bio_discard() will clear all logical blocks if logical block
> -- 
> 1.8.5.6
> 


[PATCH 0/5] locking Introduce range reader/writer lock

2017-03-06 Thread Davidlohr Bueso
Hi,

Here's a very tardy proposal for enhancements to Jan's original[1] range lock
using interval trees. Because at some point it would be awesome to switch 
mmap_sem
from rwsem to range rwlock, I've focused on making it sharable and performance
enhancements reducing the performance delta between this and conventional locks 
as
much as possible -- details in patch 1.

The rest of the patches adds support for testing the new lock and actually
makes use of it for lustre. It has passed quite a bit of artificial pounding and
I believe/hope it is in shape to consider.

Applies on top of v4.11-rc1.

[1] https://lkml.org/lkml/2013/1/31/483

Thanks!

Davidlohr Bueso (5):
  locking: Introduce range reader/writer lock
  locking/locktorture: Fix rwsem reader_delay
  locking/locktorture: Fix num reader/writer corner cases
  locking/locktorture: Support range rwlocks
  staging/lustre: Use generic range rwlock

 drivers/gpu/drm/Kconfig|   2 -
 drivers/gpu/drm/i915/Kconfig   |   1 -
 drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |  21 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
 drivers/staging/lustre/lustre/llite/range_lock.c   | 239 ---
 drivers/staging/lustre/lustre/llite/range_lock.h   |  82 
 include/linux/range_rwlock.h   |  96 +
 kernel/locking/Makefile|   2 +-
 kernel/locking/locktorture.c   | 299 +
 kernel/locking/range_rwlock.c  | 462 +
 lib/Kconfig|  14 -
 lib/Kconfig.debug  |   1 -
 lib/Makefile   |   2 +-
 15 files changed, 792 insertions(+), 438 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h
 create mode 100644 include/linux/range_rwlock.h
 create mode 100644 kernel/locking/range_rwlock.c

-- 
2.6.6



[PATCH 0/5] locking Introduce range reader/writer lock

2017-03-06 Thread Davidlohr Bueso
Hi,

Here's a very tardy proposal for enhancements to Jan's original[1] range lock
using interval trees. Because at some point it would be awesome to switch 
mmap_sem
from rwsem to range rwlock, I've focused on making it sharable and performance
enhancements reducing the performance delta between this and conventional locks 
as
much as possible -- details in patch 1.

The rest of the patches adds support for testing the new lock and actually
makes use of it for lustre. It has passed quite a bit of artificial pounding and
I believe/hope it is in shape to consider.

Applies on top of v4.11-rc1.

[1] https://lkml.org/lkml/2013/1/31/483

Thanks!

Davidlohr Bueso (5):
  locking: Introduce range reader/writer lock
  locking/locktorture: Fix rwsem reader_delay
  locking/locktorture: Fix num reader/writer corner cases
  locking/locktorture: Support range rwlocks
  staging/lustre: Use generic range rwlock

 drivers/gpu/drm/Kconfig|   2 -
 drivers/gpu/drm/i915/Kconfig   |   1 -
 drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |  21 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
 drivers/staging/lustre/lustre/llite/range_lock.c   | 239 ---
 drivers/staging/lustre/lustre/llite/range_lock.h   |  82 
 include/linux/range_rwlock.h   |  96 +
 kernel/locking/Makefile|   2 +-
 kernel/locking/locktorture.c   | 299 +
 kernel/locking/range_rwlock.c  | 462 +
 lib/Kconfig|  14 -
 lib/Kconfig.debug  |   1 -
 lib/Makefile   |   2 +-
 15 files changed, 792 insertions(+), 438 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h
 create mode 100644 include/linux/range_rwlock.h
 create mode 100644 kernel/locking/range_rwlock.c

-- 
2.6.6



[PATCH 1/1] mtk-vcodec: check the vp9 decoder buffer index from VPU.

2017-03-06 Thread Wu-Cheng Li
From: Wu-Cheng Li 

VPU firmware has a bug and may return invalid buffer index for
some vp9 videos. Check the buffer indexes before accessing the
buffer.

Signed-off-by: Wu-Cheng Li 
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c |  6 +
 .../media/platform/mtk-vcodec/vdec/vdec_vp9_if.c   | 26 ++
 drivers/media/platform/mtk-vcodec/vdec_drv_if.h|  2 ++
 3 files changed, 34 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 502877a4b1df..7ebcf9e57ac7 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -1176,6 +1176,12 @@ static void vb2ops_vdec_buf_queue(struct vb2_buffer *vb)
   "[%d] vdec_if_decode() src_buf=%d, size=%zu, 
fail=%d, res_chg=%d",
   ctx->id, src_buf->index,
   src_mem.size, ret, res_chg);
+
+   if (ret == -EIO) {
+   mtk_v4l2_err("[%d] Unrecoverable error in 
vdec_if_decode.",
+   ctx->id);
+   ctx->state = MTK_STATE_ABORT;
+   }
return;
}
 
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c 
b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
index e91a3b425b0c..5539b1853f16 100644
--- a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
@@ -718,6 +718,26 @@ static void get_free_fb(struct vdec_vp9_inst *inst, struct 
vdec_fb **out_fb)
*out_fb = fb;
 }
 
+static int validate_vsi_array_indexes(struct vdec_vp9_inst *inst,
+   struct vdec_vp9_vsi *vsi) {
+   if (vsi->sf_frm_idx >= VP9_MAX_FRM_BUF_NUM - 1) {
+   mtk_vcodec_err(inst, "Invalid vsi->sf_frm_idx=%u.",
+   vsi->sf_frm_idx);
+   return -EIO;
+   }
+   if (vsi->frm_to_show_idx >= VP9_MAX_FRM_BUF_NUM) {
+   mtk_vcodec_err(inst, "Invalid vsi->frm_to_show_idx=%u.",
+   vsi->frm_to_show_idx);
+   return -EIO;
+   }
+   if (vsi->new_fb_idx >= VP9_MAX_FRM_BUF_NUM) {
+   mtk_vcodec_err(inst, "Invalid vsi->new_fb_idx=%u.",
+   vsi->new_fb_idx);
+   return -EIO;
+   }
+   return 0;
+}
+
 static void vdec_vp9_deinit(unsigned long h_vdec)
 {
struct vdec_vp9_inst *inst = (struct vdec_vp9_inst *)h_vdec;
@@ -834,6 +854,12 @@ static int vdec_vp9_decode(unsigned long h_vdec, struct 
mtk_vcodec_mem *bs,
goto DECODE_ERROR;
}
 
+   ret = validate_vsi_array_indexes(inst, vsi);
+   if (ret) {
+   mtk_vcodec_err(inst, "Invalid values from VPU.");
+   goto DECODE_ERROR;
+   }
+
if (vsi->resolution_changed) {
if (!vp9_alloc_work_buf(inst)) {
ret = -EINVAL;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h 
b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index db6b5205ffb1..ded1154481cd 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -85,6 +85,8 @@ void vdec_if_deinit(struct mtk_vcodec_ctx *ctx);
  * @res_chg: [out] resolution change happens if current bs have different
  * picture width/height
  * Note: To flush the decoder when reaching EOF, set input bitstream as NULL.
+ *
+ * Return: 0 on success. -EIO on unrecoverable error.
  */
 int vdec_if_decode(struct mtk_vcodec_ctx *ctx, struct mtk_vcodec_mem *bs,
   struct vdec_fb *fb, bool *res_chg);
-- 
2.12.0.rc1.440.g5b76565f74-goog



Re: Subject: [PATCH v4] USB:Core: BugFix: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously

2017-03-06 Thread Ajay Kaher
 
 
 
> On Fri, 3 Mar 2017, Ajay Kaher wrote:
> 
> > > usb_class->kref is not accessible outside the file.c
> > > as usb_class is _static_ inside the file.c and
> > > pointer of usb_class->kref is not passed anywhere.
> > > 
> > > Hence as you wanted, there are no references of usb_class->kref
> > > other than taken by init_usb_class() and released by destroy_usb_class().
> > 
> > Verified the code again, I hope my last comments clarifed the things
> > which came in your mind and helps you to accept the patch :)
>  
> Your main point is that usb_class->kref is accessed from only two
> points, both of which are protected by the new mutex.  This means there
> is no reason for the value to be a struct kref at all.  You should
> change it to an int (and change its name).  Leaving it as a kref will
> make readers wonder why it needs to be updated atomically.

At many places in Linux kernel, instances of Kref have been used within
Mutex, SpinLock and don’t have any side effect.

Making to int and handle (i.e. get/put) it within file.c seems
not good as we have Kref. Instead, we can have non_atomic version of kref.
We can discuss about non_atomic kref in another thread, if you are interested.

> Also, why does destroy_usb_class() have that "if (usb_class) "test? 
> Isn't it true that usb_class can never be NULL there?

Removed in Patch v4.

thanks,
ajay kaher
 
  
Signed-off-by: Ajay Kaher
 
---

 drivers/usb/core/file.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/file.c b/drivers/usb/core/file.c
index 822ced9..422ce7b 100644
--- a/drivers/usb/core/file.c
+++ b/drivers/usb/core/file.c
@@ -27,6 +27,7 @@
 #define MAX_USB_MINORS 256
 static const struct file_operations *usb_minors[MAX_USB_MINORS];
 static DECLARE_RWSEM(minor_rwsem);
+static DEFINE_MUTEX(init_usb_class_mutex);

 static int usb_open(struct inode *inode, struct file *file)
 {
@@ -109,8 +110,9 @@ static void release_usb_class(struct kref *kref)

 static void destroy_usb_class(void)
 {
-   if (usb_class)
-   kref_put(_class->kref, release_usb_class);
+   mutex_lock(_usb_class_mutex);
+   kref_put(_class->kref, release_usb_class);
+   mutex_unlock(_usb_class_mutex);
 }

 int usb_major_init(void)
@@ -171,7 +173,10 @@ int usb_register_dev(struct usb_interface *intf,
if (intf->minor >= 0)
return -EADDRINUSE;

+   mutex_lock(_usb_class_mutex);
retval = init_usb_class();
+   mutex_unlock(_usb_class_mutex);
+
if (retval)
return retval;


[PATCH 1/1] mtk-vcodec: check the vp9 decoder buffer index from VPU.

2017-03-06 Thread Wu-Cheng Li
From: Wu-Cheng Li 

VPU firmware has a bug and may return invalid buffer index for
some vp9 videos. Check the buffer indexes before accessing the
buffer.

Signed-off-by: Wu-Cheng Li 
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c |  6 +
 .../media/platform/mtk-vcodec/vdec/vdec_vp9_if.c   | 26 ++
 drivers/media/platform/mtk-vcodec/vdec_drv_if.h|  2 ++
 3 files changed, 34 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 502877a4b1df..7ebcf9e57ac7 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -1176,6 +1176,12 @@ static void vb2ops_vdec_buf_queue(struct vb2_buffer *vb)
   "[%d] vdec_if_decode() src_buf=%d, size=%zu, 
fail=%d, res_chg=%d",
   ctx->id, src_buf->index,
   src_mem.size, ret, res_chg);
+
+   if (ret == -EIO) {
+   mtk_v4l2_err("[%d] Unrecoverable error in 
vdec_if_decode.",
+   ctx->id);
+   ctx->state = MTK_STATE_ABORT;
+   }
return;
}
 
diff --git a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c 
b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
index e91a3b425b0c..5539b1853f16 100644
--- a/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec/vdec_vp9_if.c
@@ -718,6 +718,26 @@ static void get_free_fb(struct vdec_vp9_inst *inst, struct 
vdec_fb **out_fb)
*out_fb = fb;
 }
 
+static int validate_vsi_array_indexes(struct vdec_vp9_inst *inst,
+   struct vdec_vp9_vsi *vsi) {
+   if (vsi->sf_frm_idx >= VP9_MAX_FRM_BUF_NUM - 1) {
+   mtk_vcodec_err(inst, "Invalid vsi->sf_frm_idx=%u.",
+   vsi->sf_frm_idx);
+   return -EIO;
+   }
+   if (vsi->frm_to_show_idx >= VP9_MAX_FRM_BUF_NUM) {
+   mtk_vcodec_err(inst, "Invalid vsi->frm_to_show_idx=%u.",
+   vsi->frm_to_show_idx);
+   return -EIO;
+   }
+   if (vsi->new_fb_idx >= VP9_MAX_FRM_BUF_NUM) {
+   mtk_vcodec_err(inst, "Invalid vsi->new_fb_idx=%u.",
+   vsi->new_fb_idx);
+   return -EIO;
+   }
+   return 0;
+}
+
 static void vdec_vp9_deinit(unsigned long h_vdec)
 {
struct vdec_vp9_inst *inst = (struct vdec_vp9_inst *)h_vdec;
@@ -834,6 +854,12 @@ static int vdec_vp9_decode(unsigned long h_vdec, struct 
mtk_vcodec_mem *bs,
goto DECODE_ERROR;
}
 
+   ret = validate_vsi_array_indexes(inst, vsi);
+   if (ret) {
+   mtk_vcodec_err(inst, "Invalid values from VPU.");
+   goto DECODE_ERROR;
+   }
+
if (vsi->resolution_changed) {
if (!vp9_alloc_work_buf(inst)) {
ret = -EINVAL;
diff --git a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h 
b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
index db6b5205ffb1..ded1154481cd 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
+++ b/drivers/media/platform/mtk-vcodec/vdec_drv_if.h
@@ -85,6 +85,8 @@ void vdec_if_deinit(struct mtk_vcodec_ctx *ctx);
  * @res_chg: [out] resolution change happens if current bs have different
  * picture width/height
  * Note: To flush the decoder when reaching EOF, set input bitstream as NULL.
+ *
+ * Return: 0 on success. -EIO on unrecoverable error.
  */
 int vdec_if_decode(struct mtk_vcodec_ctx *ctx, struct mtk_vcodec_mem *bs,
   struct vdec_fb *fb, bool *res_chg);
-- 
2.12.0.rc1.440.g5b76565f74-goog



Re: Subject: [PATCH v4] USB:Core: BugFix: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously

2017-03-06 Thread Ajay Kaher
 
 
 
> On Fri, 3 Mar 2017, Ajay Kaher wrote:
> 
> > > usb_class->kref is not accessible outside the file.c
> > > as usb_class is _static_ inside the file.c and
> > > pointer of usb_class->kref is not passed anywhere.
> > > 
> > > Hence as you wanted, there are no references of usb_class->kref
> > > other than taken by init_usb_class() and released by destroy_usb_class().
> > 
> > Verified the code again, I hope my last comments clarifed the things
> > which came in your mind and helps you to accept the patch :)
>  
> Your main point is that usb_class->kref is accessed from only two
> points, both of which are protected by the new mutex.  This means there
> is no reason for the value to be a struct kref at all.  You should
> change it to an int (and change its name).  Leaving it as a kref will
> make readers wonder why it needs to be updated atomically.

At many places in Linux kernel, instances of Kref have been used within
Mutex, SpinLock and don’t have any side effect.

Making to int and handle (i.e. get/put) it within file.c seems
not good as we have Kref. Instead, we can have non_atomic version of kref.
We can discuss about non_atomic kref in another thread, if you are interested.

> Also, why does destroy_usb_class() have that "if (usb_class) "test? 
> Isn't it true that usb_class can never be NULL there?

Removed in Patch v4.

thanks,
ajay kaher
 
  
Signed-off-by: Ajay Kaher
 
---

 drivers/usb/core/file.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/core/file.c b/drivers/usb/core/file.c
index 822ced9..422ce7b 100644
--- a/drivers/usb/core/file.c
+++ b/drivers/usb/core/file.c
@@ -27,6 +27,7 @@
 #define MAX_USB_MINORS 256
 static const struct file_operations *usb_minors[MAX_USB_MINORS];
 static DECLARE_RWSEM(minor_rwsem);
+static DEFINE_MUTEX(init_usb_class_mutex);

 static int usb_open(struct inode *inode, struct file *file)
 {
@@ -109,8 +110,9 @@ static void release_usb_class(struct kref *kref)

 static void destroy_usb_class(void)
 {
-   if (usb_class)
-   kref_put(_class->kref, release_usb_class);
+   mutex_lock(_usb_class_mutex);
+   kref_put(_class->kref, release_usb_class);
+   mutex_unlock(_usb_class_mutex);
 }

 int usb_major_init(void)
@@ -171,7 +173,10 @@ int usb_register_dev(struct usb_interface *intf,
if (intf->minor >= 0)
return -EADDRINUSE;

+   mutex_lock(_usb_class_mutex);
retval = init_usb_class();
+   mutex_unlock(_usb_class_mutex);
+
if (retval)
return retval;


[PATCH v1 0/1] mtk-vcodec: check the vp9 decoder buffer index from VPU

2017-03-06 Thread Wu-Cheng Li
From: Wu-Cheng Li 

This patch guards against the invalid buffer index from
VPU firmware.

Wu-Cheng Li (1):
  mtk-vcodec: check the vp9 decoder buffer index from VPU.

 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c |  6 +
 .../media/platform/mtk-vcodec/vdec/vdec_vp9_if.c   | 26 ++
 drivers/media/platform/mtk-vcodec/vdec_drv_if.h|  2 ++
 3 files changed, 34 insertions(+)

-- 
2.12.0.rc1.440.g5b76565f74-goog



[PATCH v1 0/1] mtk-vcodec: check the vp9 decoder buffer index from VPU

2017-03-06 Thread Wu-Cheng Li
From: Wu-Cheng Li 

This patch guards against the invalid buffer index from
VPU firmware.

Wu-Cheng Li (1):
  mtk-vcodec: check the vp9 decoder buffer index from VPU.

 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c |  6 +
 .../media/platform/mtk-vcodec/vdec/vdec_vp9_if.c   | 26 ++
 drivers/media/platform/mtk-vcodec/vdec_drv_if.h|  2 ++
 3 files changed, 34 insertions(+)

-- 
2.12.0.rc1.440.g5b76565f74-goog



[PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

2017-03-06 Thread Wanpeng Li
From: Wanpeng Li 

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

 [ cut here ]
 WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 
replenish_dl_entity+0x71e/0xc40
 rq->clock_update_flags < RQCF_ACT_SKIP
 CPU: 7 PID: 0 Comm: swapper/7 Tainted: GB   4.11.0-rc1+ #24
 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 
02/16/2016
 Call Trace:
  
  dump_stack+0x85/0xc4
  __warn+0x172/0x1b0
  warn_slowpath_fmt+0xb4/0xf0
  ? __warn+0x1b0/0x1b0
  ? debug_check_no_locks_freed+0x2c0/0x2c0
  ? cpudl_set+0x3d/0x2b0
  replenish_dl_entity+0x71e/0xc40
  enqueue_task_dl+0x2ea/0x12e0
  ? dl_task_timer+0x777/0x990
  ? __hrtimer_run_queues+0x270/0xa50
  dl_task_timer+0x316/0x990
  ? enqueue_task_dl+0x12e0/0x12e0
  ? enqueue_task_dl+0x12e0/0x12e0
  __hrtimer_run_queues+0x270/0xa50
  ? hrtimer_cancel+0x20/0x20
  ? hrtimer_interrupt+0x119/0x600
  hrtimer_interrupt+0x19c/0x600
  ? trace_hardirqs_off+0xd/0x10
  local_apic_timer_interrupt+0x74/0xe0
  smp_apic_timer_interrupt+0x76/0xa0
  apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL 
timer fires and currnet rq is offline. The rq clock of the new rq should 
be updated. This patch fixes it by updating the rq clock after holding 
the new rq's rq lock.

Cc: Juri Lelli 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Matt Fleming 
Signed-off-by: Wanpeng Li 
---
 kernel/sched/deadline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99b2c33..c6db3fd 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
*timer)
lockdep_unpin_lock(>lock, rf.cookie);
rq = dl_task_offline_migration(rq, p);
rf.cookie = lockdep_pin_lock(>lock);
+   update_rq_clock(rq);
 
/*
 * Now that the task has been migrated to the new RQ and we
-- 
2.7.4



[PATCH] sched/deadline: Add missing update_rq_clock() in dl_task_timer()

2017-03-06 Thread Wanpeng Li
From: Wanpeng Li 

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

 [ cut here ]
 WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 
replenish_dl_entity+0x71e/0xc40
 rq->clock_update_flags < RQCF_ACT_SKIP
 CPU: 7 PID: 0 Comm: swapper/7 Tainted: GB   4.11.0-rc1+ #24
 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 
02/16/2016
 Call Trace:
  
  dump_stack+0x85/0xc4
  __warn+0x172/0x1b0
  warn_slowpath_fmt+0xb4/0xf0
  ? __warn+0x1b0/0x1b0
  ? debug_check_no_locks_freed+0x2c0/0x2c0
  ? cpudl_set+0x3d/0x2b0
  replenish_dl_entity+0x71e/0xc40
  enqueue_task_dl+0x2ea/0x12e0
  ? dl_task_timer+0x777/0x990
  ? __hrtimer_run_queues+0x270/0xa50
  dl_task_timer+0x316/0x990
  ? enqueue_task_dl+0x12e0/0x12e0
  ? enqueue_task_dl+0x12e0/0x12e0
  __hrtimer_run_queues+0x270/0xa50
  ? hrtimer_cancel+0x20/0x20
  ? hrtimer_interrupt+0x119/0x600
  hrtimer_interrupt+0x19c/0x600
  ? trace_hardirqs_off+0xd/0x10
  local_apic_timer_interrupt+0x74/0xe0
  smp_apic_timer_interrupt+0x76/0xa0
  apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL 
timer fires and currnet rq is offline. The rq clock of the new rq should 
be updated. This patch fixes it by updating the rq clock after holding 
the new rq's rq lock.

Cc: Juri Lelli 
Cc: Peter Zijlstra 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Matt Fleming 
Signed-off-by: Wanpeng Li 
---
 kernel/sched/deadline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 99b2c33..c6db3fd 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -638,6 +638,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
*timer)
lockdep_unpin_lock(>lock, rf.cookie);
rq = dl_task_offline_migration(rq, p);
rf.cookie = lockdep_pin_lock(>lock);
+   update_rq_clock(rq);
 
/*
 * Now that the task has been migrated to the new RQ and we
-- 
2.7.4



Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 10:49:06AM -0500, Johannes Weiner wrote:

< snip >

> > @@ -1413,20 +1413,24 @@ static int try_to_unmap_one(struct page *page, 
> > struct vm_area_struct *vma,
> >  * Store the swap location in the pte.
> >  * See handle_pte_fault() ...
> >  */
> > -   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> > PageSwapBacked(page),
> > -   page);
> > +   if (VM_WARN_ON_ONCE(PageSwapBacked(page) &&
> > +   !PageSwapCache(page))) {
> > +   ret = SWAP_FAIL;
> 
> But you're not adding the !swapbacked && swapcache case?
> 
> > +   page_vma_mapped_walk_done();
> > +   break;
> > +   }
> 
> [...]
> 
> > -   /*
> > -* swapin page could be clean, it has data stored in
> > -* swap. We can't silently discard it without setting
> > -* swap entry in the page table.
> > -*/
> > -   if (!PageDirty(page) && !PageSwapCache(page)) {
> > -   /* It's a freeable page by MADV_FREE */
> > -   dec_mm_counter(mm, MM_ANONPAGES);
> > -   goto discard;
> > -   } else if (!PageSwapBacked(page)) {
> > -   /* dirty MADV_FREE page */
> > +   /* MADV_FREE page check */
> > +   if (!PageSwapBacked(page)) {
> > +   if (!PageDirty(page)) {
> > +   dec_mm_counter(mm, MM_ANONPAGES);
> > +   goto discard;
> > +   }
> 
> Andrew already has this, you might want to send the warning changes as
> a separate patch on top of this one.

Here it goes.

>From d42d296950c3bbce74afddcff307fa18eef305fe Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 7 Mar 2017 14:48:37 +0900
Subject: [PATCH] mm: fix lazyfree bug on check in try_to_unmap_one

If a page is swapbacked, it means it should be in swapcache
in try_to_unmap_one's path.

If a page is !swapbacked, it mean it shouldn't be in swapcache
in try_to_unmap_one's path.

Check both two cases all at once and if it fails, warn and
return SWAP_FAIL. Such bug never mean we should shut down
the kernel.

Suggested-by: Johannes Weiner 
Signed-off-by: Minchan Kim 
---
 mm/rmap.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 35acb83..9925f32 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,8 +1413,13 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 * Store the swap location in the pte.
 * See handle_pte_fault() ...
 */
-   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
PageSwapBacked(page),
-   page);
+   if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
+   PageSwapCache(page))) {
+   ret = SWAP_FAIL;
+   page_vma_mapped_walk_done();
+   break;
+
+   }
 
/* MADV_FREE page check */
if (!PageSwapBacked(page)) {
-- 
2.7.4



Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 10:49:06AM -0500, Johannes Weiner wrote:

< snip >

> > @@ -1413,20 +1413,24 @@ static int try_to_unmap_one(struct page *page, 
> > struct vm_area_struct *vma,
> >  * Store the swap location in the pte.
> >  * See handle_pte_fault() ...
> >  */
> > -   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> > PageSwapBacked(page),
> > -   page);
> > +   if (VM_WARN_ON_ONCE(PageSwapBacked(page) &&
> > +   !PageSwapCache(page))) {
> > +   ret = SWAP_FAIL;
> 
> But you're not adding the !swapbacked && swapcache case?
> 
> > +   page_vma_mapped_walk_done();
> > +   break;
> > +   }
> 
> [...]
> 
> > -   /*
> > -* swapin page could be clean, it has data stored in
> > -* swap. We can't silently discard it without setting
> > -* swap entry in the page table.
> > -*/
> > -   if (!PageDirty(page) && !PageSwapCache(page)) {
> > -   /* It's a freeable page by MADV_FREE */
> > -   dec_mm_counter(mm, MM_ANONPAGES);
> > -   goto discard;
> > -   } else if (!PageSwapBacked(page)) {
> > -   /* dirty MADV_FREE page */
> > +   /* MADV_FREE page check */
> > +   if (!PageSwapBacked(page)) {
> > +   if (!PageDirty(page)) {
> > +   dec_mm_counter(mm, MM_ANONPAGES);
> > +   goto discard;
> > +   }
> 
> Andrew already has this, you might want to send the warning changes as
> a separate patch on top of this one.

Here it goes.

>From d42d296950c3bbce74afddcff307fa18eef305fe Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 7 Mar 2017 14:48:37 +0900
Subject: [PATCH] mm: fix lazyfree bug on check in try_to_unmap_one

If a page is swapbacked, it means it should be in swapcache
in try_to_unmap_one's path.

If a page is !swapbacked, it mean it shouldn't be in swapcache
in try_to_unmap_one's path.

Check both two cases all at once and if it fails, warn and
return SWAP_FAIL. Such bug never mean we should shut down
the kernel.

Suggested-by: Johannes Weiner 
Signed-off-by: Minchan Kim 
---
 mm/rmap.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 35acb83..9925f32 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1413,8 +1413,13 @@ static int try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
 * Store the swap location in the pte.
 * See handle_pte_fault() ...
 */
-   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
PageSwapBacked(page),
-   page);
+   if (VM_WARN_ON_ONCE(PageSwapBacked(page) !=
+   PageSwapCache(page))) {
+   ret = SWAP_FAIL;
+   page_vma_mapped_walk_done();
+   break;
+
+   }
 
/* MADV_FREE page check */
if (!PageSwapBacked(page)) {
-- 
2.7.4



RE: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for legacy interrupts

2017-03-06 Thread Bharat Kumar Gogada
Hi Marc,

can you please look into my last comments ?

Regards,
Bharat
> Subject: RE: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for legacy 
> interrupts
> 
> Waiting for Marc's Reply...
> 
> > > -Original Message-
> > > From: Marc Zyngier [mailto:marc.zyng...@arm.com]
> > > Sent: Thursday, February 09, 2017 9:33 PM
> > > To: Bharat Kumar Gogada ; bhelg...@google.com;
> > > r...@kernel.org; paul.gortma...@windriver.com;
> > > colin.k...@canonical.com; linux-...@vger.kernel.org
> > > Cc: linux-arm-ker...@lists.infradead.org;
> > > linux-kernel@vger.kernel.org; michal.si...@xilinx.com;
> > > a...@arndb.de; Ravikiran Gummaluri 
> > > Subject: Re: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for
> > > legacy
> > interrupts
> > >
> > > On 09/02/17 15:16, Bharat Kumar Gogada wrote:
> > > >>
> > > >> On 09/02/17 12:01, Bharat Kumar Gogada wrote:
> > >  On 06/02/17 07:03, Bharat Kumar Gogada wrote:
> > > > +static struct irq_chip nwl_leg_irq_chip = {
> > > > +   .name = "nwl_pcie:legacy",
> > > > +   .irq_enable = nwl_unmask_leg_irq,
> > > > +   .irq_disable = nwl_mask_leg_irq,
> > > 
> > >  You don't need these two if they are implemented in terms of
> > > mask/unmask.
> > > >>>
> > > >>> These are being invoked by some drivers other than interrupt flow.
> > > >>> Ex: drivers/net/wireless/ath/ath9k/main.c
> > > >>> static int ath_reset_internal(struct ath_softc *sc, struct
> > > >>> ath9k_channel *hchan) {
> > > >>>  
> > > >>>  disable_irq(sc->irq);
> > > >>>  tasklet_disable(>intr_tq);
> > > >>> ...
> > > >>> ...
> > > >>> enable_irq(sc->irq);
> > > >>> spin_unlock_bh(>sc_pcu_lock); } For us
> > > >>> masking/unmasking is the way to enable/disable interrupts.
> > > >>
> > > >> And if you looked at the way disable_irq is implemented, you
> > > >> would have found out that it falls back to masking if there is no
> > > >> disable method, preserving the semantic you expect.
> > > >>
> > > > Yes I did see, but this fall back requires extra
> > > > "IRQ_DISABLE_UNLAZY" flag to
> > > be set to each virq.
> > >
> > > No it doesn't. If you do a disable_irq(), the interrupt is flagged
> > > as disabled, but nothing gets done. If an interrupt actually fires,
> > > then the interrupts gets
> > masked,
> > > and the handler is not called.
> > Yes agreed, this is where the problem comes for us. Here is the
> > scenario Ex:drivers/net/wireless/ath/ath9k/main.c
> > static int ath_reset_internal(struct ath_softc *sc, struct
> > ath9k_channel *hchan) {
> >   
> >ath9k_hw_set_interrupts(ah);
> >ath9k_hw_enable_interrupts(ah);
> >...
> >   enable_irq(sc->irq);
> >   ...
> > }
> > If you observe this they enable hardware interrupts first and then
> > call enable_irq, at this point of time virq is in disabled state. So,
> > if interrupt is raised in this period of time the handler is never
> > invoked and DEASEERT_INTx will not be seen. As I mentioned in my
> > subject the irq line between bridge and GIC goes low only after it
> > sees DEASSERT_INTx. But since DEASSERT_INTx is never seen line is
> > always high causing cpu stall.
> > So for this kind of EP's we need those two methods.
> >
> > Bharat


RE: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for legacy interrupts

2017-03-06 Thread Bharat Kumar Gogada
Hi Marc,

can you please look into my last comments ?

Regards,
Bharat
> Subject: RE: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for legacy 
> interrupts
> 
> Waiting for Marc's Reply...
> 
> > > -Original Message-
> > > From: Marc Zyngier [mailto:marc.zyng...@arm.com]
> > > Sent: Thursday, February 09, 2017 9:33 PM
> > > To: Bharat Kumar Gogada ; bhelg...@google.com;
> > > r...@kernel.org; paul.gortma...@windriver.com;
> > > colin.k...@canonical.com; linux-...@vger.kernel.org
> > > Cc: linux-arm-ker...@lists.infradead.org;
> > > linux-kernel@vger.kernel.org; michal.si...@xilinx.com;
> > > a...@arndb.de; Ravikiran Gummaluri 
> > > Subject: Re: [PATCH v5] PCI: Xilinx NWL: Modifying irq chip for
> > > legacy
> > interrupts
> > >
> > > On 09/02/17 15:16, Bharat Kumar Gogada wrote:
> > > >>
> > > >> On 09/02/17 12:01, Bharat Kumar Gogada wrote:
> > >  On 06/02/17 07:03, Bharat Kumar Gogada wrote:
> > > > +static struct irq_chip nwl_leg_irq_chip = {
> > > > +   .name = "nwl_pcie:legacy",
> > > > +   .irq_enable = nwl_unmask_leg_irq,
> > > > +   .irq_disable = nwl_mask_leg_irq,
> > > 
> > >  You don't need these two if they are implemented in terms of
> > > mask/unmask.
> > > >>>
> > > >>> These are being invoked by some drivers other than interrupt flow.
> > > >>> Ex: drivers/net/wireless/ath/ath9k/main.c
> > > >>> static int ath_reset_internal(struct ath_softc *sc, struct
> > > >>> ath9k_channel *hchan) {
> > > >>>  
> > > >>>  disable_irq(sc->irq);
> > > >>>  tasklet_disable(>intr_tq);
> > > >>> ...
> > > >>> ...
> > > >>> enable_irq(sc->irq);
> > > >>> spin_unlock_bh(>sc_pcu_lock); } For us
> > > >>> masking/unmasking is the way to enable/disable interrupts.
> > > >>
> > > >> And if you looked at the way disable_irq is implemented, you
> > > >> would have found out that it falls back to masking if there is no
> > > >> disable method, preserving the semantic you expect.
> > > >>
> > > > Yes I did see, but this fall back requires extra
> > > > "IRQ_DISABLE_UNLAZY" flag to
> > > be set to each virq.
> > >
> > > No it doesn't. If you do a disable_irq(), the interrupt is flagged
> > > as disabled, but nothing gets done. If an interrupt actually fires,
> > > then the interrupts gets
> > masked,
> > > and the handler is not called.
> > Yes agreed, this is where the problem comes for us. Here is the
> > scenario Ex:drivers/net/wireless/ath/ath9k/main.c
> > static int ath_reset_internal(struct ath_softc *sc, struct
> > ath9k_channel *hchan) {
> >   
> >ath9k_hw_set_interrupts(ah);
> >ath9k_hw_enable_interrupts(ah);
> >...
> >   enable_irq(sc->irq);
> >   ...
> > }
> > If you observe this they enable hardware interrupts first and then
> > call enable_irq, at this point of time virq is in disabled state. So,
> > if interrupt is raised in this period of time the handler is never
> > invoked and DEASEERT_INTx will not be seen. As I mentioned in my
> > subject the irq line between bridge and GIC goes low only after it
> > sees DEASSERT_INTx. But since DEASSERT_INTx is never seen line is
> > always high causing cpu stall.
> > So for this kind of EP's we need those two methods.
> >
> > Bharat


[PATCH 4/5] locking/locktorture: Support range rwlocks

2017-03-06 Thread Davidlohr Bueso
Torture the reader/writer range locks. Each thread will attempt to
lock+unlock a range of up to [0, 4096].

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 221 +--
 1 file changed, 172 insertions(+), 49 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index a68167803eee..76de50da4cdc 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -89,13 +90,13 @@ static void lock_torture_cleanup(void);
  */
 struct lock_torture_ops {
void (*init)(void);
-   int (*writelock)(void);
+   int (*writelock)(void *arg);
void (*write_delay)(struct torture_random_state *trsp);
void (*task_boost)(struct torture_random_state *trsp);
-   void (*writeunlock)(void);
-   int (*readlock)(void);
+   void (*writeunlock)(void *arg);
+   int (*readlock)(void *arg);
void (*read_delay)(struct torture_random_state *trsp);
-   void (*readunlock)(void);
+   void (*readunlock)(void *arg);
 
unsigned long flags; /* for irq spinlocks */
const char *name;
@@ -117,7 +118,7 @@ static struct lock_torture_cxt cxt = { 0, 0, false,
  * Definitions for lock torture testing.
  */
 
-static int torture_lock_busted_write_lock(void)
+static int torture_lock_busted_write_lock(void *arg)
 {
return 0;  /* BUGGY, do not use in real life!!! */
 }
@@ -136,7 +137,7 @@ static void torture_lock_busted_write_delay(struct 
torture_random_state *trsp)
 #endif
 }
 
-static void torture_lock_busted_write_unlock(void)
+static void torture_lock_busted_write_unlock(void *arg)
 {
  /* BUGGY, do not use in real life!!! */
 }
@@ -159,7 +160,8 @@ static struct lock_torture_ops lock_busted_ops = {
 
 static DEFINE_SPINLOCK(torture_spinlock);
 
-static int torture_spin_lock_write_lock(void) __acquires(torture_spinlock)
+static int torture_spin_lock_write_lock(void *arg)
+   __acquires(torture_spinlock)
 {
spin_lock(_spinlock);
return 0;
@@ -185,7 +187,8 @@ static void torture_spin_lock_write_delay(struct 
torture_random_state *trsp)
 #endif
 }
 
-static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
+static void torture_spin_lock_write_unlock(void *arg)
+   __releases(torture_spinlock)
 {
spin_unlock(_spinlock);
 }
@@ -201,8 +204,8 @@ static struct lock_torture_ops spin_lock_ops = {
.name   = "spin_lock"
 };
 
-static int torture_spin_lock_write_lock_irq(void)
-__acquires(torture_spinlock)
+static int torture_spin_lock_write_lock_irq(void *arg)
+   __acquires(torture_spinlock)
 {
unsigned long flags;
 
@@ -211,7 +214,7 @@ __acquires(torture_spinlock)
return 0;
 }
 
-static void torture_lock_spin_write_unlock_irq(void)
+static void torture_lock_spin_write_unlock_irq(void *arg)
 __releases(torture_spinlock)
 {
spin_unlock_irqrestore(_spinlock, cxt.cur_ops->flags);
@@ -230,7 +233,8 @@ static struct lock_torture_ops spin_lock_irq_ops = {
 
 static DEFINE_RWLOCK(torture_rwlock);
 
-static int torture_rwlock_write_lock(void) __acquires(torture_rwlock)
+static int torture_rwlock_write_lock(void *arg)
+   __acquires(torture_rwlock)
 {
write_lock(_rwlock);
return 0;
@@ -251,12 +255,14 @@ static void torture_rwlock_write_delay(struct 
torture_random_state *trsp)
udelay(shortdelay_us);
 }
 
-static void torture_rwlock_write_unlock(void) __releases(torture_rwlock)
+static void torture_rwlock_write_unlock(void *arg)
+   __releases(torture_rwlock)
 {
write_unlock(_rwlock);
 }
 
-static int torture_rwlock_read_lock(void) __acquires(torture_rwlock)
+static int torture_rwlock_read_lock(void *arg)
+   __acquires(torture_rwlock)
 {
read_lock(_rwlock);
return 0;
@@ -277,7 +283,8 @@ static void torture_rwlock_read_delay(struct 
torture_random_state *trsp)
udelay(shortdelay_us);
 }
 
-static void torture_rwlock_read_unlock(void) __releases(torture_rwlock)
+static void torture_rwlock_read_unlock(void *arg)
+   __releases(torture_rwlock)
 {
read_unlock(_rwlock);
 }
@@ -293,7 +300,8 @@ static struct lock_torture_ops rw_lock_ops = {
.name   = "rw_lock"
 };
 
-static int torture_rwlock_write_lock_irq(void) __acquires(torture_rwlock)
+static int torture_rwlock_write_lock_irq(void *arg)
+   __acquires(torture_rwlock)
 {
unsigned long flags;
 
@@ -302,13 +310,14 @@ static int torture_rwlock_write_lock_irq(void) 
__acquires(torture_rwlock)
return 0;
 }
 
-static void torture_rwlock_write_unlock_irq(void)
-__releases(torture_rwlock)
+static void torture_rwlock_write_unlock_irq(void *arg)
+   __releases(torture_rwlock)
 {
write_unlock_irqrestore(_rwlock, cxt.cur_ops->flags);
 }
 
-static int torture_rwlock_read_lock_irq(void) 

[PATCH 4/5] locking/locktorture: Support range rwlocks

2017-03-06 Thread Davidlohr Bueso
Torture the reader/writer range locks. Each thread will attempt to
lock+unlock a range of up to [0, 4096].

Signed-off-by: Davidlohr Bueso 
---
 kernel/locking/locktorture.c | 221 +--
 1 file changed, 172 insertions(+), 49 deletions(-)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index a68167803eee..76de50da4cdc 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -89,13 +90,13 @@ static void lock_torture_cleanup(void);
  */
 struct lock_torture_ops {
void (*init)(void);
-   int (*writelock)(void);
+   int (*writelock)(void *arg);
void (*write_delay)(struct torture_random_state *trsp);
void (*task_boost)(struct torture_random_state *trsp);
-   void (*writeunlock)(void);
-   int (*readlock)(void);
+   void (*writeunlock)(void *arg);
+   int (*readlock)(void *arg);
void (*read_delay)(struct torture_random_state *trsp);
-   void (*readunlock)(void);
+   void (*readunlock)(void *arg);
 
unsigned long flags; /* for irq spinlocks */
const char *name;
@@ -117,7 +118,7 @@ static struct lock_torture_cxt cxt = { 0, 0, false,
  * Definitions for lock torture testing.
  */
 
-static int torture_lock_busted_write_lock(void)
+static int torture_lock_busted_write_lock(void *arg)
 {
return 0;  /* BUGGY, do not use in real life!!! */
 }
@@ -136,7 +137,7 @@ static void torture_lock_busted_write_delay(struct 
torture_random_state *trsp)
 #endif
 }
 
-static void torture_lock_busted_write_unlock(void)
+static void torture_lock_busted_write_unlock(void *arg)
 {
  /* BUGGY, do not use in real life!!! */
 }
@@ -159,7 +160,8 @@ static struct lock_torture_ops lock_busted_ops = {
 
 static DEFINE_SPINLOCK(torture_spinlock);
 
-static int torture_spin_lock_write_lock(void) __acquires(torture_spinlock)
+static int torture_spin_lock_write_lock(void *arg)
+   __acquires(torture_spinlock)
 {
spin_lock(_spinlock);
return 0;
@@ -185,7 +187,8 @@ static void torture_spin_lock_write_delay(struct 
torture_random_state *trsp)
 #endif
 }
 
-static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
+static void torture_spin_lock_write_unlock(void *arg)
+   __releases(torture_spinlock)
 {
spin_unlock(_spinlock);
 }
@@ -201,8 +204,8 @@ static struct lock_torture_ops spin_lock_ops = {
.name   = "spin_lock"
 };
 
-static int torture_spin_lock_write_lock_irq(void)
-__acquires(torture_spinlock)
+static int torture_spin_lock_write_lock_irq(void *arg)
+   __acquires(torture_spinlock)
 {
unsigned long flags;
 
@@ -211,7 +214,7 @@ __acquires(torture_spinlock)
return 0;
 }
 
-static void torture_lock_spin_write_unlock_irq(void)
+static void torture_lock_spin_write_unlock_irq(void *arg)
 __releases(torture_spinlock)
 {
spin_unlock_irqrestore(_spinlock, cxt.cur_ops->flags);
@@ -230,7 +233,8 @@ static struct lock_torture_ops spin_lock_irq_ops = {
 
 static DEFINE_RWLOCK(torture_rwlock);
 
-static int torture_rwlock_write_lock(void) __acquires(torture_rwlock)
+static int torture_rwlock_write_lock(void *arg)
+   __acquires(torture_rwlock)
 {
write_lock(_rwlock);
return 0;
@@ -251,12 +255,14 @@ static void torture_rwlock_write_delay(struct 
torture_random_state *trsp)
udelay(shortdelay_us);
 }
 
-static void torture_rwlock_write_unlock(void) __releases(torture_rwlock)
+static void torture_rwlock_write_unlock(void *arg)
+   __releases(torture_rwlock)
 {
write_unlock(_rwlock);
 }
 
-static int torture_rwlock_read_lock(void) __acquires(torture_rwlock)
+static int torture_rwlock_read_lock(void *arg)
+   __acquires(torture_rwlock)
 {
read_lock(_rwlock);
return 0;
@@ -277,7 +283,8 @@ static void torture_rwlock_read_delay(struct 
torture_random_state *trsp)
udelay(shortdelay_us);
 }
 
-static void torture_rwlock_read_unlock(void) __releases(torture_rwlock)
+static void torture_rwlock_read_unlock(void *arg)
+   __releases(torture_rwlock)
 {
read_unlock(_rwlock);
 }
@@ -293,7 +300,8 @@ static struct lock_torture_ops rw_lock_ops = {
.name   = "rw_lock"
 };
 
-static int torture_rwlock_write_lock_irq(void) __acquires(torture_rwlock)
+static int torture_rwlock_write_lock_irq(void *arg)
+   __acquires(torture_rwlock)
 {
unsigned long flags;
 
@@ -302,13 +310,14 @@ static int torture_rwlock_write_lock_irq(void) 
__acquires(torture_rwlock)
return 0;
 }
 
-static void torture_rwlock_write_unlock_irq(void)
-__releases(torture_rwlock)
+static void torture_rwlock_write_unlock_irq(void *arg)
+   __releases(torture_rwlock)
 {
write_unlock_irqrestore(_rwlock, cxt.cur_ops->flags);
 }
 
-static int torture_rwlock_read_lock_irq(void) __acquires(torture_rwlock)
+static 

Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 10:49:06AM -0500, Johannes Weiner wrote:
> On Mon, Mar 06, 2017 at 12:03:44PM +0900, Minchan Kim wrote:
> > On Fri, Mar 03, 2017 at 10:18:51AM -0500, Johannes Weiner wrote:
> > > On Fri, Mar 03, 2017 at 11:52:37AM +0900, Minchan Kim wrote:
> > > > On Tue, Feb 28, 2017 at 04:32:38PM -0800, a...@linux-foundation.org 
> > > > wrote:
> > > > > 
> > > > > The patch titled
> > > > >  Subject: mm: reclaim MADV_FREE pages
> > > > > has been added to the -mm tree.  Its filename is
> > > > >  mm-reclaim-madv_free-pages.patch
> > > > > 
> > > > > This patch should soon appear at
> > > > > 
> > > > > http://ozlabs.org/~akpm/mmots/broken-out/mm-reclaim-madv_free-pages.patch
> > > > > and later at
> > > > > 
> > > > > http://ozlabs.org/~akpm/mmotm/broken-out/mm-reclaim-madv_free-pages.patch
> > > > > 
> > > > > Before you just go and hit "reply", please:
> > > > >a) Consider who else should be cc'ed
> > > > >b) Prefer to cc a suitable mailing list as well
> > > > >c) Ideally: find the original patch on the mailing list and do a
> > > > >   reply-to-all to that, adding suitable additional cc's
> > > > > 
> > > > > *** Remember to use Documentation/SubmitChecklist when testing your 
> > > > > code ***
> > > > > 
> > > > > The -mm tree is included into linux-next and is updated
> > > > > there every 3-4 working days
> > > > > 
> > > > > --
> > > > > From: Shaohua Li 
> > > > > Subject: mm: reclaim MADV_FREE pages
> > > > > 
> > > > > When memory pressure is high, we free MADV_FREE pages.  If the pages 
> > > > > are
> > > > > not dirty in pte, the pages could be freed immediately.  Otherwise we
> > > > > can't reclaim them.  We put the pages back to anonumous LRU list (by
> > > > > setting SwapBacked flag) and the pages will be reclaimed in normal 
> > > > > swapout
> > > > > way.
> > > > > 
> > > > > We use normal page reclaim policy.  Since MADV_FREE pages are put into
> > > > > inactive file list, such pages and inactive file pages are reclaimed
> > > > > according to their age.  This is expected, because we don't want to
> > > > > reclaim too many MADV_FREE pages before used once pages.
> > > > > 
> > > > > Based on Minchan's original patch
> > > > > 
> > > > > Link: 
> > > > > http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.s...@fb.com
> > > > > Signed-off-by: Shaohua Li 
> > > > > Acked-by: Minchan Kim 
> > > > > Acked-by: Michal Hocko 
> > > > > Acked-by: Johannes Weiner 
> > > > > Acked-by: Hillf Danton 
> > > > > Cc: Hugh Dickins 
> > > > > Cc: Rik van Riel 
> > > > > Cc: Mel Gorman 
> > > > > Signed-off-by: Andrew Morton 
> > > > > ---
> > > > 
> > > > < snip >
> > > > 
> > > > > @@ -1419,11 +1413,21 @@ static int try_to_unmap_one(struct page
> > > > >   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> > > > > PageSwapBacked(page),
> > > > >   page);
> > > > >  
> > > > > - if (!PageDirty(page)) {
> > > > > + /*
> > > > > +  * swapin page could be clean, it has data 
> > > > > stored in
> > > > > +  * swap. We can't silently discard it without 
> > > > > setting
> > > > > +  * swap entry in the page table.
> > > > > +  */
> > > > > + if (!PageDirty(page) && !PageSwapCache(page)) {
> > > > >   /* It's a freeable page by MADV_FREE */
> > > > >   dec_mm_counter(mm, MM_ANONPAGES);
> > > > > - rp->lazyfreed++;
> > > > >   goto discard;
> > > > > + } else if (!PageSwapBacked(page)) {
> > > > > + /* dirty MADV_FREE page */
> > > > > + set_pte_at(mm, address, pvmw.pte, 
> > > > > pteval);
> > > > > + ret = SWAP_DIRTY;
> > > > > + page_vma_mapped_walk_done();
> > > > > + break;
> > > > >   }
> > > > 
> > > > There is no point to make this logic complicated with clean swapin-page.
> > > > 
> > > > Andrew,
> > > > Could you fold below patch into the mm-reclaim-madv_free-pages.patch
> > > > if others are not against?
> > > > 
> > > > Thanks.
> > > > 
> > > > From 0c28f6560fbc4e65da4f4a8cc4664ab9f7b11cf3 Mon Sep 17 00:00:00 2001
> > > > From: Minchan Kim 
> > > > Date: Fri, 3 Mar 2017 11:42:52 +0900
> > > > Subject: [PATCH] mm: clean up lazyfree page handling
> > > > 
> > > > We can make it simple to understand without need to be aware of
> > > > clean-swapin page.
> > > > This patch just clean up lazyfree 

Re: + mm-reclaim-madv_free-pages.patch added to -mm tree

2017-03-06 Thread Minchan Kim
On Mon, Mar 06, 2017 at 10:49:06AM -0500, Johannes Weiner wrote:
> On Mon, Mar 06, 2017 at 12:03:44PM +0900, Minchan Kim wrote:
> > On Fri, Mar 03, 2017 at 10:18:51AM -0500, Johannes Weiner wrote:
> > > On Fri, Mar 03, 2017 at 11:52:37AM +0900, Minchan Kim wrote:
> > > > On Tue, Feb 28, 2017 at 04:32:38PM -0800, a...@linux-foundation.org 
> > > > wrote:
> > > > > 
> > > > > The patch titled
> > > > >  Subject: mm: reclaim MADV_FREE pages
> > > > > has been added to the -mm tree.  Its filename is
> > > > >  mm-reclaim-madv_free-pages.patch
> > > > > 
> > > > > This patch should soon appear at
> > > > > 
> > > > > http://ozlabs.org/~akpm/mmots/broken-out/mm-reclaim-madv_free-pages.patch
> > > > > and later at
> > > > > 
> > > > > http://ozlabs.org/~akpm/mmotm/broken-out/mm-reclaim-madv_free-pages.patch
> > > > > 
> > > > > Before you just go and hit "reply", please:
> > > > >a) Consider who else should be cc'ed
> > > > >b) Prefer to cc a suitable mailing list as well
> > > > >c) Ideally: find the original patch on the mailing list and do a
> > > > >   reply-to-all to that, adding suitable additional cc's
> > > > > 
> > > > > *** Remember to use Documentation/SubmitChecklist when testing your 
> > > > > code ***
> > > > > 
> > > > > The -mm tree is included into linux-next and is updated
> > > > > there every 3-4 working days
> > > > > 
> > > > > --
> > > > > From: Shaohua Li 
> > > > > Subject: mm: reclaim MADV_FREE pages
> > > > > 
> > > > > When memory pressure is high, we free MADV_FREE pages.  If the pages 
> > > > > are
> > > > > not dirty in pte, the pages could be freed immediately.  Otherwise we
> > > > > can't reclaim them.  We put the pages back to anonumous LRU list (by
> > > > > setting SwapBacked flag) and the pages will be reclaimed in normal 
> > > > > swapout
> > > > > way.
> > > > > 
> > > > > We use normal page reclaim policy.  Since MADV_FREE pages are put into
> > > > > inactive file list, such pages and inactive file pages are reclaimed
> > > > > according to their age.  This is expected, because we don't want to
> > > > > reclaim too many MADV_FREE pages before used once pages.
> > > > > 
> > > > > Based on Minchan's original patch
> > > > > 
> > > > > Link: 
> > > > > http://lkml.kernel.org/r/14b8eb1d3f6bf6cc492833f183ac8c304e560484.1487965799.git.s...@fb.com
> > > > > Signed-off-by: Shaohua Li 
> > > > > Acked-by: Minchan Kim 
> > > > > Acked-by: Michal Hocko 
> > > > > Acked-by: Johannes Weiner 
> > > > > Acked-by: Hillf Danton 
> > > > > Cc: Hugh Dickins 
> > > > > Cc: Rik van Riel 
> > > > > Cc: Mel Gorman 
> > > > > Signed-off-by: Andrew Morton 
> > > > > ---
> > > > 
> > > > < snip >
> > > > 
> > > > > @@ -1419,11 +1413,21 @@ static int try_to_unmap_one(struct page
> > > > >   VM_BUG_ON_PAGE(!PageSwapCache(page) && 
> > > > > PageSwapBacked(page),
> > > > >   page);
> > > > >  
> > > > > - if (!PageDirty(page)) {
> > > > > + /*
> > > > > +  * swapin page could be clean, it has data 
> > > > > stored in
> > > > > +  * swap. We can't silently discard it without 
> > > > > setting
> > > > > +  * swap entry in the page table.
> > > > > +  */
> > > > > + if (!PageDirty(page) && !PageSwapCache(page)) {
> > > > >   /* It's a freeable page by MADV_FREE */
> > > > >   dec_mm_counter(mm, MM_ANONPAGES);
> > > > > - rp->lazyfreed++;
> > > > >   goto discard;
> > > > > + } else if (!PageSwapBacked(page)) {
> > > > > + /* dirty MADV_FREE page */
> > > > > + set_pte_at(mm, address, pvmw.pte, 
> > > > > pteval);
> > > > > + ret = SWAP_DIRTY;
> > > > > + page_vma_mapped_walk_done();
> > > > > + break;
> > > > >   }
> > > > 
> > > > There is no point to make this logic complicated with clean swapin-page.
> > > > 
> > > > Andrew,
> > > > Could you fold below patch into the mm-reclaim-madv_free-pages.patch
> > > > if others are not against?
> > > > 
> > > > Thanks.
> > > > 
> > > > From 0c28f6560fbc4e65da4f4a8cc4664ab9f7b11cf3 Mon Sep 17 00:00:00 2001
> > > > From: Minchan Kim 
> > > > Date: Fri, 3 Mar 2017 11:42:52 +0900
> > > > Subject: [PATCH] mm: clean up lazyfree page handling
> > > > 
> > > > We can make it simple to understand without need to be aware of
> > > > clean-swapin page.
> > > > This patch just clean up lazyfree page handling in try_to_unmap_one.
> > > > 
> > > > Signed-off-by: Minchan Kim 
> > > 
> > > Agreed, this is a litle easier to follow.
> > > 
> > > Acked-by: Johannes Weiner 
> > 
> > Thanks, Johannes.
> > 
> > > 
> > > 

[PATCH 5/5] staging/lustre: Use generic range rwlock

2017-03-06 Thread Davidlohr Bueso
This replaces the in-house version, which is also derived
from Jan's interval tree implementation.

Cc: oleg.dro...@intel.com
Cc: andreas.dil...@intel.com
Cc: jsimm...@infradead.org
Cc: lustre-de...@lists.lustre.org

Signed-off-by: Davidlohr Bueso 
---
XXX: compile tested only. In house uses 'ulong long', generic uses 'ulong', is 
this a problem?

 drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |  21 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
 drivers/staging/lustre/lustre/llite/range_lock.c   | 239 -
 drivers/staging/lustre/lustre/llite/range_lock.h   |  82 ---
 6 files changed, 15 insertions(+), 336 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h

diff --git a/drivers/staging/lustre/lustre/llite/Makefile 
b/drivers/staging/lustre/lustre/llite/Makefile
index 322d4fa63f5d..922a901bc62c 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LUSTRE_FS) += lustre.o
 lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
-   rw.o rw26.o namei.o symlink.o llite_mmap.o range_lock.o \
+   rw.o rw26.o namei.o symlink.o llite_mmap.o \
xattr.o xattr_cache.o xattr_security.o \
super25.o statahead.o glimpse.o lcommon_cl.o lcommon_misc.o \
vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o \
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 481c0d01d4c6..1a14a79f87f8 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "../include/lustre/ll_fiemap.h"
 #include "../include/lustre/lustre_ioctl.h"
 #include "../include/lustre_swab.h"
@@ -1055,7 +1056,7 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
struct ll_inode_info *lli = ll_i2info(file_inode(file));
struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
struct vvp_io *vio = vvp_env_io(env);
-   struct range_lock range;
+   struct range_rwlock range;
struct cl_io *io;
ssize_t result = 0;
int rc = 0;
@@ -1072,9 +1073,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
bool range_locked = false;
 
if (file->f_flags & O_APPEND)
-   range_lock_init(, 0, LUSTRE_EOF);
+   range_rwlock_init(, 0, LUSTRE_EOF);
else
-   range_lock_init(, *ppos, *ppos + count - 1);
+   range_rwlock_init(, *ppos, *ppos + count - 1);
 
vio->vui_fd  = LUSTRE_FPRIVATE(file);
vio->vui_iter = args->u.normal.via_iter;
@@ -1087,10 +1088,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
if (((iot == CIT_WRITE) ||
 (iot == CIT_READ && (file->f_flags & O_DIRECT))) &&
!(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
-   CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
-  range.rl_node.in_extent.start,
-  range.rl_node.in_extent.end);
-   rc = range_lock(>lli_write_tree, );
+   CDEBUG(D_VFSTRACE, "Range lock [%lu, %lu]\n",
+  range.node.start, range.node.last);
+   rc = 
range_write_lock_interruptible(>lli_write_tree, );
if (rc < 0)
goto out;
 
@@ -1100,10 +1100,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
rc = cl_io_loop(env, io);
ll_cl_remove(file, env);
if (range_locked) {
-   CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
-  range.rl_node.in_extent.start,
-  range.rl_node.in_extent.end);
-   range_unlock(>lli_write_tree, );
+   CDEBUG(D_VFSTRACE, "Range unlock [%lu, %lu]\n",
+  range.node.start, range.node.last);
+   range_write_unlock(>lli_write_tree, );
}
} else {
/* cl_io_rw_init() handled IO */
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 55f68acd85d1..aa2ae72e3e70 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -49,8 +49,8 @@
 #include 
 #include 
 #include 
+#include 
 #include 

[PATCH 5/5] staging/lustre: Use generic range rwlock

2017-03-06 Thread Davidlohr Bueso
This replaces the in-house version, which is also derived
from Jan's interval tree implementation.

Cc: oleg.dro...@intel.com
Cc: andreas.dil...@intel.com
Cc: jsimm...@infradead.org
Cc: lustre-de...@lists.lustre.org

Signed-off-by: Davidlohr Bueso 
---
XXX: compile tested only. In house uses 'ulong long', generic uses 'ulong', is 
this a problem?

 drivers/staging/lustre/lustre/llite/Makefile   |   2 +-
 drivers/staging/lustre/lustre/llite/file.c |  21 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c|   3 +-
 drivers/staging/lustre/lustre/llite/range_lock.c   | 239 -
 drivers/staging/lustre/lustre/llite/range_lock.h   |  82 ---
 6 files changed, 15 insertions(+), 336 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.c
 delete mode 100644 drivers/staging/lustre/lustre/llite/range_lock.h

diff --git a/drivers/staging/lustre/lustre/llite/Makefile 
b/drivers/staging/lustre/lustre/llite/Makefile
index 322d4fa63f5d..922a901bc62c 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LUSTRE_FS) += lustre.o
 lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
-   rw.o rw26.o namei.o symlink.o llite_mmap.o range_lock.o \
+   rw.o rw26.o namei.o symlink.o llite_mmap.o \
xattr.o xattr_cache.o xattr_security.o \
super25.o statahead.o glimpse.o lcommon_cl.o lcommon_misc.o \
vvp_dev.o vvp_page.o vvp_lock.o vvp_io.o vvp_object.o \
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 481c0d01d4c6..1a14a79f87f8 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "../include/lustre/ll_fiemap.h"
 #include "../include/lustre/lustre_ioctl.h"
 #include "../include/lustre_swab.h"
@@ -1055,7 +1056,7 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
struct ll_inode_info *lli = ll_i2info(file_inode(file));
struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
struct vvp_io *vio = vvp_env_io(env);
-   struct range_lock range;
+   struct range_rwlock range;
struct cl_io *io;
ssize_t result = 0;
int rc = 0;
@@ -1072,9 +1073,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
bool range_locked = false;
 
if (file->f_flags & O_APPEND)
-   range_lock_init(, 0, LUSTRE_EOF);
+   range_rwlock_init(, 0, LUSTRE_EOF);
else
-   range_lock_init(, *ppos, *ppos + count - 1);
+   range_rwlock_init(, *ppos, *ppos + count - 1);
 
vio->vui_fd  = LUSTRE_FPRIVATE(file);
vio->vui_iter = args->u.normal.via_iter;
@@ -1087,10 +1088,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
if (((iot == CIT_WRITE) ||
 (iot == CIT_READ && (file->f_flags & O_DIRECT))) &&
!(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
-   CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
-  range.rl_node.in_extent.start,
-  range.rl_node.in_extent.end);
-   rc = range_lock(>lli_write_tree, );
+   CDEBUG(D_VFSTRACE, "Range lock [%lu, %lu]\n",
+  range.node.start, range.node.last);
+   rc = 
range_write_lock_interruptible(>lli_write_tree, );
if (rc < 0)
goto out;
 
@@ -1100,10 +1100,9 @@ ll_file_io_generic(const struct lu_env *env, struct 
vvp_io_args *args,
rc = cl_io_loop(env, io);
ll_cl_remove(file, env);
if (range_locked) {
-   CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
-  range.rl_node.in_extent.start,
-  range.rl_node.in_extent.end);
-   range_unlock(>lli_write_tree, );
+   CDEBUG(D_VFSTRACE, "Range unlock [%lu, %lu]\n",
+  range.node.start, range.node.last);
+   range_write_unlock(>lli_write_tree, );
}
} else {
/* cl_io_rw_init() handled IO */
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 55f68acd85d1..aa2ae72e3e70 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -49,8 +49,8 @@
 #include 
 #include 
 #include 
+#include 
 #include "vvp_internal.h"
-#include 

Re: [tip:x86/asm] x86/asm: Optimize clear_page()

2017-03-06 Thread Yinghai Lu
On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
 wrote:
> Commit-ID:  49ca7bb328c630dd43be626534b49e19513296fd
> Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
> Author: Borislav Petkov 
> AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
> Committer:  Ingo Molnar 
> CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
>
> x86/asm: Optimize clear_page()
>
> Currently, we CALL clear_page() which then JMPs to the proper function
> chosen by the alternatives.
>
> What we should do instead is CALL the proper function directly. (This
> was something Ingo suggested a while ago). So let's do that.

looks like this one broke the kexec.
after revert it back, kexec work again.

10:~/k # sh kk
add_buffer: base:43fff6000 bufsz:80e0 memsz:a000
add_buffer: base:43fff1000 bufsz:44ce memsz:44ce
add_buffer: base:43c00 bufsz:eb2360 memsz:352e000
add_buffer: base:439d0d000 bufsz:22f2060 memsz:22f2060
add_buffer: base:43fff bufsz:70 memsz:70
add_buffer: base:43ffef000 bufsz:140 memsz:140
10:~/k # [   79.250483] BUG: unable to handle kernel paging request at
c467661dc038
[   79.251562] IP: __handle_mm_fault+0x256/0x910
[   79.252157] PGD 0
[   79.252159]
[   79.252733] Oops:  [#1] SMP
[   79.253243] Modules linked in:
[   79.253718] CPU: 4 PID: 5593 Comm: hald-addon-stor Not tainted
4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[   79.255054] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[   79.256069] task: 8b43794c task.stack: b30dc6dac000
[   79.256887] RIP: 0010:__handle_mm_fault+0x256/0x910
[   79.257545] RSP: :b30dc6dafdd0 EFLAGS: 00010282
[   79.258225] RAX: 3928261dc000 RBX: 8b417a38dcf0 RCX: 3000
[   79.259175] RDX: 09cc3928261dcc7c RSI: 09cc3928261dcc7c RDI: b30dc6dafe48
[   79.260126] RBP: b30dc6dafe70 R08: 0001 R09: 8b43794c0c60
[   79.261095] R10: 3638e619 R11: 0001 R12: 8b427a72a538
[   79.261963] R13: c467661dc038 R14: b30dc6dafde0 R15: 0154
[   79.262903] FS:  7f29c1ce4740() GS:8b427ba0()
knlGS:
[   79.263973] CS:  0010 DS:  ES:  CR0: 80050033
[   79.264741] CR2: c467661dc038 CR3: 00033a512000 CR4: 06e0
[   79.265679] Call Trace:
[   79.266003]  ? handle_mm_fault+0x138/0x320
[   79.266431]  handle_mm_fault+0x247/0x320
[   79.266968]  ? handle_mm_fault+0x47/0x320
[   79.267491]  __do_page_fault+0x49f/0x500
[   79.268039]  do_page_fault+0x65/0x80
[   79.268508]  page_fault+0x22/0x30
[   79.268975] RIP: 0033:0x7f29c0ed53e8
[   79.269443] RSP: 002b:7ffe63a0e080 EFLAGS: 00010246
[   79.271605] RAX:  RBX: 07c7 RCX: 7f29c0ed53e8
[   79.272794] RDX: 07c7 RSI: 0002 RDI: 0060d0e0
[   79.273741] RBP: 0002 R08: 7f29c1457de0 R09: 
[   79.274698] R10: 0001 R11: 0246 R12: 0060ac20
[   79.275648] R13: 0060d0e0 R14: 0060ac28 R15: 7f29c1457de0
[   79.276596] Code: 3f 00 00 41 81 e5 f8 0f 00 00 f6 c2 80 48 0f 44
c1 4c 03 2d 25 9d ca 01 48 21 d0 49 01 c5 4d 85 ed 4c 89 6d 90 0f 84
d1 04 00 00 <49> 8b 75 00 48 f7 c6 9f ff ff ff 75 6a 48 8b 05 be 35 eb
01 a8
[   79.279121] RIP: __handle_mm_fault+0x256/0x910 RSP: b30dc6dafdd0
[   79.279965] CR2: c467661dc038
[   79.280403] ---[ end trace 7bd128a831f77757 ]---
[   79.298303] general protection fault:  [#2] SMP
[   79.298997] Modules linked in:
[   79.299402] CPU: 4 PID: 5593 Comm: hald-addon-stor Tainted: G
D 4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[   79.300794] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[   79.301707] task: 8b43794c task.stack: b30dc6dac000
[   79.302502] RIP: 0010:__wake_up_common+0x4a/0x90
[   79.303133] RSP: :8b427ba03de0 EFLAGS: 00010006
[   79.303807] RAX: b30dc6263da0 RBX: 765622af RCX: 
[   79.304769] RDX:  RSI: 0001 RDI: b30dc6263da0
[   79.305730] RBP: 8b427ba03e18 R08:  R09: 0001
[   79.306691] R10:  R11: 0e2e7ae4 R12: afe71d08
[   79.307642] R13: 58e0432d872b20f9 R14:  R15: 0001
[   79.308571] FS:  7f29c1ce4740() GS:8b427ba0()
knlGS:
[   79.309653] CS:  0010 DS:  ES:  CR0: 80050033
[   79.310434] CR2: c467661dc038 CR3: 00033a512000 CR4: 06e0
[   79.311398] Call Trace:
[   79.311724]  
[   79.311998]  __wake_up+0x39/0x50
[   79.312458]  wake_up_klogd_work_func+0x52/0x60
[   79.313119]  irq_work_run_list+0x43/0x70
[   79.313634]  ? tick_sched_handle.isra.16+0x50/0x50
[   79.314289]  irq_work_tick+0x40/0x50
[   79.314754]  update_process_times+0x42/0x60
[   79.315332]  tick_sched_handle.isra.16+0x41/0x50
[   79.315933]  tick_sched_timer+0x3d/0x70
[   

Re: [tip:x86/asm] x86/asm: Optimize clear_page()

2017-03-06 Thread Yinghai Lu
On Wed, Mar 1, 2017 at 1:47 AM, tip-bot for Borislav Petkov
 wrote:
> Commit-ID:  49ca7bb328c630dd43be626534b49e19513296fd
> Gitweb: http://git.kernel.org/tip/49ca7bb328c630dd43be626534b49e19513296fd
> Author: Borislav Petkov 
> AuthorDate: Thu, 9 Feb 2017 01:34:49 +0100
> Committer:  Ingo Molnar 
> CommitDate: Wed, 1 Mar 2017 10:18:32 +0100
>
> x86/asm: Optimize clear_page()
>
> Currently, we CALL clear_page() which then JMPs to the proper function
> chosen by the alternatives.
>
> What we should do instead is CALL the proper function directly. (This
> was something Ingo suggested a while ago). So let's do that.

looks like this one broke the kexec.
after revert it back, kexec work again.

10:~/k # sh kk
add_buffer: base:43fff6000 bufsz:80e0 memsz:a000
add_buffer: base:43fff1000 bufsz:44ce memsz:44ce
add_buffer: base:43c00 bufsz:eb2360 memsz:352e000
add_buffer: base:439d0d000 bufsz:22f2060 memsz:22f2060
add_buffer: base:43fff bufsz:70 memsz:70
add_buffer: base:43ffef000 bufsz:140 memsz:140
10:~/k # [   79.250483] BUG: unable to handle kernel paging request at
c467661dc038
[   79.251562] IP: __handle_mm_fault+0x256/0x910
[   79.252157] PGD 0
[   79.252159]
[   79.252733] Oops:  [#1] SMP
[   79.253243] Modules linked in:
[   79.253718] CPU: 4 PID: 5593 Comm: hald-addon-stor Not tainted
4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[   79.255054] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[   79.256069] task: 8b43794c task.stack: b30dc6dac000
[   79.256887] RIP: 0010:__handle_mm_fault+0x256/0x910
[   79.257545] RSP: :b30dc6dafdd0 EFLAGS: 00010282
[   79.258225] RAX: 3928261dc000 RBX: 8b417a38dcf0 RCX: 3000
[   79.259175] RDX: 09cc3928261dcc7c RSI: 09cc3928261dcc7c RDI: b30dc6dafe48
[   79.260126] RBP: b30dc6dafe70 R08: 0001 R09: 8b43794c0c60
[   79.261095] R10: 3638e619 R11: 0001 R12: 8b427a72a538
[   79.261963] R13: c467661dc038 R14: b30dc6dafde0 R15: 0154
[   79.262903] FS:  7f29c1ce4740() GS:8b427ba0()
knlGS:
[   79.263973] CS:  0010 DS:  ES:  CR0: 80050033
[   79.264741] CR2: c467661dc038 CR3: 00033a512000 CR4: 06e0
[   79.265679] Call Trace:
[   79.266003]  ? handle_mm_fault+0x138/0x320
[   79.266431]  handle_mm_fault+0x247/0x320
[   79.266968]  ? handle_mm_fault+0x47/0x320
[   79.267491]  __do_page_fault+0x49f/0x500
[   79.268039]  do_page_fault+0x65/0x80
[   79.268508]  page_fault+0x22/0x30
[   79.268975] RIP: 0033:0x7f29c0ed53e8
[   79.269443] RSP: 002b:7ffe63a0e080 EFLAGS: 00010246
[   79.271605] RAX:  RBX: 07c7 RCX: 7f29c0ed53e8
[   79.272794] RDX: 07c7 RSI: 0002 RDI: 0060d0e0
[   79.273741] RBP: 0002 R08: 7f29c1457de0 R09: 
[   79.274698] R10: 0001 R11: 0246 R12: 0060ac20
[   79.275648] R13: 0060d0e0 R14: 0060ac28 R15: 7f29c1457de0
[   79.276596] Code: 3f 00 00 41 81 e5 f8 0f 00 00 f6 c2 80 48 0f 44
c1 4c 03 2d 25 9d ca 01 48 21 d0 49 01 c5 4d 85 ed 4c 89 6d 90 0f 84
d1 04 00 00 <49> 8b 75 00 48 f7 c6 9f ff ff ff 75 6a 48 8b 05 be 35 eb
01 a8
[   79.279121] RIP: __handle_mm_fault+0x256/0x910 RSP: b30dc6dafdd0
[   79.279965] CR2: c467661dc038
[   79.280403] ---[ end trace 7bd128a831f77757 ]---
[   79.298303] general protection fault:  [#2] SMP
[   79.298997] Modules linked in:
[   79.299402] CPU: 4 PID: 5593 Comm: hald-addon-stor Tainted: G
D 4.11.0-rc1-yh-00100-g00db9e3-dirty #175
[   79.300794] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
[   79.301707] task: 8b43794c task.stack: b30dc6dac000
[   79.302502] RIP: 0010:__wake_up_common+0x4a/0x90
[   79.303133] RSP: :8b427ba03de0 EFLAGS: 00010006
[   79.303807] RAX: b30dc6263da0 RBX: 765622af RCX: 
[   79.304769] RDX:  RSI: 0001 RDI: b30dc6263da0
[   79.305730] RBP: 8b427ba03e18 R08:  R09: 0001
[   79.306691] R10:  R11: 0e2e7ae4 R12: afe71d08
[   79.307642] R13: 58e0432d872b20f9 R14:  R15: 0001
[   79.308571] FS:  7f29c1ce4740() GS:8b427ba0()
knlGS:
[   79.309653] CS:  0010 DS:  ES:  CR0: 80050033
[   79.310434] CR2: c467661dc038 CR3: 00033a512000 CR4: 06e0
[   79.311398] Call Trace:
[   79.311724]  
[   79.311998]  __wake_up+0x39/0x50
[   79.312458]  wake_up_klogd_work_func+0x52/0x60
[   79.313119]  irq_work_run_list+0x43/0x70
[   79.313634]  ? tick_sched_handle.isra.16+0x50/0x50
[   79.314289]  irq_work_tick+0x40/0x50
[   79.314754]  update_process_times+0x42/0x60
[   79.315332]  tick_sched_handle.isra.16+0x41/0x50
[   79.315933]  tick_sched_timer+0x3d/0x70
[   79.316472]  __hrtimer_run_queues+0x264/0x440
[   

Re: linux-next: build failure after merge of the staging tree

2017-03-06 Thread Greg KH
On Tue, Mar 07, 2017 at 12:25:42PM +1100, Stephen Rothwell wrote:
> Hi Greg,
> 
> After merging the staging tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/staging/media/atomisp/i2c/mt9m114.c:38:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/i2c/gc2235.c:32:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/i2c/ov2722.c:32:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/platform/clock/vlv2_plat_clock.c:25:35: fatal 
> error: linux/vlv2_plat_clock.h: No such file or directory
> drivers/staging/media/atomisp/platform/intel-mid/intel_mid_pcihelpers.c:28:38:
>  fatal error: asm/intel_mid_pcihelpers.h: No such file or directory
> drivers/staging/media/atomisp/platform/intel-mid/atomisp_gmin_platform.c:10:35:
>  fatal error: linux/vlv2_plat_clock.h: No such file or directory
> In file included from 
> drivers/staging/media/atomisp/pci/atomisp2/./atomisp_drvfs.c:26:0:
> drivers/staging/media/atomisp/pci/atomisp2/./atomisp_compat.h:27:27: fatal 
> error: linux/atomisp.h: No such file or directory
> 
> Caused by commit
> 
>   a49d25364dfb ("staging/atomisp: Add support for the Intel IPU v2")
> 
> or maybe some of the followups?
> 
> I have used the staging tree from next-20170306 for today.

I just got a report from the kbuild bot about this as well, it didn't
used to happen before, and I can't duplicate it myself.  Alan, any hints
as to what broke here?

thanks,

greg k-h


Re: linux-next: build failure after merge of the staging tree

2017-03-06 Thread Greg KH
On Tue, Mar 07, 2017 at 12:25:42PM +1100, Stephen Rothwell wrote:
> Hi Greg,
> 
> After merging the staging tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/staging/media/atomisp/i2c/mt9m114.c:38:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/i2c/gc2235.c:32:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/i2c/ov2722.c:32:41: fatal error: 
> linux/atomisp_gmin_platform.h: No such file or directory
> drivers/staging/media/atomisp/platform/clock/vlv2_plat_clock.c:25:35: fatal 
> error: linux/vlv2_plat_clock.h: No such file or directory
> drivers/staging/media/atomisp/platform/intel-mid/intel_mid_pcihelpers.c:28:38:
>  fatal error: asm/intel_mid_pcihelpers.h: No such file or directory
> drivers/staging/media/atomisp/platform/intel-mid/atomisp_gmin_platform.c:10:35:
>  fatal error: linux/vlv2_plat_clock.h: No such file or directory
> In file included from 
> drivers/staging/media/atomisp/pci/atomisp2/./atomisp_drvfs.c:26:0:
> drivers/staging/media/atomisp/pci/atomisp2/./atomisp_compat.h:27:27: fatal 
> error: linux/atomisp.h: No such file or directory
> 
> Caused by commit
> 
>   a49d25364dfb ("staging/atomisp: Add support for the Intel IPU v2")
> 
> or maybe some of the followups?
> 
> I have used the staging tree from next-20170306 for today.

I just got a report from the kbuild bot about this as well, it didn't
used to happen before, and I can't duplicate it myself.  Alan, any hints
as to what broke here?

thanks,

greg k-h


[lkp-robot] [block] 165a5e22fa: BUG_kmalloc-#(Not_tainted):Poison_overwritten

2017-03-06 Thread kernel test robot
FYI, we noticed the following commit:

commit: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820 ("block: Move bdi_unregister() 
to del_gendisk()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
|   
  | 113285b473 | 165a5e22fa |
+-+++
| boot_successes
  | 32 | 14 |
| boot_failures 
  | 1  | 27 |
| invoked_oom-killer:gfp_mask=0x
  | 1  ||
| Mem-Info  
  | 1  ||
| page_allocation_failure:order:#,mode:#(GFP_USER),nodemask=(null)  
  | 1  ||
| Out_of_memory:Kill_process
  | 1  ||
| 
page_allocation_failure:order:#,mode:#(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK),nodemask=(null)
 | 1  ||
| BUG_kmalloc-#(Not_tainted):Poison_overwritten 
  | 0  | 23 |
| INFO:#-#.First_byte#instead_of
  | 0  | 23 |
| INFO:Allocated_in_bdi_init_age=#cpu=#pid= 
  | 0  | 23 |
| INFO:Freed_in_wb_congested_put_age=#cpu=#pid= 
  | 0  | 23 |
| INFO:Slab#objects=#used=#fp=0x(null)flags=
  | 0  | 23 |
| INFO:Object#@offset=#fp=  
  | 0  | 23 |
| BUG:kernel_hang_in_test_stage 
  | 0  | 2  |
| BUG:kernel_hang_in_boot_stage 
  | 0  | 2  |
+-+++



[   17.819559] sd 0:0:0:0: [sdb] Write Protect is off
[   17.819562] sd 0:0:0:0: [sdb] Mode Sense: 73 00 10 08
[   17.823330] sd 0:0:0:0: [sdb] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   17.826134] slram: not enough parameters.
[   17.830848] 
=
[   17.831013] BUG kmalloc-16 (Not tainted): Poison overwritten
[   17.831013] 
-
[   17.831013] 
[   17.831013] Disabling lock debugging due to kernel taint
[   17.831013] INFO: 0x88001296ddc8-0x88001296ddd0. First byte 0x6a 
instead of 0x6b
[   17.831013] INFO: Allocated in bdi_init+0x85/0x31d age=119 cpu=0 pid=161
[   17.831013]  ___slab_alloc+0x479/0x4c6
[   17.831013]  __slab_alloc+0x41/0x71
[   17.831013]  kmem_cache_alloc+0x57/0xc2
[   17.831013]  bdi_init+0x85/0x31d
[   17.831013]  bdi_alloc_node+0x3f/0x53


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Ying Huang
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.10.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y

[lkp-robot] [block] 165a5e22fa: BUG_kmalloc-#(Not_tainted):Poison_overwritten

2017-03-06 Thread kernel test robot
FYI, we noticed the following commit:

commit: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820 ("block: Move bdi_unregister() 
to del_gendisk()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: trinity
with following parameters:

runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
|   
  | 113285b473 | 165a5e22fa |
+-+++
| boot_successes
  | 32 | 14 |
| boot_failures 
  | 1  | 27 |
| invoked_oom-killer:gfp_mask=0x
  | 1  ||
| Mem-Info  
  | 1  ||
| page_allocation_failure:order:#,mode:#(GFP_USER),nodemask=(null)  
  | 1  ||
| Out_of_memory:Kill_process
  | 1  ||
| 
page_allocation_failure:order:#,mode:#(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK),nodemask=(null)
 | 1  ||
| BUG_kmalloc-#(Not_tainted):Poison_overwritten 
  | 0  | 23 |
| INFO:#-#.First_byte#instead_of
  | 0  | 23 |
| INFO:Allocated_in_bdi_init_age=#cpu=#pid= 
  | 0  | 23 |
| INFO:Freed_in_wb_congested_put_age=#cpu=#pid= 
  | 0  | 23 |
| INFO:Slab#objects=#used=#fp=0x(null)flags=
  | 0  | 23 |
| INFO:Object#@offset=#fp=  
  | 0  | 23 |
| BUG:kernel_hang_in_test_stage 
  | 0  | 2  |
| BUG:kernel_hang_in_boot_stage 
  | 0  | 2  |
+-+++



[   17.819559] sd 0:0:0:0: [sdb] Write Protect is off
[   17.819562] sd 0:0:0:0: [sdb] Mode Sense: 73 00 10 08
[   17.823330] sd 0:0:0:0: [sdb] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   17.826134] slram: not enough parameters.
[   17.830848] 
=
[   17.831013] BUG kmalloc-16 (Not tainted): Poison overwritten
[   17.831013] 
-
[   17.831013] 
[   17.831013] Disabling lock debugging due to kernel taint
[   17.831013] INFO: 0x88001296ddc8-0x88001296ddd0. First byte 0x6a 
instead of 0x6b
[   17.831013] INFO: Allocated in bdi_init+0x85/0x31d age=119 cpu=0 pid=161
[   17.831013]  ___slab_alloc+0x479/0x4c6
[   17.831013]  __slab_alloc+0x41/0x71
[   17.831013]  kmem_cache_alloc+0x57/0xc2
[   17.831013]  bdi_init+0x85/0x31d
[   17.831013]  bdi_alloc_node+0x3f/0x53


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Ying Huang
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.10.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y

Re: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Bart Van Assche
On Tue, 2017-03-07 at 05:08 +, Parav Pandit wrote:
> I replied with pseudo code in previous reply to Bart to bring back dma_device 
> member in the ib_device.
> dma_device member was already present in near past of few weeks.
> It should be able to work using it without performance impact and without 
> touching driver core layer like in this patch.

That's confusing and was a source of bugs and inconsistencies. We do not
want two device structures in struct ib_device (struct device dev and struct
device *dma_device).

Bart.

Re: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Bart Van Assche
On Tue, 2017-03-07 at 05:08 +, Parav Pandit wrote:
> I replied with pseudo code in previous reply to Bart to bring back dma_device 
> member in the ib_device.
> dma_device member was already present in near past of few weeks.
> It should be able to work using it without performance impact and without 
> touching driver core layer like in this patch.

That's confusing and was a source of bugs and inconsistencies. We do not
want two device structures in struct ib_device (struct device dev and struct
device *dma_device).

Bart.

RE: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Parav Pandit
Hi Greg,

> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Monday, March 6, 2017 10:53 PM
> To: Bart Van Assche 
> Cc: Doug Ledford ; Sebastian Ott
> ; Parav Pandit ; linux-
> r...@vger.kernel.org; linux-kernel@vger.kernel.org; Bjorn Helgaas
> ; Benjamin Herrenschmidt
> ; David Woodhouse ;
> H . Peter Anvin ; Ingo Molnar ;
> Russell King 
> Subject: Re: [PATCH 1/2] device: Stop requiring that struct device is
> embedded in struct pci_dev
> 
> On Mon, Mar 06, 2017 at 04:35:48PM -0800, Bart Van Assche wrote:
> > The dma mapping operations of several architectures and also of
> > several I/O MMU implementations need to translate a struct device
> > pointer into a struct pci_dev pointer. This translation is performed
> > by to_pci_dev(). That macro assumes that struct device is embedded in
> > struct pci_dev. However, that is not the case for the device structure
> > in struct ib_device.
> 
> Then don't blindly cast it backwards!  Fix that up, an ib device should have
> access to the dma structures that the PCI device it depends on has.
> If not, you need to set that up properly in the IB core, don't mess with the
> driver core for this at all.
> 
I replied with pseudo code in previous reply to Bart to bring back dma_device 
member in the ib_device.
dma_device member was already present in near past of few weeks.
It should be able to work using it without performance impact and without 
touching driver core layer like in this patch.



Re: [PATCH] xfs: remove kmem_zalloc_greedy

2017-03-06 Thread Christoph Hellwig
On Tue, Mar 07, 2017 at 11:54:20AM +1100, Dave Chinner wrote:
> > Or maybe I've misunderstood, and you're asking if we should try
> > kmem_zalloc(4 pages), then kmem_zalloc(1 page), and only then switch to
> > the __vmalloc calls?
> 
> Just call kmem_zalloc_large() for 4 pages without a fallback on
> failure - that's exactly how we handle allocations for things like
> the 64k xattr buffers

Yeah, that sounds fine.  I didn't remember that we actually tried
kmalloc before vmalloc for kmem_zalloc_large.


Re: [PATCH] xfs: remove kmem_zalloc_greedy

2017-03-06 Thread Christoph Hellwig
On Tue, Mar 07, 2017 at 11:54:20AM +1100, Dave Chinner wrote:
> > Or maybe I've misunderstood, and you're asking if we should try
> > kmem_zalloc(4 pages), then kmem_zalloc(1 page), and only then switch to
> > the __vmalloc calls?
> 
> Just call kmem_zalloc_large() for 4 pages without a fallback on
> failure - that's exactly how we handle allocations for things like
> the 64k xattr buffers

Yeah, that sounds fine.  I didn't remember that we actually tried
kmalloc before vmalloc for kmem_zalloc_large.


RE: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Parav Pandit
Hi Greg,

> -Original Message-
> From: Greg Kroah-Hartman [mailto:gre...@linuxfoundation.org]
> Sent: Monday, March 6, 2017 10:53 PM
> To: Bart Van Assche 
> Cc: Doug Ledford ; Sebastian Ott
> ; Parav Pandit ; linux-
> r...@vger.kernel.org; linux-kernel@vger.kernel.org; Bjorn Helgaas
> ; Benjamin Herrenschmidt
> ; David Woodhouse ;
> H . Peter Anvin ; Ingo Molnar ;
> Russell King 
> Subject: Re: [PATCH 1/2] device: Stop requiring that struct device is
> embedded in struct pci_dev
> 
> On Mon, Mar 06, 2017 at 04:35:48PM -0800, Bart Van Assche wrote:
> > The dma mapping operations of several architectures and also of
> > several I/O MMU implementations need to translate a struct device
> > pointer into a struct pci_dev pointer. This translation is performed
> > by to_pci_dev(). That macro assumes that struct device is embedded in
> > struct pci_dev. However, that is not the case for the device structure
> > in struct ib_device.
> 
> Then don't blindly cast it backwards!  Fix that up, an ib device should have
> access to the dma structures that the PCI device it depends on has.
> If not, you need to set that up properly in the IB core, don't mess with the
> driver core for this at all.
> 
I replied with pseudo code in previous reply to Bart to bring back dma_device 
member in the ib_device.
dma_device member was already present in near past of few weeks.
It should be able to work using it without performance impact and without 
touching driver core layer like in this patch.



Re: netlink: GPF in netlink_unicast

2017-03-06 Thread Richard Guy Briggs
On 2017-03-06 10:10, Cong Wang wrote:
> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov  wrote:
> > Hello,
> >
> > I've got the following crash while running syzkaller fuzzer on
> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault:  [#1] SMP KASAN
> > Dumping ftrace buffer:
> >(ftrace buffer empty)
> > Modules linked in:
> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> > task: 8801d79f0240 task.stack: 8801d7a2
> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> > RSP: 0018:8801d7a27c38 EFLAGS: 00010206
> > RAX: 0056 RBX: 8801d7a27cd0 RCX: 
> > RDX:  RSI:  RDI: 02b0
> > RBP: 8801d7a27cf8 R08: ed00385cf286 R09: ed00385cf286
> > R10: 0006 R11: ed00385cf285 R12: 
> > R13: dc00 R14: 8801c2fc3c80 R15: 014000c0
> > FS:  () GS:8801dbe0() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 20cfd000 CR3: 0001c758f000 CR4: 001406f0
> > Call Trace:
> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: 8801d7a27c38
> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> > 8801d7a27c38
> > ---[ end trace ad1bba9d457430b6 ]---
> > Kernel panic - not syncing: Fatal exception
> >
> >
> > This is not reproducible and seems to be caused by an elusive race.
> > However, looking at the code I don't see any proper protection of
> > audit_sock (other than the if (!audit_pid) which is obviously not
> > enough to protect against races).
> 
> audit_cmd_mutex is supposed to protect it, I think.
> But kauditd_send_unicast_skb() seems not holding this mutex.

H, I wonder if it makes sense to wrap most of the contents of the
outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
first two innter while loops and the "if (auditd)" condition after the
"quick_loop:" label.  The condition on auditd is supposed to catch that
case.  We don't want it locked while playing with the scheduler at the
bottom of that function.

> Richard?

- RGB

--
Richard Guy Briggs 
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635


Re: netlink: GPF in netlink_unicast

2017-03-06 Thread Richard Guy Briggs
On 2017-03-06 10:10, Cong Wang wrote:
> On Mon, Mar 6, 2017 at 2:54 AM, Dmitry Vyukov  wrote:
> > Hello,
> >
> > I've got the following crash while running syzkaller fuzzer on
> > net-next/8d70eeb84ab277377c017af6a21d0a337025dede:
> >
> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > general protection fault:  [#1] SMP KASAN
> > Dumping ftrace buffer:
> >(ftrace buffer empty)
> > Modules linked in:
> > CPU: 0 PID: 883 Comm: kauditd Not tainted 4.10.0+ #6
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> > task: 8801d79f0240 task.stack: 8801d7a2
> > RIP: 0010:sock_sndtimeo include/net/sock.h:2162 [inline]
> > RIP: 0010:netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249
> > RSP: 0018:8801d7a27c38 EFLAGS: 00010206
> > RAX: 0056 RBX: 8801d7a27cd0 RCX: 
> > RDX:  RSI:  RDI: 02b0
> > RBP: 8801d7a27cf8 R08: ed00385cf286 R09: ed00385cf286
> > R10: 0006 R11: ed00385cf285 R12: 
> > R13: dc00 R14: 8801c2fc3c80 R15: 014000c0
> > FS:  () GS:8801dbe0() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 20cfd000 CR3: 0001c758f000 CR4: 001406f0
> > Call Trace:
> >  kauditd_send_unicast_skb+0x3c/0x70 kernel/audit.c:482
> >  kauditd_thread+0x174/0xb00 kernel/audit.c:599
> >  kthread+0x326/0x3f0 kernel/kthread.c:229
> >  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
> > Code: 44 89 fe e8 56 15 ff ff 8b 8d 70 ff ff ff 49 89 c6 31 c0 85 c9
> > 75 27 e8 b2 b2 f4 fd 49 8d bc 24 b0 02 00 00 48 89 f8 48 c1 e8 03 <42>
> > 80 3c 28 00 0f 85 37 06 00 00 49 8b 84 24 b0 02 00 00 4c 8d
> > RIP: sock_sndtimeo include/net/sock.h:2162 [inline] RSP: 8801d7a27c38
> > RIP: netlink_unicast+0xdd/0x730 net/netlink/af_netlink.c:1249 RSP:
> > 8801d7a27c38
> > ---[ end trace ad1bba9d457430b6 ]---
> > Kernel panic - not syncing: Fatal exception
> >
> >
> > This is not reproducible and seems to be caused by an elusive race.
> > However, looking at the code I don't see any proper protection of
> > audit_sock (other than the if (!audit_pid) which is obviously not
> > enough to protect against races).
> 
> audit_cmd_mutex is supposed to protect it, I think.
> But kauditd_send_unicast_skb() seems not holding this mutex.

H, I wonder if it makes sense to wrap most of the contents of the
outer while loop in kauditd_thread in the audit_cmd_mutex, or around the
first two innter while loops and the "if (auditd)" condition after the
"quick_loop:" label.  The condition on auditd is supposed to catch that
case.  We don't want it locked while playing with the scheduler at the
bottom of that function.

> Richard?

- RGB

--
Richard Guy Briggs 
Kernel Security Engineering, Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635


Re: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Greg Kroah-Hartman
On Mon, Mar 06, 2017 at 04:35:48PM -0800, Bart Van Assche wrote:
> The dma mapping operations of several architectures and also of
> several I/O MMU implementations need to translate a struct
> device pointer into a struct pci_dev pointer. This translation
> is performed by to_pci_dev(). That macro assumes that struct
> device is embedded in struct pci_dev. However, that is not the
> case for the device structure in struct ib_device.

Then don't blindly cast it backwards!  Fix that up, an ib device should
have access to the dma structures that the PCI device it depends on has.
If not, you need to set that up properly in the IB core, don't mess with
the driver core for this at all.

Somehow all other subsystems work just fine, don't instantly think that
the driver core needs to bend to the will of the IB code, because you
are somehow "special".  Hint, you aren't :)

greg k-h


Re: [PATCH 1/2] device: Stop requiring that struct device is embedded in struct pci_dev

2017-03-06 Thread Greg Kroah-Hartman
On Mon, Mar 06, 2017 at 04:35:48PM -0800, Bart Van Assche wrote:
> The dma mapping operations of several architectures and also of
> several I/O MMU implementations need to translate a struct
> device pointer into a struct pci_dev pointer. This translation
> is performed by to_pci_dev(). That macro assumes that struct
> device is embedded in struct pci_dev. However, that is not the
> case for the device structure in struct ib_device.

Then don't blindly cast it backwards!  Fix that up, an ib device should
have access to the dma structures that the PCI device it depends on has.
If not, you need to set that up properly in the IB core, don't mess with
the driver core for this at all.

Somehow all other subsystems work just fine, don't instantly think that
the driver core needs to bend to the will of the IB code, because you
are somehow "special".  Hint, you aren't :)

greg k-h


[PATCH] irqchip/gic-v3: Fix GICD_CTLR_ARE_NS bit field

2017-03-06 Thread Alim Akhtar
From: Alim Akhtar 

As per GICv3 Architecture specification 8.9.4 field descriptions,
GICD_CTLR_ARE_NS is bit[5]. This patch correct the same.

Fixes: 021f6537 ("irqchip: gic-v3: Initial support for GICv3")
Cc: sta...@vger.kernel.org
Signed-off-by: Alim Akhtar 
---
 include/linux/irqchip/arm-gic-v3.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index e808f8a..4aaf639 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -57,7 +57,7 @@
 
 #define GICD_CTLR_RWP  (1U << 31)
 #define GICD_CTLR_DS   (1U << 6)
-#define GICD_CTLR_ARE_NS   (1U << 4)
+#define GICD_CTLR_ARE_NS   (1U << 5)
 #define GICD_CTLR_ENABLE_G1A   (1U << 1)
 #define GICD_CTLR_ENABLE_G1(1U << 0)
 
-- 
2.7.4



[PATCH] irqchip/gic-v3: Fix GICD_CTLR_ARE_NS bit field

2017-03-06 Thread Alim Akhtar
From: Alim Akhtar 

As per GICv3 Architecture specification 8.9.4 field descriptions,
GICD_CTLR_ARE_NS is bit[5]. This patch correct the same.

Fixes: 021f6537 ("irqchip: gic-v3: Initial support for GICv3")
Cc: sta...@vger.kernel.org
Signed-off-by: Alim Akhtar 
---
 include/linux/irqchip/arm-gic-v3.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index e808f8a..4aaf639 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -57,7 +57,7 @@
 
 #define GICD_CTLR_RWP  (1U << 31)
 #define GICD_CTLR_DS   (1U << 6)
-#define GICD_CTLR_ARE_NS   (1U << 4)
+#define GICD_CTLR_ARE_NS   (1U << 5)
 #define GICD_CTLR_ENABLE_G1A   (1U << 1)
 #define GICD_CTLR_ENABLE_G1(1U << 0)
 
-- 
2.7.4



Re: [PATCH net-next 1/5] ldmvsw: better use of link up and down on ldom vswitch

2017-03-06 Thread Shannon Nelson



On 3/6/2017 3:53 PM, Florian Fainelli wrote:

On 03/06/2017 03:15 PM, Shannon Nelson wrote:

When an ldom VM is bound, the network vswitch infrastructure is set up for
it, but was being forced 'UP' by the userland switch configuration script.
When 'UP' but not actually connected to a running VM, the ipv6 neighbor
probes fail (not a horrible thing) and start cluttering up the kernel logs.
Funny thing: these are debug messages that never actually show up, but
we do see the net_ratelimited messages that say N callbacks were
suppressed.

This patch defers the netif_carrier_on() until an actual link has been
established with the VM, as indicated by receiving an LDC_EVENT_UP from
the underlying LDC protocol.  Similarly, we take the link down when we
see the LDC_EVENT_RESET.

Orabug: 25525312

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/ldmvsw.c |   10 +++---
 drivers/net/ethernet/sun/sunvnet_common.c |   14 ++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/ldmvsw.c 
b/drivers/net/ethernet/sun/ldmvsw.c
index 89952de..c6f6d59 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -41,8 +41,8 @@
 static u8 vsw_port_hwaddr[ETH_ALEN] = {0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};

 #define DRV_MODULE_NAME"ldmvsw"
-#define DRV_MODULE_VERSION "1.1"
-#define DRV_MODULE_RELDATE "February 3, 2017"
+#define DRV_MODULE_VERSION "1.2"
+#define DRV_MODULE_RELDATE "March 4, 2017"

 static char version[] =
DRV_MODULE_NAME " " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")";
@@ -133,7 +133,6 @@ static void vsw_poll_controller(struct net_device *dev)
 #endif

 static const struct net_device_ops vsw_ops = {
-   .ndo_open   = sunvnet_open_common,


Is this change intentional? It was not entirely obvious where you would
be setting ::ndo_open in other places.


Yes, it is correct.  It does look a bit odd, but nearly all the work is 
done in the _probe(), and now the carrier_on happens a little later when 
the LDC_EVENT_UP is received, so there's no longer a need for the 
_open() call.


sln






Re: [PATCH net-next 1/5] ldmvsw: better use of link up and down on ldom vswitch

2017-03-06 Thread Shannon Nelson



On 3/6/2017 3:53 PM, Florian Fainelli wrote:

On 03/06/2017 03:15 PM, Shannon Nelson wrote:

When an ldom VM is bound, the network vswitch infrastructure is set up for
it, but was being forced 'UP' by the userland switch configuration script.
When 'UP' but not actually connected to a running VM, the ipv6 neighbor
probes fail (not a horrible thing) and start cluttering up the kernel logs.
Funny thing: these are debug messages that never actually show up, but
we do see the net_ratelimited messages that say N callbacks were
suppressed.

This patch defers the netif_carrier_on() until an actual link has been
established with the VM, as indicated by receiving an LDC_EVENT_UP from
the underlying LDC protocol.  Similarly, we take the link down when we
see the LDC_EVENT_RESET.

Orabug: 25525312

Signed-off-by: Shannon Nelson 
---
 drivers/net/ethernet/sun/ldmvsw.c |   10 +++---
 drivers/net/ethernet/sun/sunvnet_common.c |   14 ++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/ldmvsw.c 
b/drivers/net/ethernet/sun/ldmvsw.c
index 89952de..c6f6d59 100644
--- a/drivers/net/ethernet/sun/ldmvsw.c
+++ b/drivers/net/ethernet/sun/ldmvsw.c
@@ -41,8 +41,8 @@
 static u8 vsw_port_hwaddr[ETH_ALEN] = {0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF};

 #define DRV_MODULE_NAME"ldmvsw"
-#define DRV_MODULE_VERSION "1.1"
-#define DRV_MODULE_RELDATE "February 3, 2017"
+#define DRV_MODULE_VERSION "1.2"
+#define DRV_MODULE_RELDATE "March 4, 2017"

 static char version[] =
DRV_MODULE_NAME " " DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")";
@@ -133,7 +133,6 @@ static void vsw_poll_controller(struct net_device *dev)
 #endif

 static const struct net_device_ops vsw_ops = {
-   .ndo_open   = sunvnet_open_common,


Is this change intentional? It was not entirely obvious where you would
be setting ::ndo_open in other places.


Yes, it is correct.  It does look a bit odd, but nearly all the work is 
done in the _probe(), and now the carrier_on happens a little later when 
the LDC_EVENT_UP is received, so there's no longer a need for the 
_open() call.


sln






[PATCH 09/18] pstore: Replace arguments for read() API

2017-03-06 Thread Kees Cook
The argument list for the pstore_read() interface is unwieldy. This changes
passes the new struct pstore_record instead. The erst backend was already
doing something similar internally.

Signed-off-by: Kees Cook 
---
 arch/powerpc/kernel/nvram_64.c|  61 +++---
 drivers/acpi/apei/erst.c  |  38 ++
 drivers/firmware/efi/efi-pstore.c | 104 --
 fs/pstore/platform.c  |   7 +--
 fs/pstore/ram.c   |  53 ++-
 include/linux/pstore.h|  20 +++-
 6 files changed, 124 insertions(+), 159 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index d5e2b8309939..7f192001d09a 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -442,10 +442,7 @@ static int nvram_pstore_write(enum pstore_type_id type,
  * Returns the length of the data we read from each partition.
  * Returns 0 if we've been called before.
  */
-static ssize_t nvram_pstore_read(u64 *id, enum pstore_type_id *type,
-   int *count, struct timespec *time, char **buf,
-   bool *compressed, ssize_t *ecc_notice_size,
-   struct pstore_info *psi)
+static ssize_t nvram_pstore_read(struct pstore_record *record)
 {
struct oops_log_info *oops_hdr;
unsigned int err_type, id_no, size = 0;
@@ -459,40 +456,40 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
switch (nvram_type_ids[read_type]) {
case PSTORE_TYPE_DMESG:
part = _log_partition;
-   *type = PSTORE_TYPE_DMESG;
+   record->type = PSTORE_TYPE_DMESG;
break;
case PSTORE_TYPE_PPC_COMMON:
sig = NVRAM_SIG_SYS;
part = _partition;
-   *type = PSTORE_TYPE_PPC_COMMON;
-   *id = PSTORE_TYPE_PPC_COMMON;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_COMMON;
+   record->id = PSTORE_TYPE_PPC_COMMON;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #ifdef CONFIG_PPC_PSERIES
case PSTORE_TYPE_PPC_RTAS:
part = _log_partition;
-   *type = PSTORE_TYPE_PPC_RTAS;
-   time->tv_sec = last_rtas_event;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_RTAS;
+   record->time.tv_sec = last_rtas_event;
+   record->time.tv_nsec = 0;
break;
case PSTORE_TYPE_PPC_OF:
sig = NVRAM_SIG_OF;
part = _config_partition;
-   *type = PSTORE_TYPE_PPC_OF;
-   *id = PSTORE_TYPE_PPC_OF;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OF;
+   record->id = PSTORE_TYPE_PPC_OF;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
 #ifdef CONFIG_PPC_POWERNV
case PSTORE_TYPE_PPC_OPAL:
sig = NVRAM_SIG_FW;
part = _partition;
-   *type = PSTORE_TYPE_PPC_OPAL;
-   *id = PSTORE_TYPE_PPC_OPAL;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OPAL;
+   record->id = PSTORE_TYPE_PPC_OPAL;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
default:
@@ -520,10 +517,10 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
return 0;
}
 
-   *count = 0;
+   record->count = 0;
 
if (part->os_partition)
-   *id = id_no;
+   record->id = id_no;
 
if (nvram_type_ids[read_type] == PSTORE_TYPE_DMESG) {
size_t length, hdr_size;
@@ -533,28 +530,28 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
/* Old format oops header had 2-byte record size */
hdr_size = sizeof(u16);
length = be16_to_cpu(oops_hdr->version);
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
} else {
hdr_size = sizeof(*oops_hdr);
length = be16_to_cpu(oops_hdr->report_length);
-   time->tv_sec = be64_to_cpu(oops_hdr->timestamp);
-   time->tv_nsec = 0;
+   record->time.tv_sec = be64_to_cpu(oops_hdr->timestamp);
+   record->time.tv_nsec = 0;
}
-   *buf = kmemdup(buff + hdr_size, 

[PATCH 09/18] pstore: Replace arguments for read() API

2017-03-06 Thread Kees Cook
The argument list for the pstore_read() interface is unwieldy. This changes
passes the new struct pstore_record instead. The erst backend was already
doing something similar internally.

Signed-off-by: Kees Cook 
---
 arch/powerpc/kernel/nvram_64.c|  61 +++---
 drivers/acpi/apei/erst.c  |  38 ++
 drivers/firmware/efi/efi-pstore.c | 104 --
 fs/pstore/platform.c  |   7 +--
 fs/pstore/ram.c   |  53 ++-
 include/linux/pstore.h|  20 +++-
 6 files changed, 124 insertions(+), 159 deletions(-)

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index d5e2b8309939..7f192001d09a 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -442,10 +442,7 @@ static int nvram_pstore_write(enum pstore_type_id type,
  * Returns the length of the data we read from each partition.
  * Returns 0 if we've been called before.
  */
-static ssize_t nvram_pstore_read(u64 *id, enum pstore_type_id *type,
-   int *count, struct timespec *time, char **buf,
-   bool *compressed, ssize_t *ecc_notice_size,
-   struct pstore_info *psi)
+static ssize_t nvram_pstore_read(struct pstore_record *record)
 {
struct oops_log_info *oops_hdr;
unsigned int err_type, id_no, size = 0;
@@ -459,40 +456,40 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
switch (nvram_type_ids[read_type]) {
case PSTORE_TYPE_DMESG:
part = _log_partition;
-   *type = PSTORE_TYPE_DMESG;
+   record->type = PSTORE_TYPE_DMESG;
break;
case PSTORE_TYPE_PPC_COMMON:
sig = NVRAM_SIG_SYS;
part = _partition;
-   *type = PSTORE_TYPE_PPC_COMMON;
-   *id = PSTORE_TYPE_PPC_COMMON;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_COMMON;
+   record->id = PSTORE_TYPE_PPC_COMMON;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #ifdef CONFIG_PPC_PSERIES
case PSTORE_TYPE_PPC_RTAS:
part = _log_partition;
-   *type = PSTORE_TYPE_PPC_RTAS;
-   time->tv_sec = last_rtas_event;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_RTAS;
+   record->time.tv_sec = last_rtas_event;
+   record->time.tv_nsec = 0;
break;
case PSTORE_TYPE_PPC_OF:
sig = NVRAM_SIG_OF;
part = _config_partition;
-   *type = PSTORE_TYPE_PPC_OF;
-   *id = PSTORE_TYPE_PPC_OF;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OF;
+   record->id = PSTORE_TYPE_PPC_OF;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
 #ifdef CONFIG_PPC_POWERNV
case PSTORE_TYPE_PPC_OPAL:
sig = NVRAM_SIG_FW;
part = _partition;
-   *type = PSTORE_TYPE_PPC_OPAL;
-   *id = PSTORE_TYPE_PPC_OPAL;
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->type = PSTORE_TYPE_PPC_OPAL;
+   record->id = PSTORE_TYPE_PPC_OPAL;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
break;
 #endif
default:
@@ -520,10 +517,10 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
return 0;
}
 
-   *count = 0;
+   record->count = 0;
 
if (part->os_partition)
-   *id = id_no;
+   record->id = id_no;
 
if (nvram_type_ids[read_type] == PSTORE_TYPE_DMESG) {
size_t length, hdr_size;
@@ -533,28 +530,28 @@ static ssize_t nvram_pstore_read(u64 *id, enum 
pstore_type_id *type,
/* Old format oops header had 2-byte record size */
hdr_size = sizeof(u16);
length = be16_to_cpu(oops_hdr->version);
-   time->tv_sec = 0;
-   time->tv_nsec = 0;
+   record->time.tv_sec = 0;
+   record->time.tv_nsec = 0;
} else {
hdr_size = sizeof(*oops_hdr);
length = be16_to_cpu(oops_hdr->report_length);
-   time->tv_sec = be64_to_cpu(oops_hdr->timestamp);
-   time->tv_nsec = 0;
+   record->time.tv_sec = be64_to_cpu(oops_hdr->timestamp);
+   record->time.tv_nsec = 0;
}
-   *buf = kmemdup(buff + hdr_size, length, GFP_KERNEL);
+

  1   2   3   4   5   6   7   8   9   10   >