Re: [PATCH 0/2] Automatically load the vmx_crypto module if supported

2016-07-12 Thread Alastair D'Silva
On Wed, 2016-07-13 at 15:47 +1000, alast...@au1.ibm.com wrote:
> From: Alastair D'Silva 
> > This series allows the vmx_crypto module to be detected and
> automatically
> loaded via UDEV if the CPU supports the vector crypto feature.
> > Alastair D'Silva (2):
>   powerpc: Add module autoloading based on CPU features
>   crypto: vmx - Convert to CPU feature based module autoloading
> >  arch/powerpc/Kconfig  |  1 +
>  arch/powerpc/include/asm/cpufeature.h | 70
> +++
>  drivers/crypto/vmx/Kconfig|  2 +-
>  drivers/crypto/vmx/vmx.c  |  6 +--
>  4 files changed, 74 insertions(+), 5 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/cpufeature.h

Please ignore the following:
  [PATCH 1/2] Allow drivers to be autoloaded.
  [PATCH 2/2] Automatically load the vmx_crypto module if supported.

-- Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819



[PATCH 2/2] Automatically load the vmx_crypto module if supported.

2016-07-12 Thread alastair
From: Alastair D'Silva 

This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the vmx_crypto module if the CPU supports
it.

Signed-off-by: Alastair D'Silva 
---
 drivers/crypto/vmx/Kconfig | 2 +-
 drivers/crypto/vmx/vmx.c   | 6 ++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/vmx/Kconfig b/drivers/crypto/vmx/Kconfig
index 89d8208..a83ead1 100644
--- a/drivers/crypto/vmx/Kconfig
+++ b/drivers/crypto/vmx/Kconfig
@@ -1,7 +1,7 @@
 config CRYPTO_DEV_VMX_ENCRYPT
tristate "Encryption acceleration support on P8 CPU"
depends on CRYPTO_DEV_VMX
-   default y
+   default m
help
  Support for VMX cryptographic acceleration instructions on Power8 CPU.
  This module supports acceleration for AES and GHASH in hardware. If 
you
diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
index e163d57..5a40f2f 100644
--- a/drivers/crypto/vmx/vmx.c
+++ b/drivers/crypto/vmx/vmx.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -43,9 +44,6 @@ int __init p8_init(void)
int ret = 0;
struct crypto_alg **alg_it;
 
-   if (!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_VEC_CRYPTO))
-   return -ENODEV;
-
for (alg_it = algs; *alg_it; alg_it++) {
ret = crypto_register_alg(*alg_it);
printk(KERN_INFO "crypto_register_alg '%s' = %d\n",
@@ -78,7 +76,7 @@ void __exit p8_exit(void)
crypto_unregister_shash(_ghash_alg);
 }
 
-module_init(p8_init);
+module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init);
 module_exit(p8_exit);
 
 MODULE_AUTHOR("Marcelo Cerri");
-- 
2.7.4



Re: [PATCH 02/34] mm, vmscan: move lru_lock to the node

2016-07-12 Thread Balbir Singh
On Tue, Jul 12, 2016 at 12:18:05PM +0100, Mel Gorman wrote:
> On Tue, Jul 12, 2016 at 09:06:04PM +1000, Balbir Singh wrote:
> > > diff --git a/Documentation/cgroup-v1/memory.txt 
> > > b/Documentation/cgroup-v1/memory.txt
> > > index b14abf217239..946e69103cdd 100644
> > > --- a/Documentation/cgroup-v1/memory.txt
> > > +++ b/Documentation/cgroup-v1/memory.txt
> > > @@ -267,11 +267,11 @@ When oom event notifier is registered, event will 
> > > be delivered.
> > > Other lock order is following:
> > > PG_locked.
> > > mm->page_table_lock
> > > -   zone->lru_lock
> > > +   zone_lru_lock
> > 
> > zone_lru_lock is a little confusing, can't we just call it
> > node_lru_lock?
> > 
> 
> It's a matter of perspective. People familiar with the VM already expect
> a zone lock so will be looking for it. I can do a rename if you insist
> but it may not actually help.

I don't want to insist, but zone_ in the name can be confusing, as to
leading us to think that the lru_lock is still in the zone

If the rest of the reviewers are fine with, we don't need to rename

> 
> > > @@ -496,7 +496,6 @@ struct zone {
> > >   /* Write-intensive fields used by page reclaim */
> > >  
> > >   /* Fields commonly accessed by the page reclaim scanner */
> > > - spinlock_t  lru_lock;
> > >   struct lruvec   lruvec;
> > >  
> > >   /*
> > > @@ -690,6 +689,9 @@ typedef struct pglist_data {
> > >   /* Number of pages migrated during the rate limiting time interval */
> > >   unsigned long numabalancing_migrate_nr_pages;
> > >  #endif
> > > + /* Write-intensive fields used by page reclaim */
> > > + ZONE_PADDING(_pad1_)a
> > 
> > I thought this was to have zone->lock and zone->lru_lock in different
> > cachelines, do we still need the padding here?
> > 
> 
> The zone padding current keeps the page lock wait tables, page allocator
> lists, compaction and vmstats on separate cache lines. They're still
> fine.
> 
> The node padding may not be necessary. It currently ensures that zonelists
> and numa balancing are separate from the LRU lock but there is no guarantee
> the current arrangement is optimal. It would depend on both the kernel
> config and the workload but it may be necessary in the future to split
> node into read-mostly sections and then different write-intensive sections
> similar to what has happened to struct zone in the past.
>

Fair enough

Balbir Singh. 


[PATCH 2/2] crypto: vmx - Convert to CPU feature based module autoloading

2016-07-12 Thread alastair
From: Alastair D'Silva 

This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the vmx_crypto module if the CPU supports
it.

Signed-off-by: Alastair D'Silva 
---
 drivers/crypto/vmx/Kconfig | 2 +-
 drivers/crypto/vmx/vmx.c   | 6 ++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/vmx/Kconfig b/drivers/crypto/vmx/Kconfig
index 89d8208..a83ead1 100644
--- a/drivers/crypto/vmx/Kconfig
+++ b/drivers/crypto/vmx/Kconfig
@@ -1,7 +1,7 @@
 config CRYPTO_DEV_VMX_ENCRYPT
tristate "Encryption acceleration support on P8 CPU"
depends on CRYPTO_DEV_VMX
-   default y
+   default m
help
  Support for VMX cryptographic acceleration instructions on Power8 CPU.
  This module supports acceleration for AES and GHASH in hardware. If 
you
diff --git a/drivers/crypto/vmx/vmx.c b/drivers/crypto/vmx/vmx.c
index e163d57..5a40f2f 100644
--- a/drivers/crypto/vmx/vmx.c
+++ b/drivers/crypto/vmx/vmx.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -43,9 +44,6 @@ int __init p8_init(void)
int ret = 0;
struct crypto_alg **alg_it;
 
-   if (!(cur_cpu_spec->cpu_user_features2 & PPC_FEATURE2_VEC_CRYPTO))
-   return -ENODEV;
-
for (alg_it = algs; *alg_it; alg_it++) {
ret = crypto_register_alg(*alg_it);
printk(KERN_INFO "crypto_register_alg '%s' = %d\n",
@@ -78,7 +76,7 @@ void __exit p8_exit(void)
crypto_unregister_shash(_ghash_alg);
 }
 
-module_init(p8_init);
+module_cpu_feature_match(PPC_MODULE_FEATURE_VEC_CRYPTO, p8_init);
 module_exit(p8_exit);
 
 MODULE_AUTHOR("Marcelo Cerri");
-- 
2.7.4



[PATCH 0/2] Automatically load the vmx_crypto module if supported

2016-07-12 Thread alastair
From: Alastair D'Silva 

This series allows the vmx_crypto module to be detected and automatically
loaded via UDEV if the CPU supports the vector crypto feature.

Alastair D'Silva (2):
  powerpc: Add module autoloading based on CPU features
  crypto: vmx - Convert to CPU feature based module autoloading

 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/cpufeature.h | 70 +++
 drivers/crypto/vmx/Kconfig|  2 +-
 drivers/crypto/vmx/vmx.c  |  6 +--
 4 files changed, 74 insertions(+), 5 deletions(-)
 create mode 100644 arch/powerpc/include/asm/cpufeature.h

-- 
2.7.4



[PATCH 1/2] powerpc: Add module autoloading based on CPU features

2016-07-12 Thread alastair
From: Alastair D'Silva 

This patch provides the necessary infrastructure to allow drivers
to be automatically loaded via UDEV. It implements the minimum
required to be able to use module_cpu_feature_match to trigger
the GENERIC_CPU_AUTOPROBE mechanisms.

The features exposed are a mirror of the cpu_user_features
(converted to an offset from a mask). This decision was made to
ensure that the behavior between features for module loading and
userspace are consistent.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/cpufeature.h | 70 +++
 2 files changed, 71 insertions(+)
 create mode 100644 arch/powerpc/include/asm/cpufeature.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 0a9d439..a6e49db 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -164,6 +164,7 @@ config PPC
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
+   select GENERIC_CPU_AUTOPROBE
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/include/asm/cpufeature.h 
b/arch/powerpc/include/asm/cpufeature.h
new file mode 100644
index 000..df31627
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeature.h
@@ -0,0 +1,70 @@
+/* CPU feature definitions for module loading, used by
+ * module_cpu_feature_match(), see asm/cputable.h for powerpc CPU features
+ *
+ * Copyright 2016 Alastair D'Silva, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef __ASM_CPUFEATURE_H
+#define __ASM_POWERPC_CPUFEATURE_H
+
+#include 
+
+/* Keep these in step with powerpc/include/asm/cputable.h */
+#define MAX_CPU_FEATURES (2 * 32)
+
+#define PPC_MODULE_FEATURE_32  (ilog2(PPC_FEATURE_32))
+#define PPC_MODULE_FEATURE_64  (ilog2(PPC_FEATURE_64))
+#define PPC_MODULE_FEATURE_601_INSTR   
(ilog2(PPC_FEATURE_601_INSTR))
+#define PPC_MODULE_FEATURE_HAS_ALTIVEC 
(ilog2(PPC_FEATURE_HAS_ALTIVEC))
+#define PPC_MODULE_FEATURE_HAS_FPU 
(ilog2(PPC_FEATURE_HAS_FPU))
+#define PPC_MODULE_FEATURE_HAS_MMU 
(ilog2(PPC_FEATURE_HAS_MMU))
+#define PPC_MODULE_FEATURE_HAS_4xxMAC  
(ilog2(PPC_FEATURE_HAS_4xxMAC))
+#define PPC_MODULE_FEATURE_UNIFIED_CACHE   
ilog2(PPC_FEATURE_UNIFIED_CACHE))
+#define PPC_MODULE_FEATURE_HAS_SPE 
(ilog2(PPC_FEATURE_HAS_SPE))
+#define PPC_MODULE_FEATURE_HAS_EFP_SINGLE  
(ilog2(PPC_FEATURE_HAS_EFP_SINGLE))
+#define PPC_MODULE_FEATURE_HAS_EFP_DOUBLE  
(ilog2(PPC_FEATURE_HAS_EFP_DOUBLE))
+#define PPC_MODULE_FEATURE_NO_TB   
(ilog2(PPC_FEATURE_NO_TB))
+#define PPC_MODULE_FEATURE_POWER4  
(ilog2(PPC_FEATURE_POWER4))
+#define PPC_MODULE_FEATURE_POWER5  
(ilog2(PPC_FEATURE_POWER5))
+#define PPC_MODULE_FEATURE_POWER5_PLUS 
(ilog2(PPC_FEATURE_POWER5_PLUS))
+#define PPC_MODULE_FEATURE_CELL
(ilog2(PPC_FEATURE_CELL))
+#define PPC_MODULE_FEATURE_BOOKE   
(ilog2(PPC_FEATURE_BOOKE))
+#define PPC_MODULE_FEATURE_SMT (ilog2(PPC_FEATURE_SMT))
+#define PPC_MODULE_FEATURE_ICACHE_SNOOP
(ilog2(PPC_FEATURE_ICACHE_SNOOP))
+#define PPC_MODULE_FEATURE_ARCH_2_05   
(ilog2(PPC_FEATURE_ARCH_2_05))
+#define PPC_MODULE_FEATURE_PA6T
(ilog2(PPC_FEATURE_PA6T))
+#define PPC_MODULE_FEATURE_HAS_DFP 
(ilog2(PPC_FEATURE_HAS_DFP))
+#define PPC_MODULE_FEATURE_POWER6_EXT  
(ilog2(PPC_FEATURE_POWER6_EXT))
+#define PPC_MODULE_FEATURE_ARCH_2_06   
(ilog2(PPC_FEATURE_ARCH_2_06))
+#define PPC_MODULE_FEATURE_HAS_VSX 
(ilog2(PPC_FEATURE_HAS_VSX))
+#define PPC_MODULE_FEATURE_PSERIES_PERFMON_COMPAT  
(ilog2(PPC_FEATURE_PSERIES_PERFMON_COMPAT))
+#define PPC_MODULE_FEATURE_TRUE_LE 
(ilog2(PPC_FEATURE_TRUE_LE))
+#define PPC_MODULE_FEATURE_PPC_LE  
(ilog2(PPC_FEATURE_PPC_LE))
+
+#define PPC_MODULE_FEATURE_ARCH_2_07   (32 + 
ilog2(PPC_FEATURE2_ARCH_2_07))
+#define PPC_MODULE_FEATURE_HTM (32 + 
ilog2(PPC_FEATURE2_HTM))
+#define PPC_MODULE_FEATURE_DSCR(32 + 
ilog2(PPC_FEATURE2_DSCR))
+#define PPC_MODULE_FEATURE_EBB (32 + 
ilog2(PPC_FEATURE2_EBB))
+#define PPC_MODULE_FEATURE_ISEL(32 + 
ilog2(PPC_FEATURE2_ISEL))
+#define 

[PATCH 1/2] Allow drivers to be autoloaded.

2016-07-12 Thread alastair
From: Alastair D'Silva 

This patch provides the necessary infrastructure to allow drivers
to be automatically loaded via UDEV. It implements the minimum
required to be able to use module_cpu_feature_match to trigger
the GENERIC_CPU_AUTOPROBE mechanisms.

The features exposed are a mirror of the cpu_user_features
(converted to an offset from a mask). This decision was made to
ensure that the behavior between features for module loading and
userspace are consistent.

Signed-off-by: Alastair D'Silva 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/cpufeature.h | 68 +++
 2 files changed, 69 insertions(+)
 create mode 100644 arch/powerpc/include/asm/cpufeature.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 0a9d439..a6e49db 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -164,6 +164,7 @@ config PPC
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
+   select GENERIC_CPU_AUTOPROBE
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/include/asm/cpufeature.h 
b/arch/powerpc/include/asm/cpufeature.h
new file mode 100644
index 000..6d52527
--- /dev/null
+++ b/arch/powerpc/include/asm/cpufeature.h
@@ -0,0 +1,68 @@
+/*
+ * Copyright 2016 Alastair D'Silva, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef __ASM_CPUFEATURE_H
+#define __ASM_CPUFEATURE_H
+
+#include 
+
+/* Keep these in step with powerpc/include/asm/cputable.h */
+#define MAX_CPU_FEATURES (2 * 32)
+
+#define PPC_MODULE_FEATURE_32  (ilog2(PPC_FEATURE_32))
+#define PPC_MODULE_FEATURE_64  (ilog2(PPC_FEATURE_64))
+#define PPC_MODULE_FEATURE_601_INSTR   
(ilog2(PPC_FEATURE_601_INSTR))
+#define PPC_MODULE_FEATURE_HAS_ALTIVEC 
(ilog2(PPC_FEATURE_HAS_ALTIVEC))
+#define PPC_MODULE_FEATURE_HAS_FPU 
(ilog2(PPC_FEATURE_HAS_FPU))
+#define PPC_MODULE_FEATURE_HAS_MMU 
(ilog2(PPC_FEATURE_HAS_MMU))
+#define PPC_MODULE_FEATURE_HAS_4xxMAC  
(ilog2(PPC_FEATURE_HAS_4xxMAC))
+#define PPC_MODULE_FEATURE_UNIFIED_CACHE   
ilog2(PPC_FEATURE_UNIFIED_CACHE))
+#define PPC_MODULE_FEATURE_HAS_SPE 
(ilog2(PPC_FEATURE_HAS_SPE))
+#define PPC_MODULE_FEATURE_HAS_EFP_SINGLE  
(ilog2(PPC_FEATURE_HAS_EFP_SINGLE))
+#define PPC_MODULE_FEATURE_HAS_EFP_DOUBLE  
(ilog2(PPC_FEATURE_HAS_EFP_DOUBLE))
+#define PPC_MODULE_FEATURE_NO_TB   
(ilog2(PPC_FEATURE_NO_TB))
+#define PPC_MODULE_FEATURE_POWER4  
(ilog2(PPC_FEATURE_POWER4))
+#define PPC_MODULE_FEATURE_POWER5  
(ilog2(PPC_FEATURE_POWER5))
+#define PPC_MODULE_FEATURE_POWER5_PLUS 
(ilog2(PPC_FEATURE_POWER5_PLUS))
+#define PPC_MODULE_FEATURE_CELL
(ilog2(PPC_FEATURE_CELL))
+#define PPC_MODULE_FEATURE_BOOKE   
(ilog2(PPC_FEATURE_BOOKE))
+#define PPC_MODULE_FEATURE_SMT (ilog2(PPC_FEATURE_SMT))
+#define PPC_MODULE_FEATURE_ICACHE_SNOOP
(ilog2(PPC_FEATURE_ICACHE_SNOOP))
+#define PPC_MODULE_FEATURE_ARCH_2_05   
(ilog2(PPC_FEATURE_ARCH_2_05))
+#define PPC_MODULE_FEATURE_PA6T
(ilog2(PPC_FEATURE_PA6T))
+#define PPC_MODULE_FEATURE_HAS_DFP 
(ilog2(PPC_FEATURE_HAS_DFP))
+#define PPC_MODULE_FEATURE_POWER6_EXT  
(ilog2(PPC_FEATURE_POWER6_EXT))
+#define PPC_MODULE_FEATURE_ARCH_2_06   
(ilog2(PPC_FEATURE_ARCH_2_06))
+#define PPC_MODULE_FEATURE_HAS_VSX 
(ilog2(PPC_FEATURE_HAS_VSX))
+#define PPC_MODULE_FEATURE_PSERIES_PERFMON_COMPAT  
(ilog2(PPC_FEATURE_PSERIES_PERFMON_COMPAT))
+#define PPC_MODULE_FEATURE_TRUE_LE 
(ilog2(PPC_FEATURE_TRUE_LE))
+#define PPC_MODULE_FEATURE_PPC_LE  
(ilog2(PPC_FEATURE_PPC_LE))
+
+#define PPC_MODULE_FEATURE_ARCH_2_07   (32 + 
ilog2(PPC_FEATURE2_ARCH_2_07))
+#define PPC_MODULE_FEATURE_HTM (32 + 
ilog2(PPC_FEATURE2_HTM))
+#define PPC_MODULE_FEATURE_DSCR(32 + 
ilog2(PPC_FEATURE2_DSCR))
+#define PPC_MODULE_FEATURE_EBB (32 + 
ilog2(PPC_FEATURE2_EBB))
+#define PPC_MODULE_FEATURE_ISEL(32 + 
ilog2(PPC_FEATURE2_ISEL))
+#define PPC_MODULE_FEATURE_TAR (32 + 
ilog2(PPC_FEATURE2_TAR))
+#define PPC_MODULE_FEATURE_VEC_CRYPTO  (32 + 

Re: Add MediaTek USB3 DRD Driver

2016-07-12 Thread chunfeng yun
Hi Felipe:

Could you please give me some suggestions if you have already reviewed
some codes.

Thanks a lot.


On Wed, 2016-06-15 at 11:07 +0800, Chunfeng Yun wrote:
> From 48552e96e4e33f8830cb6a59154fe148425532fd Mon Sep 17 00:00:00 2001
> From: Chunfeng Yun 
> Date: Wed, 15 Jun 2016 10:58:10 +0800
> Subject: [PATCH v4,0/5] Add MediaTek USB3 DRD Driver
> 
> These patches introduce the MediaTek USB3 dual-role controller
> driver.
> 
> The driver can be configured as Dual-Role Device (DRD),
> Peripheral Only and Host Only (xHCI) modes. It works well
> with Mass Storage, RNDIS and g_zero on FS/HS and SS. And it is
> tested on MT8173 platform which only contains USB2.0 device IP,
> and on MT6290 platform which contains USB3.0 device IP.
> 
> Change in v4:
> 1. fix build errors on non-mediatek platforms
> 2. provide manual dual-role switch via debugfs instead of sysfs
> 

> --
> 1.7.9.5
> 
> 




Re: [Query] Preemption (hogging) of the work handler

2016-07-12 Thread Sergey Senozhatsky
Cc Petr Mladek.

On (07/12/16 16:19), Viresh Kumar wrote:
[..]
> Okay, we have tracked this BUG and its really interesting.

good find!

> I hacked the platform's serial driver to implement a putchar() routine
> that simply writes to the FIFO in polling mode, that helped us in
> tracing on where we are going wrong.
> 
> The problem is that we are running asynchronous printks and we call
> wake_up_process() from the last running CPU which has disabled
> interrupts. That takes us to: try_to_wake_up().
> 
> In our case the CPU gets deadlocked on this line in try_to_wake_up().
> 
> raw_spin_lock_irqsave(>pi_lock, flags);

yeah, printk() can't handle these types of recursion. it can prevent
printk() calls issued from within the logbuf_lock spinlock section,
with some limitations:

if (unlikely(logbuf_cpu == smp_processor_id())) {
recursion_bug = true;
return;
}

raw_spin_lock(_lock);
logbuf_cpu = this_cpu;
...
logbuf_cpu = UINT_MAX;
raw_spin_unlock(_lock);

so should, for instance, raw_spin_unlock() generate spin_dump(), printk()
will blow up (both sync and async), because logbuf_cpu is already reset.
it may look that async printk added another source of recursion - wake_up().
but, apparently, this is not exactly correct. because there is already a
wake_up() call in console_unlock() - up().

printk()
 if (logbuf_cpu == smp_processor_id())
return;

 raw_spin_lock(_lock);
 logbuf_cpu = this_cpu;
 ...
 logbuf_cpu = UINT_MAX;
 raw_spin_unlock(_lock);

 console_trylock()
   raw_spin_lock_irqsave(>lock)  << ***
   raw_spin_unlock_irqsave(>lock)<< ***

 console_unlock()
  up()
   raw_spin_lock_irqsave(>lock)  << ***
__up()
 wake_up_process()
  try_to_wake_up()  << *** in may places


*** a printk() call from here will kill the system. either it will
recurse printk(), or spin forever in 'nested' printk() on one of
the already taken spin locks.

I had an idea of waking up a printk_kthread under logbuf_lock,
so `logbuf_cpu == smp_processor_id()' test would help here. But
it turned out to introduce a regression in printk() behaviour.
apart from that, it didn't fix any of the existing recursion
printks.

there is printk_deferred() printk that is supposed to be used for
printing under scheduler locks, but it won't help in all of the cases.

printk() has many issues.

> I will explain how:
> 
> The try_to_wake_up() function takes us through the scheduler code (RT
> sched), to the hrtimer code, where we eventually call ktime_get() (for
> the MONOTONIC clock used for hrtimer). And this function has this:
> 
> WARN_ON(timekeeping_suspended);
> 
> This starts another printk while we are in the middle of
> wake_up_process() and the CPU tries to take the above lock again and
> gets stuck there :)
> 
> This doesn't happen everytime because we don't always call ktime_get()
> and it is called only if hrtimer_active() returns false.
> 
> This happened because of a WARN_ON() but it can happen anyway. Think
> about this case:
> 
> - offline all CPUs, except 0
> - call any routine that prints messages after disabling interrupts,
>   etc.
> - If any of the function within wake_up_process() does a print, we are
>   screwed.
> 
> So the thing is that we can't really call wake_up_process() in cases
> where the last CPU disables interrupts. And that's why my fixup patch
> (which moved to synchronous prints after suspend) really works.
> 
> @Jan and Sergey: I would expect a patch from you guys to fix this
> properly :)
> 
> Maybe something more in can_print_async() routine, like:
> 
> only-one-cpu-online + irqs_disabled()
> 

right. adding only (num_online_cpus() > 1) check to can_printk_async()
*in theory* can break some cases. for example, SMP system, with only
one online CPU, active rt_sched throttling (not necessarily because of
printk kthread, any other task will do), and some of interrupts services
by CPU0 keep calling printk(), so deferred printk IRQ will stay busy:

echo 0 > /sys//cpu{1..NR_CPUS}/online  # only CPU0 is active

CPU0
sched()
 printk_deferred()
IRQ
wake_up_klogd_work_func()
console_trylock()
console_unlock()

IRQ
printk()

IRQ
printk()

IRQ
  

Re: [PATCH 1/2] HID: logitech-hidpp: add battery support for HID++ 2.0 devices

2016-07-12 Thread Peter Hutterer
On Fri, Jul 08, 2016 at 04:35:45PM +0200, Bastien Nocera wrote:
> On Wed, 2016-06-29 at 19:28 +1000, Peter Hutterer wrote:
> > +static int hidpp_battery_get_property(struct power_supply *psy,
> > + enum power_supply_property psp,
> > + union power_supply_propval
> > *val)
> > +{
> > +   struct hidpp_device *hidpp = power_supply_get_drvdata(psy);
> > +   int ret = 0;
> > +
> > +   switch(psp) {
> > +   case POWER_SUPPLY_PROP_STATUS:
> > +   val->intval = hidpp->battery.status;
> > +   break;
> > +   case POWER_SUPPLY_PROP_CAPACITY:
> > +   val->intval = hidpp->battery.level;
> > +   break;
> > +   default:
> 
> You forgot to handle POWER_SUPPLY_PROP_SCOPE. This means that UPower
> thinks it's supplying power to the computer to which it is connected.
> 
> Should be set to "POWER_SUPPLY_SCOPE_DEVICE". This should fix it.
> 
> From 8fbfcfd411a4b2c55ec24adc8b8ecc0bca2db5e3 Mon Sep 17 00:00:00 2001
> From: Bastien Nocera 
> Date: Fri, 8 Jul 2016 16:34:18 +0200
> Subject: [PATCH] HID: logitech-hidpp: Add scope to battery
> 
> Without a scope defined, UPower assumes that the battery is provide
> power to the computer it's connected to, like a laptop battery or a UPS.
> 
> Signed-off-by: Bastien Nocera 

Tested-by: Peter Hutterer 

Cheers,
   Peter

> ---
>  drivers/hid/hid-logitech-hidpp.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/hid/hid-logitech-hidpp.c 
> b/drivers/hid/hid-logitech-hidpp.c
> index 4eeb550..4aaf237 100644
> --- a/drivers/hid/hid-logitech-hidpp.c
> +++ b/drivers/hid/hid-logitech-hidpp.c
> @@ -761,6 +761,7 @@ static int hidpp20_battery_event(struct hidpp_device 
> *hidpp,
>  static enum power_supply_property hidpp_battery_props[] = {
> POWER_SUPPLY_PROP_STATUS,
> POWER_SUPPLY_PROP_CAPACITY,
> +   POWER_SUPPLY_PROP_SCOPE,
>  };
>  
>  static int hidpp_battery_get_property(struct power_supply *psy,
> @@ -777,6 +778,9 @@ static int hidpp_battery_get_property(struct power_supply 
> *psy,
> case POWER_SUPPLY_PROP_CAPACITY:
> val->intval = hidpp->battery.level;
> break;
> +   case POWER_SUPPLY_PROP_SCOPE:
> +   val->intval = POWER_SUPPLY_SCOPE_DEVICE;
> +   break;
> default:
> ret = -EINVAL;
> break;
> -- 
> 2.7.4
> 


[PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race

2016-07-12 Thread Manfred Spraul
Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a
race:

sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
- a non-simple operations is ongoing (sma->sem_perm.lock held)
- a complex operation is sleeping (sma->complex_count != 0)

As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions. See below for more details
(or kernel bugzilla 105651).

The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.

The patch also updates stale documentation regarding the locking.

With regards to stable kernels:
The patch is required for all kernels that include the
commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?)

The alternative is to revert the patch that introduced the race.

The patch is safe for backporting, i.e. it makes no assumptions
about memory barriers in spin_unlock_wait() or that the acquire
within spin_lock() is after setting the lock variable.

Background:
Here is the race of the current implementation:

Thread A: (simple op)
- does the first "sma->complex_count == 0" test

Thread B: (complex op)
- does sem_lock(): This includes an array scan. But the scan can't
  find Thread A, because Thread A does not own sem->lock yet.
- the thread does the operation, increases complex_count,
  drops sem_lock, sleeps

Thread A:
- spin_lock(>lock), spin_is_locked(sma->sem_perm.lock)
- sleeps before the complex_count test

Thread C: (complex op)
- does sem_lock (no array scan, complex_count==1)
- wakes up Thread B.
- decrements complex_count

Thread A:
- does the complex_count test

Bug:
Now both thread A and thread C operate on the same array, without
any synchronization.

Full memory barrier are required to synchronize changes of
complex_mode and the lock operations.

Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()")
Reported-by: fel...@informatik.uni-bremen.de
Signed-off-by: Manfred Spraul 
Cc: 
---
 include/linux/sem.h |   1 +
 ipc/sem.c   | 130 +++-
 2 files changed, 79 insertions(+), 52 deletions(-)

diff --git a/include/linux/sem.h b/include/linux/sem.h
index 976ce3a..d0efd6e 100644
--- a/include/linux/sem.h
+++ b/include/linux/sem.h
@@ -21,6 +21,7 @@ struct sem_array {
struct list_headlist_id;/* undo requests on this array 
*/
int sem_nsems;  /* no. of semaphores in array */
int complex_count;  /* pending complex operations */
+   boolcomplex_mode;   /* no parallel simple ops */
 };
 
 #ifdef CONFIG_SYSVIPC
diff --git a/ipc/sem.c b/ipc/sem.c
index ae72b3c..0da63c8 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -162,14 +162,21 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void 
*it);
 
 /*
  * Locking:
+ * a) global sem_lock() for read/write
  * sem_undo.id_next,
  * sem_array.complex_count,
- * sem_array.pending{_alter,_cont},
- * sem_array.sem_undo: global sem_lock() for read/write
- * sem_undo.proc_next: only "current" is allowed to read/write that field.
+ * sem_array.complex_mode
+ * sem_array.pending{_alter,_const},
+ * sem_array.sem_undo
  *
+ * b) global or semaphore sem_lock() for read/write:
  * sem_array.sem_base[i].pending_{const,alter}:
- * global or semaphore sem_lock() for read/write
+ * sem_array.complex_mode (for read)
+ *
+ * c) special:
+ * sem_undo_list.list_proc:
+ * * undo_list->lock for write
+ * * rcu for read
  */
 
 #define sc_semmsl  sem_ctls[0]
@@ -260,28 +267,59 @@ static void sem_rcu_free(struct rcu_head *head)
 }
 
 /*
- * Wait until all currently ongoing simple ops have completed.
+ * Enter the mode suitable for non-simple operations:
  * Caller must own sem_perm.lock.
- * New simple ops cannot start, because simple ops first check
- * that sem_perm.lock is free.
- * that a) sem_perm.lock is free and b) complex_count is 0.
  */
-static void sem_wait_array(struct sem_array *sma)
+static void complexmode_enter(struct sem_array *sma)
 {
int i;
struct sem *sem;
 
-   if (sma->complex_count)  {
-   /* The thread that increased sma->complex_count waited on
-* all sem->lock locks. Thus we don't need to wait again.
-*/
+   if (sma->complex_mode)  {
+   /* We are already in complex_mode. Nothing to do */
return;
}
+   WRITE_ONCE(sma->complex_mode, true);
+
+   /* We need a full barrier:
+* The write to complex_mode must be visible
+* before we read the first sem->lock spinlock state.
+*/
+   smp_mb();
 
for (i = 0; i < sma->sem_nsems; i++) {
sem = sma->sem_base + i;

[PATCH 2/2] ipc/sem.c: Remove duplicated memory barriers.

2016-07-12 Thread Manfred Spraul
With 2c610022711 (locking/qspinlock: Fix spin_unlock_wait() some more),
memory barriers were added into spin_unlock_wait().
Thus another barrier is not required.

And as explained in 055ce0fd1b8 (locking/qspinlock: Add comments),
spin_lock() provides a barrier so that reads within the critical
section cannot happen before the write for the lock is visible.
i.e. spin_lock provides an acquire barrier after the write of the lock
variable, this barrier pairs with the smp_mb() in complexmode_enter().

Please review!
For x86, the patch is safe. But I don't know enough about all archs
that support SMP.

Signed-off-by: Manfred Spraul 
---
 ipc/sem.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 0da63c8..d7b4212 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -291,14 +291,6 @@ static void complexmode_enter(struct sem_array *sma)
sem = sma->sem_base + i;
spin_unlock_wait(>lock);
}
-   /*
-* spin_unlock_wait() is not a memory barriers, it is only a
-* control barrier. The code must pair with spin_unlock(>lock),
-* thus just the control barrier is insufficient.
-*
-* smp_rmb() is sufficient, as writes cannot pass the control barrier.
-*/
-   smp_rmb();
 }
 
 /*
@@ -363,12 +355,6 @@ static inline int sem_lock(struct sem_array *sma, struct 
sembuf *sops,
 */
spin_lock(>lock);
 
-   /*
-* A full barrier is required: the write of sem->lock
-* must be visible before the read is executed
-*/
-   smp_mb();
-
if (!smp_load_acquire(>complex_mode)) {
/* fast path successful! */
return sops->sem_num;
-- 
2.5.5



[PATCH 0/2] ipc/sem.c: sem_lock fixes

2016-07-12 Thread Manfred Spraul
Hi Andrew, Hi Peter,

next version of the sem_lock() fixes:
The patches are again vs. tip.

Patch 1 is ready for merging, Patch 2 is for review.

- Patch 1 is the patch as in -next since January
  It fixes the race that was found by Felix.
- Patch 2 removes the memory barriers that are part of the qspinlock
  code.
- (The hysteresis patch would be patch 3. The risk of regressions
  can't be ruled out, thus it must wait for benchmarks from real
  workload tests)

Patch 1+2 were one patch, I've split patches so that review and
backporting are simpler.

--
Manfred


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-12 Thread Stewart Smith
Russell King - ARM Linux  writes:
> On Tue, Jul 12, 2016 at 10:58:05PM +0200, Petr Tesarik wrote:
>> I'm not an expert on DTB, so I can't provide an example of code
>> execution, but you have already mentioned the /chosen/linux,stdout-path
>> property. If an attacker redirects the bootloader to an insecure
>> console, they may get access to the system that would otherwise be
>> impossible.
>
> I fail to see how kexec connects with the boot loader - the DTB image
> that's being talked about is one which is passed from the currently
> running kernel to the to-be-kexec'd kernel.  For ARM (and I suspect
> also ARM64) that's a direct call chain which doesn't involve any
> boot loader or firmware, and certainly none that would involve the
> passed DTB image.

For OpenPOWER machines, kexec is the bootloader. Our bootloader is a
linux kernel and initramfs with a UI (petitboot) - this means we never
have to write a device driver twice: write a kernel one and you're done
(for booting from the device and using it in your OS).

-- 
Stewart Smith
OPAL Architect, IBM.



Loan Offer

2016-07-12 Thread Quick Loans
Instant cash Loan with same day payout on all kinds of Loan are available at 
Quick Financial Home were loan is offered at 2% per annul. Email: 
quickloa...@foxmail.com




Re: [PATCH v3] Input: synaptics-rmi4: Support regulator supplies

2016-07-12 Thread Bjorn Andersson
On Fri, Jun 24, 2016 at 5:40 PM, Andrew Duggan  wrote:
> On 06/10/2016 10:25 PM, Bjorn Andersson wrote:
>>
>> From: Bjorn Andersson 
>>
>> Support the two supplies - vdd and vio - to make it possible to control
>> power to the Synaptics chip.
>>
>> Signed-off-by: Bjorn Andersson 
>> Signed-off-by: Bjorn Andersson 
>
>
> Reviewed-by: Andrew Duggan 

Dmitry, do you have any comments or would you mind pick this up?

Regards,
Bjorn

>
>
>> ---
>>   .../devicetree/bindings/input/rmi4/rmi_i2c.txt |  9 +
>>   drivers/input/rmi4/rmi_i2c.c   | 41
>> ++
>>   2 files changed, 50 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/input/rmi4/rmi_i2c.txt
>> b/Documentation/devicetree/bindings/input/rmi4/rmi_i2c.txt
>> index 95fa715c6046..ec908b91fd90 100644
>> --- a/Documentation/devicetree/bindings/input/rmi4/rmi_i2c.txt
>> +++ b/Documentation/devicetree/bindings/input/rmi4/rmi_i2c.txt
>> @@ -22,6 +22,15 @@ See
>> Documentation/devicetree/bindings/interrupt-controller/interrupts.txt
>>   - syna,reset-delay-ms: The number of milliseconds to wait after
>> resetting the
>> device.
>>   +- syna,startup-delay-ms: The number of milliseconds to wait after
>> powering on
>> +the device.
>> +
>> +- vdd-supply: VDD power supply.
>> +See ../regulator/regulator.txt
>> +
>> +- vio-supply: VIO power supply
>> +See ../regulator/regulator.txt
>> +
>>   Function Parameters:
>>   Parameters specific to RMI functions are contained in child nodes of the
>> rmi device
>>node. Documentation for the parameters of each function can be found
>> in:
>> diff --git a/drivers/input/rmi4/rmi_i2c.c b/drivers/input/rmi4/rmi_i2c.c
>> index a96a326b53bd..71dc6cdde8ac 100644
>> --- a/drivers/input/rmi4/rmi_i2c.c
>> +++ b/drivers/input/rmi4/rmi_i2c.c
>> @@ -11,6 +11,8 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>> +#include 
>>   #include "rmi_driver.h"
>> #define BUFFER_SIZE_INCREMENT 32
>> @@ -37,6 +39,9 @@ struct rmi_i2c_xport {
>> u8 *tx_buf;
>> size_t tx_buf_size;
>> +
>> +   struct regulator_bulk_data supplies[2];
>> +   u32 startup_delay;
>>   };
>> #define RMI_PAGE_SELECT_REGISTER 0xff
>> @@ -246,6 +251,24 @@ static int rmi_i2c_probe(struct i2c_client *client,
>> return -ENODEV;
>> }
>>   + rmi_i2c->supplies[0].supply = "vdd";
>> +   rmi_i2c->supplies[1].supply = "vio";
>> +   retval = devm_regulator_bulk_get(>dev,
>> +ARRAY_SIZE(rmi_i2c->supplies),
>> +rmi_i2c->supplies);
>> +   if (retval < 0)
>> +   return retval;
>> +
>> +   retval = regulator_bulk_enable(ARRAY_SIZE(rmi_i2c->supplies),
>> +  rmi_i2c->supplies);
>> +   if (retval < 0)
>> +   return retval;
>> +
>> +   of_property_read_u32(client->dev.of_node, "syna,startup-delay-ms",
>> +_i2c->startup_delay);
>> +
>> +   msleep(rmi_i2c->startup_delay);
>> +
>> rmi_i2c->client = client;
>> mutex_init(_i2c->page_mutex);
>>   @@ -286,6 +309,7 @@ static int rmi_i2c_remove(struct i2c_client *client)
>> struct rmi_i2c_xport *rmi_i2c = i2c_get_clientdata(client);
>> rmi_unregister_transport_device(_i2c->xport);
>> +   regulator_bulk_disable(ARRAY_SIZE(rmi_i2c->supplies),
>> rmi_i2c->supplies);
>> return 0;
>>   }
>> @@ -308,6 +332,9 @@ static int rmi_i2c_suspend(struct device *dev)
>> dev_warn(dev, "Failed to enable irq for wake:
>> %d\n",
>> ret);
>> }
>> +
>> +   regulator_bulk_disable(ARRAY_SIZE(rmi_i2c->supplies),
>> rmi_i2c->supplies);
>> +
>> return ret;
>>   }
>>   @@ -317,6 +344,12 @@ static int rmi_i2c_resume(struct device *dev)
>> struct rmi_i2c_xport *rmi_i2c = i2c_get_clientdata(client);
>> int ret;
>>   + ret = regulator_bulk_enable(ARRAY_SIZE(rmi_i2c->supplies),
>> rmi_i2c->supplies);
>> +   if (ret)
>> +   return ret;
>> +
>> +   msleep(rmi_i2c->startup_delay);
>> +
>> enable_irq(rmi_i2c->irq);
>> if (device_may_wakeup(>dev)) {
>> ret = disable_irq_wake(rmi_i2c->irq);
>> @@ -346,6 +379,8 @@ static int rmi_i2c_runtime_suspend(struct device *dev)
>> disable_irq(rmi_i2c->irq);
>>   + regulator_bulk_disable(ARRAY_SIZE(rmi_i2c->supplies),
>> rmi_i2c->supplies);
>> +
>> return 0;
>>   }
>>   @@ -355,6 +390,12 @@ static int rmi_i2c_runtime_resume(struct device
>> *dev)
>> struct rmi_i2c_xport *rmi_i2c = i2c_get_clientdata(client);
>> int ret;
>>   + ret = regulator_bulk_enable(ARRAY_SIZE(rmi_i2c->supplies),
>> rmi_i2c->supplies);
>> +   if 

xen: migration: guest kernel gets stuck because of too-early-swappness

2016-07-12 Thread Zhangbo (Oscar)
Hi all:
  We found that guests such as RHEL6, they occasionally got stuck after 
migration.
  The stack of the stuck guest kernel is as follows:
PID: 18 TASK: 88007de61500 CPU: 1 COMMAND: "xenwatch"
#0 [88007de62e40] schedule at 8150d692
#1 [88007de62f08] io_schedule at 8150de73
#2 [88007de62f28] get_request_wait at 8125e4c8
#3 [88007de62fb8] blk_queue_bio at 8125e60d
#4 [88007de63038] generic_make_request at 8125ccce
#5 [88007de63108] submit_bio at 8125d02d
#6 [88007de63158] swap_writepage at 81154374
#7 [88007de63188] pageout.clone.2 at 8113205b
#8 [88007de63238] shrink_page_list.clone.3 at 811326e5
#9 [88007de63388] shrink_inactive_list at 81133263
#10 [88007de63538] shrink_mem_cgroup_zone at 81133afe
#11 [88007de63608] shrink_zone at 81133dc3
#12 [88007de63678] do_try_to_free_pages at 81133f25
#13 [88007de63718] try_to_free_pages at 811345f2
#14 [88007de637b8] __alloc_pages_nodemask at 8112be48
#15 [88007de638f8] kmem_getpages at 811669d2
#16 [88007de63928] fallback_alloc at 811675ea
#17 [88007de639a8] cache_alloc_node at 81167369
#18 [88007de63a08] kmem_cache_alloc at 811682eb
#19 [88007de63a48] idr_pre_get at 812786c0
#20 [88007de63a78] ida_pre_get at 8127870c
#21 [88007de63a98] proc_register at 811efc71
#22 [88007de63ae8] proc_mkdir_mode at 811f0082
#23 [88007de63b18] proc_mkdir at 811f00b6
#24 [88007de63b28] register_handler_proc at 810e54fb
#25 [88007de63bf8] __setup_irq at 810e2594
#26 [88007de63c48] request_threaded_irq at 810e2e43
#27 [88007de63ca8] serial8250_startup at 81356fac
#28 [88007de63cf8] uart_resume_port at 813547be
#29 [88007de63d78] serial8250_resume_port at 813567b6
#30 [88007de63d98] serial_pnp_resume at 81358a58
#31 [88007de63da8] pnp_bus_resume at 81311853
#32 [88007de63dc8] dpm_resume_end at 813648a8
#33 [88007de63e28] shutdown_handler at 81319351
#34 [88007de63e68] xenwatch_thread at 8131ab1a
#35 [88007de63ee8] kthread at 81096916
#36 [88007de63f48] kernel_thread at 8100c0ca

  The reason we guess is that:
  1 Guests with kernel of 3.*, such as RHEL6, when they are not configured with 
CONFIG_PREEMPT , they do NOT call FREEZE/THAW before resuming disks, thus, 
kernel threads maybe active before the disks are available. We know that kernel 
threads may require/allocate memories, which may occasionally cause swappness. 
Swappness before disks get ready may cause kernel stuck.
this problem is fixed at: 
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-3.16.y=2edbf3c6af0f5f1f9d2ef00a15339c10beaff405
  2 However, even the kernel thread xenwatch itself needs to allocate memories, 
any attempt to acquire memory before the disk is resumed may cause deadlock 
shown above.

  So, how to fix the kernel stuck problem caused by too-early-swappness? Thanks 
in advance.


ZhangBo(Oscar)


Re: [PATCH v6 5/5] usb: dwc3: rockchip: add devicetree bindings documentation

2016-07-12 Thread William.wu

Dear Rob,


On 2016/7/11 23:13, Rob Herring wrote:

On Thu, Jul 07, 2016 at 10:58:44AM +0800, William Wu wrote:

This patch adds the devicetree documentation required for Rockchip
USB3.0 core wrapper consisting of USB3.0 IP from Synopsys.

It supports DRD mode, and could operate in device mode (SS, HS, FS)
and host mode (SS, HS, FS, LS).

Signed-off-by: William Wu 
---
Changes in v6:
- rename bus_clk, and add usbdrd3_1 node as an example (Heiko)

Changes in v5:
- rename clock-names, and remove unnecessary clocks (Heiko)

Changes in v4:
- modify commit log, and add phy documentation location (Sergei)

Changes in v3:
- add dwc3 address (balbi)

Changes in v2:
- add rockchip,dwc3.txt to Documentation/devicetree/bindings/ (balbi, Brian)

  .../devicetree/bindings/usb/rockchip,dwc3.txt  | 59 ++
  1 file changed, 59 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/usb/rockchip,dwc3.txt

Acked-by: Rob Herring 


Thanks a lot. I'll add Acked-by next patch.










Re: [PATCH v6 3/5] usb: dwc3: add phyif_utmi_quirk

2016-07-12 Thread William.wu

Dear Rob,


On 2016/7/11 22:58, Rob Herring wrote:

On Thu, Jul 07, 2016 at 10:54:24AM +0800, William Wu wrote:

Add a quirk to configure the core to support the
UTMI+ PHY with an 8- or 16-bit interface. UTMI+ PHY
interface is hardware property, and it's platform
dependent. Normall, the PHYIf can be configured

s/Normall/Normally/

Yeah,I'll fix it.:-)

s/PHYIf/PHYIF/

Refer to DWC3 controller databook, "PHY Interface" called "PHYIf",
so I describe "PHYIf" here. However, "PHYIF”seems more the norm,
I'll fix it.:-)



during coreconsultant. But for some specific usb

s/usb/USB/


Thanks, I'll fix it.:-)




cores(e.g. rk3399 soc dwc3), the default PHYIf
configuration value is fault, so we need to
reconfigure it by software.

And refer to the dwc3 databook, the GUSB2PHYCFG.USBTRDTIM

s/dwc3/DWC3/

Thanks, I'll fix it too.



must be set to the corresponding value according to
the UTMI+ PHY interface.

And wrap your lines at 70-74 characters.


Thanks for your suggestion, I'll pay attention to this problem next 
patch.:-)


Best Regards,
William Wu


Rob








[PATCH] [linux-next] input: Fix a double word "is is" in include/linux/input.h

2016-07-12 Thread Masanari Iida
This patch fix a double word "is is" found in in
Documentation/DocBook/device-drivers.xml.
It is because the file was created from comments in sources,
so I have to fix the double words in include/linux/input.h

Signed-off-by: Masanari Iida 
---
 include/linux/input.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/input.h b/include/linux/input.h
index 1e967694e9a5..a65e3b24fb18 100644
--- a/include/linux/input.h
+++ b/include/linux/input.h
@@ -95,7 +95,7 @@ struct input_value {
  * @grab: input handle that currently has the device grabbed (via
  * EVIOCGRAB ioctl). When a handle grabs a device it becomes sole
  * recipient for all input events coming from the device
- * @event_lock: this spinlock is is taken when input core receives
+ * @event_lock: this spinlock is taken when input core receives
  * and processes a new event for the device (in input_event()).
  * Code that accesses and/or modifies parameters of a device
  * (such as keymap or absmin, absmax, absfuzz, etc.) after device
-- 
2.9.1.200.gb1ec08f



Re: [CRIU] Introspecting userns relationships to other namespaces?

2016-07-12 Thread W. Trevor King
On Tue, Jul 12, 2016 at 05:08:43PM -0700, Andrew Vagin wrote:
> Here is a patch to get an owning user namespace:
> https://github.com/avagin/linux-task-diag/commit/7fad8ff3fc4110bebf0920cec2388390b3bd2238
> https://github.com/avagin/linux-task-diag/commit/2663bc803d324785e328261f3c07a0fef37d2088
>
> Here is an example how it looks from user-space:
> https://github.com/avagin/linux-task-diag/blob/namespaces/tools/testing/selftests/nsfs/owner.c#L49

Overall this looks good to me (I left a handful of uninformed comments
inline ;).

It doesn't make it easy to walk leafward, but it doesn't look like the
kernel has a convenient way to list child namespaces either.
Something like /proc//task//children (with
CONFIG_PROC_CHILDREN) for namespaces would make it easier to get a
complete system overview (as far as your credentials and position in
the namespace hierarchies allow).  But looking at the
CONFIG_PROC_CHILDREN implementation doesn't make me all that excited
about mimicking it for namespaces ;).

You can still brute-force it in userspace by walking the root-most
procfs's you can find and peeking at all the /proc//ns/… entries
(but yuck ;).  With mount and other namespaces not being hierarchical,
the “leafword” idea may not be all that useful anyway, but having a
more compact collection of mount namepaces (say) that you know about
would be nice.  Where “know about” should probably means “know it
exists” but not necessarily “have permission to enter”.  Still,
getting that figured out can happen independently to this parent/owner
work.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [CRIU] Introspecting userns relationships to other namespaces?

2016-07-12 Thread Andrew Vagin
On Sat, Jul 09, 2016 at 01:29:20PM -0500, Eric W. Biederman wrote:
> ebied...@xmission.com (Eric W. Biederman) writes:
> 
> > Andrew Vagin  writes:
> >
> >> All these thoughts about security make me thinking that kcmp is what we
> >> should use here. It's maybe something like this:
> >>
> >> kcmp(pid1, pid2, KCMP_NS_USERNS, fd1, fd2)
> >>
> >> - to check if userns of the fd1 namepsace is equal to the fd2 userns
> >>
> >> kcmp(pid1, pid2, KCMP_NS_PARENT, fd1, fd2)
> >>
> >> - to check if a parent namespace of the fd1 pidns is equal to fd pidns.
> >>
> >> fd1 and fd2 is file descriptors to namespace files.
> >>
> >> So if we want to build a hierarchy, we need to collect all namespaces
> >> and then enumerate them to check dependencies with help of kcmp.
> >
> > That is certainly one way to go.
> >
> > There is a funny case where we would want to compare a user namespace
> > file descriptor to a parent user namespace file descriptor.
> >
> >
> > Grumble, Grumble.  I think this may actually a case for creating ioctls
> > for these two cases.  Now that random nsfs file descriptors are bind
> > mountable the original reason for using proc files is not as pressing.
> >
> > One ioctl for the user namespace that owns a file descriptor.
> > One ioctl for the parent namespace of a namespace file descriptor.
> >
> > We also need some way to get a command file descriptor for a file system
> > super block.  Al Viro has a pet project for cleaning up the mount API
> > and this might be the idea excuse to start looking at that.
> >
> > (In principle we might be able to run commands through the namespace
> >  file descriptor and using an ioctl feels dirty.  But an ioctl that
> >  only uses the fd and request argument does not suffer from the same
> >  problems that ioctls that have to pass additional arguments suffer
> >  from.)
> 
> Of course it should be an error perhaps -EINVAL to get a user
> namespace owner or parent namespace that is outside of a processes
> current user namespace or pid namespace.  That way thing stay bounded
> within the current namespaces the process is in.  Which prevents any
> leak possibilities, and keeps CRIU working.

I prepared patches with ioctl-s to understand how it looks like.

Here is a whole series:
https://github.com/avagin/linux-task-diag/commits/namespaces

Here is a patch to get an owning user namespace:
https://github.com/avagin/linux-task-diag/commit/7fad8ff3fc4110bebf0920cec2388390b3bd2238
https://github.com/avagin/linux-task-diag/commit/2663bc803d324785e328261f3c07a0fef37d2088

Here is an example how it looks from user-space:
https://github.com/avagin/linux-task-diag/blob/namespaces/tools/testing/selftests/nsfs/owner.c#L49

I like the idea with ioctl-s. James, Michael, Trevor, what is your
opinion about this?

> 
> Eric


Re: [PATCH v6 3/5] usb: dwc3: add phyif_utmi_quirk

2016-07-12 Thread William.wu

Dear Rob,


On 2016/7/11 22:54, Rob Herring wrote:

On Fri, Jul 08, 2016 at 02:33:09PM +0200, Heiko Stuebner wrote:

Hi William,

Am Donnerstag, 7. Juli 2016, 10:54:24 schrieb William Wu:

Add a quirk to configure the core to support the
UTMI+ PHY with an 8- or 16-bit interface. UTMI+ PHY
interface is hardware property, and it's platform
dependent. Normall, the PHYIf can be configured
during coreconsultant. But for some specific usb
cores(e.g. rk3399 soc dwc3), the default PHYIf
configuration value is fault, so we need to
reconfigure it by software.

And refer to the dwc3 databook, the GUSB2PHYCFG.USBTRDTIM
must be set to the corresponding value according to
the UTMI+ PHY interface.

Signed-off-by: William Wu 
---

[...]

diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt
b/Documentation/devicetree/bindings/usb/dwc3.txt index 020b0e9..8d7317d
100644
--- a/Documentation/devicetree/bindings/usb/dwc3.txt
+++ b/Documentation/devicetree/bindings/usb/dwc3.txt
@@ -42,6 +42,10 @@ Optional properties:
   - snps,dis-u2-freeclk-exists-quirk: when set, clear the
u2_freeclk_exists in GUSB2PHYCFG, specify that USB2 PHY doesn't provide
a free-running PHY clock.
+ - snps,phyif-utmi-quirk: when set core will set phyif UTMI+ interface.
+ - snps,phyif-utmi: the value to configure the core to support a UTMI+
PHY +   with an 8- or 16-bit interface. Value 0 select 8-bit
+   interface, value 1 select 16-bit interface.

maybe
snps,phyif-utmi-width = <8> or <16>;

Seems like this could be common. Any other bindings have something
similar already? If not "utmi-width" is fine.


It seems that there's not any dwc3 binding similar to this.
So I prefer to use “utmi-width”. :-)




devicetree is about describing the hardware, not the things that get written
to registers :-) . The conversion from the described width to the register
value can easily be done in the driver.


Also I don't think you need two properties for this. If the snps,phyif-utmi
property is specified it indicates that you want to manually set the width
and if it is absent you want to use the IC default. All functions reading
property-values indicate if the property is missing.

Agreed.

Rob








[RESEND PATCH] soc: mediatek: PMIC wrap: Extend the waiting time to 10ms.

2016-07-12 Thread Henry Chen
Read data fails sometimes because of a timeout that PMIC cannot transfer data
to PMIC wrap on time, extend the waiting time to 10ms to reduce the failed
rate.

Signed-off-by: Henry Chen 
---
Resend to fixed the typo on commit message
---
 drivers/soc/mediatek/mtk-pmic-wrap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/mediatek/mtk-pmic-wrap.c 
b/drivers/soc/mediatek/mtk-pmic-wrap.c
index a003ba2..a5f1093 100644
--- a/drivers/soc/mediatek/mtk-pmic-wrap.c
+++ b/drivers/soc/mediatek/mtk-pmic-wrap.c
@@ -583,7 +583,7 @@ static int pwrap_wait_for_state(struct pmic_wrapper *wrp,
 {
unsigned long timeout;
 
-   timeout = jiffies + usecs_to_jiffies(255);
+   timeout = jiffies + usecs_to_jiffies(1);
 
do {
if (time_after(jiffies, timeout))
-- 
1.8.1.1.dirty



Re: [PATCH v3 2/4] drm/rockchip: add an common abstracted PSR driver

2016-07-12 Thread Yakir Yang

Daniel,

On 07/12/2016 08:38 PM, Daniel Vetter wrote:

On Fri, Jul 01, 2016 at 02:00:00PM -0400, Sean Paul wrote:

On Fri, Jul 1, 2016 at 5:19 AM, Yakir Yang  wrote:

The PSR driver have exported four symbols for specific device driver:
- rockchip_drm_psr_register()
- rockchip_drm_psr_unregister()
- rockchip_drm_psr_enable()
- rockchip_drm_psr_disable()
- rockchip_drm_psr_flush()

Encoder driver should call the register/unregister interfaces to hook
itself into common PSR driver, encoder have implement the 'psr_set'
callback which use the set PSR state in hardware side.

Crtc driver would call the enable/disable interfaces when vblank is
enable/disable, after that the common PSR driver would call the encoder
registered callback to set the PSR state.


This feels overly complicated. It seems like you could cut out a bunch
of code by just coding the psr functions into vop and
analogix_dp-rockchip. I suppose the only reason to keep it abstracted
would be if you plan on supporting psr in a different encoder or crtc
in rockchip, or if you're planning on moving this into drm core.

Agreed on the layers of indirection. Also, you end up with 3 delayed
timers in total:
- defio timer from fbdev emulation
- timer in this abstraction
- delayed work in the psr backend driver

I'd cut out at least the middle one.

But since this seems to correctly use the ->dirty callback it gets my Ack
either way ;-)


Aha, thanks :-D

- Yakir


Cheers, Daniel


Perhaps others will disagree with this sentiment and this is the right
thing to do.


Fb driver would call the flush interface in 'fb->dirty' callback, this
helper function would force all PSR enabled encoders to exit from PSR
for 3 seconds.

Signed-off-by: Yakir Yang 
---
Changes in v3:
- split the psr flow into an common abstracted PSR driver
- implement the 'fb->dirty' callback function (Daniel)
- avoid to use notify to acqiure for vact event (Daniel)
- remove psr_active() callback which introduce in v2

Changes in v2: None

  drivers/gpu/drm/rockchip/Makefile   |   2 +-
  drivers/gpu/drm/rockchip/rockchip_drm_fb.c  |  12 ++
  drivers/gpu/drm/rockchip/rockchip_drm_psr.c | 200 
  drivers/gpu/drm/rockchip/rockchip_drm_psr.h |  12 ++
  drivers/gpu/drm/rockchip/rockchip_drm_vop.c |  24 
  5 files changed, 249 insertions(+), 1 deletion(-)
  create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_psr.c
  create mode 100644 drivers/gpu/drm/rockchip/rockchip_drm_psr.h

diff --git a/drivers/gpu/drm/rockchip/Makefile 
b/drivers/gpu/drm/rockchip/Makefile
index 05d0713..9746365 100644
--- a/drivers/gpu/drm/rockchip/Makefile
+++ b/drivers/gpu/drm/rockchip/Makefile
@@ -3,7 +3,7 @@
  # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.

  rockchipdrm-y := rockchip_drm_drv.o rockchip_drm_fb.o \
-   rockchip_drm_gem.o rockchip_drm_vop.o
+   rockchip_drm_gem.o rockchip_drm_psr.o rockchip_drm_vop.o
  rockchipdrm-$(CONFIG_DRM_FBDEV_EMULATION) += rockchip_drm_fbdev.o

  obj-$(CONFIG_ROCKCHIP_ANALOGIX_DP) += analogix_dp-rockchip.o
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
index 20f12bc..0fec18f 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
@@ -21,6 +21,7 @@

  #include "rockchip_drm_drv.h"
  #include "rockchip_drm_gem.h"
+#include "rockchip_drm_psr.h"

  #define to_rockchip_fb(x) container_of(x, struct rockchip_drm_fb, fb)

@@ -66,9 +67,20 @@ static int rockchip_drm_fb_create_handle(struct 
drm_framebuffer *fb,
  rockchip_fb->obj[0], handle);
  }

+static int rockchip_drm_fb_dirty(struct drm_framebuffer *fb,
+struct drm_file *file,
+unsigned int flags, unsigned int color,
+struct drm_clip_rect *clips,
+unsigned int num_clips)
+{
+   rockchip_drm_psr_flush();
+   return 0;
+}
+
  static const struct drm_framebuffer_funcs rockchip_drm_fb_funcs = {
 .destroy= rockchip_drm_fb_destroy,
 .create_handle  = rockchip_drm_fb_create_handle,
+   .dirty  = rockchip_drm_fb_dirty,
  };

  static struct rockchip_drm_fb *
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_psr.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
new file mode 100644
index 000..c03
--- /dev/null
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_psr.c
@@ -0,0 +1,200 @@
+#include 
+
+#include "rockchip_drm_psr.h"
+
+#define PSR_FLUSH_TIMEOUT  msecs_to_jiffies(3000) /* 3 seconds */
+
+static LIST_HEAD(psr_list);
+static DEFINE_MUTEX(psr_list_mutex);

I'm not crazy about these globals. Perhaps you can initialize them
with the rockchip driver and tuck them in a driver-level struct
(rockchip_drm_private or something).



+
+enum psr_state {
+   PSR_FLUSH,
+   PSR_ENABLE,
+   

Re: [PATCH] [RFC V1]s390/perf: fix 'start' address of module's map

2016-07-12 Thread Songshan Gong
I send this email to test the connection to linux-kernel maillist. Just 
ignore.


在 7/11/2016 4:11 PM, Songshan Gong 写道:



在 7/8/2016 11:18 PM, Jiri Olsa 写道:

On Thu, Jul 07, 2016 at 09:49:36AM +0800, Song Shan Gong wrote:

At preset, when creating module's map, perf gets 'start' address by
parsing
'proc/modules', but it's module base address, isn't the start address of
'.text' section. In most archs, it's OK. But for s390, it places
'GOT' and
'PLT' relocations before '.text' section. So there exists an offset
between
module base address and '.text' section, which will incur wrong symbol
resolution for modules.

Fix this bug by getting 'start' address of module's map from parsing
'/sys/module/[module name]/sections/.text', not from '/proc/modules'.


cool, does this fix the 'perf test 1' for s390? that'd be great


I've checked, 'perf test 1' is still failed. But the test is for testing
whether 'vmlinux symtab matches kallsyms'. For vmlinux, it's built when
compiling linux-kernel, so there's no any symbol info about module. For
kallsyms, when loading, it also splits kernel symbols with module
symbols. But this patch is intended to fix wrong symbol resolution for
modules, so I think there's no relationship between this patch and the
testcase 'perf test 1'.

I also debuged the reason why 'perf test 1' fails, there are two reasons
at least:
1. perf gets kernel start address by finding the first no-zero value of
symbols {'_text', '_stext'} in /proc/kallsyms; for s390, it's always
'_stext', but actually, '_stext' is not the first symbol in s390, there
are other symbols before '_stext', for example 'iplstart'.
2. In addition, when loading by parsing /proc/kallsyms, if the kernel
map start value is a non-zero value getting from '_stext', because
'_text' is zero, it will not be included in kernel map; but for vmlinux,
it always add '_text' to kernel map whether '_text' is zero or not. So
if '_text' is zero, whether adding '_text' to kernel map, it's
non-consistent for loading by /proc/kallsyms and vmlinux.

I'll try to fix this bug later.

Thanks for your comments.


I'll send few coments shortly

thanks,
jirka





--
SongShan Gong



Re: [PATCH v6 3/5] usb: dwc3: add phyif_utmi_quirk

2016-07-12 Thread William.wu

Dear Heiko,


On 2016/7/10 7:47, Heiko Stuebner wrote:

Am Samstag, 9. Juli 2016, 11:38:00 schrieb William.wu:

Dear Heiko & Balbi,

On 2016/7/8 21:29, Felipe Balbi wrote:

Hi,

Heiko Stuebner  writes:

Am Donnerstag, 7. Juli 2016, 10:54:24 schrieb William Wu:

Add a quirk to configure the core to support the
UTMI+ PHY with an 8- or 16-bit interface. UTMI+ PHY
interface is hardware property, and it's platform
dependent. Normall, the PHYIf can be configured
during coreconsultant. But for some specific usb
cores(e.g. rk3399 soc dwc3), the default PHYIf
configuration value is fault, so we need to
reconfigure it by software.

And refer to the dwc3 databook, the GUSB2PHYCFG.USBTRDTIM
must be set to the corresponding value according to
the UTMI+ PHY interface.

Signed-off-by: William Wu 
---

[...]


diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt
b/Documentation/devicetree/bindings/usb/dwc3.txt index
020b0e9..8d7317d
100644
--- a/Documentation/devicetree/bindings/usb/dwc3.txt
+++ b/Documentation/devicetree/bindings/usb/dwc3.txt

@@ -42,6 +42,10 @@ Optional properties:
- snps,dis-u2-freeclk-exists-quirk: when set, clear the

u2_freeclk_exists in GUSB2PHYCFG, specify that USB2 PHY doesn't
provide

a free-running PHY clock.

+ - snps,phyif-utmi-quirk: when set core will set phyif UTMI+
interface.
+ - snps,phyif-utmi: the value to configure the core to support a
UTMI+
PHY +   with an 8- or 16-bit interface. Value 0

select 8-bit

+   interface, value 1 select 16-bit interface.

maybe

snps,phyif-utmi-width = <8> or <16>;

devicetree is about describing the hardware, not the things that get
written to registers :-) . The conversion from the described width to
the register value can easily be done in the driver.

Thanks for your suggestion:-)
Yes, “snps,phyif-utmi-width = <8> or <16>” is much clearer and easier to
understand.
And I have considered the same dts property for phyif-utmi, but I have
no good idea about
the conversion from described width to the registers value for the time
being.

About phyif utmi width configuration, we need to set two places in
GUSB2PHYCFG register,
according to DWC3 USB3.0 controller databook version3.00a,6.3.46
GUSB2PHYCFG

--
 Bits   |  Name | Description
--
 13:10  |   USBTRDTIM   | Sets the turnaround
time in PHY clocks.
  || 4'h5: When the MAC

interface is 16-bit UTMI+

  || 4'h9: When the MAC

interface is 8-bit UTMI+/ULPI.
--
 3|   PHYIF|If UTMI+ is
selected, the application uses this bit to configure

  ||core to support a UTMI+

PHY with an 8- or 16-bit interface.

  ||1'b0: 8 bits
  ||1'b1: 16 bits

--


And I think maybe I can try to do this:
change it in dts:
  snps,phyif-utmi-width = <8> or <16>;

Then convert to register value like this:
 device_property_read_u8(dev, "snps,phyif-utmi-width",
   _utmi_width);

 dwc->phyif_utmi = phyif_utmi_width >> 4;

   Ater the conversion, dwc->phyif_utmi value 0 means 8 bits, value 1
means 16 bits,
   and it's easier for us to config GUSB2PHYCFG.

Is it OK?

or you could just store the actual width value read from the dts and make
the core handle accordingly, making everything a bit more explicit.

I guess personally I'd do something like:

make dwc->phyif_utmi a regular unsigned int

in probe:
ret = device_property_read_u8(dev, "snps,phyif-utmi-width",
   >phyif_utmi);
if (ret < 0) {
dwc->phyif_utmi = 0;
else if (dwc->phyif_utmi != 16 && dwc->phyif_utmi != 8) {
dev_err(dev, "unsupported utmi interface width %d\n",
dwc->phyif_utmi);
return -EINVAL;
}


when setting your GUSB2PHYCFG register:

if (dwc->phyif_utmi > 0) {
reg &= ~(DWC3_GUSB2PHYCFG_PHYIF_MASK |
   DWC3_GUSB2PHYCFG_USBTRDTIM_MASK);
usbtrdtim = (dwc->phyif_utmi == 16) ? USBTRDTIM_UTMI_16_BIT :
USBTRDTIM_UTMI_8_BIT;
phyif = (dwc->phyif_utmi == 16) ? 1 : 0;
reg |= DWC3_GUSB2PHYCFG_PHYIF(phyif) |
   DWC3_GUSB2PHYCFG_USBTRDTIM(usbtrdtim) |
}


Ah yes, it seems 

[PATCH] lockdep: fix warning in case of no_validate lock

2016-07-12 Thread Ming Lei
Now there are several locks which are marked as
no_validate, so name of the lock class can be
different with the no_validate lock.

This patch avoids this warning for this case, and
fix the following warning:

[   14.413292] [ cut here ]
[   14.413297] WARNING: CPU: 1 PID: 1434 at kernel/locking/lockdep.c:704
register_lock_class+0x44a/0x4f0
[   14.413298] Modules linked in: bcache raid1 psmouse dax_pmem dax nd_pmem
serio_raw nvme nd_btt nvme_core floppy null_blk configs autofs4
[   14.413309] CPU: 1 PID: 1434 Comm: bcache-register Tainted: G
W 4.7.0-rc6-next-20160708+ #2069
[   14.413310] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.9.0-0-g01a84be-prebuilt.qemu-project.org 04/01/2014
[   14.413311]   880076973900 ba459dec 

[   14.413313]   880076973940 ba08af71 
02c0ba1312a2
[   14.413316]  bb9fd1e0   
880074d078d0
[   14.413318] Call Trace:
[   14.413321]  [] dump_stack+0x85/0xc9
[   14.413323]  [] __warn+0xd1/0xf0
[   14.413324]  [] warn_slowpath_null+0x1d/0x20
[   14.413326]  [] register_lock_class+0x44a/0x4f0
[   14.413328]  [] __lock_acquire+0x85/0x1920
[   14.413330]  [] ? kvm_clock_read+0x23/0x40
[   14.413332]  [] ? sched_clock+0x9/0x10
[   14.413334]  [] ? sched_clock_local+0x18/0x80
[   14.413336]  [] lock_acquire+0xd4/0x240
[   14.413342]  [] ? mca_reap+0x54/0x180 [bcache]
[   14.413343]  [] down_write_trylock+0x67/0x80
[   14.413347]  [] ? mca_reap+0x54/0x180 [bcache]
[   14.413351]  [] mca_reap+0x54/0x180 [bcache]
[   14.413354]  [] mca_alloc+0xc2/0x5a0 [bcache]
[   14.413358]  [] bch_btree_node_get+0x143/0x290 [bcache]
[   14.413364]  [] run_cache_set+0x239/0x8f0 [bcache]
[   14.413368]  [] register_bcache+0x14d2/0x1a30 [bcache]
[   14.413370]  [] ? mutex_lock_nested+0x2db/0x460
[   14.413372]  [] kobj_attr_store+0xf/0x20
[   14.413374]  [] sysfs_kf_write+0x44/0x60
[   14.413376]  [] kernfs_fop_write+0x144/0x1e0
[   14.413378]  [] __vfs_write+0x28/0x120
[   14.413380]  [] ? percpu_down_read+0x57/0x90
[   14.413381]  [] ? __sb_start_write+0xca/0xe0
[   14.413382]  [] ? __sb_start_write+0xca/0xe0
[   14.413383]  [] vfs_write+0xb5/0x1b0
[   14.413385]  [] ? trace_hardirqs_on_caller+0xef/0x210
[   14.413386]  [] SyS_write+0x49/0xa0
[   14.413388]  [] entry_SYSCALL_64_fastpath+0x23/0xc1
[   14.413391]  [] ? __this_cpu_preempt_check+0x13/0x20
[   14.413392] ---[ end trace 16710c495b4dbf2f ]---

Signed-off-by: Ming Lei 
---
 kernel/locking/lockdep.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 589d763..cf071ec 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -701,7 +701,8 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int 
subclass)
 * Huh! same key, different name? Did someone trample
 * on some memory? We're most confused.
 */
-   WARN_ON_ONCE(class->name != lock->name);
+   WARN_ON_ONCE(lock->key != &__lockdep_no_validate__ &&
+   class->name != lock->name);
return class;
}
}
-- 
1.9.1



Re: [PULL] lkdtm update (next)

2016-07-12 Thread Kees Cook
On Tue, Jul 12, 2016 at 9:00 PM, Greg KH  wrote:
> On Tue, Jul 12, 2016 at 02:42:22PM -0400, Kees Cook wrote:
>> On Thu, Jul 7, 2016 at 2:14 PM, Kees Cook  wrote:
>> > Hi,
>> >
>> > Please pull these lkdtm changes for next.
>>
>> Friendly ping... I'd like this refactor to make it in time for the 4.8
>> merge window. :)
>
> Sorry, was on vacation last week, and am at LinuxCon Japan this week,
> will get to it in a day or so.  Don't worry, it will make 4.8 :)

Awesome, thanks!

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security


Re: [PATCH 1/2] crypto: vmx - Adding asm subroutines for XTS

2016-07-12 Thread Stewart Smith
Stephen Rothwell  writes:
> On Mon, 11 Jul 2016 16:07:39 -0300 Paulo Flabiano Smorigo 
>  wrote:
>>
>> diff --git a/drivers/crypto/vmx/aesp8-ppc.pl 
>> b/drivers/crypto/vmx/aesp8-ppc.pl
>> index 2280539..813ffcc 100644
>> --- a/drivers/crypto/vmx/aesp8-ppc.pl
>> +++ b/drivers/crypto/vmx/aesp8-ppc.pl
>> @@ -1,4 +1,11 @@
>> -#!/usr/bin/env perl
>> +#! /usr/bin/env perl
>> +# Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
>> +#
>> +# Licensed under the OpenSSL license (the "License").  You may not use
>> +# this file except in compliance with the License.  You can obtain a copy
>> +# in the file LICENSE in the source distribution or at
>> +# https://www.openssl.org/source/license.html
>
> So, I assume that this license is compatible with the GPLv2?

https://people.gnome.org/~markmc/openssl-and-the-gpl.html has an
explanation and points to:
https://www.openssl.org/docs/faq.html#LEGAL2

which makes it anything but clearer.

it appears the answer is "probably not, unless you have an explicit
exemption in your license"

-- 
Stewart Smith
OPAL Architect, IBM.



Re: [lkp] [usb] 9696ef14de: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:36 __list_add+0x104/0x188

2016-07-12 Thread Ye Xiaolong
On Wed, Jul 13, 2016 at 01:55:26AM +, Peter Chen wrote:
> 
>
>>-Original Message-
>>From: lkp-requ...@eclists.intel.com [mailto:lkp-requ...@eclists.intel.com] On 
>>Behalf
>>Of kernel test robot
>>Sent: Wednesday, July 13, 2016 9:28 AM
>>To: Peter Chen <peter.c...@nxp.com>
>>Cc: 0day robot <fengguang...@intel.com>; LKML <linux-kernel@vger.kernel.org>;
>>l...@01.org
>>Subject: [lkp] [usb] 9696ef14de: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:36
>>__list_add+0x104/0x188
>>
>>
>>FYI, we noticed the following commit:
>>
>>https://github.com/0day-ci/linux Peter-Chen/usb-udc-core-fix-error-
>>handling/20160711-100832
>>commit 9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2 ("usb: udc: core: fix error
>>handling")
>>
>
>Thanks,  but I really can't find the relationship between my patch and dump.
>Can you reproduce it after running again or without my patch?
>

Sorry, it's a false report, the error dump also showed in parent commit,
please ignore the report and sorry for the noise.

Thanks,
Xiaolong

>Peter
>
>>in testcase: boot
>>
>>on test machine: 2 threads qemu-system-x86_64 -enable-kvm -cpu Nehalem with
>>320M memory
>>
>>caused below changes:
>>
>>[   22.076363] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:36
>>__list_add+0x104/0x188
>>[   22.079911] list_add double add: new=88000e422e10, 
>>prev=88000e422e10,
>>next=8800135e9168.
>>[   22.086059] CPU: 0 PID: 1 Comm: swapper Tainted: GW   
>>4.7.0-rc4-
>>00110-g9696ef1 #1
>>[   22.087856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>Debian-1.8.2-1 04/01/2014
>>[   22.097535]   88001345fc00 8190c8e2
>>88001345fc40
>>[   22.099469]  81175453 00240009 88000e422e10
>>8800135e9168
>>[   22.101220]  88000e422e10  0001
>>88001345fca8
>>[   22.103291] Call Trace:
>>[   22.109166]  [] dump_stack+0x19/0x1b
>>[   22.110520]  [] __warn+0x10a/0x125
>>[   22.111632]  [] warn_slowpath_fmt+0x46/0x4e
>>[   22.112939]  [] ? klist_add_tail+0x29/0x62
>>[   22.114418]  [] __list_add+0x104/0x188
>>[   22.115600]  [] klist_add_tail+0x53/0x62
>>[   22.118742]  [] device_add+0xb03/0xcca
>>[   22.121931]  [] device_create_groups_vargs+0xfe/0x133
>>[   22.123390]  [] device_create_with_groups+0x2b/0x2d
>>[   22.130047]  [] ? __mutex_unlock_slowpath+0x1cb/0x1d8
>>[   22.131513]  [] misc_register+0x1e0/0x2ec
>>[   22.132751]  [] mousedev_init+0x55/0x81
>>[   22.134198]  [] ? input_leds_init+0x12/0x12
>>[   22.135442]  [] do_one_initcall+0xdc/0x16a
>>[   22.139131]  [] kernel_init_freeable+0x387/0x419
>>[   22.142059]  [] kernel_init+0xa/0x103
>>[   22.144618]  [] ret_from_fork+0x1f/0x40
>>[   22.148576]  [] ? rest_init+0xb8/0xb8
>>[   22.150009] ---[ end trace 75873bca450a4fe4 ]---
>>[   22.151559] mousedev: PS/2 mouse device common for all mice
>>[   22.155003] evbug: Connected device: input0 (Power Button at
>>LNXPWRBN/button/input0)
>>
>>
>>FYI, raw QEMU command line is:
>>
>>  qemu-system-x86_64 -enable-kvm -cpu Nehalem -kernel /pkg/linux/x86_64-
>>randconfig-s0-07121340/gcc-
>>6/9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2/vmlinuz-4.7.0-rc4-00110-g9696ef1
>>-append 'root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-intel12-yocto-x86_64-
>>3/bisect_boot-1-yocto-minimal-x86_64.cgz-x86_64-randconfig-s0-07121340-
>>9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2-20160712-19858-1j3k6zi-0.yaml
>>ARCH=x86_64 kconfig=x86_64-randconfig-s0-07121340 branch=linux-devel/devel-
>>hourly-2016071211 commit=9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2
>>BOOT_IMAGE=/pkg/linux/x86_64-randconfig-s0-07121340/gcc-
>>6/9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2/vmlinuz-4.7.0-rc4-00110-g9696ef1
>>max_uptime=600 RESULT_ROOT=/result/boot/1/vm-intel12-yocto-x86_64/yocto-
>>minimal-x86_64.cgz/x86_64-randconfig-s0-07121340/gcc-
>>6/9696ef14ded07fb0847f8e1cdda6d98a89ecd4f2/0 LKP_SERVER=inn
>>earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug
>>sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1
>>softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2
>>prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw ip=vm-
>>intel12-yocto-x86_64-3::dhcp drbd.minor_count=8'  -initrd 
>>/fs/KVM/initrd-vm-intel12-
>>yocto-x86_64-3 -m 320 -smp 2 -device e1000,netdev=net0 -netdev user,id=net0 -
>>boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -drive
>>file=/fs/KVM/disk0-vm-intel12-yocto-x86_64-3,media=disk,if=virtio -drive
>>file=/fs/KVM/disk1-vm-intel12-yocto-x86_64-3,media=disk,if=virtio -pidfile
>>/dev/shm/kboot/pid-vm-intel12-yocto-x86_64-3 -serial 
>>file:/dev/shm/kboot/serial-vm-
>>intel12-yocto-x86_64-3 -daemonize -display none -monitor null
>>
>>
>>
>>
>>
>>Thanks,
>>Xiaolong


Re: [v5 PATCH 1/5] extcon: Add Type-C and DP support

2016-07-12 Thread Chris Zhong

Hi Chanwoo Choi

On 07/13/2016 10:05 AM, Chanwoo Choi wrote:

Hi Chris,

On 2016년 07월 13일 10:39, Chris Zhong wrote:

Hi Chanwoo Choi


On 07/13/2016 09:11 AM, Chanwoo Choi wrote:

Hi Chris,

I'm now developing the extcon property on extcon-test branch.
But, it has not been completed.

On next version, I'll remove the notification about extcon property
and only support the following two functions.
- extcon_set_cable_property()
- extcon_get_cable_property()

Because the number of properties would be risen and the all properties
depend on the specific external connector(e.g., EXTCON_PROP_USB_VBUS
depend on the EXTCON_TYPE_USB type). When the specific external connector
is detached, extcon framework should make the property state as default state.

Yes, I think getting the notification from cable state is enough, actually I am 
using it like you said.

OK.


It may send the too many notification for extcon property.
For example, Assume that EXTCON_TYPE_USB has the over 20 properties,
when EXTCON_USB or EXTCON_USB_HOST is detached, extcon should send
the notification for the over 20 properties and one more notificaiton
for state of external connector.

So, I'll send the RFC patchset without the notification of proerty.

Lastly,
I have a comment on below.

Thanks,
Chanwoo Choi

On 2016년 07월 13일 00:09, Chris Zhong wrote:

Add EXTCON_DISP_DP for the Display external connector. For Type-C
connector the DisplayPort can work as an Alternate Mode(VESA DisplayPort
Alt Mode on USB Type-C Standard). The Type-C support both normal and
flipped orientation, so add a property to extcon.

Signe-off-by: Chris Zhong 

Signed-off-by: Chris Zhong 
---

Changes in v5:
- support get property

Changes in v4: None
Changes in v3: None
Changes in v2: None
Changes in v1: None

   drivers/extcon/extcon.c | 28 
   include/linux/extcon.h  | 13 +
   2 files changed, 41 insertions(+)

diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
index a1117db..2591b28 100644
--- a/drivers/extcon/extcon.c
+++ b/drivers/extcon/extcon.c
@@ -157,6 +157,11 @@ struct __extcon_info {
   .id = EXTCON_DISP_VGA,
   .name = "VGA",
   },
+[EXTCON_DISP_DP] = {
+.type = EXTCON_TYPE_DISP,
+.id = EXTCON_DISP_DP,
+.name = "DP",
+},
 /* Miscellaneous external connector */
   [EXTCON_DOCK] = {
@@ -270,6 +275,7 @@ static bool is_extcon_property_supported(unsigned int id,
   switch (prop) {
   case EXTCON_PROP_USB_ID:
   case EXTCON_PROP_USB_VBUS:
+case EXTCON_PROP_TYPEC_POLARITY:
   return true;
   default:
   break;
@@ -286,6 +292,8 @@ static bool is_extcon_property_supported(unsigned int id,
   }
   case EXTCON_TYPE_DISP:
   switch (prop) {
+case EXTCON_PROP_TYPEC_POLARITY:

Should EXTCON_PROP_TYPEC_POLARITY property add to both EXTCON_TYPE_USB and 
EXTCON_TYP_DISP?
EXTCON_PROP_TYPEC_POLARITY is the property of USB C-type?

it is for USB Type-C, But at Display Port alt mode, both EXTCON_USB and 
EXTCON_USB_HOST may be detached. Does it support set the property to a detached 
cable, if so, I think move this case to EXTCON_USB is fine.

One external connector can set the state of one more external connector
if the one connector support the various functions.
For example, EXTCON_USB and EXTCON_CHG_USB_SDP
The existing extcon driver[1](e.g., max14577/max77693 etc.) set the state of 
both EXTCON_USB and EXTCON_CHG_USB_SDP connector at the same time
when usb cable is attached. Because in this case, the usb connector uses as 
both power supply(EXTCON_CHG_USB_SDP) and data transfer(EXTCON_USB).
[1] 
https://git.kernel.org/cgit/linux/kernel/git/chanwoo/extcon.git/commit/?h=extcon-next=8b45b6a0741678902810d7be95e635c210fbb198

So, DP Alt mode uses the USB Type-C. So, When USB C-type connector is attached 
for DP Alt mode,
Maybe, you can set the following two state of connector and one property:
- extcon_set_cable_state(edev, [EXTCON_USB or EXTCON_USB_HOST], 1);
- extcon_set_cable_state(edev, EXTCON_DISP_DP, 1);
- extcon_set_cable_state(edev, [EXTCON_USB or EXTCON_USB_HOST], 
EXTCON_PROP_TYPEC_POLARITY, 0 or 1);

Thanks,
Chanwoo Choi


There are 4 modes for Type-C DP alt mode:
1) USB host only  :

extcon_set_cable_state(edev, EXTCON_USB_HOST, 1);
extcon_set_cable_state(edev, EXTCON_USB, 0);
extcon_set_cable_state(edev, EXTCON_DISP_DP, 0);

2) USB device only

extcon_set_cable_state(edev, EXTCON_USB_HOST, 0);
extcon_set_cable_state(edev, EXTCON_USB, 1);
extcon_set_cable_state(edev, EXTCON_DISP_DP, 0);

3) DP only

extcon_set_cable_state(edev, EXTCON_USB_HOST, 0);
extcon_set_cable_state(edev, EXTCON_USB, 0);
extcon_set_cable_state(edev, EXTCON_DISP_DP, 1);

4) USB + DP

extcon_set_cable_state(edev, EXTCON_USB_HOST, 1);
extcon_set_cable_state(edev, EXTCON_USB, 0);
extcon_set_cable_state(edev, EXTCON_DISP_DP, 1);


for 3rd mode: DP only, 

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-12 Thread Dave Young
On 07/12/16 at 03:50pm, Mark Rutland wrote:
> On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
> > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
> > > > 
> > > > On Open Firmware, the DT is extracted from running firmware and copied
> > > > into dynamically allocated data structures. After a kexec, the runtime
> > > > interface to the firmware is not available, so the flattened DT format
> > > > was created as a way to pass the same data in a binary blob to the new
> > > > kernel in a format that can be read from the kernel by walking the
> > > > directories in /proc/device-tree/*.
> > > 
> > > So this DT is available inside kernel and running kernel can still
> > > retrieve it and pass it to second kernel?
> > 
> > The kernel only uses the flattened DT blob at boot time and converts
> > it into the runtime data structures (struct device_node). The original
> > dtb is typically overwritten later.
> 
> On arm64 we deliberately preserved the DTB, so we can take that and
> build a new DTB from that kernel-side.
> 
> > > > - we typically ship devicetree sources for embedded machines with the
> > > >   kernel sources. As more hardware of the system gets enabled, the
> > > >   devicetree gains extra nodes and properties that describe the hardware
> > > >   more completely, so we need to use the latest DT blob to use all
> > > >   the drivers
> > > > 
> > > > - in some cases, kernels will fail to boot at all with an older version
> > > >   of the DT, or fail to use the devices that were working on the
> > > >   earlier kernel. This is usually considered a bug, but it's not rare
> > > > 
> > > > - In some cases, the kernel can update its DT at runtime, and the new
> > > >   settings are expected to be available in the new kernel too, though
> > > >   there are cases where you actually don't want the modified contents.
> > > 
> > > I am assuming that modified DT and unmodifed one both are accessible to
> > > kernel. And if user space can make decisions which modfied fields to use
> > > for new kernels and which ones not, then same can be done in kernel too?
> > 
> > The unmodified DT can typically be found on disk next to the kernel binary.
> > The option you have is to either read it from /proc/devicetree or to
> > read it from from /boot/*.dtb.
> 
> /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
> from the raw DTB (which is exposed at /sys/firmware/fdt).
> 
> The blob that was handed to the kernel at boot time is exposed at
> /sys/firmware/fdt.

I believe the blob can be read and passed to kexec kernel in kernel code without
the extra fd.

But consider we can kexec to a different kernel and a different initrd so there
will be use cases to pass a total different dtb as well. From my understanding
it is reasonable but yes I think we should think carefully about the design.

Thanks
Dave

> Thanks,
> Mark.
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec


linux-next: manual merge of the drm-misc tree with the arm tree

2016-07-12 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm-misc tree got a conflict in:

  drivers/gpu/drm/rockchip/rockchip_drm_drv.c

between commit:

  062993b15e8e ("drm: convert DT component matching to 
component_match_add_release()")

from the arm tree and commit:

  6d5fa28c13b9 ("gpu: drm: rockchip_drm_drv: add missing of_node_put after 
calling of_parse_phandle")

from the drm-misc tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index 7fd20c0e1fc8,f0bd1ee8b128..
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@@ -438,9 -433,8 +438,10 @@@ static int rockchip_drm_platform_probe(
is_support_iommu = false;
}
  
+   of_node_put(iommu);
 -  component_match_add(dev, , compare_of, port->parent);
 +  of_node_get(port->parent);
 +  component_match_add_release(dev, , release_of,
 +  compare_of, port->parent);
of_node_put(port);
}
  


Re: [PATCH v2 1/1] block: fix blk_queue_split() resource exhaustion

2016-07-12 Thread Mike Snitzer
On Tue, Jul 12 2016 at 10:18pm -0400,
Eric Wheeler  wrote:

> On Tue, 12 Jul 2016, NeilBrown wrote:
> 
> > On Tue, Jul 12 2016, Lars Ellenberg wrote:
> > 
> > >
> > > Instead, I suggest to distinguish between recursive calls to
> > > generic_make_request(), and pushing back the remainder part in
> > > blk_queue_split(), by pointing current->bio_lists to a
> > >   struct recursion_to_iteration_bio_lists {
> > >   struct bio_list recursion;
> > >   struct bio_list queue;
> > >   }
> > >
> > > By providing each q->make_request_fn() with an empty "recursion"
> > > bio_list, then merging any recursively submitted bios to the
> > > head of the "queue" list, we can make the recursion-to-iteration
> > > logic in generic_make_request() process deepest level bios first,
> > > and "sibling" bios of the same level in "natural" order.
> > >
> > > Signed-off-by: Lars Ellenberg 
> > > Signed-off-by: Roland Kammerer 
> > 
> > Reviewed-by: NeilBrown 
> > 
> > Thanks again for doing this - I think this is a very significant
> > improvement and could allow other simplifications.
> 
> Thank you Lars for all of this work!  
> 
> It seems like there have been many 4.3+ blockdev stacking issues and this 
> will certainly address some of those (maybe all of them?).  (I think we 
> hit this while trying drbd in 4.4 so we dropped back to 4.1 without 
> issue.)  It would be great to hear 4.4.y stable pick this up if 
> compatible.
> 
> 
> Do you believe that this patch would solve any of the proposals by others 
> since 4.3 related to bio splitting/large bios?  I've been collecting a 
> list, none of which appear have landed yet as of 4.7-rc7 (but correct me 
> if I'm wrong):
> 
> A.  [PATCH v2] block: make sure big bio is splitted into at most 256 bvecs
>   by Ming Lei: https://patchwork.kernel.org/patch/9169483/
> 
> B.  block: don't make BLK_DEF_MAX_SECTORS too big
>   by Shaohua Li: http://www.spinics.net/lists/linux-bcache/msg03525.html
> 
> C.  [1/3] block: flush queued bios when process blocks to avoid deadlock
>   by Mikulas Patocka: https://patchwork.kernel.org/patch/9204125/
>   (was https://patchwork.kernel.org/patch/7398411/)
> 
> D.  dm-crypt: Fix error with too large bios
>   by Mikulas Patocka: https://patchwork.kernel.org/patch/9138595/
> 
> The A,B,D are known to fix large bio issues when stacking dm+bcache 
> (though the B,D are trivial and probably necessary even with your patch).
> 
> Patch C was mentioned earlier in this thread by Mike Snitzer and you 
> commented briefly that his patch might solve the issue; given that, and in 
> the interest of minimizing duplicate effort, which of the following best 
> describes the situation?
> 
>   1. Your patch could supersede Mikulas's patch; they address the same 
> issue.
> 
>   2. Mikulas's patch addresses different issues such and both patches 
> should be applied.
> 
>   3. There is overlap between both your patch and Mikulas's such that both 
> #1,#2 are true and effort to solve this has been duplicated.
> 
> 
> If #3, then what might be done to resolve the overlap?

Mikulas confirmed to me that he believes Lars' v2 patch will fix the
dm-snapshot problem, which is being tracked with this BZ:
https://bugzilla.kernel.org/show_bug.cgi?id=119841

We'll see how testing goes (currently underway).


[PATCH] mm: fix calculation accounting dirtyable highmem

2016-07-12 Thread Minchan Kim
When I tested vmscale in mmtest in 32bit, I found the benchmark
was slow down 0.5 times.

basenode
   1global-1
User   12.98   16.04
System147.61  166.42
Elapsed26.48   38.08

With vmstat, I found IO wait avg is much increased compared to
base.

The reason was highmem_dirtyable_memory accumulates free pages
and highmem_file_pages from HIGHMEM to MOVABLE zones which was
wrong. With that, dirth_thresh in throtlle_vm_write is always
0 so that it calls congestion_wait frequently if writeback
starts.

With this patch, it is much recovered.

basenode  fi
   1global-1 fix
User   12.98   16.04   13.78
System147.61  166.42  143.92
Elapsed26.48   38.08   29.64

Signed-off-by: Minchan Kim 
---
 mm/page-writeback.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8db1db2..bf27594 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -307,27 +307,31 @@ static unsigned long highmem_dirtyable_memory(unsigned 
long total)
 {
 #ifdef CONFIG_HIGHMEM
int node;
-   unsigned long x = 0;
+   unsigned long x;
int i;
-   unsigned long dirtyable = highmem_file_pages;
+   unsigned long dirtyable = 0;
 
for_each_node_state(node, N_HIGH_MEMORY) {
for (i = ZONE_NORMAL + 1; i < MAX_NR_ZONES; i++) {
struct zone *z;
+   unsigned long nr_pages;
 
if (!is_highmem_idx(i))
continue;
 
z = _DATA(node)->node_zones[i];
-   dirtyable += zone_page_state(z, NR_FREE_PAGES);
+   if (!populated_zone(z))
+   continue;
 
+   nr_pages = zone_page_state(z, NR_FREE_PAGES);
/* watch for underflows */
-   dirtyable -= min(dirtyable, high_wmark_pages(z));
-
-   x += dirtyable;
+   nr_pages -= min(nr_pages, high_wmark_pages(z));
+   dirtyable += nr_pages;
}
}
 
+   x = dirtyable + highmem_file_pages;
+
/*
 * Unreclaimable memory (kernel memory or anonymous memory
 * without swap) can bring down the dirtyable pages below
-- 
1.9.1



[PATCH] mm: fix pgalloc_stall on unpopulated zone

2016-07-12 Thread Minchan Kim
If we use sc->reclaim_idx for accounting pgstall, it can increase
the count on unpopulated zone, for example, movable zone(but
my system doesn't have movable zone) if allocation request were
GFP_HIGHUSER_MOVABLE. It doesn't make no sense.

This patch fixes it so that it can account it on first populated
zone at or below highest_zoneidx of the request.

Signed-off-by: Minchan Kim 
---
 fs/buffer.c  | 2 +-
 include/linux/swap.h | 3 ++-
 mm/page_alloc.c  | 3 ++-
 mm/vmscan.c  | 5 +++--
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 46b3568..69841f4 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -268,7 +268,7 @@ static void free_more_memory(void)
gfp_zone(GFP_NOFS), NULL);
if (z->zone)
try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
-   GFP_NOFS, NULL);
+   GFP_NOFS, NULL, gfp_zone(GFP_NOFS));
}
 }
 
diff --git a/include/linux/swap.h b/include/linux/swap.h
index cc753c6..935f7e1 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -309,7 +309,8 @@ extern void lru_cache_add_active_or_unevictable(struct page 
*page,
 /* linux/mm/vmscan.c */
 extern unsigned long pgdat_reclaimable_pages(struct pglist_data *pgdat);
 extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
-   gfp_t gfp_mask, nodemask_t *mask);
+   gfp_t gfp_mask, nodemask_t *mask,
+   enum zone_type classzone_idx);
 extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
  unsigned long nr_pages,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 80c9b9a..5f20d4b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3305,7 +3305,8 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order,
current->reclaim_state = _state;
 
progress = try_to_free_pages(ac->zonelist, order, gfp_mask,
-   ac->nodemask);
+   ac->nodemask,
+   zonelist_zone_idx(ac->preferred_zoneref));
 
current->reclaim_state = NULL;
lockdep_clear_current_reclaim_state();
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c538a8c..1f91e2e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2855,13 +2855,14 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, 
struct zonelist *zonelist,
 }
 
 unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
-   gfp_t gfp_mask, nodemask_t *nodemask)
+   gfp_t gfp_mask, nodemask_t *nodemask,
+   enum zone_type classzone_idx)
 {
unsigned long nr_reclaimed;
struct scan_control sc = {
.nr_to_reclaim = SWAP_CLUSTER_MAX,
.gfp_mask = (gfp_mask = memalloc_noio_flags(gfp_mask)),
-   .reclaim_idx = gfp_zone(gfp_mask),
+   .reclaim_idx = classzone_idx,
.order = order,
.nodemask = nodemask,
.priority = DEF_PRIORITY,
-- 
1.9.1



Re: [PATCH v3 3/9] DocBook/v4l: Add compressed video formats used on MT8173 codec driver

2016-07-12 Thread 李務誠
On Wed, Jul 13, 2016 at 3:14 AM, Nicolas Dufresne
 wrote:
> Le mardi 12 juillet 2016 à 15:08 -0400, Nicolas Dufresne a écrit :
>> Le mardi 12 juillet 2016 à 16:16 +0800, Wu-Cheng Li (李務誠) a écrit :
>> > Decoder hardware produces MT21 (compressed). Image processor can
>> > convert it to a format that can be input of display driver.
>> > Tiffany.
>> > When do you plan to upstream image processor (mtk-mdp)?
>> > >
>> > > It can be as input format for encoder, MDP and display drivers in
>> > our
>> > > platform.
>> > I remember display driver can only accept uncompressed MT21. Right?
>> > Basically V4L2_PIX_FMT_MT21 is compressed and is like an opaque
>> > format. It's not usable until it's decompressed and converted by
>> > image
>> > processor.
>>
>> Previously it was described as MediaTek block mode, and now as a
>> MediaTek compressed format. It makes me think you have no idea what
>> this pixel format really is. Is that right ?
>>
>> The main reason why I keep asking, is that we often find similarities
>> between what vendor like to call their proprietary formats. Doing the
>> proper research helps not creating a mess like in Android where you
>> have a lot of formats that all point to the same format. I believe
>> there was the same concern when Samsung wanted to introduce their Z-
>> flip-Z NV12 tile format. In the end they simply provided sufficient
>> documentation so we could document it and implement software
>> converters
>> for test and validation purpose.
>
> Here's the kind of information we want in the documentation.
>
> https://chromium.googlesource.com/chromium/src/media/+/master/base/vide
> o_types.h#40
That is the documentation of decompressed MT21. Originally MT21 was meant
to be a YUV format and we can map it in CPU to use it. The name was changed
to mean a compressed format. The current design is only MTK image processor
can convert it. Software cannot decompress it. I'm not sure if we
should document
the format inside if we cannot decompress in software. For chromium, I'll update
the code to explain MT21 is an opaque compressed format.
>
>   // MediaTek proprietary format. MT21 is similar to NV21 except the memory
>   // layout and pixel layout (swizzles). 12bpp with Y plane followed by a 2x2
>   // interleaved VU plane. Each image contains two buffers -- Y plane and VU
>   // plane. Two planes can be non-contiguous in memory. The starting addresses
>   // of Y plane and VU plane are 4KB alignment.
>   // Suppose image dimension is (width, height). For both Y plane and VU 
> plane:
>   // Row pitch = ((width+15)/16) * 16.
>   // Plane size = Row pitch * (((height+31)/32)*32)
>
> Now obviously this is incomplete, as the swizzling need to be documented of 
> course.
>
>>
>> regards,
>> Nicolas


linux-next: manual merge of the drm tree with the v4l-dvb tree

2016-07-12 Thread Stephen Rothwell
Hi Dave,

Today's linux-next merge of the drm tree got a conflict in:

  drivers/media/platform/omap/omap_voutdef.h

between commit:

  77430f0396af ("[media] omap_vout: use control framework")

from the v4l-dvb tree and commit:

  781a162244a2 ("[media] omap_vout: Switch to use the video/omapfb_dss.h header 
file")

from the drm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/media/platform/omap/omap_voutdef.h
index 49de1475e473,94b5d65afb19..
--- a/drivers/media/platform/omap/omap_voutdef.h
+++ b/drivers/media/platform/omap/omap_voutdef.h
@@@ -11,8 -11,7 +11,8 @@@
  #ifndef OMAP_VOUTDEF_H
  #define OMAP_VOUTDEF_H
  
 +#include 
- #include 
+ #include 
  #include 
  
  #define YUYV_BPP2


Re: [dm-devel] [PATCH v2 1/1] block: fix blk_queue_split() resource exhaustion

2016-07-12 Thread Eric Wheeler
On Tue, 12 Jul 2016, NeilBrown wrote:

> On Tue, Jul 12 2016, Lars Ellenberg wrote:
> 
> >
> > Instead, I suggest to distinguish between recursive calls to
> > generic_make_request(), and pushing back the remainder part in
> > blk_queue_split(), by pointing current->bio_lists to a
> > struct recursion_to_iteration_bio_lists {
> > struct bio_list recursion;
> > struct bio_list queue;
> > }
> >
> > By providing each q->make_request_fn() with an empty "recursion"
> > bio_list, then merging any recursively submitted bios to the
> > head of the "queue" list, we can make the recursion-to-iteration
> > logic in generic_make_request() process deepest level bios first,
> > and "sibling" bios of the same level in "natural" order.
> >
> > Signed-off-by: Lars Ellenberg 
> > Signed-off-by: Roland Kammerer 
> 
> Reviewed-by: NeilBrown 
> 
> Thanks again for doing this - I think this is a very significant
> improvement and could allow other simplifications.

Thank you Lars for all of this work!  

It seems like there have been many 4.3+ blockdev stacking issues and this 
will certainly address some of those (maybe all of them?).  (I think we 
hit this while trying drbd in 4.4 so we dropped back to 4.1 without 
issue.)  It would be great to hear 4.4.y stable pick this up if 
compatible.


Do you believe that this patch would solve any of the proposals by others 
since 4.3 related to bio splitting/large bios?  I've been collecting a 
list, none of which appear have landed yet as of 4.7-rc7 (but correct me 
if I'm wrong):

A.  [PATCH v2] block: make sure big bio is splitted into at most 256 bvecs
by Ming Lei: https://patchwork.kernel.org/patch/9169483/

B.  block: don't make BLK_DEF_MAX_SECTORS too big
by Shaohua Li: http://www.spinics.net/lists/linux-bcache/msg03525.html

C.  [1/3] block: flush queued bios when process blocks to avoid deadlock
by Mikulas Patocka: https://patchwork.kernel.org/patch/9204125/
(was https://patchwork.kernel.org/patch/7398411/)

D.  dm-crypt: Fix error with too large bios
by Mikulas Patocka: https://patchwork.kernel.org/patch/9138595/

The A,B,D are known to fix large bio issues when stacking dm+bcache 
(though the B,D are trivial and probably necessary even with your patch).

Patch C was mentioned earlier in this thread by Mike Snitzer and you 
commented briefly that his patch might solve the issue; given that, and in 
the interest of minimizing duplicate effort, which of the following best 
describes the situation?

  1. Your patch could supersede Mikulas's patch; they address the same 
issue.

  2. Mikulas's patch addresses different issues such and both patches 
should be applied.

  3. There is overlap between both your patch and Mikulas's such that both 
#1,#2 are true and effort to solve this has been duplicated.


If #3, then what might be done to resolve the overlap?

What are the opinions of the authors and can a consensus be reached so we 
can see these pushed upstream with the appropriate stable Cc tags and 
ultimately fix 4.4.y?


--
Eric Wheeler


Re: [PATCH] sched/fair: do not announce throttled next buddy in dequeue_task_fair

2016-07-12 Thread Wanpeng Li
2016-07-13 9:58 GMT+08:00 Xunlei Pang :
> On 2016/07/13 at 09:50, Wanpeng Li wrote:
>> 2016-07-13 1:25 GMT+08:00  :
>>> Konstantin Khlebnikov  writes:
>>>
 On 11.07.2016 15:12, Xunlei Pang wrote:
> On 2016/07/11 at 17:54, Wanpeng Li wrote:
>> Hi Konstantin, Xunlei,
>> 2016-07-11 16:42 GMT+08:00 Xunlei Pang :
>>> On 2016/07/11 at 16:22, Xunlei Pang wrote:
 On 2016/07/11 at 15:25, Wanpeng Li wrote:
> 2016-06-16 20:57 GMT+08:00 Konstantin Khlebnikov 
> :
>> Hierarchy could be already throttled at this point. Throttled next
>> buddy could trigger null pointer dereference in 
>> pick_next_task_fair().
> There is cfs_rq->next check in pick_next_entity(), so how can null
> pointer dereference happen?
 I guess it's the following code leading to a NULL se returned:
>>> s/NULL/empty-entity cfs_rq se/
>>>
 pick_next_entity():
  if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
>>  ^
>> I think this will return false.
> With the wrong throttled_hierarchy(), I think this can happen. But after 
> we have the
> corrected throttled_hierarchy() patch, I can't see how it is possible.
>
> dequeue_task_fair():
>  if (task_sleep && parent_entity(se))
>  set_next_buddy(parent_entity(se));
>
> How does dequeue_task_fair() with DEQUEUE_SLEEP set(true task_sleep) 
> happen to a throttled hierarchy?
> IOW, a task belongs to a throttled hierarchy is running?
>
> Maybe Konstantin knows the reason.
 This function (dequeue_task_fair) check throttling but at point it could 
 skip several
 levels and announce as next buddy actually throttled entry.
 Probably this bug hadn't happened but this's really hard to prove that 
 this is impossible.
 ->set_curr_task(), PI-boost or some tricky migration in balancer could 
 break this easily.
>>> sched_setscheduler can call put_prev_task, which then can cause a
>>> throttle outside of __schedule(), then the task blocks normally and
>>> deactivate_task(DEQUEUE_SLEEP) happens and you lose.
>> The cfs_rq_throttled() check in dequeue_task_fair() will capture the
>> cfs_rq which is throttled in sched_setscheduler::put_prev_task path,
>> so nothing lost, where I miss?
>
> cfs_rq_throttled() returns false for child cgroups in the throttled 
> hierarchy, so
> throttled_hierarchy() should be relied on in such cases.

Yes, so what's lost in bsegall's reply?

Regards,
Wanpeng Li


[PATCH V2] iommu: arm-smmu: drop devm_free_irq when driver detach

2016-07-12 Thread Peng Fan
There is no need to call devm_free_irq when driver detach.
devres_release_all which is called after 'drv->remove' will
release all managed resources.

Signed-off-by: Peng Fan 
Reviewed-by: Robin Murphy 
Cc: Will Deacon 
---

V2:
 Fix compile warning. Add Robin's Reviewed-by TAG.

 drivers/iommu/arm-smmu.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 860652e..b7ef1d8 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2018,7 +2018,6 @@ out_put_masters:
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
 {
-   int i;
struct device *dev = >dev;
struct arm_smmu_device *curr, *smmu = NULL;
struct rb_node *node;
@@ -2045,9 +2044,6 @@ static int arm_smmu_device_remove(struct platform_device 
*pdev)
if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
dev_err(dev, "removing device with active domains!\n");
 
-   for (i = 0; i < smmu->num_global_irqs; ++i)
-   devm_free_irq(smmu->dev, smmu->irqs[i], smmu);
-
/* Turn the thing off */
writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
return 0;
-- 
2.6.2



Re: dm stripe: add DAX support

2016-07-12 Thread Mike Snitzer
On Tue, Jul 12 2016 at  6:22pm -0400,
Kani, Toshimitsu  wrote:

> On Fri, 2016-06-24 at 14:29 -0400, Mike Snitzer wrote:
> > 
> > BTW, if in your testing you could evaluate/quantify any extra overhead
> > from DM that'd be useful to share.  It could be there are bottlenecks
> > that need to be fixed, etc.
> 
> Here are some results from fio benchmark.  The test is single-threaded and is
> bound to one CPU.
> 
>  DAX  LVM   IOPS   NOTE
>  ---
>   Y    N    790K
>   Y    Y    754K   5% overhead with LVM
>   N    N    567K
>   N    Y    457K   20% overhead with LVM
> 
>  DAX: Y: mount -o dax,noatime, N: mount -o noatime
>  LVM: Y: dm-linear on pmem0 device, N: pmem0 device
>  fio: bs=4k, size=2G, direct=1, rw=randread, numjobs=1
> 
> Among the 5% overhead with DAX/LVM, the new DM direct_access interfaces
> account for less than 0.5%.
> 
>  dm_blk_direct_access 0.28%
>  linear_direct_access 0.17%
> 
> The average latency increases slightly from 0.93us to 0.95us.  I think most of
> the overhead comes from the submit_bio() path, which is used only for
> accessing metadata with DAX.  I believe this is due to cloning bio for each
> request in DM.  There is 12% more L2 miss in total.
> 
> Without DAX, 20% overhead is observed with LVM.  Average latency increases
> from 1.39us to 1.82us.  Without DAX, bio is cloned for both data and metadata.

Thanks for putting this summary together.  Unfortunately none of the DM
changes can be queued for 4.8 until Jens takes the 2 block core patches:
https://patchwork.kernel.org/patch/9196021/
https://patchwork.kernel.org/patch/9196019/

Not sure what the hold up and/or issue is with them.  But I've asked
twice (and implicilty a 3rd time here).  Hopefully they land in time for
4.8.

Mike


Re: [PATCH v4] [media] pci: Add tw5864 driver - fixed few style nits, going to resubmit soon

2016-07-12 Thread Andrey Utkin
Found and fixed few very minor coding style nits, will resubmit in few days,
now still waiting for comments to v4.

https://github.com/bluecherrydvr/linux/commits/tw5864

commit 31f7c98a144cb3fb8a94662f002d9b6142d1f390
Author: Andrey Utkin 
Date:   Wed Jul 13 05:00:28 2016 +0300

Fix checkpatch --strict issue

 CHECK: Alignment should match open parenthesis
 #3599: FILE: drivers/media/pci/tw5864/tw5864-video.c:539:
 +static int tw5864_fmt_vid_cap(struct file *file, void *priv,
 +   struct v4l2_format *f)

commit 11a09a1048af597ecf374507b08c809eed91b86d
Author: Andrey Utkin 
Date:   Wed Jul 13 04:59:34 2016 +0300

Fix checkpatch --strict issue

 CHECK: Please don't use multiple blank lines
 #3244: FILE: drivers/media/pci/tw5864/tw5864-video.c:184:

commit 861b2ba8593db7abe89291a4ba85976519783f4a
Author: Andrey Utkin 
Date:   Wed Jul 13 04:58:37 2016 +0300

Fix checkpatch --strict issue

 CHECK: No space is necessary after a cast
 #3053: FILE: drivers/media/pci/tw5864/tw5864-util.c:36:
 +   return (u8) tw_readl(TW5864_IND_DATA);


Re: [V3 PATCH 1/2] x86/panic: Replace smp_send_stop() with kdump friendly version

2016-07-12 Thread 'Dave Young'
On 07/12/16 at 02:49am, 河合英宏 / KAWAI,HIDEHIRO wrote:
> Hi Dave,
> 
> Thanks for the comments.
> 
> > From: Dave Young [mailto:dyo...@redhat.com]
> > Sent: Monday, July 11, 2016 5:35 PM
> > 
> > On 07/05/16 at 08:33pm, Hidehiro Kawai wrote:
> > > This patch fixes one of the problems reported by Daniel Walker
> > > (https://lkml.org/lkml/2015/6/24/44).
> > >
> > > If crash_kexec_post_notifiers boot option is specified, other CPUs
> > > are stopped by smp_send_stop() instead of machine_crash_shutdown()
> > > in crash_kexec() path.  This behavior change leads two problems.
> > >
> > >  Problem 1:
> > >  octeon_generic_shutdown() for MIPS OCTEON assumes that other CPUs are
> > >  still online and try to stop their watchdog timer.  If
> > >  smp_send_stop() is called before octeon_generic_shutdown(), stopping
> > >  watchdog timer will fail because other CPUs have been offlined by
> > >  smp_send_stop().
> > >
> > >panic()
> > >  if crash_kexec_post_notifiers == 1
> > >smp_send_stop()
> > >atomic_notifier_call_chain()
> > >kmsg_dump()
> > >  crash_kexec()
> > >machine_crash_shutdown()
> > >  octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
> > >
> > >  Problem 2:
> > >  Most of architectures stop other CPUs in machine_crash_shutdown()
> > >  path, and they also do something needed for kdump.  For example,
> > >  they save registers, disable virtualization extensions, and so on.
> > >  However, if smp_send_stop() stops other CPUs before
> > >  machine_crash_shutdown(), we miss those operations.
> > >
> > > How do we fix these problems?  In the first place, we should stop
> > > other CPUs as soon as possible when panic() was called, otherwise
> > > other CPUs may wipe out a clue to the cause of the failure.  So, we
> > > replace smp_send_stop() with more suitable one for kdump.
> > 
> > We have been avoiding extra things in panic path, but unfortunately
> > crash_kexec_post_notifiers were added. I tend to agree the best place
> > for this stuff is in 2nd kernel or purgatory instead of in 1st kernel.
> 
> Several months ago, I posted a patch set which writes regs to SEL, generate
> an event to send SNMP message, and start/stop BMC's watchdog timer in
> purgatory.  This feature requires BMC with KCS (Keyboard Controller Style)
> I/F, but the most of enterprise grade server would have it.
> (http://thread.gmane.org/gmane.linux.kernel.kexec/15382)
> 
> Doing kmsg_dump things in purgatory wouldn't be suitable (should be done
> in the 2nd kernel before enabling devices and IRQs?)

In theory it is doable maybe do something like oldmem_kmsg_dump while 
/proc/vmcore
initializing?

>  
> > As for this patch I'm not sure it is safe to replace the smp_send_stop
> > with the kdump friendly function. I'm also not sure if the kdump friendly
> > function is safe for kdump. Will glad to hear opinions from other
> > arch experts.
> 
> This stuff depends on architectures, so I speak only about
> x86 (the logic doesn't change on other architectures at this time).
> 
> kdump path with crash_kexec_post_notifiers disabled:
>  panic()
>__crash_kexec()
>  crash_setup_regs()
>  crash_save_vmcoreinfo()
>  machine_crash_shutdown()
>native_machine_crash_shutdown()
>  panic_smp_send_stop() /* mostly same as original 
> * kdump_nmi_shootdown_cpus()
> */
> 
> kdump path with crash_kexec_post_notifiers enabled:
>  panic()
>panic_smp_send_stop()
>__crash_kexec()
>  crash_setup_regs()
>  crash_save_vmcoreinfo()
>  machine_crash_shutdown()
>native_machine_crash_shutdown()
>  panic_smp_send_stop() // do nothing
> 
> The difference is that stopping other CPUs before crash_setup_regs()
> and crash_save_vmcoreinfo() or not.  Since crash_setup_regs() and
> crash_save_vmcoreinfo() just save information to some memory area, 
> they wouldn't be affected by panic_smp_send_stop().  This means
> placing panic_smp_send_stop before __crash_kexec is safe.
> 
> BTW, I noticed my patch breaks Xen kernel.  I'll fix it in the
> next version.

But it does breaks stuff which depends on cpu not being disabled
like problem 1 you mentioned in patch log.

> 
> > BTW, if one want to use crash_kexec_post_notifiers he should take the
> > risk of unreliable kdump. How about only call smp_send_stop in case no
> > crash_kexec_post_notifiers being used.
> 
> Unlike panic_smp_send_stop()/kdump_nmi_shootdown_cpus(), smp_send_stop()
> for x86 tries to stop other CPUs with normal IPI before issuing NMI IPI.
> This would be because NMI IPI has a risk of deadlock.  We checked if
> the kdump path has a risk of deadlock in the case of NMI panic and fixed
> it.  But I'm not sure about normal panic path.  I agree with that use
> smp_send_stop if crash_kexec_post_notifiers or kdump is disabled.

What I mean is like below, problem 1 will not exist in this way, but
kdump will be unreliable:
if 

Re: [PATCH] sched/fair: do not announce throttled next buddy in dequeue_task_fair

2016-07-12 Thread Xunlei Pang
On 2016/07/13 at 09:50, Wanpeng Li wrote:
> 2016-07-13 1:25 GMT+08:00  :
>> Konstantin Khlebnikov  writes:
>>
>>> On 11.07.2016 15:12, Xunlei Pang wrote:
 On 2016/07/11 at 17:54, Wanpeng Li wrote:
> Hi Konstantin, Xunlei,
> 2016-07-11 16:42 GMT+08:00 Xunlei Pang :
>> On 2016/07/11 at 16:22, Xunlei Pang wrote:
>>> On 2016/07/11 at 15:25, Wanpeng Li wrote:
 2016-06-16 20:57 GMT+08:00 Konstantin Khlebnikov 
 :
> Hierarchy could be already throttled at this point. Throttled next
> buddy could trigger null pointer dereference in pick_next_task_fair().
 There is cfs_rq->next check in pick_next_entity(), so how can null
 pointer dereference happen?
>>> I guess it's the following code leading to a NULL se returned:
>> s/NULL/empty-entity cfs_rq se/
>>
>>> pick_next_entity():
>>>  if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
>  ^
> I think this will return false.
 With the wrong throttled_hierarchy(), I think this can happen. But after 
 we have the
 corrected throttled_hierarchy() patch, I can't see how it is possible.

 dequeue_task_fair():
  if (task_sleep && parent_entity(se))
  set_next_buddy(parent_entity(se));

 How does dequeue_task_fair() with DEQUEUE_SLEEP set(true task_sleep) 
 happen to a throttled hierarchy?
 IOW, a task belongs to a throttled hierarchy is running?

 Maybe Konstantin knows the reason.
>>> This function (dequeue_task_fair) check throttling but at point it could 
>>> skip several
>>> levels and announce as next buddy actually throttled entry.
>>> Probably this bug hadn't happened but this's really hard to prove that this 
>>> is impossible.
>>> ->set_curr_task(), PI-boost or some tricky migration in balancer could 
>>> break this easily.
>> sched_setscheduler can call put_prev_task, which then can cause a
>> throttle outside of __schedule(), then the task blocks normally and
>> deactivate_task(DEQUEUE_SLEEP) happens and you lose.
> The cfs_rq_throttled() check in dequeue_task_fair() will capture the
> cfs_rq which is throttled in sched_setscheduler::put_prev_task path,
> so nothing lost, where I miss?

cfs_rq_throttled() returns false for child cgroups in the throttled hierarchy, 
so
throttled_hierarchy() should be relied on in such cases.

Regards,
Xunlei


Re: [PATCH v3 3/9] DocBook/v4l: Add compressed video formats used on MT8173 codec driver

2016-07-12 Thread tiffany lin
Hi Nicolas,

On Tue, 2016-07-12 at 15:14 -0400, Nicolas Dufresne wrote:
> Le mardi 12 juillet 2016 à 15:08 -0400, Nicolas Dufresne a écrit :
> > Le mardi 12 juillet 2016 à 16:16 +0800, Wu-Cheng Li (李務誠) a écrit :
> > > Decoder hardware produces MT21 (compressed). Image processor can
> > > convert it to a format that can be input of display driver.
> > > Tiffany.
> > > When do you plan to upstream image processor (mtk-mdp)?
> > > > 
> > > > It can be as input format for encoder, MDP and display drivers in
> > > our
> > > > platform.
> > > I remember display driver can only accept uncompressed MT21. Right?
> > > Basically V4L2_PIX_FMT_MT21 is compressed and is like an opaque
> > > format. It's not usable until it's decompressed and converted by
> > > image
> > > processor.
> > 
> > Previously it was described as MediaTek block mode, and now as a
> > MediaTek compressed format. It makes me think you have no idea what
> > this pixel format really is. Is that right ?
> > 
> > The main reason why I keep asking, is that we often find similarities
> > between what vendor like to call their proprietary formats. Doing the
> > proper research helps not creating a mess like in Android where you
> > have a lot of formats that all point to the same format. I believe
> > there was the same concern when Samsung wanted to introduce their Z-
> > flip-Z NV12 tile format. In the end they simply provided sufficient
> > documentation so we could document it and implement software
> > converters
> > for test and validation purpose.
> 
> Here's the kind of information we want in the documentation.
> 
> https://chromium.googlesource.com/chromium/src/media/+/master/base/vide
> o_types.h#40
> 
>   // MediaTek proprietary format. MT21 is similar to NV21 except the memory
>   // layout and pixel layout (swizzles). 12bpp with Y plane followed by a 2x2
>   // interleaved VU plane. Each image contains two buffers -- Y plane and VU
>   // plane. Two planes can be non-contiguous in memory. The starting addresses
>   // of Y plane and VU plane are 4KB alignment.
>   // Suppose image dimension is (width, height). For both Y plane and VU 
> plane:
>   // Row pitch = ((width+15)/16) * 16.
>   // Plane size = Row pitch * (((height+31)/32)*32)
> 
> Now obviously this is incomplete, as the swizzling need to be documented of 
> course.
> 
Because it's finally a compressed format from our codec hw, we cannot
describe its swizzling.

best regards,
Tiffany

> > 
> > regards,
> > Nicolas




Re: [v5 PATCH 1/5] extcon: Add Type-C and DP support

2016-07-12 Thread Chanwoo Choi
Hi Chris,

On 2016년 07월 13일 10:39, Chris Zhong wrote:
> Hi Chanwoo Choi
> 
> 
> On 07/13/2016 09:11 AM, Chanwoo Choi wrote:
>> Hi Chris,
>>
>> I'm now developing the extcon property on extcon-test branch.
>> But, it has not been completed.
>>
>> On next version, I'll remove the notification about extcon property
>> and only support the following two functions.
>> - extcon_set_cable_property()
>> - extcon_get_cable_property()
>>
>> Because the number of properties would be risen and the all properties
>> depend on the specific external connector(e.g., EXTCON_PROP_USB_VBUS
>> depend on the EXTCON_TYPE_USB type). When the specific external connector
>> is detached, extcon framework should make the property state as default 
>> state.
> 
> Yes, I think getting the notification from cable state is enough, actually I 
> am using it like you said.

OK. 

> 
>>
>> It may send the too many notification for extcon property.
>> For example, Assume that EXTCON_TYPE_USB has the over 20 properties,
>> when EXTCON_USB or EXTCON_USB_HOST is detached, extcon should send
>> the notification for the over 20 properties and one more notificaiton
>> for state of external connector.
>>
>> So, I'll send the RFC patchset without the notification of proerty.
>>
>> Lastly,
>> I have a comment on below.
>>
>> Thanks,
>> Chanwoo Choi
>>
>> On 2016년 07월 13일 00:09, Chris Zhong wrote:
>>> Add EXTCON_DISP_DP for the Display external connector. For Type-C
>>> connector the DisplayPort can work as an Alternate Mode(VESA DisplayPort
>>> Alt Mode on USB Type-C Standard). The Type-C support both normal and
>>> flipped orientation, so add a property to extcon.
>>>
>>> Signe-off-by: Chris Zhong 
>>>
>>> Signed-off-by: Chris Zhong 
>>> ---
>>>
>>> Changes in v5:
>>> - support get property
>>>
>>> Changes in v4: None
>>> Changes in v3: None
>>> Changes in v2: None
>>> Changes in v1: None
>>>
>>>   drivers/extcon/extcon.c | 28 
>>>   include/linux/extcon.h  | 13 +
>>>   2 files changed, 41 insertions(+)
>>>
>>> diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
>>> index a1117db..2591b28 100644
>>> --- a/drivers/extcon/extcon.c
>>> +++ b/drivers/extcon/extcon.c
>>> @@ -157,6 +157,11 @@ struct __extcon_info {
>>>   .id = EXTCON_DISP_VGA,
>>>   .name = "VGA",
>>>   },
>>> +[EXTCON_DISP_DP] = {
>>> +.type = EXTCON_TYPE_DISP,
>>> +.id = EXTCON_DISP_DP,
>>> +.name = "DP",
>>> +},
>>> /* Miscellaneous external connector */
>>>   [EXTCON_DOCK] = {
>>> @@ -270,6 +275,7 @@ static bool is_extcon_property_supported(unsigned int 
>>> id,
>>>   switch (prop) {
>>>   case EXTCON_PROP_USB_ID:
>>>   case EXTCON_PROP_USB_VBUS:
>>> +case EXTCON_PROP_TYPEC_POLARITY:
>>>   return true;
>>>   default:
>>>   break;
>>> @@ -286,6 +292,8 @@ static bool is_extcon_property_supported(unsigned int 
>>> id,
>>>   }
>>>   case EXTCON_TYPE_DISP:
>>>   switch (prop) {
>>> +case EXTCON_PROP_TYPEC_POLARITY:
>> Should EXTCON_PROP_TYPEC_POLARITY property add to both EXTCON_TYPE_USB and 
>> EXTCON_TYP_DISP?
>> EXTCON_PROP_TYPEC_POLARITY is the property of USB C-type?
> 
> it is for USB Type-C, But at Display Port alt mode, both EXTCON_USB and 
> EXTCON_USB_HOST may be detached. Does it support set the property to a 
> detached cable, if so, I think move this case to EXTCON_USB is fine.

One external connector can set the state of one more external connector
if the one connector support the various functions.
For example, EXTCON_USB and EXTCON_CHG_USB_SDP
The existing extcon driver[1](e.g., max14577/max77693 etc.) set the state of 
both EXTCON_USB and EXTCON_CHG_USB_SDP connector at the same time
when usb cable is attached. Because in this case, the usb connector uses as 
both power supply(EXTCON_CHG_USB_SDP) and data transfer(EXTCON_USB).
[1] 
https://git.kernel.org/cgit/linux/kernel/git/chanwoo/extcon.git/commit/?h=extcon-next=8b45b6a0741678902810d7be95e635c210fbb198

So, DP Alt mode uses the USB Type-C. So, When USB C-type connector is attached 
for DP Alt mode,
Maybe, you can set the following two state of connector and one property:
- extcon_set_cable_state(edev, [EXTCON_USB or EXTCON_USB_HOST], 1);
- extcon_set_cable_state(edev, EXTCON_DISP_DP, 1);
- extcon_set_cable_state(edev, [EXTCON_USB or EXTCON_USB_HOST], 
EXTCON_PROP_TYPEC_POLARITY, 0 or 1);

Thanks,
Chanwoo Choi



Re: [PATCH v6 1/2] Documentation: bindings: add dt doc for Rockchip PCIe controller

2016-07-12 Thread Brian Norris
Hi,

On Wed, Jul 13, 2016 at 09:45:43AM +0800, Shawn Lin wrote:
> 在 2016/7/13 9:31, Brian Norris 写道:
> >On Wed, Jul 13, 2016 at 09:10:15AM +0800, Shawn Lin wrote:
> >At some level, it's a matter of preference. But when you're talking
> >about the rk3399 PCIe "interrupt controller" domain, it seems that you
> >should be talking about HW bits in the controller -- i.e., you have a
> >4-bit interrupt status bitfield, that we typically call [0:3]. If you
> >use [1:4], then you have to remember to subtract 1 mentally when mapping
> >to the actual HW bit. I believe that confusion (since bitfields normally
> >count from 0) might have helped cause the infinite loop bug I noticed
> >too. And I also think that counting from 0 helps clarify the fact that
> >your interrupt controller indexing is an independent numbering from the
> >PCI interrupt numbering, even though they happen to map 1:1.
> 
> If that's the fact of how we should numbering our index base, we should
> probably start if from 5 as the layout of INTx is
> PCIE_CLIENT_INT_STATUS[5:8]... ?

Possibly better than starting from 1, but IMO also doesn't make sense,
because the other bits aren't interrupts you want to translate on behalf
of other devices (are they?) -- they're interrupt bits consumed by the
host controller itself. (If they are possibly needed for translation,
then sure, index the entire status register and handle it in the driver,
and start the INTx mapping from 5 here.)

[...]

> >If you still think it makes more sense to count from 1, then I won't
> >stop you.
> 
> I don't have a hard opinion for the index base as I think it's trivial.

It's simple, but I think it influenced code understanding and bugginess.

> So if it's more sensible to you, I will apply your suggestion.

Well, I was just offering my opinion. I think it makes more sense, but
maybe it doesn't to you.

Brian


Re: [PATCH v3 3/9] DocBook/v4l: Add compressed video formats used on MT8173 codec driver

2016-07-12 Thread tiffany lin
On Tue, 2016-07-12 at 16:16 +0800, Wu-Cheng Li (李務誠) wrote:
> On Mon, Jul 11, 2016 at 10:56 AM, tiffany lin  
> wrote:
> > Hi Hans,
> >
> > On Fri, 2016-07-08 at 12:23 +0200, Hans Verkuil wrote:
> >> On 05/30/2016 02:29 PM, Tiffany Lin wrote:
> >> > Add V4L2_PIX_FMT_MT21 documentation
> >> >
> >> > Signed-off-by: Tiffany Lin 
> >> > ---
> >> >  Documentation/DocBook/media/v4l/pixfmt.xml |6 ++
> >> >  1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/Documentation/DocBook/media/v4l/pixfmt.xml 
> >> > b/Documentation/DocBook/media/v4l/pixfmt.xml
> >> > index 5a08aee..d40e0ce 100644
> >> > --- a/Documentation/DocBook/media/v4l/pixfmt.xml
> >> > +++ b/Documentation/DocBook/media/v4l/pixfmt.xml
> >> > @@ -1980,6 +1980,12 @@ array. Anything what's in between the UYVY lines 
> >> > is JPEG data and should be
> >> >  concatenated to form the JPEG stream. 
> >> >  
> >> >   
> >> > + 
> >> > +   V4L2_PIX_FMT_MT21
> >> > +   'MT21'
> >> > +   Compressed two-planar YVU420 format used by Mediatek 
> >> > MT8173
> >> > +   codec driver.
> >>
> >> Can you give a few more details? The encoder driver doesn't seem to 
> >> produce this
> >> format, so who is creating this? Where is this format documented?
> Decoder hardware produces MT21 (compressed). Image processor can
> convert it to a format that can be input of display driver. Tiffany.
> When do you plan to upstream image processor (mtk-mdp)?
> >
We are working on this. Will upstream soon.

> > It can be as input format for encoder, MDP and display drivers in our
> > platform.
> I remember display driver can only accept uncompressed MT21. Right?
> Basically V4L2_PIX_FMT_MT21 is compressed and is like an opaque
> format. It's not usable until it's decompressed and converted by image
> processor.
That's right in MT8173 platform.

best regards,
Tiffany
> > This private format is only available in our platform.
> > So I put it in "Reserved Format Identifiers" sections.
> >
> >
> > best regards,
> > Tiffany
> >
> >> Regards,
> >>
> >>   Hans
> >>
> >> > + 
> >> > 
> >> >
> >> >  
> >> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-media" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html




Re: [PATCH v3 3/9] DocBook/v4l: Add compressed video formats used on MT8173 codec driver

2016-07-12 Thread tiffany lin
Hi Nicolas,

On Tue, 2016-07-12 at 15:08 -0400, Nicolas Dufresne wrote:
> Le mardi 12 juillet 2016 à 16:16 +0800, Wu-Cheng Li (李務誠) a écrit :
> > Decoder hardware produces MT21 (compressed). Image processor can
> > convert it to a format that can be input of display driver. Tiffany.
> > When do you plan to upstream image processor (mtk-mdp)?
> > >
> > > It can be as input format for encoder, MDP and display drivers in
> > our
> > > platform.
> > I remember display driver can only accept uncompressed MT21. Right?
> > Basically V4L2_PIX_FMT_MT21 is compressed and is like an opaque
> > format. It's not usable until it's decompressed and converted by
> > image
> > processor.
> 
> Previously it was described as MediaTek block mode, and now as a
> MediaTek compressed format. It makes me think you have no idea what
> this pixel format really is. Is that right ?
> 
That's not right.
Its a compressed format as I document in "[PATCH v3 3/9] DocBook/v4l:
Add compressed video formats used on MT8173 codec driver."
In MT8173 platform, when using this format, we need Image Processor to
cover it to standard format as wucheng mentioned.
To prevent this ambiguous, I will change it to V4L2_PIX_FMT_M21C, it
means its compressed data. Is it ok?

best regards,
Tiffany

> The main reason why I keep asking, is that we often find similarities
> between what vendor like to call their proprietary formats. Doing the
> proper research helps not creating a mess like in Android where you
> have a lot of formats that all point to the same format. I believe
> there was the same concern when Samsung wanted to introduce their Z-
> flip-Z NV12 tile format. In the end they simply provided sufficient
> documentation so we could document it and implement software converters
> for test and validation purpose.
> 
> regards,
> Nicolas




Re: [PATCH v4] [media] pci: Add tw5864 driver

2016-07-12 Thread Andrey Utkin
On Mon, Jul 11, 2016 at 09:40:53AM -0700, Joe Perches wrote:
> Each of these blocks will start with the dev_ prefix
> and the subsequent lines will not have the same prefix

Yes. I have checked how it looks before submitting, but I didn't see
this as a problem. I don't mind changing that (anyway I have found few
micro-issues with checkpatch --strict and would like to resubmit), but
would like to hear some second opinion.

> It also might be better to issue something like a single
> line dev_warn referring to the driver code and just leave
> this comment in the driver sources.
> 
> Something like:
> 
>   dev_warn(_dev->dev,
>   "This driver has known defects in video quality\n");

Things get complicated if you consider mainstream distros and their
years-behind kernels. The simplest way to preserve correspondence
between state of driver and such notice is to contain the notice in the
compiled driver. I hope the state of affairs will change to better
someday :)


Re: [PATCH] sched/fair: do not announce throttled next buddy in dequeue_task_fair

2016-07-12 Thread Wanpeng Li
2016-07-13 1:25 GMT+08:00  :
> Konstantin Khlebnikov  writes:
>
>> On 11.07.2016 15:12, Xunlei Pang wrote:
>>> On 2016/07/11 at 17:54, Wanpeng Li wrote:
 Hi Konstantin, Xunlei,
 2016-07-11 16:42 GMT+08:00 Xunlei Pang :
> On 2016/07/11 at 16:22, Xunlei Pang wrote:
>> On 2016/07/11 at 15:25, Wanpeng Li wrote:
>>> 2016-06-16 20:57 GMT+08:00 Konstantin Khlebnikov 
>>> :
 Hierarchy could be already throttled at this point. Throttled next
 buddy could trigger null pointer dereference in pick_next_task_fair().
>>> There is cfs_rq->next check in pick_next_entity(), so how can null
>>> pointer dereference happen?
>> I guess it's the following code leading to a NULL se returned:
> s/NULL/empty-entity cfs_rq se/
>
>> pick_next_entity():
>>  if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1)
  ^
 I think this will return false.
>>>
>>> With the wrong throttled_hierarchy(), I think this can happen. But after we 
>>> have the
>>> corrected throttled_hierarchy() patch, I can't see how it is possible.
>>>
>>> dequeue_task_fair():
>>>  if (task_sleep && parent_entity(se))
>>>  set_next_buddy(parent_entity(se));
>>>
>>> How does dequeue_task_fair() with DEQUEUE_SLEEP set(true task_sleep) happen 
>>> to a throttled hierarchy?
>>> IOW, a task belongs to a throttled hierarchy is running?
>>>
>>> Maybe Konstantin knows the reason.
>>
>> This function (dequeue_task_fair) check throttling but at point it could 
>> skip several
>> levels and announce as next buddy actually throttled entry.
>> Probably this bug hadn't happened but this's really hard to prove that this 
>> is impossible.
>> ->set_curr_task(), PI-boost or some tricky migration in balancer could break 
>> this easily.
>
> sched_setscheduler can call put_prev_task, which then can cause a
> throttle outside of __schedule(), then the task blocks normally and
> deactivate_task(DEQUEUE_SLEEP) happens and you lose.

The cfs_rq_throttled() check in dequeue_task_fair() will capture the
cfs_rq which is throttled in sched_setscheduler::put_prev_task path,
so nothing lost, where I miss?

Regards,
Wanpeng Li


Re: [PATCH v6 1/2] Documentation: bindings: add dt doc for Rockchip PCIe controller

2016-07-12 Thread Shawn Lin

在 2016/7/13 9:31, Brian Norris 写道:

Hi Shawn,

On Wed, Jul 13, 2016 at 09:10:15AM +0800, Shawn Lin wrote:

在 2016/7/7 8:39, Brian Norris 写道:

On Wed, Jul 06, 2016 at 03:16:37PM +0800, Shawn Lin wrote:

+   #interrupt-cells = <1>;
+   interrupt-map-mask = <0 0 0 7>;
+   interrupt-map = <0 0 0 1 _intc 1>,
+   <0 0 0 2 _intc 2>,
+   <0 0 0 3 _intc 3>,
+   <0 0 0 4 _intc 4>;


I'm a little lost on this one, so forgive my ignorance; how did you
determine the last value in each entry (i.e., the 1, 2, 3, and 4 IRQ
numbers for pcie0_intc)? IIUC, those are supposed to represent indeces
into the IRQ status register found in the PCIe interrupt status
register, and so they should be 0-based (i.e., 0, 1, 2, 3). And then
you'd have:

interrupt-map = <0 0 0 1 _intc 0>,
<0 0 0 2 _intc 1>,
<0 0 0 3 _intc 2>,
<0 0 0 4 _intc 3>;

But then, I never got this sub-node binding to work quite right, so I
may be missing something.

EDIT: ooh, I see what's going on! I'll comment on the driver as well,
but it looks like you're translating the register status to a HW IRQ
number with 'ffs(reg)', which yields a 1-based index. I think it is most
sensible to use a 0-based index (i.e., 'ffs(reg) - 1'). Now, that only
will work if you get the whole interrupt-map + interrupt-controller
thing right (i.e., using a subnode for the interrupt controller) --
otherwise, IRQ mapping might not work right. I suspect that's one reason
the original driver writer might have used 1-based indexing in the first
place.


yes, I got it but.what's the difference?


At some level, it's a matter of preference. But when you're talking
about the rk3399 PCIe "interrupt controller" domain, it seems that you
should be talking about HW bits in the controller -- i.e., you have a
4-bit interrupt status bitfield, that we typically call [0:3]. If you
use [1:4], then you have to remember to subtract 1 mentally when mapping
to the actual HW bit. I believe that confusion (since bitfields normally
count from 0) might have helped cause the infinite loop bug I noticed
too. And I also think that counting from 0 helps clarify the fact that
your interrupt controller indexing is an independent numbering from the
PCI interrupt numbering, even though they happen to map 1:1.


If that's the fact of how we should numbering our index base, we should
probably start if from 5 as the layout of INTx is
PCIE_CLIENT_INT_STATUS[5:8]... ?



But then, PCI INTx numbering is kinda weird already, as it starts from
1. So maybe it's just as valid to say our domain starts from 1 as well.


You still need to get the whole interrupt-map + interrupt-controller
things right and the code(ffs(reg) - 1)if applied your suggestion.


Yes, of course. And I already sent you patches that do that.


Look at most of the docs for pcie bindings, I saw they also take
0-base index, how about?


I don't know which ones you're referring to. I see that altera-pcie.txt
supports interrupt indeces counting from 1, but that's probably because
they're using the same broken binding that was in your ~v3 patches
(where the pcie node has both 'interrupt-controller' and
'interrupt-map', with phandles to itself), so they had no other choice.

If you still think it makes more sense to count from 1, then I won't
stop you.


I don't have a hard opinion for the index base as I think it's trivial.
So if it's more sensible to you, I will apply your suggestion.



Regards,
Brian






--
Best Regards
Shawn Lin



Re: [v5 PATCH 1/5] extcon: Add Type-C and DP support

2016-07-12 Thread Chris Zhong

Hi Chanwoo Choi


On 07/13/2016 09:11 AM, Chanwoo Choi wrote:

Hi Chris,

I'm now developing the extcon property on extcon-test branch.
But, it has not been completed.

On next version, I'll remove the notification about extcon property
and only support the following two functions.
- extcon_set_cable_property()
- extcon_get_cable_property()

Because the number of properties would be risen and the all properties
depend on the specific external connector(e.g., EXTCON_PROP_USB_VBUS
depend on the EXTCON_TYPE_USB type). When the specific external connector
is detached, extcon framework should make the property state as default state.


Yes, I think getting the notification from cable state is enough, 
actually I am using it like you said.




It may send the too many notification for extcon property.
For example, Assume that EXTCON_TYPE_USB has the over 20 properties,
when EXTCON_USB or EXTCON_USB_HOST is detached, extcon should send
the notification for the over 20 properties and one more notificaiton
for state of external connector.

So, I'll send the RFC patchset without the notification of proerty.

Lastly,
I have a comment on below.

Thanks,
Chanwoo Choi

On 2016년 07월 13일 00:09, Chris Zhong wrote:

Add EXTCON_DISP_DP for the Display external connector. For Type-C
connector the DisplayPort can work as an Alternate Mode(VESA DisplayPort
Alt Mode on USB Type-C Standard). The Type-C support both normal and
flipped orientation, so add a property to extcon.

Signe-off-by: Chris Zhong 

Signed-off-by: Chris Zhong 
---

Changes in v5:
- support get property

Changes in v4: None
Changes in v3: None
Changes in v2: None
Changes in v1: None

  drivers/extcon/extcon.c | 28 
  include/linux/extcon.h  | 13 +
  2 files changed, 41 insertions(+)

diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
index a1117db..2591b28 100644
--- a/drivers/extcon/extcon.c
+++ b/drivers/extcon/extcon.c
@@ -157,6 +157,11 @@ struct __extcon_info {
.id = EXTCON_DISP_VGA,
.name = "VGA",
},
+   [EXTCON_DISP_DP] = {
+   .type = EXTCON_TYPE_DISP,
+   .id = EXTCON_DISP_DP,
+   .name = "DP",
+   },
  
  	/* Miscellaneous external connector */

[EXTCON_DOCK] = {
@@ -270,6 +275,7 @@ static bool is_extcon_property_supported(unsigned int id,
switch (prop) {
case EXTCON_PROP_USB_ID:
case EXTCON_PROP_USB_VBUS:
+   case EXTCON_PROP_TYPEC_POLARITY:
return true;
default:
break;
@@ -286,6 +292,8 @@ static bool is_extcon_property_supported(unsigned int id,
}
case EXTCON_TYPE_DISP:
switch (prop) {
+   case EXTCON_PROP_TYPEC_POLARITY:

Should EXTCON_PROP_TYPEC_POLARITY property add to both EXTCON_TYPE_USB and 
EXTCON_TYP_DISP?
EXTCON_PROP_TYPEC_POLARITY is the property of USB C-type?


it is for USB Type-C, But at Display Port alt mode, both EXTCON_USB and 
EXTCON_USB_HOST may be detached. Does it support set the property to a 
detached cable, if so, I think move this case to EXTCON_USB is fine.


Thanks
Chris



+   return true;
default:
break;
}
@@ -547,6 +555,26 @@ int extcon_get_cable_property(struct extcon_dev *edev, 
unsigned int id,
enum extcon_property prop,
union extcon_property_value *val)
  {
+   struct extcon_cable *cable;
+   int index;
+
+   if (!edev)
+   return -EINVAL;
+
+   /* Check the property whether is supported or not */
+   if (!is_extcon_property_supported(id, prop))
+   return -EINVAL;
+
+   /* Find the cable index of external connector by using id */
+   index = find_cable_index_by_id(edev, id);
+   if (index < 0)
+   return index;
+
+   /* Store the property value */
+   cable = >cables[index];
+
+   val->intval = cable->propval[prop].intval;
+
return 0;
  }

After I develop it about get_cable_property, I'll send RFC patchset.

  
diff --git a/include/linux/extcon.h b/include/linux/extcon.h

index f6f0a8d..50ef87f 100644
--- a/include/linux/extcon.h
+++ b/include/linux/extcon.h
@@ -77,6 +77,7 @@ enum extcon_type {
  #define EXTCON_DISP_MHL   41  /* Mobile High-Definition Link 
*/
  #define EXTCON_DISP_DVI   42  /* Digital Visual Interface */
  #define EXTCON_DISP_VGA   43  /* Video Graphics Array */
+#define EXTCON_DISP_DP 44  /* DisplayPort */
  
  /* Miscellaneous external connector */

  #define EXTCON_DOCK   60
@@ -108,9 +109,13 @@ enum extcon_property {
 * - EXTCON_PROP_USB_USB
 * @type:   integer (int value)
 * @value:  0 (low) or 1 (high)
+* 

[lkp] [mm, kasan] 7392becb25: BUG: KASAN: slab-out-of-bounds in bucket_table_alloc+0x79/0x1a0 at addr ffff88003e400000

2016-07-12 Thread kernel test robot
+0xb0/0xb0
[   22.807662]  [] ? __kthread_parkme+0xb0/0xb0
[   22.810556] Object at 88003e40, in cache kmalloc-4194304
[   22.810556] Object at 88003e40, in cache kmalloc-4194304
[   22.813231] Memory state around the buggy address:


FYI, raw QEMU command line is:

qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel 
/pkg/linux/x86_64-randconfig-s2-07120443/gcc-6/7392becb255cd6c0e7bedaabd58f638b732772f2/vmlinuz-4.7.0-rc6-1-g7392bec
 -append 'root=/dev/ram0 user=lkp 
job=/lkp/scheduled/vm-kbuild-1G-5/bisect_boot-1-debian-x86_64-2015-02-07.cgz-x86_64-randconfig-s2-07120443-7392becb255cd6c0e7bedaabd58f638b732772f2-20160712-21427-xipcnl-0.yaml
 ARCH=x86_64 kconfig=x86_64-randconfig-s2-07120443 
branch=linux-devel/devel-spot-201607120350 
commit=7392becb255cd6c0e7bedaabd58f638b732772f2 
BOOT_IMAGE=/pkg/linux/x86_64-randconfig-s2-07120443/gcc-6/7392becb255cd6c0e7bedaabd58f638b732772f2/vmlinuz-4.7.0-rc6-1-g7392bec
 max_uptime=600 
RESULT_ROOT=/result/boot/1/vm-kbuild-1G/debian-x86_64-2015-02-07.cgz/x86_64-randconfig-s2-07120443/gcc-6/7392becb255cd6c0e7bedaabd58f638b732772f2/0
 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug 
sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 
softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 
prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw 
ip=vm-kbuild-1G-5::dhcp'  -initrd /fs/sdg1/initrd-vm-kbuild-1G-5 -m 1024 
-smp 2 -device e1000,netdev=net0 -netdev user,id=net0,hostfwd=tcp::23004-:22 
-boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -device 
virtio-scsi-pci,id=scsi0 -drive 
file=/fs/sdg1/disk0-vm-kbuild-1G-5,if=none,id=hd0,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd0,scsi-id=1,lun=0 -drive 
file=/fs/sdg1/disk1-vm-kbuild-1G-5,if=none,id=hd1,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd1,scsi-id=1,lun=1 -drive 
file=/fs/sdg1/disk2-vm-kbuild-1G-5,if=none,id=hd2,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd2,scsi-id=1,lun=2 -drive 
file=/fs/sdg1/disk3-vm-kbuild-1G-5,if=none,id=hd3,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd3,scsi-id=1,lun=3 -drive 
file=/fs/sdg1/disk4-vm-kbuild-1G-5,if=none,id=hd4,media=disk,aio=native,cache=none
 -device scsi-hd,bus=scsi0.0,drive=hd4,scsi-id=1,lun=4 -pidfile 
/dev/shm/kboot/pid-vm-kbuild-1G-5 -serial 
file:/dev/shm/kboot/serial-vm-kbuild-1G-5 -daemonize -display none -monitor 
null 





Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.7.0-rc6 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_KASAN_SHADOW_OFFSET=0xdc00
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx 
-fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 
-fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
CONFIG_KERNEL_LZO=y
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
# CONFIG_SYSVIPC is not set
# CONFIG_POSIX_MQUEUE is not set
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SH

Re: [PATCH 2/2] mmc: dw_mmc: Print proper voltage on error

2016-07-12 Thread Jaehoon Chung
Hi Krzysztof,

On 07/12/2016 11:08 PM, Krzysztof Kozlowski wrote:
> The commit 97f659a2e972 ("mmc: dw_mmc: prevent to set the wrong
> value") reordered the code so the 'uhs' variable used in
> mmc_regulator_set_vqmmc() error message is always 0 at that time thus
> always printing 3.3 voltage.  Instead use value obtained from ios in
> printed error message.

The commit 97f659a2e972 was dropped because some board didn't work fine.
Some boards didn't use the vqmmc suppy and not defined into device-tree.

It's short time to fix. I will re-send the patch on next.
At that time, i will check this patch.

Best Regards,
Jaehoon Chung

> 
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/mmc/host/dw_mmc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index c2a128628b31..7de561065003 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -1416,8 +1416,8 @@ static int dw_mci_switch_voltage(struct mmc_host *mmc, 
> struct mmc_ios *ios)
>   ret = mmc_regulator_set_vqmmc(mmc, ios);
>   if (ret) {
>   dev_err(>class_dev,
> -  "Regulator set error %d - %s V\n",
> -  ret, uhs & v18 ? "1.8" : "3.3");
> +  "Regulator set error %d - %s\n",
> +  ret, mmc_voltage_to_str(ios));
>   return ret;
>   }
>  
> 



Re: [PATCH v6 1/2] Documentation: bindings: add dt doc for Rockchip PCIe controller

2016-07-12 Thread Brian Norris
Hi Shawn,

On Wed, Jul 13, 2016 at 09:10:15AM +0800, Shawn Lin wrote:
> 在 2016/7/7 8:39, Brian Norris 写道:
> >On Wed, Jul 06, 2016 at 03:16:37PM +0800, Shawn Lin wrote:
> >>+   #interrupt-cells = <1>;
> >>+   interrupt-map-mask = <0 0 0 7>;
> >>+   interrupt-map = <0 0 0 1 _intc 1>,
> >>+   <0 0 0 2 _intc 2>,
> >>+   <0 0 0 3 _intc 3>,
> >>+   <0 0 0 4 _intc 4>;
> >
> >I'm a little lost on this one, so forgive my ignorance; how did you
> >determine the last value in each entry (i.e., the 1, 2, 3, and 4 IRQ
> >numbers for pcie0_intc)? IIUC, those are supposed to represent indeces
> >into the IRQ status register found in the PCIe interrupt status
> >register, and so they should be 0-based (i.e., 0, 1, 2, 3). And then
> >you'd have:
> >
> > interrupt-map = <0 0 0 1 _intc 0>,
> > <0 0 0 2 _intc 1>,
> > <0 0 0 3 _intc 2>,
> > <0 0 0 4 _intc 3>;
> >
> >But then, I never got this sub-node binding to work quite right, so I
> >may be missing something.
> >
> >EDIT: ooh, I see what's going on! I'll comment on the driver as well,
> >but it looks like you're translating the register status to a HW IRQ
> >number with 'ffs(reg)', which yields a 1-based index. I think it is most
> >sensible to use a 0-based index (i.e., 'ffs(reg) - 1'). Now, that only
> >will work if you get the whole interrupt-map + interrupt-controller
> >thing right (i.e., using a subnode for the interrupt controller) --
> >otherwise, IRQ mapping might not work right. I suspect that's one reason
> >the original driver writer might have used 1-based indexing in the first
> >place.
> 
> yes, I got it but.what's the difference?

At some level, it's a matter of preference. But when you're talking
about the rk3399 PCIe "interrupt controller" domain, it seems that you
should be talking about HW bits in the controller -- i.e., you have a
4-bit interrupt status bitfield, that we typically call [0:3]. If you
use [1:4], then you have to remember to subtract 1 mentally when mapping
to the actual HW bit. I believe that confusion (since bitfields normally
count from 0) might have helped cause the infinite loop bug I noticed
too. And I also think that counting from 0 helps clarify the fact that
your interrupt controller indexing is an independent numbering from the
PCI interrupt numbering, even though they happen to map 1:1.

But then, PCI INTx numbering is kinda weird already, as it starts from
1. So maybe it's just as valid to say our domain starts from 1 as well.

> You still need to get the whole interrupt-map + interrupt-controller
> things right and the code(ffs(reg) - 1)if applied your suggestion.

Yes, of course. And I already sent you patches that do that.

> Look at most of the docs for pcie bindings, I saw they also take
> 0-base index, how about?

I don't know which ones you're referring to. I see that altera-pcie.txt
supports interrupt indeces counting from 1, but that's probably because
they're using the same broken binding that was in your ~v3 patches
(where the pcie node has both 'interrupt-controller' and
'interrupt-map', with phandles to itself), so they had no other choice.

If you still think it makes more sense to count from 1, then I won't
stop you.

Regards,
Brian


[PATCH v4] f2fs: fix to avoid data update racing between GC and DIO

2016-07-12 Thread Chao Yu
Datas in file can be operated by GC and DIO simultaneously, so we will
face race case as below:

For write case:
Thread AThread B
- generic_file_direct_write
 - invalidate_inode_pages2_range
 - f2fs_direct_IO
  - do_blockdev_direct_IO
   - do_direct_IO
- get_more_blocks
- f2fs_gc
 - do_garbage_collect
  - gc_data_segment
   - move_data_page
- do_write_data_page
migrate data block to new block 
address
   - dio_bio_submit
   update user data to old block address

For read case:
Thread AThread B
- generic_file_direct_write
 - invalidate_inode_pages2_range
 - f2fs_direct_IO
  - do_blockdev_direct_IO
   - do_direct_IO
- get_more_blocks
- f2fs_balance_fs
 - f2fs_gc
  - do_garbage_collect
   - gc_data_segment
- move_data_page
 - do_write_data_page
 migrate data block to new block 
address
  - write_checkpoint
   - do_checkpoint
- clear_prefree_segments
 - f2fs_issue_discard
 discard old block adress
   - dio_bio_submit
   update user buffer from obsolete block address

In order to fix this, for one file, we should let DIO and GC getting exclusion
against with each other.

Signed-off-by: Chao Yu 
---
v4: split rwsem to avoid deadlock between dio reader and dio writer.
 fs/f2fs/data.c  |  6 +-
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/gc.c| 20 
 fs/f2fs/super.c |  2 ++
 4 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 20b3016..62947c30 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1714,6 +1714,7 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
struct inode *inode = mapping->host;
size_t count = iov_iter_count(iter);
loff_t offset = iocb->ki_pos;
+   int rw = iov_iter_rw(iter);
int err;
 
err = check_direct_IO(inode, iter, offset);
@@ -1727,8 +1728,11 @@ static ssize_t f2fs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
 
trace_f2fs_direct_IO_enter(inode, offset, count, iov_iter_rw(iter));
 
+   down_read(_I(inode)->dio_rwsem[rw]);
err = blockdev_direct_IO(iocb, inode, iter, get_data_block_dio);
-   if (iov_iter_rw(iter) == WRITE) {
+   up_read(_I(inode)->dio_rwsem[rw]);
+
+   if (rw == WRITE) {
if (err > 0)
set_inode_flag(inode, FI_UPDATE_WRITE);
else if (err < 0)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1190c04..5973759 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -474,6 +474,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct mutex inmem_lock;/* lock for inmemory pages */
struct extent_tree *extent_tree;/* cached extent_tree entry */
+   struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
 };
 
 static inline void get_extent_info(struct extent_info *ext,
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index c612137..5c8acf7 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -755,12 +755,32 @@ next_step:
/* phase 3 */
inode = find_gc_inode(gc_list, dni.ino);
if (inode) {
+   struct f2fs_inode_info *fi = F2FS_I(inode);
+   bool locked = false;
+
+   if (S_ISREG(inode->i_mode)) {
+   if (!down_write_trylock(>dio_rwsem[READ]))
+   continue;
+   if (!down_write_trylock(
+   >dio_rwsem[WRITE])) {
+   up_write(>dio_rwsem[READ]);
+   continue;
+   }
+   locked = true;
+   }
+
start_bidx = start_bidx_of_node(nofs, inode)
+ ofs_in_node;
if (f2fs_encrypted_inode(inode) && 
S_ISREG(inode->i_mode))
move_encrypted_block(inode, start_bidx);
else

Re: Fix issue with alternatives/paravirt patches

2016-07-12 Thread Jessica Yu

+++ Josh Poimboeuf [12/07/16 09:01 -0500]:

On Tue, Jul 12, 2016 at 01:55:54PM +0200, Miroslav Benes wrote:

On Thu, 7 Jul 2016, Josh Poimboeuf wrote:

> On Thu, Jul 07, 2016 at 05:56:33PM +0200, Petr Mladek wrote:
> > On Tue 2016-07-05 22:34:58, Jessica Yu wrote:
> > > Hi,
> > >
> > > A few months ago, Chris Arges reported a bug involving 
alternatives/paravirt
> > > patching that was discussed here [1] and here [2]. To briefly summarize 
the
> > > bug, patch modules that contained .altinstructions or .parainstructions
> > > sections would break because these alternative/paravirt patches would be
> > > applied first by the module loader (see x86 module_finalize()), then
> > > livepatch would later clobber these patches when applying per-object
> > > relocations. This lead to crashes and unpredictable behavior.
> > >
> > > One conclusion we reached from our last discussion was that we will
> > > need to introduce some arch-specific code to address this problem.
> > > This patchset presents a possible fix for the bug by adding a new
> > > arch-specific arch_klp_init_object_loaded() function that by default
> > > does nothing but can be overridden by different arches.
> > >
> > > To fix this issue for x86, since we can access a patch module's Elf
> > > sections through mod->klp_info, we can simply delay the calls to
> > > apply_paravirt() and apply_alternatives() to 
arch_klp_init_object_loaded(),
> > > which is called after relocations have been written for an object.
> > > In addition, for patch modules, .parainstructions and .altinstructions are
> > > prefixed by ".klp.arch.${objname}" so that the module loader ignores them
> > > and livepatch can apply them manually.
> >
> > The solution looks correct to me. The fun will be how to generate
> > the sections. If I get this correctly, it is not enough to rename
> > the existing ones. Instead, we need to split .parainstructions
> > and .altinstructions sections into per-object ones.
> >
> > I wonder if there is a plan for this. Especially I am interested
> > into the patches created from sources ;-) I wonder if we could add
> > a tag somewhere and improve the build infrastructure.
>
> Yeah.  I'd like to reiterate[1] that this would all be a lot easier if
> we weren't circumventing module dependencies.
>
> [1] 
https://lkml.kernel.org/r/20160404161428.3qap2i4vpgda6...@treble.redhat.com

Oh, we haven't come to any conclusion. I think it would be a great topic
for Plumbers conf. It is always better to discuss such things personally.
What do you think? Any volunteer to propose it? :)


Well, it's somewhat related to my "Livepatch module creation tooling"
proposed talk, because I suspect the tooling could be *much* simpler if
we didn't circumvent module dependencies.  So I'll probably talk about
that aspect of it.

But it would be great if somebody wanted to submit a separate talk to
explore the pros and cons of our current "load patches to modules before
the modules themselves have been loaded" approach and if there are any
viable alternatives.


In addition to Josh's linked discussion, we also once talked about the
idea of forcing module dependencies (exactly!) one year ago today:

http://www.spinics.net/lists/live-patching/msg00946.html

I've forgotten a lot of what I was blabbering about back then (and
this was way before we talked about the .klp.rela stuff), but I do
remember we talked a bit about forcing module dependencies and
potentially forcing to-be-patched modules to be loaded first before
loading the patch module. I still don't think we should do that but
instead we could potentially implement something more palatable like
enforcing maybe one patch module per object (so maybe depmod could
record dependencies for us) or something like that.

It would be interesting to revisit the problem again, esp. in the
context of recent changes. If I end up collecting enough talking
points, I can submit a proposal.

Jessica


Re: linux-next: please clean up the hid tree

2016-07-12 Thread Stephen Rothwell
Hi Jiri,

On Wed, 13 Jul 2016 09:28:09 +1000 Stephen Rothwell  
wrote:
>
> The hid tree
> (git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid.git#for-next)
> seems to be based on v3.19 and constists of a large number of merges
> (of stuff that is now in Linus' tree) and ends with one particulary
> large revert (of a merge).

What I forgot to say is that that last revert creates lots of conflicts
when I merge your tree:

CONFLICT (content): Merge conflict in sound/pci/hda/patch_realtek.c
CONFLICT (content): Merge conflict in mm/shmem.c
CONFLICT (content): Merge conflict in fs/dcache.c
CONFLICT (content): Merge conflict in 
drivers/gpu/drm/amd/powerplay/inc/smu74_discrete.h
CONFLICT (content): Merge conflict in 
drivers/gpu/drm/amd/powerplay/hwmgr/polaris10_hwmgr.h
CONFLICT (content): Merge conflict in 
drivers/gpu/drm/amd/powerplay/hwmgr/polaris10_hwmgr.c
CONFLICT (content): Merge conflict in arch/powerpc/Kconfig
CONFLICT (content): Merge conflict in arch/arm/mach-omap2/omap-smp.c
CONFLICT (content): Merge conflict in Makefile

So I have just dropped the hid tree until it is cleaned up.

-- 
Cheers,
Stephen Rothwell


RE: [f2fs-dev] [PATCH 3/7] f2fs: drop any block plugging

2016-07-12 Thread hebiao (G)
Hi, Kim,

You are right. If file system can merge bios as much as possible. It's 
very helpful to block layer. But plugging mechanism has a precognition of IO 
stream except merging bios. For example, we can out the low-power mode in 
advance when you start a plug and we can in the low-power mode only when you 
end a plug to avoid in-out low-power mode frequently. So I want to know if 
there is any side effect in F2FS to reserve the plug mechanism ?

-Original Message-
From: Jaegeuk Kim [mailto:jaeg...@kernel.org] 
Sent: 2016年7月13日 1:08
To: Yuchao (T) 
Cc: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org; 
linux-f2fs-de...@lists.sourceforge.net; CHEN CHUN YEN (IAN) 
; hebiao (G) 
Subject: Re: [f2fs-dev] [PATCH 3/7] f2fs: drop any block plugging

Hi Chao,

On Tue, Jul 12, 2016 at 09:38:11AM +0800, Chao Yu wrote:
> On 2016/7/10 0:32, Jaegeuk Kim wrote:
> > On Sat, Jul 09, 2016 at 10:28:49AM +0800, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2016/6/9 1:24, Jaegeuk Kim wrote:
> >>> In f2fs, we don't need to keep block plugging for NODE and DATA 
> >>> writes, since we already merged bios as much as possible.
> >>
> >> IMO, we can not remove block plug, this is because there are still 
> >> many conditions which stops us merging r/w IOs into one bio as we 
> >> expect, theoretically, block plug can hold bios as much as 
> >> possible, then submitting them into queue in batch, it will reduce 
> >> racing of grabbing queue->lock during bio submitting, if we drop 
> >> them, when syncing nodes or flushing datas, we will suffer more lock 
> >> racing.
> >>
> >> Or there are something I am missing, do you suffer any performance 
> >> issue on block plug?
> > 
> > In the latest patch, I've turned off plugging forcefully, only if 
> > the underlying device is SMR drive.
> 
> Got it.
> 
> > And, still I removed other block plugging, since I couldn't see any 
> > performance regression.
> 
> I suspect that in low-end machine with single-queue which is used in 
> block layer, we will suffer regression.
> 
> > Even in some workloads, I could have seen some inverted IOs due to 
> > race condition between plugged and unplugged IOs.
> 
> For data path, what about enabling block plug for IPU/SSR ?

Not sure. IPU and SSR will produce small (likely random) writes.
What I'm seeing here is that we already try to merge bios as much as possible.
Thus, I'm in doubt why we need to wait for merging them by block layer.
If possible, could you check this in android?

Thanks,

> >>>
> >>> Signed-off-by: Jaegeuk Kim 
> >>> ---
> >>>  fs/f2fs/checkpoint.c |  4 
> >>>  fs/f2fs/data.c   | 17 ++---
> >>>  fs/f2fs/gc.c |  5 -
> >>>  fs/f2fs/segment.c|  7 +--
> >>>  4 files changed, 11 insertions(+), 22 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index 
> >>> 5ddd15c..4179c7b 100644
> >>> --- a/fs/f2fs/checkpoint.c
> >>> +++ b/fs/f2fs/checkpoint.c
> >>> @@ -897,11 +897,8 @@ static int block_operations(struct f2fs_sb_info *sbi)
> >>>   .nr_to_write = LONG_MAX,
> >>>   .for_reclaim = 0,
> >>>   };
> >>> - struct blk_plug plug;
> >>>   int err = 0;
> >>>  
> >>> - blk_start_plug();
> >>> -
> >>>  retry_flush_dents:
> >>>   f2fs_lock_all(sbi);
> >>>   /* write all the dirty dentry pages */ @@ -938,7 +935,6 @@ 
> >>> retry_flush_nodes:
> >>>   goto retry_flush_nodes;
> >>>   }
> >>>  out:
> >>> - blk_finish_plug();
> >>>   return err;
> >>>  }
> >>>  
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 
> >>> 30dc448..5f655d0 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -98,10 +98,13 @@ static struct bio *__bio_alloc(struct 
> >>> f2fs_sb_info *sbi, block_t blk_addr,  }
> >>>  
> >>>  static inline void __submit_bio(struct f2fs_sb_info *sbi, int rw,
> >>> - struct bio *bio)
> >>> + struct bio *bio, enum page_type type)
> >>>  {
> >>> - if (!is_read_io(rw))
> >>> + if (!is_read_io(rw)) {
> >>>   atomic_inc(>nr_wb_bios);
> >>> + if (current->plug && (type == DATA || type == NODE))
> >>> + blk_finish_plug(current->plug);
> >>> + }
> >>>   submit_bio(rw, bio);
> >>>  }
> >>>  
> >>> @@ -117,7 +120,7 @@ static void __submit_merged_bio(struct f2fs_bio_info 
> >>> *io)
> >>>   else
> >>>   trace_f2fs_submit_write_bio(io->sbi->sb, fio, io->bio);
> >>>  
> >>> - __submit_bio(io->sbi, fio->rw, io->bio);
> >>> + __submit_bio(io->sbi, fio->rw, io->bio, fio->type);
> >>>   io->bio = NULL;
> >>>  }
> >>>  
> >>> @@ -235,7 +238,7 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
> >>>   return -EFAULT;
> >>>   }
> >>>  
> >>> - __submit_bio(fio->sbi, fio->rw, bio);
> >>> + __submit_bio(fio->sbi, fio->rw, bio, fio->type);
> >>>   return 0;
> >>>  }
> >>>  
> >>> @@ -1040,7 +1043,7 @@ got_it:
> >>>*/
> >>> 

Re: [PATCH v6 1/2] Documentation: bindings: add dt doc for Rockchip PCIe controller

2016-07-12 Thread Shawn Lin

在 2016/7/7 8:39, Brian Norris 写道:

Hi Shawn,

On Wed, Jul 06, 2016 at 03:16:37PM +0800, Shawn Lin wrote:

This patch adds a binding that describes the Rockchip PCIe controller
found on Rockchip SoCs PCIe interface.

Signed-off-by: Shawn Lin 

Acked-by: Rob Herring 
---

Changes in v6:
- add ack tag from Rob

Changes in v5:
- fix wrong example reported by Marc
- add seperate section to describe the interrupt controller child
  node

Changes in v4:
- fix example of adding intermediate interrupt controller for pcie
  legacy interrrupt

Changes in v3:
- fix example dts code suggested by Rob and Marc
- remove driver's behaviour of regulator

Changes in v2:
- fix lots clk/reset stuff suggested by Heiko
- remove msi-parent and add msi-map suggested by Marc
- drop phy related stuff
- some others minor fixes

 .../devicetree/bindings/pci/rockchip-pcie.txt  | 104 +
 1 file changed, 104 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/rockchip-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/rockchip-pcie.txt 
b/Documentation/devicetree/bindings/pci/rockchip-pcie.txt
new file mode 100644
index 000..7616ecc
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/rockchip-pcie.txt
@@ -0,0 +1,104 @@
+* Rockchip AXI PCIe Root Port Bridge DT description
+
+Required properties:
+- #address-cells: Address representation for root ports, set to <3>
+- #size-cells: Size representation for root ports, set to <2>
+- #interrupt-cells: specifies the number of cells needed to encode an
+   interrupt source. The value must be 1.
+- compatible: Should contain "rockchip,rk3399-pcie"
+- reg: Two register ranges as listed in the reg-names property
+- reg-names: Must include the following names
+   - "axi-base"
+   - "apb-base"
+- clocks: Must contain an entry for each entry in clock-names.
+   See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+   - "aclk"
+   - "aclk-perf"
+   - "hclk"
+   - "pm"
+- msi-map: Maps a Requester ID to an MSI controller and associated.
+   See ./pci-msi.txt
+- phys: From PHY bindings: Phandle for the Generic PHY for PCIe.
+- phy-names:  MUST be "pcie-phy".
+- interrupts: Three interrupt entries must be specified.
+- interrupt-names: Must include the following names
+   - "sys"
+   - "legacy"
+   - "client"
+- resets: Must contain five entries for each entry in reset-names.
+  See ../reset/reset.txt for details.
+- reset-names: Must include the following names
+   - "core"
+   - "mgmt"
+   - "mgmt-sticky"
+   - "pipe"
+- pinctrl-names : The pin control state names
+- pinctrl-0: The "default" pinctrl state
+- #interrupt-cells: specifies the number of cells needed to encode an
+   interrupt source. The value must be 1.
+- interrupt-map-mask and interrupt-map: standard PCI properties
+
+*Interrupt controller child node*
+The core controller provides a single interrupt for legacy INTx. So,
+pcie node should create a interrupt controller node to support 'interrupt-map'
+DT functionality. The driver will create an IRQ domain for this map, decode
+the four INTx interrupts in ISR and route them to this domain.


Where in your driver do you actually handle this child node? I don't see
anything, but perhaps I'm missing something. I see how your earlier
revisions of this driver used of_get_next_child() to acquire the child
node, for use with irq_domain_add_linear(). But that's not in this
version...


+
+Required properties for Interrupt controller child node:
+- interrupt-controller: identifies the node as an interrupt controller
+- #address-cells: specifies the number of cells needed to encode an
+   address. The value must be 0.
+- #interrupt-cells: specifies the number of cells needed to encode an
+   interrupt source. The value must be 1.
+
+Optional Property:


These optional properties apply to the pcie node, not the interrupt
controller child, right? Seems like the subnode and its properties
should be last (i.e., the 'Optional Property' section should be above
'Interrupt controller child node').


okay, i will move it ahead.




+- ep-gpios: contain the entry for pre-reset gpio
+- num-lanes: number of lanes to use
+- vpcie3v3-supply: The phandle to the 3.3v regulator to use for pcie.
+- vpcie1v8-supply: The phandle to the 1.8v regulator to use for pcie.
+- vpcie0v9-supply: The phandle to the 0.9v regulator to use for pcie.
+
+Example:
+
+pcie0: pcie@f800 {
+   compatible = "rockchip,rk3399-pcie";
+   #address-cells = <3>;
+   #size-cells = <2>;
+   clocks = < ACLK_PCIE>, < ACLK_PERF_PCIE>,
+< PCLK_PCIE>, < SCLK_PCIE_PM>;
+   clock-names = "aclk", "aclk-perf",
+ "hclk", "pm";
+   bus-range = <0x0 0x1>;
+   interrupts = , ,
+;
+   interrupt-names = "sys", 

Re: Build regressions/improvements in v4.7-rc7

2016-07-12 Thread Geert Uytterhoeven
On Wed, Jul 13, 2016 at 2:57 AM, Geert Uytterhoeven
 wrote:
> JFYI, when comparing v4.7-rc7[1] to v4.7-rc6[3], the summaries are:
>   - build errors: +7/-6

  + error: main.c: undefined reference to `__stack_chk_guard':  =>
.init.text+0x166), .init.text+0x1d6)

x86_64-randconfig

> [1] http://kisskb.ellerman.id.au/kisskb/head/10595/ (261 out of 263 configs)
> [3] http://kisskb.ellerman.id.au/kisskb/head/10562/ (260 out of 263 configs)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [v5 PATCH 1/5] extcon: Add Type-C and DP support

2016-07-12 Thread Chanwoo Choi
Hi Chris,

I'm now developing the extcon property on extcon-test branch.
But, it has not been completed.

On next version, I'll remove the notification about extcon property
and only support the following two functions.
- extcon_set_cable_property()
- extcon_get_cable_property()

Because the number of properties would be risen and the all properties
depend on the specific external connector(e.g., EXTCON_PROP_USB_VBUS
depend on the EXTCON_TYPE_USB type). When the specific external connector
is detached, extcon framework should make the property state as default state.

It may send the too many notification for extcon property.
For example, Assume that EXTCON_TYPE_USB has the over 20 properties,
when EXTCON_USB or EXTCON_USB_HOST is detached, extcon should send
the notification for the over 20 properties and one more notificaiton
for state of external connector.

So, I'll send the RFC patchset without the notification of proerty.

Lastly,
I have a comment on below.

Thanks,
Chanwoo Choi

On 2016년 07월 13일 00:09, Chris Zhong wrote:
> Add EXTCON_DISP_DP for the Display external connector. For Type-C
> connector the DisplayPort can work as an Alternate Mode(VESA DisplayPort
> Alt Mode on USB Type-C Standard). The Type-C support both normal and
> flipped orientation, so add a property to extcon.
> 
> Signe-off-by: Chris Zhong 
> 
> Signed-off-by: Chris Zhong 
> ---
> 
> Changes in v5:
> - support get property
> 
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
> Changes in v1: None
> 
>  drivers/extcon/extcon.c | 28 
>  include/linux/extcon.h  | 13 +
>  2 files changed, 41 insertions(+)
> 
> diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
> index a1117db..2591b28 100644
> --- a/drivers/extcon/extcon.c
> +++ b/drivers/extcon/extcon.c
> @@ -157,6 +157,11 @@ struct __extcon_info {
>   .id = EXTCON_DISP_VGA,
>   .name = "VGA",
>   },
> + [EXTCON_DISP_DP] = {
> + .type = EXTCON_TYPE_DISP,
> + .id = EXTCON_DISP_DP,
> + .name = "DP",
> + },
>  
>   /* Miscellaneous external connector */
>   [EXTCON_DOCK] = {
> @@ -270,6 +275,7 @@ static bool is_extcon_property_supported(unsigned int id,
>   switch (prop) {
>   case EXTCON_PROP_USB_ID:
>   case EXTCON_PROP_USB_VBUS:
> + case EXTCON_PROP_TYPEC_POLARITY:
>   return true;
>   default:
>   break;
> @@ -286,6 +292,8 @@ static bool is_extcon_property_supported(unsigned int id,
>   }
>   case EXTCON_TYPE_DISP:
>   switch (prop) {
> + case EXTCON_PROP_TYPEC_POLARITY:

Should EXTCON_PROP_TYPEC_POLARITY property add to both EXTCON_TYPE_USB and 
EXTCON_TYP_DISP?
EXTCON_PROP_TYPEC_POLARITY is the property of USB C-type?

> + return true;
>   default:
>   break;
>   }
> @@ -547,6 +555,26 @@ int extcon_get_cable_property(struct extcon_dev *edev, 
> unsigned int id,
>   enum extcon_property prop,
>   union extcon_property_value *val)
>  {
> + struct extcon_cable *cable;
> + int index;
> +
> + if (!edev)
> + return -EINVAL;
> +
> + /* Check the property whether is supported or not */
> + if (!is_extcon_property_supported(id, prop))
> + return -EINVAL;
> +
> + /* Find the cable index of external connector by using id */
> + index = find_cable_index_by_id(edev, id);
> + if (index < 0)
> + return index;
> +
> + /* Store the property value */
> + cable = >cables[index];
> +
> + val->intval = cable->propval[prop].intval;
> +
>   return 0;
>  }

After I develop it about get_cable_property, I'll send RFC patchset.

>  
> diff --git a/include/linux/extcon.h b/include/linux/extcon.h
> index f6f0a8d..50ef87f 100644
> --- a/include/linux/extcon.h
> +++ b/include/linux/extcon.h
> @@ -77,6 +77,7 @@ enum extcon_type {
>  #define EXTCON_DISP_MHL  41  /* Mobile High-Definition Link 
> */
>  #define EXTCON_DISP_DVI  42  /* Digital Visual Interface */
>  #define EXTCON_DISP_VGA  43  /* Video Graphics Array */
> +#define EXTCON_DISP_DP   44  /* DisplayPort */
>  
>  /* Miscellaneous external connector */
>  #define EXTCON_DOCK  60
> @@ -108,9 +109,13 @@ enum extcon_property {
>* - EXTCON_PROP_USB_USB
>* @type:   integer (int value)
>* @value:  0 (low) or 1 (high)
> +  * - EXTCON_PROP_TYPEC_POLARITY,
> +  * @type:   integer (int value)
> +  * @value:  0 (normal) or 1 (flip)
>*/
>   EXTCON_PROP_USB_ID = 0,
>   EXTCON_PROP_USB_VBUS,
> + EXTCON_PROP_TYPEC_POLARITY,
>  
>   /* Properties of EXTCON_TYPE_CHG. */
>   /* Properties of 

[PATCH 1/3] Documentation: dt: Intersil isl12057 is not a trivial device

2016-07-12 Thread Alexandre Belloni
The ISL12057 has a documentation file, remove it from trivial-devices.txt

Signed-off-by: Alexandre Belloni 
---
 Documentation/devicetree/bindings/i2c/trivial-devices.txt | 1 -
 1 file changed, 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/i2c/trivial-devices.txt 
b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
index 539874490492..a397d39ea741 100644
--- a/Documentation/devicetree/bindings/i2c/trivial-devices.txt
+++ b/Documentation/devicetree/bindings/i2c/trivial-devices.txt
@@ -50,7 +50,6 @@ fsl,sgtl5000  SGTL5000: Ultra Low-Power Audio Codec
 gmt,g751   G751: Digital Temperature Sensor and Thermal Watchdog 
with Two-Wire Interface
 infineon,slb9635tt Infineon SLB9635 (Soft-) I2C TPM (old protocol, max 
100khz)
 infineon,slb9645tt Infineon SLB9645 I2C TPM (new protocol, max 400khz)
-isil,isl12057  Intersil ISL12057 I2C RTC Chip
 isil,isl29028  Intersil ISL29028 Ambient Light and Proximity Sensor
 maxim,ds1050   5 Bit Programmable, Pulse-Width Modulator
 maxim,max1237  Low-Power, 4-/12-Channel, 2-Wire Serial, 12-Bit ADCs
-- 
2.8.1



[PATCH 2/3] rtc: ds1307: add Intersil ISL12057 support

2016-07-12 Thread Alexandre Belloni
Intersil ISL12057 is a drop-in replacement for DS1337. It can be supported
by the ds1307 driver.

Signed-off-by: Alexandre Belloni 
---
 drivers/rtc/Kconfig  | 8 
 drivers/rtc/rtc-ds1307.c | 6 ++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
index f47c2f5ff70d..ba0b3e7ce4c5 100644
--- a/drivers/rtc/Kconfig
+++ b/drivers/rtc/Kconfig
@@ -198,14 +198,14 @@ config RTC_DRV_AS3722
  will be called rtc-as3722.
 
 config RTC_DRV_DS1307
-   tristate "Dallas/Maxim DS1307/37/38/39/40, ST M41T00, EPSON RX-8025"
+   tristate "Dallas/Maxim DS1307/37/38/39/40, ST M41T00, EPSON RX-8025, 
ISL12057"
help
  If you say yes here you get support for various compatible RTC
  chips (often with battery backup) connected with I2C. This driver
  should handle DS1307, DS1337, DS1338, DS1339, DS1340, ST M41T00,
- EPSON RX-8025 and probably other chips. In some cases the RTC
- must already have been initialized (by manufacturing or a
- bootloader).
+ EPSON RX-8025, Intersil ISL12057 and probably other chips. In some
+ cases the RTC must already have been initialized (by manufacturing or
+ a bootloader).
 
  The first seven registers on these chips hold an RTC, and other
  registers may add features such as NVRAM, a trickle charger for
diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 4c5890864d9c..52be48f2e3c0 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -186,6 +186,7 @@ static const struct i2c_device_id ds1307_id[] = {
{ "mcp7941x", mcp794xx },
{ "pt7c4338", ds_1307 },
{ "rx8025", rx_8025 },
+   { "isl1207", ds_1337 },
{ }
 };
 MODULE_DEVICE_TABLE(i2c, ds1307_id);
@@ -1333,6 +1334,11 @@ static int ds1307_probe(struct i2c_client *client,
if (of_property_read_bool(client->dev.of_node, "wakeup-source")) {
ds1307_can_wakeup_device = true;
}
+   /* Intersil ISL12057 DT backward compatibility */
+   if (of_property_read_bool(client->dev.of_node,
+ "isil,irq2-can-wakeup-machine")) {
+   ds1307_can_wakeup_device = true;
+   }
 #endif
 
switch (ds1307->type) {
-- 
2.8.1



Re:Re: [f2fs-dev] [PATCH] f2fs: return proper error code

2016-07-12 Thread Tiezhu Yang
At 2016-07-12 09:45:43, "Chao Yu"  wrote:
>On 2016/7/11 7:20, Tiezhu Yang wrote:
>> When the length of file name is more than F2FS_NAME_LEN,
>
>Seem @name indicates a xattr/key name, not a file name.

Yes, you are right. Sorry for the noise.

Thanks,

Re: [PULL] lkdtm update (next)

2016-07-12 Thread Greg KH
On Tue, Jul 12, 2016 at 02:42:22PM -0400, Kees Cook wrote:
> On Thu, Jul 7, 2016 at 2:14 PM, Kees Cook  wrote:
> > Hi,
> >
> > Please pull these lkdtm changes for next.
> 
> Friendly ping... I'd like this refactor to make it in time for the 4.8
> merge window. :)

Sorry, was on vacation last week, and am at LinuxCon Japan this week,
will get to it in a day or so.  Don't worry, it will make 4.8 :)

greg k-h


[PATCH 0/3] rtc: remove intersil isl12057

2016-07-12 Thread Alexandre Belloni
Arnaud,

This is the series I intend to apply once you confirm my previous patch
is working.

Alexandre Belloni (3):
  Documentation: dt: Intersil isl12057 is not a trivial device
  rtc: ds1307: add Intersil ISL12057 support
  rtc: isl12057: remove driver

 .../devicetree/bindings/i2c/trivial-devices.txt|   1 -
 drivers/rtc/Kconfig|  18 +-
 drivers/rtc/Makefile   |   1 -
 drivers/rtc/rtc-ds1307.c   |   6 +
 drivers/rtc/rtc-isl12057.c | 643 -
 5 files changed, 10 insertions(+), 659 deletions(-)
 delete mode 100644 drivers/rtc/rtc-isl12057.c

-- 
2.8.1



[PATCH 3/3] rtc: isl12057: remove driver

2016-07-12 Thread Alexandre Belloni
The Intersil isl12057 is now supported by the ds1307 driver.

Signed-off-by: Alexandre Belloni 
---
 drivers/rtc/Kconfig|  10 -
 drivers/rtc/Makefile   |   1 -
 drivers/rtc/rtc-isl12057.c | 643 -
 3 files changed, 654 deletions(-)
 delete mode 100644 drivers/rtc/rtc-isl12057.c

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
index ba0b3e7ce4c5..88777e4440dd 100644
--- a/drivers/rtc/Kconfig
+++ b/drivers/rtc/Kconfig
@@ -378,16 +378,6 @@ config RTC_DRV_ISL12022
  This driver can also be built as a module. If so, the module
  will be called rtc-isl12022.
 
-config RTC_DRV_ISL12057
-   select REGMAP_I2C
-   tristate "Intersil ISL12057"
-   help
- If you say yes here you get support for the Intersil ISL12057
- I2C RTC chip.
-
- This driver can also be built as a module. If so, the module
- will be called rtc-isl12057.
-
 config RTC_DRV_X1205
tristate "Xicor/Intersil X1205"
help
diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
index 7cf7ad559c79..0e0c15bab5d2 100644
--- a/drivers/rtc/Makefile
+++ b/drivers/rtc/Makefile
@@ -71,7 +71,6 @@ obj-$(CONFIG_RTC_DRV_HID_SENSOR_TIME) += rtc-hid-sensor-time.o
 obj-$(CONFIG_RTC_DRV_HYM8563)  += rtc-hym8563.o
 obj-$(CONFIG_RTC_DRV_IMXDI)+= rtc-imxdi.o
 obj-$(CONFIG_RTC_DRV_ISL12022) += rtc-isl12022.o
-obj-$(CONFIG_RTC_DRV_ISL12057) += rtc-isl12057.o
 obj-$(CONFIG_RTC_DRV_ISL1208)  += rtc-isl1208.o
 obj-$(CONFIG_RTC_DRV_JZ4740)   += rtc-jz4740.o
 obj-$(CONFIG_RTC_DRV_LP8788)   += rtc-lp8788.o
diff --git a/drivers/rtc/rtc-isl12057.c b/drivers/rtc/rtc-isl12057.c
deleted file mode 100644
index 0e7f0f52bfe4..
--- a/drivers/rtc/rtc-isl12057.c
+++ /dev/null
@@ -1,643 +0,0 @@
-/*
- * rtc-isl12057 - Driver for Intersil ISL12057 I2C Real Time Clock
- *
- * Copyright (C) 2013, Arnaud EBALARD 
- *
- * This work is largely based on Intersil ISL1208 driver developed by
- * Hebert Valerio Riedel .
- *
- * Detailed datasheet on which this development is based is available here:
- *
- *  http://natisbad.org/NAS2/refs/ISL12057.pdf
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#define DRV_NAME "rtc-isl12057"
-
-/* RTC section */
-#define ISL12057_REG_RTC_SC0x00/* Seconds */
-#define ISL12057_REG_RTC_MN0x01/* Minutes */
-#define ISL12057_REG_RTC_HR0x02/* Hours */
-#define ISL12057_REG_RTC_HR_PM BIT(5)  /* AM/PM bit in 12h format */
-#define ISL12057_REG_RTC_HR_MIL BIT(6) /* 24h/12h format */
-#define ISL12057_REG_RTC_DW0x03/* Day of the Week */
-#define ISL12057_REG_RTC_DT0x04/* Date */
-#define ISL12057_REG_RTC_MO0x05/* Month */
-#define ISL12057_REG_RTC_MO_CENBIT(7)  /* Century bit */
-#define ISL12057_REG_RTC_YR0x06/* Year */
-#define ISL12057_RTC_SEC_LEN   7
-
-/* Alarm 1 section */
-#define ISL12057_REG_A1_SC 0x07/* Alarm 1 Seconds */
-#define ISL12057_REG_A1_MN 0x08/* Alarm 1 Minutes */
-#define ISL12057_REG_A1_HR 0x09/* Alarm 1 Hours */
-#define ISL12057_REG_A1_HR_PM  BIT(5)  /* AM/PM bit in 12h format */
-#define ISL12057_REG_A1_HR_MIL BIT(6)  /* 24h/12h format */
-#define ISL12057_REG_A1_DWDT   0x0A/* Alarm 1 Date / Day of the week */
-#define ISL12057_REG_A1_DWDT_B BIT(6)  /* DW / DT selection bit */
-#define ISL12057_A1_SEC_LEN4
-
-/* Alarm 2 section */
-#define ISL12057_REG_A2_MN 0x0B/* Alarm 2 Minutes */
-#define ISL12057_REG_A2_HR 0x0C/* Alarm 2 Hours */
-#define ISL12057_REG_A2_DWDT   0x0D/* Alarm 2 Date / Day of the week */
-#define ISL12057_A2_SEC_LEN3
-
-/* Control/Status registers */
-#define ISL12057_REG_INT   0x0E
-#define ISL12057_REG_INT_A1IE  BIT(0)  /* Alarm 1 interrupt enable bit */
-#define ISL12057_REG_INT_A2IE  BIT(1)  /* Alarm 2 interrupt enable bit */
-#define ISL12057_REG_INT_INTCN BIT(2)  /* Interrupt control enable bit */
-#define ISL12057_REG_INT_RS1   BIT(3)  /* Freq out control bit 1 */
-#define ISL12057_REG_INT_RS2   BIT(4)  /* Freq out control bit 2 */
-#define ISL12057_REG_INT_EOSC  BIT(7)  /* Oscillator enable bit */
-
-#define ISL12057_REG_SR0x0F
-#define ISL12057_REG_SR_A1FBIT(0)  /* Alarm 1 interrupt bit */
-#define ISL12057_REG_SR_A2FBIT(1)  /* Alarm 2 interrupt bit */
-#define ISL12057_REG_SR_OSFBIT(7)  /* 

Build regressions/improvements in v4.7-rc7

2016-07-12 Thread Geert Uytterhoeven
Below is the list of build error/warning regressions/improvements in
v4.7-rc7[1] compared to v4.6[2].

Summarized:
  - build errors: +10/-7
  - build warnings: +1294/-1054

JFYI, when comparing v4.7-rc7[1] to v4.7-rc6[3], the summaries are:
  - build errors: +7/-6
  - build warnings: +1013/-939

Note that there may be false regressions, as some logs are incomplete.
Still, they're build errors/warnings.

As I haven't mastered kup yet, there's no verbose summary at
http://www.kernel.org/pub/linux/kernel/people/geert/linux-log/v4.7-rc7.summary.gz

Happy fixing! ;-)

Thanks to the linux-next team for providing the build service.

[1] http://kisskb.ellerman.id.au/kisskb/head/10595/ (261 out of 263 configs)
[2] http://kisskb.ellerman.id.au/kisskb/head/10344/ (all 263 configs)
[3] http://kisskb.ellerman.id.au/kisskb/head/10562/ (260 out of 263 configs)


*** ERRORS ***

10 error regressions:
  + /tmp/cc4MUMku.s: Error: pcrel too far BFD_RELOC_BFIN_10:  => 889
  + error: No rule to make target /etc/sound/pndsperm.bin:  => N/A
  + error: No rule to make target drivers/scsi/aic7xxx/aicasm/*.[chyl]:  => N/A
  + error: main.c: undefined reference to `__stack_chk_guard':  => 
.init.text+0x1d6), .init.text+0x166)
  + error: relocation truncated to fit: R_PPC64_REL24 against symbol 
`.del_timer_sync' defined in .text section in kernel/built-in.o:  => 
(.text+0x1ffb41c)
  + error: relocation truncated to fit: R_PPC64_REL24 against symbol 
`.queue_work_on' defined in .text section in kernel/built-in.o:  => 
(.text+0x1ffae6c), (.text+0x1ffb3b4)
  + error: sdio.c: relocation truncated to fit: R_PPC64_REL24 against symbol 
`._raw_spin_lock_bh' defined in .spinlock.text section in kernel/built-in.o:  
=> (.text+0x1ffbad8)
  + error: sdio.c: relocation truncated to fit: R_PPC64_REL24 against symbol 
`.kfree_skb' defined in .text section in net/built-in.o:  => (.text+0x1ff98d0)
  + error: sdio.c: relocation truncated to fit: R_PPC64_REL24 against symbol 
`.skb_pull' defined in .text section in net/built-in.o:  => (.text+0x1ffb1f4), 
(.text+0x1ff8be0), (.text+0x1ff9b28), (.text+0x1ff8c70)
  + error: sdio.c: relocation truncated to fit: R_PPC64_REL24 against symbol 
`.skb_push' defined in .text section in net/built-in.o:  => (.text+0x1ffb15c)

7 error improvements:
  - /home/kisskb/slave/src/arch/sh/mm/cache-sh4.c: error: 'cached_to_uncached' 
undeclared (first use in this function): 99:17 => 
  - /home/kisskb/slave/src/arch/sh/mm/cache-sh4.c: error: implicit declaration 
of function 'cpu_context' [-Werror=implicit-function-declaration]: 192:2 => 
  - /tmp/ccSuvE7E.s: Error: pcrel too far BFD_RELOC_BFIN_10: 889 => 
  - error: hns_dsaf_main.c: relocation truncated to fit: R_PPC64_REL24 against 
symbol `.eeh_check_failure' defined in .text section in 
arch/powerpc/kernel/built-in.o: (.text+0x1ffa59c), (.text+0x1ffaa5c), 
(.text+0x1ffaeec), (.text+0x1ffaca4), (.text+0x1ffa8a8), (.text+0x1ffb134), 
(.text+0x1ffa6f8) => 
  - error: hns_dsaf_main.c: relocation truncated to fit: R_PPC64_REL24 against 
symbol `_restgpr0_18' defined in .text.save.restore section in 
arch/powerpc/lib/built-in.o: (.text+0x1ff9314) => 
  - error: hns_dsaf_main.c: relocation truncated to fit: R_PPC64_REL24 against 
symbol `_savegpr0_27' defined in .text.save.restore section in 
arch/powerpc/lib/built-in.o: (.text+0x1ffb284) => 
  - error: relocation truncated to fit: R_PPC64_REL24 against symbol 
`_savegpr0_28' defined in .text.save.restore section in 
arch/powerpc/lib/built-in.o: (.text+0x1ff7890) => 


*** WARNINGS ***

1294 warning regressions:

[Deleted 1153 lines about "warning: -ffunction-sections disabled; it makes 
profiling impossible [enabled by default]" on parisc-allmodconfig]

  + /home/kisskb/slave/src/arch/arm/include/asm/io.h: warning: 'bellbits' may 
be used uninitialized in this function [-Wuninitialized]:  => 101:2
  + /home/kisskb/slave/src/arch/powerpc/include/asm/io.h: warning: 'px_cmd' may 
be used uninitialized in this function [-Wmaybe-uninitialized]:  => 163:2
  + /home/kisskb/slave/src/arch/powerpc/include/asm/io.h: warning: 'px_is' may 
be used uninitialized in this function [-Wmaybe-uninitialized]:  => 163:2
  + /home/kisskb/slave/src/arch/sh/kernel/cpu/sh2a/../../entry-common.S: 
Warning: overflow in branch to syscall_call; converted into longer instruction 
sequence:  => 208
  + /home/kisskb/slave/src/arch/sh/kernel/cpu/sh2a/../../entry-common.S: 
Warning: overflow in branch to syscall_trace_entry; converted into longer 
instruction sequence:  => 356, 358
  + /home/kisskb/slave/src/arch/sh/kernel/cpu/sh4/../sh3/../../entry-common.S: 
Warning: overflow in branch to syscall_exit_work; converted into longer 
instruction sequence  AS  usr/initramfs_data.o:  => 392
  + /home/kisskb/slave/src/arch/sh/math-emu/math.c: warning: statement with no 
effect [-Wunused-value]  CC  arch/sh/mm/mmap.o:  => 108:1
  + /home/kisskb/slave/src/arch/x86/include/asm/irqflags.h: warning: 'flags' 
may be used uninitialized in this function 

Re: [PATCH 2/3] Update name field for all shrinker instances

2016-07-12 Thread Tony Jones
On 07/09/2016 01:52 AM, Janani Ravichandran wrote:

> diff --git a/fs/super.c b/fs/super.c
> index d78b984..051073c 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -241,6 +241,7 @@ static struct super_block *alloc_super(struct 
> file_system_type *type, int flags)
>   s->s_time_gran = 10;
>   s->cleancache_poolid = CLEANCACHE_NO_POOL;
>  
> + s->s_shrink.name = "super_cache_shrinker";

my patchset made this a little more granular wrt superblock types by including 
type->name




Re: [PATCH v4 2/2] input: add ADC resistor ladder driver

2016-07-12 Thread Dmitry Torokhov
Hi Alexandre,

On Tue, Jul 12, 2016 at 09:36:26PM +0200, Alexandre Belloni wrote:
> A common way of multiplexing buttons on a single input in cheap devices is
> to use a resistor ladder on an ADC. This driver supports that configuration
> by polling an ADC channel provided by IIO.

This looks quite reasonable, just a few small comments.

> 
> Acked-by: Jonathan Cameron 
> Signed-off-by: Alexandre Belloni 
> ---
>  drivers/input/keyboard/Kconfig|  15 +++
>  drivers/input/keyboard/Makefile   |   1 +
>  drivers/input/keyboard/adc-keys.c | 210 
> ++
>  3 files changed, 226 insertions(+)
>  create mode 100644 drivers/input/keyboard/adc-keys.c
> 
> diff --git a/drivers/input/keyboard/Kconfig b/drivers/input/keyboard/Kconfig
> index 509608c95994..4cf042cc5e63 100644
> --- a/drivers/input/keyboard/Kconfig
> +++ b/drivers/input/keyboard/Kconfig
> @@ -12,6 +12,21 @@ menuconfig INPUT_KEYBOARD
>  
>  if INPUT_KEYBOARD
>  
> +config KEYBOARD_ADC
> + tristate "ADC ladder Buttons"
> + depends on IIO
> + select INPUT_POLLDEV
> + help
> +   This driver implements support for buttons connected
> +   to an ADC using a resistor ladder.
> +
> +   Say Y here if your device has such buttons connected to an ADC.  Your
> +   board-specific setup logic must also provide a configuration data
> +   saying mapping voltages to buttons.
> +
> +   To compile this driver as a module, choose M here: the
> +   module will be called adc_keys.
> +
>  config KEYBOARD_ADP5520
>   tristate "Keypad Support for ADP5520 PMIC"
>   depends on PMIC_ADP5520
> diff --git a/drivers/input/keyboard/Makefile b/drivers/input/keyboard/Makefile
> index 1d416ddf84e4..d9f4cfcf3410 100644
> --- a/drivers/input/keyboard/Makefile
> +++ b/drivers/input/keyboard/Makefile
> @@ -4,6 +4,7 @@
>  
>  # Each configuration option enables a list of files.
>  
> +obj-$(CONFIG_KEYBOARD_ADC)   += adc-keys.o
>  obj-$(CONFIG_KEYBOARD_ADP5520)   += adp5520-keys.o
>  obj-$(CONFIG_KEYBOARD_ADP5588)   += adp5588-keys.o
>  obj-$(CONFIG_KEYBOARD_ADP5589)   += adp5589-keys.o
> diff --git a/drivers/input/keyboard/adc-keys.c 
> b/drivers/input/keyboard/adc-keys.c
> new file mode 100644
> index ..cf299ff517a0
> --- /dev/null
> +++ b/drivers/input/keyboard/adc-keys.c
> @@ -0,0 +1,210 @@
> +/* Input driver for resistor ladder connected on ADC
> + *
> + * Copyright (c) 2016 Alexandre Belloni
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published 
> by
> + * the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct adc_keys_button {
> + u32 voltage;
> + u32 keycode;
> +};
> +
> +struct adc_keys_state {
> + struct iio_channel *channel;
> + u32 num_keys;
> + u32 last_key;
> + u32 keyup_voltage;
> + struct adc_keys_button *map;

const

> +};
> +
> +static void adc_keys_poll(struct input_polled_dev *dev)
> +{
> + struct adc_keys_state *st = dev->private;
> + int i, value, ret;
> + u32 diff, closest = 0x;
> + int keycode = 0;
> +
> + ret = iio_read_channel_processed(st->channel, );
> + if (ret < 0) {

> + if (st->last_key) {
> + input_report_key(dev->input, st->last_key, 0);
> + input_sync(dev->input);
> + st->last_key = 0;
> + }
> + return;
> + }
> +
> + for (i = 0; i < st->num_keys; i++) {
> + diff = abs(st->map[i].voltage - value);
> + if (diff < closest) {
> + closest = diff;
> + keycode = st->map[i].keycode;
> + }
> + }
> +
> + if (abs(st->keyup_voltage - value) < closest) {
> + input_report_key(dev->input, st->last_key, 0);
> + st->last_key = 0;
> + } else {
> + if (st->last_key && st->last_key != keycode)
> + input_report_key(dev->input, st->last_key, 0);
> + input_report_key(dev->input, keycode, 1);
> + st->last_key = keycode;
> + }

I think this can be simplified a bit, see version below.

> +
> + input_sync(dev->input);
> +}
> +
> +static int adc_keys_load_dt_keymap(struct device *dev,
> +struct adc_keys_state *st)
> +{
> + struct device_node *pp, *np = dev->of_node;
> + int i;
> +
> + st->num_keys = of_get_child_count(np);
> + if (st->num_keys == 0) {
> + dev_err(dev, "keymap is missing\n");
> + return -EINVAL;
> + }

There is no need to limit this driver to OF, generic device properties
will allow us to use it on DT, ACPI and legacy boards.

> +
> 

Re: [PATCH 3/3] Add name fields in shrinker tracepoint definitions

2016-07-12 Thread Tony Jones
On 07/11/2016 07:18 AM, Vlastimil Babka wrote:
> On 07/09/2016 11:05 AM, Janani Ravichandran wrote:
>> Currently, the mm_shrink_slab_start and mm_shrink_slab_end
>> tracepoints tell us how much time was spent in a shrinker, the number of
>> objects scanned, etc. But there is no information about the identity of
>> the shrinker. This patch enables the trace output to display names of
>> shrinkers.
>>
>> ---
>>  include/trace/events/vmscan.h | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
>> index 0101ef3..be4c5b0 100644
>> --- a/include/trace/events/vmscan.h
>> +++ b/include/trace/events/vmscan.h
>> @@ -189,6 +189,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>>  cache_items, delta, total_scan),
>>
>>  TP_STRUCT__entry(
>> +__field(char *, name)
>>  __field(struct shrinker *, shr)
>>  __field(void *, shrink)
>>  __field(int, nid)
>> @@ -202,6 +203,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>>  ),
>>
>>  TP_fast_assign(
>> +__entry->name = shr->name;
>>  __entry->shr = shr;
>>  __entry->shrink = shr->scan_objects;
>>  __entry->nid = sc->nid;
>> @@ -214,7 +216,8 @@ TRACE_EVENT(mm_shrink_slab_start,
>>  __entry->total_scan = total_scan;
>>  ),
>>
>> -TP_printk("%pF %p: nid: %d objects to shrink %ld gfp_flags %s 
>> pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
>> +TP_printk("name: %s %pF %p: nid: %d objects to shrink %ld gfp_flags %s 
>> pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
>> +__entry->name,
> 
> Is this legal to do when printing is not done via the /sys ... file 
> itself, but raw data is collected and then printed by e.g. trace-cmd? 
> How can it possibly interpret the "char *" kernel pointer?

I actually had a similar patch set to this,  I was going to post it but Janani 
beat me to it ;-)

Vlastimil is correct,  I'll attach my patch below so you can see the 
difference.  Otherwise you won't get correct behavior passing through perf.   

I also have a patch which adds a similar latency script (python) but interfaces 
it into the perf script setup.

Tony

---

Pass shrinker name in shrink slab tracepoints

Signed-off-by: Tony Jones 
---
 include/trace/events/vmscan.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 0101ef3..0a15948 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -16,6 +16,8 @@
 #define RECLAIM_WB_SYNC0x0004u /* Unused, all reclaim async */
 #define RECLAIM_WB_ASYNC   0x0008u
 
+#define SHRINKER_NAME_LEN  (size_t)32
+
 #define show_reclaim_flags(flags)  \
(flags) ? __print_flags(flags, "|", \
{RECLAIM_WB_ANON,   "RECLAIM_WB_ANON"}, \
@@ -190,6 +192,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
TP_STRUCT__entry(
__field(struct shrinker *, shr)
+   __array(char, name, SHRINKER_NAME_LEN)
__field(void *, shrink)
__field(int, nid)
__field(long, nr_objects_to_shrink)
@@ -203,6 +206,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
TP_fast_assign(
__entry->shr = shr;
+   strlcpy(__entry->name, shr->name, SHRINKER_NAME_LEN);
__entry->shrink = shr->scan_objects;
__entry->nid = sc->nid;
__entry->nr_objects_to_shrink = nr_objects_to_shrink;
@@ -214,9 +218,10 @@ TRACE_EVENT(mm_shrink_slab_start,
__entry->total_scan = total_scan;
),
 
-   TP_printk("%pF %p: nid: %d objects to shrink %ld gfp_flags %s 
pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
+   TP_printk("%pF %p(%s): nid: %d objects to shrink %ld gfp_flags %s 
pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
__entry->shrink,
__entry->shr,
+   __entry->name,
__entry->nid,
__entry->nr_objects_to_shrink,
show_gfp_flags(__entry->gfp_flags),
@@ -236,6 +241,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
TP_STRUCT__entry(
__field(struct shrinker *, shr)
+   __array(char, name, SHRINKER_NAME_LEN)
__field(int, nid)
__field(void *, shrink)
__field(long, unused_scan)
@@ -246,6 +252,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
TP_fast_assign(
__entry->shr = shr;
+   strlcpy(__entry->name, shr->name, SHRINKER_NAME_LEN);
__entry->nid = nid;
__entry->shrink = shr->scan_objects;
__entry->unused_scan = unused_scan_cnt;
@@ -254,9 +261,10 @@ 

[PATCH] rtc: ds1307: fix century bit support

2016-07-12 Thread Alexandre Belloni
Add an option to properly support the century bit of ds1337 and compatibles
and ds1340.
Because the driver had a bug until now, it is not possible to switch users
to the fixed code directly as RTCs in the field will wrongly have the
century bit set.

Signed-off-by: Alexandre Belloni 
---

Arnaud, do you mind testing that patch, I still don't have the necessary
hardware but I think this is the proper course of action.

 drivers/rtc/Kconfig  | 14 ++
 drivers/rtc/rtc-ds1307.c | 48 +++-
 2 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
index 8526f1cded08..f47c2f5ff70d 100644
--- a/drivers/rtc/Kconfig
+++ b/drivers/rtc/Kconfig
@@ -224,6 +224,20 @@ config RTC_DRV_DS1307_HWMON
  Say Y here if you want to expose temperature sensor data on
  rtc-ds1307 (only DS3231)
 
+config RTC_DRV_DS1307_CENTURY
+   bool "Century bit support for rtc-ds1307"
+   depends on RTC_DRV_DS1307
+   default n
+   help
+ The DS1307 driver suffered from a bug where it was enabling the
+ century bit inconditionnally but never used it when reading the time.
+ It made the driver unable to support dates beyond 2099.
+ Setting this option will add proper support for the century bit but if
+ the time was previously set using a kernel predating this option,
+ reading the date will return a date in the next century.
+ To solve that, you could boot a kernel without this option set, set
+ the RTC date and then boot a kernel with this option set.
+
 config RTC_DRV_DS1374
tristate "Dallas/Maxim DS1374"
help
diff --git a/drivers/rtc/rtc-ds1307.c b/drivers/rtc/rtc-ds1307.c
index 8e1c5cb6ece6..4c5890864d9c 100644
--- a/drivers/rtc/rtc-ds1307.c
+++ b/drivers/rtc/rtc-ds1307.c
@@ -382,10 +382,25 @@ static int ds1307_get_time(struct device *dev, struct 
rtc_time *t)
t->tm_mday = bcd2bin(ds1307->regs[DS1307_REG_MDAY] & 0x3f);
tmp = ds1307->regs[DS1307_REG_MONTH] & 0x1f;
t->tm_mon = bcd2bin(tmp) - 1;
-
-   /* assume 20YY not 19YY, and ignore DS1337_BIT_CENTURY */
t->tm_year = bcd2bin(ds1307->regs[DS1307_REG_YEAR]) + 100;
 
+#ifdef CONFIG_RTC_DRV_DS1307_CENTURY
+   switch (ds1307->type) {
+   case ds_1337:
+   case ds_1339:
+   case ds_3231:
+   if (ds1307->regs[DS1307_REG_MONTH] && DS1337_BIT_CENTURY)
+   t->tm_year += 100;
+   break;
+   case ds_1340:
+   if (ds1307->regs[DS1307_REG_HOUR] && DS1340_BIT_CENTURY)
+   t->tm_year += 100;
+   break;
+   default:
+   break;
+   }
+#endif
+
dev_dbg(dev, "%s secs=%d, mins=%d, "
"hours=%d, mday=%d, mon=%d, year=%d, wday=%d\n",
"read", t->tm_sec, t->tm_min,
@@ -409,6 +424,27 @@ static int ds1307_set_time(struct device *dev, struct 
rtc_time *t)
t->tm_hour, t->tm_mday,
t->tm_mon, t->tm_year, t->tm_wday);
 
+#ifdef CONFIG_RTC_DRV_DS1307_CENTURY
+   if (t->tm_year < 100)
+   return -EINVAL;
+
+   switch (ds1307->type) {
+   case ds_1337:
+   case ds_1339:
+   case ds_3231:
+   case ds_1340:
+   if (t->tm_year > 299)
+   return -EINVAL;
+   default:
+   if (t->tm_year > 199)
+   return -EINVAL;
+   break;
+   }
+#else
+   if (t->tm_year < 100 || t->tm_year > 199)
+   return -EINVAL;
+#endif
+
buf[DS1307_REG_SECS] = bin2bcd(t->tm_sec);
buf[DS1307_REG_MIN] = bin2bcd(t->tm_min);
buf[DS1307_REG_HOUR] = bin2bcd(t->tm_hour);
@@ -424,11 +460,13 @@ static int ds1307_set_time(struct device *dev, struct 
rtc_time *t)
case ds_1337:
case ds_1339:
case ds_3231:
-   buf[DS1307_REG_MONTH] |= DS1337_BIT_CENTURY;
+   if (t->tm_year > 199)
+   buf[DS1307_REG_MONTH] |= DS1337_BIT_CENTURY;
break;
case ds_1340:
-   buf[DS1307_REG_HOUR] |= DS1340_BIT_CENTURY_EN
-   | DS1340_BIT_CENTURY;
+   buf[DS1307_REG_HOUR] |= DS1340_BIT_CENTURY_EN;
+   if (t->tm_year > 199)
+   buf[DS1307_REG_HOUR] |= DS1340_BIT_CENTURY;
break;
case mcp794xx:
/*
-- 
2.8.1



[PATCH] media: s5p-mfc remove unnecessary error messages

2016-07-12 Thread Shuah Khan
Removing unnecessary error messages as appropriate error code is returned.

Signed-off-by: Shuah Khan 
---
 drivers/media/platform/s5p-mfc/s5p_mfc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/media/platform/s5p-mfc/s5p_mfc.c 
b/drivers/media/platform/s5p-mfc/s5p_mfc.c
index b6fde20..906f80c 100644
--- a/drivers/media/platform/s5p-mfc/s5p_mfc.c
+++ b/drivers/media/platform/s5p-mfc/s5p_mfc.c
@@ -759,7 +759,6 @@ static int s5p_mfc_open(struct file *file)
/* Allocate memory for context */
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx) {
-   mfc_err("Not enough memory\n");
ret = -ENOMEM;
goto err_alloc;
}
@@ -776,7 +775,6 @@ static int s5p_mfc_open(struct file *file)
while (dev->ctx[ctx->num]) {
ctx->num++;
if (ctx->num >= MFC_NUM_CONTEXTS) {
-   mfc_err("Too many open contexts\n");
ret = -EBUSY;
goto err_no_ctx;
}
-- 
2.7.4



Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6

2016-07-12 Thread Greg KH
On Tue, Jul 12, 2016 at 03:12:35PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 11, 2016 at 03:33:53PM -0700, Greg KH wrote:
> 
> > Oops, this commit does not apply cleanly to 4.6 or 4.4-stable trees.
> > Can someone send me the backported verision that they have tested to
> > work properly so I can queue it up?
> 
> I've never actually been able to reproduce, but the attached patches
> apply, the reject was trivial.
> 
> They seem to compile and boot on my main test rig, but nothing else was
> done but build the next kernel with it.

Thanks for these, now applied.

greg k-h


[PATCH] CodingStyle: Remove "Don't use C99-style comments"

2016-07-12 Thread Joe Perches
Because Linus may still be reading source code on greenbar paper
instead of color terminals with code syntax highlighting and
appropriate font decorations.

Link: 
http://lkml.kernel.org/r/ca+55afyqyjerovmssosks7pesszbr4vnp-3quuwhqk4a4_j...@mail.gmail.com

Signed-off-by: Joe Perches 
---
 Documentation/CodingStyle | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle
index 9a70ddd..19b2e9c 100644
--- a/Documentation/CodingStyle
+++ b/Documentation/CodingStyle
@@ -461,9 +461,6 @@ When commenting the kernel API functions, please use the 
kernel-doc format.
 See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc
 for details.
 
-Linux style for comments is the C89 "/* ... */" style.
-Don't use C99-style "// ..." comments.
-
 The preferred style for long (multi-line) comments is:
 
/*
-- 
2.8.0.rc4.16.g56331f8



Re: [Query] Preemption (hogging) of the work handler

2016-07-12 Thread Viresh Kumar
On 12-07-16, 16:19, Viresh Kumar wrote:
> Okay, we have tracked this BUG and its really interesting.
> 
> I hacked the platform's serial driver to implement a putchar() routine
> that simply writes to the FIFO in polling mode, that helped us in
> tracing on where we are going wrong.
> 
> The problem is that we are running asynchronous printks and we call
> wake_up_process() from the last running CPU which has disabled
> interrupts. That takes us to: try_to_wake_up().
> 
> In our case the CPU gets deadlocked on this line in try_to_wake_up().
> 
> raw_spin_lock_irqsave(>pi_lock, flags);
> 
> I will explain how:
> 
> The try_to_wake_up() function takes us through the scheduler code (RT
> sched), to the hrtimer code, where we eventually call ktime_get() (for
> the MONOTONIC clock used for hrtimer). And this function has this:
> 
> WARN_ON(timekeeping_suspended);
> 
> This starts another printk while we are in the middle of
> wake_up_process() and the CPU tries to take the above lock again and
> gets stuck there :)
> 
> This doesn't happen everytime because we don't always call ktime_get()
> and it is called only if hrtimer_active() returns false.
> 
> This happened because of a WARN_ON() but it can happen anyway. Think
> about this case:
> 
> - offline all CPUs, except 0
> - call any routine that prints messages after disabling interrupts,
>   etc.
> - If any of the function within wake_up_process() does a print, we are
>   screwed.
> 
> So the thing is that we can't really call wake_up_process() in cases
> where the last CPU disables interrupts. And that's why my fixup patch
> (which moved to synchronous prints after suspend) really works.

Actually, any printk done from wake_up_process() will hit this, even
if all the others CPUs are up as well :)

Its only BUG_ON() which has special handling in printk, and so we
print that safely.

-- 
viresh


[PATCH] remoteproc: qcom: hexagon: Clean up mpss validation

2016-07-12 Thread Bjorn Andersson
As reported by Dan the unsigned "val" can't be negative. But instead
correcting the check for early errors here followed by a wait for the
validation result to show the error or success we can consolidate these
two parts of the validation process into the validation function.

Reported-by: Dan Carpenter 
Signed-off-by: Bjorn Andersson 
---
 drivers/remoteproc/qcom_q6v5_pil.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/remoteproc/qcom_q6v5_pil.c 
b/drivers/remoteproc/qcom_q6v5_pil.c
index 63406ee689d9..b8fba4dc2447 100644
--- a/drivers/remoteproc/qcom_q6v5_pil.c
+++ b/drivers/remoteproc/qcom_q6v5_pil.c
@@ -386,7 +386,6 @@ static int q6v5_mpss_validate(struct q6v5 *qproc, const 
struct firmware *fw)
phys_addr_t fw_addr;
bool relocate;
size_t size;
-   u32 val;
int ret;
int i;
 
@@ -425,8 +424,13 @@ static int q6v5_mpss_validate(struct q6v5 *qproc, const 
struct firmware *fw)
writel(size, qproc->rmb_base + RMB_PMI_CODE_LENGTH_REG);
}
 
-   val = readl(qproc->rmb_base + RMB_MBA_STATUS_REG);
-   return val < 0 ? val : 0;
+   ret = q6v5_rmb_mba_wait(qproc, RMB_MBA_AUTH_COMPLETE, 1);
+   if (ret == -ETIMEDOUT)
+   dev_err(qproc->dev, "MPSS authentication timed out\n");
+   else if (ret < 0)
+   dev_err(qproc->dev, "MPSS authentication failed: %d\n", ret);
+
+   return ret < 0 ? ret : 0;
 }
 
 static int q6v5_mpss_load(struct q6v5 *qproc)
@@ -463,14 +467,6 @@ static int q6v5_mpss_load(struct q6v5 *qproc)
goto release_firmware;
 
ret = q6v5_mpss_validate(qproc, fw);
-   if (ret)
-   goto release_firmware;
-
-   ret = q6v5_rmb_mba_wait(qproc, RMB_MBA_AUTH_COMPLETE, 1);
-   if (ret == -ETIMEDOUT)
-   dev_err(qproc->dev, "MPSS authentication timed out\n");
-   else if (ret < 0)
-   dev_err(qproc->dev, "MPSS authentication failed: %d\n", ret);
 
 release_firmware:
release_firmware(fw);
-- 
2.5.0



[PATCH 1/2] checkpatch: Skip long lines that use an EFI_GUID macro

2016-07-12 Thread Joe Perches
These are also possible single line uses that exceed the
generic maximum line length (typically 80 columns)

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4904ced..cc787e6 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2764,6 +2764,10 @@ sub process {
 $line =~ /^\+\s*#\s*define\s+\w+\s+$String$/) {
$msg_type = "";
 
+   # EFI_GUID is another special case
+   } elsif ($line =~ /^\+.*\bEFI_GUID\s*\(/) {
+   $msg_type = "";
+
# Otherwise set the alternate message types
 
# a comment starts before $max_line_length
-- 
2.8.0.rc4.16.g56331f8



[PATCH 2/2] checkpatch: Allow c99 style // comments

2016-07-12 Thread Joe Perches
Sanitise the lines that contain c99 comments so that
the error doesn't get emitted.

Signed-off-by: Joe Perches 
---
 scripts/checkpatch.pl | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index cc787e6..a0e5112 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -55,6 +55,7 @@ my $spelling_file = "$D/spelling.txt";
 my $codespell = 0;
 my $codespellfile = "/usr/share/codespell/dictionary.txt";
 my $color = 1;
+my $allow_c99_comments = 1;
 
 sub help {
my ($exitcode) = @_;
@@ -1145,6 +1146,11 @@ sub sanitise_line {
$res =~ s@(\#\s*(?:error|warning)\s+).*@$1$clean@;
}
 
+   if ($allow_c99_comments && $res =~ m@(//.*$)@) {
+   my $match = $1;
+   $res =~ s/\Q$match\E/"$;" x length($match)/e;
+   }
+
return $res;
 }
 
-- 
2.8.0.rc4.16.g56331f8



[PATCH] arm64: take SHN_LIVEPATCH syms into account when calculating plt_max_entries

2016-07-12 Thread Jessica Yu

SHN_LIVEPATCH symbols are technically a subset of SHN_UNDEF/undefined
symbols, except that their addresses are resolved by livepatch at runtime.
Therefore, when calculating the upper-bound for the number of plt entries
to allocate, make sure to take livepatch symbols into account as well.

Signed-off-by: Jessica Yu 
---
arch/arm64/kernel/module-plts.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/module-plts.c b/arch/arm64/kernel/module-plts.c
index 1ce90d8..1e95dc1 100644
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -122,7 +122,8 @@ static unsigned int count_plts(Elf64_Sym *syms, Elf64_Rela 
*rela, int num)
* as well, so modules can never grow beyond that limit.
*/
   s = syms + ELF64_R_SYM(rela[i].r_info);
-   if (s->st_shndx != SHN_UNDEF)
+   if (s->st_shndx != SHN_UNDEF &&
+   s->st_shndx != SHN_LIVEPATCH)
   break;

   /*
--
2.5.5



Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

2016-07-12 Thread John Stultz
Hey Tejun,

  So Dmitry Shmidt recently noticed that with 4.4 based systems we're
seeing quite a bit of performance overhead from
__cgroup_procs_write().

With 4.4 tree as it stands, we're seeing __cgroup_procs_write() quite
often take 10s of miliseconds to execute (with max times up in the
80ms range).

While with 4.1 it was quite often in the single usec range, and max
time values still in in sub-milisecond range.

The majority of these performance regressions seem to come from the
locking changes in:

3014dde762f6 ("cgroup: simplify threadgroup locking")
and
1ed1328792ff  ("sched, cgroup: replace signal_struct->group_rwsem with
a global percpu_rwsem")

Dmitry has found that by reverting these two changes (which don't
revert easiliy), we can get back down to tens 10-100 usec range for
most calls, with max values occasionally spiking to ~18ms.

Those two commits do talk about performance regressions, that were
supposedly alleviated by percpu_rwsem changes, but I'm not sure we are
seeing this.

In 1ed1328792ff, the commit talks about the write path being a fairly
cold path, but with Android I worry this may not actually be the case,
as Android uses cpuset cgroups to group tasks into foreground and
background tasks, but this means when switching applications, tasks
are migrated between cgroups. Putting an additional 80 milisecond
delay on this adds potentially visible latencies on task switching.

Reverting those two changes in the Android common.git tree doesn't
feel like a good long term solution here, so I was wondering if you
had any thoughts on how to further reduce the performance regression
here?

All the credit for finding this goes to Dmitry, I just was able to
reproduce his results and thoguht we should bring it up for discussion
here.

thanks
-john


Re: [PATCH net-next 3/3] bpf: avoid stack copy and use skb ctx for event output

2016-07-12 Thread Fengguang Wu

Hi Daniel,

On Wed, Jul 13, 2016 at 01:45:47AM +0200, Daniel Borkmann wrote:

On 07/13/2016 01:25 AM, kbuild test robot wrote:

Hi,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Daniel-Borkmann/BPF-event-output-helper-improvements/20160713-065944
config: s390-allyesconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
reproduce:
 wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 make.cross ARCH=s390

All warnings (new ones prefixed by >>):

kernel/trace/bpf_trace.c: In function 'bpf_perf_event_output':
kernel/trace/bpf_trace.c:284:1: warning: 'bpf_perf_event_output' uses 
dynamic stack allocation
 }
 ^
kernel/trace/bpf_trace.c: In function 'bpf_event_output':

kernel/trace/bpf_trace.c:319:1: warning: 'bpf_event_output' uses dynamic stack 
allocation

 }
 ^


Hmm, searching a bit on lkml, it seems these warnings on s390 are actually 
mostly
harmless I believe [1][2] ... looks like they are there to find structs sitting
on stack, for example, at least that's also what the currently existing one in 
the
above line (bpf_trace.c +284) appears to be about.


Yes it does look so. All such warnings happen only in s390:

% g -h -o '[^ ]*config' *dynamic-stack* | sort | uniq -c | sort -nr
   118 s390-allyesconfig
80 s390-allmodconfig

Let's ignore all of them on s390.

Thanks,
Fengguang


  [1] http://lkml.iu.edu/hypermail/linux/kernel/1601.2/04074.html
  [2] https://lkml.org/lkml/2013/6/25/42


Re: System freezes after OOM

2016-07-12 Thread Mikulas Patocka
The problem of swapping to dm-crypt is this.

The free memory goes low, kswapd decides that some page should be swapped 
out. However, when you swap to an ecrypted device, writeback of each page 
requires another page to hold the encrypted data. dm-crypt uses mempools 
for all its structures and pages, so that it can make forward progress 
even if there is no memory free. However, the mempool code first allocates 
from general memory allocator and resorts to the mempool only if the 
memory is below limit.

So every attempt to swap out some page allocates another page.

As long as swapping is in progress, the free memory is below the limit 
(because the swapping activity itself consumes any memory over the limit). 
And that triggered the OOM killer prematurely.


On Tue, 12 Jul 2016, Michal Hocko wrote:

> On Mon 11-07-16 11:43:02, Mikulas Patocka wrote:
> [...]
> > The general problem is that the memory allocator does 16 retries to 
> > allocate a page and then triggers the OOM killer (and it doesn't take into 
> > account how much swap space is free or how many dirty pages were really 
> > swapped out while it waited).
> 
> Well, that is not how it works exactly. We retry as long as there is a
> reclaim progress (at least one page freed) back off only if the
> reclaimable memory can exceed watermks which is scaled down in 16
> retries. The overal size of free swap is not really that important if we
> cannot swap out like here due to complete memory reserves depletion:
> https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-0/sample-00011/dmesg:
> [   90.491276] Node 0 DMA free:0kB min:60kB low:72kB high:84kB 
> active_anon:4096kB inactive_anon:4636kB active_file:212kB inactive_file:280kB 
> unevictable:488kB isolated(anon):0kB isolated(file):0kB present:15992kB 
> managed:15908kB mlocked:488kB dirty:276kB writeback:4636kB mapped:476kB 
> shmem:12kB slab_reclaimable:204kB slab_unreclaimable:4700kB kernel_stack:48kB 
> pagetables:120kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB 
> free_cma:0kB writeback_tmp:0kB pages_scanned:61132 all_unreclaimable? yes
> [   90.491283] lowmem_reserve[]: 0 977 977 977
> [   90.491286] Node 0 DMA32 free:0kB min:3828kB low:4824kB high:5820kB 
> active_anon:423820kB inactive_anon:424916kB active_file:17996kB 
> inactive_file:21800kB unevictable:20724kB isolated(anon):384kB 
> isolated(file):0kB present:1032184kB managed:1001260kB mlocked:20724kB 
> dirty:25236kB writeback:49972kB mapped:23076kB shmem:1364kB 
> slab_reclaimable:13796kB slab_unreclaimable:43008kB kernel_stack:2816kB 
> pagetables:7320kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB 
> free_cma:0kB writeback_tmp:0kB pages_scanned:5635400 all_unreclaimable? yes
> 
> Look at the amount of free memory. It is completely depleted. So it
> smells like a process which has access to memory reserves has consumed
> all of it. I suspect a __GFP_MEMALLOC resp. PF_MEMALLOC from softirq
> context user which went off the leash.

It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. Prior 
to this commit, mempool allocations set __GFP_NOMEMALLOC, so they never 
exhausted reserved memory. With this commit, mempool allocations drop 
__GFP_NOMEMALLOC, so they can dig deeper (if the process has PF_MEMALLOC, 
they can bypass all limits).

But swapping should proceed even if there is no memory free. There is a 
comment "TODO: this could cause a theoretical memory reclaim deadlock in 
the swap out path." in the function add_to_swap - but apart from that, 
swap should proceed even with no available memory, as long as all the 
drivers in the block layer use mempools.

> > So, it could prematurely trigger OOM killer on any slow swapping device 
> > (including dm-crypt). Michal Hocko reworked the OOM killer in the patch 
> > 0a0337e0d1d134465778a16f5cbea95086e8e9e0, but it still has the flaw that 
> > it triggers OOM if there is plenty of free swap space free.
> > 
> > Michal, would you accept a change to the OOM killer, to prevent it from 
> > triggerring when there is free swap space?
> 
> No this doesn't sound like a proper solution. The current decision
> logic, as explained above relies on the feedback from the reclaim. A
> free swap space doesn't really mean we can make a forward progress.

I'm interested - why would you need to trigger the OOM killer if there is 
free swap space?

The only possibility is that all the memory is filled with unswappable 
kernel pages - but that condition could be detected if there is unusually 
low number of anonymous and cache pages. Besides that - in what situation 
is triggering the OOM killer with free swap desired?

> -- 
> Michal Hocko
> SUSE Labs
> 

The kernel 4.7-rc almost deadlocks in another way. The machine got stuck 
and the following stacktrace was obtained when swapping to dm-crypt.

We can see that dm-crypt does a mempool allocation. But the mempool 
allocation somehow falls into throttle_vm_writeout. There, it waits for 
0.1 seconds. 

Re: [PATCH net-next 3/3] bpf: avoid stack copy and use skb ctx for event output

2016-07-12 Thread Daniel Borkmann

On 07/13/2016 01:25 AM, kbuild test robot wrote:

Hi,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Daniel-Borkmann/BPF-event-output-helper-improvements/20160713-065944
config: s390-allyesconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 5.3.1-8) 5.3.1 20160205
reproduce:
 wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 make.cross ARCH=s390

All warnings (new ones prefixed by >>):

kernel/trace/bpf_trace.c: In function 'bpf_perf_event_output':
kernel/trace/bpf_trace.c:284:1: warning: 'bpf_perf_event_output' uses 
dynamic stack allocation
 }
 ^
kernel/trace/bpf_trace.c: In function 'bpf_event_output':

kernel/trace/bpf_trace.c:319:1: warning: 'bpf_event_output' uses dynamic stack 
allocation

 }
 ^


Hmm, searching a bit on lkml, it seems these warnings on s390 are actually 
mostly
harmless I believe [1][2] ... looks like they are there to find structs sitting
on stack, for example, at least that's also what the currently existing one in 
the
above line (bpf_trace.c +284) appears to be about.

Thanks,
Daniel

  [1] http://lkml.iu.edu/hypermail/linux/kernel/1601.2/04074.html
  [2] https://lkml.org/lkml/2013/6/25/42


Re: [RFC 0/3] extend kexec_file_load system call

2016-07-12 Thread Stewart Smith
Petr Tesarik  writes:
> On Tue, 12 Jul 2016 13:25:11 -0300
> Thiago Jung Bauermann  wrote:
>
>> Hi Eric,
>> 
>> I'm trying to understand your concerns leading to your nack. I hope you 
>> don't mind expanding your thoughts on them a bit.
>> 
>> Am Dienstag, 12 Juli 2016, 08:25:48 schrieb Eric W. Biederman:
>> > AKASHI Takahiro  writes:
>> > > Device tree blob must be passed to a second kernel on DTB-capable
>> > > archs, like powerpc and arm64, but the current kernel interface
>> > > lacks this support.
>> > > 
>> > > This patch extends kexec_file_load system call by adding an extra
>> > > argument to this syscall so that an arbitrary number of file descriptors
>> > > can be handed out from user space to the kernel.
>> > > 
>> > > See the background [1].
>> > > 
>> > > Please note that the new interface looks quite similar to the current
>> > > system call, but that it won't always mean that it provides the "binary
>> > > compatibility."
>> > > 
>> > > [1] http://lists.infradead.org/pipermail/kexec/2016-June/016276.html
>> > 
>> > So this design is wrong.  The kernel already has the device tree blob,
>> > you should not be extracting it from the kernel munging it, and then
>> > reinserting it in the kernel if you want signatures and everything to
>> > pass.
>> 
>> I don't understand how the kernel signature will be invalidated. 
>> 
>> There are some types of boot images that can embed a device tree blob in 
>> them, but the kernel can also be handed a separate device tree blob from 
>> firmware, the boot loader, or kexec. This latter case is what we are 
>> discussing, so we are not talking about modifying an embedded blob in the 
>> kernel image.
>> 
>> > What x86 does is pass it's equivalent of the device tree blob from one
>> > kernel to another directly and behind the scenes.  It does not go
>> > through userspace for this.
>> > 
>> > Until a persuasive case can be made for going around the kernel and
>> > probably adding a feature (like code execution) that can be used to
>> > defeat the signature scheme I am going to nack this.
>> 
>> I also don't understand what you mean by code execution. How does passing a 
>> device tree blob via kexec enables code execution? How can the signature 
>> scheme be defeated?
>
> I'm not an expert on DTB, so I can't provide an example of code
> execution, but you have already mentioned the /chosen/linux,stdout-path
> property. If an attacker redirects the bootloader to an insecure
> console, they may get access to the system that would otherwise be
> impossible.

In this case, the user is sitting at the (or one of the) console(s) of
the machine. There could be petitboot UIs running on the VGA display,
IPMI serial over lan, local serial port. The logic behind setting
/chosen/linux,stdout-path is (currently) mostly to set it for the kernel
to what the user is interacting with. i.e. if you select an OS installer
to boot from the VGA console, you get a graphical installer running and
if you selected it from a text console, you get a text installer running
(on the appropriate console).

So the bootloader (petitboot) needs to work out which console is being
interacted with in order to set up /chosen/linux,stdout-path correctly.

This specific option could be passed as a kernel command line to the
next kernel, yes. However, isn't the kernel command line also an attack
vector? Is *every* command line option safe?

> In general, tampering with the hardware inventory of a machine opens up
> a security hole, and one must be very cautious which modifications are
> allowed. You're giving this power to an (unsigned, hence untrusted)
> userspace application; Eric argues that only the kernel should have
> this power.

In the case of petitboot on OpenPOWER, this (will) be a signed and
trusted kernel and userspace and verified by a previous bit of firmware.

-- 
Stewart Smith
OPAL Architect, IBM.



Re: [RFC 0/3] extend kexec_file_load system call

2016-07-12 Thread Stewart Smith
Vivek Goyal  writes:
> On Tue, Jul 12, 2016 at 10:58:09AM -0300, Thiago Jung Bauermann wrote:
>> Hello Eric,
>> 
>> Am Dienstag, 12 Juli 2016, 08:25:48 schrieb Eric W. Biederman:
>> > AKASHI Takahiro  writes:
>> > > Device tree blob must be passed to a second kernel on DTB-capable
>> > > archs, like powerpc and arm64, but the current kernel interface
>> > > lacks this support.
>> > > 
>> > > This patch extends kexec_file_load system call by adding an extra
>> > > argument to this syscall so that an arbitrary number of file descriptors
>> > > can be handed out from user space to the kernel.
>> > > 
>> > > See the background [1].
>> > > 
>> > > Please note that the new interface looks quite similar to the current
>> > > system call, but that it won't always mean that it provides the "binary
>> > > compatibility."
>> > > 
>> > > [1] http://lists.infradead.org/pipermail/kexec/2016-June/016276.html
>> > 
>> > So this design is wrong.  The kernel already has the device tree blob,
>> > you should not be extracting it from the kernel munging it, and then
>> > reinserting it in the kernel if you want signatures and everything to
>> > pass.
>> > 
>> > What x86 does is pass it's equivalent of the device tree blob from one
>> > kernel to another directly and behind the scenes.  It does not go
>> > through userspace for this.
>> > 
>> > Until a persuasive case can be made for going around the kernel and
>> > probably adding a feature (like code execution) that can be used to
>> > defeat the signature scheme I am going to nack this.
>> 
>> There are situations where userspace needs to change things in the device 
>> tree to be used by the next kernel.
>> 
>> For example, Petitboot (the boot loader used in OpenPOWER machines) is a 
>> userspace application running in an intermediary Linux instance and uses 
>> kexec to load the target OS. It has to modify the device tree that will be 
>> used by the next kernel so that the next kernel uses the same console that 
>> petitboot was configured to use (i.e., set the /chosen/linux,stdout-path 
>> property). It also modifies the device tree to allow the kernel to inherit 
>> Petitboot's Openfirmware framebuffer.
>
> Can some of this be done with the help of kernel command line options for
> second kernel?

how would this be any more secure?

Passing in an address for a framebuffer via command line option means
you could scribble over any bit of memory, which is the same kind of
damage you could do by modifying the device tree.

-- 
Stewart Smith
OPAL Architect, IBM.



Re: [PATCH] userspace API definitions for auto-focus coil

2016-07-12 Thread Mauro Carvalho Chehab
Em Sat, 18 Jun 2016 17:38:46 +0200
Pavel Machek  escreveu:

> Hi!
> 
> > > Not V4L2_CID_USER_AD5820...?  
> > 
> > The rest of the controls have no USER as part of the macro name, so I
> > wouldn't use it here either.  
> 
> Ok.
> 
> > > Ok, separate header file for 2 lines seemed like a bit of overkill,
> > > but why not.  
> > 
> > That follows an existing pattern of how controls have been implemented in
> > other drivers.  
> 
> Ok.
> 
> > Could you merge this with the driver patch? I've dropped that from my ad5820
> > branch as it does not compile.  
> 
> Yes, merged patch should be in your inbox now.

The V4L2 core changes should be on a separate patch. Btw, you'll also
need to patch documentation to reflect such changes. We're right now
moving from DocBook to ReST markup language. The patches for it are
right now on a separate topic branch (docs-next), to be merged for
Kernel 4.8 on the next merge window.

You should either base the patch on such branch or wait for it to be
merged back mainstream to write such documentation additions.


> 
> Thanks,
>   Pavel
> 


-- 
Thanks,
Mauro


Re: [PATCH] loop: Make user notify for adding loop device failed

2016-07-12 Thread Jens Axboe

On 06/06/2016 07:05 PM, Minfei Huang wrote:

There is no error number returned if loop driver fails in function
alloc_disk to add new loop device. Add a correct error number to make
user notify in this case.


Sorry about the delay, vacation got in the way. Added for 4.8, thanks.

--
Jens Axboe



Re: [PATCH 2/3] mm, meminit: Always return a valid node from early_pfn_to_nid

2016-07-12 Thread David Rientjes
On Fri, 8 Jul 2016, Mel Gorman wrote:

> early_pfn_to_nid can return node 0 if a PFN is invalid on machines
> that has no node 0. A machine with only node 1 was observed to crash
> with the following message
> 
>  BUG: unable to handle kernel paging request at 0002a3c8
>  PGD 0
>  Modules linked in:
>  Hardware name: Supermicro H8DSP-8/H8DSP-8, BIOS 080011  06/30/2006
>  task: 81c0d500 ti: 81c0 task.ti: 81c0
>  RIP: 0010:[]  [] 
> reserve_bootmem_region+0x6a/0xef
>  RSP: :81c03eb0  EFLAGS: 00010086
>  RAX:  RBX:  RCX: 
>  RDX: 81c03ec0 RSI: 81d205c0 RDI: 8213ee60
>  R13: ea00 R14: ea20 R15: ea20
>  FS:  () GS:8800fba0() knlGS:
>  CS:  0010 DS:  ES:  CR0: 80050033
>  CR2: 0002a3c8 CR3: 01c06000 CR4: 06b0
>  Stack:
>   81c03f00 0400 8800fbfc3200 81e2a2c0
>   81c03fb0 81c03f20 81dadf7d ea000240
>   ea00   0001
>  Call Trace:
>   [] free_all_bootmem+0x4b/0x12a
>   [] mem_init+0x70/0xa3
>   [] start_kernel+0x25b/0x49b
> 
> The problem is that early_page_uninitialised uses the early_pfn_to_nid
> helper which returns node 0 for invalid PFNs. No caller of early_pfn_to_nid
> cares except early_page_uninitialised. This patch has early_pfn_to_nid
> always return a valid node.
> 
> Signed-off-by: Mel Gorman 
> Cc:  # 4.2+

Acked-by: David Rientjes 

This makes me wonder about meminit_pfn_in_nid(), however, since if 
__early_pfn_to_nid() returns -1, which is the case in this bug, 
meminit_pfn_in_nid() will return true for any passed node.


Re: [PATCH 3/3] mm, meminit: Ensure node is online before checking whether pages are uninitialised

2016-07-12 Thread David Rientjes
On Fri, 8 Jul 2016, Mel Gorman wrote:

> early_page_uninitialised looks up an arbitrary PFN. While a machine without
> node 0 will boot with "mm, page_alloc: Always return a valid node from
> early_pfn_to_nid", it works because it assumes that nodes are always in
> PFN order. This is not guaranteed so this patch adds robustness by always
> checking if the node being checked is online.
> 
> Signed-off-by: Mel Gorman 
> Cc:  # 4.2+

Acked-by: David Rientjes 


Linux 3.18.36-rt38-rc2

2016-07-12 Thread Steven Rostedt

Dear RT Folks,

This is the RT stable review cycle of patch 3.18.36-rt38-rc2.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 7/14/2016.

Only difference from v1 is the removal of "ARM: imx: always use TWD on
IMX6Q"

Enjoy,

-- Steve


To build 3.18.36-rt38-rc2 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.18.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.18.36.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.18/patch-3.18.36-rt38-rc2.patch.xz

You can also build from 3.18.36-rt37 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.18/incr/patch-3.18.36-rt37-rt38-rc2.patch.xz


Changes from 3.18.36-rt37:

---


Josh Cartwright (1):
  list_bl: fixup bogus lockdep warning

Luiz Capitulino (1):
  mm: perform lru_add_drain_all() remotely

Mike Galbraith (2):
  mm/zsmalloc: Use get/put_cpu_light in zs_map_object()/zs_unmap_object()
  drivers/block/zram: Replace bit spinlocks with rtmutex for -rt

Peter Zijlstra (1):
  sched,preempt: Fix preempt_count manipulations

Rik van Riel (1):
  kvm, rt: change async pagefault code locking for PREEMPT_RT

Sebastian Andrzej Siewior (6):
  net: dev: always take qdisc's busylock in __dev_xmit_skb()
  drivers/block/zram: fixup compile for !RT
  kernel/printk: Don't try to print from IRQ/NMI region
  arm: lazy preempt: correct resched condition
  locallock: add local_lock_on()
  trace: correct off by one while recording the trace-event

Steven Rostedt (Red Hat) (1):
  Linux 3.18.36-rt38-rc2

Thomas Gleixner (1):
  perf/x86/intel/rapl: Make PMU lock raw


 arch/arm/kernel/entry-armv.S|  6 -
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 20 +++---
 arch/x86/kernel/kvm.c   | 37 +-
 drivers/block/zram/zram_drv.c   | 30 +++--
 drivers/block/zram/zram_drv.h   | 41 +
 include/asm-generic/preempt.h   |  4 +--
 include/linux/list_bl.h | 12 +
 include/linux/locallock.h   |  6 +
 kernel/printk/printk.c  | 10 +++
 kernel/trace/trace_events.c |  8 ++
 localversion-rt |  2 +-
 mm/swap.c   | 37 +-
 mm/zsmalloc.c   |  4 +--
 net/core/dev.c  |  4 +++
 14 files changed, 161 insertions(+), 60 deletions(-)


Linux 4.1.27-rt31-rc2

2016-07-12 Thread Steven Rostedt

Dear RT Folks,

This is the RT stable review cycle of patch 4.1.27-rt31-rc2.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 7/14/2016.

Only difference from v1 is the removal of "ARM: imx: always use TWD on
IMX6Q"

Enjoy,

-- Steve


To build 4.1.27-rt31-rc2 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-4.1.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-4.1.27.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/4.1/patch-4.1.27-rt31-rc2.patch.xz

You can also build from 4.1.27-rt30 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/4.1/incr/patch-4.1.27-rt30-rt31-rc2.patch.xz


Changes from 4.1.27-rt30:

---


Alexandre Belloni (5):
  ARM: at91: pm: simply call at91_pm_init
  ARM: at91: pm: find and remap the pmc
  ARM: at91: pm: move idle functions to pm.c
  ARM: at91: remove useless includes and function prototypes
  usb: gadget: atmel: access the PMC using regmap

Josh Cartwright (1):
  list_bl: fixup bogus lockdep warning

Luiz Capitulino (1):
  mm: perform lru_add_drain_all() remotely

Mike Galbraith (2):
  mm/zsmalloc: Use get/put_cpu_light in zs_map_object()/zs_unmap_object()
  drivers/block/zram: Replace bit spinlocks with rtmutex for -rt

Peter Zijlstra (1):
  sched,preempt: Fix preempt_count manipulations

Rik van Riel (1):
  kvm, rt: change async pagefault code locking for PREEMPT_RT

Sebastian Andrzej Siewior (6):
  net: dev: always take qdisc's busylock in __dev_xmit_skb()
  drivers/block/zram: fixup compile for !RT
  kernel/printk: Don't try to print from IRQ/NMI region
  arm: lazy preempt: correct resched condition
  locallock: add local_lock_on()
  trace: correct off by one while recording the trace-event

Steven Rostedt (Red Hat) (1):
  Linux 4.1.27-rt31-rc2

Thomas Gleixner (1):
  perf/x86/intel/rapl: Make PMU lock raw


 arch/arm/kernel/entry-armv.S|  6 ++-
 arch/arm/mach-at91/at91rm9200.c |  2 -
 arch/arm/mach-at91/at91sam9.c   |  2 -
 arch/arm/mach-at91/generic.h| 13 +-
 arch/arm/mach-at91/pm.c | 70 -
 arch/arm/mach-at91/sama5.c  |  2 +-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 20 -
 arch/x86/kernel/kvm.c   | 37 +++
 drivers/block/zram/zram_drv.c   | 30 +++--
 drivers/block/zram/zram_drv.h   | 41 +
 drivers/clk/at91/pmc.c  | 15 ---
 drivers/usb/gadget/udc/atmel_usba_udc.c | 20 -
 drivers/usb/gadget/udc/atmel_usba_udc.h |  2 +
 include/asm-generic/preempt.h   |  4 +-
 include/linux/list_bl.h | 12 ++---
 include/linux/locallock.h   |  6 +++
 kernel/printk/printk.c  | 10 +
 kernel/trace/trace_events.c |  8 
 localversion-rt |  2 +-
 mm/swap.c   | 37 ---
 mm/zsmalloc.c   |  4 +-
 net/core/dev.c  |  4 ++
 22 files changed, 235 insertions(+), 112 deletions(-)


Re: [PATCH 29/66] tools lib traceevent: Use str_error_r()

2016-07-12 Thread Steven Rostedt
On Tue, 12 Jul 2016 20:14:24 -0300
Arnaldo Carvalho de Melo  wrote:

> Em Tue, Jul 12, 2016 at 07:11:08PM -0400, Steven Rostedt escreveu:
> > On Tue, 12 Jul 2016 19:40:04 -0300 Arnaldo Carvalho de Melo 
> >  wrote:  
> > > To make it portable to non-glibc systems, that follow the XSI variant
> > > instead of the GNU specific one that gets in place when _GNU_SOURCE is
> > > defined.  
> 
> > >  #include "event-parse.h"
> > > @@ -6131,12 +6132,7 @@ int pevent_strerror(struct pevent *pevent 
> > > __maybe_unused,
> > > + str_error_r(errnum, buf, buflen);
> > >   return 0;  
> > 
> > What library is used with this? When I port this over to trace-cmd
> > (which is still needed as I develop this there), it fails to build.
> > "undefined reference to str_error_r"  
> 
> tools/lib/str_error_r.c
> 
> Forgot about the out of tree copy :-\
> 

Yeah, we really need to make this into a real library. I haven't had
the time to do that. Hopefully in August I can talk with some people at
LinuxCon to see the best way to go about doing that.

Anyway, I may just take that file and port it to trace-cmd.

-- Steve


Linux 3.14.72-rt76-rc2

2016-07-12 Thread Steven Rostedt

Dear RT Folks,

This is the RT stable review cycle of patch 3.14.72-rt76-rc2.

Please scream at me if I messed something up. Please test the patches too.

The -rc release will be uploaded to kernel.org and will be deleted when
the final release is out. This is just a review release (or release candidate).

The pre-releases will not be pushed to the git repository, only the
final release is.

If all goes well, this patch will be converted to the next main release
on 7/14/2016.

Only difference from v1 is the removal of "ARM: imx: always use TWD on
IMX6Q"

Enjoy,

-- Steve


To build 3.14.72-rt76-rc2 directly, the following patches should be applied:

  http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.14.tar.xz

  http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.14.72.xz

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.14/patch-3.14.72-rt76-rc2.patch.xz

You can also build from 3.14.72-rt75 by applying the incremental patch:

http://www.kernel.org/pub/linux/kernel/projects/rt/3.14/incr/patch-3.14.72-rt75-rt76-rc2.patch.xz


Changes from 3.14.72-rt75:

---


Corey Minyard (1):
  x86: Fix an RT MCE crash

Josh Cartwright (1):
  list_bl: fixup bogus lockdep warning

Luiz Capitulino (1):
  mm: perform lru_add_drain_all() remotely

Mike Galbraith (1):
  mm/zsmalloc: Use get/put_cpu_light in zs_map_object()/zs_unmap_object()

Peter Zijlstra (1):
  sched,preempt: Fix preempt_count manipulations

Rik van Riel (1):
  kvm, rt: change async pagefault code locking for PREEMPT_RT

Sebastian Andrzej Siewior (5):
  net: dev: always take qdisc's busylock in __dev_xmit_skb()
  kernel/printk: Don't try to print from IRQ/NMI region
  arm: lazy preempt: correct resched condition
  locallock: add local_lock_on()
  trace: correct off by one while recording the trace-event

Steven Rostedt (Red Hat) (1):
  Linux 3.14.72-rt76-rc2

Thomas Gleixner (1):
  perf/x86/intel/rapl: Make PMU lock raw


 arch/arm/kernel/entry-armv.S|  6 -
 arch/x86/kernel/cpu/mcheck/mce.c|  3 ++-
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 20 
 arch/x86/kernel/kvm.c   | 37 +++--
 include/asm-generic/preempt.h   |  4 ++--
 include/linux/list_bl.h | 12 ++
 include/linux/locallock.h   |  6 +
 include/trace/ftrace.h  |  3 +++
 kernel/printk/printk.c  | 10 
 localversion-rt |  2 +-
 mm/swap.c   | 37 +++--
 mm/zsmalloc.c   |  4 ++--
 net/core/dev.c  |  4 
 13 files changed, 101 insertions(+), 47 deletions(-)


Re: [PATCH 29/66] tools lib traceevent: Use str_error_r()

2016-07-12 Thread Arnaldo Carvalho de Melo
Em Tue, Jul 12, 2016 at 07:25:19PM -0400, Steven Rostedt escreveu:
> On Tue, 12 Jul 2016 20:14:24 -0300
> Arnaldo Carvalho de Melo  wrote:
> 
> > Em Tue, Jul 12, 2016 at 07:11:08PM -0400, Steven Rostedt escreveu:
> > > On Tue, 12 Jul 2016 19:40:04 -0300 Arnaldo Carvalho de Melo 
> > >  wrote:  
> > > > To make it portable to non-glibc systems, that follow the XSI variant
> > > > instead of the GNU specific one that gets in place when _GNU_SOURCE is
> > > > defined.  
> > 
> > > >  #include "event-parse.h"
> > > > @@ -6131,12 +6132,7 @@ int pevent_strerror(struct pevent *pevent 
> > > > __maybe_unused,
> > > > +   str_error_r(errnum, buf, buflen);
> > > > return 0;  

> > > What library is used with this? When I port this over to trace-cmd
> > > (which is still needed as I develop this there), it fails to build.
> > > "undefined reference to str_error_r"  

> > tools/lib/str_error_r.c

> > Forgot about the out of tree copy :-\
 
> Yeah, we really need to make this into a real library. I haven't had
> the time to do that. Hopefully in August I can talk with some people at

What exactly do you mean by that? To grab a copy of what is in tools/
and have it turned into a library somewhere else?

Or to freeze its interfaces and create an .so with a committed to ABI?

I kind of like the way it is now... :-)

> LinuxCon to see the best way to go about doing that.
> 
> Anyway, I may just take that file and port it to trace-cmd.

Yeah, that would solve this specific case.

- Arnaldo


Re: [Query] Preemption (hogging) of the work handler

2016-07-12 Thread Viresh Kumar
On 11-07-16, 15:35, Viresh Kumar wrote:
> Sometimes, the platform doesn't come back after suspend. I have tried
> enabling no-console-suspend and the last line it prints is:
> 
> Disabling non-boot CPUs
> 
> And nothing after that at all. We have to forcefully reboot the phone
> after that. Moving the prints to they synchronous way (using
> echo 1 > /sys/module/printk/parameters/synchronous), fixes that issue.
> 
> So, the asynchronous printing have a issue that only we are hitting.
> It looks like that all the CPUs are gone except CPU0 and that CPU is
> hogged by the printk thread to print stuff as well as to suspend the
> system, and something eventually gets wrong.
> 
> I am only using the 3 patches from V12 version of the series.

Okay, we have tracked this BUG and its really interesting.

I hacked the platform's serial driver to implement a putchar() routine
that simply writes to the FIFO in polling mode, that helped us in
tracing on where we are going wrong.

The problem is that we are running asynchronous printks and we call
wake_up_process() from the last running CPU which has disabled
interrupts. That takes us to: try_to_wake_up().

In our case the CPU gets deadlocked on this line in try_to_wake_up().

raw_spin_lock_irqsave(>pi_lock, flags);

I will explain how:

The try_to_wake_up() function takes us through the scheduler code (RT
sched), to the hrtimer code, where we eventually call ktime_get() (for
the MONOTONIC clock used for hrtimer). And this function has this:

WARN_ON(timekeeping_suspended);

This starts another printk while we are in the middle of
wake_up_process() and the CPU tries to take the above lock again and
gets stuck there :)

This doesn't happen everytime because we don't always call ktime_get()
and it is called only if hrtimer_active() returns false.

This happened because of a WARN_ON() but it can happen anyway. Think
about this case:

- offline all CPUs, except 0
- call any routine that prints messages after disabling interrupts,
  etc.
- If any of the function within wake_up_process() does a print, we are
  screwed.

So the thing is that we can't really call wake_up_process() in cases
where the last CPU disables interrupts. And that's why my fixup patch
(which moved to synchronous prints after suspend) really works.

@Jan and Sergey: I would expect a patch from you guys to fix this
properly :)

Maybe something more in can_print_async() routine, like:

only-one-cpu-online + irqs_disabled()

or whatever.

-- 
viresh


  1   2   3   4   5   6   7   8   9   10   >