date:20170823

Re: linux-next: build warning after merge of the scsi-mkp tree

2017-08-23 Thread Stephen Rothwell

Hi all,

[Just adding Dick and James to the cc list]

On Thu, 24 Aug 2017 15:57:56 +1000 Stephen Rothwell  
wrote:
>
> After merging the scsi-mkp tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
> 
> drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 
> 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function]
>  lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
>  ^
> 
> Introduced by commit
> 
>   9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")

-- 
Cheers,
Stephen Rothwell

Re: linux-next: build warning after merge of the scsi-mkp tree

2017-08-23 Thread Stephen Rothwell

Hi all,

[Just adding Dick and James to the cc list]

On Thu, 24 Aug 2017 15:57:56 +1000 Stephen Rothwell  
wrote:
>
> After merging the scsi-mkp tree, today's linux-next build (powerpc
> ppc64_defconfig) produced this warning:
> 
> drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 
> 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function]
>  lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
>  ^
> 
> Introduced by commit
> 
>   9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")

-- 
Cheers,
Stephen Rothwell

linux-next: build warning after merge of the scsi-mkp tree

2017-08-23 Thread Stephen Rothwell

Hi Martin,

After merging the scsi-mkp tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 'lpfc_nvmet_replenish_context' 
defined but not used [-Wunused-function]
 lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
 ^

Introduced by commit

  9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")

-- 
Cheers,
Stephen Rothwell

linux-next: build warning after merge of the scsi-mkp tree

2017-08-23 Thread Stephen Rothwell

Hi Martin,

After merging the scsi-mkp tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 'lpfc_nvmet_replenish_context' 
defined but not used [-Wunused-function]
 lpfc_nvmet_replenish_context(struct lpfc_hba *phba,
 ^

Introduced by commit

  9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling")

-- 
Cheers,
Stephen Rothwell

Re: [PATCH] tty: xilinx_uartps: move to arch_initcall for earlier console

2017-08-23 Thread Michal Simek

On 23.8.2017 12:55, Shubhrajyoti Datta wrote:
> Signed-off-by: Shubhrajyoti Datta 

Empty commit message?
What's the reason for this change?

M

> ---
>  drivers/tty/serial/xilinx_uartps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/xilinx_uartps.c 
> b/drivers/tty/serial/xilinx_uartps.c
> index ff1b115..a239343 100644
> --- a/drivers/tty/serial/xilinx_uartps.c
> +++ b/drivers/tty/serial/xilinx_uartps.c
> @@ -1647,7 +1647,7 @@ static void __exit cdns_uart_exit(void)
>   uart_unregister_driver(_uart_uart_driver);
>  }
>  
> -module_init(cdns_uart_init);
> +arch_initcall(cdns_uart_init);
>  module_exit(cdns_uart_exit);
>  
>  MODULE_DESCRIPTION("Driver for Cadence UART");
>

Re: [PATCH] tty: xilinx_uartps: move to arch_initcall for earlier console

2017-08-23 Thread Michal Simek

On 23.8.2017 12:55, Shubhrajyoti Datta wrote:
> Signed-off-by: Shubhrajyoti Datta 

Empty commit message?
What's the reason for this change?

M

> ---
>  drivers/tty/serial/xilinx_uartps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/serial/xilinx_uartps.c 
> b/drivers/tty/serial/xilinx_uartps.c
> index ff1b115..a239343 100644
> --- a/drivers/tty/serial/xilinx_uartps.c
> +++ b/drivers/tty/serial/xilinx_uartps.c
> @@ -1647,7 +1647,7 @@ static void __exit cdns_uart_exit(void)
>   uart_unregister_driver(_uart_uart_driver);
>  }
>  
> -module_init(cdns_uart_init);
> +arch_initcall(cdns_uart_init);
>  module_exit(cdns_uart_exit);
>  
>  MODULE_DESCRIPTION("Driver for Cadence UART");
>

Re: [PATCH v3 2/5] dmaengine: Add STM32 DMAMUX driver

2017-08-23 Thread Peter Ujfalusi

On 2017-08-21 12:34, Pierre Yves MORDRET wrote:
> OK. I will redesign my driver to take into account this idea.
> 
> I believe I should get rid of my custom API in DMA for channelID as well. 
> Please
> confirm. Not very clear for me whether I can keep it or not.

Yes, you should be able to get rid of any custom API.
The DMA event router should be 'invisible' to both the DMA driver itself
and for the DMA users as well. If you end up still needing custom API
(which I doubt) then it is better to extend the core support to cover
that in a generic way since it is likely that other might have similar
needs.

You need to identify what you need to manage with the DMA router, again
in my cases:
am335x/am437x: The DMA channel is fixed, but the DMA event to be handled
by the channel (DMA event muxer) is selected by the router.
dra7x: The DMA event is the fixed part and I needed to translate that to
local eDMA/sDMA events and in eDMA I needed to pick a channel based on
the translated event.

The same router code works for eDMA and sDMA in dra7, while the two DMA
engines are different in their internal works.

At the end of the day I only needed to plug the DMA event routers and
there is no change in the DMA drivers at all.

I'm happy to answer any questions you might have, if I can.

- Péter

Re: [PATCH v3 2/5] dmaengine: Add STM32 DMAMUX driver

2017-08-23 Thread Peter Ujfalusi

On 2017-08-21 12:34, Pierre Yves MORDRET wrote:
> OK. I will redesign my driver to take into account this idea.
> 
> I believe I should get rid of my custom API in DMA for channelID as well. 
> Please
> confirm. Not very clear for me whether I can keep it or not.

Yes, you should be able to get rid of any custom API.
The DMA event router should be 'invisible' to both the DMA driver itself
and for the DMA users as well. If you end up still needing custom API
(which I doubt) then it is better to extend the core support to cover
that in a generic way since it is likely that other might have similar
needs.

You need to identify what you need to manage with the DMA router, again
in my cases:
am335x/am437x: The DMA channel is fixed, but the DMA event to be handled
by the channel (DMA event muxer) is selected by the router.
dra7x: The DMA event is the fixed part and I needed to translate that to
local eDMA/sDMA events and in eDMA I needed to pick a channel based on
the translated event.

The same router code works for eDMA and sDMA in dra7, while the two DMA
engines are different in their internal works.

At the end of the day I only needed to plug the DMA event routers and
there is no change in the DMA drivers at all.

I'm happy to answer any questions you might have, if I can.

- Péter

[RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy

2017-08-23 Thread Hoeun Ryu

 'min_vruntime_copy' is copied when 'min_vruntime' is updated for cfq_rq
and used to check if updating 'min_vruntime' is completed on reader side.
 Because 'min_vruntime' variable is 64bit, we need a mimic of seqlock to
check if the variable is not being updated on 32bit machines.

 On 64BIT_ATOMIC_ACCESS enabled machines, 64bit accesses are atomic even
though the machines are 32bit, so we can directly access 'min_vruntime'
on the architectures.

 Depend on CONFIG_64BIT_ATOMIC_ACCESS instead of CONFIG_64BIT to determine
whether 'min_vruntime_copy' variable is used for synchronization or not.
And align 'min_vruntime' by 8 if 64BIT_ATOMIC_ALIGNED_ACCESS is true
because 64BIT_ATOMIC_ALIGNED_ACCESS enabled system can access the variable
atomically only when it' aligned.

Signed-off-by: Hoeun Ryu 
---
 kernel/sched/fair.c  | 6 +++---
 kernel/sched/sched.h | 6 +-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c95880e..840658f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -536,7 +536,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
 
/* ensure we never gain time by being placed backwards. */
cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime);
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
smp_wmb();
cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
@@ -5975,7 +5975,7 @@ static void migrate_task_rq_fair(struct task_struct *p)
struct cfs_rq *cfs_rq = cfs_rq_of(se);
u64 min_vruntime;
 
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
u64 min_vruntime_copy;
 
do {
@@ -9173,7 +9173,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 {
cfs_rq->tasks_timeline = RB_ROOT;
cfs_rq->min_vruntime = (u64)(-(1LL << 20));
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
 #ifdef CONFIG_SMP
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index eeef1a3..870010b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -421,8 +421,12 @@ struct cfs_rq {
unsigned int nr_running, h_nr_running;
 
u64 exec_clock;
+#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS
u64 min_vruntime;
-#ifndef CONFIG_64BIT
+#else
+   u64 min_vruntime __attribute__((aligned(sizeof(u64;
+#endif
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
u64 min_vruntime_copy;
 #endif
 
-- 
2.7.4

[RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy

2017-08-23 Thread Hoeun Ryu

 'min_vruntime_copy' is copied when 'min_vruntime' is updated for cfq_rq
and used to check if updating 'min_vruntime' is completed on reader side.
 Because 'min_vruntime' variable is 64bit, we need a mimic of seqlock to
check if the variable is not being updated on 32bit machines.

 On 64BIT_ATOMIC_ACCESS enabled machines, 64bit accesses are atomic even
though the machines are 32bit, so we can directly access 'min_vruntime'
on the architectures.

 Depend on CONFIG_64BIT_ATOMIC_ACCESS instead of CONFIG_64BIT to determine
whether 'min_vruntime_copy' variable is used for synchronization or not.
And align 'min_vruntime' by 8 if 64BIT_ATOMIC_ALIGNED_ACCESS is true
because 64BIT_ATOMIC_ALIGNED_ACCESS enabled system can access the variable
atomically only when it' aligned.

Signed-off-by: Hoeun Ryu 
---
 kernel/sched/fair.c  | 6 +++---
 kernel/sched/sched.h | 6 +-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c95880e..840658f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -536,7 +536,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq)
 
/* ensure we never gain time by being placed backwards. */
cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime);
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
smp_wmb();
cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
@@ -5975,7 +5975,7 @@ static void migrate_task_rq_fair(struct task_struct *p)
struct cfs_rq *cfs_rq = cfs_rq_of(se);
u64 min_vruntime;
 
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
u64 min_vruntime_copy;
 
do {
@@ -9173,7 +9173,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
 {
cfs_rq->tasks_timeline = RB_ROOT;
cfs_rq->min_vruntime = (u64)(-(1LL << 20));
-#ifndef CONFIG_64BIT
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime;
 #endif
 #ifdef CONFIG_SMP
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index eeef1a3..870010b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -421,8 +421,12 @@ struct cfs_rq {
unsigned int nr_running, h_nr_running;
 
u64 exec_clock;
+#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS
u64 min_vruntime;
-#ifndef CONFIG_64BIT
+#else
+   u64 min_vruntime __attribute__((aligned(sizeof(u64;
+#endif
+#ifndef CONFIG_64BIT_ATOMIC_ACCESS
u64 min_vruntime_copy;
 #endif
 
-- 
2.7.4

[RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS

2017-08-23 Thread Hoeun Ryu

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.

 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures.

 we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.

 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8 in the way of #ifdef.

 The last patch "sched: depend on 64BIT_ATOMIC_ACCESS to determine if to
use min_vruntime_copy" is an example of this approach.

 I'd like to know what the architecture maintainers and kernel maintainers
think about it. I think I can make more examples (mostly removing seqlock
to access the 64bit variables on the machines) if this approach is
accepted.

Hoeun Ryu (3):
  arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit
machines
  arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
  sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use
min_vruntime_copy

 arch/Kconfig | 20 
 arch/arm/mm/Kconfig  |  2 ++
 kernel/sched/fair.c  |  6 +++---
 kernel/sched/sched.h |  6 +-
 4 files changed, 30 insertions(+), 4 deletions(-)

-- 
2.7.4

[RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines

2017-08-23 Thread Hoeun Ryu

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.
 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures,
we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.
 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8 in the way of #ifdef.

Signed-off-by: Hoeun Ryu 
---
 arch/Kconfig | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 21d0089..1def331 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -115,6 +115,26 @@ config UPROBES
managed by the kernel and kept transparent to the probed
application. )
 
+config 64BIT_ATOMIC_ACCESS
+   def_bool 64BIT
+   help
+ On some 32bit architectures as well as 64bit architectures,
+ 64bit accesses are atomic when certain conditions are satisfied.
+
+ This symbol should be selected by an architecture if 64 bit
+ accesses can be atomic.
+
+config 64BIT_ATOMIC_ALIGNED_ACCESS
+   def_bool n
+   depends on 64BIT_ATOMIC_ACCESS
+   help
+ On 64BIT_ATOMIC_ACCESS enabled system, the address should be
+ aligned by 8 to guarantee the accesses are atomic.
+
+ This symbol should be selected by an architecture if 64 bit
+ accesses are required to be 64 bit aligned to guarantee that
+ the 64bit accesses are atomic.
+
 config HAVE_64BIT_ALIGNED_ACCESS
def_bool 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS
help
-- 
2.7.4

[RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS

2017-08-23 Thread Hoeun Ryu

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.

 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures.

 we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.

 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8 in the way of #ifdef.

 The last patch "sched: depend on 64BIT_ATOMIC_ACCESS to determine if to
use min_vruntime_copy" is an example of this approach.

 I'd like to know what the architecture maintainers and kernel maintainers
think about it. I think I can make more examples (mostly removing seqlock
to access the 64bit variables on the machines) if this approach is
accepted.

Hoeun Ryu (3):
  arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit
machines
  arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
  sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use
min_vruntime_copy

 arch/Kconfig | 20 
 arch/arm/mm/Kconfig  |  2 ++
 kernel/sched/fair.c  |  6 +++---
 kernel/sched/sched.h |  6 +-
 4 files changed, 30 insertions(+), 4 deletions(-)

-- 
2.7.4

[RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines

2017-08-23 Thread Hoeun Ryu

 On some 32-bit architectures, 64bit accesses are atomic when certain
conditions are satisfied.
 For example, on LPAE (Large Physical Address Extension) enabled ARM
architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit
atomic as long as the address is 64-bit aligned. This feature is to
guarantee atomic accesses on newly introduced 64bit wide descriptors in
the translation tables, and 's/u64' variables can be accessed atomically
when they are aligned(8) on LPAE enabled ARM architecture machines.

 Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which
can be true for the 32bit architectures as well as 64bit architectures,
we can optimize some kernel codes using seqlock (timekeeping) or mimics
of it (like in sched/cfq) simply to read or write 64bit variables.
 The existing codes depend on CONFIG_64BIT to determine whether the 64bit
variables can be directly accessed or need additional synchronization
primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of
CONFIG_64BIT in the cases.
 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to
indicate the alignment requirement to the compiler
(__attribute__((aligned(8 in the way of #ifdef.

Signed-off-by: Hoeun Ryu 
---
 arch/Kconfig | 20 
 1 file changed, 20 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 21d0089..1def331 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -115,6 +115,26 @@ config UPROBES
managed by the kernel and kept transparent to the probed
application. )
 
+config 64BIT_ATOMIC_ACCESS
+   def_bool 64BIT
+   help
+ On some 32bit architectures as well as 64bit architectures,
+ 64bit accesses are atomic when certain conditions are satisfied.
+
+ This symbol should be selected by an architecture if 64 bit
+ accesses can be atomic.
+
+config 64BIT_ATOMIC_ALIGNED_ACCESS
+   def_bool n
+   depends on 64BIT_ATOMIC_ACCESS
+   help
+ On 64BIT_ATOMIC_ACCESS enabled system, the address should be
+ aligned by 8 to guarantee the accesses are atomic.
+
+ This symbol should be selected by an architecture if 64 bit
+ accesses are required to be 64 bit aligned to guarantee that
+ the 64bit accesses are atomic.
+
 config HAVE_64BIT_ALIGNED_ACCESS
def_bool 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS
help
-- 
2.7.4

[RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines

2017-08-23 Thread Hoeun Ryu

 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as
long as the address is 64-bit aligned on LPAE (Large Physical Address
Extension) enabled architectures. This feature is to guarantee atomic
accesses on newly introduced 64bit wide descriptors in the translation
tables.

 Making 64BIT_ATOMIC_ACCESS true, some kernel codes to access 64bit
variables can be optimized by omitting seqlock or the mimic of it.
 Also make 64BIT_ATOMIC_ALIGNED_ACCESS true, the 64bit atomic access is
guarnteed only when the address is 64bit algined.

Signed-off-by: Hoeun Ryu 
---
 arch/arm/mm/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 60cdfdc..3142572 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -660,6 +660,8 @@ config ARM_LPAE
bool "Support for the Large Physical Address Extension"
depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \
!CPU_32v4 && !CPU_32v3
+   select 64BIT_ATOMIC_ACCESS
+   select 64BIT_ATOMIC_ALIGNED_ACCESS
help
  Say Y if you have an ARMv7 processor supporting the LPAE page
  table format and you would like to access memory beyond the
-- 
2.7.4

[RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines

2017-08-23 Thread Hoeun Ryu

 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as
long as the address is 64-bit aligned on LPAE (Large Physical Address
Extension) enabled architectures. This feature is to guarantee atomic
accesses on newly introduced 64bit wide descriptors in the translation
tables.

 Making 64BIT_ATOMIC_ACCESS true, some kernel codes to access 64bit
variables can be optimized by omitting seqlock or the mimic of it.
 Also make 64BIT_ATOMIC_ALIGNED_ACCESS true, the 64bit atomic access is
guarnteed only when the address is 64bit algined.

Signed-off-by: Hoeun Ryu 
---
 arch/arm/mm/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 60cdfdc..3142572 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -660,6 +660,8 @@ config ARM_LPAE
bool "Support for the Large Physical Address Extension"
depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \
!CPU_32v4 && !CPU_32v3
+   select 64BIT_ATOMIC_ACCESS
+   select 64BIT_ATOMIC_ALIGNED_ACCESS
help
  Say Y if you have an ARMv7 processor supporting the LPAE page
  table format and you would like to access memory beyond the
-- 
2.7.4

[PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request

2017-08-23 Thread js1304

From: Joonsoo Kim 

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V 
Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h |  2 +-
 mm/page_alloc.c| 11 ++-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e7e92c8..e5f134b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 90b1996..6faa53d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned 
int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-256,
+   [ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-256,
+   [ZONE_DMA32] = 256,
 #endif
+   [ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-32,
+   [ZONE_HIGHMEM] = INT_MAX,
 #endif
-32,
+   [ZONE_MOVABLE] = INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
2.7.4

[PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request

2017-08-23 Thread js1304

From: Joonsoo Kim 

Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that
important to reserve. When ZONE_MOVABLE is used, this problem would
theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE
allocation request which is mainly used for page cache and anon page
allocation. So, fix it.

And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size
makes code complex. For example, if there is highmem system, following
reserve ratio is activated for *NORMAL ZONE* which would be easyily
misleading people.

 #ifdef CONFIG_HIGHMEM
 32
 #endif

This patch also fix this situation by defining sysctl_lowmem_reserve_ratio
array by MAX_NR_ZONES and place "#ifdef" to right place.

Reviewed-by: Aneesh Kumar K.V 
Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
 include/linux/mmzone.h |  2 +-
 mm/page_alloc.c| 11 ++-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index e7e92c8..e5f134b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int watermark_scale_factor_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
-extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1];
+extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES];
 int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
 int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 90b1996..6faa53d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned 
int order);
  * TBD: should special case ZONE_DMA32 machines here - in those we normally
  * don't need any ZONE_NORMAL reservation
  */
-int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = {
+int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = {
 #ifdef CONFIG_ZONE_DMA
-256,
+   [ZONE_DMA] = 256,
 #endif
 #ifdef CONFIG_ZONE_DMA32
-256,
+   [ZONE_DMA32] = 256,
 #endif
+   [ZONE_NORMAL] = 32,
 #ifdef CONFIG_HIGHMEM
-32,
+   [ZONE_HIGHMEM] = INT_MAX,
 #endif
-32,
+   [ZONE_MOVABLE] = INT_MAX,
 };
 
 EXPORT_SYMBOL(totalram_pages);
-- 
2.7.4

Re: [PATCH][V2][netdev-next] gre: remove duplicated assignment of iph

2017-08-23 Thread David Miller

From: Colin King 
Date: Wed, 23 Aug 2017 12:59:48 +0100

> From: Colin Ian King 
> 
> iph is being assigned the same value twice; remove the redundant
> first assignment. (Thanks to Nikolay Aleksandrov for pointing out
> that the first asssignment should be removed and not the second)
> 
> Fixes warning:
> net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read
> 
> Signed-off-by: Colin Ian King 

Much better, applied, thanks.

Re: [PATCH][V2][netdev-next] gre: remove duplicated assignment of iph

2017-08-23 Thread David Miller

From: Colin King 
Date: Wed, 23 Aug 2017 12:59:48 +0100

> From: Colin Ian King 
> 
> iph is being assigned the same value twice; remove the redundant
> first assignment. (Thanks to Nikolay Aleksandrov for pointing out
> that the first asssignment should be removed and not the second)
> 
> Fixes warning:
> net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read
> 
> Signed-off-by: Colin Ian King 

Much better, applied, thanks.

Re: [PATCH net] sctp: Avoid out-of-bounds reads from address storage

2017-08-23 Thread David Miller

From: Stefano Brivio 
Date: Wed, 23 Aug 2017 13:27:13 +0200

> inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy
> sizeof(sockaddr_storage) bytes to fill in sockaddr structs used
> to export diagnostic information to userspace.
> 
> However, the memory allocated to store sockaddr information is
> smaller than that and depends on the address family, so we leak
> up to 100 uninitialized bytes to userspace. Just use the size of
> the source structs instead, in all the three cases this is what
> userspace expects. Zero out the remaining memory.
> 
> Unused bytes (i.e. when IPv4 addresses are used) in source
> structs sctp_sockaddr_entry and sctp_transport are already
> cleared by sctp_add_bind_addr() and sctp_transport_new(),
> respectively.
> 
> Noticed while testing KASAN-enabled kernel with 'ss':
 ...
> This fixes CVE-2017-7558.
> 
> References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266
> Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
> Cc:  # 4.7+
> Cc: Xin Long 
> Cc: Vlad Yasevich 
> Cc: Neil Horman 
> Signed-off-by: Stefano Brivio 

Applied and queued up for -stable.

Do not put "stable@kernel..." into networking patch submissions.
For networking, I handle the stable submissions by hand myself.

Thank you.

Re: [PATCH net] sctp: Avoid out-of-bounds reads from address storage

2017-08-23 Thread David Miller

From: Stefano Brivio 
Date: Wed, 23 Aug 2017 13:27:13 +0200

> inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy
> sizeof(sockaddr_storage) bytes to fill in sockaddr structs used
> to export diagnostic information to userspace.
> 
> However, the memory allocated to store sockaddr information is
> smaller than that and depends on the address family, so we leak
> up to 100 uninitialized bytes to userspace. Just use the size of
> the source structs instead, in all the three cases this is what
> userspace expects. Zero out the remaining memory.
> 
> Unused bytes (i.e. when IPv4 addresses are used) in source
> structs sctp_sockaddr_entry and sctp_transport are already
> cleared by sctp_add_bind_addr() and sctp_transport_new(),
> respectively.
> 
> Noticed while testing KASAN-enabled kernel with 'ss':
 ...
> This fixes CVE-2017-7558.
> 
> References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266
> Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
> Cc:  # 4.7+
> Cc: Xin Long 
> Cc: Vlad Yasevich 
> Cc: Neil Horman 
> Signed-off-by: Stefano Brivio 

Applied and queued up for -stable.

Do not put "stable@kernel..." into networking patch submissions.
For networking, I handle the stable submissions by hand myself.

Thank you.

Re: [PATCH 3/3 v3] net: tipc: constify genl_ops

2017-08-23 Thread David Miller

From: Arvind Yadav 
Date: Wed, 23 Aug 2017 16:22:20 +0530

> genl_ops are not supposed to change at runtime. All functions
> working with genl_ops provided by  work with
> const genl_ops. So mark the non-const structs as const.
> 
> Signed-off-by: Arvind Yadav 

Applied.

Re: [PATCH 3/3 v3] net: tipc: constify genl_ops

2017-08-23 Thread David Miller

From: Arvind Yadav 
Date: Wed, 23 Aug 2017 16:22:20 +0530

> genl_ops are not supposed to change at runtime. All functions
> working with genl_ops provided by  work with
> const genl_ops. So mark the non-const structs as const.
> 
> Signed-off-by: Arvind Yadav 

Applied.

Re: [PATCH 0/3] constify dsa_switch_ops

2017-08-23 Thread David Miller

From: Andrew Lunn 
Date: Wed, 23 Aug 2017 14:45:56 +0200

> On Wed, Aug 23, 2017 at 03:46:56PM +0530, Arvind Yadav wrote:
>> dsa_switch_ops are not supposed to change at runtime. All functions
>> working with dsa_switch_ops provided by  work with
>> const dsa_switch_ops. So mark the non-const structs as const.
>> 
>> Arvind Yadav (3):
>>   [PATCH 1/3] net: dsa: loop: constify dsa_switch_ops
>>   [PATCH 2/3] net: dsa: lan9303: constify dsa_switch_ops
>>   [PATCH 3/3] net: dsa: mt7530: constify dsa_switch_ops
> 
> For the whole series:
> 
> Reviewed-by: Andrew Lunn 

None of these patches apply to net-next, things are const there
already.

Re: [PATCH 0/3] constify dsa_switch_ops

2017-08-23 Thread David Miller

From: Andrew Lunn 
Date: Wed, 23 Aug 2017 14:45:56 +0200

> On Wed, Aug 23, 2017 at 03:46:56PM +0530, Arvind Yadav wrote:
>> dsa_switch_ops are not supposed to change at runtime. All functions
>> working with dsa_switch_ops provided by  work with
>> const dsa_switch_ops. So mark the non-const structs as const.
>> 
>> Arvind Yadav (3):
>>   [PATCH 1/3] net: dsa: loop: constify dsa_switch_ops
>>   [PATCH 2/3] net: dsa: lan9303: constify dsa_switch_ops
>>   [PATCH 3/3] net: dsa: mt7530: constify dsa_switch_ops
> 
> For the whole series:
> 
> Reviewed-by: Andrew Lunn 

None of these patches apply to net-next, things are const there
already.

[PATCH] iommu/mediatek: Fix a build fail of m4u_type

2017-08-23 Thread Yong Wu

The commit ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode") introduce the following build error:

   drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init':
>> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data'
 has no member named 'm4u_type'; did you mean 'm4u_dom'?
 if (data->enable_4GB && data->m4u_type != M4U_MT8173) {

This patch fix it, use "m4u_plat" instead of "m4u_type".

Reported-by: kernel test robot 
Signed-off-by: Yong Wu 
---
1) In the v2 of mt2712 iommu support patch-set, We changed a variant
name from "m4u_type" to "m4u_plat", But I'm sorry that I forgot to
change it in the later patch.
2) This patch is based on [1].
[1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-August/023847.html
3) I cann't find the commit id of the patch ("iommu/mediatek: Enlarge
the validate PA range for 4GB mode") in the [2].
So I don't write the commit id for that patch in the commit message.
[2]https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
---
 drivers/iommu/mtk_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 4f233e1..bc00e40 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -533,7 +533,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
 
writel_relaxed(F_MMU_IVRP_PA_SET(data->protect_base, data->enable_4GB),
   data->base + REG_MMU_IVRP_PADDR);
-   if (data->enable_4GB && data->m4u_type != M4U_MT8173) {
+   if (data->enable_4GB && data->m4u_plat != M4U_MT8173) {
/*
 * If 4GB mode is enabled, the validate PA range is from
 * 0x1__ to 0x1__. here record bit[32:30].
-- 
1.9.1

[PATCH] iommu/mediatek: Fix a build fail of m4u_type

2017-08-23 Thread Yong Wu

The commit ("iommu/mediatek: Enlarge the validate PA range
for 4GB mode") introduce the following build error:

   drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init':
>> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data'
 has no member named 'm4u_type'; did you mean 'm4u_dom'?
 if (data->enable_4GB && data->m4u_type != M4U_MT8173) {

This patch fix it, use "m4u_plat" instead of "m4u_type".

Reported-by: kernel test robot 
Signed-off-by: Yong Wu 
---
1) In the v2 of mt2712 iommu support patch-set, We changed a variant
name from "m4u_type" to "m4u_plat", But I'm sorry that I forgot to
change it in the later patch.
2) This patch is based on [1].
[1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-August/023847.html
3) I cann't find the commit id of the patch ("iommu/mediatek: Enlarge
the validate PA range for 4GB mode") in the [2].
So I don't write the commit id for that patch in the commit message.
[2]https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next
---
 drivers/iommu/mtk_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 4f233e1..bc00e40 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -533,7 +533,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
*data)
 
writel_relaxed(F_MMU_IVRP_PA_SET(data->protect_base, data->enable_4GB),
   data->base + REG_MMU_IVRP_PADDR);
-   if (data->enable_4GB && data->m4u_type != M4U_MT8173) {
+   if (data->enable_4GB && data->m4u_plat != M4U_MT8173) {
/*
 * If 4GB mode is enabled, the validate PA range is from
 * 0x1__ to 0x1__. here record bit[32:30].
-- 
1.9.1

Re: linux-next: build warning after merge of the phy-next tree

2017-08-23 Thread Stephen Rothwell

Hi Greg,

On Tue, 22 Aug 2017 15:07:33 +1000 Stephen Rothwell  
wrote:
>
> Hi Kishon,
> 
> After merging the phy-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> drivers/phy/ralink/phy-ralink-usb.c: In function 'ralink_usb_phy_probe':
> drivers/phy/ralink/phy-ralink-usb.c:195:13: warning: cast from pointer to 
> integer of different size [-Wpointer-to-int-cast]
>   phy->clk = (u32) match->data;
>  ^
> 
> Introduced by commit
> 
>   2411a736ff09 ("phy: ralink-usb: add driver for Mediatek/Ralink")

I now get this warning after merging the usb tree ...

-- 
Cheers,
Stephen Rothwell

Re: linux-next: build warning after merge of the phy-next tree

2017-08-23 Thread Stephen Rothwell

Hi Greg,

On Tue, 22 Aug 2017 15:07:33 +1000 Stephen Rothwell  
wrote:
>
> Hi Kishon,
> 
> After merging the phy-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> drivers/phy/ralink/phy-ralink-usb.c: In function 'ralink_usb_phy_probe':
> drivers/phy/ralink/phy-ralink-usb.c:195:13: warning: cast from pointer to 
> integer of different size [-Wpointer-to-int-cast]
>   phy->clk = (u32) match->data;
>  ^
> 
> Introduced by commit
> 
>   2411a736ff09 ("phy: ralink-usb: add driver for Mediatek/Ralink")

I now get this warning after merging the usb tree ...

-- 
Cheers,
Stephen Rothwell

Re: [PATCH][net-next] net: hinic: make functions set_ctrl0 and set_ctrl1 static

2017-08-23 Thread David Miller

From: Colin King 
Date: Wed, 23 Aug 2017 10:59:40 +0100

> From: Colin Ian King 
> 
> The functions set_ctrl0 and set_ctrl1 are local to the source and do
> not need to be in global scope, so make them static.
> 
> Cleans up sparse warnings:
> symbol 'set_ctrl0' was not declared. Should it be static?
> symbol 'set_ctrl1' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King 

Applied.

Re: [PATCH][net-next] net: hinic: make functions set_ctrl0 and set_ctrl1 static

2017-08-23 Thread David Miller

From: Colin King 
Date: Wed, 23 Aug 2017 10:59:40 +0100

> From: Colin Ian King 
> 
> The functions set_ctrl0 and set_ctrl1 are local to the source and do
> not need to be in global scope, so make them static.
> 
> Cleans up sparse warnings:
> symbol 'set_ctrl0' was not declared. Should it be static?
> symbol 'set_ctrl1' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King 

Applied.

[PATCH] acpi: acpica: fix acpi operand cache leak in dsutils.c

2017-08-23 Thread Seunghun Han

I found an ACPI cache leak in ACPI early termination and boot continuing case.

When early termination is occurred due to malicious ACPI table, Linux kernel
terminates ACPI function and continues to boot process. While kernel terminates
ACPI function, kmem_cache_destroy() reports Acpi-Operand cache leak.

Boot log of ACPI operand cache leak is as follows:
>[0.585957] ACPI: Added _OSI(Module Device)
>[0.587218] ACPI: Added _OSI(Processor Device)
>[0.588530] ACPI: Added _OSI(3.0 _SCP Extensions)
>[0.589790] ACPI: Added _OSI(Processor Aggregator Device)
>[0.591534] ACPI Error: Illegal I/O port address/length above 64K: 
>C806E0004002/0x2 (20170303/hwvalid-155)
>[0.594351] ACPI Exception: AE_LIMIT, Unable to initialize fixed events 
>(20170303/evevent-88)
>[0.597858] ACPI: Unable to start the ACPI Interpreter
>[0.599162] ACPI Error: Could not remove SCI handler (20170303/evmisc-281)
>[0.601836] kmem_cache_destroy Acpi-Operand: Slab cache still has objects
>[0.603556] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc5 #26
>[0.605159] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
>VirtualBox 12/01/2006
>[0.609177] Call Trace:
>[0.610063]  ? dump_stack+0x5c/0x81
>[0.68]  ? kmem_cache_destroy+0x1aa/0x1c0
>[0.612632]  ? acpi_sleep_proc_init+0x27/0x27
>[0.613906]  ? acpi_os_delete_cache+0xa/0x10
>[0.617986]  ? acpi_ut_delete_caches+0x3f/0x7b
>[0.619293]  ? acpi_terminate+0xa/0x14
>[0.620394]  ? acpi_init+0x2af/0x34f
>[0.621616]  ? __class_create+0x4c/0x80
>[0.623412]  ? video_setup+0x7f/0x7f
>[0.624585]  ? acpi_sleep_proc_init+0x27/0x27
>[0.625861]  ? do_one_initcall+0x4e/0x1a0
>[0.627513]  ? kernel_init_freeable+0x19e/0x21f
>[0.628972]  ? rest_init+0x80/0x80
>[0.630043]  ? kernel_init+0xa/0x100
>[0.631084]  ? ret_from_fork+0x25/0x30
>[0.633343] vgaarb: loaded
>[0.635036] EDAC MC: Ver: 3.0.0
>[0.638601] PCI: Probing PCI hardware
>[0.639833] PCI host bridge to bus :00
>[0.641031] pci_bus :00: root bus resource [io  0x-0x]
> ... Continue to boot and log is omitted ...

I analyzed this memory leak in detail and found acpi_ds_obj_stack_pop_and_
delete() function miscalculated the top of the stack. acpi_ds_obj_stack_push()
function uses walk_state->operand_index for start position of the top, but
acpi_ds_obj_stack_pop_and_delete() function considers index 0 for it.
Therefore, this causes acpi operand memory leak.

This cache leak causes a security threat because an old kernel (<= 4.9) shows
memory locations of kernel functions in stack dump. Some malicious users
could use this information to neutralize kernel ASLR.

I made a patch to fix ACPI operand cache leak.

Signed-off-by: Seunghun Han 
---
 drivers/acpi/acpica/dsutils.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/acpica/dsutils.c b/drivers/acpi/acpica/dsutils.c
index 0dabd9b..2c8a060 100644
--- a/drivers/acpi/acpica/dsutils.c
+++ b/drivers/acpi/acpica/dsutils.c
@@ -705,6 +705,8 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
union acpi_parse_object *arguments[ACPI_OBJ_NUM_OPERANDS];
u32 arg_count = 0;
u32 index = walk_state->num_operands;
+   u32 prev_num_operands = walk_state->num_operands;
+   u32 new_num_operands;
u32 i;
 
ACPI_FUNCTION_TRACE_PTR(ds_create_operands, first_arg);
@@ -733,6 +735,7 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
 
/* Create the interpreter arguments, in reverse order */
 
+   new_num_operands = index;
index--;
for (i = 0; i < arg_count; i++) {
arg = arguments[index];
@@ -757,7 +760,11 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
 * pop everything off of the operand stack and delete those
 * objects
 */
-   acpi_ds_obj_stack_pop_and_delete(arg_count, walk_state);
+   walk_state->num_operands = i;
+   acpi_ds_obj_stack_pop_and_delete(new_num_operands, walk_state);
+
+   /* Restore operand count */
+   walk_state->num_operands = prev_num_operands;
 
ACPI_EXCEPTION((AE_INFO, status, "While creating Arg %u", index));
return_ACPI_STATUS(status);
-- 
2.1.4

[PATCH] acpi: acpica: fix acpi operand cache leak in dsutils.c

2017-08-23 Thread Seunghun Han

I found an ACPI cache leak in ACPI early termination and boot continuing case.

When early termination is occurred due to malicious ACPI table, Linux kernel
terminates ACPI function and continues to boot process. While kernel terminates
ACPI function, kmem_cache_destroy() reports Acpi-Operand cache leak.

Boot log of ACPI operand cache leak is as follows:
>[0.585957] ACPI: Added _OSI(Module Device)
>[0.587218] ACPI: Added _OSI(Processor Device)
>[0.588530] ACPI: Added _OSI(3.0 _SCP Extensions)
>[0.589790] ACPI: Added _OSI(Processor Aggregator Device)
>[0.591534] ACPI Error: Illegal I/O port address/length above 64K: 
>C806E0004002/0x2 (20170303/hwvalid-155)
>[0.594351] ACPI Exception: AE_LIMIT, Unable to initialize fixed events 
>(20170303/evevent-88)
>[0.597858] ACPI: Unable to start the ACPI Interpreter
>[0.599162] ACPI Error: Could not remove SCI handler (20170303/evmisc-281)
>[0.601836] kmem_cache_destroy Acpi-Operand: Slab cache still has objects
>[0.603556] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc5 #26
>[0.605159] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
>VirtualBox 12/01/2006
>[0.609177] Call Trace:
>[0.610063]  ? dump_stack+0x5c/0x81
>[0.68]  ? kmem_cache_destroy+0x1aa/0x1c0
>[0.612632]  ? acpi_sleep_proc_init+0x27/0x27
>[0.613906]  ? acpi_os_delete_cache+0xa/0x10
>[0.617986]  ? acpi_ut_delete_caches+0x3f/0x7b
>[0.619293]  ? acpi_terminate+0xa/0x14
>[0.620394]  ? acpi_init+0x2af/0x34f
>[0.621616]  ? __class_create+0x4c/0x80
>[0.623412]  ? video_setup+0x7f/0x7f
>[0.624585]  ? acpi_sleep_proc_init+0x27/0x27
>[0.625861]  ? do_one_initcall+0x4e/0x1a0
>[0.627513]  ? kernel_init_freeable+0x19e/0x21f
>[0.628972]  ? rest_init+0x80/0x80
>[0.630043]  ? kernel_init+0xa/0x100
>[0.631084]  ? ret_from_fork+0x25/0x30
>[0.633343] vgaarb: loaded
>[0.635036] EDAC MC: Ver: 3.0.0
>[0.638601] PCI: Probing PCI hardware
>[0.639833] PCI host bridge to bus :00
>[0.641031] pci_bus :00: root bus resource [io  0x-0x]
> ... Continue to boot and log is omitted ...

I analyzed this memory leak in detail and found acpi_ds_obj_stack_pop_and_
delete() function miscalculated the top of the stack. acpi_ds_obj_stack_push()
function uses walk_state->operand_index for start position of the top, but
acpi_ds_obj_stack_pop_and_delete() function considers index 0 for it.
Therefore, this causes acpi operand memory leak.

This cache leak causes a security threat because an old kernel (<= 4.9) shows
memory locations of kernel functions in stack dump. Some malicious users
could use this information to neutralize kernel ASLR.

I made a patch to fix ACPI operand cache leak.

Signed-off-by: Seunghun Han 
---
 drivers/acpi/acpica/dsutils.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/acpica/dsutils.c b/drivers/acpi/acpica/dsutils.c
index 0dabd9b..2c8a060 100644
--- a/drivers/acpi/acpica/dsutils.c
+++ b/drivers/acpi/acpica/dsutils.c
@@ -705,6 +705,8 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
union acpi_parse_object *arguments[ACPI_OBJ_NUM_OPERANDS];
u32 arg_count = 0;
u32 index = walk_state->num_operands;
+   u32 prev_num_operands = walk_state->num_operands;
+   u32 new_num_operands;
u32 i;
 
ACPI_FUNCTION_TRACE_PTR(ds_create_operands, first_arg);
@@ -733,6 +735,7 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
 
/* Create the interpreter arguments, in reverse order */
 
+   new_num_operands = index;
index--;
for (i = 0; i < arg_count; i++) {
arg = arguments[index];
@@ -757,7 +760,11 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state,
 * pop everything off of the operand stack and delete those
 * objects
 */
-   acpi_ds_obj_stack_pop_and_delete(arg_count, walk_state);
+   walk_state->num_operands = i;
+   acpi_ds_obj_stack_pop_and_delete(new_num_operands, walk_state);
+
+   /* Restore operand count */
+   walk_state->num_operands = prev_num_operands;
 
ACPI_EXCEPTION((AE_INFO, status, "While creating Arg %u", index));
return_ACPI_STATUS(status);
-- 
2.1.4

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-23 Thread Byungchul Park

On Wed, Aug 23, 2017 at 12:26:52PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 23, 2017 at 11:43:23AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 03:49:22PM +0200, Peter Zijlstra wrote:
> 
> > > So I think we'll end up hitting a lockdep deficiency and not trigger the
> > > splat on flush_work(), see also:
> > > 
> > >   https://lwn.net/Articles/332801/
> > > 
> > > lock_map_acquire_read() is a read-recursive and will not in fact create
> > > any dependencies because of this issue.
> > > 
> > > In specific, check_prev_add() has:
> > > 
> > >   if (next->read == 2 || prev->read == 2)
> > >   return 1;
> > > 
> > > This means that for:
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   down_write(A)   (0)
> > > 
> > >   down_write(A)   (0)
> > >   wait_for_completion(C)  (0)
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   complete(C) (0)
> > > 
> > > All the (2) effectively go away and 'solve' our current issue, but:
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   mutex_lock(A)   (0)
> > > 
> > >   mutex_lock(A)   (0)
> > >   lock_map_acquire(W) (0)
> > > 
> > > as per flush_work() will not in fact trigger anymore either.
> > 
> > It should be triggered. Lockdep code should be fixed so that it does.
> 
> Yeah, its just something we never got around to. Once every so often I
> get reminded of it, like now. But then it sits on the todo list and
> never quite happens.

I want to try it.

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-23 Thread Byungchul Park

On Wed, Aug 23, 2017 at 12:26:52PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 23, 2017 at 11:43:23AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 03:49:22PM +0200, Peter Zijlstra wrote:
> 
> > > So I think we'll end up hitting a lockdep deficiency and not trigger the
> > > splat on flush_work(), see also:
> > > 
> > >   https://lwn.net/Articles/332801/
> > > 
> > > lock_map_acquire_read() is a read-recursive and will not in fact create
> > > any dependencies because of this issue.
> > > 
> > > In specific, check_prev_add() has:
> > > 
> > >   if (next->read == 2 || prev->read == 2)
> > >   return 1;
> > > 
> > > This means that for:
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   down_write(A)   (0)
> > > 
> > >   down_write(A)   (0)
> > >   wait_for_completion(C)  (0)
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   complete(C) (0)
> > > 
> > > All the (2) effectively go away and 'solve' our current issue, but:
> > > 
> > >   lock_map_acquire_read(W)(2)
> > >   mutex_lock(A)   (0)
> > > 
> > >   mutex_lock(A)   (0)
> > >   lock_map_acquire(W) (0)
> > > 
> > > as per flush_work() will not in fact trigger anymore either.
> > 
> > It should be triggered. Lockdep code should be fixed so that it does.
> 
> Yeah, its just something we never got around to. Once every so often I
> get reminded of it, like now. But then it sits on the todo list and
> never quite happens.

I want to try it.

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-23 Thread Byungchul Park

On Wed, Aug 23, 2017 at 12:46:48PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 23, 2017 at 11:31:18AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote:
> 
> > Currently, we do the following in process_one_work(),
> > 
> > lockdep_map_acquire for a workqueue
> > lockdep_map_acquire for a work
> > 
> > But IMHO it should be,
> > 
> > lockdep_map_acquire for a pair of workqueue and work.
> > 
> > Right?
> 
> No, it is right. We need the two 'locks'.
> 
> The work one is for flush_work(), the workqueue one is for
> flush_workqueue().
> 
> Just like how flush_work() must not depend on any lock taken inside the
> work, flush_workqueue() callers must not hold any lock acquired inside
> any work ran inside the workqueue. This cannot be done with a single
> 'lock'.

Thank you for explanation.

> The reason flush_work() also depends on the wq 'lock' is because doing
> flush_work() from inside work is a possible problem for single threaded
> workqueues and workqueues with a rescuer.
>  
> > > Agreed. The interaction with workqueues is buggered.
> > 
> > I think original uses of lockdep_map were already wrong. I mean it's
> > not a problem of newly introduced code.
> 
> Not wrong per-se, the new code certainly places more constraints on it.

"the new code places more constraints on it" is just the right expression.

Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING

2017-08-23 Thread Byungchul Park

On Wed, Aug 23, 2017 at 12:46:48PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 23, 2017 at 11:31:18AM +0900, Byungchul Park wrote:
> > On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote:
> 
> > Currently, we do the following in process_one_work(),
> > 
> > lockdep_map_acquire for a workqueue
> > lockdep_map_acquire for a work
> > 
> > But IMHO it should be,
> > 
> > lockdep_map_acquire for a pair of workqueue and work.
> > 
> > Right?
> 
> No, it is right. We need the two 'locks'.
> 
> The work one is for flush_work(), the workqueue one is for
> flush_workqueue().
> 
> Just like how flush_work() must not depend on any lock taken inside the
> work, flush_workqueue() callers must not hold any lock acquired inside
> any work ran inside the workqueue. This cannot be done with a single
> 'lock'.

Thank you for explanation.

> The reason flush_work() also depends on the wq 'lock' is because doing
> flush_work() from inside work is a possible problem for single threaded
> workqueues and workqueues with a rescuer.
>  
> > > Agreed. The interaction with workqueues is buggered.
> > 
> > I think original uses of lockdep_map were already wrong. I mean it's
> > not a problem of newly introduced code.
> 
> Not wrong per-se, the new code certainly places more constraints on it.

"the new code places more constraints on it" is just the right expression.

[PATCH v8 3/3] PCI: iproc: add device shutdown for PCI RC

2017-08-23 Thread Oza Pawandeep

PERST must be asserted around ~500ms before the reboot is applied.

During soft reset (e.g., "reboot" from Linux) on some iproc based SOCs
LCPLL clock and PERST both goes off simultaneously.
This will cause certain Endpoints Intel NVMe not get detected, upon
next boot sequence.

This is specifically happening with Intel P3700 400GB series.
Endpoint is expecting the clock for some amount of time after PERST is
asserted, which is not happening in Stingray (iproc based SOC).
This causes NVMe to behave in undefined way.

On the contrary, Intel x86 boards will have ref clock running, so we
do not see this behavior there.

Besides, PCI spec does not stipulate about such timings.
In-fact it does not tell us, whether PCIe device should consider
refclk to be available or not to be.

It is completely up to vendor to design their EP the way they want
with respect to ref clock availability.

500ms is just based on the observation and taken as safe margin.
This patch adds platform shutdown where it should be
called in device_shutdown while reboot command is issued.
So in sequence first Endpoint Shutdown (e.g. nvme_shutdown)
followed by RC shutdown, which issues safe PERST assertion.

Signed-off-by: Oza Pawandeep 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 

diff --git a/drivers/pci/host/pcie-iproc-platform.c 
b/drivers/pci/host/pcie-iproc-platform.c
index 2253119..a5073a9 100644
--- a/drivers/pci/host/pcie-iproc-platform.c
+++ b/drivers/pci/host/pcie-iproc-platform.c
@@ -134,6 +134,13 @@ static int iproc_pcie_pltfm_remove(struct platform_device 
*pdev)
return iproc_pcie_remove(pcie);
 }
 
+static void iproc_pcie_pltfm_shutdown(struct platform_device *pdev)
+{
+   struct iproc_pcie *pcie = platform_get_drvdata(pdev);
+
+   iproc_pcie_shutdown(pcie);
+}
+
 static struct platform_driver iproc_pcie_pltfm_driver = {
.driver = {
.name = "iproc-pcie",
@@ -141,6 +148,7 @@ static int iproc_pcie_pltfm_remove(struct platform_device 
*pdev)
},
.probe = iproc_pcie_pltfm_probe,
.remove = iproc_pcie_pltfm_remove,
+   .shutdown = iproc_pcie_pltfm_shutdown,
 };
 module_platform_driver(iproc_pcie_pltfm_driver);
 
diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index 37f4adf..cbdabe8 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -659,7 +659,7 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, 
unsigned int devfn,
.write = iproc_pcie_config_write32,
 };
 
-static void iproc_pcie_reset(struct iproc_pcie *pcie)
+static void iproc_pcie_perst_ctrl(struct iproc_pcie *pcie, bool assert)
 {
u32 val;
 
@@ -671,19 +671,26 @@ static void iproc_pcie_reset(struct iproc_pcie *pcie)
if (pcie->ep_is_internal)
return;
 
-   /*
-* Select perst_b signal as reset source. Put the device into reset,
-* and then bring it out of reset
-*/
-   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
-   val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST &
-   ~RC_PCIE_RST_OUTPUT;
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
-   udelay(250);
-
-   val |= RC_PCIE_RST_OUTPUT;
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
-   msleep(100);
+   if (assert) {
+   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
+   val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST &
+   ~RC_PCIE_RST_OUTPUT;
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
+   udelay(250);
+   } else {
+   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
+   val |= RC_PCIE_RST_OUTPUT;
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
+   msleep(100);
+   }
+}
+
+int iproc_pcie_shutdown(struct iproc_pcie *pcie)
+{
+   iproc_pcie_perst_ctrl(pcie, true);
+   msleep(500);
+
+   return 0;
 }
 
 static int iproc_pcie_check_link(struct iproc_pcie *pcie)
@@ -1365,7 +1372,8 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
goto err_exit_phy;
}
 
-   iproc_pcie_reset(pcie);
+   iproc_pcie_perst_ctrl(pcie, true);
+   iproc_pcie_perst_ctrl(pcie, false);
 
if (pcie->need_ob_cfg) {
ret = iproc_pcie_map_ranges(pcie, res);
diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
index 0bbe2ea..a6b55ce 100644
--- a/drivers/pci/host/pcie-iproc.h
+++ b/drivers/pci/host/pcie-iproc.h
@@ -110,6 +110,7 @@ struct iproc_pcie {
 
 int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
 int iproc_pcie_remove(struct iproc_pcie *pcie);
+int iproc_pcie_shutdown(struct iproc_pcie *pcie);
 
 #ifdef CONFIG_PCIE_IPROC_MSI
 int iproc_msi_init(struct iproc_pcie *pcie, struct device_node *node);
-- 
1.9.1

[PATCH v8 3/3] PCI: iproc: add device shutdown for PCI RC

2017-08-23 Thread Oza Pawandeep

PERST must be asserted around ~500ms before the reboot is applied.

During soft reset (e.g., "reboot" from Linux) on some iproc based SOCs
LCPLL clock and PERST both goes off simultaneously.
This will cause certain Endpoints Intel NVMe not get detected, upon
next boot sequence.

This is specifically happening with Intel P3700 400GB series.
Endpoint is expecting the clock for some amount of time after PERST is
asserted, which is not happening in Stingray (iproc based SOC).
This causes NVMe to behave in undefined way.

On the contrary, Intel x86 boards will have ref clock running, so we
do not see this behavior there.

Besides, PCI spec does not stipulate about such timings.
In-fact it does not tell us, whether PCIe device should consider
refclk to be available or not to be.

It is completely up to vendor to design their EP the way they want
with respect to ref clock availability.

500ms is just based on the observation and taken as safe margin.
This patch adds platform shutdown where it should be
called in device_shutdown while reboot command is issued.
So in sequence first Endpoint Shutdown (e.g. nvme_shutdown)
followed by RC shutdown, which issues safe PERST assertion.

Signed-off-by: Oza Pawandeep 
Reviewed-by: Ray Jui 
Reviewed-by: Scott Branden 

diff --git a/drivers/pci/host/pcie-iproc-platform.c 
b/drivers/pci/host/pcie-iproc-platform.c
index 2253119..a5073a9 100644
--- a/drivers/pci/host/pcie-iproc-platform.c
+++ b/drivers/pci/host/pcie-iproc-platform.c
@@ -134,6 +134,13 @@ static int iproc_pcie_pltfm_remove(struct platform_device 
*pdev)
return iproc_pcie_remove(pcie);
 }
 
+static void iproc_pcie_pltfm_shutdown(struct platform_device *pdev)
+{
+   struct iproc_pcie *pcie = platform_get_drvdata(pdev);
+
+   iproc_pcie_shutdown(pcie);
+}
+
 static struct platform_driver iproc_pcie_pltfm_driver = {
.driver = {
.name = "iproc-pcie",
@@ -141,6 +148,7 @@ static int iproc_pcie_pltfm_remove(struct platform_device 
*pdev)
},
.probe = iproc_pcie_pltfm_probe,
.remove = iproc_pcie_pltfm_remove,
+   .shutdown = iproc_pcie_pltfm_shutdown,
 };
 module_platform_driver(iproc_pcie_pltfm_driver);
 
diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index 37f4adf..cbdabe8 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -659,7 +659,7 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, 
unsigned int devfn,
.write = iproc_pcie_config_write32,
 };
 
-static void iproc_pcie_reset(struct iproc_pcie *pcie)
+static void iproc_pcie_perst_ctrl(struct iproc_pcie *pcie, bool assert)
 {
u32 val;
 
@@ -671,19 +671,26 @@ static void iproc_pcie_reset(struct iproc_pcie *pcie)
if (pcie->ep_is_internal)
return;
 
-   /*
-* Select perst_b signal as reset source. Put the device into reset,
-* and then bring it out of reset
-*/
-   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
-   val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST &
-   ~RC_PCIE_RST_OUTPUT;
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
-   udelay(250);
-
-   val |= RC_PCIE_RST_OUTPUT;
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
-   msleep(100);
+   if (assert) {
+   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
+   val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST &
+   ~RC_PCIE_RST_OUTPUT;
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
+   udelay(250);
+   } else {
+   val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL);
+   val |= RC_PCIE_RST_OUTPUT;
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val);
+   msleep(100);
+   }
+}
+
+int iproc_pcie_shutdown(struct iproc_pcie *pcie)
+{
+   iproc_pcie_perst_ctrl(pcie, true);
+   msleep(500);
+
+   return 0;
 }
 
 static int iproc_pcie_check_link(struct iproc_pcie *pcie)
@@ -1365,7 +1372,8 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
goto err_exit_phy;
}
 
-   iproc_pcie_reset(pcie);
+   iproc_pcie_perst_ctrl(pcie, true);
+   iproc_pcie_perst_ctrl(pcie, false);
 
if (pcie->need_ob_cfg) {
ret = iproc_pcie_map_ranges(pcie, res);
diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
index 0bbe2ea..a6b55ce 100644
--- a/drivers/pci/host/pcie-iproc.h
+++ b/drivers/pci/host/pcie-iproc.h
@@ -110,6 +110,7 @@ struct iproc_pcie {
 
 int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
 int iproc_pcie_remove(struct iproc_pcie *pcie);
+int iproc_pcie_shutdown(struct iproc_pcie *pcie);
 
 #ifdef CONFIG_PCIE_IPROC_MSI
 int iproc_msi_init(struct iproc_pcie *pcie, struct device_node *node);
-- 
1.9.1

[PATCH v8 1/3] PCI: iproc: factor out ep configuration access

2017-08-23 Thread Oza Pawandeep

This patch factors out ep configuration access
as a separate function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index c574863..61d9be6 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -448,6 +448,31 @@ static inline void iproc_pcie_apb_err_disable(struct 
pci_bus *bus,
}
 }
 
+static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie,
+  unsigned int busno,
+  unsigned int slot,
+  unsigned int fn,
+  int where)
+{
+   u16 offset;
+   u32 val;
+
+   /* EP device access */
+   val = (busno << CFG_ADDR_BUS_NUM_SHIFT) |
+   (slot << CFG_ADDR_DEV_NUM_SHIFT) |
+   (fn << CFG_ADDR_FUNC_NUM_SHIFT) |
+   (where & CFG_ADDR_REG_NUM_MASK) |
+   (1 & CFG_ADDR_CFG_TYPE_MASK);
+
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val);
+   offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA);
+
+   if (iproc_pcie_reg_is_invalid(offset))
+   return NULL;
+
+   return (pcie->base + offset);
+}
+
 /**
  * Note access to the configuration registers are protected at the higher layer
  * by 'pci_lock' in drivers/pci/access.c
@@ -459,7 +484,6 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct 
iproc_pcie *pcie,
 {
unsigned slot = PCI_SLOT(devfn);
unsigned fn = PCI_FUNC(devfn);
-   u32 val;
u16 offset;
 
/* root complex access */
@@ -484,18 +508,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct 
iproc_pcie *pcie,
if (slot > 0)
return NULL;
 
-   /* EP device access */
-   val = (busno << CFG_ADDR_BUS_NUM_SHIFT) |
-   (slot << CFG_ADDR_DEV_NUM_SHIFT) |
-   (fn << CFG_ADDR_FUNC_NUM_SHIFT) |
-   (where & CFG_ADDR_REG_NUM_MASK) |
-   (1 & CFG_ADDR_CFG_TYPE_MASK);
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val);
-   offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA);
-   if (iproc_pcie_reg_is_invalid(offset))
-   return NULL;
-   else
-   return (pcie->base + offset);
+   return iproc_pcie_map_ep_cfg_reg(pcie, busno, slot, fn, where);
 }
 
 static void __iomem *iproc_pcie_bus_map_cfg_bus(struct pci_bus *bus,
-- 
1.9.1

[PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP

2017-08-23 Thread Oza Pawandeep

PCIe spec r3.1, sec 2.3.2
If CRS software visibility is not enabled, the RC must reissue the
config request as a new request.

- If CRS software visibility is enabled,
- for a config read of Vendor ID, the RC must return 0x0001 data
- for all other config reads/writes, the RC must reissue the
  request

iproc PCIe Controller spec:
4.7.3.3. Retry Status On Configuration Cycle
Endpoints are allowed to generate retry status on configuration
cycles. In this case, the RC needs to re-issue the request. The IP
does not handle this because the number of configuration cycles needed
will probably be less than the total number of non-posted operations
needed.

When a retry status is received on the User RX interface for a
configuration request that was sent on the User TX interface,
it will be indicated with a completion with the CMPL_STATUS field set
to 2=CRS, and the user will have to find the address and data values
and send a new transaction on the User TX interface.
When the internal configuration space returns a retry status during a
configuration cycle (user_cscfg = 1) on the Command/Status interface,
the pcie_cscrs will assert with the pcie_csack signal to indicate the
CRS status.
When the CRS Software Visibility Enable register in the Root Control
register is enabled, the IP will return the data value to 0x0001 for
the Vendor ID value and 0x  (all 1’s) for the rest of the data in
the request for reads of offset 0 that return with CRS status.  This
is true for both the User RX Interface and for the Command/Status
interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
field of the completion on the User RX Interface will not be 2=CRS and
the pcie_cscrs signal will not assert on the Command/Status interface.

Per PCIe r3.1, sec 2.3.2, config requests that receive completions
with Configuration Request Retry Status (CRS) should be reissued by
the hardware except reads of the Vendor ID when CRS Software
Visibility is enabled.

This hardware never reissues configuration requests when it receives
CRS completions.
Note that, neither PCIe host bridge nor PCIe core re-issues the
request for any configuration offset.

For config reads, this hardware returns CFG_RETRY_STATUS data when
it receives a CRS completion for a config read, regardless of the
address of the read or the CRS Software Visibility Enable bit.

This patch implements iproc_pcie_config_read which gets called for
Stingray, if it receives a CRS completion, it retries reading it again.
In case of timeout, it returns 0x.
For other iproc based SOC, it falls back to PCI generic APIs.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index 61d9be6..37f4adf 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -68,6 +68,9 @@
 #define APB_ERR_EN_SHIFT 0
 #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
 
+#define CFG_RETRY_STATUS 0x0001
+#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
+
 /* derive the enum index of the outbound/inbound mapping registers */
 #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
 
@@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
iproc_pcie *pcie,
return (pcie->base + offset);
 }
 
+static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
+{
+   int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
+   unsigned int data;
+
+   /*
+* As per PCIe spec r3.1, sec 2.3.2, CRS Software
+* Visibility only affects config read of the Vendor ID.
+* For config write or any other config read the Root must
+* automatically re-issue configuration request again as a
+* new request.
+*
+* For config reads, this hardware returns CFG_RETRY_STATUS data when
+* it receives a CRS completion for a config read, regardless of the
+* address of the read or the CRS Software Visibility Enable bit. As a
+* partial workaround for this, we retry in software any read that
+* returns CFG_RETRY_STATUS.
+*/
+   data = readl(cfg_data_p);
+   while (data == CFG_RETRY_STATUS && timeout--) {
+   udelay(1);
+   data = readl(cfg_data_p);
+   }
+
+   if (data == CFG_RETRY_STATUS)
+   data = 0x;
+
+   return data;
+}
+
+static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
+   int where, int size, u32 *val)
+{
+   struct iproc_pcie *pcie = iproc_data(bus);
+   unsigned int slot = PCI_SLOT(devfn);
+   unsigned int fn = PCI_FUNC(devfn);
+   unsigned int busno = bus->number;
+   void __iomem *cfg_data_p;
+   unsigned int data;
+
+   /* root complex access. */
+   if (busno == 0)
+   return pci_generic_config_read32(bus, devfn, where, size, val);
+
+   cfg_data_p =

[PATCH v8 1/3] PCI: iproc: factor out ep configuration access

2017-08-23 Thread Oza Pawandeep

This patch factors out ep configuration access
as a separate function.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index c574863..61d9be6 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -448,6 +448,31 @@ static inline void iproc_pcie_apb_err_disable(struct 
pci_bus *bus,
}
 }
 
+static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie,
+  unsigned int busno,
+  unsigned int slot,
+  unsigned int fn,
+  int where)
+{
+   u16 offset;
+   u32 val;
+
+   /* EP device access */
+   val = (busno << CFG_ADDR_BUS_NUM_SHIFT) |
+   (slot << CFG_ADDR_DEV_NUM_SHIFT) |
+   (fn << CFG_ADDR_FUNC_NUM_SHIFT) |
+   (where & CFG_ADDR_REG_NUM_MASK) |
+   (1 & CFG_ADDR_CFG_TYPE_MASK);
+
+   iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val);
+   offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA);
+
+   if (iproc_pcie_reg_is_invalid(offset))
+   return NULL;
+
+   return (pcie->base + offset);
+}
+
 /**
  * Note access to the configuration registers are protected at the higher layer
  * by 'pci_lock' in drivers/pci/access.c
@@ -459,7 +484,6 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct 
iproc_pcie *pcie,
 {
unsigned slot = PCI_SLOT(devfn);
unsigned fn = PCI_FUNC(devfn);
-   u32 val;
u16 offset;
 
/* root complex access */
@@ -484,18 +508,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct 
iproc_pcie *pcie,
if (slot > 0)
return NULL;
 
-   /* EP device access */
-   val = (busno << CFG_ADDR_BUS_NUM_SHIFT) |
-   (slot << CFG_ADDR_DEV_NUM_SHIFT) |
-   (fn << CFG_ADDR_FUNC_NUM_SHIFT) |
-   (where & CFG_ADDR_REG_NUM_MASK) |
-   (1 & CFG_ADDR_CFG_TYPE_MASK);
-   iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val);
-   offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA);
-   if (iproc_pcie_reg_is_invalid(offset))
-   return NULL;
-   else
-   return (pcie->base + offset);
+   return iproc_pcie_map_ep_cfg_reg(pcie, busno, slot, fn, where);
 }
 
 static void __iomem *iproc_pcie_bus_map_cfg_bus(struct pci_bus *bus,
-- 
1.9.1

[PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP

2017-08-23 Thread Oza Pawandeep

PCIe spec r3.1, sec 2.3.2
If CRS software visibility is not enabled, the RC must reissue the
config request as a new request.

- If CRS software visibility is enabled,
- for a config read of Vendor ID, the RC must return 0x0001 data
- for all other config reads/writes, the RC must reissue the
  request

iproc PCIe Controller spec:
4.7.3.3. Retry Status On Configuration Cycle
Endpoints are allowed to generate retry status on configuration
cycles. In this case, the RC needs to re-issue the request. The IP
does not handle this because the number of configuration cycles needed
will probably be less than the total number of non-posted operations
needed.

When a retry status is received on the User RX interface for a
configuration request that was sent on the User TX interface,
it will be indicated with a completion with the CMPL_STATUS field set
to 2=CRS, and the user will have to find the address and data values
and send a new transaction on the User TX interface.
When the internal configuration space returns a retry status during a
configuration cycle (user_cscfg = 1) on the Command/Status interface,
the pcie_cscrs will assert with the pcie_csack signal to indicate the
CRS status.
When the CRS Software Visibility Enable register in the Root Control
register is enabled, the IP will return the data value to 0x0001 for
the Vendor ID value and 0x  (all 1’s) for the rest of the data in
the request for reads of offset 0 that return with CRS status.  This
is true for both the User RX Interface and for the Command/Status
interface.  When CRS Software Visibility is enabled, the CMPL_STATUS
field of the completion on the User RX Interface will not be 2=CRS and
the pcie_cscrs signal will not assert on the Command/Status interface.

Per PCIe r3.1, sec 2.3.2, config requests that receive completions
with Configuration Request Retry Status (CRS) should be reissued by
the hardware except reads of the Vendor ID when CRS Software
Visibility is enabled.

This hardware never reissues configuration requests when it receives
CRS completions.
Note that, neither PCIe host bridge nor PCIe core re-issues the
request for any configuration offset.

For config reads, this hardware returns CFG_RETRY_STATUS data when
it receives a CRS completion for a config read, regardless of the
address of the read or the CRS Software Visibility Enable bit.

This patch implements iproc_pcie_config_read which gets called for
Stingray, if it receives a CRS completion, it retries reading it again.
In case of timeout, it returns 0x.
For other iproc based SOC, it falls back to PCI generic APIs.

Signed-off-by: Oza Pawandeep 

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index 61d9be6..37f4adf 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -68,6 +68,9 @@
 #define APB_ERR_EN_SHIFT 0
 #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
 
+#define CFG_RETRY_STATUS 0x0001
+#define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milli-seconds. */
+
 /* derive the enum index of the outbound/inbound mapping registers */
 #define MAP_REG(base_reg, index)  ((base_reg) + (index) * 2)
 
@@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
iproc_pcie *pcie,
return (pcie->base + offset);
 }
 
+static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
+{
+   int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
+   unsigned int data;
+
+   /*
+* As per PCIe spec r3.1, sec 2.3.2, CRS Software
+* Visibility only affects config read of the Vendor ID.
+* For config write or any other config read the Root must
+* automatically re-issue configuration request again as a
+* new request.
+*
+* For config reads, this hardware returns CFG_RETRY_STATUS data when
+* it receives a CRS completion for a config read, regardless of the
+* address of the read or the CRS Software Visibility Enable bit. As a
+* partial workaround for this, we retry in software any read that
+* returns CFG_RETRY_STATUS.
+*/
+   data = readl(cfg_data_p);
+   while (data == CFG_RETRY_STATUS && timeout--) {
+   udelay(1);
+   data = readl(cfg_data_p);
+   }
+
+   if (data == CFG_RETRY_STATUS)
+   data = 0x;
+
+   return data;
+}
+
+static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
+   int where, int size, u32 *val)
+{
+   struct iproc_pcie *pcie = iproc_data(bus);
+   unsigned int slot = PCI_SLOT(devfn);
+   unsigned int fn = PCI_FUNC(devfn);
+   unsigned int busno = bus->number;
+   void __iomem *cfg_data_p;
+   unsigned int data;
+
+   /* root complex access. */
+   if (busno == 0)
+   return pci_generic_config_read32(bus, devfn, where, size, val);
+
+   cfg_data_p = iproc_pcie_map_ep_cfg_reg(pcie,

[PATCH v8 0/3] PCI: iproc: SOC specific fixes

2017-08-23 Thread Oza Pawandeep

PCI: iproc: Retry request when CRS returned from EP Above patch adds
support for CRS in PCI RC driver, otherwise if not handled at lower
level, the user space PMD (poll mode drivers) can timeout.

PCI: iproc: add device shutdown for PCI RC This fixes the issue where
certian PCI endpoints are not getting detected on Stingray SOC after
reboot.

Changes Since v7:
Factor out the ep config access code.

Changes Since v6:
Rebased patches on top of Lorenzo's patches.
Bjorn's comments addressed.
now the confg retry returns 0x as data.
Added reference to PCIe spec and iproc Controller spec in Changelog.

Changes Since v5:
Ray's comments addressed.

Changes Since v4:
Bjorn's comments addressed.

Changes Since v3:
[re-send]

Changes Since v2:
Fix compilation errors for pcie-iproc-platform.ko which was caught
by kbuild.

Oza Pawandeep (3):
  PCI: iproc: factor-out ep configuration access
  PCI: iproc: Retry request when CRS returned from EP
  PCI: iproc: add device shutdown for PCI RC

 drivers/pci/host/pcie-iproc-platform.c |   8 ++
 drivers/pci/host/pcie-iproc.c  | 143 ++---
 drivers/pci/host/pcie-iproc.h  |   1 +
 3 files changed, 124 insertions(+), 28 deletions(-)

-- 
1.9.1

[PATCH v8 0/3] PCI: iproc: SOC specific fixes

2017-08-23 Thread Oza Pawandeep

PCI: iproc: Retry request when CRS returned from EP Above patch adds
support for CRS in PCI RC driver, otherwise if not handled at lower
level, the user space PMD (poll mode drivers) can timeout.

PCI: iproc: add device shutdown for PCI RC This fixes the issue where
certian PCI endpoints are not getting detected on Stingray SOC after
reboot.

Changes Since v7:
Factor out the ep config access code.

Changes Since v6:
Rebased patches on top of Lorenzo's patches.
Bjorn's comments addressed.
now the confg retry returns 0x as data.
Added reference to PCIe spec and iproc Controller spec in Changelog.

Changes Since v5:
Ray's comments addressed.

Changes Since v4:
Bjorn's comments addressed.

Changes Since v3:
[re-send]

Changes Since v2:
Fix compilation errors for pcie-iproc-platform.ko which was caught
by kbuild.

Oza Pawandeep (3):
  PCI: iproc: factor-out ep configuration access
  PCI: iproc: Retry request when CRS returned from EP
  PCI: iproc: add device shutdown for PCI RC

 drivers/pci/host/pcie-iproc-platform.c |   8 ++
 drivers/pci/host/pcie-iproc.c  | 143 ++---
 drivers/pci/host/pcie-iproc.h  |   1 +
 3 files changed, 124 insertions(+), 28 deletions(-)

-- 
1.9.1

Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Correct power_source range check

2017-08-23 Thread Bjorn Andersson

On Tue 18 Jul 23:39 PDT 2017, fengl...@codeaurora.org wrote:

> From: Fenglin Wu 
> 
> Power source selection in DIG_VIN_CTL is indexed from 0, in the range
> check it shouldn't be equal to the total number of power sources.
> 

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Fenglin Wu 
> ---
>  drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c 
> b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> index 664b641..8b77c04 100644
> --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> @@ -375,7 +375,7 @@ static int pmic_gpio_config_set(struct pinctrl_dev 
> *pctldev, unsigned int pin,
>   pad->is_enabled = false;
>   break;
>   case PIN_CONFIG_POWER_SOURCE:
> - if (arg > pad->num_sources)
> + if (arg >= pad->num_sources)
>   return -EINVAL;
>   pad->power_source = arg;
>   break;
> -- 
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>

Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Correct power_source range check

2017-08-23 Thread Bjorn Andersson

On Tue 18 Jul 23:39 PDT 2017, fengl...@codeaurora.org wrote:

> From: Fenglin Wu 
> 
> Power source selection in DIG_VIN_CTL is indexed from 0, in the range
> check it shouldn't be equal to the total number of power sources.
> 

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> Signed-off-by: Fenglin Wu 
> ---
>  drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c 
> b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> index 664b641..8b77c04 100644
> --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c
> @@ -375,7 +375,7 @@ static int pmic_gpio_config_set(struct pinctrl_dev 
> *pctldev, unsigned int pin,
>   pad->is_enabled = false;
>   break;
>   case PIN_CONFIG_POWER_SOURCE:
> - if (arg > pad->num_sources)
> + if (arg >= pad->num_sources)
>   return -EINVAL;
>   pad->power_source = arg;
>   break;
> -- 
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
>

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 12:07, NeilBrown wrote:
> 
> 
> The more precise details, that automount action for indirect automount
> points is not triggered when the 'browse' option is used, is probably
> not necessary.
> 
> Ian: if you agree with that text, and Michael doesn't provide alternate
> evidence, I'll submit a formal patch for the man page or should we
> just wait until the patch actually lands?

So far only David commented about using ENOENT rather than EREMOTE.

I prefer ENOENT for this case myself and he didn't object when I
explained why, David, any concerns?

Al has been silent so far so either he hasn't seen it or he's ok with
it, Al, any concerns?

And I guess if there are no concerns there's a good chance Andrew is
ok with it for the next merge window, Andrew?

If everyone agrees then we could go ahead immediately so there's a
better chance of getting it into released man pages closer to the
change being merged.

Ian

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 12:07, NeilBrown wrote:
> 
> 
> The more precise details, that automount action for indirect automount
> points is not triggered when the 'browse' option is used, is probably
> not necessary.
> 
> Ian: if you agree with that text, and Michael doesn't provide alternate
> evidence, I'll submit a formal patch for the man page or should we
> just wait until the patch actually lands?

So far only David commented about using ENOENT rather than EREMOTE.

I prefer ENOENT for this case myself and he didn't object when I
explained why, David, any concerns?

Al has been silent so far so either he hasn't seen it or he's ok with
it, Al, any concerns?

And I guess if there are no concerns there's a good chance Andrew is
ok with it for the next merge window, Andrew?

If everyone agrees then we could go ahead immediately so there's a
better chance of getting it into released man pages closer to the
change being merged.

Ian

[RESENT PATCH v7 5/7] ASoC: rockchip: Add support for DP codec

2017-08-23 Thread Jeffy Chen

Add support for optional cdn dp codec.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/Kconfig|  1 +
 sound/soc/rockchip/rk3399_gru_sound.c | 59 +--
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig
index c84487805876..8f0d0d8d34e6 100644
--- a/sound/soc/rockchip/Kconfig
+++ b/sound/soc/rockchip/Kconfig
@@ -68,6 +68,7 @@ config SND_SOC_RK3399_GRU_SOUND
select SND_SOC_RT5514
select SND_SOC_DA7219
select SND_SOC_RT5514_SPI
+   select SND_SOC_HDMI_CODEC
help
  Say Y or M here if you want to add support multiple codecs for SoC
  audio on Rockchip RK3399 GRU boards.
diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 347c34d0db4c..91aab5c18f3b 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -228,6 +228,45 @@ static int rockchip_sound_da7219_init(struct 
snd_soc_pcm_runtime *rtd)
return 0;
 }
 
+static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream,
+ struct snd_pcm_hw_params *params)
+{
+   struct snd_soc_pcm_runtime *rtd = substream->private_data;
+   struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
+   struct snd_soc_dai *codec_dai = rtd->codec_dai;
+   int mclk, ret;
+
+   /* in bypass mode, the mclk has to be one of the frequencies below */
+   switch (params_rate(params)) {
+   case 8000:
+   case 16000:
+   case 24000:
+   case 32000:
+   case 48000:
+   case 64000:
+   case 96000:
+   mclk = 12288000;
+   break;
+   case 11025:
+   case 22050:
+   case 44100:
+   case 88200:
+   mclk = 11289600;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   ret = snd_soc_dai_set_sysclk(cpu_dai, 0, mclk,
+SND_SOC_CLOCK_OUT);
+   if (ret < 0) {
+   dev_err(codec_dai->dev, "Can't set cpu clock out %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
 static const struct snd_soc_ops rockchip_sound_max98357a_ops = {
.hw_params = rockchip_sound_max98357a_hw_params,
 };
@@ -240,6 +279,10 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops 
= {
.hw_params = rockchip_sound_da7219_hw_params,
 };
 
+static struct snd_soc_ops rockchip_sound_cdndp_ops = {
+   .hw_params = rockchip_sound_cdndp_hw_params,
+};
+
 static struct snd_soc_card rockchip_sound_card = {
.name = "rk3399-gru-sound",
.owner = THIS_MODULE,
@@ -252,6 +295,7 @@ static struct snd_soc_card rockchip_sound_card = {
 };
 
 enum {
+   DAILINK_CDNDP,
DAILINK_DA7219,
DAILINK_MAX98357A,
DAILINK_RT5514,
@@ -259,6 +303,7 @@ enum {
 };
 
 static const char * const dailink_compat[] = {
+   [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp",
[DAILINK_DA7219] = "dlg,da7219",
[DAILINK_MAX98357A] = "maxim,max98357a",
[DAILINK_RT5514] = "realtek,rt5514-i2c",
@@ -266,6 +311,14 @@ static const char * const dailink_compat[] = {
 };
 
 static const struct snd_soc_dai_link rockchip_dais[] = {
+   [DAILINK_CDNDP] = {
+   .name = "DP",
+   .stream_name = "DP PCM",
+   .codec_dai_name = "i2s-hifi",
+   .ops = _sound_cdndp_ops,
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_DA7219] = {
.name = "DA7219",
.stream_name = "DA7219 PCM",
@@ -316,7 +369,7 @@ static int rockchip_sound_codec_node_match(struct 
device_node *np_codec)
 static int rockchip_sound_of_parse_dais(struct device *dev,
struct snd_soc_card *card)
 {
-   struct device_node *np_cpu;
+   struct device_node *np_cpu, *np_cpu0, *np_cpu1;
struct device_node *np_codec;
struct snd_soc_dai_link *dai;
int i, index;
@@ -326,7 +379,8 @@ static int rockchip_sound_of_parse_dais(struct device *dev,
if (!card->dai_link)
return -ENOMEM;
 
-   np_cpu = of_parse_phandle(dev->of_node, "rockchip,cpu", 0);
+   np_cpu0 = of_parse_phandle(dev->of_node, "rockchip,cpu", 0);
+   np_cpu1 = of_parse_phandle(dev->of_node, "rockchip,cpu", 1);
 
card->num_links = 0;
for (i = 0; i < ARRAY_SIZE(rockchip_dais); i++) {
@@ -342,6 +396,7 @@ static int rockchip_sound_of_parse_dais(struct device *dev,
if (index < 0)
continue;
 
+   np_cpu = (index == DAILINK_CDNDP) ? np_cpu1 : np_cpu0;
if (!np_cpu) {
dev_err(dev, "Missing

[RESENT PATCH v7 5/7] ASoC: rockchip: Add support for DP codec

2017-08-23 Thread Jeffy Chen

Add support for optional cdn dp codec.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/Kconfig|  1 +
 sound/soc/rockchip/rk3399_gru_sound.c | 59 +--
 2 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig
index c84487805876..8f0d0d8d34e6 100644
--- a/sound/soc/rockchip/Kconfig
+++ b/sound/soc/rockchip/Kconfig
@@ -68,6 +68,7 @@ config SND_SOC_RK3399_GRU_SOUND
select SND_SOC_RT5514
select SND_SOC_DA7219
select SND_SOC_RT5514_SPI
+   select SND_SOC_HDMI_CODEC
help
  Say Y or M here if you want to add support multiple codecs for SoC
  audio on Rockchip RK3399 GRU boards.
diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 347c34d0db4c..91aab5c18f3b 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -228,6 +228,45 @@ static int rockchip_sound_da7219_init(struct 
snd_soc_pcm_runtime *rtd)
return 0;
 }
 
+static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream,
+ struct snd_pcm_hw_params *params)
+{
+   struct snd_soc_pcm_runtime *rtd = substream->private_data;
+   struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
+   struct snd_soc_dai *codec_dai = rtd->codec_dai;
+   int mclk, ret;
+
+   /* in bypass mode, the mclk has to be one of the frequencies below */
+   switch (params_rate(params)) {
+   case 8000:
+   case 16000:
+   case 24000:
+   case 32000:
+   case 48000:
+   case 64000:
+   case 96000:
+   mclk = 12288000;
+   break;
+   case 11025:
+   case 22050:
+   case 44100:
+   case 88200:
+   mclk = 11289600;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   ret = snd_soc_dai_set_sysclk(cpu_dai, 0, mclk,
+SND_SOC_CLOCK_OUT);
+   if (ret < 0) {
+   dev_err(codec_dai->dev, "Can't set cpu clock out %d\n", ret);
+   return ret;
+   }
+
+   return 0;
+}
+
 static const struct snd_soc_ops rockchip_sound_max98357a_ops = {
.hw_params = rockchip_sound_max98357a_hw_params,
 };
@@ -240,6 +279,10 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops 
= {
.hw_params = rockchip_sound_da7219_hw_params,
 };
 
+static struct snd_soc_ops rockchip_sound_cdndp_ops = {
+   .hw_params = rockchip_sound_cdndp_hw_params,
+};
+
 static struct snd_soc_card rockchip_sound_card = {
.name = "rk3399-gru-sound",
.owner = THIS_MODULE,
@@ -252,6 +295,7 @@ static struct snd_soc_card rockchip_sound_card = {
 };
 
 enum {
+   DAILINK_CDNDP,
DAILINK_DA7219,
DAILINK_MAX98357A,
DAILINK_RT5514,
@@ -259,6 +303,7 @@ enum {
 };
 
 static const char * const dailink_compat[] = {
+   [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp",
[DAILINK_DA7219] = "dlg,da7219",
[DAILINK_MAX98357A] = "maxim,max98357a",
[DAILINK_RT5514] = "realtek,rt5514-i2c",
@@ -266,6 +311,14 @@ static const char * const dailink_compat[] = {
 };
 
 static const struct snd_soc_dai_link rockchip_dais[] = {
+   [DAILINK_CDNDP] = {
+   .name = "DP",
+   .stream_name = "DP PCM",
+   .codec_dai_name = "i2s-hifi",
+   .ops = _sound_cdndp_ops,
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_DA7219] = {
.name = "DA7219",
.stream_name = "DA7219 PCM",
@@ -316,7 +369,7 @@ static int rockchip_sound_codec_node_match(struct 
device_node *np_codec)
 static int rockchip_sound_of_parse_dais(struct device *dev,
struct snd_soc_card *card)
 {
-   struct device_node *np_cpu;
+   struct device_node *np_cpu, *np_cpu0, *np_cpu1;
struct device_node *np_codec;
struct snd_soc_dai_link *dai;
int i, index;
@@ -326,7 +379,8 @@ static int rockchip_sound_of_parse_dais(struct device *dev,
if (!card->dai_link)
return -ENOMEM;
 
-   np_cpu = of_parse_phandle(dev->of_node, "rockchip,cpu", 0);
+   np_cpu0 = of_parse_phandle(dev->of_node, "rockchip,cpu", 0);
+   np_cpu1 = of_parse_phandle(dev->of_node, "rockchip,cpu", 1);
 
card->num_links = 0;
for (i = 0; i < ARRAY_SIZE(rockchip_dais); i++) {
@@ -342,6 +396,7 @@ static int rockchip_sound_of_parse_dais(struct device *dev,
if (index < 0)
continue;
 
+   np_cpu = (index == DAILINK_CDNDP) ? np_cpu1 : np_cpu0;
if (!np_cpu) {
dev_err(dev, "Missing 'rockchip,cpu' for %s\n",

[RESENT PATCH v7 7/7] dt-bindings: ASoC: rockchip: Update description of rockchip,codec

2017-08-23 Thread Jeffy Chen

Update description for newly added optional audio codecs.

Signed-off-by: Jeffy Chen 
Acked-by: Rob Herring 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt 
b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
index eac91db07178..72d3cf4c2606 100644
--- a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
+++ b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
@@ -4,7 +4,7 @@ Required properties:
 - compatible: "rockchip,rk3399-gru-sound"
 - rockchip,cpu: The phandle of the Rockchip I2S controller that's
   connected to the codecs
-- rockchip,codec: The phandle of the MAX98357A/RT5514/DA7219 codecs
+- rockchip,codec: The phandle of the audio codecs
 
 Optional properties:
 - dmic-wakeup-delay-ms : specify delay time (ms) for DMIC ready.
-- 
2.11.0

[RESENT PATCH v7 4/7] ASoC: rockchip: Parse dai links from dts

2017-08-23 Thread Jeffy Chen

Refactor rockchip_sound_probe, parse dai links from dts instead of
hard coding them.

Signed-off-by: Jeffy Chen 
Reviewed-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 
---

Changes in v7: None
Changes in v6: None
Changes in v3:
Use compatible to match audio codecs
-- Suggested-by Matthias Kaehlcke 

Changes in v2:
Let rockchip,codec-names be a required property, because we plan to
add more supported codecs to the fixed dai link list in the driver.

 sound/soc/rockchip/rk3399_gru_sound.c | 140 ++
 1 file changed, 91 insertions(+), 49 deletions(-)

diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index cd2fdba922f1..347c34d0db4c 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -240,14 +240,42 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops 
= {
.hw_params = rockchip_sound_da7219_hw_params,
 };
 
+static struct snd_soc_card rockchip_sound_card = {
+   .name = "rk3399-gru-sound",
+   .owner = THIS_MODULE,
+   .dapm_widgets = rockchip_dapm_widgets,
+   .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
+   .dapm_routes = rockchip_dapm_routes,
+   .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
+   .controls = rockchip_controls,
+   .num_controls = ARRAY_SIZE(rockchip_controls),
+};
+
 enum {
+   DAILINK_DA7219,
DAILINK_MAX98357A,
DAILINK_RT5514,
-   DAILINK_DA7219,
DAILINK_RT5514_DSP,
 };
 
-static struct snd_soc_dai_link rockchip_dailinks[] = {
+static const char * const dailink_compat[] = {
+   [DAILINK_DA7219] = "dlg,da7219",
+   [DAILINK_MAX98357A] = "maxim,max98357a",
+   [DAILINK_RT5514] = "realtek,rt5514-i2c",
+   [DAILINK_RT5514_DSP] = "realtek,rt5514-spi",
+};
+
+static const struct snd_soc_dai_link rockchip_dais[] = {
+   [DAILINK_DA7219] = {
+   .name = "DA7219",
+   .stream_name = "DA7219 PCM",
+   .codec_dai_name = "da7219-hifi",
+   .init = rockchip_sound_da7219_init,
+   .ops = _sound_da7219_ops,
+   /* set da7219 as slave */
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
.stream_name = "MAX98357A PCM",
@@ -266,16 +294,6 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
SND_SOC_DAIFMT_CBS_CFS,
},
-   [DAILINK_DA7219] = {
-   .name = "DA7219",
-   .stream_name = "DA7219 PCM",
-   .codec_dai_name = "da7219-hifi",
-   .init = rockchip_sound_da7219_init,
-   .ops = _sound_da7219_ops,
-   /* set da7219 as slave */
-   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
-   SND_SOC_DAIFMT_CBS_CFS,
-   },
/* RT5514 DSP for voice wakeup via spi bus */
[DAILINK_RT5514_DSP] = {
.name = "RT5514 DSP",
@@ -284,42 +302,72 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
},
 };
 
-static struct snd_soc_card rockchip_sound_card = {
-   .name = "rk3399-gru-sound",
-   .owner = THIS_MODULE,
-   .dai_link = rockchip_dailinks,
-   .num_links =  ARRAY_SIZE(rockchip_dailinks),
-   .dapm_widgets = rockchip_dapm_widgets,
-   .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
-   .dapm_routes = rockchip_dapm_routes,
-   .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
-   .controls = rockchip_controls,
-   .num_controls = ARRAY_SIZE(rockchip_controls),
-};
-
-static int rockchip_sound_probe(struct platform_device *pdev)
+static int rockchip_sound_codec_node_match(struct device_node *np_codec)
 {
-   struct snd_soc_card *card = _sound_card;
-   struct device_node *cpu_node;
-   int i, ret;
+   int i;
 
-   cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0);
-   if (!cpu_node) {
-   dev_err(>dev, "Property 'rockchip,cpu' missing or 
invalid\n");
-   return -EINVAL;
+   for (i = 0; i < ARRAY_SIZE(dailink_compat); i++) {
+   if (of_device_is_compatible(np_codec, dailink_compat[i]))
+   return i;
}
+   return -1;
+}
 
-   for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) {
-   rockchip_dailinks[i].platform_of_node = cpu_node;
-   rockchip_dailinks[i].cpu_of_node = cpu_node;
-
-   rockchip_dailinks[i].codec_of_node =
-   of_parse_phandle(pdev->dev.of_node, "rockchip,codec", 
i);
-   if (!rockchip_dailinks[i].codec_of_node) {
-   dev_err(>dev,

[RESENT PATCH v7 7/7] dt-bindings: ASoC: rockchip: Update description of rockchip,codec

2017-08-23 Thread Jeffy Chen

Update description for newly added optional audio codecs.

Signed-off-by: Jeffy Chen 
Acked-by: Rob Herring 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt 
b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
index eac91db07178..72d3cf4c2606 100644
--- a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
+++ b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt
@@ -4,7 +4,7 @@ Required properties:
 - compatible: "rockchip,rk3399-gru-sound"
 - rockchip,cpu: The phandle of the Rockchip I2S controller that's
   connected to the codecs
-- rockchip,codec: The phandle of the MAX98357A/RT5514/DA7219 codecs
+- rockchip,codec: The phandle of the audio codecs
 
 Optional properties:
 - dmic-wakeup-delay-ms : specify delay time (ms) for DMIC ready.
-- 
2.11.0

[RESENT PATCH v7 4/7] ASoC: rockchip: Parse dai links from dts

2017-08-23 Thread Jeffy Chen

Refactor rockchip_sound_probe, parse dai links from dts instead of
hard coding them.

Signed-off-by: Jeffy Chen 
Reviewed-by: Matthias Kaehlcke 
Tested-by: Matthias Kaehlcke 
---

Changes in v7: None
Changes in v6: None
Changes in v3:
Use compatible to match audio codecs
-- Suggested-by Matthias Kaehlcke 

Changes in v2:
Let rockchip,codec-names be a required property, because we plan to
add more supported codecs to the fixed dai link list in the driver.

 sound/soc/rockchip/rk3399_gru_sound.c | 140 ++
 1 file changed, 91 insertions(+), 49 deletions(-)

diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index cd2fdba922f1..347c34d0db4c 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -240,14 +240,42 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops 
= {
.hw_params = rockchip_sound_da7219_hw_params,
 };
 
+static struct snd_soc_card rockchip_sound_card = {
+   .name = "rk3399-gru-sound",
+   .owner = THIS_MODULE,
+   .dapm_widgets = rockchip_dapm_widgets,
+   .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
+   .dapm_routes = rockchip_dapm_routes,
+   .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
+   .controls = rockchip_controls,
+   .num_controls = ARRAY_SIZE(rockchip_controls),
+};
+
 enum {
+   DAILINK_DA7219,
DAILINK_MAX98357A,
DAILINK_RT5514,
-   DAILINK_DA7219,
DAILINK_RT5514_DSP,
 };
 
-static struct snd_soc_dai_link rockchip_dailinks[] = {
+static const char * const dailink_compat[] = {
+   [DAILINK_DA7219] = "dlg,da7219",
+   [DAILINK_MAX98357A] = "maxim,max98357a",
+   [DAILINK_RT5514] = "realtek,rt5514-i2c",
+   [DAILINK_RT5514_DSP] = "realtek,rt5514-spi",
+};
+
+static const struct snd_soc_dai_link rockchip_dais[] = {
+   [DAILINK_DA7219] = {
+   .name = "DA7219",
+   .stream_name = "DA7219 PCM",
+   .codec_dai_name = "da7219-hifi",
+   .init = rockchip_sound_da7219_init,
+   .ops = _sound_da7219_ops,
+   /* set da7219 as slave */
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
.stream_name = "MAX98357A PCM",
@@ -266,16 +294,6 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
SND_SOC_DAIFMT_CBS_CFS,
},
-   [DAILINK_DA7219] = {
-   .name = "DA7219",
-   .stream_name = "DA7219 PCM",
-   .codec_dai_name = "da7219-hifi",
-   .init = rockchip_sound_da7219_init,
-   .ops = _sound_da7219_ops,
-   /* set da7219 as slave */
-   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
-   SND_SOC_DAIFMT_CBS_CFS,
-   },
/* RT5514 DSP for voice wakeup via spi bus */
[DAILINK_RT5514_DSP] = {
.name = "RT5514 DSP",
@@ -284,42 +302,72 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
},
 };
 
-static struct snd_soc_card rockchip_sound_card = {
-   .name = "rk3399-gru-sound",
-   .owner = THIS_MODULE,
-   .dai_link = rockchip_dailinks,
-   .num_links =  ARRAY_SIZE(rockchip_dailinks),
-   .dapm_widgets = rockchip_dapm_widgets,
-   .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets),
-   .dapm_routes = rockchip_dapm_routes,
-   .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes),
-   .controls = rockchip_controls,
-   .num_controls = ARRAY_SIZE(rockchip_controls),
-};
-
-static int rockchip_sound_probe(struct platform_device *pdev)
+static int rockchip_sound_codec_node_match(struct device_node *np_codec)
 {
-   struct snd_soc_card *card = _sound_card;
-   struct device_node *cpu_node;
-   int i, ret;
+   int i;
 
-   cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0);
-   if (!cpu_node) {
-   dev_err(>dev, "Property 'rockchip,cpu' missing or 
invalid\n");
-   return -EINVAL;
+   for (i = 0; i < ARRAY_SIZE(dailink_compat); i++) {
+   if (of_device_is_compatible(np_codec, dailink_compat[i]))
+   return i;
}
+   return -1;
+}
 
-   for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) {
-   rockchip_dailinks[i].platform_of_node = cpu_node;
-   rockchip_dailinks[i].cpu_of_node = cpu_node;
-
-   rockchip_dailinks[i].codec_of_node =
-   of_parse_phandle(pdev->dev.of_node, "rockchip,codec", 
i);
-   if (!rockchip_dailinks[i].codec_of_node) {
-   dev_err(>dev,
-   "Property[%d] 'rockchip,codec' missing or

[RESENT PATCH v7 0/7] ASoC: rockchip: Parse dai links from dts

2017-08-23 Thread Jeffy Chen


Currently we are using a fixed list of dai links in the driver.
This serial of patches would let the driver parse dai links from
dts, so that we can make some of them optional for future boards.

Tested on my chromebook bob(with cros 4.4 kernel), it still works
after disabled rt5514 codecs in the dts.


Changes in v7:
Rebase on the newest for-next

Changes in v6:
Add dmic wakeup delay(not used for now).

Changes in v3:
Use compatible to match audio codecs
-- Suggested-by Matthias Kaehlcke 

Changes in v2:
Let rockchip,codec-names be a required property, because we plan to
add more supported codecs to the fixed dai link list in the driver.

Jeffy Chen (7):
  ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp
  arm64: dts: rockchip: Add rt5514 dsp for Gru
  arm64: dts: rockchip: Update rt5514 devices' compatible for Gru
  ASoC: rockchip: Parse dai links from dts
  ASoC: rockchip: Add support for DP codec
  ASoC: rockchip: Add support for DMIC codec
  dt-bindings: ASoC: rockchip: Update description of rockchip,codec

 .../bindings/sound/rockchip,rk3399-gru-sound.txt   |   2 +-
 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi   |   7 +-
 sound/soc/rockchip/Kconfig |   2 +
 sound/soc/rockchip/rk3399_gru_sound.c  | 255 +++--
 4 files changed, 186 insertions(+), 80 deletions(-)

-- 
2.11.0

[RESENT PATCH v7 6/7] ASoC: rockchip: Add support for DMIC codec

2017-08-23 Thread Jeffy Chen

Add support for optional dmic codec.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6:
Add dmic wakeup delay(not used for now).

Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/Kconfig|  1 +
 sound/soc/rockchip/rk3399_gru_sound.c | 36 +++
 2 files changed, 37 insertions(+)

diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig
index 8f0d0d8d34e6..b0825370d262 100644
--- a/sound/soc/rockchip/Kconfig
+++ b/sound/soc/rockchip/Kconfig
@@ -69,6 +69,7 @@ config SND_SOC_RK3399_GRU_SOUND
select SND_SOC_DA7219
select SND_SOC_RT5514_SPI
select SND_SOC_HDMI_CODEC
+   select SND_SOC_DMIC
help
  Say Y or M here if you want to add support multiple codecs for SoC
  audio on Rockchip RK3399 GRU boards.
diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 91aab5c18f3b..5ab25962cabd 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -267,6 +267,28 @@ static int rockchip_sound_cdndp_hw_params(struct 
snd_pcm_substream *substream,
return 0;
 }
 
+static int rockchip_sound_dmic_hw_params(struct snd_pcm_substream *substream,
+struct snd_pcm_hw_params *params)
+{
+   struct snd_soc_pcm_runtime *rtd = substream->private_data;
+   unsigned int mclk;
+   int ret;
+
+   mclk = params_rate(params) * SOUND_FS;
+
+   ret = snd_soc_dai_set_sysclk(rtd->cpu_dai, 0, mclk, 0);
+   if (ret) {
+   dev_err(rtd->card->dev, "%s() error setting sysclk to %u: %d\n",
+   __func__, mclk, ret);
+   return ret;
+   }
+
+   /* Wait for DMIC stable */
+   msleep(dmic_wakeup_delay);
+
+   return 0;
+}
+
 static const struct snd_soc_ops rockchip_sound_max98357a_ops = {
.hw_params = rockchip_sound_max98357a_hw_params,
 };
@@ -283,6 +305,10 @@ static struct snd_soc_ops rockchip_sound_cdndp_ops = {
.hw_params = rockchip_sound_cdndp_hw_params,
 };
 
+static struct snd_soc_ops rockchip_sound_dmic_ops = {
+   .hw_params = rockchip_sound_dmic_hw_params,
+};
+
 static struct snd_soc_card rockchip_sound_card = {
.name = "rk3399-gru-sound",
.owner = THIS_MODULE,
@@ -297,6 +323,7 @@ static struct snd_soc_card rockchip_sound_card = {
 enum {
DAILINK_CDNDP,
DAILINK_DA7219,
+   DAILINK_DMIC,
DAILINK_MAX98357A,
DAILINK_RT5514,
DAILINK_RT5514_DSP,
@@ -305,6 +332,7 @@ enum {
 static const char * const dailink_compat[] = {
[DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp",
[DAILINK_DA7219] = "dlg,da7219",
+   [DAILINK_DMIC] = "dmic-codec",
[DAILINK_MAX98357A] = "maxim,max98357a",
[DAILINK_RT5514] = "realtek,rt5514-i2c",
[DAILINK_RT5514_DSP] = "realtek,rt5514-spi",
@@ -329,6 +357,14 @@ static const struct snd_soc_dai_link rockchip_dais[] = {
.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
SND_SOC_DAIFMT_CBS_CFS,
},
+   [DAILINK_DMIC] = {
+   .name = "DMIC",
+   .stream_name = "DMIC PCM",
+   .codec_dai_name = "dmic-hifi",
+   .ops = _sound_dmic_ops,
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
.stream_name = "MAX98357A PCM",
-- 
2.11.0

[RESENT PATCH v7 0/7] ASoC: rockchip: Parse dai links from dts

2017-08-23 Thread Jeffy Chen


Currently we are using a fixed list of dai links in the driver.
This serial of patches would let the driver parse dai links from
dts, so that we can make some of them optional for future boards.

Tested on my chromebook bob(with cros 4.4 kernel), it still works
after disabled rt5514 codecs in the dts.


Changes in v7:
Rebase on the newest for-next

Changes in v6:
Add dmic wakeup delay(not used for now).

Changes in v3:
Use compatible to match audio codecs
-- Suggested-by Matthias Kaehlcke 

Changes in v2:
Let rockchip,codec-names be a required property, because we plan to
add more supported codecs to the fixed dai link list in the driver.

Jeffy Chen (7):
  ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp
  arm64: dts: rockchip: Add rt5514 dsp for Gru
  arm64: dts: rockchip: Update rt5514 devices' compatible for Gru
  ASoC: rockchip: Parse dai links from dts
  ASoC: rockchip: Add support for DP codec
  ASoC: rockchip: Add support for DMIC codec
  dt-bindings: ASoC: rockchip: Update description of rockchip,codec

 .../bindings/sound/rockchip,rk3399-gru-sound.txt   |   2 +-
 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi   |   7 +-
 sound/soc/rockchip/Kconfig |   2 +
 sound/soc/rockchip/rk3399_gru_sound.c  | 255 +++--
 4 files changed, 186 insertions(+), 80 deletions(-)

-- 
2.11.0

[RESENT PATCH v7 6/7] ASoC: rockchip: Add support for DMIC codec

2017-08-23 Thread Jeffy Chen

Add support for optional dmic codec.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6:
Add dmic wakeup delay(not used for now).

Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/Kconfig|  1 +
 sound/soc/rockchip/rk3399_gru_sound.c | 36 +++
 2 files changed, 37 insertions(+)

diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig
index 8f0d0d8d34e6..b0825370d262 100644
--- a/sound/soc/rockchip/Kconfig
+++ b/sound/soc/rockchip/Kconfig
@@ -69,6 +69,7 @@ config SND_SOC_RK3399_GRU_SOUND
select SND_SOC_DA7219
select SND_SOC_RT5514_SPI
select SND_SOC_HDMI_CODEC
+   select SND_SOC_DMIC
help
  Say Y or M here if you want to add support multiple codecs for SoC
  audio on Rockchip RK3399 GRU boards.
diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 91aab5c18f3b..5ab25962cabd 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -267,6 +267,28 @@ static int rockchip_sound_cdndp_hw_params(struct 
snd_pcm_substream *substream,
return 0;
 }
 
+static int rockchip_sound_dmic_hw_params(struct snd_pcm_substream *substream,
+struct snd_pcm_hw_params *params)
+{
+   struct snd_soc_pcm_runtime *rtd = substream->private_data;
+   unsigned int mclk;
+   int ret;
+
+   mclk = params_rate(params) * SOUND_FS;
+
+   ret = snd_soc_dai_set_sysclk(rtd->cpu_dai, 0, mclk, 0);
+   if (ret) {
+   dev_err(rtd->card->dev, "%s() error setting sysclk to %u: %d\n",
+   __func__, mclk, ret);
+   return ret;
+   }
+
+   /* Wait for DMIC stable */
+   msleep(dmic_wakeup_delay);
+
+   return 0;
+}
+
 static const struct snd_soc_ops rockchip_sound_max98357a_ops = {
.hw_params = rockchip_sound_max98357a_hw_params,
 };
@@ -283,6 +305,10 @@ static struct snd_soc_ops rockchip_sound_cdndp_ops = {
.hw_params = rockchip_sound_cdndp_hw_params,
 };
 
+static struct snd_soc_ops rockchip_sound_dmic_ops = {
+   .hw_params = rockchip_sound_dmic_hw_params,
+};
+
 static struct snd_soc_card rockchip_sound_card = {
.name = "rk3399-gru-sound",
.owner = THIS_MODULE,
@@ -297,6 +323,7 @@ static struct snd_soc_card rockchip_sound_card = {
 enum {
DAILINK_CDNDP,
DAILINK_DA7219,
+   DAILINK_DMIC,
DAILINK_MAX98357A,
DAILINK_RT5514,
DAILINK_RT5514_DSP,
@@ -305,6 +332,7 @@ enum {
 static const char * const dailink_compat[] = {
[DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp",
[DAILINK_DA7219] = "dlg,da7219",
+   [DAILINK_DMIC] = "dmic-codec",
[DAILINK_MAX98357A] = "maxim,max98357a",
[DAILINK_RT5514] = "realtek,rt5514-i2c",
[DAILINK_RT5514_DSP] = "realtek,rt5514-spi",
@@ -329,6 +357,14 @@ static const struct snd_soc_dai_link rockchip_dais[] = {
.dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
SND_SOC_DAIFMT_CBS_CFS,
},
+   [DAILINK_DMIC] = {
+   .name = "DMIC",
+   .stream_name = "DMIC PCM",
+   .codec_dai_name = "dmic-hifi",
+   .ops = _sound_dmic_ops,
+   .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF |
+   SND_SOC_DAIFMT_CBS_CFS,
+   },
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
.stream_name = "MAX98357A PCM",
-- 
2.11.0

[RESENT PATCH v7 1/7] ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp

2017-08-23 Thread Jeffy Chen

Currently we are using codec name for rt5514 dsp dai link, use codec
of_node instead.

Signed-off-by: Jeffy Chen 
---

Changes in v7:
Rebase on the newest for-next

Changes in v6: None
Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/rk3399_gru_sound.c | 34 ++
 1 file changed, 2 insertions(+), 32 deletions(-)

diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 566ccb39fb31..cd2fdba922f1 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -247,8 +247,6 @@ enum {
DAILINK_RT5514_DSP,
 };
 
-#define DAILINK_ENTITIES   (DAILINK_DA7219 + 1)
-
 static struct snd_soc_dai_link rockchip_dailinks[] = {
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
@@ -282,8 +280,7 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
[DAILINK_RT5514_DSP] = {
.name = "RT5514 DSP",
.stream_name = "Wake on Voice",
-   .codec_name = "snd-soc-dummy",
-   .codec_dai_name = "snd-soc-dummy-dai",
+   .codec_dai_name = "rt5514-dsp-cpu-dai",
},
 };
 
@@ -300,17 +297,10 @@ static struct snd_soc_card rockchip_sound_card = {
.num_controls = ARRAY_SIZE(rockchip_controls),
 };
 
-static int rockchip_sound_match_stub(struct device *dev, void *data)
-{
-   return 1;
-}
-
 static int rockchip_sound_probe(struct platform_device *pdev)
 {
struct snd_soc_card *card = _sound_card;
struct device_node *cpu_node;
-   struct device *dev;
-   struct device_driver *drv;
int i, ret;
 
cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0);
@@ -319,7 +309,7 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
return -EINVAL;
}
 
-   for (i = 0; i < DAILINK_ENTITIES; i++) {
+   for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) {
rockchip_dailinks[i].platform_of_node = cpu_node;
rockchip_dailinks[i].cpu_of_node = cpu_node;
 
@@ -332,22 +322,6 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
}
}
 
-   /**
-* To acquire the spi driver of the rt5514 and set the dai-links names
-* for soc_bind_dai_link
-*/
-   drv = driver_find("rt5514", _bus_type);
-   if (!drv) {
-   dev_err(>dev, "Can not find the rt5514 driver at the spi 
bus\n");
-   return -EINVAL;
-   }
-
-   dev = driver_find_device(drv, NULL, NULL, rockchip_sound_match_stub);
-   if (!dev) {
-   dev_err(>dev, "Can not find the rt5514 device\n");
-   return -ENODEV;
-   }
-
/* Set DMIC wakeup delay */
ret = device_property_read_u32(>dev, "dmic-wakeup-delay-ms",
_wakeup_delay);
@@ -357,10 +331,6 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
"no optional property 'dmic-wakeup-delay-ms' found, 
default: no delay\n");
}
 
-   rockchip_dailinks[DAILINK_RT5514_DSP].cpu_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-   rockchip_dailinks[DAILINK_RT5514_DSP].cpu_dai_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-   rockchip_dailinks[DAILINK_RT5514_DSP].platform_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-
card->dev = >dev;
 
ret = devm_snd_soc_register_card(>dev, card);
-- 
2.11.0

[RESENT PATCH v7 1/7] ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp

2017-08-23 Thread Jeffy Chen

Currently we are using codec name for rt5514 dsp dai link, use codec
of_node instead.

Signed-off-by: Jeffy Chen 
---

Changes in v7:
Rebase on the newest for-next

Changes in v6: None
Changes in v3: None
Changes in v2: None

 sound/soc/rockchip/rk3399_gru_sound.c | 34 ++
 1 file changed, 2 insertions(+), 32 deletions(-)

diff --git a/sound/soc/rockchip/rk3399_gru_sound.c 
b/sound/soc/rockchip/rk3399_gru_sound.c
index 566ccb39fb31..cd2fdba922f1 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -247,8 +247,6 @@ enum {
DAILINK_RT5514_DSP,
 };
 
-#define DAILINK_ENTITIES   (DAILINK_DA7219 + 1)
-
 static struct snd_soc_dai_link rockchip_dailinks[] = {
[DAILINK_MAX98357A] = {
.name = "MAX98357A",
@@ -282,8 +280,7 @@ static struct snd_soc_dai_link rockchip_dailinks[] = {
[DAILINK_RT5514_DSP] = {
.name = "RT5514 DSP",
.stream_name = "Wake on Voice",
-   .codec_name = "snd-soc-dummy",
-   .codec_dai_name = "snd-soc-dummy-dai",
+   .codec_dai_name = "rt5514-dsp-cpu-dai",
},
 };
 
@@ -300,17 +297,10 @@ static struct snd_soc_card rockchip_sound_card = {
.num_controls = ARRAY_SIZE(rockchip_controls),
 };
 
-static int rockchip_sound_match_stub(struct device *dev, void *data)
-{
-   return 1;
-}
-
 static int rockchip_sound_probe(struct platform_device *pdev)
 {
struct snd_soc_card *card = _sound_card;
struct device_node *cpu_node;
-   struct device *dev;
-   struct device_driver *drv;
int i, ret;
 
cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0);
@@ -319,7 +309,7 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
return -EINVAL;
}
 
-   for (i = 0; i < DAILINK_ENTITIES; i++) {
+   for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) {
rockchip_dailinks[i].platform_of_node = cpu_node;
rockchip_dailinks[i].cpu_of_node = cpu_node;
 
@@ -332,22 +322,6 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
}
}
 
-   /**
-* To acquire the spi driver of the rt5514 and set the dai-links names
-* for soc_bind_dai_link
-*/
-   drv = driver_find("rt5514", _bus_type);
-   if (!drv) {
-   dev_err(>dev, "Can not find the rt5514 driver at the spi 
bus\n");
-   return -EINVAL;
-   }
-
-   dev = driver_find_device(drv, NULL, NULL, rockchip_sound_match_stub);
-   if (!dev) {
-   dev_err(>dev, "Can not find the rt5514 device\n");
-   return -ENODEV;
-   }
-
/* Set DMIC wakeup delay */
ret = device_property_read_u32(>dev, "dmic-wakeup-delay-ms",
_wakeup_delay);
@@ -357,10 +331,6 @@ static int rockchip_sound_probe(struct platform_device 
*pdev)
"no optional property 'dmic-wakeup-delay-ms' found, 
default: no delay\n");
}
 
-   rockchip_dailinks[DAILINK_RT5514_DSP].cpu_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-   rockchip_dailinks[DAILINK_RT5514_DSP].cpu_dai_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-   rockchip_dailinks[DAILINK_RT5514_DSP].platform_name = 
kstrdup_const(dev_name(dev), GFP_KERNEL);
-
card->dev = >dev;
 
ret = devm_snd_soc_register_card(>dev, card);
-- 
2.11.0

[RESENT PATCH v7 2/7] arm64: dts: rockchip: Add rt5514 dsp for Gru

2017-08-23 Thread Jeffy Chen

Add rt5514 dsp of_node to codec list for Gru boards.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
index 199a5118b20d..5772c52fbfd3 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
@@ -514,7 +514,8 @@
sound {
compatible = "rockchip,rk3399-gru-sound";
rockchip,cpu = < >;
-   rockchip,codec = <  >;
+   rockchip,codec = < 
+  _spi_audio>;
};
 };
 
-- 
2.11.0

[RESENT PATCH v7 3/7] arm64: dts: rockchip: Update rt5514 devices' compatible for Gru

2017-08-23 Thread Jeffy Chen

Currently the rt5514 i2c driver and rt5514 spi driver are using the same
compatible string.

Add additional unused compatible strings to identify them for Gru
boards.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
index 5772c52fbfd3..50fb11ad9f17 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
@@ -611,7 +611,7 @@ ap_i2c_mic:  {
i2c-scl-rising-time-ns = <300>;
 
headsetcodec: rt5514@57 {
-   compatible = "realtek,rt5514";
+   compatible = "realtek,rt5514", "realtek,rt5514-i2c";
reg = <0x57>;
realtek,dmic-init-delay-ms = <20>;
};
@@ -820,7 +820,7 @@ ap_i2c_audio:  {
status = "okay";
 
wacky_spi_audio: spi2@0 {
-   compatible = "realtek,rt5514";
+   compatible = "realtek,rt5514", "realtek,rt5514-spi";
reg = <0>;
interrupt-parent = <>;
interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
-- 
2.11.0

[RESENT PATCH v7 2/7] arm64: dts: rockchip: Add rt5514 dsp for Gru

2017-08-23 Thread Jeffy Chen

Add rt5514 dsp of_node to codec list for Gru boards.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
index 199a5118b20d..5772c52fbfd3 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
@@ -514,7 +514,8 @@
sound {
compatible = "rockchip,rk3399-gru-sound";
rockchip,cpu = < >;
-   rockchip,codec = <  >;
+   rockchip,codec = < 
+  _spi_audio>;
};
 };
 
-- 
2.11.0

[RESENT PATCH v7 3/7] arm64: dts: rockchip: Update rt5514 devices' compatible for Gru

2017-08-23 Thread Jeffy Chen

Currently the rt5514 i2c driver and rt5514 spi driver are using the same
compatible string.

Add additional unused compatible strings to identify them for Gru
boards.

Signed-off-by: Jeffy Chen 
---

Changes in v7: None
Changes in v6: None
Changes in v3: None
Changes in v2: None

 arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi 
b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
index 5772c52fbfd3..50fb11ad9f17 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi
@@ -611,7 +611,7 @@ ap_i2c_mic:  {
i2c-scl-rising-time-ns = <300>;
 
headsetcodec: rt5514@57 {
-   compatible = "realtek,rt5514";
+   compatible = "realtek,rt5514", "realtek,rt5514-i2c";
reg = <0x57>;
realtek,dmic-init-delay-ms = <20>;
};
@@ -820,7 +820,7 @@ ap_i2c_audio:  {
status = "okay";
 
wacky_spi_audio: spi2@0 {
-   compatible = "realtek,rt5514";
+   compatible = "realtek,rt5514", "realtek,rt5514-spi";
reg = <0>;
interrupt-parent = <>;
interrupts = <13 IRQ_TYPE_LEVEL_HIGH>;
-- 
2.11.0

Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-23 Thread Sergey Senozhatsky

Hi,

On (08/24/17 12:39), Boqun Feng wrote:
> On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> > On (08/23/17 13:35), Boqun Feng wrote:
> > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > > buffer immediately.
> > > > 
> > > 
> > > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > > usages of printk(KERN_CONT "...\n") in kernel.
> > > 
> > > Did a bit research myself, and I now think the inappropriate use is to
> > > use a KERN_CONT printk *after* another printk ending with a "\n".
> > 
> > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> > upon the return. sorry, your code is correct.
> > 
> 
> So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
> correct usage, right?

well, yes. with one precondition - there should be no printk-s from other
CPUs/tasks in between

printk(KERN_CON "...");  +  printk(KERN_CONT "...\n")
   ^
here we can have a preliminary flush and broken
cont line. but it's been this way forever.

-ss

Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-23 Thread Sergey Senozhatsky

Hi,

On (08/24/17 12:39), Boqun Feng wrote:
> On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> > On (08/23/17 13:35), Boqun Feng wrote:
> > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > > buffer immediately.
> > > > 
> > > 
> > > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > > usages of printk(KERN_CONT "...\n") in kernel.
> > > 
> > > Did a bit research myself, and I now think the inappropriate use is to
> > > use a KERN_CONT printk *after* another printk ending with a "\n".
> > 
> > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> > upon the return. sorry, your code is correct.
> > 
> 
> So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
> correct usage, right?

well, yes. with one precondition - there should be no printk-s from other
CPUs/tasks in between

printk(KERN_CON "...");  +  printk(KERN_CONT "...\n")
   ^
here we can have a preliminary flush and broken
cont line. but it's been this way forever.

-ss

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 12:07, NeilBrown wrote:
> On Wed, Aug 23 2017, Ian Kent wrote:
> 
>>
>> That inconsistency has bothered me for quite a while now.
>>
>> It was carried over from the autofs module behavior when automounting
>> support was added to the VFS. What's worse is it prevents the use of
>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with
>> statx().
>>
>> There is some risk in changing that so it does work but it really does
>> need to work to enable userspace to not trigger an automount by using
>> this flag.
>>
>> So that's (hopefully) going to change soonish, see:
>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch
>>
>> The result should be that stat family calls don't trigger automounts except
>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag.
>>
> 
> oooh, yes.  That's much better - thanks.
> 
> We should make sure that change gets into the man pages...

Yes, I was wondering who to contact for that.

> 
> First however, we should probably correct the man page!
> stat.2 says:
> 
> 
>   NOTES
>On Linux, lstat() will generally not trigger  automounter
>action,  whereas  stat()  will  (but  see  the description of
>fstatat() AT_NO_AUTOMOUNT fag, above).
> 
> which is wrong: lstat and stat treat automounts the same.
> @Michael: do you recall why you inserted that text?  The commit message
> in commit 1ef5b2805471 ("stat.2: Cosmetic reworking of timestamp
> discussion in NOTES") is not very helpful.
> 
> I propose correcting to
> 
>   NOTES:
>   On Linux, lstat() nor stat() act as though AT_NO_AUTOMOUNT was set
>   and will not trigger automounter action for direct automount
>   points, though they may (prior to 4.14) for indirect automount
>   points.

Shouldn't that be "lstat() and stat() act as though AT_NO_AUTOMOUNT is
set ..."

> 
> 
> The more precise details, that automount action for indirect automount
> points is not triggered when the 'browse' option is used, is probably
> not necessary.
> 
> Ian: if you agree with that text, and Michael doesn't provide alternate
> evidence, I'll submit a formal patch for the man page or should we
> just wait until the patch actually lands?

I thought the fstatat() description needed attention too, doubly so with
the AT_NO_AUTOMOUNT change.

The "The fstatat() system call operates in exactly the same way as stat()"
is wrong in the same way as the stat() description was wrong.

After the change fstatat() will trigger automounts if the AT_NO_AUTOMOUNT
flag is not given which is different from lstat() and stat().

The updated NOTE above probably needs to be referred to in order to clarify
what's meant by "in exactly the same way" since that probably refers to the
information returned rather than whether an automount will be done.

Ian

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 12:07, NeilBrown wrote:
> On Wed, Aug 23 2017, Ian Kent wrote:
> 
>>
>> That inconsistency has bothered me for quite a while now.
>>
>> It was carried over from the autofs module behavior when automounting
>> support was added to the VFS. What's worse is it prevents the use of
>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with
>> statx().
>>
>> There is some risk in changing that so it does work but it really does
>> need to work to enable userspace to not trigger an automount by using
>> this flag.
>>
>> So that's (hopefully) going to change soonish, see:
>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch
>>
>> The result should be that stat family calls don't trigger automounts except
>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag.
>>
> 
> oooh, yes.  That's much better - thanks.
> 
> We should make sure that change gets into the man pages...

Yes, I was wondering who to contact for that.

> 
> First however, we should probably correct the man page!
> stat.2 says:
> 
> 
>   NOTES
>On Linux, lstat() will generally not trigger  automounter
>action,  whereas  stat()  will  (but  see  the description of
>fstatat() AT_NO_AUTOMOUNT fag, above).
> 
> which is wrong: lstat and stat treat automounts the same.
> @Michael: do you recall why you inserted that text?  The commit message
> in commit 1ef5b2805471 ("stat.2: Cosmetic reworking of timestamp
> discussion in NOTES") is not very helpful.
> 
> I propose correcting to
> 
>   NOTES:
>   On Linux, lstat() nor stat() act as though AT_NO_AUTOMOUNT was set
>   and will not trigger automounter action for direct automount
>   points, though they may (prior to 4.14) for indirect automount
>   points.

Shouldn't that be "lstat() and stat() act as though AT_NO_AUTOMOUNT is
set ..."

> 
> 
> The more precise details, that automount action for indirect automount
> points is not triggered when the 'browse' option is used, is probably
> not necessary.
> 
> Ian: if you agree with that text, and Michael doesn't provide alternate
> evidence, I'll submit a formal patch for the man page or should we
> just wait until the patch actually lands?

I thought the fstatat() description needed attention too, doubly so with
the AT_NO_AUTOMOUNT change.

The "The fstatat() system call operates in exactly the same way as stat()"
is wrong in the same way as the stat() description was wrong.

After the change fstatat() will trigger automounts if the AT_NO_AUTOMOUNT
flag is not given which is different from lstat() and stat().

The updated NOTE above probably needs to be referred to in order to clarify
what's meant by "in exactly the same way" since that probably refers to the
information returned rather than whether an automount will be done.

Ian

Re: [PATCH v2] ASoC: Add a sanity check before using dai driver name

2017-08-23 Thread jeffy


Hi Dong,

Thanks for noticing, will send new patch soon :)

On 08/24/2017 11:46 AM, Donglin Peng wrote:

On Thu, Aug 24, 2017 at 11:34 AM, Jeffy Chen  wrote:

 list_for_each_entry(dai, >dai_list, list) {
 if (dlc->dai_name && strcmp(dai->name, dlc->dai_name)
-   && strcmp(dai->driver->name, dlc->dai_name))
+   && (!dai->driver->name
+   || !strcmp(dai->driver->name, dlc->dai_name))
 continue;


If the dai->driver->name is match with the dlc->dai_name, does it need
to continue?


hmm, sorry, i did this in a hurry, should setup my board and test it..




 return dai;

Re: [PATCH v2] ASoC: Add a sanity check before using dai driver name

2017-08-23 Thread jeffy


Hi Dong,

Thanks for noticing, will send new patch soon :)

On 08/24/2017 11:46 AM, Donglin Peng wrote:

On Thu, Aug 24, 2017 at 11:34 AM, Jeffy Chen  wrote:

 list_for_each_entry(dai, >dai_list, list) {
 if (dlc->dai_name && strcmp(dai->name, dlc->dai_name)
-   && strcmp(dai->driver->name, dlc->dai_name))
+   && (!dai->driver->name
+   || !strcmp(dai->driver->name, dlc->dai_name))
 continue;


If the dai->driver->name is match with the dlc->dai_name, does it need
to continue?


hmm, sorry, i did this in a hurry, should setup my board and test it..




 return dai;

[PATCH v3] ASoC: Add a sanity check before using dai driver name

2017-08-23 Thread Jeffy Chen

The dai driver's name is allowed to be NULL. So add a sanity check for
that.

Signed-off-by: Jeffy Chen 
Reported-by: Donglin Peng 
---

Changes in v3:
Fix typo

Changes in v2:
Keep the oringinal check style.

 sound/soc/soc-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index fc1bb2da3e2e..8827c2225ba3 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1029,7 +1029,8 @@ struct snd_soc_dai *snd_soc_find_dai(
continue;
list_for_each_entry(dai, >dai_list, list) {
if (dlc->dai_name && strcmp(dai->name, dlc->dai_name)
-   && strcmp(dai->driver->name, dlc->dai_name))
+   && (!dai->driver->name
+   || strcmp(dai->driver->name, dlc->dai_name)))
continue;
 
return dai;
-- 
2.11.0

Re: [PATCH v7 1/2] PCI: iproc: Retry request when CRS returned from EP

2017-08-23 Thread Oza Oza

On Wed, Aug 23, 2017 at 11:30 PM, Bjorn Helgaas  wrote:
> On Wed, Aug 23, 2017 at 09:32:06PM +0530, Oza Oza wrote:
>> On Wed, Aug 23, 2017 at 9:22 PM, Sinan Kaya  wrote:
>> > Hi Oza,
>> >
>> >> In working Enumuration case I get following:
>> >> [9.125976] pci :00:00.0: bridge configuration invalid ([bus
>> >> 00-00]), re-configuring
>> >> [9.134267] where=0x0 val=0x0001
>> >> [9.146946] where=0x0 val=0x0001
>> >> [9.158943] where=0x0 val=0x0001
>> >> [9.170945] where=0x0 val=0x0001
>> >> [9.186945] where=0x0 val=0x0001
>> >> [9.210944] where=0x0 val=0x0001
>> >> [9.250943] where=0x0 val=0x0001
>> >> [9.322942] where=0x0 val=0x0001
>> >> [9.458943] where=0x0 val=0x0001
>> >> [9.726942] where=0x0 val=0x9538086>> actual vendor and device id.
>> >>
>> >> so I think I have to retry in RC driver, so the old code still holds good.
>> >> except that I have to do factoring out
>> > You need to return 0x0001 for vendor ID register and return 0x 
>> > for
>> > other registers like COMMAND register during the CRS period.
>
> The proposal we're trying to implement is to handle this controller as
> an RP that does not support CRS SV.  Such an RP would never return
> 0x0001 for the vendor/device ID.  If it never got a successful
> completion, it would return 0x.
>
> So I think you do have to either retry (as in your v7 patch) or add a
> delay before we start enumeration.
>
> It looks like we waited somewhere between 320ms and 590ms before this
> device became ready.
>
> The PCI specs do require a delay after a reset (PCIe r3.1, sec 6.6.1)
> before software sends a config request.  I don't think there's
> anywhere in the PCI core where we delay before we first enumerate
> devices under a host bridge.  I'm not sure we'd *want* such a delay in
> the PCI core, because on many systems the BIOS has already initialized
> the PCI controller, and an additional delay would be unnecessary and
> would only slow down boot.
>
> But it might make sense to add a delay in the PCI controller driver --
> it knows when the reset occurs and probably knows more about the
> firmware environment.  I haven't tried to analyze all of sec 6.6.1,
> but this section:
>
>   Unless Readiness Notifications mechanisms are used (see Section
>   6.23), the Root Complex and/or system software must allow at least
>   1.0 s after a Conventional Reset of a device, before it may
>   determine that a device which fails to return a Successful
>   Completion status for a valid Configuration Request is a broken
>   device. This period is independent of how quickly Link training
>   completes.
>
>   Note: This delay is analogous to the Trhfa parameter specified for
>   PCI/PCI-X, and is intended to allow an adequate amount of time for
>   devices which require self initialization.
>
> makes it sound like a 1sec delay might be needed.  That sounds like an
> awful lot, but this device did take close to that amount of time.
>
> I don't care which way you go.  You've already implemented the delay
> in the v7 patch, and that's fine with me.
>

Thanks for your inputs Bjorn.
I will have v8 patch which will have;
factored out separate patch + retry implementation of v7 patch.

> Bjorn

Re: [PATCH v7 1/2] PCI: iproc: Retry request when CRS returned from EP

2017-08-23 Thread Oza Oza

On Wed, Aug 23, 2017 at 11:30 PM, Bjorn Helgaas  wrote:
> On Wed, Aug 23, 2017 at 09:32:06PM +0530, Oza Oza wrote:
>> On Wed, Aug 23, 2017 at 9:22 PM, Sinan Kaya  wrote:
>> > Hi Oza,
>> >
>> >> In working Enumuration case I get following:
>> >> [9.125976] pci :00:00.0: bridge configuration invalid ([bus
>> >> 00-00]), re-configuring
>> >> [9.134267] where=0x0 val=0x0001
>> >> [9.146946] where=0x0 val=0x0001
>> >> [9.158943] where=0x0 val=0x0001
>> >> [9.170945] where=0x0 val=0x0001
>> >> [9.186945] where=0x0 val=0x0001
>> >> [9.210944] where=0x0 val=0x0001
>> >> [9.250943] where=0x0 val=0x0001
>> >> [9.322942] where=0x0 val=0x0001
>> >> [9.458943] where=0x0 val=0x0001
>> >> [9.726942] where=0x0 val=0x9538086>> actual vendor and device id.
>> >>
>> >> so I think I have to retry in RC driver, so the old code still holds good.
>> >> except that I have to do factoring out
>> > You need to return 0x0001 for vendor ID register and return 0x 
>> > for
>> > other registers like COMMAND register during the CRS period.
>
> The proposal we're trying to implement is to handle this controller as
> an RP that does not support CRS SV.  Such an RP would never return
> 0x0001 for the vendor/device ID.  If it never got a successful
> completion, it would return 0x.
>
> So I think you do have to either retry (as in your v7 patch) or add a
> delay before we start enumeration.
>
> It looks like we waited somewhere between 320ms and 590ms before this
> device became ready.
>
> The PCI specs do require a delay after a reset (PCIe r3.1, sec 6.6.1)
> before software sends a config request.  I don't think there's
> anywhere in the PCI core where we delay before we first enumerate
> devices under a host bridge.  I'm not sure we'd *want* such a delay in
> the PCI core, because on many systems the BIOS has already initialized
> the PCI controller, and an additional delay would be unnecessary and
> would only slow down boot.
>
> But it might make sense to add a delay in the PCI controller driver --
> it knows when the reset occurs and probably knows more about the
> firmware environment.  I haven't tried to analyze all of sec 6.6.1,
> but this section:
>
>   Unless Readiness Notifications mechanisms are used (see Section
>   6.23), the Root Complex and/or system software must allow at least
>   1.0 s after a Conventional Reset of a device, before it may
>   determine that a device which fails to return a Successful
>   Completion status for a valid Configuration Request is a broken
>   device. This period is independent of how quickly Link training
>   completes.
>
>   Note: This delay is analogous to the Trhfa parameter specified for
>   PCI/PCI-X, and is intended to allow an adequate amount of time for
>   devices which require self initialization.
>
> makes it sound like a 1sec delay might be needed.  That sounds like an
> awful lot, but this device did take close to that amount of time.
>
> I don't care which way you go.  You've already implemented the delay
> in the v7 patch, and that's fine with me.
>

Thanks for your inputs Bjorn.
I will have v8 patch which will have;
factored out separate patch + retry implementation of v7 patch.

> Bjorn

[PATCH v3] ASoC: Add a sanity check before using dai driver name

2017-08-23 Thread Jeffy Chen

The dai driver's name is allowed to be NULL. So add a sanity check for
that.

Signed-off-by: Jeffy Chen 
Reported-by: Donglin Peng 
---

Changes in v3:
Fix typo

Changes in v2:
Keep the oringinal check style.

 sound/soc/soc-core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index fc1bb2da3e2e..8827c2225ba3 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -1029,7 +1029,8 @@ struct snd_soc_dai *snd_soc_find_dai(
continue;
list_for_each_entry(dai, >dai_list, list) {
if (dlc->dai_name && strcmp(dai->name, dlc->dai_name)
-   && strcmp(dai->driver->name, dlc->dai_name))
+   && (!dai->driver->name
+   || strcmp(dai->driver->name, dlc->dai_name)))
continue;
 
return dai;
-- 
2.11.0

Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-23 Thread Boqun Feng

On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 13:35), Boqun Feng wrote:
> > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > buffer immediately.
> > > 
> > 
> > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > usages of printk(KERN_CONT "...\n") in kernel.
> > 
> > Did a bit research myself, and I now think the inappropriate use is to
> > use a KERN_CONT printk *after* another printk ending with a "\n".
> 
> ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> upon the return. sorry, your code is correct.
> 

So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
correct usage, right? Thanks. Again, not familiar with printk stuff,
glad you can help me go through this ;-)

Regards,
Boqun

>   -ss
> 
> > > > printk("\n *** DEADLOCK ***\n\n");
> > > > +   } else if (cross_lock(src->instance)) {
> > > > +   printk(" Possible unsafe locking scenario by 
> > > > crosslock:\n\n");
> > > > +   printk("   CPU0CPU1\n");
> > > > +   printk("   \n");
> > > > +   printk("  lock(");
> > > > +   __print_lock_name(target);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("  lock(");
> > > > +   __print_lock_name(source);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("   lock(");
> > > > +   __print_lock_name(parent == source ? target : parent);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("   unlock(");
> > > > +   __print_lock_name(source);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("\n *** DEADLOCK ***\n\n");
> > > > } else {
> > > > printk(" Possible unsafe locking scenario:\n\n");
> > > > printk("   CPU0CPU1\n");
> > > > -- 
> > > > 2.14.1
> > > > 
> 
> 


signature.asc
Description: PGP signature

Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]

2017-08-23 Thread Boqun Feng

On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote:
> On (08/23/17 13:35), Boqun Feng wrote:
> > > KERN_CONT and "\n" should not be together. "\n" flushes the cont
> > > buffer immediately.
> > > 
> > 
> > Hmm.. Not quite familiar with printk() stuffs, but I could see several
> > usages of printk(KERN_CONT "...\n") in kernel.
> > 
> > Did a bit research myself, and I now think the inappropriate use is to
> > use a KERN_CONT printk *after* another printk ending with a "\n".
> 
> ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer
> upon the return. sorry, your code is correct.
> 

So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a
correct usage, right? Thanks. Again, not familiar with printk stuff,
glad you can help me go through this ;-)

Regards,
Boqun

>   -ss
> 
> > > > printk("\n *** DEADLOCK ***\n\n");
> > > > +   } else if (cross_lock(src->instance)) {
> > > > +   printk(" Possible unsafe locking scenario by 
> > > > crosslock:\n\n");
> > > > +   printk("   CPU0CPU1\n");
> > > > +   printk("   \n");
> > > > +   printk("  lock(");
> > > > +   __print_lock_name(target);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("  lock(");
> > > > +   __print_lock_name(source);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("   lock(");
> > > > +   __print_lock_name(parent == source ? target : parent);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("   unlock(");
> > > > +   __print_lock_name(source);
> > > > +   printk(KERN_CONT ");\n");
> > > > +   printk("\n *** DEADLOCK ***\n\n");
> > > > } else {
> > > > printk(" Possible unsafe locking scenario:\n\n");
> > > > printk("   CPU0CPU1\n");
> > > > -- 
> > > > 2.14.1
> > > > 
> 
> 


signature.asc
Description: PGP signature

Re: [PATCH RESEND1 00/12] ALSA: vsnd: Add Xen para-virtualized frontend driver

2017-08-23 Thread Takashi Sakamoto


On Aug 23 2017 23:51, Oleksandr Grytsov wrote:

Hi,

Thank you for detailed explanation.

We understand that emulated interrupt on the frontend side is completely not
acceptable and definitely we need to provide some feedback mechanism from
Dom0 to DomU.

In our case it is technically impossible to provide precise period interrupt
(mostly because our backend is a user space application).
The best we can implement it is provide number of frames (time, bytes etc.)
consumed by real HW. This info will be outdated due to different delays but
we can provide precise timestamps when this info was acquired.


Stuffs of ALSA PCM in kernel land is an abstraction layer for actual
hardware for data transmission. The stuffs get affects from a design of
actual hardware. Furthermore, sound subsystems on the other operating
systems such as Microsoft Windows are also designed with a consideration
about actual hardware. When you design any interfaces as an abstraction
for such software layer, it's better to understand actual hardware and
design of low-level software layer somehow.

Actually the 'sndif' has no good abstraction for actual hardware,
therefore an idea to implement frontend driver as an ALSA driver is not
reasonable at all. It's better to implement it as an application in
the other software layer, e.g. sinks/sources of PulseAudio in DomU via
Xenbus. This idea is nearer an original concept of Xen framework, I
guess. But I don't know we can write any applications of Xenbus in user
land of DomU or not.

Anyway, it's not a good idea to have an ALSA driver for the present 
'sndif', in my opinion.



Regards

Takashi Sakamoto

Re: [PATCH RESEND1 00/12] ALSA: vsnd: Add Xen para-virtualized frontend driver

2017-08-23 Thread Takashi Sakamoto


On Aug 23 2017 23:51, Oleksandr Grytsov wrote:

Hi,

Thank you for detailed explanation.

We understand that emulated interrupt on the frontend side is completely not
acceptable and definitely we need to provide some feedback mechanism from
Dom0 to DomU.

In our case it is technically impossible to provide precise period interrupt
(mostly because our backend is a user space application).
The best we can implement it is provide number of frames (time, bytes etc.)
consumed by real HW. This info will be outdated due to different delays but
we can provide precise timestamps when this info was acquired.


Stuffs of ALSA PCM in kernel land is an abstraction layer for actual
hardware for data transmission. The stuffs get affects from a design of
actual hardware. Furthermore, sound subsystems on the other operating
systems such as Microsoft Windows are also designed with a consideration
about actual hardware. When you design any interfaces as an abstraction
for such software layer, it's better to understand actual hardware and
design of low-level software layer somehow.

Actually the 'sndif' has no good abstraction for actual hardware,
therefore an idea to implement frontend driver as an ALSA driver is not
reasonable at all. It's better to implement it as an application in
the other software layer, e.g. sinks/sources of PulseAudio in DomU via
Xenbus. This idea is nearer an original concept of Xen framework, I
guess. But I don't know we can write any applications of Xenbus in user
land of DomU or not.

Anyway, it's not a good idea to have an ALSA driver for the present 
'sndif', in my opinion.



Regards

Takashi Sakamoto

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 11:21, NeilBrown wrote:
> On Wed, Aug 23 2017, Ian Kent wrote:
> 
>> On 23/08/17 10:32, Ian Kent wrote:
>>> On 23/08/17 09:06, NeilBrown wrote:
 On Mon, Aug 21 2017, Ian Kent wrote:

>>
>> A mount isn't triggered by kern_path(pathname, 0, ).
>> That '0' would need to include one of
>>   LOOKUP_PARENT | LOOKUP_DIRECTORY |
>>   LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT
>>
>> to trigger an automount (otherwise you just get -EISDIR).
>
> It's perfectly sensible to think that but there is a case where a
> a mount is triggered when using kern_path().
>
> The EISDIR return occurs for positive dentrys, negative dentrys
> will still trigger an automount (which is autofs specific,
> indirect mount map using nobrowse option, the install default).

 Ok, I understand this better now.  This difference between direct and
 indirect mounts is slightly awkward. It is visible from user-space, but
 not elegant to document.
 When you use O_PATH to open a direct automount that has not already been
 triggered, the open returns the underlying directory (and fstatfs
 confirms that it is AUTOFS_SUPER_MAGIC).  When you use O_PATH on
 an indirect automount, it *will* trigger the automount when "nobrowse" is
 in effect, but it won't when "browse" is in effect.
>>>
>>> That inconsistency has bothered me for quite a while now.
>>>
>>> It was carried over from the autofs module behavior when automounting
>>> support was added to the VFS. What's worse is it prevents the use of
>>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with
>>> statx().
>>>
>>> There is some risk in changing that so it does work but it really does
>>> need to work to enable userspace to not trigger an automount by using
>>> this flag.
>>>
>>> So that's (hopefully) going to change soonish, see:
>>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch
>>>
>>> The result should be that stat family calls don't trigger automounts except
>>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag.
>>>

 So we cannot just say "O_PATH doesn't trigger automounts", which is
 essentially what I said in

 https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=97a45d02e6671482e8b2cdcce3951930bf6bdb94

 It might be possible to modify automount so that it was more consistent
 - i.e. if the point is triggered by a mkdir has been done, just to the
 mkdir.  If it is triggered after a mkdir has been done, do the mount.  I
 guess that might be racy, and in any case is hard to justify.

 Maybe I should change it to be about "direct automounts", and add a note
 that indirect automounts aren't so predictable.
>>>
>>> Right and the semantics should be much more consistent in the near future.
>>> I hope (and expect) this semantic change won't cause problems.
>>>

 But back to my original issue of wanting to discard
 kern_path_mountpoint, what would you think of the following approach -
 slight revised from before.

 Thanks,
 NeilBrown

 diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
 index beef981aa54f..7663ea82e68d 100644
 --- a/fs/autofs4/autofs_i.h
 +++ b/fs/autofs4/autofs_i.h
 @@ -135,10 +135,13 @@ static inline struct autofs_info 
 *autofs4_dentry_ino(struct dentry *dentry)
  /* autofs4_oz_mode(): do we see the man behind the curtain?  (The
   * processes which do manipulations for us in user space sees the raw
   * filesystem without "magic".)
 + * A process performing certain ioctls can get temporary oz status.
   */
 +extern struct task_struct *autofs_tmp_oz;
  static inline int autofs4_oz_mode(struct autofs_sb_info *sbi)
  {
 -  return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
 +  return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp ||
 +  autofs_tmp_oz == current;
  }
  
  struct inode *autofs4_get_inode(struct super_block *, umode_t);
 diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
 index dd9f1bebb5a3..d76401669a20 100644
 --- a/fs/autofs4/dev-ioctl.c
 +++ b/fs/autofs4/dev-ioctl.c
 @@ -200,6 +200,20 @@ static int autofs_dev_ioctl_protosubver(struct file 
 *fp,
return 0;
  }
  
 +struct task_struct *autofs_tmp_oz;
 +int kern_path_oz(const char *pathname, int flags, struct path *path)
 +{
 +  static DEFINE_MUTEX(autofs_oz);
 +  int err;
 +
 +  mutex_lock(_oz);
 +  autofs_tmp_oz = current;
 +  err = kern_path(pathname, flags, path);
 +  autofs_tmp_oz = NULL;
 +  mutex_unlock(_oz);
 +  return err;
 +}
 +
>>>
>>> It's simple enough but does look like it will attract criticism as being
>>> a hack!
>>>
>>> The kern_path_locked() function is very similar to what was

Re: Do we really need d_weak_revalidate???

2017-08-23 Thread Ian Kent

On 24/08/17 11:21, NeilBrown wrote:
> On Wed, Aug 23 2017, Ian Kent wrote:
> 
>> On 23/08/17 10:32, Ian Kent wrote:
>>> On 23/08/17 09:06, NeilBrown wrote:
 On Mon, Aug 21 2017, Ian Kent wrote:

>>
>> A mount isn't triggered by kern_path(pathname, 0, ).
>> That '0' would need to include one of
>>   LOOKUP_PARENT | LOOKUP_DIRECTORY |
>>   LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT
>>
>> to trigger an automount (otherwise you just get -EISDIR).
>
> It's perfectly sensible to think that but there is a case where a
> a mount is triggered when using kern_path().
>
> The EISDIR return occurs for positive dentrys, negative dentrys
> will still trigger an automount (which is autofs specific,
> indirect mount map using nobrowse option, the install default).

 Ok, I understand this better now.  This difference between direct and
 indirect mounts is slightly awkward. It is visible from user-space, but
 not elegant to document.
 When you use O_PATH to open a direct automount that has not already been
 triggered, the open returns the underlying directory (and fstatfs
 confirms that it is AUTOFS_SUPER_MAGIC).  When you use O_PATH on
 an indirect automount, it *will* trigger the automount when "nobrowse" is
 in effect, but it won't when "browse" is in effect.
>>>
>>> That inconsistency has bothered me for quite a while now.
>>>
>>> It was carried over from the autofs module behavior when automounting
>>> support was added to the VFS. What's worse is it prevents the use of
>>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with
>>> statx().
>>>
>>> There is some risk in changing that so it does work but it really does
>>> need to work to enable userspace to not trigger an automount by using
>>> this flag.
>>>
>>> So that's (hopefully) going to change soonish, see:
>>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch
>>>
>>> The result should be that stat family calls don't trigger automounts except
>>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag.
>>>

 So we cannot just say "O_PATH doesn't trigger automounts", which is
 essentially what I said in

 https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=97a45d02e6671482e8b2cdcce3951930bf6bdb94

 It might be possible to modify automount so that it was more consistent
 - i.e. if the point is triggered by a mkdir has been done, just to the
 mkdir.  If it is triggered after a mkdir has been done, do the mount.  I
 guess that might be racy, and in any case is hard to justify.

 Maybe I should change it to be about "direct automounts", and add a note
 that indirect automounts aren't so predictable.
>>>
>>> Right and the semantics should be much more consistent in the near future.
>>> I hope (and expect) this semantic change won't cause problems.
>>>

 But back to my original issue of wanting to discard
 kern_path_mountpoint, what would you think of the following approach -
 slight revised from before.

 Thanks,
 NeilBrown

 diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
 index beef981aa54f..7663ea82e68d 100644
 --- a/fs/autofs4/autofs_i.h
 +++ b/fs/autofs4/autofs_i.h
 @@ -135,10 +135,13 @@ static inline struct autofs_info 
 *autofs4_dentry_ino(struct dentry *dentry)
  /* autofs4_oz_mode(): do we see the man behind the curtain?  (The
   * processes which do manipulations for us in user space sees the raw
   * filesystem without "magic".)
 + * A process performing certain ioctls can get temporary oz status.
   */
 +extern struct task_struct *autofs_tmp_oz;
  static inline int autofs4_oz_mode(struct autofs_sb_info *sbi)
  {
 -  return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp;
 +  return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp ||
 +  autofs_tmp_oz == current;
  }
  
  struct inode *autofs4_get_inode(struct super_block *, umode_t);
 diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
 index dd9f1bebb5a3..d76401669a20 100644
 --- a/fs/autofs4/dev-ioctl.c
 +++ b/fs/autofs4/dev-ioctl.c
 @@ -200,6 +200,20 @@ static int autofs_dev_ioctl_protosubver(struct file 
 *fp,
return 0;
  }
  
 +struct task_struct *autofs_tmp_oz;
 +int kern_path_oz(const char *pathname, int flags, struct path *path)
 +{
 +  static DEFINE_MUTEX(autofs_oz);
 +  int err;
 +
 +  mutex_lock(_oz);
 +  autofs_tmp_oz = current;
 +  err = kern_path(pathname, flags, path);
 +  autofs_tmp_oz = NULL;
 +  mutex_unlock(_oz);
 +  return err;
 +}
 +
>>>
>>> It's simple enough but does look like it will attract criticism as being
>>> a hack!
>>>
>>> The kern_path_locked() function is very similar to what was

Re: [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support.

2017-08-23 Thread Naoya Horiguchi

On Wed, Aug 23, 2017 at 10:20:02AM -0400, Zi Yan wrote:
> On 23 Aug 2017, at 3:49, Naoya Horiguchi wrote:
> 
> > On Mon, Aug 14, 2017 at 09:52:13PM -0400, Zi Yan wrote:
> >> From: Zi Yan 
> >>
> >> The loop in madvise_inject_error() reads its step size from a page
> >> after it is soft-offlined. It works because the page is:
> >> 1) a hugetlb page: the page size does not change;
> >> 2) a base page: the page size does not change;
> >> 3) a THP: soft-offline always splits THPs, thus, it is OK to use
> >>PAGE_SIZE as step size.
> >>
> >> It will be a problem when soft-offline supports THP migrations.
> >> When a THP is migrated without split during soft-offlining, the THP
> >> is split after migration, thus, before and after soft-offlining page
> >> sizes do not match. This causes a THP to be unnecessarily soft-lined,
> >> at most, 511 times, wasting free space.
> >
> > Hi Zi Yan,
> >
> > Thank you for the suggestion.
> >
> > I think that when madvise(MADV_SOFT_OFFLINE) is called with some range
> > over more than one 4kB page, the caller clearly intends to call
> > soft_offline_page() over all 4kB pages within the range in order to
> > simulate the multiple soft-offline events. Please note that the caller
> > only knows that specific pages are half-broken, and expect that all such
> > pages are offlined. So the end result should be same, whether the given
> > range is backed by thp or not.
> >
> 
> But if the given virtual address is backed by a THP and the THP is 
> soft-offlined
> without splitting (enabled by following patches), the old for-loop will cause 
> extra
> 511 THPs being soft-offlined.
> 
> For example, the caller wants to offline VPN 0-511, which is backed by a THP 
> whose
> address range is PFN 0-511. In the first iteration of the for-loop,
> get_user_pages_fast(VPN0, ...) will return the THP and soft_offline_page() 
> will offline the THP,
> replacing it with a new THP, say PFN 512-1023, so VPN 0-511 is backed by PFN 
> 512-1023.
> But the original THP will be split after it is freed, thus, for-loop will not 
> end
> at this moment, but continues to offline VPN1, which leads to PFN 512-1023 
> being offlined
> and replaced by another THP, say 1024-1535. This will go on and end up with
> 511 extra THPs are offlined. That is why we need to this patch to tell
> whether the THP is offlined as a whole or just its head page is offlined.

Thanks for elaborating this. I understand your point.
But I still not sure what the best behavior is.

madvise(MADV_SOFT_OFFLINE) is a test feature and giving multi-page range
on the call works like some stress testing. So multiple thp migrations
seem to me an expected behavior. At least it behaves in the same manner
as calling madvise(MADV_SOFT_OFFLINE) 512 times on VPN0-VPN511 separately,
which is consistent.

So I still feel like leaving the current behavior as long as your later
patches work without this change.

Thanks,
Naoya Horiguchi

> 
> Let me know if it is still not clear to you. Or I missed something.
> 
> >>
> >> Signed-off-by: Zi Yan 
> >> ---
> >>  mm/madvise.c | 21 ++---
> >>  1 file changed, 18 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/madvise.c b/mm/madvise.c
> >> index 47d8d8a25eae..49f6774db259 100644
> >> --- a/mm/madvise.c
> >> +++ b/mm/madvise.c
> >> @@ -612,19 +612,22 @@ static long madvise_remove(struct vm_area_struct 
> >> *vma,
> >>  static int madvise_inject_error(int behavior,
> >>unsigned long start, unsigned long end)
> >>  {
> >> -  struct page *page;
> >> +  struct page *page = NULL;
> >> +  unsigned long page_size = PAGE_SIZE;
> >>
> >>if (!capable(CAP_SYS_ADMIN))
> >>return -EPERM;
> >>
> >> -  for (; start < end; start += PAGE_SIZE <<
> >> -  compound_order(compound_head(page))) {
> >> +  for (; start < end; start += page_size) {
> >>int ret;
> >>
> >>ret = get_user_pages_fast(start, 1, 0, );
> >>if (ret != 1)
> >>return ret;
> >>
> >> +  page_size = (PAGE_SIZE << compound_order(compound_head(page))) -
> >> +  (PAGE_SIZE * (page - compound_head(page)));
> >> +
> >
> > Assigning a value which is not 4kB or some hugepage size into page_size
> > might be confusing because that's not what the name says. You can introduce
> > 'next' virtual address and ALIGN() might be helpful to calculate it.
> 
> Like:
> 
> next = ALIGN(start, PAGE_SIZE< 
> I think it works. Thanks.
> 
> 
> --
> Best Regards
> Yan Zi

Re: [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support.

2017-08-23 Thread Naoya Horiguchi

On Wed, Aug 23, 2017 at 10:20:02AM -0400, Zi Yan wrote:
> On 23 Aug 2017, at 3:49, Naoya Horiguchi wrote:
> 
> > On Mon, Aug 14, 2017 at 09:52:13PM -0400, Zi Yan wrote:
> >> From: Zi Yan 
> >>
> >> The loop in madvise_inject_error() reads its step size from a page
> >> after it is soft-offlined. It works because the page is:
> >> 1) a hugetlb page: the page size does not change;
> >> 2) a base page: the page size does not change;
> >> 3) a THP: soft-offline always splits THPs, thus, it is OK to use
> >>PAGE_SIZE as step size.
> >>
> >> It will be a problem when soft-offline supports THP migrations.
> >> When a THP is migrated without split during soft-offlining, the THP
> >> is split after migration, thus, before and after soft-offlining page
> >> sizes do not match. This causes a THP to be unnecessarily soft-lined,
> >> at most, 511 times, wasting free space.
> >
> > Hi Zi Yan,
> >
> > Thank you for the suggestion.
> >
> > I think that when madvise(MADV_SOFT_OFFLINE) is called with some range
> > over more than one 4kB page, the caller clearly intends to call
> > soft_offline_page() over all 4kB pages within the range in order to
> > simulate the multiple soft-offline events. Please note that the caller
> > only knows that specific pages are half-broken, and expect that all such
> > pages are offlined. So the end result should be same, whether the given
> > range is backed by thp or not.
> >
> 
> But if the given virtual address is backed by a THP and the THP is 
> soft-offlined
> without splitting (enabled by following patches), the old for-loop will cause 
> extra
> 511 THPs being soft-offlined.
> 
> For example, the caller wants to offline VPN 0-511, which is backed by a THP 
> whose
> address range is PFN 0-511. In the first iteration of the for-loop,
> get_user_pages_fast(VPN0, ...) will return the THP and soft_offline_page() 
> will offline the THP,
> replacing it with a new THP, say PFN 512-1023, so VPN 0-511 is backed by PFN 
> 512-1023.
> But the original THP will be split after it is freed, thus, for-loop will not 
> end
> at this moment, but continues to offline VPN1, which leads to PFN 512-1023 
> being offlined
> and replaced by another THP, say 1024-1535. This will go on and end up with
> 511 extra THPs are offlined. That is why we need to this patch to tell
> whether the THP is offlined as a whole or just its head page is offlined.

Thanks for elaborating this. I understand your point.
But I still not sure what the best behavior is.

madvise(MADV_SOFT_OFFLINE) is a test feature and giving multi-page range
on the call works like some stress testing. So multiple thp migrations
seem to me an expected behavior. At least it behaves in the same manner
as calling madvise(MADV_SOFT_OFFLINE) 512 times on VPN0-VPN511 separately,
which is consistent.

So I still feel like leaving the current behavior as long as your later
patches work without this change.

Thanks,
Naoya Horiguchi

> 
> Let me know if it is still not clear to you. Or I missed something.
> 
> >>
> >> Signed-off-by: Zi Yan 
> >> ---
> >>  mm/madvise.c | 21 ++---
> >>  1 file changed, 18 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/madvise.c b/mm/madvise.c
> >> index 47d8d8a25eae..49f6774db259 100644
> >> --- a/mm/madvise.c
> >> +++ b/mm/madvise.c
> >> @@ -612,19 +612,22 @@ static long madvise_remove(struct vm_area_struct 
> >> *vma,
> >>  static int madvise_inject_error(int behavior,
> >>unsigned long start, unsigned long end)
> >>  {
> >> -  struct page *page;
> >> +  struct page *page = NULL;
> >> +  unsigned long page_size = PAGE_SIZE;
> >>
> >>if (!capable(CAP_SYS_ADMIN))
> >>return -EPERM;
> >>
> >> -  for (; start < end; start += PAGE_SIZE <<
> >> -  compound_order(compound_head(page))) {
> >> +  for (; start < end; start += page_size) {
> >>int ret;
> >>
> >>ret = get_user_pages_fast(start, 1, 0, );
> >>if (ret != 1)
> >>return ret;
> >>
> >> +  page_size = (PAGE_SIZE << compound_order(compound_head(page))) -
> >> +  (PAGE_SIZE * (page - compound_head(page)));
> >> +
> >
> > Assigning a value which is not 4kB or some hugepage size into page_size
> > might be confusing because that's not what the name says. You can introduce
> > 'next' virtual address and ALIGN() might be helpful to calculate it.
> 
> Like:
> 
> next = ALIGN(start, PAGE_SIZE< 
> I think it works. Thanks.
> 
> 
> --
> Best Regards
> Yan Zi

Re: [PATCH] zram: add zstd to the supported algorithms list

2017-08-23 Thread Minchan Kim

Hello Sergey,

On Thu, Aug 24, 2017 at 10:49:36AM +0900, Sergey Senozhatsky wrote:
> Add ZSTD to the list of supported compression algorithms.
> 
> Official benchmarks [1]:

First of all, thanks for the work!

I want to ask one thing.

Could you add some benchmark(e.g.,) result(comp ratio and speed)
compared to (inflate, lzo, lz4)?

I want to see how much it's good for small data that ours is 4K.
Thanks!

> 
> Compressor name Ratio   Compression Decompress.
> zstd 1.1.3 -1   2.877   430 MB/s1110 MB/s
> zlib 1.2.8 -1   2.743   110 MB/s400 MB/s
> brotli 0.5.2 -0 2.708   400 MB/s430 MB/s
> quicklz 1.5.0 -12.238   550 MB/s710 MB/s
> lzo1x 2.09 -1   2.108   650 MB/s830 MB/s
> lz4 1.7.5   2.101   720 MB/s3600 MB/s
> snappy 1.1.32.091   500 MB/s1650 MB/s
> lzf 3.6 -1  2.077   400 MB/s860 MB/s
> 
> [1] https://github.com/facebook/zstd
> 
> Signed-off-by: Sergey Senozhatsky 
> ---
>  drivers/block/zram/zcomp.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 5b8992beffec..cc66daec7bbc 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -31,6 +31,9 @@ static const char * const backends[] = {
>  #endif
>  #if IS_ENABLED(CONFIG_CRYPTO_842)
>   "842",
> +#endif
> +#if IS_ENABLED(CONFIG_CRYPTO_ZSTD)
> + "zstd",
>  #endif
>   NULL
>  };
> -- 
> 2.14.1
>

Re: [PATCH v2 2/2] KVM: X86: Fix loss of exception which has not yet injected

2017-08-23 Thread Wanpeng Li

2017-08-23 20:27 GMT+08:00 Paolo Bonzini :
> On 23/08/2017 12:23, Wanpeng Li wrote:
>> @@ -6341,7 +6345,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, 
>> bool req_int_win)
>>   int r;
>>
>>   /* try to reinject previous events if any */
>> - if (vcpu->arch.exception.pending) {
>> + if (vcpu->arch.exception.pending ||
>> + vcpu->arch.exception.injected) {
>>   trace_kvm_inj_exception(vcpu->arch.exception.nr,
>>   vcpu->arch.exception.has_error_code,
>>   vcpu->arch.exception.error_code);
>> @@ -6357,6 +6362,11 @@ static int inject_pending_event(struct kvm_vcpu 
>> *vcpu, bool req_int_win)
>>   }
>>
>>   r = kvm_x86_ops->queue_exception(vcpu);
>> + if (r == 0 && vcpu->arch.exception.pending &&
>> + !vcpu->arch.exception.injected) {
>> + vcpu->arch.exception.pending = false;
>> + vcpu->arch.exception.injected = true;
>> + }
>>   return r;
>>   }
>>
>
> There are some changes needed here:
>
> - this "if" should check only .injected and call
> kvm_x86_ops->queue_exception.  The "if" for .pending, which handles
> rflags/dr7, assigns false to .pending and true to .injected, and also
> calls kvm_x86_ops->queue_exception, should be after the call to
> check_nested_events.
>
> - in the call to inject_pending_event, the "else" should have a
>
> WARN_ON(vcpu->arch.exception.pending);
>
> just for completeness.
>
>
> Also, nested_vmx_check_exception has to be moved from
> vmx_queue_exception to vmx_check_nested_events, so that
> nested_run_pending is checked correctly.  Something like this:
>
> if (vcpu->arch.exception.pending &&
> nested_vmx_check_exception(vcpu, _qual)) {
> if (vmx->nested.nested_run_pending)
> return -EBUSY;
>
> nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> return 0;
> }
>
> Maybe you can have:
>
> - patch 1 which I'll apply now
>
> - patch 2 is the same as this one, with only the above changes to
> inject_pending_event and the WARN_ON.
>
> - patch 3 moves the nested_vmx_inject_exception_vmexit call from
> nested_vmx_check_exception to vmx_queue_exception.
>
> - patch 4 moves the code to vmx_check_nested_events and add the
> nested_run_pending check, fixing the original bug.
>
> As to GET/SET_VCPU_EVENTS, for now I would not do any change.  Instead,
> we can do
>
> /*
>  * FIXME: pass injected and pending separately.  This is only
>  * needed for nested virtualization, whose state cannot be
>  * migrated yet.  For now we combine them just in case.
>  */
> events->exception.injected =
> (vcpu->arch.exception.pending ||
>  vcpu->arch.exception.injected) &&
> !kvm_exception_is_soft(vcpu->arch.exception.nr);
> }
>

I complete all the comments in v3, thanks for the review. :)

Regards,
Wanpeng Li

> and set_vcpu_events can just set .injected = false.  Separating the two
> will need a separate capability and KVM_ENABLE_CAP.
>
> Paolo
>
>>
>> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h 
>> b/tools/arch/x86/include/uapi/asm/kvm.h
>> index c2824d0..7c55916 100644
>> --- a/tools/arch/x86/include/uapi/asm/kvm.h
>> +++ b/tools/arch/x86/include/uapi/asm/kvm.h
>> @@ -296,9 +296,9 @@ struct kvm_reinject_control {
>>  struct kvm_vcpu_events {
>>   struct {
>>   __u8 injected;
>> + __u8 pending;
>>   __u8 nr;
>>   __u8 has_error_code;
>> - __u8 pad;
>>   __u32 error_code;
>>   } exception;
>>   struct {
>
> I think you don't need t

Agreed.

Re: [PATCH] zram: add zstd to the supported algorithms list

2017-08-23 Thread Minchan Kim

Hello Sergey,

On Thu, Aug 24, 2017 at 10:49:36AM +0900, Sergey Senozhatsky wrote:
> Add ZSTD to the list of supported compression algorithms.
> 
> Official benchmarks [1]:

First of all, thanks for the work!

I want to ask one thing.

Could you add some benchmark(e.g.,) result(comp ratio and speed)
compared to (inflate, lzo, lz4)?

I want to see how much it's good for small data that ours is 4K.
Thanks!

> 
> Compressor name Ratio   Compression Decompress.
> zstd 1.1.3 -1   2.877   430 MB/s1110 MB/s
> zlib 1.2.8 -1   2.743   110 MB/s400 MB/s
> brotli 0.5.2 -0 2.708   400 MB/s430 MB/s
> quicklz 1.5.0 -12.238   550 MB/s710 MB/s
> lzo1x 2.09 -1   2.108   650 MB/s830 MB/s
> lz4 1.7.5   2.101   720 MB/s3600 MB/s
> snappy 1.1.32.091   500 MB/s1650 MB/s
> lzf 3.6 -1  2.077   400 MB/s860 MB/s
> 
> [1] https://github.com/facebook/zstd
> 
> Signed-off-by: Sergey Senozhatsky 
> ---
>  drivers/block/zram/zcomp.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 5b8992beffec..cc66daec7bbc 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -31,6 +31,9 @@ static const char * const backends[] = {
>  #endif
>  #if IS_ENABLED(CONFIG_CRYPTO_842)
>   "842",
> +#endif
> +#if IS_ENABLED(CONFIG_CRYPTO_ZSTD)
> + "zstd",
>  #endif
>   NULL
>  };
> -- 
> 2.14.1
>

Re: [PATCH v2 2/2] KVM: X86: Fix loss of exception which has not yet injected

2017-08-23 Thread Wanpeng Li

2017-08-23 20:27 GMT+08:00 Paolo Bonzini :
> On 23/08/2017 12:23, Wanpeng Li wrote:
>> @@ -6341,7 +6345,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, 
>> bool req_int_win)
>>   int r;
>>
>>   /* try to reinject previous events if any */
>> - if (vcpu->arch.exception.pending) {
>> + if (vcpu->arch.exception.pending ||
>> + vcpu->arch.exception.injected) {
>>   trace_kvm_inj_exception(vcpu->arch.exception.nr,
>>   vcpu->arch.exception.has_error_code,
>>   vcpu->arch.exception.error_code);
>> @@ -6357,6 +6362,11 @@ static int inject_pending_event(struct kvm_vcpu 
>> *vcpu, bool req_int_win)
>>   }
>>
>>   r = kvm_x86_ops->queue_exception(vcpu);
>> + if (r == 0 && vcpu->arch.exception.pending &&
>> + !vcpu->arch.exception.injected) {
>> + vcpu->arch.exception.pending = false;
>> + vcpu->arch.exception.injected = true;
>> + }
>>   return r;
>>   }
>>
>
> There are some changes needed here:
>
> - this "if" should check only .injected and call
> kvm_x86_ops->queue_exception.  The "if" for .pending, which handles
> rflags/dr7, assigns false to .pending and true to .injected, and also
> calls kvm_x86_ops->queue_exception, should be after the call to
> check_nested_events.
>
> - in the call to inject_pending_event, the "else" should have a
>
> WARN_ON(vcpu->arch.exception.pending);
>
> just for completeness.
>
>
> Also, nested_vmx_check_exception has to be moved from
> vmx_queue_exception to vmx_check_nested_events, so that
> nested_run_pending is checked correctly.  Something like this:
>
> if (vcpu->arch.exception.pending &&
> nested_vmx_check_exception(vcpu, _qual)) {
> if (vmx->nested.nested_run_pending)
> return -EBUSY;
>
> nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> return 0;
> }
>
> Maybe you can have:
>
> - patch 1 which I'll apply now
>
> - patch 2 is the same as this one, with only the above changes to
> inject_pending_event and the WARN_ON.
>
> - patch 3 moves the nested_vmx_inject_exception_vmexit call from
> nested_vmx_check_exception to vmx_queue_exception.
>
> - patch 4 moves the code to vmx_check_nested_events and add the
> nested_run_pending check, fixing the original bug.
>
> As to GET/SET_VCPU_EVENTS, for now I would not do any change.  Instead,
> we can do
>
> /*
>  * FIXME: pass injected and pending separately.  This is only
>  * needed for nested virtualization, whose state cannot be
>  * migrated yet.  For now we combine them just in case.
>  */
> events->exception.injected =
> (vcpu->arch.exception.pending ||
>  vcpu->arch.exception.injected) &&
> !kvm_exception_is_soft(vcpu->arch.exception.nr);
> }
>

I complete all the comments in v3, thanks for the review. :)

Regards,
Wanpeng Li

> and set_vcpu_events can just set .injected = false.  Separating the two
> will need a separate capability and KVM_ENABLE_CAP.
>
> Paolo
>
>>
>> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h 
>> b/tools/arch/x86/include/uapi/asm/kvm.h
>> index c2824d0..7c55916 100644
>> --- a/tools/arch/x86/include/uapi/asm/kvm.h
>> +++ b/tools/arch/x86/include/uapi/asm/kvm.h
>> @@ -296,9 +296,9 @@ struct kvm_reinject_control {
>>  struct kvm_vcpu_events {
>>   struct {
>>   __u8 injected;
>> + __u8 pending;
>>   __u8 nr;
>>   __u8 has_error_code;
>> - __u8 pad;
>>   __u32 error_code;
>>   } exception;
>>   struct {
>
> I think you don't need t

Agreed.

[PATCH] usb:xhci:Fix regression when ATI chipsets detected

2017-08-23 Thread Sandeep Singh

From: Sandeep Singh 

The following commit cause a regression on ATI chipsets.
'commit e788787ef4f9 ("usb:xhci:Add quirk for Certain
failing HP keyboard on reset after resume")'

This causes pinfo->smbus_dev to be wrongly set to NULL on
systems with the ATI chipset that this function checks for first.

Added conditional check for AMD chipsets to avoid the overwriting
pinfo->smbus_dev.

Reported-by: Ben Hutchings 
Fixes: e788787ef4f9 ("usb:xhci:Add quirk for Certain
failing HP keyboard on reset after resume")
cc: Nehal Shah 
cc: 
Signed-off-by: Sandeep Singh 
Signed-off-by: Shyam Sundar S K 
---
 drivers/usb/host/pci-quirks.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c
index 5f4ca78..58b9685 100644
--- a/drivers/usb/host/pci-quirks.c
+++ b/drivers/usb/host/pci-quirks.c
@@ -142,29 +142,30 @@ static int amd_chipset_sb_type_init(struct 
amd_chipset_info *pinfo)
pinfo->sb_type.gen = AMD_CHIPSET_SB700;
else if (rev >= 0x40 && rev <= 0x4f)
pinfo->sb_type.gen = AMD_CHIPSET_SB800;
-   }
-   pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
- 0x145c, NULL);
-   if (pinfo->smbus_dev) {
-   pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN;
} else {
pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
PCI_DEVICE_ID_AMD_HUDSON2_SMBUS, NULL);
 
-   if (!pinfo->smbus_dev) {
-   pinfo->sb_type.gen = NOT_AMD_CHIPSET;
-   return 0;
+   if (pinfo->smbus_dev) {
+   rev = pinfo->smbus_dev->revision;
+   if (rev >= 0x11 && rev <= 0x14)
+   pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2;
+   else if (rev >= 0x15 && rev <= 0x18)
+   pinfo->sb_type.gen = AMD_CHIPSET_BOLTON;
+   else if (rev >= 0x39 && rev <= 0x3a)
+   pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE;
+   } else {
+   pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
+ 0x145c, NULL);
+   if (pinfo->smbus_dev) {
+   rev = pinfo->smbus_dev->revision;
+   pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN;
+   } else {
+   pinfo->sb_type.gen = NOT_AMD_CHIPSET;
+   return 0;
+   }
}
-
-   rev = pinfo->smbus_dev->revision;
-   if (rev >= 0x11 && rev <= 0x14)
-   pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2;
-   else if (rev >= 0x15 && rev <= 0x18)
-   pinfo->sb_type.gen = AMD_CHIPSET_BOLTON;
-   else if (rev >= 0x39 && rev <= 0x3a)
-   pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE;
}
-
pinfo->sb_type.rev = rev;
return 1;
 }
-- 
2.7.4

[PATCH] usb:xhci:Fix regression when ATI chipsets detected

2017-08-23 Thread Sandeep Singh

From: Sandeep Singh 

The following commit cause a regression on ATI chipsets.
'commit e788787ef4f9 ("usb:xhci:Add quirk for Certain
failing HP keyboard on reset after resume")'

This causes pinfo->smbus_dev to be wrongly set to NULL on
systems with the ATI chipset that this function checks for first.

Added conditional check for AMD chipsets to avoid the overwriting
pinfo->smbus_dev.

Reported-by: Ben Hutchings 
Fixes: e788787ef4f9 ("usb:xhci:Add quirk for Certain
failing HP keyboard on reset after resume")
cc: Nehal Shah 
cc: 
Signed-off-by: Sandeep Singh 
Signed-off-by: Shyam Sundar S K 
---
 drivers/usb/host/pci-quirks.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c
index 5f4ca78..58b9685 100644
--- a/drivers/usb/host/pci-quirks.c
+++ b/drivers/usb/host/pci-quirks.c
@@ -142,29 +142,30 @@ static int amd_chipset_sb_type_init(struct 
amd_chipset_info *pinfo)
pinfo->sb_type.gen = AMD_CHIPSET_SB700;
else if (rev >= 0x40 && rev <= 0x4f)
pinfo->sb_type.gen = AMD_CHIPSET_SB800;
-   }
-   pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
- 0x145c, NULL);
-   if (pinfo->smbus_dev) {
-   pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN;
} else {
pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
PCI_DEVICE_ID_AMD_HUDSON2_SMBUS, NULL);
 
-   if (!pinfo->smbus_dev) {
-   pinfo->sb_type.gen = NOT_AMD_CHIPSET;
-   return 0;
+   if (pinfo->smbus_dev) {
+   rev = pinfo->smbus_dev->revision;
+   if (rev >= 0x11 && rev <= 0x14)
+   pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2;
+   else if (rev >= 0x15 && rev <= 0x18)
+   pinfo->sb_type.gen = AMD_CHIPSET_BOLTON;
+   else if (rev >= 0x39 && rev <= 0x3a)
+   pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE;
+   } else {
+   pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD,
+ 0x145c, NULL);
+   if (pinfo->smbus_dev) {
+   rev = pinfo->smbus_dev->revision;
+   pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN;
+   } else {
+   pinfo->sb_type.gen = NOT_AMD_CHIPSET;
+   return 0;
+   }
}
-
-   rev = pinfo->smbus_dev->revision;
-   if (rev >= 0x11 && rev <= 0x14)
-   pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2;
-   else if (rev >= 0x15 && rev <= 0x18)
-   pinfo->sb_type.gen = AMD_CHIPSET_BOLTON;
-   else if (rev >= 0x39 && rev <= 0x3a)
-   pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE;
}
-
pinfo->sb_type.rev = rev;
return 1;
 }
-- 
2.7.4

Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter

2017-08-23 Thread Chunyu Hu



- Original Message -
> From: "Steven Rostedt" 
> To: "Chunyu Hu" 
> Cc: mi...@kernel.org, linux-kernel@vger.kernel.org
> Sent: Thursday, August 24, 2017 10:15:41 AM
> Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
> 
> On Wed, 23 Aug 2017 18:58:03 -0400 (EDT)
> Chunyu Hu  wrote:
> 
> > - Original Message -
> > > From: "Steven Rostedt" 
> > > To: "Chunyu Hu" 
> > > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org
> > > Sent: Wednesday, August 23, 2017 12:52:49 PM
> > > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
> > > 
> > > On Wed, 23 Aug 2017 10:41:55 -0400
> > > Steven Rostedt  wrote:
> > >   
> > > >  * On success, returns 0 and *@filterp points to the new filter.  On
> > > >  * failure, returns -errno and *@filterp may point to %NULL or to a new
> > > >  * filter.  In the latter case, the returned filter contains error
> > > >  * information if @set_str is %true and the caller is responsible for
> > > >  * freeing it.
> > > > 
> > > > So filter contains an error string when it fails. It seems that we
> > > > should somehow propagate that up the chain to display. I'll look more
> > > > into this.
> > > 
> > > The bug is in create_filter(), because "set_str" is set to false, and
> > > the filter should not be passed back allocated on error.
> > 
> > Thanks for all the analysis. I think you are right. I'll try to have a test
> > on it
> > in case we miss something. But please don't block on my test.
> > 
> 
> My tests are almost done, but I wont send anything till tomorrow. I can
> wait a day to post. There's a few other changes I need to send to Linus
> as well.

Tested with your patch, I did not hit the leak issue and other kmemleak. 
Thanks.

> 
> -- Steve
> 

-- 
Regards,
Chunyu Hu

Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter

2017-08-23 Thread Chunyu Hu



- Original Message -
> From: "Steven Rostedt" 
> To: "Chunyu Hu" 
> Cc: mi...@kernel.org, linux-kernel@vger.kernel.org
> Sent: Thursday, August 24, 2017 10:15:41 AM
> Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
> 
> On Wed, 23 Aug 2017 18:58:03 -0400 (EDT)
> Chunyu Hu  wrote:
> 
> > - Original Message -
> > > From: "Steven Rostedt" 
> > > To: "Chunyu Hu" 
> > > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org
> > > Sent: Wednesday, August 23, 2017 12:52:49 PM
> > > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
> > > 
> > > On Wed, 23 Aug 2017 10:41:55 -0400
> > > Steven Rostedt  wrote:
> > >   
> > > >  * On success, returns 0 and *@filterp points to the new filter.  On
> > > >  * failure, returns -errno and *@filterp may point to %NULL or to a new
> > > >  * filter.  In the latter case, the returned filter contains error
> > > >  * information if @set_str is %true and the caller is responsible for
> > > >  * freeing it.
> > > > 
> > > > So filter contains an error string when it fails. It seems that we
> > > > should somehow propagate that up the chain to display. I'll look more
> > > > into this.
> > > 
> > > The bug is in create_filter(), because "set_str" is set to false, and
> > > the filter should not be passed back allocated on error.
> > 
> > Thanks for all the analysis. I think you are right. I'll try to have a test
> > on it
> > in case we miss something. But please don't block on my test.
> > 
> 
> My tests are almost done, but I wont send anything till tomorrow. I can
> wait a day to post. There's a few other changes I need to send to Linus
> as well.

Tested with your patch, I did not hit the leak issue and other kmemleak. 
Thanks.

> 
> -- Steve
> 

-- 
Regards,
Chunyu Hu

Re: [PATCH 0/1] devpts: use dynamic_dname() to generate proc name

2017-08-23 Thread Stefan Lippers-Hollmann

Hi

On 2017-08-23, Eric W. Biederman wrote:
> Linus Torvalds  writes:
> > On Wed, Aug 23, 2017 at 6:49 PM, Linus Torvalds 
> >  wrote:  
[...]
> This is so far untested (except for compiling) but I think this will
> work.
> 
> I factor out devpts_ptmx_path out of devpts_acquire so the code
> doesn't have to do unnecessary and confusing work, and add the
> new function devpts_mnt.
> 
> I revert the code to keep anything except a dentry in
> tty->link->driver_data.
> 
> And reduce the peer opening to a single function ptm_open_peer.
> 
> It takes lines of code but the result is very straightforward code.

I've given this a quick test, while it seems to fix the initial problem
with umounting /dev/ptmx, it does introduce a new one - trying to open 
an xterm (KDE5's konsole to be exact) doesn't open a shell (the shell 
window remains totally empty) and trying to ssh into the system fails 
with "PTY allocation request failed on channel 0", logging in via a 
real tty and creating a new pbuilder chroot from there succeeds.

Regards
Stefan Lippers-Hollmann

pgpI8qaz3RlGW.pgp
Description: Digitale Signatur von OpenPGP

Re: [PATCH 0/1] devpts: use dynamic_dname() to generate proc name

2017-08-23 Thread Stefan Lippers-Hollmann

Hi

On 2017-08-23, Eric W. Biederman wrote:
> Linus Torvalds  writes:
> > On Wed, Aug 23, 2017 at 6:49 PM, Linus Torvalds 
> >  wrote:  
[...]
> This is so far untested (except for compiling) but I think this will
> work.
> 
> I factor out devpts_ptmx_path out of devpts_acquire so the code
> doesn't have to do unnecessary and confusing work, and add the
> new function devpts_mnt.
> 
> I revert the code to keep anything except a dentry in
> tty->link->driver_data.
> 
> And reduce the peer opening to a single function ptm_open_peer.
> 
> It takes lines of code but the result is very straightforward code.

I've given this a quick test, while it seems to fix the initial problem
with umounting /dev/ptmx, it does introduce a new one - trying to open 
an xterm (KDE5's konsole to be exact) doesn't open a shell (the shell 
window remains totally empty) and trying to ssh into the system fails 
with "PTY allocation request failed on channel 0", logging in via a 
real tty and creating a new pbuilder chroot from there succeeds.

Regards
Stefan Lippers-Hollmann

pgpI8qaz3RlGW.pgp
Description: Digitale Signatur von OpenPGP

[PATCH v3 2/4] KVM: X86: Fix loss of exception which has not yet injected

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

vmx_complete_interrupts() assumes that the exception is always injected, 
so it would be dropped by kvm_clear_exception_queue(). This patch separates 
exception.pending from exception.injected, exception.inject represents the 
exception is injected or the exception should be reinjected due to vmexit 
occurs during event delivery in VMX non-root operation. exception.pending 
represents the exception is queued and will be cleared when injecting the 
exception to the guest. So exception.pending and exception.injected can 
cooperate to guarantee exception will not be lost.

Reported-by: Radim Krčmář 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * the changes to inject_pending_event and adds the WARN_ON
 * combine injected and pending exception for GET/SET_VCPU_EVENTS

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c  |  2 +-
 arch/x86/kvm/vmx.c  |  4 +--
 arch/x86/kvm/x86.c  | 56 ++---
 arch/x86/kvm/x86.h  |  4 +--
 5 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9d90787..6e385ac 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -547,8 +547,8 @@ struct kvm_vcpu_arch {
 
struct kvm_queued_exception {
bool pending;
+   bool injected;
bool has_error_code;
-   bool reinject;
u8 nr;
u32 error_code;
u8 nested_apf;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a2fddce..6a439a1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -655,7 +655,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.reinject;
+   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
 
/*
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c5f43ab..902b780 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2503,7 +2503,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.reinject;
+   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
@@ -10948,7 +10948,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu 
*vcpu,
u32 idt_vectoring;
unsigned int nr;
 
-   if (vcpu->arch.exception.pending && vcpu->arch.exception.reinject) {
+   if (vcpu->arch.exception.injected) {
nr = vcpu->arch.exception.nr;
idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f41b88..b698b2f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -389,15 +389,18 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 
kvm_make_request(KVM_REQ_EVENT, vcpu);
 
-   if (!vcpu->arch.exception.pending) {
+   if (!vcpu->arch.exception.pending ||
+   !vcpu->arch.exception.injected) {
queue:
if (has_error && !is_protmode(vcpu))
has_error = false;
-   vcpu->arch.exception.pending = true;
+   if (reinject)
+   vcpu->arch.exception.injected = true;
+   else
+   vcpu->arch.exception.pending = true;
vcpu->arch.exception.has_error_code = has_error;
vcpu->arch.exception.nr = nr;
vcpu->arch.exception.error_code = error_code;
-   vcpu->arch.exception.reinject = reinject;
return;
}
 
@@ -3069,8 +3072,14 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct 
kvm_vcpu *vcpu,
   struct kvm_vcpu_events *events)
 {
process_nmi(vcpu);
+   /*
+* FIXME: pass injected and pending separately.  This is only
+* needed for nested virtualization, whose state cannot be
+* migrated yet.  For now we combine them just in case.
+*/
events->exception.injected =
-   vcpu->arch.exception.pending &&
+   (vcpu->arch.exception.pending ||
+vcpu->arch.exception.injected) &&
!kvm_exception_is_soft(vcpu->arch.exception.nr);
events->exception.nr = vcpu->arch.exception.nr;

[PATCH v3 4/4] KVM: nVMX: Fix trying to cancel vmlauch/vmresume

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

[ cut here ]
WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 
nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW  OE   4.13.0-rc4+ #11
RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
Call Trace:
 ? kvm_multiple_exception+0x149/0x170 [kvm]
 ? handle_emulation_failure+0x79/0x230 [kvm]
 ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
 ? check_chain_key+0x137/0x1e0
 ? reexecute_instruction.part.168+0x130/0x130 [kvm]
 nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 vmx_queue_exception+0x197/0x300 [kvm_intel]
 kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
 ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
 ? preempt_count_sub+0x18/0xc0
 ? restart_apic_timer+0x17d/0x300 [kvm]
 ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
 ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
 kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]

The flag "nested_run_pending", which can override the decision of which should 
run 
next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. 
This 
is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects 
L2 to 
be run (and perhaps be injected with an event it specified, etc.). 
Nested_run_pending 
is especially intended to avoid switching  to L1 in the injection 
decision-point.

I catch this in the queue exception path, this patch fixes it by requesting 
an immediate VM exit from L2 and keeping the exception for L1 pending for a 
subsequent nested VM exit.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 21760b8..6f88a79 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2501,16 +2501,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
u32 intr_info = nr | INTR_INFO_VALID_MASK;
-   unsigned long exit_qual;
-
-   if (!reinject && is_guest_mode(vcpu) &&
-   nested_vmx_check_exception(vcpu, _qual)) {
-   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
-   return;
-   }
 
if (has_error_code) {
vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
@@ -10988,10 +10980,20 @@ static void vmcs12_save_pending_event(struct kvm_vcpu 
*vcpu,
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
+   unsigned long exit_qual;
 
if (kvm_event_needs_reinjection(vcpu))
return -EBUSY;
 
+   if (vcpu->arch.exception.pending &&
+   nested_vmx_check_exception(vcpu, _qual)) {
+   if (vmx->nested.nested_run_pending)
+   return -EBUSY;
+
+   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+   return 0;
+}
+
if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
vmx->nested.preemption_timer_expired) {
if (vmx->nested.nested_run_pending)
-- 
2.7.4

[PATCH v3 2/4] KVM: X86: Fix loss of exception which has not yet injected

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

vmx_complete_interrupts() assumes that the exception is always injected, 
so it would be dropped by kvm_clear_exception_queue(). This patch separates 
exception.pending from exception.injected, exception.inject represents the 
exception is injected or the exception should be reinjected due to vmexit 
occurs during event delivery in VMX non-root operation. exception.pending 
represents the exception is queued and will be cleared when injecting the 
exception to the guest. So exception.pending and exception.injected can 
cooperate to guarantee exception will not be lost.

Reported-by: Radim Krčmář 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
v2 -> v3:
 * the changes to inject_pending_event and adds the WARN_ON
 * combine injected and pending exception for GET/SET_VCPU_EVENTS

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c  |  2 +-
 arch/x86/kvm/vmx.c  |  4 +--
 arch/x86/kvm/x86.c  | 56 ++---
 arch/x86/kvm/x86.h  |  4 +--
 5 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9d90787..6e385ac 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -547,8 +547,8 @@ struct kvm_vcpu_arch {
 
struct kvm_queued_exception {
bool pending;
+   bool injected;
bool has_error_code;
-   bool reinject;
u8 nr;
u32 error_code;
u8 nested_apf;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a2fddce..6a439a1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -655,7 +655,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.reinject;
+   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
 
/*
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c5f43ab..902b780 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2503,7 +2503,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.reinject;
+   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
@@ -10948,7 +10948,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu 
*vcpu,
u32 idt_vectoring;
unsigned int nr;
 
-   if (vcpu->arch.exception.pending && vcpu->arch.exception.reinject) {
+   if (vcpu->arch.exception.injected) {
nr = vcpu->arch.exception.nr;
idt_vectoring = nr | VECTORING_INFO_VALID_MASK;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f41b88..b698b2f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -389,15 +389,18 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
 
kvm_make_request(KVM_REQ_EVENT, vcpu);
 
-   if (!vcpu->arch.exception.pending) {
+   if (!vcpu->arch.exception.pending ||
+   !vcpu->arch.exception.injected) {
queue:
if (has_error && !is_protmode(vcpu))
has_error = false;
-   vcpu->arch.exception.pending = true;
+   if (reinject)
+   vcpu->arch.exception.injected = true;
+   else
+   vcpu->arch.exception.pending = true;
vcpu->arch.exception.has_error_code = has_error;
vcpu->arch.exception.nr = nr;
vcpu->arch.exception.error_code = error_code;
-   vcpu->arch.exception.reinject = reinject;
return;
}
 
@@ -3069,8 +3072,14 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct 
kvm_vcpu *vcpu,
   struct kvm_vcpu_events *events)
 {
process_nmi(vcpu);
+   /*
+* FIXME: pass injected and pending separately.  This is only
+* needed for nested virtualization, whose state cannot be
+* migrated yet.  For now we combine them just in case.
+*/
events->exception.injected =
-   vcpu->arch.exception.pending &&
+   (vcpu->arch.exception.pending ||
+vcpu->arch.exception.injected) &&
!kvm_exception_is_soft(vcpu->arch.exception.nr);
events->exception.nr = vcpu->arch.exception.nr;
events->exception.has_error_code = vcpu->arch.exception.has_error_code;
@@ -3125,6 +3134,7 @@ static int

[PATCH v3 4/4] KVM: nVMX: Fix trying to cancel vmlauch/vmresume

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

[ cut here ]
WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 
nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW  OE   4.13.0-rc4+ #11
RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
Call Trace:
 ? kvm_multiple_exception+0x149/0x170 [kvm]
 ? handle_emulation_failure+0x79/0x230 [kvm]
 ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
 ? check_chain_key+0x137/0x1e0
 ? reexecute_instruction.part.168+0x130/0x130 [kvm]
 nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 vmx_queue_exception+0x197/0x300 [kvm_intel]
 kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
 ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
 ? preempt_count_sub+0x18/0xc0
 ? restart_apic_timer+0x17d/0x300 [kvm]
 ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
 ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
 kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]

The flag "nested_run_pending", which can override the decision of which should 
run 
next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. 
This 
is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects 
L2 to 
be run (and perhaps be injected with an event it specified, etc.). 
Nested_run_pending 
is especially intended to avoid switching  to L1 in the injection 
decision-point.

I catch this in the queue exception path, this patch fixes it by requesting 
an immediate VM exit from L2 and keeping the exception for L1 pending for a 
subsequent nested VM exit.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 21760b8..6f88a79 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2501,16 +2501,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned nr = vcpu->arch.exception.nr;
bool has_error_code = vcpu->arch.exception.has_error_code;
-   bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
u32 intr_info = nr | INTR_INFO_VALID_MASK;
-   unsigned long exit_qual;
-
-   if (!reinject && is_guest_mode(vcpu) &&
-   nested_vmx_check_exception(vcpu, _qual)) {
-   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
-   return;
-   }
 
if (has_error_code) {
vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
@@ -10988,10 +10980,20 @@ static void vmcs12_save_pending_event(struct kvm_vcpu 
*vcpu,
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
+   unsigned long exit_qual;
 
if (kvm_event_needs_reinjection(vcpu))
return -EBUSY;
 
+   if (vcpu->arch.exception.pending &&
+   nested_vmx_check_exception(vcpu, _qual)) {
+   if (vmx->nested.nested_run_pending)
+   return -EBUSY;
+
+   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+   return 0;
+}
+
if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
vmx->nested.preemption_timer_expired) {
if (vmx->nested.nested_run_pending)
-- 
2.7.4

[PATCH v3 1/4] KVM: VMX: use kvm_event_needs_reinjection

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

Use kvm_event_needs_reinjection() encapsulation.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dd710d3..c5f43ab 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -10988,9 +10988,7 @@ static int vmx_check_nested_events(struct kvm_vcpu 
*vcpu, bool external_intr)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
 
-   if (vcpu->arch.exception.pending ||
-   vcpu->arch.nmi_injected ||
-   vcpu->arch.interrupt.pending)
+   if (kvm_event_needs_reinjection(vcpu))
return -EBUSY;
 
if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
-- 
2.7.4

[PATCH v3 3/4] KVM: VMX: Move the nested_vmx_inject_exception_vmexit call from nested_vmx_check_exception to vmx_queue_exception

2017-08-23 Thread Wanpeng Li

From: Wanpeng Li 

Move the nested_vmx_inject_exception_vmexit call from 
nested_vmx_check_exception 
to vmx_queue_exception. 

Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 902b780..21760b8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2459,15 +2459,14 @@ static void nested_vmx_inject_exception_vmexit(struct 
kvm_vcpu *vcpu,
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
  */
-static int nested_vmx_check_exception(struct kvm_vcpu *vcpu)
+static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long 
*exit_qual)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
unsigned int nr = vcpu->arch.exception.nr;
 
if (nr == PF_VECTOR) {
if (vcpu->arch.exception.nested_apf) {
-   nested_vmx_inject_exception_vmexit(vcpu,
-  
vcpu->arch.apf.nested_apf_token);
+   *exit_qual = vcpu->arch.apf.nested_apf_token;
return 1;
}
/*
@@ -2481,16 +2480,15 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu)
 */
if (nested_vmx_is_page_fault_vmexit(vmcs12,

vcpu->arch.exception.error_code)) {
-   nested_vmx_inject_exception_vmexit(vcpu, 
vcpu->arch.cr2);
+   *exit_qual = vcpu->arch.cr2;
return 1;
}
} else {
-   unsigned long exit_qual = 0;
-   if (nr == DB_VECTOR)
-   exit_qual = vcpu->arch.dr6;
-
if (vmcs12->exception_bitmap & (1u << nr)) {
-   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+   if (nr == DB_VECTOR)
+   *exit_qual = vcpu->arch.dr6;
+   else
+   *exit_qual = 0;
return 1;
}
}
@@ -2506,10 +2504,13 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
bool reinject = vcpu->arch.exception.injected;
u32 error_code = vcpu->arch.exception.error_code;
u32 intr_info = nr | INTR_INFO_VALID_MASK;
+   unsigned long exit_qual;
 
if (!reinject && is_guest_mode(vcpu) &&
-   nested_vmx_check_exception(vcpu))
+   nested_vmx_check_exception(vcpu, _qual)) {
+   nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
return;
+   }
 
if (has_error_code) {
vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
-- 
2.7.4

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1688 matches

Mail list logo