Re: linux-next: build warning after merge of the scsi-mkp tree
Hi all, [Just adding Dick and James to the cc list] On Thu, 24 Aug 2017 15:57:56 +1000 Stephen Rothwellwrote: > > After merging the scsi-mkp tree, today's linux-next build (powerpc > ppc64_defconfig) produced this warning: > > drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: > 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function] > lpfc_nvmet_replenish_context(struct lpfc_hba *phba, > ^ > > Introduced by commit > > 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling") -- Cheers, Stephen Rothwell
Re: linux-next: build warning after merge of the scsi-mkp tree
Hi all, [Just adding Dick and James to the cc list] On Thu, 24 Aug 2017 15:57:56 +1000 Stephen Rothwell wrote: > > After merging the scsi-mkp tree, today's linux-next build (powerpc > ppc64_defconfig) produced this warning: > > drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: > 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function] > lpfc_nvmet_replenish_context(struct lpfc_hba *phba, > ^ > > Introduced by commit > > 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling") -- Cheers, Stephen Rothwell
linux-next: build warning after merge of the scsi-mkp tree
Hi Martin, After merging the scsi-mkp tree, today's linux-next build (powerpc ppc64_defconfig) produced this warning: drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function] lpfc_nvmet_replenish_context(struct lpfc_hba *phba, ^ Introduced by commit 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling") -- Cheers, Stephen Rothwell
linux-next: build warning after merge of the scsi-mkp tree
Hi Martin, After merging the scsi-mkp tree, today's linux-next build (powerpc ppc64_defconfig) produced this warning: drivers/scsi/lpfc/lpfc_nvmet.c:1457:1: warning: 'lpfc_nvmet_replenish_context' defined but not used [-Wunused-function] lpfc_nvmet_replenish_context(struct lpfc_hba *phba, ^ Introduced by commit 9a38e4f1c82f ("scsi: lpfc: Fix MRQ > 1 context list handling") -- Cheers, Stephen Rothwell
Re: [PATCH] tty: xilinx_uartps: move to arch_initcall for earlier console
On 23.8.2017 12:55, Shubhrajyoti Datta wrote: > Signed-off-by: Shubhrajyoti DattaEmpty commit message? What's the reason for this change? M > --- > drivers/tty/serial/xilinx_uartps.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/tty/serial/xilinx_uartps.c > b/drivers/tty/serial/xilinx_uartps.c > index ff1b115..a239343 100644 > --- a/drivers/tty/serial/xilinx_uartps.c > +++ b/drivers/tty/serial/xilinx_uartps.c > @@ -1647,7 +1647,7 @@ static void __exit cdns_uart_exit(void) > uart_unregister_driver(_uart_uart_driver); > } > > -module_init(cdns_uart_init); > +arch_initcall(cdns_uart_init); > module_exit(cdns_uart_exit); > > MODULE_DESCRIPTION("Driver for Cadence UART"); >
Re: [PATCH] tty: xilinx_uartps: move to arch_initcall for earlier console
On 23.8.2017 12:55, Shubhrajyoti Datta wrote: > Signed-off-by: Shubhrajyoti Datta Empty commit message? What's the reason for this change? M > --- > drivers/tty/serial/xilinx_uartps.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/tty/serial/xilinx_uartps.c > b/drivers/tty/serial/xilinx_uartps.c > index ff1b115..a239343 100644 > --- a/drivers/tty/serial/xilinx_uartps.c > +++ b/drivers/tty/serial/xilinx_uartps.c > @@ -1647,7 +1647,7 @@ static void __exit cdns_uart_exit(void) > uart_unregister_driver(_uart_uart_driver); > } > > -module_init(cdns_uart_init); > +arch_initcall(cdns_uart_init); > module_exit(cdns_uart_exit); > > MODULE_DESCRIPTION("Driver for Cadence UART"); >
Re: [PATCH v3 2/5] dmaengine: Add STM32 DMAMUX driver
On 2017-08-21 12:34, Pierre Yves MORDRET wrote: > OK. I will redesign my driver to take into account this idea. > > I believe I should get rid of my custom API in DMA for channelID as well. > Please > confirm. Not very clear for me whether I can keep it or not. Yes, you should be able to get rid of any custom API. The DMA event router should be 'invisible' to both the DMA driver itself and for the DMA users as well. If you end up still needing custom API (which I doubt) then it is better to extend the core support to cover that in a generic way since it is likely that other might have similar needs. You need to identify what you need to manage with the DMA router, again in my cases: am335x/am437x: The DMA channel is fixed, but the DMA event to be handled by the channel (DMA event muxer) is selected by the router. dra7x: The DMA event is the fixed part and I needed to translate that to local eDMA/sDMA events and in eDMA I needed to pick a channel based on the translated event. The same router code works for eDMA and sDMA in dra7, while the two DMA engines are different in their internal works. At the end of the day I only needed to plug the DMA event routers and there is no change in the DMA drivers at all. I'm happy to answer any questions you might have, if I can. - Péter
Re: [PATCH v3 2/5] dmaengine: Add STM32 DMAMUX driver
On 2017-08-21 12:34, Pierre Yves MORDRET wrote: > OK. I will redesign my driver to take into account this idea. > > I believe I should get rid of my custom API in DMA for channelID as well. > Please > confirm. Not very clear for me whether I can keep it or not. Yes, you should be able to get rid of any custom API. The DMA event router should be 'invisible' to both the DMA driver itself and for the DMA users as well. If you end up still needing custom API (which I doubt) then it is better to extend the core support to cover that in a generic way since it is likely that other might have similar needs. You need to identify what you need to manage with the DMA router, again in my cases: am335x/am437x: The DMA channel is fixed, but the DMA event to be handled by the channel (DMA event muxer) is selected by the router. dra7x: The DMA event is the fixed part and I needed to translate that to local eDMA/sDMA events and in eDMA I needed to pick a channel based on the translated event. The same router code works for eDMA and sDMA in dra7, while the two DMA engines are different in their internal works. At the end of the day I only needed to plug the DMA event routers and there is no change in the DMA drivers at all. I'm happy to answer any questions you might have, if I can. - Péter
[RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy
'min_vruntime_copy' is copied when 'min_vruntime' is updated for cfq_rq and used to check if updating 'min_vruntime' is completed on reader side. Because 'min_vruntime' variable is 64bit, we need a mimic of seqlock to check if the variable is not being updated on 32bit machines. On 64BIT_ATOMIC_ACCESS enabled machines, 64bit accesses are atomic even though the machines are 32bit, so we can directly access 'min_vruntime' on the architectures. Depend on CONFIG_64BIT_ATOMIC_ACCESS instead of CONFIG_64BIT to determine whether 'min_vruntime_copy' variable is used for synchronization or not. And align 'min_vruntime' by 8 if 64BIT_ATOMIC_ALIGNED_ACCESS is true because 64BIT_ATOMIC_ALIGNED_ACCESS enabled system can access the variable atomically only when it' aligned. Signed-off-by: Hoeun Ryu--- kernel/sched/fair.c | 6 +++--- kernel/sched/sched.h | 6 +- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c95880e..840658f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -536,7 +536,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq) /* ensure we never gain time by being placed backwards. */ cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime); -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS smp_wmb(); cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime; #endif @@ -5975,7 +5975,7 @@ static void migrate_task_rq_fair(struct task_struct *p) struct cfs_rq *cfs_rq = cfs_rq_of(se); u64 min_vruntime; -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS u64 min_vruntime_copy; do { @@ -9173,7 +9173,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq) { cfs_rq->tasks_timeline = RB_ROOT; cfs_rq->min_vruntime = (u64)(-(1LL << 20)); -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime; #endif #ifdef CONFIG_SMP diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index eeef1a3..870010b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -421,8 +421,12 @@ struct cfs_rq { unsigned int nr_running, h_nr_running; u64 exec_clock; +#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS u64 min_vruntime; -#ifndef CONFIG_64BIT +#else + u64 min_vruntime __attribute__((aligned(sizeof(u64; +#endif +#ifndef CONFIG_64BIT_ATOMIC_ACCESS u64 min_vruntime_copy; #endif -- 2.7.4
[RFC 3/3] sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy
'min_vruntime_copy' is copied when 'min_vruntime' is updated for cfq_rq and used to check if updating 'min_vruntime' is completed on reader side. Because 'min_vruntime' variable is 64bit, we need a mimic of seqlock to check if the variable is not being updated on 32bit machines. On 64BIT_ATOMIC_ACCESS enabled machines, 64bit accesses are atomic even though the machines are 32bit, so we can directly access 'min_vruntime' on the architectures. Depend on CONFIG_64BIT_ATOMIC_ACCESS instead of CONFIG_64BIT to determine whether 'min_vruntime_copy' variable is used for synchronization or not. And align 'min_vruntime' by 8 if 64BIT_ATOMIC_ALIGNED_ACCESS is true because 64BIT_ATOMIC_ALIGNED_ACCESS enabled system can access the variable atomically only when it' aligned. Signed-off-by: Hoeun Ryu --- kernel/sched/fair.c | 6 +++--- kernel/sched/sched.h | 6 +- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c95880e..840658f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -536,7 +536,7 @@ static void update_min_vruntime(struct cfs_rq *cfs_rq) /* ensure we never gain time by being placed backwards. */ cfs_rq->min_vruntime = max_vruntime(cfs_rq->min_vruntime, vruntime); -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS smp_wmb(); cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime; #endif @@ -5975,7 +5975,7 @@ static void migrate_task_rq_fair(struct task_struct *p) struct cfs_rq *cfs_rq = cfs_rq_of(se); u64 min_vruntime; -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS u64 min_vruntime_copy; do { @@ -9173,7 +9173,7 @@ void init_cfs_rq(struct cfs_rq *cfs_rq) { cfs_rq->tasks_timeline = RB_ROOT; cfs_rq->min_vruntime = (u64)(-(1LL << 20)); -#ifndef CONFIG_64BIT +#ifndef CONFIG_64BIT_ATOMIC_ACCESS cfs_rq->min_vruntime_copy = cfs_rq->min_vruntime; #endif #ifdef CONFIG_SMP diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index eeef1a3..870010b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -421,8 +421,12 @@ struct cfs_rq { unsigned int nr_running, h_nr_running; u64 exec_clock; +#ifndef CONFIG_64BIT_ATOMIC_ALIGNED_ACCESS u64 min_vruntime; -#ifndef CONFIG_64BIT +#else + u64 min_vruntime __attribute__((aligned(sizeof(u64; +#endif +#ifndef CONFIG_64BIT_ATOMIC_ACCESS u64 min_vruntime_copy; #endif -- 2.7.4
[RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS
On some 32-bit architectures, 64bit accesses are atomic when certain conditions are satisfied. For example, on LPAE (Large Physical Address Extension) enabled ARM architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables, and 's/u64' variables can be accessed atomically when they are aligned(8) on LPAE enabled ARM architecture machines. Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which can be true for the 32bit architectures as well as 64bit architectures. we can optimize some kernel codes using seqlock (timekeeping) or mimics of it (like in sched/cfq) simply to read or write 64bit variables. The existing codes depend on CONFIG_64BIT to determine whether the 64bit variables can be directly accessed or need additional synchronization primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of CONFIG_64BIT in the cases. 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to indicate the alignment requirement to the compiler (__attribute__((aligned(8 in the way of #ifdef. The last patch "sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy" is an example of this approach. I'd like to know what the architecture maintainers and kernel maintainers think about it. I think I can make more examples (mostly removing seqlock to access the 64bit variables on the machines) if this approach is accepted. Hoeun Ryu (3): arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy arch/Kconfig | 20 arch/arm/mm/Kconfig | 2 ++ kernel/sched/fair.c | 6 +++--- kernel/sched/sched.h | 6 +- 4 files changed, 30 insertions(+), 4 deletions(-) -- 2.7.4
[RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines
On some 32-bit architectures, 64bit accesses are atomic when certain conditions are satisfied. For example, on LPAE (Large Physical Address Extension) enabled ARM architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables, and 's/u64' variables can be accessed atomically when they are aligned(8) on LPAE enabled ARM architecture machines. Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which can be true for the 32bit architectures as well as 64bit architectures, we can optimize some kernel codes using seqlock (timekeeping) or mimics of it (like in sched/cfq) simply to read or write 64bit variables. The existing codes depend on CONFIG_64BIT to determine whether the 64bit variables can be directly accessed or need additional synchronization primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of CONFIG_64BIT in the cases. 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to indicate the alignment requirement to the compiler (__attribute__((aligned(8 in the way of #ifdef. Signed-off-by: Hoeun Ryu--- arch/Kconfig | 20 1 file changed, 20 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index 21d0089..1def331 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -115,6 +115,26 @@ config UPROBES managed by the kernel and kept transparent to the probed application. ) +config 64BIT_ATOMIC_ACCESS + def_bool 64BIT + help + On some 32bit architectures as well as 64bit architectures, + 64bit accesses are atomic when certain conditions are satisfied. + + This symbol should be selected by an architecture if 64 bit + accesses can be atomic. + +config 64BIT_ATOMIC_ALIGNED_ACCESS + def_bool n + depends on 64BIT_ATOMIC_ACCESS + help + On 64BIT_ATOMIC_ACCESS enabled system, the address should be + aligned by 8 to guarantee the accesses are atomic. + + This symbol should be selected by an architecture if 64 bit + accesses are required to be 64 bit aligned to guarantee that + the 64bit accesses are atomic. + config HAVE_64BIT_ALIGNED_ACCESS def_bool 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS help -- 2.7.4
[RFC 0/3] add 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS
On some 32-bit architectures, 64bit accesses are atomic when certain conditions are satisfied. For example, on LPAE (Large Physical Address Extension) enabled ARM architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables, and 's/u64' variables can be accessed atomically when they are aligned(8) on LPAE enabled ARM architecture machines. Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which can be true for the 32bit architectures as well as 64bit architectures. we can optimize some kernel codes using seqlock (timekeeping) or mimics of it (like in sched/cfq) simply to read or write 64bit variables. The existing codes depend on CONFIG_64BIT to determine whether the 64bit variables can be directly accessed or need additional synchronization primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of CONFIG_64BIT in the cases. 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to indicate the alignment requirement to the compiler (__attribute__((aligned(8 in the way of #ifdef. The last patch "sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy" is an example of this approach. I'd like to know what the architecture maintainers and kernel maintainers think about it. I think I can make more examples (mostly removing seqlock to access the 64bit variables on the machines) if this approach is accepted. Hoeun Ryu (3): arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines sched: depend on 64BIT_ATOMIC_ACCESS to determine if to use min_vruntime_copy arch/Kconfig | 20 arch/arm/mm/Kconfig | 2 ++ kernel/sched/fair.c | 6 +++--- kernel/sched/sched.h | 6 +- 4 files changed, 30 insertions(+), 4 deletions(-) -- 2.7.4
[RFC 1/3] arch: add 64BIT_ATOMIC_ACCESS to support 64bit atomic access on 32bit machines
On some 32-bit architectures, 64bit accesses are atomic when certain conditions are satisfied. For example, on LPAE (Large Physical Address Extension) enabled ARM architecture, 'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables, and 's/u64' variables can be accessed atomically when they are aligned(8) on LPAE enabled ARM architecture machines. Introducing 64BIT_ATOMIC_ACCESS and 64BIT_ATOMIC_ALIGNED_ACCESS, which can be true for the 32bit architectures as well as 64bit architectures, we can optimize some kernel codes using seqlock (timekeeping) or mimics of it (like in sched/cfq) simply to read or write 64bit variables. The existing codes depend on CONFIG_64BIT to determine whether the 64bit variables can be directly accessed or need additional synchronization primitives like seqlock. CONFIG_64BIT_ATOMIC_ACCESS can be used instead of CONFIG_64BIT in the cases. 64BIT_ATOMIC_ALIGNED_ACCESS can be used in the variable declaration to indicate the alignment requirement to the compiler (__attribute__((aligned(8 in the way of #ifdef. Signed-off-by: Hoeun Ryu --- arch/Kconfig | 20 1 file changed, 20 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index 21d0089..1def331 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -115,6 +115,26 @@ config UPROBES managed by the kernel and kept transparent to the probed application. ) +config 64BIT_ATOMIC_ACCESS + def_bool 64BIT + help + On some 32bit architectures as well as 64bit architectures, + 64bit accesses are atomic when certain conditions are satisfied. + + This symbol should be selected by an architecture if 64 bit + accesses can be atomic. + +config 64BIT_ATOMIC_ALIGNED_ACCESS + def_bool n + depends on 64BIT_ATOMIC_ACCESS + help + On 64BIT_ATOMIC_ACCESS enabled system, the address should be + aligned by 8 to guarantee the accesses are atomic. + + This symbol should be selected by an architecture if 64 bit + accesses are required to be 64 bit aligned to guarantee that + the 64bit accesses are atomic. + config HAVE_64BIT_ALIGNED_ACCESS def_bool 64BIT && !HAVE_EFFICIENT_UNALIGNED_ACCESS help -- 2.7.4
[RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned on LPAE (Large Physical Address Extension) enabled architectures. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables. Making 64BIT_ATOMIC_ACCESS true, some kernel codes to access 64bit variables can be optimized by omitting seqlock or the mimic of it. Also make 64BIT_ATOMIC_ALIGNED_ACCESS true, the 64bit atomic access is guarnteed only when the address is 64bit algined. Signed-off-by: Hoeun Ryu--- arch/arm/mm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 60cdfdc..3142572 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -660,6 +660,8 @@ config ARM_LPAE bool "Support for the Large Physical Address Extension" depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ !CPU_32v4 && !CPU_32v3 + select 64BIT_ATOMIC_ACCESS + select 64BIT_ATOMIC_ALIGNED_ACCESS help Say Y if you have an ARMv7 processor supporting the LPAE page table format and you would like to access memory beyond the -- 2.7.4
[RFC 2/3] arm: enable 64BIT_ATOMIC(_ALIGNED)_ACCESS on LPAE enabled machines
'ldrd/strd' (load/store doublewords) instructions are 64bit atomic as long as the address is 64-bit aligned on LPAE (Large Physical Address Extension) enabled architectures. This feature is to guarantee atomic accesses on newly introduced 64bit wide descriptors in the translation tables. Making 64BIT_ATOMIC_ACCESS true, some kernel codes to access 64bit variables can be optimized by omitting seqlock or the mimic of it. Also make 64BIT_ATOMIC_ALIGNED_ACCESS true, the 64bit atomic access is guarnteed only when the address is 64bit algined. Signed-off-by: Hoeun Ryu --- arch/arm/mm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 60cdfdc..3142572 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -660,6 +660,8 @@ config ARM_LPAE bool "Support for the Large Physical Address Extension" depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ !CPU_32v4 && !CPU_32v3 + select 64BIT_ATOMIC_ACCESS + select 64BIT_ATOMIC_ALIGNED_ACCESS help Say Y if you have an ARMv7 processor supporting the LPAE page table format and you would like to access memory beyond the -- 2.7.4
[PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
From: Joonsoo KimFreepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that important to reserve. When ZONE_MOVABLE is used, this problem would theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE allocation request which is mainly used for page cache and anon page allocation. So, fix it. And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size makes code complex. For example, if there is highmem system, following reserve ratio is activated for *NORMAL ZONE* which would be easyily misleading people. #ifdef CONFIG_HIGHMEM 32 #endif This patch also fix this situation by defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES and place "#ifdef" to right place. Reviewed-by: Aneesh Kumar K.V Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- include/linux/mmzone.h | 2 +- mm/page_alloc.c| 11 ++- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e7e92c8..e5f134b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); int watermark_scale_factor_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1]; +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES]; int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 90b1996..6faa53d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order); * TBD: should special case ZONE_DMA32 machines here - in those we normally * don't need any ZONE_NORMAL reservation */ -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = { +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = { #ifdef CONFIG_ZONE_DMA -256, + [ZONE_DMA] = 256, #endif #ifdef CONFIG_ZONE_DMA32 -256, + [ZONE_DMA32] = 256, #endif + [ZONE_NORMAL] = 32, #ifdef CONFIG_HIGHMEM -32, + [ZONE_HIGHMEM] = INT_MAX, #endif -32, + [ZONE_MOVABLE] = INT_MAX, }; EXPORT_SYMBOL(totalram_pages); -- 2.7.4
[PATCH] mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request
From: Joonsoo Kim Freepage on ZONE_HIGHMEM doesn't work for kernel memory so it's not that important to reserve. When ZONE_MOVABLE is used, this problem would theorectically cause to decrease usable memory for GFP_HIGHUSER_MOVABLE allocation request which is mainly used for page cache and anon page allocation. So, fix it. And, defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES - 1 size makes code complex. For example, if there is highmem system, following reserve ratio is activated for *NORMAL ZONE* which would be easyily misleading people. #ifdef CONFIG_HIGHMEM 32 #endif This patch also fix this situation by defining sysctl_lowmem_reserve_ratio array by MAX_NR_ZONES and place "#ifdef" to right place. Reviewed-by: Aneesh Kumar K.V Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- include/linux/mmzone.h | 2 +- mm/page_alloc.c| 11 ++- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e7e92c8..e5f134b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -882,7 +882,7 @@ int min_free_kbytes_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); int watermark_scale_factor_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); -extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1]; +extern int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES]; int lowmem_reserve_ratio_sysctl_handler(struct ctl_table *, int, void __user *, size_t *, loff_t *); int percpu_pagelist_fraction_sysctl_handler(struct ctl_table *, int, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 90b1996..6faa53d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -202,17 +202,18 @@ static void __free_pages_ok(struct page *page, unsigned int order); * TBD: should special case ZONE_DMA32 machines here - in those we normally * don't need any ZONE_NORMAL reservation */ -int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES-1] = { +int sysctl_lowmem_reserve_ratio[MAX_NR_ZONES] = { #ifdef CONFIG_ZONE_DMA -256, + [ZONE_DMA] = 256, #endif #ifdef CONFIG_ZONE_DMA32 -256, + [ZONE_DMA32] = 256, #endif + [ZONE_NORMAL] = 32, #ifdef CONFIG_HIGHMEM -32, + [ZONE_HIGHMEM] = INT_MAX, #endif -32, + [ZONE_MOVABLE] = INT_MAX, }; EXPORT_SYMBOL(totalram_pages); -- 2.7.4
Re: [PATCH][V2][netdev-next] gre: remove duplicated assignment of iph
From: Colin KingDate: Wed, 23 Aug 2017 12:59:48 +0100 > From: Colin Ian King > > iph is being assigned the same value twice; remove the redundant > first assignment. (Thanks to Nikolay Aleksandrov for pointing out > that the first asssignment should be removed and not the second) > > Fixes warning: > net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read > > Signed-off-by: Colin Ian King Much better, applied, thanks.
Re: [PATCH][V2][netdev-next] gre: remove duplicated assignment of iph
From: Colin King Date: Wed, 23 Aug 2017 12:59:48 +0100 > From: Colin Ian King > > iph is being assigned the same value twice; remove the redundant > first assignment. (Thanks to Nikolay Aleksandrov for pointing out > that the first asssignment should be removed and not the second) > > Fixes warning: > net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read > > Signed-off-by: Colin Ian King Much better, applied, thanks.
Re: [PATCH net] sctp: Avoid out-of-bounds reads from address storage
From: Stefano BrivioDate: Wed, 23 Aug 2017 13:27:13 +0200 > inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy > sizeof(sockaddr_storage) bytes to fill in sockaddr structs used > to export diagnostic information to userspace. > > However, the memory allocated to store sockaddr information is > smaller than that and depends on the address family, so we leak > up to 100 uninitialized bytes to userspace. Just use the size of > the source structs instead, in all the three cases this is what > userspace expects. Zero out the remaining memory. > > Unused bytes (i.e. when IPv4 addresses are used) in source > structs sctp_sockaddr_entry and sctp_transport are already > cleared by sctp_add_bind_addr() and sctp_transport_new(), > respectively. > > Noticed while testing KASAN-enabled kernel with 'ss': ... > This fixes CVE-2017-7558. > > References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266 > Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") > Cc: # 4.7+ > Cc: Xin Long > Cc: Vlad Yasevich > Cc: Neil Horman > Signed-off-by: Stefano Brivio Applied and queued up for -stable. Do not put "stable@kernel..." into networking patch submissions. For networking, I handle the stable submissions by hand myself. Thank you.
Re: [PATCH net] sctp: Avoid out-of-bounds reads from address storage
From: Stefano Brivio Date: Wed, 23 Aug 2017 13:27:13 +0200 > inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy > sizeof(sockaddr_storage) bytes to fill in sockaddr structs used > to export diagnostic information to userspace. > > However, the memory allocated to store sockaddr information is > smaller than that and depends on the address family, so we leak > up to 100 uninitialized bytes to userspace. Just use the size of > the source structs instead, in all the three cases this is what > userspace expects. Zero out the remaining memory. > > Unused bytes (i.e. when IPv4 addresses are used) in source > structs sctp_sockaddr_entry and sctp_transport are already > cleared by sctp_add_bind_addr() and sctp_transport_new(), > respectively. > > Noticed while testing KASAN-enabled kernel with 'ss': ... > This fixes CVE-2017-7558. > > References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266 > Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") > Cc: # 4.7+ > Cc: Xin Long > Cc: Vlad Yasevich > Cc: Neil Horman > Signed-off-by: Stefano Brivio Applied and queued up for -stable. Do not put "stable@kernel..." into networking patch submissions. For networking, I handle the stable submissions by hand myself. Thank you.
Re: [PATCH 3/3 v3] net: tipc: constify genl_ops
From: Arvind YadavDate: Wed, 23 Aug 2017 16:22:20 +0530 > genl_ops are not supposed to change at runtime. All functions > working with genl_ops provided by work with > const genl_ops. So mark the non-const structs as const. > > Signed-off-by: Arvind Yadav Applied.
Re: [PATCH 3/3 v3] net: tipc: constify genl_ops
From: Arvind Yadav Date: Wed, 23 Aug 2017 16:22:20 +0530 > genl_ops are not supposed to change at runtime. All functions > working with genl_ops provided by work with > const genl_ops. So mark the non-const structs as const. > > Signed-off-by: Arvind Yadav Applied.
Re: [PATCH 0/3] constify dsa_switch_ops
From: Andrew LunnDate: Wed, 23 Aug 2017 14:45:56 +0200 > On Wed, Aug 23, 2017 at 03:46:56PM +0530, Arvind Yadav wrote: >> dsa_switch_ops are not supposed to change at runtime. All functions >> working with dsa_switch_ops provided by work with >> const dsa_switch_ops. So mark the non-const structs as const. >> >> Arvind Yadav (3): >> [PATCH 1/3] net: dsa: loop: constify dsa_switch_ops >> [PATCH 2/3] net: dsa: lan9303: constify dsa_switch_ops >> [PATCH 3/3] net: dsa: mt7530: constify dsa_switch_ops > > For the whole series: > > Reviewed-by: Andrew Lunn None of these patches apply to net-next, things are const there already.
Re: [PATCH 0/3] constify dsa_switch_ops
From: Andrew Lunn Date: Wed, 23 Aug 2017 14:45:56 +0200 > On Wed, Aug 23, 2017 at 03:46:56PM +0530, Arvind Yadav wrote: >> dsa_switch_ops are not supposed to change at runtime. All functions >> working with dsa_switch_ops provided by work with >> const dsa_switch_ops. So mark the non-const structs as const. >> >> Arvind Yadav (3): >> [PATCH 1/3] net: dsa: loop: constify dsa_switch_ops >> [PATCH 2/3] net: dsa: lan9303: constify dsa_switch_ops >> [PATCH 3/3] net: dsa: mt7530: constify dsa_switch_ops > > For the whole series: > > Reviewed-by: Andrew Lunn None of these patches apply to net-next, things are const there already.
[PATCH] iommu/mediatek: Fix a build fail of m4u_type
The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build error: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init': >> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data' has no member named 'm4u_type'; did you mean 'm4u_dom'? if (data->enable_4GB && data->m4u_type != M4U_MT8173) { This patch fix it, use "m4u_plat" instead of "m4u_type". Reported-by: kernel test robotSigned-off-by: Yong Wu --- 1) In the v2 of mt2712 iommu support patch-set, We changed a variant name from "m4u_type" to "m4u_plat", But I'm sorry that I forgot to change it in the later patch. 2) This patch is based on [1]. [1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-August/023847.html 3) I cann't find the commit id of the patch ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") in the [2]. So I don't write the commit id for that patch in the commit message. [2]https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next --- drivers/iommu/mtk_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 4f233e1..bc00e40 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -533,7 +533,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(F_MMU_IVRP_PA_SET(data->protect_base, data->enable_4GB), data->base + REG_MMU_IVRP_PADDR); - if (data->enable_4GB && data->m4u_type != M4U_MT8173) { + if (data->enable_4GB && data->m4u_plat != M4U_MT8173) { /* * If 4GB mode is enabled, the validate PA range is from * 0x1__ to 0x1__. here record bit[32:30]. -- 1.9.1
[PATCH] iommu/mediatek: Fix a build fail of m4u_type
The commit ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") introduce the following build error: drivers/iommu/mtk_iommu.c: In function 'mtk_iommu_hw_init': >> drivers/iommu/mtk_iommu.c:536:30: error: 'const struct mtk_iommu_data' has no member named 'm4u_type'; did you mean 'm4u_dom'? if (data->enable_4GB && data->m4u_type != M4U_MT8173) { This patch fix it, use "m4u_plat" instead of "m4u_type". Reported-by: kernel test robot Signed-off-by: Yong Wu --- 1) In the v2 of mt2712 iommu support patch-set, We changed a variant name from "m4u_type" to "m4u_plat", But I'm sorry that I forgot to change it in the later patch. 2) This patch is based on [1]. [1]:https://lists.linuxfoundation.org/pipermail/iommu/2017-August/023847.html 3) I cann't find the commit id of the patch ("iommu/mediatek: Enlarge the validate PA range for 4GB mode") in the [2]. So I don't write the commit id for that patch in the commit message. [2]https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git next --- drivers/iommu/mtk_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 4f233e1..bc00e40 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -533,7 +533,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data *data) writel_relaxed(F_MMU_IVRP_PA_SET(data->protect_base, data->enable_4GB), data->base + REG_MMU_IVRP_PADDR); - if (data->enable_4GB && data->m4u_type != M4U_MT8173) { + if (data->enable_4GB && data->m4u_plat != M4U_MT8173) { /* * If 4GB mode is enabled, the validate PA range is from * 0x1__ to 0x1__. here record bit[32:30]. -- 1.9.1
Re: linux-next: build warning after merge of the phy-next tree
Hi Greg, On Tue, 22 Aug 2017 15:07:33 +1000 Stephen Rothwellwrote: > > Hi Kishon, > > After merging the phy-next tree, today's linux-next build (x86_64 > allmodconfig) produced this warning: > > drivers/phy/ralink/phy-ralink-usb.c: In function 'ralink_usb_phy_probe': > drivers/phy/ralink/phy-ralink-usb.c:195:13: warning: cast from pointer to > integer of different size [-Wpointer-to-int-cast] > phy->clk = (u32) match->data; > ^ > > Introduced by commit > > 2411a736ff09 ("phy: ralink-usb: add driver for Mediatek/Ralink") I now get this warning after merging the usb tree ... -- Cheers, Stephen Rothwell
Re: linux-next: build warning after merge of the phy-next tree
Hi Greg, On Tue, 22 Aug 2017 15:07:33 +1000 Stephen Rothwell wrote: > > Hi Kishon, > > After merging the phy-next tree, today's linux-next build (x86_64 > allmodconfig) produced this warning: > > drivers/phy/ralink/phy-ralink-usb.c: In function 'ralink_usb_phy_probe': > drivers/phy/ralink/phy-ralink-usb.c:195:13: warning: cast from pointer to > integer of different size [-Wpointer-to-int-cast] > phy->clk = (u32) match->data; > ^ > > Introduced by commit > > 2411a736ff09 ("phy: ralink-usb: add driver for Mediatek/Ralink") I now get this warning after merging the usb tree ... -- Cheers, Stephen Rothwell
Re: [PATCH][net-next] net: hinic: make functions set_ctrl0 and set_ctrl1 static
From: Colin KingDate: Wed, 23 Aug 2017 10:59:40 +0100 > From: Colin Ian King > > The functions set_ctrl0 and set_ctrl1 are local to the source and do > not need to be in global scope, so make them static. > > Cleans up sparse warnings: > symbol 'set_ctrl0' was not declared. Should it be static? > symbol 'set_ctrl1' was not declared. Should it be static? > > Signed-off-by: Colin Ian King Applied.
Re: [PATCH][net-next] net: hinic: make functions set_ctrl0 and set_ctrl1 static
From: Colin King Date: Wed, 23 Aug 2017 10:59:40 +0100 > From: Colin Ian King > > The functions set_ctrl0 and set_ctrl1 are local to the source and do > not need to be in global scope, so make them static. > > Cleans up sparse warnings: > symbol 'set_ctrl0' was not declared. Should it be static? > symbol 'set_ctrl1' was not declared. Should it be static? > > Signed-off-by: Colin Ian King Applied.
[PATCH] acpi: acpica: fix acpi operand cache leak in dsutils.c
I found an ACPI cache leak in ACPI early termination and boot continuing case. When early termination is occurred due to malicious ACPI table, Linux kernel terminates ACPI function and continues to boot process. While kernel terminates ACPI function, kmem_cache_destroy() reports Acpi-Operand cache leak. Boot log of ACPI operand cache leak is as follows: >[0.585957] ACPI: Added _OSI(Module Device) >[0.587218] ACPI: Added _OSI(Processor Device) >[0.588530] ACPI: Added _OSI(3.0 _SCP Extensions) >[0.589790] ACPI: Added _OSI(Processor Aggregator Device) >[0.591534] ACPI Error: Illegal I/O port address/length above 64K: >C806E0004002/0x2 (20170303/hwvalid-155) >[0.594351] ACPI Exception: AE_LIMIT, Unable to initialize fixed events >(20170303/evevent-88) >[0.597858] ACPI: Unable to start the ACPI Interpreter >[0.599162] ACPI Error: Could not remove SCI handler (20170303/evmisc-281) >[0.601836] kmem_cache_destroy Acpi-Operand: Slab cache still has objects >[0.603556] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc5 #26 >[0.605159] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS >VirtualBox 12/01/2006 >[0.609177] Call Trace: >[0.610063] ? dump_stack+0x5c/0x81 >[0.68] ? kmem_cache_destroy+0x1aa/0x1c0 >[0.612632] ? acpi_sleep_proc_init+0x27/0x27 >[0.613906] ? acpi_os_delete_cache+0xa/0x10 >[0.617986] ? acpi_ut_delete_caches+0x3f/0x7b >[0.619293] ? acpi_terminate+0xa/0x14 >[0.620394] ? acpi_init+0x2af/0x34f >[0.621616] ? __class_create+0x4c/0x80 >[0.623412] ? video_setup+0x7f/0x7f >[0.624585] ? acpi_sleep_proc_init+0x27/0x27 >[0.625861] ? do_one_initcall+0x4e/0x1a0 >[0.627513] ? kernel_init_freeable+0x19e/0x21f >[0.628972] ? rest_init+0x80/0x80 >[0.630043] ? kernel_init+0xa/0x100 >[0.631084] ? ret_from_fork+0x25/0x30 >[0.633343] vgaarb: loaded >[0.635036] EDAC MC: Ver: 3.0.0 >[0.638601] PCI: Probing PCI hardware >[0.639833] PCI host bridge to bus :00 >[0.641031] pci_bus :00: root bus resource [io 0x-0x] > ... Continue to boot and log is omitted ... I analyzed this memory leak in detail and found acpi_ds_obj_stack_pop_and_ delete() function miscalculated the top of the stack. acpi_ds_obj_stack_push() function uses walk_state->operand_index for start position of the top, but acpi_ds_obj_stack_pop_and_delete() function considers index 0 for it. Therefore, this causes acpi operand memory leak. This cache leak causes a security threat because an old kernel (<= 4.9) shows memory locations of kernel functions in stack dump. Some malicious users could use this information to neutralize kernel ASLR. I made a patch to fix ACPI operand cache leak. Signed-off-by: Seunghun Han--- drivers/acpi/acpica/dsutils.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/acpica/dsutils.c b/drivers/acpi/acpica/dsutils.c index 0dabd9b..2c8a060 100644 --- a/drivers/acpi/acpica/dsutils.c +++ b/drivers/acpi/acpica/dsutils.c @@ -705,6 +705,8 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, union acpi_parse_object *arguments[ACPI_OBJ_NUM_OPERANDS]; u32 arg_count = 0; u32 index = walk_state->num_operands; + u32 prev_num_operands = walk_state->num_operands; + u32 new_num_operands; u32 i; ACPI_FUNCTION_TRACE_PTR(ds_create_operands, first_arg); @@ -733,6 +735,7 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, /* Create the interpreter arguments, in reverse order */ + new_num_operands = index; index--; for (i = 0; i < arg_count; i++) { arg = arguments[index]; @@ -757,7 +760,11 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, * pop everything off of the operand stack and delete those * objects */ - acpi_ds_obj_stack_pop_and_delete(arg_count, walk_state); + walk_state->num_operands = i; + acpi_ds_obj_stack_pop_and_delete(new_num_operands, walk_state); + + /* Restore operand count */ + walk_state->num_operands = prev_num_operands; ACPI_EXCEPTION((AE_INFO, status, "While creating Arg %u", index)); return_ACPI_STATUS(status); -- 2.1.4
[PATCH] acpi: acpica: fix acpi operand cache leak in dsutils.c
I found an ACPI cache leak in ACPI early termination and boot continuing case. When early termination is occurred due to malicious ACPI table, Linux kernel terminates ACPI function and continues to boot process. While kernel terminates ACPI function, kmem_cache_destroy() reports Acpi-Operand cache leak. Boot log of ACPI operand cache leak is as follows: >[0.585957] ACPI: Added _OSI(Module Device) >[0.587218] ACPI: Added _OSI(Processor Device) >[0.588530] ACPI: Added _OSI(3.0 _SCP Extensions) >[0.589790] ACPI: Added _OSI(Processor Aggregator Device) >[0.591534] ACPI Error: Illegal I/O port address/length above 64K: >C806E0004002/0x2 (20170303/hwvalid-155) >[0.594351] ACPI Exception: AE_LIMIT, Unable to initialize fixed events >(20170303/evevent-88) >[0.597858] ACPI: Unable to start the ACPI Interpreter >[0.599162] ACPI Error: Could not remove SCI handler (20170303/evmisc-281) >[0.601836] kmem_cache_destroy Acpi-Operand: Slab cache still has objects >[0.603556] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc5 #26 >[0.605159] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS >VirtualBox 12/01/2006 >[0.609177] Call Trace: >[0.610063] ? dump_stack+0x5c/0x81 >[0.68] ? kmem_cache_destroy+0x1aa/0x1c0 >[0.612632] ? acpi_sleep_proc_init+0x27/0x27 >[0.613906] ? acpi_os_delete_cache+0xa/0x10 >[0.617986] ? acpi_ut_delete_caches+0x3f/0x7b >[0.619293] ? acpi_terminate+0xa/0x14 >[0.620394] ? acpi_init+0x2af/0x34f >[0.621616] ? __class_create+0x4c/0x80 >[0.623412] ? video_setup+0x7f/0x7f >[0.624585] ? acpi_sleep_proc_init+0x27/0x27 >[0.625861] ? do_one_initcall+0x4e/0x1a0 >[0.627513] ? kernel_init_freeable+0x19e/0x21f >[0.628972] ? rest_init+0x80/0x80 >[0.630043] ? kernel_init+0xa/0x100 >[0.631084] ? ret_from_fork+0x25/0x30 >[0.633343] vgaarb: loaded >[0.635036] EDAC MC: Ver: 3.0.0 >[0.638601] PCI: Probing PCI hardware >[0.639833] PCI host bridge to bus :00 >[0.641031] pci_bus :00: root bus resource [io 0x-0x] > ... Continue to boot and log is omitted ... I analyzed this memory leak in detail and found acpi_ds_obj_stack_pop_and_ delete() function miscalculated the top of the stack. acpi_ds_obj_stack_push() function uses walk_state->operand_index for start position of the top, but acpi_ds_obj_stack_pop_and_delete() function considers index 0 for it. Therefore, this causes acpi operand memory leak. This cache leak causes a security threat because an old kernel (<= 4.9) shows memory locations of kernel functions in stack dump. Some malicious users could use this information to neutralize kernel ASLR. I made a patch to fix ACPI operand cache leak. Signed-off-by: Seunghun Han --- drivers/acpi/acpica/dsutils.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/acpica/dsutils.c b/drivers/acpi/acpica/dsutils.c index 0dabd9b..2c8a060 100644 --- a/drivers/acpi/acpica/dsutils.c +++ b/drivers/acpi/acpica/dsutils.c @@ -705,6 +705,8 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, union acpi_parse_object *arguments[ACPI_OBJ_NUM_OPERANDS]; u32 arg_count = 0; u32 index = walk_state->num_operands; + u32 prev_num_operands = walk_state->num_operands; + u32 new_num_operands; u32 i; ACPI_FUNCTION_TRACE_PTR(ds_create_operands, first_arg); @@ -733,6 +735,7 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, /* Create the interpreter arguments, in reverse order */ + new_num_operands = index; index--; for (i = 0; i < arg_count; i++) { arg = arguments[index]; @@ -757,7 +760,11 @@ acpi_ds_create_operands(struct acpi_walk_state *walk_state, * pop everything off of the operand stack and delete those * objects */ - acpi_ds_obj_stack_pop_and_delete(arg_count, walk_state); + walk_state->num_operands = i; + acpi_ds_obj_stack_pop_and_delete(new_num_operands, walk_state); + + /* Restore operand count */ + walk_state->num_operands = prev_num_operands; ACPI_EXCEPTION((AE_INFO, status, "While creating Arg %u", index)); return_ACPI_STATUS(status); -- 2.1.4
Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING
On Wed, Aug 23, 2017 at 12:26:52PM +0200, Peter Zijlstra wrote: > On Wed, Aug 23, 2017 at 11:43:23AM +0900, Byungchul Park wrote: > > On Tue, Aug 22, 2017 at 03:49:22PM +0200, Peter Zijlstra wrote: > > > > So I think we'll end up hitting a lockdep deficiency and not trigger the > > > splat on flush_work(), see also: > > > > > > https://lwn.net/Articles/332801/ > > > > > > lock_map_acquire_read() is a read-recursive and will not in fact create > > > any dependencies because of this issue. > > > > > > In specific, check_prev_add() has: > > > > > > if (next->read == 2 || prev->read == 2) > > > return 1; > > > > > > This means that for: > > > > > > lock_map_acquire_read(W)(2) > > > down_write(A) (0) > > > > > > down_write(A) (0) > > > wait_for_completion(C) (0) > > > > > > lock_map_acquire_read(W)(2) > > > complete(C) (0) > > > > > > All the (2) effectively go away and 'solve' our current issue, but: > > > > > > lock_map_acquire_read(W)(2) > > > mutex_lock(A) (0) > > > > > > mutex_lock(A) (0) > > > lock_map_acquire(W) (0) > > > > > > as per flush_work() will not in fact trigger anymore either. > > > > It should be triggered. Lockdep code should be fixed so that it does. > > Yeah, its just something we never got around to. Once every so often I > get reminded of it, like now. But then it sits on the todo list and > never quite happens. I want to try it.
Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING
On Wed, Aug 23, 2017 at 12:26:52PM +0200, Peter Zijlstra wrote: > On Wed, Aug 23, 2017 at 11:43:23AM +0900, Byungchul Park wrote: > > On Tue, Aug 22, 2017 at 03:49:22PM +0200, Peter Zijlstra wrote: > > > > So I think we'll end up hitting a lockdep deficiency and not trigger the > > > splat on flush_work(), see also: > > > > > > https://lwn.net/Articles/332801/ > > > > > > lock_map_acquire_read() is a read-recursive and will not in fact create > > > any dependencies because of this issue. > > > > > > In specific, check_prev_add() has: > > > > > > if (next->read == 2 || prev->read == 2) > > > return 1; > > > > > > This means that for: > > > > > > lock_map_acquire_read(W)(2) > > > down_write(A) (0) > > > > > > down_write(A) (0) > > > wait_for_completion(C) (0) > > > > > > lock_map_acquire_read(W)(2) > > > complete(C) (0) > > > > > > All the (2) effectively go away and 'solve' our current issue, but: > > > > > > lock_map_acquire_read(W)(2) > > > mutex_lock(A) (0) > > > > > > mutex_lock(A) (0) > > > lock_map_acquire(W) (0) > > > > > > as per flush_work() will not in fact trigger anymore either. > > > > It should be triggered. Lockdep code should be fixed so that it does. > > Yeah, its just something we never got around to. Once every so often I > get reminded of it, like now. But then it sits on the todo list and > never quite happens. I want to try it.
Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING
On Wed, Aug 23, 2017 at 12:46:48PM +0200, Peter Zijlstra wrote: > On Wed, Aug 23, 2017 at 11:31:18AM +0900, Byungchul Park wrote: > > On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote: > > > Currently, we do the following in process_one_work(), > > > > lockdep_map_acquire for a workqueue > > lockdep_map_acquire for a work > > > > But IMHO it should be, > > > > lockdep_map_acquire for a pair of workqueue and work. > > > > Right? > > No, it is right. We need the two 'locks'. > > The work one is for flush_work(), the workqueue one is for > flush_workqueue(). > > Just like how flush_work() must not depend on any lock taken inside the > work, flush_workqueue() callers must not hold any lock acquired inside > any work ran inside the workqueue. This cannot be done with a single > 'lock'. Thank you for explanation. > The reason flush_work() also depends on the wq 'lock' is because doing > flush_work() from inside work is a possible problem for single threaded > workqueues and workqueues with a rescuer. > > > > Agreed. The interaction with workqueues is buggered. > > > > I think original uses of lockdep_map were already wrong. I mean it's > > not a problem of newly introduced code. > > Not wrong per-se, the new code certainly places more constraints on it. "the new code places more constraints on it" is just the right expression.
Re: [PATCH v3 1/3] lockdep: Make LOCKDEP_CROSSRELEASE configs all part of PROVE_LOCKING
On Wed, Aug 23, 2017 at 12:46:48PM +0200, Peter Zijlstra wrote: > On Wed, Aug 23, 2017 at 11:31:18AM +0900, Byungchul Park wrote: > > On Tue, Aug 22, 2017 at 11:06:03AM +0200, Peter Zijlstra wrote: > > > Currently, we do the following in process_one_work(), > > > > lockdep_map_acquire for a workqueue > > lockdep_map_acquire for a work > > > > But IMHO it should be, > > > > lockdep_map_acquire for a pair of workqueue and work. > > > > Right? > > No, it is right. We need the two 'locks'. > > The work one is for flush_work(), the workqueue one is for > flush_workqueue(). > > Just like how flush_work() must not depend on any lock taken inside the > work, flush_workqueue() callers must not hold any lock acquired inside > any work ran inside the workqueue. This cannot be done with a single > 'lock'. Thank you for explanation. > The reason flush_work() also depends on the wq 'lock' is because doing > flush_work() from inside work is a possible problem for single threaded > workqueues and workqueues with a rescuer. > > > > Agreed. The interaction with workqueues is buggered. > > > > I think original uses of lockdep_map were already wrong. I mean it's > > not a problem of newly introduced code. > > Not wrong per-se, the new code certainly places more constraints on it. "the new code places more constraints on it" is just the right expression.
[PATCH v8 3/3] PCI: iproc: add device shutdown for PCI RC
PERST must be asserted around ~500ms before the reboot is applied. During soft reset (e.g., "reboot" from Linux) on some iproc based SOCs LCPLL clock and PERST both goes off simultaneously. This will cause certain Endpoints Intel NVMe not get detected, upon next boot sequence. This is specifically happening with Intel P3700 400GB series. Endpoint is expecting the clock for some amount of time after PERST is asserted, which is not happening in Stingray (iproc based SOC). This causes NVMe to behave in undefined way. On the contrary, Intel x86 boards will have ref clock running, so we do not see this behavior there. Besides, PCI spec does not stipulate about such timings. In-fact it does not tell us, whether PCIe device should consider refclk to be available or not to be. It is completely up to vendor to design their EP the way they want with respect to ref clock availability. 500ms is just based on the observation and taken as safe margin. This patch adds platform shutdown where it should be called in device_shutdown while reboot command is issued. So in sequence first Endpoint Shutdown (e.g. nvme_shutdown) followed by RC shutdown, which issues safe PERST assertion. Signed-off-by: Oza PawandeepReviewed-by: Ray Jui Reviewed-by: Scott Branden diff --git a/drivers/pci/host/pcie-iproc-platform.c b/drivers/pci/host/pcie-iproc-platform.c index 2253119..a5073a9 100644 --- a/drivers/pci/host/pcie-iproc-platform.c +++ b/drivers/pci/host/pcie-iproc-platform.c @@ -134,6 +134,13 @@ static int iproc_pcie_pltfm_remove(struct platform_device *pdev) return iproc_pcie_remove(pcie); } +static void iproc_pcie_pltfm_shutdown(struct platform_device *pdev) +{ + struct iproc_pcie *pcie = platform_get_drvdata(pdev); + + iproc_pcie_shutdown(pcie); +} + static struct platform_driver iproc_pcie_pltfm_driver = { .driver = { .name = "iproc-pcie", @@ -141,6 +148,7 @@ static int iproc_pcie_pltfm_remove(struct platform_device *pdev) }, .probe = iproc_pcie_pltfm_probe, .remove = iproc_pcie_pltfm_remove, + .shutdown = iproc_pcie_pltfm_shutdown, }; module_platform_driver(iproc_pcie_pltfm_driver); diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index 37f4adf..cbdabe8 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -659,7 +659,7 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, unsigned int devfn, .write = iproc_pcie_config_write32, }; -static void iproc_pcie_reset(struct iproc_pcie *pcie) +static void iproc_pcie_perst_ctrl(struct iproc_pcie *pcie, bool assert) { u32 val; @@ -671,19 +671,26 @@ static void iproc_pcie_reset(struct iproc_pcie *pcie) if (pcie->ep_is_internal) return; - /* -* Select perst_b signal as reset source. Put the device into reset, -* and then bring it out of reset -*/ - val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); - val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST & - ~RC_PCIE_RST_OUTPUT; - iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); - udelay(250); - - val |= RC_PCIE_RST_OUTPUT; - iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); - msleep(100); + if (assert) { + val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); + val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST & + ~RC_PCIE_RST_OUTPUT; + iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); + udelay(250); + } else { + val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); + val |= RC_PCIE_RST_OUTPUT; + iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); + msleep(100); + } +} + +int iproc_pcie_shutdown(struct iproc_pcie *pcie) +{ + iproc_pcie_perst_ctrl(pcie, true); + msleep(500); + + return 0; } static int iproc_pcie_check_link(struct iproc_pcie *pcie) @@ -1365,7 +1372,8 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res) goto err_exit_phy; } - iproc_pcie_reset(pcie); + iproc_pcie_perst_ctrl(pcie, true); + iproc_pcie_perst_ctrl(pcie, false); if (pcie->need_ob_cfg) { ret = iproc_pcie_map_ranges(pcie, res); diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h index 0bbe2ea..a6b55ce 100644 --- a/drivers/pci/host/pcie-iproc.h +++ b/drivers/pci/host/pcie-iproc.h @@ -110,6 +110,7 @@ struct iproc_pcie { int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res); int iproc_pcie_remove(struct iproc_pcie *pcie); +int iproc_pcie_shutdown(struct iproc_pcie *pcie); #ifdef CONFIG_PCIE_IPROC_MSI int iproc_msi_init(struct iproc_pcie *pcie, struct device_node *node); -- 1.9.1
[PATCH v8 3/3] PCI: iproc: add device shutdown for PCI RC
PERST must be asserted around ~500ms before the reboot is applied. During soft reset (e.g., "reboot" from Linux) on some iproc based SOCs LCPLL clock and PERST both goes off simultaneously. This will cause certain Endpoints Intel NVMe not get detected, upon next boot sequence. This is specifically happening with Intel P3700 400GB series. Endpoint is expecting the clock for some amount of time after PERST is asserted, which is not happening in Stingray (iproc based SOC). This causes NVMe to behave in undefined way. On the contrary, Intel x86 boards will have ref clock running, so we do not see this behavior there. Besides, PCI spec does not stipulate about such timings. In-fact it does not tell us, whether PCIe device should consider refclk to be available or not to be. It is completely up to vendor to design their EP the way they want with respect to ref clock availability. 500ms is just based on the observation and taken as safe margin. This patch adds platform shutdown where it should be called in device_shutdown while reboot command is issued. So in sequence first Endpoint Shutdown (e.g. nvme_shutdown) followed by RC shutdown, which issues safe PERST assertion. Signed-off-by: Oza Pawandeep Reviewed-by: Ray Jui Reviewed-by: Scott Branden diff --git a/drivers/pci/host/pcie-iproc-platform.c b/drivers/pci/host/pcie-iproc-platform.c index 2253119..a5073a9 100644 --- a/drivers/pci/host/pcie-iproc-platform.c +++ b/drivers/pci/host/pcie-iproc-platform.c @@ -134,6 +134,13 @@ static int iproc_pcie_pltfm_remove(struct platform_device *pdev) return iproc_pcie_remove(pcie); } +static void iproc_pcie_pltfm_shutdown(struct platform_device *pdev) +{ + struct iproc_pcie *pcie = platform_get_drvdata(pdev); + + iproc_pcie_shutdown(pcie); +} + static struct platform_driver iproc_pcie_pltfm_driver = { .driver = { .name = "iproc-pcie", @@ -141,6 +148,7 @@ static int iproc_pcie_pltfm_remove(struct platform_device *pdev) }, .probe = iproc_pcie_pltfm_probe, .remove = iproc_pcie_pltfm_remove, + .shutdown = iproc_pcie_pltfm_shutdown, }; module_platform_driver(iproc_pcie_pltfm_driver); diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index 37f4adf..cbdabe8 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -659,7 +659,7 @@ static int iproc_pcie_config_write32(struct pci_bus *bus, unsigned int devfn, .write = iproc_pcie_config_write32, }; -static void iproc_pcie_reset(struct iproc_pcie *pcie) +static void iproc_pcie_perst_ctrl(struct iproc_pcie *pcie, bool assert) { u32 val; @@ -671,19 +671,26 @@ static void iproc_pcie_reset(struct iproc_pcie *pcie) if (pcie->ep_is_internal) return; - /* -* Select perst_b signal as reset source. Put the device into reset, -* and then bring it out of reset -*/ - val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); - val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST & - ~RC_PCIE_RST_OUTPUT; - iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); - udelay(250); - - val |= RC_PCIE_RST_OUTPUT; - iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); - msleep(100); + if (assert) { + val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); + val &= ~EP_PERST_SOURCE_SELECT & ~EP_MODE_SURVIVE_PERST & + ~RC_PCIE_RST_OUTPUT; + iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); + udelay(250); + } else { + val = iproc_pcie_read_reg(pcie, IPROC_PCIE_CLK_CTRL); + val |= RC_PCIE_RST_OUTPUT; + iproc_pcie_write_reg(pcie, IPROC_PCIE_CLK_CTRL, val); + msleep(100); + } +} + +int iproc_pcie_shutdown(struct iproc_pcie *pcie) +{ + iproc_pcie_perst_ctrl(pcie, true); + msleep(500); + + return 0; } static int iproc_pcie_check_link(struct iproc_pcie *pcie) @@ -1365,7 +1372,8 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res) goto err_exit_phy; } - iproc_pcie_reset(pcie); + iproc_pcie_perst_ctrl(pcie, true); + iproc_pcie_perst_ctrl(pcie, false); if (pcie->need_ob_cfg) { ret = iproc_pcie_map_ranges(pcie, res); diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h index 0bbe2ea..a6b55ce 100644 --- a/drivers/pci/host/pcie-iproc.h +++ b/drivers/pci/host/pcie-iproc.h @@ -110,6 +110,7 @@ struct iproc_pcie { int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res); int iproc_pcie_remove(struct iproc_pcie *pcie); +int iproc_pcie_shutdown(struct iproc_pcie *pcie); #ifdef CONFIG_PCIE_IPROC_MSI int iproc_msi_init(struct iproc_pcie *pcie, struct device_node *node); -- 1.9.1
[PATCH v8 1/3] PCI: iproc: factor out ep configuration access
This patch factors out ep configuration access as a separate function. Signed-off-by: Oza Pawandeepdiff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index c574863..61d9be6 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -448,6 +448,31 @@ static inline void iproc_pcie_apb_err_disable(struct pci_bus *bus, } } +static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie, + unsigned int busno, + unsigned int slot, + unsigned int fn, + int where) +{ + u16 offset; + u32 val; + + /* EP device access */ + val = (busno << CFG_ADDR_BUS_NUM_SHIFT) | + (slot << CFG_ADDR_DEV_NUM_SHIFT) | + (fn << CFG_ADDR_FUNC_NUM_SHIFT) | + (where & CFG_ADDR_REG_NUM_MASK) | + (1 & CFG_ADDR_CFG_TYPE_MASK); + + iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val); + offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA); + + if (iproc_pcie_reg_is_invalid(offset)) + return NULL; + + return (pcie->base + offset); +} + /** * Note access to the configuration registers are protected at the higher layer * by 'pci_lock' in drivers/pci/access.c @@ -459,7 +484,6 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct iproc_pcie *pcie, { unsigned slot = PCI_SLOT(devfn); unsigned fn = PCI_FUNC(devfn); - u32 val; u16 offset; /* root complex access */ @@ -484,18 +508,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct iproc_pcie *pcie, if (slot > 0) return NULL; - /* EP device access */ - val = (busno << CFG_ADDR_BUS_NUM_SHIFT) | - (slot << CFG_ADDR_DEV_NUM_SHIFT) | - (fn << CFG_ADDR_FUNC_NUM_SHIFT) | - (where & CFG_ADDR_REG_NUM_MASK) | - (1 & CFG_ADDR_CFG_TYPE_MASK); - iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val); - offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA); - if (iproc_pcie_reg_is_invalid(offset)) - return NULL; - else - return (pcie->base + offset); + return iproc_pcie_map_ep_cfg_reg(pcie, busno, slot, fn, where); } static void __iomem *iproc_pcie_bus_map_cfg_bus(struct pci_bus *bus, -- 1.9.1
[PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP
PCIe spec r3.1, sec 2.3.2 If CRS software visibility is not enabled, the RC must reissue the config request as a new request. - If CRS software visibility is enabled, - for a config read of Vendor ID, the RC must return 0x0001 data - for all other config reads/writes, the RC must reissue the request iproc PCIe Controller spec: 4.7.3.3. Retry Status On Configuration Cycle Endpoints are allowed to generate retry status on configuration cycles. In this case, the RC needs to re-issue the request. The IP does not handle this because the number of configuration cycles needed will probably be less than the total number of non-posted operations needed. When a retry status is received on the User RX interface for a configuration request that was sent on the User TX interface, it will be indicated with a completion with the CMPL_STATUS field set to 2=CRS, and the user will have to find the address and data values and send a new transaction on the User TX interface. When the internal configuration space returns a retry status during a configuration cycle (user_cscfg = 1) on the Command/Status interface, the pcie_cscrs will assert with the pcie_csack signal to indicate the CRS status. When the CRS Software Visibility Enable register in the Root Control register is enabled, the IP will return the data value to 0x0001 for the Vendor ID value and 0x (all 1’s) for the rest of the data in the request for reads of offset 0 that return with CRS status. This is true for both the User RX Interface and for the Command/Status interface. When CRS Software Visibility is enabled, the CMPL_STATUS field of the completion on the User RX Interface will not be 2=CRS and the pcie_cscrs signal will not assert on the Command/Status interface. Per PCIe r3.1, sec 2.3.2, config requests that receive completions with Configuration Request Retry Status (CRS) should be reissued by the hardware except reads of the Vendor ID when CRS Software Visibility is enabled. This hardware never reissues configuration requests when it receives CRS completions. Note that, neither PCIe host bridge nor PCIe core re-issues the request for any configuration offset. For config reads, this hardware returns CFG_RETRY_STATUS data when it receives a CRS completion for a config read, regardless of the address of the read or the CRS Software Visibility Enable bit. This patch implements iproc_pcie_config_read which gets called for Stingray, if it receives a CRS completion, it retries reading it again. In case of timeout, it returns 0x. For other iproc based SOC, it falls back to PCI generic APIs. Signed-off-by: Oza Pawandeepdiff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index 61d9be6..37f4adf 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -68,6 +68,9 @@ #define APB_ERR_EN_SHIFT 0 #define APB_ERR_EN BIT(APB_ERR_EN_SHIFT) +#define CFG_RETRY_STATUS 0x0001 +#define CFG_RETRY_STATUS_TIMEOUT_US 50 /* 500 milli-seconds. */ + /* derive the enum index of the outbound/inbound mapping registers */ #define MAP_REG(base_reg, index) ((base_reg) + (index) * 2) @@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie, return (pcie->base + offset); } +static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p) +{ + int timeout = CFG_RETRY_STATUS_TIMEOUT_US; + unsigned int data; + + /* +* As per PCIe spec r3.1, sec 2.3.2, CRS Software +* Visibility only affects config read of the Vendor ID. +* For config write or any other config read the Root must +* automatically re-issue configuration request again as a +* new request. +* +* For config reads, this hardware returns CFG_RETRY_STATUS data when +* it receives a CRS completion for a config read, regardless of the +* address of the read or the CRS Software Visibility Enable bit. As a +* partial workaround for this, we retry in software any read that +* returns CFG_RETRY_STATUS. +*/ + data = readl(cfg_data_p); + while (data == CFG_RETRY_STATUS && timeout--) { + udelay(1); + data = readl(cfg_data_p); + } + + if (data == CFG_RETRY_STATUS) + data = 0x; + + return data; +} + +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val) +{ + struct iproc_pcie *pcie = iproc_data(bus); + unsigned int slot = PCI_SLOT(devfn); + unsigned int fn = PCI_FUNC(devfn); + unsigned int busno = bus->number; + void __iomem *cfg_data_p; + unsigned int data; + + /* root complex access. */ + if (busno == 0) + return pci_generic_config_read32(bus, devfn, where, size, val); + + cfg_data_p =
[PATCH v8 1/3] PCI: iproc: factor out ep configuration access
This patch factors out ep configuration access as a separate function. Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index c574863..61d9be6 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -448,6 +448,31 @@ static inline void iproc_pcie_apb_err_disable(struct pci_bus *bus, } } +static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie, + unsigned int busno, + unsigned int slot, + unsigned int fn, + int where) +{ + u16 offset; + u32 val; + + /* EP device access */ + val = (busno << CFG_ADDR_BUS_NUM_SHIFT) | + (slot << CFG_ADDR_DEV_NUM_SHIFT) | + (fn << CFG_ADDR_FUNC_NUM_SHIFT) | + (where & CFG_ADDR_REG_NUM_MASK) | + (1 & CFG_ADDR_CFG_TYPE_MASK); + + iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val); + offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA); + + if (iproc_pcie_reg_is_invalid(offset)) + return NULL; + + return (pcie->base + offset); +} + /** * Note access to the configuration registers are protected at the higher layer * by 'pci_lock' in drivers/pci/access.c @@ -459,7 +484,6 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct iproc_pcie *pcie, { unsigned slot = PCI_SLOT(devfn); unsigned fn = PCI_FUNC(devfn); - u32 val; u16 offset; /* root complex access */ @@ -484,18 +508,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct iproc_pcie *pcie, if (slot > 0) return NULL; - /* EP device access */ - val = (busno << CFG_ADDR_BUS_NUM_SHIFT) | - (slot << CFG_ADDR_DEV_NUM_SHIFT) | - (fn << CFG_ADDR_FUNC_NUM_SHIFT) | - (where & CFG_ADDR_REG_NUM_MASK) | - (1 & CFG_ADDR_CFG_TYPE_MASK); - iproc_pcie_write_reg(pcie, IPROC_PCIE_CFG_ADDR, val); - offset = iproc_pcie_reg_offset(pcie, IPROC_PCIE_CFG_DATA); - if (iproc_pcie_reg_is_invalid(offset)) - return NULL; - else - return (pcie->base + offset); + return iproc_pcie_map_ep_cfg_reg(pcie, busno, slot, fn, where); } static void __iomem *iproc_pcie_bus_map_cfg_bus(struct pci_bus *bus, -- 1.9.1
[PATCH v8 2/3] PCI: iproc: retry request when CRS returned from EP
PCIe spec r3.1, sec 2.3.2 If CRS software visibility is not enabled, the RC must reissue the config request as a new request. - If CRS software visibility is enabled, - for a config read of Vendor ID, the RC must return 0x0001 data - for all other config reads/writes, the RC must reissue the request iproc PCIe Controller spec: 4.7.3.3. Retry Status On Configuration Cycle Endpoints are allowed to generate retry status on configuration cycles. In this case, the RC needs to re-issue the request. The IP does not handle this because the number of configuration cycles needed will probably be less than the total number of non-posted operations needed. When a retry status is received on the User RX interface for a configuration request that was sent on the User TX interface, it will be indicated with a completion with the CMPL_STATUS field set to 2=CRS, and the user will have to find the address and data values and send a new transaction on the User TX interface. When the internal configuration space returns a retry status during a configuration cycle (user_cscfg = 1) on the Command/Status interface, the pcie_cscrs will assert with the pcie_csack signal to indicate the CRS status. When the CRS Software Visibility Enable register in the Root Control register is enabled, the IP will return the data value to 0x0001 for the Vendor ID value and 0x (all 1’s) for the rest of the data in the request for reads of offset 0 that return with CRS status. This is true for both the User RX Interface and for the Command/Status interface. When CRS Software Visibility is enabled, the CMPL_STATUS field of the completion on the User RX Interface will not be 2=CRS and the pcie_cscrs signal will not assert on the Command/Status interface. Per PCIe r3.1, sec 2.3.2, config requests that receive completions with Configuration Request Retry Status (CRS) should be reissued by the hardware except reads of the Vendor ID when CRS Software Visibility is enabled. This hardware never reissues configuration requests when it receives CRS completions. Note that, neither PCIe host bridge nor PCIe core re-issues the request for any configuration offset. For config reads, this hardware returns CFG_RETRY_STATUS data when it receives a CRS completion for a config read, regardless of the address of the read or the CRS Software Visibility Enable bit. This patch implements iproc_pcie_config_read which gets called for Stingray, if it receives a CRS completion, it retries reading it again. In case of timeout, it returns 0x. For other iproc based SOC, it falls back to PCI generic APIs. Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c index 61d9be6..37f4adf 100644 --- a/drivers/pci/host/pcie-iproc.c +++ b/drivers/pci/host/pcie-iproc.c @@ -68,6 +68,9 @@ #define APB_ERR_EN_SHIFT 0 #define APB_ERR_EN BIT(APB_ERR_EN_SHIFT) +#define CFG_RETRY_STATUS 0x0001 +#define CFG_RETRY_STATUS_TIMEOUT_US 50 /* 500 milli-seconds. */ + /* derive the enum index of the outbound/inbound mapping registers */ #define MAP_REG(base_reg, index) ((base_reg) + (index) * 2) @@ -473,6 +476,64 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct iproc_pcie *pcie, return (pcie->base + offset); } +static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p) +{ + int timeout = CFG_RETRY_STATUS_TIMEOUT_US; + unsigned int data; + + /* +* As per PCIe spec r3.1, sec 2.3.2, CRS Software +* Visibility only affects config read of the Vendor ID. +* For config write or any other config read the Root must +* automatically re-issue configuration request again as a +* new request. +* +* For config reads, this hardware returns CFG_RETRY_STATUS data when +* it receives a CRS completion for a config read, regardless of the +* address of the read or the CRS Software Visibility Enable bit. As a +* partial workaround for this, we retry in software any read that +* returns CFG_RETRY_STATUS. +*/ + data = readl(cfg_data_p); + while (data == CFG_RETRY_STATUS && timeout--) { + udelay(1); + data = readl(cfg_data_p); + } + + if (data == CFG_RETRY_STATUS) + data = 0x; + + return data; +} + +static int iproc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val) +{ + struct iproc_pcie *pcie = iproc_data(bus); + unsigned int slot = PCI_SLOT(devfn); + unsigned int fn = PCI_FUNC(devfn); + unsigned int busno = bus->number; + void __iomem *cfg_data_p; + unsigned int data; + + /* root complex access. */ + if (busno == 0) + return pci_generic_config_read32(bus, devfn, where, size, val); + + cfg_data_p = iproc_pcie_map_ep_cfg_reg(pcie,
[PATCH v8 0/3] PCI: iproc: SOC specific fixes
PCI: iproc: Retry request when CRS returned from EP Above patch adds support for CRS in PCI RC driver, otherwise if not handled at lower level, the user space PMD (poll mode drivers) can timeout. PCI: iproc: add device shutdown for PCI RC This fixes the issue where certian PCI endpoints are not getting detected on Stingray SOC after reboot. Changes Since v7: Factor out the ep config access code. Changes Since v6: Rebased patches on top of Lorenzo's patches. Bjorn's comments addressed. now the confg retry returns 0x as data. Added reference to PCIe spec and iproc Controller spec in Changelog. Changes Since v5: Ray's comments addressed. Changes Since v4: Bjorn's comments addressed. Changes Since v3: [re-send] Changes Since v2: Fix compilation errors for pcie-iproc-platform.ko which was caught by kbuild. Oza Pawandeep (3): PCI: iproc: factor-out ep configuration access PCI: iproc: Retry request when CRS returned from EP PCI: iproc: add device shutdown for PCI RC drivers/pci/host/pcie-iproc-platform.c | 8 ++ drivers/pci/host/pcie-iproc.c | 143 ++--- drivers/pci/host/pcie-iproc.h | 1 + 3 files changed, 124 insertions(+), 28 deletions(-) -- 1.9.1
[PATCH v8 0/3] PCI: iproc: SOC specific fixes
PCI: iproc: Retry request when CRS returned from EP Above patch adds support for CRS in PCI RC driver, otherwise if not handled at lower level, the user space PMD (poll mode drivers) can timeout. PCI: iproc: add device shutdown for PCI RC This fixes the issue where certian PCI endpoints are not getting detected on Stingray SOC after reboot. Changes Since v7: Factor out the ep config access code. Changes Since v6: Rebased patches on top of Lorenzo's patches. Bjorn's comments addressed. now the confg retry returns 0x as data. Added reference to PCIe spec and iproc Controller spec in Changelog. Changes Since v5: Ray's comments addressed. Changes Since v4: Bjorn's comments addressed. Changes Since v3: [re-send] Changes Since v2: Fix compilation errors for pcie-iproc-platform.ko which was caught by kbuild. Oza Pawandeep (3): PCI: iproc: factor-out ep configuration access PCI: iproc: Retry request when CRS returned from EP PCI: iproc: add device shutdown for PCI RC drivers/pci/host/pcie-iproc-platform.c | 8 ++ drivers/pci/host/pcie-iproc.c | 143 ++--- drivers/pci/host/pcie-iproc.h | 1 + 3 files changed, 124 insertions(+), 28 deletions(-) -- 1.9.1
Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Correct power_source range check
On Tue 18 Jul 23:39 PDT 2017, fengl...@codeaurora.org wrote: > From: Fenglin Wu> > Power source selection in DIG_VIN_CTL is indexed from 0, in the range > check it shouldn't be equal to the total number of power sources. > Acked-by: Bjorn Andersson Regards, Bjorn > Signed-off-by: Fenglin Wu > --- > drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > index 664b641..8b77c04 100644 > --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > @@ -375,7 +375,7 @@ static int pmic_gpio_config_set(struct pinctrl_dev > *pctldev, unsigned int pin, > pad->is_enabled = false; > break; > case PIN_CONFIG_POWER_SOURCE: > - if (arg > pad->num_sources) > + if (arg >= pad->num_sources) > return -EINVAL; > pad->power_source = arg; > break; > -- > Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project. >
Re: [PATCH V1] pinctrl: qcom: spmi-gpio: Correct power_source range check
On Tue 18 Jul 23:39 PDT 2017, fengl...@codeaurora.org wrote: > From: Fenglin Wu > > Power source selection in DIG_VIN_CTL is indexed from 0, in the range > check it shouldn't be equal to the total number of power sources. > Acked-by: Bjorn Andersson Regards, Bjorn > Signed-off-by: Fenglin Wu > --- > drivers/pinctrl/qcom/pinctrl-spmi-gpio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > index 664b641..8b77c04 100644 > --- a/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > +++ b/drivers/pinctrl/qcom/pinctrl-spmi-gpio.c > @@ -375,7 +375,7 @@ static int pmic_gpio_config_set(struct pinctrl_dev > *pctldev, unsigned int pin, > pad->is_enabled = false; > break; > case PIN_CONFIG_POWER_SOURCE: > - if (arg > pad->num_sources) > + if (arg >= pad->num_sources) > return -EINVAL; > pad->power_source = arg; > break; > -- > Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project. >
Re: Do we really need d_weak_revalidate???
On 24/08/17 12:07, NeilBrown wrote: > > > The more precise details, that automount action for indirect automount > points is not triggered when the 'browse' option is used, is probably > not necessary. > > Ian: if you agree with that text, and Michael doesn't provide alternate > evidence, I'll submit a formal patch for the man page or should we > just wait until the patch actually lands? So far only David commented about using ENOENT rather than EREMOTE. I prefer ENOENT for this case myself and he didn't object when I explained why, David, any concerns? Al has been silent so far so either he hasn't seen it or he's ok with it, Al, any concerns? And I guess if there are no concerns there's a good chance Andrew is ok with it for the next merge window, Andrew? If everyone agrees then we could go ahead immediately so there's a better chance of getting it into released man pages closer to the change being merged. Ian
Re: Do we really need d_weak_revalidate???
On 24/08/17 12:07, NeilBrown wrote: > > > The more precise details, that automount action for indirect automount > points is not triggered when the 'browse' option is used, is probably > not necessary. > > Ian: if you agree with that text, and Michael doesn't provide alternate > evidence, I'll submit a formal patch for the man page or should we > just wait until the patch actually lands? So far only David commented about using ENOENT rather than EREMOTE. I prefer ENOENT for this case myself and he didn't object when I explained why, David, any concerns? Al has been silent so far so either he hasn't seen it or he's ok with it, Al, any concerns? And I guess if there are no concerns there's a good chance Andrew is ok with it for the next merge window, Andrew? If everyone agrees then we could go ahead immediately so there's a better chance of getting it into released man pages closer to the change being merged. Ian
[RESENT PATCH v7 5/7] ASoC: rockchip: Add support for DP codec
Add support for optional cdn dp codec. Signed-off-by: Jeffy Chen--- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None sound/soc/rockchip/Kconfig| 1 + sound/soc/rockchip/rk3399_gru_sound.c | 59 +-- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig index c84487805876..8f0d0d8d34e6 100644 --- a/sound/soc/rockchip/Kconfig +++ b/sound/soc/rockchip/Kconfig @@ -68,6 +68,7 @@ config SND_SOC_RK3399_GRU_SOUND select SND_SOC_RT5514 select SND_SOC_DA7219 select SND_SOC_RT5514_SPI + select SND_SOC_HDMI_CODEC help Say Y or M here if you want to add support multiple codecs for SoC audio on Rockchip RK3399 GRU boards. diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 347c34d0db4c..91aab5c18f3b 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -228,6 +228,45 @@ static int rockchip_sound_da7219_init(struct snd_soc_pcm_runtime *rtd) return 0; } +static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream, + struct snd_pcm_hw_params *params) +{ + struct snd_soc_pcm_runtime *rtd = substream->private_data; + struct snd_soc_dai *cpu_dai = rtd->cpu_dai; + struct snd_soc_dai *codec_dai = rtd->codec_dai; + int mclk, ret; + + /* in bypass mode, the mclk has to be one of the frequencies below */ + switch (params_rate(params)) { + case 8000: + case 16000: + case 24000: + case 32000: + case 48000: + case 64000: + case 96000: + mclk = 12288000; + break; + case 11025: + case 22050: + case 44100: + case 88200: + mclk = 11289600; + break; + default: + return -EINVAL; + } + + ret = snd_soc_dai_set_sysclk(cpu_dai, 0, mclk, +SND_SOC_CLOCK_OUT); + if (ret < 0) { + dev_err(codec_dai->dev, "Can't set cpu clock out %d\n", ret); + return ret; + } + + return 0; +} + static const struct snd_soc_ops rockchip_sound_max98357a_ops = { .hw_params = rockchip_sound_max98357a_hw_params, }; @@ -240,6 +279,10 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops = { .hw_params = rockchip_sound_da7219_hw_params, }; +static struct snd_soc_ops rockchip_sound_cdndp_ops = { + .hw_params = rockchip_sound_cdndp_hw_params, +}; + static struct snd_soc_card rockchip_sound_card = { .name = "rk3399-gru-sound", .owner = THIS_MODULE, @@ -252,6 +295,7 @@ static struct snd_soc_card rockchip_sound_card = { }; enum { + DAILINK_CDNDP, DAILINK_DA7219, DAILINK_MAX98357A, DAILINK_RT5514, @@ -259,6 +303,7 @@ enum { }; static const char * const dailink_compat[] = { + [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp", [DAILINK_DA7219] = "dlg,da7219", [DAILINK_MAX98357A] = "maxim,max98357a", [DAILINK_RT5514] = "realtek,rt5514-i2c", @@ -266,6 +311,14 @@ static const char * const dailink_compat[] = { }; static const struct snd_soc_dai_link rockchip_dais[] = { + [DAILINK_CDNDP] = { + .name = "DP", + .stream_name = "DP PCM", + .codec_dai_name = "i2s-hifi", + .ops = _sound_cdndp_ops, + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_DA7219] = { .name = "DA7219", .stream_name = "DA7219 PCM", @@ -316,7 +369,7 @@ static int rockchip_sound_codec_node_match(struct device_node *np_codec) static int rockchip_sound_of_parse_dais(struct device *dev, struct snd_soc_card *card) { - struct device_node *np_cpu; + struct device_node *np_cpu, *np_cpu0, *np_cpu1; struct device_node *np_codec; struct snd_soc_dai_link *dai; int i, index; @@ -326,7 +379,8 @@ static int rockchip_sound_of_parse_dais(struct device *dev, if (!card->dai_link) return -ENOMEM; - np_cpu = of_parse_phandle(dev->of_node, "rockchip,cpu", 0); + np_cpu0 = of_parse_phandle(dev->of_node, "rockchip,cpu", 0); + np_cpu1 = of_parse_phandle(dev->of_node, "rockchip,cpu", 1); card->num_links = 0; for (i = 0; i < ARRAY_SIZE(rockchip_dais); i++) { @@ -342,6 +396,7 @@ static int rockchip_sound_of_parse_dais(struct device *dev, if (index < 0) continue; + np_cpu = (index == DAILINK_CDNDP) ? np_cpu1 : np_cpu0; if (!np_cpu) { dev_err(dev, "Missing
[RESENT PATCH v7 5/7] ASoC: rockchip: Add support for DP codec
Add support for optional cdn dp codec. Signed-off-by: Jeffy Chen --- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None sound/soc/rockchip/Kconfig| 1 + sound/soc/rockchip/rk3399_gru_sound.c | 59 +-- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig index c84487805876..8f0d0d8d34e6 100644 --- a/sound/soc/rockchip/Kconfig +++ b/sound/soc/rockchip/Kconfig @@ -68,6 +68,7 @@ config SND_SOC_RK3399_GRU_SOUND select SND_SOC_RT5514 select SND_SOC_DA7219 select SND_SOC_RT5514_SPI + select SND_SOC_HDMI_CODEC help Say Y or M here if you want to add support multiple codecs for SoC audio on Rockchip RK3399 GRU boards. diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 347c34d0db4c..91aab5c18f3b 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -228,6 +228,45 @@ static int rockchip_sound_da7219_init(struct snd_soc_pcm_runtime *rtd) return 0; } +static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream, + struct snd_pcm_hw_params *params) +{ + struct snd_soc_pcm_runtime *rtd = substream->private_data; + struct snd_soc_dai *cpu_dai = rtd->cpu_dai; + struct snd_soc_dai *codec_dai = rtd->codec_dai; + int mclk, ret; + + /* in bypass mode, the mclk has to be one of the frequencies below */ + switch (params_rate(params)) { + case 8000: + case 16000: + case 24000: + case 32000: + case 48000: + case 64000: + case 96000: + mclk = 12288000; + break; + case 11025: + case 22050: + case 44100: + case 88200: + mclk = 11289600; + break; + default: + return -EINVAL; + } + + ret = snd_soc_dai_set_sysclk(cpu_dai, 0, mclk, +SND_SOC_CLOCK_OUT); + if (ret < 0) { + dev_err(codec_dai->dev, "Can't set cpu clock out %d\n", ret); + return ret; + } + + return 0; +} + static const struct snd_soc_ops rockchip_sound_max98357a_ops = { .hw_params = rockchip_sound_max98357a_hw_params, }; @@ -240,6 +279,10 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops = { .hw_params = rockchip_sound_da7219_hw_params, }; +static struct snd_soc_ops rockchip_sound_cdndp_ops = { + .hw_params = rockchip_sound_cdndp_hw_params, +}; + static struct snd_soc_card rockchip_sound_card = { .name = "rk3399-gru-sound", .owner = THIS_MODULE, @@ -252,6 +295,7 @@ static struct snd_soc_card rockchip_sound_card = { }; enum { + DAILINK_CDNDP, DAILINK_DA7219, DAILINK_MAX98357A, DAILINK_RT5514, @@ -259,6 +303,7 @@ enum { }; static const char * const dailink_compat[] = { + [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp", [DAILINK_DA7219] = "dlg,da7219", [DAILINK_MAX98357A] = "maxim,max98357a", [DAILINK_RT5514] = "realtek,rt5514-i2c", @@ -266,6 +311,14 @@ static const char * const dailink_compat[] = { }; static const struct snd_soc_dai_link rockchip_dais[] = { + [DAILINK_CDNDP] = { + .name = "DP", + .stream_name = "DP PCM", + .codec_dai_name = "i2s-hifi", + .ops = _sound_cdndp_ops, + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_DA7219] = { .name = "DA7219", .stream_name = "DA7219 PCM", @@ -316,7 +369,7 @@ static int rockchip_sound_codec_node_match(struct device_node *np_codec) static int rockchip_sound_of_parse_dais(struct device *dev, struct snd_soc_card *card) { - struct device_node *np_cpu; + struct device_node *np_cpu, *np_cpu0, *np_cpu1; struct device_node *np_codec; struct snd_soc_dai_link *dai; int i, index; @@ -326,7 +379,8 @@ static int rockchip_sound_of_parse_dais(struct device *dev, if (!card->dai_link) return -ENOMEM; - np_cpu = of_parse_phandle(dev->of_node, "rockchip,cpu", 0); + np_cpu0 = of_parse_phandle(dev->of_node, "rockchip,cpu", 0); + np_cpu1 = of_parse_phandle(dev->of_node, "rockchip,cpu", 1); card->num_links = 0; for (i = 0; i < ARRAY_SIZE(rockchip_dais); i++) { @@ -342,6 +396,7 @@ static int rockchip_sound_of_parse_dais(struct device *dev, if (index < 0) continue; + np_cpu = (index == DAILINK_CDNDP) ? np_cpu1 : np_cpu0; if (!np_cpu) { dev_err(dev, "Missing 'rockchip,cpu' for %s\n",
[RESENT PATCH v7 7/7] dt-bindings: ASoC: rockchip: Update description of rockchip,codec
Update description for newly added optional audio codecs. Signed-off-by: Jeffy ChenAcked-by: Rob Herring --- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt index eac91db07178..72d3cf4c2606 100644 --- a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt +++ b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt @@ -4,7 +4,7 @@ Required properties: - compatible: "rockchip,rk3399-gru-sound" - rockchip,cpu: The phandle of the Rockchip I2S controller that's connected to the codecs -- rockchip,codec: The phandle of the MAX98357A/RT5514/DA7219 codecs +- rockchip,codec: The phandle of the audio codecs Optional properties: - dmic-wakeup-delay-ms : specify delay time (ms) for DMIC ready. -- 2.11.0
[RESENT PATCH v7 4/7] ASoC: rockchip: Parse dai links from dts
Refactor rockchip_sound_probe, parse dai links from dts instead of hard coding them. Signed-off-by: Jeffy ChenReviewed-by: Matthias Kaehlcke Tested-by: Matthias Kaehlcke --- Changes in v7: None Changes in v6: None Changes in v3: Use compatible to match audio codecs -- Suggested-by Matthias Kaehlcke Changes in v2: Let rockchip,codec-names be a required property, because we plan to add more supported codecs to the fixed dai link list in the driver. sound/soc/rockchip/rk3399_gru_sound.c | 140 ++ 1 file changed, 91 insertions(+), 49 deletions(-) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index cd2fdba922f1..347c34d0db4c 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -240,14 +240,42 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops = { .hw_params = rockchip_sound_da7219_hw_params, }; +static struct snd_soc_card rockchip_sound_card = { + .name = "rk3399-gru-sound", + .owner = THIS_MODULE, + .dapm_widgets = rockchip_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets), + .dapm_routes = rockchip_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes), + .controls = rockchip_controls, + .num_controls = ARRAY_SIZE(rockchip_controls), +}; + enum { + DAILINK_DA7219, DAILINK_MAX98357A, DAILINK_RT5514, - DAILINK_DA7219, DAILINK_RT5514_DSP, }; -static struct snd_soc_dai_link rockchip_dailinks[] = { +static const char * const dailink_compat[] = { + [DAILINK_DA7219] = "dlg,da7219", + [DAILINK_MAX98357A] = "maxim,max98357a", + [DAILINK_RT5514] = "realtek,rt5514-i2c", + [DAILINK_RT5514_DSP] = "realtek,rt5514-spi", +}; + +static const struct snd_soc_dai_link rockchip_dais[] = { + [DAILINK_DA7219] = { + .name = "DA7219", + .stream_name = "DA7219 PCM", + .codec_dai_name = "da7219-hifi", + .init = rockchip_sound_da7219_init, + .ops = _sound_da7219_ops, + /* set da7219 as slave */ + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_MAX98357A] = { .name = "MAX98357A", .stream_name = "MAX98357A PCM", @@ -266,16 +294,6 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, }, - [DAILINK_DA7219] = { - .name = "DA7219", - .stream_name = "DA7219 PCM", - .codec_dai_name = "da7219-hifi", - .init = rockchip_sound_da7219_init, - .ops = _sound_da7219_ops, - /* set da7219 as slave */ - .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | - SND_SOC_DAIFMT_CBS_CFS, - }, /* RT5514 DSP for voice wakeup via spi bus */ [DAILINK_RT5514_DSP] = { .name = "RT5514 DSP", @@ -284,42 +302,72 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { }, }; -static struct snd_soc_card rockchip_sound_card = { - .name = "rk3399-gru-sound", - .owner = THIS_MODULE, - .dai_link = rockchip_dailinks, - .num_links = ARRAY_SIZE(rockchip_dailinks), - .dapm_widgets = rockchip_dapm_widgets, - .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets), - .dapm_routes = rockchip_dapm_routes, - .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes), - .controls = rockchip_controls, - .num_controls = ARRAY_SIZE(rockchip_controls), -}; - -static int rockchip_sound_probe(struct platform_device *pdev) +static int rockchip_sound_codec_node_match(struct device_node *np_codec) { - struct snd_soc_card *card = _sound_card; - struct device_node *cpu_node; - int i, ret; + int i; - cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0); - if (!cpu_node) { - dev_err(>dev, "Property 'rockchip,cpu' missing or invalid\n"); - return -EINVAL; + for (i = 0; i < ARRAY_SIZE(dailink_compat); i++) { + if (of_device_is_compatible(np_codec, dailink_compat[i])) + return i; } + return -1; +} - for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) { - rockchip_dailinks[i].platform_of_node = cpu_node; - rockchip_dailinks[i].cpu_of_node = cpu_node; - - rockchip_dailinks[i].codec_of_node = - of_parse_phandle(pdev->dev.of_node, "rockchip,codec", i); - if (!rockchip_dailinks[i].codec_of_node) { - dev_err(>dev,
[RESENT PATCH v7 7/7] dt-bindings: ASoC: rockchip: Update description of rockchip,codec
Update description for newly added optional audio codecs. Signed-off-by: Jeffy Chen Acked-by: Rob Herring --- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt index eac91db07178..72d3cf4c2606 100644 --- a/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt +++ b/Documentation/devicetree/bindings/sound/rockchip,rk3399-gru-sound.txt @@ -4,7 +4,7 @@ Required properties: - compatible: "rockchip,rk3399-gru-sound" - rockchip,cpu: The phandle of the Rockchip I2S controller that's connected to the codecs -- rockchip,codec: The phandle of the MAX98357A/RT5514/DA7219 codecs +- rockchip,codec: The phandle of the audio codecs Optional properties: - dmic-wakeup-delay-ms : specify delay time (ms) for DMIC ready. -- 2.11.0
[RESENT PATCH v7 4/7] ASoC: rockchip: Parse dai links from dts
Refactor rockchip_sound_probe, parse dai links from dts instead of hard coding them. Signed-off-by: Jeffy Chen Reviewed-by: Matthias Kaehlcke Tested-by: Matthias Kaehlcke --- Changes in v7: None Changes in v6: None Changes in v3: Use compatible to match audio codecs -- Suggested-by Matthias Kaehlcke Changes in v2: Let rockchip,codec-names be a required property, because we plan to add more supported codecs to the fixed dai link list in the driver. sound/soc/rockchip/rk3399_gru_sound.c | 140 ++ 1 file changed, 91 insertions(+), 49 deletions(-) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index cd2fdba922f1..347c34d0db4c 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -240,14 +240,42 @@ static const struct snd_soc_ops rockchip_sound_da7219_ops = { .hw_params = rockchip_sound_da7219_hw_params, }; +static struct snd_soc_card rockchip_sound_card = { + .name = "rk3399-gru-sound", + .owner = THIS_MODULE, + .dapm_widgets = rockchip_dapm_widgets, + .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets), + .dapm_routes = rockchip_dapm_routes, + .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes), + .controls = rockchip_controls, + .num_controls = ARRAY_SIZE(rockchip_controls), +}; + enum { + DAILINK_DA7219, DAILINK_MAX98357A, DAILINK_RT5514, - DAILINK_DA7219, DAILINK_RT5514_DSP, }; -static struct snd_soc_dai_link rockchip_dailinks[] = { +static const char * const dailink_compat[] = { + [DAILINK_DA7219] = "dlg,da7219", + [DAILINK_MAX98357A] = "maxim,max98357a", + [DAILINK_RT5514] = "realtek,rt5514-i2c", + [DAILINK_RT5514_DSP] = "realtek,rt5514-spi", +}; + +static const struct snd_soc_dai_link rockchip_dais[] = { + [DAILINK_DA7219] = { + .name = "DA7219", + .stream_name = "DA7219 PCM", + .codec_dai_name = "da7219-hifi", + .init = rockchip_sound_da7219_init, + .ops = _sound_da7219_ops, + /* set da7219 as slave */ + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_MAX98357A] = { .name = "MAX98357A", .stream_name = "MAX98357A PCM", @@ -266,16 +294,6 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, }, - [DAILINK_DA7219] = { - .name = "DA7219", - .stream_name = "DA7219 PCM", - .codec_dai_name = "da7219-hifi", - .init = rockchip_sound_da7219_init, - .ops = _sound_da7219_ops, - /* set da7219 as slave */ - .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | - SND_SOC_DAIFMT_CBS_CFS, - }, /* RT5514 DSP for voice wakeup via spi bus */ [DAILINK_RT5514_DSP] = { .name = "RT5514 DSP", @@ -284,42 +302,72 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { }, }; -static struct snd_soc_card rockchip_sound_card = { - .name = "rk3399-gru-sound", - .owner = THIS_MODULE, - .dai_link = rockchip_dailinks, - .num_links = ARRAY_SIZE(rockchip_dailinks), - .dapm_widgets = rockchip_dapm_widgets, - .num_dapm_widgets = ARRAY_SIZE(rockchip_dapm_widgets), - .dapm_routes = rockchip_dapm_routes, - .num_dapm_routes = ARRAY_SIZE(rockchip_dapm_routes), - .controls = rockchip_controls, - .num_controls = ARRAY_SIZE(rockchip_controls), -}; - -static int rockchip_sound_probe(struct platform_device *pdev) +static int rockchip_sound_codec_node_match(struct device_node *np_codec) { - struct snd_soc_card *card = _sound_card; - struct device_node *cpu_node; - int i, ret; + int i; - cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0); - if (!cpu_node) { - dev_err(>dev, "Property 'rockchip,cpu' missing or invalid\n"); - return -EINVAL; + for (i = 0; i < ARRAY_SIZE(dailink_compat); i++) { + if (of_device_is_compatible(np_codec, dailink_compat[i])) + return i; } + return -1; +} - for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) { - rockchip_dailinks[i].platform_of_node = cpu_node; - rockchip_dailinks[i].cpu_of_node = cpu_node; - - rockchip_dailinks[i].codec_of_node = - of_parse_phandle(pdev->dev.of_node, "rockchip,codec", i); - if (!rockchip_dailinks[i].codec_of_node) { - dev_err(>dev, - "Property[%d] 'rockchip,codec' missing or
[RESENT PATCH v7 0/7] ASoC: rockchip: Parse dai links from dts
Currently we are using a fixed list of dai links in the driver. This serial of patches would let the driver parse dai links from dts, so that we can make some of them optional for future boards. Tested on my chromebook bob(with cros 4.4 kernel), it still works after disabled rt5514 codecs in the dts. Changes in v7: Rebase on the newest for-next Changes in v6: Add dmic wakeup delay(not used for now). Changes in v3: Use compatible to match audio codecs -- Suggested-by Matthias KaehlckeChanges in v2: Let rockchip,codec-names be a required property, because we plan to add more supported codecs to the fixed dai link list in the driver. Jeffy Chen (7): ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp arm64: dts: rockchip: Add rt5514 dsp for Gru arm64: dts: rockchip: Update rt5514 devices' compatible for Gru ASoC: rockchip: Parse dai links from dts ASoC: rockchip: Add support for DP codec ASoC: rockchip: Add support for DMIC codec dt-bindings: ASoC: rockchip: Update description of rockchip,codec .../bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +- arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 7 +- sound/soc/rockchip/Kconfig | 2 + sound/soc/rockchip/rk3399_gru_sound.c | 255 +++-- 4 files changed, 186 insertions(+), 80 deletions(-) -- 2.11.0
[RESENT PATCH v7 6/7] ASoC: rockchip: Add support for DMIC codec
Add support for optional dmic codec. Signed-off-by: Jeffy Chen--- Changes in v7: None Changes in v6: Add dmic wakeup delay(not used for now). Changes in v3: None Changes in v2: None sound/soc/rockchip/Kconfig| 1 + sound/soc/rockchip/rk3399_gru_sound.c | 36 +++ 2 files changed, 37 insertions(+) diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig index 8f0d0d8d34e6..b0825370d262 100644 --- a/sound/soc/rockchip/Kconfig +++ b/sound/soc/rockchip/Kconfig @@ -69,6 +69,7 @@ config SND_SOC_RK3399_GRU_SOUND select SND_SOC_DA7219 select SND_SOC_RT5514_SPI select SND_SOC_HDMI_CODEC + select SND_SOC_DMIC help Say Y or M here if you want to add support multiple codecs for SoC audio on Rockchip RK3399 GRU boards. diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 91aab5c18f3b..5ab25962cabd 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -267,6 +267,28 @@ static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream, return 0; } +static int rockchip_sound_dmic_hw_params(struct snd_pcm_substream *substream, +struct snd_pcm_hw_params *params) +{ + struct snd_soc_pcm_runtime *rtd = substream->private_data; + unsigned int mclk; + int ret; + + mclk = params_rate(params) * SOUND_FS; + + ret = snd_soc_dai_set_sysclk(rtd->cpu_dai, 0, mclk, 0); + if (ret) { + dev_err(rtd->card->dev, "%s() error setting sysclk to %u: %d\n", + __func__, mclk, ret); + return ret; + } + + /* Wait for DMIC stable */ + msleep(dmic_wakeup_delay); + + return 0; +} + static const struct snd_soc_ops rockchip_sound_max98357a_ops = { .hw_params = rockchip_sound_max98357a_hw_params, }; @@ -283,6 +305,10 @@ static struct snd_soc_ops rockchip_sound_cdndp_ops = { .hw_params = rockchip_sound_cdndp_hw_params, }; +static struct snd_soc_ops rockchip_sound_dmic_ops = { + .hw_params = rockchip_sound_dmic_hw_params, +}; + static struct snd_soc_card rockchip_sound_card = { .name = "rk3399-gru-sound", .owner = THIS_MODULE, @@ -297,6 +323,7 @@ static struct snd_soc_card rockchip_sound_card = { enum { DAILINK_CDNDP, DAILINK_DA7219, + DAILINK_DMIC, DAILINK_MAX98357A, DAILINK_RT5514, DAILINK_RT5514_DSP, @@ -305,6 +332,7 @@ enum { static const char * const dailink_compat[] = { [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp", [DAILINK_DA7219] = "dlg,da7219", + [DAILINK_DMIC] = "dmic-codec", [DAILINK_MAX98357A] = "maxim,max98357a", [DAILINK_RT5514] = "realtek,rt5514-i2c", [DAILINK_RT5514_DSP] = "realtek,rt5514-spi", @@ -329,6 +357,14 @@ static const struct snd_soc_dai_link rockchip_dais[] = { .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, }, + [DAILINK_DMIC] = { + .name = "DMIC", + .stream_name = "DMIC PCM", + .codec_dai_name = "dmic-hifi", + .ops = _sound_dmic_ops, + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_MAX98357A] = { .name = "MAX98357A", .stream_name = "MAX98357A PCM", -- 2.11.0
[RESENT PATCH v7 0/7] ASoC: rockchip: Parse dai links from dts
Currently we are using a fixed list of dai links in the driver. This serial of patches would let the driver parse dai links from dts, so that we can make some of them optional for future boards. Tested on my chromebook bob(with cros 4.4 kernel), it still works after disabled rt5514 codecs in the dts. Changes in v7: Rebase on the newest for-next Changes in v6: Add dmic wakeup delay(not used for now). Changes in v3: Use compatible to match audio codecs -- Suggested-by Matthias Kaehlcke Changes in v2: Let rockchip,codec-names be a required property, because we plan to add more supported codecs to the fixed dai link list in the driver. Jeffy Chen (7): ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp arm64: dts: rockchip: Add rt5514 dsp for Gru arm64: dts: rockchip: Update rt5514 devices' compatible for Gru ASoC: rockchip: Parse dai links from dts ASoC: rockchip: Add support for DP codec ASoC: rockchip: Add support for DMIC codec dt-bindings: ASoC: rockchip: Update description of rockchip,codec .../bindings/sound/rockchip,rk3399-gru-sound.txt | 2 +- arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 7 +- sound/soc/rockchip/Kconfig | 2 + sound/soc/rockchip/rk3399_gru_sound.c | 255 +++-- 4 files changed, 186 insertions(+), 80 deletions(-) -- 2.11.0
[RESENT PATCH v7 6/7] ASoC: rockchip: Add support for DMIC codec
Add support for optional dmic codec. Signed-off-by: Jeffy Chen --- Changes in v7: None Changes in v6: Add dmic wakeup delay(not used for now). Changes in v3: None Changes in v2: None sound/soc/rockchip/Kconfig| 1 + sound/soc/rockchip/rk3399_gru_sound.c | 36 +++ 2 files changed, 37 insertions(+) diff --git a/sound/soc/rockchip/Kconfig b/sound/soc/rockchip/Kconfig index 8f0d0d8d34e6..b0825370d262 100644 --- a/sound/soc/rockchip/Kconfig +++ b/sound/soc/rockchip/Kconfig @@ -69,6 +69,7 @@ config SND_SOC_RK3399_GRU_SOUND select SND_SOC_DA7219 select SND_SOC_RT5514_SPI select SND_SOC_HDMI_CODEC + select SND_SOC_DMIC help Say Y or M here if you want to add support multiple codecs for SoC audio on Rockchip RK3399 GRU boards. diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 91aab5c18f3b..5ab25962cabd 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -267,6 +267,28 @@ static int rockchip_sound_cdndp_hw_params(struct snd_pcm_substream *substream, return 0; } +static int rockchip_sound_dmic_hw_params(struct snd_pcm_substream *substream, +struct snd_pcm_hw_params *params) +{ + struct snd_soc_pcm_runtime *rtd = substream->private_data; + unsigned int mclk; + int ret; + + mclk = params_rate(params) * SOUND_FS; + + ret = snd_soc_dai_set_sysclk(rtd->cpu_dai, 0, mclk, 0); + if (ret) { + dev_err(rtd->card->dev, "%s() error setting sysclk to %u: %d\n", + __func__, mclk, ret); + return ret; + } + + /* Wait for DMIC stable */ + msleep(dmic_wakeup_delay); + + return 0; +} + static const struct snd_soc_ops rockchip_sound_max98357a_ops = { .hw_params = rockchip_sound_max98357a_hw_params, }; @@ -283,6 +305,10 @@ static struct snd_soc_ops rockchip_sound_cdndp_ops = { .hw_params = rockchip_sound_cdndp_hw_params, }; +static struct snd_soc_ops rockchip_sound_dmic_ops = { + .hw_params = rockchip_sound_dmic_hw_params, +}; + static struct snd_soc_card rockchip_sound_card = { .name = "rk3399-gru-sound", .owner = THIS_MODULE, @@ -297,6 +323,7 @@ static struct snd_soc_card rockchip_sound_card = { enum { DAILINK_CDNDP, DAILINK_DA7219, + DAILINK_DMIC, DAILINK_MAX98357A, DAILINK_RT5514, DAILINK_RT5514_DSP, @@ -305,6 +332,7 @@ enum { static const char * const dailink_compat[] = { [DAILINK_CDNDP] = "rockchip,rk3399-cdn-dp", [DAILINK_DA7219] = "dlg,da7219", + [DAILINK_DMIC] = "dmic-codec", [DAILINK_MAX98357A] = "maxim,max98357a", [DAILINK_RT5514] = "realtek,rt5514-i2c", [DAILINK_RT5514_DSP] = "realtek,rt5514-spi", @@ -329,6 +357,14 @@ static const struct snd_soc_dai_link rockchip_dais[] = { .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, }, + [DAILINK_DMIC] = { + .name = "DMIC", + .stream_name = "DMIC PCM", + .codec_dai_name = "dmic-hifi", + .ops = _sound_dmic_ops, + .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | + SND_SOC_DAIFMT_CBS_CFS, + }, [DAILINK_MAX98357A] = { .name = "MAX98357A", .stream_name = "MAX98357A PCM", -- 2.11.0
[RESENT PATCH v7 1/7] ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp
Currently we are using codec name for rt5514 dsp dai link, use codec of_node instead. Signed-off-by: Jeffy Chen--- Changes in v7: Rebase on the newest for-next Changes in v6: None Changes in v3: None Changes in v2: None sound/soc/rockchip/rk3399_gru_sound.c | 34 ++ 1 file changed, 2 insertions(+), 32 deletions(-) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 566ccb39fb31..cd2fdba922f1 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -247,8 +247,6 @@ enum { DAILINK_RT5514_DSP, }; -#define DAILINK_ENTITIES (DAILINK_DA7219 + 1) - static struct snd_soc_dai_link rockchip_dailinks[] = { [DAILINK_MAX98357A] = { .name = "MAX98357A", @@ -282,8 +280,7 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { [DAILINK_RT5514_DSP] = { .name = "RT5514 DSP", .stream_name = "Wake on Voice", - .codec_name = "snd-soc-dummy", - .codec_dai_name = "snd-soc-dummy-dai", + .codec_dai_name = "rt5514-dsp-cpu-dai", }, }; @@ -300,17 +297,10 @@ static struct snd_soc_card rockchip_sound_card = { .num_controls = ARRAY_SIZE(rockchip_controls), }; -static int rockchip_sound_match_stub(struct device *dev, void *data) -{ - return 1; -} - static int rockchip_sound_probe(struct platform_device *pdev) { struct snd_soc_card *card = _sound_card; struct device_node *cpu_node; - struct device *dev; - struct device_driver *drv; int i, ret; cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0); @@ -319,7 +309,7 @@ static int rockchip_sound_probe(struct platform_device *pdev) return -EINVAL; } - for (i = 0; i < DAILINK_ENTITIES; i++) { + for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) { rockchip_dailinks[i].platform_of_node = cpu_node; rockchip_dailinks[i].cpu_of_node = cpu_node; @@ -332,22 +322,6 @@ static int rockchip_sound_probe(struct platform_device *pdev) } } - /** -* To acquire the spi driver of the rt5514 and set the dai-links names -* for soc_bind_dai_link -*/ - drv = driver_find("rt5514", _bus_type); - if (!drv) { - dev_err(>dev, "Can not find the rt5514 driver at the spi bus\n"); - return -EINVAL; - } - - dev = driver_find_device(drv, NULL, NULL, rockchip_sound_match_stub); - if (!dev) { - dev_err(>dev, "Can not find the rt5514 device\n"); - return -ENODEV; - } - /* Set DMIC wakeup delay */ ret = device_property_read_u32(>dev, "dmic-wakeup-delay-ms", _wakeup_delay); @@ -357,10 +331,6 @@ static int rockchip_sound_probe(struct platform_device *pdev) "no optional property 'dmic-wakeup-delay-ms' found, default: no delay\n"); } - rockchip_dailinks[DAILINK_RT5514_DSP].cpu_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - rockchip_dailinks[DAILINK_RT5514_DSP].cpu_dai_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - rockchip_dailinks[DAILINK_RT5514_DSP].platform_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - card->dev = >dev; ret = devm_snd_soc_register_card(>dev, card); -- 2.11.0
[RESENT PATCH v7 1/7] ASoC: rockchip: Use codec of_node and dai_name for rt5514 dsp
Currently we are using codec name for rt5514 dsp dai link, use codec of_node instead. Signed-off-by: Jeffy Chen --- Changes in v7: Rebase on the newest for-next Changes in v6: None Changes in v3: None Changes in v2: None sound/soc/rockchip/rk3399_gru_sound.c | 34 ++ 1 file changed, 2 insertions(+), 32 deletions(-) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index 566ccb39fb31..cd2fdba922f1 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -247,8 +247,6 @@ enum { DAILINK_RT5514_DSP, }; -#define DAILINK_ENTITIES (DAILINK_DA7219 + 1) - static struct snd_soc_dai_link rockchip_dailinks[] = { [DAILINK_MAX98357A] = { .name = "MAX98357A", @@ -282,8 +280,7 @@ static struct snd_soc_dai_link rockchip_dailinks[] = { [DAILINK_RT5514_DSP] = { .name = "RT5514 DSP", .stream_name = "Wake on Voice", - .codec_name = "snd-soc-dummy", - .codec_dai_name = "snd-soc-dummy-dai", + .codec_dai_name = "rt5514-dsp-cpu-dai", }, }; @@ -300,17 +297,10 @@ static struct snd_soc_card rockchip_sound_card = { .num_controls = ARRAY_SIZE(rockchip_controls), }; -static int rockchip_sound_match_stub(struct device *dev, void *data) -{ - return 1; -} - static int rockchip_sound_probe(struct platform_device *pdev) { struct snd_soc_card *card = _sound_card; struct device_node *cpu_node; - struct device *dev; - struct device_driver *drv; int i, ret; cpu_node = of_parse_phandle(pdev->dev.of_node, "rockchip,cpu", 0); @@ -319,7 +309,7 @@ static int rockchip_sound_probe(struct platform_device *pdev) return -EINVAL; } - for (i = 0; i < DAILINK_ENTITIES; i++) { + for (i = 0; i < ARRAY_SIZE(rockchip_dailinks); i++) { rockchip_dailinks[i].platform_of_node = cpu_node; rockchip_dailinks[i].cpu_of_node = cpu_node; @@ -332,22 +322,6 @@ static int rockchip_sound_probe(struct platform_device *pdev) } } - /** -* To acquire the spi driver of the rt5514 and set the dai-links names -* for soc_bind_dai_link -*/ - drv = driver_find("rt5514", _bus_type); - if (!drv) { - dev_err(>dev, "Can not find the rt5514 driver at the spi bus\n"); - return -EINVAL; - } - - dev = driver_find_device(drv, NULL, NULL, rockchip_sound_match_stub); - if (!dev) { - dev_err(>dev, "Can not find the rt5514 device\n"); - return -ENODEV; - } - /* Set DMIC wakeup delay */ ret = device_property_read_u32(>dev, "dmic-wakeup-delay-ms", _wakeup_delay); @@ -357,10 +331,6 @@ static int rockchip_sound_probe(struct platform_device *pdev) "no optional property 'dmic-wakeup-delay-ms' found, default: no delay\n"); } - rockchip_dailinks[DAILINK_RT5514_DSP].cpu_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - rockchip_dailinks[DAILINK_RT5514_DSP].cpu_dai_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - rockchip_dailinks[DAILINK_RT5514_DSP].platform_name = kstrdup_const(dev_name(dev), GFP_KERNEL); - card->dev = >dev; ret = devm_snd_soc_register_card(>dev, card); -- 2.11.0
[RESENT PATCH v7 2/7] arm64: dts: rockchip: Add rt5514 dsp for Gru
Add rt5514 dsp of_node to codec list for Gru boards. Signed-off-by: Jeffy Chen--- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 199a5118b20d..5772c52fbfd3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -514,7 +514,8 @@ sound { compatible = "rockchip,rk3399-gru-sound"; rockchip,cpu = < >; - rockchip,codec = < >; + rockchip,codec = < + _spi_audio>; }; }; -- 2.11.0
[RESENT PATCH v7 3/7] arm64: dts: rockchip: Update rt5514 devices' compatible for Gru
Currently the rt5514 i2c driver and rt5514 spi driver are using the same compatible string. Add additional unused compatible strings to identify them for Gru boards. Signed-off-by: Jeffy Chen--- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 5772c52fbfd3..50fb11ad9f17 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -611,7 +611,7 @@ ap_i2c_mic: { i2c-scl-rising-time-ns = <300>; headsetcodec: rt5514@57 { - compatible = "realtek,rt5514"; + compatible = "realtek,rt5514", "realtek,rt5514-i2c"; reg = <0x57>; realtek,dmic-init-delay-ms = <20>; }; @@ -820,7 +820,7 @@ ap_i2c_audio: { status = "okay"; wacky_spi_audio: spi2@0 { - compatible = "realtek,rt5514"; + compatible = "realtek,rt5514", "realtek,rt5514-spi"; reg = <0>; interrupt-parent = <>; interrupts = <13 IRQ_TYPE_LEVEL_HIGH>; -- 2.11.0
[RESENT PATCH v7 2/7] arm64: dts: rockchip: Add rt5514 dsp for Gru
Add rt5514 dsp of_node to codec list for Gru boards. Signed-off-by: Jeffy Chen --- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 199a5118b20d..5772c52fbfd3 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -514,7 +514,8 @@ sound { compatible = "rockchip,rk3399-gru-sound"; rockchip,cpu = < >; - rockchip,codec = < >; + rockchip,codec = < + _spi_audio>; }; }; -- 2.11.0
[RESENT PATCH v7 3/7] arm64: dts: rockchip: Update rt5514 devices' compatible for Gru
Currently the rt5514 i2c driver and rt5514 spi driver are using the same compatible string. Add additional unused compatible strings to identify them for Gru boards. Signed-off-by: Jeffy Chen --- Changes in v7: None Changes in v6: None Changes in v3: None Changes in v2: None arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 5772c52fbfd3..50fb11ad9f17 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -611,7 +611,7 @@ ap_i2c_mic: { i2c-scl-rising-time-ns = <300>; headsetcodec: rt5514@57 { - compatible = "realtek,rt5514"; + compatible = "realtek,rt5514", "realtek,rt5514-i2c"; reg = <0x57>; realtek,dmic-init-delay-ms = <20>; }; @@ -820,7 +820,7 @@ ap_i2c_audio: { status = "okay"; wacky_spi_audio: spi2@0 { - compatible = "realtek,rt5514"; + compatible = "realtek,rt5514", "realtek,rt5514-spi"; reg = <0>; interrupt-parent = <>; interrupts = <13 IRQ_TYPE_LEVEL_HIGH>; -- 2.11.0
Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
Hi, On (08/24/17 12:39), Boqun Feng wrote: > On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote: > > On (08/23/17 13:35), Boqun Feng wrote: > > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont > > > > buffer immediately. > > > > > > > > > > Hmm.. Not quite familiar with printk() stuffs, but I could see several > > > usages of printk(KERN_CONT "...\n") in kernel. > > > > > > Did a bit research myself, and I now think the inappropriate use is to > > > use a KERN_CONT printk *after* another printk ending with a "\n". > > > > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer > > upon the return. sorry, your code is correct. > > > > So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a > correct usage, right? well, yes. with one precondition - there should be no printk-s from other CPUs/tasks in between printk(KERN_CON "..."); + printk(KERN_CONT "...\n") ^ here we can have a preliminary flush and broken cont line. but it's been this way forever. -ss
Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
Hi, On (08/24/17 12:39), Boqun Feng wrote: > On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote: > > On (08/23/17 13:35), Boqun Feng wrote: > > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont > > > > buffer immediately. > > > > > > > > > > Hmm.. Not quite familiar with printk() stuffs, but I could see several > > > usages of printk(KERN_CONT "...\n") in kernel. > > > > > > Did a bit research myself, and I now think the inappropriate use is to > > > use a KERN_CONT printk *after* another printk ending with a "\n". > > > > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer > > upon the return. sorry, your code is correct. > > > > So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a > correct usage, right? well, yes. with one precondition - there should be no printk-s from other CPUs/tasks in between printk(KERN_CON "..."); + printk(KERN_CONT "...\n") ^ here we can have a preliminary flush and broken cont line. but it's been this way forever. -ss
Re: Do we really need d_weak_revalidate???
On 24/08/17 12:07, NeilBrown wrote: > On Wed, Aug 23 2017, Ian Kent wrote: > >> >> That inconsistency has bothered me for quite a while now. >> >> It was carried over from the autofs module behavior when automounting >> support was added to the VFS. What's worse is it prevents the use of >> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with >> statx(). >> >> There is some risk in changing that so it does work but it really does >> need to work to enable userspace to not trigger an automount by using >> this flag. >> >> So that's (hopefully) going to change soonish, see: >> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch >> >> The result should be that stat family calls don't trigger automounts except >> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag. >> > > oooh, yes. That's much better - thanks. > > We should make sure that change gets into the man pages... Yes, I was wondering who to contact for that. > > First however, we should probably correct the man page! > stat.2 says: > > > NOTES >On Linux, lstat() will generally not trigger automounter >action, whereas stat() will (but see the description of >fstatat() AT_NO_AUTOMOUNT fag, above). > > which is wrong: lstat and stat treat automounts the same. > @Michael: do you recall why you inserted that text? The commit message > in commit 1ef5b2805471 ("stat.2: Cosmetic reworking of timestamp > discussion in NOTES") is not very helpful. > > I propose correcting to > > NOTES: > On Linux, lstat() nor stat() act as though AT_NO_AUTOMOUNT was set > and will not trigger automounter action for direct automount > points, though they may (prior to 4.14) for indirect automount > points. Shouldn't that be "lstat() and stat() act as though AT_NO_AUTOMOUNT is set ..." > > > The more precise details, that automount action for indirect automount > points is not triggered when the 'browse' option is used, is probably > not necessary. > > Ian: if you agree with that text, and Michael doesn't provide alternate > evidence, I'll submit a formal patch for the man page or should we > just wait until the patch actually lands? I thought the fstatat() description needed attention too, doubly so with the AT_NO_AUTOMOUNT change. The "The fstatat() system call operates in exactly the same way as stat()" is wrong in the same way as the stat() description was wrong. After the change fstatat() will trigger automounts if the AT_NO_AUTOMOUNT flag is not given which is different from lstat() and stat(). The updated NOTE above probably needs to be referred to in order to clarify what's meant by "in exactly the same way" since that probably refers to the information returned rather than whether an automount will be done. Ian
Re: Do we really need d_weak_revalidate???
On 24/08/17 12:07, NeilBrown wrote: > On Wed, Aug 23 2017, Ian Kent wrote: > >> >> That inconsistency has bothered me for quite a while now. >> >> It was carried over from the autofs module behavior when automounting >> support was added to the VFS. What's worse is it prevents the use of >> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with >> statx(). >> >> There is some risk in changing that so it does work but it really does >> need to work to enable userspace to not trigger an automount by using >> this flag. >> >> So that's (hopefully) going to change soonish, see: >> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch >> >> The result should be that stat family calls don't trigger automounts except >> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag. >> > > oooh, yes. That's much better - thanks. > > We should make sure that change gets into the man pages... Yes, I was wondering who to contact for that. > > First however, we should probably correct the man page! > stat.2 says: > > > NOTES >On Linux, lstat() will generally not trigger automounter >action, whereas stat() will (but see the description of >fstatat() AT_NO_AUTOMOUNT fag, above). > > which is wrong: lstat and stat treat automounts the same. > @Michael: do you recall why you inserted that text? The commit message > in commit 1ef5b2805471 ("stat.2: Cosmetic reworking of timestamp > discussion in NOTES") is not very helpful. > > I propose correcting to > > NOTES: > On Linux, lstat() nor stat() act as though AT_NO_AUTOMOUNT was set > and will not trigger automounter action for direct automount > points, though they may (prior to 4.14) for indirect automount > points. Shouldn't that be "lstat() and stat() act as though AT_NO_AUTOMOUNT is set ..." > > > The more precise details, that automount action for indirect automount > points is not triggered when the 'browse' option is used, is probably > not necessary. > > Ian: if you agree with that text, and Michael doesn't provide alternate > evidence, I'll submit a formal patch for the man page or should we > just wait until the patch actually lands? I thought the fstatat() description needed attention too, doubly so with the AT_NO_AUTOMOUNT change. The "The fstatat() system call operates in exactly the same way as stat()" is wrong in the same way as the stat() description was wrong. After the change fstatat() will trigger automounts if the AT_NO_AUTOMOUNT flag is not given which is different from lstat() and stat(). The updated NOTE above probably needs to be referred to in order to clarify what's meant by "in exactly the same way" since that probably refers to the information returned rather than whether an automount will be done. Ian
Re: [PATCH v2] ASoC: Add a sanity check before using dai driver name
Hi Dong, Thanks for noticing, will send new patch soon :) On 08/24/2017 11:46 AM, Donglin Peng wrote: On Thu, Aug 24, 2017 at 11:34 AM, Jeffy Chenwrote: list_for_each_entry(dai, >dai_list, list) { if (dlc->dai_name && strcmp(dai->name, dlc->dai_name) - && strcmp(dai->driver->name, dlc->dai_name)) + && (!dai->driver->name + || !strcmp(dai->driver->name, dlc->dai_name)) continue; If the dai->driver->name is match with the dlc->dai_name, does it need to continue? hmm, sorry, i did this in a hurry, should setup my board and test it.. return dai;
Re: [PATCH v2] ASoC: Add a sanity check before using dai driver name
Hi Dong, Thanks for noticing, will send new patch soon :) On 08/24/2017 11:46 AM, Donglin Peng wrote: On Thu, Aug 24, 2017 at 11:34 AM, Jeffy Chen wrote: list_for_each_entry(dai, >dai_list, list) { if (dlc->dai_name && strcmp(dai->name, dlc->dai_name) - && strcmp(dai->driver->name, dlc->dai_name)) + && (!dai->driver->name + || !strcmp(dai->driver->name, dlc->dai_name)) continue; If the dai->driver->name is match with the dlc->dai_name, does it need to continue? hmm, sorry, i did this in a hurry, should setup my board and test it.. return dai;
[PATCH v3] ASoC: Add a sanity check before using dai driver name
The dai driver's name is allowed to be NULL. So add a sanity check for that. Signed-off-by: Jeffy ChenReported-by: Donglin Peng --- Changes in v3: Fix typo Changes in v2: Keep the oringinal check style. sound/soc/soc-core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index fc1bb2da3e2e..8827c2225ba3 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1029,7 +1029,8 @@ struct snd_soc_dai *snd_soc_find_dai( continue; list_for_each_entry(dai, >dai_list, list) { if (dlc->dai_name && strcmp(dai->name, dlc->dai_name) - && strcmp(dai->driver->name, dlc->dai_name)) + && (!dai->driver->name + || strcmp(dai->driver->name, dlc->dai_name))) continue; return dai; -- 2.11.0
Re: [PATCH v7 1/2] PCI: iproc: Retry request when CRS returned from EP
On Wed, Aug 23, 2017 at 11:30 PM, Bjorn Helgaaswrote: > On Wed, Aug 23, 2017 at 09:32:06PM +0530, Oza Oza wrote: >> On Wed, Aug 23, 2017 at 9:22 PM, Sinan Kaya wrote: >> > Hi Oza, >> > >> >> In working Enumuration case I get following: >> >> [9.125976] pci :00:00.0: bridge configuration invalid ([bus >> >> 00-00]), re-configuring >> >> [9.134267] where=0x0 val=0x0001 >> >> [9.146946] where=0x0 val=0x0001 >> >> [9.158943] where=0x0 val=0x0001 >> >> [9.170945] where=0x0 val=0x0001 >> >> [9.186945] where=0x0 val=0x0001 >> >> [9.210944] where=0x0 val=0x0001 >> >> [9.250943] where=0x0 val=0x0001 >> >> [9.322942] where=0x0 val=0x0001 >> >> [9.458943] where=0x0 val=0x0001 >> >> [9.726942] where=0x0 val=0x9538086>> actual vendor and device id. >> >> >> >> so I think I have to retry in RC driver, so the old code still holds good. >> >> except that I have to do factoring out >> > You need to return 0x0001 for vendor ID register and return 0x >> > for >> > other registers like COMMAND register during the CRS period. > > The proposal we're trying to implement is to handle this controller as > an RP that does not support CRS SV. Such an RP would never return > 0x0001 for the vendor/device ID. If it never got a successful > completion, it would return 0x. > > So I think you do have to either retry (as in your v7 patch) or add a > delay before we start enumeration. > > It looks like we waited somewhere between 320ms and 590ms before this > device became ready. > > The PCI specs do require a delay after a reset (PCIe r3.1, sec 6.6.1) > before software sends a config request. I don't think there's > anywhere in the PCI core where we delay before we first enumerate > devices under a host bridge. I'm not sure we'd *want* such a delay in > the PCI core, because on many systems the BIOS has already initialized > the PCI controller, and an additional delay would be unnecessary and > would only slow down boot. > > But it might make sense to add a delay in the PCI controller driver -- > it knows when the reset occurs and probably knows more about the > firmware environment. I haven't tried to analyze all of sec 6.6.1, > but this section: > > Unless Readiness Notifications mechanisms are used (see Section > 6.23), the Root Complex and/or system software must allow at least > 1.0 s after a Conventional Reset of a device, before it may > determine that a device which fails to return a Successful > Completion status for a valid Configuration Request is a broken > device. This period is independent of how quickly Link training > completes. > > Note: This delay is analogous to the Trhfa parameter specified for > PCI/PCI-X, and is intended to allow an adequate amount of time for > devices which require self initialization. > > makes it sound like a 1sec delay might be needed. That sounds like an > awful lot, but this device did take close to that amount of time. > > I don't care which way you go. You've already implemented the delay > in the v7 patch, and that's fine with me. > Thanks for your inputs Bjorn. I will have v8 patch which will have; factored out separate patch + retry implementation of v7 patch. > Bjorn
Re: [PATCH v7 1/2] PCI: iproc: Retry request when CRS returned from EP
On Wed, Aug 23, 2017 at 11:30 PM, Bjorn Helgaas wrote: > On Wed, Aug 23, 2017 at 09:32:06PM +0530, Oza Oza wrote: >> On Wed, Aug 23, 2017 at 9:22 PM, Sinan Kaya wrote: >> > Hi Oza, >> > >> >> In working Enumuration case I get following: >> >> [9.125976] pci :00:00.0: bridge configuration invalid ([bus >> >> 00-00]), re-configuring >> >> [9.134267] where=0x0 val=0x0001 >> >> [9.146946] where=0x0 val=0x0001 >> >> [9.158943] where=0x0 val=0x0001 >> >> [9.170945] where=0x0 val=0x0001 >> >> [9.186945] where=0x0 val=0x0001 >> >> [9.210944] where=0x0 val=0x0001 >> >> [9.250943] where=0x0 val=0x0001 >> >> [9.322942] where=0x0 val=0x0001 >> >> [9.458943] where=0x0 val=0x0001 >> >> [9.726942] where=0x0 val=0x9538086>> actual vendor and device id. >> >> >> >> so I think I have to retry in RC driver, so the old code still holds good. >> >> except that I have to do factoring out >> > You need to return 0x0001 for vendor ID register and return 0x >> > for >> > other registers like COMMAND register during the CRS period. > > The proposal we're trying to implement is to handle this controller as > an RP that does not support CRS SV. Such an RP would never return > 0x0001 for the vendor/device ID. If it never got a successful > completion, it would return 0x. > > So I think you do have to either retry (as in your v7 patch) or add a > delay before we start enumeration. > > It looks like we waited somewhere between 320ms and 590ms before this > device became ready. > > The PCI specs do require a delay after a reset (PCIe r3.1, sec 6.6.1) > before software sends a config request. I don't think there's > anywhere in the PCI core where we delay before we first enumerate > devices under a host bridge. I'm not sure we'd *want* such a delay in > the PCI core, because on many systems the BIOS has already initialized > the PCI controller, and an additional delay would be unnecessary and > would only slow down boot. > > But it might make sense to add a delay in the PCI controller driver -- > it knows when the reset occurs and probably knows more about the > firmware environment. I haven't tried to analyze all of sec 6.6.1, > but this section: > > Unless Readiness Notifications mechanisms are used (see Section > 6.23), the Root Complex and/or system software must allow at least > 1.0 s after a Conventional Reset of a device, before it may > determine that a device which fails to return a Successful > Completion status for a valid Configuration Request is a broken > device. This period is independent of how quickly Link training > completes. > > Note: This delay is analogous to the Trhfa parameter specified for > PCI/PCI-X, and is intended to allow an adequate amount of time for > devices which require self initialization. > > makes it sound like a 1sec delay might be needed. That sounds like an > awful lot, but this device did take close to that amount of time. > > I don't care which way you go. You've already implemented the delay > in the v7 patch, and that's fine with me. > Thanks for your inputs Bjorn. I will have v8 patch which will have; factored out separate patch + retry implementation of v7 patch. > Bjorn
[PATCH v3] ASoC: Add a sanity check before using dai driver name
The dai driver's name is allowed to be NULL. So add a sanity check for that. Signed-off-by: Jeffy Chen Reported-by: Donglin Peng --- Changes in v3: Fix typo Changes in v2: Keep the oringinal check style. sound/soc/soc-core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index fc1bb2da3e2e..8827c2225ba3 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1029,7 +1029,8 @@ struct snd_soc_dai *snd_soc_find_dai( continue; list_for_each_entry(dai, >dai_list, list) { if (dlc->dai_name && strcmp(dai->name, dlc->dai_name) - && strcmp(dai->driver->name, dlc->dai_name)) + && (!dai->driver->name + || strcmp(dai->driver->name, dlc->dai_name))) continue; return dai; -- 2.11.0
Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote: > On (08/23/17 13:35), Boqun Feng wrote: > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont > > > buffer immediately. > > > > > > > Hmm.. Not quite familiar with printk() stuffs, but I could see several > > usages of printk(KERN_CONT "...\n") in kernel. > > > > Did a bit research myself, and I now think the inappropriate use is to > > use a KERN_CONT printk *after* another printk ending with a "\n". > > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer > upon the return. sorry, your code is correct. > So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a correct usage, right? Thanks. Again, not familiar with printk stuff, glad you can help me go through this ;-) Regards, Boqun > -ss > > > > > printk("\n *** DEADLOCK ***\n\n"); > > > > + } else if (cross_lock(src->instance)) { > > > > + printk(" Possible unsafe locking scenario by > > > > crosslock:\n\n"); > > > > + printk(" CPU0CPU1\n"); > > > > + printk(" \n"); > > > > + printk(" lock("); > > > > + __print_lock_name(target); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" lock("); > > > > + __print_lock_name(source); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" lock("); > > > > + __print_lock_name(parent == source ? target : parent); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" unlock("); > > > > + __print_lock_name(source); > > > > + printk(KERN_CONT ");\n"); > > > > + printk("\n *** DEADLOCK ***\n\n"); > > > > } else { > > > > printk(" Possible unsafe locking scenario:\n\n"); > > > > printk(" CPU0CPU1\n"); > > > > -- > > > > 2.14.1 > > > > > > signature.asc Description: PGP signature
Re: possible circular locking dependency detected [was: linux-next: Tree for Aug 22]
On Wed, Aug 23, 2017 at 02:55:17PM +0900, Sergey Senozhatsky wrote: > On (08/23/17 13:35), Boqun Feng wrote: > > > KERN_CONT and "\n" should not be together. "\n" flushes the cont > > > buffer immediately. > > > > > > > Hmm.. Not quite familiar with printk() stuffs, but I could see several > > usages of printk(KERN_CONT "...\n") in kernel. > > > > Did a bit research myself, and I now think the inappropriate use is to > > use a KERN_CONT printk *after* another printk ending with a "\n". > > ah... I didn't check __print_lock_name(): it leaves unflushed cont buffer > upon the return. sorry, your code is correct. > So means printk(KERN_CON "..."); + printk(KERN_CONT "...\n") is a correct usage, right? Thanks. Again, not familiar with printk stuff, glad you can help me go through this ;-) Regards, Boqun > -ss > > > > > printk("\n *** DEADLOCK ***\n\n"); > > > > + } else if (cross_lock(src->instance)) { > > > > + printk(" Possible unsafe locking scenario by > > > > crosslock:\n\n"); > > > > + printk(" CPU0CPU1\n"); > > > > + printk(" \n"); > > > > + printk(" lock("); > > > > + __print_lock_name(target); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" lock("); > > > > + __print_lock_name(source); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" lock("); > > > > + __print_lock_name(parent == source ? target : parent); > > > > + printk(KERN_CONT ");\n"); > > > > + printk(" unlock("); > > > > + __print_lock_name(source); > > > > + printk(KERN_CONT ");\n"); > > > > + printk("\n *** DEADLOCK ***\n\n"); > > > > } else { > > > > printk(" Possible unsafe locking scenario:\n\n"); > > > > printk(" CPU0CPU1\n"); > > > > -- > > > > 2.14.1 > > > > > > signature.asc Description: PGP signature
Re: [PATCH RESEND1 00/12] ALSA: vsnd: Add Xen para-virtualized frontend driver
On Aug 23 2017 23:51, Oleksandr Grytsov wrote: Hi, Thank you for detailed explanation. We understand that emulated interrupt on the frontend side is completely not acceptable and definitely we need to provide some feedback mechanism from Dom0 to DomU. In our case it is technically impossible to provide precise period interrupt (mostly because our backend is a user space application). The best we can implement it is provide number of frames (time, bytes etc.) consumed by real HW. This info will be outdated due to different delays but we can provide precise timestamps when this info was acquired. Stuffs of ALSA PCM in kernel land is an abstraction layer for actual hardware for data transmission. The stuffs get affects from a design of actual hardware. Furthermore, sound subsystems on the other operating systems such as Microsoft Windows are also designed with a consideration about actual hardware. When you design any interfaces as an abstraction for such software layer, it's better to understand actual hardware and design of low-level software layer somehow. Actually the 'sndif' has no good abstraction for actual hardware, therefore an idea to implement frontend driver as an ALSA driver is not reasonable at all. It's better to implement it as an application in the other software layer, e.g. sinks/sources of PulseAudio in DomU via Xenbus. This idea is nearer an original concept of Xen framework, I guess. But I don't know we can write any applications of Xenbus in user land of DomU or not. Anyway, it's not a good idea to have an ALSA driver for the present 'sndif', in my opinion. Regards Takashi Sakamoto
Re: [PATCH RESEND1 00/12] ALSA: vsnd: Add Xen para-virtualized frontend driver
On Aug 23 2017 23:51, Oleksandr Grytsov wrote: Hi, Thank you for detailed explanation. We understand that emulated interrupt on the frontend side is completely not acceptable and definitely we need to provide some feedback mechanism from Dom0 to DomU. In our case it is technically impossible to provide precise period interrupt (mostly because our backend is a user space application). The best we can implement it is provide number of frames (time, bytes etc.) consumed by real HW. This info will be outdated due to different delays but we can provide precise timestamps when this info was acquired. Stuffs of ALSA PCM in kernel land is an abstraction layer for actual hardware for data transmission. The stuffs get affects from a design of actual hardware. Furthermore, sound subsystems on the other operating systems such as Microsoft Windows are also designed with a consideration about actual hardware. When you design any interfaces as an abstraction for such software layer, it's better to understand actual hardware and design of low-level software layer somehow. Actually the 'sndif' has no good abstraction for actual hardware, therefore an idea to implement frontend driver as an ALSA driver is not reasonable at all. It's better to implement it as an application in the other software layer, e.g. sinks/sources of PulseAudio in DomU via Xenbus. This idea is nearer an original concept of Xen framework, I guess. But I don't know we can write any applications of Xenbus in user land of DomU or not. Anyway, it's not a good idea to have an ALSA driver for the present 'sndif', in my opinion. Regards Takashi Sakamoto
Re: Do we really need d_weak_revalidate???
On 24/08/17 11:21, NeilBrown wrote: > On Wed, Aug 23 2017, Ian Kent wrote: > >> On 23/08/17 10:32, Ian Kent wrote: >>> On 23/08/17 09:06, NeilBrown wrote: On Mon, Aug 21 2017, Ian Kent wrote: >> >> A mount isn't triggered by kern_path(pathname, 0, ). >> That '0' would need to include one of >> LOOKUP_PARENT | LOOKUP_DIRECTORY | >> LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT >> >> to trigger an automount (otherwise you just get -EISDIR). > > It's perfectly sensible to think that but there is a case where a > a mount is triggered when using kern_path(). > > The EISDIR return occurs for positive dentrys, negative dentrys > will still trigger an automount (which is autofs specific, > indirect mount map using nobrowse option, the install default). Ok, I understand this better now. This difference between direct and indirect mounts is slightly awkward. It is visible from user-space, but not elegant to document. When you use O_PATH to open a direct automount that has not already been triggered, the open returns the underlying directory (and fstatfs confirms that it is AUTOFS_SUPER_MAGIC). When you use O_PATH on an indirect automount, it *will* trigger the automount when "nobrowse" is in effect, but it won't when "browse" is in effect. >>> >>> That inconsistency has bothered me for quite a while now. >>> >>> It was carried over from the autofs module behavior when automounting >>> support was added to the VFS. What's worse is it prevents the use of >>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with >>> statx(). >>> >>> There is some risk in changing that so it does work but it really does >>> need to work to enable userspace to not trigger an automount by using >>> this flag. >>> >>> So that's (hopefully) going to change soonish, see: >>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch >>> >>> The result should be that stat family calls don't trigger automounts except >>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag. >>> So we cannot just say "O_PATH doesn't trigger automounts", which is essentially what I said in https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=97a45d02e6671482e8b2cdcce3951930bf6bdb94 It might be possible to modify automount so that it was more consistent - i.e. if the point is triggered by a mkdir has been done, just to the mkdir. If it is triggered after a mkdir has been done, do the mount. I guess that might be racy, and in any case is hard to justify. Maybe I should change it to be about "direct automounts", and add a note that indirect automounts aren't so predictable. >>> >>> Right and the semantics should be much more consistent in the near future. >>> I hope (and expect) this semantic change won't cause problems. >>> But back to my original issue of wanting to discard kern_path_mountpoint, what would you think of the following approach - slight revised from before. Thanks, NeilBrown diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index beef981aa54f..7663ea82e68d 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -135,10 +135,13 @@ static inline struct autofs_info *autofs4_dentry_ino(struct dentry *dentry) /* autofs4_oz_mode(): do we see the man behind the curtain? (The * processes which do manipulations for us in user space sees the raw * filesystem without "magic".) + * A process performing certain ioctls can get temporary oz status. */ +extern struct task_struct *autofs_tmp_oz; static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) { - return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp; + return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp || + autofs_tmp_oz == current; } struct inode *autofs4_get_inode(struct super_block *, umode_t); diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index dd9f1bebb5a3..d76401669a20 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -200,6 +200,20 @@ static int autofs_dev_ioctl_protosubver(struct file *fp, return 0; } +struct task_struct *autofs_tmp_oz; +int kern_path_oz(const char *pathname, int flags, struct path *path) +{ + static DEFINE_MUTEX(autofs_oz); + int err; + + mutex_lock(_oz); + autofs_tmp_oz = current; + err = kern_path(pathname, flags, path); + autofs_tmp_oz = NULL; + mutex_unlock(_oz); + return err; +} + >>> >>> It's simple enough but does look like it will attract criticism as being >>> a hack! >>> >>> The kern_path_locked() function is very similar to what was
Re: Do we really need d_weak_revalidate???
On 24/08/17 11:21, NeilBrown wrote: > On Wed, Aug 23 2017, Ian Kent wrote: > >> On 23/08/17 10:32, Ian Kent wrote: >>> On 23/08/17 09:06, NeilBrown wrote: On Mon, Aug 21 2017, Ian Kent wrote: >> >> A mount isn't triggered by kern_path(pathname, 0, ). >> That '0' would need to include one of >> LOOKUP_PARENT | LOOKUP_DIRECTORY | >> LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT >> >> to trigger an automount (otherwise you just get -EISDIR). > > It's perfectly sensible to think that but there is a case where a > a mount is triggered when using kern_path(). > > The EISDIR return occurs for positive dentrys, negative dentrys > will still trigger an automount (which is autofs specific, > indirect mount map using nobrowse option, the install default). Ok, I understand this better now. This difference between direct and indirect mounts is slightly awkward. It is visible from user-space, but not elegant to document. When you use O_PATH to open a direct automount that has not already been triggered, the open returns the underlying directory (and fstatfs confirms that it is AUTOFS_SUPER_MAGIC). When you use O_PATH on an indirect automount, it *will* trigger the automount when "nobrowse" is in effect, but it won't when "browse" is in effect. >>> >>> That inconsistency has bothered me for quite a while now. >>> >>> It was carried over from the autofs module behavior when automounting >>> support was added to the VFS. What's worse is it prevents the use of >>> the AT_NO_AUTOMOUNT flag from working properly with fstatat(2) and with >>> statx(). >>> >>> There is some risk in changing that so it does work but it really does >>> need to work to enable userspace to not trigger an automount by using >>> this flag. >>> >>> So that's (hopefully) going to change soonish, see: >>> http://ozlabs.org/~akpm/mmotm/broken-out/autofs-fix-at_no_automount-not-being-honored.patch >>> >>> The result should be that stat family calls don't trigger automounts except >>> for fstatat(2) and statx() which will require the AT_NO_AUTOMOUNT flag. >>> So we cannot just say "O_PATH doesn't trigger automounts", which is essentially what I said in https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=97a45d02e6671482e8b2cdcce3951930bf6bdb94 It might be possible to modify automount so that it was more consistent - i.e. if the point is triggered by a mkdir has been done, just to the mkdir. If it is triggered after a mkdir has been done, do the mount. I guess that might be racy, and in any case is hard to justify. Maybe I should change it to be about "direct automounts", and add a note that indirect automounts aren't so predictable. >>> >>> Right and the semantics should be much more consistent in the near future. >>> I hope (and expect) this semantic change won't cause problems. >>> But back to my original issue of wanting to discard kern_path_mountpoint, what would you think of the following approach - slight revised from before. Thanks, NeilBrown diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index beef981aa54f..7663ea82e68d 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -135,10 +135,13 @@ static inline struct autofs_info *autofs4_dentry_ino(struct dentry *dentry) /* autofs4_oz_mode(): do we see the man behind the curtain? (The * processes which do manipulations for us in user space sees the raw * filesystem without "magic".) + * A process performing certain ioctls can get temporary oz status. */ +extern struct task_struct *autofs_tmp_oz; static inline int autofs4_oz_mode(struct autofs_sb_info *sbi) { - return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp; + return sbi->catatonic || task_pgrp(current) == sbi->oz_pgrp || + autofs_tmp_oz == current; } struct inode *autofs4_get_inode(struct super_block *, umode_t); diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index dd9f1bebb5a3..d76401669a20 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -200,6 +200,20 @@ static int autofs_dev_ioctl_protosubver(struct file *fp, return 0; } +struct task_struct *autofs_tmp_oz; +int kern_path_oz(const char *pathname, int flags, struct path *path) +{ + static DEFINE_MUTEX(autofs_oz); + int err; + + mutex_lock(_oz); + autofs_tmp_oz = current; + err = kern_path(pathname, flags, path); + autofs_tmp_oz = NULL; + mutex_unlock(_oz); + return err; +} + >>> >>> It's simple enough but does look like it will attract criticism as being >>> a hack! >>> >>> The kern_path_locked() function is very similar to what was
Re: [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support.
On Wed, Aug 23, 2017 at 10:20:02AM -0400, Zi Yan wrote: > On 23 Aug 2017, at 3:49, Naoya Horiguchi wrote: > > > On Mon, Aug 14, 2017 at 09:52:13PM -0400, Zi Yan wrote: > >> From: Zi Yan> >> > >> The loop in madvise_inject_error() reads its step size from a page > >> after it is soft-offlined. It works because the page is: > >> 1) a hugetlb page: the page size does not change; > >> 2) a base page: the page size does not change; > >> 3) a THP: soft-offline always splits THPs, thus, it is OK to use > >>PAGE_SIZE as step size. > >> > >> It will be a problem when soft-offline supports THP migrations. > >> When a THP is migrated without split during soft-offlining, the THP > >> is split after migration, thus, before and after soft-offlining page > >> sizes do not match. This causes a THP to be unnecessarily soft-lined, > >> at most, 511 times, wasting free space. > > > > Hi Zi Yan, > > > > Thank you for the suggestion. > > > > I think that when madvise(MADV_SOFT_OFFLINE) is called with some range > > over more than one 4kB page, the caller clearly intends to call > > soft_offline_page() over all 4kB pages within the range in order to > > simulate the multiple soft-offline events. Please note that the caller > > only knows that specific pages are half-broken, and expect that all such > > pages are offlined. So the end result should be same, whether the given > > range is backed by thp or not. > > > > But if the given virtual address is backed by a THP and the THP is > soft-offlined > without splitting (enabled by following patches), the old for-loop will cause > extra > 511 THPs being soft-offlined. > > For example, the caller wants to offline VPN 0-511, which is backed by a THP > whose > address range is PFN 0-511. In the first iteration of the for-loop, > get_user_pages_fast(VPN0, ...) will return the THP and soft_offline_page() > will offline the THP, > replacing it with a new THP, say PFN 512-1023, so VPN 0-511 is backed by PFN > 512-1023. > But the original THP will be split after it is freed, thus, for-loop will not > end > at this moment, but continues to offline VPN1, which leads to PFN 512-1023 > being offlined > and replaced by another THP, say 1024-1535. This will go on and end up with > 511 extra THPs are offlined. That is why we need to this patch to tell > whether the THP is offlined as a whole or just its head page is offlined. Thanks for elaborating this. I understand your point. But I still not sure what the best behavior is. madvise(MADV_SOFT_OFFLINE) is a test feature and giving multi-page range on the call works like some stress testing. So multiple thp migrations seem to me an expected behavior. At least it behaves in the same manner as calling madvise(MADV_SOFT_OFFLINE) 512 times on VPN0-VPN511 separately, which is consistent. So I still feel like leaving the current behavior as long as your later patches work without this change. Thanks, Naoya Horiguchi > > Let me know if it is still not clear to you. Or I missed something. > > >> > >> Signed-off-by: Zi Yan > >> --- > >> mm/madvise.c | 21 ++--- > >> 1 file changed, 18 insertions(+), 3 deletions(-) > >> > >> diff --git a/mm/madvise.c b/mm/madvise.c > >> index 47d8d8a25eae..49f6774db259 100644 > >> --- a/mm/madvise.c > >> +++ b/mm/madvise.c > >> @@ -612,19 +612,22 @@ static long madvise_remove(struct vm_area_struct > >> *vma, > >> static int madvise_inject_error(int behavior, > >>unsigned long start, unsigned long end) > >> { > >> - struct page *page; > >> + struct page *page = NULL; > >> + unsigned long page_size = PAGE_SIZE; > >> > >>if (!capable(CAP_SYS_ADMIN)) > >>return -EPERM; > >> > >> - for (; start < end; start += PAGE_SIZE << > >> - compound_order(compound_head(page))) { > >> + for (; start < end; start += page_size) { > >>int ret; > >> > >>ret = get_user_pages_fast(start, 1, 0, ); > >>if (ret != 1) > >>return ret; > >> > >> + page_size = (PAGE_SIZE << compound_order(compound_head(page))) - > >> + (PAGE_SIZE * (page - compound_head(page))); > >> + > > > > Assigning a value which is not 4kB or some hugepage size into page_size > > might be confusing because that's not what the name says. You can introduce > > 'next' virtual address and ALIGN() might be helpful to calculate it. > > Like: > > next = ALIGN(start, PAGE_SIZE< > I think it works. Thanks. > > > -- > Best Regards > Yan Zi
Re: [RFC PATCH 1/4] mm: madvise: read loop's step size beforehand in madvise_inject_error(), prepare for THP support.
On Wed, Aug 23, 2017 at 10:20:02AM -0400, Zi Yan wrote: > On 23 Aug 2017, at 3:49, Naoya Horiguchi wrote: > > > On Mon, Aug 14, 2017 at 09:52:13PM -0400, Zi Yan wrote: > >> From: Zi Yan > >> > >> The loop in madvise_inject_error() reads its step size from a page > >> after it is soft-offlined. It works because the page is: > >> 1) a hugetlb page: the page size does not change; > >> 2) a base page: the page size does not change; > >> 3) a THP: soft-offline always splits THPs, thus, it is OK to use > >>PAGE_SIZE as step size. > >> > >> It will be a problem when soft-offline supports THP migrations. > >> When a THP is migrated without split during soft-offlining, the THP > >> is split after migration, thus, before and after soft-offlining page > >> sizes do not match. This causes a THP to be unnecessarily soft-lined, > >> at most, 511 times, wasting free space. > > > > Hi Zi Yan, > > > > Thank you for the suggestion. > > > > I think that when madvise(MADV_SOFT_OFFLINE) is called with some range > > over more than one 4kB page, the caller clearly intends to call > > soft_offline_page() over all 4kB pages within the range in order to > > simulate the multiple soft-offline events. Please note that the caller > > only knows that specific pages are half-broken, and expect that all such > > pages are offlined. So the end result should be same, whether the given > > range is backed by thp or not. > > > > But if the given virtual address is backed by a THP and the THP is > soft-offlined > without splitting (enabled by following patches), the old for-loop will cause > extra > 511 THPs being soft-offlined. > > For example, the caller wants to offline VPN 0-511, which is backed by a THP > whose > address range is PFN 0-511. In the first iteration of the for-loop, > get_user_pages_fast(VPN0, ...) will return the THP and soft_offline_page() > will offline the THP, > replacing it with a new THP, say PFN 512-1023, so VPN 0-511 is backed by PFN > 512-1023. > But the original THP will be split after it is freed, thus, for-loop will not > end > at this moment, but continues to offline VPN1, which leads to PFN 512-1023 > being offlined > and replaced by another THP, say 1024-1535. This will go on and end up with > 511 extra THPs are offlined. That is why we need to this patch to tell > whether the THP is offlined as a whole or just its head page is offlined. Thanks for elaborating this. I understand your point. But I still not sure what the best behavior is. madvise(MADV_SOFT_OFFLINE) is a test feature and giving multi-page range on the call works like some stress testing. So multiple thp migrations seem to me an expected behavior. At least it behaves in the same manner as calling madvise(MADV_SOFT_OFFLINE) 512 times on VPN0-VPN511 separately, which is consistent. So I still feel like leaving the current behavior as long as your later patches work without this change. Thanks, Naoya Horiguchi > > Let me know if it is still not clear to you. Or I missed something. > > >> > >> Signed-off-by: Zi Yan > >> --- > >> mm/madvise.c | 21 ++--- > >> 1 file changed, 18 insertions(+), 3 deletions(-) > >> > >> diff --git a/mm/madvise.c b/mm/madvise.c > >> index 47d8d8a25eae..49f6774db259 100644 > >> --- a/mm/madvise.c > >> +++ b/mm/madvise.c > >> @@ -612,19 +612,22 @@ static long madvise_remove(struct vm_area_struct > >> *vma, > >> static int madvise_inject_error(int behavior, > >>unsigned long start, unsigned long end) > >> { > >> - struct page *page; > >> + struct page *page = NULL; > >> + unsigned long page_size = PAGE_SIZE; > >> > >>if (!capable(CAP_SYS_ADMIN)) > >>return -EPERM; > >> > >> - for (; start < end; start += PAGE_SIZE << > >> - compound_order(compound_head(page))) { > >> + for (; start < end; start += page_size) { > >>int ret; > >> > >>ret = get_user_pages_fast(start, 1, 0, ); > >>if (ret != 1) > >>return ret; > >> > >> + page_size = (PAGE_SIZE << compound_order(compound_head(page))) - > >> + (PAGE_SIZE * (page - compound_head(page))); > >> + > > > > Assigning a value which is not 4kB or some hugepage size into page_size > > might be confusing because that's not what the name says. You can introduce > > 'next' virtual address and ALIGN() might be helpful to calculate it. > > Like: > > next = ALIGN(start, PAGE_SIZE< > I think it works. Thanks. > > > -- > Best Regards > Yan Zi
Re: [PATCH] zram: add zstd to the supported algorithms list
Hello Sergey, On Thu, Aug 24, 2017 at 10:49:36AM +0900, Sergey Senozhatsky wrote: > Add ZSTD to the list of supported compression algorithms. > > Official benchmarks [1]: First of all, thanks for the work! I want to ask one thing. Could you add some benchmark(e.g.,) result(comp ratio and speed) compared to (inflate, lzo, lz4)? I want to see how much it's good for small data that ours is 4K. Thanks! > > Compressor name Ratio Compression Decompress. > zstd 1.1.3 -1 2.877 430 MB/s1110 MB/s > zlib 1.2.8 -1 2.743 110 MB/s400 MB/s > brotli 0.5.2 -0 2.708 400 MB/s430 MB/s > quicklz 1.5.0 -12.238 550 MB/s710 MB/s > lzo1x 2.09 -1 2.108 650 MB/s830 MB/s > lz4 1.7.5 2.101 720 MB/s3600 MB/s > snappy 1.1.32.091 500 MB/s1650 MB/s > lzf 3.6 -1 2.077 400 MB/s860 MB/s > > [1] https://github.com/facebook/zstd > > Signed-off-by: Sergey Senozhatsky> --- > drivers/block/zram/zcomp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c > index 5b8992beffec..cc66daec7bbc 100644 > --- a/drivers/block/zram/zcomp.c > +++ b/drivers/block/zram/zcomp.c > @@ -31,6 +31,9 @@ static const char * const backends[] = { > #endif > #if IS_ENABLED(CONFIG_CRYPTO_842) > "842", > +#endif > +#if IS_ENABLED(CONFIG_CRYPTO_ZSTD) > + "zstd", > #endif > NULL > }; > -- > 2.14.1 >
Re: [PATCH v2 2/2] KVM: X86: Fix loss of exception which has not yet injected
2017-08-23 20:27 GMT+08:00 Paolo Bonzini: > On 23/08/2017 12:23, Wanpeng Li wrote: >> @@ -6341,7 +6345,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, >> bool req_int_win) >> int r; >> >> /* try to reinject previous events if any */ >> - if (vcpu->arch.exception.pending) { >> + if (vcpu->arch.exception.pending || >> + vcpu->arch.exception.injected) { >> trace_kvm_inj_exception(vcpu->arch.exception.nr, >> vcpu->arch.exception.has_error_code, >> vcpu->arch.exception.error_code); >> @@ -6357,6 +6362,11 @@ static int inject_pending_event(struct kvm_vcpu >> *vcpu, bool req_int_win) >> } >> >> r = kvm_x86_ops->queue_exception(vcpu); >> + if (r == 0 && vcpu->arch.exception.pending && >> + !vcpu->arch.exception.injected) { >> + vcpu->arch.exception.pending = false; >> + vcpu->arch.exception.injected = true; >> + } >> return r; >> } >> > > There are some changes needed here: > > - this "if" should check only .injected and call > kvm_x86_ops->queue_exception. The "if" for .pending, which handles > rflags/dr7, assigns false to .pending and true to .injected, and also > calls kvm_x86_ops->queue_exception, should be after the call to > check_nested_events. > > - in the call to inject_pending_event, the "else" should have a > > WARN_ON(vcpu->arch.exception.pending); > > just for completeness. > > > Also, nested_vmx_check_exception has to be moved from > vmx_queue_exception to vmx_check_nested_events, so that > nested_run_pending is checked correctly. Something like this: > > if (vcpu->arch.exception.pending && > nested_vmx_check_exception(vcpu, _qual)) { > if (vmx->nested.nested_run_pending) > return -EBUSY; > > nested_vmx_inject_exception_vmexit(vcpu, exit_qual); > return 0; > } > > Maybe you can have: > > - patch 1 which I'll apply now > > - patch 2 is the same as this one, with only the above changes to > inject_pending_event and the WARN_ON. > > - patch 3 moves the nested_vmx_inject_exception_vmexit call from > nested_vmx_check_exception to vmx_queue_exception. > > - patch 4 moves the code to vmx_check_nested_events and add the > nested_run_pending check, fixing the original bug. > > As to GET/SET_VCPU_EVENTS, for now I would not do any change. Instead, > we can do > > /* > * FIXME: pass injected and pending separately. This is only > * needed for nested virtualization, whose state cannot be > * migrated yet. For now we combine them just in case. > */ > events->exception.injected = > (vcpu->arch.exception.pending || > vcpu->arch.exception.injected) && > !kvm_exception_is_soft(vcpu->arch.exception.nr); > } > I complete all the comments in v3, thanks for the review. :) Regards, Wanpeng Li > and set_vcpu_events can just set .injected = false. Separating the two > will need a separate capability and KVM_ENABLE_CAP. > > Paolo > >> >> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h >> b/tools/arch/x86/include/uapi/asm/kvm.h >> index c2824d0..7c55916 100644 >> --- a/tools/arch/x86/include/uapi/asm/kvm.h >> +++ b/tools/arch/x86/include/uapi/asm/kvm.h >> @@ -296,9 +296,9 @@ struct kvm_reinject_control { >> struct kvm_vcpu_events { >> struct { >> __u8 injected; >> + __u8 pending; >> __u8 nr; >> __u8 has_error_code; >> - __u8 pad; >> __u32 error_code; >> } exception; >> struct { > > I think you don't need t Agreed.
Re: [PATCH] zram: add zstd to the supported algorithms list
Hello Sergey, On Thu, Aug 24, 2017 at 10:49:36AM +0900, Sergey Senozhatsky wrote: > Add ZSTD to the list of supported compression algorithms. > > Official benchmarks [1]: First of all, thanks for the work! I want to ask one thing. Could you add some benchmark(e.g.,) result(comp ratio and speed) compared to (inflate, lzo, lz4)? I want to see how much it's good for small data that ours is 4K. Thanks! > > Compressor name Ratio Compression Decompress. > zstd 1.1.3 -1 2.877 430 MB/s1110 MB/s > zlib 1.2.8 -1 2.743 110 MB/s400 MB/s > brotli 0.5.2 -0 2.708 400 MB/s430 MB/s > quicklz 1.5.0 -12.238 550 MB/s710 MB/s > lzo1x 2.09 -1 2.108 650 MB/s830 MB/s > lz4 1.7.5 2.101 720 MB/s3600 MB/s > snappy 1.1.32.091 500 MB/s1650 MB/s > lzf 3.6 -1 2.077 400 MB/s860 MB/s > > [1] https://github.com/facebook/zstd > > Signed-off-by: Sergey Senozhatsky > --- > drivers/block/zram/zcomp.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c > index 5b8992beffec..cc66daec7bbc 100644 > --- a/drivers/block/zram/zcomp.c > +++ b/drivers/block/zram/zcomp.c > @@ -31,6 +31,9 @@ static const char * const backends[] = { > #endif > #if IS_ENABLED(CONFIG_CRYPTO_842) > "842", > +#endif > +#if IS_ENABLED(CONFIG_CRYPTO_ZSTD) > + "zstd", > #endif > NULL > }; > -- > 2.14.1 >
Re: [PATCH v2 2/2] KVM: X86: Fix loss of exception which has not yet injected
2017-08-23 20:27 GMT+08:00 Paolo Bonzini : > On 23/08/2017 12:23, Wanpeng Li wrote: >> @@ -6341,7 +6345,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, >> bool req_int_win) >> int r; >> >> /* try to reinject previous events if any */ >> - if (vcpu->arch.exception.pending) { >> + if (vcpu->arch.exception.pending || >> + vcpu->arch.exception.injected) { >> trace_kvm_inj_exception(vcpu->arch.exception.nr, >> vcpu->arch.exception.has_error_code, >> vcpu->arch.exception.error_code); >> @@ -6357,6 +6362,11 @@ static int inject_pending_event(struct kvm_vcpu >> *vcpu, bool req_int_win) >> } >> >> r = kvm_x86_ops->queue_exception(vcpu); >> + if (r == 0 && vcpu->arch.exception.pending && >> + !vcpu->arch.exception.injected) { >> + vcpu->arch.exception.pending = false; >> + vcpu->arch.exception.injected = true; >> + } >> return r; >> } >> > > There are some changes needed here: > > - this "if" should check only .injected and call > kvm_x86_ops->queue_exception. The "if" for .pending, which handles > rflags/dr7, assigns false to .pending and true to .injected, and also > calls kvm_x86_ops->queue_exception, should be after the call to > check_nested_events. > > - in the call to inject_pending_event, the "else" should have a > > WARN_ON(vcpu->arch.exception.pending); > > just for completeness. > > > Also, nested_vmx_check_exception has to be moved from > vmx_queue_exception to vmx_check_nested_events, so that > nested_run_pending is checked correctly. Something like this: > > if (vcpu->arch.exception.pending && > nested_vmx_check_exception(vcpu, _qual)) { > if (vmx->nested.nested_run_pending) > return -EBUSY; > > nested_vmx_inject_exception_vmexit(vcpu, exit_qual); > return 0; > } > > Maybe you can have: > > - patch 1 which I'll apply now > > - patch 2 is the same as this one, with only the above changes to > inject_pending_event and the WARN_ON. > > - patch 3 moves the nested_vmx_inject_exception_vmexit call from > nested_vmx_check_exception to vmx_queue_exception. > > - patch 4 moves the code to vmx_check_nested_events and add the > nested_run_pending check, fixing the original bug. > > As to GET/SET_VCPU_EVENTS, for now I would not do any change. Instead, > we can do > > /* > * FIXME: pass injected and pending separately. This is only > * needed for nested virtualization, whose state cannot be > * migrated yet. For now we combine them just in case. > */ > events->exception.injected = > (vcpu->arch.exception.pending || > vcpu->arch.exception.injected) && > !kvm_exception_is_soft(vcpu->arch.exception.nr); > } > I complete all the comments in v3, thanks for the review. :) Regards, Wanpeng Li > and set_vcpu_events can just set .injected = false. Separating the two > will need a separate capability and KVM_ENABLE_CAP. > > Paolo > >> >> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h >> b/tools/arch/x86/include/uapi/asm/kvm.h >> index c2824d0..7c55916 100644 >> --- a/tools/arch/x86/include/uapi/asm/kvm.h >> +++ b/tools/arch/x86/include/uapi/asm/kvm.h >> @@ -296,9 +296,9 @@ struct kvm_reinject_control { >> struct kvm_vcpu_events { >> struct { >> __u8 injected; >> + __u8 pending; >> __u8 nr; >> __u8 has_error_code; >> - __u8 pad; >> __u32 error_code; >> } exception; >> struct { > > I think you don't need t Agreed.
[PATCH] usb:xhci:Fix regression when ATI chipsets detected
From: Sandeep SinghThe following commit cause a regression on ATI chipsets. 'commit e788787ef4f9 ("usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume")' This causes pinfo->smbus_dev to be wrongly set to NULL on systems with the ATI chipset that this function checks for first. Added conditional check for AMD chipsets to avoid the overwriting pinfo->smbus_dev. Reported-by: Ben Hutchings Fixes: e788787ef4f9 ("usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume") cc: Nehal Shah cc: Signed-off-by: Sandeep Singh Signed-off-by: Shyam Sundar S K --- drivers/usb/host/pci-quirks.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index 5f4ca78..58b9685 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -142,29 +142,30 @@ static int amd_chipset_sb_type_init(struct amd_chipset_info *pinfo) pinfo->sb_type.gen = AMD_CHIPSET_SB700; else if (rev >= 0x40 && rev <= 0x4f) pinfo->sb_type.gen = AMD_CHIPSET_SB800; - } - pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, - 0x145c, NULL); - if (pinfo->smbus_dev) { - pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN; } else { pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_HUDSON2_SMBUS, NULL); - if (!pinfo->smbus_dev) { - pinfo->sb_type.gen = NOT_AMD_CHIPSET; - return 0; + if (pinfo->smbus_dev) { + rev = pinfo->smbus_dev->revision; + if (rev >= 0x11 && rev <= 0x14) + pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2; + else if (rev >= 0x15 && rev <= 0x18) + pinfo->sb_type.gen = AMD_CHIPSET_BOLTON; + else if (rev >= 0x39 && rev <= 0x3a) + pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE; + } else { + pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, + 0x145c, NULL); + if (pinfo->smbus_dev) { + rev = pinfo->smbus_dev->revision; + pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN; + } else { + pinfo->sb_type.gen = NOT_AMD_CHIPSET; + return 0; + } } - - rev = pinfo->smbus_dev->revision; - if (rev >= 0x11 && rev <= 0x14) - pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2; - else if (rev >= 0x15 && rev <= 0x18) - pinfo->sb_type.gen = AMD_CHIPSET_BOLTON; - else if (rev >= 0x39 && rev <= 0x3a) - pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE; } - pinfo->sb_type.rev = rev; return 1; } -- 2.7.4
[PATCH] usb:xhci:Fix regression when ATI chipsets detected
From: Sandeep Singh The following commit cause a regression on ATI chipsets. 'commit e788787ef4f9 ("usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume")' This causes pinfo->smbus_dev to be wrongly set to NULL on systems with the ATI chipset that this function checks for first. Added conditional check for AMD chipsets to avoid the overwriting pinfo->smbus_dev. Reported-by: Ben Hutchings Fixes: e788787ef4f9 ("usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume") cc: Nehal Shah cc: Signed-off-by: Sandeep Singh Signed-off-by: Shyam Sundar S K --- drivers/usb/host/pci-quirks.c | 35 ++- 1 file changed, 18 insertions(+), 17 deletions(-) diff --git a/drivers/usb/host/pci-quirks.c b/drivers/usb/host/pci-quirks.c index 5f4ca78..58b9685 100644 --- a/drivers/usb/host/pci-quirks.c +++ b/drivers/usb/host/pci-quirks.c @@ -142,29 +142,30 @@ static int amd_chipset_sb_type_init(struct amd_chipset_info *pinfo) pinfo->sb_type.gen = AMD_CHIPSET_SB700; else if (rev >= 0x40 && rev <= 0x4f) pinfo->sb_type.gen = AMD_CHIPSET_SB800; - } - pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, - 0x145c, NULL); - if (pinfo->smbus_dev) { - pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN; } else { pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_HUDSON2_SMBUS, NULL); - if (!pinfo->smbus_dev) { - pinfo->sb_type.gen = NOT_AMD_CHIPSET; - return 0; + if (pinfo->smbus_dev) { + rev = pinfo->smbus_dev->revision; + if (rev >= 0x11 && rev <= 0x14) + pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2; + else if (rev >= 0x15 && rev <= 0x18) + pinfo->sb_type.gen = AMD_CHIPSET_BOLTON; + else if (rev >= 0x39 && rev <= 0x3a) + pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE; + } else { + pinfo->smbus_dev = pci_get_device(PCI_VENDOR_ID_AMD, + 0x145c, NULL); + if (pinfo->smbus_dev) { + rev = pinfo->smbus_dev->revision; + pinfo->sb_type.gen = AMD_CHIPSET_TAISHAN; + } else { + pinfo->sb_type.gen = NOT_AMD_CHIPSET; + return 0; + } } - - rev = pinfo->smbus_dev->revision; - if (rev >= 0x11 && rev <= 0x14) - pinfo->sb_type.gen = AMD_CHIPSET_HUDSON2; - else if (rev >= 0x15 && rev <= 0x18) - pinfo->sb_type.gen = AMD_CHIPSET_BOLTON; - else if (rev >= 0x39 && rev <= 0x3a) - pinfo->sb_type.gen = AMD_CHIPSET_YANGTZE; } - pinfo->sb_type.rev = rev; return 1; } -- 2.7.4
Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
- Original Message - > From: "Steven Rostedt"> To: "Chunyu Hu" > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org > Sent: Thursday, August 24, 2017 10:15:41 AM > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter > > On Wed, 23 Aug 2017 18:58:03 -0400 (EDT) > Chunyu Hu wrote: > > > - Original Message - > > > From: "Steven Rostedt" > > > To: "Chunyu Hu" > > > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org > > > Sent: Wednesday, August 23, 2017 12:52:49 PM > > > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter > > > > > > On Wed, 23 Aug 2017 10:41:55 -0400 > > > Steven Rostedt wrote: > > > > > > > * On success, returns 0 and *@filterp points to the new filter. On > > > > * failure, returns -errno and *@filterp may point to %NULL or to a new > > > > * filter. In the latter case, the returned filter contains error > > > > * information if @set_str is %true and the caller is responsible for > > > > * freeing it. > > > > > > > > So filter contains an error string when it fails. It seems that we > > > > should somehow propagate that up the chain to display. I'll look more > > > > into this. > > > > > > The bug is in create_filter(), because "set_str" is set to false, and > > > the filter should not be passed back allocated on error. > > > > Thanks for all the analysis. I think you are right. I'll try to have a test > > on it > > in case we miss something. But please don't block on my test. > > > > My tests are almost done, but I wont send anything till tomorrow. I can > wait a day to post. There's a few other changes I need to send to Linus > as well. Tested with your patch, I did not hit the leak issue and other kmemleak. Thanks. > > -- Steve > -- Regards, Chunyu Hu
Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter
- Original Message - > From: "Steven Rostedt" > To: "Chunyu Hu" > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org > Sent: Thursday, August 24, 2017 10:15:41 AM > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter > > On Wed, 23 Aug 2017 18:58:03 -0400 (EDT) > Chunyu Hu wrote: > > > - Original Message - > > > From: "Steven Rostedt" > > > To: "Chunyu Hu" > > > Cc: mi...@kernel.org, linux-kernel@vger.kernel.org > > > Sent: Wednesday, August 23, 2017 12:52:49 PM > > > Subject: Re: [PATCH 2/2] tracing: Fix kmemleak in set_trigger_filter > > > > > > On Wed, 23 Aug 2017 10:41:55 -0400 > > > Steven Rostedt wrote: > > > > > > > * On success, returns 0 and *@filterp points to the new filter. On > > > > * failure, returns -errno and *@filterp may point to %NULL or to a new > > > > * filter. In the latter case, the returned filter contains error > > > > * information if @set_str is %true and the caller is responsible for > > > > * freeing it. > > > > > > > > So filter contains an error string when it fails. It seems that we > > > > should somehow propagate that up the chain to display. I'll look more > > > > into this. > > > > > > The bug is in create_filter(), because "set_str" is set to false, and > > > the filter should not be passed back allocated on error. > > > > Thanks for all the analysis. I think you are right. I'll try to have a test > > on it > > in case we miss something. But please don't block on my test. > > > > My tests are almost done, but I wont send anything till tomorrow. I can > wait a day to post. There's a few other changes I need to send to Linus > as well. Tested with your patch, I did not hit the leak issue and other kmemleak. Thanks. > > -- Steve > -- Regards, Chunyu Hu
Re: [PATCH 0/1] devpts: use dynamic_dname() to generate proc name
Hi On 2017-08-23, Eric W. Biederman wrote: > Linus Torvaldswrites: > > On Wed, Aug 23, 2017 at 6:49 PM, Linus Torvalds > > wrote: [...] > This is so far untested (except for compiling) but I think this will > work. > > I factor out devpts_ptmx_path out of devpts_acquire so the code > doesn't have to do unnecessary and confusing work, and add the > new function devpts_mnt. > > I revert the code to keep anything except a dentry in > tty->link->driver_data. > > And reduce the peer opening to a single function ptm_open_peer. > > It takes lines of code but the result is very straightforward code. I've given this a quick test, while it seems to fix the initial problem with umounting /dev/ptmx, it does introduce a new one - trying to open an xterm (KDE5's konsole to be exact) doesn't open a shell (the shell window remains totally empty) and trying to ssh into the system fails with "PTY allocation request failed on channel 0", logging in via a real tty and creating a new pbuilder chroot from there succeeds. Regards Stefan Lippers-Hollmann pgpI8qaz3RlGW.pgp Description: Digitale Signatur von OpenPGP
Re: [PATCH 0/1] devpts: use dynamic_dname() to generate proc name
Hi On 2017-08-23, Eric W. Biederman wrote: > Linus Torvalds writes: > > On Wed, Aug 23, 2017 at 6:49 PM, Linus Torvalds > > wrote: [...] > This is so far untested (except for compiling) but I think this will > work. > > I factor out devpts_ptmx_path out of devpts_acquire so the code > doesn't have to do unnecessary and confusing work, and add the > new function devpts_mnt. > > I revert the code to keep anything except a dentry in > tty->link->driver_data. > > And reduce the peer opening to a single function ptm_open_peer. > > It takes lines of code but the result is very straightforward code. I've given this a quick test, while it seems to fix the initial problem with umounting /dev/ptmx, it does introduce a new one - trying to open an xterm (KDE5's konsole to be exact) doesn't open a shell (the shell window remains totally empty) and trying to ssh into the system fails with "PTY allocation request failed on channel 0", logging in via a real tty and creating a new pbuilder chroot from there succeeds. Regards Stefan Lippers-Hollmann pgpI8qaz3RlGW.pgp Description: Digitale Signatur von OpenPGP
[PATCH v3 2/4] KVM: X86: Fix loss of exception which has not yet injected
From: Wanpeng Livmx_complete_interrupts() assumes that the exception is always injected, so it would be dropped by kvm_clear_exception_queue(). This patch separates exception.pending from exception.injected, exception.inject represents the exception is injected or the exception should be reinjected due to vmexit occurs during event delivery in VMX non-root operation. exception.pending represents the exception is queued and will be cleared when injecting the exception to the guest. So exception.pending and exception.injected can cooperate to guarantee exception will not be lost. Reported-by: Radim Krčmář Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- v2 -> v3: * the changes to inject_pending_event and adds the WARN_ON * combine injected and pending exception for GET/SET_VCPU_EVENTS arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/svm.c | 2 +- arch/x86/kvm/vmx.c | 4 +-- arch/x86/kvm/x86.c | 56 ++--- arch/x86/kvm/x86.h | 4 +-- 5 files changed, 42 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 9d90787..6e385ac 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -547,8 +547,8 @@ struct kvm_vcpu_arch { struct kvm_queued_exception { bool pending; + bool injected; bool has_error_code; - bool reinject; u8 nr; u32 error_code; u8 nested_apf; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a2fddce..6a439a1 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -655,7 +655,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_svm *svm = to_svm(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.reinject; + bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; /* diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c5f43ab..902b780 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2503,7 +2503,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.reinject; + bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; u32 intr_info = nr | INTR_INFO_VALID_MASK; @@ -10948,7 +10948,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu, u32 idt_vectoring; unsigned int nr; - if (vcpu->arch.exception.pending && vcpu->arch.exception.reinject) { + if (vcpu->arch.exception.injected) { nr = vcpu->arch.exception.nr; idt_vectoring = nr | VECTORING_INFO_VALID_MASK; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8f41b88..b698b2f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -389,15 +389,18 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, kvm_make_request(KVM_REQ_EVENT, vcpu); - if (!vcpu->arch.exception.pending) { + if (!vcpu->arch.exception.pending || + !vcpu->arch.exception.injected) { queue: if (has_error && !is_protmode(vcpu)) has_error = false; - vcpu->arch.exception.pending = true; + if (reinject) + vcpu->arch.exception.injected = true; + else + vcpu->arch.exception.pending = true; vcpu->arch.exception.has_error_code = has_error; vcpu->arch.exception.nr = nr; vcpu->arch.exception.error_code = error_code; - vcpu->arch.exception.reinject = reinject; return; } @@ -3069,8 +3072,14 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { process_nmi(vcpu); + /* +* FIXME: pass injected and pending separately. This is only +* needed for nested virtualization, whose state cannot be +* migrated yet. For now we combine them just in case. +*/ events->exception.injected = - vcpu->arch.exception.pending && + (vcpu->arch.exception.pending || +vcpu->arch.exception.injected) && !kvm_exception_is_soft(vcpu->arch.exception.nr); events->exception.nr = vcpu->arch.exception.nr;
[PATCH v3 4/4] KVM: nVMX: Fix trying to cancel vmlauch/vmresume
From: Wanpeng Li[ cut here ] WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW OE 4.13.0-rc4+ #11 RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] Call Trace: ? kvm_multiple_exception+0x149/0x170 [kvm] ? handle_emulation_failure+0x79/0x230 [kvm] ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel] ? check_chain_key+0x137/0x1e0 ? reexecute_instruction.part.168+0x130/0x130 [kvm] nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] vmx_queue_exception+0x197/0x300 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm] ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm] ? preempt_count_sub+0x18/0xc0 ? restart_apic_timer+0x17d/0x300 [kvm] ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm] ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm] kvm_vcpu_ioctl+0x4e4/0x910 [kvm] ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm] ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm] The flag "nested_run_pending", which can override the decision of which should run next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. This is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects L2 to be run (and perhaps be injected with an event it specified, etc.). Nested_run_pending is especially intended to avoid switching to L1 in the injection decision-point. I catch this in the queue exception path, this patch fixes it by requesting an immediate VM exit from L2 and keeping the exception for L1 pending for a subsequent nested VM exit. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- arch/x86/kvm/vmx.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 21760b8..6f88a79 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2501,16 +2501,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; u32 intr_info = nr | INTR_INFO_VALID_MASK; - unsigned long exit_qual; - - if (!reinject && is_guest_mode(vcpu) && - nested_vmx_check_exception(vcpu, _qual)) { - nested_vmx_inject_exception_vmexit(vcpu, exit_qual); - return; - } if (has_error_code) { vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code); @@ -10988,10 +10980,20 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu, static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) { struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long exit_qual; if (kvm_event_needs_reinjection(vcpu)) return -EBUSY; + if (vcpu->arch.exception.pending && + nested_vmx_check_exception(vcpu, _qual)) { + if (vmx->nested.nested_run_pending) + return -EBUSY; + + nested_vmx_inject_exception_vmexit(vcpu, exit_qual); + return 0; +} + if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) && vmx->nested.preemption_timer_expired) { if (vmx->nested.nested_run_pending) -- 2.7.4
[PATCH v3 2/4] KVM: X86: Fix loss of exception which has not yet injected
From: Wanpeng Li vmx_complete_interrupts() assumes that the exception is always injected, so it would be dropped by kvm_clear_exception_queue(). This patch separates exception.pending from exception.injected, exception.inject represents the exception is injected or the exception should be reinjected due to vmexit occurs during event delivery in VMX non-root operation. exception.pending represents the exception is queued and will be cleared when injecting the exception to the guest. So exception.pending and exception.injected can cooperate to guarantee exception will not be lost. Reported-by: Radim Krčmář Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- v2 -> v3: * the changes to inject_pending_event and adds the WARN_ON * combine injected and pending exception for GET/SET_VCPU_EVENTS arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/svm.c | 2 +- arch/x86/kvm/vmx.c | 4 +-- arch/x86/kvm/x86.c | 56 ++--- arch/x86/kvm/x86.h | 4 +-- 5 files changed, 42 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 9d90787..6e385ac 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -547,8 +547,8 @@ struct kvm_vcpu_arch { struct kvm_queued_exception { bool pending; + bool injected; bool has_error_code; - bool reinject; u8 nr; u32 error_code; u8 nested_apf; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index a2fddce..6a439a1 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -655,7 +655,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_svm *svm = to_svm(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.reinject; + bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; /* diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c5f43ab..902b780 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2503,7 +2503,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.reinject; + bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; u32 intr_info = nr | INTR_INFO_VALID_MASK; @@ -10948,7 +10948,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu, u32 idt_vectoring; unsigned int nr; - if (vcpu->arch.exception.pending && vcpu->arch.exception.reinject) { + if (vcpu->arch.exception.injected) { nr = vcpu->arch.exception.nr; idt_vectoring = nr | VECTORING_INFO_VALID_MASK; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8f41b88..b698b2f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -389,15 +389,18 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, kvm_make_request(KVM_REQ_EVENT, vcpu); - if (!vcpu->arch.exception.pending) { + if (!vcpu->arch.exception.pending || + !vcpu->arch.exception.injected) { queue: if (has_error && !is_protmode(vcpu)) has_error = false; - vcpu->arch.exception.pending = true; + if (reinject) + vcpu->arch.exception.injected = true; + else + vcpu->arch.exception.pending = true; vcpu->arch.exception.has_error_code = has_error; vcpu->arch.exception.nr = nr; vcpu->arch.exception.error_code = error_code; - vcpu->arch.exception.reinject = reinject; return; } @@ -3069,8 +3072,14 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu, struct kvm_vcpu_events *events) { process_nmi(vcpu); + /* +* FIXME: pass injected and pending separately. This is only +* needed for nested virtualization, whose state cannot be +* migrated yet. For now we combine them just in case. +*/ events->exception.injected = - vcpu->arch.exception.pending && + (vcpu->arch.exception.pending || +vcpu->arch.exception.injected) && !kvm_exception_is_soft(vcpu->arch.exception.nr); events->exception.nr = vcpu->arch.exception.nr; events->exception.has_error_code = vcpu->arch.exception.has_error_code; @@ -3125,6 +3134,7 @@ static int
[PATCH v3 4/4] KVM: nVMX: Fix trying to cancel vmlauch/vmresume
From: Wanpeng Li [ cut here ] WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: GW OE 4.13.0-rc4+ #11 RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel] Call Trace: ? kvm_multiple_exception+0x149/0x170 [kvm] ? handle_emulation_failure+0x79/0x230 [kvm] ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel] ? check_chain_key+0x137/0x1e0 ? reexecute_instruction.part.168+0x130/0x130 [kvm] nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel] vmx_queue_exception+0x197/0x300 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm] ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm] ? preempt_count_sub+0x18/0xc0 ? restart_apic_timer+0x17d/0x300 [kvm] ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm] ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm] kvm_vcpu_ioctl+0x4e4/0x910 [kvm] ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm] ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm] The flag "nested_run_pending", which can override the decision of which should run next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. This is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects L2 to be run (and perhaps be injected with an event it specified, etc.). Nested_run_pending is especially intended to avoid switching to L1 in the injection decision-point. I catch this in the queue exception path, this patch fixes it by requesting an immediate VM exit from L2 and keeping the exception for L1 pending for a subsequent nested VM exit. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- arch/x86/kvm/vmx.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 21760b8..6f88a79 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2501,16 +2501,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned nr = vcpu->arch.exception.nr; bool has_error_code = vcpu->arch.exception.has_error_code; - bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; u32 intr_info = nr | INTR_INFO_VALID_MASK; - unsigned long exit_qual; - - if (!reinject && is_guest_mode(vcpu) && - nested_vmx_check_exception(vcpu, _qual)) { - nested_vmx_inject_exception_vmexit(vcpu, exit_qual); - return; - } if (has_error_code) { vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code); @@ -10988,10 +10980,20 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu, static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) { struct vcpu_vmx *vmx = to_vmx(vcpu); + unsigned long exit_qual; if (kvm_event_needs_reinjection(vcpu)) return -EBUSY; + if (vcpu->arch.exception.pending && + nested_vmx_check_exception(vcpu, _qual)) { + if (vmx->nested.nested_run_pending) + return -EBUSY; + + nested_vmx_inject_exception_vmexit(vcpu, exit_qual); + return 0; +} + if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) && vmx->nested.preemption_timer_expired) { if (vmx->nested.nested_run_pending) -- 2.7.4
[PATCH v3 1/4] KVM: VMX: use kvm_event_needs_reinjection
From: Wanpeng LiUse kvm_event_needs_reinjection() encapsulation. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- arch/x86/kvm/vmx.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index dd710d3..c5f43ab 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10988,9 +10988,7 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) { struct vcpu_vmx *vmx = to_vmx(vcpu); - if (vcpu->arch.exception.pending || - vcpu->arch.nmi_injected || - vcpu->arch.interrupt.pending) + if (kvm_event_needs_reinjection(vcpu)) return -EBUSY; if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) && -- 2.7.4
[PATCH v3 3/4] KVM: VMX: Move the nested_vmx_inject_exception_vmexit call from nested_vmx_check_exception to vmx_queue_exception
From: Wanpeng LiMove the nested_vmx_inject_exception_vmexit call from nested_vmx_check_exception to vmx_queue_exception. Signed-off-by: Wanpeng Li --- arch/x86/kvm/vmx.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 902b780..21760b8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2459,15 +2459,14 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu, * KVM wants to inject page-faults which it got to the guest. This function * checks whether in a nested guest, we need to inject them to L1 or L2. */ -static int nested_vmx_check_exception(struct kvm_vcpu *vcpu) +static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned long *exit_qual) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); unsigned int nr = vcpu->arch.exception.nr; if (nr == PF_VECTOR) { if (vcpu->arch.exception.nested_apf) { - nested_vmx_inject_exception_vmexit(vcpu, - vcpu->arch.apf.nested_apf_token); + *exit_qual = vcpu->arch.apf.nested_apf_token; return 1; } /* @@ -2481,16 +2480,15 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu) */ if (nested_vmx_is_page_fault_vmexit(vmcs12, vcpu->arch.exception.error_code)) { - nested_vmx_inject_exception_vmexit(vcpu, vcpu->arch.cr2); + *exit_qual = vcpu->arch.cr2; return 1; } } else { - unsigned long exit_qual = 0; - if (nr == DB_VECTOR) - exit_qual = vcpu->arch.dr6; - if (vmcs12->exception_bitmap & (1u << nr)) { - nested_vmx_inject_exception_vmexit(vcpu, exit_qual); + if (nr == DB_VECTOR) + *exit_qual = vcpu->arch.dr6; + else + *exit_qual = 0; return 1; } } @@ -2506,10 +2504,13 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu) bool reinject = vcpu->arch.exception.injected; u32 error_code = vcpu->arch.exception.error_code; u32 intr_info = nr | INTR_INFO_VALID_MASK; + unsigned long exit_qual; if (!reinject && is_guest_mode(vcpu) && - nested_vmx_check_exception(vcpu)) + nested_vmx_check_exception(vcpu, _qual)) { + nested_vmx_inject_exception_vmexit(vcpu, exit_qual); return; + } if (has_error_code) { vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code); -- 2.7.4