Re: [RESEND][PATCH] cpuidle/powernv : Restore different PSSCR for idle and hotplug
* Benjamin Herrenschmidt[2018-03-01 08:40:22]: > On Thu, 2018-03-01 at 01:03 +0530, Akshay Adiga wrote: > > commit 1e1601b38e6e ("powerpc/powernv/idle: Restore SPRs for deep idle > > states via stop API.") uses stop-api provided by the firmware to restore > > PSSCR. PSSCR restore is required for handling special wakeup. When special > > wakeup is completed, the core enters stop state based on restored PSSCR. > > > > Currently PSSCR is restored to deepest available stop state, causing > > a idle cpu to enter deeper stop state on a special wakeup, which causes > > the cpu to hang on wakeup. > > > > A "sensors" command which reads temperature (through DTS sensors) on idle > > cpu can trigger special wakeup. > > > > Failed Scenario : > > Request restore of PSSCR with RL = 11 > > cpu enters idle state (stop5) > > user triggers "sensors" command > >Assert special wakeup on cpu > > Restores PSSCR with RL = 11 < Done by firmware > > Read DTS sensor > >Deassert special wakeup > > cpu enters idle state (stop11) <-- Instead of stop5 > > > > Cpu hang is caused because cpu ended up in a deeper state than it requested > > > > This patch fixes instability caused by special wakeup when stop11 is > > enabled. Requests restore of PSSCR to deepest stop state used by cpuidle. > > Only when offlining cpu, request restore of PSSCR to deepest stop state. > > On onlining cpu, request restore of PSSCR to deepest stop state used by > > cpuidle. > > So if we chose a stop state, but somebody else does a special wakeup, > we'll end up going back into a *deeper* one than the one we came from ? Unfortunately yes. This is the current limitation. If we are in stop4 and above and we had not set a PSSCR to be restored, then the hardware will default to all bits set (stop15) leading to entry into stop11 after the special wakeup is removed. The requirement is such that we need to have a correct PSSCR restore value set using stop-api. We need to set a restore PSSCR value that represents one in a group like stop4,5,6,7 will have identical state loss, hence we can either set a PSSCR of 7 or 4 or 5 for any of this stop state entry and not have to use stop-api to set exact value of stop4 or 5 at every entry. > I still think this is broken by design. The chip should automatically > go back to the state it went to after special wakeup, thus the PPE > controlling the state should override the PSSCR value accordingly > rather than relying on those SW hoops. Special wakeup de-assertion and re-entry hits this limitation where we have lost the original content of PSSCR SPR and hence CME does not know what was requested. Additional stop-api calls from software could have been avoided, but practically we have only cpu hotplug case that uses stop11 and needs this stop-api. We can default the system to stop4 or stop5 and then make stop-api call to explicitly set stop11 and then hotplug out the cpu. We have to restore the deepest cpuidle state (stop4/5) back during online. As such this is not much of software overhead. But we need an elegant method to control these calls from OPAL flags so that kernel behaviour can be more closely controlled. If we want to use stop11 in cpuidle (despite being very slow) for evaluation reasons, then we will need to make more stop-api call to choose between stop4/5 vs stop11 since they belong to different group. Even in this case, since stop11 is the slow path, we would want to set stop11 before entry and restore to stop4/5 after wakeup. This way we still completely avoid stop-api call in fast-path stop4/5 entry/exit. --Vaidy
Re: [RESEND][PATCH] cpuidle/powernv : Restore different PSSCR for idle and hotplug
* Benjamin Herrenschmidt [2018-03-01 08:40:22]: > On Thu, 2018-03-01 at 01:03 +0530, Akshay Adiga wrote: > > commit 1e1601b38e6e ("powerpc/powernv/idle: Restore SPRs for deep idle > > states via stop API.") uses stop-api provided by the firmware to restore > > PSSCR. PSSCR restore is required for handling special wakeup. When special > > wakeup is completed, the core enters stop state based on restored PSSCR. > > > > Currently PSSCR is restored to deepest available stop state, causing > > a idle cpu to enter deeper stop state on a special wakeup, which causes > > the cpu to hang on wakeup. > > > > A "sensors" command which reads temperature (through DTS sensors) on idle > > cpu can trigger special wakeup. > > > > Failed Scenario : > > Request restore of PSSCR with RL = 11 > > cpu enters idle state (stop5) > > user triggers "sensors" command > >Assert special wakeup on cpu > > Restores PSSCR with RL = 11 < Done by firmware > > Read DTS sensor > >Deassert special wakeup > > cpu enters idle state (stop11) <-- Instead of stop5 > > > > Cpu hang is caused because cpu ended up in a deeper state than it requested > > > > This patch fixes instability caused by special wakeup when stop11 is > > enabled. Requests restore of PSSCR to deepest stop state used by cpuidle. > > Only when offlining cpu, request restore of PSSCR to deepest stop state. > > On onlining cpu, request restore of PSSCR to deepest stop state used by > > cpuidle. > > So if we chose a stop state, but somebody else does a special wakeup, > we'll end up going back into a *deeper* one than the one we came from ? Unfortunately yes. This is the current limitation. If we are in stop4 and above and we had not set a PSSCR to be restored, then the hardware will default to all bits set (stop15) leading to entry into stop11 after the special wakeup is removed. The requirement is such that we need to have a correct PSSCR restore value set using stop-api. We need to set a restore PSSCR value that represents one in a group like stop4,5,6,7 will have identical state loss, hence we can either set a PSSCR of 7 or 4 or 5 for any of this stop state entry and not have to use stop-api to set exact value of stop4 or 5 at every entry. > I still think this is broken by design. The chip should automatically > go back to the state it went to after special wakeup, thus the PPE > controlling the state should override the PSSCR value accordingly > rather than relying on those SW hoops. Special wakeup de-assertion and re-entry hits this limitation where we have lost the original content of PSSCR SPR and hence CME does not know what was requested. Additional stop-api calls from software could have been avoided, but practically we have only cpu hotplug case that uses stop11 and needs this stop-api. We can default the system to stop4 or stop5 and then make stop-api call to explicitly set stop11 and then hotplug out the cpu. We have to restore the deepest cpuidle state (stop4/5) back during online. As such this is not much of software overhead. But we need an elegant method to control these calls from OPAL flags so that kernel behaviour can be more closely controlled. If we want to use stop11 in cpuidle (despite being very slow) for evaluation reasons, then we will need to make more stop-api call to choose between stop4/5 vs stop11 since they belong to different group. Even in this case, since stop11 is the slow path, we would want to set stop11 before entry and restore to stop4/5 after wakeup. This way we still completely avoid stop-api call in fast-path stop4/5 entry/exit. --Vaidy
Re: KASAN: use-after-free Read in __list_del_entry_valid (3)
On Tue, Mar 6, 2018 at 9:30 AM, syzbotwrote: > Hello, > > syzbot hit the following crash on upstream commit > 094b58e1040a44f991d7ab628035e69c4d6b79c9 (Mon Mar 5 19:57:06 2018 +) > Merge tag 'linux-kselftest-4.16-rc5' of > git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest I'll take a look at this one, Martijn > > Unfortunately, I don't have any reproducer for this crash yet. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > user-space arch: i386 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+09e05aba06723a94d...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > binder: release 6174:6185 transaction 4 in, still active > binder: send failed reply for transaction 4 to 6174:6185 > binder: 6194:6198 ERROR: BC_REGISTER_LOOPER called without request > == > binder: 6198 RLIMIT_NICE not set > BUG: KASAN: use-after-free in __list_del_entry_valid+0x144/0x150 > lib/list_debug.c:54 > Read of size 8 at addr 8801daede810 by task kworker/1:1/24 > > CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 4.16.0-rc4+ #252 > binder: BINDER_SET_CONTEXT_MGR already set > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Workqueue: events binder_deferred_func > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x24d lib/dump_stack.c:53 > binder: 6194:6206 got new transaction with bad transaction stack, > transaction 9 has target 6194:0 > print_address_description+0x73/0x250 mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report+0x23c/0x360 mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > __list_del_entry_valid+0x144/0x150 lib/list_debug.c:54 > __list_del_entry include/linux/list.h:117 [inline] > list_del_init include/linux/list.h:159 [inline] > binder_dequeue_work_head_ilocked drivers/android/binder.c:893 [inline] > binder_dequeue_work_head drivers/android/binder.c:913 [inline] > binder_release_work+0x163/0x490 drivers/android/binder.c:4191 > binder: 6194:6206 transaction failed 29201/-71, size 0-0 line 2875 > binder: 6191:6205 ioctl 40046207 0 returned -16 > binder_thread_release+0x4d0/0x720 drivers/android/binder.c:4396 > binder_deferred_release drivers/android/binder.c:4939 [inline] > binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022 > binder: BINDER_SET_CONTEXT_MGR already set > binder: 6200:6207 ioctl 40046207 0 returned -16 > binder: 6191:6208 ERROR: BC_REGISTER_LOOPER called without request > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > binder: 6208 RLIMIT_NICE not set > binder: 6200:6212 ERROR: BC_REGISTER_LOOPER called without request > binder: 6212 RLIMIT_NICE not set > binder: 6191:6213 got new transaction with bad transaction stack, > transaction 11 has target 6194:0 > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > binder: 6191:6213 transaction failed 29201/-71, size 0-0 line 2875 > binder: 6198 RLIMIT_NICE not set > binder: release 6200:6207 transaction 14 out, still active > binder: undelivered TRANSACTION_COMPLETE > kthread+0x33c/0x400 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > > Allocated by task 6185: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552 > kmem_cache_alloc_trace+0x136/0x740 mm/slab.c:3607 > kmalloc include/linux/slab.h:512 [inline] > kzalloc include/linux/slab.h:701 [inline] > binder_transaction+0x13c1/0x81c0 drivers/android/binder.c:2900 > binder_thread_write+0xb50/0x3840 drivers/android/binder.c:3513 > binder_ioctl_write_read.isra.38+0x261/0xcb0 drivers/android/binder.c:4451 > binder_ioctl+0xb72/0x1417 drivers/android/binder.c:4591 > C_SYSC_ioctl fs/compat_ioctl.c:1461 [inline] > compat_SyS_ioctl+0x151/0x2a30 fs/compat_ioctl.c:1407 > do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline] > do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392 > entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 > > Freed by task 24: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520 > kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527 > __cache_free mm/slab.c:3485 [inline] > kfree+0xd9/0x260 mm/slab.c:3800 > binder_free_transaction+0x6a/0x90 drivers/android/binder.c:1966 > binder_send_failed_reply+0x1c9/0x380 drivers/android/binder.c:2005 > binder_thread_release+0x4bb/0x720 drivers/android/binder.c:4395 > binder_deferred_release drivers/android/binder.c:4939 [inline] > binder_deferred_func+0x4f4/0x1340
Re: KASAN: use-after-free Read in __list_del_entry_valid (3)
On Tue, Mar 6, 2018 at 9:30 AM, syzbot wrote: > Hello, > > syzbot hit the following crash on upstream commit > 094b58e1040a44f991d7ab628035e69c4d6b79c9 (Mon Mar 5 19:57:06 2018 +) > Merge tag 'linux-kselftest-4.16-rc5' of > git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest I'll take a look at this one, Martijn > > Unfortunately, I don't have any reproducer for this crash yet. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > user-space arch: i386 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+09e05aba06723a94d...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > binder: release 6174:6185 transaction 4 in, still active > binder: send failed reply for transaction 4 to 6174:6185 > binder: 6194:6198 ERROR: BC_REGISTER_LOOPER called without request > == > binder: 6198 RLIMIT_NICE not set > BUG: KASAN: use-after-free in __list_del_entry_valid+0x144/0x150 > lib/list_debug.c:54 > Read of size 8 at addr 8801daede810 by task kworker/1:1/24 > > CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 4.16.0-rc4+ #252 > binder: BINDER_SET_CONTEXT_MGR already set > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Workqueue: events binder_deferred_func > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x24d lib/dump_stack.c:53 > binder: 6194:6206 got new transaction with bad transaction stack, > transaction 9 has target 6194:0 > print_address_description+0x73/0x250 mm/kasan/report.c:256 > kasan_report_error mm/kasan/report.c:354 [inline] > kasan_report+0x23c/0x360 mm/kasan/report.c:412 > __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 > __list_del_entry_valid+0x144/0x150 lib/list_debug.c:54 > __list_del_entry include/linux/list.h:117 [inline] > list_del_init include/linux/list.h:159 [inline] > binder_dequeue_work_head_ilocked drivers/android/binder.c:893 [inline] > binder_dequeue_work_head drivers/android/binder.c:913 [inline] > binder_release_work+0x163/0x490 drivers/android/binder.c:4191 > binder: 6194:6206 transaction failed 29201/-71, size 0-0 line 2875 > binder: 6191:6205 ioctl 40046207 0 returned -16 > binder_thread_release+0x4d0/0x720 drivers/android/binder.c:4396 > binder_deferred_release drivers/android/binder.c:4939 [inline] > binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022 > binder: BINDER_SET_CONTEXT_MGR already set > binder: 6200:6207 ioctl 40046207 0 returned -16 > binder: 6191:6208 ERROR: BC_REGISTER_LOOPER called without request > process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113 > binder: 6208 RLIMIT_NICE not set > binder: 6200:6212 ERROR: BC_REGISTER_LOOPER called without request > binder: 6212 RLIMIT_NICE not set > binder: 6191:6213 got new transaction with bad transaction stack, > transaction 11 has target 6194:0 > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > binder: 6191:6213 transaction failed 29201/-71, size 0-0 line 2875 > binder: 6198 RLIMIT_NICE not set > binder: release 6200:6207 transaction 14 out, still active > binder: undelivered TRANSACTION_COMPLETE > kthread+0x33c/0x400 kernel/kthread.c:238 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 > > Allocated by task 6185: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552 > kmem_cache_alloc_trace+0x136/0x740 mm/slab.c:3607 > kmalloc include/linux/slab.h:512 [inline] > kzalloc include/linux/slab.h:701 [inline] > binder_transaction+0x13c1/0x81c0 drivers/android/binder.c:2900 > binder_thread_write+0xb50/0x3840 drivers/android/binder.c:3513 > binder_ioctl_write_read.isra.38+0x261/0xcb0 drivers/android/binder.c:4451 > binder_ioctl+0xb72/0x1417 drivers/android/binder.c:4591 > C_SYSC_ioctl fs/compat_ioctl.c:1461 [inline] > compat_SyS_ioctl+0x151/0x2a30 fs/compat_ioctl.c:1407 > do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline] > do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392 > entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 > > Freed by task 24: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520 > kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527 > __cache_free mm/slab.c:3485 [inline] > kfree+0xd9/0x260 mm/slab.c:3800 > binder_free_transaction+0x6a/0x90 drivers/android/binder.c:1966 > binder_send_failed_reply+0x1c9/0x380 drivers/android/binder.c:2005 > binder_thread_release+0x4bb/0x720 drivers/android/binder.c:4395 > binder_deferred_release drivers/android/binder.c:4939 [inline] > binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022 > process_one_work+0xc47/0x1bb0
[PATCH v5 4/7] x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
From: Jan KiszkaAllow to enable PCI_MMCONFIG when only SFI is present and make this option default on. This will help consolidating both into one Kconfig statement. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index eb7f43f23521..c19f5342ec2b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2659,7 +2659,8 @@ config PCI_DOMAINS config PCI_MMCONFIG bool "Support mmconfig PCI config space access" - depends on X86_64 && PCI && ACPI + default y + depends on X86_64 && PCI && (ACPI || SFI) config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT -- 2.13.6
[PATCH v5 4/7] x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
From: Jan Kiszka Allow to enable PCI_MMCONFIG when only SFI is present and make this option default on. This will help consolidating both into one Kconfig statement. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index eb7f43f23521..c19f5342ec2b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2659,7 +2659,8 @@ config PCI_DOMAINS config PCI_MMCONFIG bool "Support mmconfig PCI config space access" - depends on X86_64 && PCI && ACPI + default y + depends on X86_64 && PCI && (ACPI || SFI) config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT -- 2.13.6
[PATCH v5 3/7] x86/jailhouse: Enable PCI mmconfig access in inmates
From: Otavio PontesUse the PCI mmconfig base address exported by jailhouse in boot parameters in order to access the memory mapped PCI configuration space. Signed-off-by: Otavio Pontes [Jan: rebased, fixed !CONFIG_PCI_MMCONFIG, used pcibios_last_bus] Signed-off-by: Jan Kiszka Reviewed-by: Andy Shevchenko --- arch/x86/include/asm/pci_x86.h | 2 ++ arch/x86/kernel/jailhouse.c| 8 arch/x86/pci/mmconfig-shared.c | 4 ++-- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h index eb66fa9cd0fc..959d618dbb17 100644 --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -151,6 +151,8 @@ extern int pci_mmconfig_insert(struct device *dev, u16 seg, u8 start, u8 end, phys_addr_t addr); extern int pci_mmconfig_delete(u16 seg, u8 start, u8 end); extern struct pci_mmcfg_region *pci_mmconfig_lookup(int segment, int bus); +extern struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, + int end, u64 addr); extern struct list_head pci_mmcfg_list; diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c index b68fd895235a..fa183a131edc 100644 --- a/arch/x86/kernel/jailhouse.c +++ b/arch/x86/kernel/jailhouse.c @@ -124,6 +124,14 @@ static int __init jailhouse_pci_arch_init(void) if (pcibios_last_bus < 0) pcibios_last_bus = 0xff; +#ifdef CONFIG_PCI_MMCONFIG + if (setup_data.pci_mmconfig_base) { + pci_mmconfig_add(0, 0, pcibios_last_bus, +setup_data.pci_mmconfig_base); + pci_mmcfg_arch_init(); + } +#endif + return 0; } diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c index 96684d0adcf9..0e590272366b 100644 --- a/arch/x86/pci/mmconfig-shared.c +++ b/arch/x86/pci/mmconfig-shared.c @@ -94,8 +94,8 @@ static struct pci_mmcfg_region *pci_mmconfig_alloc(int segment, int start, return new; } -static struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, - int end, u64 addr) +struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, +int end, u64 addr) { struct pci_mmcfg_region *new; -- 2.13.6
[PATCH v5 3/7] x86/jailhouse: Enable PCI mmconfig access in inmates
From: Otavio Pontes Use the PCI mmconfig base address exported by jailhouse in boot parameters in order to access the memory mapped PCI configuration space. Signed-off-by: Otavio Pontes [Jan: rebased, fixed !CONFIG_PCI_MMCONFIG, used pcibios_last_bus] Signed-off-by: Jan Kiszka Reviewed-by: Andy Shevchenko --- arch/x86/include/asm/pci_x86.h | 2 ++ arch/x86/kernel/jailhouse.c| 8 arch/x86/pci/mmconfig-shared.c | 4 ++-- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h index eb66fa9cd0fc..959d618dbb17 100644 --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -151,6 +151,8 @@ extern int pci_mmconfig_insert(struct device *dev, u16 seg, u8 start, u8 end, phys_addr_t addr); extern int pci_mmconfig_delete(u16 seg, u8 start, u8 end); extern struct pci_mmcfg_region *pci_mmconfig_lookup(int segment, int bus); +extern struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, + int end, u64 addr); extern struct list_head pci_mmcfg_list; diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c index b68fd895235a..fa183a131edc 100644 --- a/arch/x86/kernel/jailhouse.c +++ b/arch/x86/kernel/jailhouse.c @@ -124,6 +124,14 @@ static int __init jailhouse_pci_arch_init(void) if (pcibios_last_bus < 0) pcibios_last_bus = 0xff; +#ifdef CONFIG_PCI_MMCONFIG + if (setup_data.pci_mmconfig_base) { + pci_mmconfig_add(0, 0, pcibios_last_bus, +setup_data.pci_mmconfig_base); + pci_mmcfg_arch_init(); + } +#endif + return 0; } diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c index 96684d0adcf9..0e590272366b 100644 --- a/arch/x86/pci/mmconfig-shared.c +++ b/arch/x86/pci/mmconfig-shared.c @@ -94,8 +94,8 @@ static struct pci_mmcfg_region *pci_mmconfig_alloc(int segment, int start, return new; } -static struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, - int end, u64 addr) +struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start, +int end, u64 addr) { struct pci_mmcfg_region *new; -- 2.13.6
[PATCH v5 5/7] x86: Consolidate PCI_MMCONFIG configs
From: Jan KiszkaSince e279b6c1d329 ("x86: start unification of arch/x86/Kconfig.*"), we have two PCI_MMCONFIG entries, one from the original i386 and another from x86_64. This consolidates both entries into a single one. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index c19f5342ec2b..8986a6b6e3df 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2641,8 +2641,10 @@ config PCI_DIRECT depends on PCI && (X86_64 || (PCI_GODIRECT || PCI_GOANY || PCI_GOOLPC || PCI_GOMMCONFIG)) config PCI_MMCONFIG - def_bool y - depends on X86_32 && PCI && (ACPI || SFI) && (PCI_GOMMCONFIG || PCI_GOANY) + bool "Support mmconfig PCI config space access" if X86_64 + default y + depends on PCI && (ACPI || SFI) + depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG) config PCI_OLPC def_bool y @@ -2657,11 +2659,6 @@ config PCI_DOMAINS def_bool y depends on PCI -config PCI_MMCONFIG - bool "Support mmconfig PCI config space access" - default y - depends on X86_64 && PCI && (ACPI || SFI) - config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT depends on PCI -- 2.13.6
[PATCH v5 5/7] x86: Consolidate PCI_MMCONFIG configs
From: Jan Kiszka Since e279b6c1d329 ("x86: start unification of arch/x86/Kconfig.*"), we have two PCI_MMCONFIG entries, one from the original i386 and another from x86_64. This consolidates both entries into a single one. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index c19f5342ec2b..8986a6b6e3df 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2641,8 +2641,10 @@ config PCI_DIRECT depends on PCI && (X86_64 || (PCI_GODIRECT || PCI_GOANY || PCI_GOOLPC || PCI_GOMMCONFIG)) config PCI_MMCONFIG - def_bool y - depends on X86_32 && PCI && (ACPI || SFI) && (PCI_GOMMCONFIG || PCI_GOANY) + bool "Support mmconfig PCI config space access" if X86_64 + default y + depends on PCI && (ACPI || SFI) + depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG) config PCI_OLPC def_bool y @@ -2657,11 +2659,6 @@ config PCI_DOMAINS def_bool y depends on PCI -config PCI_MMCONFIG - bool "Support mmconfig PCI config space access" - default y - depends on X86_64 && PCI && (ACPI || SFI) - config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT depends on PCI -- 2.13.6
[PATCH v5 2/7] PCI: Scan all functions when running over Jailhouse
From: Jan KiszkaPer PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to have a function 0. Therefore, Linux scans for devices at function 0 (devfn 0/8/16/...) and only scans for other functions if function 0 has its Multi-Function Device bit set or ARI or SR-IOV indicate there are more functions. The Jailhouse hypervisor may pass individual functions of a multi-function device to a guest without passing function 0, which means a Linux guest won't find them. Change Linux PCI probing so it scans all function numbers when running as a guest over Jailhouse. This is technically prohibited by the spec, so it is possible that PCI devices without the Multi-Function Device bit set may have unexpected behavior in response to this probe. Derived from original patch by Benedikt Spranger. CC: Benedikt Spranger Signed-off-by: Jan Kiszka Acked-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko --- arch/x86/pci/legacy.c | 4 +++- drivers/pci/probe.c | 22 +++--- 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c index 1cb01abcb1be..dfbe6ac38830 100644 --- a/arch/x86/pci/legacy.c +++ b/arch/x86/pci/legacy.c @@ -4,6 +4,7 @@ #include #include #include +#include #include /* @@ -34,13 +35,14 @@ int __init pci_legacy_init(void) void pcibios_scan_specific_bus(int busn) { + int stride = jailhouse_paravirt() ? 1 : 8; int devfn; u32 l; if (pci_find_bus(0, busn)) return; - for (devfn = 0; devfn < 256; devfn += 8) { + for (devfn = 0; devfn < 256; devfn += stride) { if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, ) && l != 0x && l != 0x) { DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ef5377438a1e..3c365dc996e7 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include "pci.h" @@ -2518,14 +2519,29 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus, { unsigned int used_buses, normal_bridges = 0, hotplug_bridges = 0; unsigned int start = bus->busn_res.start; - unsigned int devfn, cmax, max = start; + unsigned int devfn, fn, cmax, max = start; struct pci_dev *dev; + int nr_devs; dev_dbg(>dev, "scanning bus\n"); /* Go find them, Rover! */ - for (devfn = 0; devfn < 0x100; devfn += 8) - pci_scan_slot(bus, devfn); + for (devfn = 0; devfn < 256; devfn += 8) { + nr_devs = pci_scan_slot(bus, devfn); + + /* +* The Jailhouse hypervisor may pass individual functions of a +* multi-function device to a guest without passing function 0. +* Look for them as well. +*/ + if (jailhouse_paravirt() && nr_devs == 0) { + for (fn = 1; fn < 8; fn++) { + dev = pci_scan_single_device(bus, devfn + fn); + if (dev) + dev->multifunction = 1; + } + } + } /* Reserve buses for SR-IOV capability */ used_buses = pci_iov_bus_range(bus); -- 2.13.6
[PATCH v5 6/7] x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
From: Jan KiszkaJailhouse does not use ACPI, but it does support MMCONFIG. Make sure the latter can be built without having to enable ACPI as well. Primarily, we need to make the AMD mmconf-fam10h_64 depend upon MMCONFIG and ACPI, instead of just the former. Saves some bytes in the Jailhouse non-root kernel. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 6 +- arch/x86/kernel/Makefile | 2 +- arch/x86/kernel/cpu/amd.c | 2 +- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8986a6b6e3df..b53340e71f84 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2643,7 +2643,7 @@ config PCI_DIRECT config PCI_MMCONFIG bool "Support mmconfig PCI config space access" if X86_64 default y - depends on PCI && (ACPI || SFI) + depends on PCI && (ACPI || SFI || JAILHOUSE_GUEST) depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG) config PCI_OLPC @@ -2659,6 +2659,10 @@ config PCI_DOMAINS def_bool y depends on PCI +config MMCONF_FAM10H + def_bool y + depends on X86_64 && PCI_MMCONFIG && ACPI + config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT depends on PCI diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 29786c87e864..73ccf80c09a2 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -146,6 +146,6 @@ ifeq ($(CONFIG_X86_64),y) obj-$(CONFIG_GART_IOMMU)+= amd_gart_64.o aperture_64.o obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary_64.o tce_64.o - obj-$(CONFIG_PCI_MMCONFIG) += mmconf-fam10h_64.o + obj-$(CONFIG_MMCONF_FAM10H) += mmconf-fam10h_64.o obj-y += vsmp_64.o endif diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f0e6456ca7d3..12bc0a1139da 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -716,7 +716,7 @@ static void init_amd_k8(struct cpuinfo_x86 *c) static void init_amd_gh(struct cpuinfo_x86 *c) { -#ifdef CONFIG_X86_64 +#ifdef CONFIG_MMCONF_FAM10H /* do this for boot cpu */ if (c == _cpu_data) check_enable_amd_mmconf_dmi(); -- 2.13.6
[PATCH v5 2/7] PCI: Scan all functions when running over Jailhouse
From: Jan Kiszka Per PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to have a function 0. Therefore, Linux scans for devices at function 0 (devfn 0/8/16/...) and only scans for other functions if function 0 has its Multi-Function Device bit set or ARI or SR-IOV indicate there are more functions. The Jailhouse hypervisor may pass individual functions of a multi-function device to a guest without passing function 0, which means a Linux guest won't find them. Change Linux PCI probing so it scans all function numbers when running as a guest over Jailhouse. This is technically prohibited by the spec, so it is possible that PCI devices without the Multi-Function Device bit set may have unexpected behavior in response to this probe. Derived from original patch by Benedikt Spranger. CC: Benedikt Spranger Signed-off-by: Jan Kiszka Acked-by: Bjorn Helgaas Reviewed-by: Andy Shevchenko --- arch/x86/pci/legacy.c | 4 +++- drivers/pci/probe.c | 22 +++--- 2 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c index 1cb01abcb1be..dfbe6ac38830 100644 --- a/arch/x86/pci/legacy.c +++ b/arch/x86/pci/legacy.c @@ -4,6 +4,7 @@ #include #include #include +#include #include /* @@ -34,13 +35,14 @@ int __init pci_legacy_init(void) void pcibios_scan_specific_bus(int busn) { + int stride = jailhouse_paravirt() ? 1 : 8; int devfn; u32 l; if (pci_find_bus(0, busn)) return; - for (devfn = 0; devfn < 256; devfn += 8) { + for (devfn = 0; devfn < 256; devfn += stride) { if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, ) && l != 0x && l != 0x) { DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ef5377438a1e..3c365dc996e7 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include "pci.h" @@ -2518,14 +2519,29 @@ static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus, { unsigned int used_buses, normal_bridges = 0, hotplug_bridges = 0; unsigned int start = bus->busn_res.start; - unsigned int devfn, cmax, max = start; + unsigned int devfn, fn, cmax, max = start; struct pci_dev *dev; + int nr_devs; dev_dbg(>dev, "scanning bus\n"); /* Go find them, Rover! */ - for (devfn = 0; devfn < 0x100; devfn += 8) - pci_scan_slot(bus, devfn); + for (devfn = 0; devfn < 256; devfn += 8) { + nr_devs = pci_scan_slot(bus, devfn); + + /* +* The Jailhouse hypervisor may pass individual functions of a +* multi-function device to a guest without passing function 0. +* Look for them as well. +*/ + if (jailhouse_paravirt() && nr_devs == 0) { + for (fn = 1; fn < 8; fn++) { + dev = pci_scan_single_device(bus, devfn + fn); + if (dev) + dev->multifunction = 1; + } + } + } /* Reserve buses for SR-IOV capability */ used_buses = pci_iov_bus_range(bus); -- 2.13.6
[PATCH v5 6/7] x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
From: Jan Kiszka Jailhouse does not use ACPI, but it does support MMCONFIG. Make sure the latter can be built without having to enable ACPI as well. Primarily, we need to make the AMD mmconf-fam10h_64 depend upon MMCONFIG and ACPI, instead of just the former. Saves some bytes in the Jailhouse non-root kernel. Signed-off-by: Jan Kiszka --- arch/x86/Kconfig | 6 +- arch/x86/kernel/Makefile | 2 +- arch/x86/kernel/cpu/amd.c | 2 +- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8986a6b6e3df..b53340e71f84 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2643,7 +2643,7 @@ config PCI_DIRECT config PCI_MMCONFIG bool "Support mmconfig PCI config space access" if X86_64 default y - depends on PCI && (ACPI || SFI) + depends on PCI && (ACPI || SFI || JAILHOUSE_GUEST) depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG) config PCI_OLPC @@ -2659,6 +2659,10 @@ config PCI_DOMAINS def_bool y depends on PCI +config MMCONF_FAM10H + def_bool y + depends on X86_64 && PCI_MMCONFIG && ACPI + config PCI_CNB20LE_QUIRK bool "Read CNB20LE Host Bridge Windows" if EXPERT depends on PCI diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 29786c87e864..73ccf80c09a2 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -146,6 +146,6 @@ ifeq ($(CONFIG_X86_64),y) obj-$(CONFIG_GART_IOMMU)+= amd_gart_64.o aperture_64.o obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary_64.o tce_64.o - obj-$(CONFIG_PCI_MMCONFIG) += mmconf-fam10h_64.o + obj-$(CONFIG_MMCONF_FAM10H) += mmconf-fam10h_64.o obj-y += vsmp_64.o endif diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f0e6456ca7d3..12bc0a1139da 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -716,7 +716,7 @@ static void init_amd_k8(struct cpuinfo_x86 *c) static void init_amd_gh(struct cpuinfo_x86 *c) { -#ifdef CONFIG_X86_64 +#ifdef CONFIG_MMCONF_FAM10H /* do this for boot cpu */ if (c == _cpu_data) check_enable_amd_mmconf_dmi(); -- 2.13.6
[PATCH v5 0/7] jailhouse: Enhance secondary Jailhouse guest support /wrt PCI
Basic x86 support [1] for running Linux as secondary Jailhouse [2] guest is currently pending in the tip tree. This builds on top and enhances the PCI support for x86 and also ARM guests (ARM[64] does not require platform patches and works already). Key elements of this series are: - detection of Jailhouse via device tree hypervisor node - function-level PCI scan if Jailhouse is detected - MMCONFIG support for x86 guests As most changes affect x86, I would suggest to route the series also via tip after the necessary acks are collected. Changes in v5: - fix build breakage of patch 6 on i386 Changes in v4: - slit up Kconfig changes - respect pcibios_last_bus during mmconfig setup - cosmetic changes requested by Andy Changes in v3: - avoided duplicate scans of PCI functions under Jailhouse - reformated PCI_MMCONFIG condition and rephrase related commit log Changes in v2: - adjusted commit log and include ordering in patch 2 - rebased over Linus master Jan [1] https://lkml.org/lkml/2017/11/27/125 [2] http://jailhouse-project.org CC: Benedikt SprangerCC: Juergen Gross CC: Mark Rutland CC: Otavio Pontes CC: Rob Herring Jan Kiszka (6): jailhouse: Provide detection for non-x86 systems PCI: Scan all functions when running over Jailhouse x86: Align x86_64 PCI_MMCONFIG with 32-bit variant x86: Consolidate PCI_MMCONFIG configs x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI MAINTAINERS: Add entry for Jailhouse Otavio Pontes (1): x86/jailhouse: Enable PCI mmconfig access in inmates Documentation/devicetree/bindings/jailhouse.txt | 8 MAINTAINERS | 7 +++ arch/x86/Kconfig| 12 +++- arch/x86/include/asm/jailhouse_para.h | 2 +- arch/x86/include/asm/pci_x86.h | 2 ++ arch/x86/kernel/Makefile| 2 +- arch/x86/kernel/cpu/amd.c | 2 +- arch/x86/kernel/jailhouse.c | 8 arch/x86/pci/legacy.c | 4 +++- arch/x86/pci/mmconfig-shared.c | 4 ++-- drivers/pci/probe.c | 22 +++--- include/linux/hypervisor.h | 17 +++-- 12 files changed, 74 insertions(+), 16 deletions(-) create mode 100644 Documentation/devicetree/bindings/jailhouse.txt -- 2.13.6
[PATCH v5 7/7] MAINTAINERS: Add entry for Jailhouse
From: Jan KiszkaSigned-off-by: Jan Kiszka --- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 4623caf8d72d..6dc0b8f3ae0e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7523,6 +7523,13 @@ Q: http://patchwork.linuxtv.org/project/linux-media/list/ S: Maintained F: drivers/media/dvb-frontends/ix2505v* +JAILHOUSE HYPERVISOR INTERFACE +M: Jan Kiszka +L: jailhouse-...@googlegroups.com +S: Maintained +F: arch/x86/kernel/jailhouse.c +F: arch/x86/include/asm/jailhouse_para.h + JC42.4 TEMPERATURE SENSOR DRIVER M: Guenter Roeck L: linux-hw...@vger.kernel.org -- 2.13.6
[PATCH v5 1/7] jailhouse: Provide detection for non-x86 systems
From: Jan KiszkaImplement jailhouse_paravirt() via device tree probing on architectures != x86. Will be used by the PCI core. CC: Rob Herring CC: Mark Rutland CC: Juergen Gross Signed-off-by: Jan Kiszka Reviewed-by: Juergen Gross --- Documentation/devicetree/bindings/jailhouse.txt | 8 arch/x86/include/asm/jailhouse_para.h | 2 +- include/linux/hypervisor.h | 17 +++-- 3 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/jailhouse.txt diff --git a/Documentation/devicetree/bindings/jailhouse.txt b/Documentation/devicetree/bindings/jailhouse.txt new file mode 100644 index ..2901c25ff340 --- /dev/null +++ b/Documentation/devicetree/bindings/jailhouse.txt @@ -0,0 +1,8 @@ +Jailhouse non-root cell device tree bindings + + +When running in a non-root Jailhouse cell (partition), the device tree of this +platform shall have a top-level "hypervisor" node with the following +properties: + +- compatible = "jailhouse,cell" diff --git a/arch/x86/include/asm/jailhouse_para.h b/arch/x86/include/asm/jailhouse_para.h index 875b54376689..b885a961a150 100644 --- a/arch/x86/include/asm/jailhouse_para.h +++ b/arch/x86/include/asm/jailhouse_para.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL2.0 */ /* - * Jailhouse paravirt_ops implementation + * Jailhouse paravirt detection * * Copyright (c) Siemens AG, 2015-2017 * diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h index b19563f9a8eb..fc08b433c856 100644 --- a/include/linux/hypervisor.h +++ b/include/linux/hypervisor.h @@ -8,15 +8,28 @@ */ #ifdef CONFIG_X86 + +#include #include + static inline void hypervisor_pin_vcpu(int cpu) { x86_platform.hyper.pin_vcpu(cpu); } -#else + +#else /* !CONFIG_X86 */ + +#include + static inline void hypervisor_pin_vcpu(int cpu) { } -#endif + +static inline bool jailhouse_paravirt(void) +{ + return of_find_compatible_node(NULL, NULL, "jailhouse,cell"); +} + +#endif /* !CONFIG_X86 */ #endif /* __LINUX_HYPEVISOR_H */ -- 2.13.6
[PATCH v5 7/7] MAINTAINERS: Add entry for Jailhouse
From: Jan Kiszka Signed-off-by: Jan Kiszka --- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 4623caf8d72d..6dc0b8f3ae0e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7523,6 +7523,13 @@ Q: http://patchwork.linuxtv.org/project/linux-media/list/ S: Maintained F: drivers/media/dvb-frontends/ix2505v* +JAILHOUSE HYPERVISOR INTERFACE +M: Jan Kiszka +L: jailhouse-...@googlegroups.com +S: Maintained +F: arch/x86/kernel/jailhouse.c +F: arch/x86/include/asm/jailhouse_para.h + JC42.4 TEMPERATURE SENSOR DRIVER M: Guenter Roeck L: linux-hw...@vger.kernel.org -- 2.13.6
[PATCH v5 0/7] jailhouse: Enhance secondary Jailhouse guest support /wrt PCI
Basic x86 support [1] for running Linux as secondary Jailhouse [2] guest is currently pending in the tip tree. This builds on top and enhances the PCI support for x86 and also ARM guests (ARM[64] does not require platform patches and works already). Key elements of this series are: - detection of Jailhouse via device tree hypervisor node - function-level PCI scan if Jailhouse is detected - MMCONFIG support for x86 guests As most changes affect x86, I would suggest to route the series also via tip after the necessary acks are collected. Changes in v5: - fix build breakage of patch 6 on i386 Changes in v4: - slit up Kconfig changes - respect pcibios_last_bus during mmconfig setup - cosmetic changes requested by Andy Changes in v3: - avoided duplicate scans of PCI functions under Jailhouse - reformated PCI_MMCONFIG condition and rephrase related commit log Changes in v2: - adjusted commit log and include ordering in patch 2 - rebased over Linus master Jan [1] https://lkml.org/lkml/2017/11/27/125 [2] http://jailhouse-project.org CC: Benedikt Spranger CC: Juergen Gross CC: Mark Rutland CC: Otavio Pontes CC: Rob Herring Jan Kiszka (6): jailhouse: Provide detection for non-x86 systems PCI: Scan all functions when running over Jailhouse x86: Align x86_64 PCI_MMCONFIG with 32-bit variant x86: Consolidate PCI_MMCONFIG configs x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI MAINTAINERS: Add entry for Jailhouse Otavio Pontes (1): x86/jailhouse: Enable PCI mmconfig access in inmates Documentation/devicetree/bindings/jailhouse.txt | 8 MAINTAINERS | 7 +++ arch/x86/Kconfig| 12 +++- arch/x86/include/asm/jailhouse_para.h | 2 +- arch/x86/include/asm/pci_x86.h | 2 ++ arch/x86/kernel/Makefile| 2 +- arch/x86/kernel/cpu/amd.c | 2 +- arch/x86/kernel/jailhouse.c | 8 arch/x86/pci/legacy.c | 4 +++- arch/x86/pci/mmconfig-shared.c | 4 ++-- drivers/pci/probe.c | 22 +++--- include/linux/hypervisor.h | 17 +++-- 12 files changed, 74 insertions(+), 16 deletions(-) create mode 100644 Documentation/devicetree/bindings/jailhouse.txt -- 2.13.6
[PATCH v5 1/7] jailhouse: Provide detection for non-x86 systems
From: Jan Kiszka Implement jailhouse_paravirt() via device tree probing on architectures != x86. Will be used by the PCI core. CC: Rob Herring CC: Mark Rutland CC: Juergen Gross Signed-off-by: Jan Kiszka Reviewed-by: Juergen Gross --- Documentation/devicetree/bindings/jailhouse.txt | 8 arch/x86/include/asm/jailhouse_para.h | 2 +- include/linux/hypervisor.h | 17 +++-- 3 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/jailhouse.txt diff --git a/Documentation/devicetree/bindings/jailhouse.txt b/Documentation/devicetree/bindings/jailhouse.txt new file mode 100644 index ..2901c25ff340 --- /dev/null +++ b/Documentation/devicetree/bindings/jailhouse.txt @@ -0,0 +1,8 @@ +Jailhouse non-root cell device tree bindings + + +When running in a non-root Jailhouse cell (partition), the device tree of this +platform shall have a top-level "hypervisor" node with the following +properties: + +- compatible = "jailhouse,cell" diff --git a/arch/x86/include/asm/jailhouse_para.h b/arch/x86/include/asm/jailhouse_para.h index 875b54376689..b885a961a150 100644 --- a/arch/x86/include/asm/jailhouse_para.h +++ b/arch/x86/include/asm/jailhouse_para.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: GPL2.0 */ /* - * Jailhouse paravirt_ops implementation + * Jailhouse paravirt detection * * Copyright (c) Siemens AG, 2015-2017 * diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h index b19563f9a8eb..fc08b433c856 100644 --- a/include/linux/hypervisor.h +++ b/include/linux/hypervisor.h @@ -8,15 +8,28 @@ */ #ifdef CONFIG_X86 + +#include #include + static inline void hypervisor_pin_vcpu(int cpu) { x86_platform.hyper.pin_vcpu(cpu); } -#else + +#else /* !CONFIG_X86 */ + +#include + static inline void hypervisor_pin_vcpu(int cpu) { } -#endif + +static inline bool jailhouse_paravirt(void) +{ + return of_find_compatible_node(NULL, NULL, "jailhouse,cell"); +} + +#endif /* !CONFIG_X86 */ #endif /* __LINUX_HYPEVISOR_H */ -- 2.13.6
[PATCH v4 3/3] security: Add an example sample dynamic LSM
This adds an example LSM that utilizes the features added by the dynamically loadable LSMs patch. Once the module is unloaded, the command is once again allowed. It prevents the user from running: date --set="October 21 2015 16:29:00 PDT" Signed-off-by: Sargun Dhillon--- samples/Kconfig | 6 ++ samples/Makefile | 2 +- samples/lsm/Makefile | 4 samples/lsm/lsm_example.c | 33 + 4 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 samples/lsm/Makefile create mode 100644 samples/lsm/lsm_example.c diff --git a/samples/Kconfig b/samples/Kconfig index c332a3b9de05..022242c0b50b 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -117,4 +117,10 @@ config SAMPLE_STATX help Build example userspace program to use the new extended-stat syscall. +config SAMPLE_DYNAMIC_LSM + tristate "Build LSM examples -- loadable modules only" + depends on SECURITY_DYNAMIC_HOOKS && m + help + This builds an example dynamic LSM + endif # SAMPLES diff --git a/samples/Makefile b/samples/Makefile index db54e766ddb1..9d23835d6e6d 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -3,4 +3,4 @@ obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ trace_events/ livepatch/ \ hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \ configfs/ connector/ v4l/ trace_printk/ blackfin/ \ - vfio-mdev/ statx/ + vfio-mdev/ statx/ lsm/ diff --git a/samples/lsm/Makefile b/samples/lsm/Makefile new file mode 100644 index ..d4ccb940f18b --- /dev/null +++ b/samples/lsm/Makefile @@ -0,0 +1,4 @@ +# builds the loadable LSM example kernel modules; +# then to use one (as root): insmod +# and to unload: rmmod module_name +obj-$(CONFIG_SAMPLE_DYNAMIC_LSM) += lsm_example.o diff --git a/samples/lsm/lsm_example.c b/samples/lsm/lsm_example.c new file mode 100644 index ..95c56ebd4d16 --- /dev/null +++ b/samples/lsm/lsm_example.c @@ -0,0 +1,33 @@ +/* + * This sample hooks into the "settime" + * + * Once you run it, the following will not be allowed: + * date --set="October 21 2015 16:29:00 PDT" + */ + +#include +#include +#include + +static int settime_cb(const struct timespec *ts, const struct timezone *tz) +{ + /* We aren't allowed to travel to October 21 2015 16:29 PDT */ + if (ts->tv_sec >= 1445470140 && ts->tv_sec < 1445470200) + return -EPERM; + + return 0; +} + +static struct security_hook_list sample_hooks[] = { + LSM_HOOK_INIT(settime, settime_cb), +}; + +static int __init lsm_init(void) +{ + return security_add_dynamic_hooks(sample_hooks, + ARRAY_SIZE(sample_hooks), + "sample"); +} + +module_init(lsm_init) +MODULE_LICENSE("GPL"); -- 2.14.1
[PATCH v4 3/3] security: Add an example sample dynamic LSM
This adds an example LSM that utilizes the features added by the dynamically loadable LSMs patch. Once the module is unloaded, the command is once again allowed. It prevents the user from running: date --set="October 21 2015 16:29:00 PDT" Signed-off-by: Sargun Dhillon --- samples/Kconfig | 6 ++ samples/Makefile | 2 +- samples/lsm/Makefile | 4 samples/lsm/lsm_example.c | 33 + 4 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 samples/lsm/Makefile create mode 100644 samples/lsm/lsm_example.c diff --git a/samples/Kconfig b/samples/Kconfig index c332a3b9de05..022242c0b50b 100644 --- a/samples/Kconfig +++ b/samples/Kconfig @@ -117,4 +117,10 @@ config SAMPLE_STATX help Build example userspace program to use the new extended-stat syscall. +config SAMPLE_DYNAMIC_LSM + tristate "Build LSM examples -- loadable modules only" + depends on SECURITY_DYNAMIC_HOOKS && m + help + This builds an example dynamic LSM + endif # SAMPLES diff --git a/samples/Makefile b/samples/Makefile index db54e766ddb1..9d23835d6e6d 100644 --- a/samples/Makefile +++ b/samples/Makefile @@ -3,4 +3,4 @@ obj-$(CONFIG_SAMPLES) += kobject/ kprobes/ trace_events/ livepatch/ \ hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \ configfs/ connector/ v4l/ trace_printk/ blackfin/ \ - vfio-mdev/ statx/ + vfio-mdev/ statx/ lsm/ diff --git a/samples/lsm/Makefile b/samples/lsm/Makefile new file mode 100644 index ..d4ccb940f18b --- /dev/null +++ b/samples/lsm/Makefile @@ -0,0 +1,4 @@ +# builds the loadable LSM example kernel modules; +# then to use one (as root): insmod +# and to unload: rmmod module_name +obj-$(CONFIG_SAMPLE_DYNAMIC_LSM) += lsm_example.o diff --git a/samples/lsm/lsm_example.c b/samples/lsm/lsm_example.c new file mode 100644 index ..95c56ebd4d16 --- /dev/null +++ b/samples/lsm/lsm_example.c @@ -0,0 +1,33 @@ +/* + * This sample hooks into the "settime" + * + * Once you run it, the following will not be allowed: + * date --set="October 21 2015 16:29:00 PDT" + */ + +#include +#include +#include + +static int settime_cb(const struct timespec *ts, const struct timezone *tz) +{ + /* We aren't allowed to travel to October 21 2015 16:29 PDT */ + if (ts->tv_sec >= 1445470140 && ts->tv_sec < 1445470200) + return -EPERM; + + return 0; +} + +static struct security_hook_list sample_hooks[] = { + LSM_HOOK_INIT(settime, settime_cb), +}; + +static int __init lsm_init(void) +{ + return security_add_dynamic_hooks(sample_hooks, + ARRAY_SIZE(sample_hooks), + "sample"); +} + +module_init(lsm_init) +MODULE_LICENSE("GPL"); -- 2.14.1
[PATCH v4 2/3] security: Expose a mechanism to load lsm hooks dynamically at runtime
This patch adds dynamic security hooks. These hooks are designed to allow for safe runtime loading. These hooks are only run after all built-in, and major LSMs are run. The LSMs enabled by this feature must be minor LSMs, but they can poke at the security blobs, as the blobs should be initialized by the time their callback happens. There should be little runtime performance overhead for this feature, as it's protected behind static_keys, which are disabled by default, and are only enabled per-hook at runtime, when a module is loaded. Currently, the hook heads are separated for dynamic hooks, because it is not read-only like the hooks which are loaded at runtime. Some hooks are blacklisted, and attempting to load an LSM with any of them in use will fail. Signed-off-by: Sargun Dhillon--- include/linux/lsm_hooks.h | 26 +- security/Kconfig | 9 +++ security/inode.c | 13 ++- security/security.c | 198 -- 4 files changed, 234 insertions(+), 12 deletions(-) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index d28c7f5b01c1..4e6351957dff 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -28,6 +28,7 @@ #include #include #include +#include /** * union security_list_options - Linux Security Module hook function list @@ -1968,6 +1969,9 @@ struct security_hook_list { enum lsm_hook head_idx; union security_list_options hook; char*lsm; +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS + struct module *owner; +#endif } __randomize_layout; /* @@ -1976,11 +1980,29 @@ struct security_hook_list { * care of the common case and reduces the amount of * text involved. */ +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +#define LSM_HOOK_INIT(HEAD, HOOK) \ + { \ + .head_idx = HOOK_IDX(HEAD), \ + .hook = { .HEAD = HOOK }, \ + .owner = THIS_MODULE, \ + } + +#else #define LSM_HOOK_INIT(HEAD, HOOK) \ { .head_idx = HOOK_IDX(HEAD), .hook = { .HEAD = HOOK } } +#endif -extern char *lsm_names; - +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +extern int security_add_dynamic_hooks(struct security_hook_list *hooks, + int count, char *lsm); +#else +static inline int security_add_dynamic_hooks(struct security_hook_list *hooks, +int count, char *lsm) +{ + return -EOPNOTSUPP; +} +#endif extern void security_add_hooks(struct security_hook_list *hooks, int count, char *lsm); diff --git a/security/Kconfig b/security/Kconfig index c4302067a3ad..481b93b0d4d9 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -36,6 +36,15 @@ config SECURITY_WRITABLE_HOOKS bool default n +config SECURITY_DYNAMIC_HOOKS + bool "Runtime loadable (minor) LSMs via LKMs" + depends on SECURITY && SRCU + help + This enables LSMs which live in LKMs, and supports loading, and + unloading them safely at runtime. These LSMs must be minor LSMs. + They cannot circumvent the built-in LSMs. + If you are unsure how to answer this question, answer N. + config SECURITYFS bool "Enable the securityfs filesystem" help diff --git a/security/inode.c b/security/inode.c index 8dd9ca8848e4..89be07b044a5 100644 --- a/security/inode.c +++ b/security/inode.c @@ -22,6 +22,10 @@ #include #include #include +#include + +extern char *lsm_names; +extern struct mutex lsm_lock; static struct vfsmount *mount; static int mount_count; @@ -312,8 +316,13 @@ static struct dentry *lsm_dentry; static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count, loff_t *ppos) { - return simple_read_from_buffer(buf, count, ppos, lsm_names, - strlen(lsm_names)); + ssize_t ret; + + mutex_lock(_lock); + ret = simple_read_from_buffer(buf, count, ppos, lsm_names, + strlen(lsm_names)); + mutex_unlock(_lock); + return ret; } static const struct file_operations lsm_ops = { diff --git a/security/security.c b/security/security.c index b9fb297b824e..492d44dd0549 100644 --- a/security/security.c +++ b/security/security.c @@ -29,6 +29,7 @@ #include #include #include +#include #define MAX_LSM_EVM_XATTR 2 @@ -36,10 +37,18 @@ #define SECURITY_NAME_MAX 10 static struct list_head security_hook_heads[__MAX_LSM_HOOK] __lsm_ro_after_init; -static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain); - #define HOOK_HEAD(NAME)(_hook_heads[HOOK_IDX(NAME)]) +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +static struct list_head dynamic_security_hook_heads[__MAX_LSM_HOOK]; +static struct srcu_struct
[PATCH v4 0/3] Safe, dynamically loadable LSM hooks
This patchset introduces safe dynamic LSM support. These are currently not unloadable, until we figure out a use case that needs that. Adding an unload hook is trivial given the way the patch is written. This exposes a second mechanism of loading hooks which are in modules. These hooks are behind static keys, so they should come at low performance overhead. The built-in hook heads are read-only, whereas the dynamic hooks are mutable. Not all hooks can be loaded into. Some hooks are blacklisted, and therefore trying to load a module which plugs into those hooks will fail. One of the big benefits with loadable security modules is to help with "unknown unknowns". Although, livepatch is excellent, sometimes, a surgical LSM is simpler. It includes an example LSM that prevents specific time travel. Changes since v3: * readded hook blacklisted * return error, rather than panic if unable to allocate memory Changes since v2: * inode get/set security is readded * xfrm singleton hook readded * Security hooks are turned into an array * Security hooks and dynamic hooks enum is collapsed Changes since v1: * It no longer allows unloading of modules * prctl is fixed * inode get/set security is removed * xfrm singleton hook removed Sargun Dhillon (3): security: Refactor LSM hooks into an array and enum security: Expose a mechanism to load lsm hooks dynamically at runtime security: Add an example sample dynamic LSM include/linux/lsm_hooks.h | 459 -- samples/Kconfig | 6 + samples/Makefile | 2 +- samples/lsm/Makefile | 4 + samples/lsm/lsm_example.c | 33 security/Kconfig | 9 + security/inode.c | 13 +- security/security.c | 222 -- 8 files changed, 508 insertions(+), 240 deletions(-) create mode 100644 samples/lsm/Makefile create mode 100644 samples/lsm/lsm_example.c -- 2.14.1
[PATCH v4 2/3] security: Expose a mechanism to load lsm hooks dynamically at runtime
This patch adds dynamic security hooks. These hooks are designed to allow for safe runtime loading. These hooks are only run after all built-in, and major LSMs are run. The LSMs enabled by this feature must be minor LSMs, but they can poke at the security blobs, as the blobs should be initialized by the time their callback happens. There should be little runtime performance overhead for this feature, as it's protected behind static_keys, which are disabled by default, and are only enabled per-hook at runtime, when a module is loaded. Currently, the hook heads are separated for dynamic hooks, because it is not read-only like the hooks which are loaded at runtime. Some hooks are blacklisted, and attempting to load an LSM with any of them in use will fail. Signed-off-by: Sargun Dhillon --- include/linux/lsm_hooks.h | 26 +- security/Kconfig | 9 +++ security/inode.c | 13 ++- security/security.c | 198 -- 4 files changed, 234 insertions(+), 12 deletions(-) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index d28c7f5b01c1..4e6351957dff 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -28,6 +28,7 @@ #include #include #include +#include /** * union security_list_options - Linux Security Module hook function list @@ -1968,6 +1969,9 @@ struct security_hook_list { enum lsm_hook head_idx; union security_list_options hook; char*lsm; +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS + struct module *owner; +#endif } __randomize_layout; /* @@ -1976,11 +1980,29 @@ struct security_hook_list { * care of the common case and reduces the amount of * text involved. */ +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +#define LSM_HOOK_INIT(HEAD, HOOK) \ + { \ + .head_idx = HOOK_IDX(HEAD), \ + .hook = { .HEAD = HOOK }, \ + .owner = THIS_MODULE, \ + } + +#else #define LSM_HOOK_INIT(HEAD, HOOK) \ { .head_idx = HOOK_IDX(HEAD), .hook = { .HEAD = HOOK } } +#endif -extern char *lsm_names; - +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +extern int security_add_dynamic_hooks(struct security_hook_list *hooks, + int count, char *lsm); +#else +static inline int security_add_dynamic_hooks(struct security_hook_list *hooks, +int count, char *lsm) +{ + return -EOPNOTSUPP; +} +#endif extern void security_add_hooks(struct security_hook_list *hooks, int count, char *lsm); diff --git a/security/Kconfig b/security/Kconfig index c4302067a3ad..481b93b0d4d9 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -36,6 +36,15 @@ config SECURITY_WRITABLE_HOOKS bool default n +config SECURITY_DYNAMIC_HOOKS + bool "Runtime loadable (minor) LSMs via LKMs" + depends on SECURITY && SRCU + help + This enables LSMs which live in LKMs, and supports loading, and + unloading them safely at runtime. These LSMs must be minor LSMs. + They cannot circumvent the built-in LSMs. + If you are unsure how to answer this question, answer N. + config SECURITYFS bool "Enable the securityfs filesystem" help diff --git a/security/inode.c b/security/inode.c index 8dd9ca8848e4..89be07b044a5 100644 --- a/security/inode.c +++ b/security/inode.c @@ -22,6 +22,10 @@ #include #include #include +#include + +extern char *lsm_names; +extern struct mutex lsm_lock; static struct vfsmount *mount; static int mount_count; @@ -312,8 +316,13 @@ static struct dentry *lsm_dentry; static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count, loff_t *ppos) { - return simple_read_from_buffer(buf, count, ppos, lsm_names, - strlen(lsm_names)); + ssize_t ret; + + mutex_lock(_lock); + ret = simple_read_from_buffer(buf, count, ppos, lsm_names, + strlen(lsm_names)); + mutex_unlock(_lock); + return ret; } static const struct file_operations lsm_ops = { diff --git a/security/security.c b/security/security.c index b9fb297b824e..492d44dd0549 100644 --- a/security/security.c +++ b/security/security.c @@ -29,6 +29,7 @@ #include #include #include +#include #define MAX_LSM_EVM_XATTR 2 @@ -36,10 +37,18 @@ #define SECURITY_NAME_MAX 10 static struct list_head security_hook_heads[__MAX_LSM_HOOK] __lsm_ro_after_init; -static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain); - #define HOOK_HEAD(NAME)(_hook_heads[HOOK_IDX(NAME)]) +#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS +static struct list_head dynamic_security_hook_heads[__MAX_LSM_HOOK]; +static struct srcu_struct dynamic_hook_srcus[__MAX_LSM_HOOK];
[PATCH v4 0/3] Safe, dynamically loadable LSM hooks
This patchset introduces safe dynamic LSM support. These are currently not unloadable, until we figure out a use case that needs that. Adding an unload hook is trivial given the way the patch is written. This exposes a second mechanism of loading hooks which are in modules. These hooks are behind static keys, so they should come at low performance overhead. The built-in hook heads are read-only, whereas the dynamic hooks are mutable. Not all hooks can be loaded into. Some hooks are blacklisted, and therefore trying to load a module which plugs into those hooks will fail. One of the big benefits with loadable security modules is to help with "unknown unknowns". Although, livepatch is excellent, sometimes, a surgical LSM is simpler. It includes an example LSM that prevents specific time travel. Changes since v3: * readded hook blacklisted * return error, rather than panic if unable to allocate memory Changes since v2: * inode get/set security is readded * xfrm singleton hook readded * Security hooks are turned into an array * Security hooks and dynamic hooks enum is collapsed Changes since v1: * It no longer allows unloading of modules * prctl is fixed * inode get/set security is removed * xfrm singleton hook removed Sargun Dhillon (3): security: Refactor LSM hooks into an array and enum security: Expose a mechanism to load lsm hooks dynamically at runtime security: Add an example sample dynamic LSM include/linux/lsm_hooks.h | 459 -- samples/Kconfig | 6 + samples/Makefile | 2 +- samples/lsm/Makefile | 4 + samples/lsm/lsm_example.c | 33 security/Kconfig | 9 + security/inode.c | 13 +- security/security.c | 222 -- 8 files changed, 508 insertions(+), 240 deletions(-) create mode 100644 samples/lsm/Makefile create mode 100644 samples/lsm/lsm_example.c -- 2.14.1
Re: WARNING: kmalloc bug in memdup_user
On Tue, Mar 06, 2018 at 10:59:02PM -0800, syzbot wrote: > Hello, > > syzbot hit the following crash on upstream commit > ce380619fab99036f5e745c7a865b21c59f005f6 (Tue Mar 6 04:31:14 2018 +) > Merge tag 'please-pull-ia64_misc' of > git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux > > So far this crash happened 52 times on upstream. > C reproducer is attached. > syzkaller reproducer is attached. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+a38b0e9f694c379ca...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > audit: type=1400 audit(1520367364.281:6): avc: denied { map } for > pid=4138 comm="bash" path="/bin/bash" dev="sda1" ino=1457 > scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=system_u:object_r:file_t:s0 tclass=file permissive=1 > audit: type=1400 audit(1520367370.605:7): avc: denied { map } for > pid=4152 comm="syzkaller100190" path="/root/syzkaller100190328" dev="sda1" > ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > WARNING: CPU: 0 PID: 4152 at mm/slab_common.c:1012 kmalloc_slab+0x5d/0x70 > mm/slab_common.c:1012 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 0 PID: 4152 Comm: syzkaller100190 Not tainted 4.16.0-rc4+ #343 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x24d lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 > fixup_bug arch/x86/kernel/traps.c:247 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 > invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 > RIP: 0010:kmalloc_slab+0x5d/0x70 mm/slab_common.c:1012 > RSP: 0018:8801bf76f970 EFLAGS: 00010246 > RAX: RBX: fff4 RCX: 819733cb > RDX: 8423372f RSI: RDI: 3efef4b4 > RBP: 8801bf76f970 R08: R09: > R10: 88613380 R11: R12: 3efef4b4 > R13: 2080 R14: 014200c0 R15: 8801bf76fa68 > __do_kmalloc mm/slab.c:3700 [inline] > __kmalloc_track_caller+0x21/0x760 mm/slab.c:3720 > memdup_user+0x2c/0x90 mm/util.c:160 > ucma_set_option+0x11f/0x4d0 drivers/infiniband/core/ucma.c:1297 > ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1627 > __vfs_write+0xef/0x970 fs/read_write.c:480 > vfs_write+0x189/0x510 fs/read_write.c:544 > SYSC_write fs/read_write.c:589 [inline] > SyS_write+0xef/0x220 fs/read_write.c:581 > do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > RIP: 0033:0x43fe69 > RSP: 002b:7ffe099a6388 EFLAGS: 0217 ORIG_RAX: 0001 > RAX: ffda RBX: 004002c8 RCX: 0043fe69 > RDX: 006b RSI: 20c0 RDI: 0003 > RBP: 006ca018 R08: 004002c8 R09: 004002c8 > R10: 004002c8 R11: 0217 R12: 00401790 > R13: 00401820 R14: R15: > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled > Rebooting in 86400 seconds.. I'm surprised that it surfed only now. It is clear bug, user's input wasn't checked. But it is not clear to me why optval wasn't declared as u64. Thanks signature.asc Description: PGP signature
[PATCH v4 1/3] security: Refactor LSM hooks into an array and enum
This commit should have no functional change. It changes the security hook list heads struct into an array. Additionally, it exposes all of the hooks via an enum. This loses memory layout randomization as the enum is not randomized. Signed-off-by: Sargun Dhillon--- include/linux/lsm_hooks.h | 433 +++--- security/security.c | 30 ++-- 2 files changed, 233 insertions(+), 230 deletions(-) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 7161d8e7ee79..d28c7f5b01c1 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -1729,241 +1729,243 @@ union security_list_options { #endif /* CONFIG_BPF_SYSCALL */ }; -struct security_hook_heads { - struct list_head binder_set_context_mgr; - struct list_head binder_transaction; - struct list_head binder_transfer_binder; - struct list_head binder_transfer_file; - struct list_head ptrace_access_check; - struct list_head ptrace_traceme; - struct list_head capget; - struct list_head capset; - struct list_head capable; - struct list_head quotactl; - struct list_head quota_on; - struct list_head syslog; - struct list_head settime; - struct list_head vm_enough_memory; - struct list_head bprm_set_creds; - struct list_head bprm_check_security; - struct list_head bprm_committing_creds; - struct list_head bprm_committed_creds; - struct list_head sb_alloc_security; - struct list_head sb_free_security; - struct list_head sb_copy_data; - struct list_head sb_remount; - struct list_head sb_kern_mount; - struct list_head sb_show_options; - struct list_head sb_statfs; - struct list_head sb_mount; - struct list_head sb_umount; - struct list_head sb_pivotroot; - struct list_head sb_set_mnt_opts; - struct list_head sb_clone_mnt_opts; - struct list_head sb_parse_opts_str; - struct list_head dentry_init_security; - struct list_head dentry_create_files_as; +enum lsm_hook { + LSM_HOOK_binder_set_context_mgr, + LSM_HOOK_binder_transaction, + LSM_HOOK_binder_transfer_binder, + LSM_HOOK_binder_transfer_file, + LSM_HOOK_ptrace_access_check, + LSM_HOOK_ptrace_traceme, + LSM_HOOK_capget, + LSM_HOOK_capset, + LSM_HOOK_capable, + LSM_HOOK_quotactl, + LSM_HOOK_quota_on, + LSM_HOOK_syslog, + LSM_HOOK_settime, + LSM_HOOK_vm_enough_memory, + LSM_HOOK_bprm_set_creds, + LSM_HOOK_bprm_check_security, + LSM_HOOK_bprm_committing_creds, + LSM_HOOK_bprm_committed_creds, + LSM_HOOK_sb_alloc_security, + LSM_HOOK_sb_free_security, + LSM_HOOK_sb_copy_data, + LSM_HOOK_sb_remount, + LSM_HOOK_sb_kern_mount, + LSM_HOOK_sb_show_options, + LSM_HOOK_sb_statfs, + LSM_HOOK_sb_mount, + LSM_HOOK_sb_umount, + LSM_HOOK_sb_pivotroot, + LSM_HOOK_sb_set_mnt_opts, + LSM_HOOK_sb_clone_mnt_opts, + LSM_HOOK_sb_parse_opts_str, + LSM_HOOK_dentry_init_security, + LSM_HOOK_dentry_create_files_as, #ifdef CONFIG_SECURITY_PATH - struct list_head path_unlink; - struct list_head path_mkdir; - struct list_head path_rmdir; - struct list_head path_mknod; - struct list_head path_truncate; - struct list_head path_symlink; - struct list_head path_link; - struct list_head path_rename; - struct list_head path_chmod; - struct list_head path_chown; - struct list_head path_chroot; + LSM_HOOK_path_unlink, + LSM_HOOK_path_mkdir, + LSM_HOOK_path_rmdir, + LSM_HOOK_path_mknod, + LSM_HOOK_path_truncate, + LSM_HOOK_path_symlink, + LSM_HOOK_path_link, + LSM_HOOK_path_rename, + LSM_HOOK_path_chmod, + LSM_HOOK_path_chown, + LSM_HOOK_path_chroot, #endif - struct list_head inode_alloc_security; - struct list_head inode_free_security; - struct list_head inode_init_security; - struct list_head inode_create; - struct list_head inode_link; - struct list_head inode_unlink; - struct list_head inode_symlink; - struct list_head inode_mkdir; - struct list_head inode_rmdir; - struct list_head inode_mknod; - struct list_head inode_rename; - struct list_head inode_readlink; - struct list_head inode_follow_link; - struct list_head inode_permission; - struct list_head inode_setattr; - struct list_head inode_getattr; - struct list_head inode_setxattr; - struct list_head inode_post_setxattr; - struct list_head inode_getxattr; - struct list_head inode_listxattr; - struct list_head inode_removexattr; - struct list_head inode_need_killpriv; - struct list_head inode_killpriv; - struct list_head inode_getsecurity;
Re: WARNING: kmalloc bug in memdup_user
On Tue, Mar 06, 2018 at 10:59:02PM -0800, syzbot wrote: > Hello, > > syzbot hit the following crash on upstream commit > ce380619fab99036f5e745c7a865b21c59f005f6 (Tue Mar 6 04:31:14 2018 +) > Merge tag 'please-pull-ia64_misc' of > git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux > > So far this crash happened 52 times on upstream. > C reproducer is attached. > syzkaller reproducer is attached. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+a38b0e9f694c379ca...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > audit: type=1400 audit(1520367364.281:6): avc: denied { map } for > pid=4138 comm="bash" path="/bin/bash" dev="sda1" ino=1457 > scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=system_u:object_r:file_t:s0 tclass=file permissive=1 > audit: type=1400 audit(1520367370.605:7): avc: denied { map } for > pid=4152 comm="syzkaller100190" path="/root/syzkaller100190328" dev="sda1" > ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > WARNING: CPU: 0 PID: 4152 at mm/slab_common.c:1012 kmalloc_slab+0x5d/0x70 > mm/slab_common.c:1012 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 0 PID: 4152 Comm: syzkaller100190 Not tainted 4.16.0-rc4+ #343 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x24d lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 > fixup_bug arch/x86/kernel/traps.c:247 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 > invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 > RIP: 0010:kmalloc_slab+0x5d/0x70 mm/slab_common.c:1012 > RSP: 0018:8801bf76f970 EFLAGS: 00010246 > RAX: RBX: fff4 RCX: 819733cb > RDX: 8423372f RSI: RDI: 3efef4b4 > RBP: 8801bf76f970 R08: R09: > R10: 88613380 R11: R12: 3efef4b4 > R13: 2080 R14: 014200c0 R15: 8801bf76fa68 > __do_kmalloc mm/slab.c:3700 [inline] > __kmalloc_track_caller+0x21/0x760 mm/slab.c:3720 > memdup_user+0x2c/0x90 mm/util.c:160 > ucma_set_option+0x11f/0x4d0 drivers/infiniband/core/ucma.c:1297 > ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1627 > __vfs_write+0xef/0x970 fs/read_write.c:480 > vfs_write+0x189/0x510 fs/read_write.c:544 > SYSC_write fs/read_write.c:589 [inline] > SyS_write+0xef/0x220 fs/read_write.c:581 > do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > RIP: 0033:0x43fe69 > RSP: 002b:7ffe099a6388 EFLAGS: 0217 ORIG_RAX: 0001 > RAX: ffda RBX: 004002c8 RCX: 0043fe69 > RDX: 006b RSI: 20c0 RDI: 0003 > RBP: 006ca018 R08: 004002c8 R09: 004002c8 > R10: 004002c8 R11: 0217 R12: 00401790 > R13: 00401820 R14: R15: > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled > Rebooting in 86400 seconds.. I'm surprised that it surfed only now. It is clear bug, user's input wasn't checked. But it is not clear to me why optval wasn't declared as u64. Thanks signature.asc Description: PGP signature
[PATCH v4 1/3] security: Refactor LSM hooks into an array and enum
This commit should have no functional change. It changes the security hook list heads struct into an array. Additionally, it exposes all of the hooks via an enum. This loses memory layout randomization as the enum is not randomized. Signed-off-by: Sargun Dhillon --- include/linux/lsm_hooks.h | 433 +++--- security/security.c | 30 ++-- 2 files changed, 233 insertions(+), 230 deletions(-) diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 7161d8e7ee79..d28c7f5b01c1 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -1729,241 +1729,243 @@ union security_list_options { #endif /* CONFIG_BPF_SYSCALL */ }; -struct security_hook_heads { - struct list_head binder_set_context_mgr; - struct list_head binder_transaction; - struct list_head binder_transfer_binder; - struct list_head binder_transfer_file; - struct list_head ptrace_access_check; - struct list_head ptrace_traceme; - struct list_head capget; - struct list_head capset; - struct list_head capable; - struct list_head quotactl; - struct list_head quota_on; - struct list_head syslog; - struct list_head settime; - struct list_head vm_enough_memory; - struct list_head bprm_set_creds; - struct list_head bprm_check_security; - struct list_head bprm_committing_creds; - struct list_head bprm_committed_creds; - struct list_head sb_alloc_security; - struct list_head sb_free_security; - struct list_head sb_copy_data; - struct list_head sb_remount; - struct list_head sb_kern_mount; - struct list_head sb_show_options; - struct list_head sb_statfs; - struct list_head sb_mount; - struct list_head sb_umount; - struct list_head sb_pivotroot; - struct list_head sb_set_mnt_opts; - struct list_head sb_clone_mnt_opts; - struct list_head sb_parse_opts_str; - struct list_head dentry_init_security; - struct list_head dentry_create_files_as; +enum lsm_hook { + LSM_HOOK_binder_set_context_mgr, + LSM_HOOK_binder_transaction, + LSM_HOOK_binder_transfer_binder, + LSM_HOOK_binder_transfer_file, + LSM_HOOK_ptrace_access_check, + LSM_HOOK_ptrace_traceme, + LSM_HOOK_capget, + LSM_HOOK_capset, + LSM_HOOK_capable, + LSM_HOOK_quotactl, + LSM_HOOK_quota_on, + LSM_HOOK_syslog, + LSM_HOOK_settime, + LSM_HOOK_vm_enough_memory, + LSM_HOOK_bprm_set_creds, + LSM_HOOK_bprm_check_security, + LSM_HOOK_bprm_committing_creds, + LSM_HOOK_bprm_committed_creds, + LSM_HOOK_sb_alloc_security, + LSM_HOOK_sb_free_security, + LSM_HOOK_sb_copy_data, + LSM_HOOK_sb_remount, + LSM_HOOK_sb_kern_mount, + LSM_HOOK_sb_show_options, + LSM_HOOK_sb_statfs, + LSM_HOOK_sb_mount, + LSM_HOOK_sb_umount, + LSM_HOOK_sb_pivotroot, + LSM_HOOK_sb_set_mnt_opts, + LSM_HOOK_sb_clone_mnt_opts, + LSM_HOOK_sb_parse_opts_str, + LSM_HOOK_dentry_init_security, + LSM_HOOK_dentry_create_files_as, #ifdef CONFIG_SECURITY_PATH - struct list_head path_unlink; - struct list_head path_mkdir; - struct list_head path_rmdir; - struct list_head path_mknod; - struct list_head path_truncate; - struct list_head path_symlink; - struct list_head path_link; - struct list_head path_rename; - struct list_head path_chmod; - struct list_head path_chown; - struct list_head path_chroot; + LSM_HOOK_path_unlink, + LSM_HOOK_path_mkdir, + LSM_HOOK_path_rmdir, + LSM_HOOK_path_mknod, + LSM_HOOK_path_truncate, + LSM_HOOK_path_symlink, + LSM_HOOK_path_link, + LSM_HOOK_path_rename, + LSM_HOOK_path_chmod, + LSM_HOOK_path_chown, + LSM_HOOK_path_chroot, #endif - struct list_head inode_alloc_security; - struct list_head inode_free_security; - struct list_head inode_init_security; - struct list_head inode_create; - struct list_head inode_link; - struct list_head inode_unlink; - struct list_head inode_symlink; - struct list_head inode_mkdir; - struct list_head inode_rmdir; - struct list_head inode_mknod; - struct list_head inode_rename; - struct list_head inode_readlink; - struct list_head inode_follow_link; - struct list_head inode_permission; - struct list_head inode_setattr; - struct list_head inode_getattr; - struct list_head inode_setxattr; - struct list_head inode_post_setxattr; - struct list_head inode_getxattr; - struct list_head inode_listxattr; - struct list_head inode_removexattr; - struct list_head inode_need_killpriv; - struct list_head inode_killpriv; - struct list_head inode_getsecurity; - struct
Re: [PATCH 4.4 054/108] mtd: cfi: convert inline functions to macros
On Mon, 05 Mar 2018 02:22:52 + Ben Hutchingswrote: > On Thu, 2018-02-15 at 16:16 +0100, Greg Kroah-Hartman wrote: > > 4.4-stable review patch. If anyone has any objections, please let me know. > > > > -- > > > > From: Arnd Bergmann > > > > commit 9e343e87d2c4c707ef8fae2844864d4dde3a2d13 upstream. > [...] > > -static inline int map_word_andequal(struct map_info *map, map_word val1, > > map_word val2, map_word val3) > > -{ > > - int i; > > - > > - for (i = 0; i < map_words(map); i++) { > > - if ((val1.x[i] & val2.x[i]) != val3.x[i]) > > - return 0; > > - } > > - > > - return 1; > > -} > [...] > > +#define map_word_andequal(map, val1, val2, val3) \ > > +({ \ > > + int i, ret = 1; \ > > + for (i = 0; i < map_words(map); i++) { \ > > + if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) { \ > [...] > > The right-hand side of this comparison is now using val2 instead of > val3. (This bug seems to be unfixed upstream.) Indeed. This being said, it's not buggy since all users of map_word_andequal() pass the same value to val2 and val3. Maybe we should just patch the macro and all call-sites to remove val3. > > Ben. > -- Boris Brezillon, Bootlin (formerly Free Electrons) Embedded Linux and Kernel engineering https://bootlin.com
Re: [PATCH 4.4 054/108] mtd: cfi: convert inline functions to macros
On Mon, 05 Mar 2018 02:22:52 + Ben Hutchings wrote: > On Thu, 2018-02-15 at 16:16 +0100, Greg Kroah-Hartman wrote: > > 4.4-stable review patch. If anyone has any objections, please let me know. > > > > -- > > > > From: Arnd Bergmann > > > > commit 9e343e87d2c4c707ef8fae2844864d4dde3a2d13 upstream. > [...] > > -static inline int map_word_andequal(struct map_info *map, map_word val1, > > map_word val2, map_word val3) > > -{ > > - int i; > > - > > - for (i = 0; i < map_words(map); i++) { > > - if ((val1.x[i] & val2.x[i]) != val3.x[i]) > > - return 0; > > - } > > - > > - return 1; > > -} > [...] > > +#define map_word_andequal(map, val1, val2, val3) \ > > +({ \ > > + int i, ret = 1; \ > > + for (i = 0; i < map_words(map); i++) { \ > > + if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) { \ > [...] > > The right-hand side of this comparison is now using val2 instead of > val3. (This bug seems to be unfixed upstream.) Indeed. This being said, it's not buggy since all users of map_word_andequal() pass the same value to val2 and val3. Maybe we should just patch the macro and all call-sites to remove val3. > > Ben. > -- Boris Brezillon, Bootlin (formerly Free Electrons) Embedded Linux and Kernel engineering https://bootlin.com
Re: [PATCH V2 1/2] mmc: sdhci-msm: Add support to store supported vdd-io voltages
Hi Dough, Jeremy, On 3/3/2018 4:38 AM, Jeremy McNicoll wrote: On 2018-03-02 10:23 AM, Doug Anderson wrote: Hi, On Sun, Feb 11, 2018 at 10:01 PM, Vijay Viswanathwrote: During probe check whether the vdd-io regulator of sdhc platform device can support 1.8V and 3V and store this information as a capability of platform device. Signed-off-by: Vijay Viswanath --- drivers/mmc/host/sdhci-msm.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c index c283291..5c23e92 100644 --- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -23,6 +23,7 @@ #include #include "sdhci-pltfm.h" +#include This is a strange sort order for this include file. Why is it after the local include? #define CORE_MCI_VERSION 0x50 #define CORE_VERSION_MAJOR_SHIFT 28 @@ -81,6 +82,9 @@ #define CORE_HC_SELECT_IN_HS400 (6 << 19) #define CORE_HC_SELECT_IN_MASK (7 << 19) +#define CORE_3_0V_SUPPORT (1 << 25) +#define CORE_1_8V_SUPPORT (1 << 26) + Is there something magical about 25 and 26? This is a new caps field, so I'd have expected 0 and 1. Yes, these bits are the same corresponding to the capabilities in the Capabilities Register (offset 0x40). The bit positions become important when capabilities register doesn't show support to some voltages, but we can support those voltages. At that time, we will have to fake capabilities. The changes for those are currently not yet pushed up. #define CORE_CSR_CDC_CTLR_CFG0 0x130 #define CORE_SW_TRIG_FULL_CALIB BIT(16) #define CORE_HW_AUTOCAL_ENA BIT(17) @@ -148,6 +152,7 @@ struct sdhci_msm_host { u32 curr_io_level; wait_queue_head_t pwr_irq_wait; bool pwr_irq_flag; + u32 caps_0; }; static unsigned int msm_get_clock_rate_for_bus_mode(struct sdhci_host *host, @@ -1313,6 +1318,35 @@ static void sdhci_msm_writeb(struct sdhci_host *host, u8 val, int reg) sdhci_msm_check_power_status(host, req_type); } +static int sdhci_msm_set_regulator_caps(struct sdhci_msm_host *msm_host) +{ + struct mmc_host *mmc = msm_host->mmc; + struct regulator *supply = mmc->supply.vqmmc; + int i, count; + u32 caps = 0, vdd_uV; + + if (!IS_ERR(mmc->supply.vqmmc)) { + count = regulator_count_voltages(supply); + if (count < 0) + return count; + for (i = 0; i < count; i++) { + vdd_uV = regulator_list_voltage(supply, i); + if (vdd_uV <= 0) + continue; + if (vdd_uV > 270) + caps |= CORE_3_0V_SUPPORT; + if (vdd_uV < 195) + caps |= CORE_1_8V_SUPPORT; + } Shouldn't you be using regulator_is_supported_voltage() rather than open coding? Also: I've never personally worked on a device where it was used, but there is definitely a concept floating about of a voltage level of 1.2V. Maybe should copy the ranges from mmc_regulator_set_vqmmc()? regulator_is_supported_voltage() checks for a range and it also uses regulator_list_voltage() internally. regulator_list_voltage() is also an exported API for use by drivers AFAIK. Please correct if it is not. Also: seems like you should have some way to deal with "caps" ending up w/ no bits set. IIRC you can have a regulator that can be enabled / disabled but doesn't list a voltage, so if someone messed up their device tree you could end up in this case. Should you print a warning? ...or treat it as if we support "3.0V"? ...or ? I guess it depends on how do you want patch #2 to behave in that case. Both, initialize it to sane value and print something. This way at least you have a good chance of booting and not hard hanging and you are given a reasonable message indicating what needs to be fixed. -jeremy + } How should things behave if vqmmc is an error? In that case is it important to not set "CORE_IO_PAD_PWR_SWITCH_EN" in patch set #2? ...or should you set "CORE_IO_PAD_PWR_SWITCH_EN" but then make sure you don't set "CORE_IO_PAD_PWR_SWITCH"? Thanks for the suggestion. If the regulators exit and doesn't list the voltages, then I believe initialization itself will not happen. We will not have any available ocr and in sdhci_setup_host it should fail. But these enhancements can be incorporated. Since this patch is already acknowledged, I will incorporate these changes in a subsequent patch. + msm_host->caps_0 |= caps; + pr_debug("%s: %s: supported caps: 0x%08x\n", mmc_hostname(mmc), + __func__, caps); + + return 0; +} + + static const struct of_device_id sdhci_msm_dt_match[] = { {
Re: [PATCH V2 1/2] mmc: sdhci-msm: Add support to store supported vdd-io voltages
Hi Dough, Jeremy, On 3/3/2018 4:38 AM, Jeremy McNicoll wrote: On 2018-03-02 10:23 AM, Doug Anderson wrote: Hi, On Sun, Feb 11, 2018 at 10:01 PM, Vijay Viswanath wrote: During probe check whether the vdd-io regulator of sdhc platform device can support 1.8V and 3V and store this information as a capability of platform device. Signed-off-by: Vijay Viswanath --- drivers/mmc/host/sdhci-msm.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c index c283291..5c23e92 100644 --- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -23,6 +23,7 @@ #include #include "sdhci-pltfm.h" +#include This is a strange sort order for this include file. Why is it after the local include? #define CORE_MCI_VERSION 0x50 #define CORE_VERSION_MAJOR_SHIFT 28 @@ -81,6 +82,9 @@ #define CORE_HC_SELECT_IN_HS400 (6 << 19) #define CORE_HC_SELECT_IN_MASK (7 << 19) +#define CORE_3_0V_SUPPORT (1 << 25) +#define CORE_1_8V_SUPPORT (1 << 26) + Is there something magical about 25 and 26? This is a new caps field, so I'd have expected 0 and 1. Yes, these bits are the same corresponding to the capabilities in the Capabilities Register (offset 0x40). The bit positions become important when capabilities register doesn't show support to some voltages, but we can support those voltages. At that time, we will have to fake capabilities. The changes for those are currently not yet pushed up. #define CORE_CSR_CDC_CTLR_CFG0 0x130 #define CORE_SW_TRIG_FULL_CALIB BIT(16) #define CORE_HW_AUTOCAL_ENA BIT(17) @@ -148,6 +152,7 @@ struct sdhci_msm_host { u32 curr_io_level; wait_queue_head_t pwr_irq_wait; bool pwr_irq_flag; + u32 caps_0; }; static unsigned int msm_get_clock_rate_for_bus_mode(struct sdhci_host *host, @@ -1313,6 +1318,35 @@ static void sdhci_msm_writeb(struct sdhci_host *host, u8 val, int reg) sdhci_msm_check_power_status(host, req_type); } +static int sdhci_msm_set_regulator_caps(struct sdhci_msm_host *msm_host) +{ + struct mmc_host *mmc = msm_host->mmc; + struct regulator *supply = mmc->supply.vqmmc; + int i, count; + u32 caps = 0, vdd_uV; + + if (!IS_ERR(mmc->supply.vqmmc)) { + count = regulator_count_voltages(supply); + if (count < 0) + return count; + for (i = 0; i < count; i++) { + vdd_uV = regulator_list_voltage(supply, i); + if (vdd_uV <= 0) + continue; + if (vdd_uV > 270) + caps |= CORE_3_0V_SUPPORT; + if (vdd_uV < 195) + caps |= CORE_1_8V_SUPPORT; + } Shouldn't you be using regulator_is_supported_voltage() rather than open coding? Also: I've never personally worked on a device where it was used, but there is definitely a concept floating about of a voltage level of 1.2V. Maybe should copy the ranges from mmc_regulator_set_vqmmc()? regulator_is_supported_voltage() checks for a range and it also uses regulator_list_voltage() internally. regulator_list_voltage() is also an exported API for use by drivers AFAIK. Please correct if it is not. Also: seems like you should have some way to deal with "caps" ending up w/ no bits set. IIRC you can have a regulator that can be enabled / disabled but doesn't list a voltage, so if someone messed up their device tree you could end up in this case. Should you print a warning? ...or treat it as if we support "3.0V"? ...or ? I guess it depends on how do you want patch #2 to behave in that case. Both, initialize it to sane value and print something. This way at least you have a good chance of booting and not hard hanging and you are given a reasonable message indicating what needs to be fixed. -jeremy + } How should things behave if vqmmc is an error? In that case is it important to not set "CORE_IO_PAD_PWR_SWITCH_EN" in patch set #2? ...or should you set "CORE_IO_PAD_PWR_SWITCH_EN" but then make sure you don't set "CORE_IO_PAD_PWR_SWITCH"? Thanks for the suggestion. If the regulators exit and doesn't list the voltages, then I believe initialization itself will not happen. We will not have any available ocr and in sdhci_setup_host it should fail. But these enhancements can be incorporated. Since this patch is already acknowledged, I will incorporate these changes in a subsequent patch. + msm_host->caps_0 |= caps; + pr_debug("%s: %s: supported caps: 0x%08x\n", mmc_hostname(mmc), + __func__, caps); + + return 0; +} + + static const struct of_device_id sdhci_msm_dt_match[] = { { .compatible = "qcom,sdhci-msm-v4" }, {},
Re: [PATCH 1/3] vfio/pci: Pull BAR mapping setup from read-write path
On Wed, Feb 28, 2018 at 01:14:46PM -0700, Alex Williamson wrote: > This creates a common helper that we'll use for ioeventfd setup. > > Signed-off-by: Alex WilliamsonReviewed-by: Peter Xu -- Peter Xu
Re: [PATCH 1/3] vfio/pci: Pull BAR mapping setup from read-write path
On Wed, Feb 28, 2018 at 01:14:46PM -0700, Alex Williamson wrote: > This creates a common helper that we'll use for ioeventfd setup. > > Signed-off-by: Alex Williamson Reviewed-by: Peter Xu -- Peter Xu
Re: [PATCH v6] mmc: Export host capabilities to debugfs.
On Wednesday 07 March 2018 12:10 PM, Avri Altman wrote: > >> -Original Message- >> From: Harish Jenny K N [mailto:harish_kand...@mentor.com] >> Sent: Wednesday, March 07, 2018 7:38 AM >> To: ulf.hans...@linaro.org; linus.wall...@linaro.org; >> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman >>; andriy.shevche...@linux.intel.com >> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; >> harish_kand...@mentor.com; vladimir_zapols...@mentor.com >> Subject: [PATCH v6] mmc: Export host capabilities to debugfs. >> >> This patch exports the host capabilities to debugfs >> >> This idea of sharing host capabilities over debugfs came up from Abbas Raza >> Earlier discussions: >> https://lkml.org/lkml/2018/3/5/357 >> https://www.spinics.net/lists/linux-mmc/msg48219.html >> >> Signed-off-by: Harish Jenny K N >> --- >> >> >> +static int mmc_caps_show(struct seq_file *s, void *unused) { >> +struct mmc_host *host = s->private; >> +u32 caps = host->caps; >> + >> +seq_puts(s, "\nMMC Host capabilities are:\n"); >> +seq_puts(s, >> "=\n"); >> +seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", >> + ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); > Maybe use a more compact form, and just call a macro with the applicable > (stringified) bit? Something like this ? #define YN(bit) ((caps & bit) ? "Yes" : "No") and then call seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", YN(MMC_CAP_4_BIT_DATA)); Thanks, Harish Jenny K N
Re: [PATCH v6] mmc: Export host capabilities to debugfs.
On Wednesday 07 March 2018 12:10 PM, Avri Altman wrote: > >> -Original Message- >> From: Harish Jenny K N [mailto:harish_kand...@mentor.com] >> Sent: Wednesday, March 07, 2018 7:38 AM >> To: ulf.hans...@linaro.org; linus.wall...@linaro.org; >> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman >> ; andriy.shevche...@linux.intel.com >> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; >> harish_kand...@mentor.com; vladimir_zapols...@mentor.com >> Subject: [PATCH v6] mmc: Export host capabilities to debugfs. >> >> This patch exports the host capabilities to debugfs >> >> This idea of sharing host capabilities over debugfs came up from Abbas Raza >> Earlier discussions: >> https://lkml.org/lkml/2018/3/5/357 >> https://www.spinics.net/lists/linux-mmc/msg48219.html >> >> Signed-off-by: Harish Jenny K N >> --- >> >> >> +static int mmc_caps_show(struct seq_file *s, void *unused) { >> +struct mmc_host *host = s->private; >> +u32 caps = host->caps; >> + >> +seq_puts(s, "\nMMC Host capabilities are:\n"); >> +seq_puts(s, >> "=\n"); >> +seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", >> + ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); > Maybe use a more compact form, and just call a macro with the applicable > (stringified) bit? Something like this ? #define YN(bit) ((caps & bit) ? "Yes" : "No") and then call seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", YN(MMC_CAP_4_BIT_DATA)); Thanks, Harish Jenny K N
Re: [PATCHv2 2/5] x86/boot/compressed/64: Find a place for 32-bit trampoline
* Kirill A. Shutemovwrote: > On Tue, Feb 27, 2018 at 06:42:14PM +0300, Kirill A. Shutemov wrote: > > If a bootloader enables 64-bit mode with 4-level paging, we might need to > > switch over to 5-level paging. The switching requires the disabling of > > paging, which works fine if kernel itself is loaded below 4G. > > > > But if the bootloader puts the kernel above 4G (not sure if anybody does > > this), we would lose control as soon as paging is disabled, because the > > code becomes unreachable to the CPU. > > > > To handle the situation, we need a trampoline in lower memory that would > > take care of switching on 5-level paging. > > > > This patch finds a spot in low memory for a trampoline. > > > > The heuristic is based on code in reserve_bios_regions(). > > > > We find the end of low memory based on BIOS and EBDA start addresses. > > The trampoline is put just before end of low memory. It's mimic approach > > taken to allocate memory for realtime trampoline. > > > > Signed-off-by: Kirill A. Shutemov > > Tested-by: Borislav Petkov > > --- > > arch/x86/boot/compressed/misc.c | 4 > > arch/x86/boot/compressed/pgtable.h| 11 +++ > > arch/x86/boot/compressed/pgtable_64.c | 34 > > ++ > > 3 files changed, 49 insertions(+) > > create mode 100644 arch/x86/boot/compressed/pgtable.h > > > > diff --git a/arch/x86/boot/compressed/misc.c > > b/arch/x86/boot/compressed/misc.c > > index b50c42455e25..e58409667b13 100644 > > --- a/arch/x86/boot/compressed/misc.c > > +++ b/arch/x86/boot/compressed/misc.c > > @@ -14,6 +14,7 @@ > > > > #include "misc.h" > > #include "error.h" > > +#include "pgtable.h" > > #include "../string.h" > > #include "../voffset.h" > > > > @@ -372,6 +373,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, > > memptr heap, > > debug_putaddr(output_len); > > debug_putaddr(kernel_total_size); > > > > + /* Report address of 32-bit trampoline */ > > + debug_putaddr(trampoline_32bit); > > + > > /* > > * The memory hole needed for the kernel is the larger of either > > * the entire decompressed kernel plus relocation table, or the > > 0-day found problem with the patch on 32-bit config. > > Here's fixup: > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c > index e58409667b13..8e4b55dd5df9 100644 > --- a/arch/x86/boot/compressed/misc.c > +++ b/arch/x86/boot/compressed/misc.c > @@ -373,8 +373,10 @@ asmlinkage __visible void *extract_kernel(void *rmode, > memptr heap, > debug_putaddr(output_len); > debug_putaddr(kernel_total_size); > > +#ifdef CONFIG_X86_64 > /* Report address of 32-bit trampoline */ > debug_putaddr(trampoline_32bit); > +#endif The prototype of trampoline_32bit should be in an #ifdef as well, as the variable only exists on 64-bit kernels. Thanks, Ingo
Re: [PATCHv2 2/5] x86/boot/compressed/64: Find a place for 32-bit trampoline
* Kirill A. Shutemov wrote: > On Tue, Feb 27, 2018 at 06:42:14PM +0300, Kirill A. Shutemov wrote: > > If a bootloader enables 64-bit mode with 4-level paging, we might need to > > switch over to 5-level paging. The switching requires the disabling of > > paging, which works fine if kernel itself is loaded below 4G. > > > > But if the bootloader puts the kernel above 4G (not sure if anybody does > > this), we would lose control as soon as paging is disabled, because the > > code becomes unreachable to the CPU. > > > > To handle the situation, we need a trampoline in lower memory that would > > take care of switching on 5-level paging. > > > > This patch finds a spot in low memory for a trampoline. > > > > The heuristic is based on code in reserve_bios_regions(). > > > > We find the end of low memory based on BIOS and EBDA start addresses. > > The trampoline is put just before end of low memory. It's mimic approach > > taken to allocate memory for realtime trampoline. > > > > Signed-off-by: Kirill A. Shutemov > > Tested-by: Borislav Petkov > > --- > > arch/x86/boot/compressed/misc.c | 4 > > arch/x86/boot/compressed/pgtable.h| 11 +++ > > arch/x86/boot/compressed/pgtable_64.c | 34 > > ++ > > 3 files changed, 49 insertions(+) > > create mode 100644 arch/x86/boot/compressed/pgtable.h > > > > diff --git a/arch/x86/boot/compressed/misc.c > > b/arch/x86/boot/compressed/misc.c > > index b50c42455e25..e58409667b13 100644 > > --- a/arch/x86/boot/compressed/misc.c > > +++ b/arch/x86/boot/compressed/misc.c > > @@ -14,6 +14,7 @@ > > > > #include "misc.h" > > #include "error.h" > > +#include "pgtable.h" > > #include "../string.h" > > #include "../voffset.h" > > > > @@ -372,6 +373,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, > > memptr heap, > > debug_putaddr(output_len); > > debug_putaddr(kernel_total_size); > > > > + /* Report address of 32-bit trampoline */ > > + debug_putaddr(trampoline_32bit); > > + > > /* > > * The memory hole needed for the kernel is the larger of either > > * the entire decompressed kernel plus relocation table, or the > > 0-day found problem with the patch on 32-bit config. > > Here's fixup: > > diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c > index e58409667b13..8e4b55dd5df9 100644 > --- a/arch/x86/boot/compressed/misc.c > +++ b/arch/x86/boot/compressed/misc.c > @@ -373,8 +373,10 @@ asmlinkage __visible void *extract_kernel(void *rmode, > memptr heap, > debug_putaddr(output_len); > debug_putaddr(kernel_total_size); > > +#ifdef CONFIG_X86_64 > /* Report address of 32-bit trampoline */ > debug_putaddr(trampoline_32bit); > +#endif The prototype of trampoline_32bit should be in an #ifdef as well, as the variable only exists on 64-bit kernels. Thanks, Ingo
[PATCH] scsi: jazz_esp, sun3x_esp: Pass struct device pointer in dma calls
In jazz_esp and sun3x_esp, the esp_driver_ops methods pass esp->dev in dma api calls as if it was a pointer to a struct device. But it actually points to a struct platform_device. Fix this. Cc: Thomas BogendoerferSigned-off-by: Finn Thain --- drivers/scsi/jazz_esp.c | 2 +- drivers/scsi/sun3x_esp.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/jazz_esp.c b/drivers/scsi/jazz_esp.c index 9aaa74e349cc..6eb5ff3e2e61 100644 --- a/drivers/scsi/jazz_esp.c +++ b/drivers/scsi/jazz_esp.c @@ -147,7 +147,7 @@ static int esp_jazz_probe(struct platform_device *dev) esp = shost_priv(host); esp->host = host; - esp->dev = dev; + esp->dev = >dev; esp->ops = _esp_ops; res = platform_get_resource(dev, IORESOURCE_MEM, 0); diff --git a/drivers/scsi/sun3x_esp.c b/drivers/scsi/sun3x_esp.c index d50c5ed8f428..0b1421cdf8a0 100644 --- a/drivers/scsi/sun3x_esp.c +++ b/drivers/scsi/sun3x_esp.c @@ -210,7 +210,7 @@ static int esp_sun3x_probe(struct platform_device *dev) esp = shost_priv(host); esp->host = host; - esp->dev = dev; + esp->dev = >dev; esp->ops = _esp_ops; res = platform_get_resource(dev, IORESOURCE_MEM, 0); -- 2.16.1
[PATCH] scsi: jazz_esp, sun3x_esp: Pass struct device pointer in dma calls
In jazz_esp and sun3x_esp, the esp_driver_ops methods pass esp->dev in dma api calls as if it was a pointer to a struct device. But it actually points to a struct platform_device. Fix this. Cc: Thomas Bogendoerfer Signed-off-by: Finn Thain --- drivers/scsi/jazz_esp.c | 2 +- drivers/scsi/sun3x_esp.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/jazz_esp.c b/drivers/scsi/jazz_esp.c index 9aaa74e349cc..6eb5ff3e2e61 100644 --- a/drivers/scsi/jazz_esp.c +++ b/drivers/scsi/jazz_esp.c @@ -147,7 +147,7 @@ static int esp_jazz_probe(struct platform_device *dev) esp = shost_priv(host); esp->host = host; - esp->dev = dev; + esp->dev = >dev; esp->ops = _esp_ops; res = platform_get_resource(dev, IORESOURCE_MEM, 0); diff --git a/drivers/scsi/sun3x_esp.c b/drivers/scsi/sun3x_esp.c index d50c5ed8f428..0b1421cdf8a0 100644 --- a/drivers/scsi/sun3x_esp.c +++ b/drivers/scsi/sun3x_esp.c @@ -210,7 +210,7 @@ static int esp_sun3x_probe(struct platform_device *dev) esp = shost_priv(host); esp->host = host; - esp->dev = dev; + esp->dev = >dev; esp->ops = _esp_ops; res = platform_get_resource(dev, IORESOURCE_MEM, 0); -- 2.16.1
[tip:x86/pti] objtool: Fix 32-bit build
Commit-ID: 63474dc4ac7ed3848a4786b9592dd061901f606d Gitweb: https://git.kernel.org/tip/63474dc4ac7ed3848a4786b9592dd061901f606d Author: Josh PoimboeufAuthorDate: Tue, 6 Mar 2018 17:58:15 -0600 Committer: Ingo Molnar CommitDate: Wed, 7 Mar 2018 07:50:38 +0100 objtool: Fix 32-bit build Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit host. This also simplifies read_retpoline_hints() a bit and makes its implementation similar to most of the other annotation reading functions. Reported-by: Sven Joachim Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: b5bc2231b8ad ("objtool: Add retpoline validation") Link: http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoim...@redhat.com Signed-off-by: Ingo Molnar --- tools/objtool/check.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 46c1d239cc1b..92b6a2c21631 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1116,42 +1116,29 @@ static int read_unwind_hints(struct objtool_file *file) static int read_retpoline_hints(struct objtool_file *file) { - struct section *sec, *relasec; + struct section *sec; struct instruction *insn; struct rela *rela; - int i; - sec = find_section_by_name(file->elf, ".discard.retpoline_safe"); + sec = find_section_by_name(file->elf, ".rela.discard.retpoline_safe"); if (!sec) return 0; - relasec = sec->rela; - if (!relasec) { - WARN("missing .rela.discard.retpoline_safe section"); - return -1; - } - - if (sec->len % sizeof(unsigned long)) { - WARN("retpoline_safe size mismatch: %d %ld", sec->len, sizeof(unsigned long)); - return -1; - } - - for (i = 0; i < sec->len / sizeof(unsigned long); i++) { - rela = find_rela_by_dest(sec, i * sizeof(unsigned long)); - if (!rela) { - WARN("can't find rela for retpoline_safe[%d]", i); + list_for_each_entry(rela, >rela_list, list) { + if (rela->sym->type != STT_SECTION) { + WARN("unexpected relocation symbol type in %s", sec->name); return -1; } insn = find_insn(file, rela->sym->sec, rela->addend); if (!insn) { - WARN("can't find insn for retpoline_safe[%d]", i); + WARN("bad .discard.retpoline_safe entry"); return -1; } if (insn->type != INSN_JUMP_DYNAMIC && insn->type != INSN_CALL_DYNAMIC) { - WARN_FUNC("retpoline_safe hint not a indirect jump/call", + WARN_FUNC("retpoline_safe hint not an indirect jump/call", insn->sec, insn->offset); return -1; }
[tip:x86/pti] objtool: Fix 32-bit build
Commit-ID: 63474dc4ac7ed3848a4786b9592dd061901f606d Gitweb: https://git.kernel.org/tip/63474dc4ac7ed3848a4786b9592dd061901f606d Author: Josh Poimboeuf AuthorDate: Tue, 6 Mar 2018 17:58:15 -0600 Committer: Ingo Molnar CommitDate: Wed, 7 Mar 2018 07:50:38 +0100 objtool: Fix 32-bit build Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit host. This also simplifies read_retpoline_hints() a bit and makes its implementation similar to most of the other annotation reading functions. Reported-by: Sven Joachim Signed-off-by: Josh Poimboeuf Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: b5bc2231b8ad ("objtool: Add retpoline validation") Link: http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoim...@redhat.com Signed-off-by: Ingo Molnar --- tools/objtool/check.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 46c1d239cc1b..92b6a2c21631 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -1116,42 +1116,29 @@ static int read_unwind_hints(struct objtool_file *file) static int read_retpoline_hints(struct objtool_file *file) { - struct section *sec, *relasec; + struct section *sec; struct instruction *insn; struct rela *rela; - int i; - sec = find_section_by_name(file->elf, ".discard.retpoline_safe"); + sec = find_section_by_name(file->elf, ".rela.discard.retpoline_safe"); if (!sec) return 0; - relasec = sec->rela; - if (!relasec) { - WARN("missing .rela.discard.retpoline_safe section"); - return -1; - } - - if (sec->len % sizeof(unsigned long)) { - WARN("retpoline_safe size mismatch: %d %ld", sec->len, sizeof(unsigned long)); - return -1; - } - - for (i = 0; i < sec->len / sizeof(unsigned long); i++) { - rela = find_rela_by_dest(sec, i * sizeof(unsigned long)); - if (!rela) { - WARN("can't find rela for retpoline_safe[%d]", i); + list_for_each_entry(rela, >rela_list, list) { + if (rela->sym->type != STT_SECTION) { + WARN("unexpected relocation symbol type in %s", sec->name); return -1; } insn = find_insn(file, rela->sym->sec, rela->addend); if (!insn) { - WARN("can't find insn for retpoline_safe[%d]", i); + WARN("bad .discard.retpoline_safe entry"); return -1; } if (insn->type != INSN_JUMP_DYNAMIC && insn->type != INSN_CALL_DYNAMIC) { - WARN_FUNC("retpoline_safe hint not a indirect jump/call", + WARN_FUNC("retpoline_safe hint not an indirect jump/call", insn->sec, insn->offset); return -1; }
[GIT PULL] s390 patches for 4.16-rc5
Hi Linus, please pull from the 'for-linus' branch of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus to receive the following updates: Nine bug fixes for s390: * Three fixes for the expoline code, one of them is strictly speaking a cleanup but as it relates to code added with 4.16 I would like to include the patch. * Three timer related fixes in the common I/O layer * A fix for the handling of internal DASD request which could cause panics. * One correction in regard to the accounting of pud page tables vs. compat tasks. * The register scrubbing in entry.S caused spurious crashes, this is fixed now as well. Christian Borntraeger (1): s390/entry.S: fix spurious zeroing of r0 Eugeniu Rosca (1): s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*) Guenter Roeck (1): s390: Fix runtime warning about negative pgtables_bytes Hendrik Brueckner (1): s390/clean-up: use CFI_* macros in entry.S Martin Schwidefsky (1): s390: do not bypass BPENTER for interrupt system calls Sebastian Ott (3): s390/cio: fix ccw_device_start_timeout API s390/cio: fix return code after missing interrupt s390/cio: clear timer when terminating driver I/O Stefan Haberland (1): s390/dasd: fix handling of internal requests arch/s390/include/asm/mmu_context.h | 1 + arch/s390/kernel/entry.S| 10 +++--- arch/s390/kernel/nospec-branch.c| 4 +-- drivers/s390/block/dasd.c | 21 --- drivers/s390/cio/device_fsm.c | 7 ++-- drivers/s390/cio/device_ops.c | 72 + drivers/s390/cio/io_sch.h | 1 + 7 files changed, 54 insertions(+), 62 deletions(-) diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index 65154ea..6c8ce15 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -63,6 +63,7 @@ static inline int init_new_context(struct task_struct *tsk, _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT; /* pgd_alloc() did not account this pmd */ mm_inc_nr_pmds(mm); + mm_inc_nr_puds(mm); } crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm)); return 0; diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 13a133a..a5621ea 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -230,7 +231,7 @@ _PIF_WORK = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART) .hidden \name .type \name,@function \name: - .cfi_startproc + CFI_STARTPROC #ifdef CONFIG_HAVE_MARCH_Z10_FEATURES exrl0,0f #else @@ -239,7 +240,7 @@ _PIF_WORK = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART) #endif j . 0: br \reg - .cfi_endproc + CFI_ENDPROC .endm GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1 @@ -426,13 +427,13 @@ ENTRY(system_call) UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg%r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 # load address of system call table lg %r10,__THREAD_sysc_table(%r13,%r12) llgh%r8,__PT_INT_CODE+2(%r11) @@ -1439,6 +1440,7 @@ cleanup_critical: stg %r15,__LC_SYSTEM_TIMER 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 69d7fcf..9aff72d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -2,8 +2,8 @@ #include #include -int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); +int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL); static int __init nospectre_v2_setup_early(char *str) { diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c index a7c15f0..ecef8e7 100644 --- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -2581,8 +2581,6 @@ int dasd_cancel_req(struct dasd_ccw_req *cqr) case DASD_CQR_QUEUED: /* request was not started - just set to cleared */
[GIT PULL] s390 patches for 4.16-rc5
Hi Linus, please pull from the 'for-linus' branch of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus to receive the following updates: Nine bug fixes for s390: * Three fixes for the expoline code, one of them is strictly speaking a cleanup but as it relates to code added with 4.16 I would like to include the patch. * Three timer related fixes in the common I/O layer * A fix for the handling of internal DASD request which could cause panics. * One correction in regard to the accounting of pud page tables vs. compat tasks. * The register scrubbing in entry.S caused spurious crashes, this is fixed now as well. Christian Borntraeger (1): s390/entry.S: fix spurious zeroing of r0 Eugeniu Rosca (1): s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*) Guenter Roeck (1): s390: Fix runtime warning about negative pgtables_bytes Hendrik Brueckner (1): s390/clean-up: use CFI_* macros in entry.S Martin Schwidefsky (1): s390: do not bypass BPENTER for interrupt system calls Sebastian Ott (3): s390/cio: fix ccw_device_start_timeout API s390/cio: fix return code after missing interrupt s390/cio: clear timer when terminating driver I/O Stefan Haberland (1): s390/dasd: fix handling of internal requests arch/s390/include/asm/mmu_context.h | 1 + arch/s390/kernel/entry.S| 10 +++--- arch/s390/kernel/nospec-branch.c| 4 +-- drivers/s390/block/dasd.c | 21 --- drivers/s390/cio/device_fsm.c | 7 ++-- drivers/s390/cio/device_ops.c | 72 + drivers/s390/cio/io_sch.h | 1 + 7 files changed, 54 insertions(+), 62 deletions(-) diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h index 65154ea..6c8ce15 100644 --- a/arch/s390/include/asm/mmu_context.h +++ b/arch/s390/include/asm/mmu_context.h @@ -63,6 +63,7 @@ static inline int init_new_context(struct task_struct *tsk, _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT; /* pgd_alloc() did not account this pmd */ mm_inc_nr_pmds(mm); + mm_inc_nr_puds(mm); } crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm)); return 0; diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index 13a133a..a5621ea 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -230,7 +231,7 @@ _PIF_WORK = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART) .hidden \name .type \name,@function \name: - .cfi_startproc + CFI_STARTPROC #ifdef CONFIG_HAVE_MARCH_Z10_FEATURES exrl0,0f #else @@ -239,7 +240,7 @@ _PIF_WORK = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART) #endif j . 0: br \reg - .cfi_endproc + CFI_ENDPROC .endm GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1 @@ -426,13 +427,13 @@ ENTRY(system_call) UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP stmg%r0,%r7,__PT_R0(%r11) - # clear user controlled register to prevent speculative use - xgr %r0,%r0 mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC stg %r14,__PT_FLAGS(%r11) .Lsysc_do_svc: + # clear user controlled register to prevent speculative use + xgr %r0,%r0 # load address of system call table lg %r10,__THREAD_sysc_table(%r13,%r12) llgh%r8,__PT_INT_CODE+2(%r11) @@ -1439,6 +1440,7 @@ cleanup_critical: stg %r15,__LC_SYSTEM_TIMER 0: # update accounting time stamp mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER + BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP # set up saved register r11 lg %r15,__LC_KERNEL_STACK la %r9,STACK_FRAME_OVERHEAD(%r15) diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index 69d7fcf..9aff72d 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -2,8 +2,8 @@ #include #include -int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF); -int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL); +int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF); +int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL); static int __init nospectre_v2_setup_early(char *str) { diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c index a7c15f0..ecef8e7 100644 --- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -2581,8 +2581,6 @@ int dasd_cancel_req(struct dasd_ccw_req *cqr) case DASD_CQR_QUEUED: /* request was not started - just set to cleared */
Re: [pci PATCH v3 0/3] Add support for unmanaged SR-IOV
On Tue, Mar 06, 2018 at 11:29:08AM -0800, Alexander Duyck wrote: > This series is meant to add support for SR-IOV on devices when the VFs are > not managed by the kernel. Examples of recent patches attempting to do this > include: > virto - https://patchwork.kernel.org/patch/10241225/ > pci-stub - https://patchwork.kernel.org/patch/10109935/ > vfio - https://patchwork.kernel.org/patch/10103353/ > uio - https://patchwork.kernel.org/patch/9974031/ nvme and ema seems to be existing examples. Care to throw in conversions while you're at it?
Re: [pci PATCH v3 0/3] Add support for unmanaged SR-IOV
On Tue, Mar 06, 2018 at 11:29:08AM -0800, Alexander Duyck wrote: > This series is meant to add support for SR-IOV on devices when the VFs are > not managed by the kernel. Examples of recent patches attempting to do this > include: > virto - https://patchwork.kernel.org/patch/10241225/ > pci-stub - https://patchwork.kernel.org/patch/10109935/ > vfio - https://patchwork.kernel.org/patch/10103353/ > uio - https://patchwork.kernel.org/patch/9974031/ nvme and ema seems to be existing examples. Care to throw in conversions while you're at it?
RE: [PATCH v6] mmc: Export host capabilities to debugfs.
> -Original Message- > From: Harish Jenny K N [mailto:harish_kand...@mentor.com] > Sent: Wednesday, March 07, 2018 7:38 AM > To: ulf.hans...@linaro.org; linus.wall...@linaro.org; > adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman >; andriy.shevche...@linux.intel.com > Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; > harish_kand...@mentor.com; vladimir_zapols...@mentor.com > Subject: [PATCH v6] mmc: Export host capabilities to debugfs. > > This patch exports the host capabilities to debugfs > > This idea of sharing host capabilities over debugfs came up from Abbas Raza > Earlier discussions: > https://lkml.org/lkml/2018/3/5/357 > https://www.spinics.net/lists/linux-mmc/msg48219.html > > Signed-off-by: Harish Jenny K N > --- > > > +static int mmc_caps_show(struct seq_file *s, void *unused) { > + struct mmc_host *host = s->private; > + u32 caps = host->caps; > + > + seq_puts(s, "\nMMC Host capabilities are:\n"); > + seq_puts(s, > "=\n"); > + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", > +((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); Maybe use a more compact form, and just call a macro with the applicable (stringified) bit? Thanks, Avri
RE: [PATCH v6] mmc: Export host capabilities to debugfs.
> -Original Message- > From: Harish Jenny K N [mailto:harish_kand...@mentor.com] > Sent: Wednesday, March 07, 2018 7:38 AM > To: ulf.hans...@linaro.org; linus.wall...@linaro.org; > adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman > ; andriy.shevche...@linux.intel.com > Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; > harish_kand...@mentor.com; vladimir_zapols...@mentor.com > Subject: [PATCH v6] mmc: Export host capabilities to debugfs. > > This patch exports the host capabilities to debugfs > > This idea of sharing host capabilities over debugfs came up from Abbas Raza > Earlier discussions: > https://lkml.org/lkml/2018/3/5/357 > https://www.spinics.net/lists/linux-mmc/msg48219.html > > Signed-off-by: Harish Jenny K N > --- > > > +static int mmc_caps_show(struct seq_file *s, void *unused) { > + struct mmc_host *host = s->private; > + u32 caps = host->caps; > + > + seq_puts(s, "\nMMC Host capabilities are:\n"); > + seq_puts(s, > "=\n"); > + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", > +((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); Maybe use a more compact form, and just call a macro with the applicable (stringified) bit? Thanks, Avri
[PATCH 1/1] iommu/arm-smmu: Add support for qcom,smmu-500 variant
Qualcomm's arm-smmu 500 implementation supports runtime pm so enable the same. Signed-off-by: Vivek Gautam--- Based on iommu/arm-smmu pm runtime support series [1]: [PATCH v8 0/5] iommu/arm-smmu: Add runtime pm/sleep support Tested on sdm845 with necessary support to enable the smmu and with necessary user. [1] https://lkml.org/lkml/2018/3/2/325 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 14 ++ drivers/iommu/arm-smmu.c | 8 2 files changed, 22 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 6ea27bd4f785..0b5c6d2a9865 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -18,6 +18,7 @@ conditions. "arm,mmu-500" "cavium,smmu-v2" "qcom,-smmu-v2", "qcom,smmu-v2" +"qcom,-smmu-500", "qcom,smmu-500" depending on the particular implementation and/or the version of the architecture implemented. @@ -30,6 +31,10 @@ conditions. An example string would be - "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + "qcom,smmu-500" is arm,mmu-500 implementation that supports + efficient power management by supporting smmu's state + retention. + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -179,3 +184,12 @@ conditions. < SMMU_MDP_AHB_CLK>; clock-names = "bus", "iface"; }; + + smmu5: iommu { + compatible = "qcom,sdm845-smmu-500", "qcom,smmu-500"; + reg = <0x1500 0x8>; + #iommu-cells = <2>; + #global-interrupts = <1>; + + ... + }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 7a96c924ae22..7f52456c6b25 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -2008,6 +2008,12 @@ static const char * const qcom_smmuv2_clks[] = { "bus", "iface", }; +static const struct arm_smmu_match_data qcom_smmu500 = { + .version = ARM_SMMU_V2, + .model = ARM_MMU500, + .rpm_supported = true, +}; + static const struct arm_smmu_match_data qcom_smmuv2 = { .version = ARM_SMMU_V2, .model = QCOM_SMMUV2, @@ -2024,6 +2030,7 @@ static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,mmu-500", .data = _mmu500 }, { .compatible = "cavium,smmu-v2", .data = _smmuv2 }, { .compatible = "qcom,smmu-v2", .data = _smmuv2 }, + { .compatible = "qcom,smmu-500", .data = _smmu500 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); @@ -2394,6 +2401,7 @@ IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401"); IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500"); IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2"); IOMMU_OF_DECLARE(qcom_smmuv2, "qcom,smmu-v2"); +IOMMU_OF_DECLARE(qcom_smmu500, "qcom,smmu-500"); MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations"); MODULE_AUTHOR("Will Deacon "); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[PATCH 1/1] iommu/arm-smmu: Add support for qcom,smmu-500 variant
Qualcomm's arm-smmu 500 implementation supports runtime pm so enable the same. Signed-off-by: Vivek Gautam --- Based on iommu/arm-smmu pm runtime support series [1]: [PATCH v8 0/5] iommu/arm-smmu: Add runtime pm/sleep support Tested on sdm845 with necessary support to enable the smmu and with necessary user. [1] https://lkml.org/lkml/2018/3/2/325 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 14 ++ drivers/iommu/arm-smmu.c | 8 2 files changed, 22 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 6ea27bd4f785..0b5c6d2a9865 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -18,6 +18,7 @@ conditions. "arm,mmu-500" "cavium,smmu-v2" "qcom,-smmu-v2", "qcom,smmu-v2" +"qcom,-smmu-500", "qcom,smmu-500" depending on the particular implementation and/or the version of the architecture implemented. @@ -30,6 +31,10 @@ conditions. An example string would be - "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + "qcom,smmu-500" is arm,mmu-500 implementation that supports + efficient power management by supporting smmu's state + retention. + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -179,3 +184,12 @@ conditions. < SMMU_MDP_AHB_CLK>; clock-names = "bus", "iface"; }; + + smmu5: iommu { + compatible = "qcom,sdm845-smmu-500", "qcom,smmu-500"; + reg = <0x1500 0x8>; + #iommu-cells = <2>; + #global-interrupts = <1>; + + ... + }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 7a96c924ae22..7f52456c6b25 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -2008,6 +2008,12 @@ static const char * const qcom_smmuv2_clks[] = { "bus", "iface", }; +static const struct arm_smmu_match_data qcom_smmu500 = { + .version = ARM_SMMU_V2, + .model = ARM_MMU500, + .rpm_supported = true, +}; + static const struct arm_smmu_match_data qcom_smmuv2 = { .version = ARM_SMMU_V2, .model = QCOM_SMMUV2, @@ -2024,6 +2030,7 @@ static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,mmu-500", .data = _mmu500 }, { .compatible = "cavium,smmu-v2", .data = _smmuv2 }, { .compatible = "qcom,smmu-v2", .data = _smmuv2 }, + { .compatible = "qcom,smmu-500", .data = _smmu500 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); @@ -2394,6 +2401,7 @@ IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401"); IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500"); IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2"); IOMMU_OF_DECLARE(qcom_smmuv2, "qcom,smmu-v2"); +IOMMU_OF_DECLARE(qcom_smmu500, "qcom,smmu-500"); MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations"); MODULE_AUTHOR("Will Deacon "); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section
On 3/7/2018 2:55 PM, Byungchul Park wrote: On 3/6/2018 10:42 PM, Boqun Feng wrote: On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote: Hello Paul and RCU folks, I am afraid I correctly understand and fix it. But I really wonder why sync_rcu_exp_handler() reports the quiescent state even in the case that current task is within a RCU read-side section. Do I miss something? If I correctly understand it and you agree with it, I can add more logic which make it more expedited by boosting current or making it urgent when we fail to report the quiescent state on the IPI. ->8- From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001 From: Byungchul ParkDate: Tue, 6 Mar 2018 13:54:41 +0900 Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section We report the quiescent state for this cpu if it's out of RCU read-side section at the moment IPI was just fired during the expedite process. However, current code reports the quiescent state even in the case: 1) the current task is still within a RCU read-side section 2) the current task has been blocked within the RCU read-side section If this happens, the task will queue itself in rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp kthread will wait for this task to dequeue itself. IOW, we have other mechanism to wait for this task other than bottom-up qs reporting tree. So I think we are fine here. Right. Basically we consider both the quiscent state within the current task and queued tasks on rcu nodes that you mentioned, to control grace periods when PREEMPT kernel is used. Actually my concern was if it's safe to clear the bit of 'expmask' on the IPI for all possible cases, even though anyway blocked tasks would try to prevent the grace period from ending. I worried if something subtle might cause problems, but the code looks fine on second thought as you said. Thank you for your explanation. In addition, by making quiescent states reported and bits of expmask cleared only when it's out of rcu read sections, of course keeping other mechanism unchanged like what you mentioned, I think we can avoid unnecessary locking ops and other statements, keeping the code still sane, even though the benefit might be small. For example, by removing some evitable calls to rcu_report_cpu_mult() either directly or indirectly. I'm not sure if RCU maintainers think it's worthy tho. Regards, Boqun Since we don't get to the quiescent state yet in the case, we shouldn't report it but check it another time. Signed-off-by: Byungchul Park --- kernel/rcu/tree_exp.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 73e1d3d..cc69d14 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info) /* * We are either exiting an RCU read-side critical section (negative * values of t->rcu_read_lock_nesting) or are not in one at all - * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU - * read-side critical section that blocked before this expedited - * grace period started. Either way, we can immediately report - * the quiescent state. + * (zero value of t->rcu_read_lock_nesting). We can immediately + * report the quiescent state. */ - rdp = this_cpu_ptr(rsp->rda); - rcu_report_exp_rdp(rsp, rdp, true); + if (t->rcu_read_lock_nesting <= 0) { + rdp = this_cpu_ptr(rsp->rda); + rcu_report_exp_rdp(rsp, rdp, true); + } } /** -- 1.9.1 -- Thanks, Byungchul
Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section
On 3/7/2018 2:55 PM, Byungchul Park wrote: On 3/6/2018 10:42 PM, Boqun Feng wrote: On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote: Hello Paul and RCU folks, I am afraid I correctly understand and fix it. But I really wonder why sync_rcu_exp_handler() reports the quiescent state even in the case that current task is within a RCU read-side section. Do I miss something? If I correctly understand it and you agree with it, I can add more logic which make it more expedited by boosting current or making it urgent when we fail to report the quiescent state on the IPI. ->8- From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001 From: Byungchul Park Date: Tue, 6 Mar 2018 13:54:41 +0900 Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section We report the quiescent state for this cpu if it's out of RCU read-side section at the moment IPI was just fired during the expedite process. However, current code reports the quiescent state even in the case: 1) the current task is still within a RCU read-side section 2) the current task has been blocked within the RCU read-side section If this happens, the task will queue itself in rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp kthread will wait for this task to dequeue itself. IOW, we have other mechanism to wait for this task other than bottom-up qs reporting tree. So I think we are fine here. Right. Basically we consider both the quiscent state within the current task and queued tasks on rcu nodes that you mentioned, to control grace periods when PREEMPT kernel is used. Actually my concern was if it's safe to clear the bit of 'expmask' on the IPI for all possible cases, even though anyway blocked tasks would try to prevent the grace period from ending. I worried if something subtle might cause problems, but the code looks fine on second thought as you said. Thank you for your explanation. In addition, by making quiescent states reported and bits of expmask cleared only when it's out of rcu read sections, of course keeping other mechanism unchanged like what you mentioned, I think we can avoid unnecessary locking ops and other statements, keeping the code still sane, even though the benefit might be small. For example, by removing some evitable calls to rcu_report_cpu_mult() either directly or indirectly. I'm not sure if RCU maintainers think it's worthy tho. Regards, Boqun Since we don't get to the quiescent state yet in the case, we shouldn't report it but check it another time. Signed-off-by: Byungchul Park --- kernel/rcu/tree_exp.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 73e1d3d..cc69d14 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info) /* * We are either exiting an RCU read-side critical section (negative * values of t->rcu_read_lock_nesting) or are not in one at all - * (zero value of t->rcu_read_lock_nesting). Or we are in an RCU - * read-side critical section that blocked before this expedited - * grace period started. Either way, we can immediately report - * the quiescent state. + * (zero value of t->rcu_read_lock_nesting). We can immediately + * report the quiescent state. */ - rdp = this_cpu_ptr(rsp->rda); - rcu_report_exp_rdp(rsp, rdp, true); + if (t->rcu_read_lock_nesting <= 0) { + rdp = this_cpu_ptr(rsp->rda); + rcu_report_exp_rdp(rsp, rdp, true); + } } /** -- 1.9.1 -- Thanks, Byungchul
[PATCH v1 1/9] PCI/PM: Move pcie_clear_root_pme_status() to core
From: Bjorn HelgaasMove pcie_clear_root_pme_status() from the port driver to the PCI core so it will be available even when the port driver isn't present. No functional change intended. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci.c |9 + drivers/pci/pci.h |1 + drivers/pci/pcie/portdrv.h |2 -- drivers/pci/pcie/portdrv_pci.c |9 - 4 files changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f6a4dd10d9b0..120e3393fc35 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1683,6 +1683,15 @@ int pci_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state state) } EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state); +/** + * pcie_clear_root_pme_status - Clear root port PME interrupt status. + * @dev: PCIe root port or event collector. + */ +void pcie_clear_root_pme_status(struct pci_dev *dev) +{ + pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); +} + /** * pci_check_pme_status - Check if given device has generated PME. * @dev: Device to check. diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index fcd81911b127..813ca2c895d8 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -71,6 +71,7 @@ void pci_update_current_state(struct pci_dev *dev, pci_power_t state); void pci_power_up(struct pci_dev *dev); void pci_disable_enabled_device(struct pci_dev *dev); int pci_finish_runtime_suspend(struct pci_dev *dev); +void pcie_clear_root_pme_status(struct pci_dev *dev); int __pci_pme_wakeup(struct pci_dev *dev, void *ign); void pci_pme_restore(struct pci_dev *dev); bool pci_dev_keep_suspended(struct pci_dev *dev); diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index a854bc569117..a4fc44d52206 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -34,8 +34,6 @@ void pcie_port_bus_unregister(void); struct pci_dev; -void pcie_clear_root_pme_status(struct pci_dev *dev); - #ifdef CONFIG_HOTPLUG_PCI_PCIE extern bool pciehp_msi_disabled; diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index fb1c1bb87316..4413dd85e923 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -50,15 +50,6 @@ __setup("pcie_ports=", pcie_port_setup); /* global data */ -/** - * pcie_clear_root_pme_status - Clear root port PME interrupt status. - * @dev: PCIe root port or event collector. - */ -void pcie_clear_root_pme_status(struct pci_dev *dev) -{ - pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); -} - static int pcie_portdrv_restore_config(struct pci_dev *dev) { int retval;
[PATCH v1 1/9] PCI/PM: Move pcie_clear_root_pme_status() to core
From: Bjorn Helgaas Move pcie_clear_root_pme_status() from the port driver to the PCI core so it will be available even when the port driver isn't present. No functional change intended. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci.c |9 + drivers/pci/pci.h |1 + drivers/pci/pcie/portdrv.h |2 -- drivers/pci/pcie/portdrv_pci.c |9 - 4 files changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index f6a4dd10d9b0..120e3393fc35 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1683,6 +1683,15 @@ int pci_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state state) } EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state); +/** + * pcie_clear_root_pme_status - Clear root port PME interrupt status. + * @dev: PCIe root port or event collector. + */ +void pcie_clear_root_pme_status(struct pci_dev *dev) +{ + pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); +} + /** * pci_check_pme_status - Check if given device has generated PME. * @dev: Device to check. diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index fcd81911b127..813ca2c895d8 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -71,6 +71,7 @@ void pci_update_current_state(struct pci_dev *dev, pci_power_t state); void pci_power_up(struct pci_dev *dev); void pci_disable_enabled_device(struct pci_dev *dev); int pci_finish_runtime_suspend(struct pci_dev *dev); +void pcie_clear_root_pme_status(struct pci_dev *dev); int __pci_pme_wakeup(struct pci_dev *dev, void *ign); void pci_pme_restore(struct pci_dev *dev); bool pci_dev_keep_suspended(struct pci_dev *dev); diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index a854bc569117..a4fc44d52206 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -34,8 +34,6 @@ void pcie_port_bus_unregister(void); struct pci_dev; -void pcie_clear_root_pme_status(struct pci_dev *dev); - #ifdef CONFIG_HOTPLUG_PCI_PCIE extern bool pciehp_msi_disabled; diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index fb1c1bb87316..4413dd85e923 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -50,15 +50,6 @@ __setup("pcie_ports=", pcie_port_setup); /* global data */ -/** - * pcie_clear_root_pme_status - Clear root port PME interrupt status. - * @dev: PCIe root port or event collector. - */ -void pcie_clear_root_pme_status(struct pci_dev *dev) -{ - pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME); -} - static int pcie_portdrv_restore_config(struct pci_dev *dev) { int retval;
[PATCH v1 5/9] PCI/portdrv: Remove pcie_port_bus_type link order dependency
From: Bjorn HelgaasThe pcie_port_bus_type must be registered before drivers that depend on it can be registered. Those drivers include: pcied_init()# PCIe native hotplug driver aer_service_init() # AER driver dpc_service_init() # DPC driver pcie_pme_service_init() # PME driver Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a device_initcall. The callers of pcie_port_service_register() (above) are also device_initcalls. This is fragile because the device_initcall ordering depends on link order, which is not explicit. Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type. This removes the link order dependency between portdrv and the pciehp, AER, DPC, and PCIe PME drivers. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c | 45 +++- drivers/pci/pcie/Makefile |2 + drivers/pci/pcie/portdrv_bus.c | 56 drivers/pci/pcie/portdrv_pci.c | 13 + 4 files changed, 46 insertions(+), 70 deletions(-) delete mode 100644 drivers/pci/pcie/portdrv_bus.c diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 38ee7c8b4d1a..4db85a0faf34 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -7,6 +7,7 @@ */ #include +#include #include #include #include @@ -19,6 +20,7 @@ #include #include #include "pci.h" +#include "pcie/portdrv.h" struct pci_dynid { struct list_head node; @@ -1553,8 +1555,49 @@ struct bus_type pci_bus_type = { }; EXPORT_SYMBOL(pci_bus_type); +#ifdef CONFIG_PCIEPORTBUS +static int pcie_port_bus_match(struct device *dev, struct device_driver *drv) +{ + struct pcie_device *pciedev; + struct pcie_port_service_driver *driver; + + if (drv->bus != _port_bus_type || dev->bus != _port_bus_type) + return 0; + + pciedev = to_pcie_device(dev); + driver = to_service_driver(drv); + + if (driver->service != pciedev->service) + return 0; + + if ((driver->port_type != PCIE_ANY_PORT) && + (driver->port_type != pci_pcie_type(pciedev->port))) + return 0; + + return 1; +} + +struct bus_type pcie_port_bus_type = { + .name = "pci_express", + .match = pcie_port_bus_match, +}; +EXPORT_SYMBOL_GPL(pcie_port_bus_type); +#endif + static int __init pci_driver_init(void) { - return bus_register(_bus_type); + int ret; + + ret = bus_register(_bus_type); + if (ret) + return ret; + +#ifdef CONFIG_PCIEPORTBUS + ret = bus_register(_port_bus_type); + if (ret) + return ret; +#endif + + return 0; } postcore_initcall(pci_driver_init); diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 223e4c34c29a..e01c10c97b95 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -6,7 +6,7 @@ # Build PCI Express ASPM if needed obj-$(CONFIG_PCIEASPM) += aspm.o -pcieportdrv-y := portdrv_core.o portdrv_pci.o portdrv_bus.o +pcieportdrv-y := portdrv_core.o portdrv_pci.o pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/portdrv_bus.c b/drivers/pci/pcie/portdrv_bus.c deleted file mode 100644 index f0fba552a0e2.. --- a/drivers/pci/pcie/portdrv_bus.c +++ /dev/null @@ -1,56 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * File: portdrv_bus.c - * Purpose:PCI Express Port Bus Driver's Bus Overloading Functions - * - * Copyright (C) 2004 Intel - * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com) - */ - -#include -#include -#include -#include -#include - -#include -#include "portdrv.h" - -static int pcie_port_bus_match(struct device *dev, struct device_driver *drv); - -struct bus_type pcie_port_bus_type = { - .name = "pci_express", - .match = pcie_port_bus_match, -}; -EXPORT_SYMBOL_GPL(pcie_port_bus_type); - -static int pcie_port_bus_match(struct device *dev, struct device_driver *drv) -{ - struct pcie_device *pciedev; - struct pcie_port_service_driver *driver; - - if (drv->bus != _port_bus_type || dev->bus != _port_bus_type) - return 0; - - pciedev = to_pcie_device(dev); - driver = to_service_driver(drv); - - if (driver->service != pciedev->service) - return 0; - - if ((driver->port_type != PCIE_ANY_PORT) && - (driver->port_type != pci_pcie_type(pciedev->port))) - return 0; - - return 1; -} - -int pcie_port_bus_register(void) -{ - return bus_register(_port_bus_type); -} - -void pcie_port_bus_unregister(void) -{ - bus_unregister(_port_bus_type); -} diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
[PATCH v1 5/9] PCI/portdrv: Remove pcie_port_bus_type link order dependency
From: Bjorn Helgaas The pcie_port_bus_type must be registered before drivers that depend on it can be registered. Those drivers include: pcied_init()# PCIe native hotplug driver aer_service_init() # AER driver dpc_service_init() # DPC driver pcie_pme_service_init() # PME driver Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a device_initcall. The callers of pcie_port_service_register() (above) are also device_initcalls. This is fragile because the device_initcall ordering depends on link order, which is not explicit. Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type. This removes the link order dependency between portdrv and the pciehp, AER, DPC, and PCIe PME drivers. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c | 45 +++- drivers/pci/pcie/Makefile |2 + drivers/pci/pcie/portdrv_bus.c | 56 drivers/pci/pcie/portdrv_pci.c | 13 + 4 files changed, 46 insertions(+), 70 deletions(-) delete mode 100644 drivers/pci/pcie/portdrv_bus.c diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 38ee7c8b4d1a..4db85a0faf34 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -7,6 +7,7 @@ */ #include +#include #include #include #include @@ -19,6 +20,7 @@ #include #include #include "pci.h" +#include "pcie/portdrv.h" struct pci_dynid { struct list_head node; @@ -1553,8 +1555,49 @@ struct bus_type pci_bus_type = { }; EXPORT_SYMBOL(pci_bus_type); +#ifdef CONFIG_PCIEPORTBUS +static int pcie_port_bus_match(struct device *dev, struct device_driver *drv) +{ + struct pcie_device *pciedev; + struct pcie_port_service_driver *driver; + + if (drv->bus != _port_bus_type || dev->bus != _port_bus_type) + return 0; + + pciedev = to_pcie_device(dev); + driver = to_service_driver(drv); + + if (driver->service != pciedev->service) + return 0; + + if ((driver->port_type != PCIE_ANY_PORT) && + (driver->port_type != pci_pcie_type(pciedev->port))) + return 0; + + return 1; +} + +struct bus_type pcie_port_bus_type = { + .name = "pci_express", + .match = pcie_port_bus_match, +}; +EXPORT_SYMBOL_GPL(pcie_port_bus_type); +#endif + static int __init pci_driver_init(void) { - return bus_register(_bus_type); + int ret; + + ret = bus_register(_bus_type); + if (ret) + return ret; + +#ifdef CONFIG_PCIEPORTBUS + ret = bus_register(_port_bus_type); + if (ret) + return ret; +#endif + + return 0; } postcore_initcall(pci_driver_init); diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 223e4c34c29a..e01c10c97b95 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -6,7 +6,7 @@ # Build PCI Express ASPM if needed obj-$(CONFIG_PCIEASPM) += aspm.o -pcieportdrv-y := portdrv_core.o portdrv_pci.o portdrv_bus.o +pcieportdrv-y := portdrv_core.o portdrv_pci.o pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/portdrv_bus.c b/drivers/pci/pcie/portdrv_bus.c deleted file mode 100644 index f0fba552a0e2.. --- a/drivers/pci/pcie/portdrv_bus.c +++ /dev/null @@ -1,56 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * File: portdrv_bus.c - * Purpose:PCI Express Port Bus Driver's Bus Overloading Functions - * - * Copyright (C) 2004 Intel - * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com) - */ - -#include -#include -#include -#include -#include - -#include -#include "portdrv.h" - -static int pcie_port_bus_match(struct device *dev, struct device_driver *drv); - -struct bus_type pcie_port_bus_type = { - .name = "pci_express", - .match = pcie_port_bus_match, -}; -EXPORT_SYMBOL_GPL(pcie_port_bus_type); - -static int pcie_port_bus_match(struct device *dev, struct device_driver *drv) -{ - struct pcie_device *pciedev; - struct pcie_port_service_driver *driver; - - if (drv->bus != _port_bus_type || dev->bus != _port_bus_type) - return 0; - - pciedev = to_pcie_device(dev); - driver = to_service_driver(drv); - - if (driver->service != pciedev->service) - return 0; - - if ((driver->port_type != PCIE_ANY_PORT) && - (driver->port_type != pci_pcie_type(pciedev->port))) - return 0; - - return 1; -} - -int pcie_port_bus_register(void) -{ - return bus_register(_port_bus_type); -} - -void pcie_port_bus_unregister(void) -{ - bus_unregister(_port_bus_type); -} diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index c08ebd237242..9475886eeb62 100644
[PATCH v1 9/9] PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
From: Bjorn Helgaas7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver") added the "pcie_hp=nomsi" kernel parameter to work around this error on shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) Pid: 1081, comm: reboot Not tainted 3.2.0 #1 ... Disabling IRQ #16 This happened on an unspecified system (possibly involving the Integrated Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt is generated when PCI driver switches from MSI/MSI-X to INTx while shutting down the device." The implication was that the device was buggy, but it is normal for a device to use INTx after MSI/MSI-X have been disabled. The only problem was that the driver was still attached and it wasn't prepared for INTx interrupts. Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()"). There is no automated way to set this parameter, so it's not very useful for distributions or end users. It's really only useful for debugging, and we have "pci=nomsi" for that purpose. Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter. Signed-off-by: Bjorn Helgaas CC: MUNEDA Takahiro CC: Kenji Kaneshige CC: Prarit Bhargava --- Documentation/admin-guide/kernel-parameters.txt |4 drivers/pci/pcie/portdrv.h | 12 drivers/pci/pcie/portdrv_core.c | 20 +++- 3 files changed, 3 insertions(+), 33 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1d1d53f85ddd..761749562165 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3130,10 +3130,6 @@ force Enable ASPM even on devices that claim not to support it. WARNING: Forcing ASPM on may cause system lockups. - pcie_hp=[PCIE] PCI Express Hotplug driver options: - nomsi Do not use MSI for PCI Express Native Hotplug (this - makes all PCIe ports use INTx for hotplug services). - pcie_ports= [PCIE] PCIe ports handling: autoAsk the BIOS whether or not to use native PCIe services associated with PCIe ports (PME, hot-plug, AER). Use diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 2c19cf9ffea2..87a87cb9f42d 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -34,18 +34,6 @@ void pcie_port_bus_unregister(void); struct pci_dev; -#ifdef CONFIG_HOTPLUG_PCI_PCIE -extern bool pciehp_msi_disabled; - -static inline bool pciehp_no_msi(void) -{ - return pciehp_msi_disabled; -} - -#else /* !CONFIG_HOTPLUG_PCI_PCIE */ -static inline bool pciehp_no_msi(void) { return false; } -#endif /* !CONFIG_HOTPLUG_PCI_PCIE */ - #ifdef CONFIG_PCIE_PME extern bool pcie_pme_msi_disabled; diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 29210e9bfbd3..bf9c5c885957 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -21,17 +21,6 @@ #include "../pci.h" #include "portdrv.h" -bool pciehp_msi_disabled; - -static int __init pciehp_setup(char *str) -{ - if (!strncmp(str, "nomsi", 5)) - pciehp_msi_disabled = true; - - return 1; -} -__setup("pcie_hp=", pciehp_setup); - /** * release_pcie_device - free PCI Express port service device structure * @dev: Port service device to release @@ -169,16 +158,13 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) irqs[i] = -1; /* -* If we support PME or hotplug, but we can't use MSI/MSI-X for -* them, we have to fall back to INTx or other interrupts, e.g., a -* system shared interrupt. +* If we support PME but can't use MSI/MSI-X for it, we have to +* fall back to INTx or other interrupts, e.g., a system shared +* interrupt. */ if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi()) goto legacy_irq; - if ((mask & PCIE_PORT_SERVICE_HP) && pciehp_no_msi()) - goto legacy_irq; - /* Try to use MSI-X or MSI if supported */ if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0) return 0;
[PATCH v1 9/9] PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
From: Bjorn Helgaas 7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver") added the "pcie_hp=nomsi" kernel parameter to work around this error on shutdown: irq 16: nobody cared (try booting with the "irqpoll" option) Pid: 1081, comm: reboot Not tainted 3.2.0 #1 ... Disabling IRQ #16 This happened on an unspecified system (possibly involving the Integrated Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt is generated when PCI driver switches from MSI/MSI-X to INTx while shutting down the device." The implication was that the device was buggy, but it is normal for a device to use INTx after MSI/MSI-X have been disabled. The only problem was that the driver was still attached and it wasn't prepared for INTx interrupts. Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()"). There is no automated way to set this parameter, so it's not very useful for distributions or end users. It's really only useful for debugging, and we have "pci=nomsi" for that purpose. Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter. Signed-off-by: Bjorn Helgaas CC: MUNEDA Takahiro CC: Kenji Kaneshige CC: Prarit Bhargava --- Documentation/admin-guide/kernel-parameters.txt |4 drivers/pci/pcie/portdrv.h | 12 drivers/pci/pcie/portdrv_core.c | 20 +++- 3 files changed, 3 insertions(+), 33 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1d1d53f85ddd..761749562165 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3130,10 +3130,6 @@ force Enable ASPM even on devices that claim not to support it. WARNING: Forcing ASPM on may cause system lockups. - pcie_hp=[PCIE] PCI Express Hotplug driver options: - nomsi Do not use MSI for PCI Express Native Hotplug (this - makes all PCIe ports use INTx for hotplug services). - pcie_ports= [PCIE] PCIe ports handling: autoAsk the BIOS whether or not to use native PCIe services associated with PCIe ports (PME, hot-plug, AER). Use diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 2c19cf9ffea2..87a87cb9f42d 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -34,18 +34,6 @@ void pcie_port_bus_unregister(void); struct pci_dev; -#ifdef CONFIG_HOTPLUG_PCI_PCIE -extern bool pciehp_msi_disabled; - -static inline bool pciehp_no_msi(void) -{ - return pciehp_msi_disabled; -} - -#else /* !CONFIG_HOTPLUG_PCI_PCIE */ -static inline bool pciehp_no_msi(void) { return false; } -#endif /* !CONFIG_HOTPLUG_PCI_PCIE */ - #ifdef CONFIG_PCIE_PME extern bool pcie_pme_msi_disabled; diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 29210e9bfbd3..bf9c5c885957 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -21,17 +21,6 @@ #include "../pci.h" #include "portdrv.h" -bool pciehp_msi_disabled; - -static int __init pciehp_setup(char *str) -{ - if (!strncmp(str, "nomsi", 5)) - pciehp_msi_disabled = true; - - return 1; -} -__setup("pcie_hp=", pciehp_setup); - /** * release_pcie_device - free PCI Express port service device structure * @dev: Port service device to release @@ -169,16 +158,13 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) irqs[i] = -1; /* -* If we support PME or hotplug, but we can't use MSI/MSI-X for -* them, we have to fall back to INTx or other interrupts, e.g., a -* system shared interrupt. +* If we support PME but can't use MSI/MSI-X for it, we have to +* fall back to INTx or other interrupts, e.g., a system shared +* interrupt. */ if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi()) goto legacy_irq; - if ((mask & PCIE_PORT_SERVICE_HP) && pciehp_no_msi()) - goto legacy_irq; - /* Try to use MSI-X or MSI if supported */ if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0) return 0;
[PATCH v1 2/9] PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
From: Bjorn Helgaasfe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume") added a .resume_noirq() callback to the PCIe port driver to clear the PME Status bit during resume to work around a BIOS issue. The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups but did not clear the PME Status bit during resume, which meant PMEs after resume did not trigger interrupts because PME Status did not transition from cleared to set. The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS was set. But I think we *always* want the fix because the platform may use PME interrupts even if Linux is built without the PCIe port driver. Move the fix from the port driver to the PCI core so we can work around this "PME doesn't work after waking from a sleep state" issue regardless of CONFIG_PCIEPORTBUS. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c | 14 ++ drivers/pci/pcie/portdrv_pci.c | 15 --- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 3bed6beda051..bf0704b75f79 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -525,6 +525,18 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev) pci_fixup_device(pci_fixup_resume_early, pci_dev); } +static void pcie_resume_early(struct pci_dev *pci_dev) +{ + /* +* Some BIOSes forget to clear Root PME Status bits after system wakeup +* which breaks ACPI-based runtime wakeup on PCI Express, so clear those +* bits now just in case (shouldn't hurt). +*/ + if (pci_is_pcie(pci_dev) && + pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT) + pcie_clear_root_pme_status(pci_dev); +} + /* * Default "suspend" method for devices that have no driver provided suspend, * or not even a driver at all (second part). @@ -873,6 +885,8 @@ static int pci_pm_resume_noirq(struct device *dev) if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); + pcie_resume_early(pci_dev); + if (drv && drv->pm && drv->pm->resume_noirq) error = drv->pm->resume_noirq(dev); diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 4413dd85e923..f91afd09e356 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -62,20 +62,6 @@ static int pcie_portdrv_restore_config(struct pci_dev *dev) } #ifdef CONFIG_PM -static int pcie_port_resume_noirq(struct device *dev) -{ - struct pci_dev *pdev = to_pci_dev(dev); - - /* -* Some BIOSes forget to clear Root PME Status bits after system wakeup -* which breaks ACPI-based runtime wakeup on PCI Express, so clear those -* bits now just in case (shouldn't hurt). -*/ - if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) - pcie_clear_root_pme_status(pdev); - return 0; -} - static int pcie_port_runtime_suspend(struct device *dev) { return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY; @@ -103,7 +89,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = { .thaw = pcie_port_device_resume, .poweroff = pcie_port_device_suspend, .restore= pcie_port_device_resume, - .resume_noirq = pcie_port_resume_noirq, .runtime_suspend = pcie_port_runtime_suspend, .runtime_resume = pcie_port_runtime_resume, .runtime_idle = pcie_port_runtime_idle,
[PATCH v1 2/9] PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
From: Bjorn Helgaas fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume") added a .resume_noirq() callback to the PCIe port driver to clear the PME Status bit during resume to work around a BIOS issue. The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups but did not clear the PME Status bit during resume, which meant PMEs after resume did not trigger interrupts because PME Status did not transition from cleared to set. The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS was set. But I think we *always* want the fix because the platform may use PME interrupts even if Linux is built without the PCIe port driver. Move the fix from the port driver to the PCI core so we can work around this "PME doesn't work after waking from a sleep state" issue regardless of CONFIG_PCIEPORTBUS. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c | 14 ++ drivers/pci/pcie/portdrv_pci.c | 15 --- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 3bed6beda051..bf0704b75f79 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -525,6 +525,18 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev) pci_fixup_device(pci_fixup_resume_early, pci_dev); } +static void pcie_resume_early(struct pci_dev *pci_dev) +{ + /* +* Some BIOSes forget to clear Root PME Status bits after system wakeup +* which breaks ACPI-based runtime wakeup on PCI Express, so clear those +* bits now just in case (shouldn't hurt). +*/ + if (pci_is_pcie(pci_dev) && + pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT) + pcie_clear_root_pme_status(pci_dev); +} + /* * Default "suspend" method for devices that have no driver provided suspend, * or not even a driver at all (second part). @@ -873,6 +885,8 @@ static int pci_pm_resume_noirq(struct device *dev) if (pci_has_legacy_pm_support(pci_dev)) return pci_legacy_resume_early(dev); + pcie_resume_early(pci_dev); + if (drv && drv->pm && drv->pm->resume_noirq) error = drv->pm->resume_noirq(dev); diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 4413dd85e923..f91afd09e356 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -62,20 +62,6 @@ static int pcie_portdrv_restore_config(struct pci_dev *dev) } #ifdef CONFIG_PM -static int pcie_port_resume_noirq(struct device *dev) -{ - struct pci_dev *pdev = to_pci_dev(dev); - - /* -* Some BIOSes forget to clear Root PME Status bits after system wakeup -* which breaks ACPI-based runtime wakeup on PCI Express, so clear those -* bits now just in case (shouldn't hurt). -*/ - if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) - pcie_clear_root_pme_status(pdev); - return 0; -} - static int pcie_port_runtime_suspend(struct device *dev) { return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY; @@ -103,7 +89,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = { .thaw = pcie_port_device_resume, .poweroff = pcie_port_device_suspend, .restore= pcie_port_device_resume, - .resume_noirq = pcie_port_resume_noirq, .runtime_suspend = pcie_port_runtime_suspend, .runtime_resume = pcie_port_runtime_resume, .runtime_idle = pcie_port_runtime_idle,
[PATCH v1 7/9] PCI/portdrv: Simplify PCIe feature permission checking
From: Bjorn HelgaasSome PCIe features (AER, DPC, hotplug, PME) can be managed by either the platform firmware or the OS, so the host bridge driver may have to request permission from the platform before using them. On ACPI systems, this is done by negotiate_os_control() in acpi_pci_root_add(). The PCIe port driver later uses pcie_port_platform_notify() and pcie_port_acpi_setup() to figure out whether it can use these features. But all we need is a single bit for each service, so these interfaces are needlessly complicated. Simplify this by adding bits in the struct pci_host_bridge to show when the OS has permission to use each feature: + unsigned int use_aer:1; /* OS may use PCIe AER */ + unsigned int use_hotplug:1; /* OS may use PCIe hotplug */ + unsigned int use_pme:1; /* OS may use PCIe PME */ These are set when we create a host bridge, and the host bridge driver can clear the bits corresponding to any feature the platform doesn't want us to use. Signed-off-by: Bjorn Helgaas --- drivers/acpi/pci_root.c | 13 ++-- drivers/pci/pcie/Makefile |1 - drivers/pci/pcie/portdrv.h | 11 -- drivers/pci/pcie/portdrv_core.c | 42 --- drivers/pci/probe.c | 10 + include/linux/pci.h |3 +++ 6 files changed, 50 insertions(+), 30 deletions(-) diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 6fc204a52493..dce53527cdc1 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -871,6 +871,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root, struct acpi_device *device = root->device; int node = acpi_get_node(device->handle); struct pci_bus *bus; + struct pci_host_bridge *host_bridge; info->root = root; info->bridge = device; @@ -895,9 +896,17 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root, if (!bus) goto out_release_info; + host_bridge = to_pci_host_bridge(bus->bridge); + if (!(root->osc_control_set & PCIE_PORT_SERVICE_HP)) + host_bridge->use_hotplug = 0; + if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL)) + host_bridge->use_aer = 0; + if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL)) + host_bridge->use_pme = 0; + pci_scan_child_bus(bus); - pci_set_host_bridge_release(to_pci_host_bridge(bus->bridge), - acpi_pci_root_release_info, info); + pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info, + info); if (node != NUMA_NO_NODE) dev_printk(KERN_DEBUG, >dev, "on NUMA node %d\n", node); return bus; diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index e01c10c97b95..11fb633b866c 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -7,7 +7,6 @@ obj-$(CONFIG_PCIEASPM) += aspm.o pcieportdrv-y := portdrv_core.o portdrv_pci.o -pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 749d200936d9..2c19cf9ffea2 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -66,15 +66,4 @@ static inline bool pcie_pme_no_msi(void) { return false; } static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {} #endif /* !CONFIG_PCIE_PME */ -#ifdef CONFIG_ACPI -void pcie_port_acpi_setup(struct pci_dev *port, int *mask); - -static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask) -{ - pcie_port_acpi_setup(port, mask); -} -#else /* !CONFIG_ACPI */ -static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){} -#endif /* !CONFIG_ACPI */ - #endif /* _PORTDRV_H_ */ diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 94ce4dc50d1a..29210e9bfbd3 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -207,19 +207,20 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) */ static int get_port_device_capability(struct pci_dev *dev) { + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); + bool native; int services = 0; - int cap_mask = 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP; - if (pci_aer_available()) - cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC; - - if (pcie_ports_auto) - pcie_port_platform_notify(dev, _mask); + /* +* If the user specified "pcie_ports=native", use the PCIe services +* regardless of whether the platform has given us permission. On +* ACPI systems, this means we ignore _OSC. +
[PATCH v1 4/9] PCI/portdrv: Disable port driver in compat mode
From: Bjorn HelgaasThe "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is intended to disable the PCIe port driver. But even when it was disabled, we registered pcie_portdriver so we could work around a BIOS PME issue (see fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume")). Registering the driver meant that the pcie_portdrv_probe() path called pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(), pm_runtime_use_autosuspend(), etc., even when the driver was disabled. We've since moved the BIOS PME workaround from the port driver to the core, so stop registering the PCIe port driver in compat mode. This means "pcie_ports=compat" will now be basically the same as turning off CONFIG_PCIEPORTBUS completely. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv_core.c |3 --- drivers/pci/pcie/portdrv_pci.c |2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index ef3bad4ad010..9db77c683732 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -212,9 +212,6 @@ static int get_port_device_capability(struct pci_dev *dev) int services = 0; int cap_mask = 0; - if (pcie_ports_disabled) - return 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP | PCIE_PORT_SERVICE_VC; if (pci_aer_available()) diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index f91afd09e356..c08ebd237242 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -262,7 +262,7 @@ static int __init pcie_portdrv_init(void) int retval; if (pcie_ports_disabled) - return pci_register_driver(_portdriver); + return -EACCES; dmi_check_system(pcie_portdrv_dmi_table);
[PATCH v1 7/9] PCI/portdrv: Simplify PCIe feature permission checking
From: Bjorn Helgaas Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the platform firmware or the OS, so the host bridge driver may have to request permission from the platform before using them. On ACPI systems, this is done by negotiate_os_control() in acpi_pci_root_add(). The PCIe port driver later uses pcie_port_platform_notify() and pcie_port_acpi_setup() to figure out whether it can use these features. But all we need is a single bit for each service, so these interfaces are needlessly complicated. Simplify this by adding bits in the struct pci_host_bridge to show when the OS has permission to use each feature: + unsigned int use_aer:1; /* OS may use PCIe AER */ + unsigned int use_hotplug:1; /* OS may use PCIe hotplug */ + unsigned int use_pme:1; /* OS may use PCIe PME */ These are set when we create a host bridge, and the host bridge driver can clear the bits corresponding to any feature the platform doesn't want us to use. Signed-off-by: Bjorn Helgaas --- drivers/acpi/pci_root.c | 13 ++-- drivers/pci/pcie/Makefile |1 - drivers/pci/pcie/portdrv.h | 11 -- drivers/pci/pcie/portdrv_core.c | 42 --- drivers/pci/probe.c | 10 + include/linux/pci.h |3 +++ 6 files changed, 50 insertions(+), 30 deletions(-) diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 6fc204a52493..dce53527cdc1 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -871,6 +871,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root, struct acpi_device *device = root->device; int node = acpi_get_node(device->handle); struct pci_bus *bus; + struct pci_host_bridge *host_bridge; info->root = root; info->bridge = device; @@ -895,9 +896,17 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root, if (!bus) goto out_release_info; + host_bridge = to_pci_host_bridge(bus->bridge); + if (!(root->osc_control_set & PCIE_PORT_SERVICE_HP)) + host_bridge->use_hotplug = 0; + if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL)) + host_bridge->use_aer = 0; + if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL)) + host_bridge->use_pme = 0; + pci_scan_child_bus(bus); - pci_set_host_bridge_release(to_pci_host_bridge(bus->bridge), - acpi_pci_root_release_info, info); + pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info, + info); if (node != NUMA_NO_NODE) dev_printk(KERN_DEBUG, >dev, "on NUMA node %d\n", node); return bus; diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index e01c10c97b95..11fb633b866c 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -7,7 +7,6 @@ obj-$(CONFIG_PCIEASPM) += aspm.o pcieportdrv-y := portdrv_core.o portdrv_pci.o -pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 749d200936d9..2c19cf9ffea2 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -66,15 +66,4 @@ static inline bool pcie_pme_no_msi(void) { return false; } static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {} #endif /* !CONFIG_PCIE_PME */ -#ifdef CONFIG_ACPI -void pcie_port_acpi_setup(struct pci_dev *port, int *mask); - -static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask) -{ - pcie_port_acpi_setup(port, mask); -} -#else /* !CONFIG_ACPI */ -static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){} -#endif /* !CONFIG_ACPI */ - #endif /* _PORTDRV_H_ */ diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 94ce4dc50d1a..29210e9bfbd3 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -207,19 +207,20 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) */ static int get_port_device_capability(struct pci_dev *dev) { + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus); + bool native; int services = 0; - int cap_mask = 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP; - if (pci_aer_available()) - cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC; - - if (pcie_ports_auto) - pcie_port_platform_notify(dev, _mask); + /* +* If the user specified "pcie_ports=native", use the PCIe services +* regardless of whether the platform has given us permission. On +* ACPI systems, this means we ignore _OSC. +*/ + native = !pcie_ports_auto;
[PATCH v1 4/9] PCI/portdrv: Disable port driver in compat mode
From: Bjorn Helgaas The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is intended to disable the PCIe port driver. But even when it was disabled, we registered pcie_portdriver so we could work around a BIOS PME issue (see fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume")). Registering the driver meant that the pcie_portdrv_probe() path called pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(), pm_runtime_use_autosuspend(), etc., even when the driver was disabled. We've since moved the BIOS PME workaround from the port driver to the core, so stop registering the PCIe port driver in compat mode. This means "pcie_ports=compat" will now be basically the same as turning off CONFIG_PCIEPORTBUS completely. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv_core.c |3 --- drivers/pci/pcie/portdrv_pci.c |2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index ef3bad4ad010..9db77c683732 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -212,9 +212,6 @@ static int get_port_device_capability(struct pci_dev *dev) int services = 0; int cap_mask = 0; - if (pcie_ports_disabled) - return 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP | PCIE_PORT_SERVICE_VC; if (pci_aer_available()) diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index f91afd09e356..c08ebd237242 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -262,7 +262,7 @@ static int __init pcie_portdrv_init(void) int retval; if (pcie_ports_disabled) - return pci_register_driver(_portdriver); + return -EACCES; dmi_check_system(pcie_portdrv_dmi_table);
[PATCH v1 6/9] PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
From: Bjorn HelgaasNo driver registers for PCIE_PORT_SERVICE_VC, so remove it. This removes the VC "service" files from /sys/bus/pci_express/devices, e.g., :07:00.0:pcie108, :08:04.0:pcie208 (all the files that contained "8" as the last digit of the "pcieXXX" part). The port driver created these files for PCIe port devices that have a VC Capability. Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the spot where VC used to be, the DPC sysfs files will now be named "pcieXX8". I don't think there's anything useful userspace can do with those files, so I hope nobody cares about these filenames. There is no VC driver that calls pcie_port_service_register(), so there never was a /sys/bus/pci_express/drivers/vc directory. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv.h |2 +- drivers/pci/pcie/portdrv_acpi.c |2 +- drivers/pci/pcie/portdrv_core.c | 14 -- include/linux/pcieport_if.h |4 +--- 4 files changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index a4fc44d52206..749d200936d9 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -12,7 +12,7 @@ #include -#define PCIE_PORT_DEVICE_MAXSERVICES 5 +#define PCIE_PORT_DEVICE_MAXSERVICES 4 /* * The PCIe Capability Interrupt Message Number (PCIe r3.1, sec 7.8.2) must * be one of the first 32 MSI-X entries. Per PCI r3.0, sec 6.8.3.1, MSI diff --git a/drivers/pci/pcie/portdrv_acpi.c b/drivers/pci/pcie/portdrv_acpi.c index 319c94976873..4a1b50867c98 100644 --- a/drivers/pci/pcie/portdrv_acpi.c +++ b/drivers/pci/pcie/portdrv_acpi.c @@ -48,7 +48,7 @@ void pcie_port_acpi_setup(struct pci_dev *port, int *srv_mask) flags = root->osc_control_set; - *srv_mask = PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_DPC; + *srv_mask = PCIE_PORT_SERVICE_DPC; if (flags & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL) *srv_mask |= PCIE_PORT_SERVICE_HP; if (flags & OSC_PCI_EXPRESS_PME_CONTROL) diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 9db77c683732..94ce4dc50d1a 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -189,10 +189,8 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) if (ret < 0) return -ENODEV; - for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) { - if (i != PCIE_PORT_SERVICE_VC_SHIFT) - irqs[i] = pci_irq_vector(dev, 0); - } + for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) + irqs[i] = pci_irq_vector(dev, 0); return 0; } @@ -212,8 +210,7 @@ static int get_port_device_capability(struct pci_dev *dev) int services = 0; int cap_mask = 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP - | PCIE_PORT_SERVICE_VC; + cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP; if (pci_aer_available()) cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC; @@ -240,9 +237,6 @@ static int get_port_device_capability(struct pci_dev *dev) */ pci_disable_pcie_error_reporting(dev); } - /* VC support */ - if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC)) - services |= PCIE_PORT_SERVICE_VC; /* Root ports are capable of generating PME too */ if ((cap_mask & PCIE_PORT_SERVICE_PME) && pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { @@ -332,7 +326,7 @@ int pcie_port_device_register(struct pci_dev *dev) */ status = pcie_init_service_irqs(dev, irqs, capabilities); if (status) { - capabilities &= PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_HP; + capabilities &= PCIE_PORT_SERVICE_HP; if (!capabilities) goto error_disable; } diff --git a/include/linux/pcieport_if.h b/include/linux/pcieport_if.h index b69769dbf659..28eb21731db6 100644 --- a/include/linux/pcieport_if.h +++ b/include/linux/pcieport_if.h @@ -20,9 +20,7 @@ #define PCIE_PORT_SERVICE_AER (1 << PCIE_PORT_SERVICE_AER_SHIFT) #define PCIE_PORT_SERVICE_HP_SHIFT 2 /* Native Hotplug */ #define PCIE_PORT_SERVICE_HP (1 << PCIE_PORT_SERVICE_HP_SHIFT) -#define PCIE_PORT_SERVICE_VC_SHIFT 3 /* Virtual Channel */ -#define PCIE_PORT_SERVICE_VC (1 << PCIE_PORT_SERVICE_VC_SHIFT) -#define PCIE_PORT_SERVICE_DPC_SHIFT4 /* Downstream Port Containment */ +#define PCIE_PORT_SERVICE_DPC_SHIFT3 /* Downstream Port Containment */ #define PCIE_PORT_SERVICE_DPC (1 << PCIE_PORT_SERVICE_DPC_SHIFT) struct pcie_device {
[PATCH v1 8/9] PCI/portdrv: Remove unnecessary include of
From: Bjorn Helgaasportdrv_pci.c doesn't use anything from . Remove the include of it. No functional change intended. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv_pci.c |1 - 1 file changed, 1 deletion(-) diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 9475886eeb62..d12b58db18a1 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -18,7 +18,6 @@ #include #include #include -#include #include "../pci.h" #include "portdrv.h"
[PATCH v1 3/9] PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
From: Bjorn HelgaasPer PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME interrupts on behalf of Root Complex Integrated Endpoints. Linux does not currently enable PME interrupts from RC Event Collectors, but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume") suggests PME interrupts may be enabled by the platform for ACPI- based runtime wakeup. Clear the PCIe PME Status bit for Root Complex Event Collectors during resume, just like we already do for Root Ports. If the BIOS enables PME interrupts for an event collector and neglects to clear the status bit on resume, this change should fix the same bug as fe31e69740ed (PMEs not working after waking from a sleep state), but for Root Complex Integrated Endpoints. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index bf0704b75f79..38ee7c8b4d1a 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -533,7 +533,8 @@ static void pcie_resume_early(struct pci_dev *pci_dev) * bits now just in case (shouldn't hurt). */ if (pci_is_pcie(pci_dev) && - pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT) + (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT || +pci_pcie_type(pci_dev) == PCI_EXP_TYPE_RC_EC)) pcie_clear_root_pme_status(pci_dev); }
[PATCH v1 8/9] PCI/portdrv: Remove unnecessary include of
From: Bjorn Helgaas portdrv_pci.c doesn't use anything from . Remove the include of it. No functional change intended. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv_pci.c |1 - 1 file changed, 1 deletion(-) diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index 9475886eeb62..d12b58db18a1 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -18,7 +18,6 @@ #include #include #include -#include #include "../pci.h" #include "portdrv.h"
[PATCH v1 3/9] PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
From: Bjorn Helgaas Per PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME interrupts on behalf of Root Complex Integrated Endpoints. Linux does not currently enable PME interrupts from RC Event Collectors, but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system resume") suggests PME interrupts may be enabled by the platform for ACPI- based runtime wakeup. Clear the PCIe PME Status bit for Root Complex Event Collectors during resume, just like we already do for Root Ports. If the BIOS enables PME interrupts for an event collector and neglects to clear the status bit on resume, this change should fix the same bug as fe31e69740ed (PMEs not working after waking from a sleep state), but for Root Complex Integrated Endpoints. Signed-off-by: Bjorn Helgaas --- drivers/pci/pci-driver.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index bf0704b75f79..38ee7c8b4d1a 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -533,7 +533,8 @@ static void pcie_resume_early(struct pci_dev *pci_dev) * bits now just in case (shouldn't hurt). */ if (pci_is_pcie(pci_dev) && - pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT) + (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT || +pci_pcie_type(pci_dev) == PCI_EXP_TYPE_RC_EC)) pcie_clear_root_pme_status(pci_dev); }
[PATCH v1 6/9] PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
From: Bjorn Helgaas No driver registers for PCIE_PORT_SERVICE_VC, so remove it. This removes the VC "service" files from /sys/bus/pci_express/devices, e.g., :07:00.0:pcie108, :08:04.0:pcie208 (all the files that contained "8" as the last digit of the "pcieXXX" part). The port driver created these files for PCIe port devices that have a VC Capability. Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the spot where VC used to be, the DPC sysfs files will now be named "pcieXX8". I don't think there's anything useful userspace can do with those files, so I hope nobody cares about these filenames. There is no VC driver that calls pcie_port_service_register(), so there never was a /sys/bus/pci_express/drivers/vc directory. Signed-off-by: Bjorn Helgaas --- drivers/pci/pcie/portdrv.h |2 +- drivers/pci/pcie/portdrv_acpi.c |2 +- drivers/pci/pcie/portdrv_core.c | 14 -- include/linux/pcieport_if.h |4 +--- 4 files changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index a4fc44d52206..749d200936d9 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -12,7 +12,7 @@ #include -#define PCIE_PORT_DEVICE_MAXSERVICES 5 +#define PCIE_PORT_DEVICE_MAXSERVICES 4 /* * The PCIe Capability Interrupt Message Number (PCIe r3.1, sec 7.8.2) must * be one of the first 32 MSI-X entries. Per PCI r3.0, sec 6.8.3.1, MSI diff --git a/drivers/pci/pcie/portdrv_acpi.c b/drivers/pci/pcie/portdrv_acpi.c index 319c94976873..4a1b50867c98 100644 --- a/drivers/pci/pcie/portdrv_acpi.c +++ b/drivers/pci/pcie/portdrv_acpi.c @@ -48,7 +48,7 @@ void pcie_port_acpi_setup(struct pci_dev *port, int *srv_mask) flags = root->osc_control_set; - *srv_mask = PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_DPC; + *srv_mask = PCIE_PORT_SERVICE_DPC; if (flags & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL) *srv_mask |= PCIE_PORT_SERVICE_HP; if (flags & OSC_PCI_EXPRESS_PME_CONTROL) diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 9db77c683732..94ce4dc50d1a 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -189,10 +189,8 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int *irqs, int mask) if (ret < 0) return -ENODEV; - for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) { - if (i != PCIE_PORT_SERVICE_VC_SHIFT) - irqs[i] = pci_irq_vector(dev, 0); - } + for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) + irqs[i] = pci_irq_vector(dev, 0); return 0; } @@ -212,8 +210,7 @@ static int get_port_device_capability(struct pci_dev *dev) int services = 0; int cap_mask = 0; - cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP - | PCIE_PORT_SERVICE_VC; + cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP; if (pci_aer_available()) cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC; @@ -240,9 +237,6 @@ static int get_port_device_capability(struct pci_dev *dev) */ pci_disable_pcie_error_reporting(dev); } - /* VC support */ - if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC)) - services |= PCIE_PORT_SERVICE_VC; /* Root ports are capable of generating PME too */ if ((cap_mask & PCIE_PORT_SERVICE_PME) && pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) { @@ -332,7 +326,7 @@ int pcie_port_device_register(struct pci_dev *dev) */ status = pcie_init_service_irqs(dev, irqs, capabilities); if (status) { - capabilities &= PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_HP; + capabilities &= PCIE_PORT_SERVICE_HP; if (!capabilities) goto error_disable; } diff --git a/include/linux/pcieport_if.h b/include/linux/pcieport_if.h index b69769dbf659..28eb21731db6 100644 --- a/include/linux/pcieport_if.h +++ b/include/linux/pcieport_if.h @@ -20,9 +20,7 @@ #define PCIE_PORT_SERVICE_AER (1 << PCIE_PORT_SERVICE_AER_SHIFT) #define PCIE_PORT_SERVICE_HP_SHIFT 2 /* Native Hotplug */ #define PCIE_PORT_SERVICE_HP (1 << PCIE_PORT_SERVICE_HP_SHIFT) -#define PCIE_PORT_SERVICE_VC_SHIFT 3 /* Virtual Channel */ -#define PCIE_PORT_SERVICE_VC (1 << PCIE_PORT_SERVICE_VC_SHIFT) -#define PCIE_PORT_SERVICE_DPC_SHIFT4 /* Downstream Port Containment */ +#define PCIE_PORT_SERVICE_DPC_SHIFT3 /* Downstream Port Containment */ #define PCIE_PORT_SERVICE_DPC (1 << PCIE_PORT_SERVICE_DPC_SHIFT) struct pcie_device {
[PATCH v1 0/9] PCI: Simplify PCIe port driver
This is an attempt to move a few things out of the port driver. Patches 1-2 move a workaround for a BIOS PME issue from the port driver to the PCI core, so it doesn't depend on CONFIG_PCIEPORTBUS. Patch 3 extends that workaround so it works for Root Complex Event Collectors. I haven't seen reports of this being a problem, but I think we should handle Event Collector PMEs the same as Root Port PMEs. Patch 4 disables the port driver completely for "pcie_ports=compat". We used to register the driver, claim port devices, enable them, etc., as part of supporting the above BIOS workaround. Patch 5 removes a port driver link order dependency. Patch 6 removes the unused VC service. Patch 7 simplifies the _OSC code path by keeping more of the details in the ACPI pci_root.c driver. Patch 8 removes an unnecessary #include. Patch 9 removes the "pcie_hp=nomsi" parameter. This was added to work around an issue when shutting down devices, but a later patch fixed the root cause, and I don't think we need such a specific parameter any more (we still have "pci=nomsi"). --- Bjorn Helgaas (9): PCI/PM: Move pcie_clear_root_pme_status() to core PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors PCI/portdrv: Disable port driver in compat mode PCI/portdrv: Remove pcie_port_bus_type link order dependency PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC PCI/portdrv: Simplify PCIe feature permission checking PCI/portdrv: Remove unnecessary include of PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter Documentation/admin-guide/kernel-parameters.txt |4 - drivers/acpi/pci_root.c | 13 +++- drivers/pci/pci-driver.c| 60 ++ drivers/pci/pci.c |9 +++ drivers/pci/pci.h |1 drivers/pci/pcie/Makefile |3 - drivers/pci/pcie/portdrv.h | 27 drivers/pci/pcie/portdrv_acpi.c |2 - drivers/pci/pcie/portdrv_bus.c | 56 - drivers/pci/pcie/portdrv_core.c | 77 ++- drivers/pci/pcie/portdrv_pci.c | 40 +--- drivers/pci/probe.c | 10 +++ include/linux/pci.h |3 + include/linux/pcieport_if.h |4 - 14 files changed, 131 insertions(+), 178 deletions(-) delete mode 100644 drivers/pci/pcie/portdrv_bus.c
[PATCH v1 0/9] PCI: Simplify PCIe port driver
This is an attempt to move a few things out of the port driver. Patches 1-2 move a workaround for a BIOS PME issue from the port driver to the PCI core, so it doesn't depend on CONFIG_PCIEPORTBUS. Patch 3 extends that workaround so it works for Root Complex Event Collectors. I haven't seen reports of this being a problem, but I think we should handle Event Collector PMEs the same as Root Port PMEs. Patch 4 disables the port driver completely for "pcie_ports=compat". We used to register the driver, claim port devices, enable them, etc., as part of supporting the above BIOS workaround. Patch 5 removes a port driver link order dependency. Patch 6 removes the unused VC service. Patch 7 simplifies the _OSC code path by keeping more of the details in the ACPI pci_root.c driver. Patch 8 removes an unnecessary #include. Patch 9 removes the "pcie_hp=nomsi" parameter. This was added to work around an issue when shutting down devices, but a later patch fixed the root cause, and I don't think we need such a specific parameter any more (we still have "pci=nomsi"). --- Bjorn Helgaas (9): PCI/PM: Move pcie_clear_root_pme_status() to core PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors PCI/portdrv: Disable port driver in compat mode PCI/portdrv: Remove pcie_port_bus_type link order dependency PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC PCI/portdrv: Simplify PCIe feature permission checking PCI/portdrv: Remove unnecessary include of PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter Documentation/admin-guide/kernel-parameters.txt |4 - drivers/acpi/pci_root.c | 13 +++- drivers/pci/pci-driver.c| 60 ++ drivers/pci/pci.c |9 +++ drivers/pci/pci.h |1 drivers/pci/pcie/Makefile |3 - drivers/pci/pcie/portdrv.h | 27 drivers/pci/pcie/portdrv_acpi.c |2 - drivers/pci/pcie/portdrv_bus.c | 56 - drivers/pci/pcie/portdrv_core.c | 77 ++- drivers/pci/pcie/portdrv_pci.c | 40 +--- drivers/pci/probe.c | 10 +++ include/linux/pci.h |3 + include/linux/pcieport_if.h |4 - 14 files changed, 131 insertions(+), 178 deletions(-) delete mode 100644 drivers/pci/pcie/portdrv_bus.c
Re: [PATCH v2] xhci: Fix front USB ports on ASUS PRIME B350M-A
Hi Matthias, Do you have any concern about this patch? Hopefully this can get merged for v4.16… Kai-Heng
Re: [PATCH v2] xhci: Fix front USB ports on ASUS PRIME B350M-A
Hi Matthias, Do you have any concern about this patch? Hopefully this can get merged for v4.16… Kai-Heng
Re: [PATCH 3/3] vfio/pci: Add ioeventfd support
On Wed, Feb 28, 2018 at 01:15:20PM -0700, Alex Williamson wrote: [...] > @@ -1174,6 +1206,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const > struct pci_device_id *id) > vdev->irq_type = VFIO_PCI_NUM_IRQS; > mutex_init(>igate); > spin_lock_init(>irqlock); > + mutex_init(>ioeventfds_lock); Do we better need to destroy the mutex in vfio_pci_remove? I see that vfio_pci_device.igate is also without a destructor. I'm not sure on both. Thanks, > + INIT_LIST_HEAD(>ioeventfds_list); > > ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > if (ret) { -- Peter Xu
Re: [PATCH 3/3] vfio/pci: Add ioeventfd support
On Wed, Feb 28, 2018 at 01:15:20PM -0700, Alex Williamson wrote: [...] > @@ -1174,6 +1206,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const > struct pci_device_id *id) > vdev->irq_type = VFIO_PCI_NUM_IRQS; > mutex_init(>igate); > spin_lock_init(>irqlock); > + mutex_init(>ioeventfds_lock); Do we better need to destroy the mutex in vfio_pci_remove? I see that vfio_pci_device.igate is also without a destructor. I'm not sure on both. Thanks, > + INIT_LIST_HEAD(>ioeventfds_list); > > ret = vfio_add_group_dev(>dev, _pci_ops, vdev); > if (ret) { -- Peter Xu
Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section
On 3/6/2018 10:42 PM, Boqun Feng wrote: On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote: Hello Paul and RCU folks, I am afraid I correctly understand and fix it. But I really wonder why sync_rcu_exp_handler() reports the quiescent state even in the case that current task is within a RCU read-side section. Do I miss something? If I correctly understand it and you agree with it, I can add more logic which make it more expedited by boosting current or making it urgent when we fail to report the quiescent state on the IPI. ->8- From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001 From: Byungchul ParkDate: Tue, 6 Mar 2018 13:54:41 +0900 Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section We report the quiescent state for this cpu if it's out of RCU read-side section at the moment IPI was just fired during the expedite process. However, current code reports the quiescent state even in the case: 1) the current task is still within a RCU read-side section 2) the current task has been blocked within the RCU read-side section If this happens, the task will queue itself in rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp kthread will wait for this task to dequeue itself. IOW, we have other mechanism to wait for this task other than bottom-up qs reporting tree. So I think we are fine here. Right. Basically we consider both the quiscent state within the current task and queued tasks on rcu nodes that you mentioned, to control grace periods when PREEMPT kernel is used. Actually my concern was if it's safe to clear the bit of 'expmask' on the IPI for all possible cases, even though anyway blocked tasks would try to prevent the grace period from ending. I worried if something subtle might cause problems, but the code looks fine on second thought as you said. Thank you for your explanation. Regards, Boqun Since we don't get to the quiescent state yet in the case, we shouldn't report it but check it another time. Signed-off-by: Byungchul Park --- kernel/rcu/tree_exp.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 73e1d3d..cc69d14 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info) /* * We are either exiting an RCU read-side critical section (negative * values of t->rcu_read_lock_nesting) or are not in one at all -* (zero value of t->rcu_read_lock_nesting). Or we are in an RCU -* read-side critical section that blocked before this expedited -* grace period started. Either way, we can immediately report -* the quiescent state. +* (zero value of t->rcu_read_lock_nesting). We can immediately +* report the quiescent state. */ - rdp = this_cpu_ptr(rsp->rda); - rcu_report_exp_rdp(rsp, rdp, true); + if (t->rcu_read_lock_nesting <= 0) { + rdp = this_cpu_ptr(rsp->rda); + rcu_report_exp_rdp(rsp, rdp, true); + } } /** -- 1.9.1 -- Thanks, Byungchul
Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section
On 3/6/2018 10:42 PM, Boqun Feng wrote: On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote: Hello Paul and RCU folks, I am afraid I correctly understand and fix it. But I really wonder why sync_rcu_exp_handler() reports the quiescent state even in the case that current task is within a RCU read-side section. Do I miss something? If I correctly understand it and you agree with it, I can add more logic which make it more expedited by boosting current or making it urgent when we fail to report the quiescent state on the IPI. ->8- From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001 From: Byungchul Park Date: Tue, 6 Mar 2018 13:54:41 +0900 Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section We report the quiescent state for this cpu if it's out of RCU read-side section at the moment IPI was just fired during the expedite process. However, current code reports the quiescent state even in the case: 1) the current task is still within a RCU read-side section 2) the current task has been blocked within the RCU read-side section If this happens, the task will queue itself in rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp kthread will wait for this task to dequeue itself. IOW, we have other mechanism to wait for this task other than bottom-up qs reporting tree. So I think we are fine here. Right. Basically we consider both the quiscent state within the current task and queued tasks on rcu nodes that you mentioned, to control grace periods when PREEMPT kernel is used. Actually my concern was if it's safe to clear the bit of 'expmask' on the IPI for all possible cases, even though anyway blocked tasks would try to prevent the grace period from ending. I worried if something subtle might cause problems, but the code looks fine on second thought as you said. Thank you for your explanation. Regards, Boqun Since we don't get to the quiescent state yet in the case, we shouldn't report it but check it another time. Signed-off-by: Byungchul Park --- kernel/rcu/tree_exp.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 73e1d3d..cc69d14 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info) /* * We are either exiting an RCU read-side critical section (negative * values of t->rcu_read_lock_nesting) or are not in one at all -* (zero value of t->rcu_read_lock_nesting). Or we are in an RCU -* read-side critical section that blocked before this expedited -* grace period started. Either way, we can immediately report -* the quiescent state. +* (zero value of t->rcu_read_lock_nesting). We can immediately +* report the quiescent state. */ - rdp = this_cpu_ptr(rsp->rda); - rcu_report_exp_rdp(rsp, rdp, true); + if (t->rcu_read_lock_nesting <= 0) { + rdp = this_cpu_ptr(rsp->rda); + rcu_report_exp_rdp(rsp, rdp, true); + } } /** -- 1.9.1 -- Thanks, Byungchul
[PATCH] ipmi:ssif: Fix double probe from tryacpi and trydmi
IPMI SSIF driver's parameter tryacpi and trydmi both are set to true. The addition of IPMI DMI driver to create platform device for each IPMI device causes SSIF probe to be done twice on the same SMB I2C address for BMC. Fix is to not call trydmi if tryacpi is able to find I2C address for BMC from SPMI ACPI table and probe successfully. Signed-off-by: Jiandi An--- drivers/char/ipmi/ipmi_ssif.c | 35 --- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index 9d3b0fa..5c57363 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -1981,29 +1981,41 @@ static int try_init_spmi(struct SPMITable *spmi) return new_ssif_client(myaddr, NULL, 0, 0, SI_SPMI, NULL); } -static void spmi_find_bmc(void) +static int spmi_find_bmc(void) { acpi_status status; struct SPMITable *spmi; int i; + int rc = 0; if (acpi_disabled) - return; + return -EPERM; if (acpi_failure) - return; + return -ENODEV; for (i = 0; ; i++) { status = acpi_get_table(ACPI_SIG_SPMI, i+1, (struct acpi_table_header **)); - if (status != AE_OK) - return; + if (status != AE_OK) { + if (i == 0) + return -ENODEV; + else + return 0; + } - try_init_spmi(spmi); + rc = try_init_spmi(spmi); + if (rc) + return rc; } + + return 0; } #else -static void spmi_find_bmc(void) { } +static int spmi_find_bmc(void) +{ + return -ENODEV; +} #endif #ifdef CONFIG_DMI @@ -2104,12 +2116,13 @@ static int init_ipmi_ssif(void) addr[i]); } - if (ssif_tryacpi) + if (ssif_tryacpi) { ssif_i2c_driver.driver.acpi_match_table = ACPI_PTR(ssif_acpi_match); - - if (ssif_tryacpi) - spmi_find_bmc(); + rv = spmi_find_bmc(); + if (!rv) + ssif_trydmi = false; + } if (ssif_trydmi) { rv = platform_driver_register(_driver); -- Jiandi An Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH] ipmi:ssif: Fix double probe from tryacpi and trydmi
IPMI SSIF driver's parameter tryacpi and trydmi both are set to true. The addition of IPMI DMI driver to create platform device for each IPMI device causes SSIF probe to be done twice on the same SMB I2C address for BMC. Fix is to not call trydmi if tryacpi is able to find I2C address for BMC from SPMI ACPI table and probe successfully. Signed-off-by: Jiandi An --- drivers/char/ipmi/ipmi_ssif.c | 35 --- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c index 9d3b0fa..5c57363 100644 --- a/drivers/char/ipmi/ipmi_ssif.c +++ b/drivers/char/ipmi/ipmi_ssif.c @@ -1981,29 +1981,41 @@ static int try_init_spmi(struct SPMITable *spmi) return new_ssif_client(myaddr, NULL, 0, 0, SI_SPMI, NULL); } -static void spmi_find_bmc(void) +static int spmi_find_bmc(void) { acpi_status status; struct SPMITable *spmi; int i; + int rc = 0; if (acpi_disabled) - return; + return -EPERM; if (acpi_failure) - return; + return -ENODEV; for (i = 0; ; i++) { status = acpi_get_table(ACPI_SIG_SPMI, i+1, (struct acpi_table_header **)); - if (status != AE_OK) - return; + if (status != AE_OK) { + if (i == 0) + return -ENODEV; + else + return 0; + } - try_init_spmi(spmi); + rc = try_init_spmi(spmi); + if (rc) + return rc; } + + return 0; } #else -static void spmi_find_bmc(void) { } +static int spmi_find_bmc(void) +{ + return -ENODEV; +} #endif #ifdef CONFIG_DMI @@ -2104,12 +2116,13 @@ static int init_ipmi_ssif(void) addr[i]); } - if (ssif_tryacpi) + if (ssif_tryacpi) { ssif_i2c_driver.driver.acpi_match_table = ACPI_PTR(ssif_acpi_match); - - if (ssif_tryacpi) - spmi_find_bmc(); + rv = spmi_find_bmc(); + if (!rv) + ssif_trydmi = false; + } if (ssif_trydmi) { rv = platform_driver_register(_driver); -- Jiandi An Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH] staging: lustre: Remove VLA usage
The kernel would like to remove all VLA usage. This switches to a simple kasprintf() instead. Signed-off-by: Kees Cook--- drivers/staging/lustre/lustre/llite/xattr.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c index 532384c91447..aab4eab64289 100644 --- a/drivers/staging/lustre/lustre/llite/xattr.c +++ b/drivers/staging/lustre/lustre/llite/xattr.c @@ -87,7 +87,7 @@ ll_xattr_set_common(const struct xattr_handler *handler, const char *name, const void *value, size_t size, int flags) { - char fullname[strlen(handler->prefix) + strlen(name) + 1]; + char *fullname; struct ll_sb_info *sbi = ll_i2sbi(inode); struct ptlrpc_request *req = NULL; const char *pv = value; @@ -141,10 +141,13 @@ ll_xattr_set_common(const struct xattr_handler *handler, return -EPERM; } - sprintf(fullname, "%s%s\n", handler->prefix, name); + fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name); + if (!fullname) + return -ENOMEM; rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), valid, fullname, pv, size, 0, flags, ll_i2suppgid(inode), ); + kfree(fullname); if (rc) { if (rc == -EOPNOTSUPP && handler->flags == XATTR_USER_T) { LCONSOLE_INFO("Disabling user_xattr feature because it is not supported on the server\n"); @@ -364,7 +367,7 @@ static int ll_xattr_get_common(const struct xattr_handler *handler, struct dentry *dentry, struct inode *inode, const char *name, void *buffer, size_t size) { - char fullname[strlen(handler->prefix) + strlen(name) + 1]; + char *fullname; struct ll_sb_info *sbi = ll_i2sbi(inode); #ifdef CONFIG_FS_POSIX_ACL struct ll_inode_info *lli = ll_i2info(inode); @@ -411,9 +414,13 @@ static int ll_xattr_get_common(const struct xattr_handler *handler, if (handler->flags == XATTR_ACL_DEFAULT_T && !S_ISDIR(inode->i_mode)) return -ENODATA; #endif - sprintf(fullname, "%s%s\n", handler->prefix, name); - return ll_xattr_list(inode, fullname, handler->flags, buffer, size, -OBD_MD_FLXATTR); + fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name); + if (!fullname) + return -ENOMEM; + rc = ll_xattr_list(inode, fullname, handler->flags, buffer, size, + OBD_MD_FLXATTR); + kfree(fullname); + return rc; } static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size) -- 2.7.4 -- Kees Cook Pixel Security
[PATCH] staging: lustre: Remove VLA usage
The kernel would like to remove all VLA usage. This switches to a simple kasprintf() instead. Signed-off-by: Kees Cook --- drivers/staging/lustre/lustre/llite/xattr.c | 19 +-- 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c index 532384c91447..aab4eab64289 100644 --- a/drivers/staging/lustre/lustre/llite/xattr.c +++ b/drivers/staging/lustre/lustre/llite/xattr.c @@ -87,7 +87,7 @@ ll_xattr_set_common(const struct xattr_handler *handler, const char *name, const void *value, size_t size, int flags) { - char fullname[strlen(handler->prefix) + strlen(name) + 1]; + char *fullname; struct ll_sb_info *sbi = ll_i2sbi(inode); struct ptlrpc_request *req = NULL; const char *pv = value; @@ -141,10 +141,13 @@ ll_xattr_set_common(const struct xattr_handler *handler, return -EPERM; } - sprintf(fullname, "%s%s\n", handler->prefix, name); + fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name); + if (!fullname) + return -ENOMEM; rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), valid, fullname, pv, size, 0, flags, ll_i2suppgid(inode), ); + kfree(fullname); if (rc) { if (rc == -EOPNOTSUPP && handler->flags == XATTR_USER_T) { LCONSOLE_INFO("Disabling user_xattr feature because it is not supported on the server\n"); @@ -364,7 +367,7 @@ static int ll_xattr_get_common(const struct xattr_handler *handler, struct dentry *dentry, struct inode *inode, const char *name, void *buffer, size_t size) { - char fullname[strlen(handler->prefix) + strlen(name) + 1]; + char *fullname; struct ll_sb_info *sbi = ll_i2sbi(inode); #ifdef CONFIG_FS_POSIX_ACL struct ll_inode_info *lli = ll_i2info(inode); @@ -411,9 +414,13 @@ static int ll_xattr_get_common(const struct xattr_handler *handler, if (handler->flags == XATTR_ACL_DEFAULT_T && !S_ISDIR(inode->i_mode)) return -ENODATA; #endif - sprintf(fullname, "%s%s\n", handler->prefix, name); - return ll_xattr_list(inode, fullname, handler->flags, buffer, size, -OBD_MD_FLXATTR); + fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name); + if (!fullname) + return -ENOMEM; + rc = ll_xattr_list(inode, fullname, handler->flags, buffer, size, + OBD_MD_FLXATTR); + kfree(fullname); + return rc; } static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size) -- 2.7.4 -- Kees Cook Pixel Security
[PATCH] staging: iio: meter: Remove reduntant __func__ from debug print
From: HariPrasath Elangodev_dbg includes the function name & line number by default when dynamic debugging is enabled. Hence__func__ is reduntant here and removed. Signed-off-by: HariPrasath Elango --- drivers/staging/iio/meter/ade7758_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/iio/meter/ade7758_trigger.c b/drivers/staging/iio/meter/ade7758_trigger.c index 1f0d1a0..da489ae 100644 --- a/drivers/staging/iio/meter/ade7758_trigger.c +++ b/drivers/staging/iio/meter/ade7758_trigger.c @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct iio_trigger *trig, { struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); + dev_dbg(_dev->dev, "(%d)\n", state); return ade7758_set_irq(_dev->dev, state); } -- 2.10.0.GIT
[PATCH] staging: iio: meter: Remove reduntant __func__ from debug print
From: HariPrasath Elango dev_dbg includes the function name & line number by default when dynamic debugging is enabled. Hence__func__ is reduntant here and removed. Signed-off-by: HariPrasath Elango --- drivers/staging/iio/meter/ade7758_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/iio/meter/ade7758_trigger.c b/drivers/staging/iio/meter/ade7758_trigger.c index 1f0d1a0..da489ae 100644 --- a/drivers/staging/iio/meter/ade7758_trigger.c +++ b/drivers/staging/iio/meter/ade7758_trigger.c @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct iio_trigger *trig, { struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); + dev_dbg(_dev->dev, "(%d)\n", state); return ade7758_set_irq(_dev->dev, state); } -- 2.10.0.GIT
[PATCH v6] mmc: Export host capabilities to debugfs.
This patch exports the host capabilities to debugfs This idea of sharing host capabilities over debugfs came up from Abbas RazaEarlier discussions: https://lkml.org/lkml/2018/3/5/357 https://www.spinics.net/lists/linux-mmc/msg48219.html Signed-off-by: Harish Jenny K N --- Changes in v6: - Used DEFINE_SHOW_ATTRIBUTE Changes in v5: - Added parser logic in kernel by using debugfs_create_file for caps and caps2 instead of debugfs_create_x32 - Changed Author Changes in v4: - Moved the creation of nodes to mmc_add_host_debugfs - Exported caps2 - Renamed host_caps to caps Changes in v3: - Removed typecasting of >caps to (u32 *) Changes in v2: - Changed Author drivers/mmc/core/debugfs.c | 120 + 1 file changed, 120 insertions(+) diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c index c51e0c0..136bdf7 100644 --- a/drivers/mmc/core/debugfs.c +++ b/drivers/mmc/core/debugfs.c @@ -225,6 +225,120 @@ static int mmc_clock_opt_set(void *data, u64 val) DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set, "%llu\n"); +static int mmc_caps_show(struct seq_file *s, void *unused) +{ + struct mmc_host *host = s->private; + u32 caps = host->caps; + + seq_puts(s, "\nMMC Host capabilities are:\n"); + seq_puts(s, "=\n"); + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", + ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); + seq_printf(s, "Can do MMC high-speed timing :\t%s\n", + ((caps & MMC_CAP_MMC_HIGHSPEED) ? "Yes" : "No")); + seq_printf(s, "Can do SD high-speed timing :\t%s\n", + ((caps & MMC_CAP_SD_HIGHSPEED) ? "Yes" : "No")); + seq_printf(s, "Can signal pending SDIO IRQs :\t%s\n", + ((caps & MMC_CAP_SDIO_IRQ) ? "Yes" : "No")); + seq_printf(s, "Talks only SPI protocols :\t%s\n", + ((caps & MMC_CAP_SPI) ? "Yes" : "No")); + seq_printf(s, "Needs polling for card-detection :\t%s\n", + ((caps & MMC_CAP_NEEDS_POLL) ? "Yes" : "No")); + seq_printf(s, "Can the host do 8 bit transfers :\t%s\n", + ((caps & MMC_CAP_8_BIT_DATA) ? "Yes" : "No")); + seq_printf(s, "Suspend (e)MMC/SD at idle :\t%s\n", + ((caps & MMC_CAP_AGGRESSIVE_PM) ? "Yes" : "No")); + seq_printf(s, "Nonremovable e.g. eMMC :\t%s\n", + ((caps & MMC_CAP_NONREMOVABLE) ? "Yes" : "No")); + seq_printf(s, "Waits while card is busy :\t%s\n", + ((caps & MMC_CAP_WAIT_WHILE_BUSY) ? "Yes" : "No")); + seq_printf(s, "Allow erase/trim commands :\t%s\n", + ((caps & MMC_CAP_ERASE) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 3.3V :\t%s\n", + ((caps & MMC_CAP_3_3V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 1.8V :\t%s\n", + ((caps & MMC_CAP_1_8V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 1.2V :\t%s\n", + ((caps & MMC_CAP_1_2V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can power off after boot :\t%s\n", + ((caps & MMC_CAP_POWER_OFF_CARD) ? "Yes" : "No")); + seq_printf(s, "CMD14/CMD19 bus width ok :\t%s\n", + ((caps & MMC_CAP_BUS_WIDTH_TEST) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR12 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR12) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR25 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR25) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR50 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR50) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR104 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR104) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS DDR50 mode :\t%s\n", + ((caps & MMC_CAP_UHS_DDR50) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type A :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_A) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type C :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_C) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type D :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_D) ? "Yes" : "No")); + seq_printf(s, "RW reqs can be completed within mmc_request_done() :\t%s\n", + ((caps & MMC_CAP_DONE_COMPLETE) ? "Yes" : "No")); + seq_printf(s, "Enable card detect wake :\t%s\n", + ((caps & MMC_CAP_CD_WAKE) ? "Yes" : "No")); + seq_printf(s, "Commands during data transfer :\t%s\n", + ((caps & MMC_CAP_CMD_DURING_TFR) ? "Yes" : "No")); + seq_printf(s, "CMD23 supported. :\t%s\n", + ((caps &
[PATCH v6] mmc: Export host capabilities to debugfs.
This patch exports the host capabilities to debugfs This idea of sharing host capabilities over debugfs came up from Abbas Raza Earlier discussions: https://lkml.org/lkml/2018/3/5/357 https://www.spinics.net/lists/linux-mmc/msg48219.html Signed-off-by: Harish Jenny K N --- Changes in v6: - Used DEFINE_SHOW_ATTRIBUTE Changes in v5: - Added parser logic in kernel by using debugfs_create_file for caps and caps2 instead of debugfs_create_x32 - Changed Author Changes in v4: - Moved the creation of nodes to mmc_add_host_debugfs - Exported caps2 - Renamed host_caps to caps Changes in v3: - Removed typecasting of >caps to (u32 *) Changes in v2: - Changed Author drivers/mmc/core/debugfs.c | 120 + 1 file changed, 120 insertions(+) diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c index c51e0c0..136bdf7 100644 --- a/drivers/mmc/core/debugfs.c +++ b/drivers/mmc/core/debugfs.c @@ -225,6 +225,120 @@ static int mmc_clock_opt_set(void *data, u64 val) DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set, "%llu\n"); +static int mmc_caps_show(struct seq_file *s, void *unused) +{ + struct mmc_host *host = s->private; + u32 caps = host->caps; + + seq_puts(s, "\nMMC Host capabilities are:\n"); + seq_puts(s, "=\n"); + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", + ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No")); + seq_printf(s, "Can do MMC high-speed timing :\t%s\n", + ((caps & MMC_CAP_MMC_HIGHSPEED) ? "Yes" : "No")); + seq_printf(s, "Can do SD high-speed timing :\t%s\n", + ((caps & MMC_CAP_SD_HIGHSPEED) ? "Yes" : "No")); + seq_printf(s, "Can signal pending SDIO IRQs :\t%s\n", + ((caps & MMC_CAP_SDIO_IRQ) ? "Yes" : "No")); + seq_printf(s, "Talks only SPI protocols :\t%s\n", + ((caps & MMC_CAP_SPI) ? "Yes" : "No")); + seq_printf(s, "Needs polling for card-detection :\t%s\n", + ((caps & MMC_CAP_NEEDS_POLL) ? "Yes" : "No")); + seq_printf(s, "Can the host do 8 bit transfers :\t%s\n", + ((caps & MMC_CAP_8_BIT_DATA) ? "Yes" : "No")); + seq_printf(s, "Suspend (e)MMC/SD at idle :\t%s\n", + ((caps & MMC_CAP_AGGRESSIVE_PM) ? "Yes" : "No")); + seq_printf(s, "Nonremovable e.g. eMMC :\t%s\n", + ((caps & MMC_CAP_NONREMOVABLE) ? "Yes" : "No")); + seq_printf(s, "Waits while card is busy :\t%s\n", + ((caps & MMC_CAP_WAIT_WHILE_BUSY) ? "Yes" : "No")); + seq_printf(s, "Allow erase/trim commands :\t%s\n", + ((caps & MMC_CAP_ERASE) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 3.3V :\t%s\n", + ((caps & MMC_CAP_3_3V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 1.8V :\t%s\n", + ((caps & MMC_CAP_1_8V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can support DDR mode at 1.2V :\t%s\n", + ((caps & MMC_CAP_1_2V_DDR) ? "Yes" : "No")); + seq_printf(s, "Can power off after boot :\t%s\n", + ((caps & MMC_CAP_POWER_OFF_CARD) ? "Yes" : "No")); + seq_printf(s, "CMD14/CMD19 bus width ok :\t%s\n", + ((caps & MMC_CAP_BUS_WIDTH_TEST) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR12 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR12) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR25 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR25) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR50 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR50) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS SDR104 mode :\t%s\n", + ((caps & MMC_CAP_UHS_SDR104) ? "Yes" : "No")); + seq_printf(s, "Host supports UHS DDR50 mode :\t%s\n", + ((caps & MMC_CAP_UHS_DDR50) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type A :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_A) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type C :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_C) ? "Yes" : "No")); + seq_printf(s, "Host supports Driver Type D :\t%s\n", + ((caps & MMC_CAP_DRIVER_TYPE_D) ? "Yes" : "No")); + seq_printf(s, "RW reqs can be completed within mmc_request_done() :\t%s\n", + ((caps & MMC_CAP_DONE_COMPLETE) ? "Yes" : "No")); + seq_printf(s, "Enable card detect wake :\t%s\n", + ((caps & MMC_CAP_CD_WAKE) ? "Yes" : "No")); + seq_printf(s, "Commands during data transfer :\t%s\n", + ((caps & MMC_CAP_CMD_DURING_TFR) ? "Yes" : "No")); + seq_printf(s, "CMD23 supported. :\t%s\n", + ((caps & MMC_CAP_CMD23) ? "Yes" : "No")); + seq_printf(s,
[PATCH] security: Fix IMA Kconfig for dependencies on ARM64
TPM_CRB driver is the TPM support for ARM64. If it is built as module, TPM chip is registered after IMA init. tpm_pcr_read() in IMA driver would fail and display the following message even though eventually there is TPM chip on the system: ima: No TPM chip found, activating TPM-bypass! (rc=-19) Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in kernel and initializes before IMA driver. Signed-off-by: Jiandi An--- security/integrity/ima/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig index 35ef693..6a8f677 100644 --- a/security/integrity/ima/Kconfig +++ b/security/integrity/ima/Kconfig @@ -10,6 +10,7 @@ config IMA select CRYPTO_HASH_INFO select TCG_TPM if HAS_IOMEM && !UML select TCG_TIS if TCG_TPM && X86 + select TCG_CRB if TCG_TPM && ACPI select TCG_IBMVTPM if TCG_TPM && PPC_PSERIES help The Trusted Computing Group(TCG) runtime Integrity -- Jiandi An Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH] security: Fix IMA Kconfig for dependencies on ARM64
TPM_CRB driver is the TPM support for ARM64. If it is built as module, TPM chip is registered after IMA init. tpm_pcr_read() in IMA driver would fail and display the following message even though eventually there is TPM chip on the system: ima: No TPM chip found, activating TPM-bypass! (rc=-19) Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in kernel and initializes before IMA driver. Signed-off-by: Jiandi An --- security/integrity/ima/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig index 35ef693..6a8f677 100644 --- a/security/integrity/ima/Kconfig +++ b/security/integrity/ima/Kconfig @@ -10,6 +10,7 @@ config IMA select CRYPTO_HASH_INFO select TCG_TPM if HAS_IOMEM && !UML select TCG_TIS if TCG_TPM && X86 + select TCG_CRB if TCG_TPM && ACPI select TCG_IBMVTPM if TCG_TPM && PPC_PSERIES help The Trusted Computing Group(TCG) runtime Integrity -- Jiandi An Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH] staging: iio: adc: Remove reduntant __func__ from debug print
On Wed, Mar 07, 2018 at 10:40:05AM +0530, hariprasath.ela...@gmail.com wrote: > From: HariPrasath Elango> > dev_dbg includes the function name & line number by default when dynamic > debugging is enabled. Hence__func__ is reduntant here and removed. > > Signed-off-by: HariPrasath Elango > --- > drivers/staging/iio/meter/ade7758_trigger.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/iio/meter/ade7758_trigger.c > b/drivers/staging/iio/meter/ade7758_trigger.c > index 1f0d1a0..da489ae 100644 > --- a/drivers/staging/iio/meter/ade7758_trigger.c > +++ b/drivers/staging/iio/meter/ade7758_trigger.c > @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct > iio_trigger *trig, > { > struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); > > - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); > + dev_dbg(_dev->dev, "(%d)\n", state); > return ade7758_set_irq(_dev->dev, state); > } > > -- > 2.10.0.GIT > Please ignore this patch as the subject line is wrong. It should be 'meter' and not 'adc.
Re: [PATCH] staging: iio: adc: Remove reduntant __func__ from debug print
On Wed, Mar 07, 2018 at 10:40:05AM +0530, hariprasath.ela...@gmail.com wrote: > From: HariPrasath Elango > > dev_dbg includes the function name & line number by default when dynamic > debugging is enabled. Hence__func__ is reduntant here and removed. > > Signed-off-by: HariPrasath Elango > --- > drivers/staging/iio/meter/ade7758_trigger.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/iio/meter/ade7758_trigger.c > b/drivers/staging/iio/meter/ade7758_trigger.c > index 1f0d1a0..da489ae 100644 > --- a/drivers/staging/iio/meter/ade7758_trigger.c > +++ b/drivers/staging/iio/meter/ade7758_trigger.c > @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct > iio_trigger *trig, > { > struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); > > - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); > + dev_dbg(_dev->dev, "(%d)\n", state); > return ade7758_set_irq(_dev->dev, state); > } > > -- > 2.10.0.GIT > Please ignore this patch as the subject line is wrong. It should be 'meter' and not 'adc.
Re: [PATCH v2] arm64: dts: msm8916: Add cpu cooling maps
On Wed, Mar 7, 2018 at 10:30 AM, Amit Kucheriawrote: > From: Rajendra Nayak > > Add cpu cooling maps for cpu passive trip points. The cpu cooling > device states are mapped to cpufreq based scaling frequencies. > > Signed-off-by: Rajendra Nayak > Signed-off-by: Amit Kucheria > --- > arch/arm64/boot/dts/qcom/msm8916.dtsi | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi > b/arch/arm64/boot/dts/qcom/msm8916.dtsi > index e468277..66b318e 100644 > --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi > +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > / { > model = "Qualcomm Technologies, Inc. MSM8916"; > @@ -115,6 +116,7 @@ > cpu-idle-states = <_SPC>; > clocks = < 0>; > operating-points-v2 = <_opp_table>; > + #cooling-cells = <2>; LGTM.
Re: [PATCH v2] arm64: dts: msm8916: Add cpu cooling maps
On Wed, Mar 7, 2018 at 10:30 AM, Amit Kucheria wrote: > From: Rajendra Nayak > > Add cpu cooling maps for cpu passive trip points. The cpu cooling > device states are mapped to cpufreq based scaling frequencies. > > Signed-off-by: Rajendra Nayak > Signed-off-by: Amit Kucheria > --- > arch/arm64/boot/dts/qcom/msm8916.dtsi | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi > b/arch/arm64/boot/dts/qcom/msm8916.dtsi > index e468277..66b318e 100644 > --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi > +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > / { > model = "Qualcomm Technologies, Inc. MSM8916"; > @@ -115,6 +116,7 @@ > cpu-idle-states = <_SPC>; > clocks = < 0>; > operating-points-v2 = <_opp_table>; > + #cooling-cells = <2>; LGTM.
[PATCH] staging: iio: adc: Remove reduntant __func__ from debug print
From: HariPrasath Elangodev_dbg includes the function name & line number by default when dynamic debugging is enabled. Hence__func__ is reduntant here and removed. Signed-off-by: HariPrasath Elango --- drivers/staging/iio/meter/ade7758_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/iio/meter/ade7758_trigger.c b/drivers/staging/iio/meter/ade7758_trigger.c index 1f0d1a0..da489ae 100644 --- a/drivers/staging/iio/meter/ade7758_trigger.c +++ b/drivers/staging/iio/meter/ade7758_trigger.c @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct iio_trigger *trig, { struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); + dev_dbg(_dev->dev, "(%d)\n", state); return ade7758_set_irq(_dev->dev, state); } -- 2.10.0.GIT
Re: [PATCH v9 14/15] cpufreq: Add module to register cpufreq on Krait CPUs
On 06-03-18, 20:09, Sricharan R wrote: > From: Stephen Boyd> > Register a cpufreq-generic device whenever we detect that a > "qcom,krait" compatible CPU is present in DT. > > Acked-by: Viresh Kumar > [Sricharan: updated to use dev_pm_opp_set_prop_name and > nvmem apis] > Signed-off-by: Sricharan R > Signed-off-by: Stephen Boyd > --- > drivers/cpufreq/Kconfig.arm | 10 ++ > drivers/cpufreq/Makefile | 1 + > drivers/cpufreq/cpufreq-dt-platdev.c | 5 + > drivers/cpufreq/qcom-cpufreq.c | 183 > +++ > 4 files changed, 199 insertions(+) > create mode 100644 drivers/cpufreq/qcom-cpufreq.c Acked-by: Viresh Kumar -- viresh
[PATCH] staging: iio: adc: Remove reduntant __func__ from debug print
From: HariPrasath Elango dev_dbg includes the function name & line number by default when dynamic debugging is enabled. Hence__func__ is reduntant here and removed. Signed-off-by: HariPrasath Elango --- drivers/staging/iio/meter/ade7758_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/iio/meter/ade7758_trigger.c b/drivers/staging/iio/meter/ade7758_trigger.c index 1f0d1a0..da489ae 100644 --- a/drivers/staging/iio/meter/ade7758_trigger.c +++ b/drivers/staging/iio/meter/ade7758_trigger.c @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct iio_trigger *trig, { struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig); - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state); + dev_dbg(_dev->dev, "(%d)\n", state); return ade7758_set_irq(_dev->dev, state); } -- 2.10.0.GIT
Re: [PATCH v9 14/15] cpufreq: Add module to register cpufreq on Krait CPUs
On 06-03-18, 20:09, Sricharan R wrote: > From: Stephen Boyd > > Register a cpufreq-generic device whenever we detect that a > "qcom,krait" compatible CPU is present in DT. > > Acked-by: Viresh Kumar > [Sricharan: updated to use dev_pm_opp_set_prop_name and > nvmem apis] > Signed-off-by: Sricharan R > Signed-off-by: Stephen Boyd > --- > drivers/cpufreq/Kconfig.arm | 10 ++ > drivers/cpufreq/Makefile | 1 + > drivers/cpufreq/cpufreq-dt-platdev.c | 5 + > drivers/cpufreq/qcom-cpufreq.c | 183 > +++ > 4 files changed, 199 insertions(+) > create mode 100644 drivers/cpufreq/qcom-cpufreq.c Acked-by: Viresh Kumar -- viresh
[PATCH dts/arm/aspeed-g5 v1] ARM: dts: aspeed-g5: Add IPMI KCS node
The IPMI KCS device part of the LPC interface and is used for communication with the host processor. Signed-off-by: Haiyue Wang--- arch/arm/boot/dts/aspeed-g5.dtsi | 43 +++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi index 8eac57c..f443169 100644 --- a/arch/arm/boot/dts/aspeed-g5.dtsi +++ b/arch/arm/boot/dts/aspeed-g5.dtsi @@ -267,8 +267,40 @@ ranges = <0x0 0x1e789000 0x1000>; lpc_bmc: lpc-bmc@0 { - compatible = "aspeed,ast2500-lpc-bmc"; + compatible = "aspeed,ast2500-lpc-bmc", "simple-mfd", "syscon"; reg = <0x0 0x80>; + reg-io-width = <4>; + + #address-cells = <1>; + #size-cells = <1>; + ranges = <0x0 0x0 0x80>; + + kcs1: kcs1@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <1>; + kcs_addr = <0x0>; + status = "disabled"; + }; + + kcs2: kcs2@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <2>; + kcs_addr = <0x0>; + status = "disabled"; + }; + + kcs3: kcs3@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <3>; + kcs_addr = <0x0>; + status = "disabled"; + }; }; lpc_host: lpc-host@80 { @@ -294,6 +326,15 @@ status = "disabled"; }; + kcs4: kcs4@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0xa0>; + interrupts = <8>; + kcs_chan = <4>; + kcs_addr = <0x0>; + status = "disabled"; + }; + lhc: lhc@20 { compatible = "aspeed,ast2500-lhc"; reg = <0x20 0x24 0x48 0x8>; -- 2.7.4
[PATCH dts/arm/aspeed-g5 v1] ARM: dts: aspeed-g5: Add IPMI KCS node
The IPMI KCS device part of the LPC interface and is used for communication with the host processor. Signed-off-by: Haiyue Wang --- arch/arm/boot/dts/aspeed-g5.dtsi | 43 +++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi index 8eac57c..f443169 100644 --- a/arch/arm/boot/dts/aspeed-g5.dtsi +++ b/arch/arm/boot/dts/aspeed-g5.dtsi @@ -267,8 +267,40 @@ ranges = <0x0 0x1e789000 0x1000>; lpc_bmc: lpc-bmc@0 { - compatible = "aspeed,ast2500-lpc-bmc"; + compatible = "aspeed,ast2500-lpc-bmc", "simple-mfd", "syscon"; reg = <0x0 0x80>; + reg-io-width = <4>; + + #address-cells = <1>; + #size-cells = <1>; + ranges = <0x0 0x0 0x80>; + + kcs1: kcs1@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <1>; + kcs_addr = <0x0>; + status = "disabled"; + }; + + kcs2: kcs2@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <2>; + kcs_addr = <0x0>; + status = "disabled"; + }; + + kcs3: kcs3@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0x80>; + interrupts = <8>; + kcs_chan = <3>; + kcs_addr = <0x0>; + status = "disabled"; + }; }; lpc_host: lpc-host@80 { @@ -294,6 +326,15 @@ status = "disabled"; }; + kcs4: kcs4@0 { + compatible = "aspeed,ast2500-kcs-bmc"; + reg = <0x0 0xa0>; + interrupts = <8>; + kcs_chan = <4>; + kcs_addr = <0x0>; + status = "disabled"; + }; + lhc: lhc@20 { compatible = "aspeed,ast2500-lhc"; reg = <0x20 0x24 0x48 0x8>; -- 2.7.4