Re: [RESEND][PATCH] cpuidle/powernv : Restore different PSSCR for idle and hotplug

2018-03-06 Thread Vaidyanathan Srinivasan
* Benjamin Herrenschmidt  [2018-03-01 08:40:22]:

> On Thu, 2018-03-01 at 01:03 +0530, Akshay Adiga wrote:
> > commit 1e1601b38e6e ("powerpc/powernv/idle: Restore SPRs for deep idle
> > states via stop API.") uses stop-api provided by the firmware to restore
> > PSSCR. PSSCR restore is required for handling special wakeup. When special
> > wakeup is completed, the core enters stop state based on restored PSSCR.
> > 
> > Currently PSSCR is restored to deepest available stop state, causing
> > a idle cpu to enter deeper stop state on a special wakeup, which causes
> > the cpu to hang on wakeup.
> > 
> > A "sensors" command which reads temperature (through DTS sensors) on idle
> > cpu can trigger special wakeup.
> > 
> > Failed Scenario :
> > Request restore of PSSCR with RL = 11
> > cpu enters idle state (stop5)
> >   user triggers "sensors" command
> >Assert special wakeup on cpu
> >  Restores PSSCR with RL = 11  < Done by firmware
> >   Read DTS sensor
> >Deassert special wakeup
> >   cpu enters idle state (stop11) <-- Instead of stop5
> > 
> > Cpu hang is caused because cpu ended up in a deeper state than it requested
> > 
> > This patch fixes instability caused by special wakeup when stop11 is
> > enabled. Requests restore of PSSCR to deepest stop state used by cpuidle.
> > Only when offlining cpu, request restore of PSSCR to deepest stop state.
> > On onlining cpu, request restore of PSSCR to deepest stop state used by
> > cpuidle.
> 
> So if we chose a stop state, but somebody else does a special wakeup,
> we'll end up going back into a *deeper* one than the one we came from ?

Unfortunately yes.  This is the current limitation.  If we are in stop4
and above and we had not set a PSSCR to be restored, then the hardware
will default to all bits set (stop15) leading to entry into stop11
after the special wakeup is removed.  The requirement is such that we
need to have a correct PSSCR restore value set using stop-api.

We need to set a restore PSSCR value that represents one in a group
like stop4,5,6,7 will have identical state loss, hence we can either
set a PSSCR of 7 or 4 or 5 for any of this stop state entry and not
have to use stop-api to set exact value of stop4 or 5 at every entry.
 
> I still think this is broken by design. The chip should automatically
> go back to the state it went to after special wakeup, thus the PPE
> controlling the state should override the PSSCR value accordingly
> rather than relying on those SW hoops.

Special wakeup de-assertion and re-entry hits this limitation where we
have lost the original content of PSSCR SPR and hence CME does not know
what was requested.

Additional stop-api calls from software could have been avoided, but
practically we have only cpu hotplug case that uses stop11 and needs
this stop-api.  We can default the system to stop4 or stop5 and then
make stop-api call to explicitly set stop11 and then hotplug out the
cpu. We have to restore the deepest cpuidle state (stop4/5) back
during online.  As such this is not much of software overhead. But we
need an elegant method to control these calls from OPAL flags so that
kernel behaviour can be more closely controlled.

If we want to use stop11 in cpuidle (despite being very slow) for
evaluation reasons, then we will need to make more stop-api call to
choose between stop4/5 vs stop11 since they belong to different group.
Even in this case, since stop11 is the slow path, we would want to set
stop11 before entry and restore to stop4/5 after wakeup.  This way we
still completely avoid stop-api call in fast-path stop4/5 entry/exit.

--Vaidy



Re: [RESEND][PATCH] cpuidle/powernv : Restore different PSSCR for idle and hotplug

2018-03-06 Thread Vaidyanathan Srinivasan
* Benjamin Herrenschmidt  [2018-03-01 08:40:22]:

> On Thu, 2018-03-01 at 01:03 +0530, Akshay Adiga wrote:
> > commit 1e1601b38e6e ("powerpc/powernv/idle: Restore SPRs for deep idle
> > states via stop API.") uses stop-api provided by the firmware to restore
> > PSSCR. PSSCR restore is required for handling special wakeup. When special
> > wakeup is completed, the core enters stop state based on restored PSSCR.
> > 
> > Currently PSSCR is restored to deepest available stop state, causing
> > a idle cpu to enter deeper stop state on a special wakeup, which causes
> > the cpu to hang on wakeup.
> > 
> > A "sensors" command which reads temperature (through DTS sensors) on idle
> > cpu can trigger special wakeup.
> > 
> > Failed Scenario :
> > Request restore of PSSCR with RL = 11
> > cpu enters idle state (stop5)
> >   user triggers "sensors" command
> >Assert special wakeup on cpu
> >  Restores PSSCR with RL = 11  < Done by firmware
> >   Read DTS sensor
> >Deassert special wakeup
> >   cpu enters idle state (stop11) <-- Instead of stop5
> > 
> > Cpu hang is caused because cpu ended up in a deeper state than it requested
> > 
> > This patch fixes instability caused by special wakeup when stop11 is
> > enabled. Requests restore of PSSCR to deepest stop state used by cpuidle.
> > Only when offlining cpu, request restore of PSSCR to deepest stop state.
> > On onlining cpu, request restore of PSSCR to deepest stop state used by
> > cpuidle.
> 
> So if we chose a stop state, but somebody else does a special wakeup,
> we'll end up going back into a *deeper* one than the one we came from ?

Unfortunately yes.  This is the current limitation.  If we are in stop4
and above and we had not set a PSSCR to be restored, then the hardware
will default to all bits set (stop15) leading to entry into stop11
after the special wakeup is removed.  The requirement is such that we
need to have a correct PSSCR restore value set using stop-api.

We need to set a restore PSSCR value that represents one in a group
like stop4,5,6,7 will have identical state loss, hence we can either
set a PSSCR of 7 or 4 or 5 for any of this stop state entry and not
have to use stop-api to set exact value of stop4 or 5 at every entry.
 
> I still think this is broken by design. The chip should automatically
> go back to the state it went to after special wakeup, thus the PPE
> controlling the state should override the PSSCR value accordingly
> rather than relying on those SW hoops.

Special wakeup de-assertion and re-entry hits this limitation where we
have lost the original content of PSSCR SPR and hence CME does not know
what was requested.

Additional stop-api calls from software could have been avoided, but
practically we have only cpu hotplug case that uses stop11 and needs
this stop-api.  We can default the system to stop4 or stop5 and then
make stop-api call to explicitly set stop11 and then hotplug out the
cpu. We have to restore the deepest cpuidle state (stop4/5) back
during online.  As such this is not much of software overhead. But we
need an elegant method to control these calls from OPAL flags so that
kernel behaviour can be more closely controlled.

If we want to use stop11 in cpuidle (despite being very slow) for
evaluation reasons, then we will need to make more stop-api call to
choose between stop4/5 vs stop11 since they belong to different group.
Even in this case, since stop11 is the slow path, we would want to set
stop11 before entry and restore to stop4/5 after wakeup.  This way we
still completely avoid stop-api call in fast-path stop4/5 entry/exit.

--Vaidy



Re: KASAN: use-after-free Read in __list_del_entry_valid (3)

2018-03-06 Thread Martijn Coenen
On Tue, Mar 6, 2018 at 9:30 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 094b58e1040a44f991d7ab628035e69c4d6b79c9 (Mon Mar 5 19:57:06 2018 +)
> Merge tag 'linux-kselftest-4.16-rc5' of
> git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

I'll take a look at this one,

Martijn

>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> user-space arch: i386
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+09e05aba06723a94d...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> binder: release 6174:6185 transaction 4 in, still active
> binder: send failed reply for transaction 4 to 6174:6185
> binder: 6194:6198 ERROR: BC_REGISTER_LOOPER called without request
> ==
> binder: 6198 RLIMIT_NICE not set
> BUG: KASAN: use-after-free in __list_del_entry_valid+0x144/0x150
> lib/list_debug.c:54
> Read of size 8 at addr 8801daede810 by task kworker/1:1/24
>
> CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 4.16.0-rc4+ #252
> binder: BINDER_SET_CONTEXT_MGR already set
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: events binder_deferred_func
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
> binder: 6194:6206 got new transaction with bad transaction stack,
> transaction 9 has target 6194:0
>  print_address_description+0x73/0x250 mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report+0x23c/0x360 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __list_del_entry_valid+0x144/0x150 lib/list_debug.c:54
>  __list_del_entry include/linux/list.h:117 [inline]
>  list_del_init include/linux/list.h:159 [inline]
>  binder_dequeue_work_head_ilocked drivers/android/binder.c:893 [inline]
>  binder_dequeue_work_head drivers/android/binder.c:913 [inline]
>  binder_release_work+0x163/0x490 drivers/android/binder.c:4191
> binder: 6194:6206 transaction failed 29201/-71, size 0-0 line 2875
> binder: 6191:6205 ioctl 40046207 0 returned -16
>  binder_thread_release+0x4d0/0x720 drivers/android/binder.c:4396
>  binder_deferred_release drivers/android/binder.c:4939 [inline]
>  binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022
> binder: BINDER_SET_CONTEXT_MGR already set
> binder: 6200:6207 ioctl 40046207 0 returned -16
> binder: 6191:6208 ERROR: BC_REGISTER_LOOPER called without request
>  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
> binder: 6208 RLIMIT_NICE not set
> binder: 6200:6212 ERROR: BC_REGISTER_LOOPER called without request
> binder: 6212 RLIMIT_NICE not set
> binder: 6191:6213 got new transaction with bad transaction stack,
> transaction 11 has target 6194:0
>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
> binder: 6191:6213 transaction failed 29201/-71, size 0-0 line 2875
> binder: 6198 RLIMIT_NICE not set
> binder: release 6200:6207 transaction 14 out, still active
> binder: undelivered TRANSACTION_COMPLETE
>  kthread+0x33c/0x400 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
>
> Allocated by task 6185:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552
>  kmem_cache_alloc_trace+0x136/0x740 mm/slab.c:3607
>  kmalloc include/linux/slab.h:512 [inline]
>  kzalloc include/linux/slab.h:701 [inline]
>  binder_transaction+0x13c1/0x81c0 drivers/android/binder.c:2900
>  binder_thread_write+0xb50/0x3840 drivers/android/binder.c:3513
>  binder_ioctl_write_read.isra.38+0x261/0xcb0 drivers/android/binder.c:4451
>  binder_ioctl+0xb72/0x1417 drivers/android/binder.c:4591
>  C_SYSC_ioctl fs/compat_ioctl.c:1461 [inline]
>  compat_SyS_ioctl+0x151/0x2a30 fs/compat_ioctl.c:1407
>  do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline]
>  do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392
>  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>
> Freed by task 24:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
>  __cache_free mm/slab.c:3485 [inline]
>  kfree+0xd9/0x260 mm/slab.c:3800
>  binder_free_transaction+0x6a/0x90 drivers/android/binder.c:1966
>  binder_send_failed_reply+0x1c9/0x380 drivers/android/binder.c:2005
>  binder_thread_release+0x4bb/0x720 drivers/android/binder.c:4395
>  binder_deferred_release drivers/android/binder.c:4939 [inline]
>  binder_deferred_func+0x4f4/0x1340 

Re: KASAN: use-after-free Read in __list_del_entry_valid (3)

2018-03-06 Thread Martijn Coenen
On Tue, Mar 6, 2018 at 9:30 AM, syzbot
 wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> 094b58e1040a44f991d7ab628035e69c4d6b79c9 (Mon Mar 5 19:57:06 2018 +)
> Merge tag 'linux-kselftest-4.16-rc5' of
> git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

I'll take a look at this one,

Martijn

>
> Unfortunately, I don't have any reproducer for this crash yet.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
> user-space arch: i386
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+09e05aba06723a94d...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> binder: release 6174:6185 transaction 4 in, still active
> binder: send failed reply for transaction 4 to 6174:6185
> binder: 6194:6198 ERROR: BC_REGISTER_LOOPER called without request
> ==
> binder: 6198 RLIMIT_NICE not set
> BUG: KASAN: use-after-free in __list_del_entry_valid+0x144/0x150
> lib/list_debug.c:54
> Read of size 8 at addr 8801daede810 by task kworker/1:1/24
>
> CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 4.16.0-rc4+ #252
> binder: BINDER_SET_CONTEXT_MGR already set
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: events binder_deferred_func
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
> binder: 6194:6206 got new transaction with bad transaction stack,
> transaction 9 has target 6194:0
>  print_address_description+0x73/0x250 mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report+0x23c/0x360 mm/kasan/report.c:412
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
>  __list_del_entry_valid+0x144/0x150 lib/list_debug.c:54
>  __list_del_entry include/linux/list.h:117 [inline]
>  list_del_init include/linux/list.h:159 [inline]
>  binder_dequeue_work_head_ilocked drivers/android/binder.c:893 [inline]
>  binder_dequeue_work_head drivers/android/binder.c:913 [inline]
>  binder_release_work+0x163/0x490 drivers/android/binder.c:4191
> binder: 6194:6206 transaction failed 29201/-71, size 0-0 line 2875
> binder: 6191:6205 ioctl 40046207 0 returned -16
>  binder_thread_release+0x4d0/0x720 drivers/android/binder.c:4396
>  binder_deferred_release drivers/android/binder.c:4939 [inline]
>  binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022
> binder: BINDER_SET_CONTEXT_MGR already set
> binder: 6200:6207 ioctl 40046207 0 returned -16
> binder: 6191:6208 ERROR: BC_REGISTER_LOOPER called without request
>  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
> binder: 6208 RLIMIT_NICE not set
> binder: 6200:6212 ERROR: BC_REGISTER_LOOPER called without request
> binder: 6212 RLIMIT_NICE not set
> binder: 6191:6213 got new transaction with bad transaction stack,
> transaction 11 has target 6194:0
>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
> binder: 6191:6213 transaction failed 29201/-71, size 0-0 line 2875
> binder: 6198 RLIMIT_NICE not set
> binder: release 6200:6207 transaction 14 out, still active
> binder: undelivered TRANSACTION_COMPLETE
>  kthread+0x33c/0x400 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
>
> Allocated by task 6185:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:552
>  kmem_cache_alloc_trace+0x136/0x740 mm/slab.c:3607
>  kmalloc include/linux/slab.h:512 [inline]
>  kzalloc include/linux/slab.h:701 [inline]
>  binder_transaction+0x13c1/0x81c0 drivers/android/binder.c:2900
>  binder_thread_write+0xb50/0x3840 drivers/android/binder.c:3513
>  binder_ioctl_write_read.isra.38+0x261/0xcb0 drivers/android/binder.c:4451
>  binder_ioctl+0xb72/0x1417 drivers/android/binder.c:4591
>  C_SYSC_ioctl fs/compat_ioctl.c:1461 [inline]
>  compat_SyS_ioctl+0x151/0x2a30 fs/compat_ioctl.c:1407
>  do_syscall_32_irqs_on arch/x86/entry/common.c:330 [inline]
>  do_fast_syscall_32+0x3ec/0xf9f arch/x86/entry/common.c:392
>  entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
>
> Freed by task 24:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:520
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:527
>  __cache_free mm/slab.c:3485 [inline]
>  kfree+0xd9/0x260 mm/slab.c:3800
>  binder_free_transaction+0x6a/0x90 drivers/android/binder.c:1966
>  binder_send_failed_reply+0x1c9/0x380 drivers/android/binder.c:2005
>  binder_thread_release+0x4bb/0x720 drivers/android/binder.c:4395
>  binder_deferred_release drivers/android/binder.c:4939 [inline]
>  binder_deferred_func+0x4f4/0x1340 drivers/android/binder.c:5022
>  process_one_work+0xc47/0x1bb0 

[PATCH v5 4/7] x86: Align x86_64 PCI_MMCONFIG with 32-bit variant

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Allow to enable PCI_MMCONFIG when only SFI is present and make this
option default on. This will help consolidating both into one Kconfig
statement.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index eb7f43f23521..c19f5342ec2b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2659,7 +2659,8 @@ config PCI_DOMAINS
 
 config PCI_MMCONFIG
bool "Support mmconfig PCI config space access"
-   depends on X86_64 && PCI && ACPI
+   default y
+   depends on X86_64 && PCI && (ACPI || SFI)
 
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
-- 
2.13.6



[PATCH v5 4/7] x86: Align x86_64 PCI_MMCONFIG with 32-bit variant

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Allow to enable PCI_MMCONFIG when only SFI is present and make this
option default on. This will help consolidating both into one Kconfig
statement.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index eb7f43f23521..c19f5342ec2b 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2659,7 +2659,8 @@ config PCI_DOMAINS
 
 config PCI_MMCONFIG
bool "Support mmconfig PCI config space access"
-   depends on X86_64 && PCI && ACPI
+   default y
+   depends on X86_64 && PCI && (ACPI || SFI)
 
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
-- 
2.13.6



[PATCH v5 3/7] x86/jailhouse: Enable PCI mmconfig access in inmates

2018-03-06 Thread Jan Kiszka
From: Otavio Pontes 

Use the PCI mmconfig base address exported by jailhouse in boot
parameters in order to access the memory mapped PCI configuration space.

Signed-off-by: Otavio Pontes 
[Jan: rebased, fixed !CONFIG_PCI_MMCONFIG, used pcibios_last_bus]
Signed-off-by: Jan Kiszka 
Reviewed-by: Andy Shevchenko 
---
 arch/x86/include/asm/pci_x86.h | 2 ++
 arch/x86/kernel/jailhouse.c| 8 
 arch/x86/pci/mmconfig-shared.c | 4 ++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index eb66fa9cd0fc..959d618dbb17 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -151,6 +151,8 @@ extern int pci_mmconfig_insert(struct device *dev, u16 seg, 
u8 start, u8 end,
   phys_addr_t addr);
 extern int pci_mmconfig_delete(u16 seg, u8 start, u8 end);
 extern struct pci_mmcfg_region *pci_mmconfig_lookup(int segment, int bus);
+extern struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
+   int end, u64 addr);
 
 extern struct list_head pci_mmcfg_list;
 
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index b68fd895235a..fa183a131edc 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -124,6 +124,14 @@ static int __init jailhouse_pci_arch_init(void)
if (pcibios_last_bus < 0)
pcibios_last_bus = 0xff;
 
+#ifdef CONFIG_PCI_MMCONFIG
+   if (setup_data.pci_mmconfig_base) {
+   pci_mmconfig_add(0, 0, pcibios_last_bus,
+setup_data.pci_mmconfig_base);
+   pci_mmcfg_arch_init();
+   }
+#endif
+
return 0;
 }
 
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 96684d0adcf9..0e590272366b 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -94,8 +94,8 @@ static struct pci_mmcfg_region *pci_mmconfig_alloc(int 
segment, int start,
return new;
 }
 
-static struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
-   int end, u64 addr)
+struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
+int end, u64 addr)
 {
struct pci_mmcfg_region *new;
 
-- 
2.13.6



[PATCH v5 3/7] x86/jailhouse: Enable PCI mmconfig access in inmates

2018-03-06 Thread Jan Kiszka
From: Otavio Pontes 

Use the PCI mmconfig base address exported by jailhouse in boot
parameters in order to access the memory mapped PCI configuration space.

Signed-off-by: Otavio Pontes 
[Jan: rebased, fixed !CONFIG_PCI_MMCONFIG, used pcibios_last_bus]
Signed-off-by: Jan Kiszka 
Reviewed-by: Andy Shevchenko 
---
 arch/x86/include/asm/pci_x86.h | 2 ++
 arch/x86/kernel/jailhouse.c| 8 
 arch/x86/pci/mmconfig-shared.c | 4 ++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index eb66fa9cd0fc..959d618dbb17 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -151,6 +151,8 @@ extern int pci_mmconfig_insert(struct device *dev, u16 seg, 
u8 start, u8 end,
   phys_addr_t addr);
 extern int pci_mmconfig_delete(u16 seg, u8 start, u8 end);
 extern struct pci_mmcfg_region *pci_mmconfig_lookup(int segment, int bus);
+extern struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
+   int end, u64 addr);
 
 extern struct list_head pci_mmcfg_list;
 
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index b68fd895235a..fa183a131edc 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -124,6 +124,14 @@ static int __init jailhouse_pci_arch_init(void)
if (pcibios_last_bus < 0)
pcibios_last_bus = 0xff;
 
+#ifdef CONFIG_PCI_MMCONFIG
+   if (setup_data.pci_mmconfig_base) {
+   pci_mmconfig_add(0, 0, pcibios_last_bus,
+setup_data.pci_mmconfig_base);
+   pci_mmcfg_arch_init();
+   }
+#endif
+
return 0;
 }
 
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 96684d0adcf9..0e590272366b 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -94,8 +94,8 @@ static struct pci_mmcfg_region *pci_mmconfig_alloc(int 
segment, int start,
return new;
 }
 
-static struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
-   int end, u64 addr)
+struct pci_mmcfg_region *__init pci_mmconfig_add(int segment, int start,
+int end, u64 addr)
 {
struct pci_mmcfg_region *new;
 
-- 
2.13.6



[PATCH v5 5/7] x86: Consolidate PCI_MMCONFIG configs

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Since e279b6c1d329 ("x86: start unification of arch/x86/Kconfig.*"), we
have two PCI_MMCONFIG entries, one from the original i386 and another
from x86_64. This consolidates both entries into a single one.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c19f5342ec2b..8986a6b6e3df 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2641,8 +2641,10 @@ config PCI_DIRECT
depends on PCI && (X86_64 || (PCI_GODIRECT || PCI_GOANY || PCI_GOOLPC 
|| PCI_GOMMCONFIG))
 
 config PCI_MMCONFIG
-   def_bool y
-   depends on X86_32 && PCI && (ACPI || SFI) && (PCI_GOMMCONFIG || 
PCI_GOANY)
+   bool "Support mmconfig PCI config space access" if X86_64
+   default y
+   depends on PCI && (ACPI || SFI)
+   depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG)
 
 config PCI_OLPC
def_bool y
@@ -2657,11 +2659,6 @@ config PCI_DOMAINS
def_bool y
depends on PCI
 
-config PCI_MMCONFIG
-   bool "Support mmconfig PCI config space access"
-   default y
-   depends on X86_64 && PCI && (ACPI || SFI)
-
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
depends on PCI
-- 
2.13.6



[PATCH v5 5/7] x86: Consolidate PCI_MMCONFIG configs

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Since e279b6c1d329 ("x86: start unification of arch/x86/Kconfig.*"), we
have two PCI_MMCONFIG entries, one from the original i386 and another
from x86_64. This consolidates both entries into a single one.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c19f5342ec2b..8986a6b6e3df 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2641,8 +2641,10 @@ config PCI_DIRECT
depends on PCI && (X86_64 || (PCI_GODIRECT || PCI_GOANY || PCI_GOOLPC 
|| PCI_GOMMCONFIG))
 
 config PCI_MMCONFIG
-   def_bool y
-   depends on X86_32 && PCI && (ACPI || SFI) && (PCI_GOMMCONFIG || 
PCI_GOANY)
+   bool "Support mmconfig PCI config space access" if X86_64
+   default y
+   depends on PCI && (ACPI || SFI)
+   depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG)
 
 config PCI_OLPC
def_bool y
@@ -2657,11 +2659,6 @@ config PCI_DOMAINS
def_bool y
depends on PCI
 
-config PCI_MMCONFIG
-   bool "Support mmconfig PCI config space access"
-   default y
-   depends on X86_64 && PCI && (ACPI || SFI)
-
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
depends on PCI
-- 
2.13.6



[PATCH v5 2/7] PCI: Scan all functions when running over Jailhouse

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Per PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to
have a function 0.  Therefore, Linux scans for devices at function 0
(devfn 0/8/16/...) and only scans for other functions if function 0
has its Multi-Function Device bit set or ARI or SR-IOV indicate
there are more functions.

The Jailhouse hypervisor may pass individual functions of a
multi-function device to a guest without passing function 0, which
means a Linux guest won't find them.

Change Linux PCI probing so it scans all function numbers when
running as a guest over Jailhouse.

This is technically prohibited by the spec, so it is possible that
PCI devices without the Multi-Function Device bit set may have
unexpected behavior in response to this probe.

Derived from original patch by Benedikt Spranger.

CC: Benedikt Spranger 
Signed-off-by: Jan Kiszka 
Acked-by: Bjorn Helgaas 
Reviewed-by: Andy Shevchenko 
---
 arch/x86/pci/legacy.c |  4 +++-
 drivers/pci/probe.c   | 22 +++---
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 1cb01abcb1be..dfbe6ac38830 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -34,13 +35,14 @@ int __init pci_legacy_init(void)
 
 void pcibios_scan_specific_bus(int busn)
 {
+   int stride = jailhouse_paravirt() ? 1 : 8;
int devfn;
u32 l;
 
if (pci_find_bus(0, busn))
return;
 
-   for (devfn = 0; devfn < 256; devfn += 8) {
+   for (devfn = 0; devfn < 256; devfn += stride) {
if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, ) &&
l != 0x && l != 0x) {
DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, 
l);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef5377438a1e..3c365dc996e7 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "pci.h"
@@ -2518,14 +2519,29 @@ static unsigned int pci_scan_child_bus_extend(struct 
pci_bus *bus,
 {
unsigned int used_buses, normal_bridges = 0, hotplug_bridges = 0;
unsigned int start = bus->busn_res.start;
-   unsigned int devfn, cmax, max = start;
+   unsigned int devfn, fn, cmax, max = start;
struct pci_dev *dev;
+   int nr_devs;
 
dev_dbg(>dev, "scanning bus\n");
 
/* Go find them, Rover! */
-   for (devfn = 0; devfn < 0x100; devfn += 8)
-   pci_scan_slot(bus, devfn);
+   for (devfn = 0; devfn < 256; devfn += 8) {
+   nr_devs = pci_scan_slot(bus, devfn);
+
+   /*
+* The Jailhouse hypervisor may pass individual functions of a
+* multi-function device to a guest without passing function 0.
+* Look for them as well.
+*/
+   if (jailhouse_paravirt() && nr_devs == 0) {
+   for (fn = 1; fn < 8; fn++) {
+   dev = pci_scan_single_device(bus, devfn + fn);
+   if (dev)
+   dev->multifunction = 1;
+   }
+   }
+   }
 
/* Reserve buses for SR-IOV capability */
used_buses = pci_iov_bus_range(bus);
-- 
2.13.6



[PATCH v5 6/7] x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Jailhouse does not use ACPI, but it does support MMCONFIG. Make sure the
latter can be built without having to enable ACPI as well. Primarily, we
need to make the AMD mmconf-fam10h_64 depend upon MMCONFIG and ACPI,
instead of just the former.

Saves some bytes in the Jailhouse non-root kernel.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig  | 6 +-
 arch/x86/kernel/Makefile  | 2 +-
 arch/x86/kernel/cpu/amd.c | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8986a6b6e3df..b53340e71f84 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2643,7 +2643,7 @@ config PCI_DIRECT
 config PCI_MMCONFIG
bool "Support mmconfig PCI config space access" if X86_64
default y
-   depends on PCI && (ACPI || SFI)
+   depends on PCI && (ACPI || SFI || JAILHOUSE_GUEST)
depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG)
 
 config PCI_OLPC
@@ -2659,6 +2659,10 @@ config PCI_DOMAINS
def_bool y
depends on PCI
 
+config MMCONF_FAM10H
+   def_bool y
+   depends on X86_64 && PCI_MMCONFIG && ACPI
+
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
depends on PCI
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 29786c87e864..73ccf80c09a2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -146,6 +146,6 @@ ifeq ($(CONFIG_X86_64),y)
obj-$(CONFIG_GART_IOMMU)+= amd_gart_64.o aperture_64.o
obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary_64.o tce_64.o
 
-   obj-$(CONFIG_PCI_MMCONFIG)  += mmconf-fam10h_64.o
+   obj-$(CONFIG_MMCONF_FAM10H) += mmconf-fam10h_64.o
obj-y   += vsmp_64.o
 endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f0e6456ca7d3..12bc0a1139da 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -716,7 +716,7 @@ static void init_amd_k8(struct cpuinfo_x86 *c)
 
 static void init_amd_gh(struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_MMCONF_FAM10H
/* do this for boot cpu */
if (c == _cpu_data)
check_enable_amd_mmconf_dmi();
-- 
2.13.6



[PATCH v5 2/7] PCI: Scan all functions when running over Jailhouse

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Per PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to
have a function 0.  Therefore, Linux scans for devices at function 0
(devfn 0/8/16/...) and only scans for other functions if function 0
has its Multi-Function Device bit set or ARI or SR-IOV indicate
there are more functions.

The Jailhouse hypervisor may pass individual functions of a
multi-function device to a guest without passing function 0, which
means a Linux guest won't find them.

Change Linux PCI probing so it scans all function numbers when
running as a guest over Jailhouse.

This is technically prohibited by the spec, so it is possible that
PCI devices without the Multi-Function Device bit set may have
unexpected behavior in response to this probe.

Derived from original patch by Benedikt Spranger.

CC: Benedikt Spranger 
Signed-off-by: Jan Kiszka 
Acked-by: Bjorn Helgaas 
Reviewed-by: Andy Shevchenko 
---
 arch/x86/pci/legacy.c |  4 +++-
 drivers/pci/probe.c   | 22 +++---
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c
index 1cb01abcb1be..dfbe6ac38830 100644
--- a/arch/x86/pci/legacy.c
+++ b/arch/x86/pci/legacy.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -34,13 +35,14 @@ int __init pci_legacy_init(void)
 
 void pcibios_scan_specific_bus(int busn)
 {
+   int stride = jailhouse_paravirt() ? 1 : 8;
int devfn;
u32 l;
 
if (pci_find_bus(0, busn))
return;
 
-   for (devfn = 0; devfn < 256; devfn += 8) {
+   for (devfn = 0; devfn < 256; devfn += stride) {
if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, ) &&
l != 0x && l != 0x) {
DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, 
l);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ef5377438a1e..3c365dc996e7 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "pci.h"
@@ -2518,14 +2519,29 @@ static unsigned int pci_scan_child_bus_extend(struct 
pci_bus *bus,
 {
unsigned int used_buses, normal_bridges = 0, hotplug_bridges = 0;
unsigned int start = bus->busn_res.start;
-   unsigned int devfn, cmax, max = start;
+   unsigned int devfn, fn, cmax, max = start;
struct pci_dev *dev;
+   int nr_devs;
 
dev_dbg(>dev, "scanning bus\n");
 
/* Go find them, Rover! */
-   for (devfn = 0; devfn < 0x100; devfn += 8)
-   pci_scan_slot(bus, devfn);
+   for (devfn = 0; devfn < 256; devfn += 8) {
+   nr_devs = pci_scan_slot(bus, devfn);
+
+   /*
+* The Jailhouse hypervisor may pass individual functions of a
+* multi-function device to a guest without passing function 0.
+* Look for them as well.
+*/
+   if (jailhouse_paravirt() && nr_devs == 0) {
+   for (fn = 1; fn < 8; fn++) {
+   dev = pci_scan_single_device(bus, devfn + fn);
+   if (dev)
+   dev->multifunction = 1;
+   }
+   }
+   }
 
/* Reserve buses for SR-IOV capability */
used_buses = pci_iov_bus_range(bus);
-- 
2.13.6



[PATCH v5 6/7] x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Jailhouse does not use ACPI, but it does support MMCONFIG. Make sure the
latter can be built without having to enable ACPI as well. Primarily, we
need to make the AMD mmconf-fam10h_64 depend upon MMCONFIG and ACPI,
instead of just the former.

Saves some bytes in the Jailhouse non-root kernel.

Signed-off-by: Jan Kiszka 
---
 arch/x86/Kconfig  | 6 +-
 arch/x86/kernel/Makefile  | 2 +-
 arch/x86/kernel/cpu/amd.c | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8986a6b6e3df..b53340e71f84 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2643,7 +2643,7 @@ config PCI_DIRECT
 config PCI_MMCONFIG
bool "Support mmconfig PCI config space access" if X86_64
default y
-   depends on PCI && (ACPI || SFI)
+   depends on PCI && (ACPI || SFI || JAILHOUSE_GUEST)
depends on X86_64 || (PCI_GOANY || PCI_GOMMCONFIG)
 
 config PCI_OLPC
@@ -2659,6 +2659,10 @@ config PCI_DOMAINS
def_bool y
depends on PCI
 
+config MMCONF_FAM10H
+   def_bool y
+   depends on X86_64 && PCI_MMCONFIG && ACPI
+
 config PCI_CNB20LE_QUIRK
bool "Read CNB20LE Host Bridge Windows" if EXPERT
depends on PCI
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 29786c87e864..73ccf80c09a2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -146,6 +146,6 @@ ifeq ($(CONFIG_X86_64),y)
obj-$(CONFIG_GART_IOMMU)+= amd_gart_64.o aperture_64.o
obj-$(CONFIG_CALGARY_IOMMU) += pci-calgary_64.o tce_64.o
 
-   obj-$(CONFIG_PCI_MMCONFIG)  += mmconf-fam10h_64.o
+   obj-$(CONFIG_MMCONF_FAM10H) += mmconf-fam10h_64.o
obj-y   += vsmp_64.o
 endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index f0e6456ca7d3..12bc0a1139da 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -716,7 +716,7 @@ static void init_amd_k8(struct cpuinfo_x86 *c)
 
 static void init_amd_gh(struct cpuinfo_x86 *c)
 {
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_MMCONF_FAM10H
/* do this for boot cpu */
if (c == _cpu_data)
check_enable_amd_mmconf_dmi();
-- 
2.13.6



[PATCH v5 0/7] jailhouse: Enhance secondary Jailhouse guest support /wrt PCI

2018-03-06 Thread Jan Kiszka
Basic x86 support [1] for running Linux as secondary Jailhouse [2] guest
is currently pending in the tip tree. This builds on top and enhances
the PCI support for x86 and also ARM guests (ARM[64] does not require
platform patches and works already).

Key elements of this series are:
 - detection of Jailhouse via device tree hypervisor node
 - function-level PCI scan if Jailhouse is detected
 - MMCONFIG support for x86 guests

As most changes affect x86, I would suggest to route the series also via
tip after the necessary acks are collected.

Changes in v5:
 - fix build breakage of patch 6 on i386

Changes in v4:
 - slit up Kconfig changes
 - respect pcibios_last_bus during mmconfig setup
 - cosmetic changes requested by Andy

Changes in v3:
 - avoided duplicate scans of PCI functions under Jailhouse
 - reformated PCI_MMCONFIG condition and rephrase related commit log

Changes in v2:
 - adjusted commit log and include ordering in patch 2
 - rebased over Linus master

Jan

[1] https://lkml.org/lkml/2017/11/27/125
[2] http://jailhouse-project.org

CC: Benedikt Spranger 
CC: Juergen Gross 
CC: Mark Rutland 
CC: Otavio Pontes 
CC: Rob Herring 

Jan Kiszka (6):
  jailhouse: Provide detection for non-x86 systems
  PCI: Scan all functions when running over Jailhouse
  x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
  x86: Consolidate PCI_MMCONFIG configs
  x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
  MAINTAINERS: Add entry for Jailhouse

Otavio Pontes (1):
  x86/jailhouse: Enable PCI mmconfig access in inmates

 Documentation/devicetree/bindings/jailhouse.txt |  8 
 MAINTAINERS |  7 +++
 arch/x86/Kconfig| 12 +++-
 arch/x86/include/asm/jailhouse_para.h   |  2 +-
 arch/x86/include/asm/pci_x86.h  |  2 ++
 arch/x86/kernel/Makefile|  2 +-
 arch/x86/kernel/cpu/amd.c   |  2 +-
 arch/x86/kernel/jailhouse.c |  8 
 arch/x86/pci/legacy.c   |  4 +++-
 arch/x86/pci/mmconfig-shared.c  |  4 ++--
 drivers/pci/probe.c | 22 +++---
 include/linux/hypervisor.h  | 17 +++--
 12 files changed, 74 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/jailhouse.txt

-- 
2.13.6



[PATCH v5 7/7] MAINTAINERS: Add entry for Jailhouse

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Signed-off-by: Jan Kiszka 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4623caf8d72d..6dc0b8f3ae0e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7523,6 +7523,13 @@ Q:   
http://patchwork.linuxtv.org/project/linux-media/list/
 S: Maintained
 F: drivers/media/dvb-frontends/ix2505v*
 
+JAILHOUSE HYPERVISOR INTERFACE
+M: Jan Kiszka 
+L: jailhouse-...@googlegroups.com
+S: Maintained
+F: arch/x86/kernel/jailhouse.c
+F: arch/x86/include/asm/jailhouse_para.h
+
 JC42.4 TEMPERATURE SENSOR DRIVER
 M: Guenter Roeck 
 L: linux-hw...@vger.kernel.org
-- 
2.13.6



[PATCH v5 1/7] jailhouse: Provide detection for non-x86 systems

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Implement jailhouse_paravirt() via device tree probing on architectures
!= x86. Will be used by the PCI core.

CC: Rob Herring 
CC: Mark Rutland 
CC: Juergen Gross 
Signed-off-by: Jan Kiszka 
Reviewed-by: Juergen Gross 
---
 Documentation/devicetree/bindings/jailhouse.txt |  8 
 arch/x86/include/asm/jailhouse_para.h   |  2 +-
 include/linux/hypervisor.h  | 17 +++--
 3 files changed, 24 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/jailhouse.txt

diff --git a/Documentation/devicetree/bindings/jailhouse.txt 
b/Documentation/devicetree/bindings/jailhouse.txt
new file mode 100644
index ..2901c25ff340
--- /dev/null
+++ b/Documentation/devicetree/bindings/jailhouse.txt
@@ -0,0 +1,8 @@
+Jailhouse non-root cell device tree bindings
+
+
+When running in a non-root Jailhouse cell (partition), the device tree of this
+platform shall have a top-level "hypervisor" node with the following
+properties:
+
+- compatible = "jailhouse,cell"
diff --git a/arch/x86/include/asm/jailhouse_para.h 
b/arch/x86/include/asm/jailhouse_para.h
index 875b54376689..b885a961a150 100644
--- a/arch/x86/include/asm/jailhouse_para.h
+++ b/arch/x86/include/asm/jailhouse_para.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL2.0 */
 
 /*
- * Jailhouse paravirt_ops implementation
+ * Jailhouse paravirt detection
  *
  * Copyright (c) Siemens AG, 2015-2017
  *
diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
index b19563f9a8eb..fc08b433c856 100644
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -8,15 +8,28 @@
  */
 
 #ifdef CONFIG_X86
+
+#include 
 #include 
+
 static inline void hypervisor_pin_vcpu(int cpu)
 {
x86_platform.hyper.pin_vcpu(cpu);
 }
-#else
+
+#else /* !CONFIG_X86 */
+
+#include 
+
 static inline void hypervisor_pin_vcpu(int cpu)
 {
 }
-#endif
+
+static inline bool jailhouse_paravirt(void)
+{
+   return of_find_compatible_node(NULL, NULL, "jailhouse,cell");
+}
+
+#endif /* !CONFIG_X86 */
 
 #endif /* __LINUX_HYPEVISOR_H */
-- 
2.13.6



[PATCH v5 7/7] MAINTAINERS: Add entry for Jailhouse

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Signed-off-by: Jan Kiszka 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4623caf8d72d..6dc0b8f3ae0e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7523,6 +7523,13 @@ Q:   
http://patchwork.linuxtv.org/project/linux-media/list/
 S: Maintained
 F: drivers/media/dvb-frontends/ix2505v*
 
+JAILHOUSE HYPERVISOR INTERFACE
+M: Jan Kiszka 
+L: jailhouse-...@googlegroups.com
+S: Maintained
+F: arch/x86/kernel/jailhouse.c
+F: arch/x86/include/asm/jailhouse_para.h
+
 JC42.4 TEMPERATURE SENSOR DRIVER
 M: Guenter Roeck 
 L: linux-hw...@vger.kernel.org
-- 
2.13.6



[PATCH v5 0/7] jailhouse: Enhance secondary Jailhouse guest support /wrt PCI

2018-03-06 Thread Jan Kiszka
Basic x86 support [1] for running Linux as secondary Jailhouse [2] guest
is currently pending in the tip tree. This builds on top and enhances
the PCI support for x86 and also ARM guests (ARM[64] does not require
platform patches and works already).

Key elements of this series are:
 - detection of Jailhouse via device tree hypervisor node
 - function-level PCI scan if Jailhouse is detected
 - MMCONFIG support for x86 guests

As most changes affect x86, I would suggest to route the series also via
tip after the necessary acks are collected.

Changes in v5:
 - fix build breakage of patch 6 on i386

Changes in v4:
 - slit up Kconfig changes
 - respect pcibios_last_bus during mmconfig setup
 - cosmetic changes requested by Andy

Changes in v3:
 - avoided duplicate scans of PCI functions under Jailhouse
 - reformated PCI_MMCONFIG condition and rephrase related commit log

Changes in v2:
 - adjusted commit log and include ordering in patch 2
 - rebased over Linus master

Jan

[1] https://lkml.org/lkml/2017/11/27/125
[2] http://jailhouse-project.org

CC: Benedikt Spranger 
CC: Juergen Gross 
CC: Mark Rutland 
CC: Otavio Pontes 
CC: Rob Herring 

Jan Kiszka (6):
  jailhouse: Provide detection for non-x86 systems
  PCI: Scan all functions when running over Jailhouse
  x86: Align x86_64 PCI_MMCONFIG with 32-bit variant
  x86: Consolidate PCI_MMCONFIG configs
  x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI
  MAINTAINERS: Add entry for Jailhouse

Otavio Pontes (1):
  x86/jailhouse: Enable PCI mmconfig access in inmates

 Documentation/devicetree/bindings/jailhouse.txt |  8 
 MAINTAINERS |  7 +++
 arch/x86/Kconfig| 12 +++-
 arch/x86/include/asm/jailhouse_para.h   |  2 +-
 arch/x86/include/asm/pci_x86.h  |  2 ++
 arch/x86/kernel/Makefile|  2 +-
 arch/x86/kernel/cpu/amd.c   |  2 +-
 arch/x86/kernel/jailhouse.c |  8 
 arch/x86/pci/legacy.c   |  4 +++-
 arch/x86/pci/mmconfig-shared.c  |  4 ++--
 drivers/pci/probe.c | 22 +++---
 include/linux/hypervisor.h  | 17 +++--
 12 files changed, 74 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/jailhouse.txt

-- 
2.13.6



[PATCH v5 1/7] jailhouse: Provide detection for non-x86 systems

2018-03-06 Thread Jan Kiszka
From: Jan Kiszka 

Implement jailhouse_paravirt() via device tree probing on architectures
!= x86. Will be used by the PCI core.

CC: Rob Herring 
CC: Mark Rutland 
CC: Juergen Gross 
Signed-off-by: Jan Kiszka 
Reviewed-by: Juergen Gross 
---
 Documentation/devicetree/bindings/jailhouse.txt |  8 
 arch/x86/include/asm/jailhouse_para.h   |  2 +-
 include/linux/hypervisor.h  | 17 +++--
 3 files changed, 24 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/jailhouse.txt

diff --git a/Documentation/devicetree/bindings/jailhouse.txt 
b/Documentation/devicetree/bindings/jailhouse.txt
new file mode 100644
index ..2901c25ff340
--- /dev/null
+++ b/Documentation/devicetree/bindings/jailhouse.txt
@@ -0,0 +1,8 @@
+Jailhouse non-root cell device tree bindings
+
+
+When running in a non-root Jailhouse cell (partition), the device tree of this
+platform shall have a top-level "hypervisor" node with the following
+properties:
+
+- compatible = "jailhouse,cell"
diff --git a/arch/x86/include/asm/jailhouse_para.h 
b/arch/x86/include/asm/jailhouse_para.h
index 875b54376689..b885a961a150 100644
--- a/arch/x86/include/asm/jailhouse_para.h
+++ b/arch/x86/include/asm/jailhouse_para.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL2.0 */
 
 /*
- * Jailhouse paravirt_ops implementation
+ * Jailhouse paravirt detection
  *
  * Copyright (c) Siemens AG, 2015-2017
  *
diff --git a/include/linux/hypervisor.h b/include/linux/hypervisor.h
index b19563f9a8eb..fc08b433c856 100644
--- a/include/linux/hypervisor.h
+++ b/include/linux/hypervisor.h
@@ -8,15 +8,28 @@
  */
 
 #ifdef CONFIG_X86
+
+#include 
 #include 
+
 static inline void hypervisor_pin_vcpu(int cpu)
 {
x86_platform.hyper.pin_vcpu(cpu);
 }
-#else
+
+#else /* !CONFIG_X86 */
+
+#include 
+
 static inline void hypervisor_pin_vcpu(int cpu)
 {
 }
-#endif
+
+static inline bool jailhouse_paravirt(void)
+{
+   return of_find_compatible_node(NULL, NULL, "jailhouse,cell");
+}
+
+#endif /* !CONFIG_X86 */
 
 #endif /* __LINUX_HYPEVISOR_H */
-- 
2.13.6



[PATCH v4 3/3] security: Add an example sample dynamic LSM

2018-03-06 Thread Sargun Dhillon
This adds an example LSM that utilizes the features added by the
dynamically loadable LSMs patch. Once the module is unloaded, the
command is once again allowed. It prevents the user from running:
date --set="October 21 2015 16:29:00 PDT"

Signed-off-by: Sargun Dhillon 
---
 samples/Kconfig   |  6 ++
 samples/Makefile  |  2 +-
 samples/lsm/Makefile  |  4 
 samples/lsm/lsm_example.c | 33 +
 4 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 samples/lsm/Makefile
 create mode 100644 samples/lsm/lsm_example.c

diff --git a/samples/Kconfig b/samples/Kconfig
index c332a3b9de05..022242c0b50b 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -117,4 +117,10 @@ config SAMPLE_STATX
help
  Build example userspace program to use the new extended-stat syscall.
 
+config SAMPLE_DYNAMIC_LSM
+   tristate "Build LSM examples -- loadable modules only"
+   depends on SECURITY_DYNAMIC_HOOKS && m
+   help
+ This builds an example dynamic LSM
+
 endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index db54e766ddb1..9d23835d6e6d 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -3,4 +3,4 @@
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ livepatch/ \
   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
   configfs/ connector/ v4l/ trace_printk/ blackfin/ \
-  vfio-mdev/ statx/
+  vfio-mdev/ statx/ lsm/
diff --git a/samples/lsm/Makefile b/samples/lsm/Makefile
new file mode 100644
index ..d4ccb940f18b
--- /dev/null
+++ b/samples/lsm/Makefile
@@ -0,0 +1,4 @@
+# builds the loadable LSM example kernel modules;
+# then to use one (as root):  insmod 
+# and to unload: rmmod module_name
+obj-$(CONFIG_SAMPLE_DYNAMIC_LSM) += lsm_example.o
diff --git a/samples/lsm/lsm_example.c b/samples/lsm/lsm_example.c
new file mode 100644
index ..95c56ebd4d16
--- /dev/null
+++ b/samples/lsm/lsm_example.c
@@ -0,0 +1,33 @@
+/*
+ * This sample hooks into the "settime"
+ *
+ * Once you run it, the following will not be allowed:
+ * date --set="October 21 2015 16:29:00 PDT"
+ */
+
+#include 
+#include 
+#include 
+
+static int settime_cb(const struct timespec *ts, const struct timezone *tz)
+{
+   /* We aren't allowed to travel to October 21 2015 16:29 PDT */
+   if (ts->tv_sec >= 1445470140 && ts->tv_sec < 1445470200)
+   return -EPERM;
+
+   return 0;
+}
+
+static struct security_hook_list sample_hooks[] = {
+   LSM_HOOK_INIT(settime, settime_cb),
+};
+
+static int __init lsm_init(void)
+{
+   return security_add_dynamic_hooks(sample_hooks,
+ ARRAY_SIZE(sample_hooks),
+ "sample");
+}
+
+module_init(lsm_init)
+MODULE_LICENSE("GPL");
-- 
2.14.1



[PATCH v4 3/3] security: Add an example sample dynamic LSM

2018-03-06 Thread Sargun Dhillon
This adds an example LSM that utilizes the features added by the
dynamically loadable LSMs patch. Once the module is unloaded, the
command is once again allowed. It prevents the user from running:
date --set="October 21 2015 16:29:00 PDT"

Signed-off-by: Sargun Dhillon 
---
 samples/Kconfig   |  6 ++
 samples/Makefile  |  2 +-
 samples/lsm/Makefile  |  4 
 samples/lsm/lsm_example.c | 33 +
 4 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 samples/lsm/Makefile
 create mode 100644 samples/lsm/lsm_example.c

diff --git a/samples/Kconfig b/samples/Kconfig
index c332a3b9de05..022242c0b50b 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -117,4 +117,10 @@ config SAMPLE_STATX
help
  Build example userspace program to use the new extended-stat syscall.
 
+config SAMPLE_DYNAMIC_LSM
+   tristate "Build LSM examples -- loadable modules only"
+   depends on SECURITY_DYNAMIC_HOOKS && m
+   help
+ This builds an example dynamic LSM
+
 endif # SAMPLES
diff --git a/samples/Makefile b/samples/Makefile
index db54e766ddb1..9d23835d6e6d 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -3,4 +3,4 @@
 obj-$(CONFIG_SAMPLES)  += kobject/ kprobes/ trace_events/ livepatch/ \
   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/ \
   configfs/ connector/ v4l/ trace_printk/ blackfin/ \
-  vfio-mdev/ statx/
+  vfio-mdev/ statx/ lsm/
diff --git a/samples/lsm/Makefile b/samples/lsm/Makefile
new file mode 100644
index ..d4ccb940f18b
--- /dev/null
+++ b/samples/lsm/Makefile
@@ -0,0 +1,4 @@
+# builds the loadable LSM example kernel modules;
+# then to use one (as root):  insmod 
+# and to unload: rmmod module_name
+obj-$(CONFIG_SAMPLE_DYNAMIC_LSM) += lsm_example.o
diff --git a/samples/lsm/lsm_example.c b/samples/lsm/lsm_example.c
new file mode 100644
index ..95c56ebd4d16
--- /dev/null
+++ b/samples/lsm/lsm_example.c
@@ -0,0 +1,33 @@
+/*
+ * This sample hooks into the "settime"
+ *
+ * Once you run it, the following will not be allowed:
+ * date --set="October 21 2015 16:29:00 PDT"
+ */
+
+#include 
+#include 
+#include 
+
+static int settime_cb(const struct timespec *ts, const struct timezone *tz)
+{
+   /* We aren't allowed to travel to October 21 2015 16:29 PDT */
+   if (ts->tv_sec >= 1445470140 && ts->tv_sec < 1445470200)
+   return -EPERM;
+
+   return 0;
+}
+
+static struct security_hook_list sample_hooks[] = {
+   LSM_HOOK_INIT(settime, settime_cb),
+};
+
+static int __init lsm_init(void)
+{
+   return security_add_dynamic_hooks(sample_hooks,
+ ARRAY_SIZE(sample_hooks),
+ "sample");
+}
+
+module_init(lsm_init)
+MODULE_LICENSE("GPL");
-- 
2.14.1



[PATCH v4 2/3] security: Expose a mechanism to load lsm hooks dynamically at runtime

2018-03-06 Thread Sargun Dhillon
This patch adds dynamic security hooks. These hooks are designed to allow
for safe runtime loading.

These hooks are only run after all built-in, and major LSMs are run.
The LSMs enabled by this feature must be minor LSMs, but they can poke
at the security blobs, as the blobs should be initialized by the time
their callback happens.

There should be little runtime performance overhead for this feature,
as it's protected behind static_keys, which are disabled by default,
and are only enabled per-hook at runtime, when a module is loaded.

Currently, the hook heads are separated for dynamic hooks, because
it is not read-only like the hooks which are loaded at runtime.

Some hooks are blacklisted, and attempting to load an LSM with any
of them in use will fail.

Signed-off-by: Sargun Dhillon 
---
 include/linux/lsm_hooks.h |  26 +-
 security/Kconfig  |   9 +++
 security/inode.c  |  13 ++-
 security/security.c   | 198 --
 4 files changed, 234 insertions(+), 12 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d28c7f5b01c1..4e6351957dff 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * union security_list_options - Linux Security Module hook function list
@@ -1968,6 +1969,9 @@ struct security_hook_list {
enum lsm_hook   head_idx;
union security_list_options hook;
char*lsm;
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+   struct module   *owner;
+#endif
 } __randomize_layout;
 
 /*
@@ -1976,11 +1980,29 @@ struct security_hook_list {
  * care of the common case and reduces the amount of
  * text involved.
  */
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+#define LSM_HOOK_INIT(HEAD, HOOK)  \
+   {   \
+   .head_idx = HOOK_IDX(HEAD), \
+   .hook = { .HEAD = HOOK },   \
+   .owner = THIS_MODULE,   \
+   }
+
+#else
 #define LSM_HOOK_INIT(HEAD, HOOK) \
{ .head_idx = HOOK_IDX(HEAD), .hook = { .HEAD = HOOK } }
+#endif
 
-extern char *lsm_names;
-
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+extern int security_add_dynamic_hooks(struct security_hook_list *hooks,
+ int count, char *lsm);
+#else
+static inline int security_add_dynamic_hooks(struct security_hook_list *hooks,
+int count, char *lsm)
+{
+   return -EOPNOTSUPP;
+}
+#endif
 extern void security_add_hooks(struct security_hook_list *hooks, int count,
char *lsm);
 
diff --git a/security/Kconfig b/security/Kconfig
index c4302067a3ad..481b93b0d4d9 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -36,6 +36,15 @@ config SECURITY_WRITABLE_HOOKS
bool
default n
 
+config SECURITY_DYNAMIC_HOOKS
+   bool "Runtime loadable (minor) LSMs via LKMs"
+   depends on SECURITY && SRCU
+   help
+ This enables LSMs which live in LKMs, and supports loading, and
+ unloading them safely at runtime. These LSMs must be minor LSMs.
+ They cannot circumvent the built-in LSMs.
+ If you are unsure how to answer this question, answer N.
+
 config SECURITYFS
bool "Enable the securityfs filesystem"
help
diff --git a/security/inode.c b/security/inode.c
index 8dd9ca8848e4..89be07b044a5 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -22,6 +22,10 @@
 #include 
 #include 
 #include 
+#include 
+
+extern char *lsm_names;
+extern struct mutex lsm_lock;
 
 static struct vfsmount *mount;
 static int mount_count;
@@ -312,8 +316,13 @@ static struct dentry *lsm_dentry;
 static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
loff_t *ppos)
 {
-   return simple_read_from_buffer(buf, count, ppos, lsm_names,
-   strlen(lsm_names));
+   ssize_t ret;
+
+   mutex_lock(_lock);
+   ret = simple_read_from_buffer(buf, count, ppos, lsm_names,
+ strlen(lsm_names));
+   mutex_unlock(_lock);
+   return ret;
 }
 
 static const struct file_operations lsm_ops = {
diff --git a/security/security.c b/security/security.c
index b9fb297b824e..492d44dd0549 100644
--- a/security/security.c
+++ b/security/security.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_LSM_EVM_XATTR  2
 
@@ -36,10 +37,18 @@
 #define SECURITY_NAME_MAX  10
 
 static struct list_head security_hook_heads[__MAX_LSM_HOOK] 
__lsm_ro_after_init;
-static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain);
-
 #define HOOK_HEAD(NAME)(_hook_heads[HOOK_IDX(NAME)])
 
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+static struct list_head dynamic_security_hook_heads[__MAX_LSM_HOOK];
+static struct srcu_struct 

[PATCH v4 0/3] Safe, dynamically loadable LSM hooks

2018-03-06 Thread Sargun Dhillon
This patchset introduces safe dynamic LSM support. These are currently
not unloadable, until we figure out a use case that needs that. Adding
an unload hook is trivial given the way the patch is written.

This exposes a second mechanism of loading hooks which are in modules.
These hooks are behind static keys, so they should come at low performance
overhead. The built-in hook heads are read-only, whereas the dynamic hooks
are mutable.

Not all hooks can be loaded into. Some hooks are blacklisted, and therefore
trying to load a module which plugs into those hooks will fail.

One of the big benefits with loadable security modules is to help with
"unknown unknowns". Although, livepatch is excellent, sometimes, a
surgical LSM is simpler.

It includes an example LSM that prevents specific time travel.

Changes since v3:
  * readded hook blacklisted
  * return error, rather than panic if unable to allocate memory

Changes since v2:
  * inode get/set security is readded
  * xfrm singleton hook readded
  * Security hooks are turned into an array
  * Security hooks and dynamic hooks enum is collapsed

Changes since v1:
  * It no longer allows unloading of modules
  * prctl is fixed
  * inode get/set security is removed
  * xfrm singleton hook removed


Sargun Dhillon (3):
  security: Refactor LSM hooks into an array and enum
  security: Expose a mechanism to load lsm hooks dynamically at runtime
  security: Add an example sample dynamic LSM

 include/linux/lsm_hooks.h | 459 --
 samples/Kconfig   |   6 +
 samples/Makefile  |   2 +-
 samples/lsm/Makefile  |   4 +
 samples/lsm/lsm_example.c |  33 
 security/Kconfig  |   9 +
 security/inode.c  |  13 +-
 security/security.c   | 222 --
 8 files changed, 508 insertions(+), 240 deletions(-)
 create mode 100644 samples/lsm/Makefile
 create mode 100644 samples/lsm/lsm_example.c

-- 
2.14.1



[PATCH v4 2/3] security: Expose a mechanism to load lsm hooks dynamically at runtime

2018-03-06 Thread Sargun Dhillon
This patch adds dynamic security hooks. These hooks are designed to allow
for safe runtime loading.

These hooks are only run after all built-in, and major LSMs are run.
The LSMs enabled by this feature must be minor LSMs, but they can poke
at the security blobs, as the blobs should be initialized by the time
their callback happens.

There should be little runtime performance overhead for this feature,
as it's protected behind static_keys, which are disabled by default,
and are only enabled per-hook at runtime, when a module is loaded.

Currently, the hook heads are separated for dynamic hooks, because
it is not read-only like the hooks which are loaded at runtime.

Some hooks are blacklisted, and attempting to load an LSM with any
of them in use will fail.

Signed-off-by: Sargun Dhillon 
---
 include/linux/lsm_hooks.h |  26 +-
 security/Kconfig  |   9 +++
 security/inode.c  |  13 ++-
 security/security.c   | 198 --
 4 files changed, 234 insertions(+), 12 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index d28c7f5b01c1..4e6351957dff 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * union security_list_options - Linux Security Module hook function list
@@ -1968,6 +1969,9 @@ struct security_hook_list {
enum lsm_hook   head_idx;
union security_list_options hook;
char*lsm;
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+   struct module   *owner;
+#endif
 } __randomize_layout;
 
 /*
@@ -1976,11 +1980,29 @@ struct security_hook_list {
  * care of the common case and reduces the amount of
  * text involved.
  */
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+#define LSM_HOOK_INIT(HEAD, HOOK)  \
+   {   \
+   .head_idx = HOOK_IDX(HEAD), \
+   .hook = { .HEAD = HOOK },   \
+   .owner = THIS_MODULE,   \
+   }
+
+#else
 #define LSM_HOOK_INIT(HEAD, HOOK) \
{ .head_idx = HOOK_IDX(HEAD), .hook = { .HEAD = HOOK } }
+#endif
 
-extern char *lsm_names;
-
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+extern int security_add_dynamic_hooks(struct security_hook_list *hooks,
+ int count, char *lsm);
+#else
+static inline int security_add_dynamic_hooks(struct security_hook_list *hooks,
+int count, char *lsm)
+{
+   return -EOPNOTSUPP;
+}
+#endif
 extern void security_add_hooks(struct security_hook_list *hooks, int count,
char *lsm);
 
diff --git a/security/Kconfig b/security/Kconfig
index c4302067a3ad..481b93b0d4d9 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -36,6 +36,15 @@ config SECURITY_WRITABLE_HOOKS
bool
default n
 
+config SECURITY_DYNAMIC_HOOKS
+   bool "Runtime loadable (minor) LSMs via LKMs"
+   depends on SECURITY && SRCU
+   help
+ This enables LSMs which live in LKMs, and supports loading, and
+ unloading them safely at runtime. These LSMs must be minor LSMs.
+ They cannot circumvent the built-in LSMs.
+ If you are unsure how to answer this question, answer N.
+
 config SECURITYFS
bool "Enable the securityfs filesystem"
help
diff --git a/security/inode.c b/security/inode.c
index 8dd9ca8848e4..89be07b044a5 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -22,6 +22,10 @@
 #include 
 #include 
 #include 
+#include 
+
+extern char *lsm_names;
+extern struct mutex lsm_lock;
 
 static struct vfsmount *mount;
 static int mount_count;
@@ -312,8 +316,13 @@ static struct dentry *lsm_dentry;
 static ssize_t lsm_read(struct file *filp, char __user *buf, size_t count,
loff_t *ppos)
 {
-   return simple_read_from_buffer(buf, count, ppos, lsm_names,
-   strlen(lsm_names));
+   ssize_t ret;
+
+   mutex_lock(_lock);
+   ret = simple_read_from_buffer(buf, count, ppos, lsm_names,
+ strlen(lsm_names));
+   mutex_unlock(_lock);
+   return ret;
 }
 
 static const struct file_operations lsm_ops = {
diff --git a/security/security.c b/security/security.c
index b9fb297b824e..492d44dd0549 100644
--- a/security/security.c
+++ b/security/security.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define MAX_LSM_EVM_XATTR  2
 
@@ -36,10 +37,18 @@
 #define SECURITY_NAME_MAX  10
 
 static struct list_head security_hook_heads[__MAX_LSM_HOOK] 
__lsm_ro_after_init;
-static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain);
-
 #define HOOK_HEAD(NAME)(_hook_heads[HOOK_IDX(NAME)])
 
+#ifdef CONFIG_SECURITY_DYNAMIC_HOOKS
+static struct list_head dynamic_security_hook_heads[__MAX_LSM_HOOK];
+static struct srcu_struct dynamic_hook_srcus[__MAX_LSM_HOOK];

[PATCH v4 0/3] Safe, dynamically loadable LSM hooks

2018-03-06 Thread Sargun Dhillon
This patchset introduces safe dynamic LSM support. These are currently
not unloadable, until we figure out a use case that needs that. Adding
an unload hook is trivial given the way the patch is written.

This exposes a second mechanism of loading hooks which are in modules.
These hooks are behind static keys, so they should come at low performance
overhead. The built-in hook heads are read-only, whereas the dynamic hooks
are mutable.

Not all hooks can be loaded into. Some hooks are blacklisted, and therefore
trying to load a module which plugs into those hooks will fail.

One of the big benefits with loadable security modules is to help with
"unknown unknowns". Although, livepatch is excellent, sometimes, a
surgical LSM is simpler.

It includes an example LSM that prevents specific time travel.

Changes since v3:
  * readded hook blacklisted
  * return error, rather than panic if unable to allocate memory

Changes since v2:
  * inode get/set security is readded
  * xfrm singleton hook readded
  * Security hooks are turned into an array
  * Security hooks and dynamic hooks enum is collapsed

Changes since v1:
  * It no longer allows unloading of modules
  * prctl is fixed
  * inode get/set security is removed
  * xfrm singleton hook removed


Sargun Dhillon (3):
  security: Refactor LSM hooks into an array and enum
  security: Expose a mechanism to load lsm hooks dynamically at runtime
  security: Add an example sample dynamic LSM

 include/linux/lsm_hooks.h | 459 --
 samples/Kconfig   |   6 +
 samples/Makefile  |   2 +-
 samples/lsm/Makefile  |   4 +
 samples/lsm/lsm_example.c |  33 
 security/Kconfig  |   9 +
 security/inode.c  |  13 +-
 security/security.c   | 222 --
 8 files changed, 508 insertions(+), 240 deletions(-)
 create mode 100644 samples/lsm/Makefile
 create mode 100644 samples/lsm/lsm_example.c

-- 
2.14.1



Re: WARNING: kmalloc bug in memdup_user

2018-03-06 Thread Leon Romanovsky
On Tue, Mar 06, 2018 at 10:59:02PM -0800, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> ce380619fab99036f5e745c7a865b21c59f005f6 (Tue Mar 6 04:31:14 2018 +)
> Merge tag 'please-pull-ia64_misc' of
> git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
>
> So far this crash happened 52 times on upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+a38b0e9f694c379ca...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> audit: type=1400 audit(1520367364.281:6): avc:  denied  { map } for
> pid=4138 comm="bash" path="/bin/bash" dev="sda1" ino=1457
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=system_u:object_r:file_t:s0 tclass=file permissive=1
> audit: type=1400 audit(1520367370.605:7): avc:  denied  { map } for
> pid=4152 comm="syzkaller100190" path="/root/syzkaller100190328" dev="sda1"
> ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1
> WARNING: CPU: 0 PID: 4152 at mm/slab_common.c:1012 kmalloc_slab+0x5d/0x70
> mm/slab_common.c:1012
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 4152 Comm: syzkaller100190 Not tainted 4.16.0-rc4+ #343
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>  panic+0x1e4/0x41c kernel/panic.c:183
>  __warn+0x1dc/0x200 kernel/panic.c:547
>  report_bug+0x211/0x2d0 lib/bug.c:184
>  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>  invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
> RIP: 0010:kmalloc_slab+0x5d/0x70 mm/slab_common.c:1012
> RSP: 0018:8801bf76f970 EFLAGS: 00010246
> RAX:  RBX: fff4 RCX: 819733cb
> RDX: 8423372f RSI:  RDI: 3efef4b4
> RBP: 8801bf76f970 R08:  R09: 
> R10: 88613380 R11:  R12: 3efef4b4
> R13: 2080 R14: 014200c0 R15: 8801bf76fa68
>  __do_kmalloc mm/slab.c:3700 [inline]
>  __kmalloc_track_caller+0x21/0x760 mm/slab.c:3720
>  memdup_user+0x2c/0x90 mm/util.c:160
>  ucma_set_option+0x11f/0x4d0 drivers/infiniband/core/ucma.c:1297
>  ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1627
>  __vfs_write+0xef/0x970 fs/read_write.c:480
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x43fe69
> RSP: 002b:7ffe099a6388 EFLAGS: 0217 ORIG_RAX: 0001
> RAX: ffda RBX: 004002c8 RCX: 0043fe69
> RDX: 006b RSI: 20c0 RDI: 0003
> RBP: 006ca018 R08: 004002c8 R09: 004002c8
> R10: 004002c8 R11: 0217 R12: 00401790
> R13: 00401820 R14:  R15: 
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

I'm surprised that it surfed only now.
It is clear bug, user's input wasn't checked.
But it is not clear to me why optval wasn't declared as u64.

Thanks


signature.asc
Description: PGP signature


[PATCH v4 1/3] security: Refactor LSM hooks into an array and enum

2018-03-06 Thread Sargun Dhillon
This commit should have no functional change. It changes the security hook
list heads struct into an array. Additionally, it exposes all of the hooks
via an enum. This loses memory layout randomization as the enum is not
randomized.

Signed-off-by: Sargun Dhillon 
---
 include/linux/lsm_hooks.h | 433 +++---
 security/security.c   |  30 ++--
 2 files changed, 233 insertions(+), 230 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7161d8e7ee79..d28c7f5b01c1 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1729,241 +1729,243 @@ union security_list_options {
 #endif /* CONFIG_BPF_SYSCALL */
 };
 
-struct security_hook_heads {
-   struct list_head binder_set_context_mgr;
-   struct list_head binder_transaction;
-   struct list_head binder_transfer_binder;
-   struct list_head binder_transfer_file;
-   struct list_head ptrace_access_check;
-   struct list_head ptrace_traceme;
-   struct list_head capget;
-   struct list_head capset;
-   struct list_head capable;
-   struct list_head quotactl;
-   struct list_head quota_on;
-   struct list_head syslog;
-   struct list_head settime;
-   struct list_head vm_enough_memory;
-   struct list_head bprm_set_creds;
-   struct list_head bprm_check_security;
-   struct list_head bprm_committing_creds;
-   struct list_head bprm_committed_creds;
-   struct list_head sb_alloc_security;
-   struct list_head sb_free_security;
-   struct list_head sb_copy_data;
-   struct list_head sb_remount;
-   struct list_head sb_kern_mount;
-   struct list_head sb_show_options;
-   struct list_head sb_statfs;
-   struct list_head sb_mount;
-   struct list_head sb_umount;
-   struct list_head sb_pivotroot;
-   struct list_head sb_set_mnt_opts;
-   struct list_head sb_clone_mnt_opts;
-   struct list_head sb_parse_opts_str;
-   struct list_head dentry_init_security;
-   struct list_head dentry_create_files_as;
+enum lsm_hook {
+   LSM_HOOK_binder_set_context_mgr,
+   LSM_HOOK_binder_transaction,
+   LSM_HOOK_binder_transfer_binder,
+   LSM_HOOK_binder_transfer_file,
+   LSM_HOOK_ptrace_access_check,
+   LSM_HOOK_ptrace_traceme,
+   LSM_HOOK_capget,
+   LSM_HOOK_capset,
+   LSM_HOOK_capable,
+   LSM_HOOK_quotactl,
+   LSM_HOOK_quota_on,
+   LSM_HOOK_syslog,
+   LSM_HOOK_settime,
+   LSM_HOOK_vm_enough_memory,
+   LSM_HOOK_bprm_set_creds,
+   LSM_HOOK_bprm_check_security,
+   LSM_HOOK_bprm_committing_creds,
+   LSM_HOOK_bprm_committed_creds,
+   LSM_HOOK_sb_alloc_security,
+   LSM_HOOK_sb_free_security,
+   LSM_HOOK_sb_copy_data,
+   LSM_HOOK_sb_remount,
+   LSM_HOOK_sb_kern_mount,
+   LSM_HOOK_sb_show_options,
+   LSM_HOOK_sb_statfs,
+   LSM_HOOK_sb_mount,
+   LSM_HOOK_sb_umount,
+   LSM_HOOK_sb_pivotroot,
+   LSM_HOOK_sb_set_mnt_opts,
+   LSM_HOOK_sb_clone_mnt_opts,
+   LSM_HOOK_sb_parse_opts_str,
+   LSM_HOOK_dentry_init_security,
+   LSM_HOOK_dentry_create_files_as,
 #ifdef CONFIG_SECURITY_PATH
-   struct list_head path_unlink;
-   struct list_head path_mkdir;
-   struct list_head path_rmdir;
-   struct list_head path_mknod;
-   struct list_head path_truncate;
-   struct list_head path_symlink;
-   struct list_head path_link;
-   struct list_head path_rename;
-   struct list_head path_chmod;
-   struct list_head path_chown;
-   struct list_head path_chroot;
+   LSM_HOOK_path_unlink,
+   LSM_HOOK_path_mkdir,
+   LSM_HOOK_path_rmdir,
+   LSM_HOOK_path_mknod,
+   LSM_HOOK_path_truncate,
+   LSM_HOOK_path_symlink,
+   LSM_HOOK_path_link,
+   LSM_HOOK_path_rename,
+   LSM_HOOK_path_chmod,
+   LSM_HOOK_path_chown,
+   LSM_HOOK_path_chroot,
 #endif
-   struct list_head inode_alloc_security;
-   struct list_head inode_free_security;
-   struct list_head inode_init_security;
-   struct list_head inode_create;
-   struct list_head inode_link;
-   struct list_head inode_unlink;
-   struct list_head inode_symlink;
-   struct list_head inode_mkdir;
-   struct list_head inode_rmdir;
-   struct list_head inode_mknod;
-   struct list_head inode_rename;
-   struct list_head inode_readlink;
-   struct list_head inode_follow_link;
-   struct list_head inode_permission;
-   struct list_head inode_setattr;
-   struct list_head inode_getattr;
-   struct list_head inode_setxattr;
-   struct list_head inode_post_setxattr;
-   struct list_head inode_getxattr;
-   struct list_head inode_listxattr;
-   struct list_head inode_removexattr;
-   struct list_head inode_need_killpriv;
-   struct list_head inode_killpriv;
-   struct list_head inode_getsecurity;

Re: WARNING: kmalloc bug in memdup_user

2018-03-06 Thread Leon Romanovsky
On Tue, Mar 06, 2018 at 10:59:02PM -0800, syzbot wrote:
> Hello,
>
> syzbot hit the following crash on upstream commit
> ce380619fab99036f5e745c7a865b21c59f005f6 (Tue Mar 6 04:31:14 2018 +)
> Merge tag 'please-pull-ia64_misc' of
> git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
>
> So far this crash happened 52 times on upstream.
> C reproducer is attached.
> syzkaller reproducer is attached.
> Raw console output is attached.
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+a38b0e9f694c379ca...@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
>
> audit: type=1400 audit(1520367364.281:6): avc:  denied  { map } for
> pid=4138 comm="bash" path="/bin/bash" dev="sda1" ino=1457
> scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=system_u:object_r:file_t:s0 tclass=file permissive=1
> audit: type=1400 audit(1520367370.605:7): avc:  denied  { map } for
> pid=4152 comm="syzkaller100190" path="/root/syzkaller100190328" dev="sda1"
> ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023
> tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1
> WARNING: CPU: 0 PID: 4152 at mm/slab_common.c:1012 kmalloc_slab+0x5d/0x70
> mm/slab_common.c:1012
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 4152 Comm: syzkaller100190 Not tainted 4.16.0-rc4+ #343
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>  panic+0x1e4/0x41c kernel/panic.c:183
>  __warn+0x1dc/0x200 kernel/panic.c:547
>  report_bug+0x211/0x2d0 lib/bug.c:184
>  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
>  fixup_bug arch/x86/kernel/traps.c:247 [inline]
>  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
>  invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
> RIP: 0010:kmalloc_slab+0x5d/0x70 mm/slab_common.c:1012
> RSP: 0018:8801bf76f970 EFLAGS: 00010246
> RAX:  RBX: fff4 RCX: 819733cb
> RDX: 8423372f RSI:  RDI: 3efef4b4
> RBP: 8801bf76f970 R08:  R09: 
> R10: 88613380 R11:  R12: 3efef4b4
> R13: 2080 R14: 014200c0 R15: 8801bf76fa68
>  __do_kmalloc mm/slab.c:3700 [inline]
>  __kmalloc_track_caller+0x21/0x760 mm/slab.c:3720
>  memdup_user+0x2c/0x90 mm/util.c:160
>  ucma_set_option+0x11f/0x4d0 drivers/infiniband/core/ucma.c:1297
>  ucma_write+0x2d6/0x3d0 drivers/infiniband/core/ucma.c:1627
>  __vfs_write+0xef/0x970 fs/read_write.c:480
>  vfs_write+0x189/0x510 fs/read_write.c:544
>  SYSC_write fs/read_write.c:589 [inline]
>  SyS_write+0xef/0x220 fs/read_write.c:581
>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> RIP: 0033:0x43fe69
> RSP: 002b:7ffe099a6388 EFLAGS: 0217 ORIG_RAX: 0001
> RAX: ffda RBX: 004002c8 RCX: 0043fe69
> RDX: 006b RSI: 20c0 RDI: 0003
> RBP: 006ca018 R08: 004002c8 R09: 004002c8
> R10: 004002c8 R11: 0217 R12: 00401790
> R13: 00401820 R14:  R15: 
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

I'm surprised that it surfed only now.
It is clear bug, user's input wasn't checked.
But it is not clear to me why optval wasn't declared as u64.

Thanks


signature.asc
Description: PGP signature


[PATCH v4 1/3] security: Refactor LSM hooks into an array and enum

2018-03-06 Thread Sargun Dhillon
This commit should have no functional change. It changes the security hook
list heads struct into an array. Additionally, it exposes all of the hooks
via an enum. This loses memory layout randomization as the enum is not
randomized.

Signed-off-by: Sargun Dhillon 
---
 include/linux/lsm_hooks.h | 433 +++---
 security/security.c   |  30 ++--
 2 files changed, 233 insertions(+), 230 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 7161d8e7ee79..d28c7f5b01c1 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1729,241 +1729,243 @@ union security_list_options {
 #endif /* CONFIG_BPF_SYSCALL */
 };
 
-struct security_hook_heads {
-   struct list_head binder_set_context_mgr;
-   struct list_head binder_transaction;
-   struct list_head binder_transfer_binder;
-   struct list_head binder_transfer_file;
-   struct list_head ptrace_access_check;
-   struct list_head ptrace_traceme;
-   struct list_head capget;
-   struct list_head capset;
-   struct list_head capable;
-   struct list_head quotactl;
-   struct list_head quota_on;
-   struct list_head syslog;
-   struct list_head settime;
-   struct list_head vm_enough_memory;
-   struct list_head bprm_set_creds;
-   struct list_head bprm_check_security;
-   struct list_head bprm_committing_creds;
-   struct list_head bprm_committed_creds;
-   struct list_head sb_alloc_security;
-   struct list_head sb_free_security;
-   struct list_head sb_copy_data;
-   struct list_head sb_remount;
-   struct list_head sb_kern_mount;
-   struct list_head sb_show_options;
-   struct list_head sb_statfs;
-   struct list_head sb_mount;
-   struct list_head sb_umount;
-   struct list_head sb_pivotroot;
-   struct list_head sb_set_mnt_opts;
-   struct list_head sb_clone_mnt_opts;
-   struct list_head sb_parse_opts_str;
-   struct list_head dentry_init_security;
-   struct list_head dentry_create_files_as;
+enum lsm_hook {
+   LSM_HOOK_binder_set_context_mgr,
+   LSM_HOOK_binder_transaction,
+   LSM_HOOK_binder_transfer_binder,
+   LSM_HOOK_binder_transfer_file,
+   LSM_HOOK_ptrace_access_check,
+   LSM_HOOK_ptrace_traceme,
+   LSM_HOOK_capget,
+   LSM_HOOK_capset,
+   LSM_HOOK_capable,
+   LSM_HOOK_quotactl,
+   LSM_HOOK_quota_on,
+   LSM_HOOK_syslog,
+   LSM_HOOK_settime,
+   LSM_HOOK_vm_enough_memory,
+   LSM_HOOK_bprm_set_creds,
+   LSM_HOOK_bprm_check_security,
+   LSM_HOOK_bprm_committing_creds,
+   LSM_HOOK_bprm_committed_creds,
+   LSM_HOOK_sb_alloc_security,
+   LSM_HOOK_sb_free_security,
+   LSM_HOOK_sb_copy_data,
+   LSM_HOOK_sb_remount,
+   LSM_HOOK_sb_kern_mount,
+   LSM_HOOK_sb_show_options,
+   LSM_HOOK_sb_statfs,
+   LSM_HOOK_sb_mount,
+   LSM_HOOK_sb_umount,
+   LSM_HOOK_sb_pivotroot,
+   LSM_HOOK_sb_set_mnt_opts,
+   LSM_HOOK_sb_clone_mnt_opts,
+   LSM_HOOK_sb_parse_opts_str,
+   LSM_HOOK_dentry_init_security,
+   LSM_HOOK_dentry_create_files_as,
 #ifdef CONFIG_SECURITY_PATH
-   struct list_head path_unlink;
-   struct list_head path_mkdir;
-   struct list_head path_rmdir;
-   struct list_head path_mknod;
-   struct list_head path_truncate;
-   struct list_head path_symlink;
-   struct list_head path_link;
-   struct list_head path_rename;
-   struct list_head path_chmod;
-   struct list_head path_chown;
-   struct list_head path_chroot;
+   LSM_HOOK_path_unlink,
+   LSM_HOOK_path_mkdir,
+   LSM_HOOK_path_rmdir,
+   LSM_HOOK_path_mknod,
+   LSM_HOOK_path_truncate,
+   LSM_HOOK_path_symlink,
+   LSM_HOOK_path_link,
+   LSM_HOOK_path_rename,
+   LSM_HOOK_path_chmod,
+   LSM_HOOK_path_chown,
+   LSM_HOOK_path_chroot,
 #endif
-   struct list_head inode_alloc_security;
-   struct list_head inode_free_security;
-   struct list_head inode_init_security;
-   struct list_head inode_create;
-   struct list_head inode_link;
-   struct list_head inode_unlink;
-   struct list_head inode_symlink;
-   struct list_head inode_mkdir;
-   struct list_head inode_rmdir;
-   struct list_head inode_mknod;
-   struct list_head inode_rename;
-   struct list_head inode_readlink;
-   struct list_head inode_follow_link;
-   struct list_head inode_permission;
-   struct list_head inode_setattr;
-   struct list_head inode_getattr;
-   struct list_head inode_setxattr;
-   struct list_head inode_post_setxattr;
-   struct list_head inode_getxattr;
-   struct list_head inode_listxattr;
-   struct list_head inode_removexattr;
-   struct list_head inode_need_killpriv;
-   struct list_head inode_killpriv;
-   struct list_head inode_getsecurity;
-   struct 

Re: [PATCH 4.4 054/108] mtd: cfi: convert inline functions to macros

2018-03-06 Thread Boris Brezillon
On Mon, 05 Mar 2018 02:22:52 +
Ben Hutchings  wrote:

> On Thu, 2018-02-15 at 16:16 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Arnd Bergmann 
> > 
> > commit 9e343e87d2c4c707ef8fae2844864d4dde3a2d13 upstream.  
> [...]
> > -static inline int map_word_andequal(struct map_info *map, map_word val1, 
> > map_word val2, map_word val3)
> > -{
> > -   int i;
> > -
> > -   for (i = 0; i < map_words(map); i++) {
> > -   if ((val1.x[i] & val2.x[i]) != val3.x[i])
> > -   return 0;
> > -   }
> > -
> > -   return 1;
> > -}  
> [...]
> > +#define map_word_andequal(map, val1, val2, val3)   \
> > +({ \
> > +   int i, ret = 1; \
> > +   for (i = 0; i < map_words(map); i++) {  \
> > +   if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) {   \  
> [...]
> 
> The right-hand side of this comparison is now using val2 instead of
> val3.  (This bug seems to be unfixed upstream.)

Indeed. This being said, it's not buggy since all users of
map_word_andequal() pass the same value to val2 and val3.

Maybe we should just patch the macro and all call-sites to remove val3.

> 
> Ben.
> 



-- 
Boris Brezillon, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH 4.4 054/108] mtd: cfi: convert inline functions to macros

2018-03-06 Thread Boris Brezillon
On Mon, 05 Mar 2018 02:22:52 +
Ben Hutchings  wrote:

> On Thu, 2018-02-15 at 16:16 +0100, Greg Kroah-Hartman wrote:
> > 4.4-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: Arnd Bergmann 
> > 
> > commit 9e343e87d2c4c707ef8fae2844864d4dde3a2d13 upstream.  
> [...]
> > -static inline int map_word_andequal(struct map_info *map, map_word val1, 
> > map_word val2, map_word val3)
> > -{
> > -   int i;
> > -
> > -   for (i = 0; i < map_words(map); i++) {
> > -   if ((val1.x[i] & val2.x[i]) != val3.x[i])
> > -   return 0;
> > -   }
> > -
> > -   return 1;
> > -}  
> [...]
> > +#define map_word_andequal(map, val1, val2, val3)   \
> > +({ \
> > +   int i, ret = 1; \
> > +   for (i = 0; i < map_words(map); i++) {  \
> > +   if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) {   \  
> [...]
> 
> The right-hand side of this comparison is now using val2 instead of
> val3.  (This bug seems to be unfixed upstream.)

Indeed. This being said, it's not buggy since all users of
map_word_andequal() pass the same value to val2 and val3.

Maybe we should just patch the macro and all call-sites to remove val3.

> 
> Ben.
> 



-- 
Boris Brezillon, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH V2 1/2] mmc: sdhci-msm: Add support to store supported vdd-io voltages

2018-03-06 Thread Vijay Viswanath

Hi Dough, Jeremy,

On 3/3/2018 4:38 AM, Jeremy McNicoll wrote:

On 2018-03-02 10:23 AM, Doug Anderson wrote:

Hi,

On Sun, Feb 11, 2018 at 10:01 PM, Vijay Viswanath
 wrote:

During probe check whether the vdd-io regulator of sdhc platform device
can support 1.8V and 3V and store this information as a capability of
platform device.

Signed-off-by: Vijay Viswanath 
---
  drivers/mmc/host/sdhci-msm.c | 38 
++

  1 file changed, 38 insertions(+)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index c283291..5c23e92 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -23,6 +23,7 @@
  #include 

  #include "sdhci-pltfm.h"
+#include 


This is a strange sort order for this include file.  Why is it after
the local include?



  #define CORE_MCI_VERSION   0x50
  #define CORE_VERSION_MAJOR_SHIFT   28
@@ -81,6 +82,9 @@
  #define CORE_HC_SELECT_IN_HS400    (6 << 19)
  #define CORE_HC_SELECT_IN_MASK (7 << 19)

+#define CORE_3_0V_SUPPORT  (1 << 25)
+#define CORE_1_8V_SUPPORT  (1 << 26)
+


Is there something magical about 25 and 26?  This is a new caps field,
so I'd have expected 0 and 1.




Yes, these bits are the same corresponding to the capabilities in the 
Capabilities Register (offset 0x40). The bit positions become important 
when capabilities register doesn't show support to some voltages, but we 
can support those voltages. At that time, we will have to fake 
capabilities. The changes for those are currently not yet pushed up.



  #define CORE_CSR_CDC_CTLR_CFG0 0x130
  #define CORE_SW_TRIG_FULL_CALIB    BIT(16)
  #define CORE_HW_AUTOCAL_ENA    BIT(17)
@@ -148,6 +152,7 @@ struct sdhci_msm_host {
 u32 curr_io_level;
 wait_queue_head_t pwr_irq_wait;
 bool pwr_irq_flag;
+   u32 caps_0;
  };

  static unsigned int msm_get_clock_rate_for_bus_mode(struct 
sdhci_host *host,
@@ -1313,6 +1318,35 @@ static void sdhci_msm_writeb(struct sdhci_host 
*host, u8 val, int reg)

 sdhci_msm_check_power_status(host, req_type);
  }

+static int sdhci_msm_set_regulator_caps(struct sdhci_msm_host 
*msm_host)

+{
+   struct mmc_host *mmc = msm_host->mmc;
+   struct regulator *supply = mmc->supply.vqmmc;
+   int i, count;
+   u32 caps = 0, vdd_uV;
+
+   if (!IS_ERR(mmc->supply.vqmmc)) {
+   count = regulator_count_voltages(supply);
+   if (count < 0)
+   return count;
+   for (i = 0; i < count; i++) {
+   vdd_uV = regulator_list_voltage(supply, i);
+   if (vdd_uV <= 0)
+   continue;
+   if (vdd_uV > 270)
+   caps |= CORE_3_0V_SUPPORT;
+   if (vdd_uV < 195)
+   caps |= CORE_1_8V_SUPPORT;
+   }


Shouldn't you be using regulator_is_supported_voltage() rather than
open coding?  Also: I've never personally worked on a device where it
was used, but there is definitely a concept floating about of a
voltage level of 1.2V.  Maybe should copy the ranges from
mmc_regulator_set_vqmmc()?




regulator_is_supported_voltage() checks for a range and it also uses 
regulator_list_voltage() internally. regulator_list_voltage() is also an 
exported API for use by drivers AFAIK. Please correct if it is not.



Also: seems like you should have some way to deal with "caps" ending
up w/ no bits set.  IIRC you can have a regulator that can be enabled
/ disabled but doesn't list a voltage, so if someone messed up their
device tree you could end up in this case.  Should you print a
warning?  ...or treat it as if we support "3.0V"?  ...or ?  I guess it
depends on how do you want patch #2 to behave in that case.


Both, initialize it to sane value and print something.  This way at
least you have a good chance of booting and not hard hanging and you
are given a reasonable message indicating what needs to be fixed.

-jeremy





+   }


How should things behave if vqmmc is an error?  In that case is it
important to not set "CORE_IO_PAD_PWR_SWITCH_EN" in patch set #2?
...or should you set "CORE_IO_PAD_PWR_SWITCH_EN" but then make sure
you don't set "CORE_IO_PAD_PWR_SWITCH"?




Thanks for the suggestion. If the regulators exit and doesn't list the 
voltages, then I believe initialization itself will not happen. We will 
not have any available ocr and in sdhci_setup_host it should fail.
But these enhancements can be incorporated. Since this patch is already 
acknowledged, I will incorporate these changes in a subsequent patch.



+   msm_host->caps_0 |= caps;
+   pr_debug("%s: %s: supported caps: 0x%08x\n", mmc_hostname(mmc),
+   __func__, caps);
+
+   return 0;
+}
+
+
  static const struct of_device_id sdhci_msm_dt_match[] = {
 { 

Re: [PATCH V2 1/2] mmc: sdhci-msm: Add support to store supported vdd-io voltages

2018-03-06 Thread Vijay Viswanath

Hi Dough, Jeremy,

On 3/3/2018 4:38 AM, Jeremy McNicoll wrote:

On 2018-03-02 10:23 AM, Doug Anderson wrote:

Hi,

On Sun, Feb 11, 2018 at 10:01 PM, Vijay Viswanath
 wrote:

During probe check whether the vdd-io regulator of sdhc platform device
can support 1.8V and 3V and store this information as a capability of
platform device.

Signed-off-by: Vijay Viswanath 
---
  drivers/mmc/host/sdhci-msm.c | 38 
++

  1 file changed, 38 insertions(+)

diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c
index c283291..5c23e92 100644
--- a/drivers/mmc/host/sdhci-msm.c
+++ b/drivers/mmc/host/sdhci-msm.c
@@ -23,6 +23,7 @@
  #include 

  #include "sdhci-pltfm.h"
+#include 


This is a strange sort order for this include file.  Why is it after
the local include?



  #define CORE_MCI_VERSION   0x50
  #define CORE_VERSION_MAJOR_SHIFT   28
@@ -81,6 +82,9 @@
  #define CORE_HC_SELECT_IN_HS400    (6 << 19)
  #define CORE_HC_SELECT_IN_MASK (7 << 19)

+#define CORE_3_0V_SUPPORT  (1 << 25)
+#define CORE_1_8V_SUPPORT  (1 << 26)
+


Is there something magical about 25 and 26?  This is a new caps field,
so I'd have expected 0 and 1.




Yes, these bits are the same corresponding to the capabilities in the 
Capabilities Register (offset 0x40). The bit positions become important 
when capabilities register doesn't show support to some voltages, but we 
can support those voltages. At that time, we will have to fake 
capabilities. The changes for those are currently not yet pushed up.



  #define CORE_CSR_CDC_CTLR_CFG0 0x130
  #define CORE_SW_TRIG_FULL_CALIB    BIT(16)
  #define CORE_HW_AUTOCAL_ENA    BIT(17)
@@ -148,6 +152,7 @@ struct sdhci_msm_host {
 u32 curr_io_level;
 wait_queue_head_t pwr_irq_wait;
 bool pwr_irq_flag;
+   u32 caps_0;
  };

  static unsigned int msm_get_clock_rate_for_bus_mode(struct 
sdhci_host *host,
@@ -1313,6 +1318,35 @@ static void sdhci_msm_writeb(struct sdhci_host 
*host, u8 val, int reg)

 sdhci_msm_check_power_status(host, req_type);
  }

+static int sdhci_msm_set_regulator_caps(struct sdhci_msm_host 
*msm_host)

+{
+   struct mmc_host *mmc = msm_host->mmc;
+   struct regulator *supply = mmc->supply.vqmmc;
+   int i, count;
+   u32 caps = 0, vdd_uV;
+
+   if (!IS_ERR(mmc->supply.vqmmc)) {
+   count = regulator_count_voltages(supply);
+   if (count < 0)
+   return count;
+   for (i = 0; i < count; i++) {
+   vdd_uV = regulator_list_voltage(supply, i);
+   if (vdd_uV <= 0)
+   continue;
+   if (vdd_uV > 270)
+   caps |= CORE_3_0V_SUPPORT;
+   if (vdd_uV < 195)
+   caps |= CORE_1_8V_SUPPORT;
+   }


Shouldn't you be using regulator_is_supported_voltage() rather than
open coding?  Also: I've never personally worked on a device where it
was used, but there is definitely a concept floating about of a
voltage level of 1.2V.  Maybe should copy the ranges from
mmc_regulator_set_vqmmc()?




regulator_is_supported_voltage() checks for a range and it also uses 
regulator_list_voltage() internally. regulator_list_voltage() is also an 
exported API for use by drivers AFAIK. Please correct if it is not.



Also: seems like you should have some way to deal with "caps" ending
up w/ no bits set.  IIRC you can have a regulator that can be enabled
/ disabled but doesn't list a voltage, so if someone messed up their
device tree you could end up in this case.  Should you print a
warning?  ...or treat it as if we support "3.0V"?  ...or ?  I guess it
depends on how do you want patch #2 to behave in that case.


Both, initialize it to sane value and print something.  This way at
least you have a good chance of booting and not hard hanging and you
are given a reasonable message indicating what needs to be fixed.

-jeremy





+   }


How should things behave if vqmmc is an error?  In that case is it
important to not set "CORE_IO_PAD_PWR_SWITCH_EN" in patch set #2?
...or should you set "CORE_IO_PAD_PWR_SWITCH_EN" but then make sure
you don't set "CORE_IO_PAD_PWR_SWITCH"?




Thanks for the suggestion. If the regulators exit and doesn't list the 
voltages, then I believe initialization itself will not happen. We will 
not have any available ocr and in sdhci_setup_host it should fail.
But these enhancements can be incorporated. Since this patch is already 
acknowledged, I will incorporate these changes in a subsequent patch.



+   msm_host->caps_0 |= caps;
+   pr_debug("%s: %s: supported caps: 0x%08x\n", mmc_hostname(mmc),
+   __func__, caps);
+
+   return 0;
+}
+
+
  static const struct of_device_id sdhci_msm_dt_match[] = {
 { .compatible = "qcom,sdhci-msm-v4" },
 {},

Re: [PATCH 1/3] vfio/pci: Pull BAR mapping setup from read-write path

2018-03-06 Thread Peter Xu
On Wed, Feb 28, 2018 at 01:14:46PM -0700, Alex Williamson wrote:
> This creates a common helper that we'll use for ioeventfd setup.
> 
> Signed-off-by: Alex Williamson 

Reviewed-by: Peter Xu 

-- 
Peter Xu


Re: [PATCH 1/3] vfio/pci: Pull BAR mapping setup from read-write path

2018-03-06 Thread Peter Xu
On Wed, Feb 28, 2018 at 01:14:46PM -0700, Alex Williamson wrote:
> This creates a common helper that we'll use for ioeventfd setup.
> 
> Signed-off-by: Alex Williamson 

Reviewed-by: Peter Xu 

-- 
Peter Xu


Re: [PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Harish Jenny K N


On Wednesday 07 March 2018 12:10 PM, Avri Altman wrote:
>
>> -Original Message-
>> From: Harish Jenny K N [mailto:harish_kand...@mentor.com]
>> Sent: Wednesday, March 07, 2018 7:38 AM
>> To: ulf.hans...@linaro.org; linus.wall...@linaro.org;
>> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman
>> ; andriy.shevche...@linux.intel.com
>> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>> harish_kand...@mentor.com; vladimir_zapols...@mentor.com
>> Subject: [PATCH v6] mmc: Export host capabilities to debugfs.
>>
>> This patch exports the host capabilities to debugfs
>>
>> This idea of sharing host capabilities over debugfs came up from Abbas Raza
>>  Earlier discussions:
>> https://lkml.org/lkml/2018/3/5/357
>> https://www.spinics.net/lists/linux-mmc/msg48219.html
>>
>> Signed-off-by: Harish Jenny K N 
>> ---
>>
>>
>> +static int mmc_caps_show(struct seq_file *s, void *unused) {
>> +struct mmc_host *host = s->private;
>> +u32 caps = host->caps;
>> +
>> +seq_puts(s, "\nMMC Host capabilities are:\n");
>> +seq_puts(s,
>> "=\n");
>> +seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
>> +   ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));
> Maybe use a more compact form, and just call a macro with the applicable 
> (stringified) bit?

Something like this ?

#define YN(bit) ((caps & bit) ? "Yes" : "No")
and then call
seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", 
YN(MMC_CAP_4_BIT_DATA));



Thanks,
Harish Jenny K N


Re: [PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Harish Jenny K N


On Wednesday 07 March 2018 12:10 PM, Avri Altman wrote:
>
>> -Original Message-
>> From: Harish Jenny K N [mailto:harish_kand...@mentor.com]
>> Sent: Wednesday, March 07, 2018 7:38 AM
>> To: ulf.hans...@linaro.org; linus.wall...@linaro.org;
>> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman
>> ; andriy.shevche...@linux.intel.com
>> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
>> harish_kand...@mentor.com; vladimir_zapols...@mentor.com
>> Subject: [PATCH v6] mmc: Export host capabilities to debugfs.
>>
>> This patch exports the host capabilities to debugfs
>>
>> This idea of sharing host capabilities over debugfs came up from Abbas Raza
>>  Earlier discussions:
>> https://lkml.org/lkml/2018/3/5/357
>> https://www.spinics.net/lists/linux-mmc/msg48219.html
>>
>> Signed-off-by: Harish Jenny K N 
>> ---
>>
>>
>> +static int mmc_caps_show(struct seq_file *s, void *unused) {
>> +struct mmc_host *host = s->private;
>> +u32 caps = host->caps;
>> +
>> +seq_puts(s, "\nMMC Host capabilities are:\n");
>> +seq_puts(s,
>> "=\n");
>> +seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
>> +   ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));
> Maybe use a more compact form, and just call a macro with the applicable 
> (stringified) bit?

Something like this ?

#define YN(bit) ((caps & bit) ? "Yes" : "No")
and then call
seq_printf(s, "Can the host do 4 bit transfers :\t%s\n", 
YN(MMC_CAP_4_BIT_DATA));



Thanks,
Harish Jenny K N


Re: [PATCHv2 2/5] x86/boot/compressed/64: Find a place for 32-bit trampoline

2018-03-06 Thread Ingo Molnar

* Kirill A. Shutemov  wrote:

> On Tue, Feb 27, 2018 at 06:42:14PM +0300, Kirill A. Shutemov wrote:
> > If a bootloader enables 64-bit mode with 4-level paging, we might need to
> > switch over to 5-level paging. The switching requires the disabling of
> > paging, which works fine if kernel itself is loaded below 4G.
> > 
> > But if the bootloader puts the kernel above 4G (not sure if anybody does
> > this), we would lose control as soon as paging is disabled, because the
> > code becomes unreachable to the CPU.
> > 
> > To handle the situation, we need a trampoline in lower memory that would
> > take care of switching on 5-level paging.
> > 
> > This patch finds a spot in low memory for a trampoline.
> > 
> > The heuristic is based on code in reserve_bios_regions().
> > 
> > We find the end of low memory based on BIOS and EBDA start addresses.
> > The trampoline is put just before end of low memory. It's mimic approach
> > taken to allocate memory for realtime trampoline.
> > 
> > Signed-off-by: Kirill A. Shutemov 
> > Tested-by: Borislav Petkov 
> > ---
> >  arch/x86/boot/compressed/misc.c   |  4 
> >  arch/x86/boot/compressed/pgtable.h| 11 +++
> >  arch/x86/boot/compressed/pgtable_64.c | 34 
> > ++
> >  3 files changed, 49 insertions(+)
> >  create mode 100644 arch/x86/boot/compressed/pgtable.h
> > 
> > diff --git a/arch/x86/boot/compressed/misc.c 
> > b/arch/x86/boot/compressed/misc.c
> > index b50c42455e25..e58409667b13 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -14,6 +14,7 @@
> >  
> >  #include "misc.h"
> >  #include "error.h"
> > +#include "pgtable.h"
> >  #include "../string.h"
> >  #include "../voffset.h"
> >  
> > @@ -372,6 +373,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
> > memptr heap,
> > debug_putaddr(output_len);
> > debug_putaddr(kernel_total_size);
> >  
> > +   /* Report address of 32-bit trampoline */
> > +   debug_putaddr(trampoline_32bit);
> > +
> > /*
> >  * The memory hole needed for the kernel is the larger of either
> >  * the entire decompressed kernel plus relocation table, or the
> 
> 0-day found problem with the patch on 32-bit config.
> 
> Here's fixup:
> 
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index e58409667b13..8e4b55dd5df9 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -373,8 +373,10 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
> memptr heap,
>   debug_putaddr(output_len);
>   debug_putaddr(kernel_total_size);
>  
> +#ifdef CONFIG_X86_64
>   /* Report address of 32-bit trampoline */
>   debug_putaddr(trampoline_32bit);
> +#endif

The prototype of trampoline_32bit should be in an #ifdef as well, as the 
variable 
only exists on 64-bit kernels.

Thanks,

Ingo


Re: [PATCHv2 2/5] x86/boot/compressed/64: Find a place for 32-bit trampoline

2018-03-06 Thread Ingo Molnar

* Kirill A. Shutemov  wrote:

> On Tue, Feb 27, 2018 at 06:42:14PM +0300, Kirill A. Shutemov wrote:
> > If a bootloader enables 64-bit mode with 4-level paging, we might need to
> > switch over to 5-level paging. The switching requires the disabling of
> > paging, which works fine if kernel itself is loaded below 4G.
> > 
> > But if the bootloader puts the kernel above 4G (not sure if anybody does
> > this), we would lose control as soon as paging is disabled, because the
> > code becomes unreachable to the CPU.
> > 
> > To handle the situation, we need a trampoline in lower memory that would
> > take care of switching on 5-level paging.
> > 
> > This patch finds a spot in low memory for a trampoline.
> > 
> > The heuristic is based on code in reserve_bios_regions().
> > 
> > We find the end of low memory based on BIOS and EBDA start addresses.
> > The trampoline is put just before end of low memory. It's mimic approach
> > taken to allocate memory for realtime trampoline.
> > 
> > Signed-off-by: Kirill A. Shutemov 
> > Tested-by: Borislav Petkov 
> > ---
> >  arch/x86/boot/compressed/misc.c   |  4 
> >  arch/x86/boot/compressed/pgtable.h| 11 +++
> >  arch/x86/boot/compressed/pgtable_64.c | 34 
> > ++
> >  3 files changed, 49 insertions(+)
> >  create mode 100644 arch/x86/boot/compressed/pgtable.h
> > 
> > diff --git a/arch/x86/boot/compressed/misc.c 
> > b/arch/x86/boot/compressed/misc.c
> > index b50c42455e25..e58409667b13 100644
> > --- a/arch/x86/boot/compressed/misc.c
> > +++ b/arch/x86/boot/compressed/misc.c
> > @@ -14,6 +14,7 @@
> >  
> >  #include "misc.h"
> >  #include "error.h"
> > +#include "pgtable.h"
> >  #include "../string.h"
> >  #include "../voffset.h"
> >  
> > @@ -372,6 +373,9 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
> > memptr heap,
> > debug_putaddr(output_len);
> > debug_putaddr(kernel_total_size);
> >  
> > +   /* Report address of 32-bit trampoline */
> > +   debug_putaddr(trampoline_32bit);
> > +
> > /*
> >  * The memory hole needed for the kernel is the larger of either
> >  * the entire decompressed kernel plus relocation table, or the
> 
> 0-day found problem with the patch on 32-bit config.
> 
> Here's fixup:
> 
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index e58409667b13..8e4b55dd5df9 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -373,8 +373,10 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
> memptr heap,
>   debug_putaddr(output_len);
>   debug_putaddr(kernel_total_size);
>  
> +#ifdef CONFIG_X86_64
>   /* Report address of 32-bit trampoline */
>   debug_putaddr(trampoline_32bit);
> +#endif

The prototype of trampoline_32bit should be in an #ifdef as well, as the 
variable 
only exists on 64-bit kernels.

Thanks,

Ingo


[PATCH] scsi: jazz_esp, sun3x_esp: Pass struct device pointer in dma calls

2018-03-06 Thread Finn Thain
In jazz_esp and sun3x_esp, the esp_driver_ops methods pass esp->dev
in dma api calls as if it was a pointer to a struct device. But
it actually points to a struct platform_device. Fix this.

Cc: Thomas Bogendoerfer 
Signed-off-by: Finn Thain 
---
 drivers/scsi/jazz_esp.c  | 2 +-
 drivers/scsi/sun3x_esp.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/jazz_esp.c b/drivers/scsi/jazz_esp.c
index 9aaa74e349cc..6eb5ff3e2e61 100644
--- a/drivers/scsi/jazz_esp.c
+++ b/drivers/scsi/jazz_esp.c
@@ -147,7 +147,7 @@ static int esp_jazz_probe(struct platform_device *dev)
esp = shost_priv(host);
 
esp->host = host;
-   esp->dev = dev;
+   esp->dev = >dev;
esp->ops = _esp_ops;
 
res = platform_get_resource(dev, IORESOURCE_MEM, 0);
diff --git a/drivers/scsi/sun3x_esp.c b/drivers/scsi/sun3x_esp.c
index d50c5ed8f428..0b1421cdf8a0 100644
--- a/drivers/scsi/sun3x_esp.c
+++ b/drivers/scsi/sun3x_esp.c
@@ -210,7 +210,7 @@ static int esp_sun3x_probe(struct platform_device *dev)
esp = shost_priv(host);
 
esp->host = host;
-   esp->dev = dev;
+   esp->dev = >dev;
esp->ops = _esp_ops;
 
res = platform_get_resource(dev, IORESOURCE_MEM, 0);
-- 
2.16.1



[PATCH] scsi: jazz_esp, sun3x_esp: Pass struct device pointer in dma calls

2018-03-06 Thread Finn Thain
In jazz_esp and sun3x_esp, the esp_driver_ops methods pass esp->dev
in dma api calls as if it was a pointer to a struct device. But
it actually points to a struct platform_device. Fix this.

Cc: Thomas Bogendoerfer 
Signed-off-by: Finn Thain 
---
 drivers/scsi/jazz_esp.c  | 2 +-
 drivers/scsi/sun3x_esp.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/jazz_esp.c b/drivers/scsi/jazz_esp.c
index 9aaa74e349cc..6eb5ff3e2e61 100644
--- a/drivers/scsi/jazz_esp.c
+++ b/drivers/scsi/jazz_esp.c
@@ -147,7 +147,7 @@ static int esp_jazz_probe(struct platform_device *dev)
esp = shost_priv(host);
 
esp->host = host;
-   esp->dev = dev;
+   esp->dev = >dev;
esp->ops = _esp_ops;
 
res = platform_get_resource(dev, IORESOURCE_MEM, 0);
diff --git a/drivers/scsi/sun3x_esp.c b/drivers/scsi/sun3x_esp.c
index d50c5ed8f428..0b1421cdf8a0 100644
--- a/drivers/scsi/sun3x_esp.c
+++ b/drivers/scsi/sun3x_esp.c
@@ -210,7 +210,7 @@ static int esp_sun3x_probe(struct platform_device *dev)
esp = shost_priv(host);
 
esp->host = host;
-   esp->dev = dev;
+   esp->dev = >dev;
esp->ops = _esp_ops;
 
res = platform_get_resource(dev, IORESOURCE_MEM, 0);
-- 
2.16.1



[tip:x86/pti] objtool: Fix 32-bit build

2018-03-06 Thread tip-bot for Josh Poimboeuf
Commit-ID:  63474dc4ac7ed3848a4786b9592dd061901f606d
Gitweb: https://git.kernel.org/tip/63474dc4ac7ed3848a4786b9592dd061901f606d
Author: Josh Poimboeuf 
AuthorDate: Tue, 6 Mar 2018 17:58:15 -0600
Committer:  Ingo Molnar 
CommitDate: Wed, 7 Mar 2018 07:50:38 +0100

objtool: Fix 32-bit build

Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit
host.  This also simplifies read_retpoline_hints() a bit and makes its
implementation similar to most of the other annotation reading
functions.

Reported-by: Sven Joachim 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: b5bc2231b8ad ("objtool: Add retpoline validation")
Link: 
http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/check.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 46c1d239cc1b..92b6a2c21631 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1116,42 +1116,29 @@ static int read_unwind_hints(struct objtool_file *file)
 
 static int read_retpoline_hints(struct objtool_file *file)
 {
-   struct section *sec, *relasec;
+   struct section *sec;
struct instruction *insn;
struct rela *rela;
-   int i;
 
-   sec = find_section_by_name(file->elf, ".discard.retpoline_safe");
+   sec = find_section_by_name(file->elf, ".rela.discard.retpoline_safe");
if (!sec)
return 0;
 
-   relasec = sec->rela;
-   if (!relasec) {
-   WARN("missing .rela.discard.retpoline_safe section");
-   return -1;
-   }
-
-   if (sec->len % sizeof(unsigned long)) {
-   WARN("retpoline_safe size mismatch: %d %ld", sec->len, 
sizeof(unsigned long));
-   return -1;
-   }
-
-   for (i = 0; i < sec->len / sizeof(unsigned long); i++) {
-   rela = find_rela_by_dest(sec, i * sizeof(unsigned long));
-   if (!rela) {
-   WARN("can't find rela for retpoline_safe[%d]", i);
+   list_for_each_entry(rela, >rela_list, list) {
+   if (rela->sym->type != STT_SECTION) {
+   WARN("unexpected relocation symbol type in %s", 
sec->name);
return -1;
}
 
insn = find_insn(file, rela->sym->sec, rela->addend);
if (!insn) {
-   WARN("can't find insn for retpoline_safe[%d]", i);
+   WARN("bad .discard.retpoline_safe entry");
return -1;
}
 
if (insn->type != INSN_JUMP_DYNAMIC &&
insn->type != INSN_CALL_DYNAMIC) {
-   WARN_FUNC("retpoline_safe hint not a indirect 
jump/call",
+   WARN_FUNC("retpoline_safe hint not an indirect 
jump/call",
  insn->sec, insn->offset);
return -1;
}


[tip:x86/pti] objtool: Fix 32-bit build

2018-03-06 Thread tip-bot for Josh Poimboeuf
Commit-ID:  63474dc4ac7ed3848a4786b9592dd061901f606d
Gitweb: https://git.kernel.org/tip/63474dc4ac7ed3848a4786b9592dd061901f606d
Author: Josh Poimboeuf 
AuthorDate: Tue, 6 Mar 2018 17:58:15 -0600
Committer:  Ingo Molnar 
CommitDate: Wed, 7 Mar 2018 07:50:38 +0100

objtool: Fix 32-bit build

Fix the objtool build when cross-compiling a 64-bit kernel on a 32-bit
host.  This also simplifies read_retpoline_hints() a bit and makes its
implementation similar to most of the other annotation reading
functions.

Reported-by: Sven Joachim 
Signed-off-by: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: b5bc2231b8ad ("objtool: Add retpoline validation")
Link: 
http://lkml.kernel.org/r/2ca46c636c23aa9c9d57d53c75de4ee3ddf7a7df.1520380691.git.jpoim...@redhat.com
Signed-off-by: Ingo Molnar 
---
 tools/objtool/check.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 46c1d239cc1b..92b6a2c21631 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1116,42 +1116,29 @@ static int read_unwind_hints(struct objtool_file *file)
 
 static int read_retpoline_hints(struct objtool_file *file)
 {
-   struct section *sec, *relasec;
+   struct section *sec;
struct instruction *insn;
struct rela *rela;
-   int i;
 
-   sec = find_section_by_name(file->elf, ".discard.retpoline_safe");
+   sec = find_section_by_name(file->elf, ".rela.discard.retpoline_safe");
if (!sec)
return 0;
 
-   relasec = sec->rela;
-   if (!relasec) {
-   WARN("missing .rela.discard.retpoline_safe section");
-   return -1;
-   }
-
-   if (sec->len % sizeof(unsigned long)) {
-   WARN("retpoline_safe size mismatch: %d %ld", sec->len, 
sizeof(unsigned long));
-   return -1;
-   }
-
-   for (i = 0; i < sec->len / sizeof(unsigned long); i++) {
-   rela = find_rela_by_dest(sec, i * sizeof(unsigned long));
-   if (!rela) {
-   WARN("can't find rela for retpoline_safe[%d]", i);
+   list_for_each_entry(rela, >rela_list, list) {
+   if (rela->sym->type != STT_SECTION) {
+   WARN("unexpected relocation symbol type in %s", 
sec->name);
return -1;
}
 
insn = find_insn(file, rela->sym->sec, rela->addend);
if (!insn) {
-   WARN("can't find insn for retpoline_safe[%d]", i);
+   WARN("bad .discard.retpoline_safe entry");
return -1;
}
 
if (insn->type != INSN_JUMP_DYNAMIC &&
insn->type != INSN_CALL_DYNAMIC) {
-   WARN_FUNC("retpoline_safe hint not a indirect 
jump/call",
+   WARN_FUNC("retpoline_safe hint not an indirect 
jump/call",
  insn->sec, insn->offset);
return -1;
}


[GIT PULL] s390 patches for 4.16-rc5

2018-03-06 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:

Nine bug fixes for s390:

 * Three fixes for the expoline code, one of them is strictly speaking
   a cleanup but as it relates to code added with 4.16 I would like to
   include the patch.

 * Three timer related fixes in the common I/O layer

 * A fix for the handling of internal DASD request which could cause panics.

 * One correction in regard to the accounting of pud page tables vs.
   compat tasks.

 * The register scrubbing in entry.S caused spurious crashes, this is
   fixed now as well.

Christian Borntraeger (1):
  s390/entry.S: fix spurious zeroing of r0

Eugeniu Rosca (1):
  s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)

Guenter Roeck (1):
  s390: Fix runtime warning about negative pgtables_bytes

Hendrik Brueckner (1):
  s390/clean-up: use CFI_* macros in entry.S

Martin Schwidefsky (1):
  s390: do not bypass BPENTER for interrupt system calls

Sebastian Ott (3):
  s390/cio: fix ccw_device_start_timeout API
  s390/cio: fix return code after missing interrupt
  s390/cio: clear timer when terminating driver I/O

Stefan Haberland (1):
  s390/dasd: fix handling of internal requests

 arch/s390/include/asm/mmu_context.h |  1 +
 arch/s390/kernel/entry.S| 10 +++---
 arch/s390/kernel/nospec-branch.c|  4 +--
 drivers/s390/block/dasd.c   | 21 ---
 drivers/s390/cio/device_fsm.c   |  7 ++--
 drivers/s390/cio/device_ops.c   | 72 +
 drivers/s390/cio/io_sch.h   |  1 +
 7 files changed, 54 insertions(+), 62 deletions(-)

diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 65154ea..6c8ce15 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -63,6 +63,7 @@ static inline int init_new_context(struct task_struct *tsk,
   _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
/* pgd_alloc() did not account this pmd */
mm_inc_nr_pmds(mm);
+   mm_inc_nr_puds(mm);
}
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 13a133a..a5621ea 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -230,7 +231,7 @@ _PIF_WORK   = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART)
.hidden \name
.type \name,@function
 \name:
-   .cfi_startproc
+   CFI_STARTPROC
 #ifdef CONFIG_HAVE_MARCH_Z10_FEATURES
exrl0,0f
 #else
@@ -239,7 +240,7 @@ _PIF_WORK   = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART)
 #endif
j   .
 0: br  \reg
-   .cfi_endproc
+   CFI_ENDPROC
.endm
 
GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1
@@ -426,13 +427,13 @@ ENTRY(system_call)
UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
stmg%r0,%r7,__PT_R0(%r11)
-   # clear user controlled register to prevent speculative use
-   xgr %r0,%r0
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
stg %r14,__PT_FLAGS(%r11)
 .Lsysc_do_svc:
+   # clear user controlled register to prevent speculative use
+   xgr %r0,%r0
# load address of system call table
lg  %r10,__THREAD_sysc_table(%r13,%r12)
llgh%r8,__PT_INT_CODE+2(%r11)
@@ -1439,6 +1440,7 @@ cleanup_critical:
stg %r15,__LC_SYSTEM_TIMER
 0: # update accounting time stamp
mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER
+   BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
# set up saved register r11
lg  %r15,__LC_KERNEL_STACK
la  %r9,STACK_FRAME_OVERHEAD(%r15)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c
index 69d7fcf..9aff72d 100644
--- a/arch/s390/kernel/nospec-branch.c
+++ b/arch/s390/kernel/nospec-branch.c
@@ -2,8 +2,8 @@
 #include 
 #include 
 
-int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF);
-int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL);
+int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
+int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
 
 static int __init nospectre_v2_setup_early(char *str)
 {
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index a7c15f0..ecef8e7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2581,8 +2581,6 @@ int dasd_cancel_req(struct dasd_ccw_req *cqr)
case DASD_CQR_QUEUED:
/* request was not started - just set to cleared */
 

[GIT PULL] s390 patches for 4.16-rc5

2018-03-06 Thread Martin Schwidefsky
Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:

Nine bug fixes for s390:

 * Three fixes for the expoline code, one of them is strictly speaking
   a cleanup but as it relates to code added with 4.16 I would like to
   include the patch.

 * Three timer related fixes in the common I/O layer

 * A fix for the handling of internal DASD request which could cause panics.

 * One correction in regard to the accounting of pud page tables vs.
   compat tasks.

 * The register scrubbing in entry.S caused spurious crashes, this is
   fixed now as well.

Christian Borntraeger (1):
  s390/entry.S: fix spurious zeroing of r0

Eugeniu Rosca (1):
  s390: Replace IS_ENABLED(EXPOLINE_*) with IS_ENABLED(CONFIG_EXPOLINE_*)

Guenter Roeck (1):
  s390: Fix runtime warning about negative pgtables_bytes

Hendrik Brueckner (1):
  s390/clean-up: use CFI_* macros in entry.S

Martin Schwidefsky (1):
  s390: do not bypass BPENTER for interrupt system calls

Sebastian Ott (3):
  s390/cio: fix ccw_device_start_timeout API
  s390/cio: fix return code after missing interrupt
  s390/cio: clear timer when terminating driver I/O

Stefan Haberland (1):
  s390/dasd: fix handling of internal requests

 arch/s390/include/asm/mmu_context.h |  1 +
 arch/s390/kernel/entry.S| 10 +++---
 arch/s390/kernel/nospec-branch.c|  4 +--
 drivers/s390/block/dasd.c   | 21 ---
 drivers/s390/cio/device_fsm.c   |  7 ++--
 drivers/s390/cio/device_ops.c   | 72 +
 drivers/s390/cio/io_sch.h   |  1 +
 7 files changed, 54 insertions(+), 62 deletions(-)

diff --git a/arch/s390/include/asm/mmu_context.h 
b/arch/s390/include/asm/mmu_context.h
index 65154ea..6c8ce15 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -63,6 +63,7 @@ static inline int init_new_context(struct task_struct *tsk,
   _ASCE_USER_BITS | _ASCE_TYPE_SEGMENT;
/* pgd_alloc() did not account this pmd */
mm_inc_nr_pmds(mm);
+   mm_inc_nr_puds(mm);
}
crst_table_init((unsigned long *) mm->pgd, pgd_entry_type(mm));
return 0;
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 13a133a..a5621ea 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -230,7 +231,7 @@ _PIF_WORK   = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART)
.hidden \name
.type \name,@function
 \name:
-   .cfi_startproc
+   CFI_STARTPROC
 #ifdef CONFIG_HAVE_MARCH_Z10_FEATURES
exrl0,0f
 #else
@@ -239,7 +240,7 @@ _PIF_WORK   = (_PIF_PER_TRAP | _PIF_SYSCALL_RESTART)
 #endif
j   .
 0: br  \reg
-   .cfi_endproc
+   CFI_ENDPROC
.endm
 
GEN_BR_THUNK __s390x_indirect_jump_r1use_r9,%r9,%r1
@@ -426,13 +427,13 @@ ENTRY(system_call)
UPDATE_VTIME %r8,%r9,__LC_SYNC_ENTER_TIMER
BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
stmg%r0,%r7,__PT_R0(%r11)
-   # clear user controlled register to prevent speculative use
-   xgr %r0,%r0
mvc __PT_R8(64,%r11),__LC_SAVE_AREA_SYNC
mvc __PT_PSW(16,%r11),__LC_SVC_OLD_PSW
mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
stg %r14,__PT_FLAGS(%r11)
 .Lsysc_do_svc:
+   # clear user controlled register to prevent speculative use
+   xgr %r0,%r0
# load address of system call table
lg  %r10,__THREAD_sysc_table(%r13,%r12)
llgh%r8,__PT_INT_CODE+2(%r11)
@@ -1439,6 +1440,7 @@ cleanup_critical:
stg %r15,__LC_SYSTEM_TIMER
 0: # update accounting time stamp
mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER
+   BPENTER __TI_flags(%r12),_TIF_ISOLATE_BP
# set up saved register r11
lg  %r15,__LC_KERNEL_STACK
la  %r9,STACK_FRAME_OVERHEAD(%r15)
diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c
index 69d7fcf..9aff72d 100644
--- a/arch/s390/kernel/nospec-branch.c
+++ b/arch/s390/kernel/nospec-branch.c
@@ -2,8 +2,8 @@
 #include 
 #include 
 
-int nospec_call_disable = IS_ENABLED(EXPOLINE_OFF);
-int nospec_return_disable = !IS_ENABLED(EXPOLINE_FULL);
+int nospec_call_disable = IS_ENABLED(CONFIG_EXPOLINE_OFF);
+int nospec_return_disable = !IS_ENABLED(CONFIG_EXPOLINE_FULL);
 
 static int __init nospectre_v2_setup_early(char *str)
 {
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index a7c15f0..ecef8e7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -2581,8 +2581,6 @@ int dasd_cancel_req(struct dasd_ccw_req *cqr)
case DASD_CQR_QUEUED:
/* request was not started - just set to cleared */
 

Re: [pci PATCH v3 0/3] Add support for unmanaged SR-IOV

2018-03-06 Thread Christoph Hellwig
On Tue, Mar 06, 2018 at 11:29:08AM -0800, Alexander Duyck wrote:
> This series is meant to add support for SR-IOV on devices when the VFs are
> not managed by the kernel. Examples of recent patches attempting to do this
> include:
> virto - https://patchwork.kernel.org/patch/10241225/
> pci-stub - https://patchwork.kernel.org/patch/10109935/
> vfio - https://patchwork.kernel.org/patch/10103353/
> uio - https://patchwork.kernel.org/patch/9974031/

nvme and ema seems to be existing examples.  Care to throw in
conversions while you're at it?


Re: [pci PATCH v3 0/3] Add support for unmanaged SR-IOV

2018-03-06 Thread Christoph Hellwig
On Tue, Mar 06, 2018 at 11:29:08AM -0800, Alexander Duyck wrote:
> This series is meant to add support for SR-IOV on devices when the VFs are
> not managed by the kernel. Examples of recent patches attempting to do this
> include:
> virto - https://patchwork.kernel.org/patch/10241225/
> pci-stub - https://patchwork.kernel.org/patch/10109935/
> vfio - https://patchwork.kernel.org/patch/10103353/
> uio - https://patchwork.kernel.org/patch/9974031/

nvme and ema seems to be existing examples.  Care to throw in
conversions while you're at it?


RE: [PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Avri Altman


> -Original Message-
> From: Harish Jenny K N [mailto:harish_kand...@mentor.com]
> Sent: Wednesday, March 07, 2018 7:38 AM
> To: ulf.hans...@linaro.org; linus.wall...@linaro.org;
> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman
> ; andriy.shevche...@linux.intel.com
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
> harish_kand...@mentor.com; vladimir_zapols...@mentor.com
> Subject: [PATCH v6] mmc: Export host capabilities to debugfs.
> 
> This patch exports the host capabilities to debugfs
> 
> This idea of sharing host capabilities over debugfs came up from Abbas Raza
>  Earlier discussions:
> https://lkml.org/lkml/2018/3/5/357
> https://www.spinics.net/lists/linux-mmc/msg48219.html
> 
> Signed-off-by: Harish Jenny K N 
> ---
> 
> 
> +static int mmc_caps_show(struct seq_file *s, void *unused) {
> + struct mmc_host *host = s->private;
> + u32 caps = host->caps;
> +
> + seq_puts(s, "\nMMC Host capabilities are:\n");
> + seq_puts(s,
> "=\n");
> + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
> +((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));

Maybe use a more compact form, and just call a macro with the applicable 
(stringified) bit?


Thanks,
Avri


RE: [PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Avri Altman


> -Original Message-
> From: Harish Jenny K N [mailto:harish_kand...@mentor.com]
> Sent: Wednesday, March 07, 2018 7:38 AM
> To: ulf.hans...@linaro.org; linus.wall...@linaro.org;
> adrian.hun...@intel.com; shawn@rock-chips.com; Avri Altman
> ; andriy.shevche...@linux.intel.com
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org;
> harish_kand...@mentor.com; vladimir_zapols...@mentor.com
> Subject: [PATCH v6] mmc: Export host capabilities to debugfs.
> 
> This patch exports the host capabilities to debugfs
> 
> This idea of sharing host capabilities over debugfs came up from Abbas Raza
>  Earlier discussions:
> https://lkml.org/lkml/2018/3/5/357
> https://www.spinics.net/lists/linux-mmc/msg48219.html
> 
> Signed-off-by: Harish Jenny K N 
> ---
> 
> 
> +static int mmc_caps_show(struct seq_file *s, void *unused) {
> + struct mmc_host *host = s->private;
> + u32 caps = host->caps;
> +
> + seq_puts(s, "\nMMC Host capabilities are:\n");
> + seq_puts(s,
> "=\n");
> + seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
> +((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));

Maybe use a more compact form, and just call a macro with the applicable 
(stringified) bit?


Thanks,
Avri


[PATCH 1/1] iommu/arm-smmu: Add support for qcom,smmu-500 variant

2018-03-06 Thread Vivek Gautam
Qualcomm's arm-smmu 500 implementation supports runtime pm
so enable the same.

Signed-off-by: Vivek Gautam 
---

 Based on iommu/arm-smmu pm runtime support series [1]:
 [PATCH v8 0/5] iommu/arm-smmu: Add runtime pm/sleep support
 
 Tested on sdm845 with necessary support to enable the smmu
 and with necessary user.

 [1] https://lkml.org/lkml/2018/3/2/325

 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 14 ++
 drivers/iommu/arm-smmu.c |  8 
 2 files changed, 22 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 6ea27bd4f785..0b5c6d2a9865 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -18,6 +18,7 @@ conditions.
 "arm,mmu-500"
 "cavium,smmu-v2"
 "qcom,-smmu-v2", "qcom,smmu-v2"
+"qcom,-smmu-500", "qcom,smmu-500"
 
   depending on the particular implementation and/or the
   version of the architecture implemented.
@@ -30,6 +31,10 @@ conditions.
   An example string would be -
   "qcom,msm8996-smmu-v2", "qcom,smmu-v2".
 
+  "qcom,smmu-500" is arm,mmu-500 implementation that supports
+  efficient power management by supporting smmu's state
+  retention.
+
 - reg   : Base address and size of the SMMU.
 
 - #global-interrupts : The number of global interrupts exposed by the
@@ -179,3 +184,12 @@ conditions.
 < SMMU_MDP_AHB_CLK>;
clock-names = "bus", "iface";
};
+
+   smmu5: iommu {
+   compatible = "qcom,sdm845-smmu-500", "qcom,smmu-500";
+   reg = <0x1500 0x8>;
+   #iommu-cells = <2>;
+   #global-interrupts = <1>;
+
+   ...
+   };
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7a96c924ae22..7f52456c6b25 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2008,6 +2008,12 @@ static const char * const qcom_smmuv2_clks[] = {
"bus", "iface",
 };
 
+static const struct arm_smmu_match_data qcom_smmu500 = {
+   .version = ARM_SMMU_V2,
+   .model = ARM_MMU500,
+   .rpm_supported = true,
+};
+
 static const struct arm_smmu_match_data qcom_smmuv2 = {
.version = ARM_SMMU_V2,
.model = QCOM_SMMUV2,
@@ -2024,6 +2030,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-500", .data = _mmu500 },
{ .compatible = "cavium,smmu-v2", .data = _smmuv2 },
{ .compatible = "qcom,smmu-v2", .data = _smmuv2 },
+   { .compatible = "qcom,smmu-500", .data = _smmu500 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
@@ -2394,6 +2401,7 @@ IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401");
 IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500");
 IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2");
 IOMMU_OF_DECLARE(qcom_smmuv2, "qcom,smmu-v2");
+IOMMU_OF_DECLARE(qcom_smmu500, "qcom,smmu-500");
 
 MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations");
 MODULE_AUTHOR("Will Deacon ");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/1] iommu/arm-smmu: Add support for qcom,smmu-500 variant

2018-03-06 Thread Vivek Gautam
Qualcomm's arm-smmu 500 implementation supports runtime pm
so enable the same.

Signed-off-by: Vivek Gautam 
---

 Based on iommu/arm-smmu pm runtime support series [1]:
 [PATCH v8 0/5] iommu/arm-smmu: Add runtime pm/sleep support
 
 Tested on sdm845 with necessary support to enable the smmu
 and with necessary user.

 [1] https://lkml.org/lkml/2018/3/2/325

 Documentation/devicetree/bindings/iommu/arm,smmu.txt | 14 ++
 drivers/iommu/arm-smmu.c |  8 
 2 files changed, 22 insertions(+)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 6ea27bd4f785..0b5c6d2a9865 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -18,6 +18,7 @@ conditions.
 "arm,mmu-500"
 "cavium,smmu-v2"
 "qcom,-smmu-v2", "qcom,smmu-v2"
+"qcom,-smmu-500", "qcom,smmu-500"
 
   depending on the particular implementation and/or the
   version of the architecture implemented.
@@ -30,6 +31,10 @@ conditions.
   An example string would be -
   "qcom,msm8996-smmu-v2", "qcom,smmu-v2".
 
+  "qcom,smmu-500" is arm,mmu-500 implementation that supports
+  efficient power management by supporting smmu's state
+  retention.
+
 - reg   : Base address and size of the SMMU.
 
 - #global-interrupts : The number of global interrupts exposed by the
@@ -179,3 +184,12 @@ conditions.
 < SMMU_MDP_AHB_CLK>;
clock-names = "bus", "iface";
};
+
+   smmu5: iommu {
+   compatible = "qcom,sdm845-smmu-500", "qcom,smmu-500";
+   reg = <0x1500 0x8>;
+   #iommu-cells = <2>;
+   #global-interrupts = <1>;
+
+   ...
+   };
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7a96c924ae22..7f52456c6b25 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2008,6 +2008,12 @@ static const char * const qcom_smmuv2_clks[] = {
"bus", "iface",
 };
 
+static const struct arm_smmu_match_data qcom_smmu500 = {
+   .version = ARM_SMMU_V2,
+   .model = ARM_MMU500,
+   .rpm_supported = true,
+};
+
 static const struct arm_smmu_match_data qcom_smmuv2 = {
.version = ARM_SMMU_V2,
.model = QCOM_SMMUV2,
@@ -2024,6 +2030,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-500", .data = _mmu500 },
{ .compatible = "cavium,smmu-v2", .data = _smmuv2 },
{ .compatible = "qcom,smmu-v2", .data = _smmuv2 },
+   { .compatible = "qcom,smmu-500", .data = _smmu500 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
@@ -2394,6 +2401,7 @@ IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401");
 IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500");
 IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2");
 IOMMU_OF_DECLARE(qcom_smmuv2, "qcom,smmu-v2");
+IOMMU_OF_DECLARE(qcom_smmu500, "qcom,smmu-500");
 
 MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations");
 MODULE_AUTHOR("Will Deacon ");
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-06 Thread Byungchul Park

On 3/7/2018 2:55 PM, Byungchul Park wrote:

On 3/6/2018 10:42 PM, Boqun Feng wrote:

On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote:

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

->8-
 From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side 
section


We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

    1) the current task is still within a RCU read-side section
    2) the current task has been blocked within the RCU read-side 
section




If this happens, the task will queue itself in
rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp
kthread will wait for this task to dequeue itself. IOW, we have other
mechanism to wait for this task other than bottom-up qs reporting tree.
So I think we are fine here.


Right. Basically we consider both the quiscent state within the current
task and queued tasks on rcu nodes that you mentioned, to control grace
periods when PREEMPT kernel is used.

Actually my concern was if it's safe to clear the bit of 'expmask' on
the IPI for all possible cases, even though anyway blocked tasks would
try to prevent the grace period from ending.

I worried if something subtle might cause problems, but the code looks
fine on second thought as you said. Thank you for your explanation.


In addition, by making quiescent states reported and bits of expmask
cleared only when it's out of rcu read sections, of course keeping
other mechanism unchanged like what you mentioned, I think we can avoid
unnecessary locking ops and other statements, keeping the code still
sane, even though the benefit might be small.

For example, by removing some evitable calls to rcu_report_cpu_mult()
either directly or indirectly. I'm not sure if RCU maintainers think
it's worthy tho.


Regards,
Boqun


Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park 
---
  kernel/rcu/tree_exp.h | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
  /*
   * We are either exiting an RCU read-side critical section 
(negative

   * values of t->rcu_read_lock_nesting) or are not in one at all
- * (zero value of t->rcu_read_lock_nesting).  Or we are in an RCU
- * read-side critical section that blocked before this expedited
- * grace period started.  Either way, we can immediately report
- * the quiescent state.
+ * (zero value of t->rcu_read_lock_nesting). We can immediately
+ * report the quiescent state.
   */
-    rdp = this_cpu_ptr(rsp->rda);
-    rcu_report_exp_rdp(rsp, rdp, true);
+    if (t->rcu_read_lock_nesting <= 0) {
+    rdp = this_cpu_ptr(rsp->rda);
+    rcu_report_exp_rdp(rsp, rdp, true);
+    }
  }
  /**
--
1.9.1





--
Thanks,
Byungchul


Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-06 Thread Byungchul Park

On 3/7/2018 2:55 PM, Byungchul Park wrote:

On 3/6/2018 10:42 PM, Boqun Feng wrote:

On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote:

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

->8-
 From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side 
section


We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

    1) the current task is still within a RCU read-side section
    2) the current task has been blocked within the RCU read-side 
section




If this happens, the task will queue itself in
rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp
kthread will wait for this task to dequeue itself. IOW, we have other
mechanism to wait for this task other than bottom-up qs reporting tree.
So I think we are fine here.


Right. Basically we consider both the quiscent state within the current
task and queued tasks on rcu nodes that you mentioned, to control grace
periods when PREEMPT kernel is used.

Actually my concern was if it's safe to clear the bit of 'expmask' on
the IPI for all possible cases, even though anyway blocked tasks would
try to prevent the grace period from ending.

I worried if something subtle might cause problems, but the code looks
fine on second thought as you said. Thank you for your explanation.


In addition, by making quiescent states reported and bits of expmask
cleared only when it's out of rcu read sections, of course keeping
other mechanism unchanged like what you mentioned, I think we can avoid
unnecessary locking ops and other statements, keeping the code still
sane, even though the benefit might be small.

For example, by removing some evitable calls to rcu_report_cpu_mult()
either directly or indirectly. I'm not sure if RCU maintainers think
it's worthy tho.


Regards,
Boqun


Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park 
---
  kernel/rcu/tree_exp.h | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
  /*
   * We are either exiting an RCU read-side critical section 
(negative

   * values of t->rcu_read_lock_nesting) or are not in one at all
- * (zero value of t->rcu_read_lock_nesting).  Or we are in an RCU
- * read-side critical section that blocked before this expedited
- * grace period started.  Either way, we can immediately report
- * the quiescent state.
+ * (zero value of t->rcu_read_lock_nesting). We can immediately
+ * report the quiescent state.
   */
-    rdp = this_cpu_ptr(rsp->rda);
-    rcu_report_exp_rdp(rsp, rdp, true);
+    if (t->rcu_read_lock_nesting <= 0) {
+    rdp = this_cpu_ptr(rsp->rda);
+    rcu_report_exp_rdp(rsp, rdp, true);
+    }
  }
  /**
--
1.9.1





--
Thanks,
Byungchul


[PATCH v1 1/9] PCI/PM: Move pcie_clear_root_pme_status() to core

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Move pcie_clear_root_pme_status() from the port driver to the PCI core so
it will be available even when the port driver isn't present.  No
functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci.c  |9 +
 drivers/pci/pci.h  |1 +
 drivers/pci/pcie/portdrv.h |2 --
 drivers/pci/pcie/portdrv_pci.c |9 -
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f6a4dd10d9b0..120e3393fc35 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1683,6 +1683,15 @@ int pci_set_pcie_reset_state(struct pci_dev *dev, enum 
pcie_reset_state state)
 }
 EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state);
 
+/**
+ * pcie_clear_root_pme_status - Clear root port PME interrupt status.
+ * @dev: PCIe root port or event collector.
+ */
+void pcie_clear_root_pme_status(struct pci_dev *dev)
+{
+   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
+}
+
 /**
  * pci_check_pme_status - Check if given device has generated PME.
  * @dev: Device to check.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fcd81911b127..813ca2c895d8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -71,6 +71,7 @@ void pci_update_current_state(struct pci_dev *dev, 
pci_power_t state);
 void pci_power_up(struct pci_dev *dev);
 void pci_disable_enabled_device(struct pci_dev *dev);
 int pci_finish_runtime_suspend(struct pci_dev *dev);
+void pcie_clear_root_pme_status(struct pci_dev *dev);
 int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
 void pci_pme_restore(struct pci_dev *dev);
 bool pci_dev_keep_suspended(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index a854bc569117..a4fc44d52206 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -34,8 +34,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-void pcie_clear_root_pme_status(struct pci_dev *dev);
-
 #ifdef CONFIG_HOTPLUG_PCI_PCIE
 extern bool pciehp_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index fb1c1bb87316..4413dd85e923 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -50,15 +50,6 @@ __setup("pcie_ports=", pcie_port_setup);
 
 /* global data */
 
-/**
- * pcie_clear_root_pme_status - Clear root port PME interrupt status.
- * @dev: PCIe root port or event collector.
- */
-void pcie_clear_root_pme_status(struct pci_dev *dev)
-{
-   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
-}
-
 static int pcie_portdrv_restore_config(struct pci_dev *dev)
 {
int retval;



[PATCH v1 1/9] PCI/PM: Move pcie_clear_root_pme_status() to core

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Move pcie_clear_root_pme_status() from the port driver to the PCI core so
it will be available even when the port driver isn't present.  No
functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci.c  |9 +
 drivers/pci/pci.h  |1 +
 drivers/pci/pcie/portdrv.h |2 --
 drivers/pci/pcie/portdrv_pci.c |9 -
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f6a4dd10d9b0..120e3393fc35 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1683,6 +1683,15 @@ int pci_set_pcie_reset_state(struct pci_dev *dev, enum 
pcie_reset_state state)
 }
 EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state);
 
+/**
+ * pcie_clear_root_pme_status - Clear root port PME interrupt status.
+ * @dev: PCIe root port or event collector.
+ */
+void pcie_clear_root_pme_status(struct pci_dev *dev)
+{
+   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
+}
+
 /**
  * pci_check_pme_status - Check if given device has generated PME.
  * @dev: Device to check.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fcd81911b127..813ca2c895d8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -71,6 +71,7 @@ void pci_update_current_state(struct pci_dev *dev, 
pci_power_t state);
 void pci_power_up(struct pci_dev *dev);
 void pci_disable_enabled_device(struct pci_dev *dev);
 int pci_finish_runtime_suspend(struct pci_dev *dev);
+void pcie_clear_root_pme_status(struct pci_dev *dev);
 int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
 void pci_pme_restore(struct pci_dev *dev);
 bool pci_dev_keep_suspended(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index a854bc569117..a4fc44d52206 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -34,8 +34,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-void pcie_clear_root_pme_status(struct pci_dev *dev);
-
 #ifdef CONFIG_HOTPLUG_PCI_PCIE
 extern bool pciehp_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index fb1c1bb87316..4413dd85e923 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -50,15 +50,6 @@ __setup("pcie_ports=", pcie_port_setup);
 
 /* global data */
 
-/**
- * pcie_clear_root_pme_status - Clear root port PME interrupt status.
- * @dev: PCIe root port or event collector.
- */
-void pcie_clear_root_pme_status(struct pci_dev *dev)
-{
-   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
-}
-
 static int pcie_portdrv_restore_config(struct pci_dev *dev)
 {
int retval;



[PATCH v1 5/9] PCI/portdrv: Remove pcie_port_bus_type link order dependency

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The pcie_port_bus_type must be registered before drivers that depend on it
can be registered.  Those drivers include:

  pcied_init()# PCIe native hotplug driver
  aer_service_init()  # AER driver
  dpc_service_init()  # DPC driver
  pcie_pme_service_init() # PME driver

Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a
device_initcall.  The callers of pcie_port_service_register() (above) are
also device_initcalls.  This is fragile because the device_initcall
ordering depends on link order, which is not explicit.

Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type.
This removes the link order dependency between portdrv and the pciehp, AER,
DPC, and PCIe PME drivers.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c   |   45 +++-
 drivers/pci/pcie/Makefile  |2 +
 drivers/pci/pcie/portdrv_bus.c |   56 
 drivers/pci/pcie/portdrv_pci.c |   13 +
 4 files changed, 46 insertions(+), 70 deletions(-)
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 38ee7c8b4d1a..4db85a0faf34 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -7,6 +7,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #include 
 #include 
 #include "pci.h"
+#include "pcie/portdrv.h"
 
 struct pci_dynid {
struct list_head node;
@@ -1553,8 +1555,49 @@ struct bus_type pci_bus_type = {
 };
 EXPORT_SYMBOL(pci_bus_type);
 
+#ifdef CONFIG_PCIEPORTBUS
+static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
+{
+   struct pcie_device *pciedev;
+   struct pcie_port_service_driver *driver;
+
+   if (drv->bus != _port_bus_type || dev->bus != _port_bus_type)
+   return 0;
+
+   pciedev = to_pcie_device(dev);
+   driver = to_service_driver(drv);
+
+   if (driver->service != pciedev->service)
+   return 0;
+
+   if ((driver->port_type != PCIE_ANY_PORT) &&
+   (driver->port_type != pci_pcie_type(pciedev->port)))
+   return 0;
+
+   return 1;
+}
+
+struct bus_type pcie_port_bus_type = {
+   .name   = "pci_express",
+   .match  = pcie_port_bus_match,
+};
+EXPORT_SYMBOL_GPL(pcie_port_bus_type);
+#endif
+
 static int __init pci_driver_init(void)
 {
-   return bus_register(_bus_type);
+   int ret;
+
+   ret = bus_register(_bus_type);
+   if (ret)
+   return ret;
+
+#ifdef CONFIG_PCIEPORTBUS
+   ret = bus_register(_port_bus_type);
+   if (ret)
+   return ret;
+#endif
+
+   return 0;
 }
 postcore_initcall(pci_driver_init);
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 223e4c34c29a..e01c10c97b95 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -6,7 +6,7 @@
 # Build PCI Express ASPM if needed
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
-pcieportdrv-y  := portdrv_core.o portdrv_pci.o portdrv_bus.o
+pcieportdrv-y  := portdrv_core.o portdrv_pci.o
 pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
diff --git a/drivers/pci/pcie/portdrv_bus.c b/drivers/pci/pcie/portdrv_bus.c
deleted file mode 100644
index f0fba552a0e2..
--- a/drivers/pci/pcie/portdrv_bus.c
+++ /dev/null
@@ -1,56 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * File:   portdrv_bus.c
- * Purpose:PCI Express Port Bus Driver's Bus Overloading Functions
- *
- * Copyright (C) 2004 Intel
- * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com)
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include "portdrv.h"
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv);
-
-struct bus_type pcie_port_bus_type = {
-   .name   = "pci_express",
-   .match  = pcie_port_bus_match,
-};
-EXPORT_SYMBOL_GPL(pcie_port_bus_type);
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
-{
-   struct pcie_device *pciedev;
-   struct pcie_port_service_driver *driver;
-
-   if (drv->bus != _port_bus_type || dev->bus != _port_bus_type)
-   return 0;
-
-   pciedev = to_pcie_device(dev);
-   driver = to_service_driver(drv);
-
-   if (driver->service != pciedev->service)
-   return 0;
-
-   if ((driver->port_type != PCIE_ANY_PORT) &&
-   (driver->port_type != pci_pcie_type(pciedev->port)))
-   return 0;
-
-   return 1;
-}
-
-int pcie_port_bus_register(void)
-{
-   return bus_register(_port_bus_type);
-}
-
-void pcie_port_bus_unregister(void)
-{
-   bus_unregister(_port_bus_type);
-}
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c

[PATCH v1 5/9] PCI/portdrv: Remove pcie_port_bus_type link order dependency

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The pcie_port_bus_type must be registered before drivers that depend on it
can be registered.  Those drivers include:

  pcied_init()# PCIe native hotplug driver
  aer_service_init()  # AER driver
  dpc_service_init()  # DPC driver
  pcie_pme_service_init() # PME driver

Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a
device_initcall.  The callers of pcie_port_service_register() (above) are
also device_initcalls.  This is fragile because the device_initcall
ordering depends on link order, which is not explicit.

Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type.
This removes the link order dependency between portdrv and the pciehp, AER,
DPC, and PCIe PME drivers.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c   |   45 +++-
 drivers/pci/pcie/Makefile  |2 +
 drivers/pci/pcie/portdrv_bus.c |   56 
 drivers/pci/pcie/portdrv_pci.c |   13 +
 4 files changed, 46 insertions(+), 70 deletions(-)
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 38ee7c8b4d1a..4db85a0faf34 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -7,6 +7,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -19,6 +20,7 @@
 #include 
 #include 
 #include "pci.h"
+#include "pcie/portdrv.h"
 
 struct pci_dynid {
struct list_head node;
@@ -1553,8 +1555,49 @@ struct bus_type pci_bus_type = {
 };
 EXPORT_SYMBOL(pci_bus_type);
 
+#ifdef CONFIG_PCIEPORTBUS
+static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
+{
+   struct pcie_device *pciedev;
+   struct pcie_port_service_driver *driver;
+
+   if (drv->bus != _port_bus_type || dev->bus != _port_bus_type)
+   return 0;
+
+   pciedev = to_pcie_device(dev);
+   driver = to_service_driver(drv);
+
+   if (driver->service != pciedev->service)
+   return 0;
+
+   if ((driver->port_type != PCIE_ANY_PORT) &&
+   (driver->port_type != pci_pcie_type(pciedev->port)))
+   return 0;
+
+   return 1;
+}
+
+struct bus_type pcie_port_bus_type = {
+   .name   = "pci_express",
+   .match  = pcie_port_bus_match,
+};
+EXPORT_SYMBOL_GPL(pcie_port_bus_type);
+#endif
+
 static int __init pci_driver_init(void)
 {
-   return bus_register(_bus_type);
+   int ret;
+
+   ret = bus_register(_bus_type);
+   if (ret)
+   return ret;
+
+#ifdef CONFIG_PCIEPORTBUS
+   ret = bus_register(_port_bus_type);
+   if (ret)
+   return ret;
+#endif
+
+   return 0;
 }
 postcore_initcall(pci_driver_init);
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 223e4c34c29a..e01c10c97b95 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -6,7 +6,7 @@
 # Build PCI Express ASPM if needed
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
-pcieportdrv-y  := portdrv_core.o portdrv_pci.o portdrv_bus.o
+pcieportdrv-y  := portdrv_core.o portdrv_pci.o
 pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
diff --git a/drivers/pci/pcie/portdrv_bus.c b/drivers/pci/pcie/portdrv_bus.c
deleted file mode 100644
index f0fba552a0e2..
--- a/drivers/pci/pcie/portdrv_bus.c
+++ /dev/null
@@ -1,56 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * File:   portdrv_bus.c
- * Purpose:PCI Express Port Bus Driver's Bus Overloading Functions
- *
- * Copyright (C) 2004 Intel
- * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com)
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include "portdrv.h"
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv);
-
-struct bus_type pcie_port_bus_type = {
-   .name   = "pci_express",
-   .match  = pcie_port_bus_match,
-};
-EXPORT_SYMBOL_GPL(pcie_port_bus_type);
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
-{
-   struct pcie_device *pciedev;
-   struct pcie_port_service_driver *driver;
-
-   if (drv->bus != _port_bus_type || dev->bus != _port_bus_type)
-   return 0;
-
-   pciedev = to_pcie_device(dev);
-   driver = to_service_driver(drv);
-
-   if (driver->service != pciedev->service)
-   return 0;
-
-   if ((driver->port_type != PCIE_ANY_PORT) &&
-   (driver->port_type != pci_pcie_type(pciedev->port)))
-   return 0;
-
-   return 1;
-}
-
-int pcie_port_bus_register(void)
-{
-   return bus_register(_port_bus_type);
-}
-
-void pcie_port_bus_unregister(void)
-{
-   bus_unregister(_port_bus_type);
-}
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index c08ebd237242..9475886eeb62 100644

[PATCH v1 9/9] PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp
driver") added the "pcie_hp=nomsi" kernel parameter to work around this
error on shutdown:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  Pid: 1081, comm: reboot Not tainted 3.2.0 #1
  ...
  Disabling IRQ #16

This happened on an unspecified system (possibly involving the Integrated
Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt
is generated when PCI driver switches from MSI/MSI-X to INTx while shutting
down the device."

The implication was that the device was buggy, but it is normal for a
device to use INTx after MSI/MSI-X have been disabled.  The only problem
was that the driver was still attached and it wasn't prepared for INTx
interrupts.  Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI:
Stop disabling MSI/MSI-X in pci_device_shutdown()").

There is no automated way to set this parameter, so it's not very useful
for distributions or end users.  It's really only useful for debugging, and
we have "pci=nomsi" for that purpose.

Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter.

Signed-off-by: Bjorn Helgaas 
CC: MUNEDA Takahiro 
CC: Kenji Kaneshige 
CC: Prarit Bhargava 
---
 Documentation/admin-guide/kernel-parameters.txt |4 
 drivers/pci/pcie/portdrv.h  |   12 
 drivers/pci/pcie/portdrv_core.c |   20 +++-
 3 files changed, 3 insertions(+), 33 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1d1d53f85ddd..761749562165 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3130,10 +3130,6 @@
force   Enable ASPM even on devices that claim not to support 
it.
WARNING: Forcing ASPM on may cause system lockups.
 
-   pcie_hp=[PCIE] PCI Express Hotplug driver options:
-   nomsi   Do not use MSI for PCI Express Native Hotplug (this
-   makes all PCIe ports use INTx for hotplug services).
-
pcie_ports= [PCIE] PCIe ports handling:
autoAsk the BIOS whether or not to use native PCIe services
associated with PCIe ports (PME, hot-plug, AER).  Use
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 2c19cf9ffea2..87a87cb9f42d 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -34,18 +34,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-#ifdef CONFIG_HOTPLUG_PCI_PCIE
-extern bool pciehp_msi_disabled;
-
-static inline bool pciehp_no_msi(void)
-{
-   return pciehp_msi_disabled;
-}
-
-#else  /* !CONFIG_HOTPLUG_PCI_PCIE */
-static inline bool pciehp_no_msi(void) { return false; }
-#endif /* !CONFIG_HOTPLUG_PCI_PCIE */
-
 #ifdef CONFIG_PCIE_PME
 extern bool pcie_pme_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 29210e9bfbd3..bf9c5c885957 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -21,17 +21,6 @@
 #include "../pci.h"
 #include "portdrv.h"
 
-bool pciehp_msi_disabled;
-
-static int __init pciehp_setup(char *str)
-{
-   if (!strncmp(str, "nomsi", 5))
-   pciehp_msi_disabled = true;
-
-   return 1;
-}
-__setup("pcie_hp=", pciehp_setup);
-
 /**
  * release_pcie_device - free PCI Express port service device structure
  * @dev: Port service device to release
@@ -169,16 +158,13 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
irqs[i] = -1;
 
/*
-* If we support PME or hotplug, but we can't use MSI/MSI-X for
-* them, we have to fall back to INTx or other interrupts, e.g., a
-* system shared interrupt.
+* If we support PME but can't use MSI/MSI-X for it, we have to
+* fall back to INTx or other interrupts, e.g., a system shared
+* interrupt.
 */
if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi())
goto legacy_irq;
 
-   if ((mask & PCIE_PORT_SERVICE_HP) && pciehp_no_msi())
-   goto legacy_irq;
-
/* Try to use MSI-X or MSI if supported */
if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0)
return 0;



[PATCH v1 9/9] PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp
driver") added the "pcie_hp=nomsi" kernel parameter to work around this
error on shutdown:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  Pid: 1081, comm: reboot Not tainted 3.2.0 #1
  ...
  Disabling IRQ #16

This happened on an unspecified system (possibly involving the Integrated
Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt
is generated when PCI driver switches from MSI/MSI-X to INTx while shutting
down the device."

The implication was that the device was buggy, but it is normal for a
device to use INTx after MSI/MSI-X have been disabled.  The only problem
was that the driver was still attached and it wasn't prepared for INTx
interrupts.  Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI:
Stop disabling MSI/MSI-X in pci_device_shutdown()").

There is no automated way to set this parameter, so it's not very useful
for distributions or end users.  It's really only useful for debugging, and
we have "pci=nomsi" for that purpose.

Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter.

Signed-off-by: Bjorn Helgaas 
CC: MUNEDA Takahiro 
CC: Kenji Kaneshige 
CC: Prarit Bhargava 
---
 Documentation/admin-guide/kernel-parameters.txt |4 
 drivers/pci/pcie/portdrv.h  |   12 
 drivers/pci/pcie/portdrv_core.c |   20 +++-
 3 files changed, 3 insertions(+), 33 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1d1d53f85ddd..761749562165 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3130,10 +3130,6 @@
force   Enable ASPM even on devices that claim not to support 
it.
WARNING: Forcing ASPM on may cause system lockups.
 
-   pcie_hp=[PCIE] PCI Express Hotplug driver options:
-   nomsi   Do not use MSI for PCI Express Native Hotplug (this
-   makes all PCIe ports use INTx for hotplug services).
-
pcie_ports= [PCIE] PCIe ports handling:
autoAsk the BIOS whether or not to use native PCIe services
associated with PCIe ports (PME, hot-plug, AER).  Use
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 2c19cf9ffea2..87a87cb9f42d 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -34,18 +34,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-#ifdef CONFIG_HOTPLUG_PCI_PCIE
-extern bool pciehp_msi_disabled;
-
-static inline bool pciehp_no_msi(void)
-{
-   return pciehp_msi_disabled;
-}
-
-#else  /* !CONFIG_HOTPLUG_PCI_PCIE */
-static inline bool pciehp_no_msi(void) { return false; }
-#endif /* !CONFIG_HOTPLUG_PCI_PCIE */
-
 #ifdef CONFIG_PCIE_PME
 extern bool pcie_pme_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 29210e9bfbd3..bf9c5c885957 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -21,17 +21,6 @@
 #include "../pci.h"
 #include "portdrv.h"
 
-bool pciehp_msi_disabled;
-
-static int __init pciehp_setup(char *str)
-{
-   if (!strncmp(str, "nomsi", 5))
-   pciehp_msi_disabled = true;
-
-   return 1;
-}
-__setup("pcie_hp=", pciehp_setup);
-
 /**
  * release_pcie_device - free PCI Express port service device structure
  * @dev: Port service device to release
@@ -169,16 +158,13 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
irqs[i] = -1;
 
/*
-* If we support PME or hotplug, but we can't use MSI/MSI-X for
-* them, we have to fall back to INTx or other interrupts, e.g., a
-* system shared interrupt.
+* If we support PME but can't use MSI/MSI-X for it, we have to
+* fall back to INTx or other interrupts, e.g., a system shared
+* interrupt.
 */
if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi())
goto legacy_irq;
 
-   if ((mask & PCIE_PORT_SERVICE_HP) && pciehp_no_msi())
-   goto legacy_irq;
-
/* Try to use MSI-X or MSI if supported */
if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0)
return 0;



[PATCH v1 2/9] PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") added a .resume_noirq() callback to the PCIe port driver to clear
the PME Status bit during resume to work around a BIOS issue.

The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups
but did not clear the PME Status bit during resume, which meant PMEs after
resume did not trigger interrupts because PME Status did not transition
from cleared to set.

The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS
was set.  But I think we *always* want the fix because the platform may use
PME interrupts even if Linux is built without the PCIe port driver.

Move the fix from the port driver to the PCI core so we can work around
this "PME doesn't work after waking from a sleep state" issue regardless of
CONFIG_PCIEPORTBUS.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c   |   14 ++
 drivers/pci/pcie/portdrv_pci.c |   15 ---
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3bed6beda051..bf0704b75f79 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -525,6 +525,18 @@ static void pci_pm_default_resume_early(struct pci_dev 
*pci_dev)
pci_fixup_device(pci_fixup_resume_early, pci_dev);
 }
 
+static void pcie_resume_early(struct pci_dev *pci_dev)
+{
+   /*
+* Some BIOSes forget to clear Root PME Status bits after system wakeup
+* which breaks ACPI-based runtime wakeup on PCI Express, so clear those
+* bits now just in case (shouldn't hurt).
+*/
+   if (pci_is_pcie(pci_dev) &&
+   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   pcie_clear_root_pme_status(pci_dev);
+}
+
 /*
  * Default "suspend" method for devices that have no driver provided suspend,
  * or not even a driver at all (second part).
@@ -873,6 +885,8 @@ static int pci_pm_resume_noirq(struct device *dev)
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
 
+   pcie_resume_early(pci_dev);
+
if (drv && drv->pm && drv->pm->resume_noirq)
error = drv->pm->resume_noirq(dev);
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 4413dd85e923..f91afd09e356 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -62,20 +62,6 @@ static int pcie_portdrv_restore_config(struct pci_dev *dev)
 }
 
 #ifdef CONFIG_PM
-static int pcie_port_resume_noirq(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   /*
-* Some BIOSes forget to clear Root PME Status bits after system wakeup
-* which breaks ACPI-based runtime wakeup on PCI Express, so clear those
-* bits now just in case (shouldn't hurt).
-*/
-   if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
-   pcie_clear_root_pme_status(pdev);
-   return 0;
-}
-
 static int pcie_port_runtime_suspend(struct device *dev)
 {
return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
@@ -103,7 +89,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
.thaw   = pcie_port_device_resume,
.poweroff   = pcie_port_device_suspend,
.restore= pcie_port_device_resume,
-   .resume_noirq   = pcie_port_resume_noirq,
.runtime_suspend = pcie_port_runtime_suspend,
.runtime_resume = pcie_port_runtime_resume,
.runtime_idle   = pcie_port_runtime_idle,



[PATCH v1 2/9] PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") added a .resume_noirq() callback to the PCIe port driver to clear
the PME Status bit during resume to work around a BIOS issue.

The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups
but did not clear the PME Status bit during resume, which meant PMEs after
resume did not trigger interrupts because PME Status did not transition
from cleared to set.

The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS
was set.  But I think we *always* want the fix because the platform may use
PME interrupts even if Linux is built without the PCIe port driver.

Move the fix from the port driver to the PCI core so we can work around
this "PME doesn't work after waking from a sleep state" issue regardless of
CONFIG_PCIEPORTBUS.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c   |   14 ++
 drivers/pci/pcie/portdrv_pci.c |   15 ---
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3bed6beda051..bf0704b75f79 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -525,6 +525,18 @@ static void pci_pm_default_resume_early(struct pci_dev 
*pci_dev)
pci_fixup_device(pci_fixup_resume_early, pci_dev);
 }
 
+static void pcie_resume_early(struct pci_dev *pci_dev)
+{
+   /*
+* Some BIOSes forget to clear Root PME Status bits after system wakeup
+* which breaks ACPI-based runtime wakeup on PCI Express, so clear those
+* bits now just in case (shouldn't hurt).
+*/
+   if (pci_is_pcie(pci_dev) &&
+   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   pcie_clear_root_pme_status(pci_dev);
+}
+
 /*
  * Default "suspend" method for devices that have no driver provided suspend,
  * or not even a driver at all (second part).
@@ -873,6 +885,8 @@ static int pci_pm_resume_noirq(struct device *dev)
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
 
+   pcie_resume_early(pci_dev);
+
if (drv && drv->pm && drv->pm->resume_noirq)
error = drv->pm->resume_noirq(dev);
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 4413dd85e923..f91afd09e356 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -62,20 +62,6 @@ static int pcie_portdrv_restore_config(struct pci_dev *dev)
 }
 
 #ifdef CONFIG_PM
-static int pcie_port_resume_noirq(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   /*
-* Some BIOSes forget to clear Root PME Status bits after system wakeup
-* which breaks ACPI-based runtime wakeup on PCI Express, so clear those
-* bits now just in case (shouldn't hurt).
-*/
-   if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
-   pcie_clear_root_pme_status(pdev);
-   return 0;
-}
-
 static int pcie_port_runtime_suspend(struct device *dev)
 {
return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
@@ -103,7 +89,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
.thaw   = pcie_port_device_resume,
.poweroff   = pcie_port_device_suspend,
.restore= pcie_port_device_resume,
-   .resume_noirq   = pcie_port_resume_noirq,
.runtime_suspend = pcie_port_runtime_suspend,
.runtime_resume = pcie_port_runtime_resume,
.runtime_idle   = pcie_port_runtime_idle,



[PATCH v1 7/9] PCI/portdrv: Simplify PCIe feature permission checking

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the
platform firmware or the OS, so the host bridge driver may have to request
permission from the platform before using them.  On ACPI systems, this is
done by negotiate_os_control() in acpi_pci_root_add().

The PCIe port driver later uses pcie_port_platform_notify() and
pcie_port_acpi_setup() to figure out whether it can use these features.
But all we need is a single bit for each service, so these interfaces are
needlessly complicated.

Simplify this by adding bits in the struct pci_host_bridge to show when the
OS has permission to use each feature:

  + unsigned int use_aer:1;   /* OS may use PCIe AER */
  + unsigned int use_hotplug:1;   /* OS may use PCIe hotplug */
  + unsigned int use_pme:1;   /* OS may use PCIe PME */

These are set when we create a host bridge, and the host bridge driver can
clear the bits corresponding to any feature the platform doesn't want us to
use.

Signed-off-by: Bjorn Helgaas 
---
 drivers/acpi/pci_root.c |   13 ++--
 drivers/pci/pcie/Makefile   |1 -
 drivers/pci/pcie/portdrv.h  |   11 --
 drivers/pci/pcie/portdrv_core.c |   42 ---
 drivers/pci/probe.c |   10 +
 include/linux/pci.h |3 +++
 6 files changed, 50 insertions(+), 30 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 6fc204a52493..dce53527cdc1 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -871,6 +871,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
struct acpi_device *device = root->device;
int node = acpi_get_node(device->handle);
struct pci_bus *bus;
+   struct pci_host_bridge *host_bridge;
 
info->root = root;
info->bridge = device;
@@ -895,9 +896,17 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
if (!bus)
goto out_release_info;
 
+   host_bridge = to_pci_host_bridge(bus->bridge);
+   if (!(root->osc_control_set & PCIE_PORT_SERVICE_HP))
+   host_bridge->use_hotplug = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL))
+   host_bridge->use_aer = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
+   host_bridge->use_pme = 0;
+
pci_scan_child_bus(bus);
-   pci_set_host_bridge_release(to_pci_host_bridge(bus->bridge),
-   acpi_pci_root_release_info, info);
+   pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
+   info);
if (node != NUMA_NO_NODE)
dev_printk(KERN_DEBUG, >dev, "on NUMA node %d\n", node);
return bus;
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index e01c10c97b95..11fb633b866c 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -7,7 +7,6 @@
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
 pcieportdrv-y  := portdrv_core.o portdrv_pci.o
-pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
 
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 749d200936d9..2c19cf9ffea2 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -66,15 +66,4 @@ static inline bool pcie_pme_no_msi(void) { return false; }
 static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {}
 #endif /* !CONFIG_PCIE_PME */
 
-#ifdef CONFIG_ACPI
-void pcie_port_acpi_setup(struct pci_dev *port, int *mask);
-
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask)
-{
-   pcie_port_acpi_setup(port, mask);
-}
-#else /* !CONFIG_ACPI */
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){}
-#endif /* !CONFIG_ACPI */
-
 #endif /* _PORTDRV_H_ */
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 94ce4dc50d1a..29210e9bfbd3 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -207,19 +207,20 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
  */
 static int get_port_device_capability(struct pci_dev *dev)
 {
+   struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
+   bool native;
int services = 0;
-   int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
-   if (pci_aer_available())
-   cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
-
-   if (pcie_ports_auto)
-   pcie_port_platform_notify(dev, _mask);
+   /*
+* If the user specified "pcie_ports=native", use the PCIe services
+* regardless of whether the platform has given us permission.  On
+* ACPI systems, this means we ignore _OSC.
+  

[PATCH v1 4/9] PCI/portdrv: Disable port driver in compat mode

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is
intended to disable the PCIe port driver.  But even when it was disabled,
we registered pcie_portdriver so we could work around a BIOS PME issue (see
fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume")).

Registering the driver meant that the pcie_portdrv_probe() path called
pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(),
pm_runtime_use_autosuspend(), etc., even when the driver was disabled.

We've since moved the BIOS PME workaround from the port driver to the core,
so stop registering the PCIe port driver in compat mode.

This means "pcie_ports=compat" will now be basically the same as turning
off CONFIG_PCIEPORTBUS completely.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv_core.c |3 ---
 drivers/pci/pcie/portdrv_pci.c  |2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index ef3bad4ad010..9db77c683732 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -212,9 +212,6 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   if (pcie_ports_disabled)
-   return 0;
-
cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
| PCIE_PORT_SERVICE_VC;
if (pci_aer_available())
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index f91afd09e356..c08ebd237242 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -262,7 +262,7 @@ static int __init pcie_portdrv_init(void)
int retval;
 
if (pcie_ports_disabled)
-   return pci_register_driver(_portdriver);
+   return -EACCES;
 
dmi_check_system(pcie_portdrv_dmi_table);
 



[PATCH v1 7/9] PCI/portdrv: Simplify PCIe feature permission checking

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the
platform firmware or the OS, so the host bridge driver may have to request
permission from the platform before using them.  On ACPI systems, this is
done by negotiate_os_control() in acpi_pci_root_add().

The PCIe port driver later uses pcie_port_platform_notify() and
pcie_port_acpi_setup() to figure out whether it can use these features.
But all we need is a single bit for each service, so these interfaces are
needlessly complicated.

Simplify this by adding bits in the struct pci_host_bridge to show when the
OS has permission to use each feature:

  + unsigned int use_aer:1;   /* OS may use PCIe AER */
  + unsigned int use_hotplug:1;   /* OS may use PCIe hotplug */
  + unsigned int use_pme:1;   /* OS may use PCIe PME */

These are set when we create a host bridge, and the host bridge driver can
clear the bits corresponding to any feature the platform doesn't want us to
use.

Signed-off-by: Bjorn Helgaas 
---
 drivers/acpi/pci_root.c |   13 ++--
 drivers/pci/pcie/Makefile   |1 -
 drivers/pci/pcie/portdrv.h  |   11 --
 drivers/pci/pcie/portdrv_core.c |   42 ---
 drivers/pci/probe.c |   10 +
 include/linux/pci.h |3 +++
 6 files changed, 50 insertions(+), 30 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 6fc204a52493..dce53527cdc1 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -871,6 +871,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
struct acpi_device *device = root->device;
int node = acpi_get_node(device->handle);
struct pci_bus *bus;
+   struct pci_host_bridge *host_bridge;
 
info->root = root;
info->bridge = device;
@@ -895,9 +896,17 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
if (!bus)
goto out_release_info;
 
+   host_bridge = to_pci_host_bridge(bus->bridge);
+   if (!(root->osc_control_set & PCIE_PORT_SERVICE_HP))
+   host_bridge->use_hotplug = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL))
+   host_bridge->use_aer = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
+   host_bridge->use_pme = 0;
+
pci_scan_child_bus(bus);
-   pci_set_host_bridge_release(to_pci_host_bridge(bus->bridge),
-   acpi_pci_root_release_info, info);
+   pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
+   info);
if (node != NUMA_NO_NODE)
dev_printk(KERN_DEBUG, >dev, "on NUMA node %d\n", node);
return bus;
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index e01c10c97b95..11fb633b866c 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -7,7 +7,6 @@
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
 pcieportdrv-y  := portdrv_core.o portdrv_pci.o
-pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
 
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 749d200936d9..2c19cf9ffea2 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -66,15 +66,4 @@ static inline bool pcie_pme_no_msi(void) { return false; }
 static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {}
 #endif /* !CONFIG_PCIE_PME */
 
-#ifdef CONFIG_ACPI
-void pcie_port_acpi_setup(struct pci_dev *port, int *mask);
-
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask)
-{
-   pcie_port_acpi_setup(port, mask);
-}
-#else /* !CONFIG_ACPI */
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){}
-#endif /* !CONFIG_ACPI */
-
 #endif /* _PORTDRV_H_ */
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 94ce4dc50d1a..29210e9bfbd3 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -207,19 +207,20 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
  */
 static int get_port_device_capability(struct pci_dev *dev)
 {
+   struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
+   bool native;
int services = 0;
-   int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
-   if (pci_aer_available())
-   cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
-
-   if (pcie_ports_auto)
-   pcie_port_platform_notify(dev, _mask);
+   /*
+* If the user specified "pcie_ports=native", use the PCIe services
+* regardless of whether the platform has given us permission.  On
+* ACPI systems, this means we ignore _OSC.
+*/
+   native = !pcie_ports_auto;
 

[PATCH v1 4/9] PCI/portdrv: Disable port driver in compat mode

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is
intended to disable the PCIe port driver.  But even when it was disabled,
we registered pcie_portdriver so we could work around a BIOS PME issue (see
fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume")).

Registering the driver meant that the pcie_portdrv_probe() path called
pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(),
pm_runtime_use_autosuspend(), etc., even when the driver was disabled.

We've since moved the BIOS PME workaround from the port driver to the core,
so stop registering the PCIe port driver in compat mode.

This means "pcie_ports=compat" will now be basically the same as turning
off CONFIG_PCIEPORTBUS completely.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv_core.c |3 ---
 drivers/pci/pcie/portdrv_pci.c  |2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index ef3bad4ad010..9db77c683732 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -212,9 +212,6 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   if (pcie_ports_disabled)
-   return 0;
-
cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
| PCIE_PORT_SERVICE_VC;
if (pci_aer_available())
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index f91afd09e356..c08ebd237242 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -262,7 +262,7 @@ static int __init pcie_portdrv_init(void)
int retval;
 
if (pcie_ports_disabled)
-   return pci_register_driver(_portdriver);
+   return -EACCES;
 
dmi_check_system(pcie_portdrv_dmi_table);
 



[PATCH v1 6/9] PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

No driver registers for PCIE_PORT_SERVICE_VC, so remove it.

This removes the VC "service" files from /sys/bus/pci_express/devices,
e.g., :07:00.0:pcie108, :08:04.0:pcie208 (all the files that
contained "8" as the last digit of the "pcieXXX" part).  The port driver
created these files for PCIe port devices that have a VC Capability.

Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the
spot where VC used to be, the DPC sysfs files will now be named "pcieXX8".
I don't think there's anything useful userspace can do with those files, so
I hope nobody cares about these filenames.

There is no VC driver that calls pcie_port_service_register(), so there
never was a /sys/bus/pci_express/drivers/vc directory.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv.h  |2 +-
 drivers/pci/pcie/portdrv_acpi.c |2 +-
 drivers/pci/pcie/portdrv_core.c |   14 --
 include/linux/pcieport_if.h |4 +---
 4 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index a4fc44d52206..749d200936d9 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -12,7 +12,7 @@
 
 #include 
 
-#define PCIE_PORT_DEVICE_MAXSERVICES   5
+#define PCIE_PORT_DEVICE_MAXSERVICES   4
 /*
  * The PCIe Capability Interrupt Message Number (PCIe r3.1, sec 7.8.2) must
  * be one of the first 32 MSI-X entries.  Per PCI r3.0, sec 6.8.3.1, MSI
diff --git a/drivers/pci/pcie/portdrv_acpi.c b/drivers/pci/pcie/portdrv_acpi.c
index 319c94976873..4a1b50867c98 100644
--- a/drivers/pci/pcie/portdrv_acpi.c
+++ b/drivers/pci/pcie/portdrv_acpi.c
@@ -48,7 +48,7 @@ void pcie_port_acpi_setup(struct pci_dev *port, int *srv_mask)
 
flags = root->osc_control_set;
 
-   *srv_mask = PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_DPC;
+   *srv_mask = PCIE_PORT_SERVICE_DPC;
if (flags & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
*srv_mask |= PCIE_PORT_SERVICE_HP;
if (flags & OSC_PCI_EXPRESS_PME_CONTROL)
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 9db77c683732..94ce4dc50d1a 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -189,10 +189,8 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int 
*irqs, int mask)
if (ret < 0)
return -ENODEV;
 
-   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
-   if (i != PCIE_PORT_SERVICE_VC_SHIFT)
-   irqs[i] = pci_irq_vector(dev, 0);
-   }
+   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+   irqs[i] = pci_irq_vector(dev, 0);
 
return 0;
 }
@@ -212,8 +210,7 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
-   | PCIE_PORT_SERVICE_VC;
+   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
if (pci_aer_available())
cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
 
@@ -240,9 +237,6 @@ static int get_port_device_capability(struct pci_dev *dev)
 */
pci_disable_pcie_error_reporting(dev);
}
-   /* VC support */
-   if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC))
-   services |= PCIE_PORT_SERVICE_VC;
/* Root ports are capable of generating PME too */
if ((cap_mask & PCIE_PORT_SERVICE_PME)
&& pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
@@ -332,7 +326,7 @@ int pcie_port_device_register(struct pci_dev *dev)
 */
status = pcie_init_service_irqs(dev, irqs, capabilities);
if (status) {
-   capabilities &= PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_HP;
+   capabilities &= PCIE_PORT_SERVICE_HP;
if (!capabilities)
goto error_disable;
}
diff --git a/include/linux/pcieport_if.h b/include/linux/pcieport_if.h
index b69769dbf659..28eb21731db6 100644
--- a/include/linux/pcieport_if.h
+++ b/include/linux/pcieport_if.h
@@ -20,9 +20,7 @@
 #define PCIE_PORT_SERVICE_AER  (1 << PCIE_PORT_SERVICE_AER_SHIFT)
 #define PCIE_PORT_SERVICE_HP_SHIFT 2   /* Native Hotplug */
 #define PCIE_PORT_SERVICE_HP   (1 << PCIE_PORT_SERVICE_HP_SHIFT)
-#define PCIE_PORT_SERVICE_VC_SHIFT 3   /* Virtual Channel */
-#define PCIE_PORT_SERVICE_VC   (1 << PCIE_PORT_SERVICE_VC_SHIFT)
-#define PCIE_PORT_SERVICE_DPC_SHIFT4   /* Downstream Port Containment 
*/
+#define PCIE_PORT_SERVICE_DPC_SHIFT3   /* Downstream Port Containment 
*/
 #define PCIE_PORT_SERVICE_DPC  (1 << PCIE_PORT_SERVICE_DPC_SHIFT)
 
 struct pcie_device {



[PATCH v1 8/9] PCI/portdrv: Remove unnecessary include of

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

portdrv_pci.c doesn't use anything from .  Remove the
include of it.  No functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv_pci.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 9475886eeb62..d12b58db18a1 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "../pci.h"
 #include "portdrv.h"



[PATCH v1 3/9] PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Per PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME
interrupts on behalf of Root Complex Integrated Endpoints.

Linux does not currently enable PME interrupts from RC Event Collectors,
but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") suggests PME interrupts may be enabled by the platform for ACPI-
based runtime wakeup.

Clear the PCIe PME Status bit for Root Complex Event Collectors during
resume, just like we already do for Root Ports.

If the BIOS enables PME interrupts for an event collector and neglects to
clear the status bit on resume, this change should fix the same bug as
fe31e69740ed (PMEs not working after waking from a sleep state), but for
Root Complex Integrated Endpoints.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index bf0704b75f79..38ee7c8b4d1a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -533,7 +533,8 @@ static void pcie_resume_early(struct pci_dev *pci_dev)
 * bits now just in case (shouldn't hurt).
 */
if (pci_is_pcie(pci_dev) &&
-   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT ||
+pci_pcie_type(pci_dev) == PCI_EXP_TYPE_RC_EC))
pcie_clear_root_pme_status(pci_dev);
 }
 



[PATCH v1 8/9] PCI/portdrv: Remove unnecessary include of

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

portdrv_pci.c doesn't use anything from .  Remove the
include of it.  No functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv_pci.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 9475886eeb62..d12b58db18a1 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -18,7 +18,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "../pci.h"
 #include "portdrv.h"



[PATCH v1 3/9] PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Per PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME
interrupts on behalf of Root Complex Integrated Endpoints.

Linux does not currently enable PME interrupts from RC Event Collectors,
but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") suggests PME interrupts may be enabled by the platform for ACPI-
based runtime wakeup.

Clear the PCIe PME Status bit for Root Complex Event Collectors during
resume, just like we already do for Root Ports.

If the BIOS enables PME interrupts for an event collector and neglects to
clear the status bit on resume, this change should fix the same bug as
fe31e69740ed (PMEs not working after waking from a sleep state), but for
Root Complex Integrated Endpoints.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index bf0704b75f79..38ee7c8b4d1a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -533,7 +533,8 @@ static void pcie_resume_early(struct pci_dev *pci_dev)
 * bits now just in case (shouldn't hurt).
 */
if (pci_is_pcie(pci_dev) &&
-   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT ||
+pci_pcie_type(pci_dev) == PCI_EXP_TYPE_RC_EC))
pcie_clear_root_pme_status(pci_dev);
 }
 



[PATCH v1 6/9] PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC

2018-03-06 Thread Bjorn Helgaas
From: Bjorn Helgaas 

No driver registers for PCIE_PORT_SERVICE_VC, so remove it.

This removes the VC "service" files from /sys/bus/pci_express/devices,
e.g., :07:00.0:pcie108, :08:04.0:pcie208 (all the files that
contained "8" as the last digit of the "pcieXXX" part).  The port driver
created these files for PCIe port devices that have a VC Capability.

Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the
spot where VC used to be, the DPC sysfs files will now be named "pcieXX8".
I don't think there's anything useful userspace can do with those files, so
I hope nobody cares about these filenames.

There is no VC driver that calls pcie_port_service_register(), so there
never was a /sys/bus/pci_express/drivers/vc directory.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv.h  |2 +-
 drivers/pci/pcie/portdrv_acpi.c |2 +-
 drivers/pci/pcie/portdrv_core.c |   14 --
 include/linux/pcieport_if.h |4 +---
 4 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index a4fc44d52206..749d200936d9 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -12,7 +12,7 @@
 
 #include 
 
-#define PCIE_PORT_DEVICE_MAXSERVICES   5
+#define PCIE_PORT_DEVICE_MAXSERVICES   4
 /*
  * The PCIe Capability Interrupt Message Number (PCIe r3.1, sec 7.8.2) must
  * be one of the first 32 MSI-X entries.  Per PCI r3.0, sec 6.8.3.1, MSI
diff --git a/drivers/pci/pcie/portdrv_acpi.c b/drivers/pci/pcie/portdrv_acpi.c
index 319c94976873..4a1b50867c98 100644
--- a/drivers/pci/pcie/portdrv_acpi.c
+++ b/drivers/pci/pcie/portdrv_acpi.c
@@ -48,7 +48,7 @@ void pcie_port_acpi_setup(struct pci_dev *port, int *srv_mask)
 
flags = root->osc_control_set;
 
-   *srv_mask = PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_DPC;
+   *srv_mask = PCIE_PORT_SERVICE_DPC;
if (flags & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
*srv_mask |= PCIE_PORT_SERVICE_HP;
if (flags & OSC_PCI_EXPRESS_PME_CONTROL)
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 9db77c683732..94ce4dc50d1a 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -189,10 +189,8 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int 
*irqs, int mask)
if (ret < 0)
return -ENODEV;
 
-   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
-   if (i != PCIE_PORT_SERVICE_VC_SHIFT)
-   irqs[i] = pci_irq_vector(dev, 0);
-   }
+   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+   irqs[i] = pci_irq_vector(dev, 0);
 
return 0;
 }
@@ -212,8 +210,7 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
-   | PCIE_PORT_SERVICE_VC;
+   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
if (pci_aer_available())
cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
 
@@ -240,9 +237,6 @@ static int get_port_device_capability(struct pci_dev *dev)
 */
pci_disable_pcie_error_reporting(dev);
}
-   /* VC support */
-   if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC))
-   services |= PCIE_PORT_SERVICE_VC;
/* Root ports are capable of generating PME too */
if ((cap_mask & PCIE_PORT_SERVICE_PME)
&& pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
@@ -332,7 +326,7 @@ int pcie_port_device_register(struct pci_dev *dev)
 */
status = pcie_init_service_irqs(dev, irqs, capabilities);
if (status) {
-   capabilities &= PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_HP;
+   capabilities &= PCIE_PORT_SERVICE_HP;
if (!capabilities)
goto error_disable;
}
diff --git a/include/linux/pcieport_if.h b/include/linux/pcieport_if.h
index b69769dbf659..28eb21731db6 100644
--- a/include/linux/pcieport_if.h
+++ b/include/linux/pcieport_if.h
@@ -20,9 +20,7 @@
 #define PCIE_PORT_SERVICE_AER  (1 << PCIE_PORT_SERVICE_AER_SHIFT)
 #define PCIE_PORT_SERVICE_HP_SHIFT 2   /* Native Hotplug */
 #define PCIE_PORT_SERVICE_HP   (1 << PCIE_PORT_SERVICE_HP_SHIFT)
-#define PCIE_PORT_SERVICE_VC_SHIFT 3   /* Virtual Channel */
-#define PCIE_PORT_SERVICE_VC   (1 << PCIE_PORT_SERVICE_VC_SHIFT)
-#define PCIE_PORT_SERVICE_DPC_SHIFT4   /* Downstream Port Containment 
*/
+#define PCIE_PORT_SERVICE_DPC_SHIFT3   /* Downstream Port Containment 
*/
 #define PCIE_PORT_SERVICE_DPC  (1 << PCIE_PORT_SERVICE_DPC_SHIFT)
 
 struct pcie_device {



[PATCH v1 0/9] PCI: Simplify PCIe port driver

2018-03-06 Thread Bjorn Helgaas
This is an attempt to move a few things out of the port driver.

Patches 1-2 move a workaround for a BIOS PME issue from the port driver to
the PCI core, so it doesn't depend on CONFIG_PCIEPORTBUS.

Patch 3 extends that workaround so it works for Root Complex Event
Collectors.  I haven't seen reports of this being a problem, but I think we
should handle Event Collector PMEs the same as Root Port PMEs.

Patch 4 disables the port driver completely for "pcie_ports=compat".  We
used to register the driver, claim port devices, enable them, etc., as part
of supporting the above BIOS workaround.

Patch 5 removes a port driver link order dependency.

Patch 6 removes the unused VC service.

Patch 7 simplifies the _OSC code path by keeping more of the details in the
ACPI pci_root.c driver.

Patch 8 removes an unnecessary #include.

Patch 9 removes the "pcie_hp=nomsi" parameter.  This was added to work
around an issue when shutting down devices, but a later patch fixed the
root cause, and I don't think we need such a specific parameter any more
(we still have "pci=nomsi").

---

Bjorn Helgaas (9):
  PCI/PM: Move pcie_clear_root_pme_status() to core
  PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
  PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
  PCI/portdrv: Disable port driver in compat mode
  PCI/portdrv: Remove pcie_port_bus_type link order dependency
  PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
  PCI/portdrv: Simplify PCIe feature permission checking
  PCI/portdrv: Remove unnecessary include of 
  PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter


 Documentation/admin-guide/kernel-parameters.txt |4 -
 drivers/acpi/pci_root.c |   13 +++-
 drivers/pci/pci-driver.c|   60 ++
 drivers/pci/pci.c   |9 +++
 drivers/pci/pci.h   |1 
 drivers/pci/pcie/Makefile   |3 -
 drivers/pci/pcie/portdrv.h  |   27 
 drivers/pci/pcie/portdrv_acpi.c |2 -
 drivers/pci/pcie/portdrv_bus.c  |   56 -
 drivers/pci/pcie/portdrv_core.c |   77 ++-
 drivers/pci/pcie/portdrv_pci.c  |   40 +---
 drivers/pci/probe.c |   10 +++
 include/linux/pci.h |3 +
 include/linux/pcieport_if.h |4 -
 14 files changed, 131 insertions(+), 178 deletions(-)
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c


[PATCH v1 0/9] PCI: Simplify PCIe port driver

2018-03-06 Thread Bjorn Helgaas
This is an attempt to move a few things out of the port driver.

Patches 1-2 move a workaround for a BIOS PME issue from the port driver to
the PCI core, so it doesn't depend on CONFIG_PCIEPORTBUS.

Patch 3 extends that workaround so it works for Root Complex Event
Collectors.  I haven't seen reports of this being a problem, but I think we
should handle Event Collector PMEs the same as Root Port PMEs.

Patch 4 disables the port driver completely for "pcie_ports=compat".  We
used to register the driver, claim port devices, enable them, etc., as part
of supporting the above BIOS workaround.

Patch 5 removes a port driver link order dependency.

Patch 6 removes the unused VC service.

Patch 7 simplifies the _OSC code path by keeping more of the details in the
ACPI pci_root.c driver.

Patch 8 removes an unnecessary #include.

Patch 9 removes the "pcie_hp=nomsi" parameter.  This was added to work
around an issue when shutting down devices, but a later patch fixed the
root cause, and I don't think we need such a specific parameter any more
(we still have "pci=nomsi").

---

Bjorn Helgaas (9):
  PCI/PM: Move pcie_clear_root_pme_status() to core
  PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
  PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
  PCI/portdrv: Disable port driver in compat mode
  PCI/portdrv: Remove pcie_port_bus_type link order dependency
  PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
  PCI/portdrv: Simplify PCIe feature permission checking
  PCI/portdrv: Remove unnecessary include of 
  PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter


 Documentation/admin-guide/kernel-parameters.txt |4 -
 drivers/acpi/pci_root.c |   13 +++-
 drivers/pci/pci-driver.c|   60 ++
 drivers/pci/pci.c   |9 +++
 drivers/pci/pci.h   |1 
 drivers/pci/pcie/Makefile   |3 -
 drivers/pci/pcie/portdrv.h  |   27 
 drivers/pci/pcie/portdrv_acpi.c |2 -
 drivers/pci/pcie/portdrv_bus.c  |   56 -
 drivers/pci/pcie/portdrv_core.c |   77 ++-
 drivers/pci/pcie/portdrv_pci.c  |   40 +---
 drivers/pci/probe.c |   10 +++
 include/linux/pci.h |3 +
 include/linux/pcieport_if.h |4 -
 14 files changed, 131 insertions(+), 178 deletions(-)
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c


Re: [PATCH v2] xhci: Fix front USB ports on ASUS PRIME B350M-A

2018-03-06 Thread Kai Heng Feng

Hi Matthias,

Do you have any concern about this patch?

Hopefully this can get merged for v4.16…

Kai-Heng


Re: [PATCH v2] xhci: Fix front USB ports on ASUS PRIME B350M-A

2018-03-06 Thread Kai Heng Feng

Hi Matthias,

Do you have any concern about this patch?

Hopefully this can get merged for v4.16…

Kai-Heng


Re: [PATCH 3/3] vfio/pci: Add ioeventfd support

2018-03-06 Thread Peter Xu
On Wed, Feb 28, 2018 at 01:15:20PM -0700, Alex Williamson wrote:

[...]

> @@ -1174,6 +1206,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   vdev->irq_type = VFIO_PCI_NUM_IRQS;
>   mutex_init(>igate);
>   spin_lock_init(>irqlock);
> + mutex_init(>ioeventfds_lock);

Do we better need to destroy the mutex in vfio_pci_remove?

I see that vfio_pci_device.igate is also without a destructor.  I'm
not sure on both.

Thanks,

> + INIT_LIST_HEAD(>ioeventfds_list);
>  
>   ret = vfio_add_group_dev(>dev, _pci_ops, vdev);
>   if (ret) {

-- 
Peter Xu


Re: [PATCH 3/3] vfio/pci: Add ioeventfd support

2018-03-06 Thread Peter Xu
On Wed, Feb 28, 2018 at 01:15:20PM -0700, Alex Williamson wrote:

[...]

> @@ -1174,6 +1206,8 @@ static int vfio_pci_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   vdev->irq_type = VFIO_PCI_NUM_IRQS;
>   mutex_init(>igate);
>   spin_lock_init(>irqlock);
> + mutex_init(>ioeventfds_lock);

Do we better need to destroy the mutex in vfio_pci_remove?

I see that vfio_pci_device.igate is also without a destructor.  I'm
not sure on both.

Thanks,

> + INIT_LIST_HEAD(>ioeventfds_list);
>  
>   ret = vfio_add_group_dev(>dev, _pci_ops, vdev);
>   if (ret) {

-- 
Peter Xu


Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-06 Thread Byungchul Park

On 3/6/2018 10:42 PM, Boqun Feng wrote:

On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote:

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

->8-
 From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section

We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

1) the current task is still within a RCU read-side section
2) the current task has been blocked within the RCU read-side section



If this happens, the task will queue itself in
rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp
kthread will wait for this task to dequeue itself. IOW, we have other
mechanism to wait for this task other than bottom-up qs reporting tree.
So I think we are fine here.


Right. Basically we consider both the quiscent state within the current
task and queued tasks on rcu nodes that you mentioned, to control grace
periods when PREEMPT kernel is used.

Actually my concern was if it's safe to clear the bit of 'expmask' on
the IPI for all possible cases, even though anyway blocked tasks would
try to prevent the grace period from ending.

I worried if something subtle might cause problems, but the code looks
fine on second thought as you said. Thank you for your explanation.


Regards,
Boqun


Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park 
---
  kernel/rcu/tree_exp.h | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
/*
 * We are either exiting an RCU read-side critical section (negative
 * values of t->rcu_read_lock_nesting) or are not in one at all
-* (zero value of t->rcu_read_lock_nesting).  Or we are in an RCU
-* read-side critical section that blocked before this expedited
-* grace period started.  Either way, we can immediately report
-* the quiescent state.
+* (zero value of t->rcu_read_lock_nesting). We can immediately
+* report the quiescent state.
 */
-   rdp = this_cpu_ptr(rsp->rda);
-   rcu_report_exp_rdp(rsp, rdp, true);
+   if (t->rcu_read_lock_nesting <= 0) {
+   rdp = this_cpu_ptr(rsp->rda);
+   rcu_report_exp_rdp(rsp, rdp, true);
+   }
  }
  
  /**

--
1.9.1



--
Thanks,
Byungchul


Re: [RFC] rcu: Prevent expedite reporting within RCU read-side section

2018-03-06 Thread Byungchul Park

On 3/6/2018 10:42 PM, Boqun Feng wrote:

On Tue, Mar 06, 2018 at 02:31:58PM +0900, Byungchul Park wrote:

Hello Paul and RCU folks,

I am afraid I correctly understand and fix it. But I really wonder why
sync_rcu_exp_handler() reports the quiescent state even in the case that
current task is within a RCU read-side section. Do I miss something?

If I correctly understand it and you agree with it, I can add more logic
which make it more expedited by boosting current or making it urgent
when we fail to report the quiescent state on the IPI.

->8-
 From 0b0191f506c19ce331a1fdb7c2c5a00fb23fbcf2 Mon Sep 17 00:00:00 2001
From: Byungchul Park 
Date: Tue, 6 Mar 2018 13:54:41 +0900
Subject: [RFC] rcu: Prevent expedite reporting within RCU read-side section

We report the quiescent state for this cpu if it's out of RCU read-side
section at the moment IPI was just fired during the expedite process.

However, current code reports the quiescent state even in the case:

1) the current task is still within a RCU read-side section
2) the current task has been blocked within the RCU read-side section



If this happens, the task will queue itself in
rcu_preempt_note_context_switch() using rcu_preempt_ctxt_queue(). The gp
kthread will wait for this task to dequeue itself. IOW, we have other
mechanism to wait for this task other than bottom-up qs reporting tree.
So I think we are fine here.


Right. Basically we consider both the quiscent state within the current
task and queued tasks on rcu nodes that you mentioned, to control grace
periods when PREEMPT kernel is used.

Actually my concern was if it's safe to clear the bit of 'expmask' on
the IPI for all possible cases, even though anyway blocked tasks would
try to prevent the grace period from ending.

I worried if something subtle might cause problems, but the code looks
fine on second thought as you said. Thank you for your explanation.


Regards,
Boqun


Since we don't get to the quiescent state yet in the case, we shouldn't
report it but check it another time.

Signed-off-by: Byungchul Park 
---
  kernel/rcu/tree_exp.h | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 73e1d3d..cc69d14 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -731,13 +731,13 @@ static void sync_rcu_exp_handler(void *info)
/*
 * We are either exiting an RCU read-side critical section (negative
 * values of t->rcu_read_lock_nesting) or are not in one at all
-* (zero value of t->rcu_read_lock_nesting).  Or we are in an RCU
-* read-side critical section that blocked before this expedited
-* grace period started.  Either way, we can immediately report
-* the quiescent state.
+* (zero value of t->rcu_read_lock_nesting). We can immediately
+* report the quiescent state.
 */
-   rdp = this_cpu_ptr(rsp->rda);
-   rcu_report_exp_rdp(rsp, rdp, true);
+   if (t->rcu_read_lock_nesting <= 0) {
+   rdp = this_cpu_ptr(rsp->rda);
+   rcu_report_exp_rdp(rsp, rdp, true);
+   }
  }
  
  /**

--
1.9.1



--
Thanks,
Byungchul


[PATCH] ipmi:ssif: Fix double probe from tryacpi and trydmi

2018-03-06 Thread Jiandi An
IPMI SSIF driver's parameter tryacpi and trydmi both
are set to true.  The addition of IPMI DMI driver to
create platform device for each IPMI device causes
SSIF probe to be done twice on the same SMB I2C address
for BMC.  Fix is to not call trydmi if tryacpi is able
to find I2C address for BMC from SPMI ACPI table and
probe successfully.

Signed-off-by: Jiandi An 
---
 drivers/char/ipmi/ipmi_ssif.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
index 9d3b0fa..5c57363 100644
--- a/drivers/char/ipmi/ipmi_ssif.c
+++ b/drivers/char/ipmi/ipmi_ssif.c
@@ -1981,29 +1981,41 @@ static int try_init_spmi(struct SPMITable *spmi)
return new_ssif_client(myaddr, NULL, 0, 0, SI_SPMI, NULL);
 }
 
-static void spmi_find_bmc(void)
+static int spmi_find_bmc(void)
 {
acpi_status  status;
struct SPMITable *spmi;
int  i;
+   int  rc = 0;
 
if (acpi_disabled)
-   return;
+   return -EPERM;
 
if (acpi_failure)
-   return;
+   return -ENODEV;
 
for (i = 0; ; i++) {
status = acpi_get_table(ACPI_SIG_SPMI, i+1,
(struct acpi_table_header **));
-   if (status != AE_OK)
-   return;
+   if (status != AE_OK) {
+   if (i == 0)
+   return -ENODEV;
+   else
+   return 0;
+   }
 
-   try_init_spmi(spmi);
+   rc = try_init_spmi(spmi);
+   if (rc)
+   return rc;
}
+
+   return 0;
 }
 #else
-static void spmi_find_bmc(void) { }
+static int spmi_find_bmc(void)
+{
+   return -ENODEV;
+}
 #endif
 
 #ifdef CONFIG_DMI
@@ -2104,12 +2116,13 @@ static int init_ipmi_ssif(void)
   addr[i]);
}
 
-   if (ssif_tryacpi)
+   if (ssif_tryacpi) {
ssif_i2c_driver.driver.acpi_match_table =
ACPI_PTR(ssif_acpi_match);
-
-   if (ssif_tryacpi)
-   spmi_find_bmc();
+   rv = spmi_find_bmc();
+   if (!rv)
+   ssif_trydmi = false;
+   }
 
if (ssif_trydmi) {
rv = platform_driver_register(_driver);
-- 
Jiandi An
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH] ipmi:ssif: Fix double probe from tryacpi and trydmi

2018-03-06 Thread Jiandi An
IPMI SSIF driver's parameter tryacpi and trydmi both
are set to true.  The addition of IPMI DMI driver to
create platform device for each IPMI device causes
SSIF probe to be done twice on the same SMB I2C address
for BMC.  Fix is to not call trydmi if tryacpi is able
to find I2C address for BMC from SPMI ACPI table and
probe successfully.

Signed-off-by: Jiandi An 
---
 drivers/char/ipmi/ipmi_ssif.c | 35 ---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c
index 9d3b0fa..5c57363 100644
--- a/drivers/char/ipmi/ipmi_ssif.c
+++ b/drivers/char/ipmi/ipmi_ssif.c
@@ -1981,29 +1981,41 @@ static int try_init_spmi(struct SPMITable *spmi)
return new_ssif_client(myaddr, NULL, 0, 0, SI_SPMI, NULL);
 }
 
-static void spmi_find_bmc(void)
+static int spmi_find_bmc(void)
 {
acpi_status  status;
struct SPMITable *spmi;
int  i;
+   int  rc = 0;
 
if (acpi_disabled)
-   return;
+   return -EPERM;
 
if (acpi_failure)
-   return;
+   return -ENODEV;
 
for (i = 0; ; i++) {
status = acpi_get_table(ACPI_SIG_SPMI, i+1,
(struct acpi_table_header **));
-   if (status != AE_OK)
-   return;
+   if (status != AE_OK) {
+   if (i == 0)
+   return -ENODEV;
+   else
+   return 0;
+   }
 
-   try_init_spmi(spmi);
+   rc = try_init_spmi(spmi);
+   if (rc)
+   return rc;
}
+
+   return 0;
 }
 #else
-static void spmi_find_bmc(void) { }
+static int spmi_find_bmc(void)
+{
+   return -ENODEV;
+}
 #endif
 
 #ifdef CONFIG_DMI
@@ -2104,12 +2116,13 @@ static int init_ipmi_ssif(void)
   addr[i]);
}
 
-   if (ssif_tryacpi)
+   if (ssif_tryacpi) {
ssif_i2c_driver.driver.acpi_match_table =
ACPI_PTR(ssif_acpi_match);
-
-   if (ssif_tryacpi)
-   spmi_find_bmc();
+   rv = spmi_find_bmc();
+   if (!rv)
+   ssif_trydmi = false;
+   }
 
if (ssif_trydmi) {
rv = platform_driver_register(_driver);
-- 
Jiandi An
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH] staging: lustre: Remove VLA usage

2018-03-06 Thread Kees Cook
The kernel would like to remove all VLA usage. This switches to a
simple kasprintf() instead.

Signed-off-by: Kees Cook 
---
 drivers/staging/lustre/lustre/llite/xattr.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/xattr.c 
b/drivers/staging/lustre/lustre/llite/xattr.c
index 532384c91447..aab4eab64289 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -87,7 +87,7 @@ ll_xattr_set_common(const struct xattr_handler *handler,
const char *name, const void *value, size_t size,
int flags)
 {
-   char fullname[strlen(handler->prefix) + strlen(name) + 1];
+   char *fullname;
struct ll_sb_info *sbi = ll_i2sbi(inode);
struct ptlrpc_request *req = NULL;
const char *pv = value;
@@ -141,10 +141,13 @@ ll_xattr_set_common(const struct xattr_handler *handler,
return -EPERM;
}
 
-   sprintf(fullname, "%s%s\n", handler->prefix, name);
+   fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name);
+   if (!fullname)
+   return -ENOMEM;
rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode),
 valid, fullname, pv, size, 0, flags,
 ll_i2suppgid(inode), );
+   kfree(fullname);
if (rc) {
if (rc == -EOPNOTSUPP && handler->flags == XATTR_USER_T) {
LCONSOLE_INFO("Disabling user_xattr feature because it 
is not supported on the server\n");
@@ -364,7 +367,7 @@ static int ll_xattr_get_common(const struct xattr_handler 
*handler,
   struct dentry *dentry, struct inode *inode,
   const char *name, void *buffer, size_t size)
 {
-   char fullname[strlen(handler->prefix) + strlen(name) + 1];
+   char *fullname;
struct ll_sb_info *sbi = ll_i2sbi(inode);
 #ifdef CONFIG_FS_POSIX_ACL
struct ll_inode_info *lli = ll_i2info(inode);
@@ -411,9 +414,13 @@ static int ll_xattr_get_common(const struct xattr_handler 
*handler,
if (handler->flags == XATTR_ACL_DEFAULT_T && !S_ISDIR(inode->i_mode))
return -ENODATA;
 #endif
-   sprintf(fullname, "%s%s\n", handler->prefix, name);
-   return ll_xattr_list(inode, fullname, handler->flags, buffer, size,
-OBD_MD_FLXATTR);
+   fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name);
+   if (!fullname)
+   return -ENOMEM;
+   rc = ll_xattr_list(inode, fullname, handler->flags, buffer, size,
+  OBD_MD_FLXATTR);
+   kfree(fullname);
+   return rc;
 }
 
 static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size)
-- 
2.7.4


-- 
Kees Cook
Pixel Security


[PATCH] staging: lustre: Remove VLA usage

2018-03-06 Thread Kees Cook
The kernel would like to remove all VLA usage. This switches to a
simple kasprintf() instead.

Signed-off-by: Kees Cook 
---
 drivers/staging/lustre/lustre/llite/xattr.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/xattr.c 
b/drivers/staging/lustre/lustre/llite/xattr.c
index 532384c91447..aab4eab64289 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -87,7 +87,7 @@ ll_xattr_set_common(const struct xattr_handler *handler,
const char *name, const void *value, size_t size,
int flags)
 {
-   char fullname[strlen(handler->prefix) + strlen(name) + 1];
+   char *fullname;
struct ll_sb_info *sbi = ll_i2sbi(inode);
struct ptlrpc_request *req = NULL;
const char *pv = value;
@@ -141,10 +141,13 @@ ll_xattr_set_common(const struct xattr_handler *handler,
return -EPERM;
}
 
-   sprintf(fullname, "%s%s\n", handler->prefix, name);
+   fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name);
+   if (!fullname)
+   return -ENOMEM;
rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode),
 valid, fullname, pv, size, 0, flags,
 ll_i2suppgid(inode), );
+   kfree(fullname);
if (rc) {
if (rc == -EOPNOTSUPP && handler->flags == XATTR_USER_T) {
LCONSOLE_INFO("Disabling user_xattr feature because it 
is not supported on the server\n");
@@ -364,7 +367,7 @@ static int ll_xattr_get_common(const struct xattr_handler 
*handler,
   struct dentry *dentry, struct inode *inode,
   const char *name, void *buffer, size_t size)
 {
-   char fullname[strlen(handler->prefix) + strlen(name) + 1];
+   char *fullname;
struct ll_sb_info *sbi = ll_i2sbi(inode);
 #ifdef CONFIG_FS_POSIX_ACL
struct ll_inode_info *lli = ll_i2info(inode);
@@ -411,9 +414,13 @@ static int ll_xattr_get_common(const struct xattr_handler 
*handler,
if (handler->flags == XATTR_ACL_DEFAULT_T && !S_ISDIR(inode->i_mode))
return -ENODATA;
 #endif
-   sprintf(fullname, "%s%s\n", handler->prefix, name);
-   return ll_xattr_list(inode, fullname, handler->flags, buffer, size,
-OBD_MD_FLXATTR);
+   fullname = kasprintf(GFP_KERNEL, "%s%s\n", handler->prefix, name);
+   if (!fullname)
+   return -ENOMEM;
+   rc = ll_xattr_list(inode, fullname, handler->flags, buffer, size,
+  OBD_MD_FLXATTR);
+   kfree(fullname);
+   return rc;
 }
 
 static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size)
-- 
2.7.4


-- 
Kees Cook
Pixel Security


[PATCH] staging: iio: meter: Remove reduntant __func__ from debug print

2018-03-06 Thread hariprasath . elango
From: HariPrasath Elango 

dev_dbg includes the function name & line number by default when dynamic
debugging is enabled. Hence__func__ is reduntant here and removed.

Signed-off-by: HariPrasath Elango 
---
 drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
b/drivers/staging/iio/meter/ade7758_trigger.c
index 1f0d1a0..da489ae 100644
--- a/drivers/staging/iio/meter/ade7758_trigger.c
+++ b/drivers/staging/iio/meter/ade7758_trigger.c
@@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
iio_trigger *trig,
 {
struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
 
-   dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
+   dev_dbg(_dev->dev, "(%d)\n", state);
return ade7758_set_irq(_dev->dev, state);
 }
 
-- 
2.10.0.GIT



[PATCH] staging: iio: meter: Remove reduntant __func__ from debug print

2018-03-06 Thread hariprasath . elango
From: HariPrasath Elango 

dev_dbg includes the function name & line number by default when dynamic
debugging is enabled. Hence__func__ is reduntant here and removed.

Signed-off-by: HariPrasath Elango 
---
 drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
b/drivers/staging/iio/meter/ade7758_trigger.c
index 1f0d1a0..da489ae 100644
--- a/drivers/staging/iio/meter/ade7758_trigger.c
+++ b/drivers/staging/iio/meter/ade7758_trigger.c
@@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
iio_trigger *trig,
 {
struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
 
-   dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
+   dev_dbg(_dev->dev, "(%d)\n", state);
return ade7758_set_irq(_dev->dev, state);
 }
 
-- 
2.10.0.GIT



[PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Harish Jenny K N
This patch exports the host capabilities to debugfs

This idea of sharing host capabilities over debugfs
came up from Abbas Raza 
Earlier discussions:
https://lkml.org/lkml/2018/3/5/357
https://www.spinics.net/lists/linux-mmc/msg48219.html

Signed-off-by: Harish Jenny K N 
---

Changes in v6:
- Used DEFINE_SHOW_ATTRIBUTE

Changes in v5:
- Added parser logic in kernel by using debugfs_create_file  for caps and caps2 
instead of debugfs_create_x32
- Changed Author

Changes in v4:
- Moved the creation of nodes to mmc_add_host_debugfs
- Exported caps2
- Renamed host_caps to caps

Changes in v3:
- Removed typecasting of >caps to (u32 *)

Changes in v2:
- Changed Author

 drivers/mmc/core/debugfs.c | 120 +
 1 file changed, 120 insertions(+)

diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index c51e0c0..136bdf7 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -225,6 +225,120 @@ static int mmc_clock_opt_set(void *data, u64 val)
 DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set,
"%llu\n");

+static int mmc_caps_show(struct seq_file *s, void *unused)
+{
+   struct mmc_host *host = s->private;
+   u32 caps = host->caps;
+
+   seq_puts(s, "\nMMC Host capabilities are:\n");
+   seq_puts(s, "=\n");
+   seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
+  ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));
+   seq_printf(s, "Can do MMC high-speed timing :\t%s\n",
+  ((caps & MMC_CAP_MMC_HIGHSPEED) ? "Yes" : "No"));
+   seq_printf(s, "Can do SD high-speed timing :\t%s\n",
+  ((caps & MMC_CAP_SD_HIGHSPEED) ? "Yes" : "No"));
+   seq_printf(s, "Can signal pending SDIO IRQs :\t%s\n",
+  ((caps & MMC_CAP_SDIO_IRQ) ? "Yes" : "No"));
+   seq_printf(s, "Talks only SPI protocols :\t%s\n",
+  ((caps & MMC_CAP_SPI) ? "Yes" : "No"));
+   seq_printf(s, "Needs polling for card-detection :\t%s\n",
+  ((caps & MMC_CAP_NEEDS_POLL) ? "Yes" : "No"));
+   seq_printf(s, "Can the host do 8 bit transfers :\t%s\n",
+  ((caps & MMC_CAP_8_BIT_DATA) ? "Yes" : "No"));
+   seq_printf(s, "Suspend (e)MMC/SD at idle :\t%s\n",
+  ((caps & MMC_CAP_AGGRESSIVE_PM) ? "Yes" : "No"));
+   seq_printf(s, "Nonremovable e.g. eMMC :\t%s\n",
+  ((caps & MMC_CAP_NONREMOVABLE) ? "Yes" : "No"));
+   seq_printf(s, "Waits while card is busy :\t%s\n",
+  ((caps & MMC_CAP_WAIT_WHILE_BUSY) ? "Yes" : "No"));
+   seq_printf(s, "Allow erase/trim commands :\t%s\n",
+  ((caps & MMC_CAP_ERASE) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 3.3V :\t%s\n",
+  ((caps & MMC_CAP_3_3V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 1.8V :\t%s\n",
+  ((caps & MMC_CAP_1_8V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 1.2V :\t%s\n",
+  ((caps & MMC_CAP_1_2V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can power off after boot :\t%s\n",
+  ((caps & MMC_CAP_POWER_OFF_CARD) ? "Yes" : "No"));
+   seq_printf(s, "CMD14/CMD19 bus width ok :\t%s\n",
+  ((caps & MMC_CAP_BUS_WIDTH_TEST) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR12 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR12) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR25 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR25) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR50 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR50) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR104 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR104) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS DDR50 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_DDR50) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type A :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_A) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type C :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_C) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type D :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_D) ? "Yes" : "No"));
+   seq_printf(s, "RW reqs can be completed within mmc_request_done() 
:\t%s\n",
+  ((caps & MMC_CAP_DONE_COMPLETE) ? "Yes" : "No"));
+   seq_printf(s, "Enable card detect wake :\t%s\n",
+  ((caps & MMC_CAP_CD_WAKE) ? "Yes" : "No"));
+   seq_printf(s, "Commands during data transfer :\t%s\n",
+  ((caps & MMC_CAP_CMD_DURING_TFR) ? "Yes" : "No"));
+   seq_printf(s, "CMD23 supported. :\t%s\n",
+  ((caps & 

[PATCH v6] mmc: Export host capabilities to debugfs.

2018-03-06 Thread Harish Jenny K N
This patch exports the host capabilities to debugfs

This idea of sharing host capabilities over debugfs
came up from Abbas Raza 
Earlier discussions:
https://lkml.org/lkml/2018/3/5/357
https://www.spinics.net/lists/linux-mmc/msg48219.html

Signed-off-by: Harish Jenny K N 
---

Changes in v6:
- Used DEFINE_SHOW_ATTRIBUTE

Changes in v5:
- Added parser logic in kernel by using debugfs_create_file  for caps and caps2 
instead of debugfs_create_x32
- Changed Author

Changes in v4:
- Moved the creation of nodes to mmc_add_host_debugfs
- Exported caps2
- Renamed host_caps to caps

Changes in v3:
- Removed typecasting of >caps to (u32 *)

Changes in v2:
- Changed Author

 drivers/mmc/core/debugfs.c | 120 +
 1 file changed, 120 insertions(+)

diff --git a/drivers/mmc/core/debugfs.c b/drivers/mmc/core/debugfs.c
index c51e0c0..136bdf7 100644
--- a/drivers/mmc/core/debugfs.c
+++ b/drivers/mmc/core/debugfs.c
@@ -225,6 +225,120 @@ static int mmc_clock_opt_set(void *data, u64 val)
 DEFINE_SIMPLE_ATTRIBUTE(mmc_clock_fops, mmc_clock_opt_get, mmc_clock_opt_set,
"%llu\n");

+static int mmc_caps_show(struct seq_file *s, void *unused)
+{
+   struct mmc_host *host = s->private;
+   u32 caps = host->caps;
+
+   seq_puts(s, "\nMMC Host capabilities are:\n");
+   seq_puts(s, "=\n");
+   seq_printf(s, "Can the host do 4 bit transfers :\t%s\n",
+  ((caps & MMC_CAP_4_BIT_DATA) ? "Yes" : "No"));
+   seq_printf(s, "Can do MMC high-speed timing :\t%s\n",
+  ((caps & MMC_CAP_MMC_HIGHSPEED) ? "Yes" : "No"));
+   seq_printf(s, "Can do SD high-speed timing :\t%s\n",
+  ((caps & MMC_CAP_SD_HIGHSPEED) ? "Yes" : "No"));
+   seq_printf(s, "Can signal pending SDIO IRQs :\t%s\n",
+  ((caps & MMC_CAP_SDIO_IRQ) ? "Yes" : "No"));
+   seq_printf(s, "Talks only SPI protocols :\t%s\n",
+  ((caps & MMC_CAP_SPI) ? "Yes" : "No"));
+   seq_printf(s, "Needs polling for card-detection :\t%s\n",
+  ((caps & MMC_CAP_NEEDS_POLL) ? "Yes" : "No"));
+   seq_printf(s, "Can the host do 8 bit transfers :\t%s\n",
+  ((caps & MMC_CAP_8_BIT_DATA) ? "Yes" : "No"));
+   seq_printf(s, "Suspend (e)MMC/SD at idle :\t%s\n",
+  ((caps & MMC_CAP_AGGRESSIVE_PM) ? "Yes" : "No"));
+   seq_printf(s, "Nonremovable e.g. eMMC :\t%s\n",
+  ((caps & MMC_CAP_NONREMOVABLE) ? "Yes" : "No"));
+   seq_printf(s, "Waits while card is busy :\t%s\n",
+  ((caps & MMC_CAP_WAIT_WHILE_BUSY) ? "Yes" : "No"));
+   seq_printf(s, "Allow erase/trim commands :\t%s\n",
+  ((caps & MMC_CAP_ERASE) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 3.3V :\t%s\n",
+  ((caps & MMC_CAP_3_3V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 1.8V :\t%s\n",
+  ((caps & MMC_CAP_1_8V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can support DDR mode at 1.2V :\t%s\n",
+  ((caps & MMC_CAP_1_2V_DDR) ? "Yes" : "No"));
+   seq_printf(s, "Can power off after boot :\t%s\n",
+  ((caps & MMC_CAP_POWER_OFF_CARD) ? "Yes" : "No"));
+   seq_printf(s, "CMD14/CMD19 bus width ok :\t%s\n",
+  ((caps & MMC_CAP_BUS_WIDTH_TEST) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR12 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR12) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR25 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR25) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR50 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR50) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS SDR104 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_SDR104) ? "Yes" : "No"));
+   seq_printf(s, "Host supports UHS DDR50 mode :\t%s\n",
+  ((caps & MMC_CAP_UHS_DDR50) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type A :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_A) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type C :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_C) ? "Yes" : "No"));
+   seq_printf(s, "Host supports Driver Type D :\t%s\n",
+  ((caps & MMC_CAP_DRIVER_TYPE_D) ? "Yes" : "No"));
+   seq_printf(s, "RW reqs can be completed within mmc_request_done() 
:\t%s\n",
+  ((caps & MMC_CAP_DONE_COMPLETE) ? "Yes" : "No"));
+   seq_printf(s, "Enable card detect wake :\t%s\n",
+  ((caps & MMC_CAP_CD_WAKE) ? "Yes" : "No"));
+   seq_printf(s, "Commands during data transfer :\t%s\n",
+  ((caps & MMC_CAP_CMD_DURING_TFR) ? "Yes" : "No"));
+   seq_printf(s, "CMD23 supported. :\t%s\n",
+  ((caps & MMC_CAP_CMD23) ? "Yes" : "No"));
+   seq_printf(s, 

[PATCH] security: Fix IMA Kconfig for dependencies on ARM64

2018-03-06 Thread Jiandi An
TPM_CRB driver is the TPM support for ARM64.  If it
is built as module, TPM chip is registered after IMA
init.  tpm_pcr_read() in IMA driver would fail and
display the following message even though eventually
there is TPM chip on the system:

ima: No TPM chip found, activating TPM-bypass! (rc=-19)

Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is
built in kernel and initializes before IMA driver.

Signed-off-by: Jiandi An 
---
 security/integrity/ima/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 35ef693..6a8f677 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -10,6 +10,7 @@ config IMA
select CRYPTO_HASH_INFO
select TCG_TPM if HAS_IOMEM && !UML
select TCG_TIS if TCG_TPM && X86
+   select TCG_CRB if TCG_TPM && ACPI
select TCG_IBMVTPM if TCG_TPM && PPC_PSERIES
help
  The Trusted Computing Group(TCG) runtime Integrity
-- 
Jiandi An
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



[PATCH] security: Fix IMA Kconfig for dependencies on ARM64

2018-03-06 Thread Jiandi An
TPM_CRB driver is the TPM support for ARM64.  If it
is built as module, TPM chip is registered after IMA
init.  tpm_pcr_read() in IMA driver would fail and
display the following message even though eventually
there is TPM chip on the system:

ima: No TPM chip found, activating TPM-bypass! (rc=-19)

Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is
built in kernel and initializes before IMA driver.

Signed-off-by: Jiandi An 
---
 security/integrity/ima/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 35ef693..6a8f677 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -10,6 +10,7 @@ config IMA
select CRYPTO_HASH_INFO
select TCG_TPM if HAS_IOMEM && !UML
select TCG_TIS if TCG_TPM && X86
+   select TCG_CRB if TCG_TPM && ACPI
select TCG_IBMVTPM if TCG_TPM && PPC_PSERIES
help
  The Trusted Computing Group(TCG) runtime Integrity
-- 
Jiandi An
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



Re: [PATCH] staging: iio: adc: Remove reduntant __func__ from debug print

2018-03-06 Thread
On Wed, Mar 07, 2018 at 10:40:05AM +0530, hariprasath.ela...@gmail.com wrote:
> From: HariPrasath Elango 
> 
> dev_dbg includes the function name & line number by default when dynamic
> debugging is enabled. Hence__func__ is reduntant here and removed.
> 
> Signed-off-by: HariPrasath Elango 
> ---
>  drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
> b/drivers/staging/iio/meter/ade7758_trigger.c
> index 1f0d1a0..da489ae 100644
> --- a/drivers/staging/iio/meter/ade7758_trigger.c
> +++ b/drivers/staging/iio/meter/ade7758_trigger.c
> @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
> iio_trigger *trig,
>  {
>   struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
>  
> - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
> + dev_dbg(_dev->dev, "(%d)\n", state);
>   return ade7758_set_irq(_dev->dev, state);
>  }
>  
> -- 
> 2.10.0.GIT
> 

Please ignore this patch as the subject line is wrong. It should be
'meter' and not 'adc.


Re: [PATCH] staging: iio: adc: Remove reduntant __func__ from debug print

2018-03-06 Thread
On Wed, Mar 07, 2018 at 10:40:05AM +0530, hariprasath.ela...@gmail.com wrote:
> From: HariPrasath Elango 
> 
> dev_dbg includes the function name & line number by default when dynamic
> debugging is enabled. Hence__func__ is reduntant here and removed.
> 
> Signed-off-by: HariPrasath Elango 
> ---
>  drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
> b/drivers/staging/iio/meter/ade7758_trigger.c
> index 1f0d1a0..da489ae 100644
> --- a/drivers/staging/iio/meter/ade7758_trigger.c
> +++ b/drivers/staging/iio/meter/ade7758_trigger.c
> @@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
> iio_trigger *trig,
>  {
>   struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
>  
> - dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
> + dev_dbg(_dev->dev, "(%d)\n", state);
>   return ade7758_set_irq(_dev->dev, state);
>  }
>  
> -- 
> 2.10.0.GIT
> 

Please ignore this patch as the subject line is wrong. It should be
'meter' and not 'adc.


Re: [PATCH v2] arm64: dts: msm8916: Add cpu cooling maps

2018-03-06 Thread Viresh Kumar
On Wed, Mar 7, 2018 at 10:30 AM, Amit Kucheria  wrote:
> From: Rajendra Nayak 
>
> Add cpu cooling maps for cpu passive trip points. The cpu cooling
> device states are mapped to cpufreq based scaling frequencies.
>
> Signed-off-by: Rajendra Nayak 
> Signed-off-by: Amit Kucheria 
> ---
>  arch/arm64/boot/dts/qcom/msm8916.dtsi | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> index e468277..66b318e 100644
> --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  / {
> model = "Qualcomm Technologies, Inc. MSM8916";
> @@ -115,6 +116,7 @@
> cpu-idle-states = <_SPC>;
> clocks = < 0>;
> operating-points-v2 = <_opp_table>;
> +   #cooling-cells = <2>;

LGTM.


Re: [PATCH v2] arm64: dts: msm8916: Add cpu cooling maps

2018-03-06 Thread Viresh Kumar
On Wed, Mar 7, 2018 at 10:30 AM, Amit Kucheria  wrote:
> From: Rajendra Nayak 
>
> Add cpu cooling maps for cpu passive trip points. The cpu cooling
> device states are mapped to cpufreq based scaling frequencies.
>
> Signed-off-by: Rajendra Nayak 
> Signed-off-by: Amit Kucheria 
> ---
>  arch/arm64/boot/dts/qcom/msm8916.dtsi | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> index e468277..66b318e 100644
> --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  / {
> model = "Qualcomm Technologies, Inc. MSM8916";
> @@ -115,6 +116,7 @@
> cpu-idle-states = <_SPC>;
> clocks = < 0>;
> operating-points-v2 = <_opp_table>;
> +   #cooling-cells = <2>;

LGTM.


[PATCH] staging: iio: adc: Remove reduntant __func__ from debug print

2018-03-06 Thread hariprasath . elango
From: HariPrasath Elango 

dev_dbg includes the function name & line number by default when dynamic
debugging is enabled. Hence__func__ is reduntant here and removed.

Signed-off-by: HariPrasath Elango 
---
 drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
b/drivers/staging/iio/meter/ade7758_trigger.c
index 1f0d1a0..da489ae 100644
--- a/drivers/staging/iio/meter/ade7758_trigger.c
+++ b/drivers/staging/iio/meter/ade7758_trigger.c
@@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
iio_trigger *trig,
 {
struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
 
-   dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
+   dev_dbg(_dev->dev, "(%d)\n", state);
return ade7758_set_irq(_dev->dev, state);
 }
 
-- 
2.10.0.GIT



Re: [PATCH v9 14/15] cpufreq: Add module to register cpufreq on Krait CPUs

2018-03-06 Thread Viresh Kumar
On 06-03-18, 20:09, Sricharan R wrote:
> From: Stephen Boyd 
> 
> Register a cpufreq-generic device whenever we detect that a
> "qcom,krait" compatible CPU is present in DT.
> 
> Acked-by: Viresh Kumar 
> [Sricharan: updated to use dev_pm_opp_set_prop_name and
>   nvmem apis]
> Signed-off-by: Sricharan R 
> Signed-off-by: Stephen Boyd 
> ---
>  drivers/cpufreq/Kconfig.arm  |  10 ++
>  drivers/cpufreq/Makefile |   1 +
>  drivers/cpufreq/cpufreq-dt-platdev.c |   5 +
>  drivers/cpufreq/qcom-cpufreq.c   | 183 
> +++
>  4 files changed, 199 insertions(+)
>  create mode 100644 drivers/cpufreq/qcom-cpufreq.c

Acked-by: Viresh Kumar 

-- 
viresh


[PATCH] staging: iio: adc: Remove reduntant __func__ from debug print

2018-03-06 Thread hariprasath . elango
From: HariPrasath Elango 

dev_dbg includes the function name & line number by default when dynamic
debugging is enabled. Hence__func__ is reduntant here and removed.

Signed-off-by: HariPrasath Elango 
---
 drivers/staging/iio/meter/ade7758_trigger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/iio/meter/ade7758_trigger.c 
b/drivers/staging/iio/meter/ade7758_trigger.c
index 1f0d1a0..da489ae 100644
--- a/drivers/staging/iio/meter/ade7758_trigger.c
+++ b/drivers/staging/iio/meter/ade7758_trigger.c
@@ -34,7 +34,7 @@ static int ade7758_data_rdy_trigger_set_state(struct 
iio_trigger *trig,
 {
struct iio_dev *indio_dev = iio_trigger_get_drvdata(trig);
 
-   dev_dbg(_dev->dev, "%s (%d)\n", __func__, state);
+   dev_dbg(_dev->dev, "(%d)\n", state);
return ade7758_set_irq(_dev->dev, state);
 }
 
-- 
2.10.0.GIT



Re: [PATCH v9 14/15] cpufreq: Add module to register cpufreq on Krait CPUs

2018-03-06 Thread Viresh Kumar
On 06-03-18, 20:09, Sricharan R wrote:
> From: Stephen Boyd 
> 
> Register a cpufreq-generic device whenever we detect that a
> "qcom,krait" compatible CPU is present in DT.
> 
> Acked-by: Viresh Kumar 
> [Sricharan: updated to use dev_pm_opp_set_prop_name and
>   nvmem apis]
> Signed-off-by: Sricharan R 
> Signed-off-by: Stephen Boyd 
> ---
>  drivers/cpufreq/Kconfig.arm  |  10 ++
>  drivers/cpufreq/Makefile |   1 +
>  drivers/cpufreq/cpufreq-dt-platdev.c |   5 +
>  drivers/cpufreq/qcom-cpufreq.c   | 183 
> +++
>  4 files changed, 199 insertions(+)
>  create mode 100644 drivers/cpufreq/qcom-cpufreq.c

Acked-by: Viresh Kumar 

-- 
viresh


[PATCH dts/arm/aspeed-g5 v1] ARM: dts: aspeed-g5: Add IPMI KCS node

2018-03-06 Thread Haiyue Wang
The IPMI KCS device part of the LPC interface and is used for
communication with the host processor.

Signed-off-by: Haiyue Wang 
---
 arch/arm/boot/dts/aspeed-g5.dtsi | 43 +++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index 8eac57c..f443169 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -267,8 +267,40 @@
ranges = <0x0 0x1e789000 0x1000>;
 
lpc_bmc: lpc-bmc@0 {
-   compatible = "aspeed,ast2500-lpc-bmc";
+   compatible = "aspeed,ast2500-lpc-bmc", 
"simple-mfd", "syscon";
reg = <0x0 0x80>;
+   reg-io-width = <4>;
+
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x0 0x80>;
+
+   kcs1: kcs1@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <1>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
+   kcs2: kcs2@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <2>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
+   kcs3: kcs3@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <3>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
};
 
lpc_host: lpc-host@80 {
@@ -294,6 +326,15 @@
status = "disabled";
};
 
+   kcs4: kcs4@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0xa0>;
+   interrupts = <8>;
+   kcs_chan = <4>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
lhc: lhc@20 {
compatible = 
"aspeed,ast2500-lhc";
reg = <0x20 0x24 0x48 0x8>;
-- 
2.7.4



[PATCH dts/arm/aspeed-g5 v1] ARM: dts: aspeed-g5: Add IPMI KCS node

2018-03-06 Thread Haiyue Wang
The IPMI KCS device part of the LPC interface and is used for
communication with the host processor.

Signed-off-by: Haiyue Wang 
---
 arch/arm/boot/dts/aspeed-g5.dtsi | 43 +++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index 8eac57c..f443169 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -267,8 +267,40 @@
ranges = <0x0 0x1e789000 0x1000>;
 
lpc_bmc: lpc-bmc@0 {
-   compatible = "aspeed,ast2500-lpc-bmc";
+   compatible = "aspeed,ast2500-lpc-bmc", 
"simple-mfd", "syscon";
reg = <0x0 0x80>;
+   reg-io-width = <4>;
+
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x0 0x80>;
+
+   kcs1: kcs1@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <1>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
+   kcs2: kcs2@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <2>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
+   kcs3: kcs3@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0x80>;
+   interrupts = <8>;
+   kcs_chan = <3>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
};
 
lpc_host: lpc-host@80 {
@@ -294,6 +326,15 @@
status = "disabled";
};
 
+   kcs4: kcs4@0 {
+   compatible = 
"aspeed,ast2500-kcs-bmc";
+   reg = <0x0 0xa0>;
+   interrupts = <8>;
+   kcs_chan = <4>;
+   kcs_addr = <0x0>;
+   status = "disabled";
+   };
+
lhc: lhc@20 {
compatible = 
"aspeed,ast2500-lhc";
reg = <0x20 0x24 0x48 0x8>;
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >