Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Alexei Starovoitov

On 3/9/18 10:23 AM, Andy Lutomirski wrote:




On Mar 9, 2018, at 10:15 AM, Greg KH  wrote:





Oh, and for the record, I like Andy's proposal as well as dumping this
into a kernel module "blob" with the exception that this now would take
up unswapable memory, which isn't the nicest and is one big reason we
removed the in-kernel-memory firmware blobs many years ago.



It might not be totally crazy to back it by tmpfs.


interesting. how do you propose to do it?
Something like:
- create /umh_module_tempxxx dir
- mount tmpfs there
- copy elf into it and exec it?



Re: [PATCH v2] regmap: irq: fix ack-invert

2018-03-09 Thread Tim Harvey
On Wed, Mar 7, 2018 at 6:36 AM, Mark Brown  wrote:
> On Tue, Mar 06, 2018 at 06:57:49AM -0800, Tony Lindgren wrote:
>
>> > By using regmap_irq_update_bits to ACK the interrupts we use the masked
>> > status bits so we take care not to affect any other bits then use
>> > ack_invert to determine if we clear or set those bits.
>
>> This change to use regmap_irq_update_bits() now breaks things for
>> me with cpcap interrupts. So it seems to cause a non-inverted mode
>> regression. There should be no need to read the ack register, I
>> gues that's the whole idea of having a separate ack register :)
>
> Yes, that'd be my expectation as well - the register should be just
> write only.  regmap_update_bits() definitely isn't the right thing here
> since it will suppress the write part of the read/modify/write cycle if
> it detects that it didn't actually modify anything as an optimization.

understood. I will put together a v3 soon.

Thanks,

Tim


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 10:48 AM, Andy Lutomirski  wrote:
>> On Mar 9, 2018, at 10:17 AM, Linus Torvalds  
>> wrote:
>>
>> Hmm. I wish we had an "execute blob" model, but we really don't, and
>> it would be hard/impossible to do without pinning the pages in memory.
>>
>
> Why so hard?  We can already execute a struct file for execveat, and Alexei 
> already has this working for umh.
> Surely we can make an immutable (as in even root can’t write it) 
> kernel-internal tmpfs file, execveat it, then unlink it.

And what do you think that does? It pins the memory for the whole
time. As a *copy* of the original file.

Anyway, see my other suggestion that makes this all irrelevant. Just
wait synchronously (until the exit), and just use deny_write_access().

The "synchronous wait" means that you don't have the semantic change
(and really., it's *required* anyway for the whole mutual exclusion
against another thread racing to load the same module), and the
deny_write_access() means that we don't neeed to make another copy.

Linus


Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Liang, Kan



On 3/9/2018 12:42 PM, Peter Zijlstra wrote:

On Fri, Mar 09, 2018 at 09:31:11AM -0500, Vince Weaver wrote:

On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:


Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
Gitweb: https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
Author: Kan Liang 
AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
Committer:  Ingo Molnar 
CommitDate: Fri, 9 Mar 2018 08:22:23 +0100

perf/x86/intel: Disable userspace RDPMC usage for large PEBS




So this whole commit log is about disabling RDPMC usage for "large PEBS"
but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"

Either the commit log is really misleading, or else a poor name was chosen
for this feature.


Its the same thing, and yes that might want renaming I suppose.



Yes, I will send a patch to rename the "FREERUNNING" to "LARGE_PEBS", 
and fix the confusion.


Thanks,
Kan



Re: 4.15-rc9 new insecure W+X mapping warning

2018-03-09 Thread Meelis Roos
> This is Intel SE7520JR22S mainboard with 2 64-bit P4 xeons. Earlier 
> kernels up to 4.14 have had W+X checking on but found nothing. Now I 
> tried 4.15.0-rc9-00023-g1f07476ec143 and it gives a new W+X warning. 

Actually, I was wrong about earlier kernels - I just did not have 
CONFIG_DEBUG_WX turned on before and eralier kernels did not check it.

Recompiled 4.14 with CONFIG_DEBUG_WX=y and the problem is there. So this 
is not a Linux regression but a peculiarity with the SE7520JR22S, it 
seems.

Is there anything that Linux might be doing wrong?

> [   10.880663] [ cut here ]
> [   10.880755] x86/mm: Found insecure W+X mapping at address 
> d051fb08/0x8800
> [   10.880900] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:266 
> note_page+0x718/0xb89
> [   10.881035] Modules linked in:
> [   10.881128] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
> 4.15.0-rc9-00023-g1f07476ec143 #104
> [   10.881264] Hardware name: Intel  
> /SE7520JR22S, BIOS SE7520JR22.86B.P.10.00.0087.120820051348 12/08/2005
> [   10.881405] RIP: 0010:note_page+0x718/0xb89
> [   10.881491] RSP: :c9013e48 EFLAGS: 00010296
> [   10.881578] RAX: 0051 RBX: c9013ec8 RCX: 
> 8164f938
> [   10.881666] RDX: 0001 RSI: 0092 RDI: 
> 82b468cc
> [   10.881756] RBP: 0061 R08: 0177 R09: 
> 01d7
> [   10.881844] R10: 0720072007200720 R11: 0720072007200720 R12: 
> 
> [   10.881932] R13:  R14: 0001 R15: 
> 88099000
> [   10.882022] FS:  () GS:88003fc8() 
> knlGS:
> [   10.882156] CS:  0010 DS:  ES:  CR0: 80050033
> [   10.882243] CR2:  CR3: 0200a000 CR4: 
> 06e0
> [   10.882331] Call Trace:
> [   10.882423]  ptdump_walk_pgd_level_core+0x367/0x3a5
> [   10.882511]  ptdump_walk_pgd_level_checkwx+0x10/0x3e
> [   10.882602]  kernel_init+0x2e/0x10f
> [   10.882688]  ? rest_init+0xb9/0xb9
> [   10.882775]  ret_from_fork+0x35/0x40
> [   10.882861] Code: fb ff ff 41 f7 c7 00 10 00 00 0f 85 e2 fe ff ff e9 36 fd 
> ff ff c6 05 7d 45 6f 01 01 48 89 f2 48 c7 c7 08 5b ee 81 e8 4b d6 00 00 <0f> 
> ff 48 8b 73 10 e9 bc f9 ff ff 4d 85 ed 0f 84 b9 01 00 00 41 
> [   10.883103] ---[ end trace bc3e2cf1a1adfa39 ]---
> [   10.896336] x86/mm: Checked W+X mappings: FAILED, 266243 W+X pages found.
> [   10.896430] x86/mm: Checking user space page tables
> [   10.909522] x86/mm: Checked W+X mappings: FAILED, 56 W+X pages found.

-- 
Meelis Roos (mr...@linux.ee)


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Kees Cook
On Fri, Mar 9, 2018 at 10:50 AM, Linus Torvalds
 wrote:
> On Fri, Mar 9, 2018 at 10:43 AM, Kees Cook  wrote:
>>
>> Module loading (via kernel_read_file()) already uses
>> deny_write_access(), and so does do_open_execat(). As long as module
>> loading doesn't call allow_write_access() before the execve() has
>> started in the new implementation, I think we'd be covered here.
>
> No. kernel_read_file() only does it *during* the read.

Ah, true. And looking at this again, shouldn't deny_write_access()
happen _before_ the LSM check in kernel_read_file()? That looks like a
problem...

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v2 2/2] riscv/atomic: Strengthen implementations with fences

2018-03-09 Thread Palmer Dabbelt

On Fri, 09 Mar 2018 10:36:44 PST (-0800), parri.and...@gmail.com wrote:

On Fri, Mar 09, 2018 at 09:56:21AM -0800, Palmer Dabbelt wrote:

On Fri, 09 Mar 2018 04:13:40 PST (-0800), parri.and...@gmail.com wrote:
>Atomics present the same issue with locking: release and acquire
>variants need to be strengthened to meet the constraints defined
>by the Linux-kernel memory consistency model [1].
>
>Atomics present a further issue: implementations of atomics such
>as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
>which do not give full-ordering with .aqrl; for example, current
>implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
>below to end up with the state indicated in the "exists" clause.
>
>In order to "synchronize" LKMM and RISC-V's implementation, this
>commit strengthens the implementations of the atomics operations
>by replacing .rl and .aq with the use of ("lightweigth") fences,
>and by replacing .aqrl LR/SC pairs in sequences such as:
>
>  0:  lr.w.aqrl  %0, %addr
>  bne%0, %old, 1f
>  ...
>  sc.w.aqrl  %1, %new, %addr
>  bnez   %1, 0b
>  1:
>
>with sequences of the form:
>
>  0:  lr.w   %0, %addr
>  bne%0, %old, 1f
>  ...
>  sc.w.rl%1, %new, %addr   /* SC-release   */
>  bnez   %1, 0b
>  fence  rw, rw/* "full" fence */
>  1:
>
>following Daniel's suggestion.
>
>These modifications were validated with simulation of the RISC-V
>memory consistency model.
>
>C lr-sc-aqrl-pair-vs-full-barrier
>
>{}
>
>P0(int *x, int *y, atomic_t *u)
>{
>int r0;
>int r1;
>
>WRITE_ONCE(*x, 1);
>r0 = atomic_cmpxchg(u, 0, 1);
>r1 = READ_ONCE(*y);
>}
>
>P1(int *x, int *y, atomic_t *v)
>{
>int r0;
>int r1;
>
>WRITE_ONCE(*y, 1);
>r0 = atomic_cmpxchg(v, 0, 1);
>r1 = READ_ONCE(*x);
>}
>
>exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)
>
>[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
>
https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
>https://marc.info/?l=linux-kernel&m=151633436614259&w=2
>
>Suggested-by: Daniel Lustig 
>Signed-off-by: Andrea Parri 
>Cc: Palmer Dabbelt 
>Cc: Albert Ou 
>Cc: Daniel Lustig 
>Cc: Alan Stern 
>Cc: Will Deacon 
>Cc: Peter Zijlstra 
>Cc: Boqun Feng 
>Cc: Nicholas Piggin 
>Cc: David Howells 
>Cc: Jade Alglave 
>Cc: Luc Maranget 
>Cc: "Paul E. McKenney" 
>Cc: Akira Yokosawa 
>Cc: Ingo Molnar 
>Cc: Linus Torvalds 
>Cc: linux-ri...@lists.infradead.org
>Cc: linux-kernel@vger.kernel.org
>---
> arch/riscv/include/asm/atomic.h  | 417 +--
> arch/riscv/include/asm/cmpxchg.h | 391 +---
> 2 files changed, 588 insertions(+), 220 deletions(-)
>
>diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
>index e65d1cd89e28b..855115ace98c8 100644
>--- a/arch/riscv/include/asm/atomic.h
>+++ b/arch/riscv/include/asm/atomic.h
>@@ -24,6 +24,20 @@
> #include 
>
> #define ATOMIC_INIT(i) { (i) }
>+
>+#define __atomic_op_acquire(op, args...)   \
>+({ \
>+   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
>+   __asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory");\
>+   __ret;  \
>+})
>+
>+#define __atomic_op_release(op, args...)   \
>+({ \
>+   __asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");\
>+   op##_relaxed(args); \
>+})
>+
> static __always_inline int atomic_read(const atomic_t *v)
> {
>return READ_ONCE(v->counter);
>@@ -50,22 +64,23 @@ static __always_inline void atomic64_set(atomic64_t *v, 
long i)
>  * have the AQ or RL bits set.  These don't return anything, so there's only
>  * one version to worry about.
>  */
>-#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)
 \
>-static __always_inline void atomic##prefix##_##op(c_type i, 
atomic##prefix##_t *v) \
>-{ 
 \
>-   __asm__ __volatile__ ( 
 \
>-   "amo" #asm_op "." #asm_type " zero, %1, %0"
   \
>-   : "+A" (v->counter)
\
>-   : "r" (I)  
   \
>-   : "memory");   
   \
>-}
>+#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix) \
>+static __always_inline \
>+void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v)\
>+{ 

Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread David Miller
From: Alexei Starovoitov 
Date: Fri, 9 Mar 2018 10:50:49 -0800

> On 3/9/18 10:23 AM, Andy Lutomirski wrote:
>> It might not be totally crazy to back it by tmpfs.
> 
> interesting. how do you propose to do it?
> Something like:
> - create /umh_module_tempxxx dir
> - mount tmpfs there
> - copy elf into it and exec it?

I think the idea is that it's an internal tmpfs mount that only
the kernel has access too.

And I don't think that even hurts your debuggability concerns.  The
user can just attach using the foo.ko file in the actual filesystem.



Re: [PATCH][V2] firmware: dmi_scan: add DMI_OEM_STRING support to dmi_matches

2018-03-09 Thread Alex Hung
On Fri, Mar 9, 2018 at 5:33 AM, Jean Delvare  wrote:
> Hi Alex,
>
> On Tue, 27 Feb 2018 22:48:14 -0800, Alex Hung wrote:
>> OEM strings are defined by each OEM and they contain customized and
>> useful OEM information. Supporting it provides more flexible uses of
>> the dmi_matches function.
>>
>> Signed-off-by: Alex Hung 
>> ---
>>  drivers/firmware/dmi_scan.c | 11 +--
>>  include/linux/mod_devicetable.h |  1 +
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
>> index e763e14..c712e66 100644
>> --- a/drivers/firmware/dmi_scan.c
>> +++ b/drivers/firmware/dmi_scan.c
>> @@ -775,7 +775,15 @@ static bool dmi_matches(const struct dmi_system_id *dmi)
>>   int s = dmi->matches[i].slot;
>>   if (s == DMI_NONE)
>>   break;
>> - if (dmi_ident[s]) {
>> + if (s == DMI_OEM_STRING) {
>> + /* DMI_OEM_STRING must be exact match */
>> + const struct dmi_device *valid;
>> +
>> + valid = dmi_find_device(DMI_DEV_TYPE_OEM_STRING,
>> + dmi->matches[i].substr, NULL);
>> + if (valid)
>> + continue;
>> + } else if (dmi_ident[s]) {
>>   if (dmi->matches[i].exact_match) {
>>   if (!strcmp(dmi_ident[s],
>>   dmi->matches[i].substr))
>> @@ -786,7 +794,6 @@ static bool dmi_matches(const struct dmi_system_id *dmi)
>>   continue;
>>   }
>>   }
>> -
>>   /* No match */
>>   return false;
>>   }
>
> Please avoid gratuitous blank line changes.
>
>> diff --git a/include/linux/mod_devicetable.h 
>> b/include/linux/mod_devicetable.h
>> index 48fb2b4..7d361be 100644
>> --- a/include/linux/mod_devicetable.h
>> +++ b/include/linux/mod_devicetable.h
>> @@ -502,6 +502,7 @@ enum dmi_field {
>>   DMI_CHASSIS_SERIAL,
>>   DMI_CHASSIS_ASSET_TAG,
>>   DMI_STRING_MAX,
>> + DMI_OEM_STRING, /* special case - will not be in dmi_ident */
>>  };
>>
>>  struct dmi_strmatch {
>
> Other than this, I'm happy with this version, so with the blank line
> restored:
>
> Reviewed-by: Jean Delvare 

Thank you very much.

>
> However it doesn't make sense to commit this change unless there will
> be at least one user of it. What is the status of the piece of code
> which was supposed to use this new feature?

The original use of DMI on _OSI is no needed anymore - the OEM _OSI
string will always enabled; however, this patch is still needed
because DMI_OEM_STRING are more suitable for many DMI quirks,
especially for Dell systems, and many, if not all, DMI quirks for Dell
systems with DMI_PRODUCT_NAME can be (and should be) replaced by
DMI_OEM_STRING because 1) OEM string contains system id, 2) multiple
product names can be used for the same system id and 3) the number DMI
quirks can be reduced.

For example, the DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 9020M") in
commit 1f59ab2783aed04f131 can be replaced by
DMI_MATCH_EXACT(DMI_OEM_STRING, "1[0669]")

I will start sending DMI quirks with DMI_OEM_STRING myself and perhaps
sending a clean up patch to replace DMI_PRODUCT_NAME by DMI_OEM_STRING
for the Dell systems I have access to. With this patch in place first,
I am able to convince others to use DMI_OEM_STRING because there will
fewer risks to spend time in vain.

Cheers,
Alex Hung

>
> --
> Jean Delvare
> SUSE L3 Support


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread David Miller
From: Linus Torvalds 
Date: Fri, 9 Mar 2018 10:53:45 -0800

> Anyway, see my other suggestion that makes this all irrelevant. Just
> wait synchronously (until the exit), and just use deny_write_access().

What exit?

Once the helper UMH is invoked, it runs asynchronously taking eBPF
translation requests.


Re: [PATCH 1/2] dt-bindings: serial: stm32: add RS485 optional properties

2018-03-09 Thread Greg Kroah-Hartman
On Wed, Feb 28, 2018 at 10:56:18AM +, Bich HEMON wrote:
> Add options for enabling RS485 hardware control and configuring
> Driver Enable signal:
> - rs485-rts-delay
> - rs485-rx-during-tx
> - rs485-rts-active-low
> - linux,rs485-enabled-at-boot-time
> 
> Signed-off-by: Bich Hemon 
> Reviewed-by: Rob Herring 
> ---
>  Documentation/devicetree/bindings/serial/st,stm32-usart.txt | 2 ++
>  1 file changed, 2 insertions(+)

This series does not apply to my tty-next tree at all.  Can you rebase
and resend?

thanks,

greg k-h


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Alexei Starovoitov

On 3/9/18 10:50 AM, Linus Torvalds wrote:

On Fri, Mar 9, 2018 at 10:43 AM, Kees Cook  wrote:


Module loading (via kernel_read_file()) already uses
deny_write_access(), and so does do_open_execat(). As long as module
loading doesn't call allow_write_access() before the execve() has
started in the new implementation, I think we'd be covered here.


No. kernel_read_file() only does it *during* the read.

So there's a huge big honking gap between the two.

Also, the second part of my suggestion was to be entirely synchronous
with the whole execution of the process, and do it within the "we do
mutual exclusion fo rmodules with the same name" logic.

Note that Andrei's patch uses UMH_WAIT_EXEC. That's basically
"vfork+exec" - it only waits for the exec to have started, it doesn't
wait for the whole thing.


It's not waiting for the whole thing, because once bpfilter starts it
stays running/sleeping because it's stateful. It needs normal
malloc-ed memory to keep the state of iptable->bpf translation that
it will use later during subsequent translation calls.
Theoretically it can use bpf maps pinned in kernel memory to keep
this state, but then it's non-swappable. It's better to keep bpfilter
state in its own user memory.



[PATCH v2 00/13] PCI: Simplify PCIe port driver

2018-03-09 Thread Bjorn Helgaas
This is an attempt to move a few things out of the port driver.

I added these new patches since v1:

  Merge pcieport_if.h into portdrv.h
Merge pcieport_if.h and portdrv.h to reduce clutter

  Remove unnecessary "pcie_ports=auto" parameter
This is the default setting anyway, so specifying the parameter doesn't
do anything.

  Encapsulate pcie_ports_auto inside the port driver
"pcie_ports_auto" was declared in linux/pci.h even though nobody
outside the port driver used it.

  Rename and reverse sense of pcie_ports_auto
"pcie_ports_auto" is connected with the "pcie_ports=native" parameter,
so rename it to match.

Other changes since v1:
  - Rebase onto my pci/portdrv branch.
  - Rename pcie_resume_early() to pcie_pme_root_status_cleanup() as
suggested by Rafael.
  - Add Rafael's Reviewed-by tags.

v1: 
https://lkml.kernel.org/r/152040297576.240786.1532465558381209070.st...@bhelgaas-glaptop.roam.corp.google.com

---

Bjorn Helgaas (13):
  PCI/portdrv: Merge pcieport_if.h into portdrv.h
  PCI/PM: Move pcie_clear_root_pme_status() to core
  PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
  PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
  PCI/portdrv: Disable port driver in compat mode
  PCI/portdrv: Remove pcie_port_bus_type link order dependency
  PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
  PCI/portdrv: Simplify PCIe feature permission checking
  PCI/portdrv: Remove unnecessary include of 
  PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
  PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
  PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
  PCI/portdrv: Rename and reverse sense of pcie_ports_auto


 Documentation/admin-guide/kernel-parameters.txt |   19 ++---
 drivers/acpi/pci_root.c |   13 +++
 drivers/pci/hotplug/pciehp.h|2 -
 drivers/pci/pci-driver.c|   59 +++
 drivers/pci/pci.c   |9 ++
 drivers/pci/pci.h   |1 
 drivers/pci/pcie/Makefile   |3 -
 drivers/pci/pcie/aer/aerdrv.h   |2 -
 drivers/pci/pcie/pcie-dpc.c |2 -
 drivers/pci/pcie/pcieport_if.h  |   71 ---
 drivers/pci/pcie/pme.c  |1 
 drivers/pci/pcie/portdrv.h  |   88 ---
 drivers/pci/pcie/portdrv_acpi.c |3 -
 drivers/pci/pcie/portdrv_bus.c  |   56 ---
 drivers/pci/pcie/portdrv_core.c |   73 +++
 drivers/pci/pcie/portdrv_pci.c  |   54 ++
 drivers/pci/probe.c |   10 +++
 include/linux/pci.h |5 +
 18 files changed, 198 insertions(+), 273 deletions(-)
 delete mode 100644 drivers/pci/pcie/pcieport_if.h
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c


[PATCH v2 02/13] PCI/PM: Move pcie_clear_root_pme_status() to core

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Move pcie_clear_root_pme_status() from the port driver to the PCI core so
it will be available even when the port driver isn't present.  No
functional change intended.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pci.c  |9 +
 drivers/pci/pci.h  |1 +
 drivers/pci/pcie/portdrv.h |2 --
 drivers/pci/pcie/portdrv_pci.c |9 -
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index f6a4dd10d9b0..120e3393fc35 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1683,6 +1683,15 @@ int pci_set_pcie_reset_state(struct pci_dev *dev, enum 
pcie_reset_state state)
 }
 EXPORT_SYMBOL_GPL(pci_set_pcie_reset_state);
 
+/**
+ * pcie_clear_root_pme_status - Clear root port PME interrupt status.
+ * @dev: PCIe root port or event collector.
+ */
+void pcie_clear_root_pme_status(struct pci_dev *dev)
+{
+   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
+}
+
 /**
  * pci_check_pme_status - Check if given device has generated PME.
  * @dev: Device to check.
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fcd81911b127..813ca2c895d8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -71,6 +71,7 @@ void pci_update_current_state(struct pci_dev *dev, 
pci_power_t state);
 void pci_power_up(struct pci_dev *dev);
 void pci_disable_enabled_device(struct pci_dev *dev);
 int pci_finish_runtime_suspend(struct pci_dev *dev);
+void pcie_clear_root_pme_status(struct pci_dev *dev);
 int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
 void pci_pme_restore(struct pci_dev *dev);
 bool pci_dev_keep_suspended(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index d4009e35702c..7086086e45d0 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -93,8 +93,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-void pcie_clear_root_pme_status(struct pci_dev *dev);
-
 #ifdef CONFIG_HOTPLUG_PCI_PCIE
 extern bool pciehp_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 977bd3cca2e5..d6f10a97d400 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -49,15 +49,6 @@ __setup("pcie_ports=", pcie_port_setup);
 
 /* global data */
 
-/**
- * pcie_clear_root_pme_status - Clear root port PME interrupt status.
- * @dev: PCIe root port or event collector.
- */
-void pcie_clear_root_pme_status(struct pci_dev *dev)
-{
-   pcie_capability_set_dword(dev, PCI_EXP_RTSTA, PCI_EXP_RTSTA_PME);
-}
-
 static int pcie_portdrv_restore_config(struct pci_dev *dev)
 {
int retval;



[PATCH v2 01/13] PCI/portdrv: Merge pcieport_if.h into portdrv.h

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

pcieport_if.h contained the interfaces to register port service driver,
e.g., pcie_port_service_register().  portdrv.h contained internal data
structures of the port driver.

I don't think it's worth keeping those files separate, since both headers
and their users are all inside the PCI core.

Merge pcieport_if.h directly in drivers/pci/pcie/portdrv.h and update the
users to include that instead.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/hotplug/pciehp.h|2 +
 drivers/pci/pcie/aer/aerdrv.h   |2 +
 drivers/pci/pcie/pcie-dpc.c |2 +
 drivers/pci/pcie/pcieport_if.h  |   71 ---
 drivers/pci/pcie/pme.c  |1 -
 drivers/pci/pcie/portdrv.h  |   61 +-
 drivers/pci/pcie/portdrv_acpi.c |1 -
 drivers/pci/pcie/portdrv_bus.c  |1 -
 drivers/pci/pcie/portdrv_core.c |1 -
 drivers/pci/pcie/portdrv_pci.c  |1 -
 10 files changed, 63 insertions(+), 80 deletions(-)
 delete mode 100644 drivers/pci/pcie/pcieport_if.h

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 08072bcaa381..88e917c9120f 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -23,7 +23,7 @@
 #include 
 #include 
 
-#include "../pcie/pcieport_if.h"
+#include "../pcie/portdrv.h"
 
 #define MY_NAME"pciehp"
 
diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h
index 568326f385b7..a884f68bada4 100644
--- a/drivers/pci/pcie/aer/aerdrv.h
+++ b/drivers/pci/pcie/aer/aerdrv.h
@@ -13,7 +13,7 @@
 #include 
 #include 
 
-#include "../pcieport_if.h"
+#include "../portdrv.h"
 
 #define SYSTEM_ERROR_INTR_ON_MESG_MASK (PCI_EXP_RTCTL_SECEE|   \
PCI_EXP_RTCTL_SENFEE|   \
diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c
index bac895de4c72..8c57d607e603 100644
--- a/drivers/pci/pcie/pcie-dpc.c
+++ b/drivers/pci/pcie/pcie-dpc.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-#include "pcieport_if.h"
+#include "portdrv.h"
 #include "../pci.h"
 #include "aer/aerdrv.h"
 
diff --git a/drivers/pci/pcie/pcieport_if.h b/drivers/pci/pcie/pcieport_if.h
deleted file mode 100644
index b69769dbf659..
--- a/drivers/pci/pcie/pcieport_if.h
+++ /dev/null
@@ -1,71 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * File:   pcieport_if.h
- * Purpose:PCI Express Port Bus Driver's IF Data Structure
- *
- * Copyright (C) 2004 Intel
- * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com)
- */
-
-#ifndef _PCIEPORT_IF_H_
-#define _PCIEPORT_IF_H_
-
-/* Port Type */
-#define PCIE_ANY_PORT  (~0)
-
-/* Service Type */
-#define PCIE_PORT_SERVICE_PME_SHIFT0   /* Power Management Event */
-#define PCIE_PORT_SERVICE_PME  (1 << PCIE_PORT_SERVICE_PME_SHIFT)
-#define PCIE_PORT_SERVICE_AER_SHIFT1   /* Advanced Error Reporting */
-#define PCIE_PORT_SERVICE_AER  (1 << PCIE_PORT_SERVICE_AER_SHIFT)
-#define PCIE_PORT_SERVICE_HP_SHIFT 2   /* Native Hotplug */
-#define PCIE_PORT_SERVICE_HP   (1 << PCIE_PORT_SERVICE_HP_SHIFT)
-#define PCIE_PORT_SERVICE_VC_SHIFT 3   /* Virtual Channel */
-#define PCIE_PORT_SERVICE_VC   (1 << PCIE_PORT_SERVICE_VC_SHIFT)
-#define PCIE_PORT_SERVICE_DPC_SHIFT4   /* Downstream Port Containment 
*/
-#define PCIE_PORT_SERVICE_DPC  (1 << PCIE_PORT_SERVICE_DPC_SHIFT)
-
-struct pcie_device {
-   int irq;/* Service IRQ/MSI/MSI-X Vector */
-   struct pci_dev *port;   /* Root/Upstream/Downstream Port */
-   u32 service;/* Port service this device represents */
-   void*priv_data; /* Service Private Data */
-   struct device   device; /* Generic Device Interface */
-};
-#define to_pcie_device(d) container_of(d, struct pcie_device, device)
-
-static inline void set_service_data(struct pcie_device *dev, void *data)
-{
-   dev->priv_data = data;
-}
-
-static inline void *get_service_data(struct pcie_device *dev)
-{
-   return dev->priv_data;
-}
-
-struct pcie_port_service_driver {
-   const char *name;
-   int (*probe) (struct pcie_device *dev);
-   void (*remove) (struct pcie_device *dev);
-   int (*suspend) (struct pcie_device *dev);
-   int (*resume) (struct pcie_device *dev);
-
-   /* Device driver may resume normal operations */
-   void (*error_resume)(struct pci_dev *dev);
-
-   /* Link Reset Capability - AER service driver specific */
-   pci_ers_result_t (*reset_link) (struct pci_dev *dev);
-
-   int port_type;  /* Type of the port this driver can handle */
-   u32 service;/* Port service this device represents */
-
-   struct device_driver driver;
-};
-#define to_service_driver(d) \
-   container_of(d, struct pcie_port_service_driver, driver)
-
-int pcie_port_service_register(struct pcie_port_service_driver *new);
-void pcie_port_service_unregister(struct p

[PATCH v2 04/13] PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Per PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME
interrupts on behalf of Root Complex Integrated Endpoints.

Linux does not currently enable PME interrupts from RC Event Collectors,
but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") suggests PME interrupts may be enabled by the platform for ACPI-
based runtime wakeup.

Clear the PCIe PME Status bit for Root Complex Event Collectors during
resume, just like we already do for Root Ports.

If the BIOS enables PME interrupts for an event collector and neglects to
clear the status bit on resume, this change should fix the same bug as
fe31e69740ed (PMEs not working after waking from a sleep state), but for
Root Complex Integrated Endpoints.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pci-driver.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index e561fa0f456c..204d2b54c2a4 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -533,7 +533,8 @@ static void pcie_pme_root_status_cleanup(struct pci_dev 
*pci_dev)
 * Clear those bits now just in case (shouldn't hurt).
 */
if (pci_is_pcie(pci_dev) &&
-   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   (pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT ||
+pci_pcie_type(pci_dev) == PCI_EXP_TYPE_RC_EC))
pcie_clear_root_pme_status(pci_dev);
 }
 



[PATCH v2 03/13] PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") added a .resume_noirq() callback to the PCIe port driver to clear
the PME Status bit during resume to work around a BIOS issue.

The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups
but did not clear the PME Status bit during resume, which meant PMEs after
resume did not trigger interrupts because PME Status did not transition
from cleared to set.

The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS
was set.  But I think we *always* want the fix because the platform may use
PME interrupts even if Linux is built without the PCIe port driver.

Move the fix from the port driver to the PCI core so we can work around
this "PME doesn't work after waking from a sleep state" issue regardless of
CONFIG_PCIEPORTBUS.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-driver.c   |   14 ++
 drivers/pci/pcie/portdrv_pci.c |   15 ---
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 3bed6beda051..e561fa0f456c 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -525,6 +525,18 @@ static void pci_pm_default_resume_early(struct pci_dev 
*pci_dev)
pci_fixup_device(pci_fixup_resume_early, pci_dev);
 }
 
+static void pcie_pme_root_status_cleanup(struct pci_dev *pci_dev)
+{
+   /*
+* Some BIOSes forget to clear Root PME Status bits after system
+* wakeup, which breaks ACPI-based runtime wakeup on PCI Express.
+* Clear those bits now just in case (shouldn't hurt).
+*/
+   if (pci_is_pcie(pci_dev) &&
+   pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT)
+   pcie_clear_root_pme_status(pci_dev);
+}
+
 /*
  * Default "suspend" method for devices that have no driver provided suspend,
  * or not even a driver at all (second part).
@@ -873,6 +885,8 @@ static int pci_pm_resume_noirq(struct device *dev)
if (pci_has_legacy_pm_support(pci_dev))
return pci_legacy_resume_early(dev);
 
+   pcie_pme_root_status_cleanup(pci_dev);
+
if (drv && drv->pm && drv->pm->resume_noirq)
error = drv->pm->resume_noirq(dev);
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index d6f10a97d400..ec9e936c2a5b 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -61,20 +61,6 @@ static int pcie_portdrv_restore_config(struct pci_dev *dev)
 }
 
 #ifdef CONFIG_PM
-static int pcie_port_resume_noirq(struct device *dev)
-{
-   struct pci_dev *pdev = to_pci_dev(dev);
-
-   /*
-* Some BIOSes forget to clear Root PME Status bits after system wakeup
-* which breaks ACPI-based runtime wakeup on PCI Express, so clear those
-* bits now just in case (shouldn't hurt).
-*/
-   if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
-   pcie_clear_root_pme_status(pdev);
-   return 0;
-}
-
 static int pcie_port_runtime_suspend(struct device *dev)
 {
return to_pci_dev(dev)->bridge_d3 ? 0 : -EBUSY;
@@ -102,7 +88,6 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
.thaw   = pcie_port_device_resume,
.poweroff   = pcie_port_device_suspend,
.restore= pcie_port_device_resume,
-   .resume_noirq   = pcie_port_resume_noirq,
.runtime_suspend = pcie_port_runtime_suspend,
.runtime_resume = pcie_port_runtime_resume,
.runtime_idle   = pcie_port_runtime_idle,



[PATCH v2 07/13] PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

No driver registers for PCIE_PORT_SERVICE_VC, so remove it.

This removes the VC "service" files from /sys/bus/pci_express/devices,
e.g., :07:00.0:pcie108, :08:04.0:pcie208 (all the files that
contained "8" as the last digit of the "pcieXXX" part).  The port driver
created these files for PCIe port devices that have a VC Capability.

Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the
spot where VC used to be, the DPC sysfs files will now be named "pcieXX8".
I don't think there's anything useful userspace can do with those files, so
I hope nobody cares about these filenames.

There is no VC driver that calls pcie_port_service_register(), so there
never was a /sys/bus/pci_express/drivers/vc directory.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pcie/portdrv.h  |6 ++
 drivers/pci/pcie/portdrv_acpi.c |2 +-
 drivers/pci/pcie/portdrv_core.c |   14 --
 3 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 7086086e45d0..7bfd75f9197b 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -19,12 +19,10 @@
 #define PCIE_PORT_SERVICE_AER  (1 << PCIE_PORT_SERVICE_AER_SHIFT)
 #define PCIE_PORT_SERVICE_HP_SHIFT 2   /* Native Hotplug */
 #define PCIE_PORT_SERVICE_HP   (1 << PCIE_PORT_SERVICE_HP_SHIFT)
-#define PCIE_PORT_SERVICE_VC_SHIFT 3   /* Virtual Channel */
-#define PCIE_PORT_SERVICE_VC   (1 << PCIE_PORT_SERVICE_VC_SHIFT)
-#define PCIE_PORT_SERVICE_DPC_SHIFT4   /* Downstream Port Containment 
*/
+#define PCIE_PORT_SERVICE_DPC_SHIFT3   /* Downstream Port Containment 
*/
 #define PCIE_PORT_SERVICE_DPC  (1 << PCIE_PORT_SERVICE_DPC_SHIFT)
 
-#define PCIE_PORT_DEVICE_MAXSERVICES   5
+#define PCIE_PORT_DEVICE_MAXSERVICES   4
 
 /* Port Type */
 #define PCIE_ANY_PORT  (~0)
diff --git a/drivers/pci/pcie/portdrv_acpi.c b/drivers/pci/pcie/portdrv_acpi.c
index 53f60053bd47..9d12650dc2ae 100644
--- a/drivers/pci/pcie/portdrv_acpi.c
+++ b/drivers/pci/pcie/portdrv_acpi.c
@@ -47,7 +47,7 @@ void pcie_port_acpi_setup(struct pci_dev *port, int *srv_mask)
 
flags = root->osc_control_set;
 
-   *srv_mask = PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_DPC;
+   *srv_mask = PCIE_PORT_SERVICE_DPC;
if (flags & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL)
*srv_mask |= PCIE_PORT_SERVICE_HP;
if (flags & OSC_PCI_EXPRESS_PME_CONTROL)
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 9a41751db332..bf851da97947 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -188,10 +188,8 @@ static int pcie_init_service_irqs(struct pci_dev *dev, int 
*irqs, int mask)
if (ret < 0)
return -ENODEV;
 
-   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) {
-   if (i != PCIE_PORT_SERVICE_VC_SHIFT)
-   irqs[i] = pci_irq_vector(dev, 0);
-   }
+   for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++)
+   irqs[i] = pci_irq_vector(dev, 0);
 
return 0;
 }
@@ -211,8 +209,7 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
-   | PCIE_PORT_SERVICE_VC;
+   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
if (pci_aer_available())
cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
 
@@ -239,9 +236,6 @@ static int get_port_device_capability(struct pci_dev *dev)
 */
pci_disable_pcie_error_reporting(dev);
}
-   /* VC support */
-   if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_VC))
-   services |= PCIE_PORT_SERVICE_VC;
/* Root ports are capable of generating PME too */
if ((cap_mask & PCIE_PORT_SERVICE_PME)
&& pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
@@ -331,7 +325,7 @@ int pcie_port_device_register(struct pci_dev *dev)
 */
status = pcie_init_service_irqs(dev, irqs, capabilities);
if (status) {
-   capabilities &= PCIE_PORT_SERVICE_VC | PCIE_PORT_SERVICE_HP;
+   capabilities &= PCIE_PORT_SERVICE_HP;
if (!capabilities)
goto error_disable;
}



[PATCH v2 05/13] PCI/portdrv: Disable port driver in compat mode

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is
intended to disable the PCIe port driver.  But even when it was disabled,
we registered pcie_portdriver so we could work around a BIOS PME issue (see
fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume")).

Registering the driver meant that the pcie_portdrv_probe() path called
pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(),
pm_runtime_use_autosuspend(), etc., even when the driver was disabled.

We've since moved the BIOS PME workaround from the port driver to the core,
so stop registering the PCIe port driver in compat mode.

This means "pcie_ports=compat" will now be basically the same as turning
off CONFIG_PCIEPORTBUS completely.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pcie/portdrv_core.c |3 ---
 drivers/pci/pcie/portdrv_pci.c  |2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 4268b2fc2c7a..9a41751db332 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -211,9 +211,6 @@ static int get_port_device_capability(struct pci_dev *dev)
int services = 0;
int cap_mask = 0;
 
-   if (pcie_ports_disabled)
-   return 0;
-
cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP
| PCIE_PORT_SERVICE_VC;
if (pci_aer_available())
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index ec9e936c2a5b..5d9d5305ebef 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -261,7 +261,7 @@ static int __init pcie_portdrv_init(void)
int retval;
 
if (pcie_ports_disabled)
-   return pci_register_driver(&pcie_portdriver);
+   return -EACCES;
 
dmi_check_system(pcie_portdrv_dmi_table);
 



[PATCH v2 08/13] PCI/portdrv: Simplify PCIe feature permission checking

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the
platform firmware or the OS, so the host bridge driver may have to request
permission from the platform before using them.  On ACPI systems, this is
done by negotiate_os_control() in acpi_pci_root_add().

The PCIe port driver later uses pcie_port_platform_notify() and
pcie_port_acpi_setup() to figure out whether it can use these features.
But all we need is a single bit for each service, so these interfaces are
needlessly complicated.

Simplify this by adding bits in the struct pci_host_bridge to show when the
OS has permission to use each feature:

  + unsigned int use_aer:1;   /* OS may use PCIe AER */
  + unsigned int use_hotplug:1;   /* OS may use PCIe hotplug */
  + unsigned int use_pme:1;   /* OS may use PCIe PME */

These are set when we create a host bridge, and the host bridge driver can
clear the bits corresponding to any feature the platform doesn't want us to
use.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/acpi/pci_root.c |   13 ++--
 drivers/pci/pcie/Makefile   |1 -
 drivers/pci/pcie/portdrv.h  |   11 --
 drivers/pci/pcie/portdrv_core.c |   42 ---
 drivers/pci/probe.c |   10 +
 include/linux/pci.h |3 +++
 6 files changed, 50 insertions(+), 30 deletions(-)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 6fc204a52493..65ebefb99815 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -871,6 +871,7 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
struct acpi_device *device = root->device;
int node = acpi_get_node(device->handle);
struct pci_bus *bus;
+   struct pci_host_bridge *host_bridge;
 
info->root = root;
info->bridge = device;
@@ -895,9 +896,17 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root 
*root,
if (!bus)
goto out_release_info;
 
+   host_bridge = to_pci_host_bridge(bus->bridge);
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_NATIVE_HP_CONTROL))
+   host_bridge->use_hotplug = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_AER_CONTROL))
+   host_bridge->use_aer = 0;
+   if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
+   host_bridge->use_pme = 0;
+
pci_scan_child_bus(bus);
-   pci_set_host_bridge_release(to_pci_host_bridge(bus->bridge),
-   acpi_pci_root_release_info, info);
+   pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
+   info);
if (node != NUMA_NO_NODE)
dev_printk(KERN_DEBUG, &bus->dev, "on NUMA node %d\n", node);
return bus;
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index e01c10c97b95..11fb633b866c 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -7,7 +7,6 @@
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
 pcieportdrv-y  := portdrv_core.o portdrv_pci.o
-pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
 
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 7bfd75f9197b..ed84e767085f 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -123,15 +123,4 @@ static inline bool pcie_pme_no_msi(void) { return false; }
 static inline void pcie_pme_interrupt_enable(struct pci_dev *dev, bool en) {}
 #endif /* !CONFIG_PCIE_PME */
 
-#ifdef CONFIG_ACPI
-void pcie_port_acpi_setup(struct pci_dev *port, int *mask);
-
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask)
-{
-   pcie_port_acpi_setup(port, mask);
-}
-#else /* !CONFIG_ACPI */
-static inline void pcie_port_platform_notify(struct pci_dev *port, int *mask){}
-#endif /* !CONFIG_ACPI */
-
 #endif /* _PORTDRV_H_ */
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index bf851da97947..589960fdd8a8 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -206,19 +206,20 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
  */
 static int get_port_device_capability(struct pci_dev *dev)
 {
+   struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
+   bool native;
int services = 0;
-   int cap_mask = 0;
 
-   cap_mask = PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP;
-   if (pci_aer_available())
-   cap_mask |= PCIE_PORT_SERVICE_AER | PCIE_PORT_SERVICE_DPC;
-
-   if (pcie_ports_auto)
-   pcie_port_platform_notify(dev, &cap_mask);
+   /*
+* If the user specified "pcie_ports=native", use the PCIe services
+* regardless of whether the platform has given us permission.  On
+* ACPI systems, this means we ignore

[PATCH v2 10/13] PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp
driver") added the "pcie_hp=nomsi" kernel parameter to work around this
error on shutdown:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  Pid: 1081, comm: reboot Not tainted 3.2.0 #1
  ...
  Disabling IRQ #16

This happened on an unspecified system (possibly involving the Integrated
Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt
is generated when PCI driver switches from MSI/MSI-X to INTx while shutting
down the device."

The implication was that the device was buggy, but it is normal for a
device to use INTx after MSI/MSI-X have been disabled.  The only problem
was that the driver was still attached and it wasn't prepared for INTx
interrupts.  Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI:
Stop disabling MSI/MSI-X in pci_device_shutdown()").

There is no automated way to set this parameter, so it's not very useful
for distributions or end users.  It's really only useful for debugging, and
we have "pci=nomsi" for that purpose.

Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
CC: MUNEDA Takahiro 
CC: Kenji Kaneshige 
CC: Prarit Bhargava 
---
 Documentation/admin-guide/kernel-parameters.txt |4 
 drivers/pci/pcie/portdrv.h  |   12 
 drivers/pci/pcie/portdrv_core.c |   20 +++-
 3 files changed, 3 insertions(+), 33 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 1d1d53f85ddd..761749562165 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3130,10 +3130,6 @@
force   Enable ASPM even on devices that claim not to support 
it.
WARNING: Forcing ASPM on may cause system lockups.
 
-   pcie_hp=[PCIE] PCI Express Hotplug driver options:
-   nomsi   Do not use MSI for PCI Express Native Hotplug (this
-   makes all PCIe ports use INTx for hotplug services).
-
pcie_ports= [PCIE] PCIe ports handling:
autoAsk the BIOS whether or not to use native PCIe services
associated with PCIe ports (PME, hot-plug, AER).  Use
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index ed84e767085f..86368f9341d7 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -91,18 +91,6 @@ void pcie_port_bus_unregister(void);
 
 struct pci_dev;
 
-#ifdef CONFIG_HOTPLUG_PCI_PCIE
-extern bool pciehp_msi_disabled;
-
-static inline bool pciehp_no_msi(void)
-{
-   return pciehp_msi_disabled;
-}
-
-#else  /* !CONFIG_HOTPLUG_PCI_PCIE */
-static inline bool pciehp_no_msi(void) { return false; }
-#endif /* !CONFIG_HOTPLUG_PCI_PCIE */
-
 #ifdef CONFIG_PCIE_PME
 extern bool pcie_pme_msi_disabled;
 
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index 589960fdd8a8..ed24a407518a 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -20,17 +20,6 @@
 #include "../pci.h"
 #include "portdrv.h"
 
-bool pciehp_msi_disabled;
-
-static int __init pciehp_setup(char *str)
-{
-   if (!strncmp(str, "nomsi", 5))
-   pciehp_msi_disabled = true;
-
-   return 1;
-}
-__setup("pcie_hp=", pciehp_setup);
-
 /**
  * release_pcie_device - free PCI Express port service device structure
  * @dev: Port service device to release
@@ -168,16 +157,13 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
irqs[i] = -1;
 
/*
-* If we support PME or hotplug, but we can't use MSI/MSI-X for
-* them, we have to fall back to INTx or other interrupts, e.g., a
-* system shared interrupt.
+* If we support PME but can't use MSI/MSI-X for it, we have to
+* fall back to INTx or other interrupts, e.g., a system shared
+* interrupt.
 */
if ((mask & PCIE_PORT_SERVICE_PME) && pcie_pme_no_msi())
goto legacy_irq;
 
-   if ((mask & PCIE_PORT_SERVICE_HP) && pciehp_no_msi())
-   goto legacy_irq;
-
/* Try to use MSI-X or MSI if supported */
if (pcie_port_enable_irq_vec(dev, irqs, mask) == 0)
return 0;



[PATCH v2 13/13] PCI/portdrv: Rename and reverse sense of pcie_ports_auto

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The platform may restrict the OS's use of PCIe services, e.g., via the ACPI
_OSC method.  The user may use "pcie_ports=native" to force the port driver
to use PCIe services even if the platform asked us not to.

The "pcie_ports=native" parameter determines the setting of
pcie_ports_auto.  Rename this to pcie_ports_native and reverse the
sense to simplify the code.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv.h  |2 +-
 drivers/pci/pcie/portdrv_core.c |   15 ---
 drivers/pci/pcie/portdrv_pci.c  |   10 +-
 3 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 62e28b5afa51..3e0058a5500f 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -12,7 +12,7 @@
 
 #include 
 
-extern bool pcie_ports_auto;
+extern bool pcie_ports_native;
 
 /* Service Type */
 #define PCIE_PORT_SERVICE_PME_SHIFT0   /* Power Management Event */
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index ed24a407518a..a1f838f2646a 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -193,17 +193,10 @@ static int pcie_init_service_irqs(struct pci_dev *dev, 
int *irqs, int mask)
 static int get_port_device_capability(struct pci_dev *dev)
 {
struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
-   bool native;
int services = 0;
 
-   /*
-* If the user specified "pcie_ports=native", use the PCIe services
-* regardless of whether the platform has given us permission.  On
-* ACPI systems, this means we ignore _OSC.
-*/
-   native = !pcie_ports_auto;
-
-   if (dev->is_hotplug_bridge && (native || host->use_hotplug)) {
+   if (dev->is_hotplug_bridge &&
+   (pcie_ports_native || host->use_hotplug)) {
services |= PCIE_PORT_SERVICE_HP;
 
/*
@@ -215,7 +208,7 @@ static int get_port_device_capability(struct pci_dev *dev)
}
 
if (pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR) &&
-   pci_aer_available() && (native || host->use_aer)) {
+   pci_aer_available() && (pcie_ports_native || host->use_aer)) {
services |= PCIE_PORT_SERVICE_AER;
 
/*
@@ -231,7 +224,7 @@ static int get_port_device_capability(struct pci_dev *dev)
 * those yet.
 */
if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT &&
-   (native || host->use_pme)) {
+   (pcie_ports_native || host->use_pme)) {
services |= PCIE_PORT_SERVICE_PME;
 
/*
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 8b62192342ac..569fd636b3c4 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -25,18 +25,18 @@
 bool pcie_ports_disabled;
 
 /*
- * If this switch is set, ACPI _OSC will be used to determine whether or not to
- * enable PCIe port native services.
+ * If the user specified "pcie_ports=native", use the PCIe services regardless
+ * of whether the platform has given us permission.  On ACPI systems, this
+ * means we ignore _OSC.
  */
-bool pcie_ports_auto = true;
+bool pcie_ports_native;
 
 static int __init pcie_port_setup(char *str)
 {
if (!strncmp(str, "compat", 6)) {
pcie_ports_disabled = true;
} else if (!strncmp(str, "native", 6)) {
-   pcie_ports_disabled = false;
-   pcie_ports_auto = false;
+   pcie_ports_native = true;
}
 
return 1;



[PATCH v2 11/13] PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The "pcie_ports=auto" parameter set pcie_ports_disabled and pcie_ports_auto
to their compiled-in defaults, so specifying the parameter is the same as
not using it at all.

Remove the "pcie_ports=auto" parameter and update the documentation.

Signed-off-by: Bjorn Helgaas 
---
 Documentation/admin-guide/kernel-parameters.txt |   15 +++
 drivers/pci/pcie/portdrv_pci.c  |3 ---
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 761749562165..26565794a573 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3130,14 +3130,13 @@
force   Enable ASPM even on devices that claim not to support 
it.
WARNING: Forcing ASPM on may cause system lockups.
 
-   pcie_ports= [PCIE] PCIe ports handling:
-   autoAsk the BIOS whether or not to use native PCIe services
-   associated with PCIe ports (PME, hot-plug, AER).  Use
-   them only if that is allowed by the BIOS.
-   native  Use native PCIe services associated with PCIe ports
-   unconditionally.
-   compat  Treat PCIe ports as PCI-to-PCI bridges, disable the PCIe
-   ports driver.
+   pcie_ports= [PCIE] PCIe port services handling:
+   native  Use native PCIe services (PME, AER, DPC, PCIe hotplug)
+   even if the platform doesn't give the OS permission to
+   use them.  This may cause conflicts if the platform
+   also tries to use these services.
+   compat  Disable native PCIe services (PME, AER, DPC, PCIe
+   hotplug).
 
pcie_port_pm=   [PCIE] PCIe port power management handling:
off Disable power management of all PCIe ports
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 1997d9f2743e..8b62192342ac 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -37,9 +37,6 @@ static int __init pcie_port_setup(char *str)
} else if (!strncmp(str, "native", 6)) {
pcie_ports_disabled = false;
pcie_ports_auto = false;
-   } else if (!strncmp(str, "auto", 4)) {
-   pcie_ports_disabled = false;
-   pcie_ports_auto = true;
}
 
return 1;



[PATCH v2 12/13] PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

"pcie_ports_auto" is only used inside the PCIe port driver itself, so
move it from include/linux/pci.h to portdrv.h so it's not visible to the
whole kernel.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pcie/portdrv.h |2 ++
 include/linux/pci.h|2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index 86368f9341d7..62e28b5afa51 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -12,6 +12,8 @@
 
 #include 
 
+extern bool pcie_ports_auto;
+
 /* Service Type */
 #define PCIE_PORT_SERVICE_PME_SHIFT0   /* Power Management Event */
 #define PCIE_PORT_SERVICE_PME  (1 << PCIE_PORT_SERVICE_PME_SHIFT)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 40aec7a6fdd9..2c0e5d929fd2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1449,10 +1449,8 @@ static inline int pci_irqd_intx_xlate(struct irq_domain 
*d,
 
 #ifdef CONFIG_PCIEPORTBUS
 extern bool pcie_ports_disabled;
-extern bool pcie_ports_auto;
 #else
 #define pcie_ports_disabledtrue
-#define pcie_ports_autofalse
 #endif
 
 #ifdef CONFIG_PCIEASPM



[PATCH v2 09/13] PCI/portdrv: Remove unnecessary include of

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

portdrv_pci.c doesn't use anything from .  Remove the
include of it.  No functional change intended.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pcie/portdrv_pci.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 127321e17184..1997d9f2743e 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "../pci.h"
 #include "portdrv.h"



[PATCH v2 06/13] PCI/portdrv: Remove pcie_port_bus_type link order dependency

2018-03-09 Thread Bjorn Helgaas
From: Bjorn Helgaas 

The pcie_port_bus_type must be registered before drivers that depend on it
can be registered.  Those drivers include:

  pcied_init()# PCIe native hotplug driver
  aer_service_init()  # AER driver
  dpc_service_init()  # DPC driver
  pcie_pme_service_init() # PME driver

Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a
device_initcall.  The callers of pcie_port_service_register() (above) are
also device_initcalls.  This is fragile because the device_initcall
ordering depends on link order, which is not explicit.

Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type.
This removes the link order dependency between portdrv and the pciehp, AER,
DPC, and PCIe PME drivers.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
---
 drivers/pci/pci-driver.c   |   44 +++-
 drivers/pci/pcie/Makefile  |2 +
 drivers/pci/pcie/portdrv_bus.c |   55 
 drivers/pci/pcie/portdrv_pci.c |   13 +
 4 files changed, 45 insertions(+), 69 deletions(-)
 delete mode 100644 drivers/pci/pcie/portdrv_bus.c

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 204d2b54c2a4..02ecbdafe38b 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include "pci.h"
+#include "pcie/portdrv.h"
 
 struct pci_dynid {
struct list_head node;
@@ -1553,8 +1554,49 @@ struct bus_type pci_bus_type = {
 };
 EXPORT_SYMBOL(pci_bus_type);
 
+#ifdef CONFIG_PCIEPORTBUS
+static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
+{
+   struct pcie_device *pciedev;
+   struct pcie_port_service_driver *driver;
+
+   if (drv->bus != &pcie_port_bus_type || dev->bus != &pcie_port_bus_type)
+   return 0;
+
+   pciedev = to_pcie_device(dev);
+   driver = to_service_driver(drv);
+
+   if (driver->service != pciedev->service)
+   return 0;
+
+   if ((driver->port_type != PCIE_ANY_PORT) &&
+   (driver->port_type != pci_pcie_type(pciedev->port)))
+   return 0;
+
+   return 1;
+}
+
+struct bus_type pcie_port_bus_type = {
+   .name   = "pci_express",
+   .match  = pcie_port_bus_match,
+};
+EXPORT_SYMBOL_GPL(pcie_port_bus_type);
+#endif
+
 static int __init pci_driver_init(void)
 {
-   return bus_register(&pci_bus_type);
+   int ret;
+
+   ret = bus_register(&pci_bus_type);
+   if (ret)
+   return ret;
+
+#ifdef CONFIG_PCIEPORTBUS
+   ret = bus_register(&pcie_port_bus_type);
+   if (ret)
+   return ret;
+#endif
+
+   return 0;
 }
 postcore_initcall(pci_driver_init);
diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
index 223e4c34c29a..e01c10c97b95 100644
--- a/drivers/pci/pcie/Makefile
+++ b/drivers/pci/pcie/Makefile
@@ -6,7 +6,7 @@
 # Build PCI Express ASPM if needed
 obj-$(CONFIG_PCIEASPM) += aspm.o
 
-pcieportdrv-y  := portdrv_core.o portdrv_pci.o portdrv_bus.o
+pcieportdrv-y  := portdrv_core.o portdrv_pci.o
 pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o
 
 obj-$(CONFIG_PCIEPORTBUS)  += pcieportdrv.o
diff --git a/drivers/pci/pcie/portdrv_bus.c b/drivers/pci/pcie/portdrv_bus.c
deleted file mode 100644
index 4969ccf6b214..
--- a/drivers/pci/pcie/portdrv_bus.c
+++ /dev/null
@@ -1,55 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * File:   portdrv_bus.c
- * Purpose:PCI Express Port Bus Driver's Bus Overloading Functions
- *
- * Copyright (C) 2004 Intel
- * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com)
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "portdrv.h"
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv);
-
-struct bus_type pcie_port_bus_type = {
-   .name   = "pci_express",
-   .match  = pcie_port_bus_match,
-};
-EXPORT_SYMBOL_GPL(pcie_port_bus_type);
-
-static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
-{
-   struct pcie_device *pciedev;
-   struct pcie_port_service_driver *driver;
-
-   if (drv->bus != &pcie_port_bus_type || dev->bus != &pcie_port_bus_type)
-   return 0;
-
-   pciedev = to_pcie_device(dev);
-   driver = to_service_driver(drv);
-
-   if (driver->service != pciedev->service)
-   return 0;
-
-   if ((driver->port_type != PCIE_ANY_PORT) &&
-   (driver->port_type != pci_pcie_type(pciedev->port)))
-   return 0;
-
-   return 1;
-}
-
-int pcie_port_bus_register(void)
-{
-   return bus_register(&pcie_port_bus_type);
-}
-
-void pcie_port_bus_unregister(void)
-{
-   bus_unregister(&pcie_port_bus_type);
-}
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 5d9d5305ebef..127321e17184 100644
--- a/drivers/pc

Re: [PATCH] vsprintf: Make "null" pointer dereference more robust

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 7:01 AM, Petr Mladek  wrote:
> Also it makes the handling unified. We print:
>
>+ (null) when pure NULL pointer is dereferenced
>+ (efault) when an invalid address is dereferenced
>+ pointer address otherwise

This is still fundamentally completely wrong.

It never prints "pointer address", and if it were to do that, it would be wrong.

It should never ever trigger for an address operation, only for the
"we will get _data_ from the ponter".

The strchr thing is also completely broken, and in a very subtle way.
"strchr(string, 0)" is special, and the Open Group states

  "The terminating null byte is considered to be part of the string"

so a NUL character will *always* return success, which is actually
completely wrong for this case, because now it does that whole crazy
 thing for %p that it shouldn't do.

Not that I actually verified that our strchr() follows the actual
rules anyway - I personally consider "strchr(string, 0)" to not really
be "special", but be a bug. Either way, the comment is wrong, but the
code is also wrong.

 Linus


Re: Bug: Microblaze stopped booting after 0fa1c579349fdd90173381712ad78aa99c09d38b

2018-03-09 Thread Rob Herring
On Fri, Mar 9, 2018 at 6:51 AM, Alvaro G. M.  wrote:
> Hi,
>
> I've found via git bisect that 0fa1c579349fdd90173381712ad78aa99c09d38b
> makes microblaze unbootable.
>
> I'm sorry I can't provide any console output, as nothing appears at all,
> even when setting earlyprintk (or at least I wasn't able to get anything
> back!).

Ah, looks like microblaze doesn't set CONFIG_NO_BOOTMEM and so
memblock_virt_alloc() doesn't work for CONFIG_HAVE_MEMBLOCK &&
!CONFIG_NO_BOOTMEM. AFAICT, microblaze doesn't really need bootmem and
it can be removed, but I'm still investigating. Can you try out this
branch[1].

Rob

[1] git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
microblaze-fixes


Re: [PATCH] perf annotate: Don't prepend symfs path to build_id_filename

2018-03-09 Thread Arnaldo Carvalho de Melo
Em Sun, Feb 11, 2018 at 02:19:37PM -0500, Martin Vuille escreveu:
> build_id_filename already contains symfs path if applicable, so
> don't prepend it a second time.

Where is the analysis that shows that that is the case? I looked here at
the implementation for dso__build_id_filename() and couldn't find where
was it that the symfs would be appended, can you clarify?

- Arnaldo
 
> Signed-off-by: Martin Vuille 
> ---
>  tools/perf/util/annotate.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 28b233c3dcbe..425b7f0760ec 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -1381,7 +1381,7 @@ static int dso__disassemble_filename(struct dso *dso, 
> char *filename, size_t fil
>  
>   build_id_filename = dso__build_id_filename(dso, NULL, 0, false);
>   if (build_id_filename) {
> - __symbol__join_symfs(filename, filename_size, 
> build_id_filename);
> + scnprintf(filename, filename_size, "%s", build_id_filename);
>   free(build_id_filename);
>   } else {
>   if (dso->has_build_id)
> -- 
> 2.13.6


[PATCH] exec: Set file unwritable before LSM check

2018-03-09 Thread Kees Cook
The LSM check should happen after the file has been confirmed to be
unchanging. Without this, we could have a ToCToU issue between the
LSM verification and the actual contents of the file later.

Signed-off-by: Kees Cook 
---
Only loadpin and SELinux implement this hook. From what I can see, this
won't change anything for either of them. IMA calls kernel_read_file(),
but looking there it seems those callers won't be negatively impacted
either. Can folks double-check this and send an Ack please?
---
 fs/exec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 7eb8d21bcab9..a919a827d181 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -895,13 +895,13 @@ int kernel_read_file(struct file *file, void **buf, 
loff_t *size,
if (!S_ISREG(file_inode(file)->i_mode) || max_size < 0)
return -EINVAL;
 
-   ret = security_kernel_read_file(file, id);
+   ret = deny_write_access(file);
if (ret)
return ret;
 
-   ret = deny_write_access(file);
+   ret = security_kernel_read_file(file, id);
if (ret)
-   return ret;
+   goto out;
 
i_size = i_size_read(file_inode(file));
if (max_size > 0 && i_size > max_size) {
-- 
2.7.4


-- 
Kees Cook
Pixel Security


Re: [PATCH V2] ARM: dts: BCM5301X: add missing LEDs for Buffalo WZR-900DHP

2018-03-09 Thread Florian Fainelli
On Wed,  7 Mar 2018 20:33:56 +0900, musashino.o...@gmail.com wrote:
> From: INAGAKI Hiroshi 
> 
> Buffalo WZR-900DHP has 8 LEDs, but there is not LED definitions in the
> dts and cannot configure these LEDs.
> I Added missing LED definitions for WZR-900DHP.
> 
> Signed-off-by: INAGAKI Hiroshi 
> ---

Applied to devicetree/next, thanks!
--
Florian


Re: [tip:perf/core] perf/x86/intel: Disable userspace RDPMC usage for large PEBS

2018-03-09 Thread Vince Weaver
On Fri, 9 Mar 2018, Peter Zijlstra wrote:

> On Fri, Mar 09, 2018 at 09:31:11AM -0500, Vince Weaver wrote:
> > On Fri, 9 Mar 2018, tip-bot for Kan Liang wrote:
> > 
> > > Commit-ID:  1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Gitweb: 
> > > https://git.kernel.org/tip/1af22eba248efe2de25658041a80a3d40fb3e92e
> > > Author: Kan Liang 
> > > AuthorDate: Mon, 12 Feb 2018 14:20:35 -0800
> > > Committer:  Ingo Molnar 
> > > CommitDate: Fri, 9 Mar 2018 08:22:23 +0100
> > > 
> > > perf/x86/intel: Disable userspace RDPMC usage for large PEBS
> > > 
> > 
> > 
> > So this whole commit log is about disabling RDPMC usage for "large PEBS"
> > but the actual change disables RDPMC if "PERF_X86_EVENT_FREERUNNING"
> > 
> > Either the commit log is really misleading, or else a poor name was chosen 
> > for this feature.
> 
> Its the same thing, and yes that might want renaming I suppose.

I apologize for noticing these things so late in the game, but I haven't 
had time to keep up with a full lkml feed recently so I only see these 
things once I'm CC'd on them.

So to summarize this: rdpmc is only disabled on a per-event basis, and 
only if that event is doing multi-pebs sampling?

If that's true, then I don't think I have an issue with this.

We finally got rdpmc support in a released PAPI, and it is a massive
improvement when self-monitoring (even moreso if KPTI is enabled) so I was 
just trying to make sure this wouldn't suddenly disable rdpmc out from 
under us.

Vince


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 10:57 AM, David Miller  wrote:
>
> Once the helper UMH is invoked, it runs asynchronously taking eBPF
> translation requests.

How?

Really. See my comment about mutual exclusion. The current patch is
*broken* because it doesn't handle it. Really.

Think of it this way - you may have now started *five* of those things
concurrently by mistake.

The actual module loading case never does that, because the actual
module loading case has per-module serialization that got
short-circuited.

How are you going to handle five processes doing the same setup concurrently?

  Linus


Re: [PATCH] exec: Set file unwritable before LSM check

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 11:07 AM, Kees Cook  wrote:
> The LSM check should happen after the file has been confirmed to be
> unchanging. Without this, we could have a ToCToU issue between the
> LSM verification and the actual contents of the file later.

Can we please not add random crazy six-letter acronyms that nobody
uses outside of a very small community?

The point of a commit message is to *explain*, not confuse.

Linus


Re: [PATCH] perf tools arm64: Add libdw DWARF post unwind support for ARM64

2018-03-09 Thread Kim Phillips
On Fri, 9 Mar 2018 13:49:50 -0500
Martin Vuille  wrote:

> For https://patchwork.kernel.org/patch/10211483/, I'm not sure how to go 
> about doing a reply to all.

Hit reply-all from the copy in your Sent folder.

Kim


Re: [PATCH v2] ARM: dts: BCM5301X: Add support for Linksys EA9500

2018-03-09 Thread Florian Fainelli
Hi Vivek,

On 03/02/2018 11:41 AM, Vivek Unune wrote:
> Hardware Info
> -
> 
> Processor - Broadcom BCM4709C0KFEBG dual-core @ 1.4 GHz
> Switch- BCM53012 in BCM4709C0KFEBG & external BCM53125
> DDR3 RAM  - 256 MB
> Flash - 128 MB (Toshiba TC58BVG0S3HTA00)
> 2.4GHz- BCM4366 4×4 2.4/5G single chip 802.11ac SoC
> Power Amp - Skyworks SE2623L 2.4 GHz power amp (x4)
> 5GHz x 2  - BCM4366 4×4 2.4/5G single chip 802.11ac SoC
> Power Amp - PLX Technology PEX8603 3-lane, 3-port PCIe switch
> Ports - 8 Ports, 1 WAN Ports
> Antennas  - 8 Antennas
> Serial Port   - @J6 [GND,TX,RX] (VCC NC)115200 8n1
> 
> Tested with OpenWrt built with DSA driver and Kernel v4.14
> 
> Note:
> 
> "make sure that port 0 of the internal switch is not accidentally
> configured back to untagged since that would cause problem when
> terminating the VLAN tag on the SW side." - Florian Fainelli [1]
> 
> This can be ensured by running following command in OpenWrt:
> 
> bridge vlan add vid 1 dev extsw pvid tagged
> 
> [1] https://www.spinics.net/lists/arm-kernel/msg590992.html

Glad you got it working finally! Out of curiosity, I am assuming you
have Broadcom tags enabled on the internal switch and disabled on the
external BCM53125 switch, is that correct?

Just a few nits below.

> 
> Signed-off-by: Vivek Unune 
> ---
> Changes in v2:
>  - Properly define mdio mux, internal mdio, external mdio, mii bus
>  - Now we define usb3 phy as a mdio node connected to internal mdio,
>thanks to work done by Rafał Miłecki on the bcm usb3 phy mdio driver
>  - Define external SW as a mdio-mii node connected to external mdio
> ---
>  arch/arm/boot/dts/bcm47094-linksys-panamera.dts | 239 
> +++-
>  arch/arm/boot/dts/bcm47094.dtsi |   6 +-
>  arch/arm/boot/dts/bcm5301x.dtsi |  55 +-
>  3 files changed, 288 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm/boot/dts/bcm47094-linksys-panamera.dts 
> b/arch/arm/boot/dts/bcm47094-linksys-panamera.dts
> index b6750f7..5f53207 100644
> --- a/arch/arm/boot/dts/bcm47094-linksys-panamera.dts
> +++ b/arch/arm/boot/dts/bcm47094-linksys-panamera.dts
> @@ -7,7 +7,7 @@
>  /dts-v1/;
>  
>  #include "bcm47094.dtsi"
> -#include "bcm5301x-nand-cs0-bch8.dtsi"
> +#include "bcm5301x-nand-cs0-bch1.dtsi"

This sounds like an independent bugfix, can you submit that separately?

>  
>  / {
>   compatible = "linksys,panamera", "brcm,bcm47094", "brcm,bcm4708";
> @@ -32,5 +32,242 @@
>   linux,code = ;
>   gpios = <&chipcommon 3 GPIO_ACTIVE_LOW>;
>   };
> +
> + rfkill {
> + label = "WiFi";
> + linux,code = ;
> + gpios = <&chipcommon 16 GPIO_ACTIVE_LOW>;
> + };
> +
> + reset {
> + label = "Reset";
> + linux,code = ;
> + gpios = <&chipcommon 17 GPIO_ACTIVE_LOW>;
> + };
> + };
> +
> + leds {
> + compatible = "gpio-leds";
> +
> + wps {
> + label = "bcm53xx:white:wps";
> + gpios = <&chipcommon 22 GPIO_ACTIVE_LOW>;
> + };
> +
> + usb2 {
> + label = "bcm53xx:green:usb2";
> + gpios = <&chipcommon 1 GPIO_ACTIVE_LOW>;
> + trigger-sources = <&ohci_port2>, <&ehci_port2>;
> + linux,default-trigger = "usbport";
> + };
> +
> + usb3 {
> + label = "bcm53xx:green:usb3";
> + gpios = <&chipcommon 2 GPIO_ACTIVE_LOW>;
> + trigger-sources = <&ohci_port1>, <&ehci_port1>,
> +   <&xhci_port1>;
> + linux,default-trigger = "usbport";
> + };
> +
> + power {
> + label = "bcm53xx:white:power";
> + gpios = <&chipcommon 4 GPIO_ACTIVE_HIGH>;
> + };
> +
> + wifi-disabled {
> + label = "bcm53xx:amber:wifi-disabled";
> + gpios = <&chipcommon 0 GPIO_ACTIVE_LOW>;
> + };
> +
> + wifi-enabled {
> + label = "bcm53xx:white:wifi-enabled";
> + gpios = <&chipcommon 5 GPIO_ACTIVE_HIGH>;
> + };
> +
> + bluebar1 {
> + label = "bcm53xx:white:bluebar1";
> + gpios = <&chipcommon 11 GPIO_ACTIVE_HIGH>;
> + };
> +
> + bluebar2 {
> + label = "bcm53xx:white:bluebar2";
> + gpios = <&chipcommon 12 GPIO_ACTIVE_HIGH>;
> + };
> +
> + bluebar3 {
> + label = "bcm53xx:white:bluebar3";
> + gpios = <&

Re: [PATCH] perf tools arm64: Add libdw DWARF post unwind support for ARM64

2018-03-09 Thread Martin Vuille

Yes, thought of doing that.

Unfortunately it was sent directly from git, so I do not have a copy of the 
message that was sent.

MV


On 03/09/18 14:15, Kim Phillips wrote:

On Fri, 9 Mar 2018 13:49:50 -0500
Martin Vuille  wrote:


For https://patchwork.kernel.org/patch/10211483/, I'm not sure how to go about 
doing a reply to all.

Hit reply-all from the copy in your Sent folder.

Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[PATCH 0/2] hwmon: (ucd9000) Add gpio and debugfs interfaces

2018-03-09 Thread Eddie James
The ucd9000 series chips have gpio pins. Add a gpio chip interface to the ucd
device so that users can query and set the state of the gpio pins.

Add a debugfs interface using the existing pmbus debugfs directory to provide
MFR_STATUS and the status of the gpi faults to users.

Christopher Bostic (2):
  hwmon: (ucd9000) Add gpio chip interface
  hwmon: (ucd9000) Add debugfs attributes to provide mfr_status

 drivers/hwmon/pmbus/ucd9000.c | 392 +-
 1 file changed, 391 insertions(+), 1 deletion(-)

-- 
1.8.3.1



[PATCH 2/2] hwmon: (ucd9000) Add debugfs attributes to provide mfr_status

2018-03-09 Thread Eddie James
From: Christopher Bostic 

Expose the gpiN_fault fields of mfr_status as individual debugfs
attributes. This provides a way for users to be easily notified of gpi
faults. Also provide the whole mfr_status register in debugfs.

Signed-off-by: Christopher Bostic 
Signed-off-by: Andrew Jeffery 
Signed-off-by: Eddie James 
---
 drivers/hwmon/pmbus/ucd9000.c | 172 +-
 1 file changed, 171 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/pmbus/ucd9000.c b/drivers/hwmon/pmbus/ucd9000.c
index e3a507f..297da0e 100644
--- a/drivers/hwmon/pmbus/ucd9000.c
+++ b/drivers/hwmon/pmbus/ucd9000.c
@@ -19,6 +19,7 @@
  * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -36,6 +37,7 @@
 #define UCD9000_NUM_PAGES  0xd6
 #define UCD9000_FAN_CONFIG_INDEX   0xe7
 #define UCD9000_FAN_CONFIG 0xe8
+#define UCD9000_MFR_STATUS 0xf3
 #define UCD9000_GPIO_SELECT0xfa
 #define UCD9000_GPIO_CONFIG0xfb
 #define UCD9000_DEVICE_ID  0xfd
@@ -63,13 +65,22 @@
 #define UCD901XX_NUM_GPIOS 26
 #define UCD90910_NUM_GPIOS 26
 
+#define UCD9000_DEBUGFS_NAME_LEN   24
+#define UCD9000_GPI_COUNT  8
+
 struct ucd9000_data {
u8 fan_data[UCD9000_NUM_FAN][I2C_SMBUS_BLOCK_MAX];
struct pmbus_driver_info info;
struct gpio_chip gpio;
+   struct dentry *debugfs;
 };
 #define to_ucd9000_data(_info) container_of(_info, struct ucd9000_data, info)
 
+struct ucd9000_debugfs_entry {
+   struct i2c_client *client;
+   u8 index;
+};
+
 static int ucd9000_get_fan_config(struct i2c_client *client, int fan)
 {
int fan_config = 0;
@@ -328,6 +339,156 @@ static int ucd9000_gpio_direction_output(struct gpio_chip 
*gc,
  val);
 }
 
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static int ucd9000_get_mfr_status(struct i2c_client *client, u8 *buffer)
+{
+   int ret = pmbus_set_page(client, 0);
+
+   if (ret < 0)
+   return ret;
+
+   /*
+* With the ucd90120 and ucd90124 devices, this command [MFR_STATUS]
+* is 2 bytes long (bits 0-15).  With the ucd90240 this command is 5
+* bytes long.  With all other devices, it is 4 bytes long.
+*/
+   return i2c_smbus_read_block_data(client, UCD9000_MFR_STATUS, buffer);
+}
+
+static int ucd9000_debugfs_show_mfr_status_bit(void *data, u64 *val)
+{
+   struct ucd9000_debugfs_entry *entry = data;
+   struct i2c_client *client = entry->client;
+   u8 buffer[4];
+   int ret;
+
+   /*
+* This attribute is only created for devices that return 4 bytes for
+* status_mfr, so it's safe to call with 4-byte buffer.
+*/
+   ret = ucd9000_get_mfr_status(client, buffer);
+   if (ret < 0) {
+   dev_err(&client->dev, "Failed to read mfr status. rc:%d\n",
+   ret);
+
+   return ret;
+   }
+
+   /*
+* Attribute only created for devices with gpi fault bits at bits
+* 16-23, which is the second byte of the response.
+*/
+   *val = !!(buffer[1] & BIT(entry->index));
+
+   return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(ucd9000_debugfs_mfr_status_bit,
+ucd9000_debugfs_show_mfr_status_bit, NULL, "%1lld\n");
+
+static int ucd9000_debugfs_show_mfr_status_word2(void *data, u64 *val)
+{
+   struct i2c_client *client = data;
+   __be16 buffer;
+   int ret;
+
+   ret = ucd9000_get_mfr_status(client, (u8 *)&buffer);
+   if (ret < 0) {
+   dev_err(&client->dev, "Failed to read mfr status. rc:%d\n",
+   ret);
+
+   return ret;
+   }
+
+   *val = be16_to_cpu(buffer);
+
+   return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(ucd9000_debugfs_mfr_status_word2,
+ucd9000_debugfs_show_mfr_status_word2, NULL,
+"%04llx\n");
+
+static int ucd9000_debugfs_show_mfr_status_word4(void *data, u64 *val)
+{
+   struct i2c_client *client = data;
+   __be32 buffer;
+   int ret;
+
+   ret = ucd9000_get_mfr_status(client, (u8 *)&buffer);
+   if (ret < 0) {
+   dev_err(&client->dev, "Failed to read mfr status. rc:%d\n",
+   ret);
+
+   return ret;
+   }
+
+   *val = be32_to_cpu(buffer);
+
+   return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(ucd9000_debugfs_mfr_status_word4,
+ucd9000_debugfs_show_mfr_status_word4, NULL,
+"%08llx\n");
+
+static int ucd9000_init_debugfs(struct i2c_client *client,
+   const struct i2c_device_id *mid,
+   struct ucd9000_data *data)
+{
+   struct dentry *debugfs;
+   struct ucd9000_debugfs_entry *entries;
+   int i;
+   char name[UCD9000_DEBUGFS_NAME_LEN];
+
+   debugfs = pmbus_get_debugfs_dir(c

[PATCH 1/2] hwmon: (ucd9000) Add gpio chip interface

2018-03-09 Thread Eddie James
From: Christopher Bostic 

Add a struct gpio_chip and define some methods so that this device's
I/O can be accessed via /sys/class/gpio.

Signed-off-by: Christopher Bostic 
Signed-off-by: Andrew Jeffery 
Signed-off-by: Eddie James 
---
 drivers/hwmon/pmbus/ucd9000.c | 220 ++
 1 file changed, 220 insertions(+)

diff --git a/drivers/hwmon/pmbus/ucd9000.c b/drivers/hwmon/pmbus/ucd9000.c
index b74dbec..e3a507f 100644
--- a/drivers/hwmon/pmbus/ucd9000.c
+++ b/drivers/hwmon/pmbus/ucd9000.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "pmbus.h"
 
 enum chips { ucd9000, ucd90120, ucd90124, ucd90160, ucd9090, ucd90910 };
@@ -35,8 +36,18 @@
 #define UCD9000_NUM_PAGES  0xd6
 #define UCD9000_FAN_CONFIG_INDEX   0xe7
 #define UCD9000_FAN_CONFIG 0xe8
+#define UCD9000_GPIO_SELECT0xfa
+#define UCD9000_GPIO_CONFIG0xfb
 #define UCD9000_DEVICE_ID  0xfd
 
+/* GPIO CONFIG bits */
+#define UCD9000_GPIO_CONFIG_ENABLE BIT(0)
+#define UCD9000_GPIO_CONFIG_OUT_ENABLE BIT(1)
+#define UCD9000_GPIO_CONFIG_OUT_VALUE  BIT(2)
+#define UCD9000_GPIO_CONFIG_STATUS BIT(3)
+#define UCD9000_GPIO_INPUT 0
+#define UCD9000_GPIO_OUTPUT1
+
 #define UCD9000_MON_TYPE(x)(((x) >> 5) & 0x07)
 #define UCD9000_MON_PAGE(x)((x) & 0x0f)
 
@@ -47,9 +58,15 @@
 
 #define UCD9000_NUM_FAN4
 
+#define UCD9000_GPIO_NAME_LEN  16
+#define UCD9090_NUM_GPIOS  23
+#define UCD901XX_NUM_GPIOS 26
+#define UCD90910_NUM_GPIOS 26
+
 struct ucd9000_data {
u8 fan_data[UCD9000_NUM_FAN][I2C_SMBUS_BLOCK_MAX];
struct pmbus_driver_info info;
+   struct gpio_chip gpio;
 };
 #define to_ucd9000_data(_info) container_of(_info, struct ucd9000_data, info)
 
@@ -149,6 +166,168 @@ static int ucd9000_read_byte_data(struct i2c_client 
*client, int page, int reg)
 };
 MODULE_DEVICE_TABLE(of, ucd9000_of_match);
 
+static int ucd9000_gpio_read_config(struct i2c_client *client,
+   unsigned int offset)
+{
+   int ret;
+
+   /* No page set required */
+   ret = i2c_smbus_write_byte_data(client, UCD9000_GPIO_SELECT, offset);
+   if (ret < 0) {
+   dev_err(&client->dev, "Failed to select GPIO %d: %d\n", offset,
+   ret);
+
+   return ret;
+   }
+
+   return i2c_smbus_read_byte_data(client, UCD9000_GPIO_CONFIG);
+}
+
+static int ucd9000_gpio_get(struct gpio_chip *gc, unsigned int offset)
+{
+   struct i2c_client *client  = gpiochip_get_data(gc);
+   int ret;
+
+   ret = ucd9000_gpio_read_config(client, offset);
+   if (ret < 0) {
+   dev_err(&client->dev, "failed to read GPIO %d config: %d\n",
+   offset, ret);
+
+   return ret;
+   }
+
+   return !!(ret & UCD9000_GPIO_CONFIG_STATUS);
+}
+
+static void ucd9000_gpio_set(struct gpio_chip *gc, unsigned int offset,
+int value)
+{
+   struct i2c_client *client = gpiochip_get_data(gc);
+   int ret;
+
+   ret = ucd9000_gpio_read_config(client, offset);
+   if (ret < 0) {
+   dev_err(&client->dev, "failed to read GPIO %d config: %d\n",
+   offset, ret);
+
+   return;
+   }
+
+   if (value) {
+   if (ret & UCD9000_GPIO_CONFIG_STATUS)
+   return;
+
+   ret |= UCD9000_GPIO_CONFIG_STATUS;
+   } else {
+   if (!(ret & UCD9000_GPIO_CONFIG_STATUS))
+   return;
+
+   ret &= ~UCD9000_GPIO_CONFIG_STATUS;
+   }
+
+   ret |= UCD9000_GPIO_CONFIG_ENABLE;
+
+   /* Page set not required */
+   ret = i2c_smbus_write_byte_data(client, UCD9000_GPIO_CONFIG, ret);
+   if (ret < 0) {
+   dev_err(&client->dev, "Failed to write GPIO %d config: %d\n",
+   offset, ret);
+
+   return;
+   }
+
+   ret &= ~UCD9000_GPIO_CONFIG_ENABLE;
+
+   ret = i2c_smbus_write_byte_data(client, UCD9000_GPIO_CONFIG, ret);
+   if (ret < 0)
+   dev_err(&client->dev, "Failed to write GPIO %d config: %d\n",
+   offset, ret);
+}
+
+static int ucd9000_gpio_get_direction(struct gpio_chip *gc,
+ unsigned int offset)
+{
+   struct i2c_client *client = gpiochip_get_data(gc);
+   int ret;
+
+   ret = ucd9000_gpio_read_config(client, offset);
+   if (ret < 0) {
+   dev_err(&client->dev, "failed to read GPIO %d config: %d\n",
+   offset, ret);
+
+   return ret;
+   }
+
+   return !(ret & UCD9000_GPIO_CONFIG_OUT_ENABLE);
+}
+
+static int ucd9000_gpio_set_direction(struct gpio_chip *gc,
+ unsigned int offset, bool direction_out,
+ int requested_out)
+{
+   struct i2c_cl

[GIT PULL] platform-drivers-x86 for 4.16-6

2018-03-09 Thread Darren Hart
Hi Linus,

This race condition discovery and the resulting fix landed later than I
would have liked in the RC cycle, but I judged it worth fixing now
rather than waiting for stable. The Dell drivers have some legacy
dependencies we've been slowly untangling. I think a more invasive
change should happen in 4.17, but this minimal impact change fixes the
race condition and keeps Kconfig happy.

The following changes since commit 1cedc6385d5f7310af0a08831c6c4303486ba850:

  platform/x86: wmi: Fix misuse of vsprintf extension %pULL (2018-03-01 
10:01:39 -0800)

are available in the git repository at:

  git://git.infradead.org/linux-platform-drivers-x86.git 
tags/platform-drivers-x86-v4.16-6

for you to fetch changes up to 32d7b19bad9695c4c9026b0ceb3a384561ddee70:

  platform/x86: dell-smbios: Resolve dependency error on DCDBAS (2018-03-09 
09:35:46 -0800)

Thanks,

Darren Hart
VMware Open Source Technology Center


platform-drivers-x86 for v4.16-6

Correct a module loading race condition between the DELL_SMBIOS backend
modules and the first user by converting them to bool features of the
DELL_SMBIOS driver. Fixup the resulting Kconfig dependency issue with
DCDBAS.

The following is an automated git shortlog grouped by driver:

 -  Resolve dependency error on DCDBAS
 -  Allow for SMBIOS backend defaults
 -  Link all dell-smbios-* modules together
 -  Rename dell-smbios source to dell-smbios-base
 -  Correct some style warnings


Darren Hart (VMware) (2):
  platform/x86: Allow for SMBIOS backend defaults
  platform/x86: dell-smbios: Resolve dependency error on DCDBAS

Mario Limonciello (3):
  platform/x86: dell-smbios: Correct some style warnings
  platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base
  platform/x86: dell-smbios: Link all dell-smbios-* modules together

 drivers/platform/x86/Kconfig   | 27 ++--
 drivers/platform/x86/Makefile  |  5 ++--
 .../x86/{dell-smbios.c => dell-smbios-base.c}  | 29 +++---
 drivers/platform/x86/dell-smbios-smm.c | 18 +++---
 drivers/platform/x86/dell-smbios-wmi.c | 14 +++
 drivers/platform/x86/dell-smbios.h | 27 +++-
 6 files changed, 82 insertions(+), 38 deletions(-)
 rename drivers/platform/x86/{dell-smbios.c => dell-smbios-base.c} (96%)

-- 
Darren Hart
VMware Open Source Technology Center


Re: [PATCH] perf annotate: Don't prepend symfs path to build_id_filename

2018-03-09 Thread Martin Vuille

dso__build_id_filename calls build_id_cache__linkname

build_id_cache__linkname uses buildid_dir

symbol__config_symfs includes the symfs directory in buildid_dir

So it's not necessary to prepend it again.


Should've included those notes in the original submission.

Will do better next time.

MV


On 03/09/18 14:07, Arnaldo Carvalho de Melo wrote:

Em Sun, Feb 11, 2018 at 02:19:37PM -0500, Martin Vuille escreveu:

build_id_filename already contains symfs path if applicable, so
don't prepend it a second time.

Where is the analysis that shows that that is the case? I looked here at
the implementation for dso__build_id_filename() and couldn't find where
was it that the symfs would be appended, can you clarify?

- Arnaldo
  

Signed-off-by: Martin Vuille 
---
  tools/perf/util/annotate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 28b233c3dcbe..425b7f0760ec 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1381,7 +1381,7 @@ static int dso__disassemble_filename(struct dso *dso, 
char *filename, size_t fil
  
  	build_id_filename = dso__build_id_filename(dso, NULL, 0, false);

if (build_id_filename) {
-   __symbol__join_symfs(filename, filename_size, 
build_id_filename);
+   scnprintf(filename, filename_size, "%s", build_id_filename);
free(build_id_filename);
} else {
if (dso->has_build_id)
--
2.13.6




RE: [PATCH] clk: clk-fixed-factor: Use new macro CLK_OF_DECLARE_DRIVER

2018-03-09 Thread Rajan Vaja
Hi Stephen,

Thanks for the review.

> -Original Message-
> From: Stephen Boyd [mailto:sb...@kernel.org]
> Sent: Friday, March 09, 2018 10:25 AM
> To: Rajan Vaja ; mturque...@baylibre.com
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; Jolly Shah
> ; Michal Simek ; Rajan Vaja
> 
> Subject: Re: [PATCH] clk: clk-fixed-factor: Use new macro
> CLK_OF_DECLARE_DRIVER
> 
> Quoting Rajan Vaja (2018-03-08 06:15:00)
> > Fixed factor clock has two initialization at of_clk_init() time and
> > also during platform driver probe. So declare the fixed factor clock
> > with CLK_OF_DECLARE_DRIVER instead of CLK_OF_DECLARE.
> >
> > See below commit for reference:
> > "clk: sunxi: apb0: Use new macro CLK_OF_DECLARE_DRIVER"
> > (sha1: 915128b621a05c63fa58ca9e4cbdf394bbe592f3)
> >
> > Signed-off-by: Rajan Vaja 
> > Suggested-by: Michal Simek 
> > ---
> >  drivers/clk/clk-fixed-factor.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/clk/clk-fixed-factor.c
> > b/drivers/clk/clk-fixed-factor.c index a5d402d..d72ef2d 100644
> > --- a/drivers/clk/clk-fixed-factor.c
> > +++ b/drivers/clk/clk-fixed-factor.c
> > @@ -196,8 +196,9 @@ void __init of_fixed_factor_clk_setup(struct
> > device_node *node)  {
> > _of_fixed_factor_clk_setup(node);  }
> > -CLK_OF_DECLARE(fixed_factor_clk, "fixed-factor-clock",
> > -   of_fixed_factor_clk_setup);
> > +
> > +CLK_OF_DECLARE_DRIVER(fixed_factor_clk, "fixed-factor-clock",
> > + of_fixed_factor_clk_setup);
> >
> 
> Is the intent to register the clk twice? I believe things are working as
> intended without this patch, so maybe you can explain a little more what
> you're trying to fix.
[Rajan] Yes. During of_clk_init() if some DT fixed factor clock has parent 
which is neither mentioned in output-clock-names of clock controller nor 
registered as clock provider, of_clk_init() will try to forcefully register in 
second loop. 

if (force || parent_ready(clk_provider->np)) {

/* Don't populate platform devices */
of_node_set_flag(clk_provider->np,
 OF_POPULATED);

So registration of this DT fixed-factor clock would fail as parent would be 
NULL as below (called from _of_fixed_factor_clk_setup()):
parent_name = of_clk_get_parent_name(node, 0);

On the other hand, even if registration failed, that node will be marked as 
OF_POPULATED, so probe of clk-fixed-factor.c will also not be called and that 
DT fixed-factor clock would never be registered. 

Same thing is discussed at  https://lkml.org/lkml/2017/6/5/681 .


Re: [PATCH] perf tools arm64: Add libdw DWARF post unwind support for ARM64

2018-03-09 Thread Arnaldo Carvalho de Melo
Em Fri, Mar 09, 2018 at 01:49:50PM -0500, Martin Vuille escreveu:
> Hi,
> 
> I made two other submissions that may also have been overlooked:
> 
> https://patchwork.kernel.org/patch/10211401/ -- This one has the S-o-B

Ok, replied to that one, I can't see where is it that the symfs is being
first appended, please clarify that in the patch commit log message.
 
> https://patchwork.kernel.org/patch/10211473/ -- RFC, was looking for 
> comments, has the S-o-B

[RFC,1/1] perf annotate: Don't prepend symfs path to vmlinux path

So, lemme try to provide the precise steps to reproduce this problem:

[root@jouet ~]# perf record -F 1 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ]
[root@jouet ~]# perf buildid-list
44d954246227536955cb1ecbe9ef2a05665876b6 /lib/modules/4.16.0-rc4/build/vmlinux
87ae276466bc68e958c9817f11d5e09f14510585 [vdso]
3113881229974f02113945e92c1a4d4f146e061c /usr/lib64/libc-2.26.so
[root@jouet ~]#

then we go on and remove that buildid from the cache:

[root@jouet ~]# perf record -F 1 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (10 samples) ]
[root@jouet ~]# perf report --dso \[kernel.vmlinux\] | grep -v ^# | head -5
38.00%  sleep[k] filemap_map_pages
19.68%  sleep[k] elf_map
 2.69%  perf [k] perf_iterate_ctx
 0.29%  perf [k] end_repeat_nmi
 0.04%  perf [k] native_sched_clock
[root@jouet ~]#
[root@jouet ~]# perf buildid-list
44d954246227536955cb1ecbe9ef2a05665876b6 /lib/modules/4.16.0-rc4/build/vmlinux
87ae276466bc68e958c9817f11d5e09f14510585 [vdso]
3113881229974f02113945e92c1a4d4f146e061c /usr/lib64/libc-2.26.so
[root@jouet ~]# ls -la 
~/.debug/.build-id/44/d954246227536955cb1ecbe9ef2a05665876b6
lrwxrwxrwx. 1 root root 77 Mar  9 16:14 
/root/.debug/.build-id/44/d954246227536955cb1ecbe9ef2a05665876b6 -> 
../../home/build/v4.16.0-rc4/vmlinux/44d954246227536955cb1ecbe9ef2a05665876b6
[root@jouet ~]# ls -la 
~/.debug/home/build/v4.16.0-rc4/vmlinux/44d954246227536955cb1ecbe9ef2a05665876b6/
total 510840
drwxr-xr-x. 2 root root  4096 Mar  9 16:14 .
drwxr-xr-x. 3 root root  4096 Mar  6 11:35 ..
-rwxr-xr-x. 1 root root 523085744 Mar  6 11:35 elf
-rw-r--r--. 1 root root 0 Mar  9 16:14 probes
[root@jouet ~]# perf buildid-cache --remove 
/lib/modules/4.16.0-rc4/build/vmlinux
[root@jouet ~]# ls -la 
~/.debug/.build-id/44/d954246227536955cb1ecbe9ef2a05665876b6
ls: cannot access 
'/root/.debug/.build-id/44/d954246227536955cb1ecbe9ef2a05665876b6': No such 
file or directory
[root@jouet ~]# ls -la 
~/.debug/home/build/v4.16.0-rc4/vmlinux/44d954246227536955cb1ecbe9ef2a05665876b6/
ls: cannot access 
'/root/.debug/home/build/v4.16.0-rc4/vmlinux/44d954246227536955cb1ecbe9ef2a05665876b6/':
 No such file or directory
[root@jouet ~]#

Ok, so now I do:

[root@jouet ~]# perf annotate --stdio --vmlinux 
/lib/modules/4.16.0-rc4/build/vmlinux filemap_map_pages
Failed to open [kernel.kallsyms]_text, continuing without symbols
Error:
The perf.data file has no samples!
[root@jouet ~]# 


But I ran out of time today, this one needs a bit more investigation, I
couldn't get to that dso__disassemble_filename() in the above case :-\
 
> For https://patchwork.kernel.org/patch/10211483/, I'm not sure how to go 
> about doing a reply to all.

So, no need for that, just state here that you provide your Signed-off-by: to
it and I'll add it, which is what I'm doing now since this seems to be your
intent, right?

I'll just add another Link: tag pointing to your reply to -this- message,
if it comes with a S-o-B for https://patchwork.kernel.org/patch/10211483/, ok?

- Arnaldo
 
> I had some email problems and was cut-off from the list for a while.
> 
> MV
> 
> 
> On 03/09/18 13:24, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Mar 09, 2018 at 12:07:20PM -0600, Kim Phillips escreveu:
> > > On Fri, 9 Mar 2018 12:06:27 -0300
> > > Arnaldo Carvalho de Melo  wrote:
> > > 
> > > Hi Arnaldo,
> > > 
> > > > Em Thu, Mar 08, 2018 at 09:10:30PM -0600, Kim Phillips escreveu:
> > > > > Based on prior work:
> > > > > 
> > > > > https://lkml.org/lkml/2014/5/6/395
> > > > Thanks, looks good, applying.
> > > > 
> > > > Jean, is everything ok with you on this?
> > > By now your email to Jean should have bounced with "The email account
> > > that you tried to reach does not exist."  Removing Jean from Cc.
> > > 
> > > It seems like you're applying patches.  There are a couple that have
> > > slipped through the cracks: Can you please take a look at applying them?
> > > 
> > > - "perf tools: Fixing uninitialised variable"
> > >https://patchwork.kernel.org/patch/10179381/
> > [acme@jouet perf]$ git tag --contains 
> > d2785de15f1bd42d613d56bbac5a007e7293b874
> > perf-core-for-mingo-4.17-20180216
> > 
> > commit d2785de15f1bd42d613d56bbac5a007e7293b874
> > Author: Mathieu Poirier 
> > AuthorDate: Mon Feb 12 13:32:37 2018 -0700
> > Commit: Arnaldo Car

[PATCH v2] exec: Set file unwritable before LSM check

2018-03-09 Thread Kees Cook
The LSM check should happen after the file has been confirmed to be
unchanging. Without this, we could have a race between the Time of Check
(the call to security_kernel_read_file() which could read the file and
make access policy decisions) and the Time of Use (starting with
kernel_read_file()'s reading of the file contents). In theory, file
contents could change between the two.

Signed-off-by: Kees Cook 
---
v2: Clarify the ToC/ToU race (Linus)

Only loadpin and SELinux currently implement this hook. From what
I can see, this won't change anything for either of them. IMA calls
kernel_read_file(), but looking there it seems those callers won't be
negatively impacted either. Can folks double-check this and send an
Ack please?
---
 fs/exec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 7eb8d21bcab9..a919a827d181 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -895,13 +895,13 @@ int kernel_read_file(struct file *file, void **buf, 
loff_t *size,
if (!S_ISREG(file_inode(file)->i_mode) || max_size < 0)
return -EINVAL;
 
-   ret = security_kernel_read_file(file, id);
+   ret = deny_write_access(file);
if (ret)
return ret;
 
-   ret = deny_write_access(file);
+   ret = security_kernel_read_file(file, id);
if (ret)
-   return ret;
+   goto out;
 
i_size = i_size_read(file_inode(file));
if (max_size > 0 && i_size > max_size) {
-- 
2.7.4


-- 
Kees Cook
Pixel Security


Re: [PATCH 1/3] bus: fsl-mc: add restool userspace support

2018-03-09 Thread Greg KH
On Wed, Mar 07, 2018 at 10:51:35AM -0600, Ioana Ciornei wrote:
> Adding kernel support for restool, a userspace tool for resource
> management, means exporting an ioctl capable device file representing
> the root resource container.
> This new functionality in the fsl-mc bus driver intends to provide
> restool an interface to interact with the MC firmware.
> Commands that are composed in userspace are sent to the MC firmware
> through the RESTOOL_SEND_MC_COMMAND ioctl.
> By default the implicit MC I/O portal is used for this operation,
> but if the implicit one is busy, a dynamic portal is allocated and then
> freed upon execution.
> 
> Signed-off-by: Ioana Ciornei 
> ---
>  Documentation/ioctl/ioctl-number.txt|   1 +
>  Documentation/networking/dpaa2/overview.rst |   4 +
>  drivers/bus/fsl-mc/Kconfig  |   7 +
>  drivers/bus/fsl-mc/Makefile |   3 +
>  drivers/bus/fsl-mc/fsl-mc-allocator.c   |   5 +
>  drivers/bus/fsl-mc/fsl-mc-bus.c |  19 +++
>  drivers/bus/fsl-mc/fsl-mc-private.h |  56 +++
>  drivers/bus/fsl-mc/fsl-mc-restool.c | 219 
> 

This is a "tiny" patch, yet I think it needs to be broken up more, as
you are mixing a few different things in the same patch, and you forgot
one big thing...

>  8 files changed, 314 insertions(+)
>  create mode 100644 drivers/bus/fsl-mc/fsl-mc-restool.c
> 
> diff --git a/Documentation/ioctl/ioctl-number.txt 
> b/Documentation/ioctl/ioctl-number.txt
> index 6501389..d427397 100644
> --- a/Documentation/ioctl/ioctl-number.txt
> +++ b/Documentation/ioctl/ioctl-number.txt
> @@ -170,6 +170,7 @@ Code  Seq#(hex)   Include FileComments
>  'R'  00-1F   linux/random.h  conflict!
>  'R'  01  linux/rfkill.h  conflict!
>  'R'  C0-DF   net/bluetooth/rfcomm.h
> +'R'  E0  drivers/bus/fsl-mc/fsl-mc-private.h
>  'S'  all linux/cdrom.h   conflict!
>  'S'  80-81   scsi/scsi_ioctl.h   conflict!
>  'S'  82-FF   scsi/scsi.h conflict!
> diff --git a/Documentation/networking/dpaa2/overview.rst 
> b/Documentation/networking/dpaa2/overview.rst
> index 79fede4..1056445 100644
> --- a/Documentation/networking/dpaa2/overview.rst
> +++ b/Documentation/networking/dpaa2/overview.rst
> @@ -127,6 +127,10 @@ level.
>  
>  DPRCs can be defined statically and populated with objects
>  via a config file passed to the MC when firmware starts it.
> +There is also a Linux user space tool called "restool" that can be
> +used to create/destroy containers and objects dynamically. The latest
> +version of restool can be found at:
> +https://github.com/qoriq-open-source/restool
>  
>  DPAA2 Objects for an Ethernet Network Interface
>  ---
> diff --git a/drivers/bus/fsl-mc/Kconfig b/drivers/bus/fsl-mc/Kconfig
> index c23c77c..66ec3b9 100644
> --- a/drivers/bus/fsl-mc/Kconfig
> +++ b/drivers/bus/fsl-mc/Kconfig
> @@ -14,3 +14,10 @@ config FSL_MC_BUS
> architecture.  The fsl-mc bus driver handles discovery of
> DPAA2 objects (which are represented as Linux devices) and
> binding objects to drivers.
> +
> +config FSL_MC_RESTOOL
> + bool "Management Complex (MC) restool support"
> + depends on FSL_MC_BUS
> + help
> +   Provides kernel support for the Management Complex resource
> +   manager user-space tool - restool.

Why would you want to make this a build option?  Why would you ever
_not_ want this?


> diff --git a/drivers/bus/fsl-mc/Makefile b/drivers/bus/fsl-mc/Makefile
> index 6a97f2c..9a155e3 100644
> --- a/drivers/bus/fsl-mc/Makefile
> +++ b/drivers/bus/fsl-mc/Makefile
> @@ -14,3 +14,6 @@ mc-bus-driver-objs := fsl-mc-bus.o \
> fsl-mc-allocator.o \
> fsl-mc-msi.o \
> dpmcp.o
> +
> +# MC restool kernel support
> +obj-$(CONFIG_FSL_MC_RESTOOL) += fsl-mc-restool.o
> diff --git a/drivers/bus/fsl-mc/fsl-mc-allocator.c 
> b/drivers/bus/fsl-mc/fsl-mc-allocator.c
> index 452c5d7..fb1442b 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-allocator.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-allocator.c
> @@ -646,3 +646,8 @@ int __init fsl_mc_allocator_driver_init(void)
>  {
>   return fsl_mc_driver_register(&fsl_mc_allocator_driver);
>  }
> +
> +void fsl_mc_allocator_driver_exit(void)
> +{
> + fsl_mc_driver_unregister(&fsl_mc_allocator_driver);
> +}

Why are you mixing the bus/driver changes in with the addition of the
ioctl?  That should be broken out into the "first" patch of this series,
to make the addition of the ioctl easier to see and review.

> +#define RESTOOL_IOCTL_TYPE   'R'
> +#define RESTOOL_IOCTL_SEQ0xE0
> +
> +#define RESTOOL_SEND_MC_COMMAND \
> + _IOWR(RESTOOL_IOCTL_TYPE, RESTOOL_IOCTL_SEQ, struct mc_command)

"struct mc_command" is not defined as a structure that can cross the
user/kernel boundry at all.  At the least it is not in a public uapi
header file.  It also does not use the correc

Re: [PATCH 2/3] bus: fsl-mc: add root dprc rescan attribute

2018-03-09 Thread Greg KH
On Wed, Mar 07, 2018 at 10:51:36AM -0600, Ioana Ciornei wrote:
> Introduce the rescan attribute as a device attribute to
> synchronize the fsl-mc bus objects and the MC firmware.
> 
> To rescan the root dprc only, e.g.
> echo 1 > /sys/bus/fsl-mc/devices/dprc.1/rescan
> 
> Signed-off-by: Ioana Ciornei 
> ---
>  drivers/bus/fsl-mc/dprc-driver.c|  4 ++--
>  drivers/bus/fsl-mc/fsl-mc-bus.c | 28 
>  drivers/bus/fsl-mc/fsl-mc-private.h |  3 +++
>  3 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/bus/fsl-mc/dprc-driver.c 
> b/drivers/bus/fsl-mc/dprc-driver.c
> index 52c7e15..be80e3a 100644
> --- a/drivers/bus/fsl-mc/dprc-driver.c
> +++ b/drivers/bus/fsl-mc/dprc-driver.c
> @@ -214,8 +214,8 @@ static void dprc_add_new_devices(struct fsl_mc_device 
> *mc_bus_dev,
>   * populated before they can get allocation requests from probe callbacks
>   * of the device drivers for the non-allocatable devices.
>   */
> -static int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev,
> -  unsigned int *total_irq_count)
> +int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev,
> +   unsigned int *total_irq_count)
>  {
>   int num_child_objects;
>   int dprc_get_obj_failures;
> diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
> index 240b99d..763cbeb 100644
> --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> @@ -137,8 +137,36 @@ static ssize_t modalias_show(struct device *dev, struct 
> device_attribute *attr,
>  }
>  static DEVICE_ATTR_RO(modalias);
>  
> +static ssize_t rescan_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + struct fsl_mc_device *root_mc_dev;
> + struct fsl_mc_bus *root_mc_bus;
> + unsigned long val;
> +
> + if (!fsl_mc_is_root_dprc(dev))
> + return -EINVAL;
> +
> + root_mc_dev = to_fsl_mc_device(dev);
> + root_mc_bus = to_fsl_mc_bus(root_mc_dev);
> +
> + if (kstrtoul(buf, 0, &val) < 0)
> + return -EINVAL;
> +
> + if (val) {
> + mutex_lock(&root_mc_bus->scan_mutex);
> + dprc_scan_objects(root_mc_dev, NULL);
> + mutex_unlock(&root_mc_bus->scan_mutex);
> + }
> +
> + return count;
> +}
> +static DEVICE_ATTR_WO(rescan);

You did not add the correct new documentation in Documentation/ABI/ for
the new sysfs attributes you are creating.  Please do so as part of this
patch series.

thanks,

greg k-h


Re: [PATCH] ASoC: soc-core: Add missing NULL check

2018-03-09 Thread Pavel Machek
On Fri 2018-03-09 10:45:16, Kees Cook wrote:
> On Fri, Mar 9, 2018 at 4:50 AM, Mark Brown  wrote:
> > On Thu, Mar 08, 2018 at 12:06:53PM -0800, Kees Cook wrote:
> >
> >> If a codec is not attached to the sound soc, a NULL deref is possible as a
> >> regular user in /sys.
> >
> > I can't parse this, sorry.  What is the "sound soc"?
> 
> SoC's sound component? I'm not sure either. :) I was just sending the
> patch that I mentioned from the thread where Pavel mentioned this
> Oops.
> 
> Pavel, can you isolate the specific file that is causing the oops?
> (Maybe this patch should be a WARN() instead of silent return 0, since
> we still don't want to crash, but it should be considered a bug...)

Crash is reproducible on linux-next on Nokia N900. But I seen hang on
Nokia N9, with different kernel, that may be related.

And yes, WARN() would be nicer.

> >> +++ b/sound/soc/soc-core.c
> >> @@ -137,6 +137,9 @@ static ssize_t soc_codec_reg_show(struct snd_soc_codec 
> >> *codec, char *buf,
> >>   size_t total = 0;
> >>   loff_t p = 0;
> >>
> >> + if (!codec || !codec->driver)
> >> + return 0;
> >> +
> >
> > How are we managing to create a sysfs file for a CODEC which doesn't
> > have a CODEC struct associated with it?  That is obviously nonsensical
> > and suggests we've got some more serious problem going on here - if
> > there's no CODEC those sysfs attributes simply shouldn't be there.
> 
> No idea! Hopefully Pavel has more details...

Pavel probably can reproduce it...

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] perf tools arm64: Add libdw DWARF post unwind support for ARM64

2018-03-09 Thread Martin Vuille

Thanks.

I replied to your message with additional information.

I will update the commit message and resubmit the patch.

MV


On 03/09/18 14:29, Arnaldo Carvalho de Melo wrote:

Em Fri, Mar 09, 2018 at 01:49:50PM -0500, Martin Vuille escreveu:

Hi,

I made two other submissions that may also have been overlooked:

https://patchwork.kernel.org/patch/10211401/ -- This one has the S-o-B

Ok, replied to that one, I can't see where is it that the symfs is being
first appended, please clarify that in the patch commit log message.
  




RE: [PATCH v3 2/6] PCI: hv: hv_eject_device_work(): remove the bogus test

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ; Jack
> Morgenstein ; sta...@vger.kernel.org
> Subject: [PATCH v3 2/6] PCI: hv: hv_eject_device_work(): remove the bogus test
> 
> When we're in the function, hpdev->state must be hv_pcichild_ejecting:
> see hv_pci_eject_device().
> 
> Signed-off-by: Dexuan Cui 
> Cc: Vitaly Kuznetsov 
> Cc: Jack Morgenstein 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> Cc: Michael Kelley (EOSG) 
> ---

Acked-by: Haiyang Zhang 


RE: [PATCH v3 1/6] PCI: hv: fix a comment typo in _hv_pcifront_read_config()

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ;
> sta...@vger.kernel.org
> Subject: [PATCH v3 1/6] PCI: hv: fix a comment typo in
> _hv_pcifront_read_config()
> 
> No functional change.
> 
> Signed-off-by: Dexuan Cui 
> Fixes: bdd74440d9e8 ("PCI: hv: Add explicit barriers to config space access")
> Cc: Vitaly Kuznetsov 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> ---

Acked-by: Haiyang Zhang 



Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 11:12 AM, Linus Torvalds
 wrote:
>
> How are you going to handle five processes doing the same setup concurrently?

Side note: it's not just serialization. It's also "is it actually up
and running".

The rule for "request_module()" (for a real module) has been that it
returns when the module is actually alive and active and have done
their initcalls.

The UMH_WAIT_EXEC behavior (ignore the serialization - you could do
that in the caller) behavior doesn't actually have any semantics AT
ALL. It only means that you get the error returns from execve()
itself, so you know that the executable file actually existed and
parsed right enough to get started.

But you don't actually have any reason to believe that it has *done*
anything, and started processing any requests. There's no reason
what-so-ever to believe that it has registered itself for any
asynchronous requests or anything like that.

So in the real module case, you can do

request_module("modulename");

and just start using whatever resource you just requested. So the
netfilter code literally does

request_module("nft-chain-%u-%.*s", family,
   nla_len(nla), (const char *)nla_data(nla));
nfnl_lock(NFNL_SUBSYS_NFTABLES);
type = __nf_tables_chain_type_lookup(nla, family);
if (type != NULL)
return ERR_PTR(-EAGAIN);

and doesn't even care about error handling for request_module()
itself, because it knows that either the module got loaded and is
ready, or something failed. And it needs to look that chain type up
anyway, so the failure is indicated by _that_.

With a UMH_WAIT_EXEC? No. You have *nothing*. You know the thing
started, but it might have SIGSEGV'd immediately, and you have
absolutely no way of knowing, and absolutely no way of even figuring
it out. You can wait - forever - for something to bind to whatever
dynamic resource you're expecting. You'll just fundamentally never
know.

You can try again, of course. Add a timeout, and try again in five
seconds or something. Maybe it will work then. Maybe it won't. You
won't have any way to know the _second_ time around either. Or the
third. Or...

See why I say it has to be synchronous?

If it's synchronous, you can actually do things like

 (a) maybe you only need a one-time thing, and don't have any state
("load fixed tables, be done") and that's it. If the process returns
with no error code, you're all done, and you know it's fine.

 (b) maybe the process wants to start a listener daemon or something
like the traditional inetd model. It can open the socket, it can start
listening on it, and it can fork off a child and check it's status. It
can then do exit(0) if everything is fine, and now request_module()
returns.

see the difference? Even if you ended up with a background process
(like in that (b) case), you did so with *error* handling, and you did
so knowing that the state has actually been set up by the time the
request_module() returns.

And if you use the proper module loading exclusion, it also means that
that (b) can know it's the only process starting up, and it's not
racing with another one.  It might still want to do the usual
lock-files in user space to protect against just the admin starting it
manually, but at least you don't have the situation that a hundred
threads just had a thundering herd where they all ended up using the
same kernel facility, and they all independently started a hundred
usermode helpers.

  Linus


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Andy Lutomirski
On Fri, Mar 9, 2018 at 6:55 PM, David Miller  wrote:
> From: Alexei Starovoitov 
> Date: Fri, 9 Mar 2018 10:50:49 -0800
>
>> On 3/9/18 10:23 AM, Andy Lutomirski wrote:
>>> It might not be totally crazy to back it by tmpfs.
>>
>> interesting. how do you propose to do it?
>> Something like:
>> - create /umh_module_tempxxx dir
>> - mount tmpfs there
>> - copy elf into it and exec it?
>
> I think the idea is that it's an internal tmpfs mount that only
> the kernel has access too.

That's what I was imagining.  There's precedent.  For example, there's
a very short piece of code that does it in
drivers/gpu/drm/i915/i915_gemfs.c.


>
> And I don't think that even hurts your debuggability concerns.  The
> user can just attach using the foo.ko file in the actual filesystem.
>

Not if the .ko is actually a shim that actually just contains a blob
and a few lines of code to kick off the umh.  But one could still
debug it using kernel debug symbols (like vDSO debugging works right
now, at least if your distro is in a good mood) or by reading the
contents from /proc/PID/exe.


RE: [PATCH v3 4/6] PCI: hv: remove hbus->enum_sem

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ; Jack
> Morgenstein ; sta...@vger.kernel.org
> Subject: [PATCH v3 4/6] PCI: hv: remove hbus->enum_sem
> 
> Since we serialize the present/eject work items now, we don't need the
> semaphore any more.
> 
> This is suggested by Michael Kelley.
> 
> Signed-off-by: Dexuan Cui 
> Cc: Vitaly Kuznetsov 
> Cc: Jack Morgenstein 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> Cc: Michael Kelley (EOSG) 
> ---

Acked-by: Haiyang Zhang 


RE: [PATCH v3 3/6] PCI: hv: serialize the present/eject work items

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ; Jack
> Morgenstein ; sta...@vger.kernel.org
> Subject: [PATCH v3 3/6] PCI: hv: serialize the present/eject work items
> 
> When we hot-remove the device, we first receive a PCI_EJECT message and
> then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
> 
> The first message is offloaded to hv_eject_device_work(), and the second is
> offloaded to pci_devices_present_work(). Both the paths can be running
> list_del(&hpdev->list_entry), causing general protection fault, because
> system_wq can run them concurrently.
> 
> The patch eliminates the race condition.
> 
> Signed-off-by: Dexuan Cui 
> Tested-by: Adrian Suhov 
> Tested-by: Chris Valean 
> Cc: Vitaly Kuznetsov 
> Cc: Jack Morgenstein 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> ---

Acked-by: Haiyang Zhang 


RE: [PATCH v3 6/6] PCI: hv: fix 2 hang issues in hv_compose_msi_msg()

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ;
> sta...@vger.kernel.org; Jack Morgenstein 
> Subject: [PATCH v3 6/6] PCI: hv: fix 2 hang issues in hv_compose_msi_msg()
> 
> 1. With the patch "x86/vector/msi: Switch to global reservation mode"
> (4900be8360), the recent v4.15 and newer kernels always hang for 1-vCPU
> Hyper-V VM with SR-IOV. This is because when we reach
> hv_compose_msi_msg() by request_irq()  -> request_threaded_irq() ->
> __setup_irq()->irq_startup()  -> __irq_startup() -> irq_domain_activate_irq() 
> -
> > ... ->
> msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is disabled in
> __setup_irq().
> 
> Fix this by polling the channel.
> 
> 2. If the host is ejecting the VF device before we reach hv_compose_msi_msg(),
> in a UP VM, we can hang in hv_compose_msi_msg() forever, because at this
> time the host doesn't respond to the CREATE_INTERRUPT request. This issue
> also happens to old kernels like v4.14, v4.13, etc.
> 
> Fix this by polling the channel for the PCI_EJECT message and
> hpdev->state, and by checking the PCI vendor ID.
> 
> Note: actually the above issues also happen to a SMP VM, if "hbus->hdev-
> >channel->target_cpu == smp_processor_id()" is true.
> 
> Signed-off-by: Dexuan Cui 
> Tested-by: Adrian Suhov 
> Tested-by: Chris Valean 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> Cc: Vitaly Kuznetsov 
> Cc: Jack Morgenstein 
> ---

Acked-by: Haiyang Zhang 


RE: [PATCH v3 5/6] PCI: hv: hv_pci_devices_present(): only queue a new work when necessary

2018-03-09 Thread Haiyang Zhang


> -Original Message-
> From: Dexuan Cui
> Sent: Tuesday, March 6, 2018 1:22 PM
> To: bhelg...@google.com; linux-...@vger.kernel.org; KY Srinivasan
> ; Stephen Hemminger ;
> o...@aepfle.de; a...@canonical.com; jasow...@redhat.com
> Cc: linux-kernel@vger.kernel.org; driverdev-de...@linuxdriverproject.org;
> Haiyang Zhang ; vkuzn...@redhat.com;
> marcelo.ce...@canonical.com; Michael Kelley (EOSG)
> ; Dexuan Cui ; Jack
> Morgenstein ; sta...@vger.kernel.org
> Subject: [PATCH v3 5/6] PCI: hv: hv_pci_devices_present(): only queue a new
> work when necessary
> 
> If there is a pending work, we just need to add the new dr into the dr_list.
> 
> This is suggested by Michael Kelley.
> 
> Signed-off-by: Dexuan Cui 
> Cc: Vitaly Kuznetsov 
> Cc: Jack Morgenstein 
> Cc: sta...@vger.kernel.org
> Cc: Stephen Hemminger 
> Cc: K. Y. Srinivasan 
> Cc: Michael Kelley (EOSG) 
> ---

Acked-by: Haiyang Zhang 


Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Mike Galbraith
On Fri, 2018-03-09 at 13:20 -0500, Waiman Long wrote:
> On 03/09/2018 01:17 PM, Mike Galbraith wrote:
> > On Fri, 2018-03-09 at 12:45 -0500, Waiman Long wrote:
> >> On 03/09/2018 11:34 AM, Mike Galbraith wrote:
> >>> On Fri, 2018-03-09 at 10:35 -0500, Waiman Long wrote:
>  Given the fact that thread mode had been merged into 4.14, it is now
>  time to enable cpuset to be used in the default hierarchy (cgroup v2)
>  as it is clearly threaded.
> 
>  The cpuset controller had experienced feature creep since its
>  introduction more than a decade ago. Besides the core cpus and mems
>  control files to limit cpus and memory nodes, there are a bunch of
>  additional features that can be controlled from the userspace. Some of
>  the features are of doubtful usefulness and may not be actively used.
> >>> One rather important features is the ability to dynamically partition a
> >>> box and isolate critical loads.  How does one do that with v2?
> >>>
> >>> In v1, you create two or more exclusive sets, one for generic
> >>> housekeeping, and one or more for critical load(s), RT in my case,
> >>> turning off load balancing in the critical set(s) for obvious reasons.
> >> This patch just serves as a foundation for cpuset support in v2. I am
> >> not excluding the fact that more v1 features will be added in future
> >> patches. We want to start with a clean slate and add on it after careful
> >> consideration. There are some v1 cpuset features that are not used or
> >> rarely used. We certainly want to get rid of them, if possible.
> > If v2 is to ever supersede v1, as is the normal way of things, core
> > functionality really should be on the v2 boat when it sails.  What you
> > left standing on the dock is critical core cpuset functionality.
> >
> > -Mike
> 
> From your perspective, what are core functionality that should be
> included in cpuset v2 other than the ability to restrict cpus and memory
> nodes.

Exclusive sets are essential, no?  How else can you manage set wide
properties such as topology (and hopefully soonish nohz).  You clearly
can't have overlapping sets, one having scheduler topology, the other
having none.  Whatever the form, something as core as the capability to
dynamically partition and isolate should IMO be firmly aboard the v2
boat before it sails.

-Mike


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-09 Thread Andy Lutomirski
On Fri, Mar 9, 2018 at 7:38 PM, Linus Torvalds
 wrote:
> On Fri, Mar 9, 2018 at 11:12 AM, Linus Torvalds
>  wrote:
>>
>> How are you going to handle five processes doing the same setup concurrently?
>
> Side note: it's not just serialization. It's also "is it actually up
> and running".
>

I think the right way to solve this would be to take a hint from
systemd's socket activation model.  The current patch had the module
load process kick off an ELF binary that goes an registers itself to
handle something.  We can turn that around.  Make the module init
function create the socket (or pipe or whatever) receives request and
pass it to the user program as stdin.  Then the kernel can start
queueing requests into the socket immediately, and the user program
will get to them whenever it finishes initializing.  Or it can write
some message to the socket saying "hey, I'm ready".

This also completely avoids the issue where some clever user manually
loads the "module" with exec() ("hey, I'm so clever, I can just run
the damn thing instead if using init_module()!" or writes an
out-of-tree program that uses whatever supposedly secret API the
in-kernel binary is supposed to use to register itself (and I know
people who would do exactly that!) and the kernel does
request_module() at roughly the same time.


Re: [PATCH v2] exec: Set file unwritable before LSM check

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 11:30 AM, Kees Cook  wrote:
> The LSM check should happen after the file has been confirmed to be
> unchanging. Without this, we could have a race between the Time of Check
> (the call to security_kernel_read_file() which could read the file and
> make access policy decisions) and the Time of Use (starting with
> kernel_read_file()'s reading of the file contents). In theory, file
> contents could change between the two.

I'm going to assume I get this for 4.17 from the security tree.

Because I'm guessing there are actually no existing users that care?
selinux seems to just look at file state, not actually at contents or
anything that write access denial would care about.

And the only other security module that even registers this is
loadpin, and again it just seems to check things like "on the right
filesystem" that aren't actually impacted by write access (in fact,
the documented reason is to check that it's a read-only filesystem so
that write access is simply _irrelevant_).

So this issue seems to be mainly a cleanliness thing, not an actual bug.

 Linus


Re: [PATCH v3] input: bcm5974 - Add driver for Apple Magic Trackpad 2

2018-03-09 Thread Stephan Mueller
Am Dienstag, 27. Februar 2018, 17:37:45 CET schrieb Stephan Mueller:

Hi Jiri, Dimity, Henrik,

> Am Sonntag, 21. Januar 2018, 23:06:55 CET schrieb Stephan Müller:
> 
> Hi Jiri, Dimity, Henrik,
> 
> > Hi,
> > 
> > Changes v3:
> > * port to 4.15-rc8
> > * small code cleanups (isolation of type casts to functions pertaining
> > 
> >   to the Apple Magic Trackpad 2
> > 
> > * clean up all checkpatch.pl errors and warnings (except those
> > 
> >   where the patch uses the structure of existing code fragments)
> > 
> > * updated horizontal and vertical limits to capture start of movements
> > 
> >   in the outer areas of the pad
> > 
> > ---8<---
> > 
> > Add support for Apple Magic Trackpad 2 in bcm5974 (MacBook Tochpad)
> > driver.
> > The Magic Trackpad 2 needs to be switched into the finger-reporting-mode,
> > just like the other macbook touchpads as well. But the format is different
> > to the ones before. The Header is 12 Bytes long and each reported finger
> > is additional 9 Bytes. The data order reported by the hardware is
> > different as well.
> 
> May I ask whether there is an issue in the patch? I have not received any
> feedback and would like to inquire about the status of the patch.
> 
> I would like to have Touchpad 2 properly supported.



Ciao
Stephan




Re: [PATCH v2] exec: Set file unwritable before LSM check

2018-03-09 Thread Kees Cook
On Fri, Mar 9, 2018 at 11:47 AM, Linus Torvalds
 wrote:
> On Fri, Mar 9, 2018 at 11:30 AM, Kees Cook  wrote:
>> The LSM check should happen after the file has been confirmed to be
>> unchanging. Without this, we could have a race between the Time of Check
>> (the call to security_kernel_read_file() which could read the file and
>> make access policy decisions) and the Time of Use (starting with
>> kernel_read_file()'s reading of the file contents). In theory, file
>> contents could change between the two.
>
> I'm going to assume I get this for 4.17 from the security tree.
>
> Because I'm guessing there are actually no existing users that care?
> selinux seems to just look at file state, not actually at contents or
> anything that write access denial would care about.
>
> And the only other security module that even registers this is
> loadpin, and again it just seems to check things like "on the right
> filesystem" that aren't actually impacted by write access (in fact,
> the documented reason is to check that it's a read-only filesystem so
> that write access is simply _irrelevant_).
>
> So this issue seems to be mainly a cleanliness thing, not an actual bug.

That is my assumption too (I left off the Cc: stable as a result). I'm
much less familiar with IMA, though, but it's a caller of
kernel_read_file(), not hooking it, etc.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH] x86, powerpc : pkey-mprotect must allow pkey-0

2018-03-09 Thread Ram Pai
On Fri, Mar 09, 2018 at 07:37:04PM +1100, Balbir Singh wrote:
> On Fri, Mar 9, 2018 at 7:12 PM, Ram Pai  wrote:
> > Once an address range is associated with an allocated pkey, it cannot be
> > reverted back to key-0. There is no valid reason for the above behavior.  On
> > the contrary applications need the ability to do so.
> >
> > The patch relaxes the restriction.
> 
> I looked at the code and my observation was going to be that we need
> to change mm_pkey_is_allocated. I still fail to understand what
> happens if pkey 0 is reserved? What is the default key is it the first
> available key? Assuming 0 is the default key may work and seems to
> work, but I am sure its mostly by accident. It would be nice, if we
> could have  a notion of the default key. I don't like the special
> meaning given to key 0 here. Remember on powerpc if 0 is reserved and
> UAMOR/AMOR does not allow modification because it's reserved, setting
> 0 will still fail

The linux pkey API, assumes pkey-0 is the default key. If no key is
explicitly associated with a page, the default key gets associated.
When a default key gets associated with a page, the permissions on the
page are not dictated by the permissions of the default key, but by the
permission of other bits in the pte; i.e _PAGE_RWX.

On powerpc, and AFAICT on x86, neither the hardware nor the hypervisor
reserves key-0. Hence the OS is free to use the key value, the
way it chooses. On Linux we choose to associate key-0 the special status
called default-key.

However I see your point. If some cpu architecture takes away key-0 from
Linux, than implementing the special status for key-0 on that
architecture can become challenging, though not impossible. That
architecture implementation can internally map key-0 value to some other
available key, and associate that key to the page. And offcourse make
sure that the hardware/MMU uses the pte's RWX bits to enforce
permissions, for that key.


-- 
Ram Pai



Re: [PATCH] x86, powerpc : pkey-mprotect must allow pkey-0

2018-03-09 Thread Ram Pai
On Fri, Mar 09, 2018 at 12:04:49PM +0100, Florian Weimer wrote:
> On 03/09/2018 09:12 AM, Ram Pai wrote:
> >Once an address range is associated with an allocated pkey, it cannot be
> >reverted back to key-0. There is no valid reason for the above behavior.
> 
> mprotect without a key does not necessarily use key 0, e.g. if
> protection keys are used to emulate page protection flag combination
> which is not directly supported by the hardware.
> 
> Therefore, it seems to me that filtering out non-allocated keys is
> the right thing to do.

I am not sure, what you mean. Do you agree with the patch or otherwise?
RP



[PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-09 Thread Kees Cook
When max() is used in stack array size calculations from literal values
(e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
thinks this is a dynamic calculation due to the single-eval logic, which
is not needed in the literal case. This change removes several accidental
stack VLAs from an x86 allmodconfig build:

$ diff -u before.txt after.txt | grep ^-
-drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids 
variable length array ‘ids’ [-Wvla]
-fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array 
‘namebuf’ [-Wvla]
-lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ 
[-Wvla]
-net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ 
[-Wvla]
-net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ 
[-Wvla]
-net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ 
[-Wvla]

Based on an earlier patch from Josh Poimboeuf.

Signed-off-by: Kees Cook 
---
v3:
- drop __builtin_types_compatible_p() (Rasmus, Linus)
v2:
- fix copy/paste-o max1_/max2_ (ijc)
- clarify "compile-time" constant in comment (Rasmus)
- clean up formatting on min_t()/max_t()
---
 include/linux/kernel.h | 48 ++--
 1 file changed, 30 insertions(+), 18 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3fd291503576..a0fca4deb3ab 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -787,37 +787,55 @@ static inline void ftrace_dump(enum ftrace_dump_mode 
oops_dump_mode) { }
  * strict type-checking.. See the
  * "unnecessary" pointer comparison.
  */
-#define __min(t1, t2, min1, min2, x, y) ({ \
+#define __single_eval_min(t1, t2, min1, min2, x, y) ({ \
t1 min1 = (x);  \
t2 min2 = (y);  \
(void) (&min1 == &min2);\
min1 < min2 ? min1 : min2; })
 
+/*
+ * In the case of compile-time constant values, there is no need to do
+ * the double-evaluation protection, so the raw comparison can be made.
+ * This allows min()/max() to be used in stack array allocations and
+ * avoid the compiler thinking it is a dynamic value leading to an
+ * accidental VLA.
+ */
+#define __min(t1, t2, x, y)\
+   __builtin_choose_expr(__builtin_constant_p(x) &&\
+ __builtin_constant_p(y),  \
+ (t1)(x) < (t2)(y) ? (t1)(x) : (t2)(y),\
+ __single_eval_min(t1, t2, \
+   __UNIQUE_ID(min1_), \
+   __UNIQUE_ID(min2_), \
+   x, y))
+
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define min(x, y)  \
-   __min(typeof(x), typeof(y), \
- __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),   \
- x, y)
+#define min(x, y)  __min(typeof(x), typeof(y), x, y)
 
-#define __max(t1, t2, max1, max2, x, y) ({ \
+#define __single_eval_max(t1, t2, max1, max2, x, y) ({ \
t1 max1 = (x);  \
t2 max2 = (y);  \
(void) (&max1 == &max2);\
max1 > max2 ? max1 : max2; })
 
+#define __max(t1, t2, x, y)\
+   __builtin_choose_expr(__builtin_constant_p(x) &&\
+ __builtin_constant_p(y),  \
+ (t1)(x) > (t2)(y) ? (t1)(x) : (t2)(y),\
+ __single_eval_max(t1, t2, \
+   __UNIQUE_ID(max1_), \
+   __UNIQUE_ID(max2_), \
+   x, y))
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define max(x, y)  \
-   __max(typeof(x), typeof(y), \
- __UNIQUE_ID(max1_), __UNIQUE_ID(max2_),   \
- x, y)
+#define max(x, y)  __max(typeof(x), typeof(y), x, y)
 
 /**
  * min3 - return minimum of three values
@@ -869,10 +887,7 @@ static inline void ftrace_dump(enum ftrace_dump_mode 
oops_dump_mode) { }
  * @x: first value
  * @y: second value
  */
-#define min_t(type, x, y)  \
-   __min(type, type,   \
- __UNIQUE_ID(min1_), __UNIQUE_ID(min2_),   \
- x, y)
+#define min_t(type, x, y)   __min(type, type, x, y)
 
 /**
  * max_t 

Re: [PATCH] x86, powerpc : pkey-mprotect must allow pkey-0

2018-03-09 Thread Ram Pai
On Fri, Mar 09, 2018 at 09:19:53PM +1100, Michael Ellerman wrote:
> Ram Pai  writes:
> 
> > Once an address range is associated with an allocated pkey, it cannot be
> > reverted back to key-0. There is no valid reason for the above behavior.  On
> > the contrary applications need the ability to do so.
> 
> Please explain this in much more detail. Is it an ABI change?

Not necessarily an ABI change. older binary applications  will continue
to work. It can be considered as a bug-fix.

> 
> And why did we just notice this?

Yes. this was noticed by an application vendor.

> 
> > The patch relaxes the restriction.
> >
> > Tested on powerpc and x86_64.
> 
> Thanks, but please split the patch, one for each arch.

Will do.
RP



Re: [PATCH] x86, powerpc : pkey-mprotect must allow pkey-0

2018-03-09 Thread Ram Pai
On Fri, Mar 09, 2018 at 09:43:32AM +0100, Ingo Molnar wrote:
> 
> * Ram Pai  wrote:
> 
> > Once an address range is associated with an allocated pkey, it cannot be
> > reverted back to key-0. There is no valid reason for the above behavior.  On
> > the contrary applications need the ability to do so.
> > 
> > The patch relaxes the restriction.
> > 
> > Tested on powerpc and x86_64.
> > 
> > cc: Dave Hansen 
> > cc: Michael Ellermen 
> > cc: Ingo Molnar 
> > Signed-off-by: Ram Pai 
> > ---
> >  arch/powerpc/include/asm/pkeys.h | 19 ++-
> >  arch/x86/include/asm/pkeys.h |  5 +++--
> >  2 files changed, 17 insertions(+), 7 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/pkeys.h 
> > b/arch/powerpc/include/asm/pkeys.h
> > index 0409c80..3e8abe4 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -101,10 +101,18 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
> >  
> >  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> >  {
> > -   /* A reserved key is never considered as 'explicitly allocated' */
> > -   return ((pkey < arch_max_pkey()) &&
> > -   !__mm_pkey_is_reserved(pkey) &&
> > -   __mm_pkey_is_allocated(mm, pkey));
> > +   /* pkey 0 is allocated by default. */
> > +   if (!pkey)
> > +  return true;
> > +
> > +   if (pkey < 0 || pkey >= arch_max_pkey())
> > +  return false;
> > +
> > +   /* reserved keys are never allocated. */
> > +   if (__mm_pkey_is_reserved(pkey))
> > +  return false;
> 
> Please capitalize in comments consistently, i.e.:

ok.

> 
>   /* Reserved keys are never allocated: */
> 
> > +
> > +   return(__mm_pkey_is_allocated(mm, pkey));
> 
> 'return' is not a function.

right. will fix.

Thanks,
RP



[PATCH v3] [media] Use common error handling code in 19 functions

2018-03-09 Thread SF Markus Elfring
From: Markus Elfring 
Date: Fri, 9 Mar 2018 21:00:12 +0100

Adjust jump targets so that a bit of exception handling can be better
reused at the end of these functions.

This issue was partly detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---

v3:
Laurent Pinchart and Todor Tomov requested a few adjustments.
Updates were rebased on source files from Linux next-20180308.

v2:
Hans Verkuil insisted on patch squashing. Thus several changes
were recombined based on source files from Linux next-20180216.

The implementation of the function "tda8261_set_params" was improved
after a notification by Christoph Böhmwalder on 2017-09-26.

 drivers/media/dvb-core/dmxdev.c| 16 
 drivers/media/dvb-frontends/tda1004x.c | 20 ++
 drivers/media/dvb-frontends/tda8261.c  | 19 ++
 drivers/media/pci/bt8xx/dst.c  | 19 ++
 drivers/media/pci/bt8xx/dst_ca.c   | 30 +++
 drivers/media/pci/cx88/cx88-input.c| 17 +
 drivers/media/platform/omap3isp/ispvideo.c | 28 ++
 .../media/platform/qcom/camss-8x16/camss-csid.c| 19 +-
 drivers/media/tuners/tuner-xc2028.c| 30 +++
 drivers/media/usb/cpia2/cpia2_usb.c| 13 ---
 drivers/media/usb/gspca/gspca.c| 17 +
 drivers/media/usb/gspca/sn9c20x.c  | 17 +
 drivers/media/usb/pvrusb2/pvrusb2-ioread.c | 10 +++--
 drivers/media/usb/tm6000/tm6000-cards.c|  7 ++--
 drivers/media/usb/tm6000/tm6000-dvb.c  | 11 --
 drivers/media/usb/tm6000/tm6000-video.c| 13 ---
 drivers/media/usb/ttusb-budget/dvb-ttusb-budget.c  | 13 +++
 drivers/media/usb/ttusb-dec/ttusb_dec.c| 43 --
 18 files changed, 171 insertions(+), 171 deletions(-)

diff --git a/drivers/media/dvb-core/dmxdev.c b/drivers/media/dvb-core/dmxdev.c
index 61a750fae465..17d05b05fa9d 100644
--- a/drivers/media/dvb-core/dmxdev.c
+++ b/drivers/media/dvb-core/dmxdev.c
@@ -656,18 +656,18 @@ static int dvb_dmxdev_start_feed(struct dmxdev *dmxdev,
tsfeed->priv = filter;
 
ret = tsfeed->set(tsfeed, feed->pid, ts_type, ts_pes, timeout);
-   if (ret < 0) {
-   dmxdev->demux->release_ts_feed(dmxdev->demux, tsfeed);
-   return ret;
-   }
+   if (ret < 0)
+   goto release_feed;
 
ret = tsfeed->start_filtering(tsfeed);
-   if (ret < 0) {
-   dmxdev->demux->release_ts_feed(dmxdev->demux, tsfeed);
-   return ret;
-   }
+   if (ret < 0)
+   goto release_feed;
 
return 0;
+
+release_feed:
+   dmxdev->demux->release_ts_feed(dmxdev->demux, tsfeed);
+   return ret;
 }
 
 static int dvb_dmxdev_filter_start(struct dmxdev_filter *filter)
diff --git a/drivers/media/dvb-frontends/tda1004x.c 
b/drivers/media/dvb-frontends/tda1004x.c
index 58e3beff5adc..85ca111fc8c4 100644
--- a/drivers/media/dvb-frontends/tda1004x.c
+++ b/drivers/media/dvb-frontends/tda1004x.c
@@ -1299,20 +1299,22 @@ struct dvb_frontend* tda10045_attach(const struct 
tda1004x_config* config,
id = tda1004x_read_byte(state, TDA1004X_CHIPID);
if (id < 0) {
printk(KERN_ERR "tda10045: chip is not answering. Giving 
up.\n");
-   kfree(state);
-   return NULL;
+   goto free_state;
}
 
if (id != 0x25) {
printk(KERN_ERR "Invalid tda1004x ID = 0x%02x. Can't 
proceed\n", id);
-   kfree(state);
-   return NULL;
+   goto free_state;
}
 
/* create dvb_frontend */
memcpy(&state->frontend.ops, &tda10045_ops, sizeof(struct 
dvb_frontend_ops));
state->frontend.demodulator_priv = state;
return &state->frontend;
+
+free_state:
+   kfree(state);
+   return NULL;
 }
 
 static const struct dvb_frontend_ops tda10046_ops = {
@@ -1369,19 +1371,21 @@ struct dvb_frontend* tda10046_attach(const struct 
tda1004x_config* config,
id = tda1004x_read_byte(state, TDA1004X_CHIPID);
if (id < 0) {
printk(KERN_ERR "tda10046: chip is not answering. Giving 
up.\n");
-   kfree(state);
-   return NULL;
+   goto free_state;
}
if (id != 0x46) {
printk(KERN_ERR "Invalid tda1004x ID = 0x%02x. Can't 
proceed\n", id);
-   kfree(state);
-   return NULL;
+   goto free_state;
}
 
/* create dvb_frontend */
memcpy(&state->frontend.ops, &tda10046_ops, sizeof(struct 
dvb_frontend_ops));
state->frontend.demodulator_priv = state;
return &state->frontend;
+
+free_state:
+   kfree(state);
+   return NULL;
 }
 
 module_param(debug, int, 0644);
diff --git a/drivers/media/dvb-frontends/tda8261.c 
b/d

[RFC PATCH v2 0/3] ima: namespacing IMA

2018-03-09 Thread Stefan Berger
This patch set implements an IMA namespace data structure that gets
created alongside a mount namespace with CLONE_NEWNS, and lays down the
foundation for namespacing the different aspects of IMA (eg. IMA-audit,
IMA-measurement, IMA-appraisal).

The original PoC patches [1] created a new CLONE_NEWIMA flag to
explicitly control when a new IMA namespace should be created. Based on
comments, we elected to hang the IMA namepace off of existing namespaces,
and the mount namespace made the most sense. In this version of the patches
we are adding a pointer to the mnt_namespace pointing to the ima_namespace.
Both are now tied together and joining the mnt_namespace with setns()
also joins the ima_namespace that was created along with it.

The first patch creates the ima_namespace data, while the second patch
puts the iint->flags in the namespace. The third patch uses these flags
for namespacing the IMA-audit messages, enabling the same file to be
audited each time it is accessed in a new namespace.

Mehmet Kayaalp (2):
  ima: Add ns_status for storing namespaced iint data
  ima: mamespace audit status flags

Yuqiong Sun (1):
  ima: extend clone() with IMA namespace support

 fs/mount.h   |  14 --
 fs/namespace.c   |  29 +++-
 include/linux/ima.h  |  70 ++
 include/linux/mount.h|  20 ++-
 init/Kconfig |  10 ++
 kernel/nsproxy.c |   1 +
 security/integrity/ima/Makefile  |   3 +-
 security/integrity/ima/ima.h |  47 ++-
 security/integrity/ima/ima_api.c |   8 +-
 security/integrity/ima/ima_init.c|   4 +
 security/integrity/ima/ima_init_ima_ns.c |  44 ++
 security/integrity/ima/ima_main.c|  15 +-
 security/integrity/ima/ima_ns.c  | 230 +++
 13 files changed, 469 insertions(+), 26 deletions(-)
 create mode 100644 security/integrity/ima/ima_init_ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns.c

-- 
2.13.6



[RFC PATCH v2 2/3] ima: Add ns_status for storing namespaced iint data

2018-03-09 Thread Stefan Berger
From: Mehmet Kayaalp 

This patch adds an rbtree to the IMA namespace structure that stores a
namespaced version of iint->flags in ns_status struct. Similar to the
integrity_iint_cache, both the iint ns_struct are looked up using the
inode pointer value. The lookup, allocate, and insertion code is also
similar, except ns_struct is not free'd when the inode is free'd.
Instead, the lookup verifies the i_ino and i_generation fields are also a
match. This could be replaced by a lazy clean up of the rbtree.

Signed-off-by: Mehmet Kayaalp 
---
 include/linux/ima.h  |   3 +
 security/integrity/ima/ima.h |  19 +
 security/integrity/ima/ima_init_ima_ns.c |   6 ++
 security/integrity/ima/ima_ns.c  | 117 +++
 4 files changed, 145 insertions(+)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index fd150dfde277..52c9d338819c 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -111,6 +111,9 @@ struct ima_namespace {
struct kref kref;
struct user_namespace *user_ns;
struct ima_namespace *parent;
+   struct rb_root ns_status_tree;
+   rwlock_t ns_status_lock;
+   struct kmem_cache *ns_status_cache;
 };
 
 extern struct ima_namespace init_ima_ns;
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index e98c11c7cf75..e51a39ff75ff 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -128,6 +128,14 @@ static inline void ima_load_kexec_buffer(void) {}
  */
 extern bool ima_canonical_fmt;
 
+struct ns_status {
+   struct rb_node rb_node;
+   struct inode *inode;
+   ino_t i_ino;
+   u32 i_generation;
+   unsigned long flags;
+};
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
@@ -295,6 +303,17 @@ int ima_ns_init(void);
 struct ima_namespace;
 int ima_init_namespace(struct ima_namespace *ns);
 
+#ifdef CONFIG_IMA_NS
+struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+   struct inode *inode);
+#else
+static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+ struct inode *inode)
+{
+   return NULL;
+}
+#endif /* CONFIG_IMA_NS */
+
 /* LSM based policy rules require audit */
 #ifdef CONFIG_IMA_LSM_RULES
 
diff --git a/security/integrity/ima/ima_init_ima_ns.c 
b/security/integrity/ima/ima_init_ima_ns.c
index 4b081dbfac07..9e0d5fafdfba 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -10,9 +10,15 @@
 #include 
 #include 
 #include 
+#include 
+
+#include "ima.h"
 
 int ima_init_namespace(struct ima_namespace *ns)
 {
+   ns->ns_status_tree = RB_ROOT;
+   rwlock_init(&ns->ns_status_lock);
+   ns->ns_status_cache = KMEM_CACHE(ns_status, SLAB_PANIC);
return 0;
 }
 
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index 7ab4322c88ae..03acf1431868 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -73,10 +73,24 @@ struct ima_namespace *copy_ima(struct user_namespace 
*user_ns,
return new_ns;
 }
 
+static void free_ns_status_cache(struct ima_namespace *ns)
+{
+   struct ns_status *status, *next;
+
+   write_lock(&ns->ns_status_lock);
+   rbtree_postorder_for_each_entry_safe(status, next,
+&ns->ns_status_tree, rb_node)
+   kmem_cache_free(ns->ns_status_cache, status);
+   ns->ns_status_tree = RB_ROOT;
+   write_unlock(&ns->ns_status_lock);
+   kmem_cache_destroy(ns->ns_status_cache);
+}
+
 static void destroy_ima_ns(struct ima_namespace *ns)
 {
put_user_ns(ns->user_ns);
put_ima_ns(ns->parent);
+   free_ns_status_cache(ns);
kfree(ns);
 }
 
@@ -89,3 +103,106 @@ void free_ima_ns(struct kref *kref)
 
destroy_ima_ns(ns);
 }
+
+/*
+ * __ima_ns_status_find - return the ns_status associated with an inode
+ */
+static struct ns_status *__ima_ns_status_find(struct ima_namespace *ns,
+ struct inode *inode)
+{
+   struct ns_status *status;
+   struct rb_node *n = ns->ns_status_tree.rb_node;
+
+   while (n) {
+   status = rb_entry(n, struct ns_status, rb_node);
+
+   if (inode < status->inode)
+   n = n->rb_left;
+   else if (inode > status->inode)
+   n = n->rb_right;
+   else
+   break;
+   }
+   if (!n)
+   return NULL;
+
+   return status;
+}
+
+/*
+ * ima_ns_status_find - return the ns_status associated with an inode
+ */
+static struct ns_status *ima_ns_status_find(struct ima_namespace *ns,
+   struct inode *inode)
+{
+   struct ns_status *status;
+
+   read_lock(&ns->ns_status_lock);
+   status = __ima_ns_sta

[RFC PATCH v2 3/3] ima: mamespace audit status flags

2018-03-09 Thread Stefan Berger
From: Mehmet Kayaalp 

The iint cache stores whether the file is measured, appraised, audited
etc. This patch moves the IMA_AUDITED flag into the per-namespace
ns_status, enabling IMA audit mechanism to audit the same file each time
it is accessed in a new namespace.

The ns_status is not looked up if the CONFIG_IMA_NS is disabled or if
any of the IMA_NS_STATUS_ACTIONS (currently only IMA_AUDIT) is not
enabled.

Read and write operations on the iint flags is replaced with function
calls. For reading, iint_flags() returns the bitwise AND of iint->flags
and ns_status->flags. The ns_status flags are masked with
IMA_NS_STATUS_FLAGS (currently only IMA_AUDITED). Similarly
set_iint_flags() only writes the masked portion to the ns_status flags,
while the iint flags is set as before. The ns_status parameter added to
ima_audit_measurement() is used with the above functions to query and
set the ns_status flags.

Signed-off-by: Mehmet Kayaalp 
---
 init/Kconfig  |  4 +++-
 security/integrity/ima/ima.h  | 24 +++-
 security/integrity/ima/ima_api.c  |  8 +---
 security/integrity/ima/ima_main.c | 15 ---
 security/integrity/ima/ima_ns.c   | 22 ++
 5 files changed, 65 insertions(+), 8 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index a1ad5384e081..f792ae235424 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -938,7 +938,9 @@ config IMA_NS
help
  Allow the creation of IMA namespaces for each mount namespace.
  Namespaced IMA data enables having IMA features work separately
- for each mount namespace.
+ for each mount namespace. Currently, only the audit status flags
+ are stored in the namespace, which allows the same file to be
+ audited each time it is accessed in a new namespace.
 
 endif # NAMESPACES
 
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index e51a39ff75ff..86fd3c1c0caf 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -210,7 +210,8 @@ void ima_store_measurement(struct integrity_iint_cache 
*iint, struct file *file,
   struct evm_ima_xattr_data *xattr_value,
   int xattr_len, int pcr);
 void ima_audit_measurement(struct integrity_iint_cache *iint,
-  const unsigned char *filename);
+  const unsigned char *filename,
+  struct ns_status *status);
 int ima_alloc_init_template(struct ima_event_data *event_data,
struct ima_template_entry **entry);
 int ima_store_template(struct ima_template_entry *entry, int violation,
@@ -299,6 +300,9 @@ static inline int ima_read_xattr(struct dentry *dentry,
 
 #endif /* CONFIG_IMA_APPRAISE */
 
+#define IMA_NS_STATUS_ACTIONS   IMA_AUDIT
+#define IMA_NS_STATUS_FLAGS IMA_AUDITED
+
 int ima_ns_init(void);
 struct ima_namespace;
 int ima_init_namespace(struct ima_namespace *ns);
@@ -306,12 +310,30 @@ int ima_init_namespace(struct ima_namespace *ns);
 #ifdef CONFIG_IMA_NS
 struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
struct inode *inode);
+unsigned long iint_flags(struct integrity_iint_cache *iint,
+struct ns_status *status);
+unsigned long set_iint_flags(struct integrity_iint_cache *iint,
+struct ns_status *status, unsigned long flags);
 #else
 static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
  struct inode *inode)
 {
return NULL;
 }
+
+static inline unsigned long iint_flags(struct integrity_iint_cache *iint,
+  struct ns_status *status)
+{
+   return iint->flags;
+}
+
+static inline unsigned long set_iint_flags(struct integrity_iint_cache *iint,
+  struct ns_status *status,
+  unsigned long flags)
+{
+   iint->flags = flags;
+   return flags;
+}
 #endif /* CONFIG_IMA_NS */
 
 /* LSM based policy rules require audit */
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index c7e8db0ea4c0..ee55dfd6afdb 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -304,15 +304,17 @@ void ima_store_measurement(struct integrity_iint_cache 
*iint,
 }
 
 void ima_audit_measurement(struct integrity_iint_cache *iint,
-  const unsigned char *filename)
+  const unsigned char *filename,
+  struct ns_status *status)
 {
struct audit_buffer *ab;
char hash[(iint->ima_hash->length * 2) + 1];
const char *algo_name = hash_algo_name[iint->ima_hash->algo];
char algo_hash[sizeof(hash) + strlen(algo_name) + 2];
int i;
+   unsigned long flags = iint_flags(iint, status);
 
- 

[RFC PATCH v2 1/3] ima: extend clone() with IMA namespace support

2018-03-09 Thread Stefan Berger
From: Yuqiong Sun 

Add new CONFIG_IMA_NS config option.  Let clone() create a new IMA
namespace upon CLONE_NEWNS flag. Add ima_ns data structure in nsproxy.
ima_ns is allocated and freed upon IMA namespace creation and exit.
Currently, the ima_ns contains no useful IMA data but only a dummy
interface. This patch creates the framework for namespacing the different
aspects of IMA (eg. IMA-audit, IMA-measurement, IMA-appraisal).

Changelog:
* Use CLONE_NEWNS instead of a new CLONE_NEWIMA flag
* Use existing ima.h headers
* Move the ima_namespace.c to security/integrity/ima/ima_ns.c
* Fix typo INFO->INO
* Each namespace free's itself, removed recursively free'ing
  until init_ima_ns from free_ima_ns()
* Moved ima_init_ns and related functions into own file that is
  always compiled
* Fixed putting of imans->parent
* Move IMA namespace creation from nsproxy into mount namespace
  code

Signed-off-by: Yuqiong Sun 
Signed-off-by: Mehmet Kayaalp 
Signed-off-by: Stefan Berger 
---
 fs/mount.h   | 14 -
 fs/namespace.c   | 29 --
 include/linux/ima.h  | 67 +++
 include/linux/mount.h| 20 ++-
 init/Kconfig |  8 +++
 kernel/nsproxy.c |  1 +
 security/integrity/ima/Makefile  |  3 +-
 security/integrity/ima/ima.h |  4 ++
 security/integrity/ima/ima_init.c|  4 ++
 security/integrity/ima/ima_init_ima_ns.c | 38 +
 security/integrity/ima/ima_ns.c  | 91 
 11 files changed, 260 insertions(+), 19 deletions(-)
 create mode 100644 security/integrity/ima/ima_init_ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns.c

diff --git a/fs/mount.h b/fs/mount.h
index f39bc9da4d73..e19ebde97756 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -5,20 +5,6 @@
 #include 
 #include 
 
-struct mnt_namespace {
-   atomic_tcount;
-   struct ns_commonns;
-   struct mount *  root;
-   struct list_headlist;
-   struct user_namespace   *user_ns;
-   struct ucounts  *ucounts;
-   u64 seq;/* Sequence number to prevent loops */
-   wait_queue_head_t poll;
-   u64 event;
-   unsigned intmounts; /* # of mounts in the namespace */
-   unsigned intpending_mounts;
-} __randomize_layout;
-
 struct mnt_pcp {
int mnt_count;
int mnt_writers;
diff --git a/fs/namespace.c b/fs/namespace.c
index 9d1374ab6e06..7f886c02278b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "pnode.h"
 #include "internal.h"
@@ -2858,6 +2859,7 @@ static void dec_mnt_namespaces(struct ucounts *ucounts)
 
 static void free_mnt_ns(struct mnt_namespace *ns)
 {
+   put_ima_ns(ns->ima_ns);
ns_free_inum(&ns->ns);
dec_mnt_namespaces(ns->ucounts);
put_user_ns(ns->user_ns);
@@ -2873,11 +2875,13 @@ static void free_mnt_ns(struct mnt_namespace *ns)
  */
 static atomic64_t mnt_ns_seq = ATOMIC64_INIT(1);
 
-static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns)
+static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns,
+ struct ima_namespace *ima_ns)
 {
struct mnt_namespace *new_ns;
struct ucounts *ucounts;
int ret;
+   int err;
 
ucounts = inc_mnt_namespaces(user_ns);
if (!ucounts)
@@ -2894,6 +2898,20 @@ static struct mnt_namespace *alloc_mnt_ns(struct 
user_namespace *user_ns)
dec_mnt_namespaces(ucounts);
return ERR_PTR(ret);
}
+
+   if (ima_ns == NULL) {
+   new_ns->ima_ns = get_ima_ns(&init_ima_ns);
+   } else {
+   new_ns->ima_ns = copy_ima(user_ns, ima_ns);
+   if (IS_ERR(new_ns->ima_ns)) {
+   err = PTR_ERR(new_ns->ima_ns);
+   ns_free_inum(&new_ns->ns);
+   kfree(new_ns);
+   dec_mnt_namespaces(ucounts);
+   return ERR_PTR(err);
+   }
+   }
+
new_ns->ns.ops = &mntns_operations;
new_ns->seq = atomic64_add_return(1, &mnt_ns_seq);
atomic_set(&new_ns->count, 1);
@@ -2920,6 +2938,7 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, 
struct mnt_namespace *ns,
int copy_flags;
 
BUG_ON(!ns);
+   BUG_ON(!ns->ima_ns);
 
if (likely(!(flags & CLONE_NEWNS))) {
get_mnt_ns(ns);
@@ -2928,7 +2947,7 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, 
struct mnt_namespace *ns,
 
old = ns->root;
 
-   new_ns = alloc_mnt_ns(user_ns);
+   new_ns = alloc_mnt_ns(user_ns, ns->ima_ns);
if (IS_ERR(new_ns))
return new_ns;
 
@@ -2989,7 +3008,8 @@ struct mnt_namespace *copy_mnt_ns(unsigned l

Re: [PATCH] rcu: exp: Fix "must hold exp_mutex" comments for QS reporting functions

2018-03-09 Thread Paul E. McKenney
On Fri, Mar 09, 2018 at 02:57:00PM +0800, Boqun Feng wrote:
> On Thu, Mar 08, 2018 at 07:42:55AM -0800, Paul E. McKenney wrote:
> > On Thu, Mar 08, 2018 at 04:30:06PM +0800, Boqun Feng wrote:
> > > On Thu, Mar 08, 2018 at 12:54:29PM +0800, Boqun Feng wrote:
> > > > On Wed, Mar 07, 2018 at 08:30:17PM -0800, Paul E. McKenney wrote:
> > > > [...]
> > > > > >  
> > > > > > +/*
> > > > > > + * Like sync_rcu_preempt_exp_done(), but this function assumes the 
> > > > > > caller
> > > > > > + * doesn't hold the rcu_node's ->lock, and will acquire and 
> > > > > > release the lock
> > > > > > + * itself
> > > > > > + */
> > > > > > +static bool sync_rcu_preempt_exp_done_unlocked(struct rcu_node 
> > > > > > *rnp)
> > > > > > +{
> > > > > > +   unsigned long flags;
> > > > > > +   bool ret;
> > > > > > +
> > > > > > +   raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > > > > +   ret = sync_rcu_preempt_exp_done(rnp);
> > > > > 
> > > > > Let's see...  The sync_rcu_preempt_exp_done() function checks the
> > > > > ->exp_tasks pointer and the ->expmask bitmask.  The number of bits in 
> > > > > the
> > > > > mask can only decrease, and the ->exp_tasks pointer can only 
> > > > > transition
> > > > > from NULL to non-NULL when there is at least one bit set.  However,
> > > > > there is no ordering in sync_rcu_preempt_exp_done(), so it is possible
> > > > > that it could be fooled without the lock:
> > > > > 
> > > > > o CPU 0 in sync_rcu_preempt_exp_done() reads ->exp_tasks and
> > > > >   sees that it is NULL.
> > > > > 
> > > > > o CPU 1 blocks within an RCU read-side critical section, so
> > > > >   it enqueues the task and points ->exp_tasks at it and
> > > > >   clears CPU 1's bit in ->expmask.
> > > > > 
> > > > > o All other CPUs clear their bits in ->expmask.
> > > > > 
> > > > > o CPU 0 reads ->expmask, sees that it is zero, so incorrectly
> > > > >   concludes that all quiescent states have completed, despite
> > > > >   the fact that ->exp_tasks is non-NULL.
> > > > > 
> > > > > So it seems to me that the lock is needed.  Good catch!!!  The problem
> > > > > would occur only if the task running on CPU 0 received a spurious
> > > > > wakeup, but that could potentially happen.
> > > > 
> > > > Thanks for the analysis ;-)
> > 
> > The other limitation is that it occurs only on systems small enough
> > to have a single-node rcu_node tree.  But still...
> > 
> > > > > If lock contention becomes a problem, memory-ordering tricks could be
> > > > > applied, but the lock is of course simpler.
> > > > > 
> > > > 
> > > > Agreed.
> > > > 
> > > > > I am guessing that this is a prototype patch, and that you are 
> > > > > planning
> > > > 
> > > > Yes, this is a prototype. And I'm preparing a proper patch to send
> > > > later.
> > 
> > Very good, thank you!
> > 
> > > > > to add lockdep annotations in more places, but either way please let
> > > > > me know.
> > > > 
> > > > Give it's a bug as per your analysis, I'd like to defer other lockdep
> > > > annotations and send this first. However, I'm currently getting other
> > > > lockdep splats after applying this, so I need to get that sorted first.
> > > 
> > > Hmm.. the other lockdep splat seems irrelevant with my patch, I could
> > > observe it on mainline using rcutorture with CONFIG_PROVE_LOCKING=y. I'd
> > > spend some more time on it, in the meanwhile, send a proper patch for
> > > this sync_rcu_preempt_exp_done().
> > 
> > I am not seeing that one, but am very interested in getting it fixed!  ;-)
> 
> Found the root cause, and send out the patch ;-)

Very good!  Still not sure why I don't see it, but as long as it is fixed!

Thanx, Paul



Re: [PATCH] ASoC: soc-core: Add missing NULL check

2018-03-09 Thread Pavel Machek
On Fri 2018-03-09 12:50:50, Mark Brown wrote:
> On Thu, Mar 08, 2018 at 12:06:53PM -0800, Kees Cook wrote:
> 
> > If a codec is not attached to the sound soc, a NULL deref is possible as a
> > regular user in /sys.
> 
> I can't parse this, sorry.  What is the "sound soc"?
> 
> > +++ b/sound/soc/soc-core.c
> > @@ -137,6 +137,9 @@ static ssize_t soc_codec_reg_show(struct snd_soc_codec 
> > *codec, char *buf,
> > size_t total = 0;
> > loff_t p = 0;
> >  
> > +   if (!codec || !codec->driver)
> > +   return 0;
> > +
> 
> How are we managing to create a sysfs file for a CODEC which doesn't
> have a CODEC struct associated with it?  That is obviously nonsensical
> and suggests we've got some more serious problem going on here - if
> there's no CODEC those sysfs attributes simply shouldn't be there.

Look for "linux-next on n900: oops in codec_reg_show() when grepping
sysfs" ... should be in your inbox.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] ASoC: soc-core: Add missing NULL check

2018-03-09 Thread Mark Brown
On Fri, Mar 09, 2018 at 10:45:16AM -0800, Kees Cook wrote:
> On Fri, Mar 9, 2018 at 4:50 AM, Mark Brown  wrote:
> > On Thu, Mar 08, 2018 at 12:06:53PM -0800, Kees Cook wrote:

> >> If a codec is not attached to the sound soc, a NULL deref is possible as a
> >> regular user in /sys.

> > I can't parse this, sorry.  What is the "sound soc"?

> SoC's sound component? I'm not sure either. :) I was just sending the
> patch that I mentioned from the thread where Pavel mentioned this
> Oops.

Oh, Pavel's thing.  I didn't look at that yet.  I'm afraid your
description still isn't making much sense to me - I'm guessing that
you're just papering over an immediate crack rather than having
analyized the situation in any depth?

> >> + if (!codec || !codec->driver)
> >> + return 0;

> > How are we managing to create a sysfs file for a CODEC which doesn't
> > have a CODEC struct associated with it?  That is obviously nonsensical
> > and suggests we've got some more serious problem going on here - if
> > there's no CODEC those sysfs attributes simply shouldn't be there.

> No idea! Hopefully Pavel has more details...

That's where the fix should be, it implies that there's some larger data
corruption/confusion problem somewhere else.  If we've created the file
but left a NULL pointer I'd expect that there is a good chance that
there'll be other things that think we've got a CODEC and try to
defererence the pointer, it's an assumption that's present throughout
the code.

I think I might just remove the file though, it's been non-functional on
most systems for a while now as almost all the drivers migrated to
regmap and nobody complained so we should be safe.  There's still
something that ought to be investigated here.


signature.asc
Description: PGP signature


Re: Warning from swake_up_all in 4.14.15-rt13 non-RT

2018-03-09 Thread Sebastian Andrzej Siewior
On 2018-03-09 18:46:05 [+0100], Peter Zijlstra wrote:
> On Fri, Mar 09, 2018 at 12:04:18PM +0100, Sebastian Andrzej Siewior wrote:
> > +void swake_add_all_wq(struct swait_queue_head *q, struct wake_q_head *wq)
> >  {
> > struct swait_queue *curr;
> >  
> > while (!list_empty(&q->task_list)) {
> >  
> > curr = list_first_entry(&q->task_list, typeof(*curr),
> > task_list);
> > list_del_init(&curr->task_list);
> > +   wake_q_add(wq, curr->task);
> > }
> >  }
> > +EXPORT_SYMBOL(swake_add_all_wq);
> >  
> >  void swake_up(struct swait_queue_head *q)
> >  {
> > @@ -66,25 +62,14 @@ EXPORT_SYMBOL(swake_up);
> >   */
> >  void swake_up_all(struct swait_queue_head *q)
> >  {
> > +   unsigned long flags;
> > +   DEFINE_WAKE_Q(wq);
> >  
> > +   raw_spin_lock_irqsave(&q->lock, flags);
> > +   swake_add_all_wq(q, &wq);
> > +   raw_spin_unlock_irqrestore(&q->lock, flags);
> >  
> > +   wake_up_q(&wq);
> >  }
> >  EXPORT_SYMBOL(swake_up_all);
> 
> This is fundamentally wrong. The whole point of wake_up_all() is that
> _all_ is unbounded and should not ever land in a single critical
> section, be it IRQ or PREEMPT disabled. The above does both.

Is it just about the irqsave() usage or something else? I doubt it is
the list walk. It is still unbound if not called from irq-off region.
But it is now possible, I agree. The wake_q usage should be cheaper
compared to IRQ off+on in each loop. And we wanted to do the wake ups
with enabled interrupts - there is still the list_splice() from that
attempt. Now it can be.

> Yes, wake_up_all() is crap, it is also fundamentally incompatible with
> in-*irq usage. Nothing to be done about that.
I still have (or need) completions which are swait based and do
complete_all(). There are complete_all() caller which wake more than one
waiter (that is PM and crypto from the reports I got once I added the
WARN_ON())).
The in-IRQ usage is !RT only and was there before.

> So NAK on this.
So I need completions to be swait based and do complete_all() from IRQ
(on !RT, not RT). I have this one call which breaks the usage on !RT and
has wake_up_all() in it in vanilla which needs an swait equivalent since
it calls its callback from an rcu-sched section.

Sebastian


Re: [PATCH v3 3/6] dt-bindings: soc: Add a binding for the Broadcom VCHIQ services. (v3)

2018-03-09 Thread Stefan Wahren
Hi Eric,

> Eric Anholt  hat am 9. März 2018 um 19:44 geschrieben:
> 
> 
> The VCHIQ communication channel can be provided by BCM283x and Capri
> SoCs, to communicate with the VPU-side OS services.
> 
> Signed-off-by: Eric Anholt 
> ---
> 
> v2: VCHI->VCHIQ, dropped firmware property, added cache-line-size
> v3: Dropped cache-line-size, s/vchi@/mailbox@/
> 
>  .../devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt   | 16 
> 
>  1 file changed, 16 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
> 
> diff --git a/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt 
> b/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
> new file mode 100644
> index ..8dd7b3a7de65
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
> @@ -0,0 +1,16 @@
> +Broadcom VCHIQ firmware services
> +
> +Required properties:
> +
> +- compatible:Should be "brcm,bcm2835-vchiq"
> +- reg:   Physical base address and length of the doorbell 
> register pair
> +- interrupts:The interrupt number
> +   See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt
> +
> +Example:
> +
> +mailbox@7e00b840 {

just a question: do you think this is future-proof to claim the doorbell for 
VCHIQ?

Stefan

> + compatible = "brcm,bcm2835-vchiq";
> + reg = <0x7e00b840 0xf>;
> + interrupts = <0 2>;
> +};
> -- 
> 2.16.2
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


[PATCH] powerpc/64: Fix section mismatch warnings for early boot symbols

2018-03-09 Thread Mauricio Faria de Oliveira
Some of the boot code located at the start of kernel text is "init"
class, in that it only runs at boot time, however marking it as normal
init code is problematic because that puts it into a different section
located at the very end of kernel text.

e.g., in case the TOC is not set up, we may not be able to tolerate a
branch trampoline to reach the init function.

Credits: code and message are based on 2016 patch by Nicholas Piggin,
and slightly modified so not to rename the powerpc code/symbol names.

Subject: [PATCH] powerpc/64: quieten section mismatch warnings
From: Nicholas Piggin 
Date: Fri Dec 23 00:14:19 AEDT 2016

Signed-off-by: Mauricio Faria de Oliveira 
---
 scripts/mod/modpost.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 9917f92..c65d5e2 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1174,8 +1174,15 @@ static const struct sectioncheck *section_mismatch(
  *   fromsec = text section
  *   refsymname = *.constprop.*
  *
+ * Pattern 6:
+ *   powerpc64 has boot functions that reference init, but must remain in text.
+ *   This pattern is identified by
+ *   tosec   = init section
+ *   fromsym = 
+ *
  **/
-static int secref_whitelist(const struct sectioncheck *mismatch,
+static int secref_whitelist(const struct elf_info *elf,
+   const struct sectioncheck *mismatch,
const char *fromsec, const char *fromsym,
const char *tosec, const char *tosym)
 {
@@ -1212,6 +1219,17 @@ static int secref_whitelist(const struct sectioncheck 
*mismatch,
match(fromsym, optim_symbols))
return 0;
 
+   /* Check for pattern 6 */
+   if (elf->hdr->e_machine == EM_PPC64)
+   if (match(tosec, init_sections) &&
+   (!strncmp(fromsym, "__boot_from_prom",
+   strlen("__boot_from_prom")) ||
+!strncmp(fromsym, "start_here_multiplatform",
+   strlen("start_here_multiplatform")) ||
+!strncmp(fromsym, "start_here_common",
+   strlen("start_here_common")))
+   return 0;
+
return 1;
 }
 
@@ -1552,7 +1570,7 @@ static void default_mismatch_handler(const char *modname, 
struct elf_info *elf,
tosym = sym_name(elf, to);
 
/* check whitelist - we may ignore it */
-   if (secref_whitelist(mismatch,
+   if (secref_whitelist(elf, mismatch,
 fromsec, fromsym, tosec, tosym)) {
report_sec_mismatch(modname, mismatch,
fromsec, r->r_offset, fromsym,
-- 
1.8.3.1



Re: [PATCH v4] cpuset: Enable cpuset controller in default hierarchy

2018-03-09 Thread Waiman Long
On 03/09/2018 02:40 PM, Mike Galbraith wrote:
>>>
>>> If v2 is to ever supersede v1, as is the normal way of things, core
>>> functionality really should be on the v2 boat when it sails.  What you
>>> left standing on the dock is critical core cpuset functionality.
>>>
>>> -Mike
>> From your perspective, what are core functionality that should be
>> included in cpuset v2 other than the ability to restrict cpus and memory
>> nodes.
> Exclusive sets are essential, no?  How else can you manage set wide
> properties such as topology (and hopefully soonish nohz).  You clearly
> can't have overlapping sets, one having scheduler topology, the other
> having none.  Whatever the form, something as core as the capability to
> dynamically partition and isolate should IMO be firmly aboard the v2
> boat before it sails.
>
>   -Mike

The isolcpus= parameter just reduce the cpus available to the rests of
the system. The cpuset controller does look at that value and make
adjustment accordingly, but it has no dependence on exclusive cpu/mem
features of cpuset.

-Longman




Re: [PATCH v3 3/6] dt-bindings: soc: Add a binding for the Broadcom VCHIQ services. (v3)

2018-03-09 Thread Eric Anholt
Stefan Wahren  writes:

> Hi Eric,
>
>> Eric Anholt  hat am 9. März 2018 um 19:44 geschrieben:
>> 
>> 
>> The VCHIQ communication channel can be provided by BCM283x and Capri
>> SoCs, to communicate with the VPU-side OS services.
>> 
>> Signed-off-by: Eric Anholt 
>> ---
>> 
>> v2: VCHI->VCHIQ, dropped firmware property, added cache-line-size
>> v3: Dropped cache-line-size, s/vchi@/mailbox@/
>> 
>>  .../devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt   | 16 
>> 
>>  1 file changed, 16 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
>> 
>> diff --git 
>> a/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt 
>> b/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
>> new file mode 100644
>> index ..8dd7b3a7de65
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/soc/bcm/brcm,bcm2835-vchiq.txt
>> @@ -0,0 +1,16 @@
>> +Broadcom VCHIQ firmware services
>> +
>> +Required properties:
>> +
>> +- compatible:   Should be "brcm,bcm2835-vchiq"
>> +- reg:  Physical base address and length of the doorbell 
>> register pair
>> +- interrupts:   The interrupt number
>> +  See bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt
>> +
>> +Example:
>> +
>> +mailbox@7e00b840 {
>
> just a question: do you think this is future-proof to claim the doorbell for 
> VCHIQ?

There are 4 and this is the only one used so far, so it seems terribly
unlikely to get reused.  If the firmware did for some reason decide to
reuse it for something else, they'd surely go override the DT like they
have in the past.


signature.asc
Description: PGP signature


Re: [PATCH] [v2] docs: clarify security-bugs disclosure policy

2018-03-09 Thread Alan Cox
On Wed, 07 Mar 2018 13:46:24 -0800
Dave Hansen  wrote:

> From: Dave Hansen 
> 
> I think we need to soften the language a bit.  It might scare folks
> off, especially the:
> 
>We prefer to fully disclose the bug as soon as possible.
> 
> which is not really the case.  Linus says:
> 
>   It's not full disclosure, it's not coordinated disclosure,
>   and it's not "no disclosure".  It's more like just "timely
>   open fixes".
> 
> I changed a bit of the wording in here, but mostly to remove the word
> "disclosure" since it seems to mean very specific things to people
> that we do not mean here.
> 

If you want to be taken seriously then I think minimum you also need to
- Give a GPG key for messages to the list
- State what security is in place (encryption etc) to protect the list
  itself

There are probably a lot more things people would ask but given the
policy now clear that it's basically just an 'early tip off'/'make sure
Linus doesn't miss this' list for very short notification periods doesn't
matter so much.

Alan




Re: [PATCH v2] perf machine: Fix load kernel symbol with '-k' option

2018-03-09 Thread Jiri Olsa
On Fri, Mar 09, 2018 at 02:05:23PM +0800, Leo Yan wrote:
> On Hikey arm64 octa A53 platform, when use command './perf report -v
> -k vmlinux --stdio' it outputs below error info, and it skips to load
> kernel symbol and doesn't print symbol for event:
> Failed to open [kernel.kallsyms]_text, continuing without symbols.
> 
> The regression is introduced by commit ("8c7f1bb37b29 perf machine: Move
> kernel mmap name into struct machine"), which changes the logic for
> machine mmap_name by removing function machine__mmap_name() and always
> use 'machine->mmap_name'.  Comparing difference between
> machine__mmap_name() and 'machine->mmap_name', the later one includes
> the string for specified kernel vmlinux string with option '-k' in
> command, but the old function machine__mmap_name() ignores vmlinux
> path string.  As result, event's mmap file name doesn't match with
> machine mmap file name anymore and it skips to load kernel symbol from
> vmlinux file.
> 
> To resolve this issue, this patch adds extra checking for
> 'symbol_conf.vmlinux_name', when it has been set string so we can know
> it includes vmlinux path string specified for option '-k'. For this
> case it sets 'is_kernel_mmap' to true and run into flow to load kernel
> symbol from vmlinux.
> 
> This patch has been verified with two commands: './perf report -v
> -k vmlinux --stdio' and './perf script -v -F cpu,event,ip,sym,symoff
> -k vmlinux'.
> 
> Suggested-by: Mathieu Poirier 
> Signed-off-by: Leo Yan 
> ---
>  tools/perf/util/machine.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 12b7427..3125871 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1299,9 +1299,18 @@ static int machine__process_kernel_mmap_event(struct 
> machine *machine,
>   else
>   kernel_type = DSO_TYPE_GUEST_KERNEL;
>  
> - is_kernel_mmap = memcmp(event->mmap.filename,
> - machine->mmap_name,
> - strlen(machine->mmap_name) - 1) == 0;
> + /*
> +  * When symbol_conf.vmlinux_name is not NULL, it includes the specified
> +  * kernel vmlinux path with option '-k'.  So set 'is_kernel_mmap' to
> +  * true for creating machine symbol map.
> +  */
> + if (symbol_conf.vmlinux_name)
> + is_kernel_mmap = true;
> + else
> + is_kernel_mmap = memcmp(event->mmap.filename,
> + machine->mmap_name,
> + strlen(machine->mmap_name) - 1) == 0;
> +
>   if (event->mmap.filename[0] == '/' ||
>   (!is_kernel_mmap && event->mmap.filename[0] == '[')) {
>   map = machine__findnew_module_map(machine, event->mmap.start,

right, the mmap gets confused with the vmlinux path, but I wonder
the fix should be not to include symbol_conf.vmlinux_name in the
mmap_name like below.. untested

thanks,
jirka


---
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 43fbbee409ec..f0cb72022177 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -51,15 +51,9 @@ static void machine__threads_init(struct machine *machine)
 static int machine__set_mmap_name(struct machine *machine)
 {
if (machine__is_host(machine)) {
-   if (symbol_conf.vmlinux_name)
-   machine->mmap_name = strdup(symbol_conf.vmlinux_name);
-   else
-   machine->mmap_name = strdup("[kernel.kallsyms]");
+   machine->mmap_name = strdup("[kernel.kallsyms]");
} else if (machine__is_default_guest(machine)) {
-   if (symbol_conf.default_guest_vmlinux_name)
-   machine->mmap_name = 
strdup(symbol_conf.default_guest_vmlinux_name);
-   else
-   machine->mmap_name = strdup("[guest.kernel.kallsyms]");
+   machine->mmap_name = strdup("[guest.kernel.kallsyms]");
} else {
if (asprintf(&machine->mmap_name, "[guest.kernel.kallsyms.%d]",
 machine->pid) < 0)
@@ -794,9 +788,15 @@ static struct dso *machine__get_kernel(struct machine 
*machine)
struct dso *kernel;
 
if (machine__is_host(machine)) {
+   if (symbol_conf.vmlinux_name)
+   vmlinux_name = symbol_conf.vmlinux_name;
+
kernel = machine__findnew_kernel(machine, vmlinux_name,
 "[kernel]", DSO_TYPE_KERNEL);
} else {
+   if (symbol_conf.default_guest_vmlinux_name)
+   vmlinux_name = symbol_conf.default_guest_vmlinux_name;
+
kernel = machine__findnew_kernel(machine, vmlinux_name,
 "[guest.kernel]",
 DSO_TYPE_GUEST_KERNEL);


Re: [PATCH][RFC] rslib: Remove VLAs by setting upper bound on nroots

2018-03-09 Thread Kees Cook
On Fri, Mar 9, 2018 at 7:49 AM, Thomas Gleixner  wrote:
> On Fri, 9 Mar 2018, Kees Cook wrote:
>
>> Avoid VLAs[1] by always allocating the upper bound of stack space
>> needed. The existing users of rslib appear to max out at 32 roots,
>> so use that as the upper bound.
>
> I think 32 is plenty. Do we have actually a user with 32?

I found 24 as the max, but thought maybe 32 would be better?

drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_RSM255
drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_MAX_RSN253
drivers/md/dm-verity-fec.h:#define DM_VERITY_FEC_MIN_RSN
 231 /* ~10% space overhead */
drivers/md/dm-verity-fec.c:

if (sscanf(arg_value, "%hhu%c", &num_c, &dummy) != 1
|| !num_c ||
num_c < (DM_VERITY_FEC_RSM - DM_VERITY_FEC_MAX_RSN) ||
num_c > (DM_VERITY_FEC_RSM - DM_VERITY_FEC_MIN_RSN)) {
ti->error = "Invalid " DM_VERITY_OPT_FEC_ROOTS;
return -EINVAL;
}
v->fec->roots = num_c;
...
drivers/md/dm-verity-fec.c: return init_rs(8, 0x11d, 0, 1, v->fec->roots);

So this can be as much as 24.

drivers/mtd/nand/diskonchip.c:#define NROOTS 4
drivers/mtd/nand/diskonchip.c:  rs_decoder = init_rs(10, 0x409, FCR, 1, NROOTS);

4.

fs/pstore/ram.c:static int ramoops_ecc;
fs/pstore/ram.c:module_param_named(ecc, ramoops_ecc, int, 0600);
fs/pstore/ram.c:MODULE_PARM_DESC(ramoops_ecc,
fs/pstore/ram.c:dummy_data->ecc_info.ecc_size = ramoops_ecc ==
1 ? 16 : ramoops_ecc;
...
fs/pstore/ram.c:cxt->ecc_info = pdata->ecc_info;
...
fs/pstore/ram_core.c:   prz->rs_decoder =
init_rs(prz->ecc_info.symsize, prz->ecc_info.poly,
fs/pstore/ram_core.c- 0, 1, prz->ecc_info.ecc_size);

The default "ecc enabled" mode for pstore is 16, but was made dynamic
a while ago. However, I've only ever seen people use a smaller number
of roots.

>> Alternative: make init_rs() a true caller-instance and pre-allocate
>> the workspaces. Will this need locking or are the callers already
>> single-threaded in their use of librs?
>
> init_rs() is an init function which needs to be invoked _before_ the
> decoder/encoder can be used.
>
> The way it works today that it can share the rs_control between users to
> avoid duplicating the polynom arrays and the setup of them.
>
> So we might change how rs_control works and allocate rs_control for each
> invocation of init_rs(). That means we need two data structures:
>
> Rename rs_control to rs_poly and just use that internaly for sharing the
> polynom arrays.
>
> rs_control then becomes:
>
> struct rs_control {
> struct rs_poly  *poly;
> uint16_tlamda[MAX_ROOTS + 1];
> 
> uint16_tloc[MAX_ROOTS];
> };
>
> But as you said that requires serialization or separation at the usage
> sites.

Right. Not my favorite idea. :P

> drivers/mtd/nand/* would either need a mutex or allocate one rs_control per
> instance. Simple enough to do.
>
> drivers/md/dm-verity-fec.c looks like it's allocating a dm control struct
> for each worker thread, so that should just require allocating one
> rs_control per worker then.
>
> pstore only has an issue in case of OOPS. A simple solution would be to
> allocate two rs_control structs, one for regular usage and one for the OOPS
> case. Not sure if that covers all possible problems, so that needs more
> thoughts.

Maybe I should just go with 24 as the max, and if we have a case where
we need more, address it then?

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH v5 3/4] arm64: dts: sdm845: Add minimal dts/dtsi files for sdm845 SoC and MTP

2018-03-09 Thread Doug Anderson
Hi,

On Wed, Feb 21, 2018 at 10:12 PM, Rajendra Nayak  wrote:
> +   gcc: clock-controller@10 {
> +   compatible = "qcom,gcc-sdm845";
> +   reg = <0x10 0x1f>;
> +   #clock-cells = <1>;
> +   #reset-cells = <1>;
> +   };

Seems like we need "#power-domain-cells = <1>;" in the gcc node.

It is true that the property is listed as "optional" in the bindings,
but we certainly know that the
"include/dt-bindings/clock/qcom,gcc-sdm845.h" that's posted [1]
contains several defines ending in "_GDSC" and once we start
referencing those we'll need "#power-domain-cells".  Seems like we
should just have it from the beginning.

NOTE: IMHO adding "#power-domain-cells" could be done as a follow-on
patch, but since (I think) this series still hasn't landed I guess we
could just send up v6?

[1] https://patchwork.kernel.org/patch/10267093/


-Doug


[PATCH v4 6/7] dt-bindings: Introduce interconnect consumers bindings

2018-03-09 Thread Georgi Djakov
Add documentation for the interconnect consumer bindings, that will allow
to link a device node (consumer) to its interconnect controller hardware.

Tha aim is to enable drivers to request a framework API to configure an
interconnect path by providing their struct device pointer and a name.

Signed-off-by: Georgi Djakov 
---
 .../bindings/interconnect/interconnect.txt | 23 ++
 1 file changed, 23 insertions(+)

diff --git a/Documentation/devicetree/bindings/interconnect/interconnect.txt 
b/Documentation/devicetree/bindings/interconnect/interconnect.txt
index 70612bb201e4..7935abf10c4b 100644
--- a/Documentation/devicetree/bindings/interconnect/interconnect.txt
+++ b/Documentation/devicetree/bindings/interconnect/interconnect.txt
@@ -45,3 +45,26 @@ Examples:
status = "okay";
};
 
+= interconnect consumers =
+
+The interconnect consumers are device nodes which consume the interconnect
+path(s) provided by the interconnect provider. There can be multiple
+interconnect providers on a SoC and the consumer may consume multiple paths
+from different providers depending on usecase and the components it has to
+interact with.
+
+Required-properties:
+interconnects: Pairs of phandles and interconnect provider specifier to denote
+   the source and the destination port of the interconnect path.
+interconnect-names: List of interconnect path name strings sorted in the same
+   order as the interconnects property. Consumers drivers will use
+   interconnect-names to match interconnect paths with interconnect
+   specifiers.
+
+Example:
+
+   sdhci@7864000 {
+   ...
+   interconnects = <&pnoc 78 &bimc 512>
+   interconnect-names = "memory";
+   };


Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 12:05 PM, Kees Cook  wrote:
> When max() is used in stack array size calculations from literal values
> (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
> thinks this is a dynamic calculation due to the single-eval logic, which
> is not needed in the literal case. This change removes several accidental
> stack VLAs from an x86 allmodconfig build:

Ok, looks good.

I just have a couple of questions about applying it.

In particular, if this will help people working on getting rid of
VLA's in the short term, I can apply it directly. But if people who
are looking at it (anybody else than Kees?) don't much care, then this
might be a 4.17 thing or at least "random -mm queue"?

The other unrelated reaction I had to this was that "we're passing
those types down very deep, even though nobody _cares_ about them all
that much at that deep level".

Honestly, the only case that really cares is the very top level, and
the rest could just take the properly cast versions.

For example, "max_t/min_t" really don't care at all, since they - by
definition - just take the single specified type.

So I'm wondering if we should just drop the types from __max/__min
(and everything they call) entirely, and instead do

#define __check_type(x,y) ((void)((typeof(x)*)1==(typeof(y)*)1))
#define min(x,y)   (__check_type(x,y),__min(x,y))
#define max(x,y)   (__check_type(x,y),__max(x,y))

#define min_t(t,x,y) __min((t)(x),(t)(y))
#define max_t(t,x,y) __max((t)(x),(t)(y))

and then __min/__max and friends are much simpler (and can just assume
that the type is already fine, and the casting has been done).

This is technically entirely independent of this VLA cleanup thing,
but the "passing the types around unnecessarily" just becomes more
obvious when there's now another level of macros, _and_ it's a more
complex expression too.

Yes, yes, the __single_eval_xyz() functions still end up wanting the
types for the declaration of the new single-evaluation variables, but
the 'typeof' pattern is the standard pattern, so

#define __single_eval_max(max1, max2, x, y) ({  \
typeof (x) max1 = (x);  \
typeof (y) max2 = (y);  \
max1 > max2 ? max1 : max2; })

actually looks more natural to me than passing the two types in as
arguments to the macro.

(That form also is amenable to things like "__auto_type" etc simplifications).

Side note: do we *really* need the unique variable names? That's what
makes those things _really_ illegible. I thgink it's done just for a
sparse warning that we should probably ignore. It's off by default
anyway exactly because it doesn't work that well due to nested macro
expansions like this.

There is very real value to keeping our odd macros legible, I feel.
Even when they are complicated by issues like this, it would be good
to keep them as simple as possible.

Comments?

Linus


[PATCH v4 7/7] interconnect: Allow endpoints translation via DT

2018-03-09 Thread Georgi Djakov
Currently we support only platform data for specifying the interconnect
endpoints. As now the endpoints are hard-coded into the consumer driver
this may leed to complications when a single driver is used by multiple
SoCs, which may have different interconnect topology.
To avoid cluttering the consumer drivers, introduce a translation function
to help us get the board specific interconnect data from device-tree.

Signed-off-by: Georgi Djakov 
---
 drivers/interconnect/core.c  | 38 ++
 include/linux/interconnect.h |  6 ++
 2 files changed, 44 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index a06f752a6aaa..014993473763 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static DEFINE_IDR(icc_idr);
@@ -297,6 +298,43 @@ static int constraints_apply(struct icc_path *path)
return 0;
 }
 
+struct icc_path *of_icc_get(struct device *dev, const char *name)
+{
+   struct device_node *np;
+   u32 src_id, dst_id;
+   int index, ret;
+
+   if (!dev || !name)
+   return NULL;
+
+   np = dev->of_node;
+
+   index = of_property_match_string(np, "interconnect-names", name);
+   if (index < 0)
+   return ERR_PTR(index);
+
+   /*
+* We use a combination of phandle and specifier for endpoint. For now
+* lets support only global ids and extend this is the future if needed
+* without breaking DT compatibility.
+*/
+   ret = of_property_read_u32_index(np, "interconnects", index * 2 + 1,
+&src_id);
+   if (ret) {
+   dev_err(dev, "interconnect src port is invalid (%d)\n", ret);
+   return ERR_PTR(ret);
+   }
+   ret = of_property_read_u32_index(np, "interconnects", index * 2 + 3,
+&dst_id);
+   if (ret) {
+   dev_err(dev, "interconnect dst port is invalid (%d)\n", ret);
+   return ERR_PTR(ret);
+   }
+
+   return icc_get(src_id, dst_id);
+}
+EXPORT_SYMBOL_GPL(of_icc_get);
+
 /**
  * icc_set() - set constraints on an interconnect path between two endpoints
  * @path: reference to the path returned by icc_get()
diff --git a/include/linux/interconnect.h b/include/linux/interconnect.h
index 5a7cf72b76a5..996c48ea67d5 100644
--- a/include/linux/interconnect.h
+++ b/include/linux/interconnect.h
@@ -16,6 +16,7 @@ struct device;
 #if IS_ENABLED(CONFIG_INTERCONNECT)
 
 struct icc_path *icc_get(const int src_id, const int dst_id);
+struct icc_path *of_icc_get(struct device *dev, const char *name);
 void icc_put(struct icc_path *path);
 int icc_set(struct icc_path *path, u32 avg_bw, u32 peak_bw);
 
@@ -26,6 +27,11 @@ static inline struct icc_path *icc_get(const int src_id, 
const int dst_id)
return NULL;
 }
 
+static inline struct icc_path *of_icc_get(struct device *dev, const char *name)
+{
+   return NULL;
+}
+
 static inline void icc_put(struct icc_path *path)
 {
 }


[PATCH v4 5/7] interconnect: qcom: Add msm8916 interconnect provider driver

2018-03-09 Thread Georgi Djakov
Add driver for the Qualcomm interconnect buses found in msm8916 based
platforms.

Signed-off-by: Georgi Djakov 
---
 drivers/interconnect/Kconfig|   5 +
 drivers/interconnect/Makefile   |   1 +
 drivers/interconnect/qcom/Kconfig   |  11 +
 drivers/interconnect/qcom/Makefile  |   2 +
 drivers/interconnect/qcom/msm8916.c | 482 
 include/linux/interconnect/qcom.h   | 350 ++
 6 files changed, 851 insertions(+)
 create mode 100644 drivers/interconnect/qcom/Kconfig
 create mode 100644 drivers/interconnect/qcom/msm8916.c
 create mode 100644 include/linux/interconnect/qcom.h

diff --git a/drivers/interconnect/Kconfig b/drivers/interconnect/Kconfig
index a261c7d41deb..07a8276fa35a 100644
--- a/drivers/interconnect/Kconfig
+++ b/drivers/interconnect/Kconfig
@@ -8,3 +8,8 @@ menuconfig INTERCONNECT
 
  If unsure, say no.
 
+if INTERCONNECT
+
+source "drivers/interconnect/qcom/Kconfig"
+
+endif
diff --git a/drivers/interconnect/Makefile b/drivers/interconnect/Makefile
index 5edf0ae80818..5971b811c2d7 100644
--- a/drivers/interconnect/Makefile
+++ b/drivers/interconnect/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_INTERCONNECT) += core.o
+obj-$(CONFIG_INTERCONNECT_QCOM)+= qcom/
diff --git a/drivers/interconnect/qcom/Kconfig 
b/drivers/interconnect/qcom/Kconfig
new file mode 100644
index ..86465dc37bd4
--- /dev/null
+++ b/drivers/interconnect/qcom/Kconfig
@@ -0,0 +1,11 @@
+config INTERCONNECT_QCOM
+   bool "Qualcomm Network-on-Chip interconnect drivers"
+   depends on INTERCONNECT
+   depends on ARCH_QCOM || COMPILE_TEST
+   default y
+
+config INTERCONNECT_QCOM_MSM8916
+   tristate "Qualcomm MSM8916 interconnect driver"
+   depends on INTERCONNECT_QCOM
+   help
+ This is a driver for the Qualcomm Network-on-Chip on msm8916-based 
platforms.
diff --git a/drivers/interconnect/qcom/Makefile 
b/drivers/interconnect/qcom/Makefile
index 095bdef1ee6e..a0c13a25e8db 100644
--- a/drivers/interconnect/qcom/Makefile
+++ b/drivers/interconnect/qcom/Makefile
@@ -1 +1,3 @@
 obj-y += smd-rpm.o
+
+obj-$(CONFIG_INTERCONNECT_QCOM_MSM8916) += msm8916.o
diff --git a/drivers/interconnect/qcom/msm8916.c 
b/drivers/interconnect/qcom/msm8916.c
new file mode 100644
index ..d5b54f8261c8
--- /dev/null
+++ b/drivers/interconnect/qcom/msm8916.c
@@ -0,0 +1,482 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Linaro Ltd
+ * Author: Georgi Djakov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "smd-rpm.h"
+
+#define RPM_MASTER_FIELD_BW0x7762
+#define RPM_BUS_MASTER_REQ  0x73616d62
+#define RPM_BUS_SLAVE_REQ   0x766c7362
+
+#define to_qcom_provider(_provider) \
+   container_of(_provider, struct qcom_icc_provider, provider)
+
+#define DEFINE_QNODE(_name, _id, _port, _buswidth, _ap_owned,  \
+   _mas_rpm_id, _slv_rpm_id, _qos_mode,\
+   _numlinks, ...) \
+   static struct qcom_icc_node _name = {   \
+   .id = _id,  \
+   .name = #_name, \
+   .port = _port,  \
+   .buswidth = _buswidth,  \
+   .qos_mode = _qos_mode,  \
+   .ap_owned = _ap_owned,  \
+   .mas_rpm_id = _mas_rpm_id,  \
+   .slv_rpm_id = _slv_rpm_id,  \
+   .num_links = _numlinks, \
+   .links = { __VA_ARGS__ },   \
+   }
+
+enum qcom_qos_mode {
+   QCOM_QOS_MODE_BYPASS = 0,
+   QCOM_QOS_MODE_FIXED,
+   QCOM_QOS_MODE_MAX,
+};
+
+struct qcom_icc_provider {
+   struct icc_provider provider;
+   void __iomem*base;
+   struct clk  *bus_clk;
+   struct clk  *bus_a_clk;
+};
+
+#define MSM8916_MAX_LINKS  8
+
+/**
+ * struct qcom_icc_node - Qualcomm specific interconnect nodes
+ * @name: the node name used in debugfs
+ * @links: an array of nodes where we can go next while traversing
+ * @id: a unique node identifier
+ * @num_links: the total number of @links
+ * @port: the offset index into the masters QoS register space
+ * @buswidth: width of the interconnect between a node and the bus
+ * @ap_owned: the AP CPU does the writing to QoS registers
+ * @rpm: reference to the RPM SMD driver
+ * @qos_mode: QoS mode for ap_owned resources
+ * @mas_rpm_id:RPM id for devices that are bus masters
+ * @slv_rpm_id:RPM id for devices that are bus slaves
+ * @rate: current bus cloc

[PATCH v4 4/7] interconnect: qcom: Add RPM communication

2018-03-09 Thread Georgi Djakov
On some Qualcomm SoCs, there is a remote processor, which controls some of
the Network-On-Chip interconnect resources. Other CPUs express their needs
by communicating with this processor. Add a driver to handle comminication
with this remote processor.

Signed-off-by: Georgi Djakov 
---
 .../devicetree/bindings/interconnect/qcom-smd.txt  | 31 
 drivers/interconnect/qcom/Makefile |  1 +
 drivers/interconnect/qcom/smd-rpm.c| 90 ++
 drivers/interconnect/qcom/smd-rpm.h| 15 
 4 files changed, 137 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/interconnect/qcom-smd.txt
 create mode 100644 drivers/interconnect/qcom/Makefile
 create mode 100644 drivers/interconnect/qcom/smd-rpm.c
 create mode 100644 drivers/interconnect/qcom/smd-rpm.h

diff --git a/Documentation/devicetree/bindings/interconnect/qcom-smd.txt 
b/Documentation/devicetree/bindings/interconnect/qcom-smd.txt
new file mode 100644
index ..14e83ed7019b
--- /dev/null
+++ b/Documentation/devicetree/bindings/interconnect/qcom-smd.txt
@@ -0,0 +1,31 @@
+Qualcomm SMD-RPM interconnect driver binding
+
+The RPM is a dedicated hardware engine for managing the shared
+SoC resources in order to keep the lowest power profile. It
+communicates with other hardware subsystems via shared memory
+and accepts requests for various resources.
+
+Required properties :
+- compatible : shall contain only one of the following:
+   "qcom,interconnect-smd-rpm"
+
+Example:
+   smd {
+   compatible = "qcom,smd";
+
+   rpm {
+   interrupts = <0 168 1>;
+   qcom,ipc = <&apcs 8 0>;
+   qcom,smd-edge = <15>;
+
+   rpm_requests {
+   compatible = "qcom,rpm-msm8916";
+   qcom,smd-channels = "rpm_requests";
+
+   interconnect-smd-rpm {
+   compatible = 
"qcom,interconnect-smd-rpm";
+   };
+
+   };
+   };
+   };
diff --git a/drivers/interconnect/qcom/Makefile 
b/drivers/interconnect/qcom/Makefile
new file mode 100644
index ..095bdef1ee6e
--- /dev/null
+++ b/drivers/interconnect/qcom/Makefile
@@ -0,0 +1 @@
+obj-y += smd-rpm.o
diff --git a/drivers/interconnect/qcom/smd-rpm.c 
b/drivers/interconnect/qcom/smd-rpm.c
new file mode 100644
index ..0cf772f51642
--- /dev/null
+++ b/drivers/interconnect/qcom/smd-rpm.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * RPM over SMD communication wrapper for interconects
+ *
+ * Copyright (C) 2018 Linaro Ltd
+ * Author: Georgi Djakov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "smd-rpm.h"
+
+#defineRPM_KEY_BW  0x7762
+
+static struct qcom_icc_rpm {
+   struct qcom_smd_rpm *rpm;
+} icc_rpm_smd;
+
+struct icc_rpm_smd_req {
+   __le32 key;
+   __le32 nbytes;
+   __le32 value;
+};
+
+bool qcom_icc_rpm_smd_available(void)
+{
+   if (!icc_rpm_smd.rpm)
+   return false;
+
+   return true;
+}
+
+int qcom_icc_rpm_smd_send(int ctx, int rsc_type, int id, u32 val)
+{
+   struct icc_rpm_smd_req req = {
+   .key = cpu_to_le32(RPM_KEY_BW),
+   .nbytes = cpu_to_le32(sizeof(u32)),
+   .value = cpu_to_le32(val),
+   };
+
+   return qcom_rpm_smd_write(icc_rpm_smd.rpm, ctx, rsc_type, id, &req,
+ sizeof(req));
+}
+EXPORT_SYMBOL(qcom_icc_rpm_smd_send);
+
+static int qcom_icc_rpm_smd_probe(struct platform_device *pdev)
+{
+   icc_rpm_smd.rpm = dev_get_drvdata(pdev->dev.parent);
+   if (!icc_rpm_smd.rpm) {
+   dev_err(&pdev->dev, "unable to retrieve handle to RPM\n");
+   return -ENODEV;
+   }
+
+   return 0;
+}
+
+static const struct of_device_id qcom_icc_rpm_smd_dt_match[] = {
+   { .compatible = "qcom,interconnect-smd-rpm", },
+   { },
+};
+
+MODULE_DEVICE_TABLE(of, qcom_interconnect_rpm_smd_dt_match);
+
+static struct platform_driver qcom_interconnect_rpm_smd_driver = {
+   .driver = {
+   .name   = "qcom-interconnect-smd-rpm",
+   .of_match_table = qcom_icc_rpm_smd_dt_match,
+   },
+   .probe = qcom_icc_rpm_smd_probe,
+};
+
+static int __init rpm_smd_interconnect_init(void)
+{
+   return platform_driver_register(&qcom_interconnect_rpm_smd_driver);
+}
+subsys_initcall(rpm_smd_interconnect_init);
+
+static void __exit rpm_smd_interconnect_exit(void)
+{
+   platform_driver_unregister(&qcom_interconnect_rpm_smd_driver);
+}
+module_exit(rpm_smd_interconnect_exit)
+
+MODULE_AUTHOR("Georgi Djakov ");
+MODULE_DESCRIPTION("Qualcomm SMD RPM interconnect driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/interco

[PATCH v4 2/7] dt-bindings: Introduce interconnect provider bindings

2018-03-09 Thread Georgi Djakov
This binding is intended to represent the interconnect hardware present
in some of the modern SoCs. Currently it consists only of a binding for
the interconnect hardware devices (provider).

Signed-off-by: Georgi Djakov 
---
 .../bindings/interconnect/interconnect.txt | 47 ++
 1 file changed, 47 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/interconnect/interconnect.txt

diff --git a/Documentation/devicetree/bindings/interconnect/interconnect.txt 
b/Documentation/devicetree/bindings/interconnect/interconnect.txt
new file mode 100644
index ..70612bb201e4
--- /dev/null
+++ b/Documentation/devicetree/bindings/interconnect/interconnect.txt
@@ -0,0 +1,47 @@
+Interconnect Provider Device Tree Bindings
+=
+
+The purpose of this document is to define a common set of generic interconnect
+providers/consumers properties.
+
+
+= interconnect providers =
+
+The interconnect provider binding is intended to represent the interconnect
+controllers in the system. Each provider registers a set of interconnect
+nodes, which expose the interconnect related capabilities of the interconnect
+to consumer drivers. These capabilities can be throughput, latency, priority
+etc. The consumer drivers set constraints on interconnect path (or endpoints)
+depending on the usecase. Interconnect providers can also be interconnect
+consumers, such as in the case where two network-on-chip fabrics interface
+directly
+
+Required properties:
+- compatible : contains the interconnect provider vendor specific compatible
+  string
+- reg : register space of the interconnect controller hardware
+
+Examples:
+
+   snoc: snoc@58 {
+   compatible = "qcom,msm8916-snoc";
+   reg = <0x58 0x14000>;
+   clock-names = "bus_clk", "bus_a_clk";
+   clocks = <&rpmcc RPM_SMD_SNOC_CLK>, <&rpmcc 
RPM_SMD_SNOC_A_CLK>;
+   status = "okay";
+   };
+   bimc: bimc@40 {
+   compatible = "qcom,msm8916-bimc";
+   reg = <0x40 0x62000>;
+   clock-names = "bus_clk", "bus_a_clk";
+   clocks = <&rpmcc RPM_SMD_BIMC_CLK>, <&rpmcc 
RPM_SMD_BIMC_A_CLK>;
+   status = "okay";
+   };
+   pnoc: pnoc@50 {
+   compatible = "qcom,msm8916-pnoc";
+   reg = <0x50 0x11000>;
+   clock-names = "bus_clk", "bus_a_clk";
+   clocks = <&rpmcc RPM_SMD_PCNOC_CLK>, <&rpmcc 
RPM_SMD_PCNOC_A_CLK>;
+   status = "okay";
+   };
+


[PATCH v4 3/7] interconnect: Add debugfs support

2018-03-09 Thread Georgi Djakov
Add a functionality to provide information about the current constraints
per each node and provider.

Signed-off-by: Georgi Djakov 
---
 drivers/interconnect/core.c | 70 +
 1 file changed, 70 insertions(+)

diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c
index 6306e258b9b9..a06f752a6aaa 100644
--- a/drivers/interconnect/core.c
+++ b/drivers/interconnect/core.c
@@ -6,6 +6,7 @@
  * Author: Georgi Djakov 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -15,11 +16,13 @@
 #include 
 #include 
 #include 
+#include 
 
 static DEFINE_IDR(icc_idr);
 static LIST_HEAD(icc_provider_list);
 static DEFINE_MUTEX(icc_provider_list_mutex);
 static DEFINE_MUTEX(icc_path_mutex);
+static struct dentry *icc_debugfs_dir;
 
 /**
  * struct icc_req - constraints that are attached to each node
@@ -46,6 +49,73 @@ struct icc_path {
struct icc_req reqs[0];
 };
 
+#ifdef CONFIG_DEBUG_FS
+
+static void icc_summary_show_one(struct seq_file *s, struct icc_node *n)
+{
+   if (!n)
+   return;
+
+   seq_printf(s, "%-30s %12d %12d\n",
+  n->name, n->avg_bw, n->peak_bw);
+}
+
+static int icc_summary_show(struct seq_file *s, void *data)
+{
+   struct icc_provider *provider;
+
+   seq_puts(s, " node   avg 
peak\n");
+   seq_puts(s, 
"\n");
+
+   mutex_lock(&icc_provider_list_mutex);
+
+   list_for_each_entry(provider, &icc_provider_list, provider_list) {
+   struct icc_node *n;
+
+   mutex_lock(&provider->lock);
+   list_for_each_entry(n, &provider->nodes, node_list) {
+   icc_summary_show_one(s, n);
+   }
+   mutex_unlock(&provider->lock);
+   }
+
+   mutex_unlock(&icc_provider_list_mutex);
+
+   return 0;
+}
+
+static int icc_summary_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, icc_summary_show, inode->i_private);
+}
+
+static const struct file_operations icc_summary_fops = {
+   .open   = icc_summary_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static int __init icc_debugfs_init(void)
+{
+   struct dentry *file;
+
+   icc_debugfs_dir = debugfs_create_dir("interconnect", NULL);
+   if (!icc_debugfs_dir) {
+   pr_err("interconnect: error creating debugfs directory\n");
+   return -ENODEV;
+   }
+
+   file = debugfs_create_file("interconnect_summary", 0444,
+  icc_debugfs_dir, NULL, &icc_summary_fops);
+   if (!file)
+   return -ENODEV;
+
+   return 0;
+}
+late_initcall(icc_debugfs_init);
+#endif
+
 static struct icc_node *node_find(const int id)
 {
struct icc_node *node;


[PATCH v4 0/7] Introduce on-chip interconnect API

2018-03-09 Thread Georgi Djakov
Modern SoCs have multiple processors and various dedicated cores (video, gpu,
graphics, modem). These cores are talking to each other and can generate a lot
of data flowing through the on-chip interconnects. These interconnect buses
could form different topologies such as crossbar, point to point buses,
hierarchical buses or use the network-on-chip concept.

These buses have been sized usually to handle use cases with high data
throughput but it is not necessary all the time and consume a lot of power.
Furthermore, the priority between masters can vary depending on the running
use case like video playback or cpu intensive tasks.

Having an API to control the requirement of the system in term of bandwidth
and QoS, so we can adapt the interconnect configuration to match those by
scaling the frequencies, setting link priority and tuning QoS parameters.
This configuration can be a static, one-time operation done at boot for some
platforms or a dynamic set of operations that happen at run-time.

This patchset introduce a new API to get the requirement and configure the
interconnect buses across the entire chipset to fit with the current demand.
The API is NOT for changing the performance of the endpoint devices, but only
the interconnect path in between them.

The API is using a consumer/provider-based model, where the providers are
the interconnect buses and the consumers could be various drivers.
The consumers request interconnect resources (path) to an endpoint and set
the desired constraints on this data flow path. The provider(s) receive
requests from consumers and aggregate these requests for all master-slave
pairs on that path. Then the providers configure each participating in the
topology node according to the requested data flow path, physical links and
constraints. The topology could be complicated and multi-tiered and is SoC
specific.

Below is a simplified diagram of a real-world SoC topology. The interconnect
providers are the NoCs.

 ++++
 | HW Accelerator |--->|  M NoC |<---+
 ++++|
 |  |++
  +-+  +-+  V   +--+ ||
  | DDR |  |++  | PCIe | ||
  +-+  || Slaves |  +--+ ||
^ ^|++ | |   C NoC|
| |V   V ||
 +--+   ++   ||   +-+
 |  |-->||-->||-->| CPU |
 |  |-->||<--||   +-+
 | Mem NoC  |   | S NoC  |   ++
 |  |<--||-+|
 |  |<--||<--+ ||   ++
 +--+   ++   | |+-->| Slaves |
   ^  ^^^  ^ | |++
   |  |||  | | V
 +--+  |  +-+   +-+  +-+   ++   ++
 | CPUs |  |  | GPU |   | DSP |  | Masters |-->|   P NoC|-->| Slaves |
 +--+  |  +-+   +-+  +-+   ++   ++
   |
   +---+
   | Modem |
   +---+

TODO:
 * Create icc_set_extended() to handle parameters such as latency and other QoS
   values.
 * Convert from using global node identifiers to local per provider identifiers.
 * Cache the path between the nodes instead of walking the graph on each get().
 * Sync interconnect requests with the idle state of the device.

Changes since patchset v3 (https://lkml.org/lkml/2017/9/8/544)
* Refactored the constraints aggregation.
* Use the IDR API.
* Split the provider and consumer bindings into separate patches and propose
  new bindings for consumers, which allows to specify the local source port.
* Adopted the icc_ prefix for API functions.
* Introduced separate API functions for creating interconnect nodes and links.
* Added DT lookup support in addition to platform data.
* Dropped the event tracing patch for now.
* Added a patch to provide summary via debugfs. 
* Use macro for the list of topology definitions in the platform driver.
* Various minor changes.

Changes since patchset v2 (https://lkml.org/lkml/2017/7/20/825)
* Split the aggregation into per node and per provider. Cache the
  aggregated values.
* Various small refactorings and cleanups in the framework.
* Added a patch introducing basic tracepoint support for monitoring
  the time required to update the interconnect nodes.

Changes since patchset v1 (https://lkml.org/lkml/2017/6/27/890)
* Updates in the documentation.
* Changes in request aggregation, locking.
* Dropped the aggregate() callback and u

[PATCH v4 1/7] interconnect: Add generic on-chip interconnect API

2018-03-09 Thread Georgi Djakov
This patch introduce a new API to get requirements and configure the
interconnect buses across the entire chipset to fit with the current
demand.

The API is using a consumer/provider-based model, where the providers are
the interconnect buses and the consumers could be various drivers.
The consumers request interconnect resources (path) between endpoints and
set the desired constraints on this data flow path. The providers receive
requests from consumers and aggregate these requests for all master-slave
pairs on that path. Then the providers configure each participating in the
topology node according to the requested data flow path, physical links and
constraints. The topology could be complicated and multi-tiered and is SoC
specific.

Signed-off-by: Georgi Djakov 
---
 Documentation/interconnect/interconnect.rst |  96 ++
 drivers/Kconfig |   2 +
 drivers/Makefile|   1 +
 drivers/interconnect/Kconfig|  10 +
 drivers/interconnect/Makefile   |   1 +
 drivers/interconnect/core.c | 489 
 include/linux/interconnect-provider.h   | 109 +++
 include/linux/interconnect.h|  40 +++
 8 files changed, 748 insertions(+)
 create mode 100644 Documentation/interconnect/interconnect.rst
 create mode 100644 drivers/interconnect/Kconfig
 create mode 100644 drivers/interconnect/Makefile
 create mode 100644 drivers/interconnect/core.c
 create mode 100644 include/linux/interconnect-provider.h
 create mode 100644 include/linux/interconnect.h

diff --git a/Documentation/interconnect/interconnect.rst 
b/Documentation/interconnect/interconnect.rst
new file mode 100644
index ..23eba68e8424
--- /dev/null
+++ b/Documentation/interconnect/interconnect.rst
@@ -0,0 +1,96 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=
+GENERIC SYSTEM INTERCONNECT SUBSYSTEM
+=
+
+Introduction
+
+
+This framework is designed to provide a standard kernel interface to control
+the settings of the interconnects on a SoC. These settings can be throughput,
+latency and priority between multiple interconnected devices or functional
+blocks. This can be controlled dynamically in order to save power or provide
+maximum performance.
+
+The interconnect bus is a hardware with configurable parameters, which can be
+set on a data path according to the requests received from various drivers.
+An example of interconnect buses are the interconnects between various
+components or functional blocks in chipsets. There can be multiple 
interconnects
+on a SoC that can be multi-tiered.
+
+Below is a simplified diagram of a real-world SoC interconnect bus topology.
+
+::
+
+ ++++
+ | HW Accelerator |--->|  M NoC |<---+
+ ++++|
+ |  |++
+  +-+  +-+  V   +--+ ||
+  | DDR |  |++  | PCIe | ||
+  +-+  || Slaves |  +--+ ||
+^ ^|++ | |   C NoC|
+| |V   V ||
+ +--+   ++   ||   +-+
+ |  |-->||-->||-->| CPU |
+ |  |-->||<--||   +-+
+ | Mem NoC  |   | S NoC  |   ++
+ |  |<--||-+|
+ |  |<--||<--+ ||   ++
+ +--+   ++   | |+-->| Slaves |
+   ^  ^^^  ^ | |++
+   |  |||  | | V
+ +--+  |  +-+   +-+  +-+   ++   ++
+ | CPUs |  |  | GPU |   | DSP |  | Masters |-->|   P NoC|-->| Slaves |
+ +--+  |  +-+   +-+  +-+   ++   ++
+   |
+   +---+
+   | Modem |
+   +---+
+
+Terminology
+---
+
+Interconnect provider is the software definition of the interconnect hardware.
+The interconnect providers on the above diagram are M NoC, S NoC, C NoC and Mem
+NoC.
+
+Interconnect node is the software definition of the interconnect hardware
+port. Each interconnect provider consists of multiple interconnect nodes,
+which are connected to other SoC components including other interconnect
+providers. The point on the diagram where the CPUs connects to the memory is
+called an interconnect node, which belongs to the Mem NoC interconnect 
provider.
+
+Interconnect endpoints are the first or the last elemen

94d3a25408: kernel_BUG_at_kernel/fork.c

2018-03-09 Thread kernel test robot
FYI, we noticed the following commit (built with gcc-7):

commit: 94d3a254089a7cd4f11b7071b4323afd98eea0a6 ("Detect early free of a live 
mm")
url: 
https://github.com/0day-ci/linux/commits/Mark-Rutland/Detect-early-free-of-a-live-mm/20180303-144149


in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+---++
|  | v4.16-rc3 | 94d3a25408 |
+--+---++
| boot_successes   | 18| 6  |
| boot_failures| 0 | 10 |
| kernel_BUG_at_kernel/fork.c  | 0 | 10 |
| invalid_opcode:#[##] | 0 | 10 |
| RIP:__mmdrop | 0 | 10 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 10 |
+--+---++



[   47.208935] kernel BUG at kernel/fork.c:599!
[   47.210365] invalid opcode:  [#1] SMP PTI
[   47.211336] Modules linked in:
[   47.212145] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.16.0-rc3-1-g94d3a25 #1
[   47.213966] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[   47.215869] RIP: 0010:__mmdrop+0x136/0x170
[   47.216866] RSP: 0018:82803dd8 EFLAGS: 00010293
[   47.218160] RAX: 82818500 RBX: 88011577 RCX: 810ae876
[   47.219758] RDX:  RSI: 0001 RDI: 88011577
[   47.221306] RBP: 82803e00 R08: 0001 R09: 
[   47.223268] R10:  R11:  R12: 82818500
[   47.224961] R13: 82a8ce20 R14: 88013ff534c0 R15: 03e7
[   47.226716] FS:  () GS:88013b20() 
knlGS:
[   47.228550] CS:  0010 DS:  ES:  CR0: 80050033
[   47.229884] CR2: 7fbfc2cc0190 CR3: 02812000 CR4: 06f0
[   47.231580] Call Trace:
[   47.232144]  idle_task_exit+0x53/0x60
[   47.232947]  play_dead_common+0x9/0x20
[   47.233906]  native_play_dead+0x10/0xed
[   47.234804]  ? cpuhp_report_idle_dead+0x5a/0x70
[   47.236139]  arch_cpu_idle_dead+0xa/0x10
[   47.236954]  do_idle+0x14d/0x1d0
[   47.237834]  cpu_startup_entry+0x6e/0x70
[   47.238735]  rest_init+0xc7/0xd0
[   47.239612]  ? update_intr_gate+0x1b/0x1b
[   47.240516]  start_kernel+0x59f/0x5c2
[   47.241282]  x86_64_start_reservations+0x38/0x3a
[   47.242402]  x86_64_start_kernel+0x72/0x75
[   47.243328]  secondary_startup_64+0xa5/0xb0
[   47.244378] Code: 89 ff e8 06 32 07 00 eb 83 e8 f7 11 0d 00 4c 89 e7 e8 8f 
e7 0c 00 eb ba e8 e8 11 0d 00 0f 0b e8 e1 11 0d 00 0f 0b e8 da 11 0d 00 <0f> 0b 
e8 d3 11 0d 00 48 89 df e8 8b ed 15 00 e9 2e ff ff ff e8 
[   47.248938] RIP: __mmdrop+0x136/0x170 RSP: 82803dd8
[   47.250243] ---[ end trace 0f4bf1066c11d4ef ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
lkp
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.16.0-rc3 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZ

Re: [PATCH] [v2] docs: clarify security-bugs disclosure policy

2018-03-09 Thread Linus Torvalds
On Fri, Mar 9, 2018 at 12:45 PM, Alan Cox  wrote:
>
> If you want to be taken seriously then I think minimum you also need to
> - Give a GPG key for messages to the list

Oh, I don't want to be taken seriously by people who use gpg encrypted email.

It's garbage and should be shunned as such.

I keep quoting this:

   
https://motherboard.vice.com/en_us/article/vvbw9a/even-the-inventor-of-pgp-doesnt-use-pgp

and anybody who thinks pgp encrypted email is fine is a clown.

> - State what security is in place (encryption etc) to protect the list
>   itself

That could be stated, but it's worth noting the other rules.

If you have some long corrupt vendor disclosure period and are worried
about any good guys finding out (the bad guys probably already have
it), we're not the list for you anyway.

Keep your "we'll keep security problems under wraps so that they can
be exploited for a long time" emails to yourself, or send them to
/dev/null.

   Linus


Re: [PATCH v3] input: bcm5974 - Add driver for Apple Magic Trackpad 2

2018-03-09 Thread Henrik Rydberg

Hi Stephan,


I would like to have Touchpad 2 properly supported.
You will find proper prior support for Magic Trackpads in 
drivers/hid/hid-magicmouse.c.


Henrik



[PATCH] net/9p: avoid -ERESTARTSYS leak to userspace

2018-03-09 Thread Greg Kurz
If it was interrupted by a signal, the 9p client may need to send some
more requests to the server for cleanup before returning to userspace.

To avoid such a last minute request to be interrupted right away, the
client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
the request and calls recalc_sigpending() before returning.

Unfortunately, if the transmission of this cleanup request fails for any
reason, the transport returns an error and the client propagates it right
away, without calling recalc_sigpending().

This ends up with -ERESTARTSYS from the initially interrupted request
crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
request. The specific signal handling code, which is responsible for
converting -ERESTARTSYS to -EINTR is not called, and userspace receives
the confusing errno value:

open: Unknown error 512 (512)

This is really hard to hit in real life. I discovered the issue while
working on hot-unplug of a virtio-9p-pci device with an instrumented
QEMU allowing to control request completion.

Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
error path actually. Their code flow is a bit obscure and the best
thing to do would probably be a full rewrite: to really ensure this
situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
never happen.

But given the general lack of interest for the 9p code, I won't risk
breaking more things. So this patch simply fixes the buggy paths in
both functions with a trivial label+goto.

Thanks to Laurent Dufour for his help and suggestions on how to find
the root cause and how to fix it.

Signed-off-by: Greg Kurz 
---
 net/9p/client.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index b433aff5ff13..e6cae8332e2e 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -769,7 +769,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char 
*fmt, ...)
if (err < 0) {
if (err != -ERESTARTSYS && err != -EFAULT)
c->status = Disconnected;
-   goto reterr;
+   goto recalc_sigpending;
}
 again:
/* Wait for the response */
@@ -804,6 +804,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const char 
*fmt, ...)
if (req->status == REQ_STATUS_RCVD)
err = 0;
}
+recalc_sigpending:
if (sigpending) {
spin_lock_irqsave(¤t->sighand->siglock, flags);
recalc_sigpending();
@@ -867,7 +868,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client 
*c, int8_t type,
if (err == -EIO)
c->status = Disconnected;
if (err != -ERESTARTSYS)
-   goto reterr;
+   goto recalc_sigpending;
}
if (req->status == REQ_STATUS_ERROR) {
p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
@@ -885,6 +886,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client 
*c, int8_t type,
if (req->status == REQ_STATUS_RCVD)
err = 0;
}
+recalc_sigpending:
if (sigpending) {
spin_lock_irqsave(¤t->sighand->siglock, flags);
recalc_sigpending();



Re: [PATCH v2 2/2] riscv/atomic: Strengthen implementations with fences

2018-03-09 Thread Andrea Parri
On Fri, Mar 09, 2018 at 10:54:27AM -0800, Palmer Dabbelt wrote:
> On Fri, 09 Mar 2018 10:36:44 PST (-0800), parri.and...@gmail.com wrote:

[...]


> >This belongs to the "few style fixes" (in the specific, 80-chars lines)
> >mentioned in the cover letter; I could not resist ;-), but I'll remove
> >them in v3 if you like so.
> 
> No problem, just next time it's a bit easier to not mix the really complicated
> stuff (memory model changes) with the really simple stuff (whitespace 
> changes).

Got it.


> >This proposal relies on the generic definition,
> >
> >   include/linux/atomic.h ,
> >
> >and on the
> >
> >   __atomic_op_acquire()
> >   __atomic_op_release()
> >
> >above to build the acquire/release atomics (except for the xchg,cmpxchg,
> >where the ACQUIRE_BARRIER is inserted conditionally/on success).
> 
> I thought we wanted to use the AQ and RL bits for AMOs, just not for LR/SC
> sequences.  IIRC the AMOs are safe with the current memory model, but I might
> just have some version mismatches in my head.

AMO.aqrl are "safe" w.r.t. the LKMM (as they provide "full-ordering"); OTOH,
AMO.aq and AMO.rl present weaknesses that LKMM (and some kernel developers)
do not "expect".  I was probing this issue in:

  https://marc.info/?l=linux-kernel&m=151930201102853&w=2

(c.f., e.g., test "RISCV-unlock-lock-read-ordering" from that post).

Quoting from the commit message of my patch 1/2:

  "Referring to the "unlock-lock-read-ordering" test reported below,
   Daniel wrote:

 I think an RCpc interpretation of .aq and .rl would in fact
 allow the two normal loads in P1 to be reordered [...]

 [...]

 Likewise even if the unlock()/lock() is between two stores.
 A control dependency might originate from the load part of
 the amoswap.w.aq, but there still would have to be something
 to ensure that this load part in fact performs after the store
 part of the amoswap.w.rl performs globally, and that's not
 automatic under RCpc.

   Simulation of the RISC-V memory consistency model confirmed this
   expectation."

I have just (re)checked these observations against the latest specification,
and my results _confirmed_ these verdicts.

  Andrea


Re: [PATCH 1/6] reset: qcom: AOSS (Always on subsystem) reset controller

2018-03-09 Thread Trilok Soni

Sibi,

One cosmetic comment below.

On 3/9/2018 6:55 AM, Sibi S wrote:

+
+This binding describes a reset-controller found on AOSS (Always on 
SubSysem)

+for Qualcomm SDM845 SoCs.


S/SubSysem/Subsytem

---Trilok Soni




[PATCH v3 4/5] drm/dp_mst: Add drm_atomic_dp_mst_retrain_topology()

2018-03-09 Thread Lyude Paul
Retraining MST is rather difficult. In order to do it properly while
guaranteeing that we'll never run into a spot where we commit a
physically impossible configuration, we have to do a lot of checks on
atomic commits which affect MST topologies. All of this work is going to
need to be repeated for every driver at some point, so let's save
ourselves some trouble and just implement these atomic checks as
a single helper.

Signed-off-by: Lyude Paul 
Cc: Manasi Navare 
Cc: Ville Syrjälä 
---
 drivers/gpu/drm/drm_dp_mst_topology.c | 223 ++
 include/drm/drm_dp_mst_helper.h   |   2 +
 2 files changed, 225 insertions(+)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 0d6604500b29..c4a91b1ba61b 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -2167,6 +2167,229 @@ int drm_dp_mst_topology_mgr_lower_link_rate(struct 
drm_dp_mst_topology_mgr *mgr,
 }
 EXPORT_SYMBOL(drm_dp_mst_topology_mgr_lower_link_rate);
 
+static bool drm_atomic_dp_mst_state_only_disables_mstbs(struct 
drm_atomic_state *state,
+   struct 
drm_dp_mst_topology_mgr *mgr,
+   struct 
drm_dp_mst_branch *mstb)
+{
+   struct drm_dp_mst_branch *rmstb;
+   struct drm_dp_mst_port *port;
+   struct drm_connector *connector;
+   struct drm_connector_state *conn_state;
+   struct drm_crtc *crtc;
+   struct drm_crtc_state *crtc_state;
+   int ret;
+
+   list_for_each_entry(port, &mstb->ports, next) {
+   rmstb = drm_dp_get_validated_mstb_ref(mstb->mgr, port->mstb);
+   if (rmstb) {
+   ret = drm_atomic_dp_mst_state_only_disables_mstbs(
+   state, mgr, rmstb);
+   drm_dp_put_mst_branch_device(rmstb);
+   if (!ret)
+   return false;
+   }
+
+   connector = port->connector;
+   if (!connector)
+   continue;
+
+   conn_state = drm_atomic_get_new_connector_state(
+   state, connector);
+   if (!conn_state)
+   continue;
+
+   crtc = conn_state->crtc;
+   if (!crtc)
+   continue;
+
+   crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
+   if (!crtc_state)
+   continue;
+
+   if (drm_atomic_crtc_needs_modeset(crtc_state))
+   return false;
+   }
+
+   return true;
+}
+
+static int drm_atomic_dp_mst_all_mstbs_disabled(struct drm_atomic_state *state,
+   struct drm_dp_mst_topology_mgr 
*mgr,
+   struct drm_dp_mst_branch *mstb)
+{
+   struct drm_dp_mst_branch *rmstb;
+   struct drm_dp_mst_port *port;
+   struct drm_connector *connector;
+   struct drm_connector_state *conn_state;
+   int ret;
+
+   list_for_each_entry(port, &mstb->ports, next) {
+   rmstb = drm_dp_get_validated_mstb_ref(mstb->mgr, port->mstb);
+   if (rmstb) {
+   ret = drm_atomic_dp_mst_all_mstbs_disabled(
+   state, mgr, rmstb);
+   drm_dp_put_mst_branch_device(rmstb);
+   if (ret <= 0)
+   return ret;
+   }
+
+   connector = port->connector;
+   if (!connector)
+   continue;
+
+   conn_state = drm_atomic_get_connector_state(
+   state, connector);
+   if (IS_ERR(conn_state))
+   return PTR_ERR(conn_state);
+
+   if (conn_state->crtc)
+   return false;
+   }
+
+   /* No enabled CRTCs found */
+   return true;
+}
+
+static int drm_atomic_dp_mst_retrain_mstb(struct drm_atomic_state *state,
+ struct drm_dp_mst_topology_mgr *mgr,
+ struct drm_dp_mst_branch *mstb,
+ bool full_modeset)
+{
+   struct drm_dp_mst_branch *rmstb;
+   struct drm_dp_mst_port *port;
+   struct drm_connector *connector;
+   struct drm_connector_state *conn_state;
+   struct drm_crtc *crtc;
+   struct drm_crtc_state *crtc_state;
+   int ret;
+
+   list_for_each_entry(port, &mstb->ports, next) {
+   rmstb = drm_dp_get_validated_mstb_ref(mstb->mgr, port->mstb);
+   if (rmstb) {
+   ret = drm_atomic_dp_mst_retrain_mstb(
+   state, mgr, rmstb, full_modeset);
+   drm_dp_put_mst_branch_device(rmstb);
+   if (ret)
+  

  1   2   3   4   5   6   7   8   9   10   >