Re: [PATCH net RESEND] PCI: fix oops when try to find Root Port for a PCI device

2017-08-16 Thread Michael Ellerman
Ding Tianhong  writes:

> Eric report a oops when booting the system after applying
> the commit a99b646afa8a ("PCI: Disable PCIe Relaxed..."):

I'm seeing a similar oops on powerpc:

[0.177242] pci_bus 0015:70: root bus resource [bus 70-ff]
[0.178012] Unable to handle kernel paging request for data at address 
0x0050
[0.178017] Faulting instruction address: 0xc05f84b4
[0.178022] Oops: Kernel access of bad area, sig: 11 [#1]
[0.178024] SMP NR_CPUS=2048 
[0.178025] NUMA 
[0.178028] pSeries
[0.178031] Modules linked in:
[0.178036] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW   
4.13.0-rc4-gcc-6.3.1-00167-ga99b646afa8a #407
[0.178040] task: c003f740 task.stack: c003f748
[0.178043] NIP: c05f84b4 LR: c05f5ccc CTR: 
[0.178046] REGS: c003f74836d0 TRAP: 0380   Tainted: GW
(4.13.0-rc4-gcc-6.3.1-00167-ga99b646afa8a)
[0.178050] MSR: 82009033 
[0.178057]   CR: 48000842  XER: 200f
[0.178061] CFAR: c05f840c SOFTE: 1 
[0.178061] GPR00: c05f5cb4 c003f7483950 c0fa 
 
[0.178061] GPR04: 0001 0028 c003f7483820 
f0ff6360 
[0.178061] GPR08: 0003fe2f  c003f5759000 
02001001 
[0.178061] GPR12: 0010 cfd8 c000db08 
 
[0.178061] GPR16:    
 
[0.178061] GPR20:    
 
[0.178061] GPR24:  c0c5f680 c003f756b678 
c003f5759000 
[0.178061] GPR28: 0030 c003f756b098 c003f5759000 
c003f756b000 
[0.178110] NIP [c05f84b4] pci_find_pcie_root_port+0xb4/0xd0
[0.178114] LR [c05f5ccc] pci_device_add+0x32c/0x470
[0.178117] Call Trace:
[0.178120] [c003f7483950] [c05f5cb4] pci_device_add+0x314/0x470 
(unreliable)
[0.178126] [c003f74839f0] [c005b85c] 
of_create_pci_dev+0x35c/0x400
[0.178130] [c003f7483ab0] [c005ba14] __of_scan_bus+0x114/0x1e0
[0.178135] [c003f7483b20] [c0059a9c] 
pcibios_scan_phb+0x23c/0x270
[0.178140] [c003f7483bc0] [c0d8057c] pcibios_init+0x84/0xdc
[0.178144] [c003f7483c40] [c000d680] do_one_initcall+0x60/0x1c0
[0.178149] [c003f7483d00] [c0d74454] 
kernel_init_freeable+0x2c4/0x3a0
[0.178153] [c003f7483dc0] [c000db24] kernel_init+0x24/0x150
[0.178158] [c003f7483e30] [c000bc28] 
ret_from_kernel_thread+0x5c/0xb4

...


And the patch below fixes it. Thanks.

cheers

> == cut here =
>
> It looks like the pci_find_pcie_root_port() was trying to
> find the Root Port for the PCI device which is the Root
> Port already, it will return NULL and trigger the problem,
> so check the highest_pcie_bridge to fix thie problem.
>
> Fixes: a99b646afa8a ("PCI: Disable PCIe Relaxed Ordering if unsupported")
> Reported-by: Eric Dumazet 
> Signed-off-by: Eric Dumazet 
> Signed-off-by: Ding Tianhong 
> ---
>  drivers/pci/pci.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index af0cc34..7e2022f 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -522,7 +522,8 @@ struct pci_dev *pci_find_pcie_root_port(struct pci_dev 
> *dev)
>   bridge = pci_upstream_bridge(bridge);
>   }
>  
> - if (pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT)
> + if (highest_pcie_bridge &&
> + pci_pcie_type(highest_pcie_bridge) != PCI_EXP_TYPE_ROOT_PORT)
>   return NULL;
>  
>   return highest_pcie_bridge;
> -- 
> 1.8.3.1


Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-08-16 Thread Benjamin Herrenschmidt
On Wed, 2017-08-16 at 10:56 -0600, Alex Williamson wrote:
> 
> > WTF  Alex, can you stop once and for all with all that "POWER is
> > not standard" bullshit please ? It's completely wrong.
> 
> As you've stated, the MSI-X vector table on POWER is currently updated
> via a hypercall.  POWER is overall PCI compliant (I assume), but the
> guest does not directly modify the vector table in MMIO space of the
> device.  This is important...

Well no. On qemu the guest doesn't always (but it can save/restore it),
but on PowerVM this is done by the FW running inside the partition
itself. And that firmware just does normal stores to the device table.

IE. The problem here isn't so much who does the actual stores to the
device table but where they get the address and data values from, which
isn't covered by the spec.

The added fact that qemu hijacks the stores not just to "remap" them
but also do the whole reuqesting of the interrupt etc... in the host
system is a qemu design choice which also hasn't any relation to the
spec (and arguably isnt' a great choice for our systems).

For example, on PowerVM, the HV assigns a pile of MSIs to the guest to
assign to its devices. The FW inside the guest does a default
assignment but that can be changed.

Thus the interrupts are effectively "hooked up" at the HV level at the
point where the PCI bridge is mapped into the guest.

> > This has nothing to do with PCIe standard !
> 
> Yes, it actually does, because if the guest relies on the vector table
> to be virtualized then it doesn't particularly matter whether the
> vfio-pci kernel driver allows that portion of device MMIO space to be
> directly accessed or mapped because QEMU needs for it to be trapped in
> order to provide that virtualization.

And this has nothing to do with the PCIe standard... this has
everything to do with a combination of qemu design choices and
defficient FW interfaces on x86 platforms.

> I'm not knocking POWER, it's a smart thing for virtualization to have
> defined this hypercall which negates the need for vector table
> virtualization and allows efficient mapping of the device.  On other
> platform, it's not necessarily practical given the broad base of legacy
> guests supported where we'd never get agreement to implement this as
> part of the platform spec... if there even was such a thing.  Maybe we
> could provide the hypercall and dynamically enable direct vector table
> mapping (disabling vector table virtualization) only if the hypercall
> is used.

No I think a better approach would be to provide the guest with a pile
of MSIs to use with devices and have FW (such as ACPI) tell the guest
about them.

> > The PCIe standard says strictly *nothing* whatsoever about how an OS
> > obtains the magic address/values to put in the device and how the PCIe
> > host bridge may do appropriate fitering.
> 
> And now we've jumped the tracks...  The only way the platform specific
> address/data values become important is if we allow direct access to
> the vector table AND now we're formulating how the user/guest might
> write to it directly.  Otherwise the virtualization of the vector
> table, or paravirtualization via hypercall provides the translation
> where the host and guest address/data pairs can operate in completely
> different address spaces.

They can regardless if things are done properly :-)

> > There is nothing on POWER that prevents the guest from writing the MSI-
> > X address/data by hand. The problem isn't who writes the values or even
> > how. The problem breaks down into these two things that are NOT covered
> > by any aspect of the PCIe standard:
> 
> You've moved on to a different problem, I think everyone aside from
> POWER is still back at the problem where who writes the vector table
> values is a forefront problem.
>  
> >   1- The OS needs to obtain address/data values for an MSI that will
> > "work" for the device.
> > 
> >   2- The HW+HV needs to prevent collateral damage caused by a device
> > issuing stores to incorrect address or with incorrect data. Now *this*
> > is necessary for *ANY* kind of DMA whether it's an MSI or something
> > else anyway.
> > 
> > Now, the filtering done by qemu is NOT a reasonable way to handle 2)
> > and whatever excluse about "making it harder" doesn't fly a meter when
> > it comes to security. Making it "harder to break accidentally" I also
> > don't buy, people don't just randomly put things in their MSI-X tables
> > "accidentally", that stuff works or doesn't.
> 
> As I said before, I'm not willing to preserve the weak attributes that
> blocking direct vector table access provides over pursuing a more
> performant interface, but I also don't think their value is absolute
> zero either.
> 
> > That leaves us with 1). Now this is purely a platform specific matters,
> > not a spec matter. Once the HW has a way to enforce you can only
> > generate "allowed" MSIs it becomes a matter of having some FW mechanism
> > that can be used to 

Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-16 Thread Michael Ellerman
Bart Van Assche  writes:

> On Wed, 2017-08-16 at 22:30 +0530, Abdul Haleem wrote:
>> As of next-20170809, linux-next on powerpc boot hung with below trace
>> message.
>> 
>> [ ... ]
>> 
>> A bisection resulted in first bad commit (270065e92 - scsi: scsi-mq:
>> Always unprepare ...) in the merge branch 'scsi/for-next'
>> 
>> System booted fine when the below commit is reverted: 
>> 
>> commit 270065e92c317845d69095ec8e3d18616b5b39d5
>> Author: Bart Van Assche 
>> Date:   Thu Aug 3 14:40:14 2017 -0700
>> 
>> scsi: scsi-mq: Always unprepare before requeuing a request
>
> Hello Brian and Michael,
>
> Do you agree that this probably indicates a bug in the PowerPC block driver
> that is used to access the boot disk?

I don't know a scsi device from a block device, so I'm not much help sorry.

It seems likely it is a powerpc specific bug, as it seems no one else
has reported any problems with this commit.

> Anyway, since a solution is not yet available, I will submit a revert
> for this patch.

Thanks. Sorry I haven't been able to debug it further, there's about 10
things on fire right now - ie. situation normal :)

cheers


[PATCH 1/1] selftests/powerpc: Improve tm-resched-dscr

2017-08-16 Thread Sam Bobroff
The tm-resched-dscr self test can, in some situations, run for
several minutes before being successfully interrupted by the context
switch it needs in order to perform the test. This often seems to
occur when the test is being run in a virtual machine.

Improve the test by running it under eat_cpu() to guarantee
contention for the CPU and increase the chance of a context switch.

In practice this seems to reduce the test time, in some cases, from
more than two minutes to under a second.

Also remove the "progress dots" so that if the test does run for a
long time, it doesn't produce large amounts of unnecessary output.

Signed-off-by: Sam Bobroff 
---
 tools/testing/selftests/powerpc/tm/Makefile  |  1 +
 tools/testing/selftests/powerpc/tm/tm-resched-dscr.c | 12 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/powerpc/tm/Makefile 
b/tools/testing/selftests/powerpc/tm/Makefile
index 958c11c14acd..7bfcd454fb2a 100644
--- a/tools/testing/selftests/powerpc/tm/Makefile
+++ b/tools/testing/selftests/powerpc/tm/Makefile
@@ -15,6 +15,7 @@ $(OUTPUT)/tm-syscall: tm-syscall-asm.S
 $(OUTPUT)/tm-syscall: CFLAGS += -I../../../../../usr/include
 $(OUTPUT)/tm-tmspr: CFLAGS += -pthread
 $(OUTPUT)/tm-vmx-unavail: CFLAGS += -pthread -m64
+$(OUTPUT)/tm-resched-dscr: ../pmu/lib.o
 
 SIGNAL_CONTEXT_CHK_TESTS := $(patsubst 
%,$(OUTPUT)/%,$(SIGNAL_CONTEXT_CHK_TESTS))
 $(SIGNAL_CONTEXT_CHK_TESTS): tm-signal.S
diff --git a/tools/testing/selftests/powerpc/tm/tm-resched-dscr.c 
b/tools/testing/selftests/powerpc/tm/tm-resched-dscr.c
index e79ccd6aada1..a7ac2e4c60d9 100644
--- a/tools/testing/selftests/powerpc/tm/tm-resched-dscr.c
+++ b/tools/testing/selftests/powerpc/tm/tm-resched-dscr.c
@@ -30,6 +30,7 @@
 
 #include "utils.h"
 #include "tm.h"
+#include "../pmu/lib.h"
 
 #define SPRN_DSCR   0x03
 
@@ -75,8 +76,6 @@ int test_body(void)
);
assert(rv); /* make sure the transaction aborted */
if ((texasr >> 56) != TM_CAUSE_RESCHED) {
-   putchar('.');
-   fflush(stdout);
continue;
}
if (dscr2 != dscr1) {
@@ -89,7 +88,12 @@ int test_body(void)
}
 }
 
-int main(void)
+static int tm_resched_dscr(void)
 {
-   return test_harness(test_body, "tm_resched_dscr");
+   return eat_cpu(test_body);
+}
+
+int main(int argc, const char *argv[])
+{
+   return test_harness(tm_resched_dscr, "tm_resched_dscr");
 }
-- 
2.14.1.2.g4274c698f



Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-16 Thread Brian King
On 08/16/2017 12:21 PM, Bart Van Assche wrote:
> On Wed, 2017-08-16 at 22:30 +0530, Abdul Haleem wrote:
>> As of next-20170809, linux-next on powerpc boot hung with below trace
>> message.
>>
>> [ ... ]
>>
>> A bisection resulted in first bad commit (270065e92 - scsi: scsi-mq:
>> Always unprepare ...) in the merge branch 'scsi/for-next'
>>
>> System booted fine when the below commit is reverted: 
>>
>> commit 270065e92c317845d69095ec8e3d18616b5b39d5
>> Author: Bart Van Assche 
>> Date:   Thu Aug 3 14:40:14 2017 -0700
>>
>> scsi: scsi-mq: Always unprepare before requeuing a request
> 
> Hello Brian and Michael,
> 
> Do you agree that this probably indicates a bug in the PowerPC block driver
> that is used to access the boot disk? Anyway, since a solution is not yet
> available, I will submit a revert for this patch.

I've been looking at this a bit, and can recreate the issue, but haven't
got to root cause of the issue as of yet. If I do a sysrq-w while the system is 
hung
during boot I see this:

[   25.561523] Workqueue: events_unbound async_run_entry_fn
[   25.561527] Call Trace:
[   25.561529] [c001697873f0] [c00169701600] 0xc00169701600 
(unreliable)
[   25.561534] [c001697875c0] [c001ab78] __switch_to+0x2e8/0x430
[   25.561539] [c00169787620] [c091ccb0] __schedule+0x310/0xa00
[   25.561543] [c001697876f0] [c091d3e0] schedule+0x40/0xb0
[   25.561548] [c00169787720] [c0921e40] 
schedule_timeout+0x200/0x430
[   25.561553] [c00169787810] [c091db10] 
io_schedule_timeout+0x30/0x70
[   25.561558] [c00169787840] [c091e978] 
wait_for_common_io.constprop.3+0x178/0x280
[   25.561563] [c001697878c0] [c047f7ec] blk_execute_rq+0x7c/0xd0
[   25.561567] [c00169787910] [c0614cd0] scsi_execute+0x100/0x230
[   25.561572] [c00169787990] [c060d29c] 
scsi_report_opcode+0xbc/0x170
[   25.561577] [c00169787a50] [d4fe6404] 
sd_revalidate_disk+0xe04/0x1620 [sd_mod]
[   25.561583] [c00169787b80] [d4fe6d84] sd_probe_async+0xb4/0x230 
[sd_mod]
[   25.561588] [c00169787c00] [c010fc44] 
async_run_entry_fn+0x74/0x210
[   25.561593] [c00169787c90] [c0102f48] 
process_one_work+0x198/0x480
[   25.561598] [c00169787d30] [c01032b8] worker_thread+0x88/0x510
[   25.561603] [c00169787dc0] [c010b030] kthread+0x160/0x1a0
[   25.561608] [c00169787e30] [c000b3a4] 
ret_from_kernel_thread+0x5c/0xb8

I was noticing that we are commonly in scsi_report_opcode. Since ipr RAID 
arrays don't support
the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES, I tried setting 
sdev->no_report_opcodes = 1
in ipr's slave configure. This seems to eliminate the boot hang for me, but is 
only working around
the issue. Since this command is not supported by ipr, it should return with an 
illegal request.
When I'm hung at this point, there is nothing outstanding to the adapter / 
driver. I'll continue
debugging...

-Brian 

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH v2 1/1] rtc: rtctest: Improve support detection

2017-08-16 Thread Shuah Khan
On 08/15/2017 02:46 AM, Lukáš Doktor wrote:
> The rtc-generic and opal-rtc are failing to run this test as they do not
> support all the features. Let's treat the error returns and skip to the
> following test.
> 
> Theoretically the test_DATE should be also adjusted, but as it's enabled
> on demand I think it makes sense to fail in such case.
> 
> Signed-off-by: Lukáš Doktor 
> ---
>  tools/testing/selftests/timers/rtctest.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/timers/rtctest.c 
> b/tools/testing/selftests/timers/rtctest.c
> index f61170f..411eff6 100644
> --- a/tools/testing/selftests/timers/rtctest.c
> +++ b/tools/testing/selftests/timers/rtctest.c
> @@ -221,6 +221,11 @@ int main(int argc, char **argv)
>   /* Read the current alarm settings */
>   retval = ioctl(fd, RTC_ALM_READ, _tm);
>   if (retval == -1) {
> + if (errno == EINVAL) {
> + fprintf(stderr,
> + "\n...EINVAL reading current alarm 
> setting.\n");
> + goto test_PIE;
> + }
>   perror("RTC_ALM_READ ioctl");
>   exit(errno);
>   }
> @@ -231,7 +236,7 @@ int main(int argc, char **argv)
>   /* Enable alarm interrupts */
>   retval = ioctl(fd, RTC_AIE_ON, 0);
>   if (retval == -1) {
> - if (errno == EINVAL) {
> + if (errno == EINVAL || errno == EIO) {
>   fprintf(stderr,
>   "\n...Alarm IRQs not supported.\n");
>   goto test_PIE;
> 

Applied to linux-kselftest next for 4.14-rc1

thanks,
-- Shuah


Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

2017-08-16 Thread Sukadev Bhattiprolu
Michael Ellerman [m...@ellerman.id.au] wrote:
> Sukadev Bhattiprolu  writes:
> 
> > Nicholas Piggin [npig...@gmail.com] wrote:
> >> On Mon, 14 Aug 2017 15:21:48 +1000
> >> Michael Ellerman  wrote:
> >> 
> >> > Sukadev Bhattiprolu  writes:
> >> 
> >> > >  arch/powerpc/include/asm/vas.h   |  35 
> >> > >  arch/powerpc/include/uapi/asm/vas.h  |  25 +++  
> >> > 
> >> > I thought we weren't exposing VAS to userspace yet?
> >> > 
> >> > If we are then we need to get things straight WRT copy/paste abort.
> ...
> >
> > In the FTW case, there is no data transfer from user space to the hardware.

Sorry, that was focussed on the paste side.

> > i.e the copy/paste submit a NULL CRB and hardware will be configured (see
> > ->fifo_disable setting in winctx) to ignore any data they specify in the 
> > CRB.
> 
> I thought the copy did copy a cacheline, but then the paste to the VAS
> window just ignores the contents, and doesn't allow userspace to get the
> content in any way?

Yes, you are right. The copy instruction does read the CRB into its copy-
buffer but for the FTW, VAS ignores the copy-buffer contents on paste.
So, the CRB may be zeroed, but must be a valid buffer.

> 
> Which means we have two thirds of a covert channel, ie. something can be
> copied into the copy buffer by one process, and then a second process
> can paste it, but because it can only paste to foreign memory, and the
> only foreign memory it can get is a VAS FTW window, it can't actually
> see the content of the copy buffer.
> 
> > Would we be able to allow copy/paste from user space in that case?
> 
> Yeah I think so, but it is all a bit fragile.
> 
> cheers



Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-16 Thread Bart Van Assche
On Wed, 2017-08-16 at 22:30 +0530, Abdul Haleem wrote:
> As of next-20170809, linux-next on powerpc boot hung with below trace
> message.
> [ ... ]
> System booted fine when the below commit is reverted: 

Hello Abdul,

Can you check whether applying the following commit on top of next-20170809
fixes this regression:

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.13/scsi-fixes=b0e17a9b0df29590c45dfb296f541270a5941f41

Thanks,

Bart.

Re: WARNING: CPU: 15 PID: 0 at block/blk-mq.c:1111 __blk_mq_run_hw_queue+0x1d8/0x1f0

2017-08-16 Thread Brian King
On 08/16/2017 01:15 PM, Bart Van Assche wrote:
> On Wed, 2017-08-16 at 23:37 +0530, Abdul Haleem wrote:
>> Linux-next booted with the below warnings on powerpc
>>
>> [ ... ]
>>
>> boot warnings:
>> --
>> kvm: exiting hardware virtualization
>> [ cut here ]
>> WARNING: CPU: 15 PID: 0 at block/blk-mq.c: __blk_mq_run_hw_queue
>> +0x1d8/0x1f0
>> Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
>> Call Trace:
>> [c0037990] [c088f7b0] __blk_mq_delay_run_hw_queue
>> +0x1f0/0x210
>> [c00379d0] [c088fcb8] blk_mq_start_hw_queue+0x58/0x80
>> [c00379f0] [c088fd40] blk_mq_start_hw_queues+0x60/0xb0
>> [c0037a30] [c0ae2b54] scsi_kick_queue+0x34/0xa0
>> [c0037a50] [c0ae2f70] scsi_run_queue+0x3b0/0x660
>> [c0037ac0] [c0ae7ed4] scsi_run_host_queues+0x64/0xc0
>> [c0037b00] [c0ae7f64] scsi_unblock_requests+0x34/0x60
>> [c0037b20] [c0b14998] ipr_ioa_bringdown_done+0xf8/0x3a0
>> [c0037bc0] [c0b12528] ipr_reset_ioa_job+0xd8/0x170
>> [c0037c00] [c0b18790] ipr_reset_timer_done+0x110/0x160
>> [c0037c50] [c024db50] call_timer_fn+0xa0/0x3a0
>> [c0037ce0] [c024e058] expire_timers+0x1b8/0x350
>> [c0037d50] [c024e2f0] run_timer_softirq+0x100/0x3e0
>> [c0037df0] [c0162edc] __do_softirq+0x20c/0x620
>> [c0037ee0] [c0163a80] irq_exit+0x230/0x290
>> [c0037f10] [c001d770] __do_irq+0x170/0x410
>> [c0037f90] [c003ea20] call_do_irq+0x14/0x24
>> [c007f84e3a70] [c001dae0] do_IRQ+0xd0/0x190
>> [c007f84e3ac0] [c0008c58] hardware_interrupt_common
>> +0x158/0x160
> 
> Hello Brian,
> 
> In the MAINTAINERS file I found the following:
> 
> IBM Power Linux RAID adapter
> M:  Brian King 
> S:  Supported
> F:  drivers/scsi/ipr.*
> 
> Is that information up-to-date? Do you agree that the above message indicates
> a bug in the ipr driver?

Yes. Can you try with this patch that is in 4.13/scsi-fixes:

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.13/scsi-fixes=b0e17a9b0df29590c45dfb296f541270a5941f41

Thanks,

Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center



Re: [PATCH] soc: Convert to using %pOF instead of full_name

2017-08-16 Thread Arnd Bergmann
On Thu, Aug 10, 2017 at 12:09 AM, Rob Herring  wrote:
> On Tue, Jul 18, 2017 at 4:43 PM, Rob Herring  wrote:
>> Now that we have a custom printf format specifier, convert users of
>> full_name to use %pOF instead. This is preparation to remove storing
>> of the full path string for each node.
>>
>> Signed-off-by: Rob Herring 
>> Cc: Scott Wood 
>> Cc: Qiang Zhao 
>> Cc: Matthias Brugger 
>> Cc: Simon Horman 
>> Cc: Magnus Damm 
>> Cc: Kukjin Kim 
>> Cc: Krzysztof Kozlowski 
>> Cc: Javier Martinez Canillas 

> Arnd, Olof,
>
> Can you please apply this one.

Applied to next/drivers with the various Acks, thanks!

  Arnd


Re: WARNING: CPU: 15 PID: 0 at block/blk-mq.c:1111 __blk_mq_run_hw_queue+0x1d8/0x1f0

2017-08-16 Thread Bart Van Assche
On Wed, 2017-08-16 at 23:37 +0530, Abdul Haleem wrote:
> Linux-next booted with the below warnings on powerpc
> 
> [ ... ]
> 
> boot warnings:
> --
> kvm: exiting hardware virtualization
> [ cut here ]
> WARNING: CPU: 15 PID: 0 at block/blk-mq.c: __blk_mq_run_hw_queue
> +0x1d8/0x1f0
> Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
> Call Trace:
> [c0037990] [c088f7b0] __blk_mq_delay_run_hw_queue
> +0x1f0/0x210
> [c00379d0] [c088fcb8] blk_mq_start_hw_queue+0x58/0x80
> [c00379f0] [c088fd40] blk_mq_start_hw_queues+0x60/0xb0
> [c0037a30] [c0ae2b54] scsi_kick_queue+0x34/0xa0
> [c0037a50] [c0ae2f70] scsi_run_queue+0x3b0/0x660
> [c0037ac0] [c0ae7ed4] scsi_run_host_queues+0x64/0xc0
> [c0037b00] [c0ae7f64] scsi_unblock_requests+0x34/0x60
> [c0037b20] [c0b14998] ipr_ioa_bringdown_done+0xf8/0x3a0
> [c0037bc0] [c0b12528] ipr_reset_ioa_job+0xd8/0x170
> [c0037c00] [c0b18790] ipr_reset_timer_done+0x110/0x160
> [c0037c50] [c024db50] call_timer_fn+0xa0/0x3a0
> [c0037ce0] [c024e058] expire_timers+0x1b8/0x350
> [c0037d50] [c024e2f0] run_timer_softirq+0x100/0x3e0
> [c0037df0] [c0162edc] __do_softirq+0x20c/0x620
> [c0037ee0] [c0163a80] irq_exit+0x230/0x290
> [c0037f10] [c001d770] __do_irq+0x170/0x410
> [c0037f90] [c003ea20] call_do_irq+0x14/0x24
> [c007f84e3a70] [c001dae0] do_IRQ+0xd0/0x190
> [c007f84e3ac0] [c0008c58] hardware_interrupt_common
> +0x158/0x160

Hello Brian,

In the MAINTAINERS file I found the following:

IBM Power Linux RAID adapter
M:  Brian King 
S:  Supported
F:  drivers/scsi/ipr.*

Is that information up-to-date? Do you agree that the above message indicates
a bug in the ipr driver?

Thanks,

Bart.

WARNING: CPU: 15 PID: 0 at block/blk-mq.c:1111 __blk_mq_run_hw_queue+0x1d8/0x1f0

2017-08-16 Thread Abdul Haleem
Hi,

Linux-next booted with the below warnings on powerpc

Test: Reboot
Machine Type : Power 8 bare-metal
Kernel version : 4.13.0-rc4-next-20170808
gcc : 4.8.5
config: Tul-NV-config file attached
Issue is rare to hit (found once for 3 retries)

A WARN_ON_ONCE is being triggered from function __blk_mq_run_hw_queue in
file block/blk-mq.c at line 

which is : 

static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
{
int srcu_idx;

/*
 * We should be running this queue from one of the CPUs that
 * are mapped to it.
 */
WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)
&&
cpu_online(hctx->next_cpu));

/*
 * We can't run the queue inline with ints disabled. Ensure that
 * we catch bad users of this early.
 */
   >>>  WARN_ON_ONCE(in_interrupt());

if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
rcu_read_lock();

boot warnings:
--
kvm: exiting hardware virtualization
[ cut here ]
WARNING: CPU: 15 PID: 0 at block/blk-mq.c: __blk_mq_run_hw_queue
+0x1d8/0x1f0
Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge
stp llc kvm_hv kvm iptable_filter vmx_crypto ipmi_powernv leds_powernv
led_class powernv_rng ipmi_devintf ipmi_msghandler rng_core
powernv_op_panel binfmt_misc nfsd ip_tables x_tables autofs4
CPU: 15 PID: 0 Comm: swapper/15 Not tainted 4.13.0-rc4-next-20170808 #4
task: c007f8439000 task.stack: c007f84e
NIP: c088f4d8 LR: c088f7b0 CTR: c0dafcc0
REGS: c00376d0 TRAP: 0700   Not tainted
(4.13.0-rc4-next-20170808)
MSR: 90029033 
  CR: 42004022  XER: 
CFAR: c088f3c4 SOFTE: 1
GPR00: c088f7b0 c0037950 c1d8ff00
c007eb707c00
GPR04:   c22fff00
c224ff00
GPR08: c224ff00 0001 0100
90001003
GPR12: 4400 cfd45280 c007f84e3f90
00200042
GPR16: 00019ad1 c0034000 
c15c4e80
GPR20: c1dc3b00 c15c4e80 000a
c0034000
GPR24:  c007eb3e1818 c0037a70
0001
GPR28: c007eb3e  c007eb707c00
c007eb707c00
NIP [c088f4d8] __blk_mq_run_hw_queue+0x1d8/0x1f0
LR [c088f7b0] __blk_mq_delay_run_hw_queue+0x1f0/0x210
Call Trace:
[c0037990] [c088f7b0] __blk_mq_delay_run_hw_queue
+0x1f0/0x210
[c00379d0] [c088fcb8] blk_mq_start_hw_queue+0x58/0x80
[c00379f0] [c088fd40] blk_mq_start_hw_queues+0x60/0xb0
[c0037a30] [c0ae2b54] scsi_kick_queue+0x34/0xa0
[c0037a50] [c0ae2f70] scsi_run_queue+0x3b0/0x660
[c0037ac0] [c0ae7ed4] scsi_run_host_queues+0x64/0xc0
[c0037b00] [c0ae7f64] scsi_unblock_requests+0x34/0x60
[c0037b20] [c0b14998] ipr_ioa_bringdown_done+0xf8/0x3a0
[c0037bc0] [c0b12528] ipr_reset_ioa_job+0xd8/0x170
[c0037c00] [c0b18790] ipr_reset_timer_done+0x110/0x160
[c0037c50] [c024db50] call_timer_fn+0xa0/0x3a0
[c0037ce0] [c024e058] expire_timers+0x1b8/0x350
[c0037d50] [c024e2f0] run_timer_softirq+0x100/0x3e0
[c0037df0] [c0162edc] __do_softirq+0x20c/0x620
[c0037ee0] [c0163a80] irq_exit+0x230/0x290
[c0037f10] [c001d770] __do_irq+0x170/0x410
[c0037f90] [c003ea20] call_do_irq+0x14/0x24
[c007f84e3a70] [c001dae0] do_IRQ+0xd0/0x190
[c007f84e3ac0] [c0008c58] hardware_interrupt_common
+0x158/0x160
--- interrupt: 501 at .L1^B42+0x0/0x4
LR = arch_local_irq_restore+0x124/0x160
[c007f84e3db0] [c001c9c8] arch_local_irq_restore+0xa8/0x160
(unreliable)
[c007f84e3dd0] [c0db5038] cpuidle_enter_state+0x238/0x6e0
[c007f84e3e30] [c0db5588] cpuidle_enter+0x38/0x60
[c007f84e3e50] [c01f22e4] call_cpuidle+0x74/0xe0
[c007f84e3e70] [c01f2a78] do_idle+0x4b8/0x5a0
[c007f84e3ee0] [c01f2f64] cpu_startup_entry+0x74/0x90
[c007f84e3f20] [c0068c14] start_secondary+0x4a4/0x550
[c007f84e3f90] [c000b16c] start_secondary_prolog+0x10/0x14
Instruction dump:
e9280e58 eba1ffe8 ebc1fff0 ebe1fff8 7c0803a6 39290001 f9280e58 4e800020 
3ce2004c e9270e38 39290001 f9270e38 <0fe0> 3d02004c e9280e40
39290001
---[ end trace 5632db71d3bf5b30 ]---

WARN_ON_ONCE(in_interrupt()) was first introduced in the commit :

commit b7a71e66d4d274d627cabc17c5e41330bcf47c2d
Author: Jens Axboe 
Date:   Tue Aug 1 09:28:24 2017 -0600

blk-mq: add warning to __blk_mq_run_hw_queue() for ints 

Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-16 Thread Bart Van Assche
On Wed, 2017-08-16 at 22:30 +0530, Abdul Haleem wrote:
> As of next-20170809, linux-next on powerpc boot hung with below trace
> message.
> 
> [ ... ]
> 
> A bisection resulted in first bad commit (270065e92 - scsi: scsi-mq:
> Always unprepare ...) in the merge branch 'scsi/for-next'
> 
> System booted fine when the below commit is reverted: 
> 
> commit 270065e92c317845d69095ec8e3d18616b5b39d5
> Author: Bart Van Assche 
> Date:   Thu Aug 3 14:40:14 2017 -0700
> 
> scsi: scsi-mq: Always unprepare before requeuing a request

Hello Brian and Michael,

Do you agree that this probably indicates a bug in the PowerPC block driver
that is used to access the boot disk? Anyway, since a solution is not yet
available, I will submit a revert for this patch.

Bart.

[BUG][bisected 270065e] linux-next fails to boot on powerpc

2017-08-16 Thread Abdul Haleem
Hi Bart,

As of next-20170809, linux-next on powerpc boot hung with below trace
message.

Test : Boot
Machine Type : Power 8 bare-metal
Kernel version : 4.13.0-rc4-next-2017081
gcc : 4.8.5
config: Tul-NV-config file attached


Boot logs:
-
oprofile: using timer interrupt.
ipip: IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
Key type dns_resolver registered
registered taskstats version 1
ima: No TPM chip found, activating TPM-bypass! (rc=-19)
console [netcon0] enabled
netconsole: network logging started
rtc-opal opal-rtc: setting system clock to 2017-08-16 06:34:56 UTC
(1502865296)
.
ready
sd 0:2:0:0: [sda] 272646144 512-byte logical blocks: (140 GB/130 GiB)
sd 0:2:0:0: [sda] 4096-byte physical blocks
sd 0:2:0:0: [sda] Write Protect is off 
INFO: task swapper/5:1 blocked for more than 120 seconds.
  Not tainted 4.13.0-rc4-next-20170810-autotest #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
swapper/5   D 9936 1  0 0x0800   
Call Trace:
[c007f8483a10] [c007f8483a80] 0xc007f8483a80 (unreliable)
[c007f8483be0] [c001b358] __switch_to+0x2e8/0x430
[c007f8483c40] [c09d134c] __schedule+0x38c/0xaf0
[c007f8483d20] [c09d1af0] schedule+0x40/0xb0
[c007f8483d50] [c0110bd4] async_synchronize_cookie_domain
+0xd4/0x150
[c007f8483dc0] [c000d8f8] kernel_init+0x28/0x140
[c007f8483e30] [c000bc60] ret_from_kernel_thread+0x5c/0x7c

A bisection resulted in first bad commit (270065e92 - scsi: scsi-mq:
Always unprepare ...) in the merge branch 'scsi/for-next'

System booted fine when the below commit is reverted: 

commit 270065e92c317845d69095ec8e3d18616b5b39d5
Author: Bart Van Assche 
Date:   Thu Aug 3 14:40:14 2017 -0700

scsi: scsi-mq: Always unprepare before requeuing a request

One of the two scsi-mq functions that requeue a request unprepares a
request before requeueing (scsi_io_completion()) but the other
function
not (__scsi_queue_insert()). Make sure that a request is unprepared
before requeuing it.

Fixes: commit d285203cf647 ("scsi: add support for a blk-mq based
I/O path.")
Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Damien Le Moal 
Cc: Johannes Thumshirn 
Cc: 
Tested-by: Damien Le Moal 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Johannes Thumshirn 
Signed-off-by: Martin K. Petersen 

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre


#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 4.13.0-rc2 Kernel Configuration
#
CONFIG_PPC64=y

#
# Processor support
#
CONFIG_PPC_BOOK3S_64=y
# CONFIG_PPC_BOOK3E_64 is not set
# CONFIG_POWER7_CPU is not set
CONFIG_POWER8_CPU=y
CONFIG_PPC_BOOK3S=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_VSX=y
# CONFIG_PPC_ICSWX is not set
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_STD_MMU_64=y
CONFIG_PPC_RADIX_MMU=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_PPC_MM_SLICES=y
CONFIG_PPC_HAVE_PMU_SUPPORT=y
CONFIG_PPC_PERF_CTRS=y
CONFIG_FORCE_SMP=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2048
CONFIG_PPC_DOORBELL=y
# CONFIG_CPU_BIG_ENDIAN is not set
CONFIG_CPU_LITTLE_ENDIAN=y
CONFIG_PPC64_BOOT_WRAPPER=y
CONFIG_64BIT=y
CONFIG_ARCH_PHYS_ADDR_T_64BIT=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MAX=29
CONFIG_ARCH_MMAP_RND_BITS_MIN=14
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=13
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NR_IRQS=512
CONFIG_NMI_IPI=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK=y
CONFIG_PPC=y
# CONFIG_GENERIC_CSUM is not set
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=180
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
# CONFIG_GENERIC_TBSYNC is not set
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
CONFIG_EPAPR_BOOT=y
# CONFIG_DEFAULT_UIMAGE is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
# CONFIG_PPC_OF_PLATFORM_PCI is not set
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_EMULATE_SSTEP=y
CONFIG_ZONE_DMA32=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_KERNEL_GZIP=y

Re: [RFC PATCH v5 0/5] vfio-pci: Add support for mmapping MSI-X table

2017-08-16 Thread Alex Williamson
On Wed, 16 Aug 2017 10:35:49 +1000
Benjamin Herrenschmidt  wrote:

> On Tue, 2017-08-15 at 10:37 -0600, Alex Williamson wrote:
> > Of course I don't think either of those are worth imposing a
> > performance penalty where we don't otherwise need one.  However, if we
> > look at a VM scenario where the guest is following the PCI standard for
> > programming MSI-X interrupts (ie. not POWER), we need some mechanism to
> > intercept those MMIO writes to the vector table and configure the host
> > interrupt domain of the device rather than allowing the guest direct
> > access.  This is simply part of virtualizing the device to the guest.
> > So even if the kernel allows mmap'ing the vector table, the hypervisor
> > needs to trap it, so the mmap isn't required or used anyway.  It's only
> > when you define a non-PCI standard for your guest to program
> > interrupts, as POWER has done, and can therefore trust that the
> > hypervisor does not need to trap on the vector table that having that
> > mmap'able vector table becomes fully useful.  AIUI, ARM supports 64k
> > pages too... does ARM have any strategy that would actually make it
> > possible to make use of an mmap covering the vector table?  Thanks,  
> 
> WTF  Alex, can you stop once and for all with all that "POWER is
> not standard" bullshit please ? It's completely wrong.

As you've stated, the MSI-X vector table on POWER is currently updated
via a hypercall.  POWER is overall PCI compliant (I assume), but the
guest does not directly modify the vector table in MMIO space of the
device.  This is important...

> This has nothing to do with PCIe standard !

Yes, it actually does, because if the guest relies on the vector table
to be virtualized then it doesn't particularly matter whether the
vfio-pci kernel driver allows that portion of device MMIO space to be
directly accessed or mapped because QEMU needs for it to be trapped in
order to provide that virtualization.

I'm not knocking POWER, it's a smart thing for virtualization to have
defined this hypercall which negates the need for vector table
virtualization and allows efficient mapping of the device.  On other
platform, it's not necessarily practical given the broad base of legacy
guests supported where we'd never get agreement to implement this as
part of the platform spec... if there even was such a thing.  Maybe we
could provide the hypercall and dynamically enable direct vector table
mapping (disabling vector table virtualization) only if the hypercall
is used.

> The PCIe standard says strictly *nothing* whatsoever about how an OS
> obtains the magic address/values to put in the device and how the PCIe
> host bridge may do appropriate fitering.

And now we've jumped the tracks...  The only way the platform specific
address/data values become important is if we allow direct access to
the vector table AND now we're formulating how the user/guest might
write to it directly.  Otherwise the virtualization of the vector
table, or paravirtualization via hypercall provides the translation
where the host and guest address/data pairs can operate in completely
different address spaces.

> There is nothing on POWER that prevents the guest from writing the MSI-
> X address/data by hand. The problem isn't who writes the values or even
> how. The problem breaks down into these two things that are NOT covered
> by any aspect of the PCIe standard:

You've moved on to a different problem, I think everyone aside from
POWER is still back at the problem where who writes the vector table
values is a forefront problem.
 
>   1- The OS needs to obtain address/data values for an MSI that will
> "work" for the device.
> 
>   2- The HW+HV needs to prevent collateral damage caused by a device
> issuing stores to incorrect address or with incorrect data. Now *this*
> is necessary for *ANY* kind of DMA whether it's an MSI or something
> else anyway.
> 
> Now, the filtering done by qemu is NOT a reasonable way to handle 2)
> and whatever excluse about "making it harder" doesn't fly a meter when
> it comes to security. Making it "harder to break accidentally" I also
> don't buy, people don't just randomly put things in their MSI-X tables
> "accidentally", that stuff works or doesn't.

As I said before, I'm not willing to preserve the weak attributes that
blocking direct vector table access provides over pursuing a more
performant interface, but I also don't think their value is absolute
zero either.

> That leaves us with 1). Now this is purely a platform specific matters,
> not a spec matter. Once the HW has a way to enforce you can only
> generate "allowed" MSIs it becomes a matter of having some FW mechanism
> that can be used to informed the OS what address/values to use for a
> given interrupts.
> 
> This is provided on POWER by a combination of device-tree and RTAS. It
> could be that x86/ARM64 doesn't provide good enough mechanisms via ACPI
> but this is no way a problem of standard 

Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney"  writes:
> > ...
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney 
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > > tick_sched *ts,
> > >* (not only the tick).
> > >*/
> > >   ts->sleep_length = ktime_sub(dev->next_event, now);
> > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > > / 1000);
> > >   return tick;
> > >  }
> > 
> > Should I be seeing negative values? A small sample:
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250019
> >   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: 9973
> >   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250027
> >   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250053
> >   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250055
> >   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.
> 
> I will take a look!

And from your ps output, PID 9 is rcu_sched, which is the RCU grace-period
kthread that stalled.  This kthread was starved, based on this from your
dmesg:

[ 1602.067008] rcu_sched kthread starved for 2603 jiffies! g7275 c7274 f0x0 
RCU_GP_WAIT_FQS(3) ->state=0x1

The RCU_GP_WAIT_FQS says that this kthread is periodically scanning for
idle-CPU and offline-CPU quiescent states, which means that its waits
will be accompanied by short timeouts.  The "starved for 2603 jiffies"
says that it has not run for one good long time.  The ->state is its
task_struct ->state field.

The immediately preceding dmesg line is as follows:

[ 1602.063851]  (detected by 53, t=2603 jiffies, g=7275, c=7274, q=608)

In other words, the rcu_sched grace-period kthread has been starved
for the entire duration of the current grace period, as shown by the
t=2603.

Lets turn now to the trace output, looking for the last bit of the
rcu_sched task's activity:

   rcu_sched-9 [054] d...  1576.030096: timer_start: 
timer=c007fae1bc20 function=process_timeout expires=4295094922 [timeout=1] 
cpu=54 idx=0 flags=
ksoftirqd/53-276   [053] ..s.  1576.030097: rcu_invoke_callback: rcu_sched 
rhp=c00fcf8c4eb0 func=__d_free
   rcu_sched-9 [054] d...  1576.030097: rcu_utilization: Start context 
switch
ksoftirqd/53-276   [053] ..s.  1576.030098: rcu_invoke_callback: rcu_sched 
rhp=c00fcff74ee0 func=proc_i_callback
   rcu_sched-9 [054] d...  1576.030098: rcu_grace_period: rcu_sched 
7275 cpuqs
   rcu_sched-9 [054] d...  1576.030099: rcu_utilization: End context 
switch

So this task set up a timer ("timer_start:") for one jiffy ("[timeout=1]",
but what is with "expires=4295094922"?)  and blocked ("rcu_utilization:
Start context switch" and "rcu_utilization: End context switch"),
recording its CPU's quiescent state in the process ("rcu_grace_period:
rcu_sched 7275 cpuqs").

Of course, the timer will have expired in the context of some other task,
but a search for "c007fae1bc20" (see the "timer=" in the first trace
line above) shows nothing (to be painfully accurate, the search wraps back
to earlier uses of this timer by rcu_sched).  So the timer never did fire.

The next question is "what did CPU 054 do next?"  We find it entering idle:

  -0 [054] d...  1576.030167: tick_stop: success=1 
dependency=NONE
  -0 [054] d...  1576.030167: hrtimer_cancel: 

[PATCH] powerpc/perf: Fix usage of nest_imc_refc

2017-08-16 Thread Madhavan Srinivasan
nest_imc_refc is a reference count struct,
used to track number of active perf sessions
using the nest units.

It is preferred to access nest_imc_refc using
per-cpu pointer 'local_nest_imc_refc'. Since,
nest_imc_refc is not initialized using node_id
as array index. Patch to fix the same.

Fixes: 885dcd709ba91 ('powerpc/perf: Add nest IMC PMU support')
Reported-by: Dan Carpenter 
Signed-off-by: Madhavan Srinivasan 
---
 arch/powerpc/perf/imc-pmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index a8f95f96d54b..9ccac86f3463 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -404,7 +404,7 @@ static void nest_imc_counters_release(struct perf_event 
*event)
rc = opal_imc_counters_stop(OPAL_IMC_COUNTERS_NEST,

get_hard_smp_processor_id(event->cpu));
if (rc) {
-   mutex_unlock(_imc_refc[node_id].lock);
+   mutex_unlock(>lock);
pr_err("nest-imc: Unable to stop the counters for core 
%d\n", node_id);
return;
}
@@ -487,7 +487,7 @@ static int nest_imc_event_init(struct perf_event *event)
rc = opal_imc_counters_start(OPAL_IMC_COUNTERS_NEST,
 
get_hard_smp_processor_id(event->cpu));
if (rc) {
-   mutex_unlock(_imc_refc[node_id].lock);
+   mutex_unlock(>lock);
pr_err("nest-imc: Unable to start the counters for node 
%d\n",

node_id);
return rc;
-- 
2.7.4



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Nicholas Piggin
On Wed, 16 Aug 2017 05:56:17 -0700
"Paul E. McKenney"  wrote:

> On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> > "Paul E. McKenney"  writes:
> > ...  
> > >
> > > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > > Author: Paul E. McKenney 
> > > Date:   Mon Aug 14 08:54:39 2017 -0700
> > >
> > > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > >
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index c7a899c5ce64..7358a5073dfb 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > > tick_sched *ts,
> > >* (not only the tick).
> > >*/
> > >   ts->sleep_length = ktime_sub(dev->next_event, now);
> > > + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > > / 1000);
> > >   return tick;
> > >  }  
> > 
> > Should I be seeing negative values? A small sample:  
> 
> Maybe due to hypervisor preemption delays, but I confess that I am
> surprised to see them this large.  1,602,250,019 microseconds is something
> like a half hour, which could result in stall warnings all by itself.
> 
> >   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250019
> >   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250025
> >   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: 9973
> >   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250027
> >   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250053
> >   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602250055
> >   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> >   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> > tick_nohz_stop_sched_tick: -1602260018
> > 
> > 
> > I have a full trace, I'll send it to you off-list.  
> 
> I will take a look!

I found this, I can't see that it would cause our symptoms, but it's
worth someone who knows the code taking a look at it.

--
cpuidle: fix broadcast control when broadcast can not be entered

When failing to enter broadcast timer mode for an idle state that
requires it, a new state is selected that does not require broadcast,
but the broadcast variable remains set. This causes
tick_broadcast_exit to be called despite not having entered broadcast
mode.

This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some
cases, but otherwise does not appear to cause problems.

Signed-off-by: Nicholas Piggin 
---
 drivers/cpuidle/cpuidle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 60bb64f4329d..4453e27f855e 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -208,6 +208,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct 
cpuidle_driver *drv,
return -EBUSY;
}
target_state = >states[index];
+   broadcast = false;
}
 
/* Take note of the planned idle state. */
-- 
2.13.3



Re: [PATCH v2 2/3] livepatch: send a fake signal to all blocking tasks

2017-08-16 Thread Petr Mladek
On Thu 2017-08-10 12:48:14, Miroslav Benes wrote:
> Live patching consistency model is of LEAVE_PATCHED_SET and
> SWITCH_THREAD. This means that all tasks in the system have to be marked
> one by one as safe to call a new patched function. Safe means when a
> task is not (sleeping) in a set of patched functions. That is, no
> patched function is on the task's stack. Another clearly safe place is
> the boundary between kernel and userspace. The patching waits for all
> tasks to get outside of the patched set or to cross the boundary. The
> transition is completed afterwards.
> 
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 79022b7eca2c..a359340c924d 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -452,7 +452,7 @@ EXPORT_SYMBOL_GPL(klp_enable_patch);
>  static ssize_t force_show(struct kobject *kobj,
> struct kobj_attribute *attr, char *buf)
>  {
> - return sprintf(buf, "No operation is currently permitted.\n");
> + return sprintf(buf, "signal\n");

This makes invalid the "NOTE:" above this function ;-)

Best Regards,
Petr


Re: [PATCH] powerpc/xmon: Exclude all of xmon/ from ftrace

2017-08-16 Thread Naveen N. Rao
Hi Michael,
Sorry -- was off since last week.

On 2017/08/15 08:04PM, Michael Ellerman wrote:
> Michael Ellerman  writes:
> 
> > "Naveen N. Rao"  writes:
> >
> >> diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
> >> index 0b2f771593eb..5f95af64cb8f 100644
> >> --- a/arch/powerpc/xmon/Makefile
> >> +++ b/arch/powerpc/xmon/Makefile
> >> @@ -7,6 +7,19 @@ UBSAN_SANITIZE := n
> >>  
> >>  ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
> >>  
> >> +ifdef CONFIG_FUNCTION_TRACER
> >> +CFLAGS_REMOVE_xmon.o  = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +CFLAGS_REMOVE_nonstdio.o = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +ifdef CONFIG_XMON_DISASSEMBLY
> >> +CFLAGS_REMOVE_ppc-dis.o   = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +CFLAGS_REMOVE_ppc-opc.o   = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +ifdef CONFIG_SPU_BASE
> >> +CFLAGS_REMOVE_spu-dis.o   = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +CFLAGS_REMOVE_spu-opc.o   = -mno-sched-epilog $(CC_FLAGS_FTRACE)
> >> +endif
> >> +endif
> >> +endif
> >
> > Urk.
> >
> > We want to disable it for everything in the directory, so can you do
> > something like:
> >
> >   ORIG_CFLAGS := $(KBUILD_CFLAGS)
> >   KBUILD_CFLAGS = $(subst $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS))
> 
> Yes:
> 
>   # Disable ftrace for the entire directory
>   ORIG_CFLAGS := $(KBUILD_CFLAGS)
>   KBUILD_CFLAGS = $(subst -mno-sched-epilog,,$(subst 
> $(CC_FLAGS_FTRACE),,$(ORIG_CFLAGS)))
> 
> Seems to work.

Nice -- I had looked for a generic CFLAGS_REMOVE variant, but didn't 
find that. This is much nicer.

Thanks,
Naveen



Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Paul E. McKenney
On Wed, Aug 16, 2017 at 10:43:52PM +1000, Michael Ellerman wrote:
> "Paul E. McKenney"  writes:
> ...
> >
> > commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> > Author: Paul E. McKenney 
> > Date:   Mon Aug 14 08:54:39 2017 -0700
> >
> > EXP: Trace tick return from tick_nohz_stop_sched_tick
> > 
> > Signed-off-by: Paul E. McKenney 
> >
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index c7a899c5ce64..7358a5073dfb 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> > tick_sched *ts,
> >  * (not only the tick).
> >  */
> > ts->sleep_length = ktime_sub(dev->next_event, now);
> > +   trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> > / 1000);
> > return tick;
> >  }
> 
> Should I be seeing negative values? A small sample:

Maybe due to hypervisor preemption delays, but I confess that I am
surprised to see them this large.  1,602,250,019 microseconds is something
like a half hour, which could result in stall warnings all by itself.

>   -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250019
>   -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250025
>   -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250025
>   -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: 9973
>   -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250027
>   -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250053
>   -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602250055
>   -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
>   -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
>   -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
> tick_nohz_stop_sched_tick: -1602260018
> 
> 
> I have a full trace, I'll send it to you off-list.

I will take a look!

Thanx, Paul



Re: [patch net-next 0/3] net/sched: Improve getting objects by indexes

2017-08-16 Thread Chris Wilson
Quoting Christian König (2017-08-16 08:49:07)
> Am 16.08.2017 um 04:12 schrieb Chris Mi:
> > Using current TC code, it is very slow to insert a lot of rules.
> >
> > In order to improve the rules update rate in TC,
> > we introduced the following two changes:
> >  1) changed cls_flower to use IDR to manage the filters.
> >  2) changed all act_xxx modules to use IDR instead of
> > a small hash table
> >
> > But IDR has a limitation that it uses int. TC handle uses u32.
> > To make sure there is no regression, we also changed IDR to use
> > unsigned long. All clients of IDR are changed to use new IDR API.
> 
> WOW, wait a second. The idr change is touching a lot of drivers and to 
> be honest doesn't looks correct at all.
> 
> Just look at the first chunk of your modification:
> > @@ -998,8 +999,9 @@ int bsg_register_queue(struct request_queue *q, struct 
> > device *parent,
> >   
> >   mutex_lock(_mutex);
> >   
> > - ret = idr_alloc(_minor_idr, bcd, 0, BSG_MAX_DEVS, GFP_KERNEL);
> > - if (ret < 0) {
> > + ret = idr_alloc(_minor_idr, bcd, _index, 0, BSG_MAX_DEVS,
> > + GFP_KERNEL);
> > + if (ret) {
> >   if (ret == -ENOSPC) {
> >   printk(KERN_ERR "bsg: too many bsg devices\n");
> >   ret = -EINVAL;
> The condition "if (ret)" will now always be true after the first 
> allocation and so we always run into the error handling after that.

ret is now purely the error code, so it doesn't look that suspicious.

> I've never read the bsg code before, but that's certainly not correct. 
> And that incorrect pattern repeats over and over again in this code.
> 
> Apart from that why the heck do you want to allocate more than 1<<31 
> handles?

And more to the point, arbitrarily changing the maximum to ULONG_MAX
where the ABI only supports U32_MAX is dangerous. Unless you do the
analysis otherwise, you have to replace all the end=0 with end=INT_MAX
to maintain existing behaviour.
-Chris


Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this?

2017-08-16 Thread Michael Ellerman
"Paul E. McKenney"  writes:
...
>
> commit 33103e7b1f89ef432dfe3337d2a6932cdf5c1312
> Author: Paul E. McKenney 
> Date:   Mon Aug 14 08:54:39 2017 -0700
>
> EXP: Trace tick return from tick_nohz_stop_sched_tick
> 
> Signed-off-by: Paul E. McKenney 
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index c7a899c5ce64..7358a5073dfb 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -817,6 +817,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct 
> tick_sched *ts,
>* (not only the tick).
>*/
>   ts->sleep_length = ktime_sub(dev->next_event, now);
> + trace_printk("tick_nohz_stop_sched_tick: %lld\n", (tick - ktime_get()) 
> / 1000);
>   return tick;
>  }

Should I be seeing negative values? A small sample:

  -0 [015] d...  1602.039695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250019
  -0 [009] d...  1602.039701: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250025
  -0 [007] d...  1602.039702: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250025
  -0 [048] d...  1602.039703: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: 9973
  -0 [006] d...  1602.039704: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250027
  -0 [001] d...  1602.039730: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250053
  -0 [008] d...  1602.039732: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602250055
  -0 [006] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018
  -0 [009] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018
  -0 [001] d...  1602.049695: __tick_nohz_idle_enter: 
tick_nohz_stop_sched_tick: -1602260018


I have a full trace, I'll send it to you off-list.

cheers


Re: [PATCH 0/3] ALSA: make snd_kcontrol_new const

2017-08-16 Thread Takashi Iwai
On Wed, 16 Aug 2017 10:44:08 +0200,
Bhumika Goyal wrote:
> 
> Make these structures const. Done using Coccinelle.
> 
> Bhumika Goyal (3):
>   ALSA: aoa: make snd_kcontrol_new const
>   ALSA: pcxhr: make snd_kcontrol_new const
>   ALSA: hda: make snd_kcontrol_new const

Applied all three patches now.  Thanks.


Takashi


[PATCH] powerpc: powernv: Fix build error on const discarding

2017-08-16 Thread Corentin Labbe
When building a random powerpc kernel I hit this build error:
  CC  arch/powerpc/platforms/powernv/opal-imc.o
arch/powerpc/platforms/powernv/opal-imc.c: In function « 
disable_nest_pmu_counters »:
arch/powerpc/platforms/powernv/opal-imc.c:130:13: error : assignment discards « 
const » qualifier from pointer target type [-Werror=discarded-qualifiers]
   l_cpumask = cpumask_of_node(nid);
 ^
This patch simply add const to l_cpumask to fix this issue.

Signed-off-by: Corentin Labbe 
---
 arch/powerpc/platforms/powernv/opal-imc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/opal-imc.c 
b/arch/powerpc/platforms/powernv/opal-imc.c
index b903bf5e6006..21f6531fae20 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -123,7 +123,7 @@ static int imc_pmu_create(struct device_node *parent, int 
pmu_index, int domain)
 static void disable_nest_pmu_counters(void)
 {
int nid, cpu;
-   struct cpumask *l_cpumask;
+   const struct cpumask *l_cpumask;
 
get_online_cpus();
for_each_online_node(nid) {
-- 
2.13.0



Re: powerpc/mm/nohash: add definition of PGALLOC_GFP

2017-08-16 Thread Michael Ellerman
On Tue, 2017-08-15 at 03:46:36 UTC, Balbir Singh wrote:
> fixes
> (de3b876 powerpc/mm/book(e)(3s)/64: Add page table accounting)
> 
> I missed adding PGALLOC_GFP for nohash/64
> 
> Reported-by: Michael Ellerman 
> Signed-off-by: Balbir Singh 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/5b6c133e0801007117cf4a466cb56d

cheers


Re: powerpc: fix invalid use of register expressions

2017-08-16 Thread Michael Ellerman
On Mon, 2017-08-14 at 18:42:43 UTC, Andreas Schwab wrote:
> This fixes another invalid use of register expressions.
> 
> Signed-off-by: Andreas Schwab 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6c80d3164ece26e55dc2dbfceba948

cheers


Re: Fix for supporting nest events on muti socket system

2017-08-16 Thread Michael Ellerman
On Mon, 2017-08-14 at 11:42:23 UTC, Anju T wrote:
> In a multi node system with discontinuous node id, nest event values
> are not showing up properly.That is,
> 
> snip from lscpu output:
> 
> ..
> NUMA node0 CPU(s): 0-15
> NUMA node8 CPU(s): 16-31
> ..
> 
> Nest event values on such systems are broken:
> 
> $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 0-14 -I 1000 sleep 1000
> #   time counts unit events
>  1.00029457730,17,24,42,880 nest_powerbus0_imc/PM_PB_CYC/
>  2.00052893829,92,08,53,760 nest_powerbus0_imc/PM_PB_CYC/
>  3.00071392529,92,08,00,000 nest_powerbus0_imc/PM_PB_CYC/
>  4.00090194429,95,08,63,360 nest_powerbus0_imc/PM_PB_CYC/
>  5.00108911929,92,07,92,320 nest_powerbus0_imc/PM_PB_CYC/
>  6.00127610629,92,08,11,520 nest_powerbus0_imc/PM_PB_CYC/
> 
> $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000
> #   time counts unit events
>  1.49902 nest_powerbus0_imc/PM_PB_CYC/
>  2.000147269 nest_powerbus0_imc/PM_PB_CYC/
>  3.000219730 nest_powerbus0_imc/PM_PB_CYC/
>  4.000288098 nest_powerbus0_imc/PM_PB_CYC/
>  5.000358716 nest_powerbus0_imc/PM_PB_CYC/
>  6.000435615 nest_powerbus0_imc/PM_PB_CYC/
>  7.000508481 nest_powerbus0_imc/PM_PB_CYC/
> 
> This is because, when fetching for the reference count, node id is used
> as the array index which is not how this is done when initializing the
> structure. Patch to fix the same by using the right index to get the
> nest_imc_refc.
> 
> $./perf stat -e 'nest_powerbus0_imc/PM_PB_CYC/' -C 16-28 -I 1000 sleep 1000
> #   time counts unit events
>  1.00024196126,12,35,28,704 nest_powerbus0_imc/PM_PB_CYC/
>  2.00045167825,95,72,48,512 nest_powerbus0_imc/PM_PB_CYC/
>  3.00063496325,93,13,96,608 nest_powerbus0_imc/PM_PB_CYC/
>  4.00082118625,95,74,38,208 nest_powerbus0_imc/PM_PB_CYC/
>  5.00100522125,93,13,30,048 nest_powerbus0_imc/PM_PB_CYC/ 
> 
> Signed-off-by: Anju T Sudhakar 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/7efbae90892b7858f1d4873d34

cheers


Re: powerpc/8xx: fix two CONFIG_8xx left behind

2017-08-16 Thread Michael Ellerman
On Mon, 2017-08-14 at 07:14:19 UTC, Christophe Leroy wrote:
> Commit 968159c0031ac ("powerpc/8xx: Getting rid of remaining
> use of CONFIG_8xx") removed all but 2 references to 8xx in
> Kconfigs.
> 
> This patch removes the two remaining ones.
> 
> Signed-off-by: Christophe Leroy 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ab2675d6acec03aec4a866f7d293fd

cheers


Re: powerpc: store the intended structure

2017-08-16 Thread Michael Ellerman
On Sun, 2017-08-13 at 13:24:23 UTC, Julia Lawall wrote:
> Normally the values in the resource field and the argument to ARRAY_SIZE
> in the num_resources are the same.  In this case, the value in the reousrce
> field is the same as the one in the previous platform_device structure, and
> appears to be a copy-paste error.  Replace the value in the resource field
> with the argument to the local call to ARRAY_SIZE.
> 
> Signed-off-by: Julia Lawall 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/36992606eee8016c36ad2576687e97

cheers


Re: powerpc/perf: double unlock bug in imc_common_cpuhp_mem_free()

2017-08-16 Thread Michael Ellerman
On Fri, 2017-08-11 at 20:05:41 UTC, Dan Carpenter wrote:
> There is a typo so we call unlock instead of lock.
> 
> Fixes: 885dcd709ba9 ("powerpc/perf: Add nest IMC PMU support")
> Signed-off-by: Dan Carpenter 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/b3376dcc6c62452fe24e76d8fc35bb

cheers


Re: drivers/macintosh: make wf_control_ops and wf_pid_param const

2017-08-16 Thread Michael Ellerman
On Fri, 2017-08-11 at 17:38:45 UTC, Bhumika Goyal wrote:
> Make wf_control_ops const as they are only stored in the ops field of a
> wf_control structure, which is const.
> Make wf_pid_param const as they are only used during a copy operation.
> Done using Coccinelle.
> 
> Signed-off-by: Bhumika Goyal 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1ad35f6e2864d9b52fe9705ede1730

cheers


Re: powerpc/iommu: Avoid undefined right shift in iommu_range_alloc()

2017-08-16 Thread Michael Ellerman
On Tue, 2017-08-08 at 07:06:32 UTC, Michael Ellerman wrote:
> In iommu_range_alloc() we generate a mask by right shifting ~0,
> however if the specified alignment is 0 then we right shift by 64,
> which is undefined. UBSAN tells us so:
> 
>   UBSAN: Undefined behaviour in ../arch/powerpc/kernel/iommu.c:193:35
>   shift exponent 64 is too large for 64-bit type 'long unsigned int'
> 
> We can avoid it by instead generating the mask with:
> 
>   align_mask = (1ull << align_order) - 1;
> 
> That will also generate an undefined shift if align_order is 64 or
> greater, but that shouldn't be a problem for a while.
> 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/63b85621d9aa6bdc410f01b22f7821

cheers


Re: [v3,2/2] powerpc/xmon: Disable tracing when entering xmon

2017-08-16 Thread Michael Ellerman
On Wed, 2017-08-02 at 20:14:06 UTC, Breno Leitao wrote:
> If tracing is enabled and you get into xmon, the tracing buffer
> continues to be updated, causing possible loss of data and unnecessary
> tracing information coming from xmon functions.
> 
> This patch simple disables tracing when entering xmon, and re-enables it
> if the kernel is resumed (with 'x').
> 
> Signed-off-by: Breno Leitao 
> Acked-by: Naveen N. Rao 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ed49f7fd6438dcc8c93fa7d1d7d815

cheers


Re: [v3, 1/2] powerpc/xmon: Dump ftrace buffers for the current CPU only

2017-08-16 Thread Michael Ellerman
On Wed, 2017-08-02 at 20:14:05 UTC, Breno Leitao wrote:
> Current xmon 'dt' command dumps the tracing buffer for all the CPUs,
> which makes it very hard to read due to the fact that most of
> powerpc machines currently have many CPUs. Other than that, the CPU
> lines are interleaved in the ftrace log.
> 
> This new option just dumps the ftrace buffer for the current CPU.
> 
> Signed-off-by: Breno Leitao 
> Acked-by: Naveen N. Rao 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/4125d012ff9dafe6624197d8dbd237

cheers


Re: powerpc/xmon: Exclude all of xmon/ from ftrace

2017-08-16 Thread Michael Ellerman
On Wed, 2017-08-02 at 18:25:38 UTC, "Naveen N. Rao" wrote:
> Exclude core xmon files from ftrace (along with an xmon xive helper
> outside of xmon/) to minimize impact of ftrace while within xmon.
> 
> Before patch:
>   root@ubuntu:/sys/kernel/debug/tracing# cat available_filter_functions | 
> grep -i xmon
>   xmon_xive_do_dump
>   xmon_dbgfs_get
>   xmon_print_symbol
>   xmon_show_stack
>   xmon_dbgfs_ops_open
>   xmon_init.part.2
>   xmon_dbgfs_set
>   sysrq_handle_xmon
>   xmon_fault_handler
>   cpus_are_in_xmon
>   xmon_core
>   xmon
>   xmon_irq
>   xmon_break_match
>   xmon_iabr_match
>   xmon_sstep
>   xmon_bpt
>   xmon_ipi
>   xmon_write
>   xmon_start_pagination
>   xmon_end_pagination
>   xmon_set_pagination_lpp
>   xmon_putchar
>   xmon_gets
>   xmon_printf
>   xmon_puts
>   root@ubuntu:/sys/kernel/debug/tracing#
> 
> After patch:
>   root@ubuntu:/sys/kernel/debug/tracing# cat available_filter_functions | 
> grep -i xmon
>   root@ubuntu:/sys/kernel/debug/tracing#
> 
> Signed-off-by: Naveen N. Rao 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e12d94f80614475b07d046eb095e6b

cheers


Re: [v3, 1/5] powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBs

2017-08-16 Thread Michael Ellerman
On Wed, 2017-08-02 at 13:51:01 UTC, Christophe Leroy wrote:
> __change_page_attr() uses flush_tlb_page().
> flush_tlb_page() uses tlbie instruction, which also invalidates
> pinned TLBs, which is not what we expect.
> 
> This patch modifies the implementation to use flush_tlb_kernel_range()
> instead. This will make use of tlbia which will preserve pinned TLBs.
> 
> Signed-off-by: Christophe Leroy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e611939fc8ec13387018df88083de7

cheers


Re: powerpc/hugetlb: fix page rights verification in gup_hugepte()

2017-08-16 Thread Michael Ellerman
On Wed, 2017-07-12 at 15:03:42 UTC, Christophe Leroy wrote:
> gup_hugepte() checks if pages are present and readable, and
> when  'write' is set, also checks if the pages are writable.
> 
> Initially this was done by checking if _PAGE_PRESENT and
> _PAGE_READ were set. In addition, _PAGE_WRITE was verified for write
> accesses.
> 
> The problem is that we have to handle the three following cases:
> 1/ The target defines __PAGE_READ and __PAGE_WRITE
> 2/ The target defines __PAGE_RW
> 3/ The target defines __PAGE_RO
> 
> In case 1/, this is obvious
> In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW
> so it works as well.
> But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0
> and then the test returns true (page writable) in all cases.
> 
> A first correction was attempted in commit 6b8cb66a6a7cc ("powerpc: Fix
> usage of _PAGE_RO in hugepage"), but that fix is wrong:
> instead of checking that the page is writable when write is requested,
> it checks that the page is NOT writable when write is NOT requested.
> 
> This patch adds a new pte_read() helper to check whether a page is
> readable or not. This avoids handling all possible cases in
> gup_hugepte().
> 
> Then gup_hugepte() is modified to use pte_present(), pte_read()
> and pte_write() instead of the raw flags.
> 
> Signed-off-by: Christophe Leroy 
> Reviewed-by: Aneesh Kumar K.V 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ca8afd4046255ac046f8229d5159c6

cheers


Re: [1/7] powerpc/8xx: Ensures RAM mapped with LTLB is seen as block mapped on 8xx.

2017-08-16 Thread Michael Ellerman
On Wed, 2017-07-12 at 10:08:45 UTC, Christophe Leroy wrote:
> On the 8xx, the RAM mapped with LTLBs must be seen as block mapped,
> just like areas mapped with BATs on standard PPC32.
> 
> Signed-off-by: Christophe Leroy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/eef784bbe775e66d2c21773a8c8263

cheers


Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

2017-08-16 Thread Michael Ellerman
Sukadev Bhattiprolu  writes:

> Nicholas Piggin [npig...@gmail.com] wrote:
>> On Mon, 14 Aug 2017 15:21:48 +1000
>> Michael Ellerman  wrote:
>> 
>> > Sukadev Bhattiprolu  writes:
>> 
>> > >  arch/powerpc/include/asm/vas.h   |  35 
>> > >  arch/powerpc/include/uapi/asm/vas.h  |  25 +++  
>> > 
>> > I thought we weren't exposing VAS to userspace yet?
>> > 
>> > If we are then we need to get things straight WRT copy/paste abort.
...
>
> In the FTW case, there is no data transfer from user space to the hardware.
> i.e the copy/paste submit a NULL CRB and hardware will be configured (see
> ->fifo_disable setting in winctx) to ignore any data they specify in the CRB.

I thought the copy did copy a cacheline, but then the paste to the VAS
window just ignores the contents, and doesn't allow userspace to get the
content in any way?

Which means we have two thirds of a covert channel, ie. something can be
copied into the copy buffer by one process, and then a second process
can paste it, but because it can only paste to foreign memory, and the
only foreign memory it can get is a VAS FTW window, it can't actually
see the content of the copy buffer.

> Would we be able to allow copy/paste from user space in that case?

Yeah I think so, but it is all a bit fragile.

cheers


Re: [PATCH 05/11] powerpc/topology: Remove the unused parent_node() macro

2017-08-16 Thread Michael Ellerman
Dou Liyang  writes:

> Hi Michael,
>
> At 07/27/2017 10:21 AM, Michael Ellerman wrote:
>> Dou Liyang  writes:
>>
>>> Commit a7be6e5a7f8d ("mm: drop useless local parameters of
>>> __register_one_node()") removes the last user of parent_node().
>>>
>>> The parent_node() macro in POWERPC platform is unnecessary.
>>>
>>> Remove it for cleanup.
>>>
>>> Reported-by: Michael Ellerman 
>>> Signed-off-by: Dou Liyang 
>>> Cc: Benjamin Herrenschmidt 
>>> Cc: Paul Mackerras 
>>> Cc: Michael Ellerman 
>>> Cc: linuxppc-dev@lists.ozlabs.org
>>> ---
>>>  arch/powerpc/include/asm/topology.h | 2 --
>>>  1 file changed, 2 deletions(-)
>>
>> Thanks for doing this series.
>
> It's my pleasure. :)

Seems other arch maintainers are merging these patches individually, so
I grabbed this one and put it in the powerpc tree.

cheers


Re: [PATCH] powerpc: Use reg.h values for program check reason codes

2017-08-16 Thread Cyril Bur
On Wed, 2017-08-16 at 10:52 +0200, Christophe LEROY wrote:
> Hi,
> 
> Le 16/08/2017 à 08:50, Cyril Bur a écrit :
> > Small amount of #define duplication, makes sense for these to be in
> > reg.h.
> > 
> > Signed-off-by: Cyril Bur 
> 
> Looks similar to the following applies commit, doesn't it ?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=d30a5a5262ca64d58aa07fb2ecd7f992df83b4bc
> 

Oops, I think I'm based off Linus' tree. Sorry for the noise.


Cyril

*starts writing patch to rename to PROGTMBAD*... because clearly haha
;)

> Christophe
> 
> > ---
> >   arch/powerpc/include/asm/reg.h |  1 +
> >   arch/powerpc/kernel/traps.c| 10 +-
> >   2 files changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> > index a3b6575c7842..c22b1ae5ad03 100644
> > --- a/arch/powerpc/include/asm/reg.h
> > +++ b/arch/powerpc/include/asm/reg.h
> > @@ -675,6 +675,7 @@
> >   * may not be recoverable */
> >   #define SRR1_WS_DEEPER0x0002 /* Some resources not 
> > maintained */
> >   #define SRR1_WS_DEEP  0x0001 /* All resources maintained 
> > */
> > +#define   SRR1_PROGTMBAD   0x0020 /* TM Bad Thing */
> >   #define   SRR1_PROGFPE0x0010 /* Floating Point Enabled */
> >   #define   SRR1_PROGILL0x0008 /* Illegal instruction */
> >   #define   SRR1_PROGPRIV   0x0004 /* Privileged instruction */
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 1f7ec178db05..0a5ddaea8bf1 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -416,11 +416,11 @@ static inline int check_io_access(struct pt_regs 
> > *regs)
> >  exception is in the MSR. */
> >   #define get_reason(regs)  ((regs)->msr)
> >   #define get_mc_reason(regs)   ((regs)->msr)
> > -#define REASON_TM  0x20
> > -#define REASON_FP  0x10
> > -#define REASON_ILLEGAL 0x8
> > -#define REASON_PRIVILEGED  0x4
> > -#define REASON_TRAP0x2
> > +#define REASON_TM  SRR1_PROGTMBAD
> > +#define REASON_FP  SRR1_PROGFPE
> > +#define REASON_ILLEGAL SRR1_PROGILL
> > +#define REASON_PRIVILEGED  SRR1_PROGPRIV
> > +#define REASON_TRAPSRR1_PROGTRAP
> >   
> >   #define single_stepping(regs) ((regs)->msr & MSR_SE)
> >   #define clear_single_step(regs)   ((regs)->msr &= ~MSR_SE)
> > 


Re: [PATCH] powerpc: Use reg.h values for program check reason codes

2017-08-16 Thread Christophe LEROY

Hi,

Le 16/08/2017 à 08:50, Cyril Bur a écrit :

Small amount of #define duplication, makes sense for these to be in
reg.h.

Signed-off-by: Cyril Bur 


Looks similar to the following applies commit, doesn't it ?

https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge=d30a5a5262ca64d58aa07fb2ecd7f992df83b4bc

Christophe


---
  arch/powerpc/include/asm/reg.h |  1 +
  arch/powerpc/kernel/traps.c| 10 +-
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a3b6575c7842..c22b1ae5ad03 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -675,6 +675,7 @@
  * may not be recoverable */
  #define SRR1_WS_DEEPER0x0002 /* Some resources not 
maintained */
  #define SRR1_WS_DEEP  0x0001 /* All resources maintained 
*/
+#define   SRR1_PROGTMBAD   0x0020 /* TM Bad Thing */
  #define   SRR1_PROGFPE0x0010 /* Floating Point Enabled */
  #define   SRR1_PROGILL0x0008 /* Illegal instruction */
  #define   SRR1_PROGPRIV   0x0004 /* Privileged instruction */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 1f7ec178db05..0a5ddaea8bf1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -416,11 +416,11 @@ static inline int check_io_access(struct pt_regs *regs)
 exception is in the MSR. */
  #define get_reason(regs)  ((regs)->msr)
  #define get_mc_reason(regs)   ((regs)->msr)
-#define REASON_TM  0x20
-#define REASON_FP  0x10
-#define REASON_ILLEGAL 0x8
-#define REASON_PRIVILEGED  0x4
-#define REASON_TRAP0x2
+#define REASON_TM  SRR1_PROGTMBAD
+#define REASON_FP  SRR1_PROGFPE
+#define REASON_ILLEGAL SRR1_PROGILL
+#define REASON_PRIVILEGED  SRR1_PROGPRIV
+#define REASON_TRAPSRR1_PROGTRAP
  
  #define single_stepping(regs)	((regs)->msr & MSR_SE)

  #define clear_single_step(regs)   ((regs)->msr &= ~MSR_SE)



[PATCH 1/3] ALSA: aoa: make snd_kcontrol_new const

2017-08-16 Thread Bhumika Goyal
Make these const as they are only used during a copy operation.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal 
---
 sound/aoa/codecs/onyx.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/sound/aoa/codecs/onyx.c b/sound/aoa/codecs/onyx.c
index a04edff..d2d96ca 100644
--- a/sound/aoa/codecs/onyx.c
+++ b/sound/aoa/codecs/onyx.c
@@ -167,7 +167,7 @@ static int onyx_snd_vol_put(struct snd_kcontrol *kcontrol,
return 1;
 }
 
-static struct snd_kcontrol_new volume_control = {
+static const struct snd_kcontrol_new volume_control = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "Master Playback Volume",
.access = SNDRV_CTL_ELEM_ACCESS_READWRITE,
@@ -229,7 +229,7 @@ static int onyx_snd_inputgain_put(struct snd_kcontrol 
*kcontrol,
return n != v;
 }
 
-static struct snd_kcontrol_new inputgain_control = {
+static const struct snd_kcontrol_new inputgain_control = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "Master Capture Volume",
.access = SNDRV_CTL_ELEM_ACCESS_READWRITE,
@@ -284,7 +284,7 @@ static int onyx_snd_capture_source_put(struct snd_kcontrol 
*kcontrol,
return 1;
 }
 
-static struct snd_kcontrol_new capture_source_control = {
+static const struct snd_kcontrol_new capture_source_control = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
/* If we name this 'Input Source', it properly shows up in
 * alsamixer as a selection, * but it's shown under the
@@ -348,7 +348,7 @@ static int onyx_snd_mute_put(struct snd_kcontrol *kcontrol,
return !err ? (v != c) : err;
 }
 
-static struct snd_kcontrol_new mute_control = {
+static const struct snd_kcontrol_new mute_control = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "Master Playback Switch",
.access = SNDRV_CTL_ELEM_ACCESS_READWRITE,
@@ -476,7 +476,7 @@ static int onyx_spdif_mask_get(struct snd_kcontrol 
*kcontrol,
return 0;
 }
 
-static struct snd_kcontrol_new onyx_spdif_mask = {
+static const struct snd_kcontrol_new onyx_spdif_mask = {
.access =   SNDRV_CTL_ELEM_ACCESS_READ,
.iface =SNDRV_CTL_ELEM_IFACE_PCM,
.name = SNDRV_CTL_NAME_IEC958("",PLAYBACK,CON_MASK),
@@ -533,7 +533,7 @@ static int onyx_spdif_put(struct snd_kcontrol *kcontrol,
return 1;
 }
 
-static struct snd_kcontrol_new onyx_spdif_ctrl = {
+static const struct snd_kcontrol_new onyx_spdif_ctrl = {
.access =   SNDRV_CTL_ELEM_ACCESS_READWRITE,
.iface =SNDRV_CTL_ELEM_IFACE_PCM,
.name = SNDRV_CTL_NAME_IEC958("",PLAYBACK,DEFAULT),
-- 
1.9.1



[PATCH 3/3] ALSA: hda: make snd_kcontrol_new const

2017-08-16 Thread Bhumika Goyal
Make these const as they are only passed as the 3rd argument to the
function snd_hda_gen_add_kctl, which is of type const.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal 
---
 sound/pci/hda/patch_analog.c   | 4 ++--
 sound/pci/hda/patch_sigmatel.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c
index e0fb8c6..7578573 100644
--- a/sound/pci/hda/patch_analog.c
+++ b/sound/pci/hda/patch_analog.c
@@ -505,7 +505,7 @@ static int ad1983_auto_smux_enum_put(struct snd_kcontrol 
*kcontrol,
return 1;
 }
 
-static struct snd_kcontrol_new ad1983_auto_smux_mixer = {
+static const struct snd_kcontrol_new ad1983_auto_smux_mixer = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "IEC958 Playback Source",
.info = ad1983_auto_smux_enum_info,
@@ -788,7 +788,7 @@ static int ad1988_auto_smux_enum_put(struct snd_kcontrol 
*kcontrol,
return 1;
 }
 
-static struct snd_kcontrol_new ad1988_auto_smux_mixer = {
+static const struct snd_kcontrol_new ad1988_auto_smux_mixer = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "IEC958 Playback Source",
.info = ad1988_auto_smux_enum_info,
diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c
index 6cefdf6..63d15b5 100644
--- a/sound/pci/hda/patch_sigmatel.c
+++ b/sound/pci/hda/patch_sigmatel.c
@@ -961,7 +961,7 @@ static int stac_smux_enum_put(struct snd_kcontrol *kcontrol,
 >cur_smux[smux_idx]);
 }
 
-static struct snd_kcontrol_new stac_smux_mixer = {
+static const struct snd_kcontrol_new stac_smux_mixer = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "IEC958 Playback Source",
/* count set later */
-- 
1.9.1



[PATCH 2/3] ALSA: pcxhr: make snd_kcontrol_new const

2017-08-16 Thread Bhumika Goyal
Make these const as they are only used during a copy operation.
Done using Coccinelle.

Signed-off-by: Bhumika Goyal 
---
 sound/pci/pcxhr/pcxhr_mixer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/pci/pcxhr/pcxhr_mixer.c b/sound/pci/pcxhr/pcxhr_mixer.c
index 36875df..d9a1c6c 100644
--- a/sound/pci/pcxhr/pcxhr_mixer.c
+++ b/sound/pci/pcxhr/pcxhr_mixer.c
@@ -185,7 +185,7 @@ static int pcxhr_analog_vol_put(struct snd_kcontrol 
*kcontrol,
return changed;
 }
 
-static struct snd_kcontrol_new pcxhr_control_analog_level = {
+static const struct snd_kcontrol_new pcxhr_control_analog_level = {
.iface =SNDRV_CTL_ELEM_IFACE_MIXER,
.access =   (SNDRV_CTL_ELEM_ACCESS_READWRITE |
 SNDRV_CTL_ELEM_ACCESS_TLV_READ),
@@ -409,7 +409,7 @@ static int pcxhr_pcm_vol_put(struct snd_kcontrol *kcontrol,
return changed;
 }
 
-static struct snd_kcontrol_new snd_pcxhr_pcm_vol =
+static const struct snd_kcontrol_new snd_pcxhr_pcm_vol =
 {
.iface =SNDRV_CTL_ELEM_IFACE_MIXER,
.access =   (SNDRV_CTL_ELEM_ACCESS_READWRITE |
-- 
1.9.1



[PATCH 0/3] ALSA: make snd_kcontrol_new const

2017-08-16 Thread Bhumika Goyal
Make these structures const. Done using Coccinelle.

Bhumika Goyal (3):
  ALSA: aoa: make snd_kcontrol_new const
  ALSA: pcxhr: make snd_kcontrol_new const
  ALSA: hda: make snd_kcontrol_new const

 sound/aoa/codecs/onyx.c| 12 ++--
 sound/pci/hda/patch_analog.c   |  4 ++--
 sound/pci/hda/patch_sigmatel.c |  2 +-
 sound/pci/pcxhr/pcxhr_mixer.c  |  4 ++--
 4 files changed, 11 insertions(+), 11 deletions(-)

-- 
1.9.1



Re: [patch net-next 0/3] net/sched: Improve getting objects by indexes

2017-08-16 Thread Christian König

Am 16.08.2017 um 04:12 schrieb Chris Mi:

Using current TC code, it is very slow to insert a lot of rules.

In order to improve the rules update rate in TC,
we introduced the following two changes:
 1) changed cls_flower to use IDR to manage the filters.
 2) changed all act_xxx modules to use IDR instead of
a small hash table

But IDR has a limitation that it uses int. TC handle uses u32.
To make sure there is no regression, we also changed IDR to use
unsigned long. All clients of IDR are changed to use new IDR API.


WOW, wait a second. The idr change is touching a lot of drivers and to 
be honest doesn't looks correct at all.


Just look at the first chunk of your modification:

@@ -998,8 +999,9 @@ int bsg_register_queue(struct request_queue *q, struct 
device *parent,
  
  	mutex_lock(_mutex);
  
-	ret = idr_alloc(_minor_idr, bcd, 0, BSG_MAX_DEVS, GFP_KERNEL);

-   if (ret < 0) {
+   ret = idr_alloc(_minor_idr, bcd, _index, 0, BSG_MAX_DEVS,
+   GFP_KERNEL);
+   if (ret) {
if (ret == -ENOSPC) {
printk(KERN_ERR "bsg: too many bsg devices\n");
ret = -EINVAL;
The condition "if (ret)" will now always be true after the first 
allocation and so we always run into the error handling after that.


I've never read the bsg code before, but that's certainly not correct. 
And that incorrect pattern repeats over and over again in this code.


Apart from that why the heck do you want to allocate more than 1<<31 
handles?


Regards,
Christian.



Chris Mi (3):
   idr: Use unsigned long instead of int
   net/sched: Change cls_flower to use IDR
   net/sched: Change act_api and act_xxx modules to use IDR

  block/bsg.c |   8 +-
  block/genhd.c   |  12 +-
  drivers/atm/nicstar.c   |  11 +-
  drivers/block/drbd/drbd_main.c  |  31 +--
  drivers/block/drbd/drbd_nl.c|  22 ++-
  drivers/block/drbd/drbd_proc.c  |   3 +-
  drivers/block/drbd/drbd_receiver.c  |  15 +-
  drivers/block/drbd/drbd_state.c |  34 ++--
  drivers/block/drbd/drbd_worker.c|   6 +-
  drivers/block/loop.c|  17 +-
  drivers/block/nbd.c |  20 +-
  drivers/block/zram/zram_drv.c   |   9 +-
  drivers/char/tpm/tpm-chip.c |  10 +-
  drivers/char/tpm/tpm.h  |   2 +-
  drivers/dca/dca-sysfs.c |   9 +-
  drivers/firewire/core-cdev.c|  18 +-
  drivers/firewire/core-device.c  |  15 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c |   8 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c |   9 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |   2 +-
  drivers/gpu/drm/drm_auth.c  |   9 +-
  drivers/gpu/drm/drm_connector.c |  10 +-
  drivers/gpu/drm/drm_context.c   |  20 +-
  drivers/gpu/drm/drm_dp_aux_dev.c|  11 +-
  drivers/gpu/drm/drm_drv.c   |   6 +-
  drivers/gpu/drm/drm_gem.c   |  19 +-
  drivers/gpu/drm/drm_info.c  |   2 +-
  drivers/gpu/drm/drm_mode_object.c   |  11 +-
  drivers/gpu/drm/drm_syncobj.c   |  18 +-
  drivers/gpu/drm/exynos/exynos_drm_ipp.c |  25 ++-
  drivers/gpu/drm/i915/gvt/display.c  |   2 +-
  drivers/gpu/drm/i915/gvt/kvmgt.c|   2 +-
  drivers/gpu/drm/i915/gvt/vgpu.c |   9 +-
  drivers/gpu/drm/i915/i915_debugfs.c |   6 +-
  drivers/gpu/drm/i915/i915_gem_context.c |   9 +-
  drivers/gpu/drm/qxl/qxl_cmd.c   |   8 +-
  drivers/gpu/drm/qxl/qxl_release.c   |  14 +-
  drivers/gpu/drm/sis/sis_mm.c|   8 +-
  drivers/gpu/drm/tegra/drm.c |  10 +-
  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c|   3 +-
  drivers/gpu/drm/vgem/vgem_fence.c   |  12 +-
  drivers/gpu/drm/via/via_mm.c|   8 +-
  drivers/gpu/drm/virtio/virtgpu_kms.c|   5 +-
  drivers/gpu/drm/virtio/virtgpu_vq.c |   5 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c|   9 +-
  drivers/i2c/i2c-core-base.c |  19 +-
  drivers/infiniband/core/cm.c|   8 +-
  drivers/infiniband/core/cma.c   |  12 +-
  drivers/infiniband/core/rdma_core.c |   9 +-
  drivers/infiniband/core/sa_query.c  |  23 +--
  drivers/infiniband/core/ucm.c   |   7 +-
  drivers/infiniband/core/ucma.c  |  14 +-
  drivers/infiniband/hw/cxgb3/iwch.c  |   4 +-
  drivers/infiniband/hw/cxgb3/iwch.h  |   4 +-
  

[PATCH] powerpc: Use reg.h values for program check reason codes

2017-08-16 Thread Cyril Bur
Small amount of #define duplication, makes sense for these to be in
reg.h.

Signed-off-by: Cyril Bur 
---
 arch/powerpc/include/asm/reg.h |  1 +
 arch/powerpc/kernel/traps.c| 10 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index a3b6575c7842..c22b1ae5ad03 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -675,6 +675,7 @@
  * may not be recoverable */
 #define  SRR1_WS_DEEPER0x0002 /* Some resources not 
maintained */
 #define  SRR1_WS_DEEP  0x0001 /* All resources maintained 
*/
+#define   SRR1_PROGTMBAD   0x0020 /* TM Bad Thing */
 #define   SRR1_PROGFPE 0x0010 /* Floating Point Enabled */
 #define   SRR1_PROGILL 0x0008 /* Illegal instruction */
 #define   SRR1_PROGPRIV0x0004 /* Privileged instruction */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 1f7ec178db05..0a5ddaea8bf1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -416,11 +416,11 @@ static inline int check_io_access(struct pt_regs *regs)
exception is in the MSR. */
 #define get_reason(regs)   ((regs)->msr)
 #define get_mc_reason(regs)((regs)->msr)
-#define REASON_TM  0x20
-#define REASON_FP  0x10
-#define REASON_ILLEGAL 0x8
-#define REASON_PRIVILEGED  0x4
-#define REASON_TRAP0x2
+#define REASON_TM  SRR1_PROGTMBAD
+#define REASON_FP  SRR1_PROGFPE
+#define REASON_ILLEGAL SRR1_PROGILL
+#define REASON_PRIVILEGED  SRR1_PROGPRIV
+#define REASON_TRAPSRR1_PROGTRAP
 
 #define single_stepping(regs)  ((regs)->msr & MSR_SE)
 #define clear_single_step(regs)((regs)->msr &= ~MSR_SE)
-- 
2.14.1



Re: should "linux-phandle" be "linux,phandle"?

2017-08-16 Thread Michael Ellerman
rpj...@crashcourse.ca writes:

>pedantic nitpickery but, in arch/powerpc/kernel/prom_init.c, line 2426,
> should that diagnostic message print "" and not  
> ""?

Yeah I guess.

AFAICS it can't happen though, we created the "linux,phandle" string
just prior, around line 2488.

So it could just be:

soff = dt_find_string("linux,phandle");
if (soff == 0)
prom_printf("WARNING: Wat?!\n");
else
...

cheers


[PATCH 5/5] powerpc: Remove more redundant VSX save/tests

2017-08-16 Thread Benjamin Herrenschmidt
__giveup_vsx/save_vsx are completely equivalent to testing MSR_FP
and MSR_VEC and calling the corresponding giveup/save function so
just remove the spurious VSX cases. Also add WARN_ONs checking that
we never have VSX enabled without the two other.

Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/kernel/process.c | 33 -
 1 file changed, 8 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index fc285fab7118..7093b46b3603 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -355,14 +355,6 @@ static void giveup_vsx(struct task_struct *tsk)
msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX);
 }
 
-static void save_vsx(struct task_struct *tsk)
-{
-   if (tsk->thread.regs->msr & MSR_FP)
-   save_fpu(tsk);
-   if (tsk->thread.regs->msr & MSR_VEC)
-   save_altivec(tsk);
-}
-
 void enable_kernel_vsx(void)
 {
unsigned long cpumsr;
@@ -412,7 +404,6 @@ static int restore_vsx(struct task_struct *tsk)
 }
 #else
 static inline int restore_vsx(struct task_struct *tsk) { return 0; }
-static inline void save_vsx(struct task_struct *tsk) { }
 #endif /* CONFIG_VSX */
 
 #ifdef CONFIG_SPE
@@ -492,6 +483,8 @@ void giveup_all(struct task_struct *tsk)
msr_check_and_set(msr_all_available);
check_if_tm_restore_required(tsk);
 
+   WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & 
MSR_VEC)));
+
 #ifdef CONFIG_PPC_FPU
if (usermsr & MSR_FP)
__giveup_fpu(tsk);
@@ -500,10 +493,6 @@ void giveup_all(struct task_struct *tsk)
if (usermsr & MSR_VEC)
__giveup_altivec(tsk);
 #endif
-#ifdef CONFIG_VSX
-   if (usermsr & MSR_VSX)
-   __giveup_vsx(tsk);
-#endif
 #ifdef CONFIG_SPE
if (usermsr & MSR_SPE)
__giveup_spe(tsk);
@@ -562,19 +551,13 @@ void save_all(struct task_struct *tsk)
 
msr_check_and_set(msr_all_available);
 
-   /*
-* Saving the way the register space is in hardware, save_vsx boils
-* down to a save_fpu() and save_altivec()
-*/
-   if (usermsr & MSR_VSX) {
-   save_vsx(tsk);
-   } else {
-   if (usermsr & MSR_FP)
-   save_fpu(tsk);
+   WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & 
MSR_VEC)));
 
-   if (usermsr & MSR_VEC)
-   save_altivec(tsk);
-   }
+   if (usermsr & MSR_FP)
+   save_fpu(tsk);
+
+   if (usermsr & MSR_VEC)
+   save_altivec(tsk);
 
if (usermsr & MSR_SPE)
__giveup_spe(tsk);
-- 
2.13.4



[PATCH 2/5] powerpc: Fix missing CR before {

2017-08-16 Thread Benjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/kernel/process.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 883216b4296a..14b9a3c46c5d 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -230,7 +230,8 @@ void enable_kernel_fp(void)
 }
 EXPORT_SYMBOL(enable_kernel_fp);
 
-static int restore_fp(struct task_struct *tsk) {
+static int restore_fp(struct task_struct *tsk)
+{
if (tsk->thread.load_fp || msr_tm_active(tsk->thread.regs->msr)) {
load_fp_state(>thread.fp_state);
current->thread.load_fp++;
-- 
2.13.4



Re: [PATCH 1/5] powerpc: Test MSR_FP and MSR_VEC when enabling/flushing VSX

2017-08-16 Thread Benjamin Herrenschmidt
On Wed, 2017-08-16 at 16:01 +1000, Benjamin Herrenschmidt wrote:
> VSX uses a combination of the old vector registers, the old FP registers
> and new "second halves" of the FP registers.
> 
> Thus when we need to see the VSX state in the thread struct
> (flush_vsx_to_thread) or when we'll use the VSX in the kernel
> (enable_kernel_vsx) we need to ensure they are all flushed into
> the thread struct if either of them is individually enabled.
> 
> Unfortunately we only tested if the whole VSX was enabled, not
> if they were individually enabled.
> 
> Signed-off-by: Benjamin Herrenschmidt 

And CC stable.

> ---
>  arch/powerpc/kernel/process.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 9f3e2c932dcc..883216b4296a 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -362,7 +362,8 @@ void enable_kernel_vsx(void)
>  
>   cpumsr = msr_check_and_set(MSR_FP|MSR_VEC|MSR_VSX);
>  
> - if (current->thread.regs && (current->thread.regs->msr & MSR_VSX)) {
> + if (current->thread.regs &&
> + (current->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP))) {
>   check_if_tm_restore_required(current);
>   /*
>* If a thread has already been reclaimed then the
> @@ -386,7 +387,7 @@ void flush_vsx_to_thread(struct task_struct *tsk)
>  {
>   if (tsk->thread.regs) {
>   preempt_disable();
> - if (tsk->thread.regs->msr & MSR_VSX) {
> + if (tsk->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP)) {
>   BUG_ON(tsk != current);
>   giveup_vsx(tsk);
>   }


[PATCH 1/5] powerpc: Test MSR_FP and MSR_VEC when enabling/flushing VSX

2017-08-16 Thread Benjamin Herrenschmidt
VSX uses a combination of the old vector registers, the old FP registers
and new "second halves" of the FP registers.

Thus when we need to see the VSX state in the thread struct
(flush_vsx_to_thread) or when we'll use the VSX in the kernel
(enable_kernel_vsx) we need to ensure they are all flushed into
the thread struct if either of them is individually enabled.

Unfortunately we only tested if the whole VSX was enabled, not
if they were individually enabled.

Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/kernel/process.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9f3e2c932dcc..883216b4296a 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -362,7 +362,8 @@ void enable_kernel_vsx(void)
 
cpumsr = msr_check_and_set(MSR_FP|MSR_VEC|MSR_VSX);
 
-   if (current->thread.regs && (current->thread.regs->msr & MSR_VSX)) {
+   if (current->thread.regs &&
+   (current->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP))) {
check_if_tm_restore_required(current);
/*
 * If a thread has already been reclaimed then the
@@ -386,7 +387,7 @@ void flush_vsx_to_thread(struct task_struct *tsk)
 {
if (tsk->thread.regs) {
preempt_disable();
-   if (tsk->thread.regs->msr & MSR_VSX) {
+   if (tsk->thread.regs->msr & (MSR_VSX|MSR_VEC|MSR_FP)) {
BUG_ON(tsk != current);
giveup_vsx(tsk);
}
-- 
2.13.4



[PATCH 3/5] powerpc: Remove redundant fp/altivec giveup code

2017-08-16 Thread Benjamin Herrenschmidt
__giveup_vsx already calls those two functions

Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/kernel/process.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 14b9a3c46c5d..bfbd6083f841 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -375,10 +375,6 @@ void enable_kernel_vsx(void)
 */
if(!msr_tm_active(cpumsr) && 
msr_tm_active(current->thread.regs->msr))
return;
-   if (current->thread.regs->msr & MSR_FP)
-   __giveup_fpu(current);
-   if (current->thread.regs->msr & MSR_VEC)
-   __giveup_altivec(current);
__giveup_vsx(current);
}
 }
-- 
2.13.4



[PATCH 4/5] powerpc: Remove redundant clear of MSR_VSX in __giveup_vsx()

2017-08-16 Thread Benjamin Herrenschmidt
__giveup_fpu() already does it and we cannot have MSR_VSX set
without having MSR_FP also set.

This also adds a warning to check we indeed do

Signed-off-by: Benjamin Herrenschmidt 
---
 arch/powerpc/kernel/process.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index bfbd6083f841..fc285fab7118 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -331,11 +331,19 @@ static inline int restore_altivec(struct task_struct 
*tsk) { return 0; }
 #ifdef CONFIG_VSX
 static void __giveup_vsx(struct task_struct *tsk)
 {
-   if (tsk->thread.regs->msr & MSR_FP)
+   unsigned long msr = tsk->thread.regs->msr;
+
+   /*
+* We should never be ssetting MSR_VSX without also setting
+* MSR_FP and MSR_VEC
+*/
+   WARN_ON((msr & MSR_VSX) && !((msr & MSR_FP) && (msr & MSR_VEC)));
+
+   /* __giveup_fpu will clear MSR_VSX */
+   if (msr & MSR_FP)
__giveup_fpu(tsk);
-   if (tsk->thread.regs->msr & MSR_VEC)
+   if (msr & MSR_VEC)
__giveup_altivec(tsk);
-   tsk->thread.regs->msr &= ~MSR_VSX;
 }
 
 static void giveup_vsx(struct task_struct *tsk)
-- 
2.13.4