Re: [PATCH 37/61] perf: simplify getting .drvdata

2018-04-24 Thread Will Deacon
On Thu, Apr 19, 2018 at 04:06:07PM +0200, Wolfram Sang wrote:
> We should get drvdata from struct device directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang 
> ---
> 
> Build tested only. buildbot is happy. Please apply individually.

Thanks, Wolfram. I'll pick this up via the arm perf tree.

Will


Re: [PATCHv2 6/8] arm_pmu: explicitly enable/disable SPIs at hotplug

2018-02-26 Thread Will Deacon
On Mon, Feb 26, 2018 at 04:16:19PM +0100, Geert Uytterhoeven wrote:
> On Mon, Feb 5, 2018 at 5:42 PM, Mark Rutland  wrote:
> > To support ACPI systems, we need to request IRQs before CPUs are
> > hotplugged, and thus we need to request IRQs before we know their
> > associated PMU.
> >
> > This is problematic if a PMU IRQ is pending out of reset, as it may be
> > taken before we know the PMU, and thus the IRQ handler won't be able to
> > handle it, leaving it screaming.
> >
> > To avoid such problems, lets request all IRQs in a disabled state, and
> > explicitly enable/disable them at hotplug time, when we're sure the PMU
> > has been probed.
> >
> > Signed-off-by: Mark Rutland 
> 
> This is now commit 6de3f79112cc26bf in v4.16-rc3, and causes a BUG during
> CPU offlining (e.g. during system suspend, or during boot with
> CONFIG_ARM_PSCI_CHECKER=y).
> 
> With CONFIG_ARM_PSCI_CHECKER=y:
> 
> psci_checker: PSCI checker started using 6 CPUs
> psci_checker: Starting hotplug tests
> psci_checker: Trying to turn off and on again all CPUs
> BUG: sleeping function called from invalid context at kernel/irq/manage.c:112
> in_atomic(): 1, irqs_disabled(): 128, pid: 15, name: migration/1
> no locks held by migration/1/15.
> irq event stamp: 192
> hardirqs last  enabled at (191): [<803c2507>]
> _raw_spin_unlock_irq+0x2c/0x4c
> hardirqs last disabled at (192): [<7f57ad28>] 
> multi_cpu_stop+0x9c/0x140
> softirqs last  enabled at (0): [<04ee1b58>]
> copy_process.isra.77.part.78+0x43c/0x1504
> softirqs last disabled at (0): [<  (null)>]   (null)
> CPU: 1 PID: 15 Comm: migration/1 Not tainted 4.16.0-rc3-salvator-x #1651
> Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
> Call trace:
>  dump_backtrace+0x0/0x140
>  show_stack+0x14/0x1c
>  dump_stack+0xb4/0xf0
>  ___might_sleep+0x1fc/0x218
>  __might_sleep+0x70/0x80
>  synchronize_irq+0x40/0xa8
>  disable_irq+0x20/0x2c

Given that these things are CPU-affine, I reckon this should be
disable_irq_nosync. Mark?

Will

--->8

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 0c2ed11c0603..f63db346c219 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -638,7 +638,7 @@ static int arm_perf_teardown_cpu(unsigned int cpu, struct 
hlist_node *node)
if (irq_is_percpu_devid(irq))
disable_percpu_irq(irq);
else
-   disable_irq(irq);
+   disable_irq_nosync(irq);
}
 
per_cpu(cpu_armpmu, cpu) = NULL;


Re: arm64: unhandled level 0 translation fault

2017-12-15 Thread Will Deacon
On Fri, Dec 15, 2017 at 04:59:28PM +0100, Geert Uytterhoeven wrote:
> On Fri, Dec 15, 2017 at 3:27 PM, Will Deacon <will.dea...@arm.com> wrote:
> > On Fri, Dec 15, 2017 at 02:30:00PM +0100, Geert Uytterhoeven wrote:
> >> On Fri, Dec 15, 2017 at 12:23 PM, Dave Martin <dave.mar...@arm.com> wrote:
> >> > The two important differences here seem to be
> >> >
> >> > 1) Staging the state via current->thread.fpsimd_state instead of loading
> >> > directly:
> >> >
> >> > -   fpsimd_load_state(state);
> >> > +   current->thread.fpsimd_state = *state;
> >> > +   fpsimd_load_state(>thread.fpsimd_state);
> >>
> >> The change above introduces the breakage.
> >
> > I finally managed to reproduce this, but only by using the exact same
> > compiler as Geert:
> >
> > https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/x86_64-gcc-4.9.0-nolibc_aarch64-linux.tar.xz
> >
> > I then reliably see the problem if I run:
> >
> >   # /usr/bin/update-ca-certificates
> 
> /usr/sbin/... ?
> 
> > from Debian Jessie.
> 
> Funny, I've just got both
> 
> *** Error in `/bin/sh': free(): invalid pointer: 0xc17d4988 ***
> 
> and
> 
> mountall.sh[2172]: unhandled level 0 translation fault (11) at
> 0x004d, esr 0x9204, in dash[ce7e5000+1a000]
> 
> during boot up, but I can't get update-ca-certificates to fail...

Can you try the diff below, please?

Will

--->8

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 540a1e010eb5..fae81f7964b4 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1043,7 +1043,7 @@ void fpsimd_update_current_state(struct fpsimd_state 
*state)
 
local_bh_disable();
 
-   current->thread.fpsimd_state = *state;
+   current->thread.fpsimd_state.user_fpsimd = state->user_fpsimd;
if (system_supports_sve() && test_thread_flag(TIF_SVE))
fpsimd_to_sve(current);


Re: arm64: unhandled level 0 translation fault

2017-12-15 Thread Will Deacon
On Fri, Dec 15, 2017 at 02:30:00PM +0100, Geert Uytterhoeven wrote:
> On Fri, Dec 15, 2017 at 12:23 PM, Dave Martin  wrote:
> > The two important differences here seem to be
> >
> > 1) Staging the state via current->thread.fpsimd_state instead of loading
> > directly:
> >
> > -   fpsimd_load_state(state);
> > +   current->thread.fpsimd_state = *state;
> > +   fpsimd_load_state(>thread.fpsimd_state);
> 
> The change above introduces the breakage.

I finally managed to reproduce this, but only by using the exact same
compiler as Geert:

https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/x86_64-gcc-4.9.0-nolibc_aarch64-linux.tar.xz

I then reliably see the problem if I run:

  # /usr/bin/update-ca-certificates

from Debian Jessie.

Note that my normal toolchain (Linaro 7.1.1 build) works fine and also
if I use the toolchain above but disable CONFIG_ARM64_CRYPTO then things
work too.

So there's some toolchain-specific interaction between this change and the
crypto code...

Will


Re: arm64: unhandled level 0 translation fault

2017-12-14 Thread Will Deacon
Hi Geert,

On Thu, Dec 14, 2017 at 03:34:50PM +0100, Geert Uytterhoeven wrote:
> On Tue, Dec 12, 2017 at 11:20 AM, Geert Uytterhoeven
>  wrote:
> > During userspace (Debian jessie NFS root) boot on arm64:
> >
> > rpcbind[1083]: unhandled level 0 translation fault (11) at 0x0008,
> > esr 0x9204, in dash[adf77000+1a000]
> > CPU: 0 PID: 1083 Comm: rpcbind Not tainted
> > 4.15.0-rc3-arm64-renesas-02176-g14f9a1826e48e355 #51
> > Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ 
> > (DT)
> 
> This is a quad Cortex A57.

It's so bizarre that nobody else is running into this!

> > pstate: 8000 (Nzcv daif -PAN -UAO)
> > pc : 0xadf8a51c
> > lr : 0xadf8ac08
> > sp : cffeac00
> > x29: cffeac00 x28: adfa1000
> > x27: cffebf7c x26: cffead20
> > x25: cea1c5f0 x24: 
> > x23: adfa1000 x22: adfa1000
> > x21:  x20: 0008
> > x19:  x18: cffeb500
> > x17: a22babfc x16: adfa1ae8
> > x15: a2363588 x14: 
> > x13: 0020 x12: 0010
> > x11: 0101010101010101 x10: adfa1000
> > x9 : ff81 x8 : adfa2000
> > x7 :  x6 : 
> > x5 : adfa2338 x4 : adfa2000
> > x3 : adfa2338 x2 : 
> > x1 : adfa28b0 x0 : adfa4c30
> >
> > Sometimes it happens with other processes, but the main address, esr, and
> > pstate values are always the same.
> >
> > I regularly run arm64/for-next/core (through bi-weekly renesas-drivers
> > releases, so the last time was two weeks ago), but never saw the issue
> > before until today, so probably v4.15-rc1 is OK.
> > Unfortunately it doesn't happen during every boot, which makes it
> > cumbersome to bisect.
> >
> > My first guess was UNMAP_KERNEL_AT_EL0, but even after disabling that,
> > and even without today's arm64/for-next/core merged in, I still managed to
> > reproduce the issue, so I believe it was introduced in v4.15-rc2 or
> > v4.15-rc3.
> >
> > Once, when the kernel message above wasn't shown, I got an error from
> > userspace, which may be related:
> > *** Error in `/bin/sh': free(): invalid pointer: 0xdd970988 ***
> 
> With more boots (10 instead of 6) to declare a kernel good, I bisected this
> to commit 9de52a755cfb6da5 ("arm64: fpsimd: Fix failure to restore FPSIMD
> state after signals").
> 
> Reverting that commit on top of v4.15-rc3 fixed the issue for me.

Thanks for persevering with the bisect. We'll get this fixed ASAP, but we'll
be relying on you to test the patch we come up with.

Cheers,

Will


Re: arm64: unhandled level 0 translation fault

2017-12-13 Thread Will Deacon
Hi Geert,

Thanks for trying to bisect this.

On Tue, Dec 12, 2017 at 09:54:05PM +0100, Geert Uytterhoeven wrote:
> On Tue, Dec 12, 2017 at 5:57 PM, Will Deacon <will.dea...@arm.com> wrote:
> > Do you reckon you can bisect between -rc1 and -rc2? We've been unable to
> > reproduce this on any of our systems, unfortunately.
> 
> I've tried, but ended up on an unrelated XFS merge commit. Probably I
> marked a few commits good due to not seeing this heisenbug.
> 
> For reference, here's the bisect log.
> 
> Bad commits showed one or both of "unhandled level 0 translation fault" and
> "invalid pointer". Good commits didn't show any during 6 tries.
> 
> git bisect start
> # bad: [ae64f9bd1d3621b5e60d7363bc20afb46aede215] Linux 4.15-rc2
> git bisect bad ae64f9bd1d3621b5e60d7363bc20afb46aede215
> # good: [4fbd8d194f06c8a3fd2af1ce560ddb31f7ec8323] Linux 4.15-rc1
> git bisect good 4fbd8d194f06c8a3fd2af1ce560ddb31f7ec8323
> # good: [9e0600f5cf6cecfcab5046d1453a9538c054d8a7] Merge tag
> 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
> git bisect good 9e0600f5cf6cecfcab5046d1453a9538c054d8a7
> # good: [503505bfea19b7d69e2572297e6defa0f9c2404e] Merge branch
> 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux into
> drm-fixes
> git bisect good 503505bfea19b7d69e2572297e6defa0f9c2404e
> # good: [ae753ee2771a1bacade56411bb98037b2545c929] Merge tag
> 'afs-fixes-20171201' of
> git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
> git bisect good ae753ee2771a1bacade56411bb98037b2545c929
> # good: [e1ba1c99dad92c5917b22b1047cf36e4426b124a] Merge tag
> 'riscv-for-linus-4.15-rc2_cleanups' of
> git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
> git bisect good e1ba1c99dad92c5917b22b1047cf36e4426b124a

^^ This one is the first "good" commit containing the arm64-fixes pull.
Maybe try stressing it a bit more and see if it also fails?

That said, I'm still suspicious that nobody else is seeing this -- I also
checked the various build/boot farms and everything looks ok.

Will


Re: arm64: unhandled level 0 translation fault

2017-12-12 Thread Will Deacon
On Tue, Dec 12, 2017 at 05:00:33PM +0100, Geert Uytterhoeven wrote:
> On Tue, Dec 12, 2017 at 4:11 PM, Geert Uytterhoeven
> <ge...@linux-m68k.org> wrote:
> > On Tue, Dec 12, 2017 at 11:36 AM, Will Deacon <will.dea...@arm.com> wrote:
> >> On Tue, Dec 12, 2017 at 11:20:09AM +0100, Geert Uytterhoeven wrote:
> >>> During userspace (Debian jessie NFS root) boot on arm64:
> >>>
> >>> rpcbind[1083]: unhandled level 0 translation fault (11) at 0x0008,
> >>> esr 0x9204, in dash[adf77000+1a000]
> >>> CPU: 0 PID: 1083 Comm: rpcbind Not tainted
> >>> 4.15.0-rc3-arm64-renesas-02176-g14f9a1826e48e355 #51
> >>> Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 
> >>> ES2.0+ (DT)
> >>> pstate: 8000 (Nzcv daif -PAN -UAO)
> >>> pc : 0xadf8a51c
> >>> lr : 0xadf8ac08
> >>> sp : cffeac00
> >>> x29: cffeac00 x28: adfa1000
> >>> x27: cffebf7c x26: cffead20
> >>> x25: cea1c5f0 x24: 
> >>> x23: adfa1000 x22: adfa1000
> >>> x21:  x20: 0008
> >>> x19:  x18: cffeb500
> >>> x17: a22babfc x16: adfa1ae8
> >>> x15: a2363588 x14: 
> >>> x13: 0020 x12: 0010
> >>> x11: 0101010101010101 x10: adfa1000
> >>> x9 : ff81 x8 : adfa2000
> >>> x7 :  x6 : 
> >>> x5 : adfa2338 x4 : adfa2000
> >>> x3 : adfa2338 x2 : 
> >>> x1 : adfa28b0 x0 : adfa4c30
> >>>
> >>> Sometimes it happens with other processes, but the main address, esr, and
> >>> pstate values are always the same.
> >>>
> >>> I regularly run arm64/for-next/core (through bi-weekly renesas-drivers
> >>> releases, so the last time was two weeks ago), but never saw the issue
> >>> before until today, so probably v4.15-rc1 is OK.
> >>> Unfortunately it doesn't happen during every boot, which makes it
> >>> cumbersome to bisect.
> >>>
> >>> My first guess was UNMAP_KERNEL_AT_EL0, but even after disabling that,
> >>> and even without today's arm64/for-next/core merged in, I still managed to
> >>> reproduce the issue, so I believe it was introduced in v4.15-rc2 or
> >>> v4.15-rc3.
> >>
> >> Urgh, this looks nasty. Thanks for the report! A few questions:
> >>
> >>  - Can you share your .config somewhere please?
> >
> > I managed to reproduce it on plain v4.15-rc3 using both arm64_defconfig, and
> > renesas_defconfig (from Simon's repo).
> 
> v4.15-rc2 is affected, too.

Do you reckon you can bisect between -rc1 and -rc2? We've been unable to
reproduce this on any of our systems, unfortunately.

Will


Re: arm64: unhandled level 0 translation fault

2017-12-12 Thread Will Deacon
Hi Geert,

On Tue, Dec 12, 2017 at 11:20:09AM +0100, Geert Uytterhoeven wrote:
> During userspace (Debian jessie NFS root) boot on arm64:
> 
> rpcbind[1083]: unhandled level 0 translation fault (11) at 0x0008,
> esr 0x9204, in dash[adf77000+1a000]
> CPU: 0 PID: 1083 Comm: rpcbind Not tainted
> 4.15.0-rc3-arm64-renesas-02176-g14f9a1826e48e355 #51
> Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ 
> (DT)
> pstate: 8000 (Nzcv daif -PAN -UAO)
> pc : 0xadf8a51c
> lr : 0xadf8ac08
> sp : cffeac00
> x29: cffeac00 x28: adfa1000
> x27: cffebf7c x26: cffead20
> x25: cea1c5f0 x24: 
> x23: adfa1000 x22: adfa1000
> x21:  x20: 0008
> x19:  x18: cffeb500
> x17: a22babfc x16: adfa1ae8
> x15: a2363588 x14: 
> x13: 0020 x12: 0010
> x11: 0101010101010101 x10: adfa1000
> x9 : ff81 x8 : adfa2000
> x7 :  x6 : 
> x5 : adfa2338 x4 : adfa2000
> x3 : adfa2338 x2 : 
> x1 : adfa28b0 x0 : adfa4c30
> 
> Sometimes it happens with other processes, but the main address, esr, and
> pstate values are always the same.
> 
> I regularly run arm64/for-next/core (through bi-weekly renesas-drivers
> releases, so the last time was two weeks ago), but never saw the issue
> before until today, so probably v4.15-rc1 is OK.
> Unfortunately it doesn't happen during every boot, which makes it
> cumbersome to bisect.
> 
> My first guess was UNMAP_KERNEL_AT_EL0, but even after disabling that,
> and even without today's arm64/for-next/core merged in, I still managed to
> reproduce the issue, so I believe it was introduced in v4.15-rc2 or
> v4.15-rc3.

Urgh, this looks nasty. Thanks for the report! A few questions:

 - Can you share your .config somewhere please?
 - What was your last known-good kernel?
 - Have you seen it on any other Soc?
 - What's the CPU in your SoC?

If I can reproduce the failure here, then I should be able to debug ASAP.

Cheers,

Will


Re: [PATCH V8 07/11] iommu: of: Handle IOMMU lookup failure with deferred probing or error

2017-05-15 Thread Will Deacon
Hi Sricharan,

On Wed, May 03, 2017 at 03:54:59PM +0530, Sricharan R wrote:
> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> > On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R <sricha...@codeaurora.org> 
> >> wrote:
> >>> From: Laurent Pinchart <laurent.pinchart+rene...@ideasonboard.com>
> >>>
> >>> Failures to look up an IOMMU when parsing the DT iommus property need to
> >>> be handled separately from the .of_xlate() failures to support deferred
> >>> probing.
> >>>
> >>> The lack of a registered IOMMU can be caused by the lack of a driver for
> >>> the IOMMU, the IOMMU device probe not having been performed yet, having
> >>> been deferred, or having failed.
> >>>
> >>> The first case occurs when the device tree describes the bus master and
> >>> IOMMU topology correctly but no device driver exists for the IOMMU yet
> >>> or the device driver has not been compiled in. Return NULL, the caller
> >>> will configure the device without an IOMMU.
> >>>
> >>> The second and third cases are handled by deferring the probe of the bus
> >>> master device which will eventually get reprobed after the IOMMU.
> >>>
> >>> The last case is currently handled by deferring the probe of the bus
> >>> master device as well. A mechanism to either configure the bus master
> >>> device without an IOMMU or to fail the bus master device probe depending
> >>> on whether the IOMMU is optional or mandatory would be a good
> >>> enhancement.
> >>>
> >>> Tested-by: Marek Szyprowski <m.szyprow...@samsung.com>
> >>> Signed-off-by: Laurent Pichart <laurent.pinchart+rene...@ideasonboard.com>
> >>> Signed-off-by: Sricharan R <sricha...@codeaurora.org>
> >>
> >> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
> >> properties in DT now fail to probe.
> > 
> > How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
> > registered then they should merely defer until we reach the point of
> > giving up and ignoring the IOMMU. Is it just that you have no other
> > late-probing drivers or post-init module loads to kick the deferred
> > queue after that point? I did try to find a way to explicitly kick it
> > from a suitably late initcall, but there didn't seem to be any obvious
> > public interface - anyone have any suggestions?
> > 
> > I think that's more of a general problem with the probe deferral
> > mechanism itself (I've seen the same thing happen with some of the
> > CoreSight stuff on Juno due to the number of inter-component
> > dependencies) rather than any specific fault of this series.
> > 
> 
> I was thinking of an additional check like below to avoid the
> situation ?
> 
> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> From: Sricharan R <sricha...@codeaurora.org>
> Date: Wed, 3 May 2017 13:16:59 +0530
> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> 
> While returning EPROBE_DEFER for iommu masters
> take in to account of iommu nodes that could be
> marked in DT as 'status=disabled', in which case
> simply return NULL and let the master's probe
> continue rather than deferring.
> 
> Signed-off-by: Sricharan R <sricha...@codeaurora.org>
> ---
>  drivers/iommu/of_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 9f44ee8..e6e9bec 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node 
> *np)
> 
> ops = iommu_ops_from_fwnode(fwnode);
> if ((ops && !ops->of_xlate) ||
> +   !of_device_is_available(iommu_spec->np) ||
> (!ops && !of_iommu_driver_present(iommu_spec->np)))
> return NULL;

Without this patch, v4.12-rc1 hangs on my Juno waiting to mount the root
filesystem. The problem is that the USB controller is behind an SMMU which
is marked as 'status = "disabled"' in the devicetree. Whilst there was a
separate thread with Ard about exactly what this means in terms of the DMA
ops used by upstream devices, your patch above fixes the regression and
I think should go in regardless. The DMA ops issue will likely require
an additional DT binding anyway, to advertise the behaviour of the
IOMMU when it is disabled.

Tested-by: Will Deacon <will.dea...@arm.com>
Acked-by: Will Deacon <will.dea...@arm.com>

Could you resend it as a proper patch, please?

Will


Re: [PATCH/RESEND] arm64: defconfig: Enlarge CMA alignment to 2 MiB

2017-02-13 Thread Will Deacon
On Mon, Feb 13, 2017 at 12:04:18PM +0100, Geert Uytterhoeven wrote:
> Some IOMMUs (e.g. Renesas IPMMU/VMSA) support only page sizes of 4 KiB,
> 2 MiB, and 1 GiB.
> 
> With the default setting of CONFIG_CMA_ALIGNMENT = 8, allocations larger
> than 1 MiB are aligned to a 1 MiB boundary only.  Hence a 2 MiB
> allocation may not be aligned, leading to a mapping of 512 4 KiB pages.
> 
> Increase CONFIG_CMA_ALIGNMENT to allow mapping a 2 MiB buffer using a
> single PTE, decreasing memory usage and TLB pressure.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> Is this useful?

I assume you're proposing it because you see an improvement? :)

> Should there instead be different defaults in Kconfig, depending on
> enabled platform support?

I don't object to updating defconfig as a quick hack, but the right solution
is probably to make the core Kconfig default value overridable by the
architecture.

Will


Re: [PATCH] drivers: firmware: psci: Use __pa_symbol for cpu_resume

2017-01-24 Thread Will Deacon
On Tue, Jan 24, 2017 at 03:33:51PM +, Mark Rutland wrote:
> On Tue, Jan 24, 2017 at 04:30:19PM +0100, Geert Uytterhoeven wrote:
> > If CONFIG_DEBUG_VIRTUAL=y, during s2ram:
> > 
> > virt_to_phys used for non-linear address: ff80085db280 
> > (cpu_resume+0x0/0x20)
> > [ cut here ]
> > WARNING: CPU: 0 PID: 1628 at arch/arm64/mm/physaddr.c:14 
> > __virt_to_phys+0x28/0x60
> > ...
> > [] __virt_to_phys+0x28/0x60
> > [] psci_system_suspend+0x20/0x44
> > [] cpu_suspend+0x3c/0x68
> > [] psci_system_suspend_enter+0x18/0x20
> > [] suspend_devices_and_enter+0x3f8/0x7e8
> > [] pm_suspend+0x544/0x5f4
> > 
> > Fixes: 1a08e3d9e0ac4577 ("drivers: firmware: psci: Use __pa_symbol for 
> > kernel symbol")
> > Signed-off-by: Geert Uytterhoeven 
> 
> Argh, I should have spotted this. :(
> 
> Acked-by: Mark Rutland 
> 
> This should go via the arm64 tree -- Will, could you pick this up? 

Yup, I'll queue this along with the other __pa_symbol fix.

Cheers,

Will


Re: [PATCH] arm64: Add support for DMA_ATTR_SKIP_CPU_SYNC attribute to swiotlb

2017-01-12 Thread Will Deacon
On Wed, Jan 11, 2017 at 11:11:17AM +0100, Geert Uytterhoeven wrote:
> From: Takeshi Kihara 
> 
> This patch adds support for DMA_ATTR_SKIP_CPU_SYNC attribute for
> dma_{un}map_{page,sg} functions family to swiotlb.
> 
> DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
> the CPU cache for the given buffer assuming that it has been already
> transferred to 'device' domain.
> 
> Ported from IOMMU .{un}map_{sg,page} ops.
> 
> Signed-off-by: Takeshi Kihara 
> Acked-by: Laurent Pinchart 
> Signed-off-by: Geert Uytterhoeven 
> ---
> v2:
>   - Add Acked-by.
> 
> Support for DMA_ATTR_SKIP_CPU_SYNC was included when porting the IOMMU
> ops from arm to arm64 in commit 13b8629f651164d7 ("arm64: Add IOMMU
> dma_ops").
> 
> Presumably it was an oversight that the existing swiotlb based
> implementation didn't have support for DMA_ATTR_SKIP_CPU_SYNC yet?
> ---
>  arch/arm64/mm/dma-mapping.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)

Thanks. Applied for 4.11, with Robin's Reviewed-by.

Will


Re: [PATCH v2] arm64: do not set dma masks that device connection can't handle

2017-01-10 Thread Will Deacon
On Mon, Jan 09, 2017 at 10:30:02AM +0300, Nikita Yushchenko wrote:
> It is possible that device is capable of 64-bit DMA addresses, and
> device driver tries to set wide DMA mask, but bridge or bus used to
> connect device to the system can't handle wide addresses.
> 
> With swiotlb, memory above 4G still can be used by drivers for streaming
> DMA, but *dev->mask and dev->dma_coherent_mask must still keep values
> that hardware handles physically.
> 
> This patch enforces that. Based on original version by
> Arnd Bergmann , extended with coherent mask hadnling.
> 
> Signed-off-by: Nikita Yushchenko 
> CC: Arnd Bergmann 
> ---
> Changes since v1:
> - fixed issues noted by Sergei Shtylyov 
>   - save mask, not size
>   - remove doube empty line
> 
>  arch/arm64/Kconfig  |  3 +++
>  arch/arm64/include/asm/device.h |  1 +
>  arch/arm64/mm/dma-mapping.c | 51 
> +
>  3 files changed, 55 insertions(+)

I still don't think this patch is general enough. The problem you're seeing
with swiotlb seems to be exactly the same problem reported by Feng Kan over
at:

  
http://lkml.kernel.org/r/cal85gma_sscwm80tkdkzqee+s1bewzdevdki1kpkmutdrms...@mail.gmail.com

[read on; it was initially thought to be a hardware erratum, but it's
 actually the inability to restrict the DMA mask of the endpoint that's
 the problem]

The point here is that an IOMMU doesn't solve your issue, and the
IOMMU-backed DMA ops need the same treatment. In light of that, it really
feels to me like the DMA masks should be restricted in of_dma_configure
so that the parent mask is taken into account there, rather than hook
into each set of DMA ops to intercept set_dma_mask. We'd still need to
do something to stop dma_set_mask widening the mask if it was restricted
by of_dma_configure, but I think Robin (cc'd) was playing with that.

Will


Re: [PATCH 1/2] arm64: dma_mapping: allow PCI host driver to limit DMA mask

2017-01-03 Thread Will Deacon
On Thu, Dec 29, 2016 at 11:45:03PM +0300, Nikita Yushchenko wrote:
> It is possible that PCI device supports 64-bit DMA addressing, and thus
> it's driver sets device's dma_mask to DMA_BIT_MASK(64), however PCI host
> bridge has limitations on inbound transactions addressing. Example of
> such setup is NVME SSD device connected to RCAR PCIe controller.
> 
> Previously there was attempt to handle this via bus notifier: after
> driver is attached to PCI device, bridge driver gets notifier callback,
> and resets dma_mask from there. However, this is racy: PCI device driver
> could already allocate buffers and/or start i/o in probe routine.
> In NVME case, i/o is started in workqueue context, and this race gives
> "sometimes works, sometimes not" effect.
> 
> Proper solution should make driver's dma_set_mask() call to fail if host
> bridge can't support mask being set.
> 
> This patch makes __swiotlb_dma_supported() to check mask being set for
> PCI device against dma_mask of struct device corresponding to PCI host
> bridge (one with name "pci:YY"), if that dma_mask is set.
> 
> This is the least destructive approach: currently dma_mask of that device
> object is not used anyhow, thus all existing setups will work as before,
> and modification is required only in actually affected components -
> driver of particular PCI host bridge, and dma_map_ops of particular
> platform.
> 
> Signed-off-by: Nikita Yushchenko 
> ---
>  arch/arm64/mm/dma-mapping.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 290a84f..49645277 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -347,6 +348,16 @@ static int __swiotlb_get_sgtable(struct device *dev, 
> struct sg_table *sgt,
>  
>  static int __swiotlb_dma_supported(struct device *hwdev, u64 mask)
>  {
> +#ifdef CONFIG_PCI
> + if (dev_is_pci(hwdev)) {
> + struct pci_dev *pdev = to_pci_dev(hwdev);
> + struct pci_host_bridge *br = pci_find_host_bridge(pdev->bus);
> +
> + if (br->dev.dma_mask && (*br->dev.dma_mask) &&
> + (mask & (*br->dev.dma_mask)) != mask)
> + return 0;
> + }
> +#endif

Hmm, but this makes it look like the problem is both arm64 and swiotlb
specific, when in reality it's not. Perhaps another hack you could try
would be to register a PCI bus notifier in the host bridge looking for
BUS_NOTIFY_BIND_DRIVER, then you could proxy the DMA ops for each child
device before the driver has probed, but adding a dma_set_mask callback
to limit the mask to what you need?

I agree that it would be better if dma_set_mask handled all of this
transparently, but it's all based on the underlying ops rather than the
bus type.

Will


Re: [PATCH 4/7] bus: arm-cci: add missing of_node_put after calling of_parse_phandle

2016-07-04 Thread Will Deacon
On Mon, Jul 04, 2016 at 02:48:08AM +, Peter Chen wrote:
> >Please route this via arm-soc once you've addressed Suzuki's comment.
> 
> Hi Will, how to route patch via arm-soc?
> 
> Does it mean sending patch by adding a...@kernel.org at To or Cc list?

Yes. I just wanted to make it clear that I don't intend to pick this up
(since I'd just send it to arm-soc anyway!).

Will


Re: [PATCH 4/7] bus: arm-cci: add missing of_node_put after calling of_parse_phandle

2016-07-01 Thread Will Deacon
On Fri, Jul 01, 2016 at 11:29:58AM +0100, Suzuki K Poulose wrote:
> On 01/07/16 10:41, Peter Chen wrote:
> >of_node_put needs to be called when the device node which is got
> >from of_parse_phandle has finished using.
> >
> >Cc: Will Deacon <will.dea...@arm.com>
> >Cc: Suzuki K Poulose <suzuki.poul...@arm.com>
> >Signed-off-by: Peter Chen <peter.c...@nxp.com>
> 
> Thanks for the fix.
> 
> >---
> >  drivers/bus/arm-cci.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/bus/arm-cci.c b/drivers/bus/arm-cci.c
> >index a49b283..e7b0b8c 100644
> >--- a/drivers/bus/arm-cci.c
> >+++ b/drivers/bus/arm-cci.c
> >@@ -1912,9 +1912,12 @@ static int __cci_ace_get_port(struct device_node *dn, 
> >int type)
> > cci_portn = of_parse_phandle(dn, "cci-control-port", 0);
> > for (i = 0; i < nb_cci_ports; i++) {
> > ace_match = ports[i].type == type;
> >-if (ace_match && cci_portn == ports[i].dn)
> >+if (ace_match && cci_portn == ports[i].dn) {
> >+of_node_put(cci_portn);
> > return i;
> >+}
> > }
> >+of_node_put(cci_portn);
> 
> nit: Could we please do some thing like this ?
>   if (ace_match && cci_portn == ports[i].dn)
>   break;
>   }
> 
>   of_node_put(cci_portn);
>   return (i < nb_cci_ports) ? i : -ENODEV ;
> 
> Either way,
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poul...@arm.com>

Please route this via arm-soc once you've addressed Suzuki's comment.

Will


Re: [PATCH 2/2] iommu/io-pgtable-arm: use __dma_sync_single_for_device()

2016-05-09 Thread Will Deacon
On Sun, May 08, 2016 at 12:59:56PM +0200, Niklas Söderlund wrote:
> The call to dma_sync_single_for_device() can be reached from
> dma_map_single(). If CONFIG_DMA_API_DEBUG is enabled this would result
> in a check that the mapping being synced is valid. Since the call to
> dma_map_single is not yet completed the mapping is not recorded in
> dma-debug and the check fails and a warning is printed. Avoid this
> warning by calling __dma_sync_single_for_device() which don't preform
> this check.

Hmm, I don't understand why this would trigger that warning. The memory
being sync'd here is the page table memory, not the buffer being mapped.
The page table memory is "mapped" using dma_map_single in
__arm_lpae_alloc_pages, so it sounds like the issue something else.

Will